meeko package

Contents

meeko package#

class meeko.MoleculePreparation(merge_these_atom_types=('H',), hydrate=False, flexible_amides=False, rigid_macrocycles=False, untyped_macrocycles=False, min_ring_size=7, max_ring_size=33, keep_chorded_rings=False, keep_equivalent_rings=False, double_bond_penalty=50, macrocycle_allow_A=False, rigidify_bonds_smarts=[], rigidify_bonds_indices=[], input_atom_params=None, load_atom_params='ad4_types', add_atom_types=None, input_offatom_params=None, load_offatom_params=None, charge_model='gasteiger', charge_atom_prop=None, dihedral_model=None, reactive_smarts=None, reactive_smarts_idx=None, add_index_map=False, remove_smiles=False)[source]#

Bases: object

A class representing the protocol for preparing a molecule for docking.

deprecated_setup_access#

Deprecated access to the setup object (deprecated since v0.5).

Type:

list

merge_these_atom_types#

A tuple of atom types to merge. For example, (“H”,) will merge all hydrogens.

Type:

tuple

hydrate#

If True, the molecule will be hydrated with water molecules.

Type:

bool

flexible_amides#

If True, amide bonds will be treated as flexible.

Type:

bool

rigid_macrocycles#

If True, macrocycles will be treated as rigid.

Type:

bool

untyped_macrocycles#

If True, macrocycles will not be typed.

Type:

bool

min_ring_size#

The minimum size of a ring to be considered a macrocycle.

Type:

int

max_ring_size#

The maximum size of a ring to be considered a macrocycle.

Type:

int

keep_chorded_rings#

If True, chorded rings will be kept in the molecule.

Type:

bool

keep_equivalent_rings#

If True, equivalent rings will be kept in the molecule.

Type:

bool

double_bond_penalty#

The penalty for double bonds in macrocycles.

Type:

float

macrocycle_allow_A#

If True, macrocycles will be allowed to have atom type A (aromatic carbon).

Type:

bool

rigidify_bonds_smarts#

A list of SMARTS patterns for bonds to be rigidified.

Type:

list

rigidify_bonds_indices#

A list of tuples of indices representing bonds to be rigidified.

Type:

list[tuple[int, int]]

input_atom_params#

A dictionary of input atom parameters.

Type:

dict

load_atom_params#

A string representing the name of the atom parameters to load.

Type:

str

add_atom_types#

A list of additional atom types to add.

Type:

list

charge_model#

The charge model to use. Options are “espaloma”, “gasteiger”, “zero”, or “read”.

Type:

str

charge_atom_prop#

The name of the atom property to use for charges.

Type:

str

dihedral_model#

The dihedral model to use. Options are None, “openff”, or “espaloma”.

Type:

str

reactive_smarts#

A string representing the SMARTS pattern for reactive groups.

Type:

str

reactive_smarts_idx#

An index representing the position of the reactive atom in the SMARTS pattern.

Type:

int

add_index_map#

If True, an index map will be added to the molecule.

Type:

bool

remove_smiles#

If True, the SMILES representation of the molecule will be removed.

Type:

bool

atom_params#

A dictionary of atom parameters.

Type:

dict

offatom_params#

A dictionary of off-atom parameters.

Type:

dict

dihedral_params#

A list of dihedral parameters.

Type:

list

espaloma_model#

An instance of the EspalomaTyper class.

Type:

EspalomaTyper

packaged_params = {'ad4_desolv_param': PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/apidocmeeko/envs/latest/lib/python3.10/site-packages/meeko/data/params/ad4_desolv_param.json'), 'ad4_desolv_volume': PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/apidocmeeko/envs/latest/lib/python3.10/site-packages/meeko/data/params/ad4_desolv_volume.json'), 'ad4_hb': PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/apidocmeeko/envs/latest/lib/python3.10/site-packages/meeko/data/params/ad4_hb.json'), 'ad4_types': PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/apidocmeeko/envs/latest/lib/python3.10/site-packages/meeko/data/params/ad4_types.json'), 'ad4_vdw': PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/apidocmeeko/envs/latest/lib/python3.10/site-packages/meeko/data/params/ad4_vdw.json'), 'example_offatom_charge': PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/apidocmeeko/envs/latest/lib/python3.10/site-packages/meeko/data/params/example_offatom_charge.json'), 'vina_params': PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/apidocmeeko/envs/latest/lib/python3.10/site-packages/meeko/data/params/vina_params.json')}#
name = 'ad4_desolv_volume'#
classmethod from_config(config)[source]#

Create a MoleculePreparation instance from a configuration dictionary.

Parameters:

config (dict) – A dictionary containing the configuration parameters for the MoleculePreparation instance.

Returns:

A new instance of the MoleculePreparation class with the specified configuration.

Return type:

MoleculePreparation

classmethod from_json_file(filename)[source]#

Create a MoleculePreparation instance from a JSON file containing the configuration.

Parameters:

filename (str) – The path to the JSON file containing the configuration parameters.

Returns:

A new instance of the MoleculePreparation class with the specified configuration.

Return type:

MoleculePreparation

calc_flex(setup, root_atom_index=None, not_terminal_atoms=None, delete_ring_bonds=None, glue_pseudo_atoms=None)[source]#

Calculate the flexibility model for the molecule setup.

Parameters:
  • setup (RDKitMoleculeSetup) – The molecule setup for which to calculate the flexibility model.

  • root_atom_index (int, optional) – The index of the root atom in the molecule setup. If None, the root atom will be determined automatically.

  • not_terminal_atoms (list, optional) – A list of atom indices that should be treated as non-terminal atoms.

  • delete_ring_bonds (list[tuple[int, int]], optional) – Bonds deleted for macrocycle flexibility. Each bond is a tuple of two ints (atom 0-indices).

  • glue_pseudo_atoms (dict, optional) – A dictionary for pseudo atoms mapping atom indices to their corresponding coordinates.

Return type:

None

static get_atom_params(input_atom_params, load_atom_params, add_atom_types, packaged_params)[source]#

Load and combine atom parameters from various sources.

Parameters:
  • input_atom_params (dict) – A dictionary of input atom parameters.

  • load_atom_params (str) – A string representing the name of the atom parameters to load.

  • add_atom_types (list) – A list of additional atom types to add.

  • packaged_params (dict) – A dictionary of packaged parameters. Keys are parameter group names and values are file paths to the corresponding JSON files.

Raises:
  • ValueError – If there are overlapping parameter groups or if the name of the parameter group is not recognized.

  • RuntimeError – If there are multiple groups of parameters when using add_atom_types.

Returns:

atom_params – A dictionary of combined atom parameters.

Return type:

dict

property setup#

Deprecated access to the setup object.

Returns:

The setup object in the deprecated_setup_access list.

Return type:

RDKitMoleculeSetup

Raises:

RuntimeError – If there are multiple setups in the deprecated_setup_access list.

classmethod get_defaults_dict()[source]#

Get the default values for all parameters in the class.

Returns:

A dictionary containing the default values for all parameters in the class.

Return type:

dict

prepare(mol, root_atom_index=None, not_terminal_atoms=None, delete_ring_bonds=None, glue_pseudo_atoms=None, conformer_id=-1, rename_atoms=False)[source]#

Create an RDKitMoleculeSetup from an RDKit Mol object.

Parameters:
  • mol (Chem.Mol) – An RDKit Mol with explicit hydrogens and 3D coordinates.

  • root_atom_index (int, optional) – Used to set ROOT of torsion tree instead of searching. Default is None, which means the root atom will be determined automatically.

  • not_terminal_atoms (list, optional) – A list of atom indices that should be treated as non-terminal atoms.).

  • delete_ring_bonds (list[tuple[int, int]], optional) – Bonds deleted for macrocycle flexibility. Each bond is a tuple of two ints (atom 0-indices).

  • glue_pseudo_atoms (dict, optional) – Mapping from parent atom indices to coordinates.

  • conformer_id (int) – Conformer ID to use for the molecule. Default is -1, which means the current conformer will be used.

  • rename_atoms (bool) – If True, atoms will be renamed to include their index. Default is False, which means atoms will not be renamed.

Returns:

setups – Returns a list of generated RDKitMoleculeSetups

Return type:

list[RDKitMoleculeSetup]

static check_external_ring_break(molsetup, break_ring_bonds, glue_pseudo_atoms)[source]#

Check that the external ring break bonds are in the molecule setup and that the glue pseudo atoms are present and have the correct number of coordinates.

Parameters:
  • molsetup (RDKitMoleculeSetup) – The molecule setup to check.

  • break_ring_bonds (list[tuple[int, int]]) – A list of tuples representing the bonds to break.

  • glue_pseudo_atoms (dict) – A dictionary mapping atom indices to their corresponding coordinates.

Return type:

None

Raises:

ValueError: – If bonds are missing from the MoleculeSetup, if glue_pseudo_atoms is missing certain atom indices, and if there is an incorrect number of coordinates in glue_pseudo_atoms.

path = PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/apidocmeeko/envs/latest/lib/python3.10/site-packages/meeko/data/params/ad4_desolv_volume.json')#
write_pdbqt_string()[source]#

Writes a PDBQT string. Deprecated in Meeko v0.5.

Returns:

The PDBQT string representation of the molecule setup.

Return type:

str

Raises:

RuntimeError – If there is an error generating the PDBQT string.

write_pdbqt_file(pdbqt_filename)[source]#

Writes out a pdbqt file. Deprecated in Meeko v0.5

Parameters:

pdbqt_filename (str) – PDBQT filename to write to

Return type:

None

class meeko.RDKitMoleculeSetup(name=None, source=None)[source]#

Bases: MoleculeSetup, MoleculeSetupExternalToolkit, BaseJSONParsable

Subclass of MoleculeSetup, used to represent MoleculeSetup objects working with RDKit objects

mol#

An RDKit Mol object to base the MoleculeSetup on.

Type:

Chem.Mol

modified_atom_positions#

List of dictionaries where keys are atom indices, Used to store sets of coordinates, e.g. docked poses, as dictionaries indexed by the atom index, because not all atoms need to have new coordinates specified. Unspecified hydrogen positions bonded to modified heavy atom positions are to be calculated “on-the-fly”.

Type:

list

dihedral_interactions#

A list of unique fourier_series, each of which are represented as a list of dictionaries.

Type:

list

dihedral_partaking_atoms#

a mapping from tuples of atom indices to the indices in dihedral_interactions

Type:

dict

dihedral_labels#

a mapping from tuples of atom indices to dihedral labels

Type:

dict

atom_to_ring_id#

mapping of atom index to ring id of each atom belonging to the ring

Type:

dict

rmsd_symmetry_indices#

Tuples of the indices of the molecule’s atoms that match a substructure query.

Type:

tuple

classmethod json_encoder(obj)[source]#
Return type:

Optional[dict[str, Any]]

expected_json_keys: Optional[frozenset[str]] = frozenset({'atom_params', 'atom_to_ring_id', 'atoms', 'bond_info', 'dihedral_interactions', 'dihedral_labels', 'dihedral_partaking_atoms', 'flexibility_model', 'modified_atom_positions', 'mol', 'name', 'pseudoatom_count', 'restraints', 'ring_closure_info', 'rings', 'rmsd_symmetry_indices', 'rotamers'})#
copy()[source]#

Returns a copy of the current RDKitMoleculeSetup.

Returns:

newsetup – A copy of the current RDKitMoleculeSetup object.

Return type:

RDKitMoleculeSetup

classmethod from_mol(mol, keep_chorded_rings=False, keep_equivalent_rings=False, compute_gasteiger_charges=True, read_charges_from_prop=None, conformer_id=-1)[source]#

Creates an RDKitMoleculeSetup object from an RDKit Mol object.

Parameters:
  • mol (Chem.Mol) – RDKit Mol object to build the RDKitMoleculeSetup from.

  • keep_chorded_rings (bool, optional) – Indicates whether to keep chorded rings in the molecule. Default is False.

  • keep_equivalent_rings (bool, optional) – Indicates whether to keep equivalent rings in the molecule. Default is False.

  • compute_gasteiger_charges (bool, optional) – Indicates whether to compute Gasteiger charges for the molecule. Default is True.

  • read_charges_from_prop (str, optional) – Indicates whether to read charges from a property in the molecule. Default is None.

  • conformer_id (int) – The index of the conformer to use. Default is -1, which means the current conformer will be used.

Returns:

molsetup – A populated RDKitMoleculeSetup object

Return type:

RDKitMoleculeSetup

Raises:

ValueError – If the RDKit Mol has implicit Hydrogens, or if there are no conformers for the given RDKit Mol, or if the RDKit Mol has multiple fragments, or if the RDKit Mol has a query.

static remove_elements(mol, to_rm=(12, 20, 25, 26, 30))[source]#

Removes elements from the RDKit molecule and returns the modified molecule.

Parameters:
  • mol (Chem.Mol) – The RDKit molecule to modify.

  • to_rm (tuple, optional) – A tuple of atomic numbers to remove from the molecule. Default is (12, 20, 25, 26, 30).

Returns:

  • mol (Chem.Mol) – The modified RDKit molecule.

  • idx_to_rm (dict) – A dictionary mapping atom indices to their formal charges.

  • rm_to_neigh (dict) – A dictionary mapping atom indices to their neighbors.

init_atom(compute_gasteiger_charges, read_charges_from_prop, coords)[source]#

Generates information about the atoms in an RDKit Mol and adds them to an RDKitMoleculeSetup.

Parameters:
  • compute_gasteiger_charges (bool) – Indicates whether we should compute gasteiger charges.

  • coords (list[np.ndarray]) – Atom coordinates for the RDKit Mol.

Return type:

None

Raises:
  • ValueError – If the input for read_charges_from_prop is not a string or is empty, or if the list of charges contains None.

  • RuntimeError – If the number of atoms in the modified molecule does not match the original molecule.

init_bond()[source]#

Uses the RDKit mol to initialize bond info for the RDKitMoleculeSetup

Return type:

None

find_pattern(smarts)[source]#

Given a SMARTS pattern, finds substruct matches in the molecule.

Parameters:

smarts (str) – A SMARTS string to find in the RDKit Mol object.

Returns:

The substruct matches in the RDKit Mol for the given SMARTS.

Return type:

list[tuple]

get_mol_name()[source]#

Gets the RDKit Mol’s name from self.mol.

Returns:

If the mol has a name, returns the name property.

Return type:

str

get_smiles_and_order()[source]#

Returns the SMILES string and the mapping between atom indices in the SMILES and self.molof an atom after running RDKit’s RemoveHs function.

Returns:

  • smiles (str) – The SMILES string of the molecule.

  • order (list[int]) – A list of integers representing the mapping between atom indices in the SMILES and self.mol.

Raises:

RuntimeError – If the number of atoms in the molecule after removing hydrogens does not match the number of atoms in the original molecule.

perceive_rings(keep_chorded_rings, keep_equivalent_rings)[source]#

Uses Hanser-Jauffret-Kaufmann exhaustive ring detection to find the rings in the molecule.

Parameters:
  • keep_chorded_rings (bool) – Indicates whether we want to keep chorded rings.

  • keep_equivalent_rings (bool) – Indicates whether we want to keep equivalent rings.

Return type:

None

get_conformer_with_modified_positions(new_atom_positions)[source]#

Gets a conformer with the specified new atom positions. We operate on one conformer at a time because SetTerminalAtomPositions acts on all conformers of a molecule, and we do not want to guarantee that all conformers require the same set of terminal atoms to be updated.

Parameters:

new_atom_positions (dict) – The new atom positions we want to use.

Returns:

new_conformer – A new conformer with the input new atom positions.

Return type:

Chem.Conformer

get_mol_with_modified_positions(new_atom_positions_list=None)[source]#

Modifies the stored RDKit Mol to a new set of atom positions, either those provided or the ones stored in self.modified_atom_positions, and returns the modified Mol object.

Parameters:

new_atom_positions_list (list[dict]) – New atom positions to add to the RDKit Mol object.

Returns:

new_mol – A new RDKit Mol object with conformers that have the desired new atom positions.

Return type:

Chem.Mol

get_num_mol_atoms()[source]#

Gets the number of atoms in the RDKit Mol object.

Returns:

Number of atoms in the RDKit Mol object.

Return type:

int

get_equivalent_atoms()[source]#

Gets the indices of the equivalent atoms in the RDKit Mol object.

Returns:

A list of indices of the equivalent atoms in the RDKit Mol object.

Return type:

list[int]

static get_symmetries_for_rmsd(mol, max_matches=17)[source]#

Finds the symmetry indices for RMSD calculation in the RDKit Mol object.

Parameters:
  • mol (Chem.Mol) – The RDKit Mol object to find symmetry indices for.

  • max_matches (int, optional) – The maximum number of matches to find. Default is 17.

Returns:

A list of tuples representing the symmetry indices for RMSD calculation.

Return type:

list[tuple]

static has_implicit_hydrogens(mol)[source]#

Checks if the RDKit molecule has implicit hydrogens.

Parameters:

mol (Chem.Mol) – The RDKit molecule to check.

Returns:

True if the molecule has implicit hydrogens, False otherwise.

Return type:

bool

restrain_to(target_mol, kcal_per_angstrom_square=1.0, delay_angstroms=2.0)[source]#

Restrains the current molecule to a target molecule using a stereo isomorphism mapping.

Parameters:
  • target_mol (Chem.Mol) – The target RDKit molecule to restrain to.

  • kcal_per_angstrom_square (float) – The force constant for the restraint.

  • delay_angstroms (float) – The distance at which the restraint is applied.

Return type:

None

Raises:

ImportError – If the misctools module is not available.

class meeko.AtomTyper[source]#

Bases: object

classmethod type_everything(molsetup, atom_params, charge_model, offatom_params=None, dihedral_params=None)[source]#
class meeko.PDBQTMolecule(pdbqt_string, name=None, poses_to_read=None, energy_range=None, is_dlg=False, skip_typing=False)[source]#

Bases: object

PDBQTMolecule class for reading PDBQT (or dlg) files from AutoDock4, AutoDock-GPU or AutoDock-Vina.

_current_pose#

Index of the current pose.

Type:

int

_pdbqt_filename#

Filename of the PDBQT file.

Type:

str

_atoms#

Array of atoms.

Type:

ndarray.

_positions#

Array of positions.

Type:

ndarray

_bonds#

Dictionary of bonds.

Type:

dict

_atom_annotations#

Dictionary of atom annotations.

Type:

dict

_pose_data#

Dictionary of pose data.

Type:

dict

_name#

Name of the molecule.

Type:

str

_KDTrees#

List of KDTree objects for each pose.

Type:

list

classmethod from_file(pdbqt_filename, name=None, poses_to_read=None, energy_range=None, is_dlg=False, skip_typing=False)[source]#

Read PDBQT file and return PDBQTMolecule object.

Parameters:
  • pdbqt_filename (str) – Filename of the PDBQT file.

  • name (str, optional) – Name of the molecule. De is None (use filename without pdbqt suffix).

  • poses_to_read (int, optional) – Total number of poses to read. Default is None (read all).

  • energy_range (float, optional) – Read docked poses until the maximum energy difference from best pose is reach, for example 2.5 kcal/mol. Default is None (read all).

  • is_dlg (bool, optional) – Input file is in dlg (AutoDock docking log) format. Default is False (input file is not in dlg format).

  • skip_typing (bool, optional) – Flag indicating that atomtyping should be skipped. Default is False (do not skip typing).

property name#

Return the name of the molecule.

property pose_id#

Return the index of the current pose.

property score#

Return the score (kcal/mol) of the current pose.

available_atom_properties(ignore_properties=None)[source]#

Return all the available atom properties for that molecule. The following properties are awlays ignored:

  • ligand

  • flexible_residue

  • water

Parameters:

ignore_properties (list, optional) – List of properties to ignore. Default is None (no properties are ignored).

Returns:

list – List of available atom properties.

Return type:

list

has_flexible_residues()[source]#

Tell if the molecule contains a flexible residue or not.

Returns:

True if contains flexible residues, otherwise False.

Return type:

bool

has_water_molecules()[source]#

Tell if the molecules contains water molecules or not in the current pose.

Returns:

True if contains water molecules in the current pose, otherwise False.

Return type:

bool

atoms(atom_idx=None, only_active=True)[source]#

Return the ith atom.

Parameters:
  • atom_idx (int, list, optional) – Index of one or multiple atoms (0-based). Default is None (return all atoms).

  • only_active (bool, optional) – Return only active atoms. Default is True (return only active atoms).

Returns:

atoms – ndarray (atom_id, atom_name, resname, resid, chainid, xyz, q, t).

Return type:

ndarray

positions(atom_idx=None, only_active=True)[source]#

Return coordinates (xyz) of all atoms or a certain atom.

Parameters:
  • atom_idx (int, list, optional) – Index of one or multiple atoms (0-based). Default is None (return all atoms).

  • only_active (bool, optional) – Return only active atoms. Default is True (return only active atoms).

Returns:

ndarray of coordinates (xyz).

Return type:

ndarray

atoms_by_properties(atom_properties, only_active=True)[source]#

Return atom based on their properties.

Parameters:
  • atom_properties (str or list) – Property of the atoms to retrieve (properties: ligand, flexible_residue, vdw, hb_don, hb_acc, metal, water, reactive, glue).

  • only_active (bool, optional) – Return only active atoms. Default is True (return only active atoms).

Returns:

ndarray of atoms (atom_id, atom_name, resname, resid, chainid, xyz, q, t).

Return type:

ndarray

Raises:

KeyError – If the atom property is not valid.

closest_atoms_from_positions(xyz, radius, atom_properties=None, ignore=None)[source]#

Retrieve indices of the closest atoms around a positions/coordinates at a certain radius.

Parameters:
  • xyz (np.ndarray) – array of 3D coordinates.

  • radius (float) – radius.

  • atom_properties (str or list, optional) – Property of the atoms to retrieve (properties: ligand, flexible_residue, vdw, hb_don, hb_acc, metal, water, reactive, glue).

  • ignore (int or list, optional) – Ignore atom for the search using atom id (0-based). Default is None (no atoms are ignored).

Returns:

ndarray (atom_id, atom_name, resname, resid, chainid, xyz, q, t).

Return type:

ndarray

Raises:

KeyError – If the atom property is not valid.

closest_atoms(atom_idx, radius, atom_properties=None)[source]#

Retrieve indices of the closest atoms around a positions/coordinates at a certain radius.

Parameters:
  • atom_idx (int, list) – index of one or multiple atoms (0-based).

  • radius (float) – radius.

  • atom_properties (str or list, optional) – Property of the atoms to retrieve (properties: ligand, flexible_residue, vdw, hb_don, hb_acc, metal, water, reactive, glue).

Returns:

ndarray (atom_id, atom_name, resname, resid, chainid, xyz, q, t).

Return type:

ndarray

neighbor_atoms(atom_idx)[source]#

Return neighbor (bonded) atoms of certain atom(s) by their index.

Parameters:

atom_idx (int, list) – index of one or multiple atoms (0-based).

Returns:

list of lists containing the neighbor (bonded) atoms (0-based).

Return type:

list

write_pdbqt_string(as_model=True)[source]#

Write PDBQT output string of the current pose.

Parameters:

as_model (bool, optional) – Add MODEL/ENDMDL keywords to the output PDBQT string Default is True (add MODEL/ENDMDL keywords).

Returns:

PDBQT string of the current pose.

Return type:

str

write_pdbqt_file(output_pdbqtfilename, overwrite=False, as_model=False)[source]#

Write PDBQT file of the current pose

Parameters:
  • output_pdbqtfilename (str) – Filename of the output PDBQT file.

  • overwrite (bool, optional) – Overwrite on existing PDBQT file. Default is False (do not overwrite).

  • as_model (bool, optional) – Add MODEL/ENDMDL keywords to the output PDBQT string. Default is False (do not add MODEL/ENDMDL keywords).

Raises:

RuntimeError – If the output PDBQT file already exists and overwrite is False.

Return type:

None

class meeko.PDBQTReceptor(pdbqt_string, skip_typing=False)[source]#

Bases: object

skip_types = ('H',)#
classmethod from_pdbqt_filename(pdbqt_filename, skip_typing=False)[source]#
static get_atom_indices_by_residue(atoms)[source]#
return a dictionary where residues are keys and

values are lists of atom indices

>>> atom_idx_by_res = {("A", "LYS", 417): [0, 1, 2, 3, ..., 8]}
atoms(atom_idx=None)[source]#

Return the atom i

Args:

atom_idx (int, list): index of one or multiple atoms

Returns:

ndarray: 2d ndarray (atom_id, atom_name, resname, resid, chainid, xyz, q, t)

positions(atom_idx=None)[source]#

Return coordinates (xyz) of all atoms or a certain atom

Args:

atom_idx (int, list): index of one or multiple atoms (0-based)

Returns:

ndarray: 2d ndarray of coordinates (xyz)

closest_atoms_from_positions(xyz, radius, atom_properties=None, ignore=None)[source]#

Retrieve indices of the closest atoms around a positions/coordinates at a certain radius.

Args:

xyz (np.ndarray): array of 3D coordinates raidus (float): radius atom_properties (str): property of the atoms to retrieve

(properties: ligand, flexible_residue, vdw, hb_don, hb_acc, metal, water, reactive, glue)

ignore (int or list): ignore atom for the search using atom id (0-based)

Returns:

ndarray: 2d ndarray (atom_id, atom_name, resname, resid, chainid, xyz, q, t)

closest_atoms(atom_idx, radius, atom_properties=None)[source]#

Retrieve indices of the closest atoms around a positions/coordinates at a certain radius.

Args:

atom_idx (int, list): index of one or multiple atoms (0-based) raidus (float): radius atom_properties (str or list): property of the atoms to retrieve

(properties: ligand, flexible_residue, vdw, hb_don, hb_acc, metal, water, reactive, glue)

Returns:

ndarray: 2d ndarray (atom_id, atom_name, resname, resid, chainid, xyz, q, t)

neighbor_atoms(atom_idx)[source]#

Return neighbor (bonded) atoms

Args:

atom_idx (int, list): index of one or multiple atoms (0-based)

Returns:

list_of_list: list of lists containing the neighbor (bonded) atoms (0-based)

class meeko.Polymer(raw_input_mols, bonds, residue_chem_templates, mk_prep=None, set_template=None, blunt_ends=None, get_atomprop_from_raw=None)[source]#

Bases: BaseJSONParsable

Represents polymer with its subunits as individual RDKit molecules.

Used for proteins and nucleic acids. The key class is Monomer, which contains, a padded RDKit molecule containing part of the adjacent residues to enable chemically meaningful parameterizaion. Instances of ResidueTemplate make sure that the input, which may originate from a PDB string, matches the RDKit molecule of the template, even if hydrogens are missing.

monomers#

Dictionary to store monomers where keys are residue IDs in the format <chain>:<resnum> such as “A:42” and values are instances of Monomer.

Type:

dict[str, Monomer]

log#

Dictionary to store log messages during monomer processing (_get_monomer).

Type:

dict[str, list[str]]

residue_chem_templates#

An instance of the ResidueChemTemplates class used to initialize this Polymer.

Type:

ResidueChemTemplates

raw_input_mols#

A dictionary of raw input mols used to initialize this Polymer, where keys are residue IDs in the format <chain>:<resnum> such as “A:42” and values are tuples of an RDKit Mols and input resname.

Type:

dict[str, tuple[Chem.Mol, str]]

classmethod json_encoder(obj)[source]#
Return type:

Optional[dict[str, Any]]

expected_json_keys: Optional[frozenset[str]] = {'log', 'monomers', 'residue_chem_templates'}#
stitch(residues_to_add=None, bonds_to_use=None)[source]#

Function to stitch together monomers into a single molecule.

Parameters:
  • residues_to_add (set[str], optional) – A set of residue IDs to add to the stitched molecule. If None, all valid monomers will be added. Default is None (all valid monomers).

  • bonds_to_use (dict[tuple[str], list[tuple[int]]], optional) – A dictionary of bonds to use for stitching. The keys are tuples of residue IDs, and the values are lists of tuples of atom indices (in rdkit_mol). If None, all available bonds in the polymer will be used. Default is None (all available bonds).

Returns:

An RDKit molecule that results from adding bonds between the specified residues. It may contain multiple fragments if there are multiple chains or gaps.

Return type:

Chem.Mol

classmethod from_pdb_string(pdb_string, chem_templates, mk_prep, set_template=None, residues_to_delete=None, allow_bad_res=False, bonds_to_delete=None, blunt_ends=None, wanted_altloc=None, default_altloc=None)[source]#

Construct a Polymer object from a PDB string.

Parameters:
  • pdb_string (str) – The PDB string containing the polymer structure.

  • chem_templates (ResidueChemTemplates) – An instance of the ResidueChemTemplates class to construct the polymer.

  • mk_prep (MoleculePreparation) – An instance of the MoleculePreparation class to construct the polymer.

  • set_template (dict[str, str], optional) – A dictionary mapping residue IDs in the format <chain>:<resnum> such as “A:42” to the user-specified ResidueTemplate names. If None, no specific templates will be set. Default is None (the built-in ResidueTemplate ambiguious name mapping will be used).

  • residues_to_delete (set[str], optional) – A set of residue IDs to delete from the polymer. If None, no residues will be deleted. Default is None (no residues will be deleted).

  • allow_bad_res (bool, optional) – If True, allows residues that do not match templates to be ignored (rdkit_mol will be None). If False, raises an error if any residues do not match templates. Default is False.

  • bonds_to_delete (list[tuple[str, str]], optional) – A list of tuples of residue IDs to delete bonds between. If None, no bonds will be deleted. Default is None (no bonds will be deleted).

  • blunt_ends (list[tuple[str, int]], optional) – A list of tuples where each tuple is a residue ID and a 0-based atom index (in raw_mol). If None, no blunt ends will be added. Default is None (no blunt ends will be added).

  • wanted_altloc (dict[str, str], optional) – A dictionary mapping residue IDs in the format <chain>:<resnum> such as “A:42” to the desired alternate location (altloc) for that residue.

  • default_altloc (str, optional) – A string representing the default alternate location (altloc) to be used for residues that do not have a specific altloc specified.

Returns:

An instance of the Polymer class constructed from the PDB string.

Return type:

Polymer

Raises:
  • NotImplementedError – If bonds_to_delete includes residue pairs with more than bond2 between them.

  • PolymerCreationError – If there are residues that could not be parsed or matched to templates, and allow_bad_res is False.

classmethod from_pqr_string(pqr_string, chem_templates, mk_prep, set_template=None, residues_to_delete=None, allow_bad_res=False, bonds_to_delete=None, blunt_ends=None)[source]#

Construct a Polymer object from a PQR string. Adapted from PDB2PQR.

Parameters:
  • pqr_string (str) – The PQR string containing the polymer structure.

  • chem_templates (ResidueChemTemplates) – An instance of the ResidueChemTemplates class to construct the polymer.

  • mk_prep (MoleculePreparation) – An instance of the MoleculePreparation class to construct the polymer.

  • set_template (dict[str, str], optional) – A dictionary mapping residue IDs in the format <chain>:<resnum> such as “A:42” to the user-specified ResidueTemplate names. If None, no specific templates will be set. Default is None (the built-in ResidueTemplate ambiguious name mapping will be used).

  • residues_to_delete (set[str], optional) – A set of residue IDs to delete from the polymer. If None, no residues will be deleted. Default is None (no residues will be deleted).

  • allow_bad_res (bool, optional) – If True, allows residues that do not match templates to be ignored (rdkit_mol will be None). If False, raises an error if any residues do not match templates. Default is False.

  • bonds_to_delete (list[tuple[str, str]], optional) – A list of tuples of residue IDs to delete bonds between. If None, no bonds will be deleted. Default is None (no bonds will be deleted).

  • blunt_ends (list[tuple[str, int]], optional) – A list of tuples where each tuple is a residue ID and a 0-based atom index (in raw_mol). If None, no blunt ends will be added. Default is None (no blunt ends will be added).

Returns:

An instance of the Polymer class constructed from the PDB string.

Return type:

Polymer

Raises:
  • NotImplementedError – If bonds_to_delete includes residue pairs with more than bond2 between them.

  • PolymerCreationError – If there are residues that could not be parsed or matched to templates, and allow_bad_res is False.

classmethod from_prody(prody_obj, chem_templates, mk_prep, set_template=None, residues_to_delete=None, allow_bad_res=False, bonds_to_delete=None, blunt_ends=None, wanted_altloc=None, default_altloc=None)[source]#

Construct a Polymer object from a ProDy Selection or AtomGroup object.

Parameters:
  • prody_obj (ProDy.Selection or ProDy.AtomGroup) – The ProDy object to construct the polymer from.

  • chem_templates (ResidueChemTemplates) – An instance of the ResidueChemTemplates class to construct the polymer.

  • mk_prep (MoleculePreparation) – An instance of the MoleculePreparation class to construct the polymer.

  • set_template (dict[str, str], optional) – A dictionary mapping residue IDs in the format <chain>:<resnum> such as “A:42” to the user-specified ResidueTemplate names. If None, no specific templates will be set. Default is None (the built-in ResidueTemplate ambiguious name mapping will be used).

  • residues_to_delete (set[str], optional) – A set of residue IDs to delete from the polymer. If None, no residues will be deleted. Default is None (no residues will be deleted).

  • allow_bad_res (bool, optional) – If True, allows residues that do not match templates to be ignored (rdkit_mol will be None). If False, raises an error if any residues do not match templates. Default is False.

  • bonds_to_delete (list[tuple[str, str]], optional) – A list of tuples of residue IDs to delete bonds between. If None, no bonds will be deleted. Default is None (no bonds will be deleted).

  • blunt_ends (list[tuple[str, int]], optional) – A list of tuples where each tuple is a residue ID and a 0-based atom index (in raw_mol). If None, no blunt ends will be added. Default is None (no blunt ends will be added).

  • wanted_altloc (dict[str, str], optional) – A dictionary mapping residue IDs in the format <chain>:<resnum> such as “A:42” to the desired alternate location (altloc) for that residue.

  • default_altloc (str, optional) – A string representing the default alternate location (altloc) to be used for residues that do not have a specific altloc specified.

Returns:

An instance of the Polymer class constructed from the PDB string.

Return type:

Polymer

Raises:
  • NotImplementedError – If bonds_to_delete includes residue pairs with more than bond2 between them.

  • PolymerCreationError – If there are residues that could not be parsed or matched to templates, and allow_bad_res is False.

parameterize(mk_prep, get_atomprop_from_raw=None)[source]#

Parameterize the monomers in the polymer using the provided MoleculePreparation instance.

Parameters:
  • mk_prep (MoleculePreparation) – An instance of the MoleculePreparation class to parameterize the monomers.

  • get_atomprop_from_raw (dict, optional) – A dictionary mapping atom properties to pass from raw_input_mols. The keys are the property names and the values are the default values.

Return type:

None

flexibilize_sidechain(residue_id, mk_prep)[source]#

Set the sidechain of a residue as flexible. The Monomer must have been processed and have valid attributes before calling this method.

Parameters:
  • residue_id (str) – The ID of the residue to be made flexible.

  • mk_prep (MoleculePreparation) – An instance of the MoleculePreparation class to construct the polymer.

Return type:

None

to_pdb(new_positions=None)[source]#

Convert the polymer to a PDB string while (optionally) updating the coordinates of specified monomers.

Parameters:

new_positions (dict[str, dict[int, tuple[float, float, float]]], optional) – A dictionary mapping residue IDs to dictionaries of atom indices and their new coordinates.

Returns:

pdb_string – The PDB string representation of the polymer.

Return type:

str

Raises:

ValueError – If any residue IDs in new_positions are not valid monomers.

export_static_atom_params()[source]#

Export static atom parameters from the polymer.

Returns:

  • atom_params (dict) – A dictionary containing atom parameters.

  • coords (list) – A list of coordinates for the atoms.

get_ignored_monomers()[source]#

Get monomers that are ignored.

Returns:

A dictionary of ignored monomers. The keys are the residue IDs and the values are the corresponding Monomer objects.

Return type:

dict[str, Monomer]

get_valid_monomers()[source]#

Get monomers that are valid (not ignored).

Returns:

A dictionary of valid monomers. The keys are the residue IDs and the values are the corresponding Monomer objects.

Return type:

dict[str, Monomer]

class meeko.Monomer(raw_input_mol, rdkit_mol, mapidx_to_raw, input_resname=None, template_key=None, atom_names=None)[source]#

Bases: BaseJSONParsable

Individual subunit in a Polymer. Often called residue.

raw_rdkit_mol#

An RDKit Mol that defines element and connectivity within a residue. Bond orders and formal charges may be incorrect, and hydrogens may be missing. This molecule may originate from a PDB string and it defines also the positions of the atoms.

Type:

Chem.Mol

rdkit_mol#

Copy of the molecule from a ResidueTemplate, with positions from raw_rdkit_mol. All hydrogens are real atoms except for those at connections with adjacent residues.

Type:

Chem.Mol

mapidx_to_raw#

Mapping of atom indices in rdkit_mol to raw_rdkit_mol.

Type:

dict[int, int]

residue_template_key#

The matched residue template key of this Monomer.

Type:

str

input_resname#

The input residue name of this Monomer.

Type:

str

atom_names#

List of atom names in the same order as rdkit_mol.

Type:

list[str]

padded_mol#

Padded molecule with additional atoms around link atoms from adjacent residues.

Type:

Chem.Mol

molsetup#

An RDKitMoleculeSetup associated with this residue.

Type:

RDKitMoleculeSetup

molsetup_mapidx#

Mapping of atom indices in padded_mol to rdkit_mol.

Type:

dict[int, int]

is_flexres_atom#

List indicating whether each atom is a flexible residue atom.

Type:

list[bool]

is_movable#

Indicates whether the residue is movable.

Type:

bool

mapidx_from_raw#

Mapping of atom indices in raw_rdkit_mol to rdkit_mol.

Type:

dict[int, int]

template#

provides access to link_labels in the template

Type:

ResidueTemplate

classmethod json_encoder(obj)[source]#
Return type:

Optional[dict[str, Any]]

expected_json_keys: Optional[frozenset[str]] = frozenset({'atom_name', 'input_resname', 'is_flexres_atom', 'is_movable', 'mapidx_from_raw', 'mapidx_to_raw', 'molsetup', 'molsetup_mapidx', 'padded_mol', 'raw_rdkit_mol', 'rdkit_mol', 'residue_template_key'})#
set_atom_names(atom_names_list)[source]#

Set the atom names for the monomer.

Parameters:

atom_names_list (list[str]) – A list of atom names to set for the monomer. The length of this list must match the number of atoms in the RDKit molecule.

Return type:

None

parameterize(mk_prep, residue_id, get_atomprop_from_raw=None)[source]#

Parameterize the monomer using the provided mk_prep object.

Parameters:
  • mk_prep (MoleculePreparation) – A MoleculePreparation object that provides the parameterization method.

  • residue_id (str) – The residue ID to be used for parameterization.

  • get_atomprop_from_raw (dict[str, any], optional) – A dictionary mapping atom property names to default values. If provided, these properties will be set on the atoms in the raw RDKit molecule. The default is None (not passing any properties).

Raises:
  • ValueError – If the atom property names in get_atomprop_from_raw are not strings.

  • NotImplementedError – If the number of molsetups is not equal to 1.

class meeko.ResiduePadder(rxn_smarts, adjacent_res_smarts=None, auto_blunt=False)[source]#

Bases: BaseJSONParsable

A class for padding RDKit molecules of residues with parts from adjacent residues.

rxn#

Reaction SMARTS of a single-reactant, single-product reaction for padding.

Type:

rdChemReactions.ChemicalReaction

adjacent_smartsmol#

SMARTS molecule with mapping numbers to copy atom positions from part of adjacent residue.

Type:

Chem.Mol

adjacent_smartsmol_mapidx#

Mapping for atoms in adjacent_smartsmol, from mapping numbers to atom indicies.

Type:

list

classmethod json_encoder(obj)[source]#
Return type:

Optional[dict[str, Any]]

expected_json_keys: Optional[frozenset[str]] = {'adjacent_res_smarts', 'auto_blunt', 'rxn_smarts'}#
class meeko.ResidueTemplate(smiles, link_labels=None, atom_names=None)[source]#

Bases: BaseJSONParsable

Data and methods to pad rdkit molecules of polymer residues with parts of adjacent residues.

mol#

molecule with the exact atoms that constitute the system. All Hs are explicit, but atoms bonded to adjacent residues miss an H.

Type:

Chem.Mol

Keys are indices of atoms that need padding Values are strings to identify instances of ResiduePadder

Type:

dict[int, str]

atom_names#

list of atom names, matching order of atoms in rdkit mol

Type:

list[str]

classmethod json_encoder(obj)[source]#
Return type:

Optional[dict[str, Any]]

expected_json_keys: Optional[frozenset[str]] = {'atom_name', 'link_labels', 'mol'}#
check(mol, link_labels, atom_names)[source]#

Check the validity of a ResidueTemplate using the rdkit mol.

Parameters:
  • mol (Chem.Mol) – The molecule to check.

  • link_labels (dict[int, str], optional) – Keys are indices of atoms that need padding Values are strings to identify instances of ResiduePadder

  • atom_names (list[str], optional) – List of atom names in the same order as the atoms in the given smiles.

Raises:
  • ValueError – If the number of atoms in the molecule does not match the length of atom_names.

  • RuntimeError – If the molecule is not a valid SMILES representation.

match(input_mol)[source]#

Match the input molecule with the template molecule and return the mapping.

Parameters:

input_mol (Chem.Mol) – The input molecule to be matched with the template.

Returns:

A tuple containing two dictionaries: - The first dictionary contains the results of the matching, including the number of found and missing atoms. - The second dictionary contains the mapping between the template and input molecule atom indices.

Return type:

tuple[dict, dict]

Raises:

RuntimeError – If there are repeated values with different keys in the mapping.

class meeko.ResidueChemTemplates(residue_templates, padders, ambiguous)[source]#

Bases: BaseJSONParsable

Holds template data required to initialize Polymer

residue_templates#

keys are the ID of an instance of ResidueTemplate

Type:

dict[str, ResidueTemplate]

padders#

instances of ResiduePadder keyed by a link_label (a string) link_labels establish the relationship between ResidueTemplates and ResiduePadders, determining which padder is to be used to pad each atom of an instance of Monomer that needs padding.

Type:

dict

ambiguous#

mapping between input residue names (e.g. the three-letter residue name from PDB files) and IDs (strings) of ResidueTemplates

Type:

dict[str, list[str]]

classmethod json_encoder(obj)[source]#
Return type:

Optional[dict[str, Any]]

expected_json_keys: Optional[frozenset[str]] = {'ambiguous', 'padders', 'residue_templates'}#
add_dict(data, overwrite=False)[source]#

Add or update data from a dictionary to the current instance.

Parameters:
  • data (dict) – A dictionary containing new data to be added. The dictionary may contain keys within “residue_templates”, “padders”, and “ambiguous”.

  • overwrite (bool, optional) – If True, existing data will be overwritten with new data. If False, new data will be added without overwriting existing data. Default is False.

static lookup_filename(filename, data_path)[source]#

Look for a file in the current directory or in the data_path directory. If the file is not found in either location, raise a ValueError. If the file is found in the data_path directory, return the full path. If the file is found in the current directory, return the filename.

Parameters:
  • filename (str) – The name of the file to look for.

  • data_path (pathlib.Path) – The path to the directory where the file may be located.

Returns:

The full path to the file if found, otherwise raises ValueError.

Return type:

str

Raises:

ValueError – If the file is not found in either the current directory or the data_path directory.

classmethod from_json_file(filename)[source]#

Create an instance of ResidueChemTemplates from a JSON file.

Parameters:

filename (str) – The name of the JSON file to read. The file may contain residue templates, padders, and ambiguous residue names.

Returns:

An instance of ResidueChemTemplates created from the JSON file.

Return type:

ResidueChemTemplates

classmethod create_from_defaults()[source]#

Create an instance of ResidueChemTemplates using default data (residue_chem_templates.json).

Returns:

An instance of ResidueChemTemplates created from the default JSON file.

Return type:

ResidueChemTemplates

add_json_file(filename)[source]#

Add data from a JSON file to the current instance.

Parameters:

filename (str) – The name of the JSON file to read. The file may contain residue templates, padders, and ambiguous residue names.

Return type:

None

meeko.add_rotamers_to_polymer_molsetups(rotamer_states_list, polymer)[source]#

Add rotamer states to the monomers’ molecule setups in a polymer and get the state indices.

Parameters:
  • rotamer_states_list (list[dict[str, list[float]]]) – A list of dictionaries, where each dictionary contains residue IDs as keys and lists of angles as values.

  • polymer (Polymer) – The polymer object to which the rotamer states will be added.

Returns:

state_indices_list – A list of dictionaries, where each dictionary contains residue IDs as keys and the corresponding rotamer state indices as values.

Return type:

list[dict[str, int]]

class meeko.RDKitMolCreate[source]#

Bases: object

Utilities for constructing RDKit molecules from PDBQT docking results.

ambiguous_flexres_choices = {'ARG': ['ARG', 'ARG_mgltools'], 'ASN': ['ASN', 'ASN_mgltools'], 'ASP': ['ASP', 'ASH'], 'CYS': ['CYS', 'CYM'], 'GLN': ['GLN', 'GLN_mgltools'], 'GLU': ['GLU', 'GLH'], 'HIS': ['HIE', 'HID', 'HIP'], 'LYS': ['LYS', 'LYN']}#
flexres = {'ARG': {'atom_names_in_smiles_order': ['CA', 'CB', 'CG', 'CD', 'NE', 'CZ', 'NH1', 'NH2'], 'h_to_parent_index': {'HE': 4, 'HH11': 6, 'HH12': 6, 'HH21': 7, 'HH22': 7}, 'smiles': 'CCCCNC(N)=[NH2+]'}, 'ARG_mgltools': {'atom_names_in_smiles_order': ['CA', 'CB', 'CG', 'CD', 'NE', 'CZ', 'NH1', 'NH2'], 'h_to_parent_index': {'1HH1': 6, '1HH2': 7, '2HH1': 6, '2HH2': 7, 'HE': 4}, 'smiles': 'CCCCNC(N)=[NH2+]'}, 'ASH': {'atom_names_in_smiles_order': ['CA', 'CB', 'CG', 'OD1', 'OD2'], 'h_to_parent_index': {'HD2': 4}, 'smiles': 'CCC(=O)O'}, 'ASN': {'atom_names_in_smiles_order': ['CA', 'CB', 'CG', 'OD1', 'ND2'], 'h_to_parent_index': {'HD21': 4, 'HD22': 4}, 'smiles': 'CCC(=O)N'}, 'ASN_mgltools': {'atom_names_in_smiles_order': ['CA', 'CB', 'CG', 'OD1', 'ND2'], 'h_to_parent_index': {'1HD2': 4, '2HD2': 4}, 'smiles': 'CCC(=O)N'}, 'ASP': {'atom_names_in_smiles_order': ['CA', 'CB', 'CG', 'OD1', 'OD2'], 'h_to_parent_index': {}, 'smiles': 'CCC(=O)[O-]'}, 'CYM': {'atom_names_in_smiles_order': ['CA', 'CB', 'SG'], 'h_to_parent_index': {}, 'smiles': 'CC[S-]'}, 'CYS': {'atom_names_in_smiles_order': ['CA', 'CB', 'SG'], 'h_to_parent_index': {'HG': 2}, 'smiles': 'CCS'}, 'GLH': {'atom_names_in_smiles_order': ['CA', 'CB', 'CG', 'CD', 'OE1', 'OE2'], 'h_to_parent_index': {'HE2': 5}, 'smiles': 'CCCC(=O)O'}, 'GLN': {'atom_names_in_smiles_order': ['CA', 'CB', 'CG', 'CD', 'OE1', 'NE2'], 'h_to_parent_index': {'HE21': 5, 'HE22': 5}, 'smiles': 'CCCC(=O)N'}, 'GLN_mgltools': {'atom_names_in_smiles_order': ['CA', 'CB', 'CG', 'CD', 'OE1', 'NE2'], 'h_to_parent_index': {'1HE2': 5, '2HE2': 5}, 'smiles': 'CCCC(=O)N'}, 'GLU': {'atom_names_in_smiles_order': ['CA', 'CB', 'CG', 'CD', 'OE1', 'OE2'], 'h_to_parent_index': {}, 'smiles': 'CCCC(=O)[O-]'}, 'HID': {'atom_names_in_smiles_order': ['CA', 'CB', 'CG', 'CD2', 'NE2', 'CE1', 'ND1'], 'h_to_parent_index': {'HD1': 6}, 'smiles': 'CCc1cnc[nH]1'}, 'HIE': {'atom_names_in_smiles_order': ['CA', 'CB', 'CG', 'CD2', 'NE2', 'CE1', 'ND1'], 'h_to_parent_index': {'HE2': 4}, 'smiles': 'CCc1c[nH]cn1'}, 'HIP': {'atom_names_in_smiles_order': ['CA', 'CB', 'CG', 'CD2', 'NE2', 'CE1', 'ND1'], 'h_to_parent_index': {'HD1': 6, 'HE2': 4}, 'smiles': 'CCc1c[nH+]c[nH]1'}, 'ILE': {'atom_names_in_smiles_order': ['CA', 'CB', 'CG2', 'CG1', 'CD1'], 'h_to_parent_index': {}, 'smiles': 'CC(C)CC'}, 'LEU': {'atom_names_in_smiles_order': ['CA', 'CB', 'CG', 'CD1', 'CD2'], 'h_to_parent_index': {}, 'smiles': 'CCC(C)C'}, 'LYN': {'atom_names_in_smiles_order': ['CA', 'CB', 'CG', 'CD', 'CE', 'NZ'], 'h_to_parent_index': {'HZ2': 5, 'HZ3': 5}, 'smiles': 'CCCCCN'}, 'LYS': {'atom_names_in_smiles_order': ['CA', 'CB', 'CG', 'CD', 'CE', 'NZ'], 'h_to_parent_index': {'HZ1': 5, 'HZ2': 5, 'HZ3': 5}, 'smiles': 'CCCCC[NH3+]'}, 'MET': {'atom_names_in_smiles_order': ['CA', 'CB', 'CG', 'SD', 'CE'], 'h_to_parent_index': {}, 'smiles': 'CCCSC'}, 'PHE': {'atom_names_in_smiles_order': ['CA', 'CB', 'CG', 'CD1', 'CE1', 'CZ', 'CE2', 'CD2'], 'h_to_parent_index': {}, 'smiles': 'CCc1ccccc1'}, 'SER': {'atom_names_in_smiles_order': ['CA', 'CB', 'OG'], 'h_to_parent_index': {'HG': 2}, 'smiles': 'CCO'}, 'THR': {'atom_names_in_smiles_order': ['CA', 'CB', 'CG2', 'OG1'], 'h_to_parent_index': {'HG1': 3}, 'smiles': 'CC(C)O'}, 'TRP': {'atom_names_in_smiles_order': ['CA', 'CB', 'CG', 'CD1', 'NE1', 'CE2', 'CD2', 'CE3', 'CZ3', 'CH2', 'CZ2'], 'h_to_parent_index': {'HE1': 4}, 'smiles': 'CCc1c[nH]c2c1cccc2'}, 'TYR': {'atom_names_in_smiles_order': ['CA', 'CB', 'CG', 'CD1', 'CE1', 'CZ', 'CE2', 'CD2', 'OH'], 'h_to_parent_index': {'HH': 8}, 'smiles': 'CCc1ccc(cc1)O'}, 'VAL': {'atom_names_in_smiles_order': ['CA', 'CB', 'CG1', 'CG2'], 'h_to_parent_index': {}, 'smiles': 'CC(C)C'}}#
classmethod from_pdbqt_mol(pdbqt_mol, only_cluster_leads=False, keep_flexres=False)[source]#

Convert a PDBQT molecule into a list of RDKit molecules.

Parameters:
  • pdbqt_mol (PDBQT) – The PDBQT molecule to convert.

  • only_cluster_leads (bool, optional) – If True, only cluster leads are converted. Default is False.

  • keep_flexres (bool, optional) – If True, flexible residues are kept. Default is False.

Returns:

A list of RDKit molecules. Each molecule corresponds to a pose in the PDBQT molecule.

Return type:

list

Raises:
  • RuntimeError – If no cluster leads are found in the PDBQT molecule and only_cluster_leads is True.

  • ValueError – If some atoms of a ligand are parsed as sidechain but not all.

classmethod guess_flexres_smiles(resname, atom_names)[source]#

Determine a SMILES string for flexres based on atom names, as well as the equivalent of smile_index_map and smiles_h_parent which are written to PDBQT remarks for regular small molecules.

Parameters:
  • resname (str) – The residue name of the flexible residue.

  • atom_names (list of str) – The names of the atoms in the flexible residue.

Returns:

A tuple containing:
  • smiles: SMILES string starting at C-alpha (excludes most of the backbone)

  • index_map: list of pairs of integers, first in pair is index in the smiles,

    second is index of corresponding atom in atom_names

  • h_parent: list of pairs of integers, first in pair is index of a heavy atom

    in the smiles, second is index of a hydrogen in atom_names. The hydrogen is bonded to the heavy atom.

Return type:

tuple[str, list, list]

classmethod add_pose_to_mol(mol, ligand_coordinates, index_map)[source]#

Add given coordinates to given molecule as new conformer.

Parameters:
  • mol (Chem.Mol) – The RDKit molecule to which the coordinates will be added.

  • ligand_coordinates (list[list[float]]) – 2D array of shape (nr_atom, 3).

  • index_map (list[int]) – list of nr_atom pairs of integers, 1-indexed. In each pair, the first int is the index in mol, and the second int is the index in ligand_coordinates (PDBQT).

Returns:

mol – The RDKit molecule with the new conformer added.

Return type:

Chem.Mol

Raises:

RuntimeError – Will raise error if number of coordinates provided does not match the number of atoms there should be coordinates for.

static add_hydrogens(mol, coordinates_list, h_parent)[source]#

Add hydrogen atoms to ligand RDKit mol, adjust the positions of polar hydrogens to match pdbqt.

Parameters:
  • mol (Chem.Mol) – The RDKit molecule to which the hydrogens will be added.

  • coordinates_list (list[list[float]]) – 2D array of shape (nr_atom, 3).

  • h_parent (list[int]) – list of pairs of integers, 1-indexed. In each pair, the first int is the index in mol, and the second int is the index in coordinates_list

Returns:

mol – The RDKit molecule with the new hydrogens added.

Return type:

Chem.Mol

static combine_rdkit_mols(mol_list)[source]#

Combines list of rdkit molecules into a single one using Chem.CombineMols.

Parameters:

mol_list (list[Chem.Mol]) – List of RDKit molecules to combine.

Returns:

Combined RDKit molecule with all conformers from the input list. If all input molecules are None, returns None.

Return type:

Chem.Mol

classmethod add_sandbox_coordinates(dlgstring, rdmol, index_map, h_parent, groupname=None)[source]#

Parse coordinates from a DLG file and append to RDKit mol. The coordinates are sorted by energy, and the best pose is added first. The function also adds the pose index to the RDKit molecule. The coordinates are added as conformers to the RDKit molecule. The function also adds the pose index to the RDKit molecule.

Parameters:
  • dlgstring (str) – The DLG file content as a string.

  • rdmol (Chem.Mol) – The RDKit molecule to which the coordinates will be added.

  • index_map (list[int]) – List of pairs of integers, 1-indexed. In each pair, the first int is the index in mol, and the second int is the index in coordinates_list.

  • h_parent (list[int]) – List of pairs of integers, 1-indexed. In each pair, the first int is the index in mol, and the second int is the index in coordinates_list.

  • groupname (str, optional) – The name of the group to filter the coordinates. If None, all coordinates are used. Default is None.

Returns:

A tuple containing:
  • rdmol: The RDKit molecule with the new conformers added.

  • energy: A dictionary containing the intermolecular and intramolecular energies for each pose, as well as the pose index.

Return type:

tuple[Chem.Mol, dict]

static write_sd_string(pdbqt_mol, only_cluster_leads=False, keep_flexres=False)[source]#

Write a multi-conformer SDF string from a PDBQT molecule.

Parameters:
  • pdbqt_mol (PDBQT) – The PDBQT molecule to convert.

  • only_cluster_leads (bool, optional) – If True, only cluster leads are converted. Default is False.

  • keep_flexres (bool, optional) – If True, flexible residues are kept. Default is False.

Returns:

A tuple containing:
  • output_string: The multi-conformer SDF string.

  • failures: A list of indices of failed conversions.

Return type:

tuple[str, list]

class meeko.PDBQTWriterLegacy[source]#

Bases: object

classmethod write_string_from_polymer(polymer)[source]#
classmethod write_from_polymer(polymer)[source]#
classmethod write_string(setup, add_index_map=False, remove_smiles=False, bad_charge_ok=False)[source]#

Output a PDBQT file as a string.

Args:

setup: RDKitMoleculeSetup

Returns:

str: PDBQT string of the molecule bool: success str: error message

classmethod remark_index_map(setup, numbering, order=None, prefix='REMARK INDEX MAP', missing_h=())[source]#

write mapping of atom indices from input molecule to output PDBQT order[ob_index(i.e. ‘key’)] = smiles_index

static break_long_remark_lines(strings, prefix, max_line_length=79)[source]#
static adapt_pdbqt_for_autodock4_flexres(pdbqt_string, res, chain, num, skip_rename_ca_cb=False, atom_count=None)[source]#
adapt pdbqt_string to be compatible with AutoDock4 requirements:
  • first and second atoms named CA and CB

  • write BEGIN_RES / END_RES

  • remove TORSDOF

this is for covalent docking (tethered)

meeko.get_reactive_config(types_1, types_2, eps12, r12, r13_scaling, r14_scaling, ignore=['HD', 'F'], coeff_vdw=0.1662)[source]#
Args:

types_1 (list): 1st set of atom types types_2 (list): 2nd set of atom types

Returns:

derivtypes (dict): modpairs (list):

meeko.oids_block_from_setup(molsetup, name='LigandFromMeeko')[source]#
meeko.parse_offxml(offxml_filename)[source]#

Convert OpenFF XML entries to autodockdev dictionaries

class meeko.Hydrate(water_model='tip3p', planar_tol=0.05)[source]#

Bases: object

defaults = [{'IDX': 1, 'geometries': [{'distance': 3.0, 'phi': 0.0}], 'is_donor': False, 'smarts': '[#7X2;v3;!+](=,:[*])[*]', 'z': [2, 3]}, {'IDX': 1, 'geometries': [{'distance': 2.0, 'phi': 0.0}], 'is_donor': True, 'smarts': '[#1][#7,#8,#9]', 'z': [2]}, {'IDX': 1, 'geometries': [{'distance': 3.0, 'phi': 60.0, 'theta': 0}, {'distance': 3.0, 'phi': 60.0, 'theta': 180}], 'is_donor': False, 'smarts': '[#8X1]=[X3][*]', 'x': [3], 'z': [2]}]#
static orient_water(coords, target_xyz, anchor_xyz, anchor_is_donor)[source]#

Coordinates will be changed in place target_xyz is where the water oxygen will be anchor_xyz is the atom to which this molecule belongs anchor_is_donor is True for H, and False for O, N, S

expects starting O to be at (0, 0, 0) and an H along x-axis

Parameters:
  • coords

  • target_xyz

  • anchor_xyz

  • anchor_is_donor

class meeko.Restraint(atom_index, target_coords, kcal_per_angstrom_square, delay_angstroms)[source]#

Bases: BaseJSONParsable

A class representing a restraint on an atom in a molecule.

atom_index: int#
target_coords: tuple[float, float, float]#
kcal_per_angstrom_square: float#
delay_angstroms: float#
classmethod json_encoder(obj)[source]#
Return type:

Optional[dict[str, Any]]

expected_json_keys: Optional[frozenset[str]] = {'atom_index', 'delay_angstroms', 'kcal_per_angstrom_square', 'target_coords'}#
copy()[source]#

Creates a copy of the Restraint object.

Returns:

new_restraint – A new Restraint object with the same attributes as the original.

Return type:

Restraint

class meeko.CovalentBuilder(receptor_mol, residue_string)[source]#

Bases: object

Class to perform structural alignments of ligands containing the target side chain attached to run tethered covalent dockings. The class is instantiated for a given target, with a list of one or more residues, then ligands can be processed sequentially ( CovalentBuilder.process() )

rec#

ProDy molecule object for the target residue

Type:

ProDy molecule

residues#

Dictionary with the structure { res_id : (at1_coord, at2_coord), …}

Type:

dict

process(ligand, smarts=None, smarts_indices=None, indices=None, first_only=False)[source]#

Process the ligand for the residue(s) specified for the current receptor.

Parameters:
  • ligand (RDKit molecule) – RDKit molecule object for the ligand to be processed.

  • smarts (str, optional) – SMARTS pattern to search for in the ligand. Default is None.

  • smarts_indices (tuple, optional) – Tuple containing the indices of the atoms in the SMARTS pattern. Default is None.

  • indices (list, optional) – List of tuples containing the indices of the atoms in the ligand. Default is None.

  • first_only (bool, optional) – If True, only the first match of the SMARTS pattern will be used. Default is False.

Yields:

CovLigandPrepared – A named tuple containing the prepared ligand molecule, residue information, atom names, SMARTS pattern, indices, and label.

Raises:
  • ValueError – If neither smarts nor indices are specified.

  • RuntimeWarning – If the SMARTS pattern doesn’t match any atoms in the ligand.

find_smarts(mol, smarts, smarts_indices, first_only)[source]#

Find occurrences of the SMARTS indices atoms in the requested SMARTS.

Parameters:
  • mol (RDKit molecule) – RDKit molecule object for the ligand to be processed.

  • smarts (str) – SMARTS pattern to search for in the ligand.

  • smarts_indices (tuple) – Tuple containing the indices of the atoms in the SMARTS pattern.

  • first_only (bool) – If True, only the first match of the SMARTS pattern will be used. Default is False.

Returns:

List of tuples containing the indices of the atoms in the ligand.

Return type:

list

Raises:
  • ValueError – If the SMARTS index exceeds the number of atoms in the SMARTS pattern.

  • RuntimeWarning – If the specified ligand pattern returned more than one match.

transform(ligand, index_pair, coord)[source]#

Generate translatead and aligned molecules for each of the indices requested and for all the residues defined in the class constructor. SOURCE: https://sourceforge.net/p/rdkit/mailman/message/36750909/

Parameters:
  • ligand (RDKit molecule) – RDKit molecule object for the ligand to be processed.

  • index_pair (tuple) – Tuple containing the indices of the atoms in the ligand.

  • coord (list) – List of tuples containing the coordinates of the atoms in the target residue.

Returns:

RDKit molecule object for the transformed ligand.

Return type:

RDKit molecule

classmethod parse_residue_string(string, force_CA_CB=True)[source]#

Parse the residue string and return a tuple with the residue information. The string can be in the format “CHAIN:RES:NUM” or “CHAIN:RES:NUM:ATOM1,ATOM2”.

Parameters:
  • string (str) – String specifying residues to process.

  • force_CA_CB (bool, optional) – If True, force the atom names to be CA and CB. Default is True.

Returns:

Tuple containing the residue information (chain, res, num, atname1, atname2).

Return type:

tuple

Raises:
  • ValueError – If the residue string is not in the expected format.

  • RuntimeError – If the atom names are not CA and CB and force_CA_CB is True.

Subpackages#

Submodules#