7.1. bpforms package

7.1.2. Submodules

7.1.3. bpforms.__main__ module

bpforms command line interface

Author:Jonathan Karr <karr@mssm.edu>
Date:2019-01-31
Copyright:2019, Karr Lab
License:MIT
class bpforms.__main__.App(label=None, **kw)[source]

Bases: cement.core.foundation.App

Command line application

class Meta[source]

Bases: object

base_controller = 'base'[source]
handlers = [<class 'bpforms.__main__.BaseController'>, <class 'bpforms.__main__.ValidateController'>, <class 'bpforms.__main__.GetPropertiesController'>, <class 'bpforms.__main__.ProtonateController'>, <class 'bpforms.__main__.BuildAlphabetsController'>][source]
label = 'bpforms'[source]
class bpforms.__main__.BaseController(*args, **kw)[source]

Bases: cement.ext.ext_argparse.ArgparseController

Base controller for command line application

class Meta[source]

Bases: object

arguments = [(['-v', '--version'], {'action': 'version', 'version': '0.0.1'})][source]
description = 'bpforms'[source]
label = 'base'[source]
class bpforms.__main__.BuildAlphabetsController(*args, **kw)[source]

Bases: cement.ext.ext_argparse.ArgparseController

Build DNA, RNA, and protein alphabets from DNAmod, MODOMICS, and RESID

class Meta[source]

Bases: object

arguments = [(['--ph'], {'type': <class 'float'>, 'default': 7.4, 'help': 'pH at which calculate major protonation state of each monomer'}), (['--not-major-tautomer'], {'action': 'store_true', 'default': False, 'help': 'If set, do not calculate the major tautomer'}), (['--max-monomers'], {'type': <class 'float'>, 'default': inf, 'help': 'Maximum number of monomers to build. Used for testing'})][source]
description = 'Build DNA, RNA, and protein alphabets from DNAmod, MODOMICS, and RESID'[source]
label = 'build-alphabets'[source]
stacked_on = 'base'[source]
stacked_type = 'nested'[source]
class bpforms.__main__.GetPropertiesController(*args, **kw)[source]

Bases: cement.ext.ext_argparse.ArgparseController

Calculate physical properties such as length, chemical formula, molecular weight, and charge

class Meta[source]

Bases: object

arguments = [(['type'], {'type': <class 'str'>, 'help': 'Type of biopolymer'}), (['structure'], {'type': <class 'str'>, 'help': 'Biopolymer structure'}), (['--ph'], {'default': None, 'type': <class 'float'>, 'help': 'pH at which calculate major protonation state of each monomer'}), (['--major-tautomer'], {'action': 'store_true', 'default': False, 'help': 'If set, calculate the major tautomer'})][source]
description = 'Calculate physical properties such as length, chemical formula, molecular weight, and charge'[source]
label = 'get-properties'[source]
stacked_on = 'base'[source]
stacked_type = 'nested'[source]
class bpforms.__main__.ProtonateController(*args, **kw)[source]

Bases: cement.ext.ext_argparse.ArgparseController

Calculate the major protonation and tautomerization

class Meta[source]

Bases: object

arguments = [(['type'], {'type': <class 'str'>, 'help': 'Type of biopolymer'}), (['structure'], {'type': <class 'str'>, 'help': 'Biopolymer structure'}), (['ph'], {'type': <class 'float'>, 'help': 'pH'}), (['--major-tautomer'], {'action': 'store_true', 'default': False, 'help': 'If set, calculate the major tautomer'})][source]
description = 'Calculate the major protonation and tautomerization state of a biopolymer form to a specific pH'[source]
label = 'protonate'[source]
stacked_on = 'base'[source]
stacked_type = 'nested'[source]
class bpforms.__main__.ValidateController(*args, **kw)[source]

Bases: cement.ext.ext_argparse.ArgparseController

Validate a biopolymer form

class Meta[source]

Bases: object

arguments = [(['type'], {'type': <class 'str'>, 'help': 'Type of biopolymer'}), (['structure'], {'type': <class 'str'>, 'help': 'Biopolymer structure'})][source]
description = 'Validate a biopolymer form'[source]
label = 'validate'[source]
stacked_on = 'base'[source]
stacked_type = 'nested'[source]
bpforms.__main__.main()[source]

7.1.4. bpforms.core module

Classes to represent modified forms of DNA, RNA, and proteins

Author:Jonathan Karr <karr@mssm.edu>
Date:2019-01-31
Copyright:2019, Karr Lab
License:MIT
class bpforms.core.Alphabet(id=None, name=None, description=None, monomers=None)[source]

Bases: object

Alphabet for monomers

id[source]

id

Type:str
name[source]

name

Type:str
description[source]

description

Type:str
monomers[source]

monomers

Type:dict
from_dict(dict)[source]

Create alphabet from a dictionary representation

Parameters:dict (dict) – dictionary representation of alphabet
Returns:alphabet
Return type:Alphabet
from_yaml(path)[source]

Read alphabet from YAML file

Parameters:path (str) – path to YAML file which defines alphabet
Returns:alphabet
Return type:Alphabet
get_monomer_code(monomer)[source]

Get the code for a monomer in the alphabet

Parameters:monomer (Monomer) – monomer
Returns:code for monomer
Return type:str
Raises:ValueError – if monomer is not in alphabet
is_equal(other)[source]

Determine two alphabets are semantically equal

Parameters:other (type) – other alphabet
Returns:True, if the alphabets are semantically equal
Return type:bool
monomers[source]

Get the monomers

Returns:monomers
Return type:MonomerDict
protonate(ph, major_tautomer=False)[source]

Calculate the major protonation and tautomerization of each monomer

Parameters:
  • ph (float) – pH
  • major_tautomer (bool, optional) – if True, calculate the major tautomer
to_dict()[source]

Get dictionary representation of alphabet

Returns:dictionary representation of alphabet
Return type:dict
to_yaml(path)[source]

Save alphabet to YAML file

Parameters:path (str) – path to save alphabet in YAML format
class bpforms.core.AlphabetBuilder(_max_monomers=inf)[source]

Bases: abc.ABC

Builder for alphabets

_max_monomers[source]

maximum number of monomers to build; used to limit length of tests

Type:float
build()[source]

Build alphabet

Returns:alphabet
Return type:Alphabet
run(ph=None, major_tautomer=False, path=None)[source]

Build alphabet and, optionally, save to YAML file

Parameters:
  • ph (float, optional) – pH at which to calculate the major protonation state of each monomer
  • major_tautomer (bool, optional) – if True, calculate the major tautomer
  • path (str, optional) – path to save alphabet
Returns:

alphabet

Return type:

Alphabet

save(alphabet, path)[source]

Save alphabet to YAML file

Parameters:
  • alphabet (Alphabet) – alphabet
  • path (str) – path to save alphabet
class bpforms.core.Atom(element, position=None, charge=0)[source]

Bases: object

An atom in a compound or bond

element[source]

code for the element (e.g. ‘H’)

Type:str
position[source]

IUPAC position of the atom within the compound

Type:int
charge[source]

charge of the atom

Type:int
charge[source]

Get the charge

Returns:charge
Return type:str
element[source]

Get the element

Returns:element
Return type:str
is_equal(other)[source]

Determine if two atoms are semantically equal

Parameters:other (Atom) – other atom
Returns:obj:True if the atoms are semantically equal
Return type:bool
position[source]

Get the position

Returns:position
Return type:int
class bpforms.core.AtomList(atoms=None)[source]

Bases: list

List of atoms

__setitem__(slice, atom)[source]

Set atom(s) at slice

Parameters:
  • slice (int or slice) – position(s) to set atom
  • atom (Atom or AtomList) – atom or atoms
append(atom)[source]

Add a atom

Parameters:atom (Atom) – atom
Raises:ValueError – if the atom is not an instance of Atom
extend(atoms)[source]

Add a list of atoms

Parameters:atoms (iterable of Atom) – iterable of atoms
insert(i, atom)[source]

Insert an atom at a position

Parameters:
  • i (int) – position to insert atom
  • atom (Atom) – atom
is_equal(other)[source]

Determine if two lists of atoms are semantically equal

Parameters:other (AtomList) – other list of atoms
Returns:True, of the lists of atoms are semantically equal
Return type:bool
class bpforms.core.Backbone(structure=None, backbone_bond_atoms=None, monomer_bond_atoms=None, backbone_displaced_atoms=None, monomer_displaced_atoms=None)[source]

Bases: object

Backbone of a monomer

structure[source]

chemical structure

Type:openbabel.OBMol
backbone_bond_atoms[source]

atoms from backbone that bonds to monomer

Type:AtomList
monomer_bond_atoms[source]

atoms from monomer that bonds to backbone

Type:AtomList
backbone_displaced_atoms[source]

atoms from backbone displaced by bond to monomer

Type:AtomList
monomer_displaced_atoms[source]

atoms from monomer displaced by bond to backbone

Type:AtomList
backbone_bond_atoms[source]

Get the backbone bond atoms

Returns:backbone bond atoms
Return type:AtomList
backbone_displaced_atoms[source]

Get the backbone displaced atoms

Returns:backbone displaced atoms
Return type:AtomList
get_charge()[source]

Get the charge

Returns:charge
Return type:int
get_formula()[source]

Get the formula

Returns:formula
Return type:EmpiricalFormula
get_inchi()[source]

Get InChI representration of structure

Returns:InChI representration of structure
Return type:str
get_mol_wt()[source]

Get the molecular weight

Returns:molecular weight
Return type:float
is_equal(other)[source]

Determine if two backbones are semantically equal

Parameters:other (Backbone) – other backbone
Returns:True if the backbones are semantically equal
Return type:bool
monomer_bond_atoms[source]

Get the monomer bond atoms

Returns:monomer bond atoms
Return type:AtomList
monomer_displaced_atoms[source]

Get the monomer displaced atoms

Returns:monomer displaced atoms
Return type:AtomList
structure[source]

Get the structure

Returns:structure
Return type:openbabel.OBMol
class bpforms.core.Bond(left_participant=None, right_participant=None, left_bond_atoms=None, right_bond_atoms=None, left_displaced_atoms=None, right_displaced_atoms=None)[source]

Bases: object

Bond between monomers

left_participant[source]

type of left participant (monomer or backbone)

Type:type
right_participant[source]

type of right participant (monomer or backbone)

Type:type
left_bond_atoms[source]

atoms from left monomer that bonds with right monomer

Type:AtomList
right_bond_atoms[source]

atoms from right monomer that bonds with left monomer

Type:AtomList
left_displaced_atoms[source]

atoms from left monomer displaced by bond

Type:AtomList
right_displaced_atoms[source]

atoms from right monomer displaced by bond

Type:AtomList
get_charge()[source]

Get the charge

Returns:charge
Return type:int
get_formula()[source]

Get the formula

Returns:formula
Return type:EmpiricalFormula
get_mol_wt()[source]

Get the molecular weight

Returns:molecular weight
Return type:float
is_equal(other)[source]

Determine if two bonds are semantically equal

Parameters:other (Bond) – other bond
Returns:True if the bond are semantically equal
Return type:bool
left_bond_atoms[source]

Get the left bond atoms

Returns:left bond atoms
Return type:AtomList
left_displaced_atoms[source]

Get the left displaced atoms

Returns:left displaced atoms
Return type:AtomList
left_participant[source]

Get type of the left participant

Returns:type of the left participant
Return type:type
right_bond_atoms[source]

Get the right bond atoms

Returns:right bond atoms
Return type:AtomList
right_displaced_atoms[source]

Get the right displaced atoms

Returns:right displaced atoms
Return type:AtomList
right_participant[source]

Get type of the right participant

Returns:type of the right participant
Return type:type
class bpforms.core.BpForm(monomer_seq=None, alphabet=None, backbone=None, bond=None, circular=False)[source]

Bases: object

Biopolymer form

monomer_seq[source]

monomers of the biopolymer

Type:MonomerSequence
alphabet[source]

monomer alphabet

Type:Alphabet
backbone[source]

backbone that connects monomers

Type:Backbone
bond[source]

bonds between (backbones of) monomers

Type:Bond
circular[source]

if True, indicates that the biopolymer is circular

Type:bool
_parser[source]

parser

Type:lark.Lark
DEFAULT_FASTA_CODE = '?'[source]
__contains__(monomer)[source]

Determine if a monomer is in the form

Parameters:monomer (Monomer) – monomer
Returns:true if the monomer is in the sequence
Return type:bool
__delitem__(slice)[source]

Delete monomer(s) at slice

Parameters:slice (int or slice) – position(s)
__getitem__(slice)[source]

Get monomer(s) at slice

Parameters:slice (int or slice) – position(s)
Returns:monomer or monomers
Return type:Monomer or Monomers
__iter__()[source]

Get iterator over monomer sequence

Returns:iterator of monomers
Return type:iterator of Monomer
__len__()[source]

Get the length of the sequence of the form

Returns:length
Return type:int
__reversed__()[source]

Get reverse iterator over monomer sequence

Returns:iterator of monomers
Return type:iterator of Monomer
__setitem__(slice, monomer)[source]

Set monomer(s) at slice

Parameters:
  • slice (int or slice) – position(s)
  • monomer (Monomer or Monomers) – monomer or monomers
__str__()[source]

Get a string representation of the biopolymer form

Returns:string representation of the biopolymer form
Return type:str
alphabet[source]

Get the alphabet

Returns:alphabet
Return type:Alphabet
backbone[source]

Get the backbones

Returns:backbones
Return type:Backbone
bond[source]

Get the bonds

Returns:bonds
Return type:Bond
circular[source]

Get the circularity

Returns:circularity
Return type:bool
file = <_io.TextIOWrapper name='/root/project/bpforms/grammar.lark' mode='r' encoding='UTF-8'>[source]
from_str(str)[source]

Create biopolymer form its string representation

Parameters:str (str) – string representation of the biopolymer
Returns:biopolymer form
Return type:BpForm
get_charge()[source]

Get the charge

Returns:charge
Return type:int
get_formula()[source]

Get the chemical formula

Returns:chemical formula
Return type:EmpiricalFormula
get_mol_wt()[source]

Get the molecular weight

Returns:molecular weight
Return type:float
get_monomer_counts()[source]

Get the frequency of each monomer within the biopolymer

Returns:dictionary that maps monomers to their counts
Return type:dict
is_equal(other)[source]

Check if two biopolymer forms are semantically equal

Parameters:other (BpForm) – another biopolymer form
Returns:True, if the objects have the same structure
Return type:bool
monomer_seq[source]

Get the monomer sequence

Returns:monomer sequence
Return type:MonomerSequence
protonate(ph, major_tautomer=True)[source]

Update to the major protonation and tautomerization state of each monomer at the pHf

Parameters:
  • ph (float) – pH
  • major_tautomer (bool, optional) – if True, calculate the major tautomer
to_fasta()[source]

Get FASTA representation of a monomer with bases represented by the character codes of their parent monomers (e.g. methyl-2-adenosine is represented by ‘A’)

Returns:FASTA representation of a monomer
Return type:str
class bpforms.core.Identifier(ns, id)[source]

Bases: object

A identifier in a namespace for an external database

ns[source]

namespace

Type:str
id[source]

id in namespace

Type:str
__eq__(other)[source]

Check if two identifiers are semantically equal

Parameters:other (Identifier) – another identifier
Returns:True, if the identifiers are semantically equal
Return type:bool
__hash__()[source]

Generate a hash

Returns:hash
Return type:int
id[source]

Get the id

Returns:id
Return type:str
ns[source]

Get the namespace

Returns:namespace
Return type:str
class bpforms.core.IdentifierSet(identifiers=None)[source]

Bases: set

Set of identifiers

add(identifier)[source]

Add an identifier

Parameters:identifier (Identifier) – identifier
Raises:ValueError – if the identifier is not an instance of Indentifier
symmetric_difference_update(other)[source]

Remove common elements with other and add elements from other not in self

Parameters:other (IdentifierSet) – other set of identifiers
update(identifiers)[source]

Add a set of identifiers

Parameters:identifiers (iterable of Identifier) – identifiers
class bpforms.core.Monomer(id=None, name=None, synonyms=None, identifiers=None, structure=None, delta_mass=None, delta_charge=None, start_position=None, end_position=None, base_monomers=None, comments=None)[source]

Bases: object

A monomer in a biopolymer

id[source]

id

Type:str
name[source]

name

Type:str
synonyms[source]

synonyms

Type:set
identifiers[source]

identifiers in namespaces for external databases

Type:set
structure[source]

chemical structure

Type:openbabel.OBMol
delta_mass[source]

additional mass (Dalton) relative to structure

Type:float
delta_charge[source]

additional charge relative to structure

Type:int
start_position[source]

uncertainty in the location of the monomer

Type:tuple
end_position[source]

uncertainty in the location of the monomer

Type:tuple
base_monomers[source]

monomers which this monomer is derived from

Type:set
comments[source]

comments

Type:str
IMAGE_URL_PATTERN = 'https://cactus.nci.nih.gov/chemical/structure/{}/image?format=png&bgcolor=transparent&antialiasing=0'[source]
__str__(alphabet=None)[source]

Get a string representation of the monomer

Parameters:alphabet (Alphabet, optional) – alphabet
Returns:string representation of the monomer
Return type:str
base_monomers[source]

Get base monomers

Returns:base monomers
Return type:set of Monomer
comments[source]

Get comments

Returns:comments
Return type:str
delta_charge[source]

Get extra charge

Returns:extra charge
Return type:int
delta_mass[source]

Get extra mass

Returns:extra mass
Return type:float
end_position[source]

Get end position

Returns:end position
Return type:int
from_dict(dict, alphabet=None)[source]

Get a dictionary representation of the monomer

Parameters:
  • dict (dict) – dictionary representation of the monomer
  • alphabet (Alphabet, optional) – alphabet
Returns:

monomer

Return type:

Monomer

get_charge()[source]

Get the charge

Returns:charge
Return type:int
get_formula()[source]

Get the chemical formula

Returns:chemical formula
Return type:EmpiricalFormula
get_image_url()[source]

Get URL for image of structure

Returns:URL for image of structure
Return type:str
get_inchi()[source]

Get InChI representration of structure

Returns:InChI representration of structure
Return type:str
get_mol_wt()[source]

Get the molecular weight

Returns:molecular weight
Return type:float
get_root_monomers()[source]

Get root monomers

Returns:root monomers
Return type:set of Monomer
id[source]

Get id

Returns:id
Return type:str
identifiers[source]

Get identifiers

Returns:identifiers
Return type:IdentifierSet
is_equal(other)[source]

Check if two monomers are semantically equal

Parameters:other (Monomer) – another monomer
Returns:True, if the objects have the same structure
Return type:bool
name[source]

Get name

Returns:name
Return type:str
protonate(ph, major_tautomer=False)[source]

Update to the major protonation and tautomerization state at the pH

Parameters:
  • ph (float) – pH
  • major_tautomer (bool, optional) – if True, calculate the major tautomer
start_position[source]

Get start position

Returns:start position
Return type:int
structure[source]

Get structure

Returns:structure
Return type:openbabel.OBMol
synonyms[source]

Get synonyms

Returns:synonyms
Return type:SynonymSet
to_dict(alphabet=None)[source]

Get a dictionary representation of the monomer

Parameters:alphabet (Alphabet, optional) – alphabet
Returns:dictionary representation of the monomer
Return type:dict
class bpforms.core.MonomerDict(*args, **kwargs)[source]

Bases: attrdict.dictionary.AttrDict

Dictionary for monomers

__setitem__(chars, monomer)[source]

Set monomer with chars

Parameters:
  • chars (str) – characters for monomer
  • monomer (Monomer) – monomer
class bpforms.core.MonomerSequence(monomers=None)[source]

Bases: list

Sequence of monomers

__setitem__(slice, monomer)[source]

Set monomer(s) at slice

Parameters:
  • slice (int or slice) – position(s) to set monomer
  • monomer (Monomer or list of Monomer) – monomer or monomers
append(monomer)[source]

Add a monomer

Parameters:monomer (Monomer) – monomer
Raises:ValueError – if the monomer is not an instance of Monomer
extend(monomers)[source]

Add a list of monomers

Parameters:monomers (iterable of Monomer) – iterable of monomers
get_monomer_counts()[source]

Get the frequency of each monomer within the sequence

Returns:dictionary that maps monomers to their counts
Return type:dict
insert(i, monomer)[source]

Insert a monomer at a position

Parameters:
  • i (int) – position to insert monomer
  • monomer (Monomer) – monomer
is_equal(other)[source]

Determine if two monomer sequences are semantically equal

Parameters:other (MonomerSequence) – other monomer sequence
Returns:True, of the monomer sequences are semantically equal
Return type:bool
class bpforms.core.SynonymSet(synonyms=None)[source]

Bases: set

Set of synonyms

add(synonym)[source]

Add an synonym

Parameters:synonym (str) – synonym
Raises:ValueError – if the synonym is not an instance of Indentifier
symmetric_difference_update(other)[source]

Remove common synonyms with other and add synonyms from other not in self

Parameters:other (SynonymSet) – other set of synonyms
update(synonyms)[source]

Add a set of synonyms

Parameters:synonyms (iterable of SynonymSet) – synonyms

7.1.5. bpforms.rest module

REST JSON API

Author:Jonathan Karr <karr@mssm.edu>
Date:2019-02-05
Copyright:2019, Karr Lab
License:MIT
class bpforms.rest.AlpabetResource(api=None, *args, **kwargs)[source]

Bases: flask_restplus.resource.Resource

Get alphabets

endpoint = 'alphabet_alpabet_resource'[source]
get(id)[source]

Get an alphabet

mediatypes()[source]
methods = {'GET'}[source]
class bpforms.rest.AlphabetsResource(api=None, *args, **kwargs)[source]

Bases: flask_restplus.resource.Resource

Get list of alphabets

endpoint = 'alphabet_alphabets_resource'[source]
get()[source]

Get a list of available alphabets

mediatypes()[source]
methods = {'GET'}[source]
class bpforms.rest.Bpform(api=None, *args, **kwargs)[source]

Bases: flask_restplus.resource.Resource

Optionally, calculate the major protonation and tautomerization form a biopolymer form and calculate its properties

endpoint = 'bpform_bpform'[source]
mediatypes()[source]
methods = {'POST'}[source]
post()[source]

Optionally, calculate the major protonation and tautomerization form a biopolymer form and calculate its properties

class bpforms.rest.PrefixMiddleware(app, prefix='')[source]

Bases: object

7.1.6. bpforms.util module

Utilities for BpForms

Author:Jonathan Karr <karr@mssm.edu>
Date:2019-02-05
Copyright:2019, Karr Lab
License:MIT
bpforms.util.build_alphabets(ph=None, major_tautomer=False, _max_monomers=inf)[source]

Build DNA, RNA, and protein alphabets

Parameters:
  • ph (float, optional) – pH at which calculate major protonation state of each monomer
  • major_tautomer (bool, optional) – if True, calculate the major tautomer
  • _max_monomers (float, optional) – maximum number of monomers to build; used for testing
bpforms.util.gen_html_viz_alphabet(alphabet, filename)[source]

Create and save an HTML document with images of the monomers in an alphabet

Parameters:
  • alphabet (Alphabet) – alphabet
  • filename (str) – path to save HTML document with images of monomers
bpforms.util.get_alphabet(alphabet)[source]

Get an alphabet

Parameters:alphabet (str) – alphabet
Returns:alphabet
Return type:core.Alphabet
bpforms.util.get_alphabets()[source]

Get a list of available alphabets

Returns:dictionary which maps the ids of alphabets to alphabets
Return type:dict
bpforms.util.get_form(alphabet)[source]

Get a subclass of BpFrom

Parameters:alphabet (str) – alphabet
Returns:subclass of BpForm
Return type:type

7.1.7. Module contents