3.5. Python API

The following tutorial illustrates how to use the BpForms Python API. An interactive version of this tutorial is also available in the whole-cell modeling sandbox.

3.5.1. Importing BpForms

Run this command to import BpForms:

import bpforms

3.5.2. Creating biopolymer forms

Use the BpForms grammar and the bpforms.BpForm.from_str method to create an instance of bpforms.BpForm that represents a form of a biopolymer:

dna_form = bpforms.DnaForm().from_str('ACG{m2C}AC')

3.5.3. Getting and setting monomeric forms

Individual monomeric forms and slices of monomeric forms can be get and set similar to lists:

dna_form[0]
    => <bpforms.core.Monomer at 0x7fb365341240>

dna_form[1] = bpforms.dna_alphabet.monomers.A

dna_form[1:3]
    => [<bpforms.core.Monomer at 0x7fb365341240>, <bpforms.core.Monomer at 0x7fb365330cf8>]

dna_form[1:3] = bpforms.DnaForm().from_str('TA')

3.5.4. Getting and setting the base of a monomeric form

Optionally, BpForms can track the monomeric forms that are generated from a monomeric form (e.g. m2A is generated from A). This can be get and set using the bpforms.Monomer.base_monomers attribute. This attribute is a set of bpforms.Monomer:

di_monomer = dna_form[3]
di_monomer.base_monomers
    => set(<bpforms.core.Monomer at 0x7fb365341240>)

3.5.5. Protonation and tautomerization

Calculate the major protonation and tautomerization state of each monomeric form in the biopolymer form:

dna_form.get_major_micro_species(8., major_tautomer=True)

3.5.6. Calculation of physical properties

Use these commands to calculate the length, formula, molecular weight, and charge of the biopolymer form:

len(dna_form)
    => 6

dna_form.get_formula()
    => AttrDefault(<class 'float'>, False, {'C': 59.0, 'N': 23.0, 'O': 35.0, 'P': 6.0, 'H': 72.0})

dna_form.get_mol_wt()
    => 1849.193571988

dna_form.get_charge()
    => -7

3.5.7. Generating IUPAC/IUBMB sequences for BpForms

The get_canonical_seq method generates IUPAC/IUBMB representations of BpForms. Where annotated, this method uses the base_monomers attribute to represent modified monomeric forms using the code for their root (e.g. m2A is represented as “A”). Monomeric forms that don’t have their base annotated are represented as “N” and “X” for nucleic acids and proteins, respectively:

dna_form.get_canonical_seq()
    => ATANAC

3.5.8. Determine if two biopolymers describe the same structure

Use the following command to determine if two instances of BpForm describe the same biopolymer:

dna_form_1 = bpforms.DnaForm().from_str('ACGT')
dna_form_2 = bpforms.DnaForm().from_str('ACGT')
dna_form_3 = bpforms.DnaForm().from_str('GCTC')

dna_form_1.is_equal(dna_form_2)
    => True

dna_form_1.is_equal(dna_form_3)
    => False