2.5. Python API

The following tutorial illustrates how to use the BpForms Python API. An interactive version of this tutorial is also available in the whole-cell modeling sandbox.

2.5.1. Importing BpForms

Run this command to import BpForms:

import bpforms

2.5.2. Creating biopolymer forms

Use the BpForms notation and the bpforms.BpForm.from_str method to create an instance of bpforms.BpForm that represents a form of a biopolymer:

dna_form = bpforms.DnaForm().from_str('''ACG[
    id: "dI"
    | structure: "O=C1NC=NC2=C1N=CN2"
    | base-monomer: "A"
    ]AC'''.replace('\n', '').replace(' ', ''))

2.5.3. Getting and setting monomers

Individual monomers and slices of monomers can be get and set similar to lists:

    => <bpforms.core.Monomer at 0x7fb365341240>

dna_form[1] = bpforms.dna_alphabet.monomers.A

    => [<bpforms.core.Monomer at 0x7fb365341240>, <bpforms.core.Monomer at 0x7fb365330cf8>]

dna_form[1:3] = bpforms.DnaForm().from_str('TA')

2.5.4. Getting and setting the base of a monomer

Optionally, BpForms can track the monomers that are generated from a monomer (e.g. m2A is generated from A). This can be get and set using the bpforms.Monomer.base_monomers attribute. This attribute is a set of bpforms.Monomer:

di_monomer = dna_form[3]
    => set(<bpforms.core.Monomer at 0x7fb365341240>)

2.5.5. Protonation and tautomerization

Calculate the major protation and tautomerization state of each monomer in the biopolymer form:

dna_form.get_major_micro_species(8., major_tautomer=True)

2.5.6. Calculation of physical properties

Use these commands to calculate the length, formula, molecular weight, and charge of the biopolymer form:

    => 6

    => AttrDefault(<class 'float'>, False, {'C': 59.0, 'N': 24.0, 'O': 37.0, 'P': 5.0, 'H': 66.0})

    => 1858.17680999

    => -7

2.5.7. Generating FASTA sequences for BpForms

The get_fasta method generates FASTA representations of BpForms. Where annotated, this method uses the base_monomers attribute to represent modified monomers using the code for their root (e.g. m2A is represented as “A”). Monomers that don’t have their base annotated are represented as “N” and “X” for nucleic acids and proteins, respectively:

    => ACGAAC

2.5.8. Determine if two biopolymers describe the same structure

Use the following command to determine if two instances of BpForm describe the same biopolymer:

dna_form_1 = bpforms.DnaForm().from_str('ACGT')
dna_form_2 = bpforms.DnaForm().from_str('ACGT')
dna_form_3 = bpforms.DnaForm().from_str('GCTC')

    => True

    => False