An introduction to whole-cell modeling
0.0.1
1. Introduction
1.1. Motivation for WC modeling
1.1.1. Biological science: understand how genotype influences phenotype
1.1.2. Medicine: personalize medicine for individual genomes
1.1.3. Synthetic biology: rationally design microbial genomes
1.2. The biology that WC models should aim to represent and predict
1.2.1. Phenotypes that WC models should aim to predict
1.2.2. Physics and chemistry that WC models should aim to represent
1.3. Fundamental challenges to WC modeling
1.3.1. Integrating molecular behavior to the cell level over several spatiotemporal scales
1.3.1.1. Sensitivity of phenotypic predictions to molecular parameter values
1.3.1.2. High computational cost of simulating large fine-grained models
1.3.2. Assembling a unified molecular understanding of cells from imperfect data
1.3.2.1. Incomplete data
1.3.2.2. Imprecise and noisy data
1.3.2.3. Heterogeneous experimental methods
1.3.2.4. Heterogeneous organisms and environmental conditions
1.3.2.5. Siloed data
1.3.2.6. Insufficient annotation
1.3.3. Selecting, calibrating and validating high-dimensional models
1.4. Feasibility of WC models
1.4.1. Experimental methods, data, and repositories
1.4.1.1. Measurement methods
1.4.1.2. Data repositories
1.4.1.3. Prediction tools
1.4.2. Modeling and simulation tools
1.4.2.1. Data aggregation and organization tools
1.4.2.2. Model design tools
1.4.2.3. Model selection tools
1.4.2.4. Model refinement tools
1.4.2.5. Model formats
1.4.2.6. Simulation algorithms
1.4.2.7. Simulation experiment formats
1.4.2.8. Simulation tools
1.4.2.9. Calibration tools
1.4.2.10. Verification tools
1.4.2.11. Simulation results formats
1.4.2.12. Simulation results databases
1.4.2.13. Simulation results analysis
1.4.3. Models of individual pathways and model repositories
1.4.3.1. Models of individual pathways
1.4.4. Models of multiple pathways
1.4.4.1. Model repositories
1.5. Emerging principles and methods for WC modeling
1.5.1. Principles of WC modeling
1.5.2. Methods for WC modeling
1.5.2.1. Data aggregation, standardization, and integration
1.5.2.2. Model design
1.5.2.3. Model calibration
1.5.2.4. Model verification and validation
1.5.2.5. Network-free multi-algorithmic simulation
1.5.2.6. Visualization and analysis of simulation results
1.6. Latest WC models and their limitations
1.6.1. Coarse-grained models
1.6.2. Genomically-centric bottom-up fine-grained models
1.6.3. Physiologically-centric top-down fine-grained models
1.6.4. Spatially-centric bottom-up fine-grained models
1.6.5. Hybrid models
1.7. Bottlenecks to more comprehensive and predictive WC models
1.7.1. Inadequate experimental methods and data repositories
1.7.2. Incomplete, inconsistent, scattered, and poorly annotated pathway models
1.7.2.1. Incomplete models
1.7.2.2. Poorly validated and unreliable models
1.7.2.3. Inconsistent models
1.7.2.4. Unpublished and scattered models
1.7.2.5. Incompletely annotated models
1.7.3. Inadequate software tools for WC modeling
1.7.4. Inadequate model formats
1.7.5. Lack of coordination among the cell modeling community
1.8. Technologies needed to advance WC modeling
1.8.1. Experimental methods for characterizing cells
1.8.2. Tools for aggregating, standardizing, and integrating heterogeneous data
1.8.3. Tools for scalably designing models from large datasets
1.8.4. Rule-based format for representing models
1.8.5. Scalable network-free, multi-algorithmic simulator
1.8.6. Scalable tools for calibrating models
1.8.7. Scalable tools for verifying models
1.8.8. Additional tools that would help accelerate WC modeling
1.9. A plan for achieving comprehensive WC models as a community
1.9.1. Phase I: Piloting the core technologies and concepts of WC modeling
1.9.2. Phase II: Piloting collaborative WC modeling
1.9.3. Phase III: Community modeling and model validation
1.10. Ongoing efforts to advance WC modeling
1.10.1. Genomically-centric models
1.10.1.1.
Mycoplasma pneumoniae
1.10.1.2.
Escherichia coli
1.10.1.3. H1 human embryonic stem cells (hESCs)
1.10.2. Physiologically-centric, spatially-centric, and hybrid models
1.10.3. Technology development
1.10.3.1. Data aggregation
1.10.3.2. Model representation
1.10.3.3. Simulation of genomically-centric models
1.11. Resources for learning about WC modeling
1.11.1. Summer schools
1.11.2. Online forum
1.12. Outlook
2. Foundational concepts and skills for computational biology
2.1. Typing
2.2. Software engineering
2.2.1. An introduction to Python
2.2.1.1. Key concepts
2.2.1.2. Installing Python and Python development tools
2.2.1.3. Data types
2.2.1.4. Variables
2.2.1.5. Boolean statements
2.2.1.6. If statements
2.2.1.7. Loops
2.2.1.8. Functions
2.2.1.9. Classes
2.2.1.10. Modules
2.2.1.11. String formatting
2.2.1.12. Printing to the command line
2.2.1.13. Reading and writing to/from files with
csv
and
pyexcel
2.2.1.14. Warnings and exceptions
2.2.1.15. Other Python languages features
2.2.1.16. Exercises
2.2.2. Numerical computing with
NumPy
2.2.2.1. Array construction
2.2.2.2. Concatenation
2.2.2.3. Query the shape of an array
2.2.2.4. Reshaping
2.2.2.5. Selection and slicing
2.2.2.6. Transposition
2.2.2.7. Algebra
2.2.2.8. Trigonometry
2.2.2.9. Other mathematical functions
2.2.2.10. Data reduction
2.2.2.11. Random number generation
2.2.2.12. NaN and infinity
2.2.2.13. Exercises
2.2.2.14. NumPy introduction for MATLAB users
2.2.3. Plotting data with
matplotlib
2.2.3.1. Plot types
2.2.4. Developing database, command line, and web-based programs with Python
2.2.4.1. Using databases with SQLAlchemy and SQLite
2.2.4.2. Building command line programs with Cement
2.2.4.3. Building web-based programs with Flask
2.2.5. Writing code for Python 2 and 3
2.2.6. Organizing Python code into functions, classes, and modules
2.2.7. Structuring Python projects
2.2.8. Revisioning code with Git, GitHub, and Meld
2.2.8.1. Installing and configuring the required software
2.2.8.2. Instructions
2.2.9. Testing Python code with unittest, pytest, and Coverage
2.2.9.1. Required packages
2.2.9.2. File naming and organization
2.2.9.3. Writing tests
2.2.9.4. Testing stochastic algorithms
2.2.9.5. Testing standard output
2.2.9.6. Testing cement command line programs
2.2.9.7. Testing for multiple version of Python
2.2.9.8. Running your tests
2.2.9.9. Analyzing the coverage of your tests
2.2.9.10. Additional tutorials
2.2.10. Debugging Python code using the PyCharm debugger
2.2.11. Documenting Python code with Sphinx
2.2.11.1. Required packages
2.2.11.2. File naming and organization
2.2.11.3. Generating a Sphinx configuration file
2.2.11.4. Writing documentation
2.2.11.5. Compiling the documentation
2.2.12. Continuously testing Python code with CircleCI, Coveralls, Code Climate, and the Karr Lab’s dashboards
2.2.12.1. Required packages
2.2.12.2. Using the CircleCI cloud-based continuous integration system
2.2.12.3. Code Climate
2.2.12.4. Coveralls
2.2.12.5. Karr Lab test results dashboard (tests.karrlab.org)
2.2.12.6. Karr Lab software development dashboard (code.karrlab.org)
2.2.13. Distributing Python software with GitHub, PyPI, Docker Hub, and Read The Docs
2.2.13.1. Required packages
2.2.13.2. Prepare your package for distribution
2.2.13.3. Distributing source code with GitHub
2.2.13.4. Distributing Python packages with PyPI
2.2.13.5. Distributing containers with Docker Hub
2.2.13.6. Distributing documentation with Read The Docs
2.2.14. Recommended Python development tools
2.2.14.1. Installation
2.2.15. Comparison between Python and other languages
2.2.15.1. Python vs MATLAB
2.3. Linux
2.3.1. How to build a Linux Mint virtual machine with Virtual Box
2.3.1.1. Instructions
2.3.1.2. Additional tutorials
2.3.2. How to build a Ubuntu Linux image with Docker
2.3.2.1. Required packages
2.3.2.2. Configuring a image
2.3.2.3. Building a image
2.3.2.4. Uploading images to Docker Hub
2.3.2.5. Listing existing images
2.3.2.6. Removing images
2.3.2.7. Running an image
2.3.3. An introduction to Linux Mint
2.3.3.1. Running command line programs
2.3.3.2. Getting help for command line programs
2.3.3.3. Installing, upgrading, and uninstalling software
2.3.3.4. Additional tutorials
2.4. Version and sharing data with Quilt
2.4.1. Overview
2.4.2. Using Quilt
2.5. Scientific communication: papers, presentations, graphics
2.5.1. Writing manuscripts
2.5.1.1. The publication process
2.5.1.2. How to write a manuscript
2.5.1.3. How to write an abstract
2.5.1.4. How to format a manuscript for submission
2.5.1.5. How to write a response to reviewer critiques
2.5.2. Reviewing manuscripts
2.5.2.1. Becoming a reviewer
2.5.2.2. Timeline
2.5.2.3. Format
2.5.2.4. More information
2.5.3. Making posters
2.5.3.1. Abstracts
2.5.3.2. Content
2.5.3.3. Layout
2.5.3.4. Formatting
2.5.3.5. Software tools
2.5.4. Making presentations
2.5.4.1. Presentation structure
2.5.4.2. Slide design
2.5.4.3. Software tools
2.5.4.4. Further information
2.5.5. Visualizing data
2.5.5.1. Interactive exploratory data visualization
2.5.5.2. Software tools
2.5.5.3. Exercises
2.5.5.4. Further information
2.5.6. Formatting textual documents with LaTeX
2.5.6.1. Required software
2.5.6.2. Tutorial
2.5.6.3. Online, collaborative LaTeX editing
2.5.7. Drawing vector graphics with Adobe Illustrator and Inkscape
2.5.7.1. Key concepts
2.5.7.2. Fundamental vector graphic objects
2.5.7.3. Color models
2.5.7.4. Vector graphics drawing tools: Illustrator vs Inkscape
2.5.7.5. Vector graphics file formats
2.5.7.6. Required software
2.5.7.7. Illustrator exercise
2.5.7.8. Inkscape exercise
2.5.7.9. Additional tutorials
2.5.8. Editing raster graphics with Gimp
2.5.8.1. Concepts
2.5.8.2. Required software
2.5.8.3. Exercise
2.5.8.4. Additional tutorials
3. Fundamentals of cell modeling
3.1. Data aggregation
3.1.1. Common data types and data sources
3.1.2. Finding data sources
3.1.3. Finding relevant data for models
3.1.4. Data aggregation tools
3.1.5. Determining the consensus of multiple observations
3.1.6. Exercise
3.2. Input data organization
3.2.1. Schema
3.2.2. Software tools
3.2.3. Exercises
3.2.3.1. EcoCyc and Pathway Tools
3.2.3.2. WholeCellKB
3.3. Model design
3.3.1. Software tools
3.3.2. Exercises
3.3.2.1. Required software
3.3.2.2. Expert-driven model design with COPASI
3.3.2.3. PGDB-driven model design with MetaFlux
3.3.2.4. Formal model selection
3.3.2.5. Bayesian network structure learning
3.4. Model calibration
3.4.1. Key concepts
3.4.2. Calibration data
3.4.3. Approximate, multi-stage parameter estimation
3.4.3.1. Univariate parameter estimates
3.4.3.2. Pathway joint parameter estimates
3.4.3.3. Global joint parameter estimates
3.4.4. Exercise
3.5. Model representation
3.5.1. Custom numerical simulation code
3.5.1.1. Exercise
3.5.2. Standard numerical simulation packages
3.5.2.1. Exercise
3.5.3. Enumerated modeling languages
3.5.3.1. Exercise
3.5.4. Ruled-based modeling languages
3.5.4.1. Exercise
3.5.5. Rule-based modeling API
3.5.5.1. Exercise
3.5.6. High-level rule-based modeling language
3.6. Model annotation
3.6.1. Component-level semantic annotations of species, reactions, and parameters
3.6.2. Model-level semantic annotations
3.6.3. Model provenance annotations
3.6.4. Simulation algorithm annotations
3.6.5. Exercises
3.7. Model composition
3.7.1. Model composition procedure
3.7.2. Software tools
3.7.3. Exercises
3.7.3.1. Merging metabolic models
3.7.3.2. Merging electrophysiological models
3.8. Mathematical representations and simulation algorithms
3.8.1. Boolean/logical models
3.8.2. Ordinary differential equations (ODEs)
3.8.3. Stochastic simulation
3.8.3.1. Gillespie Algorithm / Stochastic Simulation Algorithm / Direct Method
3.8.3.2. Gillespie first reaction method
3.8.3.3. Gibson-Bruck Next Reaction Method
3.8.3.4. Tau-leaping
3.8.4. Network-free simulation
3.8.5. Flux balance analysis (FBA)
3.8.6. Hybrid/multi-algorithmic simulation
3.8.6.1. Approaches to constructing hybrid simulations
3.8.6.2. Numerical simulation
3.8.6.3. Synchronizing discrete and continuous variables
3.8.6.4. Simulating individual submodels
3.8.7. Reproducing stochastic simulations
3.8.8. Simulation descriptions
3.8.9. Software tools
3.8.10. Exercises
3.8.10.1. Required software
3.8.10.2. Boolean simulation
3.8.10.3. ODE simulation
3.8.10.4. Stochastic simulation
3.8.10.5. Deterministic, probalistic, and stochastic simulation of mRNA and protein synthesis and degradation
3.8.10.6. Network-free simulation of rule-based models
3.8.10.7. Dynamic FBA (dFBA) simulation
3.8.10.8. Hybrid simulation
3.9. Model testing
3.9.1. Testing composite models
3.9.2. Exercise
3.10. Logging simulation results
3.11. Organizing simulation results
3.11.1. Exercise
3.12. Quickly analyzing large simulation results
3.13. Rule-based Modeling with BioNetGen, BNGL, and RuleBender
3.13.1. Description of Rule-based Modeling, BioNetGen, BNGL, and RuleBender
3.13.2. BNGL and RuleBender Basic Functionality
3.13.3. Exercises
3.13.4. Molecular sites, their states, and bonds
4. Principles and methods of WC modeling
4.1. Units of WC models
4.2. Using the wc_lang package to define whole-cell models
4.2.1. Semantics of a
wc_lang
biochemical Model
4.2.2.
wc_lang
Classes Used to Define biochemical Models
4.2.2.1. Static Enumerations
4.2.2.2.
wc_lang
Model Components
4.2.2.3.
wc_lang
Model Data Sources
4.2.3. Using
wc_lang
4.3. Using
wc_env_manager
build, version, and sharing computing environments for WC modeling
4.3.1. How
wc_env_manager
works
4.3.2. Installing
wc_env_manager
4.3.3. Using
wc_env_manager
to build and share images for WC modeling
4.3.3.1. Creating contexts for building the
wc_env
and
wc_env_dependencies
images
4.3.3.2. Creating Dockerfile templates for
wc_env
and
wc_env_dependencies
4.3.3.3. Setting the configuration for
wc_env_manager
4.3.3.4. Building the
wc_env
and
wc_env_dependencies
Docker images
4.3.3.5. Pushing the
wc_env
and
wc_env_dependencies
Docker images to DockerHub
4.3.4. Using
wc_env_manager
to create and run Docker containers for WC modeling
4.3.4.1. Pulling existing Docker images
4.3.4.2. Building containers for WC modeling
4.3.4.3. Using containers to run WC models and WC modeling tools
4.3.5. Using WC modeling computing environments with an external IDE such as PyCharm
4.3.6. Caveats and troubleshooting
5. Appendix: Funding WC modeling research
5.1. Graduate fellowships
5.2. Postdoctoral fellowships
5.3. Postdoc/faculty transition awards
5.4. Grants
5.4.1. Funding streams for your lab
5.4.2. Taxonomy of funding opportunities
5.4.3. Funding programs for early career investigators
5.4.4. Finding funding opportunities
5.4.5. Eligibility
5.4.6. Deadlines
5.4.7. Proposal process
5.4.8. Writing proposals
5.4.9. Typical costs for budgets
5.4.10. Submitting proposals
5.4.11. Peer review
5.4.12. Statistics (NIGMS)
5.4.13. Grant award process
5.4.14. Annual grant renewals
5.4.15. Advice for winning grants
5.4.16. Advice for resubmissions
6. Appendix: Installing the code in this primer
6.1. Requirements
6.2. How to install these tutorials
6.3. Detailed instructions to install the tutorials and all of the requirements
7. Appendix: Documentation for the code in this primer
7.1. Subpackages
7.1.1. intro_to_wc_modeling.cell_modeling package
7.1.1.1. Subpackages
7.1.1.2. Submodules
7.1.1.3. intro_to_wc_modeling.cell_modeling.model_composition module
7.1.1.4. Module contents
7.1.2. intro_to_wc_modeling.concepts_skills package
7.1.2.1. Subpackages
7.1.2.2. Module contents
7.1.3. intro_to_wc_modeling.wc_modeling package
7.1.3.1. Subpackages
7.1.3.2. Module contents
7.2. Submodules
7.3. intro_to_wc_modeling._version module
7.4. Module contents
8. Appendix: Acronyms
9. Appendix: Glossary
10. Appendix: References
11. Appendix: About this primer
11.1. License
11.2. Authors
11.3. Acknowledgements
11.4. Questions and comments
An introduction to whole-cell modeling
Home
»
2.
Foundational concepts and skills for computational biology
»
2.3.
Linux
2.3.
Linux
¶
Table of contents
2.3.1. How to build a Linux Mint virtual machine with Virtual Box
2.3.1.1. Instructions
2.3.1.2. Additional tutorials
2.3.2. How to build a Ubuntu Linux image with Docker
2.3.2.1. Required packages
2.3.2.2. Configuring a image
2.3.2.3. Building a image
2.3.2.4. Uploading images to Docker Hub
2.3.2.5. Listing existing images
2.3.2.6. Removing images
2.3.2.7. Running an image
2.3.3. An introduction to Linux Mint
2.3.3.1. Running command line programs
2.3.3.2. Getting help for command line programs
2.3.3.3. Installing, upgrading, and uninstalling software
2.3.3.4. Additional tutorials