2.1.1.4. wc_utils.util package¶

2.1.1.4.1. Subpackages¶

2.1.1.4.2. Submodules¶

2.1.1.4.3. wc_utils.util.decorate_default_data_struct module¶

A decorator that solves the problem of default parameter values that become global data structures.

Author: Arthur Goldberg <Arthur.Goldberg@mssm.edu>
Date: 2016-10-01
Copyright: 2016-2018, Karr Lab
License: MIT

wc_utils.util.decorate_default_data_struct.default_mutable_params(mutable_args)[source]¶

A function or method decorator that handles mutable optional parameters.

Optional parameters with mutable default values like d and l in “def f( d={}, l=[])” have the awkward behavior that a global mutable data strcture is created when the function (or method) is defined, that references to the parameter access this data structure, and that all calls to the function which do not provide the parameter refer to this data structure. This differs from the semantics naive Python programmers expect, which is that calls that don’t provide the parameter initialize it as an empty data structure.

Somewhat surprisingly, the Python Language Reference recommends (https://docs.python.org/3.5/reference/compound_stmts.html#function-definitions) that this behavior be fixed by defining the default value for such optional parameters as None, and setting the parameter as empty data structure if it is not provided (or is provided as None). However, this is cumbersome, especially if the function contains a large number of such parameters.

This decorator transforms optional parameters whose default values None into mutable data structures of the appropriate type. The parameters must have names whose prefix or suffix indicates their data type (as in so-called Hungarian or rudder notation). The mutable parameters are provided as a list to the decorator. The decorated function uses None as default values for these parameters. Calls to the decorated function replace optional parameters whose value is None with the appropriate empty data structure. For example, consider:

@default_mutable_params( ['d_dict', 'list_l', 's_set'] )
def test3( a, d_dict=None, list_l=None, s_set=None, l2=[4] )

The call:

test3( 1, d_dict={3}, list_l=None, s_set=None, l2=None )

will be transformed into:

test3( 1, d_dict={3}, list_l=[], s_set=set(), l2=None )

where the values of list_l and s_set are local variables.

Parameters

mutable_args (list) – list of optional parameters whose default values are mutable
structure. (data) –

Returns

description

Return type

type

Raises

ValueError – if an argument to @default_mutable_params does not indicate
the type of its aggregate data structure –

TODO(Arthur): An alternative way to define default_params_decorator and avoid the need to add the type to the name of each parameter and select parameters for the decorator, would be to copy the target function’s signature as the decorator’s argument, parse the signature with compile(), and then use the parse’s AST to determine the optional parameters with default datastructures, and their data types.

wc_utils.util.decorate_default_data_struct.none_to_empty(param, value)[source]¶

If value is None, return an empty data structure whose type is indicated by param

Parameters

param (str) – a variable name whose prefix or suffix indicates its data type
value (obj) – a value, which might be None

Returns

value unmodified, or if value is None, an empty data structure whose type is indicated by param

Return type

obj

wc_utils.util.decorate_default_data_struct.typed(param)[source]¶

Indicate whether the param indicates a data type

Parameters

param (str) – a variable name whose prefix or suffix might indicate its data type,
would be one of 'list', 'dict', or 'set' (which) –

Returns

True if param indicates a data type

Return type

boolean

2.1.1.4.4. wc_utils.util.dict module¶

dict utils

Author: Jonathan Karr <karr@mssm.edu>
Author: Arthur Goldberg <Arthur.Goldberg@mssm.edu>
Date: 2016-08-25
Copyright: 2016-2018, Karr Lab
License: MIT

class wc_utils.util.dict.DictUtil[source]¶

Bases: object

Dictionary utility methods

static expand_dict(d, separator='.')[source]¶

Expand a dict, converting string or tuple keys into nested keys

Parameters

d (dict) – dictionary to expand
separator (str, optional) – separator for keys that are strings

Returns

a nested dict, with each tuple element used as a key

Return type

dict

Raises

ValueError – if d is not a dict, or d contains a key that’s neither a tuple or a str, or d contains conflicting keys

static filtered_dict(d, filter_keys)[source]¶

Create a new dict from d, with keys filtered by filter_keys.

Parameters

d (dict) – dictionary to filter.
filter_keys (list of str) – list of keys to retain.

Returns

a new dict containing the entries in d whose keys are in filter_keys.

Return type

dict

static filtered_iteritems(d, filter_keys)[source]¶

A generator that filters a dict’s items to keys in filter_keys.

Parameters

d (dict) – dictionary to filter.
filter_keys (list of str) – list of keys to retain.

Yields

tuple – (key, value) tuples from d whose keys are in filter_keys.

static flatten_dict(d, root_flat_key=None)[source]¶

Flatten a dict, converting nested keys into tuples

Parameters

d (dict) – dictionary to flatten
root_flat_key (list) – flat key for parent dict

Returns

a single level, flattened dict with tuples for keys

Return type

dict

static nested_get(dict, keys, key_delimiter='.')[source]¶

Get the value of a nested dictionary at the nested key sequence keys

Parameters

dict (dict) – dictionary to retrieve value from
keys (str or list) – list of nested keys to retrieve
key_delimiter (str, optional) – delimiter for keys

Returns

The value of dict from the nested keys list

Return type

object

static nested_in(dict, keys, key_delimiter='.')[source]¶

Determine whether the nested key sequence keys is in the dictionary dict

Parameters

dict (dict) – dictionary to retrieve value from
keys (str or list) – list of nested keys to retrieve
key_delimiter (str, optional) – delimiter for keys

Returns

Whether or not the nested key sequence keys is in the dictionary dict

Return type

bool

static nested_set(dict, keys, value, key_delimiter='.')[source]¶

Set the value of a nested dictionary at the nested key sequence keys

Parameters

dict (dict) – dictionary to retrieve value from
keys (str or list) – list of nested keys to retrieve
value (object) – desired value of dict at key sequence keys
key_delimiter (str, optional) – delimiter for keys

Returns

Modified input dictionary

Return type

object

static set_value(d, target_key, new_value, match_type=True)[source]¶

Set values of target keys in a nested dictionary

Consider every key-value pair in nested dictionary d. If value is not a dict, and key is equal to target_key then replace value with new_value. However, if match_type is set, only replace value if it is an instance of new_value’s type. Caution: set_value() will loop infinitely on self-referential dicts.

Parameters

d (dict) – dictionary to modify
target_key (obj) – key to match
new_value (obj) – replacement value
match_type (bool, optional) – if set, only replace values that are instances of the type of new_value

static to_string_sorted_by_key(d)[source]¶

Provide a string representation of a dictionary sorted by key.

Parameters: d (dict) – dictionary
Returns: string representation of a dictionary sorted by key
Return type: str

2.1.1.4.5. wc_utils.util.enumerate module¶

Enumerations

Author: Jonathan Karr <karr@mssm.edu>
Date: 2016-12-09
Copyright: 2016-2018, Karr Lab
License: MIT

class wc_utils.util.enumerate.CaseInsensitiveEnum[source]¶

Bases: enum.Enum

Enumeration with case-insensitive attribute lookup

class wc_utils.util.enumerate.CaseInsensitiveEnumMeta[source]¶

Bases: enum.EnumMeta

__getattr__(name)[source]¶

Get value by name

Parameters: name (str) – attribute name
Returns: enumeration
Return type: Enum

__getitem__(name)[source]¶

Get value by name

Parameters: name (str) – attribute name
Returns: enumeration
Return type: Enum

2.1.1.4.6. wc_utils.util.environ module¶

Environment utilities

Author: Jonathan Karr <karr@mssm.edu>
Author: Arthur Goldberg <Arthur.Goldberg@mssm.edu>
Date: 2016-10-24
Copyright: 2016-2018, Karr Lab
License: MIT

class wc_utils.util.environ.ConfigEnvDict[source]¶

Bases: object

CONFIG = 'CONFIG'[source]¶

DOT = '__DOT__'[source]¶

add_config_value(path, value)[source]¶

Add a value to a configuration environment dictionary

Parameters

path (list of str) – configuration path components
value (obj) – the value the path should be given

Returns

the updated configuration environment dictionary

Return type

dict

get_env_dict()[source]¶

Get the configuration environment dictionary

Returns: the configuration environment dictionary
Return type: dict

prep_tmp_conf(path_value_pairs)[source]¶

Create a config environment dictionary

Parameters: path_value_pairs (list) – iterator over path, value pairs; ‘path’ is the hierarchical path to a config value, and ‘value’ is its value
Returns: a config environment dictionary for the path, value pairs
Return type: dict
Raises: ValueError – if a value is not a string

class wc_utils.util.environ.EnvironUtils[source]¶

Bases: object

A context manager that temporarily sets environment variables

static make_temp_environ(**environ)[source]¶

Temporarily set environment variables:

# assume ‘NO_SUCH_ENV_VAR’ is not set in the environment assert ‘NO_SUCH_ENV_VAR’ not in os.environ with EnvironUtils.make_temp_environ(NO_SUCH_ENV_VAR=’test_value’):

assert os.environ[‘NO_SUCH_ENV_VAR’] == ‘test_value’

assert ‘NO_SUCH_ENV_VAR’ not in os.environ

When used to modify configuration variables, ConfigManager().get_config must be called after the temporary environment variables are set by make_temp_environ().

From http://stackoverflow.com/questions/2059482/python-temporarily-modify-the-current-processs-environment

Parameters: environ (dict) – dictionary mapping environment variable names to desired temporary values

static temp_config_env(path_value_pairs)[source]¶

Create a temporary environment of configuration values

Parameters: path_value_pairs (list) – iterator over path, value pairs; ‘path’ is the hierarchical path to a config value, and ‘value’ is its value

2.1.1.4.7. wc_utils.util.files module¶

File utils

Author: Jonathan Karr <karr@mssm.edu>
Author: Arthur Goldberg <Arthur.Goldberg@mssm.edu>
Date: 2018-05-11
Copyright: 2018, Karr Lab
License: MIT

wc_utils.util.files.copytree_to_existing_destination(src, dst)[source]¶

Copy files from src to dst, overwriting existing files with the same paths and keeping all other existing directories and files

Parameters

src (str) – path to source
dst (str) – path to destination

wc_utils.util.files.normalize_filename(filename, dir=None)[source]¶

Normalize a filename to its fully expanded, real, absolute path

Expand filename by interpreting a user’s home directory, environment variables, and normalizing its path. If filename is not an absolute path and dir is provided then return a full path of filename in dir.

Parameters

filename (str) – a filename
dir (str, optional) – a directory that contains filename

Returns

filename’s fully expanded, absolute path

Return type

str

Raises

ValueError – if neither filename after expansion nor dir are absolute

wc_utils.util.files.normalize_filenames(filenames, absolute_file=None)[source]¶

Normalize filenames relative to directory containing existing file

Parameters

filenames (list of str) – list of filenames
absolute_file (str, optional) – file whose directory contains files in filenames

Returns

absolute paths for files in filenames

Return type

list of str

wc_utils.util.files.remove_silently(filename)[source]¶

Delete file filename if it exist, but report no error if it doesn’t

Parameters: filename (str) – a filename
Raises: Exception – if an error occurs that is not ‘no such file or directory’

2.1.1.4.8. wc_utils.util.git module¶

Git utilities for obtaining repo metadata

Author: Arthur Goldberg <Arthur.Goldberg@mssm.edu>
Author: Jonathan Karr <jonrkarr@gmail.com>
Date: 2017-05-24
Copyright: 2017-2019, Karr Lab
License: MIT

class wc_utils.util.git.GitHubRepoForTests(name, organization='KarrLab')[source]¶

Bases: object

Functions for managing test GitHub repos

delete_test_repo()[source]¶

static get_github_api_token()[source]¶

make_test_repo(dirname=None)[source]¶

Create a test GitHub repository

Parameters

dirname (str, optional) – a directory name; if present, clone the repo into it

Returns

if dirname is provided, a gitpython reference to a local clone of the test: GitHub repository; otherwise, the URL of the test GitHub repository

Return type

obj

class wc_utils.util.git.RepoMetadataCollectionType[source]¶

Bases: enum.Enum

Type of Git repo being queried for metadata that’s stored in a data file

DATA_REPO = 1[source]¶

SCHEMA_REPO = 2[source]¶

class wc_utils.util.git.RepositoryMetadata(url: str, branch: str, revision: str)[source]¶

Bases: object

Represents metadata about a Git repository

url[source]¶

URL

Type: str

branch[source]¶

branch

Type: str

revision[source]¶

revision

Type: str

wc_utils.util.git.get_repo(path='.', search_parent_directories=True)[source]¶

Get a Git repository given the path to a file it contains

Parameters

path (str) – path to file or directory in a Git repository; if path doesn’t exist or is a file then its directory is used
search_parent_directories (bool, optional) – if True have git.Repo search for the root of the repository among the parent directories of path; otherwise, this method iterates over the parent directories itself

Returns

a GitPython repository

Return type

git.Repo

Raises

ValueError – if obj:path is not a path to a Git repository

wc_utils.util.git.get_repo_metadata(path='.', search_parent_directories=True, repo_type=None, data_file=None)[source]¶

Get metadata about a Git repository

Parameters

path (str) – path to file or directory in a Git repository
search_parent_directories (bool, optional) – if True, have GitPython search for the root of the repository among the parent directories of path
repo_type (RepoMetadataCollectionType, optional) – repo type having metadata collected
data_file (str, optional) – pathname of a data file in the repo; must be provided if repo_type is RepoMetadataCollectionType.DATA_REPO

Returns

of RepositoryMetadata:, list of str: repository metadata,: and, if repo_type is provided, changes in the repository that make it unsuitable

Return type

tuple

wc_utils.util.git.repo_suitability(repo, repo_type, data_file=None)[source]¶

Evaluate whether a repo is a suitable source for git metadata

Determine whether repo is in a state that’s suitable for collecting immutable metadata. It cannot be ahead of the remote, because commits must have been pushed to the server so they can be later retrieved. If the repo_type is RepoMetadataCollectionType.SCHEMA_REPO, then there cannot be any differences between the index and the working tree because the schema should be synched with the origin. If the`repo_type` is RepoMetadataCollectionType.DATA_REPO then the repo can contain changes, but the data file should not depend on them. The caller is responsible for determining this.

Parameters

repo (git.Repo) – a GitPython repository
repo_type (RepoMetadataCollectionType) – repo type having status determined
data_file (str, optional) – pathname of a data file in the repo; must be provided if repo_type is RepoMetadataCollectionType.DATA_REPO

Returns

list of reasons, if any, that the repo is in a state that’s not: suitable for collecting metadata; an empty list indicates that the repo can be used to collect metadata

Return type

list of str

Raises

ValueError – if obj:data_file is not a path in a Git repository, or if repo_type is RepoMetadataCollectionType.DATA_REPO and data_file is not provided, or if repo_type is not a RepoMetadataCollectionType

2.1.1.4.9. wc_utils.util.list module¶

List utilities

Author: Jonathan Karr <karr@mssm.edu>
Author: Arthur Goldberg <Arthur.Goldberg@mssm.edu>
Date: 2016-11-30
Copyright: 2016-2018, Karr Lab
License: MIT

wc_utils.util.list.det_count_elements(l)[source]¶

Deterministically count elements in an iterable

Returns the count of each element in l. Costs O(n), where n is the length of l.

Parameters: l (iterable) – an iterable with hashable elements
Returns: a list of pairs, (element, count), for each element in l
Return type: list of tuple
Raises: TypeError –

wc_utils.util.list.det_dedupe(l)[source]¶

Deterministically deduplicate a list

Returns a deduplicated copy of l. That is, returns a new list that contains one instance of each element in l and orders these instances by their first occurrence in l. Costs O(n), where n is the length of l.

Parameters: l (list) – a list with hashable elements
Returns: a deterministically deduplicated copy of l
Return type: list
Raises: TypeError –

wc_utils.util.list.det_find_dupes(l)[source]¶

Deterministically find dupes in an iterable

Returns the duplicates in l. That is, returns a new list that contains one instance of each element that has multiple copies in l and orders these instances by their first occurrence in l. Costs O(n), where n is the length of l.

Parameters: l (list) – a list with hashable elements
Returns: a deterministically deduplicated copy of l
Return type: list
Raises: TypeError –

wc_utils.util.list.dict_by_class(obj_list)[source]¶

Create a dict keyed by class from a list of objects

Parameters: obj_list (list) –
Returns: mapping from object class to list of objects of that class
Return type: dict

wc_utils.util.list.difference(list_1, list_2)[source]¶

Deterministically find the difference between two lists

Returns the elements in list_1 that are not in list_2. Behaves deterministically, whereas set difference does not. Computational cost is O(max(l1, l2)), where l1 and l2 are len(list_1) and len(list_2), respectively.

Parameters

list_1 (list) – one-dimensional list
list_2 (list) – one-dimensional list

Returns

a set-like difference between list_1 and list_2

Return type

list

Raises

TypeError –

wc_utils.util.list.elements_to_str(l)[source]¶

Convert each element in an iterator to a string representation

Parameters: l (list) – an iterator
Returns: a list containing each element of the iterator converted to a string
Return type: list

wc_utils.util.list.get_count_limited_class(classes, class_name, min=1, max=1)[source]¶

Find a class in an iterator over classes, and constrain its count

Parameters

classes (iterator) – an iterator over some classes
class_name (str) – the desired class’ name
min (int) – the fewest instances of a class named class_name allowed
max (int) – the most instances of a class named class_name allowed

Returns

the class in classes whose name (__name__) is class_name; if no instances: of class are allowed, and no instances are found in classes, then return None

Return type

type

Raises

ValueError – if min > max, or if `classes doesn’t contain between min and max, inclusive, class(es) whose name is class_name, or if classes contains multiple, distinct classes with the name class_name

wc_utils.util.list.is_sorted(lst, le_cmp=None)[source]¶

Check if a list is sorted

Parameters

lst (list) – list to check
le_cmp (function, optional) – less than equals comparison function

Returns: bool: true if the list is sorted

wc_utils.util.list.transpose(lst)[source]¶

Swaps the first two dimensions of a two (or more) dimensional list

Parameters: lst (list of list) – two-dimensional list
Returns: two-dimensional list
Return type: list of list

2.1.1.4.10. wc_utils.util.misc module¶

Miscellaneous utilities.

Author: Jonathan Karr <karr@mssm.edu>
Author: Arthur Goldberg <Arthur.Goldberg@mssm.edu>
Date: 2016-11-05
Copyright: 2016-2018, Karr Lab
License: MIT

class wc_utils.util.misc.DFSMAcceptor(start_state, accepting_state, transitions)[source]¶

Bases: object

Deterministic finite state machine (DFSM) that accepts sequences which move from the start to the end state

A data-driven finite state machine (finite-state automaton). States and messages can be any hashable type.

start_state[source]¶

a DFSM’s start state

Type: object

accepting_state[source]¶

a DFSM must be in this state to accept a message sequence

Type: object

transitions_dict[source]¶

transitions, a map state -> message -> next state

Type: dict

state[source]¶

a DFSM’s current state

Type: object

ACCEPT = 'accept'[source]¶

FAIL = 'fail'[source]¶

exec_transition(message)[source]¶

Execute one DFSM state transition

Parameters

message (object) – a message that might transition the DFSM to another state

Returns

returns DFSMAcceptor.FAIL if message does not transition the DFSM to: another state; otherwise returns None

Return type

object

get_state()[source]¶: Get a DFSM’s state

reset()[source]¶: Reset a DFSM to it’s start state

run(transition_messages)[source]¶

Execute one DFSM state transition

Parameters

transition_messages (iterator of object) – an iterator that provides messages that might transition a DFSM from its start_state to its accepting_state

Returns

returns DFSMAcceptor.FAIL if transition_messages do not transition the: DFSM to from its start_state to its accepting_state; otherwise returns DFSMAcceptor.ACCEPT

Return type

object

class wc_utils.util.misc.EnhancedDataClass[source]¶

Bases: object

A class that enhances dataclasses

LIKELY_INITIAL_VOWEL_SOUNDS[source]¶

initial letters of words that will be preceeded by ‘an’

Type: set of str

DO_NOT_PICKLE[source]¶

fields in a dataclass that cannot be pickled

Type: set of str

DO_NOT_PICKLE = {}[source]

LIKELY_INITIAL_VOWEL_SOUNDS = {'a', 'e', 'i', 'o', 'u'}[source]

__setattr__(name, value)[source]¶: Validate a dataclass attribute when it is changed

static get_pathname(dirname)[source]¶

Get the pathname for a pickled EnhancedDataClass object stored in directory dirname

Subclasses of EnhancedDataClass that read or write files must override this method.

Parameters: dirname (str) – directory for holding the dataclass
Returns: pathname for the EnhancedDataClass
Return type: str

prepare_to_pickle()[source]¶

Provide a copy of this instance that can be pickled; recursively calls nested EnhancedDataClasss

Some objects, such as functions, cannot be pickled. Replace the value of these attributes with None.

Returns: a copy of self that can be pickled
Return type: SimulationConfig

classmethod read_dataclass(dirname)[source]¶

Read an EnhancedDataClass object from the directory dirname

Parameters: dirname (str) – directory for holding the dataclass
Returns: an EnhancedDataClass object
Return type: EnhancedDataClass

semantically_equal(other)[source]¶

Evaluate whether two instances of an EnhancedDataClass subclass are semantically equal

Defaults to self == other if not overridden. Otherwise, should return True if self and other are semantically equal, and False otherwise. By default, dataclasses are created with __eq__ methods that compare all attributes.

Parameters: other (Object) – other object
Returns: True if other is semantically equal to self, False otherwise
Return type: bool

validate_dataclass_type(attr_name)[source]¶

Validate the type of an attribute in a dataclass instance

Parameters

attr_name (str) – the name of the attribute to validate

Returns

if no error is found

Return type

None

Raises

ValueError – if attr_name is not the name of a field
TypeError – if attribute attr_name does not have the right type

validate_dataclass_types()[source]¶

Validate the types of all attributes in a dataclass instance

Returns: if no error is found
Return type: None
Raises: error_type – if an attribute does not have the right type

classmethod write_dataclass(dataclass, dirname)[source]¶

Save an EnhancedDataClass object to the directory dirname

Parameters

dataclass (EnhancedDataClass) – an EnhancedDataClass instance
dirname (str) – directory for holding the dataclass

Raises

ValueError – if a dataclass has already been written to dirname

class wc_utils.util.misc.OrderableNoneType[source]¶

Bases: object

Type that can be used for sorting in Python 3 in place of None

wc_utils.util.misc.as_dict(obj)[source]¶

Provide a dictionary representation of obj

obj must define an attribute called ATTRIBUTES which iterates over the attributes that should be included in the representation.

Recursively computes as_dict() on nested objects that define ATTRIBUTES. Warning: calling as_dict on cyclic networks of objects will cause infinite recursion and stack overflow.

Returns

a representation of obj mapping attribute names to values, nested for nested: objects

Return type

dict

Raises

ValueError – obj does not define an attribute called ATTRIBUTES

wc_utils.util.misc.geometric_iterator(min, max, factor)[source]¶

Create a geometic sequence

Generate the sequence min, min`*`factor, min`*`factor`**2, …, stopping at the first element greater then or equal to `max.

Parameters

min (float) – first and smallest element of the geometic sequence
max (float) – largest element of the geometic sequence
factor (float) – multiplicative factor between sequence entries

Returns

the geometic sequence

Return type

iterator of float

Raises

ValueError – if min <= 0, or if max < min, or if factor <= 1

wc_utils.util.misc.internet_connected()[source]¶

Determine whether the Internet is connected

Returns: return True if the internet (actually www.google.com) is accessible, False otherwise
Return type: bool

wc_utils.util.misc.isclass(cls, cls_info)[source]¶

Compares a class with classes in cls_info.

Parameters

cls (str) – class
cls_info (class, type, or tuple of classes and types) – class, type, or tuple of classes and types

Returns

True if one of the classes in cls_info is cls.

Return type

bool

wc_utils.util.misc.isclass_by_name(cls_name, cls_info)[source]¶

Compares a class name with the names of the classes in cls_info.

Parameters

cls_name (str) – class name
cls_info (class, type, or tuple of classes and types) – class, type, or tuple of classes and types

Returns

True if one of the classes in cls_info has name cls_name.

Return type

bool

wc_utils.util.misc.most_qual_cls_name(obj)[source]¶

Obtain the most qualified class name available for obj.

Since references to classes cannot be sent in messages that leave an address space, use the most qualified class name available to compare class values across address spaces. Fully qualified class names are available for Python >= 3.3.

Parameters: obj (class) – an object, which may be a class.
Returns: the most qualified class name available for obj.
Return type: str

wc_utils.util.misc.obj_to_str(obj, attrs)[source]¶

Provide a string representation of an object

Parameters

obj (object) – an object
attrs (collections.abc.Iterator) – the names of attributes in obj to represent

Returns

a string

Return type

str

wc_utils.util.misc.quote(s)[source]¶

Enclose a string that contains spaces in single quotes, ‘like this’

Parameters: s (object) – a string
Returns: a string
Return type: str

wc_utils.util.misc.round_direct(value, precision=2)[source]¶

Convert value to rounded string with appended sign indicating the rounding direction.

Append ‘+’ to indicate that value has been rounded down, and ‘-‘ to indicate rounding up. For example, round_direct(3.01, 2) == ‘3.01’ round_direct(3.01, 1) == ‘3.0+’ round_direct(2.99, 1) == ‘3.0-‘

This function helps display simulation times that have been slightly increased or decreased to control order execution.

Parameters

value (float) – the value to round.
precision (int) – the precision with which to round value.

Returns

value rounded to precision places, followed by a sign indicating rounding direction.

Return type

str

2.1.1.4.11. wc_utils.util.rand module¶

Random number generator utilities.

Author: Arthur Goldberg <Arthur.Goldberg@mssm.edu>
Date: 2016-10-07
Copyright: 2016-2018, Karr Lab
License: MIT

exception wc_utils.util.rand.InvalidRandomStateException[source]¶

Bases: Exception

An exception for invalid random states

class wc_utils.util.rand.RandomState[source]¶

Bases: numpy.random.mtrand.RandomState

Enhanced random state with additional random methods for * Rounding

ltd()[source]¶

Sample a left triangular distribution.

The pdf of ltd is f(x) = 2(1-x) for 0<=x<=1, and 0 elsewhere.

Returns: a sample from a left triangular distribution.
Return type: float

round(x, method='binomial')[source]¶

Stochastically round a floating point value.

Parameters

x (float) – a value to be rounded.
method (str, optional) – the type of rounding to use. The default is ‘binomial’.

Returns

rounded value of x.

Return type

int

Raises

Exception – if method is not one of the valid types: ‘binomial’, ‘midpoint’, ‘poisson’, and ‘quadratic’.

round_binomial(x)[source]¶

Stochastically round a float.

Randomly round a float to one of the two nearest integers. This is achieved by making

P[round x to floor(x)] = f = 1 - (x - floor(x)), and P[round x to ceil(x)] = 1 - f.

This avoids the bias that would arise from always using floor or ceil, especially with small populations. The mean of the rounded values for a set of floats converges to the mean of the floats.

Parameters: x (float) – a value to be rounded.
Returns: rounded value of x.
Return type: int

round_midpoint(x)[source]¶

Round to the closest integer; if the fractional part of x is 0.5, randomly round up or down.

Round a float to the closest integer. If the fractional part of x is 0.5, randomly round x up or down. This avoids rounding bias if the distribution of x is not uniform. See http://www.clivemaxfield.com/diycalculator/sp-round.shtml#A15

Parameters: x (float) – a value to be rounded
Returns: rounded value of x
Return type: int

round_poisson(x)[source]¶

Stochastically round a floating point value by sampling from a poisson distribution.

A sample of Poisson(x) is provided, the domain of which is the integers in [0,inf). It is not symmetric about a fractional part of 0.5.

Parameters: x (float) – a value to be rounded.
Returns: rounded value of x.
Return type: int

round_quadratic(x)[source]¶

Stochastically round a float, with a quadratic bias towards the closest integer.

Stochastically round a float. Rounding is non-linearly biased towards the closest integer. This rounding behaves symmetrically about 0.5. Its expected value when rounding a unif(0,1) random variable is 0.5.

Parameters: x (float) – a value to be rounded.
Returns: rounded value of x.
Return type: int

rtd()[source]¶

Sample a right triangular distribution.

The pdf of rtd is f(x) = 2x for 0<=x<=1, and 0 elsewhere.

Returns: a sample from a right triangular distribution.
Return type: float

std()[source]¶

Sample a symmetric triangular distribution.

The pdf of symmetric triangular distribution is

4x for 0<=x<.5, 4(1-x) for .5<=x<=1, and 0 elsewhere.

See https://en.wikipedia.org/wiki/Triangular_distribution.

Returns: a sample from a symmetric triangular distribution.
Return type: float

class wc_utils.util.rand.RandomStateManager[source]¶

Bases: object

Manager for singleton of numpy.random.RandomState

classmethod initialize(seed=None)[source]¶

Constructs the singleton random state, if it doesn’t already exist and seeds the random state.

Parameters: seed (int) – random number generator seed

classmethod instance()[source]¶

Returns the single random state

Returns: random state
Return type: numpy.random.RandomState

wc_utils.util.rand.validate_random_state(random_state)[source]¶

Validates a random state

Parameters: random_state (obj) – random state
Raises: InvalidRandomStateException – if random_state is not valid

2.1.1.4.12. wc_utils.util.stats module¶

Statistical utilities.

Author: Arthur Goldberg <Arthur.Goldberg@mssm.edu>
Author: Jonathan Karr <jonrkarr@mssm.edu>
Date: 2017-05-26
Copyright: 2016-2018, Karr Lab
License: MIT

class wc_utils.util.stats.ExponentialMovingAverage(value, alpha=None, center_of_mass=None)[source]¶

Bases: object

An exponential moving average.

Each moving average S is computed recursively from the sample values Y:: S_1 = Y_1 S_t = alpha * Y_t + (1 - alpha) * S_(t-1)

value[source]¶

the current average

Type: float

alpha[source]¶

the decay factor

Type: float

__eq__(other)[source]¶

Compare two exponential moving averages

Parameters: other (ExponentialMovingAverage) – other exponential moving average
Returns: true if exponential moving averages are equal
Return type: bool

__ne__(other)[source]¶

Compare two exponential moving averages

Parameters: other (ExponentialMovingAverage) – other exponential moving average
Returns: true if exponential moving averages are unequal
Return type: bool

add_value(new_value)[source]¶

Add a sample to this ExponentialMovingAverage, and update the average.

Parameters: new_value (float) – the next value to contribute to the exponential moving average
Returns: the updated exponential moving average
Return type: float

get_ema()[source]¶

Get the curent average

Returns: curent exponential moving average
Return type: float

wc_utils.util.stats.weighted_mean(values, weights, ignore_nan=True)[source]¶

Calculate weighted mean of a list of values, weighted by weights

Parameters

values (list of float) – values
weights (list of float) – weights
ignore_nan (bool, optional) – if True, ignore nan values

Returns

mean of values, weighted by weights

Return type

float

wc_utils.util.stats.weighted_median(values, weights, ignore_nan=True)[source]¶

Calculate the median of a list of values, weighted by weights

Parameters

values (list of float) – values
weights (list of float) – weights
ignore_nan (bool, optional) – if True, ignore nan values

Returns

weighted median of values

Return type

float

wc_utils.util.stats.weighted_mode(values, weights, ignore_nan=True)[source]¶

Calculate the mode of a list of values, weighted by weights

Parameters

values (list of float) – values
weights (list of float) – weights
ignore_nan (bool, optional) – if True, ignore nan values

Returns

weighted mode of values

Return type

float

wc_utils.util.stats.weighted_percentile(values, weights, percentile, ignore_nan=True)[source]¶

Calculate percentile of a list of values, weighted by weights

Parameters

values (list of float) – values
weights (list of float) – weights
percentile (float) – percentile
ignore_nan (bool, optional) – if True, ignore nan values

Returns

weighted percentile of values

Return type

float

2.1.1.4.13. wc_utils.util.string module¶

String utilities.

Author: Arthur Goldberg <Arthur.Goldberg@mssm.edu>
Author: Jonathan Karr <jonrkarr@gmail.com>
Date: 2017-03-20
License: MIT

wc_utils.util.string.camel_case_to_snake_case(camel_case)[source]¶

Convert string from camel (e.g. SnakeCase) to snake case (e.g. snake_case)

Parameters: camel_case (str) – string in camel case
Returns: string in snake case
Return type: str

wc_utils.util.string.delete_trailing_blanks(l_of_strings)[source]¶

Remove all blank lines from the end of a list of strings

A line is blank if it is empty after applying String.rstrip().

Parameters: l_of_strings (list of str) – a list of strings

wc_utils.util.string.find_nth(s, sub, n, start=0, end=inf)[source]¶

Get the index of the nth occurrence of a substring within a string

Parameters

s (str) – string to search
sub (str) – substring to search for
n (int) – number of occurence to find the position of
start (int, optional) – starting position to search from
end (int, optional) – end position to search within

Returns

index of nth occurence of the substring within the string: or -1 if there are less than n occurrences of the substring within the string

Return type

int

Raises

ValueError – if sub is empty or n is less than 1

wc_utils.util.string.indent_forest(forest, indentation=2, keep_trailing_blank_lines=False, return_list=False)[source]¶

Generate a string of lines, each indented by its depth in forest

Convert a forest of objects provided in an iterator of nested iterators into a flat list of strings, each indented by depth*indentation spaces where depth is the objects’ depth in forest.

Strings are not treated as iterators. Properly handles strings containing newlines. Trailing blank lines are removed from strings containing newlines.

Parameters

forest (iterators of iterators) – a forest as an iterator of nested iterators
indentation (int, optional) – number of spaces to indent at each level
keep_trailing_blank_lines (Boolean, optional) – if set, keep trailing blank lines in strings in forest
return_list (Boolean, optional) – if set, return a list of lines, each indented by its depth in forest

Returns

a string of lines, each indented by its depth in forest

Return type

str

wc_utils.util.string.partition_nth(s, sep, n)[source]¶

Partition a string on the nth occurrence of a substring

Parameters

s (str) – string to partition
sep (str) – separator to partition on
n (int) – number of occurence to partition on

Returns

str: substring before the nth separator
str: separator
str: substring after the nth separator

Return type

tuple

Raises

ValueError – if sep is empty or n is less than 1

wc_utils.util.string.rfind_nth(s, sub, n, start=0, end=inf)[source]¶

Get the index of the nth-last occurrence of a substring within a string

Parameters

s (str) – string to search
sub (str) – substring to search for
n (int) – number of occurence to find the position of
start (int, optional) – starting position to search from
end (int, optional) – end position to search within

Returns

index of nth-last occurence of the substring within the string: or -1 if there are less than n occurrences of the substring within the string

Return type

int

Raises

ValueError – if sub is empty or n is less than 1

wc_utils.util.string.rpartition_nth(s, sep, n)[source]¶

Partition a string on the nth-last occurrence of a substring

Parameters

s (str) – string to partition
sep (str) – separator to partition on
n (int) – number of occurence to partition on

Returns

str: substring before the nth-last separator
str: separator
str: substring after the nth-last separator

Return type

tuple

Raises

ValueError – if sep is empty or n is less than 1

2.1.1.4.14. wc_utils.util.testing module¶

Assertions for testing

Author: Jonathan Karr <karr@mssm.edu>
Date: 2019-06-18
License: MIT

wc_utils.util.testing.assert_memory_less(obj, size, exclusive=False)[source]¶

Assert that the memory occupied by an object is less than a size

Parameters

obj (object) – object
size (int) – size in bytes
exclusive (bool, optional) – if True, check the exclusive memory of the object

Raises

ValueError – if the memory occupied by the object is greater than or equal to size

wc_utils.util.testing.assert_memory_less_equal(obj, size, exclusive=False)[source]¶

Assert that the memory occupied by an object is less than or equal to a size

Parameters

obj (object) – object
size (int) – size in bytes
exclusive (bool, optional) – if True, check the exclusive memory of the object

Raises

ValueError – if the memory occupied by the object is greater than size

2.1.1.4.15. wc_utils.util.types module¶

Utility functions

Author: Jonathan Karr <karr@mssm.edu>
Date: 2016-08-20
License: MIT

exception wc_utils.util.types.TypesUtilAssertionError[source]¶

Bases: AssertionError

Types Util assertion error

wc_utils.util.types.assert_value_equal(obj1, obj2, check_type=False, check_iterable_ordering=False)[source]¶

Recursively raise an exception if two objects have different semantic values, ignoring

key/attribute order
optionally, object types
optionally, element ordering in iterables

Parameters

obj1 (object) – first object
obj1 – second object
check_type (bool, optional) – If true, raise an exception if obj1 and obj2 have different types
check_iterable_ordering (bool, optional) – If true, raise an exception if the objects have different orderings of iterable attributes

Raises

obj – TypesUtilAssertionError: If the value of obj1 is not equal to that of obj2

wc_utils.util.types.assert_value_not_equal(obj1, obj2, check_type=False, check_iterable_ordering=False)[source]¶

Recursively raise an exception if two objects have the same semantic values, ignoring

key/attribute order
optionally, object types
optionally, element ordering in iterables

Parameters

obj1 (object) – first object
obj1 – second object
check_type (bool, optional) – If true, raise an exception if obj1 and obj2 have different types
check_iterable_ordering (bool, optional) – If true, raise an exception if the objects have different orderings of iterable attributes

Raises

obj – TypesUtilAssertionError: If the value of obj1 is not equal to that of obj2

wc_utils.util.types.cast_to_builtins(obj)[source]¶

Recursively type cast an object to a semantically equivalent object expressed using only builtin types

All iterable objects (objects with __iter__ attribute) are converted to lists
All dictionable objects (objects which are dictionaries or which have the __dict__ attribute) are converted to dictionaries

Parameters: obj (object) – an object
Returns: a semantically equivalent object expressed using only builtin types
Return type: object

wc_utils.util.types.get_subclasses(cls, immediate_only=False)[source]¶

Reproducibly get subclasses of a class, with duplicates removed

Parameters

cls (type) – class
immediate_only (bool, optional) – if true, only return direct subclasses

Returns

list of subclasses, with duplicates removed

Return type

list of type

wc_utils.util.types.get_superclasses(cls, immediate_only=False)[source]¶

Get superclasses of a class. If immediate_only, only return direct superclasses.

Parameters

cls (type) – class
immediate_only (bool) – if true, only return direct superclasses

Returns

list of superclasses

Return type

list of type

wc_utils.util.types.is_iterable(obj)[source]¶

Check if object is an iterable (list, tuple, etc.) and not a string

Parameters: obj (object) – object
Returns: Whether or not object is iterable
Return type: bool

2.1.1.4.16. wc_utils.util.uniform_seq module¶

Generate an infinite sequence of evenly spaced values

Author: Arthur Goldberg <Arthur.Goldberg@mssm.edu>
Date: 2019-12-11
License: MIT

class wc_utils.util.uniform_seq.UniformSequence(start, step)[source]¶

Bases: collections.abc.Iterator

Generate an infinite sequence of evenly spaced values, especially for non-integral step sizes

Avoids floating-point roundoff errors by using Decimals to represent the start and step size. The start and step arguments must be integers, floats or strings that can be represented as a Decimal with a mantissa that contains no more than UNIFORM_SEQ_PRECISION digits.

_start[source]¶

starting point of the sequence

Type: Decimal

_step[source]¶

step size for the sequence

Type: Decimal

_num_steps[source]¶

number of steps taken in the sequence

Type: int

__iter__()[source]¶

Get this UniformSequence

Returns: this UniformSequence
Return type: UniformSequence

__next__()[source]¶

Get next value in the sequence

Returns: next value in this UniformSequence
Return type: Decimal

next_float()[source]¶

Get next value in the sequence as a float for external use

Returns: next value in this UniformSequence
Return type: float

static truncate(value)[source]¶

Truncate a uniform sequence value into fixed-point notation for output

Raise an exception if truncation loses precision.

Parameters

value (float) – value to truncate to a certain precision

Returns

string representation of a uniform sequence value truncated to the maximum: precision supported

Return type

str

Raises

StopIteration – if the truncated value does not equal value

2.1.1.4.17. wc_utils.util.units module¶

Utilities for dealing with units

Author: Jonathan <jonrkarr@gmail.com>
Date: 2017-05-29
License: MIT

wc_utils.util.units.are_units_equivalent(units1, units2, check_same_magnitude=True)[source]¶

Determine if two units are equivalent

Parameters

units1 (pint.unit._Unit) – units
units2 (pint.unit._Unit) – other units
check_same_magnitude (bool, optional) – if True, units are only equivalent if they have the same magnitude

Returns

True if the units are equivalent

Return type

bool

wc_utils.util.units.get_unit_registry(base_filename='', extra_filenames=None)[source]¶

Get a unit registry

Parameters

base_filename (str, optional) – Path to base unit system definition. If None, the default pint unit system will be used
extra_filenames (list of str, optional) – List of paths to additional unit definitions beyond the base unit system definition

Returns

unit registry

Return type

pint.UnitRegistry

2.1.1.4. wc_utils.util package¶

2.1.1.4.1. Subpackages¶

2.1.1.4.2. Submodules¶

2.1.1.4.3. wc_utils.util.decorate_default_data_struct module¶

2.1.1.4.4. wc_utils.util.dict module¶

2.1.1.4.5. wc_utils.util.enumerate module¶

2.1.1.4.6. wc_utils.util.environ module¶

2.1.1.4.7. wc_utils.util.files module¶

2.1.1.4.8. wc_utils.util.git module¶

2.1.1.4.9. wc_utils.util.list module¶

2.1.1.4.10. wc_utils.util.misc module¶

2.1.1.4.11. wc_utils.util.rand module¶

2.1.1.4.12. wc_utils.util.stats module¶

2.1.1.4.13. wc_utils.util.string module¶

2.1.1.4.14. wc_utils.util.testing module¶

2.1.1.4.15. wc_utils.util.types module¶

2.1.1.4.16. wc_utils.util.uniform_seq module¶

2.1.1.4.17. wc_utils.util.units module¶

2.1.1.4.18. Module contents¶