3.1. obj_tables package

3.1.2. Submodules

3.1.3. obj_tables.__main__ module

Command line utilities for modeling data in tables (XLSX, CSV, TSV)

Author

Jonathan Karr <karr@mssm.edu>

Date

2019-09-11

Copyright

2019, Karr Lab

License

MIT

class obj_tables.__main__.App(label=None, **kw)[source]

Bases: cement.core.foundation.App

Command line application

class Meta[source]

Bases: object

base_controller = 'base'[source]
handlers = [<class 'obj_tables.__main__.BaseController'>, <class 'obj_tables.__main__.VizSchemaController'>, <class 'obj_tables.__main__.ValidateController'>, <class 'obj_tables.__main__.NormalizeController'>, <class 'obj_tables.__main__.InitSchemaController'>, <class 'obj_tables.__main__.GenTemplateController'>, <class 'obj_tables.__main__.DiffController'>, <class 'obj_tables.__main__.ConvertController'>][source]
label = 'obj-tables'[source]
class obj_tables.__main__.BaseController(*args, **kw)[source]

Bases: cement.ext.ext_argparse.ArgparseController

Base controller for command line application

class Meta[source]

Bases: object

arguments = [(['-v', '--version'], {'action': 'version', 'version': '1.0.14'})][source]
description = 'Command line utilities for modeling data in tables (XLSX, CSV, TSV)'[source]
help = 'Command line utilities for modeling data in tables (XLSX, CSV, TSV)'[source]
label = 'base'[source]
class obj_tables.__main__.ConvertController(*args, **kw)[source]

Bases: cement.ext.ext_argparse.ArgparseController

Convert a schema-encoded workbook to another format (CSV, XLSX, JSON, TSV, YAML)

class Meta[source]

Bases: object

arguments = [(['schema_file'], {'type': <class 'str'>, 'help': 'Path to the schema (.py) or a declarative description of the schema (.csv, .tsv, .xlsx)'}), (['in_wb_file'], {'type': <class 'str'>, 'help': 'Path to the workbook (.csv, .json, .tsv, .xlsx, .yml)'}), (['out_wb_file'], {'type': <class 'str'>, 'help': 'Path to save the workbook (.csv, .json, .tsv, .xlsx, .yml)'}), (['--write-toc'], {'action': 'store_true', 'default': False, 'help': 'If set, write a table of contents with the outputted workbook'}), (['--write-schema'], {'action': 'store_true', 'default': False, 'help': 'If set, save a copy of the schema within the outputted workbook'}), (['--unprotected'], {'action': 'store_true', 'default': False, 'help': 'If set, do not protect the outputted workbook'})][source]
description = 'Convert a schema-encoded workbook to another format (CSV, XLSX, JSON, TSV, YAML)'[source]
help = 'Convert a schema-encoded workbook to another format (CSV, XLSX, JSON, TSV, YAML)'[source]
label = 'convert'[source]
stacked_on = 'base'[source]
stacked_type = 'nested'[source]
class obj_tables.__main__.DiffController(*args, **kw)[source]

Bases: cement.ext.ext_argparse.ArgparseController

Calculate the difference between two workbooks according to a schema

class Meta[source]

Bases: object

arguments = [(['schema_file'], {'type': <class 'str'>, 'help': 'Path to the schema (.py) or a declarative description of the schema (.csv, .tsv, .xlsx)'}), (['model'], {'type': <class 'str'>, 'help': 'Type of objects to compare'}), (['wb_file_1'], {'type': <class 'str'>, 'help': 'Path to the first workbook (.csv, .json, .tsv, .xlsx, .yml)'}), (['wb_file_2'], {'type': <class 'str'>, 'help': 'Path to the second workbook (.csv, .json, .tsv, .xlsx, .yml)'})][source]
description = 'Calculate the difference between two workbooks according to a schema'[source]
help = 'Calculate the difference between two workbooks according to a schema'[source]
label = 'diff'[source]
stacked_on = 'base'[source]
stacked_type = 'nested'[source]
class obj_tables.__main__.GenTemplateController(*args, **kw)[source]

Bases: cement.ext.ext_argparse.ArgparseController

Generate a template workbook (XLSX, CSV, TSV) for a schema or declarative description of a schema

class Meta[source]

Bases: object

arguments = [(['schema_file'], {'type': <class 'str'>, 'help': 'Path to the schema (.py) or declarative description of the schema (.csv, .tsv, .xlsx)'}), (['template_file'], {'type': <class 'str'>, 'help': 'Path to save the template (.csv, .tsv, .xlsx)'}), (['--write-toc'], {'action': 'store_true', 'default': False, 'help': 'If set, write a table of contents with the outputted workbook'}), (['--write-schema'], {'action': 'store_true', 'default': False, 'help': 'If set, save a copy of the schema within the template'}), (['--unprotected'], {'action': 'store_true', 'default': False, 'help': 'If set, do not protect the outputted workbook'})][source]
description = 'Generate a template workbook (XLSX, CSV, TSV) for a schema or declarative description of a schema'[source]
help = 'Generate a template workbook (XLSX, CSV, TSV) for a schema or declarative description of a schema'[source]
label = 'gen-template'[source]
stacked_on = 'base'[source]
stacked_type = 'nested'[source]
class obj_tables.__main__.InitSchemaController(*args, **kw)[source]

Bases: cement.ext.ext_argparse.ArgparseController

Initialize a Python schema from a declarative description of the schema in a table (XLSX, CSV, TSV)

class Meta[source]

Bases: object

arguments = [(['in_file'], {'type': <class 'str'>, 'help': 'Path to the declarative description of the schema (.csv, .tsv, .xlsx)'}), (['out_file'], {'type': <class 'str'>, 'help': 'Path to save Python schema (.py)'})][source]
description = 'Initialize a Python schema from a declarative description of the schema in a table (XLSX, CSV, TSV)'[source]
help = 'Initialize a Python schema from a declarative description of the schema in a table (XLSX, CSV, TSV)'[source]
label = 'init-schema'[source]
stacked_on = 'base'[source]
stacked_type = 'nested'[source]
class obj_tables.__main__.NormalizeController(*args, **kw)[source]

Bases: cement.ext.ext_argparse.ArgparseController

Normalize a workbook according to a schema

class Meta[source]

Bases: object

arguments = [(['schema_file'], {'type': <class 'str'>, 'help': 'Path to the schema (.py) or a declarative description of the schema (.csv, .tsv, .xlsx)'}), (['model'], {'type': <class 'str'>, 'help': 'Type of objects to normalize'}), (['in_wb_file'], {'type': <class 'str'>, 'help': 'Path to the workbook (.csv, .json, .tsv, .xlsx, .yml)'}), (['out_wb_file'], {'type': <class 'str'>, 'help': 'Path to save the normalized workbook (.csv, .json, .tsv, .xlsx, .yml)'}), (['--write-toc'], {'action': 'store_true', 'default': False, 'help': 'If set, write a table of contents with the outputted workbook'}), (['--write-schema'], {'action': 'store_true', 'default': False, 'help': 'If set, save a copy of the schema within the normalized workbook'}), (['--unprotected'], {'action': 'store_true', 'default': False, 'help': 'If set, do not protect the outputted workbook'})][source]
description = 'Normalize a workbook according to a schema'[source]
help = 'Normalize a workbook according to a schema'[source]
label = 'normalize'[source]
stacked_on = 'base'[source]
stacked_type = 'nested'[source]
class obj_tables.__main__.ValidateController(*args, **kw)[source]

Bases: cement.ext.ext_argparse.ArgparseController

Validate that a workbook is consistent with a schema, and report any errors

class Meta[source]

Bases: object

arguments = [(['schema_file'], {'type': <class 'str'>, 'help': 'Path to the schema (.py) or a declarative description of the schema (.csv, .tsv, .xlsx)'}), (['wb_file'], {'type': <class 'str'>, 'help': 'Path to the workbooks (.csv, .json, .tsv, .xlsx, .yml)'})][source]
description = 'Validate that a workbook is consistent with a schema, and report any errors'[source]
help = 'Validate that a workbook is consistent with a schema, and report any errors'[source]
label = 'validate'[source]
stacked_on = 'base'[source]
stacked_type = 'nested'[source]
class obj_tables.__main__.VizSchemaController(*args, **kw)[source]

Bases: cement.ext.ext_argparse.ArgparseController

Generate a UML diagram for a schema

class Meta[source]

Bases: object

arguments = [(['schema_file'], {'type': <class 'str'>, 'help': 'Path to the schema (.py) or a declarative description of the schema (.csv, .tsv, .xlsx)'}), (['img_file'], {'type': <class 'str'>, 'help': 'Path to save a UML diagram of the schema (.pdf, .png, .svg)'})][source]
description = 'Generate a UML diagram for a schema'[source]
help = 'Generate a UML diagram for a schema'[source]
label = 'viz-schema'[source]
stacked_on = 'base'[source]
stacked_type = 'nested'[source]
obj_tables.__main__.get_schema_models(filename)[source]

Get a Python schema and its models

Parameters

filename (str) – path to schema or declarative representation of the schema

Returns

  • str: schema name

  • types.ModuleType: schema module

  • list of core.Model: models

Return type

tuple

obj_tables.__main__.main()[source]

3.1.4. obj_tables._version module

3.1.5. obj_tables.abstract module

Support for abstract model classes

Author

Jonathan Karr <karr@mssm.edu>

Date

2017-05-23

Copyright

2016, Karr Lab

License

MIT

class obj_tables.abstract.AbstractModel(_comments=None, **kwargs)[source]

Bases: obj_tables.core.Model

Abstract model base class

Parameters

**kwargs – dictionary of keyword arguments with keys equal to the names of the model attributes

Raises

TypeError – if keyword argument is not a defined attribute

class Meta[source]

Bases: obj_tables.core.Meta

attribute_order = ()[source]
attributes = {}[source]
children = {}[source]
description = ''[source]
frozen_columns = 1[source]
indexed_attrs_tuples = ()[source]
inheritance = (<class 'obj_tables.abstract.AbstractModel'>,)[source]
local_attributes = {}[source]
merge = 1[source]
ordering = ()[source]
primary_attribute = None[source]
related_attributes = {}[source]
table_format = 1[source]
unique_together = ()[source]
verbose_name = 'Abstract model'[source]
verbose_name_plural = 'Abstract models'[source]
objects = <obj_tables.core.Manager object>[source]
class obj_tables.abstract.AbstractModelMeta[source]

Bases: obj_tables.core.ModelMeta, abc.ABCMeta

Abstract model metaclass

Parameters
  • metacls (Model) – Model, or a subclass of Model

  • name (str) – Model class name

  • bases (tuple) – tuple of superclasses

  • namespace (dict) – namespace of Model class definition

Returns

a new instance of Model, or a subclass of Model

Return type

Model

3.1.6. obj_tables.core module

Toolkit for modeling complex datasets with collections of user-friendly tables

Many classes contain the methods serialize() and deserialize()`, which invert each other. serialize() converts a python object instance into a string representation, whereas deserialize() parses an object’s string representation – as would be stored in a file or spreadsheet representation of a biochemical model – into a python object instance. deserialize() returns an error when the string representation cannot be parsed into the python object. Deserialization methods for related attributes (subclasses of RelatedAttribute) do not get called until all other attributes have been deserialized. In particular, they’re called by obj_tables.io.WorkbookReader.link_model. Therefore, they get passed all objects that are not inline, which can then be referenced to deserialize the related attribute.

Author

Jonathan Karr <karr@mssm.edu>

Author

Arthur Goldberg <Arthur.Goldberg@mssm.edu>

Date

2016-12-12

Copyright

2016, Karr Lab

License

MIT

class obj_tables.core.Attribute(init_value=None, default=None, default_cleaned_value=None, none_value=None, verbose_name='', description='', primary=False, unique=False, unique_case_insensitive=False)[source]

Bases: object

Model attribute

name[source]

name

Type

str

type[source]

allowed type(s) of the values of the attribute

Type

types.TypeType or tuple of types.TypeType

init_value[source]

initial value

Type

object

default[source]

default value

Type

object

default_cleaned_value[source]

value to replace None values with during cleaning, or function which computes the value to replace None values

Type

object

none_value[source]

none value

Type

object

verbose_name[source]

verbose name

Type

str

description[source]

description

Type

str

primary[source]

indicate if attribute is primary attribute

Type

bool

unique[source]

indicate if attribute value must be unique

Type

bool

unique_case_insensitive[source]

if true, conduct case-insensitive test of uniqueness

Type

bool

Parameters
  • init_value (object, optional) – initial value

  • default (object, optional) – default value

  • default_cleaned_value (object, optional) – value to replace None values with during cleaning, or function which computes the value to replace None values

  • none_value (object, optional) – none value

  • verbose_name (str, optional) – verbose name

  • description (str, optional) – description

  • primary (bool, optional) – indicate if attribute is primary attribute

  • unique (bool, optional) – indicate if attribute value must be unique

  • unique_case_insensitive (bool, optional) – if true, conduct case-insensitive test of uniqueness

clean(value)[source]

Convert attribute value into the appropriate type

Parameters

value (object) – value of attribute to clean

Returns

tuple of cleaned value and cleaning error

Return type

tuple of object, InvalidAttribute or None

abstract copy_value(value, objects_and_copies)[source]

Copy value

Parameters
  • value (object) – value

  • objects_and_copies (dict) – dictionary that maps objects to their copies

Returns

copy of value

Return type

object

abstract deserialize(value)[source]

Deserialize value

Parameters

value (object) – semantically equivalent representation

Returns

tuple of cleaned value and cleaning error

Return type

tuple of object, InvalidAttribute or None

abstract from_builtin(json)[source]

Decode a simple Python representation (dict, list, str, float, bool, None) of a value of the attribute that is compatible with JSON and YAML

Parameters

json (object) – simple Python representation of a value of the attribute

Returns

decoded value of the attribute

Return type

object

get_default()[source]

Get default value for attribute

Returns

initial value

Return type

object

get_default_cleaned_value()[source]

Get value to replace None values with during cleaning

Returns

initial value

Return type

object

get_init_value(obj)[source]

Get initial value for attribute

Parameters

obj (Model) – object whose attribute is being initialized

Returns

initial value

Return type

object

get_none_value()[source]

Get none value

Returns

none value

Return type

object

get_xlsx_validation(sheet_models=None, doc_metadata_model=None)[source]

Get XLSX validation

Parameters
  • sheet_models (list of Model, optional) – models encoded as separate sheets

  • doc_metadata_model (type) – model whose worksheet contains the document metadata

Returns

validation

Return type

wc_utils.workbook.io.FieldValidation

abstract merge(left, right, right_objs_in_left, left_objs_in_right)[source]

Merge an attribute of elements of two models

Parameters
  • left (Model) – an element in a model to merge

  • right (Model) – an element in a second model to merge

  • right_objs_in_left (dict) – mapping from objects in right model to objects in left model

  • left_objs_in_right (dict) – mapping from objects in left model to objects in right model

abstract serialize(value)[source]

Serialize value

Parameters

value (object) – Python representation

Returns

simple Python representation

Return type

bool, float, str, or None

set_value(obj, new_value)[source]

Set value of attribute of object

Parameters
  • obj (Model) – object

  • new_value (object) – new attribute value

Returns

attribute value

Return type

object

abstract to_builtin(value)[source]

Encode a value of the attribute using a simple Python representation (dict, list, str, float, bool, None) that is compatible with JSON and YAML

Parameters

value (object) – value of the attribute

Returns

simple Python representation of a value of the attribute

Return type

object

abstract validate(obj, value)[source]

Determine if value is a valid value of the attribute

Parameters
  • obj (Model) – object being validated

  • value (object) – value of attribute to validate

Returns

None if attribute is valid, otherwise return a list

of errors as an instance of InvalidAttribute

Return type

InvalidAttribute or None

validate_unique(objects, values)[source]

Determine if the attribute values are unique

Parameters
  • objects (list of Model) – list of Model objects

  • values (list) – list of values

Returns

None if values are unique, otherwise return a list of

errors as an instance of InvalidAttribute

Return type

InvalidAttribute or None

value_equal(val1, val2, tol=0.0)[source]

Determine if attribute values are equal

Parameters
  • val1 (object) – first value

  • val2 (object) – second value

  • tol (float, optional) – equality tolerance

Returns

True if attribute values are equal

Return type

bool

class obj_tables.core.BaseRelatedAttribute[source]

Bases: object

Attribute which represents a relationship with 1 or more other Models

class obj_tables.core.BooleanAttribute(default=False, default_cleaned_value=None, none_value=None, verbose_name='', description='Enter a Boolean value')[source]

Bases: obj_tables.core.LiteralAttribute

Boolean attribute

default[source]

default value

Type

bool

default_cleaned_value[source]

value to replace None values with during cleaning

Type

bool

Parameters
  • default (bool, optional) – default value

  • default_cleaned_value (bool, optional) – value to replace None values with during cleaning

  • none_value (object, optional) – none value

  • verbose_name (str, optional) – verbose name

  • description (str, optional) – description

Raises

ValueError – if default is not a bool or if default_cleaned_value is not a bool

clean(value)[source]

Convert attribute value into the appropriate type

Parameters

value (object) – value of attribute to clean

Returns

tuple of cleaned value and cleaning error

Return type

tuple of bool, InvalidAttribute or None

get_xlsx_validation(sheet_models=None, doc_metadata_model=None)[source]

Get XLSX validation

Parameters
  • sheet_models (list of Model, optional) – models encoded as separate sheets

  • doc_metadata_model (type) – model whose worksheet contains the document metadata

Returns

validation

Return type

wc_utils.workbook.io.FieldValidation

validate(obj, value)[source]

Determine if value is a valid value of the attribute

Parameters
  • obj (Model) – object being validated

  • value (object) – value of attribute to validate

Returns

None if attribute is valid, other return list of errors as an

instance of InvalidAttribute

Return type

InvalidAttribute or None

class obj_tables.core.CellDialect[source]

Bases: str, enum.Enum

Dialect for serializing values to a cell

csv = 'excel'[source]
json = 'json'[source]
tsv = 'excel-tab'[source]
class obj_tables.core.DateAttribute(none=True, default=None, default_cleaned_value=None, none_value=None, verbose_name='', description='', primary=False, unique=False)[source]

Bases: obj_tables.core.LiteralAttribute

Date attribute

none[source]

if False, the attribute is invalid if its value is None

Type

bool

default[source]

default date

Type

date

default_cleaned_value[source]

value to replace None values with during cleaning, or function which computes the value to replace None values

Type

date

Parameters
  • none (bool, optional) – if False, the attribute is invalid if its value is None

  • default (date, optional) – default date

  • default_cleaned_value (date, optional) – value to replace None values with during cleaning, or function which computes the value to replace None values

  • verbose_name (str, optional) – verbose name

  • description (str, optional) – description

  • primary (bool, optional) – indicate if attribute is primary attribute

  • unique (bool, optional) – indicate if attribute value must be unique

clean(value)[source]

Convert attribute value into the appropriate type

Parameters

value (object) – value of attribute to clean

Returns

(date, None), or (None, InvalidAttribute) reporting error

Return type

tuple

from_builtin(json)[source]

Decode a simple Python representation (dict, list, str, float, bool, None) of a value of the attribute that is compatible with JSON and YAML

Parameters

json (str) – simple Python representation of a value of the attribute

Returns

decoded value of the attribute

Return type

date

get_xlsx_validation(sheet_models=None, doc_metadata_model=None)[source]

Get XLSX validation

Parameters
  • sheet_models (list of Model, optional) – models encoded as separate sheets

  • doc_metadata_model (type) – model whose worksheet contains the document metadata

Returns

validation

Return type

wc_utils.workbook.io.FieldValidation

serialize(value)[source]

Serialize string

Parameters

value (date) – Python representation

Returns

simple Python representation

Return type

str

to_builtin(value)[source]

Encode a value of the attribute using a simple Python representation (dict, list, str, float, bool, None) that is compatible with JSON and YAML

Parameters

value (date) – value of the attribute

Returns

simple Python representation of a value of the attribute

Return type

str

validate(obj, value)[source]

Determine if value is a valid value of the attribute

Parameters
  • obj (Model) – object being validated

  • value (date) – value of attribute to validate

Returns

None if attribute is valid, other return list of errors as an

instance of InvalidAttribute

Return type

InvalidAttribute or None

class obj_tables.core.DateTimeAttribute(none=True, default=None, default_cleaned_value=None, none_value=None, verbose_name='', description='', primary=False, unique=False)[source]

Bases: obj_tables.core.LiteralAttribute

Datetime attribute

none[source]

if False, the attribute is invalid if its value is None

Type

bool

default[source]

default datetime

Type

datetime

default_cleaned_value[source]

value to replace None values with during cleaning, or function which computes the value to replace None values

Type

datetime

Parameters
  • none (bool, optional) – if False, the attribute is invalid if its value is None

  • default (datetime, optional) – default datetime

  • default_cleaned_value (datetime, optional) – value to replace None values with during cleaning, or function which computes the value to replace None values

  • none_value (object, optional) – none value

  • verbose_name (str, optional) – verbose name

  • description (str, optional) – description

  • primary (bool, optional) – indicate if attribute is primary attribute

  • unique (bool, optional) – indicate if attribute value must be unique

clean(value)[source]

Convert attribute value into the appropriate type

Parameters

value (object) – value of attribute to clean

Returns

  • tuple of object, InvalidAttribute if value is invalid, or

  • tuple of datetime, None with cleaned value otherwise

Return type

tuple

from_builtin(json)[source]

Decode a simple Python representation (dict, list, str, float, bool, None) of a value of the attribute that is compatible with JSON and YAML

Parameters

json (str) – simple Python representation of a value of the attribute

Returns

decoded value of the attribute

Return type

datetime

get_xlsx_validation(sheet_models=None, doc_metadata_model=None)[source]

Get XLSX validation

Parameters
  • sheet_models (list of Model, optional) – models encoded as separate sheets

  • doc_metadata_model (type) – model whose worksheet contains the document metadata

Returns

validation

Return type

wc_utils.workbook.io.FieldValidation

serialize(value)[source]

Serialize string

Parameters

value (datetime) – Python representation

Returns

simple Python representation

Return type

str

to_builtin(value)[source]

Encode a value of the attribute using a simple Python representation (dict, list, str, float, bool, None) that is compatible with JSON and YAML

Parameters

value (datetime) – value of the attribute

Returns

simple Python representation of a value of the attribute

Return type

str

validate(obj, value)[source]

Determine if value is a valid value of the attribute

Parameters
  • obj (Model) – object being validated

  • value (datetime) – value of attribute to validate

Returns

None if attribute is valid, otherwise return list of

errors as an instance of InvalidAttribute

Return type

None or InvalidAttribute

class obj_tables.core.EmailAttribute(verbose_name='', description='Enter a valid email address', primary=False, unique=False)[source]

Bases: obj_tables.core.StringAttribute

Attribute for email addresses

Parameters
  • verbose_name (str, optional) – verbose name

  • description (str, optional) – description

  • primary (bool, optional) – indicate if attribute is primary attribute

  • unique (bool, optional) – indicate if attribute value must be unique

validate(obj, value)[source]

Determine if value is a valid value of the attribute

Parameters
  • obj (Model) – object being validated

  • value (date) – value of attribute to validate

Returns

None if attribute is valid, other return list of errors as an

instance of InvalidAttribute

Return type

InvalidAttribute or None

class obj_tables.core.EnumAttribute(enum_class, none=False, default=None, default_cleaned_value=None, none_value=None, verbose_name='', description='', primary=False, unique=False, unique_case_insensitive=False)[source]

Bases: obj_tables.core.LiteralAttribute

Enumeration attribute

enum_class[source]

subclass of Enum

Type

type

none[source]

if False, the attribute is invalid if its value is None

Type

bool

Parameters
  • enum_class (type or list) – subclass of Enum, list of enumerated names, list of 2-tuples of each enumerated name and its value, or a dict which maps enumerated names to their values

  • none (bool, optional) – if False, the attribute is invalid if its value is None

  • default (object, optional) – default value

  • default_cleaned_value (Enum, optional) – value to replace None values with during cleaning

  • none_value (object, optional) – none value

  • verbose_name (str, optional) – verbose name

  • description (str, optional) – description

  • primary (bool, optional) – indicate if attribute is primary attribute

  • unique (bool, optional) – indicate if attribute value must be unique

  • unique_case_insensitive (bool, optional) – if true, conduct case-insensitive test of uniqueness

Raises

ValueError – if enum_class is not a subclass of Enum, if default is not an instance of enum_class, or if default_cleaned_value is not an instance of enum_class

clean(value)[source]

Convert attribute value into the appropriate type

Parameters

value (object) – value of attribute to clean

Returns

tuple of cleaned value and cleaning error

Return type

tuple of Enum, InvalidAttribute or None

from_builtin(json)[source]

Decode a simple Python representation (dict, list, str, float, bool, None) of a value of the attribute that is compatible with JSON and YAML

Parameters

json (str) – simple Python representation of a value of the attribute

Returns

decoded value of the attribute

Return type

Enum

get_xlsx_validation(sheet_models=None, doc_metadata_model=None)[source]

Get XLSX validation

Parameters
  • sheet_models (list of Model, optional) – models encoded as separate sheets

  • doc_metadata_model (type) – model whose worksheet contains the document metadata

Returns

validation

Return type

wc_utils.workbook.io.FieldValidation

serialize(value)[source]

Serialize enumeration

Parameters

value (Enum) – Python representation

Returns

simple Python representation

Return type

str

to_builtin(value)[source]

Encode a value of the attribute using a simple Python representation (dict, list, str, float, bool, None) that is compatible with JSON and YAML

Parameters

value (Enum) – value of the attribute

Returns

simple Python representation of a value of the attribute

Return type

str

validate(obj, value)[source]

Determine if value is a valid value of the attribute

Parameters
  • obj (Model) – object being validated

  • value (object) – value of attribute to validate

Returns

None if attribute is valid, other return list of errors as an

instance of InvalidAttribute

Return type

InvalidAttribute or None

class obj_tables.core.FloatAttribute(min=nan, max=nan, nan=True, default=nan, default_cleaned_value=nan, none_value=nan, verbose_name='', description='', primary=False, unique=False)[source]

Bases: obj_tables.core.NumericAttribute

Float attribute

default[source]

default value

Type

float

default_cleaned_value[source]

value to replace None values with during cleaning

Type

float

min[source]

minimum value

Type

float

max[source]

maximum value

Type

float

nan[source]

if true, allow nan values

Type

bool

Parameters
  • min (float, optional) – minimum value

  • max (float, optional) – maximum value

  • nan (bool, optional) – if true, allow nan values

  • default (float, optional) – default value

  • default_cleaned_value (float, optional) – value to replace None values with during cleaning

  • verbose_name (str, optional) – verbose name

  • description (str, optional) – description

  • primary (bool, optional) – indicate if attribute is primary attribute

  • unique (bool, optional) – indicate if attribute value must be unique

Raises

ValueError – if max is less than min

clean(value)[source]

Convert attribute value into the appropriate type

Parameters

value (object) – value of attribute to clean

Returns

tuple of cleaned value and cleaning error

Return type

tuple of float, InvalidAttribute or None

get_xlsx_validation(sheet_models=None, doc_metadata_model=None)[source]

Get XLSX validation

Parameters
  • sheet_models (list of Model, optional) – models encoded as separate sheets

  • doc_metadata_model (type) – model whose worksheet contains the document metadata

Returns

validation

Return type

wc_utils.workbook.io.FieldValidation

merge(left, right, right_objs_in_left, left_objs_in_right)[source]

Merge an attribute of elements of two models

Parameters
  • left (Model) – an element in a model to merge

  • right (Model) – an element in a second model to merge

  • right_objs_in_left (dict) – mapping from objects in right model to objects in left model

  • left_objs_in_right (dict) – mapping from objects in left model to objects in right model

Raises

ValueError – if the attributes of the elements of the models are different

serialize(value)[source]

Serialize float

Parameters

value (float) – Python representation

Returns

simple Python representation

Return type

float

validate(obj, value)[source]

Determine if value is a valid value of the attribute

Parameters
  • obj (Model) – object being validated

  • value (object) – value of attribute to validate

Returns

None if attribute is valid, other return list of errors as an

instance of InvalidAttribute

Return type

InvalidAttribute or None

value_equal(val1, val2, tol=0.0)[source]

Determine if attribute values are equal, optionally, up to a tolerance

Parameters
  • val1 (object) – first value

  • val2 (object) – second value

  • tol (float, optional) – equality tolerance

Returns

True if attribute values are equal

Return type

bool

class obj_tables.core.IntegerAttribute(min=None, max=None, none=False, default=None, default_cleaned_value=None, none_value=None, verbose_name='', description='', primary=False, unique=False)[source]

Bases: obj_tables.core.NumericAttribute

Integer attribute

none[source]

if False, the attribute is invalid if its value is None

Type

bool

default[source]

default value

Type

int

default_cleaned_value[source]

value to replace None values with during cleaning

Type

int

min[source]

minimum value

Type

int

max[source]

maximum value

Type

int

Parameters
  • min (int, optional) – minimum value

  • max (int, optional) – maximum value

  • none (bool, optional) – if False, the attribute is invalid if its value is None

  • default (int, optional) – default value

  • default_cleaned_value (int, optional) – value to replace None values with during cleaning

  • none_value (object, optional) – none value

  • verbose_name (str, optional) – verbose name

  • description (str, optional) – description

  • primary (bool, optional) – indicate if attribute is primary attribute

  • unique (bool, optional) – indicate if attribute value must be unique

Raises

ValueError – if max is less than min

clean(value)[source]

Convert attribute value into the appropriate type

Parameters

value (object) – value of attribute to clean

Returns

tuple of cleaned value and cleaning error

Return type

tuple of int, InvalidAttribute or None

from_builtin(json)[source]

Decode a simple Python representation (dict, list, str, float, bool, None) of a value of the attribute that is compatible with JSON and YAML

Parameters

json (float) – simple Python representation of a value of the attribute

Returns

decoded value of the attribute

Return type

int

get_xlsx_validation(sheet_models=None, doc_metadata_model=None)[source]

Get XLSX validation

Parameters
  • sheet_models (list of Model, optional) – models encoded as separate sheets

  • doc_metadata_model (type) – model whose worksheet contains the document metadata

Returns

validation

Return type

wc_utils.workbook.io.FieldValidation

serialize(value)[source]

Serialize integer

Parameters

value (int) – Python representation

Returns

simple Python representation

Return type

float

to_builtin(value)[source]

Encode a value of the attribute using a simple Python representation (dict, list, str, float, bool, None) that is compatible with JSON and YAML

Parameters

value (int) – value of the attribute

Returns

simple Python representation of a value of the attribute

Return type

float

validate(obj, value)[source]

Determine if value is a valid value of the attribute

Parameters
  • obj (Model) – object being validated

  • value (object) – value of attribute to validate

Returns

None if attribute is valid, otherwise return list of

errors as an instance of InvalidAttribute

Return type

InvalidAttribute or None

class obj_tables.core.InvalidAttribute(attribute, messages, related=False, location=None, value=None)[source]

Bases: object

Represents an invalid attribute and its errors

attribute[source]

invalid attribute

Type

Attribute

messages[source]

list of error messages

Type

list of str

related[source]

indicates if error is about value or related value

Type

bool

location[source]

a string representation of the attribute’s location in an input file

Type

str, optional

value[source]

invalid input value

Type

str, optional

Parameters
  • attribute (Attribute) – invalid attribute

  • message (list of str) – list of error messages

  • related (bool, optional) – indicates if error is about value or related value

  • location (str, optional) – a string representation of the attribute’s location in an input file

  • value (str, optional) – invalid input value

__str__()[source]

Get string representation of errors

Returns

string representation of errors

Return type

str

set_location_and_value(location, value)[source]

Set the location and value of the attribute

Parameters
  • location (str) – a string representation of the attribute’s location in an input file

  • value (str) – the invalid input value

class obj_tables.core.InvalidModel(model, attributes)[source]

Bases: object

Represents an invalid model, such as a model with an attribute that fails to meet specified constraints

model[source]

Model class

Type

class

attributes[source]

list of invalid attributes and their errors

Type

list of InvalidAttribute

Parameters
  • model (class) – Model class

  • attributes (list of InvalidAttribute) – list of invalid attributes and their errors

__str__()[source]

Get string representation of errors

Returns

string representation of errors

Return type

str

class obj_tables.core.InvalidObject(object, attributes)[source]

Bases: object

Represents an invalid object and its errors

object[source]

invalid object

Type

object

attributes[source]

list of invalid attributes and their errors

Type

list of InvalidAttribute

Parameters
  • object (Model) – invalid object

  • attributes (list of InvalidAttribute) – list of invalid attributes and their errors

__str__()[source]

Get string representation of errors

Returns

string representation of errors

Return type

str

class obj_tables.core.InvalidObjectSet(invalid_objects, invalid_models)[source]

Bases: object

Represents a list of invalid objects and invalid models

invalid_objects[source]

list of invalid objects

Type

list of InvalidObject

invalid_models[source]

list of invalid models

Type

list of InvalidModel

Parameters
  • invalid_objects (list of InvalidObject) – list of invalid objects

  • invalid_models (list of InvalidModel) – list of invalid models

Raises

ValueErrorinvalid_models is not unique

__str__()[source]

Get string representation of errors

Returns

string representation of errors

Return type

str

get_model_errors_by_model()[source]

Get model errors grouped by models

Returns

InvalidModel: dictionary of model errors, grouped by model

Return type

dict of Model

get_object_errors_by_model()[source]

Get object errors grouped by model

Returns

list of InvalidObject: dictionary of object errors, grouped by model

Return type

dict of Model

class obj_tables.core.ListAttribute(type=<class 'str'>, separator=', ', default=[], none_value=[], verbose_name='', description='A list of values')[source]

Bases: obj_tables.core.LiteralAttribute

List attribute

type[source]

type of elements

Type

type

separator[source]

element separator for serialization

Type

str

Parameters
  • type (type, optional) – type of elements

  • separator (str, optional) – element separator for serialization

  • default (list, optional) – default value

  • none_value (list, optional) – none value

  • verbose_name (str, optional) – verbose name

  • description (str, optional) – description

clean(value)[source]

Deserialize value

Parameters

value (str) – semantically equivalent representation

Returns

Return type

tuple

from_builtin(json)[source]

Decode a simple Python representation (dict, list, str, float, bool, None) of a value of the attribute that is compatible with JSON and YAML

Parameters

json (dict) – simple Python representation of a value of the attribute

Returns

decoded value of the attribute

Return type

list

get_xlsx_validation(sheet_models=None, doc_metadata_model=None)[source]

Get XLSX validation

Parameters
  • sheet_models (list of Model, optional) – models encoded as separate sheets

  • doc_metadata_model (type) – model whose worksheet contains the document metadata

Returns

validation

Return type

wc_utils.workbook.io.FieldValidation

serialize(value)[source]

Serialize string

Parameters

value (list) – Python representation

Returns

simple Python representation

Return type

str

to_builtin(value)[source]

Encode a value of the attribute using a simple Python representation (dict, list, str, float, bool, None) that is compatible with JSON and YAML

Parameters

value (list) – value of the attribute

Returns

simple Python representation of a value of the attribute

Return type

dict

validate(obj, value)[source]

Determine if value is a valid value

Parameters
  • obj (Model) – class being validated

  • value (list) – value of attribute to validate

Returns

None if attribute is valid, other return

list of errors as an instance of InvalidAttribute

Return type

InvalidAttribute or None

class obj_tables.core.LiteralAttribute(init_value=None, default=None, default_cleaned_value=None, none_value=None, verbose_name='', description='', primary=False, unique=False, unique_case_insensitive=False)[source]

Bases: obj_tables.core.Attribute

Base class for literal attributes (Boolean, enumeration, float, integer, string, etc.)

Parameters
  • init_value (object, optional) – initial value

  • default (object, optional) – default value

  • default_cleaned_value (object, optional) – value to replace None values with during cleaning, or function which computes the value to replace None values

  • none_value (object, optional) – none value

  • verbose_name (str, optional) – verbose name

  • description (str, optional) – description

  • primary (bool, optional) – indicate if attribute is primary attribute

  • unique (bool, optional) – indicate if attribute value must be unique

  • unique_case_insensitive (bool, optional) – if true, conduct case-insensitive test of uniqueness

copy_value(value, objects_and_copies)[source]

Copy value

Parameters
  • value (object) – value

  • objects_and_copies (dict) – dictionary that maps objects to their copies

Returns

copy of value

Return type

object

deserialize(value)[source]

Deserialize value

Parameters

value (object) – semantically equivalent representation

Returns

tuple

of cleaned value and cleaning error

Return type

tuple of object, InvalidAttribute or None

from_builtin(json)[source]

Decode a simple Python representation (dict, list, str, float, bool, None) of a value of the attribute that is compatible with JSON and YAML

Parameters

json (object) – simple Python representation of a value of the attribute

Returns

decoded value of the attribute

Return type

object

merge(left, right, right_objs_in_left, left_objs_in_right)[source]

Merge an attribute of elements of two models

Parameters
  • left (Model) – an element in a model to merge

  • right (Model) – an element in a second model to merge

  • right_objs_in_left (dict) – mapping from objects in right model to objects in left model

  • left_objs_in_right (dict) – mapping from objects in left model to objects in right model

Raises

ValueError – if the attributes of the elements of the models are different

serialize(value)[source]

Serialize value

Parameters

value (object) – Python representation

Returns

simple Python

representation

Return type

bool, float, str, or None

to_builtin(value)[source]

Encode a value of the attribute using a simple Python representation (dict, list, str, float, bool, None) that is compatible with JSON and YAML

Parameters

value (object) – value of the attribute

Returns

simple Python representation of a value of the attribute

Return type

object

validate(obj, value)[source]

Determine if value is a valid value of the attribute

Parameters
  • obj (Model) – object being validated

  • value (object) – value of attribute to validate

Returns

None if attribute is valid, otherwise return a

list of errors as an instance of InvalidAttribute

Return type

InvalidAttribute or None

class obj_tables.core.LocalAttribute(attr, primary_class, is_primary=True)[source]

Bases: object

Meta data about a local attribute in a class

attr[source]

attribute

Type

Attribute

cls[source]

class which owns this attribute

Type

type

name[source]

name of the attr in cls

Type

str

type[source]

allowed type(s) of the values of the attribute

Type

types.TypeType

related_class[source]

other class which is related to this attribute

Type

type

related_name[source]

name of this attribute in related_cls

Type

str

primary_class[source]

class in which this attribute was defined

Type

type

primary_name[source]

name of this attribute in primary_cls

Type

str

secondary_class[source]

related class to primary_cls

Type

type

secondary_name[source]

name of this attribute in secondary_cls

Type

str

is_primary[source]

True if this attr was defined in cls (cls=primary_cls)

Type

bool

True if this attribute is an instance of RelatedAttribute

Type

bool

True if the value of this attribute is a list (*-to-many relationship)

Type

bool

minimum number of related objects in the forward direction

Type

int

maximum number of related objects in the forward direction

Type

int

minimum number of related objects in the reverse direction

Type

int

maximum number of related objects in the reverse direction

Type

int

Parameters
  • attr (Attribute) – attribute

  • primary_class (type) – class in which attr was defined

  • is_primary (bool, optional) – True indicates that a local attribute should be created for the related class of attr

class obj_tables.core.LocalPathAttribute(verbose_name='', description='Enter a path to a local file or directory', none=False, default=None, default_cleaned_value=None, none_value=None, primary=False, unique=False)[source]

Bases: obj_tables.core.LongStringAttribute

Attribute to be used for paths to local files and directories

Parameters
  • verbose_name (str, optional) – verbose name

  • description (str, optional) – description

  • none (bool, optional) – if False, the attribute is invalid if its value is None

  • default (str, optional) – default value

  • default_cleaned_value (str, optional) – value to replace None values with during cleaning

  • primary (bool, optional) – indicate if attribute is primary attribute

  • unique (bool, optional) – indicate if attribute value must be unique

clean(value)[source]

Convert attribute value into the appropriate type

Parameters

value (object) – value of attribute to clean

Returns

(pathlib.Path, None), or (None, InvalidAttribute) reporting error

Return type

tuple

from_builtin(json)[source]

Decode a simple Python representation (dict, list, str, float, bool, None) of a value of the attribute that is compatible with JSON and YAML

Parameters

json (str) – simple Python representation of a value of the attribute

Returns

decoded value of the attribute

Return type

pathlib.Path

get_xlsx_validation(sheet_models=None, doc_metadata_model=None)[source]

Get XLSX validation

Parameters
  • sheet_models (list of Model, optional) – models encoded as separate sheets

  • doc_metadata_model (type) – model whose worksheet contains the document metadata

Returns

validation

Return type

wc_utils.workbook.io.FieldValidation

serialize(value)[source]

Serialize string

Parameters

value (pathlib.Path) – Python representation

Returns

simple Python representation

Return type

str

to_builtin(value)[source]

Encode a value of the attribute using a simple Python representation (dict, list, str, float, bool, None) that is compatible with JSON and YAML

Parameters

value (pathlib.Path) – value of the attribute

Returns

simple Python representation of a value of the attribute

Return type

str

validate(obj, value)[source]

Determine if value is a valid value of the attribute

Parameters
  • obj (Model) – object being validated

  • value (pathlib.Path) – value of attribute to validate

Returns

None if attribute is valid, other return list of errors as an

instance of InvalidAttribute

Return type

InvalidAttribute or None

class obj_tables.core.LongStringAttribute(min_length=0, max_length=4294967295, none=False, default='', default_cleaned_value='', none_value='', verbose_name='', description='', primary=False, unique=False, unique_case_insensitive=False)[source]

Bases: obj_tables.core.StringAttribute

Long string attribute

Parameters
  • min_length (int, optional) – minimum length

  • max_length (int, optional) – maximum length

  • none (bool, optional) – if False, the attribute is invalid if its value is None

  • default (str, optional) – default value

  • default_cleaned_value (str, optional) – value to replace None values with during cleaning

  • verbose_name (str, optional) – verbose name

  • description (str, optional) – description

  • primary (bool, optional) – indicate if attribute is primary attribute

  • unique (bool, optional) – indicate if attribute value must be unique

  • unique_case_insensitive (bool, optional) – if true, conduct case-insensitive test of uniqueness

class obj_tables.core.Manager(cls)[source]

Bases: object

Enable O(1) dictionary-based searching of a Model’s instances

This class is inspired by Django’s Manager class. An instance of Manger is associated with each Model and accessed as the class attribute objects (as in Django). The tuples of attributes to index are specified by the indexed_attrs_tuples attribute of Model.Meta, which contains a tuple of tuples of attributes to index. Models with empty indexed_attrs_tuples attributes incur no overhead from Manager.

Manager maintains a dictionary for each indexed attribute tuple, and a reverse index from each Model instance to its indexed attribute tuple keys.

These data structures support * O(1) get operations for Model instances indexed by a indexed attribute tuple * O(1) Model instance insert and update operations

cls[source]

the Model class which is being managed

Type

class

_new_instances[source]

set of all new instances of cls that have not been indexed, stored as weakrefs, so Model’s that are otherwise unused can be garbage collected

Type

WeakSet

_index_dicts[source]

indices that enable lookup of Model instances from their Meta.indexed_attrs_tuples mapping: <attr names tuple> -> <attr values tuple> -> WeakSet(<model_obj instances>)

Type

dict mapping tuple to WeakSet

_reverse_index[source]

a reverse index that provides all of each Model’s indexed attribute tuple keys mapping: <model_obj instances> -> <attr names tuple> -> <attr values tuple>

Type

WeakKeyDictionary mapping Model instance to dict

num_ops_since_gc[source]

number of operations since the last gc of weaksets

Type

int

Parameters

cls (class) – the Model class which is being managed

GC_PERIOD = 1000[source]
all()[source]

Provide all instances of the Model managed by this Manager

Returns

a list of all instances of the managed Model

or None if the Model is not indexed

Return type

list of Model

clear_new_instances()[source]

Clear the set of new instances that have not been inserted

get(**kwargs)[source]

Get the Model instance(s) that match the attribute name,value pair(s) in kwargs

The keys in kwargs must correspond to an entry in the Model’s indexed_attrs_tuples. Warning: this method is non-deterministic. To obtain Manager’s O(1) performance, Model instances in the index are stored in WeakSet’s. Therefore, the order of elements in the list returned is not reproducible. Applications that need reproducibility must deterministically order elements in lists returned by this method.

Parameters

**kwargs – keyword args mapping from attribute name(s) to value(s)

Returns

a list of Model instances whose indexed attribute tuples have the

values in kwargs; otherwise None, indicating no match

Return type

list of Model

Raises

ValueError – if no arguments are provided, or the attribute name(s) in kwargs.keys() do not correspond to an indexed attribute tuple of the Model

get_one(**kwargs)[source]

Get one Model instance that matches the attribute name,value pair(s) in kwargs

Uses get.

Parameters

**kwargs – keyword args mapping from attribute name(s) to value(s)

Returns

a Model instance whose indexed attribute tuples have the values in kwargs,

or None if no Model satisfies the query

Return type

Model

Raises

ValueError – if get raises an exception, or if multiple instances match.

insert_all_new()[source]

Insert all new instances of this Manager’s Model’s into the search indices

reset()[source]

Reset this Manager

Empty Manager’s indices. Since Manager globally indexes all instances of a Model, this method is useful when multiple models are loaded sequentially.

upsert(model_obj)[source]

Update the indices for model_obj that are used to search on indexed attribute tuples

Upsert means update or insert. Update the indices if model_obj is already stored, otherwise insert model_obj. Users of Manager are responsible for calling this method if model_obj changes.

Costs O(I) where I is the number of indexed attribute tuples for the Model.

Parameters

model_obj (Model) – a Model instance

upsert_all()[source]

Upsert the indices for all of this Manager’s Model’s

class obj_tables.core.ManyToManyAttribute(related_class, related_name='', default=[], default_cleaned_value=[], related_default=[], none_value=<class 'list'>, separator=', ', min_related=0, max_related=inf, min_related_rev=0, max_related_rev=inf, verbose_name='', verbose_related_name='', description='', related_manager=<class 'obj_tables.core.ManyToManyRelatedManager'>, cell_dialect=<CellDialect.json: 'json'>)[source]

Bases: obj_tables.core.ToManyAttribute, obj_tables.core.RelatedAttribute

Represents a many-to-many relationship between two types of objects.

related_manager[source]

related manager

Type

type

cell_dialect[source]

dialect for serializing values to a cell

Type

CellDialect

Parameters
  • related_class (class) – related class

  • related_name (str, optional) – name of related attribute on related_class

  • default (callable, optional) – function which returns the default values

  • default_cleaned_value (callable, optional) – value to replace None values with during cleaning, or function which computes the value to replace None values

  • related_default (callable, optional) – function which returns the default related values

  • none_value (object, optional) – none value

  • separator (str, optional) – element separator for serialization

  • min_related (int, optional) – minimum number of related objects in the forward direction

  • max_related (int, optional) – maximum number of related objects in the forward direction

  • min_related_rev (int, optional) – minimum number of related objects in the reverse direction

  • max_related_rev (int, optional) – maximum number of related objects in the reverse direction

  • verbose_name (str, optional) – verbose name

  • verbose_related_name (str, optional) – verbose related name

  • description (str, optional) – description

  • related_manager (type, optional) – related manager

  • cell_dialect (CellDialect, optional) – dialect for serializing values to a cell

copy_value(value, objects_and_copies)[source]

Copy value

Parameters
  • value (list of Model) – value

  • objects_and_copies (dict) – dictionary that maps objects to their copies

Returns

copy of value

Return type

list of Model

deserialize(values, objects, decoded=None)[source]

Deserialize value

Parameters
  • values (object) – String representation

  • objects (dict) – dictionary of objects, grouped by model

  • decoded (dict, optional) – dictionary of objects that have already been decoded

Returns

tuple of cleaned value and cleaning error

Return type

tuple of object, InvalidAttribute or None

get_init_value(obj)[source]

Get initial value for attribute

Parameters

obj (Model) – object whose attribute is being initialized

Returns

initial value

Return type

object

Get initial related value for attribute

Parameters

obj (object) – object whose attribute is being initialized

Returns

initial value

Return type

value (object)

Raises

ValueError – if related property is not defined

get_xlsx_validation(sheet_models=None, doc_metadata_model=None)[source]

Get XLSX validation

Parameters
  • sheet_models (list of Model, optional) – models encoded as separate sheets

  • doc_metadata_model (type) – model whose worksheet contains the document metadata

Returns

validation

Return type

wc_utils.workbook.io.FieldValidation

merge(left, right, right_objs_in_left, left_objs_in_right)[source]

Merge an attribute of elements of two models

Parameters
  • left (Model) – an element in a model to merge

  • right (Model) – an element in a second model to merge

  • right_objs_in_left (dict) – mapping from objects in right model to objects in left model

  • left_objs_in_right (dict) – mapping from objects in left model to objects in right model

Raises

ValueError – if the attributes of the elements of the models are different

related_validate(obj, value)[source]

Determine if value is a valid value of the related attribute

Parameters
  • obj (Model) – object being validated

  • value (list of Model) – value to validate

Returns

None if attribute is valid, other return list of errors as an

instance of InvalidAttribute

Return type

InvalidAttribute or None

serialize(value, encoded=None)[source]

Serialize related object

Parameters
  • value (list of Model) – Python representation

  • encoded (dict, optional) – dictionary of objects that have already been encoded

Returns

simple Python representation

Return type

str

Update the values of the related attributes of the attribute

Parameters
  • obj (object) – object whose attribute should be set

  • new_values (list) – value of the attribute

Returns

value of the attribute

Return type

list

Raises

ValueError – if related property is not defined

set_value(obj, new_values)[source]

Get value of attribute of object

Parameters
  • obj (Model) – object

  • new_values (list) – new attribute value

Returns

new attribute value

Return type

list

validate(obj, value)[source]

Determine if value is a valid value of the attribute

Parameters
  • obj (Model) – object being validated

  • value (list of Model) – value of attribute to validate

Returns

None if attribute is valid, other return list of errors as an

instance of InvalidAttribute

Return type

InvalidAttribute or None

class obj_tables.core.ManyToManyRelatedManager(object, attribute, related=True)[source]

Bases: obj_tables.core.RelatedManager

Represent values and related values of related attributes

Parameters
  • object (Model) – model instance

  • attribute (Attribute) – attribute

  • related (bool, optional) – is related attribute

append(value, propagate=True)[source]

Add value to list

Parameters
  • value (object) – value

  • propagate (bool, optional) – propagate change to related attribute

Returns

self

Return type

RelatedManager

cut(kind=None)[source]

Cut values and their children of kind kind into separate graphs.

If kind is None, children are defined to be the values of the related attributes defined in each class.

Parameters

kind (str, optional) – kind of children to include

Returns

cut values and their children

Return type

list of Model

remove(value, update_list=True, propagate=True)[source]

Remove value from list

Parameters
  • value (object) – value

  • update_list (bool, optional) – update list

  • propagate (bool, optional) – propagate change to related attribute

Returns

self

Return type

RelatedManager

class obj_tables.core.ManyToOneAttribute(related_class, related_name='', default=None, default_cleaned_value=None, related_default=[], none_value=None, min_related=0, min_related_rev=0, max_related_rev=inf, verbose_name='', verbose_related_name='', description='', related_manager=<class 'obj_tables.core.ManyToOneRelatedManager'>)[source]

Bases: obj_tables.core.RelatedAttribute

Represents a many-to-one relationship between two types of objects. This is analagous to a foreign key relationship in a database.

related_manager[source]

related manager

Type

type

Parameters
  • related_class (class) – related class

  • related_name (str, optional) – name of related attribute on related_class

  • default (callable, optional) – callable which returns the default value

  • default_cleaned_value (callable, optional) – value to replace None values with during cleaning, or function which computes the value to replace None values

  • related_default (callable, optional) – callable which returns the default related value

  • none_value (object, optional) – none value

  • min_related (int, optional) – minimum number of related objects in the forward direction

  • min_related_rev (int, optional) – minimum number of related objects in the reverse direction

  • max_related_rev (int, optional) – maximum number of related objects in the reverse direction

  • verbose_name (str, optional) – verbose name

  • verbose_related_name (str, optional) – verbose related name

  • description (str, optional) – description

  • related_manager (type, optional) – related manager

copy_value(value, objects_and_copies)[source]

Copy value

Parameters
  • value (Model) – value

  • objects_and_copies (dict) – dictionary that maps objects to their copies

Returns

copy of value

Return type

Model

deserialize(value, objects, decoded=None)[source]

Deserialize value

Parameters
  • value (str) – String representation

  • objects (dict) – dictionary of objects, grouped by model

  • decoded (dict, optional) – dictionary of objects that have already been decoded

Returns

tuple of cleaned value and cleaning error

Return type

tuple of object, InvalidAttribute or None

deserialize_from_cell(value, objects, decoded=None)[source]

Deserialize value from cell

Parameters
  • value (str) – String representation

  • objects (dict) – dictionary of objects, grouped by model

  • decoded (dict, optional) – dictionary of objects that have already been decoded

Returns

Model

Get initial related value for attribute

Parameters

obj (object) – object whose attribute is being initialized

Returns

initial value

Return type

value (object)

Raises

ValueError – if related property is not defined

get_xlsx_validation(sheet_models=None, doc_metadata_model=None)[source]

Get XLSX validation

Parameters
  • sheet_models (list of Model, optional) – models encoded as separate sheets

  • doc_metadata_model (type) – model whose worksheet contains the document metadata

Returns

validation

Return type

wc_utils.workbook.io.FieldValidation

merge(left, right, right_objs_in_left, left_objs_in_right)[source]

Merge an attribute of elements of two models

Parameters
  • left (Model) – an element in a model to merge

  • right (Model) – an element in a second model to merge

  • right_objs_in_left (dict) – mapping from objects in right model to objects in left model

  • left_objs_in_right (dict) – mapping from objects in left model to objects in right model

Raises

ValueError – if the attributes of the elements of the models are different

related_validate(obj, value)[source]

Determine if value is a valid value of the related attribute

Parameters
  • obj (Model) – object being validated

  • value (list of Model) – value to validate

Returns

None if attribute is valid, other return list of errors as an

instance of InvalidAttribute

Return type

InvalidAttribute or None

serialize(value, encoded=None)[source]

Serialize related object

Parameters
  • value (Model) – Python representation

  • encoded (dict, optional) – dictionary of objects that have already been encoded

Returns

simple Python representation

Return type

str

serialize_to_cell(value, encoded=None)[source]

Serialize related object

Parameters
  • value (Model) – Python representation

  • encoded (dict, optional) – dictionary of objects that have already been encoded

Returns

string representation

Return type

str

Update the values of the related attributes of the attribute

Parameters
  • obj (object) – object whose attribute should be set

  • new_values (list) – value of the attribute

Returns

value of the attribute

Return type

list

Raises

ValueError – if related property is not defined

set_value(obj, new_value)[source]

Update the values of the related attributes of the attribute

Parameters
  • obj (object) – object whose attribute should be set

  • new_value (Model) – new attribute value

Returns

new attribute value

Return type

Model

validate(obj, value)[source]

Determine if value is a valid value of the attribute

Parameters
  • obj (Model) – object being validated

  • value (Model) – value of attribute to validate

Returns

None if attribute is valid, other return list of errors as an

instance of InvalidAttribute

Return type

InvalidAttribute or None

class obj_tables.core.ManyToOneRelatedManager(object, attribute)[source]

Bases: obj_tables.core.RelatedManager

Represent values of related attributes

Parameters
  • object (Model) – model instance

  • attribute (Attribute) – attribute

append(value, propagate=True)[source]

Add value to list

Parameters
  • value (object) – value

  • propagate (bool, optional) – propagate change to related attribute

Returns

self

Return type

RelatedManager

cut(kind=None)[source]

Cut values and their children of kind kind into separate graphs.

If kind is None, children are defined to be the values of the related attributes defined in each class.

Parameters

kind (str, optional) – kind of children to include

Returns

cut values and their children

Return type

list of Model

remove(value, update_list=True, propagate=True)[source]

Remove value from list

Parameters
  • value (object) – value

  • propagate (bool, optional) – propagate change to related attribute

Returns

self

Return type

RelatedManager

class obj_tables.core.Model(_comments=None, **kwargs)[source]

Bases: object

Base object model

_source[source]

file location, worksheet, column, and row where the object was defined

Type

ModelSource

_comments[source]

comments

Type

list of str

Class attributes:

objects (Manager): a Manager that supports searching for Model instances

Parameters

**kwargs – dictionary of keyword arguments with keys equal to the names of the model attributes

Raises

TypeError – if keyword argument is not a defined attribute

DEFAULT_INDENT = 3[source]
DEFAULT_MAX_DEPTH = 2[source]
class Meta[source]

Bases: object

Meta data for Model

attributes[source]

attributes

Type

collections.OrderedDict of str, Attribute

related_attributes[source]

attributes declared in related objects

Type

collections.OrderedDict of str, :obj:`Attribute

local_attributes[source]

dictionary that maps the names of all local attributes to their instances, including attributes defined in this class and attributes defined in related classes

Type

collections.OrderedDict of str, Attribute

primary_attribute[source]

attribute with primary = True

Type

Attribute

unique_together[source]

controls what tuples of attribute values must be unique

Type

tuple of tuple’s of attribute names

indexed_attrs_tuples[source]

tuples of attributes on which instances of this Model will be indexed by the Model’s Manager

Type

tuple of tuple’s of attribute names

attribute_order[source]

tuple of attribute names, in the order in which they should be displayed

Type

tuple of str

verbose_name[source]

verbose name to refer to an instance of the model

Type

str

verbose_name_plural[source]

plural verbose name for multiple instances of the model

Type

str

description[source]

description of the model (e.g., to print in the table of contents in XLSX)

Type

str

table_format[source]

orientation of model objects in table (e.g. XLSX)

Type

TableFormat

frozen_columns[source]

number of XLSX columns to freeze

Type

int

inheritance[source]

tuple of all superclasses

Type

tuple of class

ordering[source]

controls the order in which objects should be printed when serialized

Type

tuple of attribute names

children[source]

dictionary that maps types of children to names of attributes which compose each type of children

Type

dict that maps str to tuple of str

merge[source]

type of merging operation

Type

ModelMerge

attribute_order = ()[source]
attributes = None[source]
children = {}[source]
description = ''[source]
frozen_columns = 1[source]
indexed_attrs_tuples = ()[source]
inheritance = None[source]
merge = 1[source]
ordering = None[source]
primary_attribute = None[source]
related_attributes = None[source]
table_format = 1[source]
unique_together = ()[source]
verbose_name = ''[source]
verbose_name_plural = ''[source]
__enter__()[source]

Enter context

__exit__(type, value, traceback)[source]

Exit context

__setattr__(attr_name, value, propagate=True)[source]

Set attribute and validate any unique attribute constraints

Parameters
  • attr_name (str) – attribute name

  • value (object) – value

  • propagate (bool, optional) – propagate change through attribute set_value and set_related_value

__str__()[source]

Get the string representation of an object

Returns

string representation of object

Return type

str

classmethod are_attr_paths_equal(attr_path, other_attr_path)[source]

Determine if two attribute paths are semantically equal

Parameters
  • attr_path (list of list of object) – the path to an attribute or nested attribute of a model

  • other_attr_path (list of list of object) – the path to another attribute or nested attribute of a model

Returns

True if the paths are semantically equal

Return type

bool

Determine if the immediate related attributes of the class can be serialized

Returns

True if the related attributes can be serialized

Return type

bool

clean()[source]

Clean all of this Model’s attributes

Returns

None if the object is valid,

otherwise return a list of errors as an instance of InvalidObject

Return type

InvalidObject or None

copy()[source]

Create a copy

Returns

model copy

Return type

Model

cut(kind=None)[source]

Cut the object and its children from the rest of the object graph.

If kind is None, children are defined to be the values of the related attributes defined in each class.

Parameters

kind (str, optional) – kind of children to get

Returns

same object, but cut from the rest of the object graph

Return type

Model

cut_relations(objs_to_keep=None)[source]

Cut relations to objects not in objs.

Parameters

objs_to_keep (set of Model, optional) – objects to retain relations to

classmethod deserialize(value, objects)[source]

Deserialize value

Parameters
  • value (str) – String representation

  • objects (dict) – dictionary of objects, grouped by model

Returns

tuple of cleaned value

and cleaning error

Return type

tuple of object, InvalidAttribute or None

difference(other, tol=0.0)[source]

Get the semantic difference between two models

Parameters
  • other (Model) – other Model

  • tol (float, optional) – equality tolerance

Returns

difference message

Return type

str

static from_dict(json, models, decode_primary_objects=True, primary_objects=None, decoded=None, ignore_extra_models=False, validate=False, output_format=None)[source]

Decode a simple Python representation (dict, list, str, float, bool, None) of an object that is compatible with JSON and YAML, including references to objects through __id keys.

Parameters
  • json (dict) – simple Python representation of the object

  • decode_primary_objects (bool, optional) – if True, decode primary classes otherwise just look up objects by their IDs

  • primary_objects (list, optional) – list of instances of primary classes (i.e. non-line classes)

  • decoded (dict, optional) – dictionary of objects that have already been decoded

  • ignore_extra_models (bool, optional) – if True and all models are found, ignore other worksheets or files

  • validate (bool, optional) – if True, validate the data

  • output_format (str, optional) –

    desired structure of the return value

    • None: Return the data with the same structure as json. Do not reshape the data.

    • list: List of instances of Model.

    • dict: Dictionary that maps subclasses of Model to the instances of each subclass.

Returns

decoded object

Return type

Model

gen_merge_map(other)[source]

Create a dictionary that maps instances of objects in another model to objects in a model

Parameters

other (Model) – other model

Returns

  • dict: dictionary that maps instances of objects in another model to objects in a model

  • list: list of instances of objects in another model which have no parallel in the model

Return type

tuple

gen_serialized_val_obj_map()[source]

Generate mappings from serialized values to objects

Returns

dictionary which maps types of models to dictionaries which serialized values to objects

Return type

dict

Raises

ValueError – if serialized values are not unique within each type

Optimally obtain all objects related to objects in objs

The set of all Models can be viewed as a graph whose nodes are Model instances and whose edges are related connections. Because related edges are bi-directional, this graph is a set of strongly connected components and no edges connect the components.

The algorithm here finds all Models that are reachable from a set of instances in \(O(n)\), where \(n\) is the size of the reachable set. This algorithm is optimal. It achieves this performance because obj.get_related() takes \(O(n(c))\) where \(n(c)\) is the number of nodes in the component containing obj, and each component is only explored once because all of a component’s nodes are stored in found_objs when the component is first explored.

In addition, this method is deterministic because ordered dictionaries preserve insertion order.

Parameters
  • objs (iterator of Model) – some objects

  • forward (bool, optional) – if True, get all forward related objects

  • reverse (bool, optional) – if True, get all reverse related objects

Returns

all objects in objs and all objects related to them, without any duplicates

Return type

list of Model

classmethod get_attr_index(attr)[source]

Get the index of an attribute within Meta.attribute_order

Parameters

attr (Attribute) – attribute

Returns

index of attribute within Meta.attribute_order

Return type

int

classmethod get_attrs(type=None, forward=True, reverse=True)[source]

Get attributes of a type, optionally including attributes from related classes. By default, return all attributes.

Parameters
  • type (type or tuple of type, optional) – type of attributes to get

  • forward (bool, optional) – if True, include attributes from class

  • reverse (bool, optional) – if True, include attributes from related classes

Returns

dictionary of the names and instances

of matching attributes

Return type

dict of str, Attribute

get_attrs_by_val(type=None, reverse=True, include=None, exclude=None)[source]

Get attributes whose type is type and values are in include and not exclude, optionally including attributes from related classes. By default, get all attributes.

Parameters
  • type (type or tuple of type, optional) – type of attributes to get

  • reverse (bool, optional) – if True, include attributes from related classes

  • include (list, optional) – list of values to filter for

  • exclude (list, optional) – list of values to filter out

Returns

dictionary of the names and instances

of matching attributes

Return type

dict of str, Attribute

get_children(kind=None, _Model__type=None, recursive=True, **kwargs)[source]

Get a kind of children.

If kind is None, children are defined to be the values of the related attributes defined in each class.

Parameters
  • kind (str, optional) – kind of children to get

  • __type (types.TypeType or tuple of types.TypeType) – subclass(es) of Model

  • recursive (bool, optional) – if True, get children recursively

  • **kwargs – dictionary of attribute name/value pairs

Returns

children

Return type

list of Model

get_empty_literal_attrs()[source]

Get empty (None, ‘’, or NaN) literal attributes

Returns

dictionary of the names and instances

of empty literal attributes

Return type

dict of str, Attribute

Get empty (None or []) related attributes

Parameters

reverse (bool, optional) – if True, include attributes from related classes

Returns

dictionary of the names and instances

of empty related attributes

Return type

dict of str, Attribute

get_immediate_children(kind=None, _Model__type=None, **kwargs)[source]

Get a kind of immediate children

If kind is None, children are defined to be the values of the related attributes defined in each class.

Parameters
  • kind (str, optional) – kind of children to get

  • __type (types.TypeType or tuple of types.TypeType) – subclass(es) of Model

  • **kwargs – dictionary of attribute name/value pairs

Returns

immediate children

Return type

list of Model

classmethod get_literal_attrs()[source]

Get literal attributes

Returns

dictionary of the names and instances

of literal attributes

Return type

dict of str, Attribute

classmethod get_manager()[source]

Get the manager for the model

Returns

manager

Return type

Manager

classmethod get_nested_attr(attr_path)[source]

Get the value of an attribute or a nested attribute of a model

Parameters

attr_path (list of list of str) – the path to an attribute or nested attribute of a model

Returns

nested attribute

Return type

Attribute

get_nested_attr_val(attr_path)[source]

Get the value of an attribute or a nested attribute of a model

Parameters

attr_path (list of list of object) – the path to an attribute or nested attribute of a model

Returns

value of the attribute or nested attribute

Return type

Object

get_non_empty_literal_attrs()[source]

Get non-empty (None, ‘’, or NaN) literal attributes

Returns

dictionary of the names and instances

of non-empty literal attributes

Return type

dict of str, Attribute

Get non-empty (None or []) related attributes

Parameters

reverse (bool, optional) – if True, include attributes from related classes

Returns

dictionary of the names and instances

of non-empty related attributes

Return type

dict of str, Attribute

get_primary_attribute()[source]

Get value of primary attribute

Returns

value of primary attribute

Return type

object

Get all related objects reachable from self

Parameters
  • forward (bool, optional) – if True, get all forward related objects

  • reverse (bool, optional) – if True, get all reverse related objects

Returns

related objects, without any duplicates

Return type

list of Model

Get related attributes

Parameters

reverse (bool, optional) – if True, include attributes from related classes

Returns

dictionary of the names and instances

of related attributes

Return type

dict of str, Attribute

classmethod get_sort_key(object, attr_name)[source]

Get sort key for Model instance object based on cls.Meta.ordering

Parameters
  • object (Model) – Model instance

  • attr_name (str) – attribute name

Returns

sort key for object

Return type

object

get_source(attr_name)[source]

Get file location of attribute with name attr_name

Provide the type, filename, worksheet, row, and column of attr_name. Row and column use 1-based counting. Column is provided in XLSX format if the file was a spreadsheet.

Parameters

attr_name (str) – attribute name

Returns

type, basename, worksheet, row, columnf

Return type

tuple

:raises ValueError if the location of attr_name is unknown:

has_attr_vals(_Model__type=None, _Model__check_attr_defined=True, **kwargs)[source]

Check if the type and values of the attributes of an object match a set of conditions

Parameters
  • __type (types.TypeType or tuple of types.TypeType) – subclass(es) of Model

  • __check_attr_defined (bool, optional) – if True, raise an exception if the queried attribute is not defined

  • **kwargs – dictionary of attribute name/value pairs to find matching object or create new object

Returns

True if the object is an instance of __type and the

the values of the attributes of the object match kwargs

Return type

bool

is_equal(other, tol=0.0)[source]

Determine whether two models are semantically equal

Parameters
  • other (Model) – object to compare

  • tol (float, optional) – equality tolerance

Returns

True if objects are semantically equal, else False

Return type

bool

classmethod is_serializable()[source]

Determine if the class (and its related classes) can be serialized

Raises

boolTrue if the class can be serialized

merge(other, normalize=True, validate=True)[source]

Merge another model into a model

Parameters
  • other (Model) – other model

  • normalize (bool, optional) – if True, normalize models and merged model

  • validate (bool, optional) – if True, validate models and merged model

merge_attrs(other, other_objs_in_self, self_objs_in_other)[source]

Merge attributes of two objects

Parameters
  • other (Model) – other model

  • other_objs_in_self (dict) – dictionary that maps instances of objects in another model to objects in a model

  • self_objs_in_other (dict) – dictionary that maps instances of objects in a model to objects in another model

normalize()[source]

Normalize an object into a canonical form. Specifically, this method sorts the RelatedManagers into a canonical order because their order has no semantic meaning. Importantly, this canonical form is reproducible. Thus, this canonical form facilitates reproducible computations on top of Model objects.

pformat(max_depth=2, indent=3)[source]

Return a human-readable string representation of this Model.

Follows the graph of related Model’s up to a depth of max_depth. Model’s at depth max_depth+1 are represented by ‘<class name>: …’, while deeper Model’s are not traversed or printed. Re-encountered Model’s do not get printed, and are indicated by ‘<attribute name>: –’. Attributes that are related or iterable are indented.

For example, we have:

Model1_classname:       # Each model starts with its classname, followed by a list of
    attr1: value1           # attribute names & values.
    attr2: value2
    attr3:                  # Reference attributes can point to other Models; we indent these under the attribute name
        Model2_classname:   # Reference attribute attr3 contains Model2;
            ...                 # its attributes follow.
    attr4:
        Model3_classname:   # An iteration over reference attributes is a list at constant indentation:
            ...
    attr5:
        Model2_classname: --    # Traversing the Model network may re-encounter a Model; they're listed with '--'
    attr6:
        Model5_classname:
            attr7:
                Model5_classname: ...   # The size of the output is controlled with max_depth;
                                        # models encountered at depth = max_depth+1 are shown with '...'
Parameters
  • max_depth (int, optional) – the maximum depth to which related Model’s should be printed

  • indent (int, optional) – number of spaces to indent

Returns

obj:str: readable string representation of this Model

pprint(stream=None, max_depth=2, indent=3)[source]
serialize()[source]

Get value of primary attribute

Returns

value of primary attribute

Return type

str

set_nested_attr_val(attr_path, value)[source]

Set the value of an attribute or a nested attribute of a model

Parameters
  • attr_path (list of list of object) – the path to an attribute or nested attribute of a model

  • value (object) – new value

Returns

the same model with the value of an attribute

modified

Return type

Model

set_source(path_name, sheet_name, attribute_seq, row, table_id=None)[source]

Set metadata about source of the file, worksheet, columns, and row where the object was defined

Parameters
  • path_name (str) – pathname of source file for object

  • sheet_name (str) – name of spreadsheet containing source data for object

  • attribute_seq (list) – sequence of attribute names in source file; blank values indicate attributes that were ignored

  • row (int) – row number of object in its source file

  • table_id (str, optional) – id of the source table

classmethod sort(objects)[source]

Sort list of Model objects

Parameters

objects (list of Model) – list of objects

Returns

sorted list of objects

Return type

list of Model

static to_dict(object, models=None, encode_primary_objects=True, encoded=None)[source]

Encode a instance of Model or a collection of instances of Model using a simple Python representation (dict, list, str, float, bool, None) that is compatible with JSON and YAML. Use __id keys to avoid infinite recursion by encoding each object once and referring to objects by their __id for each repeated reference.

Parameters
  • object (object) – instance of Model or a collection (dict, list, tuple, or nested combination of dict, list, and tuple) of instances of Model

  • models (str, optional) – list of models to encode into JSON

  • encode_primary_objects (bool, optional) – if True, encode primary classes otherwise just encode their IDs

  • encoded (dict, optional) – objects that have already been encoded and their assigned JSON identifiers

Returns

simple Python representation of the object

Return type

dict

validate()[source]

Determine if the object is valid

Returns

None if the object is valid,

otherwise return a list of errors as an instance of InvalidObject

Return type

InvalidObject or None

Validate attribute values

Raises

ValueError – if related attributes are not valid (e.g. if a class that is the subject of a relationship does not have a primary attribute)

classmethod validate_unique(objects)[source]

Validate attribute uniqueness

Parameters

objects (list of Model) – list of objects

Returns

list of invalid attributes and their errors

Return type

InvalidModel or None

class obj_tables.core.ModelMerge[source]

Bases: int, enum.Enum

Types of model merging operations

append = 2[source]
join = 1[source]
class obj_tables.core.ModelMeta[source]

Bases: type

Parameters
  • metacls (Model) – Model, or a subclass of Model

  • name (str) – Model class name

  • bases (tuple) – tuple of superclasses

  • namespace (dict) – namespace of Model class definition

Returns

a new instance of Model, or a subclass of Model

Return type

Model

static __new__(metacls, name, bases, namespace)[source]
Parameters
  • metacls (Model) – Model, or a subclass of Model

  • name (str) – Model class name

  • bases (tuple) – tuple of superclasses

  • namespace (dict) – namespace of Model class definition

Returns

a new instance of Model, or a subclass of Model

Return type

Model

create_model_manager()[source]

Create a Manager for this Model

The Manager is accessed via a Model’s objects attribute

Parameters

cls (type) – the Model class which is being managed

init_attribute_order()[source]

Initialize the order in which the attributes should be printed across XLSX columns

init_attributes()[source]

Initialize attributes

init_inheritance()[source]

Create tuple of this model and superclasses which are subclasses of Model

init_ordering()[source]

Initialize how to sort objects

init_primary_attribute()[source]

Initialize the primary attribute of a model

Initialize related attributes

init_verbose_names()[source]

Initialize the singular and plural verbose names of a model

normalize_attr_tuples(attribute)[source]

Normalize a tuple of tuples of attribute names

Parameters

attribute (str) – the name of the attribute to validate and normalize

static normalize_tuple_of_tuples_of_attribute_names(tuple_of_tuples_of_attribute_names)[source]

Normalize a tuple of tuples of attribute names by sorting each member tuple

Enables simple indexing and searching of tuples

Parameters

tuple_of_tuples_of_attribute_names (tuple) – a tuple of tuples of attribute names

Returns

a tuple of sorted tuples of attribute names

Return type

tuple

classmethod validate_attr_tuples(name, bases, namespace, meta_attribute_name)[source]

Validate a tuple of tuples of attribute names

Parameters

meta_attribute_name (str) – the name of the attribute to validate and normalize

Raises

ValueError – if attributes are not valid

classmethod validate_attribute_inheritance(name, bases, namespace)[source]

Check attribute inheritance

Raises

ValueError – if subclass overrides a superclass attribute (instance of Attribute) with an incompatible attribute (i.e. an attribute that is not a subclass of the class of the super class’ attribute)

classmethod validate_attributes(name, bases, namespace)[source]

Validate attribute values

Raises

ValueError – if attributes are not valid

classmethod validate_meta(name, bases, namespace)[source]
classmethod validate_primary_attribute(name, bases, namespace)[source]

Check the attributes

Raises

ValueError – if there are multiple primary attributes

Check the related attributes

Raises

ValueError – if an OneToManyAttribute or ManyToOneAttribute has a related_name equal to its name

class obj_tables.core.ModelSource(path_name, sheet_name, attribute_seq, row, table_id=None)[source]

Bases: object

Represents the file, sheet, columns, and row where a Model instance was defined

path_name[source]

pathname of source file for object

Type

str

sheet_name[source]

name of spreadsheet containing source data for object

Type

str

attribute_seq[source]

sequence of attribute names in source file; blank values indicate attributes that were ignored

Type

list

row[source]

row number of object in its source file

Type

int

table_id[source]

id of the source table

Type

str

Parameters
  • path_name (str) – pathname of source file for object

  • sheet_name (str) – name of spreadsheet containing source data for object

  • attribute_seq (list) – sequence of attribute names in source file; blank values indicate attributes that were ignored

  • row (int) – row number of object in its source file

  • table_id (str, optional) – id of the source table

class obj_tables.core.NumericAttribute(init_value=None, default=None, default_cleaned_value=None, none_value=None, verbose_name='', description='', primary=False, unique=False, unique_case_insensitive=False)[source]

Bases: obj_tables.core.LiteralAttribute

Base class for numeric literal attributes (float, integer)

Parameters
  • init_value (object, optional) – initial value

  • default (object, optional) – default value

  • default_cleaned_value (object, optional) – value to replace None values with during cleaning, or function which computes the value to replace None values

  • none_value (object, optional) – none value

  • verbose_name (str, optional) – verbose name

  • description (str, optional) – description

  • primary (bool, optional) – indicate if attribute is primary attribute

  • unique (bool, optional) – indicate if attribute value must be unique

  • unique_case_insensitive (bool, optional) – if true, conduct case-insensitive test of uniqueness

exception obj_tables.core.ObjTablesWarning[source]

Bases: UserWarning

ObjTables warning

class obj_tables.core.OneToManyAttribute(related_class, related_name='', default=[], default_cleaned_value=[], related_default=None, none_value=<class 'list'>, separator=', ', min_related=0, max_related=inf, min_related_rev=0, verbose_name='', verbose_related_name='', description='', related_manager=<class 'obj_tables.core.OneToManyRelatedManager'>, cell_dialect=<CellDialect.json: 'json'>)[source]

Bases: obj_tables.core.ToManyAttribute, obj_tables.core.RelatedAttribute

Represents a one-to-many relationship between two types of objects. This is analagous to a foreign key relationship in a database.

related_manager[source]

related manager

Type

type

cell_dialect[source]

dialect for serializing values to a cell

Type

CellDialect

Parameters
  • related_class (class) – related class

  • related_name (str, optional) – name of related attribute on related_class

  • default (callable, optional) – function which returns the default value

  • default_cleaned_value (callable, optional) – value to replace None values with during cleaning, or function which computes the value to replace None values

  • related_default (callable, optional) – function which returns the default related value

  • none_value (object, optional) – none value

  • separator (str, optional) – element separator for serialization

  • min_related (int, optional) – minimum number of related objects in the forward direction

  • max_related (int, optional) – maximum number of related objects in the forward direction

  • min_related_rev (int, optional) – minimum number of related objects in the reverse direction

  • verbose_name (str, optional) – verbose name

  • verbose_related_name (str, optional) – verbose related name

  • description (str, optional) – description

  • related_manager (type, optional) – related manager

  • cell_dialect (CellDialect, optional) – dialect for serializing values to a cell

copy_value(value, objects_and_copies)[source]

Copy value

Parameters
  • value (list of Model) – value

  • objects_and_copies (dict) – dictionary that maps objects to their copies

Returns

copy of value

Return type

list of Model

deserialize(values, objects, decoded=None)[source]

Deserialize value

Parameters
  • values (object) – String representation

  • objects (dict) – dictionary of objects, grouped by model

  • decoded (dict, optional) – dictionary of objects that have already been decoded

Returns

tuple of cleaned value and cleaning error

Return type

tuple of object, InvalidAttribute or None

get_init_value(obj)[source]

Get initial value for attribute

Parameters

obj (Model) – object whose attribute is being initialized

Returns

initial value

Return type

object

get_xlsx_validation(sheet_models=None, doc_metadata_model=None)[source]

Get XLSX validation

Parameters
  • sheet_models (list of Model, optional) – models encoded as separate sheets

  • doc_metadata_model (type) – model whose worksheet contains the document metadata

Returns

validation

Return type

wc_utils.workbook.io.FieldValidation

merge(left, right, right_objs_in_left, left_objs_in_right)[source]

Merge an attribute of elements of two models

Parameters
  • left (Model) – an element in a model to merge

  • right (Model) – an element in a second model to merge

  • right_objs_in_left (dict) – mapping from objects in right model to objects in left model

  • left_objs_in_right (dict) – mapping from objects in left model to objects in right model

Raises

ValueError – if the attributes of the elements of the models are different

related_validate(obj, value)[source]

Determine if value is a valid value of the related attribute

Parameters
  • obj (Model) – object being validated

  • value (Model) – value of attribute to validate

Returns

None if attribute is valid, other return list of errors as an

instance of InvalidAttribute

Return type

InvalidAttribute or None

serialize(value, encoded=None)[source]

Serialize related object

Parameters
  • value (list of Model) – Python representation

  • encoded (dict, optional) – dictionary of objects that have already been encoded

Returns

simple Python representation

Return type

str

Update the values of the related attributes of the attribute

Parameters
  • obj (object) – object whose attribute should be set

  • new_value (Model) – new attribute value

Returns

new attribute value

Return type

Model

Raises

ValueError – if related property is not defined

set_value(obj, new_values)[source]

Update the values of the related attributes of the attribute

Parameters
  • obj (object) – object whose attribute should be set

  • new_values (list) – value of the attribute

Returns

value of the attribute

Return type

list

validate(obj, value)[source]

Determine if value is a valid value of the attribute

Parameters
  • obj (Model) – object being validated

  • value (list of Model) – value to validate

Returns

None if attribute is valid, other return list of errors as an

instance of InvalidAttribute

Return type

InvalidAttribute or None

class obj_tables.core.OneToManyRelatedManager(object, attribute)[source]

Bases: obj_tables.core.RelatedManager

Represent values of related attributes

Parameters
  • object (Model) – model instance

  • attribute (Attribute) – attribute

append(value, propagate=True)[source]

Add value to list

Parameters
  • value (object) – value

  • propagate (bool, optional) – propagate change to related attribute

Returns

self

Return type

RelatedManager

cut(kind=None)[source]

Cut values and their children of kind kind into separate graphs.

If kind is None, children are defined to be the values of the related attributes defined in each class.

Parameters

kind (str, optional) – kind of children to include

Returns

cut values and their children

Return type

list of Model

remove(value, update_list=True, propagate=True)[source]

Remove value from list

Parameters
  • value (object) – value

  • propagate (bool, optional) – propagate change to related attribute

Returns

self

Return type

RelatedManager

class obj_tables.core.OneToOneAttribute(related_class, related_name='', default=None, default_cleaned_value=None, related_default=None, none_value=None, min_related=0, min_related_rev=0, verbose_name='', verbose_related_name='', description='')[source]

Bases: obj_tables.core.RelatedAttribute

Represents a one-to-one relationship between two types of objects.

Parameters
  • related_class (class) – related class

  • related_name (str, optional) – name of related attribute on related_class

  • default (callable, optional) – callable which returns default value

  • default_cleaned_value (callable, optional) – value to replace None values with during cleaning, or function which computes the value to replace None values

  • related_default (callable, optional) – callable which returns default related value

  • none_value (object, optional) – none value

  • min_related (int, optional) – minimum number of related objects in the forward direction

  • min_related_rev (int, optional) – minimum number of related objects in the reverse direction

  • verbose_name (str, optional) – verbose name

  • verbose_related_name (str, optional) – verbose related name

  • description (str, optional) – description

copy_value(value, objects_and_copies)[source]

Copy value

Parameters
  • value (Model) – value

  • objects_and_copies (dict) – dictionary that maps objects to their copies

Returns

copy of value

Return type

Model

deserialize(value, objects, decoded=None)[source]

Deserialize value

Parameters
  • value (str) – String representation

  • objects (dict) – dictionary of objects, grouped by model

  • decoded (dict, optional) – dictionary of objects that have already been decoded

Returns

tuple of cleaned value and cleaning error

Return type

tuple of object, InvalidAttribute or None

deserialize_from_cell(value, objects, decoded=None)[source]

Deserialize value from cell

Parameters
  • value (str) – String representation

  • objects (dict) – dictionary of objects, grouped by model

  • decoded (dict, optional) – dictionary of objects that have already been decoded

Returns

Model

get_xlsx_validation(sheet_models=None, doc_metadata_model=None)[source]

Get XLSX validation

Parameters
  • sheet_models (list of Model, optional) – models encoded as separate sheets

  • doc_metadata_model (type) – model whose worksheet contains the document metadata

Returns

validation

Return type

wc_utils.workbook.io.FieldValidation

merge(left, right, right_objs_in_left, left_objs_in_right)[source]

Merge an attribute of elements of two models

Parameters
  • left (Model) – an element in a model to merge

  • right (Model) – an element in a second model to merge

  • right_objs_in_left (dict) – mapping from objects in right model to objects in left model

  • left_objs_in_right (dict) – mapping from objects in left model to objects in right model

Raises

ValueError – if the attributes of the elements of the models are different

related_validate(obj, value)[source]

Determine if value is a valid value of the related attribute

Parameters
  • obj (Model) – object being validated

  • value (list of Model) – value to validate

Returns

None if attribute is valid, other return list of errors as an

instance of InvalidAttribute

Return type

InvalidAttribute or None

serialize(value, encoded=None)[source]

Serialize related object

Parameters
  • value (Model) – Python representation

  • encoded (dict, optional) – dictionary of objects that have already been encoded

Returns

simple Python representation

Return type

str

serialize_to_cell(value, encoded=None)[source]

Serialize related object

Parameters
  • value (Model) – Python representation

  • encoded (dict, optional) – dictionary of objects that have already been encoded

Returns

string representation

Return type

str

Update the values of the related attributes of the attribute

Parameters
  • obj (object) – object whose attribute should be set

  • new_value (Model) – value of the attribute

Returns

value of the attribute

Return type

Model

Raises

ValueError – if related property is not defined or the attribute of new_value is not None

set_value(obj, new_value)[source]

Update the values of the related attributes of the attribute

Parameters
  • obj (object) – object whose attribute should be set

  • new_value (Model) – new attribute value

Returns

new attribute value

Return type

Model

Raises

ValueError – if related attribute of new_value is not None

validate(obj, value)[source]

Determine if value is a valid value of the attribute

Parameters
  • obj (Model) – object being validated

  • value (Model) – value of attribute to validate

Returns

None if attribute is valid, other return list of errors as an

instance of InvalidAttribute

Return type

InvalidAttribute or None

class obj_tables.core.PositiveFloatAttribute(max=nan, nan=True, default=nan, default_cleaned_value=nan, none_value=nan, verbose_name='', description='', primary=False, unique=False)[source]

Bases: obj_tables.core.FloatAttribute

Positive float attribute

Parameters
  • max (float, optional) – maximum value

  • nan (bool, optional) – if true, allow nan values

  • default (float, optional) – default value

  • default_cleaned_value (float, optional) – value to replace None values with during cleaning

  • verbose_name (str, optional) – verbose name

  • description (str, optional) – description

  • primary (bool, optional) – indicate if attribute is primary attribute

  • unique (bool, optional) – indicate if attribute value must be unique

get_xlsx_validation(sheet_models=None, doc_metadata_model=None)[source]

Get XLSX validation

Parameters
  • sheet_models (list of Model, optional) – models encoded as separate sheets

  • doc_metadata_model (type) – model whose worksheet contains the document metadata

Returns

validation

Return type

wc_utils.workbook.io.FieldValidation

validate(obj, value)[source]

Determine if value is a valid value of the attribute

Parameters
  • obj (Model) – object being validated

  • value (object) – value of attribute to validate

Returns

None if attribute is valid, other return list of errors as an

instance of InvalidAttribute

Return type

InvalidAttribute or None

class obj_tables.core.PositiveIntegerAttribute(max=None, none=False, default=None, default_cleaned_value=None, none_value=None, verbose_name='', description='', primary=False, unique=False)[source]

Bases: obj_tables.core.IntegerAttribute

Positive integer attribute

Parameters
  • min (int, optional) – minimum value

  • max (int, optional) – maximum value

  • none (bool, optional) – if False, the attribute is invalid if its value is None

  • default (int, optional) – default value

  • default_cleaned_value (int, optional) – value to replace None values with during cleaning

  • none_value (object, optional) – none value

  • verbose_name (str, optional) – verbose name

  • description (str, optional) – description

  • primary (bool, optional) – indicate if attribute is primary attribute

  • unique (bool, optional) – indicate if attribute value must be unique

get_xlsx_validation(sheet_models=None, doc_metadata_model=None)[source]

Get XLSX validation

Parameters
  • sheet_models (list of Model, optional) – models encoded as separate sheets

  • doc_metadata_model (type) – model whose worksheet contains the document metadata

Returns

validation

Return type

wc_utils.workbook.io.FieldValidation

validate(obj, value)[source]

Determine if value is a valid value of the attribute

Parameters
  • obj (Model) – object being validated

  • value (object) – value of attribute to validate

Returns

None if attribute is valid, other return list of errors as an

instance of InvalidAttribute

Return type

InvalidAttribute or None

class obj_tables.core.Range(min, max)[source]

Bases: object

A numerical range

Attributes

min (int, float): minimum max (int, float): maximum

Parameters
  • min (int, float) – minimum

  • max (int, float) – maximum

is_equal(other, tol=0.0)[source]
class obj_tables.core.RangeAttribute(type=<class 'float'>, separator='-', separator_pattern='(?<!e) *- *', none=True, default=None, none_value=None, verbose_name='', description='A range of values')[source]

Bases: obj_tables.core.LiteralAttribute

Attribute for a range of values (x-y)

type[source]

type of elements

Type

type

Parameters
  • type (type, optional) – type of elements

  • separator (str, optional) – element separator for serialization

  • separator_pattern (str, optional) – element separator for deserialization

  • none (bool, optional) – if False, the attribute is invalid if its value is None

  • default (Range, optional) – default value

  • none_value (Range, optional) – none value

  • verbose_name (str, optional) – verbose name

  • description (str, optional) – description

clean(value)[source]

Deserialize value

Parameters

value (str) – semantically equivalent representation

Returns

Return type

tuple

from_builtin(json)[source]

Decode a simple Python representation (dict, list, str, float, bool, None) of a value of the attribute that is compatible with JSON and YAML

Parameters

json (dict) – simple Python representation of a value of the attribute

Returns

decoded value of the attribute

Return type

Range

get_xlsx_validation(sheet_models=None, doc_metadata_model=None)[source]

Get XLSX validation

Parameters
  • sheet_models (list of Model, optional) – models encoded as separate sheets

  • doc_metadata_model (type) – model whose worksheet contains the document metadata

Returns

validation

Return type

wc_utils.workbook.io.FieldValidation

serialize(value)[source]

Serialize string

Parameters

value (Range) – Python representation

Returns

simple Python representation

Return type

str

to_builtin(value)[source]

Encode a value of the attribute using a simple Python representation (dict, list, str, float, bool, None) that is compatible with JSON and YAML

Parameters

value (Range) – value of the attribute

Returns

simple Python representation of a value of the attribute

Return type

dict

validate(obj, value)[source]

Determine if value is a valid value

Parameters
  • obj (Model) – class being validated

  • value (Range) – value of attribute to validate

Returns

None if attribute is valid, other return

list of errors as an instance of InvalidAttribute

Return type

InvalidAttribute or None

value_equal(val1, val2, tol=0.0)[source]

Determine if attribute values are equal

Parameters
  • val1 (Range) – first value

  • val2 (Range) – second value

  • tol (float, optional) – equality tolerance

Returns

True if attribute values are equal

Return type

bool

class obj_tables.core.RegexAttribute(pattern, flags=0, min_length=0, max_length=None, none=False, default='', default_cleaned_value='', none_value='', verbose_name='', description='', primary=False, unique=False)[source]

Bases: obj_tables.core.StringAttribute

Regular expression attribute

pattern[source]

regular expression pattern

Type

str

flags[source]

regular expression flags

Type

int

none[source]

if False, the attribute is invalid if its value is None

Type

bool

Parameters
  • pattern (str) – regular expression pattern

  • flags (int, optional) – regular expression flags

  • min_length (int, optional) – minimum length

  • max_length (int, optional) – maximum length

  • none (bool, optional) – if False, the attribute is invalid if its value is None

  • default (str, optional) – default value

  • default_cleaned_value (str, optional) – value to replace None values with during cleaning

  • verbose_name (str, optional) – verbose name

  • description (str, optional) – description

  • primary (bool, optional) – indicate if attribute is primary attribute

  • unique (bool, optional) – indicate if attribute value must be unique

validate(obj, value)[source]

Determine if value is a valid value of the attribute

Parameters
  • obj (Model) – object being validated

  • value (object) – value of attribute to validate

Returns

None if attribute is valid, other return list of errors as an

instance of InvalidAttribute

Return type

InvalidAttribute or None

class obj_tables.core.RelatedAttribute(related_class, related_name='', init_value=None, default=None, default_cleaned_value=None, none_value=None, related_init_value=None, related_default=None, min_related=0, max_related=inf, min_related_rev=0, max_related_rev=inf, verbose_name='', verbose_related_name='', description='')[source]

Bases: obj_tables.core.BaseRelatedAttribute, obj_tables.core.Attribute

Attribute which represents a relationship with another Model

related_type[source]

allowed type(s) of the related values of the attribute

Type

types.TypeType or tuple of types.TypeType

primary_class[source]

the type of the class that this related attribute references

Type

class

related_class[source]

the type of the class that contains a related attribute

Type

class

related_name[source]

name of related attribute on related_class

Type

str

verbose related name

Type

str

related_init_value[source]

initial value of related attribute

Type

object

related_default[source]

default value of related attribute

Type

object

minimum number of related objects in the forward direction

Type

int

maximum number of related objects in the forward direction

Type

int

minimum number of related objects in the reverse direction

Type

int

maximum number of related objects in the reverse direction

Type

int

Parameters
  • related_class (class) – related class

  • related_name (str, optional) – name of related attribute on related_class

  • init_value (object, optional) – initial value

  • default (object, optional) – default value

  • default_cleaned_value (object, optional) – value to replace None values with during cleaning, or function which computes the value to replace None values

  • none_value (object, optional) – none value

  • related_init_value (object, optional) – related initial value

  • related_default (object, optional) – related default value

  • min_related (int, optional) – minimum number of related objects in the forward direction

  • max_related (int, optional) – maximum number of related objects in the forward direction

  • min_related_rev (int, optional) – minimum number of related objects in the reverse direction

  • max_related_rev (int, optional) – maximum number of related objects in the reverse direction

  • verbose_name (str, optional) – verbose name

  • verbose_related_name (str, optional) – verbose related name

  • description (str, optional) – description

Raises

ValueError – If default or related_default is not None, an empty list, or a callable or default and related_default are both non-empty lists or callables

deserialize(value, objects, decoded=None)[source]

Deserialize value

Parameters
  • values (object) – String representation

  • objects (dict) – dictionary of objects, grouped by model

  • decoded (dict, optional) – dictionary of objects that have already been decoded

Returns

tuple of cleaned value and cleaning error

Return type

tuple of object, InvalidAttribute or None

from_builtin(json)[source]

Decode a simple Python representation (dict, list, str, float, bool, None) of a value of the attribute that is compatible with JSON and YAML

Parameters

json (object) – simple Python representation of a value of the attribute

Returns

decoded value of the attribute

Return type

object

Get default related value for attribute

Parameters

obj (Model) – object whose attribute is being initialized

Returns

initial value

Return type

object

Raises

ValueError – if related property is not defined

Get initial related value for attribute

Parameters

obj (object) – object whose attribute is being initialized

Returns

initial value

Return type

value (object)

Raises

ValueError – if related property is not defined

abstract related_validate(obj, value)[source]

Determine if value is a valid value of the related attribute

Parameters
  • obj (Model) – object to validate

  • value (list) – value to validate

Returns

None if attribute is valid, other return list of errors as an

instance of InvalidAttribute

Return type

InvalidAttribute or None

serialize(value, encoded=None)[source]

Serialize related object

Parameters
  • value (Model) – Python representation

  • encoded (dict, optional) – dictionary of objects that have already been encoded

Returns

simple Python representation

Return type

str

Update the values of the related attributes of the attribute

Parameters
  • obj (object) – object whose attribute should be set

  • new_values (object) – value of the attribute

Returns

value of the attribute

Return type

object

to_builtin(value)[source]

Encode a value of the attribute using a simple Python representation (dict, list, str, float, bool, None) that is compatible with JSON and YAML

Parameters

value (object) – value of the attribute

Returns

simple Python representation of a value of the attribute

Return type

object

class obj_tables.core.RelatedManager(object, attribute, related=True)[source]

Bases: list

Represent values and related values of related attributes

object[source]

model instance

Type

Model

attribute[source]

attribute

Type

Attribute

related[source]

is related attribute

Type

bool

Parameters
  • object (Model) – model instance

  • attribute (Attribute) – attribute

  • related (bool, optional) – is related attribute

add(value, **kwargs)[source]

Add value to list

Parameters

value (object) – value

Returns

self

Return type

RelatedManager

append(value, **kwargs)[source]

Add value to list

Parameters

value (object) – value

Returns

self

Return type

RelatedManager

clear()[source]

Remove all elements from list

Returns

self

Return type

RelatedManager

create(_RelatedManager__type=None, **kwargs)[source]

Create instance of primary class and add to list

Parameters
  • __type (types.TypeType or tuple of types.TypeType) – subclass(es) of Model

  • **kwargs – dictionary of attribute name/value pairs

Returns

created object

Return type

Model

Raises

ValueError – if keyword argument is not an attribute of the class

difference_update(values)[source]

Retain only values of list not in values

Parameters

values (list) – values to difference with list

Returns

self

Return type

RelatedManager

discard(value)[source]

Remove value from list if value in list

Parameters

value (object) – value

Returns

self

Return type

RelatedManager

extend(values)[source]

Add values to list

Parameters

values (list) – values to add to list

Returns

self

Return type

RelatedManager

get(_RelatedManager__type=None, **kwargs)[source]

Get related objects by attribute/value pairs and, optionally, only return matches that are also instances of Model subclass __type.

Parameters
  • __type (types.TypeType or tuple of types.TypeType) – subclass(es) of Model

  • **kwargs – dictionary of attribute name/value pairs to find matching objects

Returns

matching instances of Model

Return type

list of Model

get_one(_RelatedManager__type=None, **kwargs)[source]

Get a related object by attribute/value pairs; report an error if multiple objects match and, optionally, only return matches that are also instances of Model subclass __type.

Parameters
  • __type (types.TypeType or tuple of types.TypeType) – subclass(es) of Model

  • **kwargs – dictionary of attribute name/value pairs to find matching objects

Returns

matching instance of Model, or None if no matching instance

Return type

Model or None

Raises

ValueError – if multiple matching objects

get_or_create(_RelatedManager__type=None, **kwargs)[source]

Get or create a related object by attribute/value pairs. Optionally, only get or create instances of Model subclass __type.

Parameters
  • __type (types.TypeType or tuple of types.TypeType) – subclass(es) of Model

  • **kwargs – dictionary of attribute name/value pairs to find matching object or create new object

Returns

existing or new object

Return type

Model

index(*args, **kwargs)[source]

Get related object index by attribute/value pairs

Parameters
  • *args (Model) – object to find

  • **kwargs – dictionary of attribute name/value pairs to find matching objects

Returns

index of matching object

Return type

int

Raises

ValueError – if no argument or keyword argument is provided, if argument and keyword arguments are both provided, if multiple arguments are provided, if the keyword attribute/value pairs match no object, or if the keyword attribute/value pairs match multiple objects

intersection_update(values)[source]

Retain only intersection of list and values

Parameters

values (list) – values to intersect with list

Returns

self

Return type

RelatedManager

pop(i=-1)[source]

Remove an arbitrary element from the list

Parameters

i (int, optional) – index of element to remove

Returns

removed element

Return type

object

symmetric_difference_update(values)[source]

Retain values in only one of list and values

Parameters

values (list) – values to difference with list

Returns

self

Return type

RelatedManager

update(values)[source]

Add values to list

Parameters

values (list) – values to add to list

Returns

self

Return type

RelatedManager

exception obj_tables.core.SchemaWarning[source]

Bases: obj_tables.core.ObjTablesWarning

Schema warning

class obj_tables.core.SlugAttribute(verbose_name='', description=None, primary=True, unique=True)[source]

Bases: obj_tables.core.RegexAttribute

Slug attribute to be used for string IDs

Parameters
  • verbose_name (str, optional) – verbose name

  • description (str, optional) – description

  • primary (bool, optional) – indicate if attribute is primary attribute

  • unique (bool, optional) – indicate whether attribute must be unique

class obj_tables.core.StringAttribute(min_length=0, max_length=255, none=False, default='', default_cleaned_value='', none_value='', verbose_name='', description='', primary=False, unique=False, unique_case_insensitive=False)[source]

Bases: obj_tables.core.LiteralAttribute

String attribute

none[source]

if False, the attribute is invalid if its value is None

Type

bool

default[source]

default value

Type

str

default_cleaned_value[source]

value to replace None values with during cleaning

Type

str

min_length[source]

minimum length

Type

int

max_length[source]

maximum length

Type

int

Parameters
  • min_length (int, optional) – minimum length

  • max_length (int, optional) – maximum length

  • none (bool, optional) – if False, the attribute is invalid if its value is None

  • default (str, optional) – default value

  • default_cleaned_value (str, optional) – value to replace None values with during cleaning

  • verbose_name (str, optional) – verbose name

  • description (str, optional) – description

  • primary (bool, optional) – indicate if attribute is primary attribute

  • unique (bool, optional) – indicate if attribute value must be unique

  • unique_case_insensitive (bool, optional) – if true, conduct case-insensitive test of uniqueness

Raises

ValueError – if min_length is negative, max_length is less than min_length, default is not a string, or default_cleaned_value is not a string

clean(value)[source]

Convert attribute value into the appropriate type

Parameters

value (object) – value of attribute to clean

Returns

tuple of cleaned value and cleaning error

Return type

tuple of str, InvalidAttribute or None

get_xlsx_validation(sheet_models=None, doc_metadata_model=None)[source]

Get XLSX validation

Parameters
  • sheet_models (list of Model, optional) – models encoded as separate sheets

  • doc_metadata_model (type) – model whose worksheet contains the document metadata

Returns

validation

Return type

wc_utils.workbook.io.FieldValidation

serialize(value)[source]

Serialize string

Parameters

value (str) – Python representation

Returns

simple Python representation

Return type

str

validate(obj, value)[source]

Determine if value is a valid value for this StringAttribute

Parameters
  • obj (Model) – class being validated

  • value (object) – value of attribute to validate

Returns

None if attribute is valid, other return list of errors as an

instance of InvalidAttribute

Return type

InvalidAttribute or None

class obj_tables.core.TableFormat[source]

Bases: enum.Enum

Describes a table’s orientation

  • row: the first row contains attribute names; subsequents rows store objects

  • column: the first column contains attribute names; subsequents columns store objects

  • cell: a cell contains a table, as a comma-separated list for example

  • multiple_cells: multiple cells within a row or column

cell = 3[source]
column = 2[source]
multiple_cells = 4[source]
row = 1[source]
class obj_tables.core.TimeAttribute(none=True, default=None, default_cleaned_value=None, none_value=None, verbose_name='', description='', primary=False, unique=False)[source]

Bases: obj_tables.core.LiteralAttribute

Time attribute

none[source]

if False, the attribute is invalid if its value is None

Type

bool

default[source]

default time

Type

time

default_cleaned_value[source]

value to replace None values with during cleaning, or function which computes the value to replace None values

Type

time

Parameters
  • none (bool, optional) – if False, the attribute is invalid if its value is None

  • default (time, optional) – default time

  • default_cleaned_value (time, optional) – value to replace None values with during cleaning, or function which computes the value to replace None values

  • none_value (object, optional) – none value

  • verbose_name (str, optional) – verbose name

  • description (str, optional) – description

  • primary (bool, optional) – indicate if attribute is primary attribute

  • unique (bool, optional) – indicate if attribute value must be unique

clean(value)[source]

Convert attribute value into the appropriate type

Parameters

value (object) – value of attribute to clean

Returns

tuple of cleaned value and cleaning error

Return type

tuple of time, InvalidAttribute or None

from_builtin(json)[source]

Decode a simple Python representation (dict, list, str, float, bool, None) of a value of the attribute that is compatible with JSON and YAML

Parameters

json (str) – simple Python representation of a value of the attribute

Returns

decoded value of the attribute

Return type

time

get_xlsx_validation(sheet_models=None, doc_metadata_model=None)[source]

Get XLSX validation

Parameters
  • sheet_models (list of Model, optional) – models encoded as separate sheets

  • doc_metadata_model (type) – model whose worksheet contains the document metadata

Returns

validation

Return type

wc_utils.workbook.io.FieldValidation

serialize(value)[source]

Serialize string

Parameters

value (time) – Python representation

Returns

simple Python representation

Return type

str

to_builtin(value)[source]

Encode a value of the attribute using a simple Python representation (dict, list, str, float, bool, None) that is compatible with JSON and YAML

Parameters

value (time) – value of the attribute

Returns

simple Python representation of a value of the attribute

Return type

str

validate(obj, value)[source]

Determine if value is a valid value of the attribute

Parameters
  • obj (Model) – object being validated

  • value (time) – value of attribute to validate

Returns

None if attribute is valid, other return list of errors as an

instance of InvalidAttribute

Return type

InvalidAttribute or None

class obj_tables.core.ToManyAttribute(related_class, related_name='', init_value=None, default=None, default_cleaned_value=None, none_value=None, related_init_value=None, related_default=None, min_related=0, max_related=inf, min_related_rev=0, max_related_rev=inf, verbose_name='', verbose_related_name='', description='')[source]

Bases: obj_tables.core.RelatedAttribute

*-to-many attribute

cell_dialect[source]

dialect for serializing values to a cell

Type

CellDialect

Parameters
  • related_class (class) – related class

  • related_name (str, optional) – name of related attribute on related_class

  • init_value (object, optional) – initial value

  • default (object, optional) – default value

  • default_cleaned_value (object, optional) – value to replace None values with during cleaning, or function which computes the value to replace None values

  • none_value (object, optional) – none value

  • related_init_value (object, optional) – related initial value

  • related_default (object, optional) – related default value

  • min_related (int, optional) – minimum number of related objects in the forward direction

  • max_related (int, optional) – maximum number of related objects in the forward direction

  • min_related_rev (int, optional) – minimum number of related objects in the reverse direction

  • max_related_rev (int, optional) – maximum number of related objects in the reverse direction

  • verbose_name (str, optional) – verbose name

  • verbose_related_name (str, optional) – verbose related name

  • description (str, optional) – description

Raises

ValueError – If default or related_default is not None, an empty list, or a callable or default and related_default are both non-empty lists or callables

deserialize_from_cell(values, objects, decoded=None)[source]

Deserialize value from cell

Parameters
  • values (str) – String representation

  • objects (dict) – dictionary of objects, grouped by model

  • decoded (dict, optional) – dictionary of objects that have already been decoded

Returns

dict

serialize_to_cell(values, encoded=None)[source]

Serialize related object

Parameters
  • values (list of Model) – Python representation

  • encoded (dict, optional) – dictionary of objects that have already been encoded

Returns

simple Python representation

Return type

str

class obj_tables.core.UrlAttribute(verbose_name='', description='Enter a valid URL', primary=False, unique=False)[source]

Bases: obj_tables.core.RegexAttribute

URL attribute to be used for URLs

Parameters
  • verbose_name (str, optional) – verbose name

  • description (str, optional) – description

  • primary (bool, optional) – indicate if attribute is primary attribute

  • unique (bool, optional) – indicate if attribute value must be unique

validate(obj, value)[source]

Determine if value is a valid value of the attribute

Parameters
  • obj (Model) – object being validated

  • value (object) – value of attribute to validate

Returns

None if attribute is valid, other return list of errors as an

instance of InvalidAttribute

Return type

InvalidAttribute or None

class obj_tables.core.Validator[source]

Bases: object

Engine to validate sets of objects

clean(objects)[source]

Clean a list of objects and return their errors

Parameters

object (list of Model) – list of objects

Returns

list of invalid objects/models and their errors

Return type

InvalidObjectSet or None

run(objects, get_related=False)[source]

Validate a list of objects and return their errors

Parameters
  • objects (Model or list of Model) – object or list of objects

  • get_related (bool, optional) – if true, get all related objects

Returns

list of invalid objects/models and their errors

Return type

InvalidObjectSet or None

validate(objects)[source]

Validate a list of objects and return their errors

Parameters

object (list of Model) – list of Model instances

Returns

list of invalid objects/models and their errors

Return type

InvalidObjectSet or None

obj_tables.core.get_model(name, module=None)[source]

Get first Model with name name

Parameters
  • name (str) – name

  • module (Module, optional) – module

Returns

model class

Return type

class

obj_tables.core.get_models(module=None, inline=True)[source]

Get models

Parameters
  • module (module, optional) – module

  • inline (bool, optional) – if true, return inline models

Returns

list of model classes

Return type

list of class

obj_tables.core.join_separated_list(values, separator=', ')[source]

Parse a separator list of values into a list of values

Parameters
  • values (list of str) – values

  • separator (str, optional) – separator

Returns

seperator-separated list of values

Return type

str

obj_tables.core.split_separated_list(joined_values, separator=', ')[source]

Parse a separator list of values into a list of values

Parameters
  • joined_values (str) – seperator-separated list of values

  • separator (str, optional) – separator

Returns

values

Return type

list of str

obj_tables.core.xlsx_col_name(col)[source]

Convert column number to an XLSX-style string.

From http://stackoverflow.com/a/19169180/509882

Parameters

col (int) – column number (positive integer)

Returns

alphabetic column name

Return type

str

Raises

ValueError – if col is not positive

3.1.7. obj_tables.grammar module

Attributes for embedding domain-specific langauges for describing *-to-many relationships into XLSX cell

Author

Jonathan Karr <karr@mssm.edu>

Date

2019-09-23

Copyright

2019, Karr Lab

License

MIT

class obj_tables.grammar.ToManyGrammarAttribute(related_class, grammar=None, **kwargs)[source]

Bases: obj_tables.core.RelatedAttribute

*-to-many attribute that can be deserialized with a grammar

grammar[source]

grammar

Type

str

parser[source]

parser

Type

lark.Lark

Class attributes:

  • grammar (str): grammar

  • grammar_path (str): path to grammar

  • Transformer (type): subclass of Transformer which transforms parse trees into a list of instances of core.Model

Parameters
  • related_class (type) – related class

  • grammar (str, optional) – grammar

Transformer = None[source]
deserialize(values, objects, decoded=None)[source]

Deserialize value

Parameters
  • values (object) – String representation of related objects

  • objects (dict) – dictionary of objects, grouped by model

  • decoded (dict, optional) – dictionary of objects that have already been decoded

Returns

tuple of cleaned value and cleaning error

Return type

tuple of object, core.InvalidAttribute or None

classmethod gen_transformer(model)[source]

Generate transformer for model

Parameters

model (type) – model

Returns

transformer

Return type

type

grammar = None[source]
grammar_path = None[source]
abstract serialize(values, encoded=None)[source]

Serialize related object

Parameters
  • values (list of core.Model) – Python representation

  • encoded (dict, optional) – dictionary of objects that have already been encoded

Returns

simple Python representation

Return type

str

class obj_tables.grammar.ToManyGrammarTransformer(objects)[source]

Bases: lark.visitors.Transformer

Transforms parse trees into a list of instances of core.Model

objects[source]

dictionary that maps types of models to dictionaries which map serialized values of instances of models to instances

Type

dict

Parameters

objects (dict) – dictionary that maps types of models to dictionaries which map serialized values of instances of models to instances

get_or_create_model_obj(model, _serialized_val=None, _clean=True, **kwargs)[source]

Get a instance of a model with serialized value _serialized_val, or create an instance if there is no such instance

Parameters
  • model (type) – type of model instance to get or create

  • _serialized_val (str, optional) – serialized value of instance of model

  • _clean (bool, optional) – if True, clean values

  • kwargs (dict, optional) – arguments to constructor of model for instance

start(*args)[source]

Collapse return into a list of related model instances

Parameters

*args (list of core.Model) – related model instances

Returns

related model instances

Return type

list of core.Model

3.1.8. obj_tables.io module

Reading/writing schema objects to/from files

  • Comma separated values (.csv)

  • XLSX (.xlsx)

  • JavaScript Object Notation (.json)

  • Tab separated values (.tsv)

  • Yet Another Markup Language (.yaml, .yml)

Author

Jonathan Karr <karr@mssm.edu>

Author

Arthur Goldberg <Arthur.Goldberg@mssm.edu>

Date

2019-09-19

Copyright

2016-2019, Karr Lab

License

MIT

exception obj_tables.io.IoWarning[source]

Bases: obj_tables.core.ObjTablesWarning

IO warning

class obj_tables.io.JsonReader[source]

Bases: obj_tables.io.ReaderBase

Read model objects from a JSON or YAML file

run(path, schema_name=None, models=None, allow_multiple_sheets_per_model=False, ignore_missing_models=False, ignore_extra_models=False, ignore_sheet_order=False, include_all_attributes=True, ignore_missing_attributes=False, ignore_extra_attributes=False, ignore_attribute_order=False, ignore_empty_rows=True, group_objects_by_model=True, validate=True)[source]

Read model objects from file(s) and, optionally, validate them

Parameters
  • path (str) – path to file(s)

  • schema_name (str, optional) – schema name

  • models (types.TypeType or list of types.TypeType, optional) – type or list of type of objects to read

  • allow_multiple_sheets_per_model (bool, optional) – if True, allow multiple sheets per model

  • ignore_missing_models (bool, optional) – if False, report an error if a worksheet/ file is missing for one or more models

  • ignore_extra_models (bool, optional) – if True and all models are found, ignore other worksheets or files

  • ignore_sheet_order (bool, optional) – if True, do not require the sheets to be provided in the canonical order

  • include_all_attributes (bool, optional) – if True, export all attributes including those not explictly included in Model.Meta.attribute_order

  • ignore_missing_attributes (bool, optional) – if False, report an error if a worksheet/file doesn’t contain all of attributes in a model in models

  • ignore_extra_attributes (bool, optional) – if True, do not report errors if attributes in the data are not in the model

  • ignore_attribute_order (bool, optional) – if True, do not require the attributes to be provided in the canonical order

  • ignore_empty_rows (bool, optional) – if True, ignore empty rows

  • group_objects_by_model (bool, optional) – if True, group decoded objects by their types

  • validate (bool, optional) – if True, validate the data

Returns

model objects grouped by Model class

Return type

dict

Raises

ValueError – if the input format is not supported, model names are not unique, or the data is invalid

class obj_tables.io.JsonWriter[source]

Bases: obj_tables.io.WriterBase

Write model objects to a JSON or YAML file

run(path, objects, schema_name=None, doc_metadata=None, model_metadata=None, models=None, get_related=True, include_all_attributes=True, validate=True, title=None, description=None, keywords=None, version=None, language=None, creator=None, write_toc=False, write_schema=False, write_empty_models=True, write_empty_cols=True, extra_entries=0, group_objects_by_model=True, data_repo_metadata=False, schema_package=None, protected=False)[source]

Write a list of model classes to a JSON or YAML file

Parameters
  • path (str) – path to write file(s)

  • objects (Model or list of Model) – object or list of objects

  • schema_name (str, optional) – schema name

  • doc_metadata (dict, optional) – dictionary of document metadata to be saved to header row (e.g., !!!ObjTables ...)

  • model_metadata (dict, optional) – dictionary that maps models to dictionary with their metadata to be saved to header row (e.g., !!ObjTables ...)

  • models (list of Model, optional) – models

  • get_related (bool, optional) – if True, write object and all related objects

  • include_all_attributes (bool, optional) – if True, export all attributes including those not explictly included in Model.Meta.attribute_order

  • validate (bool, optional) – if True, validate the data

  • title (str, optional) – title

  • description (str, optional) – description

  • keywords (str, optional) – keywords

  • version (str, optional) – version

  • language (str, optional) – language

  • creator (str, optional) – creator

  • write_toc (bool, optional) – if True, include additional worksheet with table of contents

  • write_schema (bool, optional) – if True, include additional worksheet with schema

  • write_empty_models (bool, optional) – if True, write models even when there are no instances

  • write_empty_cols (bool, optional) – if True, write columns even when all values are None

  • extra_entries (int, optional) – additional entries to display

  • group_objects_by_model (bool, optional) – if True, group objects by model

  • data_repo_metadata (bool, optional) – if True, try to write metadata information about the file’s Git repo; the repo must be current with origin, except for the file

  • schema_package (str, optional) – the package which defines the ObjTables schema used by the file; if not None, try to write metadata information about the the schema’s Git repository: the repo must be current with origin

  • protected (bool, optional) – if True, protect the worksheet

Raises

ValueError – if model names are not unique or output format is not supported

class obj_tables.io.MultiSeparatedValuesReader[source]

Bases: obj_tables.io.ReaderBase

Read a list of model objects from a single text file which contains multiple comma or tab-separated files

run(path, schema_name=None, models=None, allow_multiple_sheets_per_model=False, ignore_missing_models=False, ignore_extra_models=False, ignore_sheet_order=False, include_all_attributes=True, ignore_missing_attributes=False, ignore_extra_attributes=False, ignore_attribute_order=False, ignore_empty_rows=True, group_objects_by_model=True, validate=True)[source]

Read a list of model objects from a single text file which contains multiple comma or tab-separated files

Parameters
  • path (str) – path to file(s)

  • schema_name (str, optional) – schema name

  • models (types.TypeType or list of types.TypeType, optional) – type or list of type of objects to read

  • allow_multiple_sheets_per_model (bool, optional) – if True, allow multiple sheets per model

  • ignore_missing_models (bool, optional) – if False, report an error if a worksheet/ file is missing for one or more models

  • ignore_extra_models (bool, optional) – if True and all models are found, ignore other worksheets or files

  • ignore_sheet_order (bool, optional) – if True, do not require the sheets to be provided in the canonical order

  • include_all_attributes (bool, optional) – if True, export all attributes including those not explictly included in Model.Meta.attribute_order

  • ignore_missing_attributes (bool, optional) – if False, report an error if a worksheet/file doesn’t contain all of attributes in a model in models

  • ignore_extra_attributes (bool, optional) – if True, do not report errors if attributes in the data are not in the model

  • ignore_attribute_order (bool, optional) – if True, do not require the attributes to be provided in the canonical order

  • ignore_empty_rows (bool, optional) – if True, ignore empty rows

  • group_objects_by_model (bool, optional) – if True, group decoded objects by their types

  • validate (bool, optional) – if True, validate the data

Returns

if group_objects_by_model set returns dict: of model objects grouped by Model class;

else returns list: of all model objects

Return type

obj

Raises

ValueError – if path contains a glob pattern

class obj_tables.io.MultiSeparatedValuesWriter[source]

Bases: obj_tables.io.WriterBase

Write model objects to a single text file which contains multiple comma or tab-separated tables.

run(path, objects, schema_name=None, doc_metadata=None, model_metadata=None, models=None, get_related=True, include_all_attributes=True, validate=True, title=None, description=None, keywords=None, version=None, language=None, creator=None, write_toc=True, write_schema=False, write_empty_models=True, write_empty_cols=True, extra_entries=0, group_objects_by_model=True, data_repo_metadata=False, schema_package=None, protected=False)[source]

Write model objects to a single text file which contains multiple comma or tab-separated tables.

Parameters
  • path (str) – path to write file(s)

  • objects (Model or list of Model) – Model instance or list of Model instances

  • schema_name (str, optional) – schema name

  • doc_metadata (dict, optional) – dictionary of document metadata to be saved to header row (e.g., !!!ObjTables ...)

  • model_metadata (dict, optional) – dictionary that maps models to dictionary with their metadata to be saved to header row (e.g., !!ObjTables ...)

  • models (list of Model, optional) – models in the order that they should appear as worksheets; all models which are not in models will follow in alphabetical order

  • get_related (bool, optional) – if True, write objects and all their related objects

  • include_all_attributes (bool, optional) – if True, export all attributes including those not explictly included in Model.Meta.attribute_order

  • validate (bool, optional) – if True, validate the data

  • title (str, optional) – title

  • description (str, optional) – description

  • keywords (str, optional) – keywords

  • version (str, optional) – version

  • language (str, optional) – language

  • creator (str, optional) – creator

  • write_schema (bool, optional) – if True, include additional worksheet with schema

  • write_empty_models (bool, optional) – if True, write models even when there are no instances

  • write_empty_cols (bool, optional) – if True, write columns even when all values are None

  • write_toc (bool, optional) – if True, include additional worksheet with table of contents

  • extra_entries (int, optional) – additional entries to display

  • group_objects_by_model (bool, optional) – if True, group objects by model

  • data_repo_metadata (bool, optional) – if True, try to write metadata information about the file’s Git repo; the repo must be current with origin, except for the file

  • schema_package (str, optional) – the package which defines the ObjTables schema used by the file; if not None, try to write metadata information about the the schema’s Git repository: the repo must be current with origin

  • protected (bool, optional) – if True, protect the worksheet

Raises

ValueError – if no model is provided or a class cannot be serialized

class obj_tables.io.PandasWriter[source]

Bases: obj_tables.io.WorkbookWriter

Write model instances to a dictionary of pandas.DataFrame

_data_frames[source]

dictionary that maps models (Model) to their instances (pandas.DataFrame)

Type

dict

run(objects, schema_name=None, models=None, get_related=True, include_all_attributes=True, validate=True, protected=False)[source]

Write model instances to a dictionary of pandas.DataFrame

Parameters
  • objects (Model or list of Model) – object or list of objects

  • schema_name (str, optional) – schema name

  • models (list of Model, optional) – models in the order that they should appear as worksheets; all models which are not in models will follow in alphabetical order

  • get_related (bool, optional) – if True, write objects and all their related objects

  • include_all_attributes (bool, optional) – if True, export all attributes including those not explictly included in Model.Meta.attribute_order

  • validate (bool, optional) – if True, validate the data

  • protected (bool, optional) – if True, protect the worksheet

Returns

dictionary that maps models (Model) to their

instances (pandas.DataFrame)

Return type

dict

write_sheet(writer, model, data, headings, metadata_headings, validation, extra_entries=0, merge_ranges=None, protected=False)[source]

Write data to sheet

Parameters
  • writer (wc_utils.workbook.io.Writer) – io writer

  • model (type) – model

  • data (list of list of object) – list of list of cell values

  • headings (list of list of str) – list of list of row headingsvalidations

  • metadata_headings (list of list of str) – model metadata (name, description) to print at the top of the worksheet

  • validation (WorksheetValidation) – validation

  • extra_entries (int, optional) – additional entries to display

  • merge_ranges (list of tuple) – list of ranges of cells to merge

  • protected (bool, optional) – if True, protect the worksheet

class obj_tables.io.Reader[source]

Bases: obj_tables.io.ReaderBase

static get_reader(path)[source]

Get the IO class whose run method can read the file(s) at path

Parameters

path (str) – path to write file(s)

Returns

reader class

Return type

type

Raises

ValueError – if extension is not supported

run(path, schema_name=None, models=None, allow_multiple_sheets_per_model=False, ignore_missing_models=False, ignore_extra_models=False, ignore_sheet_order=False, include_all_attributes=True, ignore_missing_attributes=False, ignore_extra_attributes=False, ignore_attribute_order=False, ignore_empty_rows=True, group_objects_by_model=True, validate=True)[source]

Read a list of model objects from file(s) and, optionally, validate them

Parameters
  • path (str) – path to file(s)

  • schema_name (str, optional) – schema name

  • models (types.TypeType or list of types.TypeType, optional) – type of object to read or list of types of objects to read

  • allow_multiple_sheets_per_model (bool, optional) – if True, allow multiple sheets per model

  • ignore_missing_models (bool, optional) – if False, report an error if a worksheet/ file is missing for one or more models

  • ignore_extra_models (bool, optional) – if True and all models are found, ignore other worksheets or files

  • ignore_sheet_order (bool, optional) – if True, do not require the sheets to be provided in the canonical order

  • include_all_attributes (bool, optional) – if True, export all attributes including those not explictly included in Model.Meta.attribute_order

  • ignore_missing_attributes (bool, optional) – if False, report an error if a worksheet/file doesn’t contain all of attributes in a model in models

  • ignore_extra_attributes (bool, optional) – if True, do not report errors if attributes in the data are not in the model

  • ignore_attribute_order (bool, optional) – if True, do not require the attributes to be provided in the canonical order

  • ignore_empty_rows (bool, optional) – if True, ignore empty rows

  • group_objects_by_model (bool, optional) – if True, group decoded objects by their types

  • validate (bool, optional) – if True, validate the data

Returns

if group_objects_by_model is set returns dict: model objects grouped

by Model class, otherwise returns list: of model objects

Return type

obj

class obj_tables.io.ReaderBase[source]

Bases: object

Interface for classes which write model objects to file(s)

_doc_metadata[source]

dictionary of document metadata read from header row (e.g., !!!ObjTables ...)

Type

dict

_model_metadata[source]

dictionary which maps models (Model) to dictionaries of metadata read from a document (e.g., !!ObjTables date=’…’ …)

Type

dict

MODELS[source]

default types of models to export and the order in which to export them

Type

tuple of type

MODELS = ()[source]
abstract run(path, schema_name=None, models=None, allow_multiple_sheets_per_model=False, ignore_missing_models=False, ignore_extra_models=False, ignore_sheet_order=False, include_all_attributes=True, ignore_missing_attributes=False, ignore_extra_attributes=False, ignore_attribute_order=False, ignore_empty_rows=True, group_objects_by_model=True, validate=True)[source]

Read a list of model objects from file(s) and, optionally, validate them

Parameters
  • path (str) – path to file(s)

  • schema_name (str, optional) – schema name

  • models (types.TypeType or list of types.TypeType, optional) – type of object to read or list of types of objects to read

  • allow_multiple_sheets_per_model (bool, optional) – if True, allow multiple sheets per model

  • ignore_missing_models (bool, optional) – if False, report an error if a worksheet/ file is missing for one or more models

  • ignore_extra_models (bool, optional) – if True and all models are found, ignore other worksheets or files

  • ignore_sheet_order (bool, optional) – if True, do not require the sheets to be provided in the canonical order

  • include_all_attributes (bool, optional) – if True, export all attributes including those not explictly included in Model.Meta.attribute_order

  • ignore_missing_attributes (bool, optional) – if False, report an error if a worksheet/file doesn’t contain all of attributes in a model in models

  • ignore_extra_attributes (bool, optional) – if True, do not report errors if attributes in the data are not in the model

  • ignore_attribute_order (bool, optional) – if True, do not require the attributes to be provided in the canonical order

  • ignore_empty_rows (bool, optional) – if True, ignore empty rows

  • group_objects_by_model (bool, optional) – if True, group decoded objects by their types

  • validate (bool, optional) – if True, validate the data

Returns

model objects grouped by Model class

Return type

dict

class obj_tables.io.WorkbookReader[source]

Bases: obj_tables.io.ReaderBase

Read model objects from an XLSX file or CSV and TSV files

DOC_METADATA_PATTERN = '^!!!ObjTables( +(.*?)=(\'((?:[^\'\\\\]|\\\\.)*)\'|\\"((?:[^\\"\\\\]|\\\\.)*)\\"))* *$'[source]
MODEL_METADATA_PATTERN = '^!!ObjTables( +(.*?)=(\'((?:[^\'\\\\]|\\\\.)*)\'|\\"((?:[^\\"\\\\]|\\\\.)*)\\"))* *$'[source]
classmethod get_model_sheet_name(sheet_names, model)[source]

Get the name of the worksheet/file which corresponds to a model

Parameters
  • sheet_names (list of str) – names of the sheets in the workbook/files

  • model (Model) – model

Returns

name of sheet corresponding to the model or None if there is no sheet for the model

Return type

str

Raises

ValueError – if the model matches more than one sheet

classmethod get_possible_model_sheet_names(model)[source]

Return set of possible sheet names for a model

Parameters

model (Model) – Model

Returns

set of possible sheet names for a model

Return type

set

classmethod header_row_col_names(index, file_ext, table_format)[source]

Determine row and column names for header entries.

Parameters
  • index (int) – index in header sequence

  • file_ext (str) – extension for model file

  • table_format (TableFormat) – orientation of the stored table

Returns

tuple of row, column, header_entries

Construct object graph

Parameters
  • model (Model) – an obj_tables.core.Model

  • attributes (list of Attribute) – attribute order of data

  • data (list of list of object) – nested list of object data

  • objects (list) – list of model objects in order of data

  • objects_by_primary_attribute (dict) – dictionary of model objects grouped by model

  • decoded (dict, optional) – dictionary of objects that have already been decoded

Returns

list of parsing errors

Return type

list of str

merge_doc_metadata(metadata)[source]

Merge metadata into document metadata

Parameters

metadata (dict) – meta data

Raises

ValueError – if the meta data conflicts with existing document metadata

classmethod parse_worksheet_heading_metadata(heading, sheet_name=None)[source]

Parse key-value pairs of metadata from heading

Parameters
  • heading (str) – heading with key-value pairs of metadata

  • sheet_name (str, optional) – sheet name

Returns

dictionary of document metadata

Return type

dict

Raises

ValueError – if a key is repeated

read_model(reader, sheet_name, schema_name, model, include_all_attributes=True, ignore_missing_attributes=False, ignore_extra_attributes=False, ignore_attribute_order=False, ignore_empty_rows=True, validate=True)[source]

Instantiate a list of objects from data in a table in a file

Parameters
  • reader (wc_utils.workbook.io.Reader) – reader

  • sheet_name (str) – sheet name

  • schema_name (str) – schema name

  • model (type) – the model describing the objects’ schema

  • include_all_attributes (bool, optional) – if True, export all attributes including those not explictly included in Model.Meta.attribute_order

  • ignore_missing_attributes (bool, optional) – if False, report an error if the worksheet/files don’t have all of attributes in the model

  • ignore_extra_attributes (bool, optional) – if True, do not report errors if attributes in the data are not in the model

  • ignore_attribute_order (bool, optional) – if True, do not require the attributes to be provided in the canonical order

  • ignore_empty_rows (bool, optional) – if True, ignore empty rows

  • validate (bool, optional) – if True, validate the data

Returns

  • list of Attribute: attribute order of data

  • list of list of object: a two-dimensional nested list of object data

  • list of str: a list of parsing errors

  • list of Model: constructed model objects

Return type

tuple

read_sheet(model, reader, sheet_name, num_row_heading_columns=0, num_column_heading_rows=0, ignore_empty_rows=False, ignore_empty_cols=False)[source]

Read worksheet or file into a two-dimensional list

Parameters
  • model (type) – the model describing the objects’ schema

  • reader (wc_utils.workbook.io.Reader) – reader

  • sheet_name (str) – worksheet name

  • num_row_heading_columns (int, optional) – number of columns of row headings

  • num_column_heading_rows (int, optional) – number of rows of column headings

  • ignore_empty_rows (bool, optional) – if True, ignore empty rows

  • ignore_empty_cols (bool, optional) – if True, ignore empty columns

Returns

  • list of list: two-dimensional list of table values

  • list of list: row headings

  • list of list: column_headings

  • list of str: comments above column headings

Return type

tuple

Raises

ValueError – if worksheet doesn’t have header rows or columns

classmethod read_worksheet_metadata(sheet_name, rows)[source]

Read worksheet metadata

Parameters
  • sheet_name (str) – sheet name

  • rows (list) – rows

Returns

  • dict: dictionary of document properties

  • dict: dictionary of model properties

  • list of str: comments

Return type

tuple

run(path, schema_name=None, models=None, allow_multiple_sheets_per_model=False, ignore_missing_models=False, ignore_extra_models=False, ignore_sheet_order=False, include_all_attributes=True, ignore_missing_attributes=False, ignore_extra_attributes=False, ignore_attribute_order=False, ignore_empty_rows=True, group_objects_by_model=True, validate=True)[source]

Read a list of model objects from file(s) and, optionally, validate them

File(s) may be a single XLSX workbook with multiple worksheets or a set of delimeter separated files encoded by a single path with a glob pattern.

Parameters
  • path (str) – path to file(s)

  • schema_name (str, optional) – schema name

  • models (types.TypeType or list of types.TypeType, optional) – type or list of type of objects to read

  • allow_multiple_sheets_per_model (bool, optional) – if True, allow multiple sheets per model

  • ignore_missing_models (bool, optional) – if False, report an error if a worksheet/ file is missing for one or more models

  • ignore_extra_models (bool, optional) – if True and all models are found, ignore other worksheets or files

  • ignore_sheet_order (bool, optional) – if True, do not require the sheets to be provided in the canonical order

  • include_all_attributes (bool, optional) – if True, export all attributes including those not explictly included in Model.Meta.attribute_order

  • ignore_missing_attributes (bool, optional) – if False, report an error if a worksheet/file doesn’t contain all of attributes in a model in models

  • ignore_extra_attributes (bool, optional) – if True, do not report errors if attributes in the data are not in the model

  • ignore_attribute_order (bool, optional) – if True, do not require the attributes to be provided in the canonical order

  • ignore_empty_rows (bool, optional) – if True, ignore empty rows

  • group_objects_by_model (bool, optional) – if True, group decoded objects by their types

  • validate (bool, optional) – if True, validate the data

Returns

if group_objects_by_model set returns dict: of model objects grouped by Model class;

else returns list: of all model objects

Return type

obj

Raises

ValueError – if * Sheets cannot be unambiguously mapped to models * The file(s) indicated by path is missing a sheet for a model and ignore_missing_models is False * The file(s) indicated by path contains extra sheets that don’t correspond to one of models and ignore_extra_models is False * The worksheets are file(s) indicated by path are not in the canonical order and ignore_sheet_order is False * Some models are not serializable * The data contains parsing errors found by read_model

class obj_tables.io.WorkbookWriter[source]

Bases: obj_tables.io.WriterBase

Write model objects to an XLSX file or CSV or TSV file(s)

static create_worksheet_style(model, extra_entries=0)[source]

Create worksheet style for model

Parameters
  • model (type) – model class

  • extra_entries (int, optional) – additional entries to display

Returns

worksheet style

Return type

WorksheetStyle

run(path, objects, schema_name=None, doc_metadata=None, model_metadata=None, models=None, get_related=True, include_all_attributes=True, validate=True, title=None, description=None, keywords=None, version=None, language=None, creator=None, write_toc=True, write_schema=False, write_empty_models=True, write_empty_cols=True, extra_entries=0, group_objects_by_model=True, data_repo_metadata=False, schema_package=None, protected=True)[source]
Write a list of model instances to an XLSX file, with one worksheet for each model class,

or to a set of .csv or .tsv files, with one file for each model class

Parameters
  • path (str) – path to write file(s)

  • objects (Model or list of Model) – Model instance or list of Model instances

  • schema_name (str, optional) – schema name

  • doc_metadata (dict, optional) – dictionary of document metadata to be saved to header row (e.g., !!!ObjTables ...)

  • model_metadata (dict, optional) – dictionary that maps models to dictionary with their metadata to be saved to header row (e.g., !!ObjTables ...)

  • models (list of Model, optional) – models in the order that they should appear as worksheets; all models which are not in models will follow in alphabetical order

  • get_related (bool, optional) – if True, write objects and all their related objects

  • include_all_attributes (bool, optional) – if True, export all attributes including those not explictly included in Model.Meta.attribute_order

  • validate (bool, optional) – if True, validate the data

  • title (str, optional) – title

  • description (str, optional) – description

  • keywords (str, optional) – keywords

  • version (str, optional) – version

  • language (str, optional) – language

  • creator (str, optional) – creator

  • write_toc (bool, optional) – if True, include additional worksheet with table of contents

  • write_schema (bool, optional) – if True, include additional worksheet with schema

  • write_empty_models (bool, optional) – if True, write models even when there are no instances

  • write_empty_cols (bool, optional) – if True, write columns even when all values are None

  • extra_entries (int, optional) – additional entries to display

  • group_objects_by_model (bool, optional) – if True, group objects by model

  • data_repo_metadata (bool, optional) – if True, try to write metadata information about the file’s Git repo; the repo must be current with origin, except for the file

  • schema_package (str, optional) – the package which defines the ObjTables schema used by the file; if not None, try to write metadata information about the the schema’s Git repository: the repo must be current with origin

  • protected (bool, optional) – if True, protect the worksheet

Raises

ValueError – if no model is provided or a class cannot be serialized

write_model(writer, model, objects, schema_name, date, doc_metadata, doc_metadata_model, model_metadata, sheet_models, include_all_attributes=True, encoded=None, write_empty_models=True, write_empty_cols=True, extra_entries=0, protected=True)[source]

Write a list of model objects to a file

Parameters
  • writer (wc_utils.workbook.io.Writer) – io writer

  • model (type) – model

  • objects (list of Model) – list of instances of Model

  • schema_name (str) – schema name

  • date (str) – date

  • doc_metadata (dict) – dictionary of document metadata to be saved to header row (e.g., !!!ObjTables ...)

  • doc_metadata_model (type) – model whose worksheet contains the document metadata

  • model_metadata (dict) – dictionary of model metadata

  • sheet_models (list of Model) – models encoded as separate sheets

  • include_all_attributes (bool, optional) – if True, export all attributes including those not explictly included in Model.Meta.attribute_order

  • encoded (dict, optional) – objects that have already been encoded and their assigned JSON identifiers

  • write_empty_models (bool, optional) – if True, write models even when there are no instances

  • write_empty_cols (bool, optional) – if True, write columns even when all values are None

  • extra_entries (int, optional) – additional entries to display

  • protected (bool, optional) – if True, protect the worksheet

write_schema(writer, models, name, date, doc_metadata, protected=True)[source]

Write a worksheet with a schema

Parameters
  • writer (wc_utils.workbook.io.Writer) – io writer

  • models (list of Model, optional) – models in the order that they should appear in the table of contents

  • name (str) – name

  • date (str) – date

  • doc_metadata (dict) – dictionary of document metadata to be saved to header row (e.g., !!!ObjTables ...)

  • protected (bool, optional) – if True, protect the worksheet

write_sheet(writer, model, data, headings, metadata_headings, validation, extra_entries=0, merge_ranges=None, protected=True)[source]

Write data to sheet

Parameters
  • writer (wc_utils.workbook.io.Writer) – io writer

  • model (type) – model

  • data (list of list of object) – list of list of cell values

  • headings (list of list of str) – list of list of row headings validations

  • metadata_headings (list of list of str) – model metadata (name, description) to print at the top of the worksheet

  • validation (WorksheetValidation) – validation

  • extra_entries (int, optional) – additional entries to display

  • merge_ranges (list of tuple) – list of ranges of cells to merge

  • protected (bool, optional) – if True, protect the worksheet

write_toc(writer, models, schema_name, date, doc_metadata, grouped_objects, write_schema=False, protected=True)[source]

Write a worksheet with a table of contents

Parameters
  • writer (wc_utils.workbook.io.Writer) – io writer

  • models (list of Model, optional) – models in the order that they should appear in the table of contents

  • schema_name (str) – schema name

  • date (str) – date

  • doc_metadata (dict) – dictionary of document metadata to be saved to header row (e.g., !!!ObjTables ...)

  • grouped_objects (dict) – dictionary which maps models to lists of instances of each model

  • write_schema (bool, optional) – if True, include additional row for worksheet with schema

  • protected (bool, optional) – if True, protect the worksheet

class obj_tables.io.Writer[source]

Bases: obj_tables.io.WriterBase

Write a list of model objects to file(s)

static get_writer(path)[source]

Get writer

Parameters

path (str) – path to write file(s)

Returns

writer class

Return type

type

Raises

ValueError – if extension is not supported

run(path, objects, schema_name=None, doc_metadata=None, model_metadata=None, models=None, get_related=True, include_all_attributes=True, validate=True, title=None, description=None, keywords=None, version=None, language=None, creator=None, write_toc=True, write_schema=False, write_empty_models=True, write_empty_cols=True, extra_entries=0, group_objects_by_model=True, data_repo_metadata=False, schema_package=None, protected=True)[source]
Write a list of model classes to an XLSX file, with one worksheet for each model, or to

a set of .csv or .tsv files, with one file for each model.

Parameters
  • path (str) – path to write file(s)

  • objects (Model or list of Model) – object or list of objects

  • schema_name (str, optional) – schema name

  • doc_metadata (dict, optional) – dictionary of document metadata to be saved to header row (e.g., !!!ObjTables ...)

  • model_metadata (dict, optional) – dictionary that maps models to dictionary with their metadata to be saved to header row (e.g., !!ObjTables ...)

  • models (list of Model, optional) – models in the order that they should appear as worksheets; all models which are not in models will follow in alphabetical order

  • get_related (bool, optional) – if True, write objects and all related objects

  • include_all_attributes (bool, optional) – if True, export all attributes including those not explictly included in Model.Meta.attribute_order

  • validate (bool, optional) – if True, validate the data

  • title (str, optional) – title

  • description (str, optional) – description

  • keywords (str, optional) – keywords

  • version (str, optional) – version

  • language (str, optional) – language

  • creator (str, optional) – creator

  • write_schema (bool, optional) – if True, include additional worksheet with schema

  • write_empty_models (bool, optional) – if True, write models even when there are no instances

  • write_empty_cols (bool, optional) – if True, write columns even when all values are None

  • write_toc (bool, optional) – if True, include additional worksheet with table of contents

  • extra_entries (int, optional) – additional entries to display

  • group_objects_by_model (bool, optional) – if True, group objects by model

  • data_repo_metadata (bool, optional) – if True, try to write metadata information about the file’s Git repo; the repo must be current with origin, except for the file

  • schema_package (str, optional) – the package which defines the ObjTables schema used by the file; if not None, try to write metadata information about the the schema’s Git repository: the repo must be current with origin

  • protected (bool, optional) – if True, protect the worksheet

class obj_tables.io.WriterBase[source]

Bases: object

Interface for classes which write model objects to file(s)

MODELS[source]

default types of models to export and the order in which to export them

Type

tuple of type

MODELS = ()[source]
make_metadata_objects(data_repo_metadata, path, schema_package)[source]

Make models that store Git repository metadata

Metadata models can only be created from suitable Git repos. Failures to obtain metadata are reported as warnings that do not interfeer with writing data files.

Parameters
  • data_repo_metadata (bool) – if True, try to obtain metadata information about the Git repo containing path; the repo must be current with origin, except for the file at path

  • path (str) – path of the file(s) that will be written

  • schema_package (str, optional) – the package which defines the ObjTables schema used by the file; if not None, try to obtain metadata information about the the schema’s Git repository from a package on sys.path: the repo must be current with its origin

Returns

metadata objects(s) created

Return type

list of Model

abstract run(path, objects, schema_name=None, doc_metadata=None, model_metadata=None, models=None, get_related=True, include_all_attributes=True, validate=True, title=None, description=None, keywords=None, version=None, language=None, creator=None, write_toc=True, write_schema=False, write_empty_models=True, write_empty_cols=True, extra_entries=0, group_objects_by_model=True, data_repo_metadata=False, schema_package=None, protected=True)[source]
Write a list of model classes to an XLSX file, with one worksheet for each model, or to

a set of .csv or .tsv files, with one file for each model.

Parameters
  • path (str) – path to write file(s)

  • objects (Model or list of Model) – object or list of objects

  • schema_name (str, optional) – schema name

  • doc_metadata (dict, optional) – dictionary of document metadata to be saved to header row (e.g., !!!ObjTables ...)

  • model_metadata (dict, optional) – dictionary that maps models to dictionary with their metadata to be saved to header row (e.g., !!ObjTables ...)

  • models (list of Model, optional) – models

  • get_related (bool, optional) – if True, write object and all related objects

  • include_all_attributes (bool, optional) – if True, export all attributes including those not explictly included in Model.Meta.attribute_order

  • validate (bool, optional) – if True, validate the data

  • title (str, optional) – title

  • description (str, optional) – description

  • keywords (str, optional) – keywords

  • version (str, optional) – version

  • language (str, optional) – language

  • creator (str, optional) – creator

  • write_toc (bool, optional) – if True, include additional worksheet with table of contents

  • write_schema (bool, optional) – if True, include additional worksheet with schema

  • write_empty_models (bool, optional) – if True, write models even when there are no instances

  • write_empty_cols (bool, optional) – if True, write columns even when all values are None

  • extra_entries (int, optional) – additional entries to display

  • group_objects_by_model (bool, optional) – if True, group objects by model

  • data_repo_metadata (bool, optional) – if True, try to write metadata information about the file’s Git repo; a warning will be generated if the repo repo is not current with origin, except for the file

  • schema_package (str, optional) – the package which defines the ObjTables schema used by the file; if not None, try to write metadata information about the the schema’s Git repository: the repo must be current with origin

  • protected (bool, optional) – if True, protect the worksheet

obj_tables.io.convert(source, destination, schema_name=None, models=None, allow_multiple_sheets_per_model=False, ignore_missing_models=False, ignore_extra_models=False, ignore_sheet_order=False, include_all_attributes=True, ignore_missing_attributes=False, ignore_extra_attributes=False, ignore_attribute_order=False, ignore_empty_rows=True, protected=True)[source]

Convert among comma-separated (.csv), XLSX (.xlsx), JavaScript Object Notation (.json), tab-separated (.tsv), and Yet Another Markup Language (.yaml, .yml) formats

Parameters
  • source (str) – path to source file

  • destination (str) – path to save converted file

  • schema_name (str, optional) – schema name

  • models (list of type) – list of models

  • allow_multiple_sheets_per_model (bool, optional) – if True, allow multiple sheets per model

  • ignore_missing_models (bool, optional) – if False, report an error if a worksheet/ file is missing for one or more models

  • ignore_extra_models (bool, optional) – if True and all models are found, ignore other worksheets or files

  • ignore_sheet_order (bool, optional) – if True, do not require the sheets to be provided in the canonical order

  • include_all_attributes (bool, optional) – if True, export all attributes including those not explictly included in Model.Meta.attribute_order

  • ignore_missing_attributes (bool, optional) – if False, report an error if a worksheet/file doesn’t contain all of attributes in a model in models

  • ignore_extra_attributes (bool, optional) – if True, do not report errors if attributes in the data are not in the model

  • ignore_attribute_order (bool, optional) – if True, do not require the attributes to be provided in the canonical order

  • ignore_empty_rows (bool, optional) – if True, ignore empty rows

  • protected (bool, optional) – if True, protect the worksheet

obj_tables.io.create_template(path, schema_name, models, title=None, description=None, keywords=None, version=None, language=None, creator=None, write_toc=True, write_schema=False, write_empty_models=True, write_empty_cols=True, extra_entries=10, group_objects_by_model=True, protected=True)[source]

Create a template for a model

Parameters
  • path (str) – path to write file(s)

  • schema_name (str) – schema name

  • models (list) – list of model, in the order that they should appear as worksheets; all models which are not in models will follow in alphabetical order

  • title (str, optional) – title

  • description (str, optional) – description

  • keywords (str, optional) – keywords

  • version (str, optional) – version

  • language (str, optional) – language

  • creator (str, optional) – creator

  • write_schema (bool, optional) – if True, include additional worksheet with schema

  • write_empty_models (bool, optional) – if True, write models even when there are no instances

  • write_empty_cols (bool, optional) – if True, write columns even when all values are None

  • write_toc (bool, optional) – if True, include additional worksheet with table of contents

  • extra_entries (int, optional) – additional entries to display

  • group_objects_by_model (bool, optional) – if True, group objects by model

  • protected (bool, optional) – if True, protect the worksheet

obj_tables.io.format_doc_metadata(schema_name, metadata)[source]

Format document metadata as a string of key-value pairs of document metadata

Parameters
  • schema_name (str) – schema name

  • metadata (dict) – document metadata

Returns

string of key-value pairs of document metadata

Return type

str

obj_tables.io.get_fields(cls, schema_name, date, doc_metadata, doc_metadata_model, model_metadata, include_all_attributes=True, sheet_models=None)[source]

Get the attributes, headings, and validation for a worksheet

Parameters
  • cls (type) – Model type (subclass of Model)

  • schema_name (str) – schema name

  • date (str) – date

  • doc_metadata (dict) – dictionary of document metadata to be saved to header row (e.g., !!!ObjTables ...)

  • doc_metadata_model (type) – model whose worksheet contains the document metadata

  • model_metadata (dict) – dictionary of model metadata

  • include_all_attributes (bool, optional) – if True, export all attributes including those not explictly included in Model.Meta.attribute_order

  • sheet_models (list of Model, optional) – list of models encoded as separate worksheets; used to setup XLSX validation for related attributes

Returns

  • list of Attribute: Attributes of cls in the order they should be encoded as one or more columns in a worksheet. Attributes which define *-to-one relationships to other classes which are encoded as multiple cells (TableFormat.multiple_cells) will be encoded as multiple columns. All other attributes will be encoded as a single column.

    This represents a nested tree of attributes. For classes which have *-to-one relationships to other classes which are encoded as multiple cells, the tree has two levels. For all other classes, the tree only has a single level.

  • list of tuple of Attribute: Flattened representation of the first return value. This is a list of attributes of cls and attributes of classes related to cls by *-to-one relationships that are encoded as multiple cells (TableFormat.multiple_cells), in the order they are encoded as columns in a worksheet.

    Each element of the list is a tuple.

    1. For attributes of cls that represent *-to-one relationships to classes encoded as multiple cells, the first element will be the attribute. This will be used to populate a merged cell in Row 1 of the worksheet which represents the heading for the multiple columns that encode the attributes of the related class. For all other attributes, the first element will be None, and no value will be printed in Row 1.

    2. The second element will be the attribute that should be encoded in the column. For attributes that represent *-to-one relationships to related classes encoded as multiple cells, this will be an attribute of the related class. For all other attributes, this will be an attribute of cls. This will be used to populate the columns headings for the worksheet. For classes that have *-to-one relationships with classes encoded as multiple columns, the column headings will appear in Row 2 (and the group headings specified by the first element of the tuple will be in Row 1). For all other classes, the column headings will appear in Row 1.

  • list: field headings

  • list: list of field headings to merge

  • list: list of field validations

  • list of list str: model metadata (name and description)

    to print at the top of the worksheet

Return type

tuple

obj_tables.io.get_ordered_attributes(cls, include_all_attributes=True)[source]

Get the attributes for a class in the order that they should be printed

Parameters
  • cls (type) – Model type (subclass of Model)

  • include_all_attributes (bool, optional) – if True, export all attributes including those not explictly included in Model.Meta.attribute_order

Returns

attributes in the order they should be printed

Return type

list of Attribute

3.1.9. obj_tables.migrate module

Support schema migration

Author

Arthur Goldberg <Arthur.Goldberg@mssm.edu>

Date

2018-11-18

Copyright

2018, Karr Lab

License

MIT

class obj_tables.migrate.CementControllers[source]

Bases: object

Cement Controllers for cement CLIs in data and schema repos involved with migrating files

Because these controllers are used by multiple schema and data repos, they’re defined here and imported into __main__.py modules in schema repos that use ObjTables to define data schemas and into __main__.py modules in data repos that contain data files to migrate. wc_lang is an example schema repo. wc_sim is an example data repo that contains data files whose schema is defined in wc_lang.

class DataSchemaMigrationConfigController(*args, **kw)[source]

Bases: cement.ext.ext_argparse.ArgparseController

Create a data-schema migration configuration file.

This controller is used by data repos.

class Meta[source]

Bases: object

arguments = [(['--data_repo_dir'], {'type': <class 'str'>, 'help': 'path of the directory of the repository storing the data file(s) to migrate; defaults to the current directory', 'default': '.'}), (['schema_url'], {'type': <class 'str'>, 'help': 'URL of the schema in its git repository, including the branch'}), (['file_to_migrate'], {'action': 'store', 'type': <class 'str'>, 'nargs': '+', 'help': 'a file to migrate'})][source]
description = 'Create a data-schema migration configuration file'[source]
help = 'Create a data-schema migration configuration file'[source]
label = 'make-data-schema-migration-config-file'[source]
stacked_on = 'base'[source]
stacked_type = 'nested'[source]
class MigrateController(*args, **kw)[source]

Bases: cement.ext.ext_argparse.ArgparseController

Perform a migration configured by a data-schema migration config file

This controller is used by data repos.

class Meta[source]

Bases: object

arguments = [(['migration_config_file'], {'type': <class 'str'>, 'help': 'name of the data-schema migration configuration file to use'})][source]
description = 'Migrate data file(s) as configured in a data-schema migration configuration file'[source]
help = 'Migrate data file(s) as configured in a data-schema migration configuration file'[source]
label = 'do-configured-migration'[source]
stacked_on = 'base'[source]
stacked_type = 'nested'[source]
class MigrateFileController(*args, **kw)[source]

Bases: cement.ext.ext_argparse.ArgparseController

Migrate specified data file(s)

This controller is used by data repos.

class Meta[source]

Bases: object

arguments = [(['--data_repo_dir'], {'type': <class 'str'>, 'help': 'path of the directory of the repository storing the data file(s) to migrate; defaults to the current directory', 'default': '.'}), (['schema_url'], {'type': <class 'str'>, 'help': 'URL of the schema in its git repository, including the branch'}), (['file_to_migrate'], {'action': 'store', 'type': <class 'str'>, 'nargs': '+', 'help': 'a file to migrate'})][source]
description = 'Migrate specified data file(s)'[source]
help = 'Migrate specified data file(s)'[source]
label = 'migrate-data'[source]
stacked_on = 'base'[source]
stacked_type = 'nested'[source]
class SchemaChangesTemplateController(*args, **kw)[source]

Bases: cement.ext.ext_argparse.ArgparseController

Create a template schema changes file

This controller is used by schema repos.

class Meta[source]

Bases: object

arguments = [(['--schema_repo_dir'], {'type': <class 'str'>, 'help': "path of the directory of the schema's repository; defaults to the current directory", 'default': '.'}), (['--commit'], {'type': <class 'str'>, 'help': 'hash of a commit containing the changes; default is most recent commit'})][source]
description = 'Create a template schema changes file'[source]
help = 'Create a template schema changes file'[source]
label = 'make-changes-template'[source]
stacked_on = 'base'[source]
stacked_type = 'nested'[source]
class obj_tables.migrate.DataSchemaMigration(**kwargs)[source]

Bases: object

Automate the migration of the data files in a repo

A data repo stores the data files that need to be migrated. A schema repo contains the schemas that provide the data models for these data files. The data and schema repos may be the same repo or two different repos.

DataSchemaMigration uses configuration information in the data and schema repos to migrate data files in the data repo to the latest version of the schema repo.

The data repo must contain a migrations directory that has:

  • Data-schema migration configuration files, written in YAML

A data-schema migration configuration file contains the attributes described in DataSchemaMigration._CONFIG_ATTRIBUTES:

  • files_to_migrate: a list of files to be migrated

  • schema_repo_url: the URL of the schema repo

  • branch: the branch of the schema repo

  • schema_file: the relative path of the schema file in the schema repo

The schema repo contains a migrations directory that contains schema changes files, which may refer to associated transformations files. Hashes in the changes files must refer to commits in the schema repo. These files are managed by SchemaChanges objects. Migration will not work if changes to the schema are not documented in schema changes files.

data_repo_location[source]

directory or URL of the data repo

Type

str

data_git_repo[source]

a GitRepo for a git clone of the data repo

Type

GitRepo

schema_git_repo[source]

a GitRepo for a clone of the schema repo

Type

GitRepo

data_config_file_basename[source]

the basename of the YAML configuration file for the migration, which is stored in the data repo’s migrations directory

Type

str

migration_config_data[source]

the data in the data-schema migration config file

Type

dict

loaded_schema_changes[source]

all validated schema change files

Type

list

migration_specs[source]

the migration’s specification

Type

MigrationSpec

io_classes[source]

custom schema repo specific reader and/or writer for I/O of existing and migrated files, respectively

Type

dict of type, optional

git_repos[source]

all GitRepos create by this DataSchemaMigration

Type

list of GitRepo

CUSTOM_IO_CLASSES_FILE = 'custom_io_classes.py'[source]
__str__()[source]

Provide a string representation

Returns

a string representation of this DataSchemaMigration

Return type

str

automated_migrate(tmp_dir=None)[source]

Migrate the data repo’s data files

Migrate to the latest commit referenced by a schema changes file in the schema repo, and migrate data files in place. If the data repo passed to DataSchemaMigration was a directory, then the migrated data files will be stored in that directory.

Parameters

tmp_dir (str, optional) – if the data repo passed to DataSchemaMigration was an URL, then the migrated files will be returned in a temporary directory. If tmp_dir is provided then it will contain the migrated files; if not, then a temporary directory is created to hold them, and the caller is responsible for deleting it.

Returns

the migrated files, and the value of tmp_dir

Return type

tuple of list, str

clean_up()[source]

Delete the temp dirs used by this DataSchemaMigration’s git repos

generate_migration_spec(data_file, schema_changes)[source]

Generate a MigrationSpec from a sequence of schema changes

The migration will migrate data_file in place.

Parameters
  • data_file (str) – the existing data file that will be migrated

  • schema_changes (list of SchemaChanges) – a sequence of schema changes instances

Returns

a MigrationSpec that specifies a migration of the file through

the sequence of schema changes

Return type

MigrationSpec

Raises

MigratorError – if the MigrationSpec that’s created doesn’t validate

get_data_file_git_commit_hash(data_file)[source]

Get the git commit hash of the schema repo that describes a data file

Parameters

data_file (str) – pathname of a data file

Returns

the hash

Return type

str

Raises

MigratorError – if data_file does not contain a schema repo metadata model

get_name()[source]

Make a timestamped name for a data-schema migration config file

Returns

the name

Return type

str

Raises

MigratorError – if either the data or the schema git repo are not initialized

static get_name_static(data_repo_name, schema_repo_name)[source]

Make a timestamped name for a data-schema migration config file

Parameters
  • data_repo_name (str) – name of the data repo

  • schema_repo_name (str) – name of the schema repo

Returns

the name

Return type

str

get_schema_package()[source]

Obtain the name of the schema package from the schema file

Returns

the package name

Return type

str

import_custom_IO_classes(io_classes_file_basename=None)[source]

If the schema repo has an IO classes file, import custom IO classes for accessing data files

Parameters

io_classes_file_basename (str, optional) – custom basename for the custom IO classes file, which overrides CUSTOM_IO_CLASSES_FILE

Returns

or None: map from ‘reader’ and/or ‘writer’ to IO classes, or None:

if a custom IO classes file doesn’t exist

Return type

dict

Raises

MigratorError – if the custom IO classes file cannot be imported, or if schema_repo_dir is deleted from sys.path while importing it, or neither Reader or Writer are defined in it

static load_config_file(data_schema_migration_conf_file)[source]

Load a data-schema migration config file

Parameters

data_schema_migration_conf_file (str) – path to the data-schema migration config file

Returns

the data in the data-schema migration config file

Return type

dict

Raises

MigratorError – if the data-schema migration config file cannot be found, or is not proper YAML, or does not have the right format, or does not contain any data

static make_data_schema_migration_conf_file_cmd(data_repo_dir, schema_file_url, files_to_migrate, add_to_repo=True)[source]

Make a data-schema migration configuration file from CLI input

Parameters
  • data_repo_dir (str) – directory of the data repo

  • schema_file_url (str) – URL for schema’s Python file

  • files_to_migrate (list of str) – data files to migrate

  • add_to_repo (bool, optional) – if set, add the migration config file to the data repo; default = True:

Returns

pathname of the schema changes file that was written

Return type

str

Raises

MigratorError – if data_repo_dir isn’t the directory of a repo, or schema_file_url isn’t the URL of a schema file, or files_to_migrate aren’t files

static make_migration_config_file(data_git_repo, schema_repo_name, add_to_repo=True, **kwargs)[source]

Create a data-schema migration config file

Parameters
  • data_git_repo (GitRepo) – the data git repo that contains the data files to migrate

  • schema_repo_name (str) – name of the schema repo

  • add_to_repo (bool, optional) – if set, add the migration config file to the data repo; default = True:

  • kwargs (dict) – optional initial values for data-schema migration config file

Returns

the pathname to the data-schema migration config file that was written

Return type

str

Raises

MigratorError – if the data-schema migration configuration file already exists

static migrate_files(schema_url, local_dir, data_files)[source]

Migrate some data files specified by command line input

Migrate data files in place in a local repository.

Parameters
  • schema_url (str) – URL of the schema’s Python file

  • local_dir (str) – directory in a local data repo that contains the data files

  • data_files (list of str) – data files to migrate

Returns

list of pathnames of migrated files

Return type

list of str

Raises

MigratorError – if schema_url isn’t in the right form, or local_dir isn’t a directory, or any of the data files cannot be found, or the migration fails

prepare()[source]

Prepare for migration

  • Validate this DataSchemaMigration

  • Clone each schema version specified by a schema change

  • Generate and prepare MigrationSpec: instances for the migration, one for each file

Raises

MigratorError – if the DataSchemaMigration doesn’t validate

record_git_repo(git_repo)[source]

Record a new GitRepo: so that its temp dir can be deleted later

Parameters

git_repo (GitRepo) – a git repo

schema_changes_for_data_file(data_file)[source]

Generate a sequence of SchemaChanges for migrating a data file

Parameters

data_file (str) – a data file in the data git repo

Returns

a sequence of SchemaChanges for migrating a data file

Return type

list

test_migration()[source]

Test a migration

Check …

The trickiest part of a migration is importing the schema. Unfortunately, imports that use the Python import command may fail with migration’s import, which uses the library call importlib.import_module. This should be called whenever a schema that may be migrated is changed.

This method reports:

  • any validation errors in automatic config files

  • any validation errors in schema changes files

  • any errors in transformations

  • any failures to import schemas

It does not alter any files.

Parameters

data_repo_location (str) – directory or URL of the data repo

validate()[source]

Validate files to migrate, and load all schema changes files

Raises

MigratorError – if any files to migrate do not exist, or all schema changes files cannot be loaded

verify_schemas()[source]

Verify that each schema can be independently imported

It can be difficult to import a schema via importlib.import_module() in import_module_for_migration(). This method tests that proactively.

Returns

all errors obtained

Return type

list of str

class obj_tables.migrate.GitRepo(repo_location=None, repo_url=None, branch='master', search_parent_directories=False)[source]

Bases: object

Methods for processing a git repo and its commit history

repo_dir[source]

the repo’s root directory

Type

str

repo_url[source]

the repo’s URL, if provided

Type

str

branch[source]

the repo’s branch, if it was cloned

Type

str

repo[source]

the GitPython repo

Type

git.Repo

commit_DAG[source]

NetworkX DAG of the repo’s commit history

Type

nx.classes.digraph.DiGraph

git_hash_map[source]

map from each git hash in the repo to its commit

Type

dict

temp_dirs[source]

temporary directories that hold repo clones

Type

list of str

Initialize a GitRepo from an existing Git repo

If repo_location is a directory then use the Git repo in the directory. Otherwise it must be an URL and the repo is cloned into a temporary directory.

Parameters
  • repo_location (str, optional) – the location of the repo, either its directory or its URL

  • repo_url (str, optional) – the repo’s original URL, which will be set in all copies of a repo that was initially cloned by clone_repo_from_url

  • branch (str, optional) – branch to clone if repo_location is an URL; default is master

  • search_parent_directories (bool, optional) – search_parent_directories option to git.Repo; if set and repo_location is a directory, then all of its parent directories will be searched for a valid repo; default=False

Returns

root directory for the repo (which contains the .git directory)

Return type

str

EMPTY_SUBDIR = 'empty_subdir'[source]
__str__()[source]

Provide a string representation

Returns

a string representation of this GitRepo

Return type

str

add_file(filename)[source]

Add a file to the index

Parameters

filename (str) – path to new file

Raises

MigratorError – if the file cannot be added

checkout_commit(commit_identifier)[source]

Checkout a commit for this repo

Use checkout_commit carefully. If it checks out a new commit, then other operations on the repo may behave differently.

Parameters

commit_identifier (git.objects.commit.Commit or str) – a commit or a commit’s hash

Raises

MigratorError – if the commit cannot be checked out

clone_repo_from_url(url, branch='master', directory=None)[source]

Clone a repo from an URL

Parameters
  • url (str) – URL for the repo

  • branch (str, optional) – branch to clone; default is master

  • directory (str, optional) – directory to hold the repo; if not provided, the repo is stored in a new temporary dir

Returns

(git.Repo, str): the repo cloned, and its root directory

Return type

tuple

Raises

MigratorError – if repo cannot be cloned from url

commit_changes(message)[source]

Commit the changes in this repo

Parameters

message (str) – the commit message

Raises

MigratorError – if the changes cannot be commited

commits_as_graph()[source]

Make a DAG for this repo’s commit dependencies - edges point from dependent commit to parent commit

The DAG contains all commits in the repo on which the latest commit depends. Also creates git_hash_map, a map from all git hashes to their commits.

Returns

a DAG representing the repo commit history

Return type

nx.classes.digraph.DiGraph

commits_in_dependency_consistent_seq(commits)[source]

Order some git commits into a sequence that’s consistent with the repo’s dependencies

Generate a topological sort of the commits. If the dependency relationship contains branching, then the sequence found is not deterministic, because concurrent nodes can appear in any relative order in the sort. E.g., in a commit DAG with the paths a -> b -> c and a -> d -> c, nodes b and d can appear in either order in the sequence.

Parameters

commits (list of git.objects.commit.Commit) – commits to include in the returned sequence

Returns

the elements of commits, in a sequence

that’s consistent with git commit dependencies in the repository, ordered from from antecedent to dependent

Return type

list of git.objects.commit.Commit

copy(tmp_dir=None)[source]

Copy this GitRepo into a new directory

For better performance use copy() instead of GitRepo() or clone_repo_from_url() if you need multiple copies of a repo, such as multiple instances checked out to different commits. This is an optimization because copying is faster than cloning over the network. To avoid bytecode is stale errors, doesn’t copy __pycache__ directories.

Parameters

tmp_dir (str, optional) – directory to hold the repo; if not provided, a new temporary directory is created to store the repo

Returns

a new GitRepo that’s a copy of self in a new temporary directory

Return type

GitRepo

del_temp_dirs()[source]

Delete the temp dirs created by get_temp_dir

Returns

the pathname to a temporary directory

Return type

str

fixtures_dir()[source]

Get the repo’s fixtures directory

Returns

the repo’s fixtures directory

Return type

str

get_commit(commit_or_hash)[source]

Obtain a commit from its hash

Also, if commit_or_hash is a commit, simply return it.

Parameters

commit_or_hash (str or git.objects.commit.Commit) – the hash of a commit or a commit in the repo

Returns

a commit

Return type

git.objects.commit.Commit

Raises

MigratorError – if commit_or_hash is not a commit and cannot be converted into one

get_commits(commits_or_hashes)[source]

Get the commits with the given commits or hashes

Parameters

commits_or_hashes (list of str) – an iterator over commits or commit hashes

Returns

list of the repo’s commits with the

commits or hashes in commits_or_hashes

Return type

list of git.objects.commit.Commit

Raises

MigratorError – if any commit or hash in commits_or_hashes doesn’t identify a commit in this repo

get_dependents(commit_or_hash)[source]

Get all commits that depend on a commit, including transitively

Parameters

commit_or_hash (str or git.objects.commit.Commit) – the hash of a commit or a commit in the repo

Returns

all commits that depend on commit_or_hash

Return type

set of git.objects.commit.Commit

static get_hash(commit)[source]

Get a commit’s hash

Parameters

commit (git.objects.commit.Commit) – a commit

Returns

the commit’s SHA1 hash

Return type

str

get_metadata(metadata_type)[source]

Create a metadata model that describes the current state of this repo

Parameters

metadata_type (class) – a subclass of RepoMetadata, either DataRepoMetadata or SchemaRepoMetadata

Returns

a RepoMetadata object

Return type

RepoMetadata

Raises

MigratorError – if the metadata isn’t available in this repo

get_temp_dir()[source]

Get a temporary directory, which must eventually be deleted by calling del_temp_dirs

Returns

the pathname to a temporary directory

Return type

str

static hash_prefix(hash)[source]

Get a commit hash’s prefix

Parameters

hash (str) – git commit hash

Returns

hash’s prefix

Return type

str

head_commit()[source]

Get the repo’s head commit

Returns

the repo’s latest commit

Return type

git.objects.commit.Commit

latest_hash()[source]

Get the hash of the repo’s head commit

Returns

the head commit’s SHA1 hash

Return type

str

migrations_dir()[source]

Get the repo’s migrations directory

Returns

the repo’s migrations directory

Return type

str

repo_name()[source]

Get the repo’s name

Returns

the repo’s name

Return type

str

exception obj_tables.migrate.MigrateWarning[source]

Bases: UserWarning

Migrate warning

class obj_tables.migrate.MigrationController[source]

Bases: object

Manage migrations

Manage migrations on several dimensions:

  • Migrate a single model file through a sequence of schemas

  • Perform migrations parameterized by a configuration file

static migrate_from_config(migrations_spec_config_file)[source]

Perform the migrations specified in a migrations spec config file

Parameters

migrations_spec_config_file (str) – a migrations spec configuration file, written in YAML

Returns

list of (MigrationSpec, migrated filenames) pairs

Return type

list of tuple

static migrate_from_spec(migration_spec)[source]

Perform the migration specified in a MigrationSpec

Parameters

migration_spec (MigrationSpec) – a migration specification

Returns

of str: migrated filenames

Return type

list

static migrate_over_schema_sequence(migration_spec)[source]

Migrate some model files over a sequence of schemas

Parameters

migration_spec (MigrationSpec) – a migration specification

Returns

for each migration, its sequence of models and

its migrated filename

Return type

tuple of list, list

Raises

MigratorError – if schema_files, renamed_models, and seq_of_renamed_attributes are not consistent with each other;

class obj_tables.migrate.MigrationSpec(name, **kwargs)[source]

Bases: object

Specification of a sequence of migrations for a list of existing files

_REQUIRED_ATTRS[source]

required attributes in a MigrationSpec

Type

list of str

_CHANGES_LISTS[source]

lists of changes in a migration

Type

list of str

_ALLOWED_ATTRS[source]

attributes allowed in a MigrationSpec

Type

list of str

name[source]

name for this MigrationSpec

Type

str

existing_files (:obj:`list`

of str, optional): existing files to migrate

schema_files[source]

list of Python files containing model definitions for each state in a sequence of migrations

Type

list of str, optional

seq_of_renamed_models[source]

list of renamed models for use by a Migrator for each migration in a sequence of migrations

Type

list of list, optional

seq_of_renamed_attributes[source]

list of renamed attributes for use by a Migrator for each migration in a sequence

Type

list of list, optional

seq_of_transformations[source]

list of transformations for use by a Migrator for each migration in a sequence

Type

list of MigrationWrapper, optional

migrated_files (:obj:`list`

of str, optional): migration destination files in 1-to-1 correspondence with existing_files; if not provided, migrated files use a suffix or are migrated in place

io_classes[source]

reader and/or writer for I/O of existing and migrated files, respectively; IMPORTANT NOTE: io_classes are imported and loaded from current schema repositories by DataSchemaMigration

Type

dict of type, optional

migrate_suffix[source]

suffix added to destination file name, before the file type suffix

Type

str, optional

migrate_in_place[source]

whether to migrate in place

Type

bool, optional

migrations_config_file[source]

if created from a configuration file, the file’s path

Type

str, optional

final_schema_branch[source]

branch of the final schema repo

Type

str, optional

final_schema_url[source]

base url of the final schema repo

Type

str, optional

final_schema_hash[source]

hash of the head commit of the final schema repo

Type

str, optional

final_schema_git_metadata[source]

a git metadata model for the repo containing the last schema in the migration; may initialized directly, or constructed from the other final_schema_* attributes

Type

SchemaRepoMetadata, optional

_prepared[source]

whether this MigrationSpec has been prepared

Type

bool, optional

__str__()[source]

Get str representation

Returns

string representation of all allowed attributes in a MigrationSpec

Return type

str

expected_migrated_files()[source]

Provide names of migrated files that migration of this MigrationSpec would produce

Returns

the names of the migrated files that a successful migration of this

MigrationSpec will produce

Return type

list of str

static get_migrations_config(migrations_config_file)[source]

Create a list of MigrationSpecs from a migrations configuration file

Parameters

migrations_config_file (str) – pathname of migrations configuration in YAML file

Returns

migration specifications

Return type

dict of MigrationSpec

Raises

MigratorError – if migrations_config_file cannot be read

is_prepared()[source]

Check that this MigrationSpec has been prepared

Raises

MigratorError – if this MigrationSpec has not been prepared

classmethod load(migrations_config_file)[source]

Create a list of validated and standardized MigrationSpecs from a migrations configuration file

Parameters

migrations_config_file (str) – pathname of migrations configuration in YAML file

Returns

migration specifications

Return type

dict of MigrationSpec

Raises

MigratorError – if migrations_config_file cannot be read, or the migration specifications in migrations_config_file are not valid

prepare()[source]

Validate and standardize this MigrationSpec

Raises

MigratorError – if migrations_config_file cannot be read, or the migration specifications in migrations_config_file are not valid

standardize()[source]

Standardize the attributes of a MigrationSpec

In particular, standardize a MigrationSpec that has been read from a YAML config file

validate()[source]

Validate the attributes of a migration specification

Returns

list of errors found

Return type

list of str

class obj_tables.migrate.MigrationWrapper[source]

Bases: abc.ABC

Interface for classes that define a pair of methods that can modify obj_tables.Models being migrated

MigrationWrapper defines the interface used by transformations. If it’s defined, transformations uses prepare_existing_models to modify existing models just before they are migrated and uses modify_migrated_models to modify migrated models just after they are migrated.

A transformations is associated with each SchemaChanges, and obtained from the transformations file configured in a schema changes file.

static import_migration_wrapper(migration_wrapper_file)[source]

Import the migration wrapper instances defined in a Python file

:param migration_wrapper_file (str: ) name of a file that defines a migration wrappers

Returns

the MigrationWrapper instances in migration_wrapper_file, keyed by their

attribute names

Return type

dict

Raises

MigratorError – if migration_wrapper_file cannot be imported

abstract modify_migrated_models(migrator, migrated_models)[source]

Modify migrated models after migration

:param migrator (Migrator: ) the Migrator calling this method :param migrated_models (list of obj_tables.Model: ) all models that have been migrated

abstract prepare_existing_models(migrator, existing_models)[source]

Prepare existing models before migration

:param migrator (Migrator: ) the Migrator calling this method :param existing_models (list of obj_tables.Model: ) the models that will be migrated

class obj_tables.migrate.Migrator(existing_defs_file=None, migrated_defs_file=None, renamed_models=None, renamed_attributes=None, io_classes=None, transformations=None)[source]

Bases: object

Support schema migration

existing_schema[source]

the existing schema, and its properties

Type

SchemaModule

migrated_schema[source]

the migrated schema, and its properties

Type

SchemaModule

existing_defs[source]

obj_tables.Model definitions of the existing models, keyed by name

Type

dict

migrated_defs[source]

obj_tables.Model definitions of the migrated models, keyed by name

Type

dict

deleted_models[source]

model types defined in the existing models but not the migrated models

Type

set

renamed_models[source]

model types renamed from the existing to the migrated schema

Type

list of tuple

models_map[source]

map from existing model names to migrated model names

Type

dict

renamed_attributes[source]

attribute names renamed from the existing to the migrated schema

Type

list of tuple

renamed_attributes_map[source]

map of attribute names renamed from the existing to the migrated schema

Type

dict

_migrated_copy_attr_name[source]

attribute name used to point existing models to corresponding migrated models; not used in any existing schema

Type

str

io_classes[source]

reader and writer for I/O of existing and migrated files, respectively; defaults provided in DEFAULT_IO_CLASSES

Type

dict of type

transformations[source]

optional transformations which modify models before and/or after migration

Type

MigrationWrapper

Construct a Migrator

If it’s defined, transformations uses its prepare_existing_models method to modify existing models just before they are migrated and uses its modify_migrated_models method to modify migrated models just after they are migrated. A different transformations is associated with each granular step in a sequence of migrations (see SchemaChanges below) so that different transformations can wrap migration in each step.

Parameters
  • existing_defs_file (str, optional) – path of a file containing existing Model definitions

  • migrated_defs_file (str, optional) – path of a file containing migrated Model definitions; filenames optional so that Migrator can use models defined in memory

  • renamed_models (list of tuple, optional) – model types renamed from the existing to the migrated schema; has the form [(‘Existing_1’, ‘Migrated_1’), …, (‘Existing_n’, ‘Migrated_n’)], where (‘Existing_i’, ‘Migrated_i’) indicates that existing model Existing_i is being renamed into migrated model Migrated_i.

  • renamed_attributes (list of tuple, optional) – attribute names renamed from the existing to the migrated schema; a list of tuples of the form ((‘Existing_Model_i’, ‘Existing_Attr_x’), (‘Migrated_Model_j’, ‘Migrated_Attr_y’)), which indicates that Existing_Model_i.Existing_Attr_x will migrate to Migrated_Model_j.Migrated_Attr_y

  • io_classes (dict of type, optional) – reader and/or writer for I/O of existing and migrated files, respectively; if provided, overrides defaults provided in DEFAULT_IO_CLASSES

  • transformations (MigrationWrapper, optional) – transformations which modify models before and/or after migration

COLLECTIONS_ATTRS = ['existing_defs', 'migrated_defs', 'renamed_models', 'models_map', 'renamed_attributes', 'renamed_attributes_map'][source]
DEFAULT_IO_CLASSES = {'reader': <class 'obj_tables.io.Reader'>, 'writer': <class 'obj_tables.io.Writer'>}[source]
MIGRATED_COPY_ATTR_PREFIX = '__migrated_copy'[source]
MIGRATE_SUFFIX = '_migrated'[source]
PARSED_EXPR = '_parsed_expression'[source]
SCALAR_ATTRS = ['deleted_models', '_migrated_copy_attr_name'][source]
__str__()[source]

Get string representation

Returns

string representation of a Migrator; collections attributes are rendered

by pformat

Return type

str

full_migrate(existing_file, migrated_file=None, migrate_suffix=None, migrate_in_place=False)[source]

Migrate data from an existing file to a migrated file

Parameters
  • existing_file (str) – pathname of file to migrate

  • migrated_file (str, optional) – pathname of migrated file; if not provided, save migrated file with migrated suffix in same directory as existing file

  • migrate_suffix (str, optional) – suffix of automatically created migrated filename; default is Migrator.MIGRATE_SUFFIX

  • migrate_in_place (bool, optional) – if set, overwrite existing_file with the migrated file and ignore migrated_file and migrate_suffix

Returns

name of migrated file

Return type

str

Raises

MigratorError – if migrate_in_place is False and writing the migrated file would overwrite an existing file

migrate(existing_models)[source]

Migrate existing model instances to the migrated schema

:param existing_models (list of obj_tables.Model: ) the models being migrated

Returns

the migrated models

Return type

list of obj_tables.Model

modules = {}[source]
static path_of_migrated_file(existing_file, migrate_suffix=None, migrate_in_place=False)[source]

Determine the pathname of the migrated file

Parameters
  • existing_file (str) – pathname of file being migrated

  • migrate_suffix (str, optional) – suffix of automatically created migrated filename; default is Migrator.MIGRATE_SUFFIX

  • migrate_in_place (bool, optional) – if set, migrated file is existing_file, which will be overwritten

Returns

name of migrated file

Return type

str

prepare()[source]

Prepare for migration

Raises

MigratorError – if renamings are not valid, or inconsistencies exist between corresponding existing and migrated classes

read_existing_file(existing_file)[source]

Read models from existing file

Does not perform validation – data in existing model file must be already validated with the existing schema

Parameters

existing_file (str) – pathname of file to migrate

Returns

the models in existing_file

Return type

list of obj_tables.Model

run(files)[source]

Migrate some files

Parameters

files (list) – names of model files to migrate

write_migrated_file(migrated_models, model_order, existing_file, migrated_file=None, migrate_suffix=None, migrate_in_place=False)[source]

Write migrated models to an external representation

Does not perform validation – validation must be performed independently.

:param migrated_models (list of obj_tables.Model: ) the migrated models :param model_order (list of obj_tables.core.ModelMeta: ) migrated models in the order they should appear in a workbook :param existing_file: pathname of file that is being migrated :type existing_file: str :param migrated_file: pathname of migrated file; if not provided, save

migrated file with migrated suffix in same directory as source file

Parameters
  • migrate_suffix (str, optional) – suffix of automatically created migrated filename; default is Migrator.MIGRATE_SUFFIX

  • migrate_in_place (bool, optional) – if set, overwrite existing_file with the migrated file and ignore migrated_file and migrate_suffix

Returns

name of migrated file

Return type

str

Raises

MigratorError – if migrate_in_place is False and writing the migrated file would overwrite an existing file

exception obj_tables.migrate.MigratorError(message=None)[source]

Bases: Exception

Exception raised for errors in obj_tables.migrate

message[source]

the exception’s message

Type

str

class obj_tables.migrate.SchemaChanges(schema_repo=None, schema_changes_file=None, commit_hash=None, renamed_models=None, renamed_attributes=None, transformations_file=None)[source]

Bases: object

Specification of the changes to a schema in a git commit

More generally, a SchemaChanges should encode the set of changes to a schema over the sequence of git commits since the previous SchemaChanges.

_CHANGES_FILE_ATTRS[source]

required attributes in a schema changes file

Type

list of str

_ATTRIBUTES[source]

attributes in a SchemaChanges instance

Type

list of str

schema_repo[source]

a Git repo that defines the data model (schema) of the data being migrated

Type

GitRepo

schema_changes_file[source]

the schema changes file

Type

str

commit_hash[source]

hash of a sentinel commit from a schema changes file

Type

str

renamed_models[source]

list of renamed models in the commit

Type

list, optional

renamed_attributes[source]

list of renamed attributes in the commit

Type

list, optional

transformations_file[source]

the name of a Python file containing transformations

Type

str, optional

transformations[source]

a wrapper that transforms models during migrations

Type

MigrationWrapper, optional

__str__()[source]

Provide a string representation

Returns

a string representation of this SchemaChanges

Return type

str

static all_schema_changes_files(migrations_directory)[source]

Find all schema changes files in a git repo

Parameters

migrations_directory (str) – path to the migrations directory in a git repo

Returns

of str: pathnames of the schema changes files

Return type

list

Raises

MigratorError – if no schema changes files are found

static all_schema_changes_with_commits(schema_repo)[source]

Instantiate all schema changes in a git repo

Obtain all validated schema change files.

Parameters

schema_repo (GitRepo) – an initialized repo for the schema

Returns

list of errors, list all validated schema change files

Return type

tuple

static find_file(schema_repo, commit_hash)[source]

Find a schema changes file in a git repo

Parameters
  • schema_repo (GitRepo) – an initialized repo for the schema

  • commit_hash (str) – hash of a sentinel commit

Returns

the pathname of the file found

Return type

str

Raises

MigratorError – if a file with the hash cannot be found, or multiple files have the hash

generate_filename(commit_hash)[source]

Generate a filename for a template schema changes file

Returns

the filename

Return type

str

static generate_instance(schema_changes_file)[source]

Generate a SchemaChanges instance from a schema changes file

Parameters

schema_changes_file (str) – path to the schema changes file

Returns

the SchemaChanges instance

Return type

SchemaChanges

static get_date_timestamp()[source]

Get a current date timestamp, with second resolution

Returns

the timestamp

Return type

str

get_hash()[source]

Get the repo’s current commit hash

Returns

the hash

Return type

str

static hash_prefix_from_sc_file(schema_changes_file)[source]

Get the hash prefix from a schema changes filename

Parameters

schema_changes_file (str) – the schema changes file

Returns

the hash prefix in a schema changes filename

Return type

str

static load(schema_changes_file)[source]

Read a schema changes file

Parameters

schema_changes_file (str) – path to the schema changes file

Returns

the data in the schema changes file

Return type

dict

Raises

MigratorError – if the schema changes file cannot be found, or is not proper YAML, or does not have the right format, or does not contain any changes

load_transformations()[source]

Load a transformations wrapper if a transformations file is configured

Returns

the transformations wrapper

Return type

MigrationWrapper

make_template(schema_url=None, commit_hash=None)[source]

Make a template schema changes file

The template includes the repo hash which it describes and empty values for SchemaChanges attributes.

Parameters
  • schema_url (str, optional) – URL of the schema repo; if not provided, self.schema_repo must be already initialized

  • commit_hash (str, optional) – hash of the sentinel commit in the schema repo which the template schema changes file identifies; default is the most recent commit

Returns

pathname of the schema changes file that was written

Return type

str

Raises

MigratorError – if a repo cannot be cloned from schema_url, or checked out from commit_hash, or the schema changes file already exists

static make_template_command(schema_dir, commit_hash=None)[source]

Make a template schema changes file with CLI input

Parameters
  • schema_dir (str) – directory of the schema repo

  • commit_hash (str, optional) – hash of the sentinel commit in the schema repo which the template schema changes file identifies; default is the most recent commit

Returns

pathname of the schema changes file that was written

Return type

str

Raises

MigratorError – if a repo cannot be cloned from schema_url, or checked out from commit_hash, or the schema changes file already exists

static validate(schema_changes_kwargs)[source]

Check that the attributes of the arguments to SchemaChanges have the right structure

Parameters

schema_changes_kwargs (dict) – kwargs arguments to SchemaChanges generated by loading a schema changes file

Returns

errors in schema_changes_kwargs

Return type

list

class obj_tables.migrate.SchemaModule(module_path, dir=None)[source]

Bases: object

Represent and import a schema module

module_path[source]

path to the module

Type

str

abs_module_path[source]

absolute path to the module

Type

str

directory[source]

if the module is in a package, the path to the package’s directory; otherwise the directory containing the module

Type

str

package_name[source]

if the module is in a package, the name of the package containing the module; otherwise None

Type

str

module_name[source]

the module’s module name

Type

str

Initialize a SchemaModule

Parameters
  • module_path (str) – path to the module

  • dir (str, optional) – a directory that contains self.module_path

MODULES = {}[source]
MUNGED_MODEL_NAME_SUFFIX = '_MUNGED WITH SPACES'[source]
get_path()[source]
import_module_for_migration(validate=True, required_attrs=None, debug=False, mod_patterns=None, print_code=False)[source]

Import a schema from a Python module in a file, which may be in a package

Parameters
  • validate (bool, optional) – whether to validate the module; default is True

  • required_attrs (list of str, optional) – list of attributes that must be present in the imported module

  • debug (bool, optional) – if True, print debugging output; default is False

  • mod_patterns (list of str, optional) – RE patterns used to search for modules in sys.modules; modules whose names match a pattern are output when debug is True

  • print_code (bool, optional) – if True, while debugging print code being imported; default is False

Returns

the Module loaded from self.module_path

Return type

Module

Raises

MigratorError – if one of the following conditions is met: * The schema at self.module_path cannot be imported * Validate is True and any related attribute in any model references a model not in the module * The module is missing a required attribute

in_package()[source]

Is the schema in a package

Returns

whether the schema is in a package

Return type

bool

static parse_module_path(module_path)[source]

Parse the path to a module

If the module is not in a package, provide its directory and module name. If the module is in a package, provide its directory, package name and module name. The directory can be used as a sys.path entry.

Parameters

module_path (str) – path of a Python module file

Returns

a triple containing directory, package name and module name, as described

above. If the module is not in a package, then package name is None.

Return type

tuple

Raises

MigratorError – if module_path is not the name of a Python file, or is not a file

run()[source]

Import a schema and provide its obj_tables.Models

Returns

the imported Models

Return type

dict

Raises

MigratorError – if self.module_path cannot be loaded

class obj_tables.migrate.Utils[source]

Bases: object

Utilities for migration

static find_schema_modules()[source]

Find the modules used by a schema

Useful for creating schema changes files for a schema repo

Returns

???

Return type

argparse.Namespace

class obj_tables.migrate.VirtualEnvUtil(name, dir=None)[source]

Bases: object

Support creation, use and distruction of virtual environments for Python packages

Will be used to allow different schema versions depend on different package versions

name[source]

name of the VirtualEnvUtil

Type

str

Initialize a VirtualEnvUtil

Parameters
  • name (str) – name for the VirtualEnvUtil

  • dir (str, optional) – a directory to hold the VirtualEnvUtil

activate()[source]

Use this VirtualEnvUtil

deactivate()[source]

Stop using this VirtualEnvUtil

destroy()[source]

Destroy this VirtualEnvUtil

Distruction deletes the directory storing the VirtualEnvUtil

destroyed()[source]

Test whether this VirtualEnvUtil has been destroyed

install_from_pip_spec(pip_spec)[source]

Install a package from a pip specification

Parameters

pip_spec (str) – a pip specification for a package to load

Raises

ValueError – if the package described by pip_spec cannot be installed

is_installed(pip_spec)[source]

3.1.10. obj_tables.utils module

Utilities

Author

Jonathan Karr <karr@mssm.edu>

Author

Arthur Goldberg <Arthur.Goldberg@mssm.edu>

Date

2016-11-23

Copyright

2016, Karr Lab

License

MIT

class obj_tables.utils.DataFileMetadata[source]

Bases: tuple

DataFileMetadata(data_repo_metadata, schema_repo_metadata): Git repository metadata from an obj_tables data file

Create new instance of DataFileMetadata(data_repo_metadata, schema_repo_metadata)

property data_repo_metadata[source]

Git metadata about the repository containing the file

property schema_repo_metadata[source]

Git metadata about the repository containing the obj_tables schema used by the file

class obj_tables.utils.DataRepoMetadata(_comments=None, **kwargs)[source]

Bases: obj_tables.utils.RepoMetadata

Model to store Git version information about a data file’s repo

Parameters

**kwargs – dictionary of keyword arguments with keys equal to the names of the model attributes

Raises

TypeError – if keyword argument is not a defined attribute

class Meta[source]

Bases: obj_tables.core.Meta

attribute_order = ('url', 'branch', 'revision')[source]
attributes = {'branch': <obj_tables.core.StringAttribute object>, 'revision': <obj_tables.core.StringAttribute object>, 'url': <obj_tables.core.StringAttribute object>}[source]
children = {}[source]
description = ''[source]
frozen_columns = 1[source]
indexed_attrs_tuples = ()[source]
inheritance = (<class 'obj_tables.utils.DataRepoMetadata'>, <class 'obj_tables.utils.RepoMetadata'>)[source]
local_attributes = {'branch': <obj_tables.core.LocalAttribute object>, 'revision': <obj_tables.core.LocalAttribute object>, 'url': <obj_tables.core.LocalAttribute object>}[source]
merge = 1[source]
ordering = ()[source]
primary_attribute = None[source]
related_attributes = {}[source]
table_format = 2[source]
unique_together = ()[source]
verbose_name = 'Data repo metadata'[source]
verbose_name_plural = 'Data repo metadatas'[source]
objects = <obj_tables.core.Manager object>[source]
class obj_tables.utils.RepoMetadata(_comments=None, **kwargs)[source]

Bases: obj_tables.core.Model

Generic Model to store Git version information about a repo

Parameters

**kwargs – dictionary of keyword arguments with keys equal to the names of the model attributes

Raises

TypeError – if keyword argument is not a defined attribute

class Meta[source]

Bases: obj_tables.core.Meta

attribute_order = ('url', 'branch', 'revision')[source]
attributes = {'branch': <obj_tables.core.StringAttribute object>, 'revision': <obj_tables.core.StringAttribute object>, 'url': <obj_tables.core.StringAttribute object>}[source]
indexed_attrs_tuples = ()[source]
inheritance = (<class 'obj_tables.utils.RepoMetadata'>,)[source]
local_attributes = {'branch': <obj_tables.core.LocalAttribute object>, 'revision': <obj_tables.core.LocalAttribute object>, 'url': <obj_tables.core.LocalAttribute object>}[source]
ordering = ()[source]
primary_attribute = None[source]
related_attributes = {}[source]
table_format = 2[source]
unique_together = ()[source]
verbose_name = 'Repo metadata'[source]
verbose_name_plural = 'Repo metadatas'[source]
branch = <obj_tables.core.StringAttribute object>[source]
objects = <obj_tables.core.Manager object>[source]
revision = <obj_tables.core.StringAttribute object>[source]
url = <obj_tables.core.StringAttribute object>[source]
class obj_tables.utils.SchemaRepoMetadata(_comments=None, **kwargs)[source]

Bases: obj_tables.utils.RepoMetadata

Model to store Git version info for the repo that defines the obj_tables schema used by a data file

Parameters

**kwargs – dictionary of keyword arguments with keys equal to the names of the model attributes

Raises

TypeError – if keyword argument is not a defined attribute

class Meta[source]

Bases: obj_tables.core.Meta

attribute_order = ('url', 'branch', 'revision')[source]
attributes = {'branch': <obj_tables.core.StringAttribute object>, 'revision': <obj_tables.core.StringAttribute object>, 'url': <obj_tables.core.StringAttribute object>}[source]
children = {}[source]
description = ''[source]
frozen_columns = 1[source]
indexed_attrs_tuples = ()[source]
inheritance = (<class 'obj_tables.utils.SchemaRepoMetadata'>, <class 'obj_tables.utils.RepoMetadata'>)[source]
local_attributes = {'branch': <obj_tables.core.LocalAttribute object>, 'revision': <obj_tables.core.LocalAttribute object>, 'url': <obj_tables.core.LocalAttribute object>}[source]
merge = 1[source]
ordering = ()[source]
primary_attribute = None[source]
related_attributes = {}[source]
table_format = 2[source]
unique_together = ()[source]
verbose_name = 'Schema repo metadata'[source]
verbose_name_plural = 'Schema repo metadatas'[source]
objects = <obj_tables.core.Manager object>[source]
obj_tables.utils.add_metadata_to_file(pathname, models, schema_package=None)[source]

Add Git repository metadata to an existing ObjTables data file

Overwrites the existing file

Parameters
  • pathname (str) – path to an ObjTables data file in a Git repo

  • models (list of types.TypeType, optional) – list of types of objects to read

  • schema_package (str, optional) – the package which defines the ObjTables schema used by the file; if not None, try to write metadata information about the the schema’s Git repository: the repo must be current with origin

Returns

pathname of new data file

Return type

str

obj_tables.utils.diff_workbooks(filename_1, filename_2, models, model_name, schema_name=None, **kwargs)[source]

Get difference of models in two workbooks

Parameters
  • filename_1 (str) – path to first workbook

  • filename_2 (str) – path to second workbook

  • models (list of Model) – schema for objects to compare

  • model_name (str) – Type of objects to compare

  • schema_name (str, optional) – name of the schema

  • kwargs (dict, optional) – additional arguments to obj_tables.io.Reader

Returns

list of differences

Return type

list of str

obj_tables.utils.get_attr_order(model)[source]

Get the names of attributes in the order they should appear in ER diagrams

Parameters

model (type) – model

Returns

names of attributes in the order they should appear in ER diagrams

Return type

list of str

obj_tables.utils.get_attribute_by_name(cls, group_name, attr_name, verbose_name=False, case_insensitive=False)[source]

Return the attribute of Model class cls with name name

Parameters
  • cls (class) – Model class

  • group_name (str) – name of attribute group

  • attr_name (str) – attribute name

  • verbose_name (str) – if True, search for attributes by verbose name; otherwise search for attributes by name

  • case_insensitive (bool, optional) – if True, ignore case

Returns

  • Attribute: attribute with name equal to the value of group_name or None if there is no matching attribute

  • Attribute: attribute with name equal to the value of attr_name or None if there is no matching attribute

Return type

tuple

obj_tables.utils.get_attrs()[source]

Get a dictionary of the defined types of attributes for use with init_schema.

Returns

dictionary which maps the name of each attribute to its instance

Return type

dict

obj_tables.utils.get_component_by_id(models, id, identifier='id')[source]

Retrieve a model instance by its identifier

Parameters
  • model (list of Model) – an iterable of Model objects

  • id (str) – the identifier being sought

  • identifier (str, optional) – the name of the identifier attribute

Returns

the retrieved Model instance if found, or None

Return type

Model

Raises

AttributeError – if none of the items in models has the attribute specified by identifier

obj_tables.utils.get_models(module)[source]

Get the models in a module

Parameters

module (types.ModuleType) – module

Returns

dictionary that maps the names of models to models

Return type

dict of str => Model

Get all errors associated with an object and its related objects

Parameters

object (Model) – object

Returns

set of errors

Return type

InvalidObjectSet

Get the models that have relationships to a model

Parameters
  • root_model (type) – subclass of Model

  • include_root_model (bool, optional) – include the root model in the returned list of models

Returns

list of models that have relationships with root_model

Return type

list of type

obj_tables.utils.get_schema(path, name=None)[source]

Get a Python schema

Parameters
  • path (str) – path to Python schema

  • name (str, optional) – Python name for schema module

Returns

schema

Return type

types.ModuleType

obj_tables.utils.group_objects_by_model(objects)[source]

Group objects by their models

Parameters

objects (list of Model) – list of model objects

Returns

dictionary with object grouped by their class

Return type

dict

obj_tables.utils.init_schema(filename, out_filename=None)[source]

Initialize an ObjTables schema from a tabular declarative specification in filename. filename can be a XLSX, CSV, or TSV file.

Schemas (classes and attributes) should be defined using the following tabular format. Classes and their attributes can be defined in any order.

Table 3.1 Format for specifying classes.

Python

Tabular column

Tabular column values

Optional

Class name

!Name

Valid Python name

Class

!Type

Class

Superclass

!Parent

Empty or the name of another class

obj_tables.Meta.table_format

!Format

row, column, multiple_cells, cell

obj_tables.Meta.verbose_name

!Verbose name

String

Y

obj_tables.Meta.verbose_name_plural

!Verbose name plural

String

Y

obj_tables.Meta.description

!Description

Y

Table 3.2 Format for specifying attributes of classes.

Python

Tabular column

Tabular column values

Optional

Name of instance of subclass of obj_tables.Attribute

!Name

a-z, A-Z, 0-9, _, :, >, ., -, [, ], or ‘ ‘

obj_tables.Attribute

!Type

Attribute

Parent class

!Parent

Name of the parent class

Subclass of obj_tables.Attribute

!Format

Boolean`, ``Float`, ``String, etc.

obj_tables.Attribute.verbose_name

!Verbose name

String

Y

obj_tables.Attribute.verbose_name_plural

!Verbose name plural

String

Y

obj_tables.Attribute.description

!Description

String

Y

Parameters
  • filename (str) – path to

  • out_filename (str, optional) – path to save schema

Returns

  • types.ModuleType: module with classes

  • str: schema name

Return type

tuple

Raises

ValueError – if schema specification is not in a supported format, an XLSX schema file does not contain a worksheet with the name !!_Schema which specifies the schema, the class inheritance structure is cyclic, or the schema specification is invalid (e.g., a class is defined multiple defined)

obj_tables.utils.rand_schema_name(len=8)[source]

Generate a random Python module name of a schema

Parameters

len (int, optional) – length of random name

Returns

random name for schema

Return type

str

obj_tables.utils.randomize_object_graph(obj)[source]

Randomize the order of the edges (RelatedManagers) in the object’s object graph.

Parameters

obj (Model) – instance of Model

obj_tables.utils.read_metadata_from_file(pathname)[source]

Read Git repository metadata from an ObjTables data file

Parameters

pathname (str) – path to the data file

Returns

data and schema repo metadata from the file at pathname; missing metadata is returned as None

Return type

DataFileMetadata

Raises

ValueError – if pathname’s extension is not supported, or unexpected metadata instances are found

obj_tables.utils.set_git_repo_metadata_from_path(model, repo_type, path='.', url_attr='url', branch_attr='branch', commit_hash_attr='revision')[source]

Use Git to set the Git repository URL, branch, and commit hash metadata attributes of a model

Parameters
  • model (Model) – model whose Git attributes will be set

  • repo_type (git.RepoMetadataCollectionType) – repo type being set

  • path (str, optional) – path to file or directory in a clone of a Git repository; default=’.’

  • url_attr (str, optional) – attribute in model for the Git URL; default=’url’

  • branch_attr (str, optional) – attribute in model for the Git branch; default=’branch’

  • commit_hash_attr (str, optional) – attribute in model for the Git commit hash; default=’revision’

Returns

list of reasons, if any, that the repo might not be

suitable for collecting metadata

Return type

list of str

obj_tables.utils.source_report(obj, attr_name)[source]

Get the source file, worksheet, column, and row location of attribute attr_name of model object obj as a colon-separated string.

Parameters
  • obj (Model) – model object

  • attr_name (str) – attribute name

Returns

a string representation of the source file, worksheet, column, and row

location of attr_name of obj

Return type

str

obj_tables.utils.to_pandas(objs, models=None, get_related=True, include_all_attributes=True, validate=True)[source]

Generate a pandas representation of a collection of objects

Parameters
  • objs (list of Model) – objects

  • models (list of Model, optional) – models in the order that they should appear as worksheets; all models which are not in models will follow in alphabetical order

  • get_related (bool, optional) – if True, write obj and all their related objects

  • include_all_attributes (bool, optional) – if True, export all attributes including those not explictly included in Model.Meta.attribute_order

  • validate (bool, optional) – if True, validate the data

Returns

dictionary that maps models (Model) to

the instances of each model (pandas.DataFrame)

Return type

dict

obj_tables.utils.viz_schema(module, filename, attributes=True, tail_labels=True, hidden_classes=None, extra_edges=None, model_names=None, rank_sep=None, node_sep=None, node_width=None, node_height=None, node_margin=(0.0, 0.055), node_edge_color=None, node_fill_color=None, node_font_color=None, model_edge_colors=None, model_fill_colors=None, model_font_colors=None, arrow_size=None, font_name=None, font_size=None)[source]

Visualize a schema

Parameters
  • models (types.ModuleType) – module with models

  • filename (str) – path to save visualization of schema

  • attributes (bool, optional) – If True, display attributes. If False, only display classes.

  • tail_labels (bool, optional) – If True, display tail labels (1 or N).

  • hidden_classes (list, optional) – list of classes to not display

  • extra_edges (list, optional) – list of additional edges to not display

  • model_names (dict, optional) – dictionary that maps models to their display names

  • rank_sep (float, optional) – separation between node ranks

  • node_sep (float, optional) – separation within a node rank

  • node_width (float, optional) – node width

  • node_height (float, optional) – node height

  • node_margin (tuple of float, optional) – node margin

  • node_edge_color (str, optional) – node edge color

  • node_fill_color (str, optional) – node fill color

  • node_font_color (str, optional) – node font color

  • model_edge_colors (dict, optional) – dictionary that maps models to their edge color

  • model_fill_colors (dict, optional) – dictionary that maps models to their fill color

  • model_font_colors (dict, optional) – dictionary that maps models to their font color

  • arrow_size (float, optional) – relative arrow size

  • font_name (str, optional) – font name

  • font_size (float, optional) – font size in points

3.1.11. obj_tables.web_service module

Web service

Author

Jonathan Karr <karr@mssm.edu>

Date

2019-09-15

Copyright

2019, Karr Lab

License

MIT

class obj_tables.web_service.Convert(api=None, *args, **kwargs)[source]

Bases: flask_restplus.resource.Resource

Convert a schema-encoded workbook to another format (CSV, multi-CSV, JSON, TSV, multi-TSV, XLSX, YAML)

endpoint = 'convert'[source]
mediatypes()[source]
methods = {'POST'}[source]
post()[source]

Convert a schema-encoded workbook to another format (CSV, multi-CSV, JSON, TSV, multi-TSV, XLSX, YAML)

class obj_tables.web_service.Diff(api=None, *args, **kwargs)[source]

Bases: flask_restplus.resource.Resource

Calculate the difference between two workbooks according to a schema

endpoint = 'diff'[source]
mediatypes()[source]
methods = {'POST'}[source]
post()[source]

Calculate the difference between two workbooks according to a schema

class obj_tables.web_service.GenTemplate(api=None, *args, **kwargs)[source]

Bases: flask_restplus.resource.Resource

Generate a template workbook (CSV, multi-CSV, TSV, multi-TSV, XLSX) for a schema or declarative description of a schema

endpoint = 'gen_template'[source]
mediatypes()[source]
methods = {'POST'}[source]
post()[source]

Generate a template workbook (CSV, multi-CSV, TSV, multi-TSV, XLSX) for a schema or declarative description of a schema

class obj_tables.web_service.InitSchema(api=None, *args, **kwargs)[source]

Bases: flask_restplus.resource.Resource

Initialize a Python schema from a declarative description of the schema in a table (CSV, TSV, XLSX)

endpoint = 'init_schema'[source]
mediatypes()[source]
methods = {'POST'}[source]
post()[source]

Initialize a Python schema from a declarative description of the schema in a table (CSV, TSV, XLSX)

class obj_tables.web_service.Normalize(api=None, *args, **kwargs)[source]

Bases: flask_restplus.resource.Resource

Normalize a workbook according to a schema

endpoint = 'normalize'[source]
mediatypes()[source]
methods = {'POST'}[source]
post()[source]

Normalize a workbook according to a schema

class obj_tables.web_service.PrefixMiddleware(app, prefix='')[source]

Bases: object

class obj_tables.web_service.Validate(api=None, *args, **kwargs)[source]

Bases: flask_restplus.resource.Resource

Validate that a workbook is consistent with a schema, and report any errors

endpoint = 'validate'[source]
mediatypes()[source]
methods = {'POST'}[source]
post()[source]

Validate that a workbook is consistent with a schema, and report any errors

class obj_tables.web_service.VizSchema(api=None, *args, **kwargs)[source]

Bases: flask_restplus.resource.Resource

Generate a UML diagram for a schema

endpoint = 'viz_schema'[source]
mediatypes()[source]
methods = {'POST'}[source]
post()[source]

Generate a UML diagram for a schema

obj_tables.web_service.api = <flask_restplus.api.Api object>[source]

Convert

obj_tables.web_service.get_model(models, name)[source]

Get the model with name name

Parameters
  • models (list of core.Model) – models

  • name (str) – model name

Returns

model

Return type

core.Model

obj_tables.web_service.read_workbook(filename, models, schema_name=None)[source]

Read a workbook

Parameters
  • filename (str) – path to workbook

  • models (list of core.Model) – models

  • ( (schema_name) – obj:str`, optional): schema name

Returns

  • dict: dictionary that maps types to a dictionary of instance

  • dict: dictionary of model metadata

Return type

tuple

obj_tables.web_service.save_in_workbook(file_storage)[source]

Save workbook to a temporary directory

Parameters

file_storage (FileStorage) – uploaded file

Returns

  • str: temporary directory with workbook

  • str: local path to workbook file

Return type

tuple

obj_tables.web_service.save_out_workbook(format, objs, schema_name, doc_metadata, model_metadata, models, write_toc=False, write_schema=False, write_empty_models=True, write_empty_cols=True, protected=True)[source]
Parameters
  • format (str) – format (csv, multi.csv, json, tsv, multi.tsv, xlsx, yml)

  • objs (dict) – dictionary that maps types to instances

  • schema_name (str) – schema name

  • doc_metadata (dict) – dictionary of document metadata

  • model_metadata (dict) – dictionary of model metadata

  • models (list of core.Model) – models

  • write_toc (bool, optional) – if True, write a table of contents with the file

  • write_schema (bool, optional) – if True, write schema with file

  • write_empty_models (bool, optional) – if True, write models even when there are no instances

  • write_empty_cols (bool, optional) – if True, write columns even when all values are None

  • protected (bool, optional) – if True, protect the worksheet

Returns

  • str: temporary directory with workbook

  • str: path to workbook file

  • str: mimetype of workbook

Return type

tuple

obj_tables.web_service.save_schema(file_storage)[source]

Save schema to a temporary directory

Parameters

file_storage (FileStorage) – uploaded file

Returns

  • str: temporary directory with schema

  • str: local path to schema file

Return type

tuple

3.1.12. Module contents