4.1.1.5.1.1. datanator.data_source.array_express_tools package

4.1.1.5.1.1.1. Submodules

4.1.1.5.1.1.2. datanator.data_source.array_express_tools.ensembl_tools module

class datanator.data_source.array_express_tools.ensembl_tools.StrainInfo(organism_strain, download_url, full_strain_specificity, domain)[source]

Bases: object

Represents information about an ensembl reference genome

organism_strain[source]

the ensembl strain in the reference genome

Type:str
download_url[source]

the url for that strain’s refernce genome

Type:str
full_strain_specificity[source]

whether or not the strain mathces the full specifity provided in the arra express sample

Type:bool
datanator.data_source.array_express_tools.ensembl_tools.find_nth(haystack, needle, n)[source]
datanator.data_source.array_express_tools.ensembl_tools.format_org_name(name)[source]

Format the name of an organism so normalize all species names

Args:
name (bool): the name of a spcies (e.g. escherichia coli str. k12)
Returns:
str: the normalized version of the strain name (e.g. escherichia coli k12)
datanator.data_source.array_express_tools.ensembl_tools.get_ftp_url(url)[source]
datanator.data_source.array_express_tools.ensembl_tools.get_json_ends(tree)[source]
datanator.data_source.array_express_tools.ensembl_tools.get_ref_seq_url(org_symbol)[source]
datanator.data_source.array_express_tools.ensembl_tools.get_strain_info(sample)[source]

Get information about the refernce genome that should be used for a given sample

Args:
sample (array_express.Sample): an RNA-Seq sample
Returns:
EnsembleInfo: Ensembl information about the reference genome
datanator.data_source.array_express_tools.ensembl_tools.get_taxonomic_lineage(base_species)[source]

Get the lineage of a species

Parameters:base_species (bool) – a species (e.g. escherichia coli)
Returns:list of str: a list of strings corresponding to the layer of its taxonomy

4.1.1.5.1.1.3. Module contents