3.1.1.3. karr_lab_aws_manager.elasticsearch_kl package¶

3.1.1.3.1. Submodules¶

3.1.1.3.2. karr_lab_aws_manager.elasticsearch_kl.index_setting_file module¶

class karr_lab_aws_manager.elasticsearch_kl.index_setting_file.IndexUtil(filter_dir=None, analyzer_dir=None, mapping_properties_dir=None)[source]¶

Bases: object

Make index setting json file.

combine_files(**kwargs)[source]¶

Combine various settings for index to form a coherent description. (https://www.elastic.co/guide/en/elasticsearch/reference/current/search-analyzer.html)

Parameters

_filter (bool) – whether to include filter info.
analyzer (bool) – whether to include analyzer info.
mappings (bool) – whether to include mappings info.

Returns

(dict)

read_file(_dir)[source]¶

Read in json file.

Parameters: _dir (str) – directory of the json file.
Returns: (dict)

3.1.1.3.3. karr_lab_aws_manager.elasticsearch_kl.query_builder module¶

class karr_lab_aws_manager.elasticsearch_kl.query_builder.QueryBuilder(profile_name=None, credential_path=None, config_path=None, elastic_path=None, cache_dir=None, service_name='es', max_entries=inf, verbose=False)[source]¶

Bases: karr_lab_aws_manager.elasticsearch_kl.util.EsUtil

build_bool_query_body(must=None, _filter=None, should=None, must_not=None, minimum_should_match=0)[source]¶

Building boolean query body (https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-bool-query.html)

Parameters

must (list or dict, optional) – Body for must. Defaults to None.
_filter (list or dict, optional) – Body for filter. Defaults to None.
should (list or dict, optional) – Body for should. Defaults to None.
must_not (list or dict, optional) – Body for must_not. Defaults to None.
minimum_should_match (int) – The number or percentage of should clauses returned documents must match. Defaults to 0.

Returns

boolean query body

Return type

(dict)

build_simple_query_string_body(query_message, **kwargs)[source]¶

Builds query portion of the body in request body search (https://opendistro.github.io/for-elasticsearch-docs/docs/elasticsearch/full-text/#simple-query-string)

Parameters: query_message (str) – string to be queried for.
Returns: query request body
Return type: (dict)

3.1.1.3.4. karr_lab_aws_manager.elasticsearch_kl.util module¶

class karr_lab_aws_manager.elasticsearch_kl.util.ComplexEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)[source]¶

Bases: json.encoder.JSONEncoder

default(o)[source]¶

Implement this method in a subclass such that it returns a serializable object for o, or calls the base implementation (to raise a TypeError).

For example, to support arbitrary iterators, you could implement default like this:

def default(self, o):
    try:
        iterable = iter(o)
    except TypeError:
        pass
    else:
        return list(iterable)
    # Let the base class default method raise the TypeError
    return JSONEncoder.default(self, o)

class karr_lab_aws_manager.elasticsearch_kl.util.EsUtil(profile_name=None, credential_path=None, config_path=None, elastic_path=None, cache_dir=None, service_name='es', max_entries=inf, verbose=False)[source]¶

Bases: karr_lab_aws_manager.config.config.establishES

add_field_to_index(index, field=None, value=None, query={'match_all': {}}, script_type='inline', script_complete=None)[source]¶

Add a field of value to all documents in index

Parameters

index (str) – name of index.
field (str) – name of field.
value (Obj) – value of field.
query (Obj) – query of index.
script_type (str) – type of script, inline or store.
script_complete (str) – content of script.

Returns

elasticsearch update status description.

Return type

(HTTPResponse)

allocation_explain()[source]¶

chooses the first unassigned shard that it finds and explains why it cannot be allocated to a node

Returns: http response
Return type: (HTTPResponse)

build_es(suffix=None)[source]¶

build es query object

Parameters: suffix (str) – string trailing es endpoint
Returns: Elasticsearch object
Return type: (Elasticsearch)

change_field_name(pipeline_name, pipeline_description, src_field, target_field, src_idx, dest_idx)[source]¶

Change field name. (https://www.elastic.co/guide/en/elasticsearch/reference/current/rename-processor.html)

Parameters

pipeline_name (str) – Name of pipeline.
pipeline_description (str) – Description of pipeline.
src_field (str) – Name of the field before change.
target_field (str) – Name of the field after change.
src_idx (str) – Name of index before change.
dest_idx (str) – Name of index after changes.

create_index(index, mappings=None, setting={'settings': {'number_of_replicas': 0, 'number_of_shards': 1}}, additional_settings=None)[source]¶

Create index: (https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-create-index.html)

Parameters

index (str) – name of index
setting (dict, optional) – index settings. Defaults to {“settings”: {“number_of_shards”: 1}}.
mappings (dict, optional) – index mappings. Deafults to None.
additional_settings (dict) – additional settings. Defaults to None.

create_index_with_file(index, _file, num_shard=1, num_replica=0)[source]¶

Create index with an index description file: (https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-create-index.html)

Parameters

index (str) – name of index
_file (dict) – index setting description.
num_shard (int, optional) – number of shards. Defaults to 1.
num_replica (int, optional) – number of replicas. Defaults to 0.

Returns

(requests.Response)

data_to_es_bulk(cursor, index='test', count=None, bulk_size=100, _id='uniprot_id', headers={'Content-Type': 'application/json'})[source]¶

Load data into elasticsearch service

Parameters

count (int) – cursor size
cursor (pymongo.Cursor or iter) – documents to be PUT/POST to es
index (str) – name of unique key to be used as index for es
bulk_size (int) – number of documents in one PUT
headers (dict) – http header
_id (str) – key in mogno collection for identification

Returns

set of status codes

Return type

(set)

data_to_es_single(count, cursor, index, _id='uniprot_id', headers={'Content-Type': 'application/json'})[source]¶

Load data into elasticsearch service

Parameters

count (int) – cursor size
cursor (pymongo.Cursor or iter) – documents to be PUT to es
index (str) – name of unique key to be used as index for es
es_endpoint (str) – elasticsearch endpoint
headers (dict) – http header information
_id (str) – key in mongo collection for identification

Returns

set of status codes

Return type

(set)

delete_index(index, _id=None)[source]¶

Delete elasticsearch index

Parameters

index (str) – name of index in es
_id (int) – id of the doc in index (optional)

enable_fielddata(index, _type, field)[source]¶

Enable fielddata for type fields

Parameters

index (str) – Index in which the operation will be done
_type (str) – Existing mapping for field.
field (str) – name of the field.

get_index_mapping(index='.kibana_1')[source]¶

Get

Parameters: index (str, optional) – Comma-separated list or wildcard expression of index names. Defaults to ‘.kibana_1’.
Returns: (requests.Response)

index_health_status()[source]¶

shows the health status, number of documents, and disk usage for each index

Returns: http response
Return type: (HTTPResponse)

index_settings(index, number_of_replicas, number_of_shards=1, other_settings={}, headers={'Content-Type': 'application/json'})[source]¶

Setting index’s shard and replica number in es cluster

Parameters

index (str) – name of index to be set
number_of_replicas (int) – number of replica shards to be used for the index
number_of_shards (int) – number of primary shards contained in the es cluster
other_settings (dict) – other index settings.
headers (dict) – http request content header description

Returns

http response

Return type

(HTTPResponse)

make_action_and_metadata(index, _id)[source]¶

Make action_and_metadata obj for bulk loading e.g. { “index”: { “_index” : “index”, “_id” : “id” } }

Parameters

index (str) – name of index on ES
_id (str) – unique id for document

Returns

metadata that conforms to ES bulk load requirement

Return type

(dict)

migrate_index(old_index, new_index, headers={'Content-Type': 'application/json'}, number_of_shards=1, number_of_replicas=0)[source]¶

Migrate old index to new index whilst changing shard and replica setting

Parameters

old_index (str) – name of the old index.
new_index (str) – name of the new index
headers (HTTP.header, optional) – header. Defaults to { “Content-Type”: “application/json” }.
number_of_shards (int, optional) – number of shards for the index. Defaults to 1.
number_of_replicas (int, optional) – number of replicas for the index. Defaults to 1.

Returns

(list of requests.Response)

put_mapping(index, body)[source]¶

Put index mapping to exisiting index. (https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-put-mapping.html)

Parameters

index (str) – mapping for the index.
body (dict) – mapping description.

Returns

(requests.Response)

test_analyzer(msg, tokenizer='standard', index=None)[source]¶

Test ES analyzer / tokenizer results. https://www.elastic.co/guide/en/elasticsearch/reference/6.8/analysis-standard-tokenizer.html

Parameters

msg (str) – Message to be analyzed.
tokenizer (str, optional) – analyzer to be used.
index (str, optional) – Index in which custom analyzer resides.

Returns

http response

Return type

(HTTPResponse)

unassigned_reason()[source]¶

sends http request to get why a shard is unassigned

Returns: http response
Return type: (HTTPResponse)

update_alias_to_idx(idx, alias, action='add')[source]¶

Add aliases to an index. (https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-aliases.html)

Parameters

idx (str or list of str) – indices official name / names.
alias (str) – index alias.
action (str) – add or remove

karr_lab_aws_manager.elasticsearch_kl.util.main()[source]¶

3.1.1.3. karr_lab_aws_manager.elasticsearch_kl package¶

3.1.1.3.1. Submodules¶

3.1.1.3.2. karr_lab_aws_manager.elasticsearch_kl.index_setting_file module¶

3.1.1.3.3. karr_lab_aws_manager.elasticsearch_kl.query_builder module¶

3.1.1.3.4. karr_lab_aws_manager.elasticsearch_kl.util module¶

3.1.1.3.5. Module contents¶