3.1.1.3. karr_lab_aws_manager.elasticsearch_kl package¶
3.1.1.3.1. Submodules¶
3.1.1.3.2. karr_lab_aws_manager.elasticsearch_kl.index_setting_file module¶
-
class
karr_lab_aws_manager.elasticsearch_kl.index_setting_file.
IndexUtil
(filter_dir=None, analyzer_dir=None, mapping_properties_dir=None)[source]¶ Bases:
object
Make index setting json file.
-
combine_files
(**kwargs)[source]¶ Combine various settings for index to form a coherent description. (https://www.elastic.co/guide/en/elasticsearch/reference/current/search-analyzer.html)
- Parameters
_filter (
bool
) – whether to include filter info.analyzer (
bool
) – whether to include analyzer info.mappings (
bool
) – whether to include mappings info.
- Returns
(
dict
)
-
3.1.1.3.3. karr_lab_aws_manager.elasticsearch_kl.query_builder module¶
-
class
karr_lab_aws_manager.elasticsearch_kl.query_builder.
QueryBuilder
(profile_name=None, credential_path=None, config_path=None, elastic_path=None, cache_dir=None, service_name='es', max_entries=inf, verbose=False)[source]¶ Bases:
karr_lab_aws_manager.elasticsearch_kl.util.EsUtil
-
build_bool_query_body
(must=None, _filter=None, should=None, must_not=None, minimum_should_match=0)[source]¶ Building boolean query body (https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-bool-query.html)
- Parameters
must (
list
ordict
, optional) – Body for must. Defaults to None._filter (
list
ordict
, optional) – Body for filter. Defaults to None.should (
list
ordict
, optional) – Body for should. Defaults to None.must_not (
list
ordict
, optional) – Body for must_not. Defaults to None.minimum_should_match (
int
) – The number or percentage of should clauses returned documents must match. Defaults to 0.
- Returns
boolean query body
- Return type
(
dict
)
-
build_simple_query_string_body
(query_message, **kwargs)[source]¶ Builds query portion of the body in request body search (https://opendistro.github.io/for-elasticsearch-docs/docs/elasticsearch/full-text/#simple-query-string)
- Parameters
query_message (
str
) – string to be queried for.- Returns
query request body
- Return type
(
dict
)
-
3.1.1.3.4. karr_lab_aws_manager.elasticsearch_kl.util module¶
-
class
karr_lab_aws_manager.elasticsearch_kl.util.
ComplexEncoder
(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)[source]¶ Bases:
json.encoder.JSONEncoder
-
default
(o)[source]¶ Implement this method in a subclass such that it returns a serializable object for
o
, or calls the base implementation (to raise aTypeError
).For example, to support arbitrary iterators, you could implement default like this:
def default(self, o): try: iterable = iter(o) except TypeError: pass else: return list(iterable) # Let the base class default method raise the TypeError return JSONEncoder.default(self, o)
-
-
class
karr_lab_aws_manager.elasticsearch_kl.util.
EsUtil
(profile_name=None, credential_path=None, config_path=None, elastic_path=None, cache_dir=None, service_name='es', max_entries=inf, verbose=False)[source]¶ Bases:
karr_lab_aws_manager.config.config.establishES
-
add_field_to_index
(index, field=None, value=None, query={'match_all': {}}, script_type='inline', script_complete=None)[source]¶ Add a field of value to all documents in index
- Parameters
index (
str
) – name of index.field (
str
) – name of field.value (
Obj
) – value of field.query (
Obj
) – query of index.script_type (
str
) – type of script, inline or store.script_complete (
str
) – content of script.
- Returns
elasticsearch update status description.
- Return type
(
HTTPResponse
)
-
allocation_explain
()[source]¶ chooses the first unassigned shard that it finds and explains why it cannot be allocated to a node
- Returns
http response
- Return type
(HTTPResponse)
-
build_es
(suffix=None)[source]¶ build es query object
- Parameters
suffix (
str
) – string trailing es endpoint- Returns
Elasticsearch object
- Return type
(
Elasticsearch
)
-
change_field_name
(pipeline_name, pipeline_description, src_field, target_field, src_idx, dest_idx)[source]¶ Change field name. (https://www.elastic.co/guide/en/elasticsearch/reference/current/rename-processor.html)
- Parameters
pipeline_name (
str
) – Name of pipeline.pipeline_description (
str
) – Description of pipeline.src_field (
str
) – Name of the field before change.target_field (
str
) – Name of the field after change.src_idx (
str
) – Name of index before change.dest_idx (
str
) – Name of index after changes.
-
create_index
(index, mappings=None, setting={'settings': {'number_of_replicas': 0, 'number_of_shards': 1}}, additional_settings=None)[source]¶ - Create index
(https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-create-index.html)
- Parameters
index (
str
) – name of indexsetting (
dict
, optional) – index settings. Defaults to {“settings”: {“number_of_shards”: 1}}.mappings (
dict
, optional) – index mappings. Deafults to None.additional_settings (
dict
) – additional settings. Defaults to None.
-
create_index_with_file
(index, _file, num_shard=1, num_replica=0)[source]¶ - Create index with an index description file
(https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-create-index.html)
- Parameters
index (
str
) – name of index_file (
dict
) – index setting description.num_shard (
int
, optional) – number of shards. Defaults to 1.num_replica (
int
, optional) – number of replicas. Defaults to 0.
- Returns
(
requests.Response
)
-
data_to_es_bulk
(cursor, index='test', count=None, bulk_size=100, _id='uniprot_id', headers={'Content-Type': 'application/json'})[source]¶ Load data into elasticsearch service
- Parameters
count (
int
) – cursor sizecursor (
pymongo.Cursor
oriter
) – documents to be PUT/POST to esindex (
str
) – name of unique key to be used as index for esbulk_size (
int
) – number of documents in one PUTheaders (
dict
) – http header_id (
str
) – key in mogno collection for identification
- Returns
set of status codes
- Return type
(
set
)
-
data_to_es_single
(count, cursor, index, _id='uniprot_id', headers={'Content-Type': 'application/json'})[source]¶ Load data into elasticsearch service
- Parameters
count (
int
) – cursor sizecursor (
pymongo.Cursor
oriter
) – documents to be PUT to esindex (
str
) – name of unique key to be used as index for eses_endpoint (
str
) – elasticsearch endpointheaders (
dict
) – http header information_id (
str
) – key in mongo collection for identification
- Returns
set of status codes
- Return type
(
set
)
-
delete_index
(index, _id=None)[source]¶ Delete elasticsearch index
- Parameters
index (
str
) – name of index in es_id (
int
) – id of the doc in index (optional)
-
enable_fielddata
(index, _type, field)[source]¶ Enable fielddata for type fields
- Parameters
index (
str
) – Index in which the operation will be done_type (
str
) – Existing mapping for field.field (
str
) – name of the field.
-
get_index_mapping
(index='.kibana_1')[source]¶ Get
- Parameters
index (
str
, optional) – Comma-separated list or wildcard expression of index names. Defaults to ‘.kibana_1’.- Returns
(
requests.Response
)
-
index_health_status
()[source]¶ shows the health status, number of documents, and disk usage for each index
- Returns
http response
- Return type
(HTTPResponse)
-
index_settings
(index, number_of_replicas, number_of_shards=1, other_settings={}, headers={'Content-Type': 'application/json'})[source]¶ Setting index’s shard and replica number in es cluster
- Parameters
index (str) – name of index to be set
number_of_replicas (int) – number of replica shards to be used for the index
number_of_shards (int) – number of primary shards contained in the es cluster
other_settings (
dict
) – other index settings.headers (dict) – http request content header description
- Returns
http response
- Return type
(HTTPResponse)
-
make_action_and_metadata
(index, _id)[source]¶ Make action_and_metadata obj for bulk loading e.g. { “index”: { “_index” : “index”, “_id” : “id” } }
- Parameters
index (
str
) – name of index on ES_id (
str
) – unique id for document
- Returns
metadata that conforms to ES bulk load requirement
- Return type
(
dict
)
-
migrate_index
(old_index, new_index, headers={'Content-Type': 'application/json'}, number_of_shards=1, number_of_replicas=0)[source]¶ Migrate old index to new index whilst changing shard and replica setting
- Parameters
old_index (
str
) – name of the old index.new_index (
str
) – name of the new indexheaders (
HTTP.header
, optional) – header. Defaults to { “Content-Type”: “application/json” }.number_of_shards (
int
, optional) – number of shards for the index. Defaults to 1.number_of_replicas (
int
, optional) – number of replicas for the index. Defaults to 1.
- Returns
(
list
ofrequests.Response
)
-
put_mapping
(index, body)[source]¶ Put index mapping to exisiting index. (https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-put-mapping.html)
- Parameters
index (
str
) – mapping for the index.body (
dict
) – mapping description.
- Returns
(
requests.Response
)
-
test_analyzer
(msg, tokenizer='standard', index=None)[source]¶ Test ES analyzer / tokenizer results. https://www.elastic.co/guide/en/elasticsearch/reference/6.8/analysis-standard-tokenizer.html
- Parameters
msg (
str
) – Message to be analyzed.tokenizer (
str
, optional) – analyzer to be used.index (
str
, optional) – Index in which custom analyzer resides.
- Returns
http response
- Return type
(
HTTPResponse
)
-
unassigned_reason
()[source]¶ sends http request to get why a shard is unassigned
- Returns
http response
- Return type
(HTTPResponse)
-
update_alias_to_idx
(idx, alias, action='add')[source]¶ Add aliases to an index. (https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-aliases.html)
- Parameters
idx (
str
orlist
ofstr
) – indices official name / names.alias (
str
) – index alias.action (
str
) – add or remove
-