wukong package

Submodules

wukong.api module

class wukong.api.SolrAPI(solr_hosts, solr_collection, zookeeper_hosts=None, timeout=15, zookeeper_timeout=5)

Bases: object

add_schema_fields(fields)

Add new fields to the schema of current collection

Parameters:fields (list) – a list of dicts of fields.
commit()

Hard commit documents to SOLR.

delete(unique_key, unique_key_value, commit=False)

Deleting a document from SOLR.

Parameters:
  • unique_key – the unique key for the doc to delete
  • unique_key_value – the value for the unique_key
  • commit – whether or not we should commit the documents.
get_schema()

Get the SOLR schema for the solr collection.

Returns:the schema for the current collection
Return type:dict
is_alive()

Check if current collection is live from zookeeper.

Returns:weather or not if the collection is live
Return type:boolean
select(query_dict, groups=False, facets=False, stats=False, **kwargs)

Query documents from SOLR.

Parameters:
  • query_dict (dict) – a dict containing the query params to SOLR
  • metadata (boolean) – whether or not solr metadata should be returned
  • kwargs (dict) – a dict of additional params for SOLR
Returns:

reformatted response from SOLR

Return type:

dict

update(docs, commit=False)

Add new docs or updating existing docs.

Parameters:
  • docs – a list of instances of SolrDoc.
  • commit – whether or not we should commit the documents.

wukong.errors module

exception wukong.errors.SolrDeleteUniqueKeyError(pk)

Bases: wukong.errors.SolrError

exception wukong.errors.SolrDocumentNotExistError(pk)

Bases: wukong.errors.SolrError

exception wukong.errors.SolrDuplicateUniqueKeyError(pk)

Bases: wukong.errors.SolrError

exception wukong.errors.SolrError(message=None, status_code=None)

Bases: exceptions.Exception

exception wukong.errors.SolrSchemaUpdateError(fields, message=None, status_code=None)

Bases: wukong.errors.SolrError

exception wukong.errors.SolrSchemaValidationError(field, message=None)

Bases: wukong.errors.SolrError

exception wukong.errors.SolrUnspecifiedOperatorError(field_name)

Bases: wukong.errors.SolrError

exception wukong.errors.SolrUnspportedOperatorError(operator)

Bases: wukong.errors.SolrError

wukong.models module

class wukong.models.SolrDoc(partial_update=False, field_weights=None, **kwargs)

Bases: object

The base class for modeling any type of solr document

classmethod add_schema_fields(fields)

Add fields to the SOLR schema which will hit the SOLR schema api.

Parameters:fields (list) – a list of field meta info (e.g. name, type)
collection_name = None
del_field(key)

Delete the value of an field from the SolrDoc

delete(commit=False)

Delete the current SolrDoc from SOLR.

Parameters:commit (boolean) – whether or not to commit upon submission.
documents = <wukong.query.SolrQueryManager object>
classmethod from_json_docs(json_docs)

Convert a list of json dict from SOLR to a list of SolrDoc.

Parameters:json_docs (list) – a list of dict returned from SOLR server
Returns:a list of SolrDoc
Return type:list
get_data_for_solr()

Generate the data for SOLR indexing for the current SolrDoc.

Returns:a json representation of a SolrDoc for SOLR indexing. If the SolrDoc is invalid, it returns None.
Return type:dict
get_field(key)

Get the value of an field from the SolrDoc

get_field_weight(field)

Get the weight of an field from a SolrDoc

get_unique_field()

Get the value of the unique key in the SolrDoc

index(commit=False)

Index the current SolrDoc from SOLR.

Parameters:commit (boolean) – whether or not to commit upon submission.
is_partial_update()

Return if the SolrDoc is only for partial update

request_timeout = 15
schema

Reference the property from the metaclass and return the current collection schema

set_field(key, value)

Set the value of an field from the SolrDoc

set_field_weight(field, weight)

Set the weight of an field from the SolrDoc

set_fields(**fields)

Set the values for all fields in the SolrDoc

set_partial_update(value)

Set the value of partial update

solr

Reference the property from the metaclass and return a instance of SOLR api class.

solr_hosts = None
unique_key

Reference the property from the metaclass and return the unique key of the collection schema

classmethod validate_schema_fields(fields)

Validate if the fields are valid for SOLR by checking with the schema

Parameters:fields (list) – a list of dicts for the field meta info
Returns:whether or not the fields are consistent with SOLR schema
Return type:boolean
zookeeper_hosts = None
zookeeper_timeout = 5
class wukong.models.SolrDocMetaClass

Bases: type

Meta class for SolrDoc

documents

Return a instance of SOLR Query Manager class.

schema

Return the current collection schema

solr

Return a instance of SOLR api class.

unique_key

Return the unique key of the collection schema

class wukong.models.SolrDocs(docs=None)

Bases: object

A Wrapper Container for a collection of SolrDoc instances, designed for batch update and delete

add(doc)

Add a document to the SolrDoc container

Parameters:doc (SolrDoc) – the document to add
delete(commit=False)

Index all current documents in the container to SOLR collection

Parameters:commit (boolean) – whether or not to commit to SOLR when submitted
index(commit=False)

Index all current documents in the container to SOLR collection

Parameters:commit (boolean) – whether or not to commit to SOLR when submitted

wukong.query module

class wukong.query.AND(*args, **kwargs)

Bases: wukong.query.SolrNode

Model the AND logic in SOLR query

logic = 'AND'
parsed_solr_query
class wukong.query.Comparator(operator, key, value)

Bases: wukong.query.SolrNode

Model the compare logic in SOLR query

logic = 'COMP'
parsed_solr_query
class wukong.query.NOT(*args, **kwargs)

Bases: wukong.query.SolrNode

Model the NOT logic in SOLR query

logic = 'NOT'
parsed_solr_query
class wukong.query.OR(*args, **kwargs)

Bases: wukong.query.SolrNode

Model the OR logic in SOLR query

logic = 'OR'
parsed_solr_query
class wukong.query.SolrNode(*args, **kwargs)

Bases: object

The base class to model a tree node in the query logic to SOLR

classmethod build_items(args, kwargs)

Build a list of items under the current logic operator

parsed_solr_query

Parse the current node to a query string

class wukong.query.SolrQueryManager(doc_class, node=None, sort_str=None, weight_dict=None, returned_fields=None, edismax=False, rows=999999999, start=0, facet_fields=None, facet_options={}, mincount=1, facet_group=True, group_fields=None, group_limit=0, group_options={}, boost_func=None, bf_weight=1, boost_query=None, bq_weight=1, minimum_matches=None, text_keywords=None, stats_fields=None)

Bases: object

A class to chain different query methods for SOLR and construct the final query to SOLR

all(**extra)

Retrieve all matched documents from SOLR and convert them into SolrDocs

Returns:documents from SOLR
Return type:list of SolrDoc
boost_by_func(boost_func, bf_weight=1)

Boost query by function

boost_by_query(boost_query, bq_weight=1)

Boost query by query

create(field_weights=None, **kwargs)

Create one document in SOLR

Returns:document created in SOLR
Return type:SolrDoc
facet(facet_fields, mincount=1, group=True, **kwargs)

Facet the SOLR documents

facets(**extra)

Retrieve document facets when facet is ON in the query

Returns:facet counts of SOLR documents
Return type:dict
filter(*args, **kwargs)

Filter the SOLR documents by mathmatical and logical operators. Usage:

filter(name__eq=”Test Name”) filter(name__eq=”Test Name”, city__wc=”Test*”) filter(OR(name__eq=”Test Name”, city__wc=”Test*”),

population__ge=300000)
get(*args, **kwargs)

Fetch one document from SOLR

Returns:document from SOLR
Return type:SolrDoc
group_by(group_fields, group_limit=0, **kwargs)

Group the SOLR documents

groups(**extra)

Retrieve document groups when group is ON in the query

Returns:document groups of SOLR documents
Return type:dict
limit(rows)

Specify the number of documents returned.

offset(start)

Specify the offset of the entire documents

one(**extra)

Get one document from SOLR and convert it into SolrDoc

Returns:one document from SOLR
Return type:SolrDoc
only(*args)

Specify the returned fields in each document.

query

Construct the query string to SOLR

raw(**extra)

Retrieve matched documents from SOLR in json format

Returns:documents from SOLR
Return type:list of json
search(text, minimum_matches=None, **weights)

Search SOLR by text query.

sort_by(sort_str)

Sort the SOLR documents Usage:

ascending: sort_by(“name”) descending: sort_by(“-date”)
stats(stats_fields)
to_dict()

Get the json representation of the query manager

update(field_weights=None, **kwargs)

Update one document in SOLR

Returns:document created in SOLR
Return type:SolrDoc

wukong.request module

class wukong.request.SolrRequest(solr_hosts, zookeeper_hosts=None, timeout=15, refresh_frequency=2, zookeeper_timeout=5)

Bases: object

Handle requests to SOLR and response from SOLR

attempt_zookeeper_refresh()
current_hosts
get(path, params=None, headers=None)

Send a GET request to the SOLR servers

post(path, params=None, body=None, headers=None)

Send a POST request to the SOLR servers

request(path, params, method, body=None, headers=None, is_retry=False)

Prepare data and send request to SOLR servers

zookeeper
wukong.request.process_response(response)

wukong.zookeeper module

class wukong.zookeeper.Zookeeper(hosts, connection_timeout=5)

Bases: object

Retrieve the status of SOLR servers from Zookeeper

get_active_hosts(collection_name=None)

Get the current active SOLR hosts from Zookeeper

Parameters:collection_name – If provided, the name of a SOLR collection to get the hosts for. If not provided, all solr hosts will be returned. Optional.
Returns list[str]:
 A list of solr nodes in the form http://hostname

Module contents