Create a pyelasticsearch ElasticSearch object and return it.
This will aggressively re-use ElasticSearch objects with the following rules:
Parameters: |
|
---|
Examples:
# Returns cached ElasticSearch object
es = get_es()
# Returns a new ElasticSearch object
es = get_es(force_new=True)
es = get_es(urls=['http://localhost:9200'])
es = get_es(urls=['http://localhost:9200'], timeout=10,
max_retries=3)
Represents a lazy Elasticsearch Search API request.
The API for S takes inspiration from Django’s QuerySet.
S can be either typed or untyped. An untyped S returns dict results by default.
An S is lazy in the sense that it doesn’t do an Elasticsearch search request until it’s forced to evaluate by either iterating over it, calling .count, doing len(s), or calling .facet_count.
Adding support for other queries
You can add support for queries that S doesn’t have support for by subclassing S with a method called process_query_ACTION. This method takes a key, value and an action.
For example:
claass FunkyS(S):
def process_query_funkyquery(self, key, val, action):
return {'funkyquery': {'field': key, 'value': val}}
Then you can use that just like other actions:
s = FunkyS().query(Q(foo__funkyquery='bar'))
s = FunkyS().query(foo__funkyquery='bar')
Many Elasticsearch queries take other arguments. This is a good way of using different arguments. For example, if you wanted to write a handler for fuzzy for dates, you could do:
claass FunkyS(S):
def process_query_fuzzy(self, key, val, action):
# val here is a (value, min_similarity) tuple
return {
'funkyquery': {
key: {
'value': val[0],
'min_similarity': val[1]
}
}
}
Used:
s = FunkyS().query(created__fuzzy=(created_dte, '1d'))
Adding support for other filters
You can add support for filters that S doesn’t have support for by subclassing S with a method called process_filter_ACTION. This method takes a key, value and an action.
For example:
claass FunkyS(S):
def process_filter_funkyfilter(self, key, val, action):
return {'funkyfilter': {'field': key, 'value': val}}
Then you can use that just like other actions:
s = FunkyS().filter(F(foo__funkyfilter='bar'))
s = FunkyS().filter(foo__funkyfilter='bar')
Create and return an S.
Parameters: | type – class; the model that this S is based on |
---|
Chaining transforms
- query(*queries, **kw)¶
Return a new S instance with query args combined with existing set in a must boolean query.
Parameters:
- queries – instances of Q
- kw – queries in the form of field__action=value
There are three special flags you can use:
must=True: Specifies that the queries and kw queries must match in order for a document to be in the result.
If you don’t specify a special flag, this is the default.
should=True: Specifies that the queries and kw queries should match in order for a document to be in the result.
must_not=True: Specifies the queries and kw queries must not match in order for a document to be in the result.
These flags work by putting those queries in the appropriate clause of an Elasticsearch boolean query.
Examples:
>>> s = S().query(foo='bar') >>> s = S().query(Q(foo=='bar')) >>> s = S().query(foo='bar', bat__text='baz') >>> s = S().query(foo='bar', should=True) >>> s = S().query(foo='bar', should=True).query(baz='bat', must=True)Notes:
- Don’t specify multiple special flags, but if you did, should takes precedence.
- If you don’t specify any, it defaults to must.
- You can specify special flags in the elasticutils.Q, too. If you’re building your query incrementally, using elasticutils.Q helps a lot.
See the documentation on elasticutils.Q for more details on composing queries with Q.
See the documentation on elasticutils.S for more details on adding support for more query types.
- query_raw(query)¶
Return a new S instance with a query_raw.
Parameters: query – Python dict specifying the complete query to send to Elasticsearch Example:
S().query_raw({'match': {'title': 'example'}})Note
If there’s a query_raw in your S, then that’s your query. All .query(), .demote(), .boost() and anything else that affects the query clause is ignored.
- filter(*filters, **kw)¶
Return a new S instance with filter args combined with existing set with AND.
Parameters:
- filters – this will be instances of F
- kw – this will be in the form of field__action=value
Examples:
>>> s = S().filter(foo='bar') >>> s = S().filter(F(foo='bar')) >>> s = S().filter(foo='bar', bat='baz') >>> s = S().filter(foo='bar').filter(bat='baz')By default, everything is combined using AND. If you provide multiple filters in a single filter call, those are ANDed together. If you provide multiple filters in multiple filter calls, those are ANDed together.
If you want something different, use the F class which supports & (and), | (or) and ~ (not) operators. Then call filter once with the resulting F instance.
See the documentation on elasticutils.F for more details on composing filters with F.
See the documentation on elasticutils.S for more details on adding support for new filter types.
- order_by(*fields)¶
Return a new S instance with results ordered as specified
You can change the order search results by specified fields:
q = (S().query(title='trucks') .order_by('title')This orders search results by the title field in ascending order.
If you want to sort by descending order, prepend a -:
q = (S().query(title='trucks') .order_by('-title')You can also sort by the computed field _score.
Note
Calling this again will overwrite previous .order_by() calls.
- boost(**kw)¶
Return a new S instance with field boosts.
ElasticUtils allows you to specify query-time field boosts with .boost(). It takes a set of arguments where the keys are either field names or field name + __ + field action.
Examples:
q = (S().query(title='taco trucks', description__text='awesome') .boost(title=4.0, description__text=2.0))If the key is a field name, then the boost will apply to all query bits that have that field name. For example:
q = (S().query(title='trucks', title__prefix='trucks', title__fuzzy='trucks') .boost(title=4.0))applies a 4.0 boost to all three query bits because all three query bits are for the title field name.
If the key is a field name and field action, then the boost will apply only to that field name and field action. For example:
q = (S().query(title='trucks', title__prefix='trucks', title__fuzzy='trucks') .boost(title__prefix=4.0))will only apply the 4.0 boost to title__prefix.
Boosts are relative to one another and all boosts default to 1.0.
For example, if you had:
qs = (S().boost(title=4.0, summary=2.0) .query(title__text=value, summary__text=value, content__text=value, should=True))title__text would be boosted twice as much as summary__text and summary__text twice as much as content__text.
- demote(amount_, *queries, **kw)¶
Returns a new S instance with boosting query and demotion.
You can demote documents that match query criteria:
q = (S().query(title='trucks') .demote(0.5, description__text='gross')) q = (S().query(title='trucks') .demote(0.5, Q(description__text='gross')))This is implemented using the boosting query in Elasticsearch. Anything you specify with .query() goes into the positive section. The negative query and negative boost portions are specified as the first and second arguments to .demote().
Note
Calling this again will overwrite previous .demote() calls.
- facet(*args, **kw)¶
Return a new S instance with facet args combined with existing set.
- facet_raw(**kw)¶
Return a new S instance with raw facet args combined with existing set.
- highlight(*fields, **kwargs)¶
Set highlight/excerpting with specified options.
Parameters: fields – The list of fields to highlight. If the field is None, then the highlight is cleared. Additional keyword options:
- pre_tags – List of tags before highlighted portion
- post_tags – List of tags after highlighted portion
Results will have a _highlight property which contains the highlighted field excerpts.
For example:
q = (S().query(title__text='crash', content__text='crash') .highlight('title', 'content')) for result in q: print result._highlight['title'] print result._highlight['content']If you pass in None, it will clear the highlight.
For example, this search won’t highlight anything:
q = (S().query(title__text='crash') .highlight('title') # highlights 'title' field .highlight(None)) # clears highlightNote
Calling this again will overwrite previous .highlight() calls.
Note
Make sure the fields you’re highlighting are indexed correctly. Read the Elasticsearch documentation for details.
- values_list(*fields)¶
Return a new S instance that returns ListSearchResults.
Parameters: fields – the list of fields to have in the results.
With no arguments, returns a list of tuples of all the data for that document.
With arguments, returns a list of tuples where the fields in the tuple are in the order specified.
For example:
>>> list(S().values_list()) [(1, 'fred', 40), (2, 'brian', 30), (3, 'james', 45)] >>> list(S().values_list('id', 'name')) [(1, 'fred'), (2, 'brian'), (3, 'james')] >>> list(S().values_list('name', 'id') [('fred', 1), ('brian', 2), ('james', 3)]Note
If you don’t specify fields, the data comes back in an arbitrary order. It’s probably best to specify fields or use values_dict.
- values_dict(*fields)¶
Return a new S instance that returns DictSearchResults.
Parameters: fields – the list of fields to have in the results.
With no arguments, this returns a list of dicts with all the fields.
With arguments, it returns a list of dicts with the specified fields.
For example:
>>> list(S().values_dict()) [{'id': 1, 'name': 'fred', 'age': 40}, ...] >>> list(S().values_dict('id', 'name') [{'id': 1, 'name': 'fred'}, ...]
- es(**settings)¶
Return a new S with specified ElasticSearch settings.
This allows you to configure the ElasticSearch object that gets used to execute the search.
Parameters: settings – the settings you’d use to build the ElasticSearch—same as what you’d pass to get_es().
- indexes(*indexes)¶
Return a new S instance that will search specified indexes.
- doctypes(*doctypes)¶
Return a new S instance that will search specified doctypes.
Note
Elasticsearch calls these “mapping types”. It’s the name associated with a mapping.
- explain(value=True)¶
Return a new S instance with explain set.
Methods to override if you need different behavior
- get_es(default_builder=<function get_es at 0x29828c0>)¶
Returns the ElasticSearch object to use.
Parameters: default_builder – The function that takes a bunch of arguments and generates a pyelasticsearch ElasticSearch object. Note
If you desire special behavior regarding building the ElasticSearch object for this S, subclass S and override this method.
- get_indexes(default_indexes=None)¶
Returns the list of indexes to act on.
- get_doctypes(default_doctypes=None)¶
Returns the list of doctypes to use.
- to_python(obj)¶
Converts strings in a data structure to Python types
It converts datetime-ish things to Python datetimes.
Override if you want something different.
Parameters: obj – Python datastructure Returns: Python datastructure with strings converted to Python types Note
This does the conversion in-place!
Methods that force evaluation
- __iter__()¶
Executes search and returns an iterator of results.
Returns: iterator of results For example:
>>> s = S().query(name__prefix='Jimmy') >>> for obj in s.execute(): ... print obj['id'] ...
- __len__()¶
Executes search and returns the number of results you’d get.
Executes search and returns number of results as an integer.
Returns: integer For example:
>>> s = S().query(name__prefix='Jimmy') >>> count = len(s) >>> results = s().execute() >>> count = len(results) TrueNote
This is very different than calling .count(). If you call .count() you get the total number of results that Elasticsearch thinks matches your search. If you call len(s), then you get the number of results you’d get if you executed the search. This factors in slices and default from and size values.
- all()¶
Executes search and returns ALL search results.
Returns: SearchResults instance For example:
>>> s = S().query(name__prefix='Jimmy') >>> all_results = s.all()Warning
This returns ALL search results. The way it does this is by calling .count() first to figure out how many to return, then by slicing by that size and returning a list of ALL search results.
Don’t use this if you’ve got 1000s of results!
- count()¶
Executes search and returns number of results as an integer.
Returns: integer For example:
>>> s = S().query(name__prefix='Jimmy') >>> count = s.count()
- execute()¶
Executes search and returns a SearchResults object.
Returns: SearchResults instance For example:
>>> s = S().query(name__prefix='Jimmy') >>> results = s.execute()
- facet_counts()¶
Executes search and returns facet counts.
Example:
>>> s = S().query(name__prefix='Jimmy') >>> facet_counts = s.facet_counts()
Filter objects.
Makes it easier to create filters cumulatively using & (and), | (or) and ~ (not) operations.
For example:
f = F()
f &= F(price='Free')
f |= F(style='Mexican')
creates a filter “price = ‘Free’ or style = ‘Mexican’”.
Query objects.
Makes it easier to create queries cumulatively.
If there’s more than one query part, they’re combined under a BooleanQuery. By default, they’re combined in the must clause.
You can combine two Q classes using the + operator. For example:
q = Q()
q += Q(title__text='shoes')
q += Q(summary__text='shoes')
creates a BooleanQuery with two must clauses.
Example 2:
q = Q()
q += Q(title__text='shoes', should=True)
q += Q(summary__text='shoes')
q += Q(description__text='shoes', must=True)
creates a BooleanQuery with one should clause (title) and two must clauses (summary and description).
After executing a search, this is the class that manages the results.
Property type: | the mapping type of the S that created this SearchResults instance |
---|---|
Property took: | the amount of time the search took |
Property count: | the total results |
Property response: | |
the raw Elasticsearch search response | |
Property results: | |
the search results from the response if any | |
Property fields: | |
the list of fields specified by values_list or values_dict |
When you iterate over this object, it returns the individual search results in the shape you asked for (object, tuple, dict, etc) in the order returned by Elasticsearch.
Example:
s = S().query(bio__text='archaeologist')
results = s.execute()
# Shows how long the search took
print results.took
# Shows the raw Elasticsearch response
print results.results
Base class for mapping types.
To extend this class:
For example:
class ContactType(MappingType):
@classmethod
def get_index(cls):
return 'contacts_index'
@classmethod
def get_mapping_type_name(cls):
return 'contact_type'
@classmethod
def get_model(cls):
return ContactModel
def get_object(self):
return self.get_model().get(id=self._id)
Returns the model instance
This gets called when someone uses the .object attribute which triggers lazy-loading of the object this document is based on.
By default, this calls:
self.get_model().get(id=self._id)
where self._id is the Elasticsearch document id.
Override it to do something different.
Returns the index to use for this mapping type.
You can specify the index to use for this mapping type. This affects S built with this type.
By default, raises NotImplementedError.
Override this to return the index this mapping type should be indexed and searched in.
Returns the mapping type name.
You can specify the mapping type name (also sometimes called the document type) with this method.
By default, raises NotImplementedError.
Override this to return the mapping type name.
Return the model class related to this MappingType.
This can be any class that has an instance related to this Mappingtype by id.
By default, raises NoModelError.
Override this to return a class that works with .get_object() to return the instance of the model that is related to this document.
Mixin for mapping types with all the indexing hoo-hah.
Add this mixin to your DjangoMappingType subclass and it gives you super indexing power.
Adds or updates a batch of documents.
Parameters: |
|
---|
Note
If you need the documents available for searches immediately, make sure to refresh the index by calling refresh_index().
Extracts the Elasticsearch index document for this instance
This must be implemented.
Note
The resulting dict must be JSON serializable.
Parameters: |
|
---|---|
Returns: | dict of key/value pairs representing the document |
Returns an ElasticSearch object
Override this if you need special functionality.
Returns: | a pyelasticsearch ElasticSearch instance |
---|
Returns an iterable of things to index.
Returns: | iterable of things to index |
---|
Returns the mapping for this mapping type.
Example:
@classmethod
def get_mapping(cls):
return {
'properties': {
'id': {'type': 'integer'},
'name': {'type': 'string'}
}
}
See the docs for more details on how to specify a mapping.
Override this to return a mapping for this doctype.
Returns: | dict representing the Elasticsearch mapping or None if you want Elasticsearch to infer it. defaults to None. |
---|
Adds or updates a document to the index
Parameters: |
|
---|
Note
If you need the documents available for searches immediately, make sure to refresh the index by calling refresh_index().
Refreshes the index.
Elasticsearch will update the index periodically automatically. If you need to see the documents you just indexed in your search results right now, you should call refresh_index as soon as you’re done indexing. This is particularly helpful for unit tests.
Parameters: |
|
---|
Removes a particular item from the search index.
Parameters: |
|
---|
This is the default mapping type for S.
Represents a lazy Elasticsearch More Like This API request.
This is lazy in the sense that it doesn’t evaluate and execute the Elasticsearch request unless you force it to by iterating over it or getting the length of the search results.
For example:
>>> mlt = MLT(2034, index='addons_index', doctype='addon')
>>> num_related_documents = len(mlt)
>>> num_related_documents = list(mlt)
When the MLT is evaluated, it generates a list of dict results.
Parameters: |
|
---|
Note
You must specify either an s or the index and doctype arguments. Omitting them will result in a ValueError.
Converts strings in a data structure to Python types
It converts datetime-ish things to Python datetimes.
Override if you want something different.
Parameters: | obj – Python datastructure |
---|---|
Returns: | Python datastructure with strings converted to Python types |
Note
This does the conversion in-place!
Returns an ElasticSearch.
Override this if that behavior isn’t correct for you.
Build query and passes to ElasticSearch, then returns the raw format returned.