Indexes
Indexes allow for the organization and retrieval of documents by attributes other than their Reference. To query a collection using search and/or sort criteria, you must first create an index to support your query.
Indexes act as a lookup table that improves the performance of finding documents: instead of reading every single document to find the ones that you are interested in, you query an index to find those documents. Fauna uses log-structured merge-tree indexes, unlike most SQL systems, which use B-tree indexes.
The Index tutorials are a great place to get started learning about index usage and creation.
See Limits for details on concurrent index builds and
transaction limits. See the CreateIndex reference page for
limitations on what an index may be named.
|
Source
When you create an index, you specify its source
, which is one or more
collections of documents. Once the index is active, queries that
create, update, or delete a document in the source
collections
causes the index to be updated.
Terms
An index can define terms
: these are zero or more term objects that
define which document fields to use for searching.
terms
are comparable to column=value
predicates in an SQL WHERE
clause. For example, if your documents have a name
field, you can
define terms
to include that field, and then you can find all of the
documents that match a name
.
Only scalar Values are indexed. When a term targets a document field or index binding result that has an array, one index entry per array item is created, which makes it easy to search for an array item. Objects are not indexed. As a result, it is not possible to search for arrays or objects.
Be aware that when the terms
definition includes multiple array
fields, the number of index entries created is the Cartesian product of
the number of array items. For example, when an index terms
definition
specifies two fields that are arrays, and a document is created
including one array with 5 items and the second array with 11 items, 55
index entries are created. Index
write operations are
grouped together, so the billing impact depends on the overall size of
the index entries.
When an index has one or more terms
, the index is partitioned by the
terms
, allowing Fauna to efficiently scale indexes.
When a document is indexed, and all of the index’s defined terms
evaluate to null
, no index entry is stored for the document.
Values
An index can define values
: these are zero or more scalar
Values returned for each index entry that matches the terms
when you query the index. values
are comparable to the SQL SELECT
clause.
values
are also how indexes are sorted: each field value in values
is sorted lexically according to the field type, and the order can be
inverted by setting reverse: true
.
Each index entry records the Reference of each document involved in the
index. When there is no values
definition, the index returns the
Reference for each matching index entry. When values
is defined, only
the defined field values are returned.
When a document is indexed, and all of the index’s defined values
evaluate to null
, no index entry is stored for the document.
Values must refer to fields with scalar Values. Objects are
not indexed, so when a values
definition points to a document field or
index binding result that has an Object, the index entry stores
null
because Objects cannot be sorted. When a values
definition points to a document field or index binding result that
has an Array, one index entry per array item is created.
Collection index
An index with no terms
and values
defined is known as a
collection index: searching for documents is not possible, and all
documents in the collection are included in the result set, and are
sorted by their reference in ascending order.
Unique
You can use unique: true
in an index definition. When you do, the
index has only one entry for a document with the defined terms
and
values
. Subsequently, creating, or updating, a document to have the
same terms
and values
as an existing document would cause an error.
Avoid creating a unique index that does not define If you do create a "term-less" index, the index could cause performance issues. Every time a covered document is created or updated, the index (and its history) needs to be evaluated to decide whether the document is unique or not. As the index grows larger, the evaluation for uniqueness can cause your queries involving writes to exceed the query timeout. |
Bindings
Fauna indexes can also specify bindings, which are Lambda
functions that you can use to compute fields in the terms
or values
.
For example, if your documents store a timestamp, and you want to search
for documents by year, you could write a binding that converts the
timestamp to a year and include the computed year as one of the terms
.
Similarly, if you want to report the month of a document, you could
write a binding that converts the timestamp to a month, and include the
computed month as one of the values
.
For background, a Set is a sorted group of immutable data from a collection. An Index is a group of sets in a collection. Indexes are defined as documents in the system indexes collection.
Index details
Example index
The simplest index is called a "collection" index. A collection index
has no terms
or values
defined. This means that
the index includes all documents with no search terms, and that the
index returns the Reference to each indexed document. Such an index can
be created with a name
and a source
collection:
ObjectV(ref: RefV(id = "new-index", collection = RefV(id = "indexes")),ts: LongV(1603756181200000),active: BooleanV(True),serialized: BooleanV(True),name: StringV(new-index),source: RefV(id = "spells", collection = RefV(id = "collections")),partitions: LongV(8))
map[active:true name:new-index partitions:8 ref:{new-index 0x1400011d950 0x1400011d950 <nil>} serialized:true source:{spells 0x1400011da40 0x1400011da40 <nil>} ts:1657754502810000]
{ref: ref(id = "new-index", collection = ref(id = "indexes")), ts: 1593464646510000, active: true, serialized: true, name: "new-index", source: ref(id = "spells", collection = ref(id = "collections")), partitions: 8}
{
ref: Index("new-index"),
ts: 1591996190530000,
active: true,
serialized: true,
name: 'new-index',
source: Collection("spells"),
partitions: 8
}
{'ref': Ref(id=new-index, collection=Ref(id=indexes)), 'ts': 1592856422060000, 'active': True, 'serialized': True, 'name': 'new-index', 'source': Ref(id=spells, collection=Ref(id=collections)), 'partitions': 8}
{
ref: Index("new-index"),
ts: 1624310362210000,
active: true,
serialized: true,
name: 'new-index',
source: Collection("spells"),
partitions: 8
}
Index fields
Field Name | Field Type | Definition and Requirements | ||
---|---|---|---|---|
|
The logical name of the index. Cannot be |
|||
|
A Collection reference, or an array of one or more source objects describing source collections and (optional) binding fields.
|
|||
|
Optional - An array of Term
objects describing the fields that should be searchable. Indexed
terms can be used to search for field values using the |
|||
|
Optional - An array of Value
objects describing the fields that should be reported in search
results. The default is an empty Array. When there is no
|
|||
|
Optional - If |
|||
|
Optional - If If The default is See Isolation levels for details. |
|||
|
Optional - Indicates who is allowed to read the index. The default is everyone can read the index. |
|||
|
Optional - This is user-defined metadata for the index. It is provided for the developer to store information at the index level. The default is an empty object with no data. |
The maximum size of an index entry, which is comprised of the terms
and values content (and some overhead to distinguish multiple
fields), must not exceed 64k bytes. If an index
entry is too large, the query that created/updated the index
entry fails.
|
Source objects
Source objects describe the source collection of index entries and,
optionally, bindings. A binding must be a pure Lambda
function
(it must not create side effects, such as reads or writes) that emits
values to be used as a term and/or value.
An index cannot be created in the same transaction that creates its source collections. |
The collection
field can be a single collection reference or an array
of collection references. Documents in collections matching the
collection
field apply the associated bindings to be used in the
index terms
or values
definitions. A collection reference can only
exist in one source object.
Field | Type | Definition and Requirements |
---|---|---|
|
The collection or collections to be indexed. |
|
|
An object mapping a binding name to a |
The following examples demonstrates the structure of a source object, which includes an example binding object:
ObjectV(source: ObjectV(collection: RefV(id = "collection", collection = RefV(id = "collections")),fields: ObjectV(binding1: QueryV(System.Collections.Generic.Dictionary`2[System.String,FaunaDB.Query.Expr]))))
map[source:map[collection:{collection 0xc00007fe30 0xc00007fe30 <nil>} fields:map[binding1:{[123 34 97 112 105 95 118 101 114 115 105 111 110 34 58 34 52 34 44 34 108 97 109 98 100 97 34 58 34 100 111 99 117 109 101 110 116 34 44 34 101 120 112 114 34 58 123 34 115 101 108 101 99 116 34 58 91 34 100 97 116 97 34 44 34 102 105 101 108 100 34 93 44 34 102 114 111 109 34 58 123 34 118 97 114 34 58 34 100 111 99 117 109 101 110 116 34 125 125 125]}]]]
{source: {collection: ref(id = "collection", collection = ref(id = "collections")), fields: {binding1: QueryV({api_version=4, lambda=document, expr={select=[data, field], from={var=document}}})}}}
{
source: {
collection: Collection("collection"),
fields: {
binding1: Query(Lambda("document", Select(["data", "field"], Var("document"))))
}
}
}
{'source': {'collection': Ref(id=collection, collection=Ref(id=collections)), 'fields': {'binding1': Query({'api_version': '4', 'lambda': 'document', 'expr': {'select': ['data', 'field'], 'from': {'var': 'document'}}})}}}
{
source: {
collection: Collection("collection"),
fields: {
binding1: Query(Lambda("document", Select(["data", "field"], Var("document"))))
}
}
}
Binding objects
A binding object binds a field name to a pure, single-argument
Lambda
function (it must not create side effects, such as reads
or writes). The function must take the document to be indexed and emit
a single scalar value or an array of scalar values. Binding functions
are not permitted to use reads or writes.
Once defined, bindings can be used in the index terms
or values
definitions as if they were document fields.
Functions that cannot be used in bindings include:
ObjectV(binding1: QueryV(System.Collections.Generic.Dictionary`2[System.String,FaunaDB.Query.Expr]))
map[binding1:{[123 34 97 112 105 95 118 101 114 115 105 111 110 34 58 34 52 34 44 34 108 97 109 98 100 97 34 58 34 100 111 99 117 109 101 110 116 34 44 34 101 120 112 114 34 58 123 34 115 101 108 101 99 116 34 58 91 34 100 97 116 97 34 44 34 102 105 101 108 100 34 93 44 34 102 114 111 109 34 58 123 34 118 97 114 34 58 34 100 111 99 117 109 101 110 116 34 125 125 125]}]
{binding1: QueryV({api_version=4, lambda=document, expr={select=[data, field], from={var=document}}})}
{ binding1:
Query(Lambda("document", Select(["data", "field"], Var("document")))) }
{'binding1': Query({'api_version': '4', 'lambda': 'document', 'expr': {'select': ['data', 'field'], 'from': {'var': 'document'}}})}
{
binding1: Query(Lambda("document", Select(["data", "field"], Var("document"))))
}
Term objects
Term objects describe the fields whose values are used to search for entries in the index.
When a terms
field is missing from an indexed document, the field
value in the index is null
. If all defined terms
fields evaluate to
null
, no index entry is stored for the document.
If no term objects are defined, passing term values to Match
is
not required. The resulting set has all documents in the source
collection.
A value can be from a field
in the document or a binding
defined by
the source object.
Terms must refer to fields with scalar Values. When a term points to a document field or index binding result that has an array, one index entry per array item is created. Objects are not indexed.
Be aware that when the terms
definition includes multiple array
fields, the number of index entries created is the Cartesian product of
the number of array items. For example, when an index terms
definition
specifies two fields that are arrays, and a document is created
including one array with 5 items and the second array with 11 items, 55
index entries are created. Index
write operations are
grouped together, so the billing impact depends on the overall size of
the index entries.
Field | Type | Definition |
---|---|---|
|
The field name path or field name in the document to be indexed. The path targets a field value. For this example object:
The following paths can be used to target fields:
|
|
|
The name of a binding from a source object. |
The following example demonstrates an index terms
definition with two
term objects, the first specifies a binding, the second specifies a
document field:
ObjectV(terms: Arr(ObjectV(binding: StringV(binding1)), ObjectV(field: Arr(StringV(data), StringV(field)))))
map[terms:[map[binding:binding1] map[field:[data field]]]]
{terms: [{binding: "binding1"}, {field: ["data", "field"]}]}
{
terms: [ { binding: 'binding1' }, { field: [ 'data', 'field' ] } ]
}
{'terms': [{'binding': 'binding1'}, {'field': ['data', 'field']}]}
{
terms: [ { binding: 'binding1' }, { field: [ 'data', 'field' ] } ]
}
Value objects
Value objects describe the fields whose values should be used to sort
the index, and whose values should be reported in query results. By
default, indexes have no values
defined, and return the References of
indexed documents.
When a values
field is missing from an indexed document, the field
value in the index is null
. If all defined values
fields evaluate to
null
, no index entry is stored for the document.
A value can be from a field
in the document, or a binding
function
defined in a Source objects.
Values must refer to fields with scalar Values. Objects are not
indexed, so when a values
definition points to a document field or
index binding result that has an Object, the index entry stores
null
because Objects cannot be sorted. When a values
definition points to a document field or index binding result with an
Array, one index entry per array item is created.
Field | Type | Definition |
---|---|---|
|
The field name path or field name in the document to be indexed. The path targets a field value. For this example object:
The following paths can be used to target fields:
|
|
|
The name of a binding from a Source objects. |
|
|
Whether this field value should sort reversed. Defaults to |
The document reference is included in before and after cursors
when paging through an index with the Paginate function,
even if the reference is not included as a values field.
Pagination uses the covered document reference stored in each
index entry to stabilize pagination.
|
All document fields may be indexed. The value of field
in a
Term or Value object indicates the position in a
document for a field. For example, the field ref
refers to the
top-level ref
field. The field ["data", "address", "street"]
refers
to the street
field contained in an address
object in the data
object of the document.
The following example demonstrates an index values
definition with two
term objects, the first specifies a binding, the second specifies a
document field that should be sorted in reverse:
ObjectV(values: Arr(ObjectV(binding: StringV(binding1)), ObjectV(field: Arr(StringV(data), StringV(field)),reverse: BooleanV(True))))
map[values:[map[binding:binding1] map[field:[data field] reverse:true]]]
{values: [{binding: "binding1"}, {field: ["data", "field"], reverse: true}]}
{
values: [
{ binding: 'binding1' },
{ field: [ 'data', 'field' ], reverse: true }
]
}
{'values': [{'binding': 'binding1'}, {'field': ['data', 'field'], 'reverse': True}]}
{
values: [
{ binding: 'binding1' },
{ field: [ 'data', 'field' ], reverse: true }
]
}
Is this article helpful?
Tell Fauna how the article can be improved:
Visit Fauna's forums
or email docs@fauna.com
Thank you for your feedback!