Indexes

Indexes allow for the organization and retrieval of documents by attributes other than their Reference. To query a collection using search and/or sort criteria, you must first create an index to support your query.

Indexes act as a lookup table that improves the performance of finding documents: instead of reading every single document to find the ones that you are interested in, you query an index to find those documents. Fauna uses log-structured merge-tree indexes, unlike most SQL systems, which use B-tree indexes.

The Index tutorials are a great place to get started learning about index usage and creation.

See Limits for details on concurrent index builds and transaction limits. See the CreateIndex reference page for limitations on what an index may be named.

Source

When you create an index, you specify its source, which is one or more collections of documents. Once the index is active, queries that create, update, or delete a document in the source collections causes the index to be updated.

Terms

An index can define terms: these are zero or more term objects that define which document fields to use for searching.

terms are comparable to column=value predicates in an SQL WHERE clause. For example, if your documents have a name field, you can define terms to include that field, and then you can find all of the documents that match a name.

Only scalar Values are indexed. When a term targets a document field or index binding result that has an array, one index entry per array item is created, which makes it easy to search for an array item. Objects are not indexed. As a result, it is not possible to search for arrays or objects.

Be aware that when the terms definition includes multiple array fields, the number of index entries created is the Cartesian product of the number of array items. For example, when an index terms definition specifies two fields that are arrays, and a document is created including one array with 5 items and the second array with 11 items, 55 index entries are created. Index write operations are grouped together, so the billing impact depends on the overall size of the index entries.

When an index has one or more terms, the index is partitioned by the terms, allowing Fauna to efficiently scale indexes.

When a document is indexed, and all of the index’s defined terms evaluate to null, no index entry is stored for the document.

Values

An index can define values: these are zero or more scalar Values returned for each index entry that matches the terms when you query the index. values are comparable to the SQL SELECT clause.

values are also how indexes are sorted: each field value in values is sorted lexically according to the field type, and the order can be inverted by setting reverse: true.

Each index entry records the Reference of each document involved in the index. When there is no values definition, the index returns the Reference for each matching index entry. When values is defined, only the defined field values are returned.

When a document is indexed, and all of the index’s defined values evaluate to null, no index entry is stored for the document.

Values must refer to fields with scalar Values. Objects are not indexed, so when a values definition points to a document field or index binding result that has an Object, the index entry stores null because Objects cannot be sorted. When a values definition points to a document field or index binding result that has an Array, one index entry per array item is created.

Collection index

An index with no terms and values defined is known as a collection index: searching for documents is not possible, and all documents in the collection are included in the result set, and are sorted by their reference in ascending order.

Unique

You can use unique: true in an index definition. When you do, the index has only one entry for a document with the defined terms and values. Subsequently, creating, or updating, a document to have the same terms and values as an existing document would cause an error.

Avoid creating a unique index that does not define terms.

If you do create a "term-less" index, the index could cause performance issues. Every time a covered document is created or updated, the index (and its history) needs to be evaluated to decide whether the document is unique or not. As the index grows larger, the evaluation for uniqueness can cause your queries involving writes to exceed the query timeout.

Bindings

Fauna indexes can also specify bindings, which are Lambda functions that you can use to compute fields in the terms or values. For example, if your documents store a timestamp, and you want to search for documents by year, you could write a binding that converts the timestamp to a year and include the computed year as one of the terms. Similarly, if you want to report the month of a document, you could write a binding that converts the timestamp to a month, and include the computed month as one of the values.

For background, a Set is a sorted group of immutable data from a collection. An Index is a group of sets in a collection. Indexes are defined as documents in the system indexes collection.

Index details

Example index

The simplest index is called a "collection" index. A collection index has no terms or values defined. This means that the index includes all documents with no search terms, and that the index returns the Reference to each indexed document. Such an index can be created with a name and a source collection:

try
{
    Value result = await client.Query(
        CreateIndex(
            Obj(
                "name", "new-index",
                "source", Collection("spells")
            )
        )
    );
    Console.WriteLine(result);
}
catch (Exception e)
{
    Console.WriteLine($"ERROR: {e.Message}");
}
ObjectV(ref: RefV(id = "new-index", collection = RefV(id = "indexes")),ts: LongV(1603756181200000),active: BooleanV(True),serialized: BooleanV(True),name: StringV(new-index),source: RefV(id = "spells", collection = RefV(id = "collections")),partitions: LongV(8))
result, err := client.Query(
	f.CreateIndex(
		f.Obj{
			"name":   "new-index",
			"source": f.Collection("spells"),
		},
	))

if err != nil {
	fmt.Fprintln(os.Stderr, err)
} else {
	fmt.Println(result)
}
map[active:true name:new-index partitions:8 ref:{new-index 0x1400011d950 0x1400011d950 <nil>} serialized:true source:{spells 0x1400011da40 0x1400011da40 <nil>} ts:1657754502810000]
System.out.println(
    client.query(
        CreateIndex(
            Obj(
                "name", Value("new-index"),
                "source", Collection("spells")
            )
        )
    ).get());
{ref: ref(id = "new-index", collection = ref(id = "indexes")), ts: 1593464646510000, active: true, serialized: true, name: "new-index", source: ref(id = "spells", collection = ref(id = "collections")), partitions: 8}
client.query(
  q.CreateIndex({
    name: 'new-index',
    source: q.Collection('spells'),
  })
)
.then((ret) => console.log(ret))
.catch((err) => console.error(
  'Error: [%s] %s: %s',
  err.name,
  err.message,
  err.errors()[0].description,
))
{
  ref: Index("new-index"),
  ts: 1591996190530000,
  active: true,
  serialized: true,
  name: 'new-index',
  source: Collection("spells"),
  partitions: 8
}
result = client.query(
  q.create_index({
    "name": "new-index",
    "source": q.collection("spells")
  })
)
print(result)
{'ref': Ref(id=new-index, collection=Ref(id=indexes)), 'ts': 1592856422060000, 'active': True, 'serialized': True, 'name': 'new-index', 'source': Ref(id=spells, collection=Ref(id=collections)), 'partitions': 8}
CreateIndex({
  name: 'new-index',
  source: Collection('spells'),
})
{
  ref: Index("new-index"),
  ts: 1624310362210000,
  active: true,
  serialized: true,
  name: 'new-index',
  source: Collection("spells"),
  partitions: 8
}
Query metrics:
  •    bytesIn:    81

  •   bytesOut:   252

  • computeOps:     1

  •    readOps:     0

  •   writeOps:     5

  •  readBytes: 1,825

  • writeBytes:   855

  •  queryTime:   9ms

  •    retries:     0

Index fields

Field Name Field Type Definition and Requirements

name

The logical name of the index.

Cannot be events, sets, self, documents, or _. Cannot have the % character.

source

A Collection reference, or an array of one or more source objects describing source collections and (optional) binding fields.

The ts field can be used as a term or value for an index but should not be used in a binding because it is not known at the time index bindings are computed.

terms

Optional - An array of Term objects describing the fields that should be searchable. Indexed terms can be used to search for field values using the Match function. The default is an empty Array.

values

Optional - An array of Value objects describing the fields that should be reported in search results. The default is an empty Array. When there is no values definition, search results include the Reference of each matching document.

unique

Optional - If true, maintains a unique constraint on combined terms and values. The default is false.

serialized

Optional - If true, writes to this index are serialized with concurrent reads and writes. Subsequent reads or writes must wait until the writes for this index have completed.

If false, writes to this index occur asynchronously and do not block subsequent reads or writes. This is better performance-wise, but the lack of serialization makes it notably harder to use this index to read your own writes.

The default is true.

See Isolation levels for details.

permissions

Optional - Indicates who is allowed to read the index. The default is everyone can read the index.

data

Optional - This is user-defined metadata for the index. It is provided for the developer to store information at the index level. The default is an empty object with no data.

The maximum size of an index entry, which is comprised of the terms and values content (and some overhead to distinguish multiple fields), must not exceed 64k bytes. If an index entry is too large, the query that created/updated the index entry fails.

Source objects

Source objects describe the source collection of index entries and, optionally, bindings. A binding must be a pure Lambda function (it must not create side effects, such as reads or writes) that emits values to be used as a term and/or value.

An index cannot be created in the same transaction that creates its source collections.

The collection field can be a single collection reference or an array of collection references. Documents in collections matching the collection field apply the associated bindings to be used in the index terms or values definitions. A collection reference can only exist in one source object.

Field Type Definition and Requirements

collection

Collection Reference, or Array of collection references

The collection or collections to be indexed.

fields

An object mapping a binding name to a Lambda function.

The following examples demonstrates the structure of a source object, which includes an example binding object:

try
{
    Value result = await client.Query(
        Obj(
            "source", Obj(
                "collection", Collection("collection"),
                "fields", Obj(
                    "binding1", Query(
                        Lambda(
                            "document",
                            Select(Arr("data", "field"), Var("document"))
                        )
                    )
                )
            )
        )
    );
    Console.WriteLine(result);
}
catch (Exception e)
{
    Console.WriteLine($"ERROR: {e.Message}");
}
ObjectV(source: ObjectV(collection: RefV(id = "collection", collection = RefV(id = "collections")),fields: ObjectV(binding1: QueryV(System.Collections.Generic.Dictionary`2[System.String,FaunaDB.Query.Expr]))))
result, err := client.Query(
	f.Obj{
		"source": f.Obj{
			"collection": f.Collection("collection"),
			"fields": f.Obj{
				"binding1": f.Query(
					f.Lambda(
						"document",
						f.Select(f.Arr{"data", "field"}, f.Var("document"))))}}})

if err != nil {
	fmt.Fprintln(os.Stderr, err)
} else {
	fmt.Println(result)
}
map[source:map[collection:{collection 0xc00007fe30 0xc00007fe30 <nil>} fields:map[binding1:{[123 34 97 112 105 95 118 101 114 115 105 111 110 34 58 34 52 34 44 34 108 97 109 98 100 97 34 58 34 100 111 99 117 109 101 110 116 34 44 34 101 120 112 114 34 58 123 34 115 101 108 101 99 116 34 58 91 34 100 97 116 97 34 44 34 102 105 101 108 100 34 93 44 34 102 114 111 109 34 58 123 34 118 97 114 34 58 34 100 111 99 117 109 101 110 116 34 125 125 125]}]]]
System.out.println(
    client.query(
        Obj(
            "source", Obj(
                "collection", Collection("collection"),
                "fields", Obj(
                    "binding1", Query(
                        Lambda(
                            "document",
                            Select(
                                Arr(Value("data"), Value("field")),
                                Var("document")
                            )
                        )
                    )
                )
            )
        )
    ).get());
{source: {collection: ref(id = "collection", collection = ref(id = "collections")), fields: {binding1: QueryV({api_version=4, lambda=document, expr={select=[data, field], from={var=document}}})}}}
client.query({
  source: {
    collection: q.Collection('collection'),
    fields: {
      binding1: q.Query(
        q.Lambda(
          'document',
          q.Select(['data', 'field'], q.Var('document'))
        )
      ),
    },
  },
})
.then((ret) => console.log(ret))
.catch((err) => console.error(
  'Error: [%s] %s: %s',
  err.name,
  err.message,
  err.errors()[0].description,
))
{
  source: {
    collection: Collection("collection"),
    fields: {
      binding1: Query(Lambda("document", Select(["data", "field"], Var("document"))))
    }
  }
}
result = client.query(
  {
    "source": {
      "collection": q.collection("collection"),
      "fields": {
        "binding1": q.query(
          q.lambda_(
            "document",
            q.select(["data", "field"], q.var("document"))
          )
        )
      }
    }
  }
)
print(result)
{'source': {'collection': Ref(id=collection, collection=Ref(id=collections)), 'fields': {'binding1': Query({'api_version': '4', 'lambda': 'document', 'expr': {'select': ['data', 'field'], 'from': {'var': 'document'}}})}}}
{
  source: {
    collection: Collection("collection"),
    fields: {
      binding1: Query(
        Lambda(
          "document",
          Select(["data", "field"], Var("document")),
        )
      ),
    },
  },
}
{
  source: {
    collection: Collection("collection"),
    fields: {
      binding1: Query(Lambda("document", Select(["data", "field"], Var("document"))))
    }
  }
}
Query metrics:
  •    bytesIn: 201

  •   bytesOut: 244

  • computeOps:   1

  •    readOps:   0

  •   writeOps:   0

  •  readBytes:   0

  • writeBytes:   0

  •  queryTime: 7ms

  •    retries:   0

Binding objects

A binding object binds a field name to a pure, single-argument Lambda function (it must not create side effects, such as reads or writes). The function must take the document to be indexed and emit a single scalar value or an array of scalar values. Binding functions are not permitted to use reads or writes.

Once defined, bindings can be used in the index terms or values definitions as if they were document fields.

Functions that cannot be used in bindings include:

try
{
    Value result = await client.Query(
        Obj(
            "binding1", Query(
                Lambda(
                    "document",
                    Select(Arr("data", "field"), Var("document"))
                )
            )
        )
    );
    Console.WriteLine(result);
}
catch (Exception e)
{
    Console.WriteLine($"ERROR: {e.Message}");
}
ObjectV(binding1: QueryV(System.Collections.Generic.Dictionary`2[System.String,FaunaDB.Query.Expr]))
result, err := client.Query(
	f.Obj{
		"binding1": f.Query(
			f.Lambda(
				"document",
				f.Select(f.Arr{"data", "field"}, f.Var("document")),
			))})

if err != nil {
	fmt.Fprintln(os.Stderr, err)
} else {
	fmt.Println(result)
}
map[binding1:{[123 34 97 112 105 95 118 101 114 115 105 111 110 34 58 34 52 34 44 34 108 97 109 98 100 97 34 58 34 100 111 99 117 109 101 110 116 34 44 34 101 120 112 114 34 58 123 34 115 101 108 101 99 116 34 58 91 34 100 97 116 97 34 44 34 102 105 101 108 100 34 93 44 34 102 114 111 109 34 58 123 34 118 97 114 34 58 34 100 111 99 117 109 101 110 116 34 125 125 125]}]
System.out.println(
    client.query(
        Obj(
            "binding1", Query(
                Lambda(
                    "document",
                    Select(
                        Arr(Value("data"), Value("field")),
                        Var("document")
                    )
                )
            )
        )
    ).get());
{binding1: QueryV({api_version=4, lambda=document, expr={select=[data, field], from={var=document}}})}
client.query({
  binding1: q.Query(
    q.Lambda('document', q.Select(['data', 'field'], q.Var('document')))
  ),
})
.then((ret) => console.log(ret))
.catch((err) => console.error(
  'Error: [%s] %s: %s',
  err.name,
  err.message,
  err.errors()[0].description,
))
{ binding1:
   Query(Lambda("document", Select(["data", "field"], Var("document")))) }
result = client.query(
  {
    "binding1": q.query(
      q.lambda_( "document", q.select(["data", "field"], q.var("document")))
    )
  }
)
print(result)
{'binding1': Query({'api_version': '4', 'lambda': 'document', 'expr': {'select': ['data', 'field'], 'from': {'var': 'document'}}})}
{
  binding1: Query(
    Lambda('document', Select(['data', 'field'], Var('document')))
  )
}
{
  binding1: Query(Lambda("document", Select(["data", "field"], Var("document"))))
}
Query metrics:
  •    bytesIn: 116

  •   bytesOut: 137

  • computeOps:   1

  •    readOps:   0

  •   writeOps:   0

  •  readBytes:   0

  • writeBytes:   0

  •  queryTime: 2ms

  •    retries:   0

Term objects

Term objects describe the fields whose values are used to search for entries in the index.

When a terms field is missing from an indexed document, the field value in the index is null. If all defined terms fields evaluate to null, no index entry is stored for the document.

If no term objects are defined, passing term values to Match is not required. The resulting set has all documents in the source collection.

A value can be from a field in the document or a binding defined by the source object.

Terms must refer to fields with scalar Values. When a term points to a document field or index binding result that has an array, one index entry per array item is created. Objects are not indexed.

Be aware that when the terms definition includes multiple array fields, the number of index entries created is the Cartesian product of the number of array items. For example, when an index terms definition specifies two fields that are arrays, and a document is created including one array with 5 items and the second array with 11 items, 55 index entries are created. Index write operations are grouped together, so the billing impact depends on the overall size of the index entries.

Field Type Definition

field

Array or String

The field name path or field name in the document to be indexed.

The path targets a field value. For this example object:

{
  "ref": Ref(Collection("pet_owners"), "12345"),
  "data": {
    "name": "Alice",
    "pets": [
      { "type": "dog", "name": "Luna" },
      { "type": "cat", "name": "Fluffy" },
    ],
  }
}

The following paths can be used to target fields:

  • "ref" targets the value for the ref field.

  • ["data", "name"] targets the value for the data.name field, "Alice."

  • ["data", "pets", "name"] targets the name field of objects in the pets array.

    Recall that one index item is created per array item during indexing, which flattens the array so that no array offset exists in the index entry.

binding

The name of a binding from a source object.

The following example demonstrates an index terms definition with two term objects, the first specifies a binding, the second specifies a document field:

try
{
    Value result = await client.Query(
        Obj(
            "terms", Arr(
                Obj("binding", "binding1" ),
                Obj("field", Arr("data", "field"))
            )
        )
    );
    Console.WriteLine(result);
}
catch (Exception e)
{
    Console.WriteLine($"ERROR: {e.Message}");
}
ObjectV(terms: Arr(ObjectV(binding: StringV(binding1)), ObjectV(field: Arr(StringV(data), StringV(field)))))
result, err := client.Query(
	f.Obj{
		"terms": f.Arr{
			f.Obj{"binding": "binding1"},
			f.Obj{"field": f.Arr{"data", "field"}},
		},
	})

if err != nil {
	fmt.Fprintln(os.Stderr, err)
} else {
	fmt.Println(result)
}
map[terms:[map[binding:binding1] map[field:[data field]]]]
System.out.println(
    client.query(
        Obj(
            "terms", Arr(
                Obj("binding", Value("binding1")),
                Obj("field", Arr(Value("data"), Value("field")))
            )
        )
    ).get());
{terms: [{binding: "binding1"}, {field: ["data", "field"]}]}
client.query({
  terms: [
    { binding: 'binding1' },
    { field: ['data', 'field'] },
  ],
})
.then((ret) => console.log(ret))
.catch((err) => console.error(
  'Error: [%s] %s: %s',
  err.name,
  err.message,
  err.errors()[0].description,
))
{
  terms: [ { binding: 'binding1' }, { field: [ 'data', 'field' ] } ]
}
result = client.query(
  {
    "terms": [
      { "binding": "binding1" },
      { "field": ["data", "field"] }
    ]
  }
)
print(result)
{'terms': [{'binding': 'binding1'}, {'field': ['data', 'field']}]}
{
  terms: [
    { binding: 'binding1' },
    { field: ['data', 'field'] },
  ]
}
{
  terms: [ { binding: 'binding1' }, { field: [ 'data', 'field' ] } ]
}
Query metrics:
  •    bytesIn:  94

  •   bytesOut:  74

  • computeOps:   1

  •    readOps:   0

  •   writeOps:   0

  •  readBytes:   0

  • writeBytes:   0

  •  queryTime: 4ms

  •    retries:   0

Value objects

Value objects describe the fields whose values should be used to sort the index, and whose values should be reported in query results. By default, indexes have no values defined, and return the References of indexed documents.

When a values field is missing from an indexed document, the field value in the index is null. If all defined values fields evaluate to null, no index entry is stored for the document.

A value can be from a field in the document, or a binding function defined in a Source objects.

Values must refer to fields with scalar Values. Objects are not indexed, so when a values definition points to a document field or index binding result that has an Object, the index entry stores null because Objects cannot be sorted. When a values definition points to a document field or index binding result with an Array, one index entry per array item is created.

Field Type Definition

field

Array or String

The field name path or field name in the document to be indexed.

The path targets a field value. For this example object:

{
  "ref": Ref(Collection("pet_owners"), "12345"),
  "data": {
    "name": "Alice",
    "pets": [
      { "type": "dog", "name": "Luna" },
      { "type": "cat", "name": "Fluffy" },
    ],
  }
}

The following paths can be used to target fields:

  • "ref" targets the value for the ref field.

  • ["data", "name"] targets the value for the data.name field, "Alice."

  • ["data", "pets", "name"] targets the name field of objects in the pets array.

    Recall that one index item is created per array item during indexing, which flattens the array so that no array offset exists in the index entry.

binding

The name of a binding from a Source objects.

reverse

Whether this field value should sort reversed. Defaults to false.

The document reference is included in before and after cursors when paging through an index with the Paginate function, even if the reference is not included as a values field. Pagination uses the covered document reference stored in each index entry to stabilize pagination.

All document fields may be indexed. The value of field in a Term or Value object indicates the position in a document for a field. For example, the field ref refers to the top-level ref field. The field ["data", "address", "street"] refers to the street field contained in an address object in the data object of the document.

The following example demonstrates an index values definition with two term objects, the first specifies a binding, the second specifies a document field that should be sorted in reverse:

try
{
    Value result = await client.Query(
        Obj(
            "values", Arr(
                Obj("binding", "binding1" ),
                Obj("field", Arr("data", "field"), "reverse", true)
            )
        )
    );
    Console.WriteLine(result);
}
catch (Exception e)
{
    Console.WriteLine($"ERROR: {e.Message}");
}
ObjectV(values: Arr(ObjectV(binding: StringV(binding1)), ObjectV(field: Arr(StringV(data), StringV(field)),reverse: BooleanV(True))))
result, err := client.Query(
	f.Obj{
		"values": f.Arr{
			f.Obj{"binding": "binding1"},
			f.Obj{"field": f.Arr{"data", "field"}, "reverse": true},
		},
	})

if err != nil {
	fmt.Fprintln(os.Stderr, err)
} else {
	fmt.Println(result)
}
map[values:[map[binding:binding1] map[field:[data field] reverse:true]]]
System.out.println(
    client.query(
        Obj(
            "values", Arr(
                Obj("binding", Value("binding1")),
                Obj(
                    "field", Arr(Value("data"), Value("field")),
                    "reverse", Value(true)
                )
            )
        )
    ).get());
{values: [{binding: "binding1"}, {field: ["data", "field"], reverse: true}]}
client.query({
  values: [
    { binding: 'binding1' },
    { field: ['data', 'field'], reverse: true },
  ],
})
.then((ret) => console.log(ret))
.catch((err) => console.error(
  'Error: [%s] %s: %s',
  err.name,
  err.message,
  err.errors()[0].description,
))
{
  values: [
    { binding: 'binding1' },
    { field: [ 'data', 'field' ], reverse: true }
  ]
}
result = client.query(
  {
    "values": [
      {"binding": "binding1"},
      {"field": ["data", "field"], "reverse": True}
    ]
  }
)
print(result)
{'values': [{'binding': 'binding1'}, {'field': ['data', 'field'], 'reverse': True}]}
{
  values: [
    { binding: 'binding1' },
    { field: ['data', 'field'], reverse: true },
  ]
}
{
  values: [
    { binding: 'binding1' },
    { field: [ 'data', 'field' ], reverse: true }
  ]
}
Query metrics:
  •    bytesIn: 110

  •   bytesOut:  90

  • computeOps:   1

  •    readOps:   0

  •   writeOps:   0

  •  readBytes:   0

  • writeBytes:   0

  •  queryTime: 2ms

  •    retries:   0

Is this article helpful? 

Tell Fauna how the article can be improved:
Visit Fauna's forums or email docs@fauna.com

Thank you for your feedback!