Database performance
This section describes performance-related best practices.
Counters
In Fauna, each document update stores a new version of the document, which makes counter data poorly suited for database storage.
If a frequently-updated counter is an essential part of your application, an event sourcing technique is recommended to reduce database contention and reduce unnecessary database operations. For more information, see the Write throughput scaling tutorial.
If the event sourcing pattern isn’t suitable for your application, you might be able to make other performance improvements:
-
Set a collection
history_days
field to a small value, with a zero value recommended. Document history continues to be collected, but is removed sooner than the default zero days. See Collections for additional information. -
Periodically, run a query that calls
Remove
to explicitly remove document history. -
Instead of attempting to implement a real-time counter, consider storing countable documents as a cache and periodically analyze cache contents to update a reporting document.
Other frequently updated field considerations
Fauna processes transactions in 10-millisecond batch intervals. Each batch can contain many transactions, but only one transaction per batch can modify a specific field, so throughput could be limited by contention on field updates.
Indexes must evaluate document history when searching for matching
entries. Where possible, do not include frequently-updated fields in
index terms
or values
definitions.
Data persistence considerations
Where possible, set the collection history_days
field to 0
. Older
version removal is handled by a background task so versions accumulate
until the task executes. See
Collections for additional information.
Consider setting the collection ttl_days
to a small value. When the
ttl_days
limit is reached, "old" documents and their history are
removed. See
Collections for additional information.
Where possible, separate ephemeral data from long-term data. For queries that need to only access or change ephemeral data, including long-term data can make those queries take longer.
Indexing performance considerations
Avoid indexing fields that have low cardinality. This can cause many index entries to exist for each field value, which impacts general index performance.
Internally, indexes are partitioned by term, where each index term is serviced by one database node. With low cardinality terms, fewer nodes participate in generating sets of matching index entries. This makes index performance less scalable and can make those nodes hot spots that degrade overall performance.
Using set manipulation functions, such as Union
, Intersection
,
and Difference
can produce increased scatter/gather operations to
populate the resulting set, reducing query performance. Set operations
with fewer indexes perform better than those that have many indexes.
When you are not using temporality on documents in a collection, document
versions increase your storage costs and reduce the performance of
indexes covering those documents. Reduce the collection history_days
setting to reduce the time interval for persisting document versions.
Additionally, to support temporality, indexes store index entries for every
version of the documents associated with the index source
definition.
Indexes with many entries can impact overall index performance and increases
storage requirements. To address this concern:
-
Set low
ttl
(time-to-live) values on indexed collections. -
Set low
history_days
values on indexed collections.
Is this article helpful?
Tell Fauna how the article can be improved:
Visit Fauna's forums
or email docs@fauna.com
Thank you for your feedback!