Airbyte integration

Supported version

Airbyte 0.39.26-alpha

This document describes how to configure the Fauna source connector to transfer your database to one of the data analytics or warehousing destination connectors supported by Airbyte.

The Fauna source supports two ways to export your data:

  • Full refresh sync mode copies all of your data to the destination, optionally deleting existing data.

  • Incremental append sync mode periodically transfers new, changed, or deleted data to the destination.

Prerequisites

You need a destination database account and need to set up the Data Build Tool (dbt™) to transform fields in your documents to columns in your destination. You also need to install Docker.

Create a destination database account

If you do not already have an account for the database associated with your destination connector, create an account and save the authentication credentials for setting up the destination connector to populate the destination database.

Set up dbt

To access the fields in your Fauna source using SQL-style statements, create a dbt account and set up dbt as described in the Airbyte Transformations with dbt setup guide. The guide steps you through the setup for transforming the data between the source and destination, and connects you to the destination database.

Install Docker

The Fauna connector is an Airbyte Open Source integration, deployed as a Docker image. If you do not already have Docker installed, follow the Install Docker Engine guide.

Step 1: Set up the Fauna source

Depending on your use case, set up one of the following sync modes for your collection.

Full refresh sync mode

Follow these steps to fully sync the source and destination database.

  1. Use the Fauna Dashboard or fauna-shell shell to create a role that can read the collection to be exported. For example:

    CreateRole({
      name: "airbyte-readonly",
      privileges: [{
        resource: Collection("COLLECTION_NAME"),
        actions: { read: true }
      }],
    })

    Replace COLLECTION_NAME with the collection name for this connector.

  2. Create a secret that has the permissions associated with the role, using the name of the role you created. For example:

    CreateKey({
      name: "airbyte-readonly",
      role: Role("airbyte-readonly"),
    })
    {
      ref: Key("341909050919747665"),
      ts: 1662328730450000,
      role: Role("airbyte-readonly"),
      secret: "fnAEjXudojkeRWaz5lxL2wWuqHd8k690edbKNYZz",
      hashed_secret: "$2a$05$TGr5F3JzriWbRUXlKMlykerq1nnYzEUr4euwrbrLUcWgLhvWmnW6S"
    }

    Save the returned secret, otherwise, you need to create a new key.

Incremental append sync mode

Use incremental sync mode to periodically sync the source and destination, updating only new and changed data.

Follow these steps to set up incremental sync.

  1. Use the Fauna Dashboard or fauna-shell to create an index, which lets the connector do incremental syncs. For example:

    CreateIndex({
      name: "INDEX_NAME",
      source: Collection("COLLECTION_NAME"),
      terms: [],
      values: [
        { "field": "ts" },
        { "field": "ref" }
      ]
    })

    Replace INDEX_NAME with the name you configured for the Incremental Sync Index. Replace COLLECTION_NAME with the name of the collection configured for this connector.

    Index values

    Description

    ts

    Last modified timestamp.

    ref

    Unique document identifier.

  2. Create a role that can read the collection and index, and can access index metadata to validate the index settings. For example:

    CreateRole({
      name: "airbyte-readonly",
      privileges: [
        {
          resource: Collection("COLLECTION_NAME"),
          actions: { read: true }
        },
        {
          resource: Index("INDEX_NAME"),
          actions: { read: true }
        },
        {
          resource: Indexes(),
          actions: { read: true }
        }
      ],
    })

    Replace COLLECTION_NAME with the name of the collection configured for this connector. Replace INDEX_NAME with the name that you configured for the Incremental Sync Index.

  3. Create a secret key that has the permissions associated with the role, using the name of the role you created. For example:

    CreateKey({
      name: "airbyte-readonly",
      role: Role("airbyte-readonly"),
    })
    {
      ref: Key("341909050919747665"),
      ts: 1662328730450000,
      role: Role("airbyte-readonly"),
      secret: "fnAEjXudojkeRWaz5lxL2wWuqHd8k690edbKNYZz",
      hashed_secret: "$2a$05$TGr5F3JzriWbRUXlKMlykerq1nnYzEUr4euwrbrLUcWgLhvWmnW6S"
    }

    Save the returned secret. You need to enter the secret in step 2 of the Install Docker procedure. It is important to save the key, otherwise, you need to create a new key if you lose the provided secret.

The Fauna source iterates through all indexes on the database. For each index it finds, the following conditions must be met for incremental sync:

  1. The source must be able to Get() the index, which means it needs read access to this index.

  2. The source of the index must be a reference to the collection you are trying to sync

  3. The number of values must be two.

  4. The number of terms must be zero.

  5. The values must be equal to:

    {"field": "ts"}
    {"field": "ref"}

All of the above conditions are checked in the order listed. If a check fails, it skips that index.

If no indexes are found in the initial setup, incremental sync isn’t available for the given collection. No error is emitted because it can’t be determined whether or not you are expecting an index for that collection.

If you find that the collection doesn’t have incremental sync available, make sure that you followed all the setup steps, and that the source, terms, and values all match for your index.

Step 2: Deploy and launch Airbyte

  1. Refer to the Deploy Airbyte instructions to install and deploy Airbyte. Enter the following commands to deploy the Airbyte server:

    git clone https://github.com/airbytehq/airbyte.git
    cd airbyte
    docker-compose up
  2. When the Airbyte banner displays, launch the Airbyte dashboard at http://localhost:8000.

  3. Choose the Connections menu item to start setting up your data source.

Step 3: Set up the Fauna source

  1. In the Airbyte dashboard, click the + New connection button. If you previously set up a source, click the Use existing source button to choose that source.

  2. In the Source type dropdown, choose Fauna and click the Set up source button. This lists the configurable Fauna connector parameters. An in-app Setup Guide in the right-side panel also gives detailed setup instructions.

  3. Set the following required parameters:

    Parameter

    Description

    Name

    Enter a descriptive name for this connection. The name is displayed in the Connections window connections list.

    Domain

    Enter the domain of the collection you want to export. See Region Groups for region domains.

    Port

    Enter the default port number: 443.

    Scheme

    Enter the scheme used to connect to Fauna: https.

    Fauna Secret

    Enter the saved Fauna secret that you use to authenticate with the database.

    Page Size

    The page size lets you control the memory size, which affects connector performance.

    Deletion Mode

    The deletion mode lets you specify whether to ignore document deletions or flag documents as deleted, depending on your use case.

    + Choose from the following options:

    • The Ignore option ignores document deletions.

    • The Deleted Field option adds a date column that has the date when you deleted the document. This maintains document history while letting the destination reconstruct deletion events.

  4. After setting up the source, click the Set up source button.

    The All connection tests passed! message confirms successful connection to the Fauna source. This minimally confirms:

    • The secret is valid.

    • The collection exists.

Step 4: Set up the destination

  1. In the New connection window, choose a Destination type and click the Set up destination button. If you previously set up a destination, click the Use existing destination button to select and use that destination. Otherwise, continue to set up a new destination.

  2. Destination connector configuration parameters are unique to the destination. Populate the Set up the destination fields according to the connector requirements, including authentication information if needed. A Setup Guide is provided in the right-side panel with detailed setup instructions.

  3. When you are done, click the Set up destination button.

Step 5: Set up the connection

Set up the connection to sync the source and destination.

  1. Enter a descriptive name for the connection in the Name field.

  2. Choose a Transfer > Replication frequency, which is the data sync interval.

    You can choose the Manual option to manually sync the data.

  3. In the Streams > Destination Namespace field, choose a destination namespace where the data is stored. Options include:

    Option

    Description

    Mirror source structure

    Sets the name in the destination database to the name used for the Fauna source.

    Other

    Uses another naming option, such as prefixing the database name with a string.

  4. Optionally, enter a stream name prefix in the Streams > Destination Stream Prefix field.

  5. In the Activate the streams you want to sync section, click the > arrow to expand the available fields:

    Field

    Description

    data

    Collection data.

    ref

    Unique document identifier.

    ts

    Data timestamp.

    ttl

    Time-to-live interval.

    The document is deleted if it is not modified in the ttl time interval. The default value is null for not used. After document deletion, it is not displayed in temporal queries and the connector does not emit a deleted_at row.

  6. Select ref as the Primary key. This uniquely identifies the document in the collection.

  7. Choose a Sync mode as the source sync behavior, full or incremental.

    A new incremental sync gets the full database, the same as a full sync.

    Sync mode

    Description

    Incremental | Deduped + history

    what it does

    Full refresh | Overwrite

    what it does

    Incremental | Append

    what it does

    Full refresh | Append

    what it does

    If fewer than four options are displayed, it indicates that the index is incorrectly set up. See Step 1: Set up the Fauna source.

  8. Choose the Normalization and Transformation data format:

    Data format

    Description

    Raw data (JSON)

    Put all the source data in a single column.

    Normalized tabular data

    Put the ref, ts, ttl, and data fields in separate columns.

  9. Click the + Add transformation button to add the dbt transform.

    To extract the fields in the source data column, you need to configure dbt to map source data to destination database columns. For example, the following SQL-based query extracts the name, account_balance, and credit_card/expires fields from the source data column to populate three separate columns of the destination data:

    with output as (
      select
        data:name as name
        data:account_balance as balance
        data:credit_card:expires as cc_expires
      from airbyte_schema.users
    )
    
    select *from output
  10. Click the Set up connection button.

Step 6: Sync the data

On the Connection page for the connection you created, click the Sync now button.

The time to run a sync varies with the status displayed in Sync History. When the sync completes, the status changes from Running to Succeeded and shows:

  • The number of bytes transferred.

  • The number of records emitted and committed.

  • The sync duration.

Add Fauna contact info in case of problems.

Step 7: Verify the integration

To expand the sync log, click the > arrow to the right of the displayed time. This gives you a detailed view of the sync events.

Finally, verify successful database transfer by opening and viewing the destination database.

Is this article helpful? 

Tell Fauna how the article can be improved:
Visit Fauna's forums or email docs@fauna.com

Thank you for your feedback!