Airbyte integration

Supported version

Airbyte 0.40.32 (alpha)

This document describes how to configure the Fauna source connector to transfer your database to one of the data analytics or warehousing destination connectors supported by Airbyte.

The Fauna source supports two ways to export your data:

  • Full refresh sync mode copies all of your data to the destination, optionally deleting existing data.

  • Incremental append sync mode periodically transfers new, changed, or deleted data to the destination.

Prerequisites

Create a destination database account

If you do not already have an account for the database associated with your destination connector, create an account and save the authentication credentials for setting up the destination connector to populate the destination database.

Install Docker

If you want to run Airbyte locally, you must install Docker. Follow the Install Docker Engine guide.

Step 1: Set up sync mode and authentication

Choose one of the following sync modes, depending on your requirements.

Full refresh sync mode

Follow these steps to fully sync the source and destination database.

  1. Use the Fauna Dashboard or fauna-shell shell to create a role that can read the collection to be exported. For example:

    CreateRole({
      name: "airbyte-readonly",
      privileges: [{
        resource: Collection("COLLECTION_NAME"),
        actions: { read: true }
      }],
    })

    Replace COLLECTION_NAME with the collection name for this connector.

  2. Create a secret with permissions associated with the role using the name of the role you created. For example:

    CreateKey({
      name: "airbyte-readonly",
      role: Role("airbyte-readonly"),
    })
    {
      ref: Key("341909050919747665"),
      ts: 1662328730450000,
      role: Role("airbyte-readonly"),
      secret: "fnAEjXudojkeRWaz5lxL2wWuqHd8k690edbKNYZz",
      hashed_secret: "$2a$05$TGr5F3JzriWbRUXlKMlykerq1nnYzEUr4euwrbrLUcWgLhvWmnW6S"
    }

    Save the returned secret, which you need in Step 3: Set up the Fauna source. If you lose the secret, you must create a new key.

  3. Skip to Step 2: Deploy and launch Airbyte.

Incremental append sync mode

Use incremental sync mode to periodically sync the source and destination, updating only new and changed data.

Follow these steps to set up incremental sync.

  1. Use the Fauna Dashboard or fauna-shell to create an index, which lets the connector do incremental syncs. For example:

    CreateIndex({
      name: "INDEX_NAME",
      source: Collection("COLLECTION_NAME"),
      terms: [],
      values: [
        { "field": "ts" },
        { "field": "ref" }
      ]
    })

    Replace INDEX_NAME with the name you configured for the Incremental Sync Index. Replace COLLECTION_NAME with the name of the collection configured for this connector.

    Index values

    Description

    ts

    Last modified timestamp.

    ref

    Unique document identifier.

  2. Create a role that can read the collection and index, which can access index metadata to validate the index settings. For example:

    CreateRole({
      name: "airbyte-readonly",
      privileges: [
        {
          resource: Collection("COLLECTION_NAME"),
          actions: { read: true }
        },
        {
          resource: Index("INDEX_NAME"),
          actions: { read: true }
        },
        {
          resource: Indexes(),
          actions: { read: true }
        }
      ],
    })

    Replace COLLECTION_NAME with the name of the collection configured for this connector. Replace INDEX_NAME with the name that you configured for the Incremental Sync Index.

  3. Create a secret key with permissions associated with the role using the name of the role you created. For example:

    CreateKey({
      name: "airbyte-readonly",
      role: Role("airbyte-readonly"),
    })
    {
      ref: Key("341909050919747665"),
      ts: 1662328730450000,
      role: Role("airbyte-readonly"),
      secret: "fnAEjXudojkeRWaz5lxL2wWuqHd8k690edbKNYZz",
      hashed_secret: "$2a$05$TGr5F3JzriWbRUXlKMlykerq1nnYzEUr4euwrbrLUcWgLhvWmnW6S"
    }

    Save the returned secret, which you need in Step 3: Set up the Fauna source. If you lose the secret, you must create a new key.

  4. Continue with the next step.

The Fauna source iterates through all indexes on the database in the order listed. For each index it finds, incremental sync requires the following conditions:

  • The source must have read access to Get() the index.

  • The source of the index must be a reference to the collection you are trying to sync

  • The number of values must be two.

  • The number of terms must be zero.

  • The values must be equal to:

    {"field": "ts"}
    {"field": "ref"}

If a check fails, that index is skipped.

If no indexes are found in the initial setup, incremental sync isn’t available for the collection. No error is emitted because it doesn’t known if you’re expecting an index for that collection.

If you find that the collection doesn’t have incremental sync available, make sure that you follow the setup steps and that the source, terms, and values match your index.

Step 2: Deploy and launch Airbyte

You can deploy and launch Airbyte locally using a Docker image or use Airbyte Cloud.

Deploy and launch locally

  1. Refer to the Airbyte Local Deployment guide to install and deploy Airbyte locally. Enter the following commands to deploy the Airbyte server:

    git clone https://github.com/airbytehq/airbyte.git
    cd airbyte
    docker-compose up
  2. When the Airbyte banner displays, launch the Airbyte dashboard at http://localhost:8000.

  3. To log in, enter the default credentials found in the .env file of the cloned repository:

    BASIC_AUTH_USERNAME=airbyte
    BASIC_AUTH_PASSWORD=password
  4. Choose the Connections menu item to start setting up your data source, destination, and connection between them.

  5. In the Airbyte dashboard, click the + New connection button.

  6. Skip to Step 3: Set up the Fauna source.

Deploy and launch using Airbyte cloud

  1. Refer to the Getting Started with Airbyte Cloud guide for the fastest and most reliable way to run Airbyte.

  2. If this is your first time, go to https://cloud.airbyte.com/signup, sign up for an Airbyte account, and click Create your first connection.

    Otherwise, log in to Airbyte, and in the left navigation panel, choose Connections. Choose an existing connection to make changes.

    To create a new connection, click the New connection button and continue with the next step.

Step 3: Set up the Fauna source

  1. In the Source type dropdown, choose Fauna, which lists the configurable Fauna connector parameters.

    If you previously set up a source, choose the source you want and click the Use existing source button.

    A Setup Guide in the right-side panel gives detailed setup instructions.

  2. Set the following required parameters:

    Parameter

    Description

    Source name

    Enter a descriptive name for this connection. The name is displayed in the Connections window connections list.

    Domain

    Enter the main domain for Fauna, db.fauna.com. See Region Groups for more information.

    Port

    Enter the default port number: 443.

    Scheme

    Enter the scheme used to connect to Fauna: https.

    Fauna Secret

    Enter the Fauna source database secret that you saved in Step 1: Set up sync mode and authentication.

    Page Size

    The page size lets you control the memory size, which affects connector performance.

    Deletion Mode

    The deletion mode lets you specify whether to ignore document deletions or flag documents as deleted, depending on your use case.

    + Choose from the following options:

    • Disabled: ignores document deletions.

    • Enabled: adds a date column with the date when you deleted the document. This maintains document history while letting the destination reconstruct deletion events.

  3. After setting up the source, click the Set up source button.

    The All connection tests passed! message indicates that you successfully connected to the Fauna source, minimally confirming:

    • The secret is valid.

    • The collection exists.

Step 4: Set up the destination

  1. If you previously set up a destination, click the Use existing destination button to select and use that destination. Otherwise, choose the Destination type.

  2. Destination connector configuration parameters differ according to the destination type. Populate the Set up the destination fields according to the connector requirements, including authentication information. A Setup Guide in the right-side panel gives you detailed setup instructions.

  3. When you are done entering the required parameters, click the Set up destination button and wait for the destination testing to successfully complete.

Step 5: Set up the connection

In the New connection window, accept the default settings or make the changes you want for syncing the source and destination.

  1. Enter a descriptive name for the connection in the Connection name field.

  2. Choose a Replication frequency, which is the data sync period.

    Choose Manual to sync the data manually.

  3. In the Destination Namespace field, click the Edit button to choose a destination namespace where the data is stored. Options include:

    Option

    Description

    Destination default

    Replicate and store in the default namespace defined in the destination settings.

    Mirror source structure

    Sets the name in the destination database to the name used for the Fauna source.

    Custom format

    Create a custom format to rename the namespace that your data is replicated into, such as prefixing the database name with a string.

    Click the Apply button when you are done.

  4. In the Non-breaking schema updates detected field, choose Ignore or Disable connection for how Airbyte handles syncs when it detects a non-breaking schema change in the source.

  5. In the Activate the streams you want to sync section, select the rows for the sources you want to sync.

  6. Click anywhere on a row to choose the fields you want to sync:

    Field

    Description

    data

    Collection data.

    ref

    Unique document identifier.

    ts

    Data timestamp.

    ttl

    Time-to-live interval.

    The document is deleted if it isn’t modified in the ttl time interval. The default value is null for not used. After document deletion, it is not displayed in temporal queries, and the connector does not emit a deleted_at row.

  7. Select ref as the Primary key uniquely identifying the document in the collection.

  8. In the Sync mode click the options to choose the combination of options that define the source sync behavior:

    Sync mode

    Description

    Full refresh | Overwrite

    When the connector runs, it copies all of the Fauna source data and overwrites the destination data.

    Full refresh | Append

    When the connector runs, it copies all of the Fauna source data and appends it to the destination data.

    Incremental | Append

    When the connector runs, it copies all of the Fauna source data that changed since the last run.

    Incremental | Deduped + history

    When the connector runs, it copies all of the Fauna source data that changed since the last run. Next, in the destination database, it sets up a view that shows the most recent version according to a user-defined primary key.

    Fewer than four options indicates that the index is set up incorrectly. See Step 1: Set up sync mode and authentication.

    A new incremental sync gets the full database, the same as a full sync.

  9. Choose the Normalization data format:

    Data format

    Description

    Raw data (JSON)

    Put all the source data in a single column.

    Normalized tabular data

    Put the ref, ts, ttl, and data fields in separate columns.

  10. Click the Set up connection button.

Step 6: Sync the data

On the Connection page of the created connection, click the Sync now button if the sync hasn’t already started.

The time to run a sync varies with the status displayed in Sync History. When the sync completes, the status changes from Running to Succeeded and shows:

  • The number of bytes transferred.

  • The number of records emitted and committed.

  • The sync duration.

Click the Cancel Sync but to cancel a sync in progress.

Step 7: Verify the integration

When the sync completes, click the Sync Succeeded to view the Sync History.

Confirm that the database has transferred successfully by opening and viewing the destination database.

Is this article helpful? 

Tell Fauna how the article can be improved:
Visit Fauna's forums or email docs@fauna.com

Thank you for your feedback!