Airbyte integration
Supported version |
---|
Airbyte 0.40.32 (alpha) |
This document describes how to configure the Fauna source connector to transfer your database to one of the data analytics or warehousing destination connectors supported by Airbyte.
The Fauna source supports two ways to export your data:
-
Full refresh sync mode copies all of your data to the destination, optionally deleting existing data.
-
Incremental append sync mode periodically transfers new, changed, or deleted data to the destination.
Prerequisites
Create a destination database account
If you do not already have an account for the database associated with your destination connector, create an account and save the authentication credentials for setting up the destination connector to populate the destination database.
Install Docker
If you want to run Airbyte locally, you must install Docker. Follow the Install Docker Engine guide.
Step 1: Set up sync mode and authentication
Choose one of the following sync modes, depending on your requirements.
Full refresh sync mode
Follow these steps to fully sync the source and destination database.
-
Use the Fauna Dashboard or
fauna-shell
shell to create a role that can read the collection to be exported. For example:CreateRole({ name: "airbyte-readonly", privileges: [{ resource: Collection("COLLECTION_NAME"), actions: { read: true } }], })
Replace
COLLECTION_NAME
with the collection name for this connector. -
Create a secret with permissions associated with the role using the
name
of the role you created. For example:CreateKey({ name: "airbyte-readonly", role: Role("airbyte-readonly"), })
{ ref: Key("341909050919747665"), ts: 1662328730450000, role: Role("airbyte-readonly"), secret: "fnAEjXudojkeRWaz5lxL2wWuqHd8k690edbKNYZz", hashed_secret: "$2a$05$TGr5F3JzriWbRUXlKMlykerq1nnYzEUr4euwrbrLUcWgLhvWmnW6S" }
Save the returned
secret
, which you need in Step 3: Set up the Fauna source. If you lose the secret, you must create a new key. -
Skip to Step 2: Deploy and launch Airbyte.
Incremental append sync mode
Use incremental sync mode to periodically sync the source and destination, updating only new and changed data.
Follow these steps to set up incremental sync.
-
Use the Fauna Dashboard or
fauna-shell
to create an index, which lets the connector do incremental syncs. For example:CreateIndex({ name: "INDEX_NAME", source: Collection("COLLECTION_NAME"), terms: [], values: [ { "field": "ts" }, { "field": "ref" } ] })
Replace
INDEX_NAME
with the name you configured for the Incremental Sync Index. ReplaceCOLLECTION_NAME
with the name of the collection configured for this connector.Index values
Description
ts
Last modified timestamp.
ref
Unique document identifier.
-
Create a role that can read the collection and index, which can access index metadata to validate the index settings. For example:
CreateRole({ name: "airbyte-readonly", privileges: [ { resource: Collection("COLLECTION_NAME"), actions: { read: true } }, { resource: Index("INDEX_NAME"), actions: { read: true } }, { resource: Indexes(), actions: { read: true } } ], })
Replace
COLLECTION_NAME
with the name of the collection configured for this connector. ReplaceINDEX_NAME
with the name that you configured for the Incremental Sync Index. -
Create a secret key with permissions associated with the role using the
name
of the role you created. For example:CreateKey({ name: "airbyte-readonly", role: Role("airbyte-readonly"), })
{ ref: Key("341909050919747665"), ts: 1662328730450000, role: Role("airbyte-readonly"), secret: "fnAEjXudojkeRWaz5lxL2wWuqHd8k690edbKNYZz", hashed_secret: "$2a$05$TGr5F3JzriWbRUXlKMlykerq1nnYzEUr4euwrbrLUcWgLhvWmnW6S" }
Save the returned
secret
, which you need in Step 3: Set up the Fauna source. If you lose the secret, you must create a new key. -
Continue with the next step.
The Fauna source iterates through all indexes on the database in the order listed. For each index it finds, incremental sync requires the following conditions:
If a check fails, that index is skipped. If no indexes are found in the initial setup, incremental sync isn’t available for the collection. No error is emitted because it doesn’t known if you’re expecting an index for that collection. If you find that the collection doesn’t have incremental sync available, make sure that you follow the setup steps and that the source, terms, and values match your index. |
Step 2: Deploy and launch Airbyte
You can deploy and launch Airbyte locally using a Docker image or use Airbyte Cloud.
Deploy and launch locally
-
Refer to the Airbyte Local Deployment guide to install and deploy Airbyte locally. Enter the following commands to deploy the Airbyte server:
git clone https://github.com/airbytehq/airbyte.git cd airbyte docker-compose up
-
When the Airbyte banner displays, launch the Airbyte dashboard at
http://localhost:8000
. -
To log in, enter the default credentials found in the .env file of the cloned repository:
BASIC_AUTH_USERNAME=airbyte BASIC_AUTH_PASSWORD=password
-
Choose the Connections menu item to start setting up your data source, destination, and connection between them.
-
In the Airbyte dashboard, click the + New connection button.
-
Skip to Step 3: Set up the Fauna source.
Deploy and launch using Airbyte cloud
-
Refer to the Getting Started with Airbyte Cloud guide for the fastest and most reliable way to run Airbyte.
-
If this is your first time, go to https://cloud.airbyte.com/signup, sign up for an Airbyte account, and click Create your first connection.
Otherwise, log in to Airbyte, and in the left navigation panel, choose Connections. Choose an existing connection to make changes.
To create a new connection, click the New connection button and continue with the next step.
Step 3: Set up the Fauna source
-
In the Source type dropdown, choose Fauna, which lists the configurable Fauna connector parameters.
If you previously set up a source, choose the source you want and click the Use existing source button.
A Setup Guide in the right-side panel gives detailed setup instructions.
-
Set the following required parameters:
Parameter
Description
Source name
Enter a descriptive name for this connection. The name is displayed in the Connections window connections list.
Domain
Enter the main domain for Fauna,
db.fauna.com
. See Region Groups for more information.Port
Enter the default port number:
443
.Scheme
Enter the scheme used to connect to Fauna:
https
.Fauna Secret
Enter the Fauna source database secret that you saved in Step 1: Set up sync mode and authentication.
Page Size
The page size lets you control the memory size, which affects connector performance.
Deletion Mode
The deletion mode lets you specify whether to ignore document deletions or flag documents as deleted, depending on your use case.
+ Choose from the following options:
-
Disabled: ignores document deletions.
-
Enabled: adds a date column with the date when you deleted the document. This maintains document history while letting the destination reconstruct deletion events.
-
-
After setting up the source, click the Set up source button.
The All connection tests passed! message indicates that you successfully connected to the Fauna source, minimally confirming:
-
The secret is valid.
-
The collection exists.
-
Step 4: Set up the destination
-
If you previously set up a destination, click the Use existing destination button to select and use that destination. Otherwise, choose the Destination type.
-
Destination connector configuration parameters differ according to the destination type. Populate the Set up the destination fields according to the connector requirements, including authentication information. A Setup Guide in the right-side panel gives you detailed setup instructions.
-
When you are done entering the required parameters, click the Set up destination button and wait for the destination testing to successfully complete.
Step 5: Set up the connection
In the New connection window, accept the default settings or make the changes you want for syncing the source and destination.
-
Enter a descriptive name for the connection in the Connection name field.
-
Choose a Replication frequency, which is the data sync period.
Choose Manual to sync the data manually.
-
In the Destination Namespace field, click the Edit button to choose a destination namespace where the data is stored. Options include:
Option
Description
Destination default
Replicate and store in the default namespace defined in the destination settings.
Mirror source structure
Sets the name in the destination database to the name used for the Fauna source.
Custom format
Create a custom format to rename the namespace that your data is replicated into, such as prefixing the database name with a string.
Click the Apply button when you are done.
-
In the Non-breaking schema updates detected field, choose Ignore or Disable connection for how Airbyte handles syncs when it detects a non-breaking schema change in the source.
-
In the Activate the streams you want to sync section, select the rows for the sources you want to sync.
-
Click anywhere on a row to choose the fields you want to sync:
Field
Description
data
Collection data.
ref
Unique document identifier.
ts
Data timestamp.
ttl
Time-to-live interval.
The document is deleted if it isn’t modified in the
ttl
time interval. The default value isnull
for not used. After document deletion, it is not displayed in temporal queries, and the connector does not emit adeleted_at
row. -
Select
ref
as the Primary key uniquely identifying the document in the collection. -
In the Sync mode click the options to choose the combination of options that define the source sync behavior:
Sync mode
Description
Full refresh | Overwrite
When the connector runs, it copies all of the Fauna source data and overwrites the destination data.
Full refresh | Append
When the connector runs, it copies all of the Fauna source data and appends it to the destination data.
Incremental | Append
When the connector runs, it copies all of the Fauna source data that changed since the last run.
Incremental | Deduped + history
When the connector runs, it copies all of the Fauna source data that changed since the last run. Next, in the destination database, it sets up a view that shows the most recent version according to a user-defined primary key.
Fewer than four options indicates that the index is set up incorrectly. See Step 1: Set up sync mode and authentication.
A new incremental sync gets the full database, the same as a full sync.
-
Choose the Normalization data format:
Data format
Description
Raw data (JSON)
Put all the source data in a single column.
Normalized tabular data
Put the
ref
,ts
,ttl
, anddata
fields in separate columns. -
Click the Set up connection button.
Step 6: Sync the data
On the Connection page of the created connection, click the Sync now button if the sync hasn’t already started.
The time to run a sync varies with the status displayed in Sync History. When the sync completes, the status changes from Running to Succeeded and shows:
-
The number of bytes transferred.
-
The number of records emitted and committed.
-
The sync duration.
Click the Cancel Sync but to cancel a sync in progress.
Step 7: Verify the integration
When the sync completes, click the Sync Succeeded to view the Sync History.
Confirm that the database has transferred successfully by opening and viewing the destination database.
Is this article helpful?
Tell Fauna how the article can be improved:
Visit Fauna's forums
or email docs@fauna.com
Thank you for your feedback!