OpenMetadata
Search…
Run DynamoDB Connector with the CLI
Use the 'metadata' CLI to run a one-time ingestion
Configure and schedule DynamoDB metadata workflows using your the metadata CLI.

Requirements

Follow this guide to learn how to set up Airflow to run the metadata ingestions.

Python requirements

To run the DynamoDB ingestion, you will need to install:
1
pip3 install 'openmetadata-ingestion[dynamodb]'
Copied!

Metadata Ingestion

All connectors are now defined as JSON Schemas. Here you can find the structure to create a connection to DynamoDB.
In order to create and run a Metadata Ingestion workflow, we will follow the steps to create a JSON configuration able to connect to the source, process the Entities if needed, and reach the OpenMetadata server.
The workflow is modelled around the following JSON Schema.

1. Define the YAML Config

This is a sample config for DynamoDB:
1
source:
2
type: dynamodb
3
serviceName: local_dynamodb
4
serviceConnection:
5
config:
6
type: DynamoDB
7
awsConfig:
8
awsAccessKeyId: aws_access_key_id
9
awsSecretAccessKey: aws_secret_access_key
10
awsRegion: aws region
11
endPointURL: https://dynamodb.<region_name>.amazonaws.com
12
database: custom_database_name
13
sourceConfig:
14
config:
15
enableDataProfiler: false
16
tableFilterPattern:
17
includes:
18
- ''
19
sink:
20
type: metadata-rest
21
config: {}
22
workflowConfig:
23
openMetadataServerConfig:
24
hostPort: http://localhost:8585/api
25
authProvider: no-auth
Copied!

Source Configuration - Source Config

The sourceConfig is defined here.
  • enableDataProfiler: DynamoDB does not provide query capabilities, so the profiler is not supported.
  • markDeletedTables: To flag tables as soft-deleted if they are not present anymore in the source system.
  • includeTables: true or false, to ingest table data. Default is true.
  • includeViews: true or false, to ingest views definitions.
  • generateSampleData: DynamoDB does not provide query capabilities, so sample data is not supported.
  • sampleDataQuery: Defaults to select * from {}.{} limit 50.
  • schemaFilterPattern and tableFilternPattern: Note that the schemaFilterPattern and tableFilterPattern both support regex as include or exclude. E.g.,
1
tableFilterPattern:
2
includes:
3
- users
4
- type_test
Copied!

Sink Configuration

To send the metadata to OpenMetadata, it needs to be specified as type: metadata-rest.

Workflow Configuration

The main property here is the openMetadataServerConfig, where you can define the host and security provider of your OpenMetadata installation.
For a simple, local installation using our docker containers, this looks like:
1
workflowConfig:
2
openMetadataServerConfig:
3
hostPort: http://localhost:8585/api
4
authProvider: no-auth
Copied!

OpenMetadata Security Providers

We support different security providers. You can find their definitions here. An example of an Auth0 configuration would be the following:
1
workflowConfig:
2
openMetadataServerConfig:
3
hostPort: http://localhost:8585/api
4
authProvider: auth0
5
securityConfig:
6
clientId: <client ID>
7
secretKey: <secret key>
8
domain: <domain>
Copied!

2. Run with the CLI

First, we will need to save the YAML file. Afterward, and with all requirements installed, we can run:
1
metadata ingest -c <path-to-yaml>
Copied!
Note that from connector to connector, this recipe will always be the same. By updating the JSON configuration, you will be able to extract metadata from different sources.