Amundsen

FeatureStatus
Table Metadata
Table Owner
Classifications/Tags
Dashboard & Chart Metadata

If you don't want to use the OpenMetadata Ingestion container to configure the workflows via the UI, then you can check the following docs to connect using Airflow SDK or with the CLI.

In this page, you will learn how to use the metadata CLI to run a one-ingestion.

Make sure you are running openmetadata-ingestion version 0.11.0 or above.

To create database service follow these steps:

The first step is ingesting the metadata from your sources. Under Settings, you will find a Services link an external source system to OpenMetadata. Once a service is created, it can be used to configure metadata, usage, and profiler workflows.

To visit the Services page, select Services from the Settings menu.

Visit Services Page

Find Metadata option on left panel of the settings page

Click on the 'Add New Service' button to start the Service creation.

Create a new service

Add a new Service from the Metadata Services page

Select the service type which are available on the page.

db-service

Select your service from the list

Provide a name and description for your service as illustrated below.

OpenMetadata uniquely identifies services by their Service Name. Provide a name that distinguishes your deployment from other services, including the other {connector} services that you might be ingesting metadata from.

Add New Service

Provide a Name and description for your Service

In this step, we will configure the connection settings required for this connector. Please follow the instructions below to ensure that you've configured the connector to read from your athena service as desired.

Configure service connection

Configure the service connection by filling the form

All connectors are now defined as JSON Schemas. Here you can find the structure to create a connection to Amundsen.

In order to create and run a Metadata Ingestion workflow, we will follow the steps to create a YAML configuration able to connect to the source, process the Entities if needed, and reach the OpenMetadata server.

The workflow is modeled around the following JSON Schema.

This is a sample config for Amundsen:

username: Enter the username of your Amundsen user in the Username field. The specified user should be authorized to read all databases you want to include in the metadata ingestion workflow.

password: Enter the password for your amundsen user in the Password field.

hostPort: Host and port of the Amundsen Neo4j Connection. This expect a URI format like: bolt://localhost:7687.

maxConnectionLifeTime (optional): Maximum connection lifetime for the Amundsen Neo4j Connection

validateSSL (optional): Enable SSL validation for the Amundsen Neo4j Connection.

encrypted (Optional): Enable encryption for the Amundsen Neo4j Connection.

To send the metadata to OpenMetadata, it needs to be specified as type: metadata-rest.

The main property here is the openMetadataServerConfig, where you can define the host and security provider of your OpenMetadata installation.

For a simple, local installation using our docker containers, this looks like:

filename.yaml

We support different security providers. You can find their definitions here.

  • JWT tokens will allow your clients to authenticate against the OpenMetadata server. To enable JWT Tokens, you will get more details here.
  • You can refer to the JWT Troubleshooting section link for any issues in your JWT configuration. If you need information on configuring the ingestion with other security providers in your bots, you can follow this doc link.

First, we will need to save the YAML file. Afterward, and with all requirements installed, we can run:

Note that from connector to connector, this recipe will always be the same. By updating the YAML configuration, you will be able to extract metadata from different sources.