Atlas

In this page, you will learn how to use the metadata CLI to run a one-ingestion.

Make sure you are running openmetadata-ingestion version 0.11.0 or above.

You need to create database services before ingesting the metadata from Atlas. In OpenMetadata we have to create database services with the same name as the source.

To create database service follow these steps:

The first step is ingesting the metadata from your sources. Under Settings, you will find a Services link an external source system to OpenMetadata. Once a service is created, it can be used to configure metadata, usage, and profiler workflows.To visit the Services page, select Services from the Settings menu.serv

db-service

Navigate to Settings >> Services

Click on the Add New Service button to start the Service creation.

db-service

Add a New Service from the Database Services Page

Select the service type which are available on the Atlas and create a service. In this example we will need to create services for hive.

db-service
db-service

Note: Adding ingestion in this step is optional, because we will fetch the metadata from Atlas. After creating all the database services, my service page looks like below, and we are ready to start with the Atlas ingestion via the CLI.

db-service

All connectors are now defined as JSON Schemas. Here you can find the structure to create a connection to Atlas.

In order to create and run a Metadata Ingestion workflow, we will follow the steps to create a YAML configuration able to connect to the source, process the Entities if needed, and reach the OpenMetadata server.

The workflow is modeled around the following JSON Schema.

This is a sample config for Atlas:

source:
  type: Atlas
  serviceName: local_atlas
  serviceConnection:
    config:
      type: Atlas
      atlasHost: http://192.168.1.8:21000
      username: admin
      password: admin
      dbService: hive
      messagingService: kafka
      serviceType: Hive
      hostPort: localhost:10000
      entityTypes: examples/workflows/atlas_mapping.yaml
  sourceConfig:
    config:
      type: DatabaseMetadata
sink:
  type: metadata-rest
  config: {}
workflowConfig:
  openMetadataServerConfig:
    hostPort: <OpenMetadata host and port>
    authProvider: <OpenMetadata auth provider>

This is a sample config for Atlas mapping: It represent a key which is used to map the name of the database in atlas with OMD Service. File name: atlas_mapping.yaml

Table:
  rdbms_table:
    db: rdbms_db
    column: rdbms_column
Topic:
  - kafka_topic
  - kafka_topic_2

You can find all the definitions and types for the serviceConnection here.

  • username: Username to connect to the Atlas. This user should have privileges to read all the metadata in Atlas.
  • password: Password to connect to the Atlas.
  • hostPort: Host and port of the data source..
  • entityTypes: entity types of the data source.
  • serviceType: service type of the data source.
  • atlasHost: Atlas Host of the data source.
  • dbService : source database of the data source(Database service that you created from UI. example- hive).
  • messagingService (Optional): messaging service source of the data source.
  • database (Optional) :Database of the data source. This is optional parameter, if you would like to restrict the metadata reading to a single database. When left blank , OpenMetadata Ingestion attempts to scan all the databases in Atlas.

To send the metadata to OpenMetadata, it needs to be specified as "type": "metadata-rest".

The main property here is the openMetadataServerConfig, where you can define the host and security provider of your OpenMetadata installation. For a simple, local installation using our docker containers, this looks like:

workflowConfig:
  openMetadataServerConfig:
    hostPort: "http://localhost:8585/api"
    authProvider: openmetadata
    securityConfig:
      jwtToken: "{bot_jwt_token}"
chevron_rightConfigure SSO in the Ingestion Workflows

First, we will need to save the YAML file. Afterward, and with all requirements installed, we can run:

metadata ingest -c <path-to-yaml>

Note that from connector to connector, this recipe will always be the same. By updating the YAML configuration, you will be able to extract metadata from different sources.

Still have questions?

You can take a look at our Q&A or reach out to us in Slack

Was this page helpful?

editSuggest edits