OpenMetadata
Search…
Ingest Metadata in Production
Use this procedure, if you already have a production Airflow instance on which you would like to schedule OpenMetadata ingestion workflows.

1. Create a configuration file for your connector

See the connector documentation for instructions on how to create a configuration file for the service you would like to integrate with OpenMetadata.

2. Edit a Python script to define your ingestion DAG

Copy and paste the code below into a file called openmetadata-airflow.py.
1
import json
2
from datetime import timedelta
3
​
4
from airflow import DAG
5
​
6
try:
7
from airflow.operators.python import PythonOperator
8
except ModuleNotFoundError:
9
from airflow.operators.python_operator import PythonOperator
10
​
11
from airflow.utils.dates import days_ago
12
​
13
from metadata.ingestion.api.workflow import Workflow
14
​
15
default_args = {
16
"owner": "user_name",
17
"email": ["[email protected]"],
18
"email_on_failure": False,
19
"retries": 3,
20
"retry_delay": timedelta(seconds=10),
21
"execution_timeout": timedelta(minutes=60),
22
}
23
​
24
config = """
25
## REPLACE THIS LINE WITH YOUR CONFIGURATION JSON
26
"""
27
​
28
def metadata_ingestion_workflow():
29
workflow_config = json.loads(config)
30
workflow = Workflow.create(workflow_config)
31
workflow.execute()
32
workflow.raise_from_status()
33
workflow.print_status()
34
workflow.stop()
35
​
36
with DAG(
37
"sample_data",
38
default_args=default_args,
39
description="An example DAG which runs a OpenMetadata ingestion workflow",
40
start_date=days_ago(1),
41
is_paused_upon_creation=False,
42
catchup=False,
43
) as dag:
44
ingest_task = PythonOperator(
45
task_id="ingest_using_recipe",
46
python_callable=metadata_ingestion_workflow,
47
)
Copied!

3. Copy your configuration JSON into the ingestion script

In step 1 above you created a JSON file with the configuration for your ingestion connector. Copy that JSON into the openmetadata-airflow.py file that you created in step 2 as directed by the comment below.
1
config = """
2
## REPLACE THIS LINE WITH YOUR CONFIGURATION JSON
3
"""
Copied!

14. Run the script to create your ingestion DAG

Run the following command to create your ingestion DAG in Airflow.
1
python openmetadata-airflow.py
Copied!