Run the ingestion from your Airflow
OpenMetadata integrates with Airflow to orchestrate ingestion workflows. You can use Airflow to extract metadata and [deploy workflows] (/deployment/ingestion/openmetadata) directly. This guide explains how to run ingestion workflows in Airflow using three different operators:Using the Python Operator
Prerequisites
Install theopenmetadata-ingestion package in your Airflow environment. This approach works best if you have access to the Airflow host and can manage dependencies.
Installation Command:
Example
Example DAG
Key Notes
- Function Setup: The
python_callableargument in thePythonOperatorexecutes themetadata_ingestion_workflowfunction, which instantiates the workflow and runs the ingestion process. - Drawback: This method requires pre-installed dependencies, which may not always be feasible. Consider using the DockerOperator or PythonVirtualenvOperator as alternatives.
Next Steps
Docker & Virtualenv Operators
Run ingestion using the Docker Operator or Python Virtualenv Operator for isolated, dependency-free execution.