> ## Documentation Index
> Fetch the complete documentation index at: https://docs.open-metadata.org/llms.txt
> Use this file to discover all available pages before exploring further.

# Run the ingestion from your Airflow

> Deploy ingestion externally using Airflow for scalable orchestration of metadata pipelines across environments.

<Tip>
  This page is about running the Ingestion Framework **externally**!

  There are mainly 2 ways of running the ingestion:

  1. Internally, by managing the workflows from OpenMetadata.
  2. Externally, by using any other tool capable of running Python code.

  If you are looking for how to manage the ingestion process from OpenMetadata, you can follow
  this [doc](/deployment/ingestion/openmetadata).
</Tip>

# Run the ingestion from your Airflow

OpenMetadata integrates with Airflow to orchestrate ingestion workflows. You can use Airflow to [extract metadata](/v1.12.x/connectors/pipeline/airflow) and \[deploy workflows] (/deployment/ingestion/openmetadata) directly. This guide explains how to run ingestion workflows in Airflow using three different operators:

1. [Python Operator](#python-operator)
2. [Docker Operator](#docker-operator)
3. [Python Virtualenv Operator](#python-virtualenv-operator)

## Using the Python Operator

### Prerequisites

Install the `openmetadata-ingestion` package in your Airflow environment. This approach works best if you have access to the Airflow host and can manage dependencies.

#### Installation Command:

```
pip3 install openmetadata-ingestion[&lt;plugin&gt;]==x.y.z
```

-Replace [\<plugin>](https://github.com/open-metadata/OpenMetadata/blob/main/ingestion/setup.py) with the sources to ingest, such as mysql, snowflake, or s3.
-Replace x.y.z with the OpenMetadata version matching your server (e.g., 1.6.1).

### Example

```
pip3 install openmetadata-ingestion[mysql,snowflake,s3]==1.6.1
```

### Example DAG

```python theme={null}
import yaml
from datetime import timedelta
from airflow import DAG

try:
    from airflow.operators.python import PythonOperator
except ModuleNotFoundError:
    from airflow.operators.python_operator import PythonOperator

from metadata.config.common import load_config_file
from metadata.workflow.metadata import MetadataWorkflow


from airflow.utils.dates import days_ago

default_args = {
    "owner": "user_name",
    "email": ["username@org.com"],
    "email_on_failure": False,
    "retries": 3,
    "retry_delay": timedelta(minutes=5),
    "execution_timeout": timedelta(minutes=60)
}

config = """
<your YAML configuration>
"""

def metadata_ingestion_workflow():
    workflow_config = yaml.safe_load(config)
    workflow = MetadataWorkflow.create(workflow_config)
    workflow.execute()
    workflow.raise_from_status()
    workflow.print_status()
    workflow.stop()

with DAG(
    "sample_data",
    default_args=default_args,
    description="An example DAG which runs a OpenMetadata ingestion workflow",
    start_date=days_ago(1),
    is_paused_upon_creation=False,
    schedule_interval='*/5 * * * *',
    catchup=False,
) as dag:
    ingest_task = PythonOperator(
        task_id="ingest_using_recipe",
        python_callable=metadata_ingestion_workflow,
    )
```

### Key Notes

* **Function Setup**: The `python_callable` argument in the `PythonOperator` executes the `metadata_ingestion_workflow` function, which instantiates the workflow and runs the ingestion process.
* **Drawback**: This method requires pre-installed dependencies, which may not always be feasible. Consider using the **DockerOperator** or **PythonVirtualenvOperator** as alternatives.

## Next Steps

<Card title="Docker & Virtualenv Operators" href="/v1.12.x/deployment/ingestion/external/airflow-docker-virtualenv">
  Run ingestion using the Docker Operator or Python Virtualenv Operator for isolated, dependency-free execution.
</Card>
