> ## Documentation Index
> Fetch the complete documentation index at: https://docs.open-metadata.org/llms.txt
> Use this file to discover all available pages before exploring further.

# How to Deploy a Lineage Workflow

> Build data lineage using workflows to extract upstream and downstream dependencies.

export const connector_0 = "bigquery"

# How to Deploy a Lineage Workflow

Lineage data can be ingested from your data sources right from the OpenMetadata UI. Currently, the lineage workflow is supported for a limited set of connectors, like [BigQuery](/v1.12.x/connectors/database/bigquery), [Snowflake](/v1.12.x/connectors/database/snowflake), [MSSQL](/v1.12.x/connectors/database/mssql), [Redshift](/v1.12.x/connectors/database/redshift), [Clickhouse](/v1.12.x/connectors/database/clickhouse), [PostgreSQL](/v1.12.x/connectors/database/postgres), [Databricks](/v1.12.x/connectors/database/databricks).

<Tip>
  **Tip:** Trace the upstream and downstream dependencies with Lineage.
</Tip>

## View Lineage from Metadata Ingestion

Once the metadata ingestion runs correctly, and we are able to explore the service Entities, we can add the view lineage information for the data assets. This will populate the Lineage tab in the data asset page. During the Metadata Ingestion workflow we differentiate if a Table is a View. For those sources, where we can obtain the query that generates the View, we bring in the view lineage along with the metadata. After all Tables have been ingested in the workflow, it's time to parse all the queries generating Views. During the query parsing, we will obtain the source and target tables, search if the Tables exist in OpenMetadata, and finally create the lineage relationship between the involved Entities.

If the database has views, then the view lineage would be generated automatically, along with the column-level lineage. In such a case, the table type is **View** as shown in the example below.

<img src="https://mintcdn.com/openmetadata/AdVLfc7GKtiOkV7v/public/images/how-to-guides/lineage/view.png?fit=max&auto=format&n=AdVLfc7GKtiOkV7v&q=85&s=9c72db3d1c643658480b13e9aa1b2193" alt="View Lineage through Metadata Ingestion" width="1855" height="589" data-path="public/images/how-to-guides/lineage/view.png" />

## Lineage Agent from UI

Apart from the Metadata ingestion, we can create a workflow that will obtain the query log and table creation information from the underlying database and feed it to OpenMetadata. The Lineage Agent will be in charge of obtaining this data. The metadata ingestion will only bring in the View lineage queries, whereas the Lineage Agent workflow will be bring in all those queries that can be used to generate lineage information.

### 1. Add a Lineage Agent

Navigate to **Settings >> Services >> Databases**. Select the required service

<img src="https://mintcdn.com/openmetadata/AdVLfc7GKtiOkV7v/public/images/how-to-guides/lineage/wkf1.png?fit=max&auto=format&n=AdVLfc7GKtiOkV7v&q=85&s=9aa9aab83bb9a9ea41c178f5173a7dbd" alt="Select a Service" width="2943" height="1424" data-path="public/images/how-to-guides/lineage/wkf1.png" />

<img src="https://mintcdn.com/openmetadata/AdVLfc7GKtiOkV7v/public/images/how-to-guides/lineage/wkf1.1.png?fit=max&auto=format&n=AdVLfc7GKtiOkV7v&q=85&s=cbfac8e6c0355306b667b59d46847796" alt="Click on Databases" width="2943" height="1424" data-path="public/images/how-to-guides/lineage/wkf1.1.png" />

<img src="https://mintcdn.com/openmetadata/AdVLfc7GKtiOkV7v/public/images/how-to-guides/lineage/wkf1.2.png?fit=max&auto=format&n=AdVLfc7GKtiOkV7v&q=85&s=83a85fdf1da5c48afffd5b8953bd2d12" alt="Select the Database" width="2943" height="1424" data-path="public/images/how-to-guides/lineage/wkf1.2.png" />

Go the the **Ingestions** tab. Click on **Add Ingestion** and select **Add Lineage Agent**.

<img src="https://mintcdn.com/openmetadata/AdVLfc7GKtiOkV7v/public/images/how-to-guides/lineage/wkf2.png?fit=max&auto=format&n=AdVLfc7GKtiOkV7v&q=85&s=ff4c9580adaf8bc23a49541e2eeebdcd" alt="Add a Lineage Agent" width="2943" height="1424" data-path="public/images/how-to-guides/lineage/wkf2.png" />

### 2. Configure the Lineage Agent

Here you can enter the Lineage Agent details:

<img src="https://mintcdn.com/openmetadata/AdVLfc7GKtiOkV7v/public/images/how-to-guides/lineage/wkf3.png?fit=max&auto=format&n=AdVLfc7GKtiOkV7v&q=85&s=611ad5d77f5828433fb2414d93c50cf7" alt="Configure the Lineage Agent" width="2943" height="1424" data-path="public/images/how-to-guides/lineage/wkf3.png" />

### Lineage Options

**Query Log Duration:** Specify the duration in days for which the profiler should capture lineage data from the query logs. For example, if you specify 2 as the value for the duration, the data profiler will capture lineage information for 2 **days** or 48 hours prior to when the ingestion workflow is run.

**Parsing Timeout Limit:** Specify the timeout limit for parsing the sql queries to perform the lineage analysis. This must be specified in **seconds**.

**Result Limit:** Set the limit for the query log results to be run at a time. This is the **number of rows**.

**Filter Condition:** We execute a query on query history table of the respective data source to perform the query analysis and extract the lineage and usage information. This field will be useful when you want to restrict some queries from being part of this analysis. In this field you can specify a sql condition that will be applied on the query history result set. You can check more about [Usage Query Filtering here](/v1.12.x/connectors/ingestion/workflows/usage/filter-query-set).

### 3. Schedule and Deploy

After clicking Next, you will be redirected to the Scheduling form. This will be the same as the Metadata Ingestion. Select your desired schedule and click on Deploy to find the lineage pipeline being added to the Service Ingestions.

<img src="https://mintcdn.com/openmetadata/AdVLfc7GKtiOkV7v/public/images/how-to-guides/lineage/wkf4.png?fit=max&auto=format&n=AdVLfc7GKtiOkV7v&q=85&s=80b783203fa3936bac87b7c39195a41f" alt="Schedule and Deploy the Lineage Agent" width="2943" height="1424" data-path="public/images/how-to-guides/lineage/wkf4.png" />

## Run Lineage Workflow Externally

## Lineage

After running a Metadata Ingestion workflow, we can run Lineage workflow.
While the `serviceName` will be the same to that was used in Metadata Ingestion, so the ingestion bot can get the `serviceConnection` details from the server.

### 1. Define the YAML Config

This is a sample config for {connector_0} Lineage:

<CodePreview>
  <ContentPanel>
    <ContentSection id={1} title="Source Configuration" lines="4">
      Configure the source type and service name for your lineage workflow.

      You can find all the definitions and types for the `sourceConfig` [here](https://github.com/open-metadata/OpenMetadata/blob/main/openmetadata-spec/src/main/resources/json/schema/metadataIngestion/databaseServiceQueryLineagePipeline.json).
    </ContentSection>

    <ContentSection id={2} title="Lineage Config Type" lines="6">
      **type**: Set to `DatabaseLineage` for database lineage ingestion.
    </ContentSection>

    <ContentSection id={3} title="Query Log Duration" lines="7-8">
      **queryLogDuration**: Configuration to tune how far we want to look back in query logs to process lineage data in days.
    </ContentSection>

    <ContentSection id={4} title="Parsing Timeout Limit" lines="9">
      **parsingTimeoutLimit**: Configuration to set the timeout for parsing the query in seconds.
    </ContentSection>

    <ContentSection id={5} title="Filter Condition" lines="10">
      **filterCondition**: Condition to filter the query history.
    </ContentSection>

    <ContentSection id={6} title="Result Limit" lines="11">
      **resultLimit**: Configuration to set the limit for query logs.
    </ContentSection>

    <ContentSection id={7} title="Query Log File Path" lines="12-13">
      **queryLogFilePath**: Configuration to set the file path for query logs. If instead of getting the query logs from the database we want to pass a file with the queries.
    </ContentSection>

    <ContentSection id={8} title="Database Filter Pattern" lines="14-19">
      **databaseFilterPattern**: Regex to only fetch databases that matches the pattern.
    </ContentSection>

    <ContentSection id={9} title="Schema Filter Pattern" lines="20-25">
      **schemaFilterPattern**: Regex to only fetch tables or databases that matches the pattern.
    </ContentSection>

    <ContentSection id={10} title="Table Filter Pattern" lines="26-32">
      **tableFilterPattern**: Regex to only fetch tables or databases that matches the pattern.
    </ContentSection>

    <ContentSection id={11} title="Override View Lineage" lines="33">
      **overrideViewLineage**: Set the 'Override View Lineage' toggle to control whether to override the existing view lineage.
    </ContentSection>

    <ContentSection id={12} title="Process View Lineage" lines="34">
      **processViewLineage**: Set the 'Process View Lineage' toggle to control whether to process view lineage.
    </ContentSection>

    <ContentSection id={13} title="Process Query Lineage" lines="35">
      **processQueryLineage**: Set the 'Process Query Lineage' toggle to control whether to process query lineage.
    </ContentSection>

    <ContentSection id={14} title="Process Stored Procedure Lineage" lines="36">
      **processStoredProcedureLineage**: Set the 'Process Stored ProcedureLog Lineage' toggle to control whether to process stored procedure lineage.
    </ContentSection>

    <ContentSection id={15} title="Threads" lines="37">
      **threads**: Number of Threads to use in order to parallelize lineage ingestion.
    </ContentSection>

    <ContentSection id={16} title="Sink Configuration" lines="38-40">
      To send the metadata to OpenMetadata, it needs to be specified as `type: metadata-rest`.
    </ContentSection>
  </ContentPanel>

  <CodePanel fileName="{connector}_lineage.yaml">
    ```yaml theme={null}
    source:
      type: {connector}-lineage
      serviceName: {connector}
      sourceConfig:
        config:
          type: DatabaseLineage
          # Number of days to look back
          queryLogDuration: 1
          parsingTimeoutLimit: 300
          # filterCondition: query_text not ilike '--- metabase query %'
          resultLimit: 1000
          # If instead of getting the query logs from the database we want to pass a file with the queries
          # queryLogFilePath: /tmp/query_log/file_path
          # databaseFilterPattern:
          #   includes:
          #     - database1
          #     - database2
          #   excludes:
          #     - database3
          # schemaFilterPattern:
          #   includes:
          #     - schema1
          #     - schema2
          #   excludes:
          #     - schema3
          # tableFilterPattern:
          #   includes:
          #     - table1
          #     - table2
          #   excludes:
          #     - table3
          #     - table4
          overrideViewLineage: false
          processViewLineage: true
          processQueryLineage: true
          processStoredProcedureLineage: true
          threads: 1
    sink:
      type: metadata-rest
      config: {}
    ```
  </CodePanel>
</CodePreview>

* You can learn more about how to configure and run the Lineage Workflow to extract Lineage data from [here](/connectors/ingestion/workflows/lineage)

### 2. Run with the CLI

After saving the YAML config, we will run the command the same way we did for the metadata ingestion:

```bash theme={null}
metadata ingest -c <path-to-yaml>
```

## dbt Ingestion

We can also generate lineage through [dbt ingestion](/v1.12.x/connectors/database/dbt/configure-dbt-workflow). The dbt workflow can fetch queries that carry lineage information. For a dbt ingestion pipeline, the path to the Catalog and Manifest files must be specified. We also fetch the column level lineage through dbt.

You can learn more about [lineage ingestion here](/v1.12.x/connectors/ingestion/lineage).

## Query Logs using CSV File

Lineage ingestion is supported for a few connectors as mentioned earlier. For the unsupported connectors, you can set up [Lineage Workflows using Query Logs](/v1.12.x/connectors/ingestion/workflows/lineage/lineage-workflow-query-logs) using a CSV file.

## Manual Lineage

Lineage can also be added and edited manually in OpenMetadata. Refer for more information on [adding lineage manually](/v1.12.x/how-to-guides/data-lineage/manual).

<Card title="Explore the Lineage View" href="/v1.12.x/how-to-guides/data-lineage/explore">
  Explore the rich lineage view in OpenMetadata.
</Card>
