connectors

No menu items for this category

Lineage Query Filtering

In order to extract usage information, OpenMetadata parses the queries that have run against the database. We fetch these queries from the query history table of the respective data source. This query fetches all the queries executed within configured no of days. In this guide we we understand how we can filter out the query history result set, this can help to exclude queries with specific pattern or queries executed on a specific schema or database, depending on the data source.

Query filtering is supported for both Usage & Lineage workflows. While configuring either usage or lineage workflow you will find a Filtering Condition text field where you can provide the sql condition which will be added to already existing conditions with an AND operation. In later part of this document you will find how to write this condition for supported data source.

filter-condition-field

Filter Condition Field

To fetch the query history log from snowflake we execute the following query

You can refer to this snowflake documentation to find out more about the query history table.

For example if you want to add a condition to filter out queries executed by metabase client, i.e. the queries staring with -- metabase % then you can put the condition as query_text NOT LIKE '--metabase %'

you also need to further restrict the query log and need only queries which have been executed on SALES database then you can put the filter condition as query_text NOT LIKE '--metabase %' AND database_name='SALES'.

To fetch the query history log from bigquery we execute the following query

You can refer to this bigquery documentation to find out more about the JOBS_BY_PROJECT table.

For example if you want to add a condition to filter out queries executed by metabase client, i.e. the queries staring with -- metabase % then you can put the condition as query NOT LIKE '--metabase %'.

To fetch the query history log from MSSQL we execute the following query

You can refer to this mssql documentation to find out more about the dm_exec_cached_plans table.

For example if you want to add a condition to filter out queries executed by metabase client, i.e. the queries staring with -- metabase % then you can put the condition as t.text NOT LIKE '--metabase %'.

you also need to further restrict the query log and need only queries which have been executed on SALES database then you can put the filter condition as t.text NOT LIKE '--metabase %' AND db.NAME='SALES'.

To fetch the query history log from clickhouse we execute the following query

You can refer to this clickhouse documentation to find out more about the query_log table.

For example if you want to add a condition to filter out queries executed by metabase client, i.e. the queries staring with -- metabase % then you can put the condition as query NOT LIKE '--metabase %'.

To fetch the query history log from vertica we execute the following query

You can refer to this vertica documentation to find out more about the query_profiles table.

For example if you want to add a condition to filter out queries executed by metabase client, i.e. the queries staring with -- metabase % then you can put the condition as query NOT LIKE '--metabase %'.

To fetch the query history log from redshift we execute the following query

You can refer to this redshift documentation to find out more about the stl_query table.

For example if you want to add a condition to filter out queries executed by metabase client, i.e. the queries staring with -- metabase % then you can put the condition as query NOT LIKE '--metabase %'.