After the dbt workflow is finished, check the logs to see if the dbt files were successfully validated or not. Any missing keys in the manifest.json or catalog.json files will displayed in the logs and those keys are needed to be added.
The dbt workflow requires the below keys to be present in the node of a manifest.json file:
- resource_type (required)
- alias/name (any one of them required)
- schema (required)
- description (required if description needs to be updated)
- compiled_code/compiled_sql (required if the dbt model query is to be shown in dbt tab and for query lineage)
- depends_on (required if lineage information needs to extracted )
- columns (required if column description is to be processed)
name/alias, schema and database values from dbt manifest.json should match values of the
name, schema and database of the table/view ingested in OpenMetadata.
dbt will only be processed if these values match
Below is a sample manifest.json node for reference:
For dbt lineage to happen we need to have the tables (models) involved previously ingested in OM. The process would be as follows:
- We have a dbt project that creates tables
A -> B -> C
- We run the metadata ingestion in our database service so that
Care ingested in OpenMetadata.
- We run the dbt ingestion in the same service so that 2 things would happen:
- We will add all the dbt-related metadata to the tables such as the model definition and descriptions.
- We will draw the lineage
A -> B -> Cthat comes from the model dependency in the
If lineage is not appearing:
- Make sure that all the tables are ingested in OpenMetadata.
- Follow to docs here to see if necessary details are present in the manifest.json file.
- Search for the following string
Processing DBT lineage forin the dbt workflow logs and see if any errors are causing the lineage creation to fail.
You might see this error when you have placed your dbt artifacts in S3 without the correct policies.
If we have the artifacts on the bucket
MyBucket, the user running the ingestion should have, at least, the permissions from the following policy:
Note that it's not enough to point the resource to
arn:aws:s3:::MyBucket. We need its contents as well!