connectors

No menu items for this category

Troubleshooting

After the dbt workflow is finished, check the logs to see if the dbt files were successfully validated or not. Any missing keys in the manifest.json or catalog.json files will displayed in the logs and those keys are needed to be added.

The dbt workflow requires the below keys to be present in the node of a manifest.json file:

  • resource_type (required)
  • alias/name (any one of them required)
  • schema (required)
  • description (required if description needs to be updated)
  • compiled_code/compiled_sql (required if the dbt model query is to be shown in dbt tab and for query lineage)
  • depends_on (required if lineage information needs to extracted )
  • columns (required if column description is to be processed)

The name/alias, schema and database values from dbt manifest.json should match values of the name, schema and database of the table/view ingested in OpenMetadata.

dbt will only be processed if these values match

Below is a sample manifest.json node for reference:

For dbt lineage to happen we need to have the tables (models) involved previously ingested in OM. The process would be as follows:

  • We have a dbt project that creates tables A -> B -> C
  • We run the metadata ingestion in our database service so that A , B and C are ingested in OpenMetadata.
  • We run the dbt ingestion in the same service so that 2 things would happen:
    • We will add all the dbt-related metadata to the tables such as the model definition and descriptions.
    • We will draw the lineage A -> B -> C that comes from the model dependency in the manifest.json

If lineage is not appearing:

  • Make sure that all the tables are ingested in OpenMetadata.
  • Follow to docs here to see if necessary details are present in the manifest.json file.
  • Search for the following string Processing DBT lineage for in the dbt workflow logs and see if any errors are causing the lineage creation to fail.

You might see this error when you have placed your dbt artifacts in S3 without the correct policies.

If we have the artifacts on the bucket MyBucket, the user running the ingestion should have, at least, the permissions from the following policy:

Note that it's not enough to point the resource to arn:aws:s3:::MyBucket. We need its contents as well!