connectors

No menu items for this category

Auto Ingest dbt-core

Learn how to automatically ingest dbt-core artifacts into OpenMetadata using the simplified metadata ingest-dbt CLI command that reads configuration directly from your dbt_project.yml file.

The metadata ingest-dbt command provides a streamlined way to ingest dbt artifacts into OpenMetadata by:

  • Reading configuration directly from your dbt_project.yml file
  • Automatically discovering dbt artifacts (manifest.json, catalog.json, run_results.json)
  • Supporting comprehensive filtering and configuration options
  1. dbt project setup: You must have a dbt project with a valid dbt_project.yml file
  2. dbt artifacts: Run dbt compile or dbt run to generate required artifacts in the target/ directory
  3. OpenMetadata service: Your database service must already be configured in OpenMetadata
  4. OpenMetadata Python package: Install the OpenMetadata ingestion package

Add the following variables to the vars section of your dbt_project.yml file:

If you're already in your dbt project directory:

Or if you're in a different directory:

For security and flexibility, you can use environment variables in your dbt_project.yml configuration instead of hardcoding sensitive values like JWT tokens. The system supports three different environment variable patterns:

PatternDescriptionExample
${VAR}Shell-style variable substitution"${OPENMETADATA_TOKEN}"
{{ env_var("VAR") }}dbt-style without default"{{ env_var('OPENMETADATA_HOST') }}"
{{ env_var("VAR", "default") }}dbt-style with default value"{{ env_var('SERVICE_NAME', 'default-service') }}"

Then set your environment variables:

Alternative: Using .env Files

For local development, you can create a .env file in your dbt project directory:

ParameterDescription
openmetadata_host_portOpenMetadata server URL (must start with https://)
openmetadata_jwt_tokenJWT token for authentication
openmetadata_service_nameName of the database service in OpenMetadata
ParameterDefaultDescription
openmetadata_dbt_update_descriptionstrueUpdate table/column descriptions from dbt
openmetadata_dbt_update_ownerstrueUpdate model owners from dbt
openmetadata_include_tagstrueInclude dbt tags as OpenMetadata tags
openmetadata_search_across_databasesfalseSearch for tables across multiple databases
openmetadata_dbt_classification_namenullCustom classification name for dbt tags

Control which databases, schemas, and tables to include or exclude:

Note: Global options like --version, --log-level, and --debug are available at the main metadata command level:

The command automatically discovers artifacts from your dbt project's target/ directory:

ArtifactRequiredDescription
manifest.json✅ YesModel definitions, relationships, and metadata
catalog.json❌ OptionalTable and column statistics from dbt docs generate
run_results.json❌ OptionalTest results from dbt test
  • Model Definitions: Queries, configurations, and relationships
  • Lineage: Table-to-table and column-level lineage
  • Documentation: Model and column descriptions
  • Data Quality: dbt test definitions and results
  • Tags & Classification: Model and column tags
  • Ownership: Model owners and team assignments
IssueSolution
dbt_project.yml not foundEnsure you're in a valid dbt project directory
Required configuration not foundAdd openmetadata_* variables to your dbt_project.yml
manifest.json not foundRun dbt compile or dbt run first
Invalid URL formatEnsure openmetadata_host_port includes protocol (https://)
Environment variable 'VAR' is not setSet the required environment variable or provide a default value
Environment variable not set and no defaultEither set the environment variable or use the {{ env_var('VAR', 'default') }} pattern

Enable detailed logging:

  • Always use environment variables for sensitive data like JWT tokens
  • Multiple patterns supported for flexibility:
  • Never commit sensitive values directly to version control
  • Use specific patterns to exclude temporary/test tables
  • Filter based on your organization's naming conventions
  • Exclude system schemas and databases
  • Integrate into CI/CD pipelines
  • Run after successful dbt builds
  • Set up scheduled ingestion for regular updates

After successful ingestion:

  1. Explore your data in the OpenMetadata UI
  2. Configure additional dbt features like tags, tiers, and glossary
  3. Set up data governance policies and workflows
  4. Schedule regular ingestion for keeping metadata up-to-date

For additional troubleshooting, refer to the dbt Troubleshooting Guide.