how-to-guides

No menu items for this category
OpenMetadata Documentation

TestRunner - Running Table-Level Tests

The TestRunner class provides a fluent API for executing data quality tests against tables cataloged in OpenMetadata. It automatically fetches table metadata and service connections, allowing you to run tests with minimal configuration.

TestRunner enables you to:

  • Execute tests defined in code against cataloged tables
  • Run tests previously configured in the OpenMetadata UI
  • Load test definitions from YAML workflow files
  • Validate data at the table and column levels
  • Get detailed test results for programmatic handling

Create a runner for a specific table using its fully qualified name (FQN):

The table FQN format is: {service}.{database}.{schema}.{table}

Add test definitions to the runner:

Use add_tests() to add several tests at once:

Execute all configured tests:

Here's a complete example of testing a customer table:

Instead of defining tests in code, you can run tests that data stewards have configured in the OpenMetadata UI. This enables a collaborative workflow where:

  • Data stewards define and maintain test criteria in the UI
  • Engineers execute those tests automatically in pipelines

This approach ensures:

  • Test definitions stay synchronized with business requirements
  • Engineers don't need to modify code when test criteria change
  • All stakeholders own data quality

You can customize test names, display names, and descriptions:

Or pass values directly to the constructor:

Some tests support computing the number and percentage of rows that passed or failed:

This provides detailed metrics about test failures, useful for:

  • Identifying the scope of data quality issues
  • Prioritizing remediation efforts
  • Tracking data quality trends over time

Customize the test runner behavior using the setup() method:

ParameterTypeDefaultDescription
force_test_updateboolFalseForce update even if tests already exist
log_levelLogLevelsINFOLogging level (DEBUG, INFO, WARN, ERROR)
raise_on_errorboolFalseRaise exceptions if test data already exists
success_thresholdint90Percentage threshold for overall success
enable_streamable_logsboolFalseEnable streamable log output

Test results contain detailed information about test execution:

  • Success: Test passed all validation criteria
  • Failed: Test did not meet validation criteria
  • Aborted: Test execution was interrupted or could not complete

Integrate TestRunner into your ETL pipelines:

Handle potential errors gracefully:

  1. Use descriptive test names: Make test failures easy to understand

  2. Leverage UI-defined tests: Let data stewards define test criteria

  3. Handle results programmatically: Don't just print - take action

  4. Use appropriate thresholds: Set realistic min/max values based on data patterns

  5. Combine table and column tests: Ensure both structural and content quality

If your OpenMetadata instance uses database-stored credentials (the default configuration), you do not need to follow this guide. The SDK will automatically retrieve and decrypt credentials.

This guide is only necessary when your organization uses an external secrets manager for credential storage.

The TestRunner API executes data quality tests directly from your Python code (e.g., within your ETL pipelines). To connect to your data sources, it needs to:

  1. Retrieve the service connection configuration from OpenMetadata
  2. Decrypt the credentials stored in your secrets manager
  3. Establish a connection to the data source
  4. Execute the test cases

Without proper secrets manager configuration, the SDK cannot decrypt credentials and will fail to connect to your data sources.

  1. Contact your OpenMetadata/Collate administrator to obtain:

    • The secrets manager type (AWS, Azure, GCP, etc.)
    • The secrets manager loader configuration
    • Required environment variables or configuration files
    • Any additional setup (IAM roles, service principals, etc.)
  2. Install required dependencies for your secrets manager provider

  3. Configure environment variables with access credentials

  4. Initialize the SecretsManagerFactory before using TestRunner

  5. Configure the SDK and run your tests

Required Dependencies:

Example Configuration:

OpenMetadata's ingestion extras: aws (e.g pip install 'openmetadata-ingestion[aws]')

SecretsManagerProvider: (one of)

  • SecretsManagerProvider.aws
  • SecretsManagerProvider.managed_aws
  • SecretsManagerProvider.aws_ssm
  • SecretsManagerProvider.managed_aws_ssm

Environment variables:

  • AWS_ACCESS_KEY_ID
  • AWS_SECRET_ACCESS_KEY
  • AWS_DEFAULT_REGION

OpenMetadata's ingestion extras: azure (e.g pip install 'openmetadata-ingestion[azure]')

SecretsManagerProvider: (one of)

  • SecretsManagerProvider.azure_kv
  • SecretsManagerProvider.managed_azure_kv

Environment variables:

  • AZURE_CLIENT_ID
  • AZURE_CLIENT_SECRET
  • AZURE_TENANT_ID
  • AZURE_KEY_VAULT_NAME

OpenMetadata's ingestion extras: gcp (e.g pip install 'openmetadata-ingestion[gcp]')

SecretsManagerProvider: SecretsManagerProvider.gcp

Environment variables:

  • GOOGLE_APPLICATION_CREDENTIALS: path to the file with the credentials json file
  • GCP_PROJECT_ID

Cause: Secrets manager not initialized or misconfigured

Solution: Ensure SecretsManagerFactory is initialized before calling configure() or creating the TestRunner

Cause: Insufficient permissions to access secrets

Solution:

  • Verify IAM role/service principal has correct permissions
  • Check credentials are valid and not expired
  • Ensure correct region/vault name is specified

Cause: Missing dependencies for your secrets manager

Solution: Install required extras:

Cause: Credentials not properly decrypted or secrets manager misconfigured

Solution:

  1. Verify secrets manager provider matches your OpenMetadata backend configuration
  2. Test credential access independently (e.g., using AWS CLI, Azure CLI, gcloud)
  3. Check network connectivity to secrets manager service
  4. Enable debug logging to see detailed error messages:

If you're unsure about:

  • Which secrets manager your organization uses
  • Required environment variables or configuration
  • Access credentials or IAM roles
  • Permissions needed

Contact your OpenMetadata or Collate administrator for the specific configuration required in your environment.