TestRunner - Running Table-Level Tests
TheTestRunner class provides a fluent API for executing data quality tests against tables cataloged in OpenMetadata. It automatically fetches table metadata and service connections, allowing you to run tests with minimal configuration.
Table of contents
- Overview
- Basic Usage
- Complete Example
- Running Tests from OpenMetadata’s UI
- Customizing Test Metadata
- Configuring Row Count Computation
- Test Runner Configuration
- Understanding Test Results
- Integration with ETL Workflows
- Error Handling
- Best Practices
- Next Steps
⚠️ If you’re using OpenMetadata Cloud to run OpenMetadata, please refer to External Secrets Managers before using the TestRunner API.
Overview
TestRunner enables you to:- Execute tests defined in code against cataloged tables
- Run tests previously configured in the OpenMetadata UI
- Load test definitions from YAML workflow files
- Validate data at the table and column levels
- Get detailed test results for programmatic handling
Basic Usage
Creating a TestRunner
Create a runner for a specific table using its fully qualified name (FQN):{service}.{database}.{schema}.{table}
Adding Tests
Add test definitions to the runner:Adding Multiple Tests
Useadd_tests() to add several tests at once:
Running Tests
Execute all configured tests:Complete Example
Here’s a complete example of testing a customer table:Running Tests from OpenMetadata UI
Instead of defining tests in code, you can run tests that data stewards have configured in the OpenMetadata UI. This enables a collaborative workflow where:- Data stewards define and maintain test criteria in the UI
- Engineers execute those tests automatically in pipelines
- Test definitions stay synchronized with business requirements
- Engineers don’t need to modify code when test criteria change
- All stakeholders own data quality
Customizing Test Metadata
You can customize test names, display names, and descriptions:Configuring Row Count Computation
Some tests support computing the number and percentage of rows that passed or failed:- Identifying the scope of data quality issues
- Prioritizing remediation efforts
- Tracking data quality trends over time
Test Runner Configuration
Customize the test runner behavior using thesetup() method:
Configuration Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
force_test_update | bool | False | Force update even if tests already exist |
log_level | LogLevels | INFO | Logging level (DEBUG, INFO, WARN, ERROR) |
raise_on_error | bool | False | Raise exceptions if test data already exists |
success_threshold | int | 90 | Percentage threshold for overall success |
enable_streamable_logs | bool | False | Enable streamable log output |
Understanding Test Results
Test results contain detailed information about test execution:Test Status Values
Success: Test passed all validation criteriaFailed: Test did not meet validation criteriaAborted: Test execution was interrupted or could not complete
Integration with ETL Workflows
Integrate TestRunner into your ETL pipelines:Error Handling
Handle potential errors gracefully:Best Practices
-
Use descriptive test names: Make test failures easy to understand
-
Leverage UI-defined tests: Let data stewards define test criteria
-
Handle results programmatically: Don’t just print - take action
- Use appropriate thresholds: Set realistic min/max values based on data patterns
- Combine table and column tests: Ensure both structural and content quality
If your organization uses an external secrets manager (AWS, Azure, GCP), see External Secrets Managers before using the TestRunner API.
Next Steps
- Learn about DataFrame Validation for validating transformations
- Review the Test Definitions Reference for all available tests
- Explore Advanced Usage including YAML workflows