Advanced Usage
This guide covers advanced patterns and configurations for Data Quality as Code, including loading tests from YAML files, customizing workflow configurations, and integrating with production systems.
Loading Tests from YAML
You can load test definitions from YAML workflow files, enabling version-controlled test configurations:
Basic YAML Loading
Using OpenMetadata Connection from YAML
By default, from_yaml() uses the connection configured via configure(). To use the connection from the YAML file:
YAML File Structure
A complete YAML configuration includes:
Advanced TestRunner Configuration
Customizing Workflow Behavior
Accessing Test Definitions
Inspect configured tests before running:
Publishing Results to OpenMetadata
Results can be published back to OpenMetadata for tracking, alerting, and visualization:
DataFrame Validation Results
Benefits of Publishing Results
- Historical tracking: View trends over time
- Alerting: Trigger notifications on failures
- Dashboards: Centralized data quality monitoring
- Collaboration: Share results across teams
- Compliance: Maintain audit trails
Error Handling and Retries
Implement robust error handling:
Dynamic Test Generation
Generate tests programmatically based on metadata:
Multi-Table Validation
Validate multiple tables in a workflow:
Best Practices Summary
- Version control test configurations: Store YAML configs in git
- Use environment variables: Never hardcode credentials
- Implement retries: Handle transient failures gracefully
- Publish results: Enable tracking and alerting in OpenMetadata
- Monitor execution: Track metrics for test runs
- Handle errors explicitly: Don't silently swallow failures
- Document tests: Use descriptive names and descriptions
- Validate incrementally: Test early and often in pipelines
- Separate concerns: Let data stewards define tests, engineers execute them
- Test your tests: Ensure test definitions are correct
Next Steps
- Review the Test Definitions Reference
- Learn about TestRunner
- Explore DataFrame Validation
- Return to Data Quality as Code Overview
- Check our Examples and Tutorials out