Test Definitions Reference
This page provides a complete reference for all data quality test definitions available in the OpenMetadata Python SDK. Tests are organized into two categories: Table-Level Tests and Column-Level Tests.
Importing Test Definitions
All test definitions are available from the metadata.sdk.data_quality module:
Common Parameters
All test definitions support these optional parameters:
| Parameter | Type | Description |
|---|---|---|
name | str | Unique identifier for the test case |
display_name | str | Human-readable name shown in UI |
description | str | Detailed description of what the test validates |
Column tests additionally require:
| Parameter | Type | Description |
|---|---|---|
column | str | Name of the column to test (required) |
Table-Level Tests
Table-level tests validate properties of entire tables, such as row counts, column counts, or custom SQL queries.
TableRowCountToBeBetween
Validates that the number of rows in a table falls within a specified range.
Parameters:
min_count(int, optional): Minimum acceptable number of rowsmax_count(int, optional): Maximum acceptable number of rows
Example:
Use Cases:
- Monitor data volume and detect data loss
- Validate expected data growth patterns
- Detect unexpected data surges
TableRowCountToEqual
Validates that the table has an exact number of rows.
Parameters:
row_count(int, required): Expected number of rows
Example:
Use Cases:
- Validate fixed-size reference tables
- Ensure complete dimension table loads
- Verify static lookup tables
TableColumnCountToBeBetween
Validates that the number of columns in a table falls within a specified range.
Parameters:
min_count(int, optional): Minimum acceptable number of columnsmax_count(int, optional): Maximum acceptable number of columns
Example:
Use Cases:
- Schema validation
- Detect unexpected column additions or removals
- Monitor schema evolution
TableColumnCountToEqual
Validates that the table has an exact number of columns.
Parameters:
column_count(int, required): Expected number of columns
Example:
Use Cases:
- Strict schema validation
- Ensure schema stability
- Prevent schema drift
TableColumnNameToExist
Validates that a specific column exists in the table schema.
Parameters:
column_name(str, required): Name of the column that must exist
Example:
Use Cases:
- Verify required columns are present
- Ensure critical columns aren't dropped
- Validate schema migrations
TableColumnToMatchSet
Validates that table columns match an expected set of column names.
Parameters:
column_names(list[str], required): List of expected column namesordered(bool, optional): If True, column order must match exactly (default: False)
Example:
Use Cases:
- Validate complete schema structure
- Ensure schema consistency across environments
- Detect unexpected schema changes
TableRowInsertedCountToBeBetween
Validates that the number of rows inserted within a time range is within bounds.
Parameters:
min_count(int, optional): Minimum acceptable number of inserted rowsmax_count(int, optional): Maximum acceptable number of inserted rowsrange_type(str, optional): Time unit ("HOUR", "DAY", "WEEK", "MONTH") (default: "DAY")range_interval(int, optional): Number of time units to look back (default: 1)
Example:
Use Cases:
- Monitor data ingestion rates
- Detect ingestion pipeline failures
- Validate ETL job completions
TableCustomSQLQuery
Validates data using a custom SQL query expression.
Parameters:
sql_expression(str, required): SQL query to executestrategy(str, optional): "ROWS" (count failing rows) or "COUNT" (expect a count) (default: "ROWS")
Example:
Use Cases:
- Implement custom business logic validation
- Validate referential integrity
- Check complex data relationships
TableDiff
Compares two tables and identifies differences in their data.
Parameters:
table2(str, required): Fully qualified name of the comparison tablekey_columns(list[str], optional): Columns to use as join keystable2_key_columns(list[str], optional): Join key columns from table 2use_columns(list[str], optional): Specific columns to compareextra_columns(list[str], optional): Additional columns to include in outputtable2_extra_columns(list[str], optional): Additional columns from table 2
Example:
Use Cases:
- Validate data migrations
- Verify data replication
- Compare production vs staging data
Column-Level Tests
Column-level tests validate properties of specific columns.
ColumnValuesToBeNotNull
Validates that a column contains no null or missing values.
Parameters:
column(str, required): Name of the column to validate
Example:
Use Cases:
- Ensure required fields are populated
- Validate data completeness
- Enforce NOT NULL constraints
ColumnValuesToBeUnique
Validates that all values in a column are unique with no duplicates.
Parameters:
column(str, required): Name of the column to validate
Example:
Use Cases:
- Validate primary keys
- Ensure unique identifiers
- Detect duplicate records
ColumnValuesToBeInSet
Validates that all values in a column belong to a specified set.
Parameters:
column(str, required): Name of the column to validateallowed_values(list[str], required): List of acceptable values
Example:
Use Cases:
- Validate enum values
- Enforce categorical constraints
- Validate lookup values
ColumnValuesToBeNotInSet
Validates that column values do not contain any forbidden values.
Parameters:
column(str, required): Name of the column to validateforbidden_values(list[str], required): List of values that must not appear
Example:
Use Cases:
- Detect test data in production
- Blacklist invalid values
- Filter out placeholder values
ColumnValuesToMatchRegex
Validates that column values match a specified regular expression pattern.
Parameters:
column(str, required): Name of the column to validateregex(str, required): Regular expression pattern
Example:
Use Cases:
- Validate data format consistency
- Ensure pattern compliance
- Detect malformed data
ColumnValuesToNotMatchRegex
Validates that column values do not match a forbidden regular expression pattern.
Parameters:
column(str, required): Name of the column to validateregex(str, required): Regular expression pattern that values must NOT match
Example:
Use Cases:
- Detect test data patterns
- Prevent specific formats
- Identify security risks
ColumnValuesToBeBetween
Validates that all values in a column fall within a specified numeric range.
Parameters:
column(str, required): Name of the column to validatemin_value(float, optional): Minimum acceptable valuemax_value(float, optional): Maximum acceptable value
Example:
Use Cases:
- Validate numeric constraints
- Detect outliers
- Ensure value ranges
ColumnValueMaxToBeBetween
Validates that the maximum value in a column falls within a specified range.
Parameters:
column(str, required): Name of the column to validatemin_value(float, optional): Minimum acceptable maximum valuemax_value(float, optional): Maximum acceptable maximum value
Example:
Use Cases:
- Monitor data ranges
- Detect upper outliers
- Validate maximum constraints
ColumnValueMinToBeBetween
Validates that the minimum value in a column falls within a specified range.
Parameters:
column(str, required): Name of the column to validatemin_value(float, optional): Minimum acceptable minimum valuemax_value(float, optional): Maximum acceptable minimum value
Example:
Use Cases:
- Monitor lower bounds
- Detect lower outliers
- Validate minimum constraints
ColumnValueMeanToBeBetween
Validates that the mean (average) value falls within a specified range.
Parameters:
column(str, required): Name of the column to validatemin_value(float, optional): Minimum acceptable mean valuemax_value(float, optional): Maximum acceptable mean value
Example:
Use Cases:
- Statistical validation
- Detect data drift
- Monitor averages
ColumnValueMedianToBeBetween
Validates that the median value falls within a specified range.
Parameters:
column(str, required): Name of the column to validatemin_value(float, optional): Minimum acceptable median valuemax_value(float, optional): Maximum acceptable median value
Example:
Use Cases:
- Robust central tendency checks
- Detect skewed distributions
- Monitor typical values
ColumnValueStdDevToBeBetween
Validates that the standard deviation falls within a specified range.
Parameters:
column(str, required): Name of the column to validatemin_value(float, optional): Minimum acceptable standard deviationmax_value(float, optional): Maximum acceptable standard deviation
Example:
Use Cases:
- Detect unexpected variability
- Monitor data consistency
- Validate distribution stability
ColumnValuesSumToBeBetween
Validates that the sum of all values falls within a specified range.
Parameters:
column(str, required): Name of the column to validatemin_value(float, optional): Minimum acceptable summax_value(float, optional): Maximum acceptable sum
Example:
Use Cases:
- Validate totals
- Monitor aggregates
- Detect unexpected volumes
ColumnValuesMissingCount
Validates the count of missing or null values.
Parameters:
column(str, required): Name of the column to validatemissing_count_value(int, optional): Expected number of missing valuesmissing_value_match(list[str], optional): Additional strings to treat as missing
Example:
Use Cases:
- Monitor data completeness
- Track missing data patterns
- Validate optional fields
ColumnValueLengthsToBeBetween
Validates that string lengths fall within a specified range.
Parameters:
column(str, required): Name of the column to validatemin_length(int, optional): Minimum acceptable string lengthmax_length(int, optional): Maximum acceptable string length
Example:
Use Cases:
- Validate string constraints
- Prevent truncation
- Ensure format compliance
ColumnValuesToBeAtExpectedLocation
Validates that a specific value appears at an expected row position.
Parameters:
column(str, required): Name of the column to validateexpected_value(str, required): The exact value expectedrow_index(int, optional): Zero-based row position (default: 0)
Example:
Use Cases:
- Validate sorted data
- Check ordered results
- Verify specific positions
Customizing Tests
All tests support customization through fluent methods:
Or pass values directly to the constructor:
Next Steps
- Learn how to use these tests with TestRunner
- Apply tests to DataFrame Validation
- Explore Advanced Usage patterns