Skip to main content

Python SDK

The OpenMetadata Python SDK provides a comprehensive interface for interacting with the OpenMetadata API. It offers type-safe operations for managing metadata entities and seamless integration with your Python applications.

Installation

Install the OpenMetadata Python SDK using pip:
pip install openmetadata-ingestion

Quick Start

Basic Connection

from metadata.sdk import configure

# Configure with your host and JWT token
configure(
    host="http://localhost:8585/api",
    jwt_token="your-jwt-token"
)
You can also configure from environment variables (OPENMETADATA_HOST and OPENMETADATA_JWT_TOKEN):
from metadata.sdk import configure

# Reads OPENMETADATA_HOST and OPENMETADATA_JWT_TOKEN automatically
configure()

Working with Entities

from metadata.sdk import Tables, DatabaseServices

# Get all database services
services = list(DatabaseServices.list().auto_paging_iterable())
print(f"Found {len(services)} database services")

# Get a specific table by name
table = Tables.retrieve_by_name("your-service.your-database.your-schema.your-table")

if table:
    print(f"Table: {table.name}")
    print(f"Columns: {len(table.columns) if table.columns else 0}")

Core Functionality

Entity Management

The Python SDK provides full CRUD operations for all OpenMetadata entities:

Create or Update Entities

from metadata.sdk import Tables
from metadata.generated.schema.api.data.createTable import CreateTableRequest

# Create table request
create_table = CreateTableRequest(
    name="sample_table",
    databaseSchema="your_database.your_schema",
    columns=[
        # Define your columns here
    ],
    description="Sample table created via Python SDK"
)

# Create the table
table = Tables.create(create_table)

Retrieve Entities

from metadata.sdk import Tables

# Get by ID
table = Tables.retrieve("uuid-here")

# Get by fully qualified name
table = Tables.retrieve_by_name("service.database.schema.table")

# Get with specific fields
table = Tables.retrieve_by_name(
    "service.database.schema.table",
    fields=["owners", "tags"]
)

List All Entities

from metadata.sdk import Tables

# Auto-paginating iterator for large datasets
for table in Tables.list().auto_paging_iterable():
    print(f"Processing table: {table.name}")

List with Filters

from metadata.sdk import Tables
from metadata.sdk.entities.tables import TableListParams

# List with filters and field selection
params = TableListParams.builder().limit(50).fields(["owners", "tags"]).build()
tables = Tables.list(params)

Update Entities

from metadata.sdk import Tables

table = Tables.retrieve_by_name("service.database.schema.table")
table.description = "Updated description"
updated = Tables.update(str(table.id), table)

Delete Entities

from metadata.sdk import Tables

# Soft delete
Tables.delete("uuid-here")

# Hard delete with recursive removal of children
Tables.delete("uuid-here", recursive=True, hard_delete=True)

Entity References

from metadata.sdk import to_entity_reference, Tables

# Retrieve the entity first, then get a reference
table = Tables.retrieve_by_name("service.database.schema.table")
ref = to_entity_reference(table)

# Use in other entity creation
if ref:
    print(f"Table reference ID: {ref.id}")

Advanced Features

Error Handling

from metadata.ingestion.ometa.client import APIError
from metadata.sdk import Tables

try:
    table = Tables.retrieve("table-id")
except APIError as e:
    if e.status_code == 404:
        print("Table not found")
    elif e.status_code == 401:
        print("Authentication failed")
    else:
        print(f"Error: {e}")

Common Use Cases

Data Discovery

from metadata.sdk import Tables

# Iterate all tables and filter for a keyword
matching_tables = [
    table for table in Tables.list().auto_paging_iterable()
    if "customer" in table.name.lower()
]

for table in matching_tables:
    print(f"Found customer table: {table.fullyQualifiedName}")

Metadata Automation

from metadata.sdk import Tables

# Bulk update table descriptions
for table in Tables.list().auto_paging_iterable():
    if not table.description:
        table.description = f"Production table: {table.name}"
        Tables.update(str(table.id), table)

Lineage Management

from metadata.sdk.api import Lineage

lineage = Lineage.get_lineage(
    "service.database.schema.table",
    upstream_depth=1,
    downstream_depth=1
)

if lineage:
    print(f"Upstream entities: {len(lineage.get('upstreamEdges', []))}")
    print(f"Downstream entities: {len(lineage.get('downstreamEdges', []))}")

API Reference

The Python SDK provides a comprehensive API based on the OpenMetadata data model:

Core Classes

  • Entity classes (Tables, Databases, DatabaseSchemas, DatabaseServices, Users, etc.): Static-method interfaces for each entity type — no instantiation required
  • configure(): One-time global setup for host and JWT token
  • to_entity_reference(entity): Convert a retrieved entity to an EntityReference for use in relationships
  • Entity Request Classes: Pydantic-based typed request objects (e.g., CreateTableRequest)

Key Methods

Each entity class exposes the same consistent interface:
  • EntityClass.create(request): Create a new entity
  • EntityClass.retrieve(entity_id): Retrieve entity by UUID
  • EntityClass.retrieve_by_name(fqn, fields=[]): Retrieve entity by fully qualified name
  • EntityClass.list(params=None): List entities (returns a pageable result)
  • EntityClass.list().auto_paging_iterable(): Auto-paginating generator for all entities
  • EntityClass.update(entity_id, entity): Update an existing entity
  • EntityClass.delete(entity_id, recursive=False, hard_delete=False): Delete an entity

Type Safety

The Python SDK is built on generated Pydantic models, providing:
  • Type hints for better IDE support
  • Runtime validation of data structures
  • Auto-completion for entity properties
  • Error prevention through static typing
from metadata.sdk import Tables
from metadata.generated.schema.entity.data.table import Table

# Type-safe retrieval — IDE provides auto-completion and type checking
table: Table = Tables.retrieve_by_name("service.database.schema.table")
if table:
    columns_count: int = len(table.columns) if table.columns else 0

Best Practices

  1. Configure once: Call configure() once at application startup and reuse globally — no need to pass a client object around
  2. Error Handling: Always handle APIError exceptions for robust integrations
  3. Pagination: Use .auto_paging_iterable() for large datasets to avoid loading everything into memory at once
  4. Performance: Specify only required fields when fetching entities (e.g., fields=["owners", "tags"])