Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.open-metadata.org/llms.txt

Use this file to discover all available pages before exploring further.

Deploy OpenMetadata on AWS with Terraform

The OpenMetadata Terraform module for AWS deploys OpenMetadata and all its dependencies on an existing EKS cluster. Each component (database and search engine) can be independently configured using one of three provisioners: deploy it inside the cluster via Helm, provision a managed AWS service, or connect to an existing resource you already operate.

Prerequisites

Before using this module, ensure you have:
  • Terraform ~> 1.0
  • An existing EKS cluster with kubectl configured to access it
  • Helm and Kubernetes Terraform providers configured to point to your cluster
  • AWS provider ~> 6.0 with permissions to create the resources required by your chosen provisioners (see IAM permissions below)
The module manages OpenMetadata and its dependencies only (it does not create the EKS cluster, VPC, or node groups). See the complete example for a reference that provisions the full AWS infrastructure from scratch.

IAM Permissions

The following permissions are required depending on which provisioners you use:
ProvisionerRequired AWS permissions
db = "aws"RDS: create/manage DB instances, subnet groups, security groups
opensearch = "aws"OpenSearch Service: create/manage domains, security groups
kms_key_idKMS: use the specified key for encryption

Provider Configuration

Your Terraform configuration must include the AWS, Kubernetes, and Helm providers:
provider "aws" {
  region = "us-east-1"
}

provider "kubernetes" {
  host                   = aws_eks_cluster.this.endpoint
  cluster_ca_certificate = base64decode(aws_eks_cluster.this.certificate_authority[0].data)
  token                  = data.aws_eks_cluster_auth.this.token
}

provider "helm" {
  kubernetes {
    host                   = aws_eks_cluster.this.endpoint
    cluster_ca_certificate = base64decode(aws_eks_cluster.this.certificate_authority[0].data)
    token                  = data.aws_eks_cluster_auth.this.token
  }
}

Choosing a Provisioner

Each component supports a different set of provisioners. Mix and match to fit your infrastructure:
Componenthelmawsexisting
OpenMetadataN/AN/A
OpenMetadata database
OpenSearch
ProvisionerWhen to use
helmDevelopment, testing, or when you want everything self-contained inside the cluster.
awsProduction. Creates a managed AWS resource (RDS or OpenSearch Service) with high availability, automated backups, and encryption.
existingYou already have a database or search engine running. The module connects OpenMetadata to it without creating anything new.

Quick Start - Helm

The simplest deployment. All components run inside your cluster via Helm. Suitable for development and evaluation:
module "omd" {
  source  = "open-metadata/openmetadata/aws"
  version = "1.13"

  app_namespace    = "openmetadata"
  eks_nodes_sg_ids = ["sg-1234abcd5678efgh"]
  subnet_ids       = ["subnet-1a2b3c4d", "subnet-5e6f7g8h", "subnet-9i0j1k2l"]
  vpc_id           = "vpc-1a2b3c4d"
}
1

Initialize Terraform

terraform init
2

Review the plan

terraform plan
3

Apply

terraform apply

Production Deployment - AWS Managed Services

Use the aws provisioner for the database and OpenSearch to get production-grade infrastructure. This creates:
  • RDS PostgreSQL instance (Multi-AZ, db.t4g.medium) for OpenMetadata
  • OpenSearch Service domain (2 nodes, t3.small.search) for search
  • Security groups allowing traffic from your EKS nodes to each resource
  • Kubernetes secrets with auto-generated credentials in your application namespace
The aws provisioner creates billable AWS resources. Run terraform destroy when you no longer need them.
module "omd" {
  source  = "open-metadata/openmetadata/aws"
  version = "1.13"

  app_namespace    = "openmetadata"
  eks_nodes_sg_ids = ["sg-1234abcd5678efgh"]
  kms_key_id       = "arn:aws:kms:us-east-1:123456789012:key/12345678-1234-1234-1234-123456789012"
  subnet_ids       = ["subnet-1a2b3c4d", "subnet-5e6f7g8h", "subnet-9i0j1k2l"]
  vpc_id           = "vpc-1a2b3c4d"

  db = {
    provisioner = "aws"
  }

  opensearch = {
    provisioner = "aws"
  }
}
Credentials for RDS and OpenSearch are generated automatically and stored as Kubernetes secrets in your application namespace. You do not need to manage passwords manually.

Customizing AWS Resources

Override the defaults for any AWS-managed resource using the aws sub-object:
db = {
  provisioner = "aws"
  aws = {
    instance_class          = "db.t4g.large"
    multi_az                = true
    backup_retention_period = 14
    deletion_protection     = true
    skip_final_snapshot     = false
  }
}

opensearch = {
  provisioner = "aws"
  aws = {
    instance_type           = "m6g.large.search"
    instance_count          = 3
    availability_zone_count = 3
    engine_version          = "OpenSearch_3.3"
  }
}

Bring Your Own Infrastructure - Existing

Connect OpenMetadata to a database and search engine you already operate. No new AWS resources are created:
module "omd" {
  source  = "open-metadata/openmetadata/aws"
  version = "1.13"

  app_namespace = "openmetadata"

  db = {
    provisioner = "existing"
    host        = "omd-db.postgres.example"
    port        = 5432
    db_name     = "openmetadata_db"
    engine = {
      name = "postgres"
    }
    credentials = {
      username = "dbadmin"
      password = {
        secret_ref = "db-secrets"
        secret_key = "password"
      }
    }
  }

  opensearch = {
    provisioner = "existing"
    host        = "opensearch.example"
    port        = "443"
    scheme      = "https"
  }
}
The secret_ref and secret_key values reference a Kubernetes secret that must already exist in your application namespace before terraform apply.

Kubernetes Orchestrator (No Airflow)

This is the default mode. The module deploys OpenMetadata without Airflow and configures it to run ingestion pipelines as native Kubernetes Jobs via the OMJob operator. No extra configuration is needed:
module "omd" {
  source  = "open-metadata/openmetadata/aws"
  version = "1.13"

  app_namespace    = "openmetadata"
  eks_nodes_sg_ids = ["sg-1234abcd5678efgh"]
  subnet_ids       = ["subnet-1a2b3c4d", "subnet-5e6f7g8h", "subnet-9i0j1k2l"]
  vpc_id           = "vpc-1a2b3c4d"
}
No Airflow deployment, Airflow database, or EFS volumes are created. OpenMetadata is configured automatically to use the OMJob operator:
  • pipelineServiceClientConfig.type is set to k8s
  • pipelineServiceClientConfig.k8s.useOMJobOperator is set to true
  • omjobOperator.enabled is set to true
The OMJob operator installs Custom Resource Definitions (CRDs) on your cluster, which requires elevated permissions during the first terraform apply.

Advanced Configuration

Extra Environment Variables

Inject arbitrary environment variables into the OpenMetadata pod:
extra_envs = {
  "ELASTICSEARCH_BATCH_SIZE"         = "250"
  "PIPELINE_SERVICE_IP_INFO_ENABLED" = "false"
}
Or load them from an existing Kubernetes secret:
env_from = ["my-app-secrets", "another-secret"]
Both can be used together. env_from secrets are mounted before extra_envs, so individual values in extra_envs can override keys from a secret.

Overriding Helm Values

Pass arbitrary values to any Helm chart using the *_helm_values variables. These are merged on top of the values generated by the module, so they can override defaults or configure options not exposed as Terraform variables:
VariableHelm chart
openmetadata_helm_valuesOpenMetadata
opensearch_helm_valuesOpenSearch (inside the deps chart)
openmetadata_helm_values = {
  "replicaCount" = "2"
}

Accessing Your Deployment

kubectl port-forward service/openmetadata 8585:8585 -n <app_namespace>
Open http://localhost:8585 in your browser.
Keep the terminal session with kubectl port-forward open while accessing OpenMetadata. If port 8585 is already in use on your machine, change the local port number (the first number in local:remote, e.g. 9585:8585).

Complete AWS Example

The complete example provisions a full AWS environment from scratch, including:
  • VPC with public/private subnets, Internet Gateway, and NAT Gateway
  • EKS cluster with EBS and EFS CSI driver addons
  • KMS key for encrypting all resources
  • RDS instance for OpenMetadata (Multi-AZ, deletion protection enabled)
  • OpenSearch domain with a security group allowing inbound traffic from EKS nodes
  • Kubernetes namespace, storage classes, and secrets
It is a good reference for production deployments and for understanding how to wire together the AWS, Kubernetes, and Helm providers.

Next Steps

Kubernetes Orchestrator

Run ingestion pipelines as native Kubernetes Jobs

EKS Deployment Guide

Manual Helm-based deployment on Amazon EKS

Helm Values Reference

Full reference for OpenMetadata Helm chart values

Secrets Manager

Store and rotate credentials securely using AWS Secrets Manager