Skip to main content

EKS on Amazon Web Services Deployment

OpenMetadata supports the Installation and Running of Application on Elastic Kubernetes Services (EKS) through Helm Charts. However, there are some additional configurations which needs to be done as prerequisites for the same.
All the code snippets in this section assume the default namespace for kubernetes. This guide presumes you have AWS EKS Cluster already available.

Prerequisites

AWS Services for Database as RDS and Search Engine as ElasticSearch

It is recommended to use Amazon RDS and Amazon OpenSearch Service for Production Deployments. We support
  • Amazon RDS (MySQL) engine version 8 or higher
  • Amazon RDS (PostgreSQL) engine version 12 or higher
  • Amazon OpenSearch engine version 2.X (upto 2.19)
When using AWS Services the SearchType Configuration for elastic search should be opensearch, for both cases ElasticSearch and OpenSearch, as you can see in the ElasticSearch configuration example below.
We recommend
  • Amazon RDS to be in Multiple Availability Zones.
  • Amazon OpenSearch (or ElasticSearch) Service with Multiple Availability Zones with minimum 2 Nodes.
Make sure to increase sort_buffer_size (for MySQL) or work_mem (for PostgreSQL) to the recommended value of 20MB or more using the database parameter group setting. This is especially important when running migrations to prevent Out of Sort Memory Error. You can revert the setting once the migrations are complete.
Once you have the RDS and OpenSearch Services Setup, you can update the environment variables below for OpenMetadata kubernetes deployments to connect with Database and ElasticSearch.
# openmetadata-values.prod.yaml
...
openmetadata:
  config:
    elasticsearch:
      host: <AMAZON_OPENSEARCH_SERVICE_ENDPOINT_WITHOUT_HTTPS>
      searchType: opensearch
      port: 443
      scheme: https
      connectionTimeoutSecs: 5
      socketTimeoutSecs: 60
      keepAliveTimeoutSecs: 600
      batchSize: 10
      auth:
        enabled: true
        username: <AMAZON_OPENSEARCH_USERNAME>
        password:
          secretRef: elasticsearch-secrets
          secretKey: openmetadata-elasticsearch-password
    database:
      host: <AMAZON_RDS_ENDPOINT>
      port: 3306
      driverClass: com.mysql.cj.jdbc.Driver
      dbScheme: mysql
      dbUseSSL: true
      databaseName: <RDS_DATABASE_NAME>
      auth:
        username: <RDS_DATABASE_USERNAME>
        password:
          secretRef: mysql-secrets
          secretKey: openmetadata-mysql-password
  ...
Make sure to create RDS and OpenSearch credentials as Kubernetes Secrets mentioned here. Also, disable MySQL and ElasticSearch from OpenMetadata Dependencies Helm Charts as mentioned in the FAQs here.

Create Elastic File System in AWS

You can follow official AWS Guides here to provision EFS File System in the same VPC which is associated with your EKS Cluster.

Persistent Volumes with ReadWriteMany Access Modes

OpenMetadata helm chart depends on Airflow and Airflow expects a persistent disk that support ReadWriteMany (the volume can be mounted as read-write by many nodes). In AWS, this is achieved by Elastic File System (EFS) service. AWS Elastic Block Store (EBS) does not provide ReadWriteMany Volume access mode as EBS will only be attached to one Kubernetes Node at any given point of time. In order to provision persistent volumes from AWS EFS, you will need to setup and install aws-efs-csi-driver. Note that this is required for Airflow as One OpenMetadata Dependencies. Also, aws-ebs-csi-driver might be required for Persistent Volumes that are to be used for MySQL and ElasticSearch as OpenMetadata Dependencies. The below guide provides Persistent Volumes provisioning as static volumes (meaning you will be responsible to create, maintain and destroy Persistent Volumes).
Continue to Airflow on EKS — EFS Storage Setup to provision persistent volumes, configure Airflow dependencies, and deploy OpenMetadata.