GKE on Google Cloud Platform Deployment
OpenMetadata supports the Installation and Running of Application on Google Kubernetes Engine through Helm Charts. However, there are some additional configurations which needs to be done as prerequisites for the same.
Note
Google Kubernetes Engine (GKE) Auto Pilot Mode is not compatible with one of OpenMetadata Dependencies - ElasticSearch. The reason being that ElasticSearch Pods require Elevated permissions to run initContainers for changing configurations which is not allowed by GKE AutoPilot PodSecurityPolicy.
Note
All the code snippets in this section assume the default
namespace for kubernetes.
Prerequisites
Persistent Volumes with ReadWriteMany Access Modes
OpenMetadata helm chart depends on Airflow and Airflow expects a presistent disk that support ReadWriteMany (the volume can be mounted as read-write by many nodes).
The workaround is to create nfs-server disk on Google Kubernetes Engine and use that as the presistent claim and delpoy OpenMetadata by implementing the following steps in order.
Create NFS Share
Provision GCP Persistent Disk for Google Kubernetes Engine
Run the below command to create a gcloud compute zonal disk. For more information on Google Cloud Disk Options, please visit here.
gcloud compute disks create --size=100GB --zone=<zone_id> nfs-disk
Deploy NFS Server in GKE
Provision NFS backed PV and PVC for Airflow DAGs and Airflow Logs
Update <NFS_SERVER_CLUSTER_IP>
with the NFS Service Cluster IP Address for below code snippets.
You can get the clusterIP using the following command
kubectl get service nfs-server -o jsonpath='{.spec.clusterIP}'
Change owner and permission manually on disks
Since airflow pods run as non root users, they would not have write access on the nfs server volumes. In order to fix the permission here, spin up a pod with persistent volumes attached and run it once.
# permissions_pod.yml
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: null
labels:
run: my-permission-pod
name: my-permission-pod
spec:
containers:
- image: nginx
name: my-permission-pod
volumeMounts:
- name: airflow-dags
mountPath: /airflow-dags
- name: airflow-logs
mountPath: /airflow-logs
volumes:
- name: airflow-logs
persistentVolumeClaim:
claimName: openmetadata-dependencies-logs
- name: airflow-dags
persistentVolumeClaim:
claimName: openmetadata-dependencies-dags
dnsPolicy: ClusterFirst
restartPolicy: Always
Note
Airflow runs the pods with linux user name as airflow and linux user id as 50000.
Run the below command to create the pod and fix the permissions
kubectl create -f permissions_pod.yml
Once the permissions pod is up and running, execute the below commands within the container.
kubectl exec --tty my-permission-pod --container my-permission-pod -- chown -R 50000 /airflow-dags /airflow-logs
# If needed
kubectl exec --tty my-permission-pod --container my-permission-pod -- chmod -R a+rwx /airflow-dags
Create OpenMetadata dependencies Values
Override openmetadata dependencies airflow helm values to bind the nfs persistent volumes for DAGs and logs.
# values-dependencies.yml
airflow:
airflow:
extraVolumeMounts:
- mountPath: /airflow-logs
name: nfs-airflow-logs
- mountPath: /airflow-dags/dags
name: nfs-airflow-dags
extraVolumes:
- name: nfs-airflow-logs
persistentVolumeClaim:
claimName: openmetadata-dependencies-logs
- name: nfs-airflow-dags
persistentVolumeClaim:
claimName: openmetadata-dependencies-dags
config:
AIRFLOW__OPENMETADATA_AIRFLOW_APIS__DAG_GENERATED_CONFIGS: "/airflow-dags/dags"
dags:
path: /airflow-dags/dags
persistence:
enabled: false
logs:
path: /airflow-logs
persistence:
enabled: false
For more information on airflow helm chart values, please refer to airflow-helm.
Follow OpenMetadata Kubernetes Deployment to install and deploy helm charts with nfs volumes. When deploying openmeteadata dependencies helm chart, use the below command -
helm install openmetadata-dependencies open-metadata/openmetadata-dependencies --values values-dependencies.yaml