
GCS
PRODFeature List
✓ Metadata
Requirements
To run the GCS ingestion, you will need to install:OpenMetadata 1.0 or later
To deploy OpenMetadata, check the Deployment guides.
GCS Permissions
For all the buckets that we want to ingest, we need to provide the following:storage.buckets.getstorage.buckets.liststorage.objects.getstorage.objects.list
OpenMetadata Manifest
In any other connector, extracting metadata happens automatically. In this case, we will be able to extract high-level metadata from buckets, but in order to understand their internal structure we need users to provide anopenmetadata.json
file at the bucket root.
Supported File Formats: [ "csv", "tsv", "avro", "parquet", "json", "json.gz", "json.zip" ]
You can learn more about this here. Keep reading for an example on the shape of the manifest file.
OpenMetadata Manifest
Our manifest file is defined as a JSON Schema, and can look like this:Global Manifest
You can also manage a single manifest file to centralize the ingestion process for any container, namedopenmetadata_storage_manifest.json.
You can also keep local manifests openmetadata.json in each container, but if possible, we will always try to pick up the global manifest during the ingestion.