> ## Documentation Index
> Fetch the complete documentation index at: https://docs.open-metadata.org/llms.txt
> Use this file to discover all available pages before exploring further.

# Managing Credentials

# Managing Credentials

On the release 0.12 we updated how services credentials are handled from an Ingestion Workflow. We are covering
now two scenarios:

**1.** If we are running a metadata workflow for the first time, pointing to a service that **does not yet exist**,
then the service will be created from the Metadata Ingestion pipeline. It does not matter if the workflow
is run from the CLI or any other scheduler.

**2.** If instead, there is an already existing service to which we are pointing with a Metadata Ingestion pipeline,
then we will be using the **stored credentials**, not the ones incoming from the YAML config.

## Existing Services

What this means is that once a service is created, the only way to update its connection credentials is via
the **UI** or directly running an API call. This prevents the scenario where a new YAML config is created, using a name
of a service that already exists, but pointing to a completely different source system.

One of the main benefits of this approach is that if an admin in our organisation creates the service from the UI,
then we can prepare any Ingestion Workflow without having to pass the connection details.

For example, for an Athena YAML, instead of requiring the full set of credentials as below:

```yaml theme={null}
source:
  type: athena
  serviceName: my_athena_service
  serviceConnection:
    config:
      type: Athena
      awsConfig:
        awsAccessKeyId: KEY
        awsSecretAccessKey: SECRET
        awsRegion: us-east-2
      s3StagingDir: s3 directory for datasource
      workgroup: workgroup name
  sourceConfig:
    type: DatabaseMetadata
    config:
      markDeletedTables: true
      includeTables: true
      includeViews: true
sink:
  type: metadata-rest
  config: {}
workflowConfig:
  openMetadataServerConfig:
    hostPort: <OpenMetadata host and port>
    authProvider: <OpenMetadata auth provider>
```

We can use a simplified version:

```yaml theme={null}
source:
  type: athena
  serviceName: my_athena_service
  sourceConfig:
    config:
      type: DatabaseMetadata
      markDeletedTables: true
      includeTables: true
      includeViews: true
sink:
  type: metadata-rest
  config: {}
workflowConfig:
  openMetadataServerConfig:
    hostPort: <OpenMetadata host and port>
    authProvider: <OpenMetadata auth provider>
```

The workflow will then dynamically pick up the service connection details for `my_athena_service` and ingest
the metadata accordingly.

If instead, you want to have the full source of truth in your DAGs or processes, you can keep reading on different
ways to secure the credentials in your environment and not have them at plain sight.

## Securing Credentials

<Tip>
  Note that these are just a few examples. Any secure and automated approach to retrieve a string would work here,
  as our only requirement is to pass the string inside the YAML configuration.
</Tip>

When running Workflow with the CLI or your favourite scheduler, it's safer to not have the services' credentials
visible. For the CLI, the ingestion package can load sensitive information from environment variables.

For example, if you are using the [Glue](/v1.12.x/connectors/database/glue) connector you could specify the
AWS configurations as follows in the case of a JSON config file

```json theme={null}
[...]
"awsConfig": {
    "awsAccessKeyId": "${AWS_ACCESS_KEY_ID}",
    "awsSecretAccessKey": "${AWS_SECRET_ACCESS_KEY}",
    "awsRegion": "${AWS_REGION}",
    "awsSessionToken": "${AWS_SESSION_TOKEN}"
},
[...]
```

Or

```yaml theme={null}
[...]
awsConfig:
  awsAccessKeyId: '${AWS_ACCESS_KEY_ID}'
  awsSecretAccessKey: '${AWS_SECRET_ACCESS_KEY}'
  awsRegion: '${AWS_REGION}'
  awsSessionToken: '${AWS_SESSION_TOKEN}'
[...]
```

for a YAML configuration.

### AWS Credentials

The AWS Credentials are based on the following [JSON Schema](https://github.com/open-metadata/OpenMetadata/blob/main/openmetadata-spec/src/main/resources/json/schema/security/credentials/awsCredentials.json).
Note that the only required field is the `awsRegion`. This configuration is rather flexible to allow installations under AWS
that directly use instance roles for permissions to authenticate to whatever service we are pointing to without having to
write the credentials down.

#### AWS Vault

If using [aws-vault](https://github.com/99designs/aws-vault), it gets a bit more involved to run the CLI ingestion as the credentials are not globally available in the terminal.
In that case, you could use the following command after setting up the ingestion configuration file:

```bash theme={null}
aws-vault exec <role> -- $SHELL -c 'metadata ingest -c <path to connector>'
```

### GCP Credentials

The GCP Credentials are based on the following [JSON Schema](https://github.com/open-metadata/OpenMetadata/blob/main/openmetadata-spec/src/main/resources/json/schema/security/credentials/gcpCredentials.json).
These are the fields that you can export when preparing a Service Account.

Once the account is created, you can see the fields in the exported JSON file from:

```
IAM & Admin > Service Accounts > Keys
```

You can validate the whole Google service account setup [here](/v1.12.x/deployment/security/google).

### Using GitHub Actions Secrets

If running the ingestion in a GitHub Action, you can create [encrypted secrets](https://docs.github.com/en/actions/security-guides/encrypted-secrets)
to store sensitive information such as users and passwords.

In the end, we'll map these secrets to environment variables in the process, that we can pick up with `os.getenv`, for example:

```python theme={null}
import os
import yaml

from metadata.workflow.metadata import MetadataWorkflow


CONFIG = f"""
source:
  type: snowflake
  serviceName: snowflake_from_github_actions
  serviceConnection:
    config:
      type: Snowflake
      username: {os.getenv('SNOWFLAKE_USERNAME')}
...
"""


def run():
    workflow_config = yaml.safe_load(CONFIG)
    workflow = MetadataWorkflow.create(workflow_config)
    workflow.execute()
    workflow.raise_from_status()
    workflow.print_status()
    workflow.stop()


if __name__ == "__main__":
    run()
```

Make sure to update your step environment to pass the secrets as environment variables:

```yaml theme={null}
- name: Run Ingestion
  run: |
    source env/bin/activate
    python ingestion-github-actions/snowflake_ingestion.py
  # Add the env vars we need to load the snowflake credentials
  env:
     SNOWFLAKE_USERNAME: ${{ secrets.SNOWFLAKE_USERNAME }}
     SNOWFLAKE_PASSWORD: ${{ secrets.SNOWFLAKE_PASSWORD }}
     SNOWFLAKE_WAREHOUSE: ${{ secrets.SNOWFLAKE_WAREHOUSE }}
     SNOWFLAKE_ACCOUNT: ${{ secrets.SNOWFLAKE_ACCOUNT }}
```

You can see a full demo setup [here](https://github.com/open-metadata/openmetadata-demo/tree/main/ingestion-github-actions).

## Next Steps

For a step-by-step guide on using Airflow Connections to securely retrieve service credentials in your DAGs, see
[Using Airflow Connections](/v1.12.x/deployment/ingestion/external/credentials-airflow).