> ## Documentation Index
> Fetch the complete documentation index at: https://docs.open-metadata.org/llms.txt
> Use this file to discover all available pages before exploring further.

# Auto-Classification in OpenMetadata

> Learn how OpenMetadata automatically detects and tags sensitive data like PII using column name scanning and NLP-based entity recognition.

# Overview

Auto-Classification is an OpenMetadata workflow that automatically detects and tags sensitive data — such as PII — across your database columns. It removes the need for manual tagging by scanning both column names and sample data during ingestion, then applying or suggesting tags like `PII.Sensitive` and `PII.NonSensitive`.

## How It Works

Auto-Classification uses two complementary detection approaches:

* **Column Name Scanner**: Validates column names against a set of regex rules that identify common sensitive patterns — email addresses, names, SSNs, bank account numbers, and similar fields.

  For example, columns `email` and `full_name` are auto-tagged as `PII.Sensitive` based on their column names.

  <img src="https://mintcdn.com/openmetadata/B6U1glZKT4PT9EFR/public/images/how-to-guides/governance/auto-pii1.png?fit=max&auto=format&n=B6U1glZKT4PT9EFR&q=85&s=ba3b19ab9fe16e8a0f8a487997dbd1eb" alt="Columns with recognizable sensitive names auto-tagged as PII Sensitive" width="2980" height="1496" data-path="public/images/how-to-guides/governance/auto-pii1.png" />

* **Entity Recognition**: If sample data ingestion is enabled, scans the actual row values using an NLP-based entity recognition engine. This catches sensitive data even when the column name is generic or ambiguous. The `confidence` parameter (0–100, default `80`) controls the minimum score required to tag a column as `PII.Sensitive`.

  If a column already has a `PII` tag, it is skipped during execution.

  For example, the column `I_FORMULATION` is also tagged as `PII.Sensitive`, even though its name gives no indication of sensitive content.

  <img src="https://mintcdn.com/openmetadata/B6U1glZKT4PT9EFR/public/images/how-to-guides/governance/auto-pii2.png?fit=max&auto=format&n=B6U1glZKT4PT9EFR&q=85&s=adfcfcde6e4506d561db27d727ce622f" alt="Column with an ambiguous name tagged as PII Sensitive" width="2848" height="1102" data-path="public/images/how-to-guides/governance/auto-pii2.png" />

  Inspecting the **Sample Data** tab reveals that the actual row values contain sensitive information, which the entity recognition engine detected. This shows that auto-classification works beyond column names and relies on the data itself when sample ingestion is enabled.

  <img src="https://mintcdn.com/openmetadata/B6U1glZKT4PT9EFR/public/images/how-to-guides/governance/auto-pii3.png?fit=max&auto=format&n=B6U1glZKT4PT9EFR&q=85&s=4124636128cb4a34ba5eb4d7c7c06533" alt="Sample data showing sensitive values that triggered auto-classification" width="2822" height="1446" data-path="public/images/how-to-guides/governance/auto-pii3.png" />

## Tag Mapping

Tag mapping lets you link two tags so that applying one automatically applies the other. When two related tags are associated, any time the first tag is applied to a data asset, the second tag is applied automatically — keeping classifications consistent across taxonomies without extra manual steps.

For example, applying `Personal Data.Personal` automatically applies `Data Classification.Confidential`, ensuring that privacy and sensitivity classifications always stay in sync. Tag mappings are configured in the backend and are not available through the OpenMetadata UI.

## Set Up Auto-Classification

<CardGroup cols={2}>
  <Card title="Workflow" href="/v1.13.x/how-to-guides/data-governance/classification/auto-classification/workflow">
    Add an Auto Classification Agent to a database service directly from the OpenMetadata UI.
  </Card>

  <Card title="External Workflow" href="/v1.13.x/how-to-guides/data-governance/classification/auto-classification/external-workflow">
    Run the Auto Classification Workflow externally using a YAML pipeline configuration.
  </Card>

  <Card title="Auto PII Tagging" href="/v1.13.x/how-to-guides/data-governance/classification/auto-classification/auto-pii-tagging">
    Understand the tagging logic and troubleshoot common issues like SSL certificate errors.
  </Card>

  <Card title="Sample Data" href="/v1.13.x/how-to-guides/data-governance/classification/auto-classification/external-sample-data">
    Store sample data collected during auto-classification to an S3 bucket in Parquet format.
  </Card>
</CardGroup>
