> ## Documentation Index
> Fetch the complete documentation index at: https://docs.open-metadata.org/llms.txt
> Use this file to discover all available pages before exploring further.

# Build a Connector

# Build a Connector

This design doc will walk through developing a connector for OpenMetadata

Ingestion is a simple python framework to ingest the metadata from various sources.

Please look at our framework [APIs](https://github.com/open-metadata/OpenMetadata/tree/main/ingestion/src/metadata/ingestion/api).

## Workflow

[workflow](https://github.com/open-metadata/OpenMetadata/blob/main/ingestion/src/metadata/ingestion/api/workflow.py) is a simple orchestration job that runs the components in an Order.

A workflow consists of [Source](/v1.12.x/api-reference/sdk/python/build-connector/source) and [Sink](/v1.12.x/api-reference/sdk/python/build-connector/sink). It also provides support for [Stage](/v1.12.x/api-reference/sdk/python/build-connector/stage) and [BulkSink](/v1.12.x/api-reference/sdk/python/build-connector/bulk-sink).

Workflow execution happens in a serial fashion.

1. The **Workflow** runs the **source** component first. The **source** retrieves a record from external sources and emits the record downstream.
2. If the **processor** component is configured, the **workflow** sends the record to the **processor** next.
3. There can be multiple **processor** components attached to the **workflow**. The **workflow** passes a record to each **processor** in the order they are configured.
4. Once a **processor** is finished, it sends the modified record to the **sink**.
5. The above steps are repeated for each record emitted from the **source**.

In the cases where we need aggregation over the records, we can use the **stage** to write to a file or other store. Use the file written to in **stage** and pass it to **bulk sink** to publish to external services such as **OpenMetadata** or **Elasticsearch**.

Each `Step` comes from this generic definition:

```python theme={null}
class Step(ABC, Closeable):
    """All Workflow steps must inherit this base class."""

    status: Status

    def __init__(self):
        self.status = Status()

    @classmethod
    @abstractmethod
    def create(cls, config_dict: dict, metadata: OpenMetadata) -> "Step":
        pass

    def get_status(self) -> Status:
        return self.status

    @abstractmethod
    def close(self) -> None:
        pass
```

so we always need to inform the methods:

* `create` to initialize the actual step.
* `close` in case there's any connection that needs to be terminated.

On top of this, you can find further notes on each specific step in the links below:

<CardGroup cols={2}>
  <Card title="Source" href="/v1.12.x/api-reference/sdk/python/build-connector/source">
    The connector to external systems which outputs a record for downstream to process.
  </Card>

  <Card title="Sink" href="/v1.12.x/api-reference/sdk/python/build-connector/sink">
    It will get the event emitted by the source, one at a time.
  </Card>

  <Card title="Stage" href="/v1.12.x/api-reference/sdk/python/build-connector/stage">
    It can be used to store the records or to aggregate the work done by a processor.
  </Card>

  <Card title="BulkSink" href="/v1.12.x/api-reference/sdk/python/build-connector/bulk-sink">
    It can be used to bulk update the records generated in a workflow.
  </Card>
</CardGroup>

Read more about the Workflow management [here](https://github.com/open-metadata/OpenMetadata/blob/main/ingestion/src/metadata/workflow/README.mdx).
