developers

No menu items for this category

Define the JSON Schema

The first step when creating a new connector is to create the JSON Schema definition for the connection itself.

This is a JSON file that declares the properties we need for the connection to work, and it will be mapped to a Java Class on the Server, a Python Class on the Ingestion Framework and a Typescript Class on the UI. By using it we can guarantee that everywhere we have the same definition.

These files can be found in the following path:

openmetadata-spec/src/main/resources/json/schema/entity/services/connections

Here you can check what the different service connections look like and get some inspiration on how to create your own.

Breathe

It can be overwhelming doing this for the first time, trying to reuse different schemas and get everything right.

It's a good idea to start little by little and repeat yourself while you get used to working with the definitions.

In order to go through the connection file anatomy, we are going to take a look at the mysqlConnection.json

  • $id: Here we are basically referencing the file itself. You will need to change the path to the path for your connection.
  • title: Here we need to define the name of the schema. The standard is to use the filename in camelcase. So if you are creating a connection called ownConnection.json the title would be OwnConnection.
  • description: Here we also add a small description that explains what the JSON Schema is for.
  • javaType: Here we also need to define the javaType this JSONSchema will become. This will also depend on the connection name like the title.

Info

Please note that the javaType path is similar to the filepath for the JSON Schema but not the same.

The standard is as follows:

org.openmetadata.schema.services.connections.{source_type}.{title}

where

  • {source_type} depends on the Connector you are building (Database, Dashboard, etc)
  • {title} is the title attribute from this json file.
  • definitions: Here you can place JSON Schemas that you can reference later within the properties attribute. On this connector we can see two different definitions:

    • mySQLType: This definition is a standard for all connectors and it defines which is the Service Type for a given connection.

    If you are creating a connection called ownConnection.json you could create a definition like:

    • mySQLScheme: This definition is specific for the connections that use SQLAlchemy underneath and it is used to define which is the driver scheme to be used.
  • properties: Here we actually define the attributes that our connection will have. In order to understand better what you need to define here we are going to go through a few of the attributes.

    • type: As mentioned in the definitions section, we define the Service Type. But in order to actually use it we need to reference it in a property. This is exactly what we do here.

    In order to reference another JSON Schema we use the $ref attribute. This will basically put the entire JSON Schema in place and update/add any attributes defined here.

    • authType: This property is insteresting because it allows us to showcase two different features.

      • $ref: As explained above, this attribute is used to reference another JSON Schema. But in this case you can see it being used within the oneOf attribute referencing an external JSON Schema and not a definition.

      When referencing a definition we use the following pattern: #/definitions/myDefinition When referencing an external JSONSchema we use relative paths: ../common/ownSchema.json

      • oneOf: This property allows us to actually have a list of different types that are valid. It is used when there are multiple different ways a configuration might appear.

      On this example we can see it references both ./common/basicAuth.json and ./common/iamAuthConfig.json. It is this way because we could Authenticate to MySQL either by using the basicAuth (Username/Password) or by using iamAuth if we are actually running MySQL as a RDS in AWS

    • supportsMetadataExtraction: We can also see a couple of different properties that showcase the features this connector supports (supportsMetadataExtraction, supportsDBTExtraction, supportsProfiler, supportsQueryComment) They are all different features from OpenMetadata that are not necessarily supported by all connectors.

    The most basic case is supportsMetadataExtraction and we should always start from there.

    Here we can also see $ref being used to reference a definition on another schema: ../connectionBasicType.json#/definitions/supportsMetadataExtraction

  • additionalProperties: To avoid werid behavior, we always prevent additionalProperties to be passed to the schema by setting this parameter to false.

  • required: Here we can define any properties that are always required or the schema would be invalid otherwise

mysqlConnection.json

Once the connection file is properly created, we still need to take one extra step to make it available for the Service.

Note

The connection is part of a Service (Dashboard, Database, Messaging, etc) and this step should be done on the correct service.

Following with the mysqlConnection.json example, we now need to make it available to the Database Service by updating the databaseService.json file within openmetadata-spec/src/main/resources/json/schema/entity/services

The file will be shortened and parts of it will be replaced with ... for readability.

  • databaseServiceType: Here we need to add our connector type to the enum and javaEnums properties. It should be the same value as the type property that we defined on the JSON Schema.
  • databaseConnection: Here we need to point to our JSON Schema within the config property by adding it to the oneOf list.
mysqlConnection.json

Now that you have your Connection defined in the JSON Schema, we can proceed to actually implement the Python Code to perform the Ingestion.

Develop the Ingesion Code

Learn what you need to implement for the Connector's logic