Define the JSON Schema
The first step when creating a new connector is to create the JSON Schema definition for the connection itself.
This is a JSON file that declares the properties we need for the connection to work, and it will be mapped to a Java Class
on the Server, a Python Class
on the Ingestion Framework and a Typescript Class
on the UI. By using it we can guarantee that everywhere we have the same definition.
These files can be found in the following path:
openmetadata-spec/src/main/resources/json/schema/entity/services/connections
Here you can check what the different service connections look like and get some inspiration on how to create your own.
Breathe
It can be overwhelming doing this for the first time, trying to reuse different schemas and get everything right.
It's a good idea to start little by little and repeat yourself while you get used to working with the definitions.
Connection File Anatomy
In order to go through the connection file anatomy, we are going to take a look at the mysqlConnection.json
- $id: Here we are basically referencing the file itself. You will need to change the path to the path for your connection.
- title: Here we need to define the name of the schema. The standard is to use the filename in camelcase. So if you are creating a connection called
ownConnection.json
the title would beOwnConnection
. - description: Here we also add a small description that explains what the JSON Schema is for.
- javaType: Here we also need to define the javaType this JSONSchema will become. This will also depend on the connection name like the title.
Info
Please note that the javaType path is similar to the filepath for the JSON Schema but not the same.
The standard is as follows:
org.openmetadata.schema.services.connections.{source_type}.{title}
where
{source_type}
depends on the Connector you are building (Database, Dashboard, etc){title}
is the title attribute from this json file.
definitions: Here you can place JSON Schemas that you can reference later within the
properties
attribute. On this connector we can see two different definitions:- mySQLType: This definition is a standard for all connectors and it defines which is the Service Type for a given connection.
If you are creating a connection called
ownConnection.json
you could create a definition like:- mySQLScheme: This definition is specific for the connections that use SQLAlchemy underneath and it is used to define which is the driver scheme to be used.
properties: Here we actually define the attributes that our connection will have. In order to understand better what you need to define here we are going to go through a few of the attributes.
- type: As mentioned in the definitions section, we define the Service Type. But in order to actually use it we need to reference it in a property. This is exactly what we do here.
In order to reference another JSON Schema we use the
$ref
attribute. This will basically put the entire JSON Schema in place and update/add any attributes defined here.authType: This property is insteresting because it allows us to showcase two different features.
- $ref: As explained above, this attribute is used to reference another JSON Schema. But in this case you can see it being used within the oneOf attribute referencing an external JSON Schema and not a definition.
When referencing a definition we use the following pattern:
#/definitions/myDefinition
When referencing an external JSONSchema we use relative paths:../common/ownSchema.json
- oneOf: This property allows us to actually have a list of different types that are valid. It is used when there are multiple different ways a configuration might appear.
On this example we can see it references both
./common/basicAuth.json
and./common/iamAuthConfig.json
. It is this way because we could Authenticate to MySQL either by using thebasicAuth
(Username/Password) or by usingiamAuth
if we are actually running MySQL as a RDS in AWSsupportsMetadataExtraction: We can also see a couple of different properties that showcase the features this connector supports (supportsMetadataExtraction, supportsDBTExtraction, supportsProfiler, supportsQueryComment) They are all different features from OpenMetadata that are not necessarily supported by all connectors.
The most basic case is supportsMetadataExtraction and we should always start from there.
Here we can also see
$ref
being used to reference adefinition
on another schema:../connectionBasicType.json#/definitions/supportsMetadataExtraction
additionalProperties: To avoid werid behavior, we always prevent additionalProperties to be passed to the schema by setting this parameter to false.
required: Here we can define any properties that are always required or the schema would be invalid otherwise
Making the new Connection configuration available to the Service
Once the connection file is properly created, we still need to take one extra step to make it available for the Service.
Note
The connection is part of a Service (Dashboard, Database, Messaging, etc) and this step should be done on the correct service.
Following with the mysqlConnection.json
example, we now need to make it available to the Database Service
by updating the databaseService.json
file within openmetadata-spec/src/main/resources/json/schema/entity/services
The file will be shortened and parts of it will be replaced with ...
for readability.
- databaseServiceType: Here we need to add our connector type to the
enum
andjavaEnums
properties. It should be the same value as thetype
property that we defined on the JSON Schema.
- databaseConnection: Here we need to point to our JSON Schema within the
config
property by adding it to theoneOf
list.
Next Step
Now that you have your Connection defined in the JSON Schema, we can proceed to actually implement the Python Code to perform the Ingestion.
Develop the Ingesion CodeLearn what you need to implement for the Connector's logic