Ingest Glossary from dbt
Ingest the table and column level glossary terms from manifest.json
file
Requirements
For dbt Glossary, Glossary terms must be created or present in OpenMetadata beforehand for data ingestion to work.
Steps for ingesting dbt Glossary
1. Create a Glosary at OpenMetadata or Select a previously added glossary
A Glossary Term is a preferred terminology for a concept. In a Glossary term, you can add tags, synonyms, related terms to build a conceptual semantic graph, and also add reference links.
For details on creating glossary terms, refer to the OpenMetadata documentation
To view created Glossary Terms, navigate to the Glossary section within OpenMetadata govern->glossary->glossary_name->glossary_term_name
OpenMetadata also supports creating nested Glossary Terms, allowing you to organize them hierarchically and seamlessly ingest them into dbt.
OpenMetadata Glossaries
2. Add Table-Level Glossary term information in schema.yml file
To associate glossary terms with specific tables in your dbt model, you'll need their Fully Qualified Names (FQNs) within OpenMetadata.
Steps to Get Glossary Term FQNs:
- Navigate to the desired glossary term in OpenMetadata's glossary section.
- The glossary term's details page will display its FQN e.g.
Glossary_name.glossary_term
in the url likeyour-uri/glossary/Glossary_name.glossary_term
.
Example
Suppose you want to add the glossary terms term_one
(FQN: Test_Glossary.term_one
) and more_nested_term
(FQN: Test_Glossary.term_two.nested_term.more_nested_term
) to the customers table in your dbt model.
To get FQN for term_one
(Test_Glossary.term_one
), navigate to govern->glossary->Test_Glossary->term_one
.
And for more_nested_term
(Test_Glossary.term_two.nested_term.more_nested_term
), navigate to govern->glossary->Test_Glossary->term_two->nested_term->more_nested_term
.
you can see the current url containing the glossary term FQNs as https://localhost:8585/glossary/Test_Glossary.term_two.nested_term.more_nested_term
OpenMetadata Glossary Term - term_one
OpenMetadata Glossary Term - more_nested_term
In your dbt schema.yml file for the customers
table model, add the Glossary Term FQNs under model->name->meta->openmetadata->glossary
The format should be a list of strings, like this: [ 'Test_Glossary.term_one', 'Test_Glossary.term_two.nested_term.more_nested_term' ]
.
For details on dbt meta follow the link here
After adding the Glossary term information to your schema.yml file, run your dbt workflow. The generated manifest.json
file will then include the FQNs under node_name->meta->openmetadata->glossary
as [ 'Test_Glossary.term_one', 'Test_Glossary.term_two.nested_term.more_nested_term' ]
3. Add Column-Level Glossary term information in schema.yml
file
To associate a glossary term with a specific column in your dbt model, follow these steps:
- Locate the
customer_id
column within thecustomers
table model in yourschema.yml
file. - Under the
customer_id
column definition, add the glossary term FQNs undermodel->name->columns->column_name->meta->openmetadata->glossary
as[ 'Test_Glossary.term_two.nested_term' ]
.
After adding the Glossary term information to your schema.yml file, run your dbt workflow. The generated manifest.json
file will then include the FQNs under node_name->columns->column_name->meta->openmetadata->glossary
as [ 'Test_Glossary.term_two.nested_term' ]
4. Viewing the Glossary term on tables and columns
Table and Column level Glossary term ingested from dbt can be viewed on the node in OpenMetadata
dbt Glossary term