Auto-Classification in OpenMetadata
OpenMetadata identifies PII data and auto tags or suggests the tags. The data profiler automatically tags the PII-Sensitive data. The addition of tags about PII data helps consumers and governance teams identify data that needs to be treated carefully.
In the example below, the columns ‘last_name’ and ‘social security number’ are auto-tagged as PII-sensitive. This works using NLP as part of the profiler during ingestion.
User_name and Social Security Number are Auto-Classified as PII Sensitive
In the below example, the column ‘number_of_orders’ is also auto-tagged as PII Sensitive, even though the column name does not provide much information.
Column Name does not provide much information
When we look at the content of the column ‘number_of_orders’ in the Sample Data tab, it becomes clear that the auto-classification is based on the data in the column.
Column Data provides information
You can read more about Auto PII Tagging here.
Tag Mapping
Tag mapping is supported in the backend and not in the OpenMetadata UI. When two related tags are associated with each other, applying one tag, automatically applies the other tag. For example, when the tag Personal Data.Personal
is applied, it automatically applies another tag Data Classification.Confidential
. That way, applying the tag Personal
automatically applies the tag Confidential
.
Tiers helps to define the importance of data to an organization.