applications

No menu items for this category
Collate Documentation

๐Ÿ“˜ Metadata Exporter Documentation

The Metadata Exporter is a configurable application that enables organizations to export Data Quality (DQ) test results โ€” such as pass/fail flags, rule IDs, asset metadata, timestamps โ€” and profile data into downstream analytical or data storage systems like Snowflake, BigQuery (BQ), and Databricks.

This functionality enables:

  • Feeding downstream dashboards (e.g., Power BI, Tableau)
  • Triggering alerting and remediation workflows
  • Historical tracking and versioning of data quality scores (DQI)
  • Snowflake
  • Databricks
  • BigQuery
  • Manual
  • Scheduled
  • Ability to decide what event to export (data quality or profile data)

To configure the Metadata Exporter:

  • Go to: Settings > Applications > Metadata Exporter
Metadata Exporter Navigation

Metadata Exporter Navigation

Youโ€™ll find the following tabs:

  • Schedule
  • Configuration
  • Recent Runs
Metadata Exporter Tabs

Metadata Exporter Tabs

Defines the agent responsible for executing the ingestion pipeline.
Example: Collate SaaS Agent

Establishes connectivity to your export destination (e.g., Snowflake, BigQuery, Databricks).

Configuration

Configuration

FieldDescription
Service TypeSnowflake
UsernameSnowflake user login
PasswordUser password (optional if using private key)
AccountSnowflake account identifier (e.g., AAAAA-99999)
RoleSnowflake role to assume (e.g., ACCOUNTADMIN)
DatabaseTarget database (e.g., OBS_ANALYTICS)
WarehouseTarget virtual warehouse (e.g., COMPUTE_WH)
Query TagOptional tagging for traceability
Private Key & PassphraseFor key-pair auth (optional, secure)

Advanced Option:

  • Client Session Keep Alive โ€“ Useful for long-running exports
FieldDescription
Service TypeMust be BigQuery
Project IDGCP project where the BigQuery dataset resides
Dataset IDTarget dataset where the metadata will be exported
Table NameDestination table name (BQ table to export metadata to)
Service Account JSONContents of the service account key in JSON format with write access
LocationBigQuery region (e.g., us-central1)

Security Note: Ensure the service account has the BigQuery Data Editor and BigQuery Job User roles.

FieldDescription
Service TypeMust be Databricks
Host URLDatabricks workspace URL (e.g., https://<region>.azuredatabricks.net)
TokenPersonal Access Token (PAT) for API authentication
Cluster IDTarget cluster where jobs will run
Database NameTarget database within the Databricks environment
Schema NameSchema (if applicable)
Table NameDestination table to store metadata
Path (Optional)DBFS path or external location (if exporting to files instead of a table)

Requirements:

  • The token must have workspace-wide read/write access.
  • The cluster must have access to the target database or mount location.

Defines the temporal scope of the data to be exported.

FieldDescription
Range Type (exportRange.rangeType)Options: ALL, LATEST, or DATE_RANGE
Interval (exportRange.interval)Used with DATE_RANGE (e.g., 7)
Unit (exportRange.unit)Time unit for the interval (e.g., days, hours)
Event TypesSelect which types of DQ events to export (All, or specific types)
BackfillEnable to process historical data on first run
Export Range

Export Range

Specifies the target table where exported metadata will be written.

FieldDescription
Database Name (tableConfiguration.databaseName)e.g., OBS_ANALYTICS
Schema Name (tableConfiguration.schemaName)e.g., OBS_DATA
Table Name (tableConfiguration.tableName)e.g., COLLATE_METADATA
Table Configuration

Table Configuration

Configure how often the metadata export runs:

  • Manual: Click Run Now on the Schedule tab
  • Scheduled: Setup periodic exports (feature roadmap)
Scheduling

Scheduling

Under the Recent Runs tab:

  • View status: Success or Failed
  • Check:
    • Run time
    • Duration
    • Logs for troubleshooting
    • Config used during run

A successful export shows the Status: Success, with details on execution duration and timestamps.

Monitoring Runs

Monitoring Runs

KeyDescription
exportRange.rangeTypeDefines range (ALL, LATEST, DATE_RANGE)
exportRange.intervalInterval number for DATE_RANGE
exportRange.unitTime unit (days, hours)
eventTypesEvent types to export
BackfillBoolean, historical data processing
tableConfiguration.databaseNameTarget DB
tableConfiguration.schemaNameTarget schema
tableConfiguration.tableNameTarget table