External Catalog

External Catalog is the federated query entry point in Lakehouse, mapping the metadata catalogs of external data systems (Hive, Databricks, Snowflake, etc.) into Lakehouse, allowing you to query external data directly with standard SQL — no data copying required.

Difference from External Schema: External Catalog is an independent top-level catalog accessed with three-level naming catalog.schema.table; External Schema is a Schema mounted into the current workspace, accessed with two-level naming schema.table, which is better suited for integrating Hive databases into an existing workspace. See Organization Hierarchy.

Supported Data Sources

Data SourceConnection Method
Apache HiveHive Metastore URIs
Databricks Unity CatalogDatabricks API
Iceberg REST CatalogIceberg REST API
Snowflake Open CatalogIceberg REST API + OAuth

Use Cases

  • Cross-platform federated queries: Query Lakehouse local data and Hive/Databricks data simultaneously — no ETL required
  • In-place data lake acceleration: Keep data in OSS/HDFS and use Lakehouse to replace Spark/Hive for ETL or Presto/Trino for ad-hoc queries
  • Gradual migration: Maintain business continuity through External Catalog during migration; switch over after verifying data consistency

Permissions

Currently, only the instance_admin role can query the created External Catalog.