DataLakeCatalog

The DataLakeCatalog database engine enables you to connect ClickHouse to external data catalogs and query open table format data without the need for data duplication. This transforms ClickHouse into a powerful query engine that works seamlessly with your existing data lake infrastructure.

Supported Catalogs

The DataLakeCatalog engine supports the following data catalogs:

AWS Glue Catalog - For Iceberg tables in AWS environments
Databricks Unity Catalog - For Delta Lake and Iceberg tables
Hive Metastore - Traditional Hadoop ecosystem catalog
REST Catalogs - Any catalog supporting the Iceberg REST specification

Creating a Database

You will need to enable the relevant settings below to use the DataLakeCatalog engine:

Databases with the DataLakeCatalog engine can be created using the following syntax:

The following settings are supported:

Setting	Description
`catalog_type`	Type of catalog: `glue`, `unity` (Delta), `rest` (Iceberg), `hive`
`warehouse`	The warehouse/database name to use in the catalog.
`catalog_credential`	Authentication credential for the catalog (e.g., API key or token)
`auth_header`	Custom HTTP header for authentication with the catalog service
`auth_scope`	OAuth2 scope for authentication (if using OAuth)
`storage_endpoint`	Endpoint URL for the underlying storage
`oauth_server_uri`	URI of the OAuth2 authorization server for authentication
`vended_credentials`	Boolean indicating whether to use vended credentials (AWS-specific)
`aws_access_key_id`	AWS access key ID for S3/Glue access (if not using vended credentials)
`aws_secret_access_key`	AWS secret access key for S3/Glue access (if not using vended credentials)
`region`	AWS region for the service (e.g., `us-east-1`)

Examples

See below pages for examples of using the DataLakeCatalog engine:

Supported Catalogs​

Creating a Database​

Examples​

Supported Catalogs

Creating a Database

Examples