跳到主要内容
跳到主要内容

DataLakeCatalog

The DataLakeCatalog database engine enables you to connect ClickHouse to external data catalogs and query open table format data without the need for data duplication. This transforms ClickHouse into a powerful query engine that works seamlessly with your existing data lake infrastructure.

Supported Catalogs

The DataLakeCatalog engine supports the following data catalogs:

  • AWS Glue Catalog - For Iceberg tables in AWS environments
  • Databricks Unity Catalog - For Delta Lake and Iceberg tables
  • Hive Metastore - Traditional Hadoop ecosystem catalog
  • REST Catalogs - Any catalog supporting the Iceberg REST specification

Creating a Database

You will need to enable the relevant settings below to use the DataLakeCatalog engine:

Databases with the DataLakeCatalog engine can be created using the following syntax:

The following settings are supported:

SettingDescription
catalog_typeType of catalog: glue, unity (Delta), rest (Iceberg), hive
warehouseThe warehouse/database name to use in the catalog.
catalog_credentialAuthentication credential for the catalog (e.g., API key or token)
auth_headerCustom HTTP header for authentication with the catalog service
auth_scopeOAuth2 scope for authentication (if using OAuth)
storage_endpointEndpoint URL for the underlying storage
oauth_server_uriURI of the OAuth2 authorization server for authentication
vended_credentialsBoolean indicating whether to use vended credentials (AWS-specific)
aws_access_key_idAWS access key ID for S3/Glue access (if not using vended credentials)
aws_secret_access_keyAWS secret access key for S3/Glue access (if not using vended credentials)
regionAWS region for the service (e.g., us-east-1)

Examples

See below pages for examples of using the DataLakeCatalog engine: