Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
170 changes: 126 additions & 44 deletions docs/en/data_source/catalog/iceberg/iceberg_catalog.md
Original file line number Diff line number Diff line change
Expand Up @@ -465,32 +465,38 @@ If you choose AWS S3 as storage for your Iceberg cluster, take one of the follow
"aws.s3.region" = "<aws_s3_region>"
```

- To choose vended credential (supported from v4.0 onwards) with the REST catalog, configure `StorageCredentialParams` as follows:

```SQL
"aws.s3.region" = "<aws_s3_region>"
```

`StorageCredentialParams` for AWS S3:

###### aws.s3.use_instance_profile

Required: Yes
Description: Specifies whether to enable the instance profile-based authentication method and the assumed role-based authentication method. Valid values: `true` and `false`. Default value: `false`.
- Required: Yes
- Description: Specifies whether to enable the instance profile-based authentication method and the assumed role-based authentication method. Valid values: `true` and `false`. Default value: `false`.

###### aws.s3.iam_role_arn

Required: No
Description: The ARN of the IAM role that has privileges on your AWS S3 bucket. If you use the assumed role-based authentication method to access AWS S3, you must specify this parameter.
- Required: No
- Description: The ARN of the IAM role that has privileges on your AWS S3 bucket. If you use the assumed role-based authentication method to access AWS S3, you must specify this parameter.

###### aws.s3.region

Required: Yes
Description: The region in which your AWS S3 bucket resides. Example: `us-west-1`.
- Required: Yes
- Description: The region in which your AWS S3 bucket resides. Example: `us-west-1`.

###### aws.s3.access_key

Required: No
Description: The access key of your IAM user. If you use the IAM user-based authentication method to access AWS S3, you must specify this parameter.
- Required: No
- Description: The access key of your IAM user. If you use the IAM user-based authentication method to access AWS S3, you must specify this parameter.

###### aws.s3.secret_key

Required: No
Description: The secret key of your IAM user. If you use the IAM user-based authentication method to access AWS S3, you must specify this parameter.
- Required: No
- Description: The secret key of your IAM user. If you use the IAM user-based authentication method to access AWS S3, you must specify this parameter.

For information about how to choose an authentication method for accessing AWS S3 and how to configure an access control policy in AWS IAM Console, see [Authentication parameters for accessing AWS S3](../../../integrations/authenticate_to_aws_resources.md#authentication-parameters-for-accessing-aws-s3).

Expand Down Expand Up @@ -522,28 +528,28 @@ If you choose an S3-compatible storage system, such as MinIO, as storage for you

###### aws.s3.enable_ssl

Required: Yes
Description: Specifies whether to enable SSL connection.<br />Valid values: `true` and `false`. Default value: `true`.
- Required: Yes
- Description: Specifies whether to enable SSL connection.<br />Valid values: `true` and `false`. Default value: `true`.

###### aws.s3.enable_path_style_access

Required: Yes
Description: Specifies whether to enable path-style access.<br />Valid values: `true` and `false`. Default value: `false`. For MinIO, you must set the value to `true`.<br />Path-style URLs use the following format: `https://s3.<region_code>.amazonaws.com/<bucket_name>/<key_name>`. For example, if you create a bucket named `DOC-EXAMPLE-BUCKET1` in the US West (Oregon) Region, and you want to access the `alice.jpg` object in that bucket, you can use the following path-style URL: `https://s3.us-west-2.amazonaws.com/DOC-EXAMPLE-BUCKET1/alice.jpg`.
- Required: Yes
- Description: Specifies whether to enable path-style access.<br />Valid values: `true` and `false`. Default value: `false`. For MinIO, you must set the value to `true`.<br />Path-style URLs use the following format: `https://s3.<region_code>.amazonaws.com/<bucket_name>/<key_name>`. For example, if you create a bucket named `DOC-EXAMPLE-BUCKET1` in the US West (Oregon) Region, and you want to access the `alice.jpg` object in that bucket, you can use the following path-style URL: `https://s3.us-west-2.amazonaws.com/DOC-EXAMPLE-BUCKET1/alice.jpg`.

###### aws.s3.endpoint

Required: Yes
Description: The endpoint that is used to connect to your S3-compatible storage system instead of AWS S3.
- Required: Yes
- Description: The endpoint that is used to connect to your S3-compatible storage system instead of AWS S3.

###### aws.s3.access_key

Required: Yes
Description: The access key of your IAM user.
- Required: Yes
- Description: The access key of your IAM user.

###### aws.s3.secret_key

Required: Yes
Description: The secret key of your IAM user.
- Required: Yes
- Description: The secret key of your IAM user.

</TabItem>

Expand Down Expand Up @@ -572,34 +578,36 @@ If you choose Blob Storage as storage for your Iceberg cluster, take one of the
"azure.blob.sas_token" = "<storage_account_SAS_token>"
```

- To choose REST catalog with vended credential (supported from v4.0 onwards), you do not need to configure `StorageCredentialParams`.

`StorageCredentialParams` for Microsoft Azure:

###### azure.blob.storage_account

Required: Yes
Description: The username of your Blob Storage account.
- Required: Yes
- Description: The username of your Blob Storage account.

###### azure.blob.shared_key

Required: Yes
Description: The shared key of your Blob Storage account.
- Required: Yes
- Description: The shared key of your Blob Storage account.

###### azure.blob.account_name

Required: Yes
Description: The username of your Blob Storage account.
- Required: Yes
- Description: The username of your Blob Storage account.

###### azure.blob.container

Required: Yes
Description: The name of the blob container that stores your data.
- Required: Yes
- Description: The name of the blob container that stores your data.

###### azure.blob.sas_token

Required: Yes
Description: The SAS token that is used to access your Blob Storage account.
- Required: Yes
- Description: The SAS token that is used to access your Blob Storage account.

###### Azure Data Lake Storage Gen1
##### Azure Data Lake Storage Gen1

If you choose Data Lake Storage Gen1 as storage for your Iceberg cluster, take one of the following actions:

Expand All @@ -619,7 +627,7 @@ Or:
"azure.adls1.oauth2_endpoint" = "<OAuth_2.0_authorization_endpoint_v2>"
```

###### Azure Data Lake Storage Gen2
##### Azure Data Lake Storage Gen2

If you choose Data Lake Storage Gen2 as storage for your Iceberg cluster, take one of the following actions:

Expand Down Expand Up @@ -650,6 +658,8 @@ If you choose Data Lake Storage Gen2 as storage for your Iceberg cluster, take o
"azure.adls2.oauth2_client_endpoint" = "<service_principal_client_endpoint>"
```

- To choose REST catalog with vended credential (supported from v4.0 onwards), you do not need to configure `StorageCredentialParams`.

</TabItem>

<TabItem value="GCS" label="Google GCS" >
Expand Down Expand Up @@ -692,31 +702,33 @@ If you choose Google GCS as storage for your Iceberg cluster, take one of the fo
"gcp.gcs.impersonation_service_account" = "<data_google_service_account_email>"
```

- To choose REST catalog with vended credential (supported from v4.0 onwards), you do not need to configure `StorageCredentialParams`.

`StorageCredentialParams` for Google GCS:

###### gcp.gcs.service_account_email

Default value: ""
Example: "[user@hello.iam.gserviceaccount.com](mailto:user@hello.iam.gserviceaccount.com)"
Description: The email address in the JSON file generated at the creation of the service account.
- Default value: ""
- Example: "[user@hello.iam.gserviceaccount.com](mailto:user@hello.iam.gserviceaccount.com)"
- Description: The email address in the JSON file generated at the creation of the service account.

###### gcp.gcs.service_account_private_key_id

Default value: ""
Example: "61d257bd8479547cb3e04f0b9b6b9ca07af3b7ea"
Description: The private key ID in the JSON file generated at the creation of the service account.
- Default value: ""
- Example: "61d257bd8479547cb3e04f0b9b6b9ca07af3b7ea"
- Description: The private key ID in the JSON file generated at the creation of the service account.

###### gcp.gcs.service_account_private_key

Default value: ""
Example: "-----BEGIN PRIVATE KEY----xxxx-----END PRIVATE KEY-----\n"
Description: The private key in the JSON file generated at the creation of the service account.
- Default value: ""
- Example: "-----BEGIN PRIVATE KEY----xxxx-----END PRIVATE KEY-----\n"
- Description: The private key in the JSON file generated at the creation of the service account.

###### gcp.gcs.impersonation_service_account

Default value: ""
Example: "hello"
Description: The service account that you want to impersonate.
- Default value: ""
- Example: "hello"
- Description: The service account that you want to impersonate.

</TabItem>

Expand Down Expand Up @@ -853,6 +865,27 @@ The following examples create an Iceberg catalog named `iceberg_catalog_hms` or
"aws.s3.region" = "us-west-2"
);
```

##### If you choose vended credential

If you choose REST catalog with vended credential, run a command like below:

```SQL
CREATE EXTERNAL CATALOG polaris_s3
PROPERTIES
(
"type" = "iceberg",
"iceberg.catalog.uri" = "http://xxx:xxx/api/catalog",
"iceberg.catalog.type" = "rest",
"iceberg.catalog.rest.nested-namespace-enabled"="true",
"iceberg.catalog.security" = "oauth2",
"iceberg.catalog.oauth2.credential" = "xxxxx:xxxx",
"iceberg.catalog.oauth2.scope"='PRINCIPAL_ROLE:ALL',
"iceberg.catalog.warehouse" = "iceberg_catalog",
"aws.s3.region" = "us-west-2"
);
```

</TabItem>

<TabItem value="HDFS" label="HDFS" >
Expand Down Expand Up @@ -930,6 +963,22 @@ PROPERTIES
);
```

- If you choose REST catalog with vended credential, run a command like below:

```SQL
CREATE EXTERNAL CATALOG polaris_azure
PROPERTIES (
"type" = "iceberg",
"iceberg.catalog.uri" = "http://xxx:xxx/api/catalog",
"iceberg.catalog.type" = "rest",
"iceberg.catalog.rest.nested-namespace-enabled"="true",
"iceberg.catalog.security" = "oauth2",
"iceberg.catalog.oauth2.credential" = "xxxxx:xxxx",
"iceberg.catalog.oauth2.scope"='PRINCIPAL_ROLE:ALL',
"iceberg.catalog.warehouse" = "iceberg_catalog"
);
```

##### Azure Data Lake Storage Gen1

- If you choose the Managed Service Identity authentication method, run a command like below:
Expand Down Expand Up @@ -1006,6 +1055,22 @@ PROPERTIES
);
```

- If you choose REST catalog with vended credential, run a command like below:

```SQL
CREATE EXTERNAL CATALOG polaris_azure
PROPERTIES (
"type" = "iceberg",
"iceberg.catalog.uri" = "http://xxx:xxx/api/catalog",
"iceberg.catalog.type" = "rest",
"iceberg.catalog.rest.nested-namespace-enabled"="true",
"iceberg.catalog.security" = "oauth2",
"iceberg.catalog.oauth2.credential" = "xxxxx:xxxx",
"iceberg.catalog.oauth2.scope"='PRINCIPAL_ROLE:ALL',
"iceberg.catalog.warehouse" = "iceberg_catalog"
);
```

</TabItem>

<TabItem value="GCS" label="Google GCS" >
Expand Down Expand Up @@ -1071,6 +1136,23 @@ PROPERTIES
"gcp.gcs.impersonation_service_account" = "<data_google_service_account_email>"
);
```

- If you choose REST catalog with vended credential, run a command like below:

```SQL
CREATE EXTERNAL CATALOG polaris_gcp
PROPERTIES (
"type" = "iceberg",
"iceberg.catalog.uri" = "http://xxx:xxx/api/catalog",
"iceberg.catalog.type" = "rest",
"iceberg.catalog.rest.nested-namespace-enabled"="true",
"iceberg.catalog.security" = "oauth2",
"iceberg.catalog.oauth2.credential" = "xxxxx:xxxx",
"iceberg.catalog.oauth2.scope"='PRINCIPAL_ROLE:ALL',
"iceberg.catalog.warehouse" = "iceberg_catalog"
);
```

</TabItem>

</Tabs>
Expand Down
Loading
Loading