HUDI External Table
HUDI introduces a structured storage layer to data lakes, greatly enhancing their usability and making the experience comparable to working with a data warehouse. Through the external table feature supported by Singdata Lakehouse, you can easily access and work with this structured data.
Creating a HUDI Format External Table
Example
Dropping an External Table
Parameter Description
IF EXISTS: Optional. If specified, no error is raised when the table does not exist.schema_name: Optional. Specifies the schema name. If not specified, the current user's schema is used by default.table_name: The name of the table to drop.
Notes
- Dropping an external table does not delete the underlying data, because the data is stored in an external system. The drop operation only removes the table's mapping metadata.
Example
Viewing External Table Details
Parameter Description
- DESC[RIBE]:
DESCandDESCRIBEare interchangeable; both describe the table structure. - TABLE: Optional. Specifies the type of object to describe, such as
BASE TABLEorVIEW. - EXTENDED: Optional. When included, additional extended information is displayed, such as the table's creation statement and Location.
- table_name: The name of the table whose structure you want to view.
Modifying an External Table
Renaming a Table
You can use the ALTER TABLE command to rename an existing table.
Syntax
Example
Modifying Table Comments
You can use the ALTER TABLE command to add or update a comment on a table.
Syntax
Example
External Table Billing
- Storage cost: External tables do not incur storage costs because the data is not stored in Singdata Lakehouse.
- Compute cost: Querying an external table consumes compute resources and therefore incurs compute costs.
External Table Permissions
External tables share the same permission model as internal tables. Because external tables do not support INSERT, UPDATE, TRUNCATE, DELETE, or UNDROP operations, there are no corresponding permission points for those operations.
- Create permission: Requires the
create tableprivilege. - Drop permission: Requires the
DROPprivilege. - Read permission: Requires the
SELECTprivilege.
Usage Notes
- Connection configuration: When creating a connection, make sure the
endpointis configured correctly so that Singdata Lakehouse can connect successfully. If Singdata Lakehouse and the object storage are in the same cloud service and the same region, you can typically use the internal network address for connectivity. If they are in different network environments, use the public endpoint of the object storage.
Examples
Connecting to Alibaba Cloud OSS
Connecting to Google GCS
When Singdata Lakehouse connects to Google Cloud Storage (GCS), it uses a service account key for authentication. Follow the steps below:
-
Obtain the service account key:
- Log in to the Google Cloud Console.
- Follow the Google Cloud documentation to create and download a JSON key file for your service account.
-
Configure the
private_keyparameter:- Open the downloaded JSON key file and copy the full private key content.
-
Note:
- When configuring
private_key, you must prefix the value withr. Therprefix means the string is treated as a raw string, so special characters and Unicode characters will not be escaped.
- When configuring
