Volume Object
Volume is Singdata Lakehouse's object storage mount point, used to access files in external object storage (OSS/COS/S3) or as Lakehouse's built-in file storage space.
What is Volume
A Volume is similar to the concept of "external table" or "mount point" in traditional databases, but it is oriented toward files rather than tables. With Volume, you can:
- Directly query CSV/JSON/Parquet files in object storage
- Import files from object storage into Lakehouse tables
- Export Lakehouse table data to object storage
- Manage Lakehouse's built-in file storage space
Volume Types
Lakehouse provides four Volume types, categorized by creation method:
Automatically Created Volumes
User Volume
User-level file storage space, automatically available for each user.
Table Volume
Table-level file storage space, automatically associated with each table.
Explicitly Created Volumes
External Volume (Mount External Storage)
Created via CREATE EXTERNAL VOLUME, mounts external object storage (OSS/COS/S3).
Named Volume (Using Internal or External Storage)
Created via CREATE VOLUME, a type of External Volume emphasizing explicit user creation with a custom name.
Type Comparison
| Type | Creation Method | Storage Location | Applicable Scenarios |
|---|---|---|---|
| User Volume | Auto-created | Internal | Upload/download local files, RAG knowledge base |
| Table Volume | Auto-created (one per table) | Internal | Table-associated ETL files, batch import/export |
| External Volume | CREATE EXTERNAL VOLUME | External (OSS/COS/S3) | Mount existing object storage |
| Named Volume | CREATE VOLUME | Internal or external | Cross-team shared resources |
Volume Usage Scenarios
| Scenario | Usage |
|---|---|
| Import data from OSS | COPY INTO table FROM VOLUME my_vol USING CSV |
| Export data to OSS | COPY INTO VOLUME my_vol FROM table USING PARQUET |
| Query files directly | SELECT * FROM VOLUME my_vol USING PARQUET FILES ('data.parquet') |
| Upload local files | PUT file:///local/data.csv TO USER VOLUME |
| RAG knowledge base | Upload documents to Volume, vectorize via unstructured ETL pipeline |
Volume and Pipe Relationship
Pipe is Lakehouse's continuous data ingestion pipeline. When a Pipe ingests data from object storage (OSS/S3/COS), it depends on Volume at the lower level to access files:
- Volume provides file access capability, mounting external object storage
- Pipe provides continuous streaming capability, monitoring new files in Volume and auto-importing them into tables
- Together, they automate the "Object Storage -> Lakehouse Table" data flow
Typical Usage:
Volume and Table Relationship
There is a bidirectional data flow relationship between Volume and Table:
Data from Volume to Table (Import):
Data from Table to Volume (Export):
- Volume manages files; Table manages structured data
- Volume is the channel for data entering the lake and the exit for data leaving the lake
- Table is the subject of data processing; processing results can be exported to external storage via Volume
