MONOTONICALLY_INCREASING_ID
Description
Generates a monotonically increasing ID. This function generates a unique, monotonically increasing 64-bit integer ID for each row. In a distributed environment, the ID ranges generated for each partition do not overlap, ensuring global uniqueness.
Parameters
- None
Returns
- bigint type
- Returns a monotonically increasing unique ID
- IDs start from 0 and increase
- In a distributed environment, IDs from different partitions do not overlap
Examples
Notes
- The IDs generated by monotonically_increasing_id have the following characteristics:
- Uniqueness: Each ID is unique within the result set
- Monotonicity: ID values are monotonically increasing (but not necessarily contiguous)
- Determinism: Under the same data distribution, the generated IDs are the same
- In a distributed environment, ID generation works as follows:
- Each partition is assigned an ID range
- IDs within a partition increase contiguously
- ID ranges of different partitions do not overlap
- The high bits of the ID identify the partition, and the low bits identify the row number within the partition
- ID composition (64-bit):
- High 33 bits: partition ID
- Low 31 bits: row number within the partition
- Therefore, each partition can have at most 231 - 1 rows
- Differences from row_number():
- monotonically_increasing_id() is more efficient in a distributed environment
- row_number() requires global ordering, which has higher overhead
- monotonically_increasing_id() does not guarantee contiguity
- Important notes:
- IDs may not start from 0 (in a distributed environment)
- There may be gaps between IDs
- Do not rely on the specific values of IDs; only rely on their uniqueness and monotonicity
- The same query may generate different IDs across different executions (if the data distribution changes)
- Common use cases for this function:
- Generating unique identifiers for rows in a table
- Adding sequence numbers during data export
- Primary key generation for temporary tables
- Data partitioning and bucketing
- If you need contiguous row numbers, use the row_number() window function instead
