groupArray Function
groupArray([DISTINCT] expr [, limit]) [FILTER (WHERE condition)]
Description
The groupArray function is an alias for COLLECT_LIST, supporting all features of COLLECT_LIST.
Parameters
expr: An expression of any type, used to collect elements from input data.
limit: Optional parameter, integer type, specifying the maximum number of elements to collect. If not specified, all elements are collected.
Return Result
- Returns an array where the element type matches the input parameter type.
- If the
DISTINCT keyword is set, returns a deduplicated array.
- If the
limit parameter is specified, the returned array contains at most limit elements.
- The function does not guarantee the order of elements in the result.
- If the input data contains
NULL values, those values are not included in the returned array.
Usage Examples
- Return an array containing non-duplicate elements:
SELECT groupArray(DISTINCT col) FROM VALUES (1), (2), (1), (NULL) AS tab(col);
+----------------------------+
| groupArray(DISTINCT col) |
+----------------------------+
| [1,2] |
+----------------------------+
- Return an array containing duplicate elements:
SELECT groupArray(col) FROM VALUES (1), (2), (1), (NULL) AS tab(col);
+-------------------+
| groupArray(col) |
+-------------------+
| [1,2,1] |
+-------------------+
- Collect characters from string data:
SELECT groupArray(col)
FROM VALUES ("apple"), ("banana"), ("cherry"), (NULL) AS tab(col);
+-----------------------------+
| groupArray(col) |
+-----------------------------+
| ["apple","banana","cherry"] |
+-----------------------------+
- Collect and return an array containing null values:
SELECT groupArray(col) FROM VALUES (true), (false), (null) AS tab(col);
+-------------------+
| groupArray(col) |
+-------------------+
| [true,false] |
+-------------------+
- Use the FILTER clause to conditionally collect elements:
SELECT groupArray(col) FILTER (WHERE col > 1) FROM VALUES (1), (2), (3), (1) AS tab(col);
+--------------------------------------------+
| groupArray(col) FILTER (WHERE (col > 1)) |
+--------------------------------------------+
| [2,3] |
+--------------------------------------------+
- Combine FILTER clause and DISTINCT to collect unique conditional elements:
SELECT groupArray(DISTINCT col) FILTER (WHERE col <= 3) FROM VALUES (1), (2), (3), (3), (4) AS tab(col);
+-----------------------------------------------------------+
| collect_list(DISTINCT col) FILTER (WHERE (col <= 3)) |
+-----------------------------------------------------------+
| [1,2,3] |
+-----------------------------------------------------------+
- Use the limit parameter to restrict the number of returned elements:
SELECT groupArray(col, 2) FROM VALUES (1), (2), (3), (4) AS tab(col);
+----------------------+
| groupArray(col, 2) |
+----------------------+
| [1,2] |
+----------------------+