Functions

User-defined functions (Functions) extend the SQL computation capabilities of Lakehouse, allowing you to create reusable functions tailored to your business needs and call them in SQL queries just like built-in functions.

Function Types

Lakehouse supports two types of user-defined functions:

SQL Function

Functions defined using SQL expressions and executed inside the Lakehouse engine.

  • Scalar functions: Process one or more columns from an input row and return a single result value per row.
  • Table functions: Accept one or more input parameters and return a result set with multiple rows and columns.

SQL Functions are well-suited for simple computation logic such as formatting, conditional evaluation, and mathematical operations.

External Function

Custom functions written in Python or Java and executed in a remote function compute service (Alibaba Cloud FC, Tencent Cloud SCF, AWS Lambda).

  • UDF (User-Defined Function): Returns a single value per row.
  • UDAF (User-Defined Aggregate Function): Aggregates multiple rows and returns a single value (Java only).
  • UDTF (User-Defined Table Function): Returns a result set with multiple rows and columns (Java only).

External Functions are suited for calling external services (such as LLMs or image recognition APIs) or implementing complex business logic.

Function Comparison

DimensionSQL FunctionExternal Function
Development languageSQLPython 3.10 / Java 8
Execution locationInside the Lakehouse engineRemote function compute service
Use casesSimple computation, formatting, conditional logicComplex logic, AI model calls, unstructured data processing
PerformanceHigh (local execution)Subject to network latency
Dependency managementNoneDependencies must be packaged and uploaded
Supported typesScalar function, table functionUDF, UDAF (Java), UDTF (Java)

Quick Start

Create a SQL Scalar Function

CREATE FUNCTION public.area(x DOUBLE, y DOUBLE) RETURNS DOUBLE RETURN x * y; -- Use the function SELECT public.area(3, 4); -- Output: 12.0

Create a SQL Table Function

CREATE FUNCTION public.generate_series(start_val INT, end_val INT) RETURNS TABLE(val INT) AS SELECT generate_series(start_val, end_val);

Create an External Function

CREATE EXTERNAL FUNCTION public.upper_udf AS 'com.example.GenericUdfUpper' USING FILE 'volume://my_vol/upper.jar' CONNECTION my_fc_connection WITH PROPERTIES ( 'remote.udf.api' = 'java8.hive2.v0', 'remote.udf.protocol' = 'http.arrow.v0' ); -- Use the function SELECT public.upper_udf('hello'); -- Output: HELLO

Function Permissions

PermissionDescription
CREATE FUNCTIONCreate a function under a Schema
ALTER FUNCTIONModify function properties
USE FUNCTIONCall the function in SQL
DROP FUNCTIONDelete the function
READ METADATAView function metadata