Lakehouse Column-Level Security (Dynamic Masking) User Guide

Overview

Column-level Security provides fine-grained data protection capabilities through Dynamic Data Masking, which dynamically modifies the display of sensitive data (such as partial hiding or character replacement) based on user identity or role. The system only stores the original data and executes the masking function at runtime during data access. This document introduces how to implement this functionality through the SQL interface.

Core Syntax

1 Creating Masking Policy Functions

Refer to the CREATE FUNCTION (SQL) syntax.

CREATE FUNCTION [schema_name.]function_name (col_name column_type) RETURNS output_type AS expression_with_conditional_logic;

Key Elements:

  • Must return the same data type as the original column.
  • Use security context functions:
    • current_user() to get the current user (note case sensitivity).
    • current_roles() to get an array of user roles.

2 Binding Policies to Columns

When Creating a Table:

CREATE TABLE table_name ( col1 STRING MASK schema_name.masking_function, ... );

Modifying an Existing Table:

ALTER TABLE table_name CHANGE COLUMN column_name SET MASK schema_name.masking_function;

Adding a Column with Masking:

ALTER TABLE table_name ADD COLUMN (column_name column_type MASK schema_name.masking_function);

3 Removing Policy Binding

ALTER TABLE table_name CHANGE COLUMN column_name UNSET MASK;


Use Case Examples

1 Basic Masking

Requirement: Display the first 6 characters of an ID card number, followed by 4 asterisks, and then the last 4 characters.

CREATE FUNCTION public.idcard_masking(idcard STRING) RETURNS STRING AS concat(substr(idcard, 1, 6), repeat('*', 4), substr(idcard, 10, 4)); ALTER TABLE data CHANGE COLUMN idcard SET MASK public.idcard_masking;

Query Effect:

Original Value: 130183199901011234 → Masked: 130183****9010

2 Dynamic Masking Based on User

Requirement: Only the UAT_TEST user should see masked data.

CREATE FUNCTION public.idcard_masking(idcard STRING) RETURNS STRING AS CASE WHEN current_user() = "UAT_TEST" THEN concat(substr(idcard, 1, 6), repeat('*', 4), substr(idcard, 10, 4)) ELSE idcard END; -- Ignoring case sensitivity of the username CREATE FUNCTION public.idcard_masking(idcard STRING) RETURNS STRING AS CASE WHEN lower(current_user()) = "uat_test" THEN concat(substr(idcard, 1, 6), repeat('*', 4), substr(idcard, 10, 4)) ELSE idcard END;

3 Dynamic Masking Based on Role

Requirement: Users with the user_admin role can view the full information.

CREATE FUNCTION public.idcard_masking_role(idcard STRING) RETURNS STRING AS CASE WHEN array_contains(current_roles(), "user_admin") THEN idcard ELSE concat(substr(idcard,1,6), '****', substr(idcard,11,4)) END;


Complete Operation Example

1 Initializing the Environment

CREATE SCHEMA IF NOT EXISTS security_demo; USE security_demo; -- General masking function CREATE FUNCTION security_demo.ssn_mask(ssn STRING) RETURNS STRING AS concat('***-**-', substr(ssn, 8, 4)); CREATE TABLE security_demo.user_data ( name STRING, ssn STRING MASK security_demo.ssn_mask, -- Binding directly when creating the table phone STRING ); INSERT INTO security_demo.user_data VALUES ('James', '123-45-6789', '123456789'); SELECT * FROM security_demo.user_data;

2 Creating Policy Functions

-- Exemption for privileged roles CREATE FUNCTION security_demo.admin_ssn_mask(ssn STRING) RETURNS STRING AS CASE WHEN array_contains(current_roles(), 'user_admin') THEN ssn ELSE concat('***-**-', substr(ssn,8,4)) END;

3 Modifying Masking Policies

-- Removing the previous policy ALTER TABLE security_demo.user_data CHANGE COLUMN ssn UNSET MASK; -- Adding a new policy ALTER TABLE security_demo.user_data CHANGE COLUMN ssn SET MASK security_demo.admin_ssn_mask;

4 Verifying the Effect

Query by a Regular User:

SELECT * FROM user_data; -- Output: John Doe ***-**-6789 138****1234

Query by a USER_ADMIN Role:

SELECT * FROM user_data; -- Output: John Doe 123-45-6789 138****1234


Management Notes

1 Permission Control

  • Only roles with ALTER TABLE permissions are allowed to modify masking policies.
  • Function creation requires CREATE FUNCTION permissions.

2 Performance Recommendations

  • Avoid using complex calculations in masking functions.
  • Use conditional logic cautiously for columns with high query frequency.

Limitations

  • Only one masking policy can be bound to a single column. If you want to define multiple masking rules, you can use conditional logic within a single function to apply different policies.