Analyzing Usage with the information_schema Job History View

Overview

This guide helps you use the sys.information_schema.job_history table to analyze Singdata Lakehouse system usage, understand resource consumption patterns, identify performance bottlenecks, and discover optimization opportunities. All analysis is based on SQL queries, requiring no additional tools.

Data Source Introduction

Primary Analysis Table

  • Table Name: sys.information_schema.job_history
  • Purpose: Records the execution history of all jobs in the system
  • Permissions: Requires query permissions on sys.information_schema

Key Field Descriptions

Field NameData TypeDescription
workspace_nameStringWorkspace name
virtual_clusterStringVirtual cluster name
job_idStringUnique job identifier
execution_timeFloatJob execution time (seconds)
start_timeTimestampJob start time
input_tablesStringInput table information (JSON format)
input_bytesStringNumber of bytes read
cache_hitStringNumber of cache hit bytes
statusStringJob execution status

Analysis Goals and Methods

Analysis Goals

  1. Resource Usage Analysis: Identify the busiest workspaces and virtual clusters
  2. Data Access Analysis: Find the most frequently accessed tables and data read patterns
  3. Performance Optimization Analysis: Evaluate cache hit rates and query efficiency
  4. Capacity Planning Analysis: Provide data support for resource scaling
  • Daily Monitoring: Last 7 days of data
  • Periodic Analysis: Last 30 days of data
  • Long-term Trends: Last 90 days of data

Common Quick Analysis SQL

The following SQL queries are suitable for daily monitoring, troubleshooting, and dashboard initialization. Use sys.information_schema.job_history for cross-workspace analysis; switch to information_schema.job_history when viewing only the current workspace.

Cluster Load Analysis

SELECT virtual_cluster, COUNT(*) AS job_count, AVG(execution_time) AS avg_seconds, ROUND(SUM(CASE WHEN status = 'SUCCEED' THEN 1 ELSE 0 END) * 100.0 / COUNT(*), 2) AS success_rate FROM sys.information_schema.job_history WHERE start_time >= CURRENT_DATE() - INTERVAL 7 DAY GROUP BY virtual_cluster ORDER BY job_count DESC;

Top 20 Slow Queries

SELECT job_id, virtual_cluster, execution_time, status, start_time FROM sys.information_schema.job_history WHERE start_time >= CURRENT_DATE() - INTERVAL 7 DAY ORDER BY execution_time DESC LIMIT 20;

Failed Job Statistics

SELECT virtual_cluster, COUNT(*) AS failed_count, DATE(start_time) AS date FROM sys.information_schema.job_history WHERE status = 'FAILED' AND start_time >= CURRENT_DATE() - INTERVAL 7 DAY GROUP BY virtual_cluster, DATE(start_time) ORDER BY date DESC, failed_count DESC;

Cache Hit Rate Analysis

SELECT virtual_cluster, SUM(CAST(input_bytes AS BIGINT)) AS total_input_bytes, SUM(CAST(cache_hit AS BIGINT)) AS total_cache_hit, ROUND(SUM(CAST(cache_hit AS BIGINT)) * 100.0 / NULLIF(SUM(CAST(input_bytes AS BIGINT)), 0), 2) AS cache_hit_rate FROM sys.information_schema.job_history WHERE start_time >= CURRENT_DATE() - INTERVAL 7 DAY AND input_bytes IS NOT NULL GROUP BY virtual_cluster;

Peak Period Identification

SELECT HOUR(start_time) AS hour_of_day, COUNT(*) AS job_count, AVG(execution_time) AS avg_execution_time FROM sys.information_schema.job_history WHERE start_time >= CURRENT_DATE() - INTERVAL 7 DAY GROUP BY HOUR(start_time) ORDER BY hour_of_day;

Filter Job History by query_tag

SELECT job_id, execution_time, status FROM sys.information_schema.job_history WHERE start_time >= CURRENT_DATE() - INTERVAL 7 DAY AND query_tag = 'etl_daily';

1. Workspace and Virtual Cluster Activity Analysis

Analysis Purpose

Identify the busiest workspaces and virtual clusters in the system, providing a basis for resource allocation and capacity planning.

1.1 Workspace Activity Analysis

Query Goal: Sort by total execution time to find the busiest workspaces

-- Workspace activity statistics (last 30 days) SELECT workspace_name, COUNT(*) as job_count, -- Job count SUM(execution_time) as total_execution_time, -- Total execution time AVG(execution_time) as avg_execution_time, -- Average execution time SUM(CASE WHEN status = 'SUCCEED' THEN 1 ELSE 0 END) as success_jobs, -- Successful jobs SUM(CASE WHEN status = 'FAILED' THEN 1 ELSE 0 END) as failed_jobs, -- Failed jobs ROUND(SUM(CASE WHEN status = 'SUCCEED' THEN 1 ELSE 0 END) * 100.0 / COUNT(*), 2) as success_rate -- Success rate FROM sys.information_schema.job_history WHERE start_time >= CURRENT_DATE() - INTERVAL 30 DAY GROUP BY workspace_name ORDER BY total_execution_time DESC;

1.2 Virtual Cluster Activity Analysis

Query Goal: Analyze the workload distribution across virtual clusters

-- Virtual Cluster activity statistics (last 30 days) SELECT virtual_cluster, COUNT(*) as job_count, SUM(execution_time) as total_execution_time, AVG(execution_time) as avg_execution_time, MIN(execution_time) as min_execution_time, MAX(execution_time) as max_execution_time FROM sys.information_schema.job_history WHERE start_time >= CURRENT_DATE() - INTERVAL 30 DAY AND virtual_cluster IS NOT NULL GROUP BY virtual_cluster ORDER BY total_execution_time DESC;

Result Example:

Virtual Cluster NameJob CountTotal Execution Time (s)Avg Execution Time (s)Min Execution Time (s)Max Execution Time (s)
MET***_ETL_GP36,695996,551.8927.160.005745.531
DEFAULT338,797558,213.831.650.0063,825.289
CUS***_BILLING531,01445,493.620.090.003165.597
BI_ANALYSE49,1281,725.920.040.003104.061
VC_***_CAL80373.294.670.00760.184
MY_FIRST_VC140.650.050.0110.097
MY_SECOND_VC40.120.030.0150.072

1.3 Analyzing Workload by Time Period

Query Goal: Understand system load across different time periods

-- Job distribution statistics by hour SELECT HOUR(start_time) as hour_of_day, COUNT(*) as job_count, SUM(execution_time) as total_execution_time, AVG(execution_time) as avg_execution_time FROM sys.information_schema.job_history WHERE start_time >= CURRENT_DATE() - INTERVAL 7 DAY GROUP BY HOUR(start_time) ORDER BY hour_of_day;

Result Example:

HourJob CountTotal Execution Time (s)Avg Execution Time (s)
024,18918,479.990.76
123,82311,243.610.47
217,72112,227.460.69
319,74628,425.321.44
424,53512,300.860.50
828,22418,066.540.64
920,44327,761.991.36
1525,00429,525.281.18
1820,34329,472.921.45
2317,46111,217.910.64

-- Job distribution statistics by day of week SELECT DAYOFWEEK(start_time) as day_of_week, CASE DAYOFWEEK(start_time) WHEN 1 THEN 'Sunday' WHEN 2 THEN 'Monday' WHEN 3 THEN 'Tuesday' WHEN 4 THEN 'Wednesday' WHEN 5 THEN 'Thursday' WHEN 6 THEN 'Friday' WHEN 7 THEN 'Saturday' END as day_name, COUNT(*) as job_count, SUM(execution_time) as total_execution_time FROM sys.information_schema.job_history WHERE start_time >= CURRENT_DATE() - INTERVAL 30 DAY GROUP BY DAYOFWEEK(start_time) ORDER BY day_of_week;

Result Example:

Day NumberDay NameJob CountTotal Execution Time (s)
1Sunday86,383162,597.21
2Monday103,041172,924.33
3Tuesday158,431276,514.79
4Wednesday208,982322,615.64
5Thursday174,951278,444.13
6Friday143,648238,794.32
7Saturday80,380150,478.52

2. Table Usage Statistics Analysis

Analysis Purpose

Identify the most frequently accessed tables, analyze data read patterns, and provide guidance for table optimization and indexing strategies.

2.1 Most Frequently Accessed Tables

Query Goal: Find the tables with the highest access frequency

-- Parse input_tables JSON and calculate table access statistics SELECT GET_JSON_OBJECT(input_tables, '$.table[0].tableName') as table_name, CONCAT( GET_JSON_OBJECT(input_tables, '$.table[0].namespace[0]'), '.', GET_JSON_OBJECT(input_tables, '$.table[0].namespace[1]') ) as schema_name, COUNT(*) as access_count, SUM(CAST(input_bytes AS BIGINT)) as total_bytes_read, AVG(CAST(input_bytes AS BIGINT)) as avg_bytes_per_access, SUM(CAST(GET_JSON_OBJECT(input_tables, '$.table[0].record') AS BIGINT)) as total_records_read FROM sys.information_schema.job_history WHERE start_time >= CURRENT_DATE() - INTERVAL 30 DAY AND input_tables IS NOT NULL AND input_tables != '' AND input_tables != '{"table":[]}' AND input_bytes > 0 GROUP BY GET_JSON_OBJECT(input_tables, '$.table[0].tableName'), CONCAT( GET_JSON_OBJECT(input_tables, '$.table[0].namespace[0]'), '.', GET_JSON_OBJECT(input_tables, '$.table[0].namespace[1]') ) HAVING table_name IS NOT NULL ORDER BY access_count DESC LIMIT 20;

Result Example:

Table NameSchema NameAccess CountTotal Bytes ReadAvg Bytes Per AccessTotal Records Read
bil***_summary_mvmet_bill.bil_mv662,7147,815,536,374,23111,793,230521,718,965,089
vc_***_calculatemet***_bill.public65,837164,257,938,0612,494,9186,127,770,647
met***_events_allmet***_bill.raw8,78711,177,614,832,7141,272,063,000527,117,351,038
cli****gateway**_log_beginsto***_metering.public8,779110,104,760,84212,541,830198,025,739
sku_categorymet***_bill.sku_meta3,8531,734,507,214,974450,170,6001,029,852
bil***_compute_detail_mvmet_bill.bil_mv2,92897,644,902,29633,348,6706,685,089,232
vc_bil***_without_zd_detail_mvmet_bill.bil_mv1,473227,399,306,618154,378,3008,328,596,693
met***_details_allmet***_bill.raw1,4054,312,047,296,0073,069,073,000339,604,011,874
mv_vc_met***_detailsmet***_bill.public1,1858,165,515,4646,890,730856,688,041
sto_***oss_bil**_detail_mvmet_bill.bil_mv7484,350,578,5515,816,281945,398,941

2.2 Top Tables by Data Read Volume

Query Goal: Find the tables with the largest data read volumes

-- Table statistics sorted by data read volume SELECT GET_JSON_OBJECT(input_tables, '$.table[0].tableName') as table_name, CONCAT( GET_JSON_OBJECT(input_tables, '$.table[0].namespace[0]'), '.', GET_JSON_OBJECT(input_tables, '$.table[0].namespace[1]') ) as schema_name, COUNT(*) as access_count, SUM(CAST(input_bytes AS BIGINT)) as total_bytes_read, SUM(CAST(input_bytes AS BIGINT)) / 1024 / 1024 / 1024 as total_gb_read, AVG(CAST(input_bytes AS BIGINT)) / 1024 / 1024 as avg_mb_per_access FROM sys.information_schema.job_history WHERE start_time >= CURRENT_DATE() - INTERVAL 30 DAY AND input_tables IS NOT NULL AND input_tables != '' AND input_tables != '{"table":[]}' AND input_bytes > 0 GROUP BY 1, 2 HAVING table_name IS NOT NULL ORDER BY total_bytes_read DESC LIMIT 20;

Result Example:

Table NameSchema NameAccess CountTotal Bytes ReadTotal Read (GB)Avg Per Access (MB)
met***_events_allmet***_bill.raw8,78711,177,614,832,71410,409.971,213.13
bil***_summary_mvmet_bill.bil_mv662,7147,815,536,374,2317,278.7911.25
met***_details_allmet***_bill.raw1,4054,312,047,296,0074,015.912,926.90
sku_categorymet***_bill.sku_meta3,8531,734,507,214,9741,615.39429.32
dwd_cz_jobssys_meta_warehouse.inf_schema35387,223,640,942360.6310,551.01
vc_met***_detailsmet***_bill.public743371,186,266,727345.69476.43
vc_bil***_without_zd_detail_mvmet_bill.bil_mv1,473227,399,306,618211.78147.23
vc_***_calculatemet***_bill.public65,837164,257,938,061152.982.38
dim_stu***_instance_dmin_fmet_bill.stu_dw_tenant405130,022,636,178121.09306.17
ins***_account_mappingmet***_bill.public730118,682,911,967110.53155.05

2.3 Table Access Time Distribution Analysis

Query Goal: Analyze the time patterns of table access

-- Analyze access time distribution for major tables WITH top_tables AS ( SELECT GET_JSON_OBJECT(input_tables, '$.table[0].tableName') as table_name FROM sys.information_schema.job_history WHERE start_time >= CURRENT_DATE() - INTERVAL 30 DAY AND input_tables IS NOT NULL AND input_tables != '{"table":[]}' GROUP BY 1 ORDER BY COUNT(*) DESC LIMIT 5 ) SELECT GET_JSON_OBJECT(h.input_tables, '$.table[0].tableName') as table_name, HOUR(h.start_time) as hour_of_day, COUNT(*) as access_count, SUM(CAST(h.input_bytes AS BIGINT)) / 1024 / 1024 as total_mb_read FROM sys.information_schema.job_history h JOIN top_tables t ON GET_JSON_OBJECT(h.input_tables, '$.table[0].tableName') = t.table_name WHERE h.start_time >= CURRENT_DATE() - INTERVAL 7 DAY GROUP BY 1, 2 ORDER BY table_name, hour_of_day;

Result Example:

Table NameHourAccess CountTotal Read (MB)
bil***_summary_mv021,826238,651.21
bil***_summary_mv122,557251,626.92
bil***_summary_mv215,100173,747.51
bil***_summary_mv318,436216,057.70
bil***_summary_mv422,117249,271.86
bil***_summary_mv824,900286,801.29
bil***_summary_mv917,682207,026.35
bil***_summary_mv1519,234225,847.45
bil***_summary_mv1816,891198,234.12
bil***_summary_mv2314,567167,432.89

3. Cache Hit Rate Analysis

Analysis Purpose

Evaluate system cache efficiency, identify cache optimization opportunities, and improve query performance.

3.1 Overall Cache Hit Rate

Query Goal: Calculate the system-wide cache hit rate

-- System-wide cache hit rate statistics SELECT CASE WHEN cache_hit = '0' OR cache_hit IS NULL THEN 'Cache Miss' ELSE 'Cache Hit' END as cache_status, COUNT(*) as job_count, ROUND(COUNT(*) * 100.0 / SUM(COUNT(*)) OVER(), 2) as percentage, SUM(execution_time) as total_execution_time, AVG(execution_time) as avg_execution_time FROM sys.information_schema.job_history WHERE start_time >= CURRENT_DATE() - INTERVAL 30 DAY GROUP BY 1 ORDER BY job_count DESC;

Result Example:

Cache StatusJob CountPercentage (%)Total Execution Time (s)Avg Execution Time (s)
Cache Hit738,78477.29883,488.521.20
Cache Miss217,03222.71718,880.423.31

3.2 Cache Hit Rate by Workspace

Query Goal: Compare cache usage effectiveness across different workspaces

-- Cache hit rate analysis by workspace SELECT workspace_name, SUM(CASE WHEN cache_hit != '0' AND cache_hit IS NOT NULL THEN 1 ELSE 0 END) as cache_hit_jobs, SUM(CASE WHEN cache_hit = '0' OR cache_hit IS NULL THEN 1 ELSE 0 END) as cache_miss_jobs, COUNT(*) as total_jobs, ROUND(SUM(CASE WHEN cache_hit != '0' AND cache_hit IS NOT NULL THEN 1 ELSE 0 END) * 100.0 / COUNT(*), 2) as cache_hit_rate, SUM(CAST(cache_hit AS BIGINT)) / 1024 / 1024 / 1024 as total_cache_gb FROM sys.information_schema.job_history WHERE start_time >= CURRENT_DATE() - INTERVAL 30 DAY GROUP BY workspace_name ORDER BY cache_hit_rate DESC;

Result Example:

Workspace NameCache Hit JobsCache Miss JobsTotal JobsCache Hit Rate (%)Total Cache (GB)
met***_n_bill732,157136,263868,42084.3112,336.17
sto***_metering6,29029,08235,37217.7836.20
cos***_analyse33751,66452,0010.6598.62
qui***_ws018180.000.00
cli***_sample_data0110.000.00
dev_envirment0440.000.00

3.3 Cache Hit Rate Trend Analysis

Query Goal: Observe cache hit rate trends over time

-- Daily cache hit rate trend statistics SELECT DATE(start_time) as date, SUM(CASE WHEN cache_hit != '0' AND cache_hit IS NOT NULL THEN 1 ELSE 0 END) as cache_hit_jobs, COUNT(*) as total_jobs, ROUND(SUM(CASE WHEN cache_hit != '0' AND cache_hit IS NOT NULL THEN 1 ELSE 0 END) * 100.0 / COUNT(*), 2) as cache_hit_rate FROM sys.information_schema.job_history WHERE start_time >= CURRENT_DATE() - INTERVAL 30 DAY GROUP BY DATE(start_time) ORDER BY date;

Result Example:

DateCache Hit JobsTotal JobsCache Hit Rate (%)
2025-04-2320,14526,83475.08
2025-04-2422,56728,90178.09
2025-04-2524,12331,24577.21
2025-04-2625,89033,12778.15
2025-04-2723,45630,23477.57
2025-04-2821,78928,56776.27
2025-04-2926,23434,12376.88
2025-04-3024,56731,89077.04
2025-05-0122,89029,56777.42
2025-05-0225,12332,45677.40

4. Performance Issue Diagnostic Queries

4.1 Long-Running Jobs

Query Goal: Identify jobs with abnormally long execution times

-- Find long-running jobs SELECT job_id, workspace_name, virtual_cluster, job_type, execution_time, start_time, end_time, status, LEFT(job_text, 100) as job_text_preview FROM sys.information_schema.job_history WHERE start_time >= CURRENT_DATE() - INTERVAL 7 DAY AND execution_time > 300 -- Jobs running longer than 5 minutes ORDER BY execution_time DESC LIMIT 50;

Result Example:

Job IDWorkspace NameVirtual ClusterJob TypeExecution Time (s)Start TimeStatusJob Preview
202505***96423met***_n_billMET***_ETL_GPSELECT3,825.292025-05-22 03:15:23SUCCEEDSELECT SUM(CAST(record_count AS BIGINT)) as total_records, SUM(CAST(data_size AS BIGINT))...
202505***84521met***_n_billMET***_ETL_GPSELECT2,456.782025-05-21 15:42:11SUCCEEDWITH billing_data AS (SELECT workspace_id, SUM(compute_time) FROM billing_summary...
202505***73941met***_n_billDEFAULTINSERT1,923.452025-05-20 09:33:47SUCCEEDINSERT INTO meter SELECT event_id, workspace_id, timestamp, event_type...
202505***62847sto***_meteringDEFAULTSELECT1,567.232025-05-19 14:28:36FAILEDSELECT storage_type, bucket_name, SUM(storage_size) FROM sto***_usage WHERE date...
202505***51238met***_n_billBI_ANALYSESELECT1,234.562025-05-18 11:17:29SUCCEEDSELECT DATE_TRUNC('hour', start_time) as hour, COUNT(*) as job_count FROM job_his...

4.2 Failed Job Analysis

Query Goal: Analyze patterns and causes of job failures

-- Failed job statistics and analysis SELECT workspace_name, virtual_cluster, job_type, COUNT(*) as failed_count, COUNT(*) * 100.0 / SUM(COUNT(*)) OVER() as failure_percentage, LEFT(error_message, 100) as common_error FROM sys.information_schema.job_history WHERE start_time >= CURRENT_DATE() - INTERVAL 7 DAY AND status = 'FAILED' GROUP BY workspace_name, virtual_cluster, job_type, LEFT(error_message, 100) ORDER BY failed_count DESC LIMIT 20;

Result Example:

Workspace NameVirtual ClusterJob TypeFailed CountFailure Percentage (%)Common Error
cos***_analyseBI_ANALYSESELECT1,24545.67CZLH-40000 Table 'cost_data.billing_temp' doesn't exist
met***_n_billDEFAULTINSERT56720.82CZLH-42000 Duplicate key error: PRIMARY KEY constraint violated
sto***_meteringDEFAULTSELECT2348.59CZLH-42000 Semantic analysis exception - cannot resolve column
met***_n_billMET***_ETL_GPUPDATE1565.73CZLH-41000 Lock timeout: Table locked by another transaction
cos***_analyseBI_ANALYSEDELETE893.27CZLH-43000 Syntax error: Invalid column reference 'unknown_column'

4.3 Top Resource-Consuming Jobs

Query Goal: Find the job types with the highest resource consumption

-- High resource consumption job analysis SELECT job_type, workspace_name, COUNT(*) as job_count, SUM(execution_time) as total_execution_time, AVG(execution_time) as avg_execution_time, SUM(CAST(input_bytes AS BIGINT)) / 1024 / 1024 / 1024 as total_input_gb, AVG(CAST(input_bytes AS BIGINT)) / 1024 / 1024 as avg_input_mb FROM sys.information_schema.job_history WHERE start_time >= CURRENT_DATE() - INTERVAL 30 DAY AND input_bytes > 0 GROUP BY job_type, workspace_name ORDER BY total_execution_time DESC LIMIT 20;

Result Example:

Job TypeWorkspace NameJob CountTotal Execution Time (s)Avg Execution Time (s)Total Input (GB)Avg Input (MB)
SELECTmet***_n_bill345,6781,234,567.893.5715,234.5645.67
INSERTmet***_n_bill67,890456,789.126.738,901.23135.45
UPDATEmet***_n_bill12,345234,567.8919.013,456.78289.34
DELETEcos***_analyse8,901123,456.7813.871,234.56142.78
CREATEsto***_metering2,34556,789.1224.21567.89249.12

5. Practical Analysis Templates

5.1 Daily Monitoring Report

-- Daily system health report SELECT 'Overall Overview' as metric_category, 'Jobs Total' as metric_name, CAST(COUNT(*) AS STRING) as metric_value FROM sys.information_schema.job_history WHERE DATE(start_time) = CURRENT_DATE() - INTERVAL 1 DAY UNION ALL SELECT 'Overall Overview', 'Execution Time (Hours)', CAST(ROUND(SUM(execution_time) / 3600, 2) AS STRING) FROM sys.information_schema.job_history WHERE DATE(start_time) = CURRENT_DATE() - INTERVAL 1 DAY UNION ALL SELECT 'Overall Overview', 'Success Rate (%)', CAST(ROUND(SUM(CASE WHEN status = 'SUCCEED' THEN 1 ELSE 0 END) * 100.0 / COUNT(*), 2) AS STRING) FROM sys.information_schema.job_history WHERE DATE(start_time) = CURRENT_DATE() - INTERVAL 1 DAY ORDER BY metric_category, metric_name;

Result Example:

Metric CategoryMetric NameMetric Value
Overall OverviewExecution Time (Hours)427.35
Overall OverviewJobs Total34,567
Overall OverviewSuccess Rate (%)97.85

5.2 Resource Usage Assessment

-- Resource usage assessment query WITH resource_summary AS ( SELECT workspace_name, COUNT(*) as jobs, SUM(execution_time) as total_time, SUM(CAST(input_bytes AS BIGINT)) as total_bytes FROM sys.information_schema.job_history WHERE start_time >= CURRENT_DATE() - INTERVAL 30 DAY GROUP BY workspace_name ) SELECT workspace_name, jobs, ROUND(total_time / 3600, 2) as total_hours, ROUND(total_bytes / 1024 / 1024 / 1024, 2) as total_gb, ROUND(jobs * 100.0 / SUM(jobs) OVER(), 2) as job_percentage, ROUND(total_time * 100.0 / SUM(total_time) OVER(), 2) as time_percentage FROM resource_summary ORDER BY total_time DESC;

Result Example:

Workspace NameJob CountTotal HoursTotal Data (GB)Job Percentage (%)Time Percentage (%)
met***_n_bill868,420430.5424,567.8990.8596.74
sto***_metering35,37213.011,234.563.702.93
cos***_analyse52,0011.55567.235.440.35
qui***_ws180.000.010.000.00
dev_envirment40.000.020.000.00
cli***_sample_data10.000.000.000.00

Analysis Frequency Recommendations

  • Daily Monitoring: Execute the overall overview and failed job analysis
  • Weekly Analysis: Run the full activity and table usage analysis
  • Monthly Assessment: Conduct cache efficiency and resource planning analysis

It is recommended to save frequently used queries as views for easy reuse.

Optimization Action Guide

  • High Execution Time Jobs: Check for SQL optimization opportunities; try using Dynamic Table incremental computation pipelines to reduce computation volume and execution time
  • Low Cache Hit Rate: Adjust the auto-suspend time for analytics compute clusters; avoid shutting down during peak query periods to prevent cache loss
  • High-Frequency Access Tables: Consider partitioning and index optimization
  • Resource Imbalance: Redistribute compute cluster resource specifications within workspaces; for frequently used compute clusters, consider appropriate scaling if you want to reduce job execution time