InterSystems READY 2025: Making SQL Aggregates up to 35% Faster

SQLAggregates

Elie Eshoa, InterSystems Systems Developer, outlined a project to optimize the aggregation of group by queries by improving parallel processing for average, minimum, and maximum operations. The old sequential parent process bottleneck is addressed by a new approach, where worker processes directly write results in parallel to a shared pool, yielding a 38% performance increase for high-cardinality scenarios. Eshoa covered technical concurrency considerations and noted the simplified query plan and improved ASL readability; this enhancement will be available in version 2025.2.


 

Aggregation Optimization Focus: Eshoa explained that the project focuses on optimizing aggregation over group-by queries, specifically enhancing parallel optimization and execution for operations such as average, minimum, and maximum. The current system already operates this process mostly in parallel, and the goal is to improve its performance.

Old Implementation Overview: Eshoa described the old implementation where a parent process assigns rows to worker processes, which calculate intermediary results and store them in temp files. Workers then pipe these results back to the parent process sequentially, which then adds them to a final temp file. This sequential processing by the parent process creates a bottleneck, especially when there is a high number of distinct groups (high cardinality).

New Implementation Approach: Eshoa detailed the new optimization where worker processes directly write their results in parallel to a final shared pool, eliminating the sequential processing by the parent process. This parallel writing aims to reduce the accumulated wait time experienced in the old implementation, especially in scenarios involving multiple independent groups.

Performance Improvement: Eshoa presented results from a sample table with 20 million rows and varying cardinalities, showing a 38% increase in performance for higher cardinalities with the new implementation. While lower cardinality cases showed minimal change, the parallel aggregation significantly reduced runtime in high cardinality scenarios.

Technical Considerations and Benefits: Eshoa emphasized the importance of addressing concurrency issues, such as locking, when multiple processes simultaneously write to the same data structure. The new implementation simplifies the query plan and improves readability in the abstract syntax language (ASL) because steps involving sending results back to the parent are no longer necessary. The optimization is expected to be available in version 2025.2.

J2 Interactive

J2 Interactive is an award-winning software development and IT consulting firm that specializes in customized solutions for healthcare and life sciences.