MDEV-24943: Implement FILTER clause support for aggregate functions #4439

KhaledR57 · 2025-11-14T07:10:40Z

The Jira issue number for this PR is: MDEV-24943

Description

Aggregates lacked the SQL-standard FILTER clause, forcing CASE-based workarounds that reduced readability.

This update introduces the ability to specify a FILTER clause for aggregate functions, allowing for more granular control over which rows are included in the aggregation. Also, improves standards compliance and makes queries clearer and more readable.

<aggregate_function> ( <expression> ) [ FILTER ( WHERE <condition> ) ] [ OVER ( <window_spec> ) ]

The <condition> may contain any expression allowed in regular WHERE clauses, except subqueries, window functions, and outer references.

Example:

-- Using FILTER clause
SELECT AVG(value) FILTER (WHERE status = 'active') FROM sales;

-- Equivalent using CASE
SELECT AVG(CASE WHEN status = 'active' THEN value END) FROM sales;

How can this PR be tested?

Running the Test Suite

./mtr aggregates-filter

Validating CASE vs FILTER Equivalence

Use the compare.py script to verify CASE and FILTER produce identical results:

python3 compare.py aggregates-filter-case.result

Expected Result File Format:

The script expects result files with this structure:

#
# Test N: Description
#
# 1. CASE version
column1	column2	column3
value1	value2	value3
# 2. FILTER version
column1	column2	column3
value1	value2	value3

The script identifies test headers (# Test N: Description), CASE/FILTER markers (# 1. CASE version / # 2. FILTER version), and compares the data rows that follow each marker.

Basing the PR against the correct MariaDB version

This is a new feature or a refactoring, and the PR is based against the main branch.
This is a bug fix, and the PR is based against the earliest maintained branch in which the bug can be reproduced.

PR quality check

I checked the CODING_STANDARDS.md file and my PR conforms to this where appropriate.
For any trivial modifications to the PR, I am ok with the reviewer making the changes themselves.

gkodinov · 2025-12-15T11:41:11Z

Thank for taking the effort to work on this. Indeed, the LIMIT clause is a part of (at least) SQL 2011 and, as such, is a very good feature to have.

However, the diff seems very incomplete. Can you please make sure that you:

Have a working diff.
Have a good functional description of the change (hopefully in a jira)

Please re-submit when you have the above.
This is not a trivial change. I would suggest reaching out to the mariaDB developers at zulip if you need to talk about it with somebody.

vuvova · 2025-12-15T18:56:05Z

I don't see how the diff is incomplete. It applies fine. It compiles fine. It passes tests on buildbot, not every builder, but it passes on many builders, this doesn't look like a non-working diff.

DaveGosselin-MariaDB · 2025-12-17T21:44:07Z

Hi @KhaledR57 , I'm experimenting with the changes and I see a difference between what should otherwise be equivalent statements. What was once written using CASE as a workaround should now work with FILTER as in the following examples below. First, on Postgres 18.1, we see that these queries are equivalent:

postgres=# SELECT COUNT(*) AS unfiltered, SUM( CASE WHEN generate_series < 5 THEN 1 ELSE 0 END ) AS filtered FROM generate_series(1,10);
 unfiltered | filtered
------------+----------
         10 |        4
(1 row)

postgres=# SELECT COUNT(*) AS unfiltered, COUNT(*) FILTER (WHERE generate_series < 5) AS filtered FROM generate_series(1,10);
 unfiltered | filtered
------------+----------
         10 |        4
(1 row)

However, on MariaDB, the FILTER analogue produces different results with your patch:

MariaDB [test]> SELECT COUNT(*) AS unfiltered, SUM( CASE WHEN seq < 5 THEN 1 ELSE 0 END ) AS filtered FROM seq_1_to_10;
+------------+----------+
| unfiltered | filtered |
+------------+----------+
|         10 |        4 |
+------------+----------+
1 row in set (0.005 sec)

MariaDB [test]> SELECT COUNT(*) AS unfiltered, COUNT(*) FILTER (WHERE seq < 5) AS filtered FROM seq_1_to_10;
+------------+----------+
| unfiltered | filtered |
+------------+----------+
|         10 |       10 |
+------------+----------+
1 row in set (0.001 sec)

Why is this the case?

DaveGosselin-MariaDB · 2025-12-18T20:44:26Z

If we store the sequence engine result to an InnoDB-backed table and run the FILTER query against that instead we get the correct result:

create table t1 (seq int);
insert into t1 select seq from seq_1_to_10;
SELECT COUNT(*) AS unfiltered, COUNT(*) FILTER (WHERE seq < 5) AS filtered FROM t1;
+------------+----------+
| unfiltered | filtered |
+------------+----------+
|         10 |        4 |
+------------+----------+
1 row in set (0.001 sec)

Aggregates lacked the SQL-standard FILTER clause, forcing CASE-based workarounds that reduced readability across (sum, avg, count, …). This update introduces the ability to specify a FILTER clause for aggregate functions, allowing for more granular control over which rows are included in the aggregation. Also, improves standards compliance and makes queries clearer and more readable. The FILTER(WHERE ...) condition may contain any expression allowed in regular WHERE clauses, except subqueries, window functions, and outer references.

KhaledR57 · 2025-12-19T05:39:52Z

Hi @DaveGosselin-MariaDB , Sorry for the delayed response!

The issue was with the sequence storage engine's group_by_handler optimization in storage/sequence/sequence.cc. It computes SUM() and COUNT() directly using formulas without scanning rows, but it wasn't checking for the FILTER clause, so the filter was completely bypassed.

I've added a has_filter() check so it falls back to normal aggregation when FILTER is present. Fixed now!

if (item->type() != Item::SUM_FUNC_ITEM ||
    (((Item_sum*) item)->sum_func() != Item_sum::SUM_FUNC &&
     ((Item_sum*) item)->sum_func() != Item_sum::COUNT_FUNC) ||
    ((Item_sum*) item)->has_filter())  // NEW CHECK
    return 0;  // Fall back to normal aggregation

Storing the sequence data to an InnoDB table gave the correct result because InnoDB goes through the standard opt_sum.cc optimization path, which I had already updated here to check has_filter() and skip the optimization when FILTER is present.

I'll add sequence-specific tests in the next commit after I finish the stored aggregates (almost done with those).

KhaledR57 force-pushed the MDEV-24943-add-filter-clause branch from 9f3c0ff to cb7d38d Compare November 14, 2025 07:38

svoj added the External Contribution All PRs from entities outside of MariaDB Foundation, Corporation, Codership agreements. label Nov 15, 2025

gkodinov closed this Dec 15, 2025

vuvova reopened this Dec 15, 2025

vuvova requested a review from DaveGosselin-MariaDB December 16, 2025 16:08

KhaledR57 force-pushed the MDEV-24943-add-filter-clause branch from cb7d38d to aaea60a Compare December 19, 2025 04:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

MDEV-24943: Implement FILTER clause support for aggregate functions #4439

MDEV-24943: Implement FILTER clause support for aggregate functions #4439

KhaledR57 commented Nov 14, 2025

Uh oh!

gkodinov commented Dec 15, 2025

Uh oh!

vuvova commented Dec 15, 2025

Uh oh!

DaveGosselin-MariaDB commented Dec 17, 2025 •

edited

Loading

Uh oh!

DaveGosselin-MariaDB commented Dec 18, 2025

Uh oh!

KhaledR57 commented Dec 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

5 participants

Uh oh!

MDEV-24943: Implement FILTER clause support for aggregate functions #4439

Are you sure you want to change the base?

MDEV-24943: Implement FILTER clause support for aggregate functions #4439

Conversation

KhaledR57 commented Nov 14, 2025

Description

How can this PR be tested?

Running the Test Suite

Validating CASE vs FILTER Equivalence

Basing the PR against the correct MariaDB version

PR quality check

Uh oh!

gkodinov commented Dec 15, 2025

Uh oh!

vuvova commented Dec 15, 2025

Uh oh!

DaveGosselin-MariaDB commented Dec 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DaveGosselin-MariaDB commented Dec 18, 2025

Uh oh!

KhaledR57 commented Dec 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

5 participants

DaveGosselin-MariaDB commented Dec 17, 2025 •

edited

Loading