CASSANDRA-21139 - Feature/guardrail for misprepare statements by omniCoder77 · Pull Request #4596 · apache/cassandra

omniCoder77 · 2026-01-30T20:57:12Z

Validation is posted on Jira Ticket

Feature suggestion:
Guardrail for miss-prepared statements

Description:

We have hundreds of application teams and several dozen of them miss-prepare statements by using literals instead of bind markers.

I.e.,

// wrong 
session.prepare("select * from users where ID = 996");
session.prepare("select * from users where ID = 997");
session.prepare("select * from users where ID = 998");
session.prepare("select * from users where ID = 999");

// correct
session.prepare("select * from users where ID = ?");

The problem causes the prepared statement cache to constantly overflow, and will print a prepared statements discarded WARN message in the Cassandra log. At present, we use a wack-a-mole approach to discuss the problem with each development team individually, and hope they fix it and train the entire team on how to prepare statements correctly.

Also, finding the root cause of the issue today requires having the knowledge and access to look at the system.prepared_statements table.

Guardrails would seem a good approach here, where the guard could WARN or REJECT when a statement was prepared using a WHERE clause and no bind markers.

Note, this should not prevent users from creating prepared statements without a WHERE clause or with one or more literal values so long as there was at least one bind marker. Thus, the following would remain valid:

session.prepare("select * from users");
session.prepare("select * from users where TYPE=5 and ID = ?");

Approach:
Introduced a boolean flag use_misprepare_statements_enabled (which can be configured from cassandra.yaml) whose default value is true (backward compatibility) and added functions to StorageServiceMBean to enable dynamic runtime configuration.
Added test cases to validate changes in parseAndPrepare function.

… CRUD, IN, and LWT operations

src/java/org/apache/cassandra/config/Config.java

src/java/org/apache/cassandra/config/DatabaseDescriptor.java

src/java/org/apache/cassandra/config/GuardrailsOptions.java

src/java/org/apache/cassandra/cql3/QueryProcessor.java

smiklosovic · 2026-01-31T07:58:46Z

src/java/org/apache/cassandra/cql3/QueryProcessor.java


+    private static void checkMispreparedGuardrail(CQLStatement statement, ClientState clientState)
+    {
+        if (clientState.isInternal)


just move this to above if, this method will have easier job / will be more focused, logic wise. I have not check the details yet but it is interesting to see that if isInternal is false then clientState.internal can be still true. Should not be these two just aligned so only one is necessary?

I remove the clientState.isInternal check, is checkMispreparedGuardrail is called only if it's false.

at line 218 isInternal parameter parameter is false, but clientState is ClientState.forInternalCalls(), hence clientState.isInternal will be true.

smiklosovic · 2026-01-31T07:59:27Z

src/java/org/apache/cassandra/cql3/QueryProcessor.java

+            }
+            else
+            {
+                String msg = "Performance Tip: This query contains literal values in the WHERE clause. " +


can you do private static final string in this class from this string?

smiklosovic · 2026-01-31T08:03:32Z

src/java/org/apache/cassandra/cql3/QueryProcessor.java

+            return;
+        }
+
+        if (isMisprepared(statement))


can you make this logic faster? If it is indeed prepared then you are evaluating it unnecessarily which would slow it down. What is the least amount of test to do to run through that "heavy" logic? It is the fact whether our guardrail is enabled in the first place, right? So base the logic on guardrail evaluation first and then we will check if it is all prepared or not - the heavy logic done only when must.

If you are not sure which approach is faster then run a stress test and profile it with async profiler.

Also, for example, superusers completely bypass guardrails so for them you would call isMisprepared completely unnecessarily.

smiklosovic · 2026-01-31T08:09:53Z

src/java/org/apache/cassandra/db/guardrails/Guardrails.java

+    /**
+     * Guardrail on mis-prepared statements.
+     */
+    public static final EnableFlag MispreparedStatementsEnabled =


start the guardrail name with lower-case as others are. Also, why is this (to me correctly) called "mispreparedStatementsEnabled" but the config introduces use_misprepare_statements_enabled ? Just align it, no?

src/java/org/apache/cassandra/db/guardrails/Guardrails.java

smiklosovic · 2026-01-31T08:14:00Z

src/java/org/apache/cassandra/db/guardrails/GuardrailsConfig.java

    boolean getVectorTypeEnabled();

+    /**
+     * @return Whether use misprepared statements is enabled. If not enabled, misprepared statements will fail


returns true if the usage of mis-prepared statements is enabled, false otherwise. Returns true by default.

You can also expand on what the mis-prepared statement is and when it is considered so (also what happens with batches etc, more descriptive better)

Wording might change if you rename it to "disallowMispreparedStatements" as semantics would change

test/unit/org/apache/cassandra/cql3/MispreparedStatementsTest.java

smiklosovic · 2026-01-31T08:33:25Z

conf/cassandra.yaml

+#
+# When true, allows the use of misprepared statements, only warns in the logs.
+# When false, misprepared statements will result in an error to the client. Default is true.
+# use_misprepare_statements_enabled: false


I think it would be better if the name of this was "disallow_misprepared_statements: true". If you look into other guardrails, all of them follow the logic that they are set to "true".

Then you will have way easier job later in the code as you can just do

Guardrails.disallowMispreparedStatements.ensureEnabled(state) / enabled

But enabled will still be necessary, because if disallow_misprepared_statements is false, we will have to warn against use of mis prepared statements as the ticket suggests
Although I can change the name to disallow_misprepared_statements in next PR, if needed.

smiklosovic · 2026-02-02T00:25:28Z

conf/cassandra.yaml

 # maximum_replication_factor_warn_threshold: -1
 # maximum_replication_factor_fail_threshold: -1
+#
+# When true, allows the use of misprepared statements, only warns in the logs.


allows the usage, "use" is not a noun.

conf/cassandra.yaml

smiklosovic · 2026-02-02T00:28:13Z

src/java/org/apache/cassandra/config/Config.java

    public volatile boolean user_timestamps_enabled = true;
    public volatile boolean alter_table_enabled = true;
    public volatile boolean group_by_enabled = true;
+    public volatile boolean misprepared_statements_enabled = false;


so is it true or false? I am confused now. Should be true here, no? By default we want to just warn.

src/java/org/apache/cassandra/config/DatabaseDescriptor.java

smiklosovic · 2026-02-02T00:32:57Z

src/java/org/apache/cassandra/config/GuardrailsOptions.java

+
+    public void setMispreparedStatementsEnabled(boolean enabled)
+    {
+        updatePropertyWithLogging("use_misprepare_statements_enabled",


this should be called "misprepared_statements_enabled"

smiklosovic · 2026-02-02T00:33:48Z

src/java/org/apache/cassandra/cql3/QueryProcessor.java

    public static final CQLMetrics metrics = new CQLMetrics();

+
+    private static final String msg = "Performance Tip: This query contains literal values in the WHERE clause. " +


can you be more descriptive? "msg" is pretty non-telling. "MISPREPARED_STATEMENT_MESSAGE" is way better.

You're right, msg was way too generic. I’ve updated the variable name as per your suggestion.

Quick question on the naming: since this is technically a warning rather than a hard error, would you prefer MISPREPARED_STATEMENT_WARN_MESSAGE, or should I stick with your suggestion of MISPREPARED_STATEMENT_MESSAGE to keep it concise?

MISPREPARED_STATEMENT_WARN_MESSAGE

smiklosovic · 2026-02-02T00:37:35Z

src/java/org/apache/cassandra/service/StorageService.java

+    public void setMispreparedStatementsEnabled(boolean enabled)
+    {
+        DatabaseDescriptor.setMispreparedStatementsEnabled(enabled);
+        logger.info("Updated misprepared_statements_enabled to {}", enabled);


why logging

Should this logging be kept for auditing (as we removed logging from DatabaseDescriptor).

smiklosovic · 2026-02-02T01:03:34Z

src/java/org/apache/cassandra/cql3/QueryProcessor.java

        return res;
    }

+    private static void checkMispreparedGuardrail(CQLStatement statement, ClientState clientState)


I am not completely sure about this logic, nothing about what you did specifically ... I just need to check this more closely and I don't have time for that right now. This is very nuanced stuff.

smiklosovic · 2026-02-02T01:07:26Z

src/java/org/apache/cassandra/cql3/QueryProcessor.java

+        if (user != null && user.isSuper())
+            return;
+
+        if (Guardrails.mispreparedStatementsEnabled.enabled(clientState))


is not the logic here actually the other way around? if it is enabled then we are going to just warn, if it is disabled (false in yaml), then we fail, "use_misprepared_statements" set to "false" means that we CAN NOT use them, so we need to fail in that situation. If it is true then we just warn.

You also do not need to use "NoSpamLogger.log". Check what Guardrail.warn is doing (EnableFlag subclass of Guardrail so you can call .warn() here too). That already warns in a non-spamming manner and does other things (diagnostic events etc).

smiklosovic

more comments

smiklosovic · 2026-02-02T01:13:12Z

src/java/org/apache/cassandra/cql3/QueryProcessor.java

        CQLStatement statement = raw.prepare(clientState);
        statement.validate(clientState);

+        if (!isInternal && !clientState.isInternal)


I am not sure if we should add the logic, conceptually, into this class. If you check close enough then you see that there is "statement.validate()" just above. If you check what, for example, SelectStatement.validate() is doing, it contains guardrails too.

We would just enrich validate methods for each respective statement kind you want to check here instead of having arbitrary logic in this class directly. I think that is way cleaner approach.

It also means that if we do not put it into "validate" method for each respective statement then calling "validate" itself would pass but in fact it would contain mis prepared statement. I do not think that is a good way do that. By putting that logic to validate method, we are centralizing it all which is better design and the logic does not "leak" outside of it.

…or system keyspaces

omniCoder77 · 2026-02-02T09:23:44Z

@smiklosovic
I have addressed above comments and

Added new test cases verifying warning behaviour (warn only when misprepared_statements_enabled is true) and to bypass misprepare guardrail for system keyspaces.

Centralized onMisprepare logic to Guardrails.java

Refined test case to extract repeating statements to a function, increase readability.

omniCoder77 · 2026-02-04T11:07:18Z

In my test cases I have tested

System keyspaces, internal calls and super users are exempted from this guardrail check
Confirmed literals in the SELECT clause are allowed as long as the WHERE clause is prepared.
Verified use is warned if and only if
- misprepared_statement_use is set to true and statement is misprepared
- isn't internal call
- user isn't super user.
Verified no warning when guardrail is violated, instead GuardrailViolationException
Verified blocks for literals in WHERE clauses, IN clauses, LWT IF conditions, and JSON inserts.
Confirmed that blocked queries do not enter the statement cache
Exempts super user and internal call from this guardrail

true

omniCoder77 added 8 commits January 30, 2026 09:16

added use_misprepare_statements_enabled flag (not implemented)

d5da19a

implemented use_misprepared_statements flag

33c3219

added warning to client

ae61b88

cleanup

c83bf22

additional cleanup

ba80e42

fixed codestyle files

4caf9ad

optimized test case

e282987

Expand guardrail coverage for mis-prepared CQL statements across full…

5825094

… CRUD, IN, and LWT operations

smiklosovic reviewed Jan 31, 2026

View reviewed changes

src/java/org/apache/cassandra/config/Config.java Outdated Show resolved Hide resolved

smiklosovic reviewed Jan 31, 2026

View reviewed changes

src/java/org/apache/cassandra/config/DatabaseDescriptor.java Outdated Show resolved Hide resolved

smiklosovic reviewed Jan 31, 2026

View reviewed changes

src/java/org/apache/cassandra/config/GuardrailsOptions.java Show resolved Hide resolved

smiklosovic reviewed Jan 31, 2026

View reviewed changes

src/java/org/apache/cassandra/cql3/QueryProcessor.java Outdated Show resolved Hide resolved

smiklosovic reviewed Jan 31, 2026

View reviewed changes

src/java/org/apache/cassandra/db/guardrails/Guardrails.java Show resolved Hide resolved

smiklosovic reviewed Jan 31, 2026

View reviewed changes

test/unit/org/apache/cassandra/cql3/MispreparedStatementsTest.java Show resolved Hide resolved

smiklosovic reviewed Jan 31, 2026

View reviewed changes

optimized queryprocessor and clubbed getter and setters

50ec499

omniCoder77 requested a review from smiklosovic January 31, 2026 16:34

smiklosovic reviewed Feb 2, 2026

View reviewed changes

conf/cassandra.yaml Show resolved Hide resolved

smiklosovic reviewed Feb 2, 2026

View reviewed changes

src/java/org/apache/cassandra/config/DatabaseDescriptor.java Show resolved Hide resolved

smiklosovic reviewed Feb 2, 2026

View reviewed changes

smiklosovic self-requested a review February 2, 2026 01:08

smiklosovic requested changes Feb 2, 2026

View reviewed changes

smiklosovic reviewed Feb 2, 2026

View reviewed changes

omniCoder77 added 2 commits February 2, 2026 12:44

added test case for literals in projections and bypassing guardrail f…

564a99d

…or system keyspaces

optimized query processing and new test cases

e2071d1

omniCoder77 requested a review from smiklosovic February 2, 2026 09:19

omniCoder77 added 4 commits February 2, 2026 14:54

remove redundant variable declaration

0304d23

fixed GuardrailsConfigCommandsTest

029259d

remove explicit client warn message

739ff86

fixed warning logic to not warn interal and super user

7270f91

added test to config initial value of misprepared_statements_enable is

44bba46

true

omniCoder77 force-pushed the feature/guardrail-for-misprepare-statements branch from 311028f to 44bba46 Compare February 4, 2026 11:10

		public static final CQLMetrics metrics = new CQLMetrics();


		private static final String msg = "Performance Tip: This query contains literal values in the WHERE clause. " +

Conversation

omniCoder77 commented Jan 30, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

smiklosovic Jan 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

smiklosovic Jan 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

smiklosovic Jan 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

smiklosovic Jan 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

smiklosovic Jan 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

smiklosovic Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

omniCoder77 Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

smiklosovic left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

smiklosovic Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

omniCoder77 commented Feb 2, 2026

Uh oh!

omniCoder77 commented Feb 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

smiklosovic Jan 31, 2026 •

edited

Loading

smiklosovic Jan 31, 2026 •

edited

Loading

smiklosovic Jan 31, 2026 •

edited

Loading

smiklosovic Jan 31, 2026 •

edited

Loading

smiklosovic Jan 31, 2026 •

edited

Loading

smiklosovic Feb 2, 2026 •

edited

Loading

omniCoder77 Feb 2, 2026 •

edited

Loading

smiklosovic Feb 2, 2026 •

edited

Loading