feat(cassandra): add CQL parser with full Cassandra 4.1 grammar support#320
Merged
Conversation
Hand-written recursive descent parser for Apache Cassandra CQL covering: - DML: SELECT, INSERT, UPDATE, DELETE, BATCH - DDL: keyspace/table/index/type/MV/function/aggregate/trigger CRUD - Auth: GRANT, REVOKE, LIST, role/user management - Expressions: literals, collections, function calls, operators - Splitter: semicolon-aware splitting with string/comment/code block handling - Position tracking: byte offsets and line/column for all statements Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Strict parseIfNotExists(): now returns (bool, error) and requires the EXISTS token; rejects "IF NOT GARBAGE" - MV SELECT * support: CREATE MATERIALIZED VIEW now accepts SELECT * - MV WHERE IS NOT NULL: uses strict expect chain instead of blind advance - Lexer exponent: backtrack on malformed "1e" / "1e+" instead of emitting invalid float tokens - Type generic: only VECTOR<T, N> allows integer dimension parameter - Tests: error/truncation cases for every statement family, no-panic coverage for malformed/truncated input, Loc walker validation, MV-specific tests (SELECT *, multi-column, IF NOT EXISTS + options) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
VECTOR type now requires exactly 1 element type + 1 integer dimension: - vector<3> rejected (missing element type) - vector<float, 3, 4> rejected (extra params after dimension) - map<text, 3> rejected (integer param only valid for vector) Added negative tests for all three cases plus positive test for valid vector<float, 3> in CREATE TABLE. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add AST walker infrastructure and comprehensive Loc validation: - ast/walk.go: Visitor interface, Walk(), Inspect() functions - ast/walk_children.go: hand-written walkChildren covering all ~50 node types with correct child traversal - loc_test.go: reflection-based CheckLocations + walkNodeLocs that recursively validates every Loc field in the AST tree (detects End <= Start and mixed sentinel violations) - TestCheckLocations: 37 test cases covering all statement families (DML, DDL, Auth, multi-statement) — zero Loc violations Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…anced Loc checks - Fix ALTER/DROP MV double-advance of VIEW token (dispatch already consumed it) - Fix CREATE TRIGGER grammar: name ON table USING class (was missing ON table) - Fix UUID lexing for digit-prefixed UUIDs like 550e8400-e29b-... (was parsed as float) - Fix NULL/TRUE/FALSE in expression context (were swallowed by isIdentLike) - Add ast/walk_test.go: direct unit tests for Walk/Inspect - Add walk_coverage_test.go: reflection-vs-Walk coverage for all AST node types - Enhance CheckLocations: bounds + parent containment + statement containment - Add DotAccess and VectorLit test SQL for full walker coverage Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Enhance ParseError with Line, Column, Near fields - Error format: "line N column M: message at or near 'token'" - Store line index in Lexer for offset-to-position conversion - Lexer errors (unterminated string/identifier/code block) include line/column - Parser errorf() prefers lexer errors when available (e.g. unterminated string) - Add TestErrorLineColumn: verifies line/column accuracy for single-line, multi-line, deep-in-statement, and unterminated string errors - Add TestErrorAtOrNear: verifies "at or near" context in error messages - Add TestTruncationFuzz: truncates 12 valid SQL statements at every byte position (1500+ truncation points), verifies no panics - Add TestBinaryInputNoPanic: null bytes, 0xFF sequences, embedded nulls Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add TestCompatibilityHarness: parses all 45 CQL example files from the ANTLR reference grammar test corpus (105 statements total) - 44/45 files pass; 1 expected failure (standalone APPLY BATCH) - Fix UUID lexing for digit-prefixed UUIDs with hex-letter continuation (e.g. 6ab09bec-e68e-...) — extend digit scan to check for 8-char hex group forming UUID pattern - Validate Loc correctness on all parsed statements via CheckLocations Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Address Phase 5 review findings: - Vendor 45 CQL example files into cassandra/testdata/cql/examples/ - Use runtime.Caller to resolve testdata path (hermetic, no absolute paths) - Missing corpus now fails with t.Fatalf instead of silently skipping - Fix summary counters: totalFiles, passedFiles, expectedFailureFiles, totalStmts Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Align parser with official Apache Cassandra 4.1 CQL documentation: - Support // line comments alongside -- - Add != operator - GROUP BY clause in SELECT - PER PARTITION LIMIT clause in SELECT - KEYS/VALUES/ENTRIES/FULL index column specs - COUNTER batch type - Optional FINALFUNC/INITCOND in CREATE AGGREGATE - GRANT/REVOKE ROLE statements Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add optional IF EXISTS to ALTER KEYSPACE, ALTER TABLE, ALTER TYPE, ALTER MATERIALIZED VIEW, ALTER ROLE, ALTER USER. Add sub-operation guards: ADD IF NOT EXISTS, DROP IF EXISTS, RENAME IF EXISTS on ALTER TABLE and ALTER TYPE per Cassandra 4.1 CQL specification. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Complete all grammar compliance items for Cassandra 4.1 CQL: - CAST(expr AS type) expressions - Bind markers (? positional and :name named) - NaN/Infinity float literals - DROP FUNCTION/AGGREGATE with optional argument type signatures - IF condition extensions: IN, CONTAINS, CONTAINS KEY in LWT - INSERT JSON DEFAULT NULL alongside DEFAULT UNSET - HASHED PASSWORD and ACCESS TO DATACENTERS in role options - UDT field access (col.field) in UPDATE SET and DELETE targets - MBEAN/MBEANS and ALL MBEANS resource types - FUNCTION resource with type signature in GRANT/REVOKE Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1. CREATE/ALTER USER: WITH PASSWORD optional, HASHED PASSWORD support
2. LIMIT/PER PARTITION LIMIT: accept bind markers (? and :name)
3. ALTER TABLE RENAME: support multiple pairs (a TO b AND c TO d)
4. Singular PERMISSION keyword (GRANT SELECT PERMISSION ON ...)
5. CREATE CUSTOM INDEX: support both WITH OPTIONS = {...} and WITH {...}
6. MBEAN/MBEANS: checked type assertion to prevent panic on non-string
7. gofmt formatting on node.go, update.go, split_test.go
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
parseLimitValue() no longer falls through to parseConstant(). Only tokINTEGER, positional (?), and named (:name) bind markers are accepted. Strings, bools, nulls, and floats are now rejected with a clear error. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
.cqltest files (all passing)Test plan
go test ./cassandra/... -count=1— all passgo vet ./cassandra/...— cleangofmt -l cassandra/— clean.cqlfiles pass🤖 Generated with Claude Code