-
-
Notifications
You must be signed in to change notification settings - Fork 959
Query fixes #2876
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Query fixes #2876
Conversation
|
WalkthroughRemoved a ClickHouse safety flag from the default settings. Added semantic state to the query printer to track SELECT aliases, allowed and internal-only ClickHouse columns, and a projection context flag. Enhanced field resolution to block unknown and internal-only columns in user-facing projections, permit certain internal columns in internal contexts (tenant/required-filter), and produce suggestion-rich errors for unknown columns. Adjusted error sanitization to avoid replacing names inside "Did you mean" suggestions. Escalated unknown-column reports from warnings to errors. Expanded and added tests covering aliasing, subqueries, clause-specific resolution, and internal-column blocking. Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes 🚥 Pre-merge checks | ❌ 3❌ Failed checks (2 warnings, 1 inconclusive)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
📜 Recent review detailsConfiguration used: Repository UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (7)
💤 Files with no reviewable changes (1)
🚧 Files skipped from review as they are similar to previous changes (3)
🧰 Additional context used📓 Path-based instructions (9)**/*.{ts,tsx}📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Files:
**/*.{ts,tsx,js,jsx}📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Files:
**/*.{js,ts,jsx,tsx,json,md,css,scss}📄 CodeRabbit inference engine (AGENTS.md)
Files:
**/*.ts📄 CodeRabbit inference engine (.cursor/rules/otel-metrics.mdc)
Files:
**/*.{ts,tsx,js}📄 CodeRabbit inference engine (CLAUDE.md)
Files:
**/*.{test,spec}.{ts,tsx}📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Files:
**/*.test.{ts,tsx,js,jsx}📄 CodeRabbit inference engine (AGENTS.md)
Files:
**/*.test.{ts,tsx,js}📄 CodeRabbit inference engine (CLAUDE.md)
Files:
**/*.test.{ts,tsx}📄 CodeRabbit inference engine (CLAUDE.md)
Files:
🧠 Learnings (2)📚 Learning: 2026-01-12T11:01:34.792ZApplied to files:
📚 Learning: 2025-11-27T16:27:48.109ZApplied to files:
🧬 Code graph analysis (2)internal-packages/tsql/src/query/printer.ts (3)
internal-packages/tsql/src/query/printer.test.ts (3)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (23)
🔇 Additional comments (15)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
PR Review: Query fixesSummaryThis PR improves the TSQL query layer with two key changes:
Overall Assessment✅ Good PR - Well-structured changes with comprehensive test coverage. The security improvements around column access are solid. Code Quality & Best PracticesStrengths:
Minor suggestions:
private static readonly ARITHMETIC_OP_NAMES: Record<ArithmeticOperationOp, string> = {
[ArithmeticOperationOp.Add]: "plus",
// ...
};
Potential Bugs or Issues
SELECT unknown_col FROM (SELECT * FROM runs)would not throw. Consider if this should be documented or handled differently. Performance Considerations
Security Considerations✅ This PR improves security by:
The "Did you mean?" error messages are safe because they only suggest columns that are already exposed in the TSQL schema. Test CoverageExcellent coverage with tests for:
One minor gap:
Summary of Recommendations
Overall this is a well-implemented security improvement. Approve with minor suggestions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal-packages/tsql/src/query/printer.ts (1)
2177-2246: Unknown-column blocking: add “did you mean” support for dotted roots typed as ClickHouse names (e.g.created_at.foo).Right now,
resolveFieldChain()returns the chain unchanged whenchain.length >= 2andchain[0]isn’t a known table alias (Line 2203-2204). That preserves subquery alias refs, but it also skips the clickhouseName→tsqlName suggestion path if the root is a ClickHouse internal name used with dot access.A targeted hardening that still preserves subquery aliases: when
chain.length >= 2andchain[0]is not a known table alias, check whetherchain[0]matches anyColumnSchema.clickhouseName; if so, throw the same “Did you mean …?” error you already throw for unqualified/qualified access.Proposed diff (minimal change, preserves subquery alias refs)
if (chain.length >= 2) { const tableAlias = firstPart; const tableSchema = this.tableContexts.get(tableAlias); if (tableSchema) { // This is a table.column reference const columnName = chain[1]; if (typeof columnName === "string") { const resolvedColumn = this.resolveColumnName(tableSchema, columnName, tableAlias); return [tableAlias, resolvedColumn, ...chain.slice(2)]; } } - // Not a known table alias, might be a nested field or JSON path - return as-is - return chain; + // Not a known table alias. + // If the user used an internal ClickHouse column name as the root of a dotted path, + // still provide "Did you mean" guidance (instead of silently passing it through). + const suggestion = this.findTSQLNameForClickHouseName(tableAlias); + if (suggestion) { + throw new QueryError(`Unknown column "${tableAlias}". Did you mean "${suggestion}"?`); + } + // Otherwise, allow (e.g. subquery aliases like sub.total) + return chain; }Also, if you keep
allowedInternalColumnsas a bypass, I’d strongly recommend adding an explicit denial path for user-authored references to internal-only columns (so the bypass is only usable for printer-generated guards).Also applies to: 2259-2275, 2277-2305
🤖 Fix all issues with AI agents
In @internal-packages/tsql/src/query/printer.ts:
- Around line 1417-1433: When registering tenant/required-filter names into
allowedInternalColumns, also mark them as non-user-queryable unless they appear
in the table schema's public columns: create or update a companion set (e.g.,
internalNonQueryableColumns or internalOnlyColumns) and add
organizationId/projectId/environmentId and each requiredFilters[].column to it;
then update the printer paths that render/provide SELECT, ORDER BY, GROUP BY,
and projection logic to reject or strip any user-requested column present in
that internal-only set unless that column exists in TableSchema.columns (or
tableSchema.columns includes an explicit exposure flag). Ensure the check
references the existing symbols allowedInternalColumns, tableSchema, and
TableSchema.columns and that the guard runs wherever user-supplied column names
are validated/rendered.
- Around line 119-128: The field-resolution logic currently treats
allowedInternalColumns as universally resolvable, allowing user-authored queries
to reference internal/generated columns; update resolveFieldChain to only accept
names from allowedInternalColumns when the AST node was printer-injected (e.g.,
mark injected nodes or check a flag/property on the AST token), and otherwise
require resolution against TableSchema.columns/selectAliases; ensure the Set
allowedInternalColumns is only consulted for guard/inserted nodes
(printer-injected) and do not fall back to it for normal parsed identifiers to
prevent schema bypass.
🧹 Nitpick comments (4)
internal-packages/tsql/src/query/schema.ts (1)
859-871: Make the “Did you mean” detection more robust (quote/case variations).
result.includes('Did you mean "')is easy to miss if formatting changes (single quotes/backticks/case). Consider a case-insensitive regex gate.Proposed diff
- if (!result.includes('Did you mean "')) { + const hasDidYouMean = /did you mean\s+["'`]/i.test(result); + if (!hasDidYouMean) { // Sort by length descending to replace longer names first const sortedColumnNames = [...columnNameMap.entries()].sort( (a, b) => b[0].length - a[0].length ); for (const [clickhouseName, { tsqlName }] of sortedColumnNames) { result = replaceAllOccurrences(result, clickhouseName, tsqlName); } }internal-packages/tsql/src/query/printer.ts (2)
323-335: Wrap context restore intry/finallyto avoid state corruption on throw.
visitSelectQuery()restores contexts/aliases at the end, but anyQueryErrorthrown mid-walk will skip restore (relevant ifClickHousePrinteris reused).Proposed diff (structure)
- // Save and clear table contexts - // Top-level queries clear contexts; subqueries save parent context and create fresh context - const savedTableContexts = new Map(this.tableContexts); - const savedInternalColumns = new Set(this.allowedInternalColumns); + const savedTableContexts = new Map(this.tableContexts); + const savedInternalColumns = new Set(this.allowedInternalColumns); + const savedAliases = this.selectAliases; if (isTopLevelQuery) { this.tableContexts.clear(); this.allowedInternalColumns.clear(); } else { this.tableContexts = new Map(); this.allowedInternalColumns = new Set(); } - // Extract SELECT column aliases BEFORE visiting columns - // This allows ORDER BY/HAVING to reference aliased columns - const savedAliases = this.selectAliases; - this.selectAliases = new Set(); - if (node.select) { - for (const col of node.select) { - this.extractSelectAlias(col); - } - } + try { + // Extract SELECT column aliases BEFORE visiting columns + this.selectAliases = new Set(); + if (node.select) { + for (const col of node.select) { + this.extractSelectAlias(col); + } + } // ... existing method body ... - // Restore saved contexts (for nested queries) - this.selectAliases = savedAliases; - this.tableContexts = savedTableContexts; - this.allowedInternalColumns = savedInternalColumns; - - return response; + return response; + } finally { + this.selectAliases = savedAliases; + this.tableContexts = savedTableContexts; + this.allowedInternalColumns = savedInternalColumns; + }Also applies to: 510-514
371-379: Alias pre-scan is the right approach; consider de-duping implicit-name logic.Extracting aliases before visiting columns is a good fix for ORDER BY/HAVING alias references. Small maintainability note:
extractSelectAlias()overlaps conceptually withgenerateImplicitName()(and validator’s implicit-name logic), so drift is a risk.Also applies to: 518-558
internal-packages/tsql/src/query/printer.test.ts (1)
2175-2319: Nice coverage expansion for compile-time unknown-column blocking + alias/subquery allowances.If you implement the dotted-root “Did you mean” hardening in
resolveFieldChain(), it’d be worth adding a focused test case likeSELECT created_at.foo FROM runsexpectingDid you mean "created"(or whatever mapping applies) to prevent regression.
📜 Review details
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
apps/webapp/app/services/queryService.server.tsinternal-packages/tsql/src/query/printer.test.tsinternal-packages/tsql/src/query/printer.tsinternal-packages/tsql/src/query/schema.test.tsinternal-packages/tsql/src/query/schema.tsinternal-packages/tsql/src/query/validator.test.tsinternal-packages/tsql/src/query/validator.ts
💤 Files with no reviewable changes (1)
- apps/webapp/app/services/queryService.server.ts
🧰 Additional context used
📓 Path-based instructions (9)
**/*.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.{ts,tsx}: Use types over interfaces for TypeScript
Avoid using enums; prefer string unions or const objects insteadEvery Trigger.dev task must be exported; use task() function with unique id and run async function
Files:
internal-packages/tsql/src/query/schema.test.tsinternal-packages/tsql/src/query/schema.tsinternal-packages/tsql/src/query/validator.test.tsinternal-packages/tsql/src/query/printer.test.tsinternal-packages/tsql/src/query/printer.tsinternal-packages/tsql/src/query/validator.ts
**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Use function declarations instead of default exports
Files:
internal-packages/tsql/src/query/schema.test.tsinternal-packages/tsql/src/query/schema.tsinternal-packages/tsql/src/query/validator.test.tsinternal-packages/tsql/src/query/printer.test.tsinternal-packages/tsql/src/query/printer.tsinternal-packages/tsql/src/query/validator.ts
**/*.{test,spec}.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Use vitest for all tests in the Trigger.dev repository
Files:
internal-packages/tsql/src/query/schema.test.tsinternal-packages/tsql/src/query/validator.test.tsinternal-packages/tsql/src/query/printer.test.ts
**/*.{js,ts,jsx,tsx,json,md,css,scss}
📄 CodeRabbit inference engine (AGENTS.md)
Format code using Prettier
Files:
internal-packages/tsql/src/query/schema.test.tsinternal-packages/tsql/src/query/schema.tsinternal-packages/tsql/src/query/validator.test.tsinternal-packages/tsql/src/query/printer.test.tsinternal-packages/tsql/src/query/printer.tsinternal-packages/tsql/src/query/validator.ts
**/*.test.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (AGENTS.md)
**/*.test.{ts,tsx,js,jsx}: Test files should live beside the files under test and use descriptivedescribeanditblocks
Avoid mocks or stubs in tests; use helpers from@internal/testcontainerswhen Redis or Postgres are needed
Use vitest for unit tests
Files:
internal-packages/tsql/src/query/schema.test.tsinternal-packages/tsql/src/query/validator.test.tsinternal-packages/tsql/src/query/printer.test.ts
**/*.ts
📄 CodeRabbit inference engine (.cursor/rules/otel-metrics.mdc)
**/*.ts: When creating or editing OTEL metrics (counters, histograms, gauges), ensure metric attributes have low cardinality by using only enums, booleans, bounded error codes, or bounded shard IDs
Do not use high-cardinality attributes in OTEL metrics such as UUIDs/IDs (envId, userId, runId, projectId, organizationId), unbounded integers (itemCount, batchSize, retryCount), timestamps (createdAt, startTime), or free-form strings (errorMessage, taskName, queueName)
When exporting OTEL metrics via OTLP to Prometheus, be aware that the exporter automatically adds unit suffixes to metric names (e.g., 'my_duration_ms' becomes 'my_duration_ms_milliseconds', 'my_counter' becomes 'my_counter_total'). Account for these transformations when writing Grafana dashboards or Prometheus queries
Files:
internal-packages/tsql/src/query/schema.test.tsinternal-packages/tsql/src/query/schema.tsinternal-packages/tsql/src/query/validator.test.tsinternal-packages/tsql/src/query/printer.test.tsinternal-packages/tsql/src/query/printer.tsinternal-packages/tsql/src/query/validator.ts
**/*.test.{ts,tsx,js}
📄 CodeRabbit inference engine (CLAUDE.md)
Use vitest exclusively for testing and never mock anything; use testcontainers instead
Files:
internal-packages/tsql/src/query/schema.test.tsinternal-packages/tsql/src/query/validator.test.tsinternal-packages/tsql/src/query/printer.test.ts
**/*.test.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.test.{ts,tsx}: Place test files next to source files with .test.ts naming convention (e.g., MyService.ts → MyService.test.ts)
Use testcontainers helpers (redisTest, postgresTest, containerTest from @internal/testcontainers) for Redis/PostgreSQL testing instead of mocks
Files:
internal-packages/tsql/src/query/schema.test.tsinternal-packages/tsql/src/query/validator.test.tsinternal-packages/tsql/src/query/printer.test.ts
**/*.{ts,tsx,js}
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.{ts,tsx,js}: Import from @trigger.dev/core using subpaths only; never import from root of @trigger.dev/core
Always import task definitions from @trigger.dev/sdk, never from @trigger.dev/sdk/v3 or deprecated client.defineJob pattern
Files:
internal-packages/tsql/src/query/schema.test.tsinternal-packages/tsql/src/query/schema.tsinternal-packages/tsql/src/query/validator.test.tsinternal-packages/tsql/src/query/printer.test.tsinternal-packages/tsql/src/query/printer.tsinternal-packages/tsql/src/query/validator.ts
🧠 Learnings (1)
📚 Learning: 2026-01-12T11:01:34.792Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-12T11:01:34.792Z
Learning: Applies to internal-packages/database/prisma/migrations/**/*.sql : Edit database schema in internal-packages/database/prisma/schema.prisma; remove extraneous lines from migrations related to BackgroundWorker, TaskRun tags, waitpoints, and SecretStore indexes
Applied to files:
internal-packages/tsql/src/query/schema.ts
🧬 Code graph analysis (3)
internal-packages/tsql/src/query/schema.test.ts (2)
internal-packages/tsql/src/index.ts (1)
sanitizeErrorMessage(82-82)apps/webapp/app/v3/querySchemas.ts (1)
runsSchema(27-420)
internal-packages/tsql/src/query/printer.test.ts (4)
internal-packages/tsql/src/query/errors.ts (1)
QueryError(43-45)internal-packages/tsql/src/query/schema.ts (1)
createSchemaRegistry(335-351)apps/webapp/app/v3/querySchemas.ts (1)
runsSchema(27-420)internal-packages/tsql/src/query/printer_context.ts (1)
createPrinterContext(205-214)
internal-packages/tsql/src/query/printer.ts (3)
internal-packages/tsql/src/query/ast.ts (4)
Expression(47-78)Alias(372-378)Call(482-488)ArithmeticOperation(380-385)internal-packages/tsql/src/query/errors.ts (1)
QueryError(43-45)internal-packages/tsql/src/query/schema.ts (1)
TableSchema(285-303)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (23)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
- GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
- GitHub Check: typecheck / typecheck
- GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (6)
internal-packages/tsql/src/query/validator.ts (2)
424-433: Unknown columns now fail validation (expected breaking change).Severity is now
"error"for qualified unknown columns, which will flipValidationResult.validtofalse(based on Line 102). Just ensure any callers/UI that previously tolerated warnings are updated accordingly.
463-471: Unqualified unknown columns now fail validation (consistent with qualified path).This matches the new “unknown columns are errors” posture and aligns the two paths.
internal-packages/tsql/src/query/schema.test.ts (1)
618-627: Good regression test for “Did you mean” sanitization.Covers the exact failure mode that would otherwise rewrite the “unknown” token into the suggestion token.
internal-packages/tsql/src/query/validator.test.ts (2)
50-59: Test correctly updated for unknown ORDER BY column now being an error.
101-108: Test correctly updated for unknown SELECT column now being an error.internal-packages/tsql/src/query/printer.test.ts (1)
432-441: NULL handling expectations updated appropriately (started_at).
8258fd6 to
e5c9e55
Compare
PR Review: Query fixesSummaryThis PR improves the TSQL query compiler by:
Code Quality ✅Well-designed architecture: The implementation cleanly separates concerns:
Good error messages: The "Did you mean X?" suggestions when users type ClickHouse column names instead of TSQL names is a nice UX improvement. Potential Issues
Test Coverage ✅Excellent test coverage with:
Performance ConsiderationsThe column resolution now does additional lookups through Security ✅The change improves security by blocking access to internal columns that users shouldn't query directly (tenant isolation columns, internal filter columns like Minor Suggestions
VerdictApprove - This is a well-implemented improvement that:
The changes are focused and don't over-engineer the solution. |
Don’t allow aliased columns to be queried – it was actually safe but confusing. We call
created_at->triggered_atbut we still allowed created_at which was confusing.Now we have nice errors if you try select columns that aren’t selectable.
Also removed a ClickHouse setting
allow_experimental_object_typewhich worked fine locally but stopped all queries working on ClickHouse Cloud 🤦♂️