fix: use RE2-compatible escaper for repo/file filter queries#1004
fix: use RE2-compatible escaper for repo/file filter queries#1004brendan-kellam wants to merge 6 commits intomainfrom
Conversation
Token refresh was previously only triggered from the Next.js jwt callback, meaning tokens could expire between user visits and cause account-driven permission sync jobs to fail silently. Move refresh logic to packages/backend/src/ee/tokenRefresh.ts and call it from accountPermissionSyncer before using an account's access token. On refresh failure, tokenRefreshErrorMessage is set on the Account record and surfaced in the linked accounts UI so users know to re-authenticate. Also adds a DB migration for the tokenRefreshErrorMessage field and wires the signIn event to clear it on successful re-authentication. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
escape-string-regexp v5 encodes some characters (e.g. '-') as \xNN hex escapes, which Go's RE2 engine (used by Zoekt) does not support. This caused filterByRepos/filterByFilepaths with names containing dots, dashes, or slashes to return no results. Replace the import with a local escapeRE2 helper that only backslash-escapes characters special in RE2. Fixes #990 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
WalkthroughThis PR fixes repository and filepath filtering in code search by replacing escape-string-regexp with a custom RE2-compatible regex escaper. The previous implementation generated hex escapes incompatible with Zoekt's regex engine, causing searches with special characters (dots, dashes, slashes) to return no results. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (1)
packages/backend/src/ee/tokenRefresh.ts (1)
80-84: Consider the edge case whereexpires_atis exactly 0.The condition
account.expires_at > 0treatsexpires_at === 0as "no expiry set," which skips the refresh check. However, 0 represents Unix epoch (1970), which is definitively expired. If a provider ever returns 0 or if corrupted data sets it to 0, this would incorrectly return a potentially expired token.Consider changing the condition to explicitly check for
null:const isExpiredOrNearExpiry = - account.expires_at !== null && - account.expires_at > 0 && + account.expires_at !== null && now >= account.expires_at - EXPIRY_BUFFER_S;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/backend/src/ee/tokenRefresh.ts` around lines 80 - 84, The current expiry check treats expires_at === 0 as "no expiry" because of the `> 0` guard; change the logic to explicitly only guard against null (or undefined) so 0 is considered a valid timestamp (and therefore expired). Update the `isExpiredOrNearExpiry` computation (the variable and use of `account.expires_at`) to use `account.expires_at !== null` (and optionally `!== undefined`) instead of `> 0`, i.e. check presence only and then evaluate `now >= account.expires_at - EXPIRY_BUFFER_S` so an expires_at of 0 is handled as expired; ensure references to EXPIRY_BUFFER_S and account.expires_at remain unchanged elsewhere.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/mcp/src/index.ts`:
- Around line 7-9: The escapeRE2 function's character class is missing the
closing-parenthesis character, so update the regex in escapeRE2 (const escapeRE2
= (s: string) => s.replace(/[.+*?^${}[\]|(\\]/g, '\\$&');) to also include ')'
so closing parentheses are escaped; modify the character class to add ')'
(ensuring proper escaping/ordering inside the [...]) so the function correctly
backslash-escapes ')' for RE2-compatible patterns.
In `@packages/web/src/app/`[domain]/browse/layoutClient.tsx:
- Around line 13-15: The regex escape function escapeRE2 is missing the closing
parenthesis in its character class, so update the pattern in every duplicated
implementation (escapeRE2) to include ')' among the escaped characters (i.e.,
add ')' into the character class) so literal ')' in repo/file names are
correctly escaped; after patching each occurrence (in the functions named
escapeRE2 across the codebase) consolidate these duplicates by moving the
corrected escapeRE2 into a single shared utility module and import/use that
utility where the function was previously duplicated.
In `@packages/web/src/features/chat/tools.ts`:
- Line 14: escapeRE2's character class currently omits the closing parenthesis,
so strings containing ')' aren't properly escaped; update the escapeRE2
function's regex (the char class in the replace call inside escapeRE2) to
include ')' alongside '(' so both grouping metacharacters are escaped, ensuring
the replace call escapes all RE2 metacharacters correctly.
In `@packages/web/src/features/mcp/server.ts`:
- Around line 14-16: The regex in escapeRE2 currently escapes "(" but omits ")",
so add the closing parenthesis to the character class used in the replace so
that escapeRE2(s: string) properly escapes both parentheses; update the
character class in the escapeRE2 function to include ')' so Zoekt queries that
use the escaped repo/file patterns (used where escapeRE2 is invoked around the
Zoekt query construction) won't break at the locations where those patterns are
used.
---
Nitpick comments:
In `@packages/backend/src/ee/tokenRefresh.ts`:
- Around line 80-84: The current expiry check treats expires_at === 0 as "no
expiry" because of the `> 0` guard; change the logic to explicitly only guard
against null (or undefined) so 0 is considered a valid timestamp (and therefore
expired). Update the `isExpiredOrNearExpiry` computation (the variable and use
of `account.expires_at`) to use `account.expires_at !== null` (and optionally
`!== undefined`) instead of `> 0`, i.e. check presence only and then evaluate
`now >= account.expires_at - EXPIRY_BUFFER_S` so an expires_at of 0 is handled
as expired; ensure references to EXPIRY_BUFFER_S and account.expires_at remain
unchanged elsewhere.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: ce79e67b-c53f-4568-8ab6-7dedfe3a4eb0
📒 Files selected for processing (13)
CHANGELOG.mdpackages/backend/src/ee/accountPermissionSyncer.tspackages/backend/src/ee/tokenRefresh.tspackages/db/prisma/migrations/20260313002214_add_account_token_refresh_error_message/migration.sqlpackages/db/prisma/schema.prismapackages/mcp/src/index.tspackages/web/src/app/[domain]/browse/layoutClient.tsxpackages/web/src/auth.tspackages/web/src/ee/features/sso/actions.tspackages/web/src/ee/features/sso/components/linkedAccountProviderCard.tsxpackages/web/src/features/chat/tools.tspackages/web/src/features/mcp/server.tspackages/web/src/lib/encryptedPrismaAdapter.ts
| // Escapes special RE2 regex characters using backslash (compatible with Zoekt/Go RE2). | ||
| // escape-string-regexp v5 uses \xNN hex escapes which RE2 does not support. | ||
| const escapeRE2 = (s: string) => s.replace(/[.+*?^${}[\]|(\\]/g, '\\$&'); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n packages/mcp/src/index.ts | head -20Repository: sourcebot-dev/sourcebot
Length of output: 1496
🏁 Script executed:
# Check if the file exists and examine the context around lines 7-9
wc -l packages/mcp/src/index.tsRepository: sourcebot-dev/sourcebot
Length of output: 96
🏁 Script executed:
# Search for all occurrences of escapeRE2 in the codebase
rg "escapeRE2" -n --context 2Repository: sourcebot-dev/sourcebot
Length of output: 4830
🏁 Script executed:
# Also check if there are similar regex patterns elsewhere
rg "replace\(/\[" --context 3 -t ts -t tsx -t js -t jsxRepository: sourcebot-dev/sourcebot
Length of output: 94
🏁 Script executed:
# Search for any comments or documentation about RE2 special characters in the codebase
rg "RE2|special.*character" -i --context 3 -t ts -t tsx -t jsRepository: sourcebot-dev/sourcebot
Length of output: 94
🌐 Web query:
RE2 regular expression special characters that need escaping
💡 Result:
In RE2, you escape a metacharacter by prefixing it with a backslash (e.g., \+ to match a literal +). [1]
Metacharacters you typically need to escape (to match literally)
.^$*+?()[{\|(and, in general, other punctuation when used as operators) [2]
RE2 also supports escaping “any punctuation character” with \ (e.g., \*), and you can escape a whole literal chunk with \Q...\E. [3]
Inside a character class ([...])
You generally only need to escape (or place carefully):
\]-^(when^would otherwise negate the class) [2]
Sources: [1] [2] [3]
escapeRE2 is missing closing-parenthesis escaping.
Line 9's regex pattern fails to escape ), which is a special character in RE2. This can cause query filters to fail or parse incorrectly when repo IDs or filepaths contain closing parentheses.
Proposed fix
-const escapeRE2 = (s: string) => s.replace(/[.+*?^${}[\]|(\\]/g, '\\$&');
+const escapeRE2 = (s: string) => s.replace(/[\\.^$|?*+()[\]{}]/g, '\\$&');📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Escapes special RE2 regex characters using backslash (compatible with Zoekt/Go RE2). | |
| // escape-string-regexp v5 uses \xNN hex escapes which RE2 does not support. | |
| const escapeRE2 = (s: string) => s.replace(/[.+*?^${}[\]|(\\]/g, '\\$&'); | |
| // Escapes special RE2 regex characters using backslash (compatible with Zoekt/Go RE2). | |
| // escape-string-regexp v5 uses \xNN hex escapes which RE2 does not support. | |
| const escapeRE2 = (s: string) => s.replace(/[\\.^$|?*+()[\]{}]/g, '\\$&'); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/mcp/src/index.ts` around lines 7 - 9, The escapeRE2 function's
character class is missing the closing-parenthesis character, so update the
regex in escapeRE2 (const escapeRE2 = (s: string) =>
s.replace(/[.+*?^${}[\]|(\\]/g, '\\$&');) to also include ')' so closing
parentheses are escaped; modify the character class to add ')' (ensuring proper
escaping/ordering inside the [...]) so the function correctly backslash-escapes
')' for RE2-compatible patterns.
| // Escapes special RE2 regex characters using backslash (compatible with Zoekt/Go RE2). | ||
| // escape-string-regexp v5 uses \xNN hex escapes which RE2 does not support. | ||
| const escapeRE2 = (s: string) => s.replace(/[.+*?^${}[\]|(\\]/g, '\\$&'); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
file="packages/web/src/app/[domain]/browse/layoutClient.tsx"
rg -n "const escapeRE2" "$file"
python - <<'PY'
from pathlib import Path
import sys
file = Path("packages/web/src/app/[domain]/browse/layoutClient.tsx")
line = next((l for l in file.read_text().splitlines() if "const escapeRE2" in l), None)
if not line:
print("escapeRE2 declaration not found")
sys.exit(1)
# Pull content inside /[ ... ]/g for quick static validation
try:
char_class = line.split("/[", 1)[1].split("]/g", 1)[0]
except Exception:
print("Could not parse regex char class from:", line)
sys.exit(1)
print("escapeRE2 char class:", char_class)
required = [".", "+", "*", "?", "^", "$", "{", "}", "[", "]", "|", "(", ")", "\\\\"]
missing = [c for c in required if c not in char_class]
print("missing:", missing)
sys.exit(1 if missing else 0)
PYRepository: sourcebot-dev/sourcebot
Length of output: 196
🏁 Script executed:
# Check the file structure and verify the patch suggestion
cat -n packages/web/src/app/[domain]/browse/layoutClient.tsx | head -50
# Also check if escapeRE2 is used anywhere else in the codebase
rg "escapeRE2" --type ts --type tsxRepository: sourcebot-dev/sourcebot
Length of output: 2431
🏁 Script executed:
# Search for escapeRE2 usage with a simpler approach
rg "escapeRE2"
# Verify the suggested patch - check if all RE2 metacharacters are covered
python - <<'PY'
import re
# Current regex pattern from line 15
current = r"[.+*?^${}[\]|(\\]"
# Suggested patch
suggested = r"[\\.^$|?*+()[\]{}]"
# All RE2 regex metacharacters that need escaping
re2_metacharacters = {'.', '+', '*', '?', '^', '$', '{', '}', '[', ']', '|', '(', ')', '\\'}
print("=== Current Pattern Analysis ===")
print(f"Pattern: {current}")
current_chars = set()
# Parse the character class manually since it's complex
for c in '.+*?^${}[|(\\':
if c in current:
current_chars.add(c)
current_chars.add(']') # ] is in the class as \]
print(f"Escapes: {sorted(current_chars)}")
current_missing = re2_metacharacters - current_chars
print(f"Missing: {current_missing}")
print("\n=== Suggested Pattern Analysis ===")
print(f"Pattern: {suggested}")
suggested_chars = set()
for c in '.^$|?*+()[]{\\':
if c in suggested:
suggested_chars.add(c)
print(f"Escapes: {sorted(suggested_chars)}")
suggested_missing = re2_metacharacters - suggested_chars
print(f"Missing: {suggested_missing}")
print(f"\n=== Verdict ===")
print(f"Current pattern missing: {current_missing}")
print(f"Suggested patch fixes issue: {')' in suggested_chars}")
PYRepository: sourcebot-dev/sourcebot
Length of output: 1838
Escape set is incomplete: ) is not escaped in escapeRE2, affecting 4 files.
The character class on line 15 escapes ( but not ). This breaks literal matching for any repo name or file path containing ). The same incomplete function is duplicated across the codebase:
- packages/web/src/app/[domain]/browse/layoutClient.tsx:15 (repo escaping)
- packages/web/src/features/mcp/server.ts (repo and file escaping)
- packages/web/src/features/chat/tools.ts (repo and file escaping)
- packages/mcp/src/index.ts (repo and file escaping)
All instances must be fixed consistently. Apply the corrected pattern to all locations:
Suggested patch
-const escapeRE2 = (s: string) => s.replace(/[.+*?^${}[\]|(\\]/g, '\\$&');
+const escapeRE2 = (s: string) => s.replace(/[\\.^$|?*+()[\]{}]/g, '\\$&');Consider extracting this function to a shared utility module to prevent duplication and ensure consistency across the codebase.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Escapes special RE2 regex characters using backslash (compatible with Zoekt/Go RE2). | |
| // escape-string-regexp v5 uses \xNN hex escapes which RE2 does not support. | |
| const escapeRE2 = (s: string) => s.replace(/[.+*?^${}[\]|(\\]/g, '\\$&'); | |
| // Escapes special RE2 regex characters using backslash (compatible with Zoekt/Go RE2). | |
| // escape-string-regexp v5 uses \xNN hex escapes which RE2 does not support. | |
| const escapeRE2 = (s: string) => s.replace(/[\\.^$|?*+()[\]{}]/g, '\\$&'); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/web/src/app/`[domain]/browse/layoutClient.tsx around lines 13 - 15,
The regex escape function escapeRE2 is missing the closing parenthesis in its
character class, so update the pattern in every duplicated implementation
(escapeRE2) to include ')' among the escaped characters (i.e., add ')' into the
character class) so literal ')' in repo/file names are correctly escaped; after
patching each occurrence (in the functions named escapeRE2 across the codebase)
consolidate these duplicates by moving the corrected escapeRE2 into a single
shared utility module and import/use that utility where the function was
previously duplicated.
| import escapeStringRegexp from "escape-string-regexp"; | ||
| // Escapes special RE2 regex characters using backslash (compatible with Zoekt/Go RE2). | ||
| // escape-string-regexp v5 uses \xNN hex escapes which RE2 does not support. | ||
| const escapeRE2 = (s: string) => s.replace(/[.+*?^${}[\]|(\\]/g, '\\$&'); |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
For RE2-compatible regex escaping (used from JavaScript), which metacharacters must be escaped for literal matching outside character classes, and does this include both '(' and ')'?
💡 Result:
Outside character classes, RE2 treats these as metacharacters (i.e., not literal unless escaped): \ (escape), . (any char), ^ / $ (anchors), * + ? (quantifiers), | (alternation), ( ) (grouping), [ (starts a character class), and { (starts a repetition like {n,m}). To match any of these literally, prefix with \ (and in a JavaScript string literal you’d write that as \\). [1][2]
Yes—this includes both ( and ); they are grouping metacharacters, so use \( and \) (JS: "\\(" and "\\)"). [1]
Sources: [1] RE2 syntax (google/re2 wiki). [2] RE2 syntax reference (CRAN mirror).
escapeRE2 misses ) escaping, breaking repos/files with closing parentheses.
The regex character class escapes ( but not ). Since both are metacharacters in RE2, values containing ) will produce invalid or unintended patterns. The fix is to include both grouping characters.
Suggested fix
-const escapeRE2 = (s: string) => s.replace(/[.+*?^${}[\]|(\\]/g, '\\$&');
+const escapeRE2 = (s: string) => s.replace(/[.+*?^${}()|[\]\\]/g, '\\$&');📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const escapeRE2 = (s: string) => s.replace(/[.+*?^${}[\]|(\\]/g, '\\$&'); | |
| const escapeRE2 = (s: string) => s.replace(/[.+*?^${}()|[\]\\]/g, '\\$&'); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/web/src/features/chat/tools.ts` at line 14, escapeRE2's character
class currently omits the closing parenthesis, so strings containing ')' aren't
properly escaped; update the escapeRE2 function's regex (the char class in the
replace call inside escapeRE2) to include ')' alongside '(' so both grouping
metacharacters are escaped, ensuring the replace call escapes all RE2
metacharacters correctly.
| // Escapes special RE2 regex characters using backslash (compatible with Zoekt/Go RE2). | ||
| // escape-string-regexp v5 uses \xNN hex escapes which RE2 does not support. | ||
| const escapeRE2 = (s: string) => s.replace(/[.+*?^${}[\]|(\\]/g, '\\$&'); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n packages/web/src/features/mcp/server.ts | head -30Repository: sourcebot-dev/sourcebot
Length of output: 1532
🏁 Script executed:
cat -n packages/web/src/features/mcp/server.ts | sed -n '140,160p'Repository: sourcebot-dev/sourcebot
Length of output: 958
🏁 Script executed:
rg "escapeRE2" packages/web/src/features/mcp/server.ts -B 2 -A 2Repository: sourcebot-dev/sourcebot
Length of output: 722
🌐 Web query:
RE2 regex metacharacters parentheses escaping Go
💡 Result:
In Go, the regexp package uses RE2 syntax, where parentheses ( and ) are metacharacters for grouping/capturing. To match literal parentheses, escape them as \( and \). [2]
Because Go string literals also treat \ as an escape (in double-quoted strings), you typically need double escaping:
regexp.MustCompile("\\(\\d+,\\d+\\)") // interpreted string literalOr avoid double escaping by using a raw string literal (backticks):
regexp.MustCompile(`\(\d+,\d+\)`) // raw string literalIf you need to safely insert arbitrary text (including parentheses) into a regex, use regexp.QuoteMeta, which escapes all regex metacharacters for you. [1]
Sources: Go regexp docs (QuoteMeta). [1] RE2 syntax reference. [2] Go escaping-parentheses discussion/examples. [3]
[1] (pkg.go.dev)
[2] (github.com)
[3] (stackoverflow.com)
Citations:
- 1: https://pkg.go.dev/regexp?utm_source=openai
- 2: https://github.com/google/re2/wiki/syntax?utm_source=openai
- 3: https://stackoverflow.com/questions/35183103/escaping-parentheses-in-go-regexp?utm_source=openai
🏁 Script executed:
rg "escapeRE2|filterByRepos|filterByFilepaths" packages/web/src/features/mcp/ -B 1 -A 1Repository: sourcebot-dev/sourcebot
Length of output: 2149
🏁 Script executed:
find packages -name "*.test.ts" -o -name "*.test.tsx" -o -name "*.spec.ts" | xargs grep -l "escapeRE2\|server.ts" 2>/dev/null | head -5Repository: sourcebot-dev/sourcebot
Length of output: 49
Escape ) in escapeRE2 as well.
The current pattern at line 16 escapes ( but not ). Both are metacharacters in RE2 used for grouping/capturing. Unescaped ) in repo names or filepaths will break regex pattern matching in Zoekt queries at lines 146 and 152.
Proposed fix
-const escapeRE2 = (s: string) => s.replace(/[.+*?^${}[\]|(\\]/g, '\\$&');
+const escapeRE2 = (s: string) => s.replace(/[\\.^$|?*+()[\]{}]/g, '\\$&');📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Escapes special RE2 regex characters using backslash (compatible with Zoekt/Go RE2). | |
| // escape-string-regexp v5 uses \xNN hex escapes which RE2 does not support. | |
| const escapeRE2 = (s: string) => s.replace(/[.+*?^${}[\]|(\\]/g, '\\$&'); | |
| // Escapes special RE2 regex characters using backslash (compatible with Zoekt/Go RE2). | |
| // escape-string-regexp v5 uses \xNN hex escapes which RE2 does not support. | |
| const escapeRE2 = (s: string) => s.replace(/[\\.^$|?*+()[\]{}]/g, '\\$&'); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/web/src/features/mcp/server.ts` around lines 14 - 16, The regex in
escapeRE2 currently escapes "(" but omits ")", so add the closing parenthesis to
the character class used in the replace so that escapeRE2(s: string) properly
escapes both parentheses; update the character class in the escapeRE2 function
to include ')' so Zoekt queries that use the escaped repo/file patterns (used
where escapeRE2 is invoked around the Zoekt query construction) won't break at
the locations where those patterns are used.
Updated CHANGELOG.md to include fixes for repository filtering and added new features related to SMTP configuration and API key restrictions.
Summary
escape-string-regexpv5 encodes certain characters (e.g.-) as\xNNhex escapes, which Go's RE2 engine (used by Zoekt) does not supportfilterByRepos/filterByFilepathswith names containing dots, dashes, or slashes to silently return no resultsescapeRE2helper that only backslash-escapes characters actually special in RE2, across all four call sites (packages/mcp,packages/webMCP server, chat tools, and browse layout)Test plan
filterByReposcontaining a repo name with dots (e.g.dev.azure.com/Org/Proj/_git/repo) — confirm results are returnedfilterByReposcontaining a repo name with dashes (e.g.my-backend-service) — confirm results are returnedfilterByFilepathscontaining a path with dots/dashes — confirm filtering works correctlyrepo:^...$filterFixes #990
🤖 Generated with Claude Code
Summary by CodeRabbit
Bug Fixes