Skip to content

fix: use RE2-compatible escaper for repo/file filter queries#1004

Closed
brendan-kellam wants to merge 6 commits intomainfrom
brendan/fix-repo-filter-special-chars-SOU-556
Closed

fix: use RE2-compatible escaper for repo/file filter queries#1004
brendan-kellam wants to merge 6 commits intomainfrom
brendan/fix-repo-filter-special-chars-SOU-556

Conversation

@brendan-kellam
Copy link
Contributor

@brendan-kellam brendan-kellam commented Mar 14, 2026

Summary

  • escape-string-regexp v5 encodes certain characters (e.g. -) as \xNN hex escapes, which Go's RE2 engine (used by Zoekt) does not support
  • This caused filterByRepos/filterByFilepaths with names containing dots, dashes, or slashes to silently return no results
  • Replaces the library import with a local escapeRE2 helper that only backslash-escapes characters actually special in RE2, across all four call sites (packages/mcp, packages/web MCP server, chat tools, and browse layout)

Test plan

  • Search for code without a repo filter — confirm results return as before
  • Search with filterByRepos containing a repo name with dots (e.g. dev.azure.com/Org/Proj/_git/repo) — confirm results are returned
  • Search with filterByRepos containing a repo name with dashes (e.g. my-backend-service) — confirm results are returned
  • Search with filterByFilepaths containing a path with dots/dashes — confirm filtering works correctly
  • Browse to a repo with special characters in its name and confirm the search bar pre-populates the correct repo:^...$ filter

Fixes #990

🤖 Generated with Claude Code

Summary by CodeRabbit

Bug Fixes

  • Fixed repository and file path filtering to correctly handle repository names containing special characters (dots, dashes, or slashes), which previously returned no search results.

brendan-kellam and others added 4 commits March 12, 2026 17:58
Token refresh was previously only triggered from the Next.js jwt callback,
meaning tokens could expire between user visits and cause account-driven
permission sync jobs to fail silently.

Move refresh logic to packages/backend/src/ee/tokenRefresh.ts and call it
from accountPermissionSyncer before using an account's access token. On
refresh failure, tokenRefreshErrorMessage is set on the Account record and
surfaced in the linked accounts UI so users know to re-authenticate.

Also adds a DB migration for the tokenRefreshErrorMessage field and wires
the signIn event to clear it on successful re-authentication.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
escape-string-regexp v5 encodes some characters (e.g. '-') as \xNN hex
escapes, which Go's RE2 engine (used by Zoekt) does not support. This
caused filterByRepos/filterByFilepaths with names containing dots, dashes,
or slashes to return no results. Replace the import with a local escapeRE2
helper that only backslash-escapes characters special in RE2.

Fixes #990

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 14, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 860021a8-755a-425a-bc79-953e912949d7

📥 Commits

Reviewing files that changed from the base of the PR and between ac166d6 and 79c3147.

📒 Files selected for processing (1)
  • CHANGELOG.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • CHANGELOG.md

Walkthrough

This PR fixes repository and filepath filtering in code search by replacing escape-string-regexp with a custom RE2-compatible regex escaper. The previous implementation generated hex escapes incompatible with Zoekt's regex engine, causing searches with special characters (dots, dashes, slashes) to return no results.

Changes

Cohort / File(s) Summary
Changelog
CHANGELOG.md
Added entry documenting the fix for filterByRepos and filterByFilepaths when repository names contain dots, dashes, or slashes.
MCP & Web Packages
packages/mcp/src/index.ts, packages/web/src/features/mcp/server.ts, packages/web/src/features/chat/tools.ts, packages/web/src/app/[domain]/browse/layoutClient.tsx
Replaced escape-string-regexp imports with local RE2-compatible escapeRE2 functions. Applied new escaper in query construction for repository and filepath filtering, with explanatory comments on RE2 compatibility.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested reviewers

  • msukkari
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix: use RE2-compatible escaper for repo/file filter queries' clearly and accurately describes the main change: replacing the escape-string-regexp library with a RE2-compatible escaper for repository and file filtering.
Linked Issues check ✅ Passed The PR directly addresses all coding requirements from issue #990: replaces escape-string-regexp with RE2-compatible escaper across multiple call sites to fix filterByRepos and filterByFilepaths functionality for repository names with special characters (dots, dashes, slashes).
Out of Scope Changes check ✅ Passed All changes are directly related to fixing the escaping mechanism for repository and file path filters in the code search tool; no unrelated modifications detected across the four modified files.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch brendan/fix-repo-filter-special-chars-SOU-556
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (1)
packages/backend/src/ee/tokenRefresh.ts (1)

80-84: Consider the edge case where expires_at is exactly 0.

The condition account.expires_at > 0 treats expires_at === 0 as "no expiry set," which skips the refresh check. However, 0 represents Unix epoch (1970), which is definitively expired. If a provider ever returns 0 or if corrupted data sets it to 0, this would incorrectly return a potentially expired token.

Consider changing the condition to explicitly check for null:

     const isExpiredOrNearExpiry =
-        account.expires_at !== null &&
-        account.expires_at > 0 &&
+        account.expires_at !== null &&
         now >= account.expires_at - EXPIRY_BUFFER_S;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/backend/src/ee/tokenRefresh.ts` around lines 80 - 84, The current
expiry check treats expires_at === 0 as "no expiry" because of the `> 0` guard;
change the logic to explicitly only guard against null (or undefined) so 0 is
considered a valid timestamp (and therefore expired). Update the
`isExpiredOrNearExpiry` computation (the variable and use of
`account.expires_at`) to use `account.expires_at !== null` (and optionally `!==
undefined`) instead of `> 0`, i.e. check presence only and then evaluate `now >=
account.expires_at - EXPIRY_BUFFER_S` so an expires_at of 0 is handled as
expired; ensure references to EXPIRY_BUFFER_S and account.expires_at remain
unchanged elsewhere.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/mcp/src/index.ts`:
- Around line 7-9: The escapeRE2 function's character class is missing the
closing-parenthesis character, so update the regex in escapeRE2 (const escapeRE2
= (s: string) => s.replace(/[.+*?^${}[\]|(\\]/g, '\\$&');) to also include ')'
so closing parentheses are escaped; modify the character class to add ')'
(ensuring proper escaping/ordering inside the [...]) so the function correctly
backslash-escapes ')' for RE2-compatible patterns.

In `@packages/web/src/app/`[domain]/browse/layoutClient.tsx:
- Around line 13-15: The regex escape function escapeRE2 is missing the closing
parenthesis in its character class, so update the pattern in every duplicated
implementation (escapeRE2) to include ')' among the escaped characters (i.e.,
add ')' into the character class) so literal ')' in repo/file names are
correctly escaped; after patching each occurrence (in the functions named
escapeRE2 across the codebase) consolidate these duplicates by moving the
corrected escapeRE2 into a single shared utility module and import/use that
utility where the function was previously duplicated.

In `@packages/web/src/features/chat/tools.ts`:
- Line 14: escapeRE2's character class currently omits the closing parenthesis,
so strings containing ')' aren't properly escaped; update the escapeRE2
function's regex (the char class in the replace call inside escapeRE2) to
include ')' alongside '(' so both grouping metacharacters are escaped, ensuring
the replace call escapes all RE2 metacharacters correctly.

In `@packages/web/src/features/mcp/server.ts`:
- Around line 14-16: The regex in escapeRE2 currently escapes "(" but omits ")",
so add the closing parenthesis to the character class used in the replace so
that escapeRE2(s: string) properly escapes both parentheses; update the
character class in the escapeRE2 function to include ')' so Zoekt queries that
use the escaped repo/file patterns (used where escapeRE2 is invoked around the
Zoekt query construction) won't break at the locations where those patterns are
used.

---

Nitpick comments:
In `@packages/backend/src/ee/tokenRefresh.ts`:
- Around line 80-84: The current expiry check treats expires_at === 0 as "no
expiry" because of the `> 0` guard; change the logic to explicitly only guard
against null (or undefined) so 0 is considered a valid timestamp (and therefore
expired). Update the `isExpiredOrNearExpiry` computation (the variable and use
of `account.expires_at`) to use `account.expires_at !== null` (and optionally
`!== undefined`) instead of `> 0`, i.e. check presence only and then evaluate
`now >= account.expires_at - EXPIRY_BUFFER_S` so an expires_at of 0 is handled
as expired; ensure references to EXPIRY_BUFFER_S and account.expires_at remain
unchanged elsewhere.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: ce79e67b-c53f-4568-8ab6-7dedfe3a4eb0

📥 Commits

Reviewing files that changed from the base of the PR and between 5c4b0ce and ac166d6.

📒 Files selected for processing (13)
  • CHANGELOG.md
  • packages/backend/src/ee/accountPermissionSyncer.ts
  • packages/backend/src/ee/tokenRefresh.ts
  • packages/db/prisma/migrations/20260313002214_add_account_token_refresh_error_message/migration.sql
  • packages/db/prisma/schema.prisma
  • packages/mcp/src/index.ts
  • packages/web/src/app/[domain]/browse/layoutClient.tsx
  • packages/web/src/auth.ts
  • packages/web/src/ee/features/sso/actions.ts
  • packages/web/src/ee/features/sso/components/linkedAccountProviderCard.tsx
  • packages/web/src/features/chat/tools.ts
  • packages/web/src/features/mcp/server.ts
  • packages/web/src/lib/encryptedPrismaAdapter.ts

Comment on lines +7 to +9
// Escapes special RE2 regex characters using backslash (compatible with Zoekt/Go RE2).
// escape-string-regexp v5 uses \xNN hex escapes which RE2 does not support.
const escapeRE2 = (s: string) => s.replace(/[.+*?^${}[\]|(\\]/g, '\\$&');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n packages/mcp/src/index.ts | head -20

Repository: sourcebot-dev/sourcebot

Length of output: 1496


🏁 Script executed:

# Check if the file exists and examine the context around lines 7-9
wc -l packages/mcp/src/index.ts

Repository: sourcebot-dev/sourcebot

Length of output: 96


🏁 Script executed:

# Search for all occurrences of escapeRE2 in the codebase
rg "escapeRE2" -n --context 2

Repository: sourcebot-dev/sourcebot

Length of output: 4830


🏁 Script executed:

# Also check if there are similar regex patterns elsewhere
rg "replace\(/\[" --context 3 -t ts -t tsx -t js -t jsx

Repository: sourcebot-dev/sourcebot

Length of output: 94


🏁 Script executed:

# Search for any comments or documentation about RE2 special characters in the codebase
rg "RE2|special.*character" -i --context 3 -t ts -t tsx -t js

Repository: sourcebot-dev/sourcebot

Length of output: 94


🌐 Web query:

RE2 regular expression special characters that need escaping

💡 Result:

In RE2, you escape a metacharacter by prefixing it with a backslash (e.g., \+ to match a literal +). [1]

Metacharacters you typically need to escape (to match literally)

  • . ^ $ * + ? ( ) [ { \ | (and, in general, other punctuation when used as operators) [2]

RE2 also supports escaping “any punctuation character” with \ (e.g., \*), and you can escape a whole literal chunk with \Q...\E. [3]

Inside a character class ([...])

You generally only need to escape (or place carefully):

  • \ ] - ^ (when ^ would otherwise negate the class) [2]

Sources: [1] [2] [3]


escapeRE2 is missing closing-parenthesis escaping.

Line 9's regex pattern fails to escape ), which is a special character in RE2. This can cause query filters to fail or parse incorrectly when repo IDs or filepaths contain closing parentheses.

Proposed fix
-const escapeRE2 = (s: string) => s.replace(/[.+*?^${}[\]|(\\]/g, '\\$&');
+const escapeRE2 = (s: string) => s.replace(/[\\.^$|?*+()[\]{}]/g, '\\$&');
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Escapes special RE2 regex characters using backslash (compatible with Zoekt/Go RE2).
// escape-string-regexp v5 uses \xNN hex escapes which RE2 does not support.
const escapeRE2 = (s: string) => s.replace(/[.+*?^${}[\]|(\\]/g, '\\$&');
// Escapes special RE2 regex characters using backslash (compatible with Zoekt/Go RE2).
// escape-string-regexp v5 uses \xNN hex escapes which RE2 does not support.
const escapeRE2 = (s: string) => s.replace(/[\\.^$|?*+()[\]{}]/g, '\\$&');
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/mcp/src/index.ts` around lines 7 - 9, The escapeRE2 function's
character class is missing the closing-parenthesis character, so update the
regex in escapeRE2 (const escapeRE2 = (s: string) =>
s.replace(/[.+*?^${}[\]|(\\]/g, '\\$&');) to also include ')' so closing
parentheses are escaped; modify the character class to add ')' (ensuring proper
escaping/ordering inside the [...]) so the function correctly backslash-escapes
')' for RE2-compatible patterns.

Comment on lines +13 to +15
// Escapes special RE2 regex characters using backslash (compatible with Zoekt/Go RE2).
// escape-string-regexp v5 uses \xNN hex escapes which RE2 does not support.
const escapeRE2 = (s: string) => s.replace(/[.+*?^${}[\]|(\\]/g, '\\$&');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

file="packages/web/src/app/[domain]/browse/layoutClient.tsx"
rg -n "const escapeRE2" "$file"

python - <<'PY'
from pathlib import Path
import sys

file = Path("packages/web/src/app/[domain]/browse/layoutClient.tsx")
line = next((l for l in file.read_text().splitlines() if "const escapeRE2" in l), None)
if not line:
    print("escapeRE2 declaration not found")
    sys.exit(1)

# Pull content inside /[ ... ]/g for quick static validation
try:
    char_class = line.split("/[", 1)[1].split("]/g", 1)[0]
except Exception:
    print("Could not parse regex char class from:", line)
    sys.exit(1)

print("escapeRE2 char class:", char_class)
required = [".", "+", "*", "?", "^", "$", "{", "}", "[", "]", "|", "(", ")", "\\\\"]
missing = [c for c in required if c not in char_class]
print("missing:", missing)
sys.exit(1 if missing else 0)
PY

Repository: sourcebot-dev/sourcebot

Length of output: 196


🏁 Script executed:

# Check the file structure and verify the patch suggestion
cat -n packages/web/src/app/[domain]/browse/layoutClient.tsx | head -50

# Also check if escapeRE2 is used anywhere else in the codebase
rg "escapeRE2" --type ts --type tsx

Repository: sourcebot-dev/sourcebot

Length of output: 2431


🏁 Script executed:

# Search for escapeRE2 usage with a simpler approach
rg "escapeRE2"

# Verify the suggested patch - check if all RE2 metacharacters are covered
python - <<'PY'
import re

# Current regex pattern from line 15
current = r"[.+*?^${}[\]|(\\]"
# Suggested patch
suggested = r"[\\.^$|?*+()[\]{}]"

# All RE2 regex metacharacters that need escaping
re2_metacharacters = {'.', '+', '*', '?', '^', '$', '{', '}', '[', ']', '|', '(', ')', '\\'}

print("=== Current Pattern Analysis ===")
print(f"Pattern: {current}")
current_chars = set()
# Parse the character class manually since it's complex
for c in '.+*?^${}[|(\\':
    if c in current:
        current_chars.add(c)
current_chars.add(']')  # ] is in the class as \]
print(f"Escapes: {sorted(current_chars)}")
current_missing = re2_metacharacters - current_chars
print(f"Missing: {current_missing}")

print("\n=== Suggested Pattern Analysis ===")
print(f"Pattern: {suggested}")
suggested_chars = set()
for c in '.^$|?*+()[]{\\':
    if c in suggested:
        suggested_chars.add(c)
print(f"Escapes: {sorted(suggested_chars)}")
suggested_missing = re2_metacharacters - suggested_chars
print(f"Missing: {suggested_missing}")

print(f"\n=== Verdict ===")
print(f"Current pattern missing: {current_missing}")
print(f"Suggested patch fixes issue: {')' in suggested_chars}")
PY

Repository: sourcebot-dev/sourcebot

Length of output: 1838


Escape set is incomplete: ) is not escaped in escapeRE2, affecting 4 files.

The character class on line 15 escapes ( but not ). This breaks literal matching for any repo name or file path containing ). The same incomplete function is duplicated across the codebase:

  • packages/web/src/app/[domain]/browse/layoutClient.tsx:15 (repo escaping)
  • packages/web/src/features/mcp/server.ts (repo and file escaping)
  • packages/web/src/features/chat/tools.ts (repo and file escaping)
  • packages/mcp/src/index.ts (repo and file escaping)

All instances must be fixed consistently. Apply the corrected pattern to all locations:

Suggested patch
-const escapeRE2 = (s: string) => s.replace(/[.+*?^${}[\]|(\\]/g, '\\$&');
+const escapeRE2 = (s: string) => s.replace(/[\\.^$|?*+()[\]{}]/g, '\\$&');

Consider extracting this function to a shared utility module to prevent duplication and ensure consistency across the codebase.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Escapes special RE2 regex characters using backslash (compatible with Zoekt/Go RE2).
// escape-string-regexp v5 uses \xNN hex escapes which RE2 does not support.
const escapeRE2 = (s: string) => s.replace(/[.+*?^${}[\]|(\\]/g, '\\$&');
// Escapes special RE2 regex characters using backslash (compatible with Zoekt/Go RE2).
// escape-string-regexp v5 uses \xNN hex escapes which RE2 does not support.
const escapeRE2 = (s: string) => s.replace(/[\\.^$|?*+()[\]{}]/g, '\\$&');
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/web/src/app/`[domain]/browse/layoutClient.tsx around lines 13 - 15,
The regex escape function escapeRE2 is missing the closing parenthesis in its
character class, so update the pattern in every duplicated implementation
(escapeRE2) to include ')' among the escaped characters (i.e., add ')' into the
character class) so literal ')' in repo/file names are correctly escaped; after
patching each occurrence (in the functions named escapeRE2 across the codebase)
consolidate these duplicates by moving the corrected escapeRE2 into a single
shared utility module and import/use that utility where the function was
previously duplicated.

import escapeStringRegexp from "escape-string-regexp";
// Escapes special RE2 regex characters using backslash (compatible with Zoekt/Go RE2).
// escape-string-regexp v5 uses \xNN hex escapes which RE2 does not support.
const escapeRE2 = (s: string) => s.replace(/[.+*?^${}[\]|(\\]/g, '\\$&');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

For RE2-compatible regex escaping (used from JavaScript), which metacharacters must be escaped for literal matching outside character classes, and does this include both '(' and ')'?

💡 Result:

Outside character classes, RE2 treats these as metacharacters (i.e., not literal unless escaped): \ (escape), . (any char), ^ / $ (anchors), * + ? (quantifiers), | (alternation), ( ) (grouping), [ (starts a character class), and { (starts a repetition like {n,m}). To match any of these literally, prefix with \ (and in a JavaScript string literal you’d write that as \\). [1][2]

Yes—this includes both ( and ); they are grouping metacharacters, so use \( and \) (JS: "\\(" and "\\)"). [1]

Sources: [1] RE2 syntax (google/re2 wiki). [2] RE2 syntax reference (CRAN mirror).


escapeRE2 misses ) escaping, breaking repos/files with closing parentheses.

The regex character class escapes ( but not ). Since both are metacharacters in RE2, values containing ) will produce invalid or unintended patterns. The fix is to include both grouping characters.

Suggested fix
-const escapeRE2 = (s: string) => s.replace(/[.+*?^${}[\]|(\\]/g, '\\$&');
+const escapeRE2 = (s: string) => s.replace(/[.+*?^${}()|[\]\\]/g, '\\$&');
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const escapeRE2 = (s: string) => s.replace(/[.+*?^${}[\]|(\\]/g, '\\$&');
const escapeRE2 = (s: string) => s.replace(/[.+*?^${}()|[\]\\]/g, '\\$&');
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/web/src/features/chat/tools.ts` at line 14, escapeRE2's character
class currently omits the closing parenthesis, so strings containing ')' aren't
properly escaped; update the escapeRE2 function's regex (the char class in the
replace call inside escapeRE2) to include ')' alongside '(' so both grouping
metacharacters are escaped, ensuring the replace call escapes all RE2
metacharacters correctly.

Comment on lines +14 to +16
// Escapes special RE2 regex characters using backslash (compatible with Zoekt/Go RE2).
// escape-string-regexp v5 uses \xNN hex escapes which RE2 does not support.
const escapeRE2 = (s: string) => s.replace(/[.+*?^${}[\]|(\\]/g, '\\$&');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n packages/web/src/features/mcp/server.ts | head -30

Repository: sourcebot-dev/sourcebot

Length of output: 1532


🏁 Script executed:

cat -n packages/web/src/features/mcp/server.ts | sed -n '140,160p'

Repository: sourcebot-dev/sourcebot

Length of output: 958


🏁 Script executed:

rg "escapeRE2" packages/web/src/features/mcp/server.ts -B 2 -A 2

Repository: sourcebot-dev/sourcebot

Length of output: 722


🌐 Web query:

RE2 regex metacharacters parentheses escaping Go

💡 Result:

In Go, the regexp package uses RE2 syntax, where parentheses ( and ) are metacharacters for grouping/capturing. To match literal parentheses, escape them as \( and \). [2]

Because Go string literals also treat \ as an escape (in double-quoted strings), you typically need double escaping:

regexp.MustCompile("\\(\\d+,\\d+\\)") // interpreted string literal

Or avoid double escaping by using a raw string literal (backticks):

regexp.MustCompile(`\(\d+,\d+\)`) // raw string literal

If you need to safely insert arbitrary text (including parentheses) into a regex, use regexp.QuoteMeta, which escapes all regex metacharacters for you. [1]

Sources: Go regexp docs (QuoteMeta). [1] RE2 syntax reference. [2] Go escaping-parentheses discussion/examples. [3]

[1] (pkg.go.dev)
[2] (github.com)
[3] (stackoverflow.com)

Citations:


🏁 Script executed:

rg "escapeRE2|filterByRepos|filterByFilepaths" packages/web/src/features/mcp/ -B 1 -A 1

Repository: sourcebot-dev/sourcebot

Length of output: 2149


🏁 Script executed:

find packages -name "*.test.ts" -o -name "*.test.tsx" -o -name "*.spec.ts" | xargs grep -l "escapeRE2\|server.ts" 2>/dev/null | head -5

Repository: sourcebot-dev/sourcebot

Length of output: 49


Escape ) in escapeRE2 as well.

The current pattern at line 16 escapes ( but not ). Both are metacharacters in RE2 used for grouping/capturing. Unescaped ) in repo names or filepaths will break regex pattern matching in Zoekt queries at lines 146 and 152.

Proposed fix
-const escapeRE2 = (s: string) => s.replace(/[.+*?^${}[\]|(\\]/g, '\\$&');
+const escapeRE2 = (s: string) => s.replace(/[\\.^$|?*+()[\]{}]/g, '\\$&');
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Escapes special RE2 regex characters using backslash (compatible with Zoekt/Go RE2).
// escape-string-regexp v5 uses \xNN hex escapes which RE2 does not support.
const escapeRE2 = (s: string) => s.replace(/[.+*?^${}[\]|(\\]/g, '\\$&');
// Escapes special RE2 regex characters using backslash (compatible with Zoekt/Go RE2).
// escape-string-regexp v5 uses \xNN hex escapes which RE2 does not support.
const escapeRE2 = (s: string) => s.replace(/[\\.^$|?*+()[\]{}]/g, '\\$&');
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/web/src/features/mcp/server.ts` around lines 14 - 16, The regex in
escapeRE2 currently escapes "(" but omits ")", so add the closing parenthesis to
the character class used in the replace so that escapeRE2(s: string) properly
escapes both parentheses; update the character class in the escapeRE2 function
to include ')' so Zoekt queries that use the escaped repo/file patterns (used
where escapeRE2 is invoked around the Zoekt query construction) won't break at
the locations where those patterns are used.

Updated CHANGELOG.md to include fixes for repository filtering and added new features related to SMTP configuration and API key restrictions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[bug] Repository filtering (filterByRepos) not working in search_code tool

1 participant