diff --git a/.github/instructions/broken-access-control-prevention.instructions.md b/.github/instructions/broken-access-control-prevention.instructions.md new file mode 100644 index 000000000..394e6cd3e --- /dev/null +++ b/.github/instructions/broken-access-control-prevention.instructions.md @@ -0,0 +1,107 @@ +--- +applyTo: '**/*.py' +--- + +# Security: Broken Access Control Prevention + +## Critical Requirement + +**NEVER treat caller-supplied ids or stored active-scope settings as authorization decisions after login.** + +Treat all of the following as untrusted authorization inputs unless the code proves otherwise: + +- `conversation_id`, `message_id`, `document_id`, `file_id`, `approval_id`, `group_id`, and `public_workspace_id` +- `activeGroupOid` and `activePublicWorkspaceOid` values loaded from user settings +- Plugin or tool-call arguments such as `user_id`, `conversation_id`, `group_id`, `public_workspace_id`, `scope_id`, and `scope_type` + +## Preferred Safe Patterns + +Use these patterns by default: + +- Revalidate personal conversation ownership with `_authorize_personal_conversation_read(...)`, `_authorize_personal_conversation_access(...)`, or an explicit owner check before reading dependent data. +- Route `activeGroupOid` writes through `update_active_group_for_user(...)`. +- Route `activePublicWorkspaceOid` writes through `update_active_public_workspace_for_user(...)`. +- Resolve active group scope through `require_active_group(...)` instead of raw settings reads in backend and plugin code. +- Resolve active public workspace scope through `require_active_public_workspace(...)` instead of raw settings reads in backend and plugin code. +- In Semantic Kernel plugins, normalize tool-call scope ids through `_resolve_authorized_scope_arguments(...)`, `_resolve_blob_location_with_fallback(...)`, or `_resolve_authorized_fact_memory_call(...)` before storage, blob, or Cosmos access. +- Prefer request-scoped authorization context such as `g.authorized_chat_context` over raw tool arguments. + +## Disallowed Patterns For New Code + +Do not add new code that does any of the following without a reviewed exception: + +- Call `update_user_settings(...)` with a literal `{"activeGroupOid": ...}` payload outside `update_active_group_for_user(...)` +- Call `update_user_settings(...)` with a literal `{"activePublicWorkspaceOid": ...}` payload outside `update_active_public_workspace_for_user(...)` +- Read `activeGroupOid` or `activePublicWorkspaceOid` directly from raw settings in backend routes or plugins when a shared validator exists +- Expose `user_id`, `conversation_id`, `group_id`, `public_workspace_id`, `scope_id`, or `scope_type` in a `@kernel_function` surface without immediately rebinding those values to the authorized request context +- Read a personal conversation by request-derived `conversation_id` and continue to message, blob, or feedback work without an explicit ownership boundary + +## Safe Examples + +```python +conversation_item = _authorize_personal_conversation_read(user_id, conversation_id) +messages = list( + cosmos_messages_container.query_items( + query=query, + partition_key=conversation_item['id'], + ) +) +``` + +```python +update_active_group_for_user(requested_active_group, user_id=user_id) +active_group_id = require_active_group(user_id) +``` + +```python +authorized_scope = self._resolve_authorized_fact_memory_call( + scope_type=scope_type, + scope_id=scope_id, + conversation_id=conversation_id, +) +``` + +## Unsafe Examples + +```python +update_user_settings(user_id, {'activeGroupOid': group_id}) +``` + +```python +active_group_id = settings.get('settings', {}).get('activeGroupOid') +``` + +```python +@kernel_function(name='unsafe_tool') +def unsafe_tool(self, user_id: str, conversation_id: str, group_id: str = ''): + return self.store.lookup(user_id=user_id, conversation_id=conversation_id, group_id=group_id) +``` + +```python +conversation_item = cosmos_conversations_container.read_item( + item=conversation_id, + partition_key=conversation_id, +) +``` + +## PR Review Checklist + +For any Python change that reads or mutates user, group, workspace, conversation, or plugin-scoped data: + +1. Identify every caller-controlled id that crosses into a data read or mutation. +2. Revalidate ownership or membership at the sensitive operation boundary, not just at route entry. +3. Use the dedicated active-scope validators instead of raw settings reads and writes. +4. Rebind plugin scope parameters to the authorized request context before storage, blob, or Cosmos access. +5. Add or update a regression test when the change touches an authorization boundary. + +## Workflow Guardrail + +This repository includes a Development PR check in `.github/workflows/broken-access-control-check.yml` backed by `scripts/check_broken_access_control.py`. + +If a reviewed exception is unavoidable, add the suppression token below near the specific line and include a justification comment: + +```text +bac-check: ignore +``` + +Use that escape hatch rarely. It is for reviewed legacy exceptions, not normal route or plugin code. \ No newline at end of file diff --git a/.github/instructions/xss-prevention.instructions.md b/.github/instructions/xss-prevention.instructions.md new file mode 100644 index 000000000..aa9d156d0 --- /dev/null +++ b/.github/instructions/xss-prevention.instructions.md @@ -0,0 +1,133 @@ +--- +applyTo: '**/*.js, **/*.html, **/*.py' +--- + +# Security: XSS Prevention and Browser Rendering + +## Critical Requirement + +**NEVER pass untrusted data into browser HTML or JavaScript execution sinks without an explicit safe boundary.** + +Treat all of the following as untrusted unless the code proves otherwise: + +- User profile fields, workspace names, group names, agent names, document titles, filenames, tags, descriptions, emails, and ids +- API response values returned from storage, Microsoft Graph, Cosmos DB, Azure AI Search, or any plugin/tool response +- Markdown, rich text, uploaded text files, generated summaries, model output, and any server-returned error string + +## Preferred Safe Patterns + +Use these patterns by default: + +- Create DOM nodes with `document.createElement(...)` +- Set untrusted text with `textContent` +- Set trusted static classes with `className` +- Use `setAttribute(...)` or `dataset` for inert data only when DOM node creation is not practical +- Attach behavior with `addEventListener(...)` +- Normalize dynamic HTTP links with a helper such as `sanitizeHttpUrl(...)` before assigning `href` or `src` +- Sanitize rendered markdown with `DOMPurify.sanitize(marked.parse(...))` before inserting HTML +- Keep static modal or card shells fully static, then populate untrusted fields with DOM APIs after creation + +## Disallowed Patterns For New Code + +Do not add new code that does any of the following with untrusted values: + +- `innerHTML`, `outerHTML`, `insertAdjacentHTML`, or jQuery `.html(...)` +- Inline event handlers such as `onclick=`, `onerror=`, `onload=`, or `setAttribute('onclick', ...)` +- Dynamic interpolation into HTML attributes such as `href`, `src`, `title`, `style`, or `data-*` +- `javascript:` URLs +- `Markup(...)` in Python on untrusted content +- Jinja `|safe` on untrusted content +- `marked.parse(...)` output rendered without `DOMPurify.sanitize(...)` + +## Safe Examples + +### JavaScript + +```javascript +const row = document.createElement('tr'); +const nameCell = document.createElement('td'); +nameCell.textContent = user.displayName || 'Unknown User'; + +const actionButton = document.createElement('button'); +actionButton.type = 'button'; +actionButton.dataset.userId = user.id || ''; +actionButton.addEventListener('click', handleUserClick); + +row.appendChild(nameCell); +row.appendChild(actionButton); +``` + +```javascript +const renderedHtml = DOMPurify.sanitize(marked.parse(markdownText || '')); +markdownContainer.innerHTML = renderedHtml; +``` + +### HTML / Jinja + +```html + +``` + +### Python + +```python +return render_template( + 'page.html', + title=page_title, + items=items, +) +``` + +## Unsafe Examples + +```javascript +row.innerHTML = `${user.displayName}`; +``` + +```javascript +button.setAttribute('onclick', `selectUser('${user.id}', '${user.displayName}')`); +``` + +```html +Run +``` + +```python +return Markup(user_supplied_html) +``` + +```html +{{ user_supplied_html|safe }} +``` + +## Static HTML Shell Exception + +When a static HTML shell is genuinely simpler, it is acceptable only if: + +- The HTML string is fully static +- It contains no `${...}` interpolation or dynamic concatenation +- Untrusted values are populated afterward with `textContent`, `setAttribute(...)`, or `dataset` + +## PR Review Checklist + +For any JavaScript, HTML, or Python change that affects browser rendering: + +1. Identify the trust boundary for every value that reaches the browser. +2. Prefer DOM node creation and `textContent` for untrusted text. +3. Normalize dynamic URLs before assigning them to clickable or loadable attributes. +4. If HTML rendering is required, document the sanitizer boundary explicitly. +5. Add or update a regression test when untrusted data reaches a browser-rendering path. + +## Workflow Guardrail + +This repository includes a Development PR check in `.github/workflows/xss-sink-check.yml` backed by `scripts/check_xss_sinks.py`. + +If a reviewed exception is unavoidable, add the suppression token below near the specific line and include a justification comment: + +```text +xss-check: ignore +``` + +Use that escape hatch rarely. It is for reviewed legacy exceptions, not for normal rendering code. \ No newline at end of file diff --git a/.github/workflows/broken-access-control-check.yml b/.github/workflows/broken-access-control-check.yml new file mode 100644 index 000000000..33869b28f --- /dev/null +++ b/.github/workflows/broken-access-control-check.yml @@ -0,0 +1,66 @@ +name: Broken Access Control Check + +on: + pull_request: + branches: + - Development + paths: + - 'application/**/*.py' + - 'scripts/check_broken_access_control.py' + - 'functional_tests/test_broken_access_control_guardrails_checker.py' + - '.github/workflows/broken-access-control-check.yml' + - '.github/instructions/broken-access-control-prevention.instructions.md' + +jobs: + broken-access-control-check: + runs-on: ubuntu-latest + + steps: + - name: Checkout code + uses: actions/checkout@v4 + with: + fetch-depth: 0 + + - name: Set up Python + uses: actions/setup-python@v5 + with: + python-version: '3.11' + + - name: Get changed Python files + id: changed-files + uses: tj-actions/changed-files@v46.0.1 + with: + files_yaml: | + bac_surface: + - 'application/**/*.py' + bac_guardrails: + - 'scripts/check_broken_access_control.py' + - 'functional_tests/test_broken_access_control_guardrails_checker.py' + - '.github/workflows/broken-access-control-check.yml' + - '.github/instructions/broken-access-control-prevention.instructions.md' + - 'docs/explanation/features/v0.241.022/BROKEN_ACCESS_CONTROL_PR_GUARDRAILS.md' + + - name: Run Broken Access Control validation + env: + CHANGED_BAC_FILES: ${{ steps.changed-files.outputs.bac_surface_all_changed_files }} + GITHUB_BASE_SHA: ${{ github.event.pull_request.base.sha }} + GITHUB_HEAD_SHA: ${{ github.sha }} + run: | + if [[ -z "$CHANGED_BAC_FILES" ]]; then + echo "No changed application files detected for Broken Access Control validation." + exit 0 + fi + + echo "Changed application files:" + printf '%s\n' "$CHANGED_BAC_FILES" | tr ' ' '\n' + + python scripts/check_broken_access_control.py \ + --base-sha "$GITHUB_BASE_SHA" \ + --head-sha "$GITHUB_HEAD_SHA" \ + $CHANGED_BAC_FILES + + - name: Run Broken Access Control guardrail self-test (advisory) + if: steps.changed-files.outputs.bac_guardrails_any_changed == 'true' + continue-on-error: true + run: | + python functional_tests/test_broken_access_control_guardrails_checker.py \ No newline at end of file diff --git a/.github/workflows/xss-sink-check.yml b/.github/workflows/xss-sink-check.yml new file mode 100644 index 000000000..938df6061 --- /dev/null +++ b/.github/workflows/xss-sink-check.yml @@ -0,0 +1,70 @@ +name: XSS Sink Check + +on: + pull_request: + branches: + - Development + paths: + - 'application/**/*.js' + - 'application/**/*.html' + - 'application/**/*.py' + - 'scripts/check_xss_sinks.py' + - 'functional_tests/test_xss_guardrails_checker.py' + - '.github/workflows/xss-sink-check.yml' + - '.github/instructions/xss-prevention.instructions.md' + +jobs: + xss-sink-check: + runs-on: ubuntu-latest + + steps: + - name: Checkout code + uses: actions/checkout@v4 + with: + fetch-depth: 0 + + - name: Set up Python + uses: actions/setup-python@v5 + with: + python-version: '3.11' + + - name: Get changed XSS-related files + id: changed-files + uses: tj-actions/changed-files@v46.0.1 + with: + files_yaml: | + xss_surface: + - 'application/**/*.js' + - 'application/**/*.html' + - 'application/**/*.py' + xss_guardrails: + - 'scripts/check_xss_sinks.py' + - 'functional_tests/test_xss_guardrails_checker.py' + - '.github/workflows/xss-sink-check.yml' + - '.github/instructions/xss-prevention.instructions.md' + - 'docs/explanation/features/v0.241.021/XSS_PR_GUARDRAILS.md' + + - name: Run XSS sink validation + env: + CHANGED_XSS_FILES: ${{ steps.changed-files.outputs.xss_surface_all_changed_files }} + GITHUB_BASE_SHA: ${{ github.event.pull_request.base.sha }} + GITHUB_HEAD_SHA: ${{ github.sha }} + run: | + if [[ -z "$CHANGED_XSS_FILES" ]]; then + echo "No changed application files detected for XSS sink validation." + exit 0 + fi + + echo "Changed application files:" + printf '%s\n' "$CHANGED_XSS_FILES" | tr ' ' '\n' + + python scripts/check_xss_sinks.py \ + --base-sha "$GITHUB_BASE_SHA" \ + --head-sha "$GITHUB_HEAD_SHA" \ + $CHANGED_XSS_FILES + + - name: Run XSS guardrail self-test (advisory) + if: steps.changed-files.outputs.xss_guardrails_any_changed == 'true' + continue-on-error: true + run: | + python functional_tests/test_xss_guardrails_checker.py \ No newline at end of file diff --git a/application/single_app/config.py b/application/single_app/config.py index 3ccb6ca98..4d0cb7f0a 100644 --- a/application/single_app/config.py +++ b/application/single_app/config.py @@ -94,7 +94,7 @@ EXECUTOR_TYPE = 'thread' EXECUTOR_MAX_WORKERS = 30 SESSION_TYPE = 'filesystem' -VERSION = "0.241.007" +VERSION = "0.241.022" SECRET_KEY = os.getenv('SECRET_KEY', 'dev-secret-key-change-in-production') @@ -150,6 +150,7 @@ def get_allowed_extensions(enable_video=False, enable_audio=False): Args: enable_video: Whether video file support is enabled + enable_audio: Whether audio file support is enabled Returns: set: Allowed file extensions """ diff --git a/application/single_app/functions_agent_templates.py b/application/single_app/functions_agent_templates.py index f5cda8a3f..af566ef88 100644 --- a/application/single_app/functions_agent_templates.py +++ b/application/single_app/functions_agent_templates.py @@ -102,10 +102,32 @@ def _serialize_additional_settings(raw: Any) -> str: return json.dumps(parsed, indent=2, sort_keys=True) +def _normalize_actions_to_load(actions: Any, strict: bool = False) -> List[str]: + if actions in (None, ""): + return [] + if not isinstance(actions, list): + if strict: + raise ValueError("actions_to_load must be an array of strings") + return [] + + cleaned: List[str] = [] + for action in actions: + if isinstance(action, str): + trimmed = action.strip() + elif strict: + raise ValueError("actions_to_load entries must be strings") + else: + trimmed = str(action).strip() + + if trimmed: + cleaned.append(trimmed) + + return cleaned + + def _sanitize_template(doc: Dict[str, Any], include_internal: bool = False) -> Dict[str, Any]: cleaned = _strip_metadata(doc) - cleaned.setdefault('actions_to_load', []) - cleaned['actions_to_load'] = [a for a in cleaned['actions_to_load'] if a] + cleaned['actions_to_load'] = _normalize_actions_to_load(cleaned.get('actions_to_load')) cleaned.setdefault('tags', []) cleaned['tags'] = [str(tag)[:64] for tag in cleaned['tags']] cleaned['helper_text'] = _normalize_helper_text( @@ -287,7 +309,7 @@ def _base_template_from_payload(payload: Dict[str, Any], user_info: Optional[Dic tags = payload.get('tags') or [] tags = [str(tag)[:64] for tag in tags] - actions = [str(action) for action in (payload.get('actions_to_load') or []) if action] + actions = _normalize_actions_to_load(payload.get('actions_to_load'), strict=True) template = { 'id': payload.get('id') or str(uuid.uuid4()), @@ -366,6 +388,11 @@ def update_agent_template(template_id: str, updates: Dict[str, Any]) -> Optional else: payload['additional_settings'] = _parse_additional_settings(doc.get('additional_settings')) + if 'actions_to_load' in payload: + payload['actions_to_load'] = _normalize_actions_to_load(payload['actions_to_load'], strict=True) + else: + payload['actions_to_load'] = _normalize_actions_to_load(doc.get('actions_to_load')) + if 'tags' in payload: payload['tags'] = [str(tag)[:64] for tag in payload['tags']] diff --git a/application/single_app/functions_approvals.py b/application/single_app/functions_approvals.py index a6c733467..b755d5c2c 100644 --- a/application/single_app/functions_approvals.py +++ b/application/single_app/functions_approvals.py @@ -277,7 +277,8 @@ def approve_request( approver_id: str, approver_email: str, approver_name: str, - comment: Optional[str] = None + comment: Optional[str] = None, + approval: Optional[Dict[str, Any]] = None, ) -> Dict[str, Any]: """ Approve an approval request. @@ -295,10 +296,11 @@ def approve_request( """ try: # Get the approval request - approval = cosmos_approvals_container.read_item( - item=approval_id, - partition_key=group_id - ) + if approval is None: + approval = cosmos_approvals_container.read_item( + item=approval_id, + partition_key=group_id + ) # Validate status if approval['status'] != STATUS_PENDING: @@ -368,7 +370,8 @@ def deny_request( denier_email: str, denier_name: str, comment: str, - auto_denied: bool = False + auto_denied: bool = False, + approval: Optional[Dict[str, Any]] = None, ) -> Dict[str, Any]: """ Deny an approval request. @@ -387,10 +390,11 @@ def deny_request( """ try: # Get the approval request - approval = cosmos_approvals_container.read_item( - item=approval_id, - partition_key=group_id - ) + if approval is None: + approval = cosmos_approvals_container.read_item( + item=approval_id, + partition_key=group_id + ) # Validate status (allow denying pending requests) if approval['status'] not in [STATUS_PENDING]: @@ -543,6 +547,29 @@ def get_approval_by_id(approval_id: str, group_id: str) -> Optional[Dict[str, An return None +def get_authorized_approval( + approval_id: str, + group_id: str, + user_id: str, + user_roles: List[str], + require_approval_rights: bool = False, +) -> Dict[str, Any]: + """Return an approval only if the current user is allowed to view or approve it.""" + approval = get_approval_by_id(approval_id, group_id) + if not approval: + raise LookupError("Approval not found") + + is_authorized = ( + _can_user_approve(approval, user_id, user_roles) + if require_approval_rights + else _can_user_view(approval, user_id, user_roles) + ) + if not is_authorized: + raise PermissionError("You are not authorized to access this approval") + + return approval + + def auto_deny_expired_approvals() -> int: """ Auto-deny approval requests that have expired (older than 3 days). diff --git a/application/single_app/functions_documents.py b/application/single_app/functions_documents.py index 7c6e4a272..362729ffd 100644 --- a/application/single_app/functions_documents.py +++ b/application/single_app/functions_documents.py @@ -1,5 +1,6 @@ # functions_documents.py that has some changes I need to merge into Development +import re import traceback from config import * from functions_content import * @@ -20,6 +21,7 @@ def allowed_file(filename, allowed_extensions=None): ARCHIVED_SCOPE_PREFIX = "__archived__::" CURRENT_ALIAS_BLOB_PATH_MODE = "current_alias" ARCHIVED_REVISION_BLOB_PATH_MODE = "archived_revision" +TAG_COLOR_PATTERN = re.compile(r'^#?(?:[0-9a-fA-F]{3}|[0-9a-fA-F]{6})$') def _get_blob_container_name(group_id=None, public_workspace_id=None): @@ -7566,6 +7568,58 @@ def sanitize_tags_for_filter(raw_tags): return valid_tags +def normalize_tag_color(color): + """ + Normalize a tag color to a canonical 6-digit lowercase hex code. + Returns None for invalid values. + """ + if not isinstance(color, str): + return None + + normalized_color = color.strip() + if not normalized_color: + return None + + if not TAG_COLOR_PATTERN.fullmatch(normalized_color): + return None + + if not normalized_color.startswith('#'): + normalized_color = f'#{normalized_color}' + + if len(normalized_color) == 4: + normalized_color = '#' + ''.join(component * 2 for component in normalized_color[1:]) + + return normalized_color.lower() + + +def get_safe_tag_color(color, tag_name): + """ + Return a normalized tag color or the deterministic default for the tag. + """ + normalized_color = normalize_tag_color(color) + if normalized_color: + return normalized_color + + safe_tag_name = normalize_tag(tag_name) or str(tag_name or '') + return get_default_tag_color(safe_tag_name) + + +def validate_tag_color(color, tag_name): + """ + Validate a requested tag color. + Returns (is_valid, error_message, normalized_color). + Missing colors resolve to the deterministic default for the tag. + """ + if color is None: + return True, None, get_safe_tag_color(None, tag_name) + + normalized_color = normalize_tag_color(color) + if not normalized_color: + return False, 'Tag color must be a valid 3- or 6-digit hex color', None + + return True, None, normalized_color + + def get_workspace_tags(user_id, group_id=None, public_workspace_id=None): """ Get all unique tags used in a workspace with document counts. @@ -7662,7 +7716,7 @@ def get_workspace_tags(user_id, group_id=None, public_workspace_id=None): results.append({ 'name': tag_name, 'count': count, - 'color': tag_def.get('color', get_default_tag_color(tag_name)) + 'color': get_safe_tag_color(tag_def.get('color'), tag_name) }) # Add defined tags that haven't been used yet (count = 0) @@ -7671,7 +7725,7 @@ def get_workspace_tags(user_id, group_id=None, public_workspace_id=None): results.append({ 'name': tag_name, 'count': 0, - 'color': tag_def.get('color', get_default_tag_color(tag_name)) + 'color': get_safe_tag_color(tag_def.get('color'), tag_name) }) # Sort by count descending, then name ascending @@ -7728,34 +7782,40 @@ def get_or_create_tag_definition(user_id, tag_name, workspace_type='personal', c """ from datetime import datetime, timezone + safe_color = get_safe_tag_color(color, tag_name) + if workspace_type == 'group' and group_id: from functions_group import find_group_by_id group_doc = find_group_by_id(group_id) if not group_doc: - return {'color': color or get_default_tag_color(tag_name)} + return {'color': safe_color} tag_defs = group_doc.get('tag_definitions', {}) if tag_name not in tag_defs: tag_defs[tag_name] = { - 'color': color if color else get_default_tag_color(tag_name), + 'color': safe_color, 'created_at': datetime.now(timezone.utc).isoformat() } group_doc['tag_definitions'] = tag_defs cosmos_groups_container.upsert_item(group_doc) - return tag_defs[tag_name] + stored_tag_def = dict(tag_defs[tag_name]) + stored_tag_def['color'] = get_safe_tag_color(stored_tag_def.get('color'), tag_name) + return stored_tag_def elif workspace_type == 'public' and public_workspace_id: from functions_public_workspaces import find_public_workspace_by_id ws_doc = find_public_workspace_by_id(public_workspace_id) if not ws_doc: - return {'color': color or get_default_tag_color(tag_name)} + return {'color': safe_color} tag_defs = ws_doc.get('tag_definitions', {}) if tag_name not in tag_defs: tag_defs[tag_name] = { - 'color': color if color else get_default_tag_color(tag_name), + 'color': safe_color, 'created_at': datetime.now(timezone.utc).isoformat() } ws_doc['tag_definitions'] = tag_defs cosmos_public_workspaces_container.upsert_item(ws_doc) - return tag_defs[tag_name] + stored_tag_def = dict(tag_defs[tag_name]) + stored_tag_def['color'] = get_safe_tag_color(stored_tag_def.get('color'), tag_name) + return stored_tag_def else: # Personal: store in user settings from functions_settings import get_user_settings, update_user_settings @@ -7771,12 +7831,14 @@ def get_or_create_tag_definition(user_id, tag_name, workspace_type='personal', c if tag_name not in workspace_tags: workspace_tags[tag_name] = { - 'color': color if color else get_default_tag_color(tag_name), + 'color': safe_color, 'created_at': datetime.now(timezone.utc).isoformat() } update_user_settings(user_id, {'tag_definitions': tag_definitions}) - return workspace_tags[tag_name] + stored_tag_def = dict(workspace_tags[tag_name]) + stored_tag_def['color'] = get_safe_tag_color(stored_tag_def.get('color'), tag_name) + return stored_tag_def def propagate_tags_to_blob_metadata(document_id, tags, user_id, group_id=None, public_workspace_id=None): diff --git a/application/single_app/functions_group.py b/application/single_app/functions_group.py index e50b09f60..2c60fab58 100644 --- a/application/single_app/functions_group.py +++ b/application/single_app/functions_group.py @@ -1,6 +1,8 @@ # functions_group.py from config import * +import functions_authentication +import functions_settings from functions_authentication import * from functions_settings import * from typing import Iterable @@ -103,12 +105,20 @@ def find_group_by_id(group_id): except exceptions.CosmosResourceNotFoundError: return None -def update_active_group_for_user(group_id): - user_id = get_current_user_id() +def update_active_group_for_user(group_id, user_id=None): + if not user_id: + user_id = functions_authentication.get_current_user_id() + + assert_group_role( + user_id, + group_id, + allowed_roles=("Owner", "Admin", "DocumentManager", "User"), + ) + new_settings = { "activeGroupOid": group_id } - update_user_settings(user_id, new_settings) + functions_settings.update_user_settings(user_id, new_settings) def get_user_role_in_group(group_doc, user_id): """Determine the user's role in the given group doc.""" @@ -129,12 +139,17 @@ def get_user_role_in_group(group_doc, user_id): return None -def require_active_group(user_id: str) -> str: - """Return the active group id for a user or raise ValueError if missing.""" - settings = get_user_settings(user_id) +def require_active_group( + user_id: str, + allowed_roles: Iterable[str] = ("Owner", "Admin", "DocumentManager", "User"), +) -> str: + """Return the active group id for a user after validating current membership.""" + settings = functions_settings.get_user_settings(user_id) active_group_id = settings.get("settings", {}).get("activeGroupOid") if not active_group_id: raise ValueError("No active group selected") + + assert_group_role(user_id, active_group_id, allowed_roles=allowed_roles) return active_group_id diff --git a/application/single_app/functions_keyvault.py b/application/single_app/functions_keyvault.py index a523eeaa9..c40977507 100644 --- a/application/single_app/functions_keyvault.py +++ b/application/single_app/functions_keyvault.py @@ -60,6 +60,109 @@ class SecretReturnType(Enum): NAME = "name" +def _normalize_allowed_sources(allowed_sources): + """Normalize one or many allowed sources into a comparable set.""" + if allowed_sources is None: + return None + if isinstance(allowed_sources, str): + return {allowed_sources} + return { + str(source).strip() + for source in allowed_sources + if str(source).strip() + } + + +def parse_secret_name_dynamic(secret_name): + """Return parsed Key Vault secret reference parts when the name is valid.""" + scopes_pattern = '|'.join(re.escape(scope) for scope in supported_scopes) + sources_pattern = '|'.join(re.escape(source) for source in supported_sources) + pattern = ( + rf"^(?P.+?)--(?P{sources_pattern})--" + rf"(?P{scopes_pattern})--(?P.+)$" + ) + match = re.match(pattern, secret_name) + if not match: + return None + if len(secret_name) > 127: + return None + return match.groupdict() + + +def secret_reference_matches_context(secret_name, scope_value=None, scope=None, allowed_sources=None): + """Return True when a secret reference belongs to the expected scope and source.""" + parsed = parse_secret_name_dynamic(secret_name) + if not parsed: + return False + + normalized_sources = _normalize_allowed_sources(allowed_sources) + expected_scope_value = None + if scope_value is not None: + expected_scope_value = clean_name_for_keyvault(str(scope_value)) + + if expected_scope_value is not None and parsed["scope_value"] != expected_scope_value: + return False + if scope is not None and parsed["scope"] != scope: + return False + if normalized_sources is not None and parsed["source"] not in normalized_sources: + return False + return True + + +def _log_secret_reference_context_mismatch(secret_name, context_label, scope_value=None, scope=None, allowed_sources=None): + """Emit a warning when a stored secret reference does not match its expected context.""" + parsed = parse_secret_name_dynamic(secret_name) or {} + expected_scope_value = None + if scope_value is not None: + expected_scope_value = clean_name_for_keyvault(str(scope_value)) + + log_event( + f"[KeyVault] Rejected mismatched secret reference for {context_label}.", + extra={ + "context_label": context_label, + "expected_scope_value": expected_scope_value, + "expected_scope": scope, + "allowed_sources": sorted(_normalize_allowed_sources(allowed_sources) or []), + "provided_scope_value": parsed.get("scope_value"), + "provided_scope": parsed.get("scope"), + "provided_source": parsed.get("source"), + }, + level=logging.WARNING, + ) + + +def resolve_secret_reference_for_context( + secret_name, + scope_value=None, + scope=None, + allowed_sources=None, + context_label="secret reference", +): + """Resolve a Key Vault reference only when it matches the expected context.""" + if not validate_secret_name_dynamic(secret_name): + return secret_name + + if not secret_reference_matches_context( + secret_name, + scope_value=scope_value, + scope=scope, + allowed_sources=allowed_sources, + ): + _log_secret_reference_context_mismatch( + secret_name, + context_label, + scope_value=scope_value, + scope=scope, + allowed_sources=allowed_sources, + ) + raise ValueError(f"Stored Key Vault reference for {context_label} does not match the expected scope.") + + resolved_value = retrieve_secret_from_key_vault_by_full_name(secret_name) + if validate_secret_name_dynamic(resolved_value): + raise ValueError(f"Unable to resolve stored Key Vault secret for {context_label}.") + return resolved_value + + def _get_nested_dict_value(data, path): """Return a nested dictionary value, or None when the path is missing.""" current = data @@ -119,10 +222,28 @@ def _store_plugin_secret_reference(updated_plugin, existing_plugin, path, secret if not value: return + path_label = ".".join(path) + existing_reference = _get_existing_secret_reference(existing_plugin, path) if value == ui_trigger_word: if existing_reference: + if not secret_reference_matches_context( + existing_reference, + scope_value=scope_value, + scope=scope, + allowed_sources={source}, + ): + _log_secret_reference_context_mismatch( + existing_reference, + f"plugin field '{path_label}' existing reference", + scope_value=scope_value, + scope=scope, + allowed_sources={source}, + ) + raise ValueError( + f"Stored Key Vault reference for '{path_label}' no longer matches the expected scope. Re-enter the secret value." + ) _set_nested_dict_value(updated_plugin, path, existing_reference) return _set_nested_dict_value( @@ -133,6 +254,22 @@ def _store_plugin_secret_reference(updated_plugin, existing_plugin, path, secret return if validate_secret_name_dynamic(value): + if not secret_reference_matches_context( + value, + scope_value=scope_value, + scope=scope, + allowed_sources={source}, + ): + _log_secret_reference_context_mismatch( + value, + f"plugin field '{path_label}'", + scope_value=scope_value, + scope=scope, + allowed_sources={source}, + ) + raise ValueError( + f"Stored Key Vault reference for '{path_label}' does not match the expected scope." + ) _set_nested_dict_value(updated_plugin, path, value) return @@ -377,18 +514,7 @@ def validate_secret_name_dynamic(secret_name): Returns: bool: True if valid, False otherwise. """ - # Build regex pattern dynamically - scopes_pattern = '|'.join(re.escape(scope) for scope in supported_scopes) - sources_pattern = '|'.join(re.escape(source) for source in supported_sources) - # Wildcards for secret_name and scope_value - pattern = rf"^(.+)--({sources_pattern})--({scopes_pattern})--(.+)$" - match = re.match(pattern, secret_name) - if not match: - return False - # Optionally, check length - if len(secret_name) > 127: - return False - return True + return parse_secret_name_dynamic(secret_name) is not None def keyvault_agent_save_helper(agent_dict, scope_value, scope="global"): """ @@ -616,8 +742,20 @@ def keyvault_plugin_get_helper(plugin_dict, scope_value, scope="global", return_ value = auth.get(auth_field) if value and validate_secret_name_dynamic(value): try: + is_expected_reference = secret_reference_matches_context( + value, + scope_value=scope_value, + scope=scope, + allowed_sources={"action"}, + ) if return_type == SecretReturnType.VALUE: - new_auth[auth_field] = retrieve_secret_from_key_vault_by_full_name(value) + new_auth[auth_field] = resolve_secret_reference_for_context( + value, + scope_value=scope_value, + scope=scope, + allowed_sources={"action"}, + context_label=f"action auth field '{auth_field}'", + ) elif return_type == SecretReturnType.NAME: new_auth[auth_field] = value else: @@ -635,8 +773,20 @@ def keyvault_plugin_get_helper(plugin_dict, scope_value, scope="global", return_ for k, v in additional_fields.items(): if (k.endswith('__Secret') or _is_sql_sensitive_additional_field(updated, k)) and v and validate_secret_name_dynamic(v): try: + is_expected_reference = secret_reference_matches_context( + v, + scope_value=scope_value, + scope=scope, + allowed_sources={"action-addset"}, + ) if return_type == SecretReturnType.VALUE: - new_additional_fields[k] = retrieve_secret_from_key_vault_by_full_name(v) + new_additional_fields[k] = resolve_secret_reference_for_context( + v, + scope_value=scope_value, + scope=scope, + allowed_sources={"action-addset"}, + context_label=f"action additional field '{k}'", + ) elif return_type == SecretReturnType.NAME: new_additional_fields[k] = v else: @@ -834,6 +984,20 @@ def keyvault_plugin_delete_helper(plugin_dict, scope_value, scope="global"): for auth_field in ('key', *SQL_PLUGIN_SENSITIVE_AUTH_FIELDS): secret_name = auth.get(auth_field) if secret_name and validate_secret_name_dynamic(secret_name): + if not secret_reference_matches_context( + secret_name, + scope_value=scope_value, + scope=scope, + allowed_sources={"action"}, + ): + _log_secret_reference_context_mismatch( + secret_name, + f"action auth field '{auth_field}' deletion", + scope_value=scope_value, + scope=scope, + allowed_sources={"action"}, + ) + continue try: key_vault_url = f"https://{key_vault_name}{KEY_VAULT_DOMAIN}" log_event(f"Deleting action auth secret '{auth_field}' for action '{plugin_name}' for '{scope}' '{scope_value}'", level=logging.INFO) @@ -847,6 +1011,20 @@ def keyvault_plugin_delete_helper(plugin_dict, scope_value, scope="global"): if isinstance(additional_fields, dict): for k, v in additional_fields.items(): if (k.endswith('__Secret') or _is_sql_sensitive_additional_field(plugin_dict, k)) and v and validate_secret_name_dynamic(v): + if not secret_reference_matches_context( + v, + scope_value=scope_value, + scope=scope, + allowed_sources={"action-addset"}, + ): + _log_secret_reference_context_mismatch( + v, + f"action additional field '{k}' deletion", + scope_value=scope_value, + scope=scope, + allowed_sources={"action-addset"}, + ) + continue try: key_vault_url = f"https://{key_vault_name}{KEY_VAULT_DOMAIN}" log_event(f"Deleting action additionalField secret '{k}' for action '{plugin_name}' for '{scope}' '{scope_value}'", level=logging.INFO) diff --git a/application/single_app/functions_public_workspaces.py b/application/single_app/functions_public_workspaces.py index 45e5f80e6..8039845f4 100644 --- a/application/single_app/functions_public_workspaces.py +++ b/application/single_app/functions_public_workspaces.py @@ -1,8 +1,10 @@ # functions_public_workspaces.py from config import * +import functions_settings from functions_authentication import * from functions_group import * +from typing import Iterable def create_public_workspace(name: str, description: str) -> dict: """ @@ -114,13 +116,57 @@ def get_user_role_in_public_workspace(ws_doc: dict, user_id: str) -> str | None: return None if ws_doc.get("owner", {}).get("userId") == user_id: return "Owner" - if user_id in ws_doc.get("admins", []): - return "Admin" + for admin in ws_doc.get("admins", []): + if isinstance(admin, str) and admin == user_id: + return "Admin" + if isinstance(admin, dict) and admin.get("userId") == user_id: + return "Admin" if any(dm["userId"] == user_id for dm in ws_doc.get("documentManagers", [])): return "DocumentManager" return None +def build_public_workspace_public_summary(ws_doc: dict) -> dict: + """Return the non-sensitive workspace fields safe for any authenticated caller.""" + owner = ws_doc.get("owner", {}) or {} + return { + "id": ws_doc.get("id", ""), + "name": ws_doc.get("name", ""), + "description": ws_doc.get("description", ""), + "owner": { + "displayName": owner.get("displayName", ""), + }, + "status": ws_doc.get("status", "active"), + "heroColor": ws_doc.get("heroColor", "#0078d4"), + "userRole": None, + "isMember": False, + } + + +def build_public_workspace_member_payload(ws_doc: dict, user_id: str) -> dict: + """Return the workspace fields required by member-facing workspace pages.""" + role = get_user_role_in_public_workspace(ws_doc, user_id) + owner = ws_doc.get("owner", {}) or {} + payload = { + "id": ws_doc.get("id", ""), + "name": ws_doc.get("name", ""), + "description": ws_doc.get("description", ""), + "owner": { + "displayName": owner.get("displayName", ""), + "email": owner.get("email", ""), + }, + "status": ws_doc.get("status", "active"), + "heroColor": ws_doc.get("heroColor", "#0078d4"), + "userRole": role, + "isMember": bool(role), + } + + if role in ("Owner", "Admin") and "retention_policy" in ws_doc: + payload["retention_policy"] = ws_doc.get("retention_policy") + + return payload + + def is_user_in_public_workspace(ws_doc: dict, user_id: str) -> bool: """ Check if a user has any role in the workspace. @@ -224,9 +270,46 @@ def count_public_workspace_documents(ws_id: str) -> int: def update_active_public_workspace_for_user(user_id: str, ws_id: str) -> None: """ - Persist the user's activePublicWorkspaceOid in their settings. + Persist the user's activePublicWorkspaceOid after validating the workspace. """ - update_user_settings(user_id, {"activePublicWorkspaceOid": ws_id}) + normalized_workspace_id = str(ws_id or "").strip() + if not normalized_workspace_id: + functions_settings.update_user_settings(user_id, {"activePublicWorkspaceOid": ""}) + return + + workspace_doc = find_public_workspace_by_id(normalized_workspace_id) + if not workspace_doc: + raise LookupError("Workspace not found") + + functions_settings.update_user_settings( + user_id, + {"activePublicWorkspaceOid": normalized_workspace_id}, + ) + + +def require_active_public_workspace( + user_id: str, + allowed_roles: Iterable[str] = ("Owner", "Admin", "DocumentManager"), +) -> tuple[str, dict, str]: + """Return the active public workspace after validating it still exists and the user can access it.""" + settings = functions_settings.get_user_settings(user_id) + active_workspace_id = str(settings.get("settings", {}).get("activePublicWorkspaceOid") or "").strip() + if not active_workspace_id: + raise ValueError("No active public workspace selected") + + workspace_doc = find_public_workspace_by_id(active_workspace_id) + if not workspace_doc: + raise LookupError("Active public workspace not found") + + role = get_user_role_in_public_workspace(workspace_doc, user_id) + if not role: + raise PermissionError("Access denied") + + allowed = {allowed_role.lower() for allowed_role in allowed_roles} + if role.lower() not in allowed: + raise PermissionError("Access denied") + + return active_workspace_id, workspace_doc, role def get_user_visible_public_workspaces(user_id: str) -> list: diff --git a/application/single_app/functions_search.py b/application/single_app/functions_search.py index 6851778f0..0859cf064 100644 --- a/application/single_app/functions_search.py +++ b/application/single_app/functions_search.py @@ -94,6 +94,22 @@ def build_tags_filter(tags_filter): tag_conditions = [f"document_tags/any(t: t eq '{tag}')" for tag in safe_tags] return " and ".join(tag_conditions) + +def _escape_odata_literal(value: Any) -> str: + """Escape a value for safe inclusion inside an OData single-quoted literal.""" + return str(value or "").replace("'", "''") + + +def _build_odata_eq(field_name: str, value: Any) -> str: + """Build a simple equality clause with an escaped OData literal.""" + return f"{field_name} eq '{_escape_odata_literal(value)}'" + + +def _build_odata_any_eq(collection_field: str, iterator_name: str, value: Any) -> str: + """Build an OData any(...) equality clause with an escaped literal.""" + escaped_value = _escape_odata_literal(value) + return f"{collection_field}/any({iterator_name}: {iterator_name} eq '{escaped_value}')" + def hybrid_search(query, user_id, document_id=None, document_ids=None, top_n=12, doc_scope="all", active_group_id=None, active_group_ids=None, active_public_workspace_id=None, enable_file_sharing=True, tags_filter=None): """ Hybrid search that queries the user doc index, group doc index, or public doc index @@ -155,9 +171,9 @@ def hybrid_search(query, user_id, document_id=None, document_ids=None, top_n=12, doc_id_filter = None if document_ids and len(document_ids) > 0: if len(document_ids) == 1: - doc_id_filter = f"document_id eq '{document_ids[0]}'" + doc_id_filter = _build_odata_eq("document_id", document_ids[0]) else: - conditions = " or ".join([f"document_id eq '{did}'" for did in document_ids]) + conditions = " or ".join([_build_odata_eq("document_id", did) for did in document_ids]) doc_id_filter = f"({conditions})" # Generate cache key including document set fingerprints and tags filter @@ -237,9 +253,9 @@ def hybrid_search(query, user_id, document_id=None, document_ids=None, top_n=12, # Build user filter with optional tags user_base_filter = ( ( - f"(user_id eq '{user_id}' or shared_user_ids/any(u: u eq '{user_id},approved')) " + f"({_build_odata_eq('user_id', user_id)} or {_build_odata_any_eq('shared_user_ids', 'u', f'{user_id},approved')}) " if enable_file_sharing else - f"user_id eq '{user_id}' " + f"{_build_odata_eq('user_id', user_id)} " ) + f"and {doc_id_filter}" ) @@ -258,8 +274,11 @@ def hybrid_search(query, user_id, document_id=None, document_ids=None, top_n=12, # Only search group index if active_group_ids is provided if active_group_ids: - group_conditions = " or ".join([f"group_id eq '{gid}'" for gid in active_group_ids]) - shared_conditions = " or ".join([f"shared_group_ids/any(g: g eq '{gid},approved')" for gid in active_group_ids]) + group_conditions = " or ".join([_build_odata_eq("group_id", gid) for gid in active_group_ids]) + shared_conditions = " or ".join([ + _build_odata_any_eq("shared_group_ids", "g", f"{gid},approved") + for gid in active_group_ids + ]) group_base_filter = f"({group_conditions} or {shared_conditions}) and {doc_id_filter}" group_filter = f"{group_base_filter} and {tags_filter_clause}" if tags_filter_clause else group_base_filter @@ -282,11 +301,14 @@ def hybrid_search(query, user_id, document_id=None, document_ids=None, top_n=12, # Create filter for visible public workspaces if visible_public_workspace_ids: # Use 'or' conditions instead of 'in' operator for OData compatibility - workspace_conditions = " or ".join([f"public_workspace_id eq '{id}'" for id in visible_public_workspace_ids]) + workspace_conditions = " or ".join([ + _build_odata_eq("public_workspace_id", workspace_id) + for workspace_id in visible_public_workspace_ids + ]) public_base_filter = f"({workspace_conditions}) and {doc_id_filter}" else: # Fallback to active_public_workspace_id if no visible workspaces - public_base_filter = f"public_workspace_id eq '{active_public_workspace_id}' and {doc_id_filter}" + public_base_filter = f"{_build_odata_eq('public_workspace_id', active_public_workspace_id)} and {doc_id_filter}" public_filter = f"{public_base_filter} and {tags_filter_clause}" if tags_filter_clause else public_base_filter @@ -303,9 +325,9 @@ def hybrid_search(query, user_id, document_id=None, document_ids=None, top_n=12, else: # Build user filter with optional tags user_base_filter = ( - f"(user_id eq '{user_id}' or shared_user_ids/any(u: u eq '{user_id},approved')) " + f"({_build_odata_eq('user_id', user_id)} or {_build_odata_any_eq('shared_user_ids', 'u', f'{user_id},approved')}) " if enable_file_sharing else - f"user_id eq '{user_id}' " + f"{_build_odata_eq('user_id', user_id)} " ) user_filter = f"{user_base_filter} and {tags_filter_clause}" if tags_filter_clause else user_base_filter.strip() @@ -322,8 +344,11 @@ def hybrid_search(query, user_id, document_id=None, document_ids=None, top_n=12, # Only search group index if active_group_ids is provided if active_group_ids: - group_conditions = " or ".join([f"group_id eq '{gid}'" for gid in active_group_ids]) - shared_conditions = " or ".join([f"shared_group_ids/any(g: g eq '{gid},approved')" for gid in active_group_ids]) + group_conditions = " or ".join([_build_odata_eq("group_id", gid) for gid in active_group_ids]) + shared_conditions = " or ".join([ + _build_odata_any_eq("shared_group_ids", "g", f"{gid},approved") + for gid in active_group_ids + ]) group_base_filter = f"({group_conditions} or {shared_conditions})" group_filter = f"{group_base_filter} and {tags_filter_clause}" if tags_filter_clause else group_base_filter @@ -346,11 +371,14 @@ def hybrid_search(query, user_id, document_id=None, document_ids=None, top_n=12, # Create filter for visible public workspaces if visible_public_workspace_ids: # Use 'or' conditions instead of 'in' operator for OData compatibility - workspace_conditions = " or ".join([f"public_workspace_id eq '{id}'" for id in visible_public_workspace_ids]) + workspace_conditions = " or ".join([ + _build_odata_eq("public_workspace_id", workspace_id) + for workspace_id in visible_public_workspace_ids + ]) public_base_filter = f"({workspace_conditions})" else: # Fallback to active_public_workspace_id if no visible workspaces - public_base_filter = f"public_workspace_id eq '{active_public_workspace_id}'" + public_base_filter = _build_odata_eq("public_workspace_id", active_public_workspace_id) public_filter = f"{public_base_filter} and {tags_filter_clause}" if tags_filter_clause else public_base_filter @@ -396,9 +424,9 @@ def hybrid_search(query, user_id, document_id=None, document_ids=None, top_n=12, if doc_id_filter: user_base_filter = ( ( - f"(user_id eq '{user_id}' or shared_user_ids/any(u: u eq '{user_id},approved')) " + f"({_build_odata_eq('user_id', user_id)} or {_build_odata_any_eq('shared_user_ids', 'u', f'{user_id},approved')}) " if enable_file_sharing else - f"user_id eq '{user_id}' " + f"{_build_odata_eq('user_id', user_id)} " ) + f"and {doc_id_filter}" ) @@ -417,9 +445,9 @@ def hybrid_search(query, user_id, document_id=None, document_ids=None, top_n=12, results = extract_search_results(user_results, top_n) else: user_base_filter = ( - f"(user_id eq '{user_id}' or shared_user_ids/any(u: u eq '{user_id},approved')) " + f"({_build_odata_eq('user_id', user_id)} or {_build_odata_any_eq('shared_user_ids', 'u', f'{user_id},approved')}) " if enable_file_sharing else - f"user_id eq '{user_id}' " + f"{_build_odata_eq('user_id', user_id)} " ) user_filter = f"{user_base_filter} and {tags_filter_clause}" if tags_filter_clause else user_base_filter.strip() @@ -439,8 +467,11 @@ def hybrid_search(query, user_id, document_id=None, document_ids=None, top_n=12, if not active_group_ids: results = [] elif doc_id_filter: - group_conditions = " or ".join([f"group_id eq '{gid}'" for gid in active_group_ids]) - shared_conditions = " or ".join([f"shared_group_ids/any(g: g eq '{gid},approved')" for gid in active_group_ids]) + group_conditions = " or ".join([_build_odata_eq("group_id", gid) for gid in active_group_ids]) + shared_conditions = " or ".join([ + _build_odata_any_eq("shared_group_ids", "g", f"{gid},approved") + for gid in active_group_ids + ]) group_base_filter = f"({group_conditions} or {shared_conditions}) and {doc_id_filter}" group_filter = f"{group_base_filter} and {tags_filter_clause}" if tags_filter_clause else group_base_filter @@ -456,8 +487,11 @@ def hybrid_search(query, user_id, document_id=None, document_ids=None, top_n=12, ) results = extract_search_results(group_results, top_n) else: - group_conditions = " or ".join([f"group_id eq '{gid}'" for gid in active_group_ids]) - shared_conditions = " or ".join([f"shared_group_ids/any(g: g eq '{gid},approved')" for gid in active_group_ids]) + group_conditions = " or ".join([_build_odata_eq("group_id", gid) for gid in active_group_ids]) + shared_conditions = " or ".join([ + _build_odata_any_eq("shared_group_ids", "g", f"{gid},approved") + for gid in active_group_ids + ]) group_base_filter = f"({group_conditions} or {shared_conditions})" group_filter = f"{group_base_filter} and {tags_filter_clause}" if tags_filter_clause else group_base_filter @@ -481,11 +515,14 @@ def hybrid_search(query, user_id, document_id=None, document_ids=None, top_n=12, # Create filter for visible public workspaces if visible_public_workspace_ids: # Use 'or' conditions instead of 'in' operator for OData compatibility - workspace_conditions = " or ".join([f"public_workspace_id eq '{id}'" for id in visible_public_workspace_ids]) + workspace_conditions = " or ".join([ + _build_odata_eq("public_workspace_id", workspace_id) + for workspace_id in visible_public_workspace_ids + ]) public_base_filter = f"({workspace_conditions}) and {doc_id_filter}" else: # Fallback to active_public_workspace_id if no visible workspaces - public_base_filter = f"public_workspace_id eq '{active_public_workspace_id}' and {doc_id_filter}" + public_base_filter = f"{_build_odata_eq('public_workspace_id', active_public_workspace_id)} and {doc_id_filter}" public_filter = f"{public_base_filter} and {tags_filter_clause}" if tags_filter_clause else public_base_filter @@ -507,11 +544,14 @@ def hybrid_search(query, user_id, document_id=None, document_ids=None, top_n=12, # Create filter for visible public workspaces if visible_public_workspace_ids: # Use 'or' conditions instead of 'in' operator for OData compatibility - workspace_conditions = " or ".join([f"public_workspace_id eq '{id}'" for id in visible_public_workspace_ids]) + workspace_conditions = " or ".join([ + _build_odata_eq("public_workspace_id", workspace_id) + for workspace_id in visible_public_workspace_ids + ]) public_base_filter = f"({workspace_conditions})" else: # Fallback to active_public_workspace_id if no visible workspaces - public_base_filter = f"public_workspace_id eq '{active_public_workspace_id}'" + public_base_filter = _build_odata_eq("public_workspace_id", active_public_workspace_id) public_filter = f"{public_base_filter} and {tags_filter_clause}" if tags_filter_clause else public_base_filter diff --git a/application/single_app/functions_settings.py b/application/single_app/functions_settings.py index 8d09ee614..304660f42 100644 --- a/application/single_app/functions_settings.py +++ b/application/single_app/functions_settings.py @@ -1,5 +1,7 @@ # functions_settings.py +from flask import has_request_context + from config import * from functions_appinsights import log_event import app_settings_cache @@ -15,6 +17,43 @@ def is_tabular_processing_enabled(settings): """Tabular processing is available whenever enhanced citations is enabled.""" return bool((settings or {}).get('enable_enhanced_citations', False)) + + +def _authorize_user_settings_access(user_id, operation, allow_cross_user=False): + """Authorize user-settings access for the current request context.""" + normalized_user_id = str(user_id or '').strip() + if allow_cross_user or not has_request_context(): + return None + + try: + # Import locally to avoid a circular dependency during app startup. + from functions_authentication import get_current_user_id + except ImportError: + from application.single_app.functions_authentication import get_current_user_id + + actor_user_id = str(get_current_user_id() or '').strip() + if actor_user_id and normalized_user_id and actor_user_id != normalized_user_id: + log_event( + f"[UserSettings] Denied cross-user {operation}", + { + "actor_user_id": actor_user_id, + "target_user_id": normalized_user_id, + "operation": operation, + }, + level=logging.WARNING, + ) + raise PermissionError(f"Cannot {operation} settings for another user.") + + return actor_user_id or None + + +def _should_sync_session_profile(target_user_id, actor_user_id, allow_cross_user=False): + """Return True when session-derived profile data should update the target settings doc.""" + if allow_cross_user or not has_request_context(): + return False + normalized_target_user_id = str(target_user_id or '').strip() + normalized_actor_user_id = str(actor_user_id or '').strip() + return bool(normalized_target_user_id and normalized_actor_user_id and normalized_target_user_id == normalized_actor_user_id) import copy from support_menu_config import ( get_default_support_latest_features_visibility, @@ -306,7 +345,7 @@ def get_settings(use_cosmos=False, include_source=False): 'enable_web_search': False, 'web_search_consent_accepted': False, 'enable_web_search_user_notice': False, # Show popup to users explaining their message will be sent to Bing - 'web_search_user_notice_text': 'Your message will be sent to Microsoft Bing for web search. Only your current message is sent, not your conversation history.', + 'web_search_user_notice_text': 'Your current message will be sent to Microsoft Bing for web search. Conversation history is not sent for web search, but any sensitive content you paste into this message may be sent.', 'web_search_agent': { 'agent_type': 'aifoundry', 'azure_openai_gpt_endpoint': '', @@ -1035,9 +1074,14 @@ def decrypt_key(encrypted_key): ) return None -def get_user_settings(user_id): +def get_user_settings(user_id, allow_cross_user=False): """Fetches the user settings document from Cosmos DB, ensuring email and display_name are present if possible.""" - from flask import session + actor_user_id = _authorize_user_settings_access(user_id, "read", allow_cross_user=allow_cross_user) + should_sync_session_profile = _should_sync_session_profile( + user_id, + actor_user_id, + allow_cross_user=allow_cross_user, + ) try: doc = cosmos_user_settings_container.read_item(item=user_id, partition_key=user_id) updated = False @@ -1058,27 +1102,62 @@ def get_user_settings(user_id): doc['settings']['showTutorialButtons'] = True updated = True - # Try to update email/display_name if missing and available in session - user = session.get("user", {}) - email = user.get("preferred_username") or user.get("email") - display_name = user.get("name") - if email and doc.get("email") != email: - doc["email"] = email - updated = True - if display_name and doc.get("display_name") != display_name: - doc["display_name"] = display_name - updated = True - - # Check if profile image needs to be fetched - if 'profileImage' not in doc['settings']: + if should_sync_session_profile: + # Try to update email/display_name if missing and available in session + user = session.get("user", {}) + email = user.get("preferred_username") or user.get("email") + display_name = user.get("name") + if email and doc.get("email") != email: + doc["email"] = email + updated = True + if display_name and doc.get("display_name") != display_name: + doc["display_name"] = display_name + updated = True + + # Check if profile image needs to be fetched + if 'profileImage' not in doc['settings']: + from functions_authentication import get_user_profile_image + try: + profile_image = get_user_profile_image() + doc['settings']['profileImage'] = profile_image + updated = True + except Exception as e: + log_event( + "Could not fetch profile image for existing user.", + extra={ + "user_id": user_id, + "error": str(e) + }, + level=logging.WARNING + ) + doc['settings']['profileImage'] = None + updated = True + + if updated: + cosmos_user_settings_container.upsert_item(body=doc) + return doc + except exceptions.CosmosResourceNotFoundError: + # Return a default structure if the user has no settings saved yet + doc = {"id": user_id, "settings": {}} + doc["settings"]["personal_model_endpoints"] = [] + doc["settings"]["showTutorialButtons"] = True + if should_sync_session_profile: + user = session.get("user", {}) + email = user.get("preferred_username") or user.get("email") + display_name = user.get("name") + if email: + doc["email"] = email + if display_name: + doc["display_name"] = display_name + + # Try to fetch profile image for new user from functions_authentication import get_user_profile_image try: profile_image = get_user_profile_image() doc['settings']['profileImage'] = profile_image - updated = True except Exception as e: log_event( - "Could not fetch profile image for existing user.", + "Could not fetch profile image for new user.", extra={ "user_id": user_id, "error": str(e) @@ -1086,39 +1165,6 @@ def get_user_settings(user_id): level=logging.WARNING ) doc['settings']['profileImage'] = None - updated = True - - if updated: - cosmos_user_settings_container.upsert_item(body=doc) - return doc - except exceptions.CosmosResourceNotFoundError: - # Return a default structure if the user has no settings saved yet - user = session.get("user", {}) - email = user.get("preferred_username") or user.get("email") - display_name = user.get("name") - doc = {"id": user_id, "settings": {}} - doc["settings"]["personal_model_endpoints"] = [] - doc["settings"]["showTutorialButtons"] = True - if email: - doc["email"] = email - if display_name: - doc["display_name"] = display_name - - # Try to fetch profile image for new user - from functions_authentication import get_user_profile_image - try: - profile_image = get_user_profile_image() - doc['settings']['profileImage'] = profile_image - except Exception as e: - log_event( - "Could not fetch profile image for new user.", - extra={ - "user_id": user_id, - "error": str(e) - }, - level=logging.WARNING - ) - doc['settings']['profileImage'] = None cosmos_user_settings_container.upsert_item(body=doc) return doc @@ -1134,7 +1180,7 @@ def get_user_settings(user_id): ) raise # Re-raise the exception to be handled by the route -def update_user_settings(user_id, settings_to_update): +def update_user_settings(user_id, settings_to_update, allow_cross_user=False): """ Updates or creates user settings in Cosmos DB, merging new settings into the existing 'settings' sub-dictionary and updating 'lastUpdated'. @@ -1147,8 +1193,21 @@ def update_user_settings(user_id, settings_to_update): Returns: bool: True if the update was successful, False otherwise. """ + actor_user_id = _authorize_user_settings_access( + user_id, + "update", + allow_cross_user=allow_cross_user, + ) sanitized_settings_to_update = sanitize_settings_for_logging(settings_to_update) - log_event("[UserSettings] Update Attempt", {"user_id": user_id, "settings_to_update": sanitized_settings_to_update}) + log_event( + "[UserSettings] Update Attempt", + { + "user_id": user_id, + "actor_user_id": actor_user_id, + "allow_cross_user": allow_cross_user, + "settings_to_update": sanitized_settings_to_update, + }, + ) try: diff --git a/application/single_app/route_backend_chats.py b/application/single_app/route_backend_chats.py index e16d72423..4d31db45f 100644 --- a/application/single_app/route_backend_chats.py +++ b/application/single_app/route_backend_chats.py @@ -218,6 +218,235 @@ def build_fact_memory_citation(query_text, matched_facts, search_mode): } +def _normalize_requested_scope_ids(*scope_values): + """Normalize single-value and list-based scope ids into a de-duplicated list.""" + normalized_values = [] + for scope_value in scope_values: + if scope_value is None: + continue + + if isinstance(scope_value, (list, tuple, set)): + candidates = list(scope_value) + else: + candidates = [scope_value] + + for candidate in candidates: + normalized_candidate = str(candidate or '').strip() + if not normalized_candidate or normalized_candidate in normalized_values: + continue + normalized_values.append(normalized_candidate) + + return normalized_values + + +def _get_authorized_chat_scope_context( + user_id, + active_group_id=None, + active_group_ids=None, + active_public_workspace_id=None, + active_public_workspace_ids=None, +): + """Filter request-provided chat scopes down to the caller's current access.""" + requested_group_ids = _normalize_requested_scope_ids(active_group_ids, active_group_id) + allowed_group_ids = [] + for group_id in requested_group_ids: + group_doc = find_group_by_id(group_id) + if group_doc and get_user_role_in_group(group_doc, user_id): + allowed_group_ids.append(group_id) + + requested_public_workspace_ids = _normalize_requested_scope_ids( + active_public_workspace_ids, + active_public_workspace_id, + ) + visible_public_workspace_ids = set( + _normalize_requested_scope_ids(get_user_visible_public_workspace_ids_from_settings(user_id) or []) + ) + allowed_public_workspace_ids = [ + workspace_id + for workspace_id in requested_public_workspace_ids + if workspace_id in visible_public_workspace_ids + ] + + return { + 'active_group_ids': allowed_group_ids, + 'active_group_id': allowed_group_ids[0] if allowed_group_ids else None, + 'active_public_workspace_ids': allowed_public_workspace_ids, + 'active_public_workspace_id': ( + allowed_public_workspace_ids[0] if allowed_public_workspace_ids else None + ), + } + + +def _set_authorized_chat_request_context(user_id, conversation_id, scope_context): + """Persist the canonical request authorization context for downstream plugin checks.""" + authorized_context = { + 'user_id': user_id, + 'conversation_id': conversation_id, + 'active_group_ids': list(scope_context.get('active_group_ids') or []), + 'active_group_id': scope_context.get('active_group_id'), + 'active_public_workspace_ids': list(scope_context.get('active_public_workspace_ids') or []), + 'active_public_workspace_id': scope_context.get('active_public_workspace_id'), + } + authorized_context['fact_memory_scope_id'] = authorized_context['active_group_id'] or user_id + authorized_context['fact_memory_scope_type'] = ( + 'group' if authorized_context['active_group_id'] else 'user' + ) + + g.conversation_id = conversation_id + g.authorized_chat_context = authorized_context + return authorized_context + + +def _resolve_chat_selected_document_metadata(document_id, user_id=None, document_scope='personal', + active_group_id=None, active_group_ids=None, + active_public_workspace_id=None, + active_public_workspace_ids=None): + """Resolve selected-document metadata using the authorized chat scope model.""" + normalized_document_id = str(document_id or '').strip() + if not normalized_document_id or normalized_document_id == 'all': + return None + + normalized_scope = str(document_scope or 'personal').strip().lower() + authorized_group_ids = _normalize_requested_scope_ids(active_group_ids, active_group_id) + authorized_public_workspace_ids = _normalize_requested_scope_ids( + active_public_workspace_ids, + active_public_workspace_id, + ) + + resolution_queries = [] + + if normalized_scope in {'personal', 'workspace', 'all'} and user_id: + resolution_queries.append({ + 'source_hint': 'workspace', + 'cosmos_container': cosmos_user_documents_container, + 'query': """ + SELECT TOP 1 c.id, c.file_name, c.title, c.group_id, c.public_workspace_id + FROM c + WHERE c.id = @doc_id + AND ( + c.user_id = @user_id + OR ARRAY_CONTAINS(c.shared_user_ids, @user_id) + OR EXISTS(SELECT VALUE s FROM s IN c.shared_user_ids WHERE STARTSWITH(s, @user_id_prefix)) + ) + ORDER BY c.version DESC + """, + 'parameters': [ + {'name': '@doc_id', 'value': normalized_document_id}, + {'name': '@user_id', 'value': user_id}, + {'name': '@user_id_prefix', 'value': f"{user_id},"}, + ], + }) + + if normalized_scope in {'group', 'all'}: + for group_id in authorized_group_ids: + resolution_queries.append({ + 'source_hint': 'group', + 'cosmos_container': cosmos_group_documents_container, + 'query': """ + SELECT TOP 1 c.id, c.file_name, c.title, c.group_id, c.public_workspace_id + FROM c + WHERE c.id = @doc_id + AND ( + c.group_id = @group_id + OR ARRAY_CONTAINS(c.shared_group_ids, @group_id) + OR ARRAY_CONTAINS(c.shared_group_ids, @group_id_approved) + ) + ORDER BY c.version DESC + """, + 'parameters': [ + {'name': '@doc_id', 'value': normalized_document_id}, + {'name': '@group_id', 'value': group_id}, + {'name': '@group_id_approved', 'value': f"{group_id},approved"}, + ], + }) + + if normalized_scope in {'public', 'all'}: + for public_workspace_id in authorized_public_workspace_ids: + resolution_queries.append({ + 'source_hint': 'public', + 'cosmos_container': cosmos_public_documents_container, + 'query': """ + SELECT TOP 1 c.id, c.file_name, c.title, c.group_id, c.public_workspace_id + FROM c + WHERE c.id = @doc_id + AND c.public_workspace_id = @public_workspace_id + ORDER BY c.version DESC + """, + 'parameters': [ + {'name': '@doc_id', 'value': normalized_document_id}, + {'name': '@public_workspace_id', 'value': public_workspace_id}, + ], + }) + + for resolution_query in resolution_queries: + doc_results = list(resolution_query['cosmos_container'].query_items( + query=resolution_query['query'], + parameters=resolution_query['parameters'], + enable_cross_partition_query=True, + )) + if not doc_results: + continue + + doc_info = dict(doc_results[0]) + doc_info['source_hint'] = resolution_query['source_hint'] + return doc_info + + return None + + +def _create_personal_conversation(user_id, conversation_id=None): + """Create and persist a new personal conversation owned by the current user.""" + resolved_conversation_id = str(conversation_id or uuid.uuid4()) + conversation_item = { + 'id': resolved_conversation_id, + 'user_id': user_id, + 'last_updated': datetime.utcnow().isoformat(), + 'title': 'New Conversation', + 'context': [], + 'tags': [], + 'strict': False, + 'chat_type': 'new' + } + cosmos_conversations_container.upsert_item(conversation_item) + + log_conversation_creation( + user_id=user_id, + conversation_id=resolved_conversation_id, + title='New Conversation', + workspace_type='personal' + ) + + conversation_item['added_to_activity_log'] = True + cosmos_conversations_container.upsert_item(conversation_item) + return conversation_item + + +def _authorize_personal_conversation_access(user_id, conversation_id): + """Load a personal conversation and ensure the caller owns it.""" + try: + conversation_item = cosmos_conversations_container.read_item( + item=conversation_id, + partition_key=conversation_id, + ) + except CosmosResourceNotFoundError as exc: + raise LookupError(f"Conversation {conversation_id} not found") from exc + + if conversation_item.get('user_id') != user_id: + raise PermissionError('You can only access your own conversations') + + return conversation_item + + +def _resolve_or_create_authorized_personal_conversation(user_id, conversation_id): + """Create new personal conversations server-side or load an authorized existing one.""" + if not conversation_id: + conversation_item = _create_personal_conversation(user_id) + return conversation_item, conversation_item['id'] + + conversation_item = _authorize_personal_conversation_access(user_id, conversation_id) + return conversation_item, conversation_id + + def build_instruction_memory_payload( scope_id, scope_type, @@ -5127,8 +5356,10 @@ def infer_tabular_source_context_from_document(source_doc, document_scope='perso def get_selected_workspace_tabular_file_contexts(selected_document_ids=None, selected_document_id=None, - document_scope='personal', active_group_id=None, - active_public_workspace_id=None): + document_scope='personal', user_id=None, + active_group_id=None, active_group_ids=None, + active_public_workspace_id=None, + active_public_workspace_ids=None): """Resolve explicitly selected workspace documents and return tabular source contexts.""" selected_ids = list(selected_document_ids or []) if not selected_ids and selected_document_id and selected_document_id != 'all': @@ -5144,33 +5375,26 @@ def get_selected_workspace_tabular_file_contexts(selected_document_ids=None, sel continue try: - doc_query = ( - "SELECT TOP 1 c.file_name, c.title, c.group_id, c.public_workspace_id " - "FROM c WHERE c.id = @doc_id " - "ORDER BY c.version DESC" + doc_info = _resolve_chat_selected_document_metadata( + doc_id, + user_id=user_id, + document_scope=document_scope, + active_group_id=active_group_id, + active_group_ids=active_group_ids, + active_public_workspace_id=active_public_workspace_id, + active_public_workspace_ids=active_public_workspace_ids, ) - doc_params = [{"name": "@doc_id", "value": doc_id}] - - for source_hint, cosmos_container in get_document_containers_for_scope(document_scope): - doc_results = list(cosmos_container.query_items( - query=doc_query, - parameters=doc_params, - enable_cross_partition_query=True - )) - - if not doc_results: - continue + if not doc_info: + continue - doc_info = doc_results[0] - file_context = build_tabular_file_context( - doc_info.get('file_name') or doc_info.get('title'), - source_hint=source_hint, - group_id=doc_info.get('group_id') or active_group_id, - public_workspace_id=doc_info.get('public_workspace_id') or active_public_workspace_id, - ) - if file_context: - tabular_file_contexts.append(file_context) - break + file_context = build_tabular_file_context( + doc_info.get('file_name') or doc_info.get('title'), + source_hint=doc_info.get('source_hint', 'workspace'), + group_id=doc_info.get('group_id') or active_group_id, + public_workspace_id=doc_info.get('public_workspace_id') or active_public_workspace_id, + ) + if file_context: + tabular_file_contexts.append(file_context) except Exception as e: log_event( f"[Tabular SK Analysis] Failed to resolve selected document '{doc_id}': {e}", @@ -5182,7 +5406,10 @@ def get_selected_workspace_tabular_file_contexts(selected_document_ids=None, sel def collect_workspace_tabular_file_contexts(combined_documents=None, selected_document_ids=None, selected_document_id=None, document_scope='personal', - active_group_id=None, active_public_workspace_id=None): + user_id=None, active_group_id=None, + active_group_ids=None, + active_public_workspace_id=None, + active_public_workspace_ids=None): """Collect tabular source contexts from search results and explicit workspace selection.""" tabular_file_contexts = [] @@ -5200,8 +5427,11 @@ def collect_workspace_tabular_file_contexts(combined_documents=None, selected_do selected_document_ids=selected_document_ids, selected_document_id=selected_document_id, document_scope=document_scope, + user_id=user_id, active_group_id=active_group_id, + active_group_ids=active_group_ids, active_public_workspace_id=active_public_workspace_id, + active_public_workspace_ids=active_public_workspace_ids, )) return dedupe_tabular_file_contexts(tabular_file_contexts) @@ -5209,15 +5439,21 @@ def collect_workspace_tabular_file_contexts(combined_documents=None, selected_do def collect_workspace_tabular_filenames(combined_documents=None, selected_document_ids=None, selected_document_id=None, document_scope='personal', - active_group_id=None, active_public_workspace_id=None): + user_id=None, active_group_id=None, + active_group_ids=None, + active_public_workspace_id=None, + active_public_workspace_ids=None): """Collect unique tabular filenames from search results and explicit workspace selection.""" tabular_file_contexts = collect_workspace_tabular_file_contexts( combined_documents=combined_documents, selected_document_ids=selected_document_ids, selected_document_id=selected_document_id, document_scope=document_scope, + user_id=user_id, active_group_id=active_group_id, + active_group_ids=active_group_ids, active_public_workspace_id=active_public_workspace_id, + active_public_workspace_ids=active_public_workspace_ids, ) return {file_context['file_name'] for file_context in tabular_file_contexts} @@ -6031,22 +6267,19 @@ def result_requires_message_reload(result: Any) -> bool: active_group_id = data.get('active_group_id') active_group_ids = data.get('active_group_ids', []) - # Backwards compat: if new list not provided, wrap single ID - if not active_group_ids and active_group_id: - active_group_ids = [active_group_id] - # Permission validation: only keep groups user is a member of - validated_group_ids = [] - for gid in active_group_ids: - g_doc = find_group_by_id(gid) - if g_doc and get_user_role_in_group(g_doc, user_id): - validated_group_ids.append(gid) - active_group_ids = validated_group_ids - # Keep single ID for backwards compat in metadata/context - active_group_id = active_group_ids[0] if active_group_ids else data.get('active_group_id') active_public_workspace_id = data.get('active_public_workspace_id') # Extract active public workspace ID active_public_workspace_ids = data.get('active_public_workspace_ids', []) - if not active_public_workspace_ids and active_public_workspace_id: - active_public_workspace_ids = [active_public_workspace_id] + scope_context = _get_authorized_chat_scope_context( + user_id, + active_group_id=active_group_id, + active_group_ids=active_group_ids, + active_public_workspace_id=active_public_workspace_id, + active_public_workspace_ids=active_public_workspace_ids, + ) + active_group_ids = scope_context['active_group_ids'] + active_group_id = scope_context['active_group_id'] + active_public_workspace_ids = scope_context['active_public_workspace_ids'] + active_public_workspace_id = scope_context['active_public_workspace_id'] frontend_gpt_model = data.get('model_deployment') top_n_results = data.get('top_n') # Extract top_n parameter from request classifications_to_send = data.get('classifications') # Extract classifications parameter from request @@ -6064,20 +6297,12 @@ def result_requires_message_reload(result: Any) -> bool: operation_type = 'Edit' if is_edit else 'Retry' debug_print(f"🔍 Chat API - {operation_type} detected! user_message_id={retry_user_message_id}, thread_id={retry_thread_id}, attempt={retry_thread_attempt}") - # Store conversation_id in Flask context for plugin logger access - g.conversation_id = conversation_id - - # Clear plugin invocations at start of message processing to ensure - # each message only shows citations for tools executed during that specific interaction - from semantic_kernel_plugins.plugin_invocation_logger import get_plugin_logger - plugin_logger = get_plugin_logger() - plugin_logger.clear_invocations_for_conversation(user_id, conversation_id) - # Validate chat_type if chat_type not in ('user', 'group'): chat_type = 'user' search_query = user_message # <--- ADD THIS LINE (Initialize search_query) + web_search_query_text = build_web_search_query_text(user_message) hybrid_citations_list = [] # <--- ADD THIS LINE (Initialize hybrid list) agent_citations_list = [] # <--- ADD THIS LINE (Initialize agent citations list) web_search_citations_list = [] @@ -6236,65 +6461,25 @@ def result_requires_message_reload(result: Any) -> bool: # --------------------------------------------------------------------- # 1) Load or create conversation # --------------------------------------------------------------------- - if not conversation_id: - conversation_id = str(uuid.uuid4()) - conversation_item = { - 'id': conversation_id, - 'user_id': user_id, - 'last_updated': datetime.utcnow().isoformat(), - 'title': 'New Conversation', - 'context': [], - 'tags': [], - 'strict': False, - 'chat_type': 'new' - } - cosmos_conversations_container.upsert_item(conversation_item) - - # Log conversation creation - log_conversation_creation( - user_id=user_id, - conversation_id=conversation_id, - title='New Conversation', - workspace_type='personal' + try: + conversation_item, conversation_id = _resolve_or_create_authorized_personal_conversation( + user_id, + conversation_id, ) - - # Mark as logged to activity logs to prevent duplicate migration - conversation_item['added_to_activity_log'] = True - cosmos_conversations_container.upsert_item(conversation_item) - else: - try: - conversation_item = cosmos_conversations_container.read_item(item=conversation_id, partition_key=conversation_id) - except CosmosResourceNotFoundError: - # If conversation ID is provided but not found, create a new one with that ID - # Or decide if you want to return an error instead - conversation_item = { - 'id': conversation_id, # Keep the provided ID if needed for linking - 'user_id': user_id, - 'last_updated': datetime.utcnow().isoformat(), - 'title': 'New Conversation', # Or maybe fetch title differently? - 'context': [], - 'tags': [], - 'strict': False, - 'chat_type': 'new' - } - # Optionally log that a conversation was expected but not found - debug_print(f"Warning: Conversation ID {conversation_id} not found, creating new.") - cosmos_conversations_container.upsert_item(conversation_item) - - # Log conversation creation - log_conversation_creation( - user_id=user_id, - conversation_id=conversation_id, - title='New Conversation', - workspace_type='personal' - ) - - # Mark as logged to activity logs to prevent duplicate migration - conversation_item['added_to_activity_log'] = True - cosmos_conversations_container.upsert_item(conversation_item) - except Exception as e: - debug_print(f"Error reading conversation {conversation_id}: {e}") - return jsonify({'error': f'Error reading conversation: {str(e)}'}), 500 + except LookupError: + return jsonify({'error': 'Conversation not found'}), 404 + except PermissionError: + return jsonify({'error': 'Forbidden'}), 403 + except Exception as e: + debug_print(f"Error reading conversation {conversation_id}: {e}") + return jsonify({'error': f'Error reading conversation: {str(e)}'}), 500 + + _set_authorized_chat_request_context(user_id, conversation_id, scope_context) + + # Clear plugin invocations at start of message processing to ensure + # each message only shows citations for tools executed during that specific interaction + plugin_logger = get_plugin_logger() + plugin_logger.clear_invocations_for_conversation(user_id, conversation_id) # Determine the actual chat context based on existing conversation or document usage # For existing conversations, use the chat_type from conversation metadata @@ -6404,21 +6589,16 @@ def result_requires_message_reload(result: Any) -> bool: # Get document details if specific document selected if selected_document_id and selected_document_id != "all": try: - # Use the appropriate documents container based on scope - if document_scope == 'group': - cosmos_container = cosmos_group_documents_container - elif document_scope == 'public': - cosmos_container = cosmos_public_documents_container - elif document_scope == 'personal': - cosmos_container = cosmos_user_documents_container - - doc_query = "SELECT c.file_name, c.title, c.document_id, c.group_id FROM c WHERE c.id = @doc_id" - doc_params = [{"name": "@doc_id", "value": selected_document_id}] - doc_results = list(cosmos_container.query_items( - query=doc_query, parameters=doc_params, enable_cross_partition_query=True - )) - if doc_results and 'workspace_search' in user_metadata: - doc_info = doc_results[0] + doc_info = _resolve_chat_selected_document_metadata( + selected_document_id, + user_id=user_id, + document_scope=document_scope, + active_group_id=active_group_id, + active_group_ids=active_group_ids, + active_public_workspace_id=active_public_workspace_id, + active_public_workspace_ids=active_public_workspace_ids, + ) + if doc_info and 'workspace_search' in user_metadata: user_metadata['workspace_search']['document_name'] = doc_info.get('title') or doc_info.get('file_name') user_metadata['workspace_search']['document_filename'] = doc_info.get('file_name') except Exception as e: @@ -6802,7 +6982,11 @@ def result_requires_message_reload(result: Any) -> bool: fallback_search_parameters = build_prior_grounded_document_search_parameters( prior_grounded_document_refs ) - if fallback_search_parameters.get('document_ids'): + fallback_search_parameters = revalidate_prior_grounded_document_search_parameters( + user_id, + fallback_search_parameters, + ) + if fallback_search_parameters.get('document_ids') and fallback_search_parameters.get('doc_scope'): history_grounded_search_used = True effective_document_scope = fallback_search_parameters.get('doc_scope') or 'all' effective_selected_document_ids = list( @@ -6889,6 +7073,7 @@ def result_requires_message_reload(result: Any) -> bool: # Filter out inactive thread messages before summarizing message_texts_search = [] for msg in last_messages_asc: + role = msg.get('role', 'user') thread_info = msg.get('metadata', {}).get('thread_info', {}) active_thread = thread_info.get('active_thread') @@ -6896,8 +7081,15 @@ def result_requires_message_reload(result: Any) -> bool: if active_thread is False: debug_print(f"[THREAD] Skipping inactive thread message {msg.get('id')} from search summary") continue - - message_texts_search.append(f"{msg.get('role', 'user').upper()}: {msg.get('content', '')}") + + if role not in ('user', 'assistant'): + continue + + content = msg.get('content', '') + if role == 'assistant': + content = build_assistant_history_content_with_citations(msg, content) + + message_texts_search.append(f"{role.upper()}: {content}") if not message_texts_search: # No active messages to summarize @@ -7614,8 +7806,11 @@ def result_requires_message_reload(result: Any) -> bool: selected_document_ids=effective_selected_document_ids, selected_document_id=effective_selected_document_id, document_scope=effective_document_scope, + user_id=user_id, active_group_id=effective_active_group_id, + active_group_ids=effective_active_group_ids, active_public_workspace_id=effective_active_public_workspace_id, + active_public_workspace_ids=effective_active_public_workspace_ids, ) workspace_tabular_files = { file_context['file_name'] for file_context in workspace_tabular_file_contexts @@ -7689,7 +7884,7 @@ def result_requires_message_reload(result: Any) -> bool: ) if web_search_enabled: - thought_tracker.add_thought('web_search', f"Searching the web for '{(search_query or user_message)[:50]}'") + thought_tracker.add_thought('web_search', f"Searching the web for '{web_search_query_text[:50]}'") perform_web_search( settings=settings, conversation_id=conversation_id, @@ -7700,7 +7895,7 @@ def result_requires_message_reload(result: Any) -> bool: document_scope=document_scope, active_group_id=active_group_id, active_public_workspace_id=active_public_workspace_id, - search_query=search_query, + web_search_query_text=web_search_query_text, system_messages_for_augmentation=system_messages_for_augmentation, agent_citations_list=agent_citations_list, web_search_citations_list=web_search_citations_list, @@ -8906,9 +9101,32 @@ def chat_stream_api(): compatibility_mode = bool(data.get('image_generation')) or is_retry requested_conversation_id = str(data.get('conversation_id') or '').strip() or None + + if requested_conversation_id: + try: + _authorize_personal_conversation_access(user_id, requested_conversation_id) + except LookupError: + return jsonify({'error': 'Conversation not found'}), 404 + except PermissionError: + return jsonify({'error': 'Forbidden'}), 403 + except Exception as exc: + debug_print(f"[Streaming] Error authorizing conversation {requested_conversation_id}: {exc}") + return jsonify({'error': 'Failed to authorize conversation'}), 500 + + initial_scope_context = _get_authorized_chat_scope_context( + user_id, + active_group_id=data.get('active_group_id'), + active_group_ids=data.get('active_group_ids', []), + active_public_workspace_id=data.get('active_public_workspace_id'), + active_public_workspace_ids=data.get('active_public_workspace_ids', []), + ) finalized_conversation_id = requested_conversation_id or str(uuid.uuid4()) is_new_stream_conversation = requested_conversation_id is None data['conversation_id'] = finalized_conversation_id + data['active_group_ids'] = list(initial_scope_context['active_group_ids']) + data['active_group_id'] = initial_scope_context['active_group_id'] + data['active_public_workspace_ids'] = list(initial_scope_context['active_public_workspace_ids']) + data['active_public_workspace_id'] = initial_scope_context['active_public_workspace_id'] stream_session = CHAT_STREAM_REGISTRY.start_session(user_id, finalized_conversation_id) request_message = (data.get('message') or '').strip() @@ -9047,22 +9265,19 @@ def generate(publish_background_event=None): tags_filter = data.get('tags', []) # Extract tags filter active_group_id = data.get('active_group_id') active_group_ids = data.get('active_group_ids', []) - # Backwards compat: if new list not provided, wrap single ID - if not active_group_ids and active_group_id: - active_group_ids = [active_group_id] - # Permission validation: only keep groups user is a member of - validated_group_ids = [] - for gid in active_group_ids: - g_doc = find_group_by_id(gid) - if g_doc and get_user_role_in_group(g_doc, user_id): - validated_group_ids.append(gid) - active_group_ids = validated_group_ids - # Keep single ID for backwards compat in metadata/context - active_group_id = active_group_ids[0] if active_group_ids else data.get('active_group_id') active_public_workspace_id = data.get('active_public_workspace_id') # Extract active public workspace ID active_public_workspace_ids = data.get('active_public_workspace_ids', []) - if not active_public_workspace_ids and active_public_workspace_id: - active_public_workspace_ids = [active_public_workspace_id] + scope_context = _get_authorized_chat_scope_context( + user_id, + active_group_id=active_group_id, + active_group_ids=active_group_ids, + active_public_workspace_id=active_public_workspace_id, + active_public_workspace_ids=active_public_workspace_ids, + ) + active_group_ids = scope_context['active_group_ids'] + active_group_id = scope_context['active_group_id'] + active_public_workspace_ids = scope_context['active_public_workspace_ids'] + active_public_workspace_id = scope_context['active_public_workspace_id'] frontend_gpt_model = data.get('model_deployment') frontend_model_id = data.get('model_id') frontend_model_endpoint_id = data.get('model_endpoint_id') @@ -9156,11 +9371,9 @@ def generate(publish_background_event=None): yield f"data: {json.dumps({'error': 'Image generation is not supported in streaming mode'})}\n\n" return - # Initialize Flask context - g.conversation_id = conversation_id + _set_authorized_chat_request_context(user_id, conversation_id, scope_context) # Clear plugin invocations - from semantic_kernel_plugins.plugin_invocation_logger import get_plugin_logger plugin_logger = get_plugin_logger() plugin_logger.clear_invocations_for_conversation(user_id, conversation_id) debug_print( @@ -9175,6 +9388,7 @@ def generate(publish_background_event=None): # Initialize variables search_query = user_message + web_search_query_text = build_web_search_query_text(user_message) hybrid_citations_list = [] agent_citations_list = [] web_search_citations_list = [] @@ -9323,37 +9537,18 @@ def generate(publish_background_event=None): # Load or create conversation (simplified) if is_new_stream_conversation: - conversation_item = { - 'id': conversation_id, - 'user_id': user_id, - 'last_updated': datetime.utcnow().isoformat(), - 'title': 'New Conversation', - 'context': [], - 'tags': [], - 'strict': False, - 'chat_type': 'new' - } - cosmos_conversations_container.upsert_item(conversation_item) + conversation_item = _create_personal_conversation(user_id, conversation_id=conversation_id) debug_print(f"[Streaming] Created new conversation {conversation_id}") else: try: - conversation_item = cosmos_conversations_container.read_item( - item=conversation_id, partition_key=conversation_id - ) + conversation_item = _authorize_personal_conversation_access(user_id, conversation_id) debug_print(f"[Streaming] Loaded existing conversation {conversation_id}") - except CosmosResourceNotFoundError: - conversation_item = { - 'id': conversation_id, - 'user_id': user_id, - 'last_updated': datetime.utcnow().isoformat(), - 'title': 'New Conversation', - 'context': [], - 'tags': [], - 'strict': False, - 'chat_type': 'new' - } - cosmos_conversations_container.upsert_item(conversation_item) - debug_print(f"[Streaming] Conversation {conversation_id} not found; created replacement") + except LookupError: + yield f"data: {json.dumps({'error': 'Conversation not found'})}\n\n" + return + except PermissionError: + yield f"data: {json.dumps({'error': 'Forbidden'})}\n\n" + return # Determine chat type actual_chat_type = 'personal_single_user' @@ -9402,21 +9597,16 @@ def generate(publish_background_event=None): # Get document details if specific document selected if selected_document_id and selected_document_id != "all": try: - # Use the appropriate documents container based on scope - if document_scope == 'group': - cosmos_container = cosmos_group_documents_container - elif document_scope == 'public': - cosmos_container = cosmos_public_documents_container - elif document_scope == 'personal': - cosmos_container = cosmos_user_documents_container - - doc_query = "SELECT c.file_name, c.title, c.document_id, c.group_id FROM c WHERE c.id = @doc_id" - doc_params = [{"name": "@doc_id", "value": selected_document_id}] - doc_results = list(cosmos_container.query_items( - query=doc_query, parameters=doc_params, enable_cross_partition_query=True - )) - if doc_results: - doc_info = doc_results[0] + doc_info = _resolve_chat_selected_document_metadata( + selected_document_id, + user_id=user_id, + document_scope=document_scope, + active_group_id=active_group_id, + active_group_ids=active_group_ids, + active_public_workspace_id=active_public_workspace_id, + active_public_workspace_ids=active_public_workspace_ids, + ) + if doc_info: user_metadata['workspace_search']['document_name'] = doc_info.get('title') or doc_info.get('file_name') user_metadata['workspace_search']['document_filename'] = doc_info.get('file_name') except Exception as e: @@ -9744,7 +9934,11 @@ def publish_live_plugin_thought(thought_payload): fallback_search_parameters = build_prior_grounded_document_search_parameters( prior_grounded_document_refs ) - if fallback_search_parameters.get('document_ids'): + fallback_search_parameters = revalidate_prior_grounded_document_search_parameters( + user_id, + fallback_search_parameters, + ) + if fallback_search_parameters.get('document_ids') and fallback_search_parameters.get('doc_scope'): history_grounded_search_used = True effective_document_scope = fallback_search_parameters.get('doc_scope') or 'all' effective_selected_document_ids = list( @@ -10072,8 +10266,11 @@ def publish_live_plugin_thought(thought_payload): selected_document_ids=effective_selected_document_ids, selected_document_id=effective_selected_document_id, document_scope=effective_document_scope, + user_id=user_id, active_group_id=effective_active_group_id, + active_group_ids=effective_active_group_ids, active_public_workspace_id=effective_active_public_workspace_id, + active_public_workspace_ids=effective_active_public_workspace_ids, ) workspace_tabular_files = { file_context['file_name'] for file_context in workspace_tabular_file_contexts @@ -10160,7 +10357,7 @@ def publish_live_plugin_thought(thought_payload): debug_print( f"[Streaming] Starting web search augmentation for conversation_id={conversation_id}" ) - yield emit_thought('web_search', f"Searching the web for '{(search_query or user_message)[:50]}'") + yield emit_thought('web_search', f"Searching the web for '{web_search_query_text[:50]}'") perform_web_search( settings=settings, conversation_id=conversation_id, @@ -10171,7 +10368,7 @@ def publish_live_plugin_thought(thought_payload): document_scope=document_scope, active_group_id=active_group_id, active_public_workspace_id=active_public_workspace_id, - search_query=search_query, + web_search_query_text=web_search_query_text, system_messages_for_augmentation=system_messages_for_augmentation, agent_citations_list=agent_citations_list, web_search_citations_list=web_search_citations_list, @@ -11102,7 +11299,13 @@ def mask_message_api(message_id): # Get action: "mask_all", "mask_selection", or "unmask_all" action = data.get('action') selection = data.get('selection', {}) - user_display_name = data.get('display_name', 'Unknown User') + current_user = get_current_user_info() or {} + user_display_name = ( + current_user.get('displayName') + or current_user.get('email') + or current_user.get('userPrincipalName') + or 'Unknown User' + ) # Validate action if action not in ['mask_all', 'mask_selection', 'unmask_all']: @@ -11688,6 +11891,43 @@ def build_prior_grounded_document_search_parameters(grounded_refs): } +def revalidate_prior_grounded_document_search_parameters(user_id, search_parameters): + """Filter fallback search parameters to scopes the caller can still access.""" + normalized_parameters = dict(search_parameters or {}) + scope_types = set(normalized_parameters.get('scope_types') or []) + scope_context = _get_authorized_chat_scope_context( + user_id, + active_group_ids=normalized_parameters.get('active_group_ids') or [], + active_public_workspace_ids=normalized_parameters.get('active_public_workspace_ids') or [], + ) + allowed_group_ids = scope_context['active_group_ids'] + allowed_public_workspace_ids = scope_context['active_public_workspace_ids'] + + allowed_scope_types = [] + if 'personal' in scope_types: + allowed_scope_types.append('personal') + if allowed_group_ids: + allowed_scope_types.append('group') + if allowed_public_workspace_ids: + allowed_scope_types.append('public') + + normalized_parameters['active_group_ids'] = allowed_group_ids + normalized_parameters['active_group_id'] = scope_context['active_group_id'] + normalized_parameters['active_public_workspace_ids'] = allowed_public_workspace_ids + normalized_parameters['active_public_workspace_id'] = scope_context['active_public_workspace_id'] + normalized_parameters['scope_types'] = allowed_scope_types + + if not allowed_scope_types: + normalized_parameters['document_ids'] = [] + normalized_parameters['doc_scope'] = None + return normalized_parameters + + normalized_parameters['doc_scope'] = ( + allowed_scope_types[0] if len(allowed_scope_types) == 1 else 'all' + ) + return normalized_parameters + + def build_history_only_assessment_messages(history_segments, default_system_prompt=''): """Construct the prompt context used to decide whether history alone is sufficient.""" assessment_messages = [] @@ -12294,6 +12534,11 @@ def to_int(value: Any) -> Optional[int]: "completion_tokens": int(completion_tokens), } + +def build_web_search_query_text(user_message): + """Return the only chat content allowed to leave the app for external web search.""" + return str(user_message or "").strip() + def perform_web_search( *, settings, @@ -12305,7 +12550,7 @@ def perform_web_search( document_scope, active_group_id, active_public_workspace_id, - search_query, + web_search_query_text, system_messages_for_augmentation, agent_citations_list, web_search_citations_list, @@ -12320,7 +12565,10 @@ def perform_web_search( debug_print(f"[WebSearch] document_scope: {document_scope}") debug_print(f"[WebSearch] active_group_id: {active_group_id}") debug_print(f"[WebSearch] active_public_workspace_id: {active_public_workspace_id}") - debug_print(f"[WebSearch] search_query: {search_query[:100] if search_query else None}...") + debug_print( + "[WebSearch] web_search_query_text: " + f"{web_search_query_text[:100] if web_search_query_text else None}..." + ) enable_web_search = settings.get("enable_web_search") debug_print(f"[WebSearch] enable_web_search setting: {enable_web_search}") @@ -12328,15 +12576,13 @@ def perform_web_search( if not enable_web_search: debug_print("[WebSearch] Web search is DISABLED in settings, returning early") return True # Not an error, just disabled - - debug_print("[WebSearch] Web search is ENABLED, proceeding...") web_search_agent = settings.get("web_search_agent") or {} debug_print(f"[WebSearch] web_search_agent config present: {bool(web_search_agent)}") if web_search_agent: # Avoid logging sensitive data, just log structure debug_print(f"[WebSearch] web_search_agent keys: {list(web_search_agent.keys())}") - + other_settings = web_search_agent.get("other_settings") or {} debug_print(f"[WebSearch] other_settings keys: {list(other_settings.keys()) if other_settings else ''}") @@ -12369,16 +12615,8 @@ def perform_web_search( return False # Configuration error debug_print(f"[WebSearch] Agent ID is configured: {agent_id}") - - query_text = None - try: - query_text = search_query - debug_print(f"[WebSearch] Using search_query as query_text: {query_text[:100] if query_text else None}...") - except NameError: - query_text = None - debug_print("[WebSearch] search_query not defined, query_text is None") - query_text = (query_text or user_message or "").strip() + query_text = (web_search_query_text or user_message or "").strip() debug_print(f"[WebSearch] Final query_text after fallback: '{query_text[:100] if query_text else ''}'") if not query_text: @@ -12400,17 +12638,8 @@ def perform_web_search( debug_print(f"[WebSearch] Message history created with {len(message_history)} message(s)") try: - foundry_metadata = { - "conversation_id": conversation_id, - "user_id": user_id, - "message_id": user_message_id, - "chat_type": chat_type, - "document_scope": document_scope, - "group_id": active_group_id if chat_type == "group" else None, - "public_workspace_id": active_public_workspace_id, - "search_query": query_text, - } - debug_print(f"[WebSearch] Foundry metadata prepared: {json.dumps(foundry_metadata, default=str)}") + foundry_metadata = {} + debug_print("[WebSearch] Foundry metadata prepared: {}") debug_print("[WebSearch] Calling execute_foundry_agent...") debug_print(f"[WebSearch] foundry_settings keys: {list(foundry_settings.keys())}") diff --git a/application/single_app/route_backend_control_center.py b/application/single_app/route_backend_control_center.py index a7e5e8a0d..b74508acf 100644 --- a/application/single_app/route_backend_control_center.py +++ b/application/single_app/route_backend_control_center.py @@ -539,7 +539,7 @@ def enhance_user_with_activity(user, force_refresh=False): # Update user settings with cached metrics settings_update = {'metrics': metrics_cache} - update_success = update_user_settings(user.get('id'), settings_update) + update_success = update_user_settings(user.get('id'), settings_update, allow_cross_user=True) if update_success: debug_print(f"Successfully cached metrics for user {user.get('id')}") @@ -2315,7 +2315,7 @@ def api_update_user_access(user_id): } } - success = update_user_settings(user_id, access_settings) + success = update_user_settings(user_id, access_settings, allow_cross_user=True) if success: # Log admin action @@ -2371,7 +2371,7 @@ def api_update_user_file_uploads(user_id): } } - success = update_user_settings(user_id, file_upload_settings) + success = update_user_settings(user_id, file_upload_settings, allow_cross_user=True) if success: # Log admin action @@ -2515,7 +2515,7 @@ def api_bulk_user_action(): for user_id in user_ids: try: - success = update_user_settings(user_id, update_settings) + success = update_user_settings(user_id, update_settings, allow_cross_user=True) if success: success_count += 1 else: @@ -5940,6 +5940,22 @@ def api_admin_get_approvals(): debug_print(traceback.format_exc()) return jsonify({'error': 'Failed to fetch approvals', 'details': str(e)}), 500 + def _get_authorized_route_approval(approval_id, group_id, require_approval_rights=False): + """Resolve the current user and return an authorized approval plus user context.""" + user = session.get('user', {}) + user_id = user.get('oid') or user.get('sub') + user_roles = user.get('roles', []) + user_email = user.get('preferred_username', user.get('email', 'unknown')) + user_name = user.get('name', user_email) + approval = get_authorized_approval( + approval_id, + group_id, + user_id, + user_roles, + require_approval_rights=require_approval_rights, + ) + return approval, user_id, user_roles, user_email, user_name + @app.route('/api/admin/control-center/approvals/', methods=['GET']) @swagger_route(security=get_auth_security()) @login_required @@ -5952,23 +5968,23 @@ def api_admin_get_approval_by_id(approval_id): group_id (str): Group ID (partition key) """ try: - user = session.get('user', {}) - user_id = user.get('oid') or user.get('sub') - group_id = request.args.get('group_id') if not group_id: return jsonify({'error': 'group_id query parameter is required'}), 400 - - # Get the approval - approval = cosmos_approvals_container.read_item( - item=approval_id, - partition_key=group_id + + approval, user_id, user_roles, _user_email, _user_name = _get_authorized_route_approval( + approval_id, + group_id, ) # Add can_approve field - approval['can_approve'] = (approval.get('requester_id') != user_id) + approval['can_approve'] = _can_user_approve(approval, user_id, user_roles) return jsonify(approval), 200 + except LookupError: + return jsonify({'error': 'Approval not found'}), 404 + except PermissionError: + return jsonify({'error': 'You are not authorized to view this approval'}), 403 except Exception as e: debug_print(f"Error fetching approval {approval_id}: {e}") @@ -5989,17 +6005,18 @@ def api_admin_approve_request(approval_id): comment (str, optional): Approval comment """ try: - user = session.get('user', {}) - user_id = user.get('oid') or user.get('sub') - user_email = user.get('preferred_username', user.get('email', 'unknown')) - user_name = user.get('name', user_email) - data = request.get_json() group_id = data.get('group_id') comment = data.get('comment', '') if not group_id: return jsonify({'error': 'group_id is required'}), 400 + + approval, user_id, _user_roles, user_email, user_name = _get_authorized_route_approval( + approval_id, + group_id, + require_approval_rights=True, + ) # Approve the request approval = approve_request( @@ -6008,7 +6025,8 @@ def api_admin_approve_request(approval_id): approver_id=user_id, approver_email=user_email, approver_name=user_name, - comment=comment + comment=comment, + approval=approval, ) # Execute the approved action @@ -6020,6 +6038,10 @@ def api_admin_approve_request(approval_id): 'approval': approval, 'execution_result': execution_result }), 200 + except LookupError: + return jsonify({'error': 'Approval not found'}), 404 + except PermissionError: + return jsonify({'error': 'You are not eligible to approve this request'}), 403 except Exception as e: debug_print(f"Error approving request: {e}") @@ -6038,11 +6060,6 @@ def api_admin_deny_request(approval_id): comment (str): Reason for denial (required) """ try: - user = session.get('user', {}) - user_id = user.get('oid') or user.get('sub') - user_email = user.get('preferred_username', user.get('email', 'unknown')) - user_name = user.get('name', user_email) - data = request.get_json() group_id = data.get('group_id') comment = data.get('comment', '') @@ -6052,6 +6069,12 @@ def api_admin_deny_request(approval_id): if not comment: return jsonify({'error': 'comment is required for denial'}), 400 + + approval, user_id, _user_roles, user_email, user_name = _get_authorized_route_approval( + approval_id, + group_id, + require_approval_rights=True, + ) # Deny the request approval = deny_request( @@ -6061,7 +6084,8 @@ def api_admin_deny_request(approval_id): denier_email=user_email, denier_name=user_name, comment=comment, - auto_denied=False + auto_denied=False, + approval=approval, ) return jsonify({ @@ -6069,6 +6093,10 @@ def api_admin_deny_request(approval_id): 'message': 'Request denied', 'approval': approval }), 200 + except LookupError: + return jsonify({'error': 'Approval not found'}), 404 + except PermissionError: + return jsonify({'error': 'You are not eligible to deny this request'}), 403 except Exception as e: debug_print(f"Error denying request: {e}") @@ -6127,8 +6155,7 @@ def api_get_approvals(): approvals_with_permission = [] for approval in result.get('approvals', []): approval_copy = dict(approval) - # User can approve if they didn't create the request - approval_copy['can_approve'] = (approval.get('requester_id') != user_id) + approval_copy['can_approve'] = _can_user_approve(approval, user_id, user_roles) approvals_with_permission.append(approval_copy) return jsonify({ @@ -6157,23 +6184,23 @@ def api_get_approval_by_id(approval_id): group_id (str): Group ID (partition key) """ try: - user = session.get('user', {}) - user_id = user.get('oid') or user.get('sub') - group_id = request.args.get('group_id') if not group_id: return jsonify({'error': 'group_id query parameter is required'}), 400 - - # Get the approval - approval = cosmos_approvals_container.read_item( - item=approval_id, - partition_key=group_id + + approval, user_id, user_roles, _user_email, _user_name = _get_authorized_route_approval( + approval_id, + group_id, ) # Add can_approve field - approval['can_approve'] = (approval.get('requester_id') != user_id) + approval['can_approve'] = _can_user_approve(approval, user_id, user_roles) return jsonify(approval), 200 + except LookupError: + return jsonify({'error': 'Approval not found'}), 404 + except PermissionError: + return jsonify({'error': 'You are not authorized to view this approval'}), 403 except Exception as e: debug_print(f"Error fetching approval {approval_id}: {e}") @@ -6193,17 +6220,18 @@ def api_approve_request(approval_id): comment (str, optional): Approval comment """ try: - user = session.get('user', {}) - user_id = user.get('oid') or user.get('sub') - user_email = user.get('preferred_username', user.get('email', 'unknown')) - user_name = user.get('name', user_email) - data = request.get_json() group_id = data.get('group_id') comment = data.get('comment', '') if not group_id: return jsonify({'error': 'group_id is required'}), 400 + + approval, user_id, _user_roles, user_email, user_name = _get_authorized_route_approval( + approval_id, + group_id, + require_approval_rights=True, + ) # Approve the request approval = approve_request( @@ -6212,7 +6240,8 @@ def api_approve_request(approval_id): approver_id=user_id, approver_email=user_email, approver_name=user_name, - comment=comment + comment=comment, + approval=approval, ) # Execute the approved action @@ -6224,6 +6253,10 @@ def api_approve_request(approval_id): 'approval': approval, 'execution_result': execution_result }), 200 + except LookupError: + return jsonify({'error': 'Approval not found'}), 404 + except PermissionError: + return jsonify({'error': 'You are not eligible to approve this request'}), 403 except Exception as e: debug_print(f"Error approving request: {e}") @@ -6241,11 +6274,6 @@ def api_deny_request(approval_id): comment (str): Reason for denial (required) """ try: - user = session.get('user', {}) - user_id = user.get('oid') or user.get('sub') - user_email = user.get('preferred_username', user.get('email', 'unknown')) - user_name = user.get('name', user_email) - data = request.get_json() group_id = data.get('group_id') comment = data.get('comment', '') @@ -6255,6 +6283,12 @@ def api_deny_request(approval_id): if not comment: return jsonify({'error': 'comment is required for denial'}), 400 + + approval, user_id, _user_roles, user_email, user_name = _get_authorized_route_approval( + approval_id, + group_id, + require_approval_rights=True, + ) # Deny the request approval = deny_request( @@ -6264,7 +6298,8 @@ def api_deny_request(approval_id): denier_email=user_email, denier_name=user_name, comment=comment, - auto_denied=False + auto_denied=False, + approval=approval, ) return jsonify({ @@ -6272,6 +6307,10 @@ def api_deny_request(approval_id): 'message': 'Request denied', 'approval': approval }), 200 + except LookupError: + return jsonify({'error': 'Approval not found'}), 404 + except PermissionError: + return jsonify({'error': 'You are not eligible to deny this request'}), 403 except Exception as e: debug_print(f"Error denying request: {e}") diff --git a/application/single_app/route_backend_conversations.py b/application/single_app/route_backend_conversations.py index 23f1a0714..a99e7c54b 100644 --- a/application/single_app/route_backend_conversations.py +++ b/application/single_app/route_backend_conversations.py @@ -72,6 +72,22 @@ def _collect_child_message_documents(conversation_id, root_message_ids): return child_docs + +def _authorize_personal_conversation_read(user_id, conversation_id): + """Load a personal conversation and ensure the caller owns it.""" + try: + conversation_item = cosmos_conversations_container.read_item( + item=conversation_id, + partition_key=conversation_id, + ) + except CosmosResourceNotFoundError as exc: + raise LookupError(f"Conversation {conversation_id} not found") from exc + + if conversation_item.get('user_id') != user_id: + raise PermissionError('Forbidden') + + return conversation_item + def register_route_backend_conversations(app): @app.route('/api/get_messages', methods=['GET']) @@ -86,10 +102,7 @@ def api_get_messages(): if not conversation_id: return jsonify({'error': 'No conversation_id provided'}), 400 try: - conversation_item = cosmos_conversations_container.read_item( - item=conversation_id, - partition_key=conversation_id - ) + _authorize_personal_conversation_read(user_id, conversation_id) # Query all messages in cosmos_messages_container # We'll filter for active_thread in Python since Cosmos DB boolean queries can be tricky message_query = f""" @@ -209,7 +222,9 @@ def api_get_messages(): message['vision_analysis'] = vision_analysis return jsonify({'messages': messages}) - except CosmosResourceNotFoundError: + except PermissionError: + return jsonify({'error': 'Forbidden'}), 403 + except LookupError: return jsonify({'messages': []}) except Exception as e: print(f"ERROR: Failed to get messages: {str(e)}") @@ -240,6 +255,8 @@ def api_get_image(image_id): conversation_id = '_'.join(parts[:-3]) debug_print(f"Serving image {image_id} from conversation {conversation_id}") + + _authorize_personal_conversation_read(user_id, conversation_id) # Query for the main image document and chunks message_query = f"SELECT * FROM c WHERE c.conversation_id = '{conversation_id}'" @@ -334,6 +351,11 @@ def api_get_image(image_id): ) else: return jsonify({'error': 'Invalid image format'}), 400 + + except PermissionError: + return jsonify({'error': 'Forbidden'}), 403 + except LookupError: + return jsonify({'error': 'Image not found'}), 404 except Exception as e: print(f"ERROR: Failed to serve image {image_id}: {str(e)}") @@ -456,18 +478,21 @@ def delete_conversation(conversation_id): """ Delete a conversation. If archiving is enabled, copy it to archived_conversations first. """ + user_id = get_current_user_id() + if not user_id: + return jsonify({'error': 'User not authenticated'}), 401 + settings = get_settings() archiving_enabled = settings.get('enable_conversation_archiving', False) try: - conversation_item = cosmos_conversations_container.read_item( - item=conversation_id, - partition_key=conversation_id - ) - except CosmosResourceNotFoundError: + conversation_item = _authorize_personal_conversation_read(user_id, conversation_id) + except LookupError: return jsonify({ "error": f"Conversation {conversation_id} not found." }), 404 + except PermissionError: + return jsonify({'error': 'Forbidden'}), 403 except Exception as e: return jsonify({ "error": str(e) diff --git a/application/single_app/route_backend_documents.py b/application/single_app/route_backend_documents.py index b70d99806..c74375e0e 100644 --- a/application/single_app/route_backend_documents.py +++ b/application/single_app/route_backend_documents.py @@ -135,7 +135,7 @@ def get_file_content(): return jsonify({'error': 'Missing conversation_id or id'}), 400 try: - _ = cosmos_conversations_container.read_item( + conversation_item = cosmos_conversations_container.read_item( item=conversation_id, partition_key=conversation_id ) @@ -143,6 +143,9 @@ def get_file_content(): return jsonify({'error': 'Conversation not found'}), 404 except Exception as e: return jsonify({'error': f'Error reading conversation: {str(e)}'}), 500 + + if conversation_item.get('user_id') != user_id: + return jsonify({'error': 'Forbidden'}), 403 add_file_task_to_file_processing_log(document_id=file_id, user_id=user_id, content="Conversation exists, retrieving file content") try: @@ -919,12 +922,12 @@ def api_create_tag(): data = request.get_json() tag_name = data.get('tag_name') - color = data.get('color', '#0d6efd') # Default blue color + color = data.get('color') if not tag_name: return jsonify({'error': 'tag_name is required'}), 400 - from functions_documents import normalize_tag, validate_tags + from functions_documents import normalize_tag, validate_tag_color, validate_tags from functions_settings import get_user_settings, update_user_settings from datetime import datetime, timezone @@ -935,6 +938,9 @@ def api_create_tag(): return jsonify({'error': error_msg}), 400 normalized_tag = normalized_tags[0] + is_valid_color, color_error, normalized_color = validate_tag_color(color, normalized_tag) + if not is_valid_color: + return jsonify({'error': color_error}), 400 # Get existing tag definitions from settings user_settings = get_user_settings(user_id) @@ -954,7 +960,7 @@ def api_create_tag(): # Add new tag to existing tags (don't replace) personal_tags[normalized_tag] = { - 'color': color, + 'color': normalized_color, 'created_at': datetime.now(timezone.utc).isoformat() } @@ -973,7 +979,7 @@ def api_create_tag(): 'message': f'Tag "{normalized_tag}" created successfully', 'tag': { 'name': normalized_tag, - 'color': color + 'color': normalized_color } }), 201 @@ -1157,7 +1163,7 @@ def api_update_tag(tag_name): debug_print(f"[UPDATE TAG] Request data - new_name: {new_name}, new_color: {new_color}") from functions_documents import ( - normalize_tag, validate_tags, get_documents, + normalize_tag, validate_tag_color, validate_tags, get_documents, update_document, propagate_tags_to_chunks ) from functions_settings import get_user_settings, update_user_settings @@ -1267,6 +1273,10 @@ def api_update_tag(tag_name): # Handle color change only if new_color: debug_print(f"[UPDATE TAG] Handling color change operation...") + is_valid_color, color_error, normalized_color = validate_tag_color(new_color, normalized_old_tag) + if not is_valid_color: + return jsonify({'error': color_error}), 400 + user_settings = get_user_settings(user_id) settings_dict = user_settings.get('settings', {}) tag_defs = settings_dict.get('tag_definitions', {}) @@ -1276,13 +1286,13 @@ def api_update_tag(tag_name): debug_print(f"[UPDATE TAG] Looking for tag: {normalized_old_tag}") if normalized_old_tag in personal_tags: - debug_print(f"[UPDATE TAG] Found tag, updating color to: {new_color}") - personal_tags[normalized_old_tag]['color'] = new_color + debug_print(f"[UPDATE TAG] Found tag, updating color to: {normalized_color}") + personal_tags[normalized_old_tag]['color'] = normalized_color else: - debug_print(f"[UPDATE TAG] Tag not found, creating new entry with color: {new_color}") + debug_print(f"[UPDATE TAG] Tag not found, creating new entry with color: {normalized_color}") from datetime import datetime, timezone personal_tags[normalized_old_tag] = { - 'color': new_color, + 'color': normalized_color, 'created_at': datetime.now(timezone.utc).isoformat() } @@ -1293,7 +1303,11 @@ def api_update_tag(tag_name): debug_print(f"[UPDATE TAG] Color change completed successfully") return jsonify({ - 'message': f'Tag color updated for "{normalized_old_tag}"' + 'message': f'Tag color updated for "{normalized_old_tag}"', + 'tag': { + 'name': normalized_old_tag, + 'color': normalized_color + } }), 200 debug_print(f"[UPDATE TAG] No updates specified!") diff --git a/application/single_app/route_backend_feedback.py b/application/single_app/route_backend_feedback.py index 49167cc85..8eeb31694 100644 --- a/application/single_app/route_backend_feedback.py +++ b/application/single_app/route_backend_feedback.py @@ -5,6 +5,22 @@ from functions_settings import * from swagger_wrapper import swagger_route, get_auth_security + +def _authorize_feedback_conversation(user_id, conversation_id): + """Load the target conversation and ensure the caller owns it.""" + try: + conversation_item = cosmos_conversations_container.read_item( + item=conversation_id, + partition_key=conversation_id, + ) + except CosmosResourceNotFoundError as exc: + raise LookupError(f"Conversation {conversation_id} not found") from exc + + if conversation_item.get("user_id") != user_id: + raise PermissionError("Forbidden") + + return conversation_item + def register_route_backend_feedback(app): @app.route("/feedback/submit", methods=["POST"]) @@ -18,7 +34,7 @@ def feedback_submit(): POST /feedback/submit JSON body: { messageId, conversationId, feedbackType, reason } """ - data = request.get_json() + data = request.get_json() or {} messageId = data.get("messageId") # This is the ID of the specific AI message conversationId = data.get("conversationId") # This is the ID of the conversation feedbackType = data.get("feedbackType") @@ -30,6 +46,16 @@ def feedback_submit(): if not messageId or not conversationId or not feedbackType: return jsonify({"error": "Missing required fields"}), 400 + if not user_id: + return jsonify({"error": "No user ID found in session"}), 403 + + try: + _authorize_feedback_conversation(user_id, conversationId) + except LookupError: + return jsonify({"error": "Conversation not found"}), 404 + except PermissionError: + return jsonify({"error": "Forbidden", "message": "You do not have access to this conversation"}), 403 + ai_message_text = None user_prompt_text = None all_messages = [] # Initialize an empty list for messages @@ -51,10 +77,7 @@ def feedback_submit(): # --- END CORRECTED PART --- if not message_items: - # No messages found for this conversation ID, which is unexpected if feedback is given - # You might want to log this or handle it differently - print(f"Warning: No messages found for conversationId {conversationId} during feedback submission.") - # Keep ai_message_text and user_prompt_text as None initially + return jsonify({"error": "Assistant message not found"}), 404 all_messages = message_items # Assign the query results to all_messages @@ -70,6 +93,9 @@ def feedback_submit(): ai_msg_index = i break + if ai_msg_index == -1: + return jsonify({"error": "Assistant message not found"}), 404 + # Find the user message immediately preceding the AI message if ai_msg_index > 0: # Iterate backwards from the message before the AI's message @@ -87,21 +113,12 @@ def feedback_submit(): if all_messages[i].get("role") == "user": user_prompt_text = all_messages[i].get("content") break - - - except exceptions.CosmosResourceNotFoundError: - # This specific exception might not be raised by query_items if the container exists but no items match. - # A query returning empty is more likely. Handle general exceptions. - print(f"Error querying messages for conversation {conversationId}: Resource not found (unexpected).") - # Decide how to handle - maybe proceed with default text? except Exception as e: print(f"Error querying messages for conversation {conversationId}: {e}") - # Log the error, maybe return a 500 or proceed with default text - # For now, let the default text logic below handle it. - pass # Allow execution to continue to the default text part + return jsonify({"error": "Failed to load feedback target"}), 500 # Set default text if messages weren't found - if not ai_message_text: + if ai_message_text is None: ai_message_text = "[AI response text not found in cosmos_messages_container]" if not user_prompt_text: diff --git a/application/single_app/route_backend_group_documents.py b/application/single_app/route_backend_group_documents.py index d8f00a04c..e8a72cba7 100644 --- a/application/single_app/route_backend_group_documents.py +++ b/application/single_app/route_backend_group_documents.py @@ -1046,12 +1046,12 @@ def api_create_group_tag(): data = request.get_json() tag_name = data.get('tag_name') - color = data.get('color', '#0d6efd') + color = data.get('color') if not tag_name: return jsonify({'error': 'tag_name is required'}), 400 - from functions_documents import normalize_tag, validate_tags + from functions_documents import normalize_tag, validate_tag_color, validate_tags from datetime import datetime, timezone try: @@ -1060,6 +1060,9 @@ def api_create_group_tag(): return jsonify({'error': error_msg}), 400 normalized_tag = normalized_tags[0] + is_valid_color, color_error, normalized_color = validate_tag_color(color, normalized_tag) + if not is_valid_color: + return jsonify({'error': color_error}), 400 tag_defs = group_doc.get('tag_definitions', {}) @@ -1067,7 +1070,7 @@ def api_create_group_tag(): return jsonify({'error': 'Tag already exists'}), 409 tag_defs[normalized_tag] = { - 'color': color, + 'color': normalized_color, 'created_at': datetime.now(timezone.utc).isoformat() } group_doc['tag_definitions'] = tag_defs @@ -1077,7 +1080,7 @@ def api_create_group_tag(): 'message': f'Tag "{normalized_tag}" created successfully', 'tag': { 'name': normalized_tag, - 'color': color + 'color': normalized_color } }), 201 @@ -1248,7 +1251,7 @@ def api_update_group_tag(tag_name): new_name = data.get('new_name') new_color = data.get('color') - from functions_documents import normalize_tag, validate_tags, update_document, propagate_tags_to_chunks + from functions_documents import normalize_tag, validate_tag_color, validate_tags, update_document, propagate_tags_to_chunks try: normalized_old_tag = normalize_tag(tag_name) @@ -1309,14 +1312,18 @@ def api_update_group_tag(tag_name): }), 200 if new_color: + is_valid_color, color_error, normalized_color = validate_tag_color(new_color, normalized_old_tag) + if not is_valid_color: + return jsonify({'error': color_error}), 400 + tag_defs = group_doc.get('tag_definitions', {}) if normalized_old_tag in tag_defs: - tag_defs[normalized_old_tag]['color'] = new_color + tag_defs[normalized_old_tag]['color'] = normalized_color else: from datetime import datetime, timezone tag_defs[normalized_old_tag] = { - 'color': new_color, + 'color': normalized_color, 'created_at': datetime.now(timezone.utc).isoformat() } @@ -1324,7 +1331,11 @@ def api_update_group_tag(tag_name): cosmos_groups_container.upsert_item(group_doc) return jsonify({ - 'message': f'Tag color updated for "{normalized_old_tag}"' + 'message': f'Tag color updated for "{normalized_old_tag}"', + 'tag': { + 'name': normalized_old_tag, + 'color': normalized_color + } }), 200 return jsonify({'error': 'No updates specified'}), 400 diff --git a/application/single_app/route_backend_group_prompts.py b/application/single_app/route_backend_group_prompts.py index ec87d4ec5..bd49da209 100644 --- a/application/single_app/route_backend_group_prompts.py +++ b/application/single_app/route_backend_group_prompts.py @@ -2,10 +2,26 @@ from config import * from functions_authentication import * +from functions_group import require_active_group from functions_settings import * from functions_prompts import * from swagger_wrapper import swagger_route, get_auth_security + +def _get_active_group_or_error(user_id): + try: + return require_active_group( + user_id, + allowed_roles=("Owner", "Admin", "DocumentManager", "User"), + ), None + except ValueError: + return None, (jsonify({"error": "No active group selected"}), 400) + except LookupError: + return None, (jsonify({"error": "Active group not found"}), 404) + except PermissionError: + return None, (jsonify({"error": "You are not a member of the active group"}), 403) + + def register_route_backend_group_prompts(app): @app.route('/api/group_prompts', methods=['GET']) @swagger_route(security=get_auth_security()) @@ -13,10 +29,10 @@ def register_route_backend_group_prompts(app): @user_required @enabled_required("enable_group_workspaces") def get_group_prompts(): - user_id = get_current_user_id() - active_group = get_user_settings(user_id)["settings"].get("activeGroupOid") - if not active_group: - return jsonify({"error":"No active group selected"}), 400 + user_id = get_current_user_id() + active_group, error_response = _get_active_group_or_error(user_id) + if error_response: + return error_response try: items, total, page, page_size = list_prompts( @@ -41,10 +57,10 @@ def get_group_prompts(): @user_required @enabled_required("enable_group_workspaces") def create_group_prompt(): - user_id = get_current_user_id() - active_group = get_user_settings(user_id)["settings"].get("activeGroupOid") - if not active_group: - return jsonify({"error":"No active group selected"}), 400 + user_id = get_current_user_id() + active_group, error_response = _get_active_group_or_error(user_id) + if error_response: + return error_response data = request.get_json() or {} name = data.get("name","").strip() @@ -71,10 +87,10 @@ def create_group_prompt(): @user_required @enabled_required("enable_group_workspaces") def get_group_prompt(prompt_id): - user_id = get_current_user_id() - active_group = get_user_settings(user_id)["settings"].get("activeGroupOid") - if not active_group: - return jsonify({"error":"No active group selected"}), 400 + user_id = get_current_user_id() + active_group, error_response = _get_active_group_or_error(user_id) + if error_response: + return error_response try: item = get_prompt_doc( @@ -96,10 +112,10 @@ def get_group_prompt(prompt_id): @user_required @enabled_required("enable_group_workspaces") def update_group_prompt(prompt_id): - user_id = get_current_user_id() - active_group = get_user_settings(user_id)["settings"].get("activeGroupOid") - if not active_group: - return jsonify({"error":"No active group selected"}), 400 + user_id = get_current_user_id() + active_group, error_response = _get_active_group_or_error(user_id) + if error_response: + return error_response data = request.get_json() or {} updates = {} @@ -135,10 +151,10 @@ def update_group_prompt(prompt_id): @user_required @enabled_required("enable_group_workspaces") def delete_group_prompt(prompt_id): - user_id = get_current_user_id() - active_group = get_user_settings(user_id)["settings"].get("activeGroupOid") - if not active_group: - return jsonify({"error":"No active group selected"}), 400 + user_id = get_current_user_id() + active_group, error_response = _get_active_group_or_error(user_id) + if error_response: + return error_response try: success = delete_prompt_doc( diff --git a/application/single_app/route_backend_groups.py b/application/single_app/route_backend_groups.py index 0e35d211b..9186e4ce4 100644 --- a/application/single_app/route_backend_groups.py +++ b/application/single_app/route_backend_groups.py @@ -163,10 +163,17 @@ def api_get_group_details(group_id): GET /api/groups/ Returns the full group details for that group. """ + user_info = get_current_user_info() + user_id = user_info["userId"] + group_doc = find_group_by_id(group_id) if not group_doc: return jsonify({"error": "Group not found"}), 404 + + if not get_user_role_in_group(group_doc, user_id): + return jsonify({"error": "You are not a member of this group"}), 403 + return jsonify(group_doc), 200 @app.route("/api/groups/", methods=["DELETE"]) diff --git a/application/single_app/route_backend_plugins.py b/application/single_app/route_backend_plugins.py index 115bf2828..b60f81894 100644 --- a/application/single_app/route_backend_plugins.py +++ b/application/single_app/route_backend_plugins.py @@ -28,6 +28,7 @@ validate_group_action_payload, ) from functions_keyvault import ( + resolve_secret_reference_for_context, SecretReturnType, redact_plugin_secret_values, retrieve_secret_from_key_vault_by_full_name, @@ -230,17 +231,35 @@ def _redact_plugin_for_logging(plugin): return redact_plugin_secret_values(plugin) -def _resolve_secret_value_for_sql_test(value, field_name): +def _resolve_plugin_secret_context(plugin_manifest, fallback_scope_value, fallback_scope="user"): + """Infer the expected Key Vault scope for SQL test-connection secret resolution.""" + if not isinstance(plugin_manifest, dict): + return fallback_scope_value, fallback_scope + + plugin_scope = str(plugin_manifest.get("scope") or "").strip().lower() + if plugin_scope == "group" or plugin_manifest.get("is_group"): + return plugin_manifest.get("group_id"), "group" + if plugin_scope == "global" or plugin_manifest.get("is_global"): + return plugin_manifest.get("id") or fallback_scope_value, "global" + if plugin_scope == "user" or plugin_manifest.get("user_id"): + return plugin_manifest.get("user_id") or fallback_scope_value, "user" + return fallback_scope_value, fallback_scope + + +def _resolve_secret_value_for_sql_test(value, field_name, scope_value, scope): """Resolve a Key Vault reference for SQL test-connection flows.""" if not isinstance(value, str) or not value: return value if not validate_secret_name_dynamic(value): return value - resolved_value = retrieve_secret_from_key_vault_by_full_name(value) - if validate_secret_name_dynamic(resolved_value): - raise ValueError(f"Unable to resolve stored Key Vault secret for SQL field '{field_name}'.") - return resolved_value + return resolve_secret_reference_for_context( + value, + scope_value=scope_value, + scope=scope, + allowed_sources={"action-addset"}, + context_label=f"SQL field '{field_name}'", + ) def _load_existing_plugin_for_sql_test(plugin_context, user_id): @@ -1093,9 +1112,21 @@ def test_sql_connection(): field_list = ', '.join(unresolved_fields) return jsonify({'success': False, 'error': f"Stored SQL secret could not be resolved for testing. Re-enter the {field_list}."}), 400 + plugin_scope_value, plugin_scope = _resolve_plugin_secret_context(existing_plugin, user_id) + try: - connection_string = _resolve_secret_value_for_sql_test(connection_string, 'connection_string') - password = _resolve_secret_value_for_sql_test(password, 'password') + connection_string = _resolve_secret_value_for_sql_test( + connection_string, + 'connection_string', + scope_value=plugin_scope_value, + scope=plugin_scope, + ) + password = _resolve_secret_value_for_sql_test( + password, + 'password', + scope_value=plugin_scope_value, + scope=plugin_scope, + ) except ValueError as exc: return jsonify({'success': False, 'error': str(exc)}), 400 diff --git a/application/single_app/route_backend_public_documents.py b/application/single_app/route_backend_public_documents.py index 3b7486bd2..f4396c241 100644 --- a/application/single_app/route_backend_public_documents.py +++ b/application/single_app/route_backend_public_documents.py @@ -536,12 +536,12 @@ def api_create_public_workspace_tag(): data = request.get_json() tag_name = data.get('tag_name') - color = data.get('color', '#0d6efd') + color = data.get('color') if not tag_name: return jsonify({'error': 'tag_name is required'}), 400 - from functions_documents import normalize_tag, validate_tags + from functions_documents import normalize_tag, validate_tag_color, validate_tags from datetime import datetime, timezone try: @@ -550,6 +550,9 @@ def api_create_public_workspace_tag(): return jsonify({'error': error_msg}), 400 normalized_tag = normalized_tags[0] + is_valid_color, color_error, normalized_color = validate_tag_color(color, normalized_tag) + if not is_valid_color: + return jsonify({'error': color_error}), 400 tag_defs = ws_doc.get('tag_definitions', {}) @@ -557,7 +560,7 @@ def api_create_public_workspace_tag(): return jsonify({'error': 'Tag already exists'}), 409 tag_defs[normalized_tag] = { - 'color': color, + 'color': normalized_color, 'created_at': datetime.now(timezone.utc).isoformat() } ws_doc['tag_definitions'] = tag_defs @@ -567,7 +570,7 @@ def api_create_public_workspace_tag(): 'message': f'Tag "{normalized_tag}" created successfully', 'tag': { 'name': normalized_tag, - 'color': color + 'color': normalized_color } }), 201 @@ -740,7 +743,7 @@ def api_update_public_workspace_tag(tag_name): new_name = data.get('new_name') new_color = data.get('color') - from functions_documents import normalize_tag, validate_tags, update_document, propagate_tags_to_chunks + from functions_documents import normalize_tag, validate_tag_color, validate_tags, update_document, propagate_tags_to_chunks try: normalized_old_tag = normalize_tag(tag_name) @@ -801,14 +804,18 @@ def api_update_public_workspace_tag(tag_name): }), 200 if new_color: + is_valid_color, color_error, normalized_color = validate_tag_color(new_color, normalized_old_tag) + if not is_valid_color: + return jsonify({'error': color_error}), 400 + tag_defs = ws_doc.get('tag_definitions', {}) if normalized_old_tag in tag_defs: - tag_defs[normalized_old_tag]['color'] = new_color + tag_defs[normalized_old_tag]['color'] = normalized_color else: from datetime import datetime, timezone tag_defs[normalized_old_tag] = { - 'color': new_color, + 'color': normalized_color, 'created_at': datetime.now(timezone.utc).isoformat() } @@ -816,7 +823,11 @@ def api_update_public_workspace_tag(tag_name): cosmos_public_workspaces_container.upsert_item(ws_doc) return jsonify({ - 'message': f'Tag color updated for "{normalized_old_tag}"' + 'message': f'Tag color updated for "{normalized_old_tag}"', + 'tag': { + 'name': normalized_old_tag, + 'color': normalized_color + } }), 200 return jsonify({'error': 'No updates specified'}), 400 diff --git a/application/single_app/route_backend_public_prompts.py b/application/single_app/route_backend_public_prompts.py index f125cbb32..75c3f678e 100644 --- a/application/single_app/route_backend_public_prompts.py +++ b/application/single_app/route_backend_public_prompts.py @@ -8,6 +8,20 @@ from functions_prompts import * from swagger_wrapper import swagger_route, get_auth_security + +def _get_active_public_workspace_or_error(user_id): + try: + return require_active_public_workspace( + user_id, + allowed_roles=("Owner", "Admin", "DocumentManager"), + ), None + except ValueError: + return None, (jsonify({'error': 'No active public workspace selected'}), 400) + except LookupError: + return None, (jsonify({'error': 'Workspace not found'}), 404) + except PermissionError: + return None, (jsonify({'error': 'Access denied'}), 403) + def register_route_backend_public_prompts(app): """ Backend routes for public-workspace–scoped prompts management @@ -20,15 +34,10 @@ def register_route_backend_public_prompts(app): @enabled_required('enable_public_workspaces') def api_list_public_prompts(): user_id = get_current_user_id() - settings = get_user_settings(user_id) - active_ws = settings['settings'].get('activePublicWorkspaceOid') - if not active_ws: - return jsonify({'error': 'No active public workspace selected'}), 400 - ws = find_public_workspace_by_id(active_ws) - if not ws: - return jsonify({'error': 'Workspace not found'}), 404 - if not get_user_role_in_public_workspace(ws, user_id): - return jsonify({'error': 'Access denied'}), 403 + active_workspace_context, error_response = _get_active_public_workspace_or_error(user_id) + if error_response: + return error_response + active_ws, _, _ = active_workspace_context try: items, total, page, page_size = list_prompts( @@ -54,15 +63,10 @@ def api_list_public_prompts(): @enabled_required('enable_public_workspaces') def api_create_public_prompt(): user_id = get_current_user_id() - settings = get_user_settings(user_id) - active_ws = settings['settings'].get('activePublicWorkspaceOid') - if not active_ws: - return jsonify({'error': 'No active public workspace selected'}), 400 - ws = find_public_workspace_by_id(active_ws) - if not ws: - return jsonify({'error': 'Workspace not found'}), 404 - if not get_user_role_in_public_workspace(ws, user_id): - return jsonify({'error': 'Access denied'}), 403 + active_workspace_context, error_response = _get_active_public_workspace_or_error(user_id) + if error_response: + return error_response + active_ws, _, _ = active_workspace_context data = request.get_json() or {} name = data.get('name','').strip() @@ -90,15 +94,10 @@ def api_create_public_prompt(): @enabled_required('enable_public_workspaces') def api_get_public_prompt(prompt_id): user_id = get_current_user_id() - settings = get_user_settings(user_id) - active_ws = settings['settings'].get('activePublicWorkspaceOid') - if not active_ws: - return jsonify({'error': 'No active public workspace selected'}), 400 - ws = find_public_workspace_by_id(active_ws) - if not ws: - return jsonify({'error': 'Workspace not found'}), 404 - if not get_user_role_in_public_workspace(ws, user_id): - return jsonify({'error': 'Access denied'}), 403 + active_workspace_context, error_response = _get_active_public_workspace_or_error(user_id) + if error_response: + return error_response + active_ws, _, _ = active_workspace_context try: item = get_prompt_doc( @@ -121,15 +120,10 @@ def api_get_public_prompt(prompt_id): @enabled_required('enable_public_workspaces') def api_update_public_prompt(prompt_id): user_id = get_current_user_id() - settings = get_user_settings(user_id) - active_ws = settings['settings'].get('activePublicWorkspaceOid') - if not active_ws: - return jsonify({'error': 'No active public workspace selected'}), 400 - ws = find_public_workspace_by_id(active_ws) - if not ws: - return jsonify({'error': 'Workspace not found'}), 404 - if not get_user_role_in_public_workspace(ws, user_id): - return jsonify({'error': 'Access denied'}), 403 + active_workspace_context, error_response = _get_active_public_workspace_or_error(user_id) + if error_response: + return error_response + active_ws, _, _ = active_workspace_context data = request.get_json() or {} updates = {} @@ -166,15 +160,10 @@ def api_update_public_prompt(prompt_id): @enabled_required('enable_public_workspaces') def api_delete_public_prompt(prompt_id): user_id = get_current_user_id() - settings = get_user_settings(user_id) - active_ws = settings['settings'].get('activePublicWorkspaceOid') - if not active_ws: - return jsonify({'error': 'No active public workspace selected'}), 400 - ws = find_public_workspace_by_id(active_ws) - if not ws: - return jsonify({'error': 'Workspace not found'}), 404 - if not get_user_role_in_public_workspace(ws, user_id): - return jsonify({'error': 'Access denied'}), 403 + active_workspace_context, error_response = _get_active_public_workspace_or_error(user_id) + if error_response: + return error_response + active_ws, _, _ = active_workspace_context try: success = delete_prompt_doc( diff --git a/application/single_app/route_backend_public_workspaces.py b/application/single_app/route_backend_public_workspaces.py index 9307bc3c3..8a7ade4ed 100644 --- a/application/single_app/route_backend_public_workspaces.py +++ b/application/single_app/route_backend_public_workspaces.py @@ -211,12 +211,20 @@ def api_create_public_workspace(): def api_get_public_workspace(ws_id): """ GET /api/public_workspaces/ - Returns full workspace document. + Returns a role-aware workspace payload. """ + info = get_current_user_info() + user_id = info["userId"] + ws = find_public_workspace_by_id(ws_id) if not ws: return jsonify({"error": "Workspace not found"}), 404 - return jsonify(ws), 200 + + role = get_user_role_in_public_workspace(ws, user_id) + if role: + return jsonify(build_public_workspace_member_payload(ws, user_id)), 200 + + return jsonify(build_public_workspace_public_summary(ws)), 200 @app.route("/api/public_workspaces/", methods=["PATCH", "PUT"]) @swagger_route(security=get_auth_security()) @@ -289,13 +297,11 @@ def api_set_active_public_workspace(): info = get_current_user_info() user_id = info["userId"] - ws = find_public_workspace_by_id(ws_id) - if not ws: + try: + update_active_public_workspace_for_user(user_id, ws_id) + except LookupError: return jsonify({"error": "Workspace not found"}), 404 - # Public workspaces are accessible to all authenticated users for chat. - # No membership check needed — any user can set a public workspace as active. - update_active_public_workspace_for_user(user_id, ws_id) return jsonify({"message": f"Active set to {ws_id}"}), 200 @app.route("/api/public_workspaces//requests", methods=["GET"]) diff --git a/application/single_app/route_backend_users.py b/application/single_app/route_backend_users.py index 459aa800c..eac97c37e 100644 --- a/application/single_app/route_backend_users.py +++ b/application/single_app/route_backend_users.py @@ -2,9 +2,16 @@ from config import * from functions_authentication import * +from functions_group import update_active_group_for_user +from functions_public_workspaces import update_active_public_workspace_for_user from functions_settings import * from swagger_wrapper import swagger_route, get_auth_security + +def _escape_graph_odata_literal(value): + return str(value or "").replace("'", "''") + + def register_route_backend_users(app): """ This route will expose GET /api/userSearch?query= which calls @@ -20,6 +27,8 @@ def api_user_search(): if not query: return jsonify([]), 200 + safe_query = _escape_graph_odata_literal(query) + token = get_valid_access_token() if not token: return jsonify({"error": "Could not acquire access token"}), 401 @@ -32,9 +41,9 @@ def api_user_search(): } filter_str = ( - f"startswith(displayName, '{query}') " - f"or startswith(mail, '{query}') " - f"or startswith(userPrincipalName, '{query}')" + f"startswith(displayName, '{safe_query}') " + f"or startswith(mail, '{safe_query}') " + f"or startswith(userPrincipalName, '{safe_query}')" ) params = { "$filter": filter_str, @@ -163,13 +172,54 @@ def user_settings(): invalid_keys = set(settings_to_update.keys()) - allowed_keys if invalid_keys: print(f"Warning: Received invalid settings keys: {invalid_keys}") - # Decide whether to ignore them or return an error - # To ignore: settings_to_update = {k: v for k, v in settings_to_update.items() if k in allowed_keys} - # To error: return jsonify({"error": f"Invalid settings keys provided: {', '.join(invalid_keys)}"}), 400 + settings_to_update = { + key: value + for key, value in settings_to_update.items() + if key in allowed_keys + } + if not settings_to_update: + return jsonify({"error": "No valid settings keys provided"}), 400 + + + settings_to_update = dict(settings_to_update) + active_group_updated = False + active_public_workspace_updated = False + + if "activeGroupOid" in settings_to_update: + requested_active_group = str(settings_to_update.pop("activeGroupOid") or "").strip() + if requested_active_group: + try: + update_active_group_for_user(requested_active_group, user_id=user_id) + active_group_updated = True + except LookupError: + return jsonify({"error": "Group not found"}), 404 + except PermissionError: + return jsonify({"error": "You are not a member of this group"}), 403 + else: + settings_to_update["activeGroupOid"] = requested_active_group + if "activePublicWorkspaceOid" in settings_to_update: + requested_active_public_workspace = str( + settings_to_update.pop("activePublicWorkspaceOid") or "" + ).strip() + if requested_active_public_workspace: + try: + update_active_public_workspace_for_user( + user_id, + requested_active_public_workspace, + ) + active_public_workspace_updated = True + except LookupError: + return jsonify({"error": "Workspace not found"}), 404 + else: + settings_to_update["activePublicWorkspaceOid"] = requested_active_public_workspace # Call the updated function - it handles merging and timestamp - success = update_user_settings(user_id, settings_to_update) + success = True + if settings_to_update: + success = update_user_settings(user_id, settings_to_update) + elif active_group_updated or active_public_workspace_updated: + success = True if success: return jsonify({"message": "User settings updated successfully"}), 200 diff --git a/application/single_app/route_frontend_admin_settings.py b/application/single_app/route_frontend_admin_settings.py index 129dfcde6..0dc5cdd88 100644 --- a/application/single_app/route_frontend_admin_settings.py +++ b/application/single_app/route_frontend_admin_settings.py @@ -1261,7 +1261,7 @@ def is_valid_url(url): 'enable_web_search': enable_web_search, 'web_search_consent_accepted': web_search_consent_accepted, 'enable_web_search_user_notice': form_data.get('enable_web_search_user_notice') == 'on', - 'web_search_user_notice_text': form_data.get('web_search_user_notice_text', 'Your message will be sent to Microsoft Bing for web search. Only your current message is sent, not your conversation history.').strip(), + 'web_search_user_notice_text': form_data.get('web_search_user_notice_text', 'Your current message will be sent to Microsoft Bing for web search. Conversation history is not sent for web search, but any sensitive content you paste into this message may be sent.').strip(), 'web_search_agent': { 'agent_type': 'aifoundry', 'azure_openai_gpt_endpoint': form_data.get('web_search_foundry_endpoint', '').strip(), diff --git a/application/single_app/route_frontend_conversations.py b/application/single_app/route_frontend_conversations.py index d2b428fe2..b2e65be49 100644 --- a/application/single_app/route_frontend_conversations.py +++ b/application/single_app/route_frontend_conversations.py @@ -10,6 +10,22 @@ ) from swagger_wrapper import swagger_route, get_auth_security + +def _authorize_frontend_personal_conversation_access(user_id, conversation_id): + """Load a personal conversation and ensure the caller owns it.""" + try: + conversation_item = cosmos_conversations_container.read_item( + item=conversation_id, + partition_key=conversation_id, + ) + except CosmosResourceNotFoundError as exc: + raise LookupError(f"Conversation {conversation_id} not found") from exc + + if conversation_item.get('user_id') != user_id: + raise PermissionError('Forbidden') + + return conversation_item + def register_route_frontend_conversations(app): @app.route('/conversations') @swagger_route(security=get_auth_security()) @@ -41,12 +57,11 @@ def view_conversation(conversation_id): if not user_id: return redirect(url_for('login')) try: - conversation_item = cosmos_conversations_container.read_item( - item=conversation_id, - partition_key=conversation_id - ) - except Exception: + _authorize_frontend_personal_conversation_access(user_id, conversation_id) + except LookupError: return "Conversation not found", 404 + except PermissionError: + return "Forbidden", 403 message_query = f""" SELECT * FROM c @@ -70,9 +85,11 @@ def get_conversation_messages(conversation_id): return jsonify({'error': 'User not authenticated'}), 401 try: - _ = cosmos_conversations_container.read_item(conversation_id, conversation_id) - except CosmosResourceNotFoundError: + _authorize_frontend_personal_conversation_access(user_id, conversation_id) + except LookupError: return jsonify({'error': 'Conversation not found'}), 404 + except PermissionError: + return jsonify({'error': 'Forbidden'}), 403 msg_query = f""" SELECT * FROM c diff --git a/application/single_app/route_frontend_group_workspaces.py b/application/single_app/route_frontend_group_workspaces.py index e92d5a980..6e3186f62 100644 --- a/application/single_app/route_frontend_group_workspaces.py +++ b/application/single_app/route_frontend_group_workspaces.py @@ -2,7 +2,7 @@ from config import * from functions_authentication import * -from functions_group import get_group_model_endpoints +from functions_group import get_group_model_endpoints, require_active_group, update_active_group_for_user from functions_settings import * from swagger_wrapper import swagger_route, get_auth_security @@ -18,7 +18,10 @@ def group_workspaces(): settings = get_settings() user_settings = get_user_settings(user_id) public_settings = sanitize_settings_for_user(settings) - active_group_id = user_settings.get("settings", {}).get("activeGroupOid") + try: + active_group_id = require_active_group(user_id) + except (ValueError, LookupError, PermissionError): + active_group_id = None enable_document_classification = settings.get('enable_document_classification', False) enable_file_sharing = settings.get('enable_file_sharing', False) enable_extract_meta_data = settings.get('enable_extract_meta_data', False) @@ -97,7 +100,12 @@ def set_active_group(): group_id = request.form.get("group_id") if not user_id or not group_id: return "Missing user or group id", 400 - success = update_user_settings(user_id, {"activeGroupOid": group_id}) - if not success: - return "Failed to update user settings", 500 + + try: + update_active_group_for_user(group_id, user_id=user_id) + except LookupError: + return "Group not found", 404 + except PermissionError: + return "You are not a member of this group", 403 + return redirect(url_for('group_workspaces')) diff --git a/application/single_app/route_frontend_public_workspaces.py b/application/single_app/route_frontend_public_workspaces.py index 05d5b982a..fa1178d64 100644 --- a/application/single_app/route_frontend_public_workspaces.py +++ b/application/single_app/route_frontend_public_workspaces.py @@ -2,6 +2,7 @@ from config import * from functions_authentication import * +from functions_public_workspaces import update_active_public_workspace_for_user from functions_settings import * from swagger_wrapper import swagger_route, get_auth_security @@ -116,7 +117,10 @@ def set_active_public_workspace(): workspace_id = request.form.get("workspace_id") if not user_id or not workspace_id: return "Missing user or workspace id", 400 - success = update_user_settings(user_id, {"activePublicWorkspaceOid": workspace_id}) - if not success: - return "Failed to update user settings", 500 + + try: + update_active_public_workspace_for_user(user_id, workspace_id) + except LookupError: + return "Workspace not found", 404 + return redirect(url_for('public_workspaces')) \ No newline at end of file diff --git a/application/single_app/route_plugin_logging.py b/application/single_app/route_plugin_logging.py index 940d540ea..69b7ce35b 100644 --- a/application/single_app/route_plugin_logging.py +++ b/application/single_app/route_plugin_logging.py @@ -4,7 +4,7 @@ """ from flask import Blueprint, jsonify, request -from functions_authentication import login_required, get_current_user_id +from functions_authentication import admin_required, login_required, get_current_user_id from functions_appinsights import log_event from semantic_kernel_plugins.plugin_invocation_logger import get_plugin_logger from swagger_wrapper import swagger_route, get_auth_security @@ -122,10 +122,10 @@ def get_plugin_stats(): security=get_auth_security() ) @login_required +@admin_required def get_recent_invocations(): """Get the most recent plugin invocations across all users (admin only).""" try: - # Note: You might want to add admin role checking here plugin_logger = get_plugin_logger() limit = request.args.get('limit', 20, type=int) @@ -220,6 +220,7 @@ def get_plugin_specific_invocations(plugin_name): security=get_auth_security() ) @login_required +@admin_required def clear_plugin_logs(): """Clear plugin invocation logs (admin only or for testing).""" try: diff --git a/application/single_app/semantic_kernel_loader.py b/application/single_app/semantic_kernel_loader.py index f66cfca7e..14dcc0d37 100644 --- a/application/single_app/semantic_kernel_loader.py +++ b/application/single_app/semantic_kernel_loader.py @@ -36,7 +36,16 @@ from semantic_kernel_plugins.smart_http_plugin import SmartHttpPlugin from functions_debug import debug_print from flask import g -from functions_keyvault import SecretReturnType, keyvault_model_endpoint_get_helper, retrieve_secret_from_key_vault, retrieve_secret_from_key_vault_by_full_name, validate_secret_name_dynamic +from functions_keyvault import ( + SQL_PLUGIN_SENSITIVE_ADDITIONAL_FIELDS, + SQL_PLUGIN_SENSITIVE_AUTH_FIELDS, + SecretReturnType, + keyvault_model_endpoint_get_helper, + resolve_secret_reference_for_context, + retrieve_secret_from_key_vault, + retrieve_secret_from_key_vault_by_full_name, + validate_secret_name_dynamic, +) from functions_global_actions import get_global_actions from functions_global_agents import get_global_agents from functions_group_agents import get_group_agent, get_group_agents @@ -1595,6 +1604,27 @@ def create_chat_completion_service(): log_event(f"[SK Loader] load_single_agent_for_kernel completed - returning {len(agent_objs)} agents: {list(agent_objs.keys())}", level=logging.INFO) return kernel, agent_objs +def _get_plugin_secret_context(plugin_manifest): + """Infer the expected Key Vault scope for a plugin manifest.""" + if not isinstance(plugin_manifest, dict): + return None, None + + plugin_scope = str(plugin_manifest.get("scope") or "").strip().lower() + if plugin_scope == "group" or plugin_manifest.get("is_group"): + return plugin_manifest.get("group_id"), "group" + if plugin_scope == "global" or plugin_manifest.get("is_global"): + return plugin_manifest.get("id"), "global" + if plugin_scope == "user" or plugin_manifest.get("user_id"): + return plugin_manifest.get("user_id"), "user" + return plugin_manifest.get("id"), "global" + + +def _is_sql_sensitive_plugin_field(plugin_manifest, field_name): + """Return True when an additional field should resolve as a SQL secret.""" + plugin_type = str((plugin_manifest or {}).get("type") or "").strip().lower() + return plugin_type in {"sql_query", "sql_schema"} and field_name in SQL_PLUGIN_SENSITIVE_ADDITIONAL_FIELDS + + def resolve_key_vault_secrets_in_plugins(plugin_manifest, settings): """ Resolve any Key Vault secrets in a plugin manifest. @@ -1606,26 +1636,66 @@ def resolve_key_vault_secrets_in_plugins(plugin_manifest, settings): if not kv_name: raise ValueError("Key Vault name not configured in settings") - def resolve_value(value): - if isinstance(value, str) and validate_secret_name_dynamic(value): - resolved = retrieve_secret_from_key_vault_by_full_name(value) - if resolved: - return resolved - else: - raise ValueError(f"Failed to retrieve secret '{value}' from Key Vault '{kv_name}'") - return value - - resolved_manifest = {} - for k, v in plugin_manifest.items(): - debug_print(f"[SK Loader] Resolving plugin manifest key: {k} with value type: {type(v)}") - if isinstance(v, str): - resolved_manifest[k] = resolve_value(v) - elif isinstance(v, list): - resolved_manifest[k] = [resolve_value(item) for item in v] - elif isinstance(v, dict): - resolved_manifest[k] = {sub_k: resolve_value(sub_v) for sub_k, sub_v in v.items()} - else: - resolved_manifest[k] = v # Leave other types unchanged + scope_value, scope = _get_plugin_secret_context(plugin_manifest) + resolved_manifest = dict(plugin_manifest) + + auth = plugin_manifest.get("auth", {}) + if isinstance(auth, dict): + resolved_auth = dict(auth) + for auth_field in ("key", *SQL_PLUGIN_SENSITIVE_AUTH_FIELDS): + value = auth.get(auth_field) + if not isinstance(value, str) or not validate_secret_name_dynamic(value): + continue + try: + resolved_auth[auth_field] = resolve_secret_reference_for_context( + value, + scope_value=scope_value, + scope=scope, + allowed_sources={"action"}, + context_label=f"plugin auth field '{auth_field}'", + ) + except ValueError as exc: + log_event( + f"[SK Loader] Blocked plugin auth secret resolution for field '{auth_field}': {exc}", + extra={ + "plugin_name": plugin_manifest.get("name"), + "plugin_id": plugin_manifest.get("id"), + "scope": scope, + }, + level=logging.WARNING, + ) + resolved_auth[auth_field] = "" + resolved_manifest["auth"] = resolved_auth + + additional_fields = plugin_manifest.get("additionalFields", {}) + if isinstance(additional_fields, dict): + resolved_additional_fields = dict(additional_fields) + for field_name, value in additional_fields.items(): + if not isinstance(value, str) or not validate_secret_name_dynamic(value): + continue + if not (field_name.endswith("__Secret") or _is_sql_sensitive_plugin_field(plugin_manifest, field_name)): + continue + try: + resolved_additional_fields[field_name] = resolve_secret_reference_for_context( + value, + scope_value=scope_value, + scope=scope, + allowed_sources={"action-addset"}, + context_label=f"plugin additional field '{field_name}'", + ) + except ValueError as exc: + log_event( + f"[SK Loader] Blocked plugin additionalField secret resolution for '{field_name}': {exc}", + extra={ + "plugin_name": plugin_manifest.get("name"), + "plugin_id": plugin_manifest.get("id"), + "scope": scope, + }, + level=logging.WARNING, + ) + resolved_additional_fields[field_name] = "" + resolved_manifest["additionalFields"] = resolved_additional_fields + return resolved_manifest def load_plugins_for_kernel(kernel, plugin_manifests, settings, mode_label="global"): diff --git a/application/single_app/semantic_kernel_plugins/fact_memory_plugin.py b/application/single_app/semantic_kernel_plugins/fact_memory_plugin.py index 188d16fac..f38bd37ed 100644 --- a/application/single_app/semantic_kernel_plugins/fact_memory_plugin.py +++ b/application/single_app/semantic_kernel_plugins/fact_memory_plugin.py @@ -5,8 +5,12 @@ - Exposes methods for use as a Semantic Kernel plugin (does not need to derive from BasePlugin). - Read/inject logic is handled separately by orchestration utility. """ +import logging +from flask import g, has_request_context from typing import Optional, List +from functions_appinsights import log_event +from functions_authentication import get_current_user_id from semantic_kernel.functions import kernel_function from semantic_kernel_fact_memory_store import FactMemoryStore @@ -18,6 +22,82 @@ def __init__(self, store: Optional[FactMemoryStore] = None): self.store = store or FactMemoryStore() auto_wrap_plugin_functions(self, self.__class__.__name__) + def _get_authorized_fact_memory_scope(self) -> dict: + """Return the canonical request-scoped fact-memory authorization boundary.""" + if not has_request_context(): + raise PermissionError('Fact memory requires an active request context.') + + current_user_id = str(get_current_user_id() or '').strip() + if not current_user_id: + raise PermissionError('User not authenticated.') + + authorized_context = dict(getattr(g, 'authorized_chat_context', {}) or {}) + authorized_user_id = str(authorized_context.get('user_id') or current_user_id).strip() + if authorized_user_id != current_user_id: + authorized_user_id = current_user_id + + authorized_scope_id = str( + authorized_context.get('fact_memory_scope_id') + or authorized_context.get('active_group_id') + or current_user_id + ).strip() + authorized_scope_type = str( + authorized_context.get('fact_memory_scope_type') + or ('group' if authorized_context.get('active_group_id') else 'user') + ).strip().lower() + if authorized_scope_type not in {'user', 'group'}: + authorized_scope_type = 'user' + + authorized_conversation_id = str( + authorized_context.get('conversation_id') or getattr(g, 'conversation_id', '') or '' + ).strip() or None + + return { + 'user_id': authorized_user_id, + 'scope_id': authorized_scope_id, + 'scope_type': authorized_scope_type, + 'conversation_id': authorized_conversation_id, + } + + def _resolve_authorized_fact_memory_call( + self, + scope_type: str = '', + scope_id: str = '', + conversation_id: str = '', + ) -> dict: + """Normalize tool-call scope arguments against the authorized request scope.""" + authorized_scope = self._get_authorized_fact_memory_scope() + requested_scope_type = str(scope_type or '').strip().lower() + requested_scope_id = str(scope_id or '').strip() + requested_conversation_id = str(conversation_id or '').strip() + + if ( + (requested_scope_type and requested_scope_type != authorized_scope['scope_type']) + or (requested_scope_id and requested_scope_id != authorized_scope['scope_id']) + ): + log_event( + '[FactMemoryPlugin] Overriding mismatched fact-memory scope in tool call.', + extra={ + 'requested_scope_type': requested_scope_type, + 'requested_scope_id': requested_scope_id, + 'authorized_scope_type': authorized_scope['scope_type'], + 'authorized_scope_id': authorized_scope['scope_id'], + }, + level=logging.WARNING, + ) + + if requested_conversation_id and requested_conversation_id != authorized_scope['conversation_id']: + log_event( + '[FactMemoryPlugin] Overriding mismatched fact-memory conversation_id in tool call.', + extra={ + 'requested_conversation_id': requested_conversation_id, + 'authorized_conversation_id': authorized_scope['conversation_id'], + }, + level=logging.WARNING, + ) + + return authorized_scope + @kernel_function( description=""" Store a fact for the given agent, scope, and conversation. @@ -39,11 +119,16 @@ def set_fact(self, scope_type: str, scope_id: str, value: str, conversation_id: """ Store a fact for the given agent, scope, and conversation. """ - return self.store.set_fact( + authorized_scope = self._resolve_authorized_fact_memory_call( scope_type=scope_type, scope_id=scope_id, - value=value, conversation_id=conversation_id, + ) + return self.store.set_fact( + scope_type=authorized_scope['scope_type'], + scope_id=authorized_scope['scope_id'], + value=value, + conversation_id=authorized_scope['conversation_id'], agent_id=agent_id, memory_type=memory_type, ) @@ -56,8 +141,9 @@ def update_fact(self, scope_id: str, fact_id: str, value: str, memory_type: str """ Update a fact value by its unique id and scope_id partition key. """ + authorized_scope = self._resolve_authorized_fact_memory_call(scope_id=scope_id) update_kwargs = { - 'scope_id': scope_id, + 'scope_id': authorized_scope['scope_id'], 'fact_id': fact_id, 'value': value, } @@ -77,8 +163,9 @@ def delete_fact(self, scope_id: str, fact_id: str) -> bool: """ Delete a fact by its unique id and the scope_id which is the partition key. """ + authorized_scope = self._resolve_authorized_fact_memory_call(scope_id=scope_id) return self.store.delete_fact( - scope_id=scope_id, + scope_id=authorized_scope['scope_id'], fact_id=fact_id ) @@ -100,7 +187,11 @@ def get_facts(self, scope_type: str, scope_id: str,) -> List[dict]: """ Retrieve all facts for the user. Facts are persistent values that provide important context, background knowledge, or user preferences to the AI agent. Use this to get all facts that will be injected as context for the agent. """ - return self.store.get_facts( + authorized_scope = self._resolve_authorized_fact_memory_call( scope_type=scope_type, scope_id=scope_id, ) + return self.store.get_facts( + scope_type=authorized_scope['scope_type'], + scope_id=authorized_scope['scope_id'], + ) diff --git a/application/single_app/semantic_kernel_plugins/log_analytics_plugin.py b/application/single_app/semantic_kernel_plugins/log_analytics_plugin.py index f98c0efaa..7c5a69326 100644 --- a/application/single_app/semantic_kernel_plugins/log_analytics_plugin.py +++ b/application/single_app/semantic_kernel_plugins/log_analytics_plugin.py @@ -193,7 +193,6 @@ def _generate_metadata(self) -> Dict[str, Any]: "description": "Run a KQL (Kusto Query Language) query against the Log Analytics workspace and return the results. Results are chunked for LLMs if needed. Accepts an optional timespan parameter (timedelta, tuple, or hours).", "parameters": [ {"name": "query", "type": "string", "description": "The KQL query string to execute.", "required": True}, - {"name": "user_id", "type": "string", "description": "User ID for query history tracking (optional).", "required": False}, {"name": "timespan", "type": "any", "description": "Query timespan: timedelta, (start, end) tuple, or number of hours (optional).", "required": False} ], "returns": {"type": "list[object]", "description": "A list of result rows, each as a dictionary of column values."} @@ -210,8 +209,7 @@ def _generate_metadata(self) -> Dict[str, Any]: "name": "get_query_history", "description": "Return the last N queries run by this plugin instance for the current user. Useful for re-running or editing previous queries.", "parameters": [ - {"name": "limit", "type": "integer", "description": "Number of queries to return (default 20).", "required": False}, - {"name": "user_id", "type": "string", "description": "User ID for query history tracking (optional).", "required": False} + {"name": "limit", "type": "integer", "description": "Number of queries to return (default 20).", "required": False} ], "returns": {"type": "list[string]", "description": "A list of previous KQL queries, most recent last."} }, @@ -228,6 +226,21 @@ def _generate_metadata(self) -> Dict[str, Any]: def get_functions(self) -> List[str]: return [m["name"] for m in self._metadata["methods"]] + + def _get_authenticated_history_user_id(self) -> Optional[str]: + """Return the authenticated user id for query-history persistence.""" + try: + from application.single_app.functions_authentication import get_current_user_id + except ImportError: + from functions_authentication import get_current_user_id + + try: + user_id = str(get_current_user_id() or "").strip() + except Exception as exc: + logging.warning(f"[LA] Could not resolve authenticated user for query history: {exc}") + return None + + return user_id or None @plugin_function_logger("LogAnalyticsPlugin") @kernel_function(description="Return a dictionary of all tables and their schemas (column names and types, including Properties virtual columns) in the connected Azure Log Analytics workspace. This combines list_tables and get_table_schema for efficient schema discovery.") @@ -394,14 +407,13 @@ def col_name(col): return schema @plugin_function_logger("LogAnalyticsPlugin") - @kernel_function(description="Execute a KQL (Kusto Query Language) query against a specific table in the Log Analytics workspace and return the results as a list of rows (each as a dictionary of column values). Use this function after discovering available tables and their schemas to retrieve data. Accepts an optional timespan parameter to limit the query window as a timedelta, tuple of datetimes, or number of hours. Limitations on returns should be specified in the query (ex: take N). Always provide user_id to enable saving the query to Cosmos DB for user history tracking.") + @kernel_function(description="Execute a KQL (Kusto Query Language) query against a specific table in the Log Analytics workspace and return the results as a list of rows (each as a dictionary of column values). Use this function after discovering available tables and their schemas to retrieve data. Accepts an optional timespan parameter to limit the query window as a timedelta, tuple of datetimes, or number of hours. Limitations on returns should be specified in the query (ex: take N).") def run_query( self, query: str, - user_id: Optional[str] = None, timespan: Optional[Any] = None ) -> Any: - logging.debug(f"[LA] Running query: {query} with user_id={user_id}, timespan={timespan}") + logging.debug(f"[LA] Running query: {query} with timespan={timespan}") if not self._client: raise RuntimeError("Log Analytics client not initialized.") # Determine if this is a control command (starts with '.') @@ -477,9 +489,9 @@ def col_name(col): logging.error(f"[LA] Error processing query results: {e}") return {"error": "Failed to process query results."} finally: - # Save to Cosmos query history if user_id is provided - if user_id: - self._save_query_history_to_cosmos(user_id, query) + history_user_id = self._get_authenticated_history_user_id() + if history_user_id: + self._save_query_history_to_cosmos(history_user_id, query) @plugin_function_logger("LogAnalyticsPlugin") @kernel_function(description="Summarize a result set for LLM consumption, including row count and column names.") @@ -492,7 +504,8 @@ def summarize_results(self, results: List[Dict[str, Any]]) -> str: @plugin_function_logger("LogAnalyticsPlugin") @kernel_function(description="Return the last N queries run by this plugin instance. They should be numbered for the user to allow easy selection.") - def get_query_history(self, limit: int = 20, user_id: Optional[str] = None) -> List[str]: + def get_query_history(self, limit: int = 20) -> List[str]: + user_id = self._get_authenticated_history_user_id() if not user_id: return [] return self._get_query_history_from_cosmos(user_id, limit) diff --git a/application/single_app/semantic_kernel_plugins/tabular_processing_plugin.py b/application/single_app/semantic_kernel_plugins/tabular_processing_plugin.py index 344d092a5..f76edff81 100644 --- a/application/single_app/semantic_kernel_plugins/tabular_processing_plugin.py +++ b/application/single_app/semantic_kernel_plugins/tabular_processing_plugin.py @@ -15,11 +15,15 @@ import re import warnings import pandas +from flask import g, has_request_context from typing import Annotated, Dict, List, Optional, Set from urllib.parse import urlsplit, urlunsplit from semantic_kernel.functions import kernel_function from semantic_kernel_plugins.plugin_invocation_logger import plugin_function_logger from functions_appinsights import log_event +from functions_authentication import get_current_user_id +from functions_group import find_group_by_id, get_user_role_in_group +from functions_public_workspaces import get_user_visible_public_workspace_ids_from_settings from config import ( CLIENTS, TABULAR_EXTENSIONS, @@ -187,6 +191,179 @@ def _get_blob_service_client(self): raise RuntimeError("Blob storage client not available. Enhanced citations must be enabled.") return client + def _get_authorized_chat_context(self) -> dict: + """Return the canonical request-scoped authorization context for tabular access.""" + if not has_request_context(): + raise PermissionError('Tabular processing requires an active request context.') + + current_user_id = str(get_current_user_id() or '').strip() + if not current_user_id: + raise PermissionError('User not authenticated.') + + authorized_context = dict(getattr(g, 'authorized_chat_context', {}) or {}) + authorized_user_id = str(authorized_context.get('user_id') or current_user_id).strip() + if authorized_user_id != current_user_id: + authorized_user_id = current_user_id + + authorized_conversation_id = str( + authorized_context.get('conversation_id') or getattr(g, 'conversation_id', '') or '' + ).strip() + if not authorized_conversation_id: + raise PermissionError('Conversation context unavailable for tabular processing.') + + active_group_ids = [ + str(group_id or '').strip() + for group_id in (authorized_context.get('active_group_ids') or []) + if str(group_id or '').strip() + ] + active_public_workspace_ids = [ + str(workspace_id or '').strip() + for workspace_id in (authorized_context.get('active_public_workspace_ids') or []) + if str(workspace_id or '').strip() + ] + + return { + 'user_id': authorized_user_id, + 'conversation_id': authorized_conversation_id, + 'active_group_ids': active_group_ids, + 'active_group_id': str(authorized_context.get('active_group_id') or '').strip() or None, + 'active_public_workspace_ids': active_public_workspace_ids, + 'active_public_workspace_id': ( + str(authorized_context.get('active_public_workspace_id') or '').strip() or None + ), + } + + def _resolve_authorized_scope_arguments( + self, + user_id: str, + conversation_id: str, + group_id: Optional[str] = None, + public_workspace_id: Optional[str] = None, + ) -> dict: + """Normalize tool-call scope arguments against the current authorized request context.""" + authorized_context = self._get_authorized_chat_context() + requested_user_id = str(user_id or '').strip() + requested_conversation_id = str(conversation_id or '').strip() + requested_group_id = str(group_id or '').strip() + requested_public_workspace_id = str(public_workspace_id or '').strip() + + if requested_user_id and requested_user_id != authorized_context['user_id']: + log_event( + '[TabularProcessingPlugin] Ignoring mismatched user_id in tool call.', + extra={ + 'requested_user_id': requested_user_id, + 'authorized_user_id': authorized_context['user_id'], + }, + level=logging.WARNING, + ) + + if requested_conversation_id and requested_conversation_id != authorized_context['conversation_id']: + log_event( + '[TabularProcessingPlugin] Ignoring mismatched conversation_id in tool call.', + extra={ + 'requested_conversation_id': requested_conversation_id, + 'authorized_conversation_id': authorized_context['conversation_id'], + }, + level=logging.WARNING, + ) + + resolved_group_id = None + if requested_group_id: + if not self._is_authorized_group_scope( + authorized_context['user_id'], + requested_group_id, + authorized_context=authorized_context, + ): + raise PermissionError('Tabular processing cannot access that group scope.') + resolved_group_id = requested_group_id + + resolved_public_workspace_id = None + if requested_public_workspace_id: + if not self._is_authorized_public_workspace_scope( + authorized_context['user_id'], + requested_public_workspace_id, + authorized_context=authorized_context, + ): + raise PermissionError('Tabular processing cannot access that public workspace scope.') + resolved_public_workspace_id = requested_public_workspace_id + + authorized_context['user_id'] = authorized_context['user_id'] + authorized_context['conversation_id'] = authorized_context['conversation_id'] + authorized_context['group_id'] = resolved_group_id + authorized_context['public_workspace_id'] = resolved_public_workspace_id + return authorized_context + + def _is_authorized_group_scope( + self, + user_id: str, + group_id: str, + authorized_context: Optional[dict] = None, + ) -> bool: + """Return True when the current user may access the requested group scope.""" + normalized_group_id = str(group_id or '').strip() + if not normalized_group_id: + return False + + if authorized_context and normalized_group_id in set(authorized_context.get('active_group_ids') or []): + return True + + group_doc = find_group_by_id(normalized_group_id) + return bool(group_doc and get_user_role_in_group(group_doc, user_id)) + + def _is_authorized_public_workspace_scope( + self, + user_id: str, + public_workspace_id: str, + authorized_context: Optional[dict] = None, + ) -> bool: + """Return True when the current user may access the requested public workspace scope.""" + normalized_public_workspace_id = str(public_workspace_id or '').strip() + if not normalized_public_workspace_id: + return False + + if authorized_context and normalized_public_workspace_id in set( + authorized_context.get('active_public_workspace_ids') or [] + ): + return True + + visible_public_workspace_ids = set( + get_user_visible_public_workspace_ids_from_settings(user_id) or [] + ) + return normalized_public_workspace_id in visible_public_workspace_ids + + def _is_authorized_blob_location(self, container_name: str, blob_path: str, authorized_context: dict) -> bool: + """Ensure remembered blob locations still fall within the caller's authorized request scope.""" + source = self._infer_source_from_container(container_name) + blob_parts = [part for part in str(blob_path or '').split('/') if part] + if not source or not blob_parts: + return False + + if source == 'workspace': + return blob_parts[0] == authorized_context['user_id'] + + if source == 'chat': + return ( + len(blob_parts) >= 2 + and blob_parts[0] == authorized_context['user_id'] + and blob_parts[1] == authorized_context['conversation_id'] + ) + + if source == 'group': + return self._is_authorized_group_scope( + authorized_context['user_id'], + blob_parts[0], + authorized_context=authorized_context, + ) + + if source == 'public': + return self._is_authorized_public_workspace_scope( + authorized_context['user_id'], + blob_parts[0], + authorized_context=authorized_context, + ) + + return False + def _list_tabular_blobs(self, container_name: str, prefix: str) -> List[str]: """List all tabular file blobs under a given prefix.""" client = self._get_blob_service_client() @@ -2754,9 +2931,25 @@ def _resolve_blob_location(self, user_id: str, conversation_id: str, filename: s def _resolve_blob_location_with_fallback(self, user_id: str, conversation_id: str, filename: str, source: str, group_id: str = None, public_workspace_id: str = None) -> tuple: """Try primary source first, then fall back to other containers if blob not found.""" + authorized_context = self._resolve_authorized_scope_arguments( + user_id, + conversation_id, + group_id=group_id, + public_workspace_id=public_workspace_id, + ) + user_id = authorized_context['user_id'] + conversation_id = authorized_context['conversation_id'] + group_id = authorized_context['group_id'] + public_workspace_id = authorized_context['public_workspace_id'] source = source.lower().strip() + + if source == 'group' and not group_id: + group_id = authorized_context['active_group_id'] + if source == 'public' and not public_workspace_id: + public_workspace_id = authorized_context['active_public_workspace_id'] + override = self._get_resolved_blob_location_override(source, filename) - if override: + if override and self._is_authorized_blob_location(override[0], override[1], authorized_context): return override attempts = [] @@ -2811,6 +3004,29 @@ async def list_tabular_files( public_workspace_id: Annotated[Optional[str], "Public workspace ID (for public workspace documents)"] = None, ) -> Annotated[str, "JSON list of available tabular files"]: """List all tabular files available for the user across all accessible containers.""" + try: + authorized_context = self._resolve_authorized_scope_arguments( + user_id, + conversation_id, + group_id=group_id, + public_workspace_id=public_workspace_id, + ) + except PermissionError as exc: + log_event( + f"[TabularProcessingPlugin] Denied tabular file listing: {exc}", + level=logging.WARNING, + extra={ + 'requested_group_id': group_id, + 'requested_public_workspace_id': public_workspace_id, + }, + ) + return json.dumps({"error": str(exc)}) + + user_id = authorized_context['user_id'] + conversation_id = authorized_context['conversation_id'] + group_id = authorized_context['group_id'] + public_workspace_id = authorized_context['public_workspace_id'] + def _sync_work(): results = [] try: diff --git a/application/single_app/static/images/custom_logo.png b/application/single_app/static/images/custom_logo.png index ecf6e6521..a5b440e93 100644 Binary files a/application/single_app/static/images/custom_logo.png and b/application/single_app/static/images/custom_logo.png differ diff --git a/application/single_app/static/images/custom_logo_dark.png b/application/single_app/static/images/custom_logo_dark.png index 4f2819457..df9485a93 100644 Binary files a/application/single_app/static/images/custom_logo_dark.png and b/application/single_app/static/images/custom_logo_dark.png differ diff --git a/application/single_app/static/js/admin/admin_agents.js b/application/single_app/static/js/admin/admin_agents.js index 5c1daed86..6f9c41458 100644 --- a/application/single_app/static/js/admin/admin_agents.js +++ b/application/single_app/static/js/admin/admin_agents.js @@ -20,6 +20,12 @@ let orchestrationSettings = {}; let agents = []; let selectedAgent = null; +function escapeHtml(text) { + const div = document.createElement('div'); + div.textContent = text ?? ''; + return div.innerHTML; +} + // --- Function Definitions --- async function loadAllAdminAgentData() { @@ -274,10 +280,13 @@ function renderAgentsTable() { const isSelected = selectedAgent && agent.name === selectedAgent; const tr = document.createElement('tr'); let selectedBadge = isSelected ? 'Selected' : ''; + const safeName = escapeHtml(agent.name || ''); + const safeDisplayName = escapeHtml(agent.display_name || ''); + const safeDescription = escapeHtml(agent.description || ''); tr.innerHTML = ` - ${agent.name} - ${agent.display_name} - ${agent.description || ''} + ${safeName} + ${safeDisplayName} + ${safeDescription} ${selectedBadge} diff --git a/application/single_app/static/js/agent_templates_gallery.js b/application/single_app/static/js/agent_templates_gallery.js index 428ebf702..c0d06248a 100644 --- a/application/single_app/static/js/agent_templates_gallery.js +++ b/application/single_app/static/js/agent_templates_gallery.js @@ -142,7 +142,12 @@ function renderAccordion(accordion, templates, options = {}) { if (Array.isArray(template.actions_to_load) && template.actions_to_load.length) { const actionLine = document.createElement("p"); actionLine.className = "mb-0 text-muted small"; - actionLine.innerHTML = `Recommended actions: ${template.actions_to_load.join(", ")}`; + const actionLabel = document.createElement("strong"); + actionLabel.textContent = "Recommended actions:"; + actionLine.appendChild(actionLabel); + actionLine.appendChild( + document.createTextNode(` ${template.actions_to_load.map((action) => String(action)).join(", ")}`) + ); metaList.appendChild(actionLine); } diff --git a/application/single_app/static/js/chat/chat-citations.js b/application/single_app/static/js/chat/chat-citations.js index 9d751ffd9..4fa3f60ac 100644 --- a/application/single_app/static/js/chat/chat-citations.js +++ b/application/single_app/static/js/chat/chat-citations.js @@ -195,7 +195,7 @@ export function showCitedTextPopup(citedText, fileName, pageNumber) { `; } @@ -76,6 +77,7 @@ export async function showConversationDetails(conversationId) { */ function renderConversationMetadata(metadata, conversationId) { const { context = [], tags = [], strict = false, classification = [], last_updated, chat_type = 'personal', is_pinned = false, is_hidden = false, scope_locked, locked_contexts = [], summary = null } = metadata; + const safeConversationId = escapeHtml(conversationId); // Organize tags by category const tagsByCategory = { @@ -102,7 +104,7 @@ function renderConversationMetadata(metadata, conversationId) {
Summary
- ${summary ? `Generated ${formatDate(summary.generated_at)}${summary.model_deployment ? ` · ${summary.model_deployment}` : ''}` : ''} + ${summary ? `Generated ${formatDate(summary.generated_at)}${summary.model_deployment ? ` · ${escapeHtml(summary.model_deployment)}` : ''}` : ''}
${renderSummaryContent(summary, conversationId)} @@ -118,7 +120,7 @@ function renderConversationMetadata(metadata, conversationId) {
- Conversation ID: ${conversationId} + Conversation ID: ${safeConversationId}
Last Updated: ${formatDate(last_updated)} @@ -256,17 +258,20 @@ function renderContextSection(context) { if (primary) { const displayName = primary.name || primary.id; const isGroupChat = primary.scope === 'group'; + const safeDisplayName = escapeHtml(displayName); + const safePrimaryScope = escapeHtml(primary.scope); + const safePrimaryId = escapeHtml(primary.id); html += `
Primary Context:
- ${primary.scope} + ${safePrimaryScope} ${isGroupChat ? 'single-user' : ''} - ${displayName} + ${safeDisplayName}
- ${primary.name ? `
ID: ${primary.id}
` : ''} + ${primary.name ? `
ID: ${safePrimaryId}
` : ''}
`; @@ -281,11 +286,14 @@ function renderContextSection(context) { secondary.forEach(ctx => { const displayName = ctx.name || ctx.id; + const safeDisplayName = escapeHtml(displayName); + const safeScope = escapeHtml(ctx.scope); + const safeContextId = escapeHtml(ctx.id); html += `
- ${ctx.scope} - ${displayName} - ${ctx.name ? `
ID: ${ctx.id}
` : ''} + ${safeScope} + ${safeDisplayName} + ${ctx.name ? `
ID: ${safeContextId}
` : ''}
`; }); @@ -305,15 +313,19 @@ function renderParticipantsSection(participants) { participants.forEach(participant => { const initials = (participant.name || 'U').slice(0, 2).toUpperCase(); const avatarId = `participant-avatar-${participant.user_id}`; + const safeAvatarId = escapeHtml(avatarId); + const safeInitials = escapeHtml(initials); + const safeParticipantName = escapeHtml(participant.name || 'Unknown User'); + const safeParticipantEmail = escapeHtml(participant.email || ''); html += `
-
- ${initials} +
+ ${safeInitials}
-
${participant.name || 'Unknown User'}
- ${participant.email || ''} +
${safeParticipantName}
+ ${safeParticipantEmail}
`; @@ -337,7 +349,7 @@ async function loadParticipantProfileImage(userId) { if (!avatarElement) return; try { - const response = await fetch(`/api/user/profile-image/${userId}`); + const response = await fetch(`/api/user/profile-image/${encodeURIComponent(userId)}`); if (!response.ok) throw new Error('Failed to load user profile image'); const userData = await response.json(); @@ -380,7 +392,7 @@ function renderModelsAndAgentsSection(models, agents) { if (models.length > 0) { html += '
Models:
'; models.forEach(model => { - html += `${model.value}`; + html += `${escapeHtml(model.value)}`; }); html += '
'; } @@ -388,7 +400,7 @@ function renderModelsAndAgentsSection(models, agents) { if (agents.length > 0) { html += '
Agents:
'; agents.forEach(agent => { - html += `${agent.value}`; + html += `${escapeHtml(agent.value)}`; }); html += '
'; } @@ -407,6 +419,11 @@ function renderDocumentsSection(documents) { const chunkCount = doc.chunk_ids ? doc.chunk_ids.length : 0; const documentTitle = doc.title || doc.document_id; const scopeName = doc.scope?.name || doc.scope?.id || 'Unknown'; + const safeClassification = escapeHtml(doc.classification || 'None'); + const safeDocumentId = escapeHtml(doc.document_id || 'Unknown Document'); + const safeDocumentTitle = escapeHtml(documentTitle); + const safeScopeName = escapeHtml(scopeName); + const safeScopeType = escapeHtml(doc.scope?.type || 'Unknown'); // Format document classification with custom colors const allCategories = window.classification_categories || []; @@ -415,15 +432,15 @@ function renderDocumentsSection(documents) { if (category) { const textClass = isColorLight(category.color) ? 'text-dark' : 'text-white'; - classificationHtml = `${doc.classification}`; + classificationHtml = `${safeClassification}`; } else { - classificationHtml = `${doc.classification}`; + classificationHtml = `${safeClassification}`; } html += `
-
${documentTitle}
+
${safeDocumentTitle}
${classificationHtml}
@@ -433,12 +450,12 @@ function renderDocumentsSection(documents) {
- ${doc.scope?.type} scope: ${scopeName} + ${safeScopeType} scope: ${safeScopeName}
${doc.title && doc.title !== doc.document_id ? `
- ID: ${doc.document_id} + ID: ${safeDocumentId}
` : ''}
@@ -455,7 +472,7 @@ function renderSemanticTagsSection(semanticTags) { let html = '
'; semanticTags.forEach(tag => { - html += `${tag.value}`; + html += `${escapeHtml(tag.value)}`; }); html += '
'; @@ -469,14 +486,26 @@ function renderWebSourcesSection(webSources) { let html = ''; webSources.forEach(source => { - html += ` - - `; + const sourceValue = typeof source.value === 'string' ? source.value : ''; + const safeSourceText = escapeHtml(sourceValue); + const safeSourceUrl = sanitizeHttpUrl(sourceValue); + + if (safeSourceUrl) { + html += ` + + `; + } else { + html += ` +
+ ${safeSourceText || 'Invalid link'} +
+ `; + } }); return html; @@ -510,7 +539,7 @@ function formatScopeLockStatus(scopeLocked, lockedContexts) { return ctx.scope; }); return 'Locked' + - (names.length > 0 ? '
' + names.join(', ') + '' : ''); + (names.length > 0 ? '
' + names.map(name => escapeHtml(name)).join(', ') + '' : ''); } // false — unlocked return 'Unlocked'; @@ -525,14 +554,15 @@ function formatClassifications(classifications) { return classifications.map(label => { const category = allCategories.find(cat => cat.label === label); + const safeLabel = escapeHtml(label); if (category) { // Found category definition, apply custom color const textClass = isColorLight(category.color) ? 'text-dark' : 'text-white'; - return `${label}`; + return `${safeLabel}`; } else { // Label exists but no definition found (maybe deleted in admin) - return `${label}`; + return `${safeLabel}`; } }).join(' '); } @@ -587,12 +617,14 @@ function extractPageNumbers(chunkIds) { * @returns {string} HTML string */ function renderSummaryContent(summary, conversationId) { + const safeConversationId = escapeHtml(conversationId); + if (summary && summary.content) { return `

${escapeHtml(summary.content)}

@@ -608,13 +640,30 @@ function renderSummaryContent(summary, conversationId) { ${modelOptions}
`; } +function sanitizeHttpUrl(value) { + if (!value || typeof value !== 'string') { + return ''; + } + + try { + const parsed = new URL(value); + if (parsed.protocol === 'http:' || parsed.protocol === 'https:') { + return parsed.toString(); + } + } catch (error) { + return ''; + } + + return ''; +} + /** * Get available model options from the global #model-select dropdown * @returns {string} HTML option elements @@ -700,11 +749,11 @@ async function handleGenerateSummary(conversationId, modelDeployment) { * @returns {string} Escaped string */ function escapeHtml(str) { - if (!str) { + if (str === null || typeof str === 'undefined') { return ''; } const div = document.createElement('div'); - div.textContent = str; + div.textContent = String(str); return div.innerHTML; } diff --git a/application/single_app/static/js/chat/chat-documents.js b/application/single_app/static/js/chat/chat-documents.js index d0792db7e..b1fab7f08 100644 --- a/application/single_app/static/js/chat/chat-documents.js +++ b/application/single_app/static/js/chat/chat-documents.js @@ -1838,16 +1838,43 @@ document.addEventListener('DOMContentLoaded', function() { icon = 'bi-globe'; } if (name) { - workspaceItems.push(`
  • ${name}
  • `); + workspaceItems.push({ icon, name }); } } if (listEl) { + listEl.textContent = ''; if (workspaceItems.length > 0) { const listLabel = scopeLocked === true ? 'Currently locked to:' : 'Will lock to:'; - listEl.innerHTML = `

    ${listLabel}

      ${workspaceItems.join('')}
    `; + const listLabelEl = document.createElement('p'); + listLabelEl.className = 'small text-muted mb-2'; + listLabelEl.textContent = listLabel; + + const listGroupEl = document.createElement('ul'); + listGroupEl.className = 'list-group list-group-flush'; + + workspaceItems.forEach(({ icon, name }) => { + const listItemEl = document.createElement('li'); + listItemEl.className = 'list-group-item'; + + const iconEl = document.createElement('i'); + iconEl.className = `bi ${icon} me-2`; + + const nameEl = document.createElement('span'); + nameEl.textContent = name; + + listItemEl.appendChild(iconEl); + listItemEl.appendChild(nameEl); + listGroupEl.appendChild(listItemEl); + }); + + listEl.appendChild(listLabelEl); + listEl.appendChild(listGroupEl); } else { - listEl.innerHTML = '

    No specific workspaces recorded.

    '; + const emptyStateEl = document.createElement('p'); + emptyStateEl.className = 'text-muted'; + emptyStateEl.textContent = 'No specific workspaces recorded.'; + listEl.appendChild(emptyStateEl); } } diff --git a/application/single_app/static/js/chat/chat-input-actions.js b/application/single_app/static/js/chat/chat-input-actions.js index 66eaf0444..b96caf8d8 100644 --- a/application/single_app/static/js/chat/chat-input-actions.js +++ b/application/single_app/static/js/chat/chat-input-actions.js @@ -154,7 +154,7 @@ export function showFileContentPopup(fileContent, filename, isTable, fileContent ", + "${doc.scope?.type} scope: ${scopeName}", + "${tag.value}", + "", + "names.join(', ')", + ] + + for snippet in required_snippets: + assert snippet in source, f"Expected safe conversation-details snippet: {snippet}" + + for snippet in forbidden_snippets: + assert snippet not in source, f"Unexpected unsafe conversation-details snippet: {snippet}" + + +def test_fix_documentation_and_version_are_in_sync() -> None: + """Verify the version bump and fix documentation were added together.""" + version = read_version() + assert version == "0.241.022", f"Expected config.py version 0.241.022, found {version}" + assert FIX_DOC.exists(), f"Expected fix documentation file at {FIX_DOC}" + + fix_doc = read_text(FIX_DOC) + assert "Fixed in version: **0.241.019**" in fix_doc + assert "functional_tests/test_stored_xss_chat_scope_and_conversation_details_fix.py" in fix_doc + assert "ui_tests/test_chat_scope_lock_and_conversation_details_escaping.py" in fix_doc + + +if __name__ == "__main__": + tests = [ + test_scope_lock_modal_renders_workspace_names_with_text_nodes, + test_conversation_details_modal_escapes_untrusted_metadata_fields, + test_fix_documentation_and_version_are_in_sync, + ] + results = [] + + for test in tests: + print(f"\nRunning {test.__name__}...") + try: + test() + print("PASS") + results.append(True) + except Exception as error: + print(f"FAIL: {error}") + results.append(False) + + success = all(results) + print(f"\nResults: {sum(results)}/{len(results)} tests passed") + sys.exit(0 if success else 1) \ No newline at end of file diff --git a/functional_tests/test_stored_xss_chat_workspace_rendering_fix.py b/functional_tests/test_stored_xss_chat_workspace_rendering_fix.py new file mode 100644 index 000000000..93aa10cee --- /dev/null +++ b/functional_tests/test_stored_xss_chat_workspace_rendering_fix.py @@ -0,0 +1,215 @@ +# test_stored_xss_chat_workspace_rendering_fix.py +""" +Functional test for stored XSS chat and workspace rendering hardening. +Version: 0.241.017 +Implemented in: 0.241.017 + +This test ensures chat agent display names, workspace member display names, +and Graph user search filters are safely encoded before HTML or OData +insertion. +""" + +import ast +import os +import sys + + +ROOT_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__))) +CHAT_MESSAGES_JS = os.path.join( + ROOT_DIR, + "application", + "single_app", + "static", + "js", + "chat", + "chat-messages.js", +) +MANAGE_PUBLIC_WORKSPACE_JS = os.path.join( + ROOT_DIR, + "application", + "single_app", + "static", + "js", + "public", + "manage_public_workspace.js", +) +MANAGE_GROUP_JS = os.path.join( + ROOT_DIR, + "application", + "single_app", + "static", + "js", + "group", + "manage_group.js", +) +USERS_ROUTE = os.path.join( + ROOT_DIR, + "application", + "single_app", + "route_backend_users.py", +) +CONFIG_FILE = os.path.join(ROOT_DIR, "application", "single_app", "config.py") +FIX_DOC = os.path.join( + ROOT_DIR, + "docs", + "explanation", + "fixes", + "v0.241.017", + "STORED_XSS_AGENT_AND_MEMBER_RENDERING_FIX.md", +) + + +def read_file_text(file_path): + with open(file_path, "r", encoding="utf-8") as file_handle: + return file_handle.read() + + +def read_config_version(): + for line in read_file_text(CONFIG_FILE).splitlines(): + if line.startswith("VERSION = "): + return line.split("=", 1)[1].strip().strip('"') + raise AssertionError("VERSION assignment not found in config.py") + + +def load_function(file_path, function_name): + source = read_file_text(file_path) + parsed = ast.parse(source, filename=file_path) + selected_node = next( + ( + node + for node in parsed.body + if isinstance(node, ast.FunctionDef) and node.name == function_name + ), + None, + ) + assert selected_node is not None, f"Expected function {function_name} in {file_path}" + module = ast.Module(body=[selected_node], type_ignores=[]) + namespace = {} + exec(compile(module, file_path, "exec"), namespace) + return namespace[function_name] + + +def test_chat_agent_display_name_is_escaped_before_html_rendering(): + """Verify chat agent display names are escaped in both HTML sinks.""" + print("🔍 Testing chat agent display name escaping...") + + source = read_file_text(CHAT_MESSAGES_JS) + + required_snippets = [ + "senderLabel = escapeHtml(agentDisplayName);", + "${escapeHtml(metadata.agent_display_name)}", + ] + missing = [snippet for snippet in required_snippets if snippet not in source] + assert not missing, f"Missing chat escaping snippets: {missing}" + + forbidden_snippets = [ + "senderLabel = agentDisplayName;", + "${metadata.agent_display_name}", + ] + present = [snippet for snippet in forbidden_snippets if snippet in source] + assert not present, f"Unexpected unescaped chat snippets found: {present}" + + print("✅ Chat agent display name escaping passed") + + +def test_public_workspace_member_renderers_escape_untrusted_fields(): + """Verify public workspace member-management renderers escape display names and emails.""" + print("🔍 Testing public workspace member rendering escaping...") + + source = read_file_text(MANAGE_PUBLIC_WORKSPACE_JS) + + required_snippets = [ + 'const safeDisplayName = escapeHtml(m.displayName || "(no name)");', + 'const safeDisplayName = escapeHtml(req.displayName || "(no name)");', + 'const safeDisplayName = escapeHtml(u.displayName || "(no name)");', + 'data-user-name="${safeDisplayName}"', + 'membersList += `
  • • ${safeName} (${safeEmail})
  • `;', + '$(document).on("click", ".select-user-btn", function () {', + "${escapeHtml(row.displayName || '')} (${escapeHtml(row.email || '')})", + ] + missing = [snippet for snippet in required_snippets if snippet not in source] + assert not missing, f"Missing public workspace escaping snippets: {missing}" + + forbidden_snippets = [ + 'onclick="selectUserForAdd(', + '${u.displayName || "(no name)"}', + '${req.displayName}', + '
  • • ${member.name} (${member.email})
  • ', + ] + present = [snippet for snippet in forbidden_snippets if snippet in source] + assert not present, f"Unexpected unescaped public workspace snippets found: {present}" + + print("✅ Public workspace member rendering escaping passed") + + +def test_group_workspace_member_renderers_escape_untrusted_fields(): + """Verify group workspace member-management renderers escape display names and emails.""" + print("🔍 Testing group workspace member rendering escaping...") + + source = read_file_text(MANAGE_GROUP_JS) + + required_snippets = [ + 'const safeDisplayName = escapeHtml(m.displayName || "(no name)");', + 'const safeDisplayName = escapeHtml(u.displayName || "(no name)");', + 'data-user-name="${safeDisplayName}"', + 'membersList += `
  • • ${safeName} (${safeEmail})
  • `;', + "${escapeHtml(row.displayName || '')} (${escapeHtml(row.email || '')})", + ] + missing = [snippet for snippet in required_snippets if snippet not in source] + assert not missing, f"Missing group workspace escaping snippets: {missing}" + + forbidden_snippets = [ + '${u.displayName || "(no name)"}', + '${u.email || ""}', + '
  • • ${member.name} (${member.email})
  • ', + '', + ] + present = [snippet for snippet in forbidden_snippets if snippet in source] + assert not present, f"Unexpected unescaped group workspace snippets found: {present}" + + print("✅ Group workspace member rendering escaping passed") + + +def test_user_search_filter_escapes_odata_literals(): + """Verify /api/userSearch escapes apostrophes before building the Graph filter.""" + print("🔍 Testing Graph user search OData literal escaping...") + + escape_helper = load_function(USERS_ROUTE, "_escape_graph_odata_literal") + assert escape_helper("o'hare") == "o''hare" + assert escape_helper("") == "" + + source = read_file_text(USERS_ROUTE) + assert "safe_query = _escape_graph_odata_literal(query)" in source + assert "startswith(displayName, '{safe_query}')" in source + assert "startswith(mail, '{safe_query}')" in source + assert "startswith(userPrincipalName, '{safe_query}')" in source + assert "startswith(displayName, '{query}')" not in source + + print("✅ Graph user search OData literal escaping passed") + + +def test_fix_documentation_and_version_exist(): + """Verify the version bump and fix documentation landed for this change.""" + print("🔍 Testing stored XSS rendering fix documentation and version...") + + assert read_config_version() == "0.241.017" + assert os.path.exists(FIX_DOC), f"Expected fix documentation at {FIX_DOC}" + + print("✅ Stored XSS rendering fix documentation and version passed") + + +if __name__ == "__main__": + tests = [ + test_chat_agent_display_name_is_escaped_before_html_rendering, + test_public_workspace_member_renderers_escape_untrusted_fields, + test_group_workspace_member_renderers_escape_untrusted_fields, + test_user_search_filter_escapes_odata_literals, + test_fix_documentation_and_version_exist, + ] + + for test in tests: + print(f"\n🧪 Running {test.__name__}...") + test() + + print(f"\n📊 Results: {len(tests)}/{len(tests)} tests passed") + sys.exit(0) \ No newline at end of file diff --git a/functional_tests/test_stored_xss_share_activity_and_masking_fix.py b/functional_tests/test_stored_xss_share_activity_and_masking_fix.py new file mode 100644 index 000000000..41ac8fe61 --- /dev/null +++ b/functional_tests/test_stored_xss_share_activity_and_masking_fix.py @@ -0,0 +1,240 @@ +# test_stored_xss_share_activity_and_masking_fix.py +""" +Functional test for stored XSS sharing, activity, and masking hardening. +Version: 0.241.022 +Implemented in: 0.241.020 + +This test ensures document-sharing modals, group activity rendering, and chat +masking metadata render attacker-controlled values as inert text and derive +masking identity from the authenticated server-side user. +""" + +import sys +from pathlib import Path + + +ROOT_DIR = Path(__file__).resolve().parents[1] +CHAT_TOAST_JS = ROOT_DIR / "application" / "single_app" / "static" / "js" / "chat" / "chat-toast.js" +CHAT_MESSAGES_JS = ROOT_DIR / "application" / "single_app" / "static" / "js" / "chat" / "chat-messages.js" +GROUP_MANAGE_JS = ROOT_DIR / "application" / "single_app" / "static" / "js" / "group" / "manage_group.js" +GROUP_SHARE_JS = ROOT_DIR / "application" / "single_app" / "static" / "js" / "workspace" / "group-documents-sharing.js" +WORKSPACE_SHARE_JS = ROOT_DIR / "application" / "single_app" / "static" / "js" / "workspace" / "workspace-documents-sharing.js" +CHAT_ROUTE = ROOT_DIR / "application" / "single_app" / "route_backend_chats.py" +CONFIG_FILE = ROOT_DIR / "application" / "single_app" / "config.py" +FIX_DOC = ROOT_DIR / "docs" / "explanation" / "fixes" / "v0.241.020" / "STORED_XSS_SHARE_ACTIVITY_AND_MASKING_FIX.md" + + +def read_text(path: Path) -> str: + """Read a UTF-8 text file from the repository.""" + return path.read_text(encoding="utf-8") + + +def read_version() -> str: + """Extract the application version from config.py.""" + for line in read_text(CONFIG_FILE).splitlines(): + if line.strip().startswith('VERSION = '): + return line.split('"')[1] + raise AssertionError("VERSION assignment was not found in config.py") + + +def assert_required_snippets(source: str, required_snippets: list[str], description: str) -> None: + """Assert that all required snippets exist in the target source text.""" + missing = [snippet for snippet in required_snippets if snippet not in source] + assert not missing, f"Missing {description} snippets: {missing}" + + +def assert_forbidden_snippets(source: str, forbidden_snippets: list[str], description: str) -> None: + """Assert that forbidden snippets were removed from the target source text.""" + present = [snippet for snippet in forbidden_snippets if snippet in source] + assert not present, f"Unexpected {description} snippets still present: {present}" + + +def test_chat_toast_uses_text_nodes_for_messages() -> None: + """Verify the shared chat toast helper no longer interpolates raw HTML messages.""" + source = read_text(CHAT_TOAST_JS) + + assert_required_snippets( + source, + [ + 'const toastEl = document.createElement("div");', + 'bodyEl.textContent = String(message ?? "");', + 'container.appendChild(toastEl);', + ], + "chat toast hardening", + ) + assert_forbidden_snippets( + source, + [ + 'container.insertAdjacentHTML("beforeend", toastHtml);', + '${message}', + ], + "unsafe chat toast rendering", + ) + + +def test_document_share_modals_use_safe_rendering_and_delegated_clicks() -> None: + """Verify personal and group share modals no longer rehydrate attacker HTML.""" + workspace_source = read_text(WORKSPACE_SHARE_JS) + group_source = read_text(GROUP_SHARE_JS) + + assert_required_snippets( + workspace_source, + [ + "const userSearchResultsBody = document.querySelector('#userSearchResultsTable tbody');", + "const sharedUsersList = document.getElementById('sharedUsersList');", + "const addButton = e.target.closest('.user-search-add-btn');", + "const removeButton = e.target.closest('.shared-user-remove-btn');", + "displayNameCell.textContent = user.displayName || '';", + "emailCell.textContent = user.email || '';", + "toastBody.textContent = String(message ?? '');", + "tbody.replaceChildren(...userRows);", + "sharedUsersList.replaceChildren(...userRows);", + ], + "workspace share hardening", + ) + assert_forbidden_snippets( + workspace_source, + [ + 'onclick="addUserToDocument(', + 'onclick="removeUserFromDocument(', + 'toast.innerHTML = `', + ], + "unsafe workspace share rendering", + ) + + assert_required_snippets( + group_source, + [ + "const groupSearchResultsBody = document.querySelector('#groupSearchResultsTable tbody');", + "const sharedGroupsList = document.getElementById('sharedGroupsList');", + "const addButton = e.target.closest('.group-search-add-btn');", + "const removeButton = e.target.closest('.shared-group-remove-btn');", + "nameCell.textContent = group.name || '';", + "descriptionCell.textContent = group.description || '';", + "toastBody.textContent = String(message ?? '');", + "tbody.replaceChildren(...groupRows);", + "sharedGroupsList.replaceChildren(...groupRows);", + ], + "group share hardening", + ) + assert_forbidden_snippets( + group_source, + [ + 'onclick="addGroupToDocument(', + 'onclick="removeGroupFromDocument(', + 'toast.innerHTML = `', + ], + "unsafe group share rendering", + ) + + +def test_group_activity_timeline_uses_safe_text_rendering() -> None: + """Verify activity rows and the raw-activity modal no longer use unsafe HTML sinks.""" + source = read_text(GROUP_MANAGE_JS) + + assert_required_snippets( + source, + [ + 'const safeDescription = escapeHtml(description);', + "const activityTimeline = $('#activityTimeline');", + "activityTimeline.find('.activity-item').each(function(index) {", + "$(this).data('activity', activities[index]);", + '
    ', + "code.textContent = JSON.stringify(activity ?? {}, null, 2) || '{}';", + 'modalBody.replaceChildren(pre);', + ], + "group activity hardening", + ) + assert_forbidden_snippets( + source, + [ + 'data-activity=\'${activityJson.replace(/\'/g, "'")}\'', + 'onclick="showRawActivity(this)"', + 'modalBody.innerHTML = `
    ${JSON.stringify(activity, null, 2)}
    `;', + '

    ${description}

    ', + ], + "unsafe group activity rendering", + ) + + +def test_masking_renderer_and_backend_identity_are_hardened() -> None: + """Verify masked spans render safely and the backend ignores client-supplied display names.""" + chat_messages_source = read_text(CHAT_MESSAGES_JS) + chat_route_source = read_text(CHAT_ROUTE) + + assert_required_snippets( + chat_messages_source, + [ + "const fragment = document.createDocumentFragment();", + "fragment.appendChild(document.createTextNode(content.substring(lastIndex, range.start)));", + "const maskedSpan = document.createElement('span');", + "maskedSpan.setAttribute('data-display-name', String(range.display_name ?? ''));", + "maskedSpan.title = `Masked by ${String(range.display_name ?? 'Unknown User')} on ${timestamp}`;", + 'messageText.replaceChildren(fragment);', + ], + "chat masking renderer hardening", + ) + assert_forbidden_snippets( + chat_messages_source, + [ + 'messageText.innerHTML = htmlContent;', + 'data-display-name="${range.display_name}"', + 'title="Masked by ${range.display_name} on ${timestamp}"', + ], + "unsafe chat masking rendering", + ) + + assert_required_snippets( + chat_route_source, + [ + 'current_user = get_current_user_info() or {}', + "current_user.get('displayName')", + "current_user.get('email')", + "current_user.get('userPrincipalName')", + ], + "masking backend identity hardening", + ) + assert_forbidden_snippets( + chat_route_source, + [ + "user_display_name = data.get('display_name', 'Unknown User')", + ], + "client-controlled masking display name", + ) + + +def test_fix_documentation_and_version_are_in_sync() -> None: + """Verify the version bump and fix documentation landed together.""" + version = read_version() + assert version == "0.241.022", f"Expected config.py version 0.241.022, found {version}" + assert FIX_DOC.exists(), f"Expected fix documentation file at {FIX_DOC}" + + fix_doc = read_text(FIX_DOC) + assert "Fixed in version: **0.241.020**" in fix_doc + assert "functional_tests/test_stored_xss_share_activity_and_masking_fix.py" in fix_doc + assert "ui_tests/test_document_share_modal_escaping.py" in fix_doc + + +if __name__ == "__main__": + tests = [ + test_chat_toast_uses_text_nodes_for_messages, + test_document_share_modals_use_safe_rendering_and_delegated_clicks, + test_group_activity_timeline_uses_safe_text_rendering, + test_masking_renderer_and_backend_identity_are_hardened, + test_fix_documentation_and_version_are_in_sync, + ] + results = [] + + for test in tests: + print(f"\nRunning {test.__name__}...") + try: + test() + print("PASS") + results.append(True) + except Exception as error: + print(f"FAIL: {error}") + results.append(False) + + success = all(results) + print(f"\nResults: {sum(results)}/{len(results)} tests passed") + sys.exit(0 if success else 1) \ No newline at end of file diff --git a/functional_tests/test_tabular_all_scope_group_source_context.py b/functional_tests/test_tabular_all_scope_group_source_context.py index f966e9c0c..be42e9c62 100644 --- a/functional_tests/test_tabular_all_scope_group_source_context.py +++ b/functional_tests/test_tabular_all_scope_group_source_context.py @@ -2,12 +2,13 @@ # test_tabular_all_scope_group_source_context.py """ Functional test for all-scope tabular group source context handling. -Version: 0.240.049 -Implemented in: 0.240.032; 0.240.041; 0.240.042; 0.240.043; 0.240.048; 0.240.049 +Version: 0.241.017 +Implemented in: 0.240.032; 0.240.041; 0.240.042; 0.240.043; 0.240.048; 0.240.049; 0.241.016 This test ensures mixed-scope workspace search keeps per-file group/public source metadata so tabular analysis can open group and public workbooks even -when chat document scope is set to all. +when chat document scope is set to all, while selected-document resolution only +uses documents the current chat scope is authorized to access. """ import ast @@ -19,6 +20,8 @@ ROUTE_FILE = os.path.join(ROOT_DIR, 'application', 'single_app', 'route_backend_chats.py') CONFIG_FILE = os.path.join(ROOT_DIR, 'application', 'single_app', 'config.py') TARGET_FUNCTIONS = { + '_normalize_requested_scope_ids', + '_resolve_chat_selected_document_metadata', 'is_tabular_filename', 'get_document_containers_for_scope', 'build_tabular_file_context', @@ -146,6 +149,8 @@ def test_selected_tabular_document_lookup_checks_all_scope_containers(): selected_contexts = helpers['get_selected_workspace_tabular_file_contexts']( selected_document_ids=['group-doc-123', 'public-doc-456'], document_scope='all', + active_group_ids=['93aa364a-99ee-4cfd-8e4d-f37d175f00f5'], + active_public_workspace_ids=['public-456'], ) assert selected_contexts == [ @@ -204,7 +209,7 @@ def test_route_uses_context_aware_tabular_analysis_and_version_bump(): ] missing = [snippet for snippet in required_snippets if snippet not in source] assert not missing, f'Missing route integration snippets: {missing}' - assert read_config_version() == '0.240.049' + assert read_config_version() == '0.241.017' print('✅ Route integration and version bump passed') return True diff --git a/functional_tests/test_uploaded_file_preview_xss_fix.py b/functional_tests/test_uploaded_file_preview_xss_fix.py new file mode 100644 index 000000000..f7055bde3 --- /dev/null +++ b/functional_tests/test_uploaded_file_preview_xss_fix.py @@ -0,0 +1,107 @@ +# test_uploaded_file_preview_xss_fix.py +""" +Functional test for uploaded file preview XSS hardening. +Version: 0.241.022 +Implemented in: 0.241.022 + +This test ensures uploaded file preview rendering no longer injects raw file +content into modal HTML and that current tabular previews build their DOM with +text nodes instead of untrusted HTML interpolation. +""" + +import os +import sys + + +ROOT_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__))) +CHAT_INPUT_ACTIONS_JS = os.path.join( + ROOT_DIR, + "application", + "single_app", + "static", + "js", + "chat", + "chat-input-actions.js", +) +CONFIG_FILE = os.path.join(ROOT_DIR, "application", "single_app", "config.py") +FIX_DOC = os.path.join( + ROOT_DIR, + "docs", + "explanation", + "fixes", + "v0.241.022", + "UPLOADED_FILE_PREVIEW_XSS_FIX.md", +) + + +def read_file_text(file_path): + with open(file_path, "r", encoding="utf-8") as file_handle: + return file_handle.read() + + +def read_config_version(): + for line in read_file_text(CONFIG_FILE).splitlines(): + if line.strip().startswith("VERSION = "): + return line.split("=", 1)[1].strip().strip('"') + raise AssertionError("VERSION assignment not found in config.py") + + +def test_uploaded_file_preview_uses_safe_rendering_boundaries(): + """Verify the preview modal no longer feeds file content into dynamic HTML sinks.""" + print("🔍 Testing uploaded file preview rendering boundaries...") + + source = read_file_text(CHAT_INPUT_ACTIONS_JS) + + required_snippets = [ + 'downloadBtnContainer.replaceChildren();', + 'const downloadLink = document.createElement("a");', + 'const isLegacyHtmlTableContent = /^$/i.test(trimmedContent);', + 'renderPreformattedText(fileContentElement, fileContent);', + 'const tableWrapper = buildCsvTableElement(fileContent);', + 'headerCell.textContent = header;', + 'cellElement.textContent = cell;', + 'pre.textContent = String(text ?? "");', + ] + missing = [snippet for snippet in required_snippets if snippet not in source] + assert not missing, f"Missing uploaded file preview hardening snippets: {missing}" + + forbidden_snippets = [ + 'fileContentElement.innerHTML = `
    ${tableHTML}
    `;', + 'fileContentElement.innerHTML = `
    ${fileContent}
    `;', + 'fileContentElement.innerHTML = `
    ${fileContent}
    `;', + '!fileContent.trim().startsWith(\'<\')', + ] + present = [snippet for snippet in forbidden_snippets if snippet in source] + assert not present, f"Unexpected unsafe uploaded file preview snippets found: {present}" + + print("✅ Uploaded file preview rendering boundaries passed") + + +def test_fix_documentation_and_version_are_in_sync(): + """Verify the fix note and current config version landed together.""" + print("🔍 Testing uploaded file preview fix documentation and version...") + + assert read_config_version() == "0.241.022" + + assert os.path.exists(FIX_DOC), f"Expected fix documentation at {FIX_DOC}" + fix_doc = read_file_text(FIX_DOC) + assert "Fixed/Implemented in version: **0.241.022**" in fix_doc + assert "legacy html table payloads now render as inert preformatted text" in fix_doc.lower() + assert "functional_tests/test_uploaded_file_preview_xss_fix.py" in fix_doc + assert "ui_tests/test_uploaded_file_preview_escaping.py" in fix_doc + + print("✅ Uploaded file preview fix documentation and version passed") + + +if __name__ == "__main__": + tests = [ + test_uploaded_file_preview_uses_safe_rendering_boundaries, + test_fix_documentation_and_version_are_in_sync, + ] + + for test in tests: + print(f"\n🧪 Running {test.__name__}...") + test() + + print(f"\n📊 Results: {len(tests)}/{len(tests)} tests passed") + sys.exit(0) \ No newline at end of file diff --git a/functional_tests/test_web_search_current_message_only.py b/functional_tests/test_web_search_current_message_only.py new file mode 100644 index 000000000..722b37373 --- /dev/null +++ b/functional_tests/test_web_search_current_message_only.py @@ -0,0 +1,138 @@ +# test_web_search_current_message_only.py +""" +Functional test for current-message-only web search egress. +Version: 0.241.008 +Implemented in: 0.241.008 + +This test ensures external web search uses only the current user message, +keeps history-derived internal search rewrites out of the outbound web-search +boundary, and does not send the previous Foundry identifier metadata blob. +""" + +import ast +import os +import sys + + +ROOT_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__))) +ROUTE_FILE = os.path.join(ROOT_DIR, 'application', 'single_app', 'route_backend_chats.py') + + +def read_file_text(file_path): + with open(file_path, 'r', encoding='utf-8') as file_handle: + return file_handle.read() + + +def extract_function_source(source_text, function_name): + parsed = ast.parse(source_text, filename=ROUTE_FILE) + for node in ast.walk(parsed): + if isinstance(node, ast.FunctionDef) and node.name == function_name: + return ast.get_source_segment(source_text, node) + raise AssertionError(f'Function {function_name} not found in route_backend_chats.py') + + +def load_helper(function_name): + source = read_file_text(ROUTE_FILE) + parsed = ast.parse(source, filename=ROUTE_FILE) + selected_nodes = [ + node for node in parsed.body + if isinstance(node, ast.FunctionDef) and node.name == function_name + ] + assert len(selected_nodes) == 1, f'Expected helper {function_name} to exist exactly once' + + module = ast.Module(body=selected_nodes, type_ignores=[]) + namespace = {} + exec(compile(module, ROUTE_FILE, 'exec'), namespace) + return namespace[function_name] + + +def test_web_search_query_helper_uses_only_current_message(): + """Verify the outbound web-search helper only normalizes the current turn.""" + print('🔍 Testing outbound web-search query helper...') + + helper = load_helper('build_web_search_query_text') + assert helper(' current turn only ') == 'current turn only' + assert helper('') == '' + assert helper(None) == '' + + print('✅ Outbound web-search query helper passed') + return True + + +def test_perform_web_search_uses_explicit_outbound_query_and_empty_metadata(): + """Verify the web-search boundary uses the explicit outbound query and no metadata blob.""" + print('🔍 Testing perform_web_search outbound boundary...') + + source = read_file_text(ROUTE_FILE) + perform_source = extract_function_source(source, 'perform_web_search') + + assert 'web_search_query_text,' in perform_source + assert 'query_text = (web_search_query_text or user_message or "").strip()' in perform_source + assert 'foundry_metadata = {}' in perform_source + + metadata_block = perform_source.split('foundry_metadata = {}', 1)[1].split( + 'debug_print("[WebSearch] Foundry metadata prepared: {}")', 1 + )[0] + + forbidden_snippets = [ + '"conversation_id": conversation_id', + '"user_id": user_id', + '"message_id": user_message_id', + '"chat_type": chat_type', + '"document_scope": document_scope', + '"group_id": active_group_id if chat_type == "group" else None', + '"public_workspace_id": active_public_workspace_id', + '"search_query": query_text', + ] + for snippet in forbidden_snippets: + assert snippet not in metadata_block, f'Unexpected outbound metadata snippet present: {snippet}' + + print('✅ perform_web_search outbound boundary passed') + return True + + +def test_chat_routes_pass_explicit_outbound_web_query(): + """Verify both chat handlers pass the dedicated outbound web-search query.""" + print('🔍 Testing chat route web-search call-site separation...') + + source = read_file_text(ROUTE_FILE) + + assert source.count('web_search_query_text = build_web_search_query_text(user_message)') >= 2 + assert source.count('web_search_query_text=web_search_query_text') >= 2 + assert 'search_query=search_query' not in source + + # Internal workspace-search rewrites still exist, but they should no longer + # be able to flow into the outbound web-search boundary. + assert 'search_query = rewritten_search_query' in source + assert "Based on the recent conversation about:" in source + + print('✅ Chat route web-search call-site separation passed') + return True + + +def test_search_summary_filters_out_system_messages(): + """Verify the optional search-summary branch excludes persisted system augmentation content.""" + print('🔍 Testing search-summary role filtering...') + + source = read_file_text(ROUTE_FILE) + assert "if role not in ('user', 'assistant'):" in source + assert "content = build_assistant_history_content_with_citations(msg, content)" in source + + print('✅ Search-summary role filtering passed') + return True + + +if __name__ == '__main__': + tests = [ + test_web_search_query_helper_uses_only_current_message, + test_perform_web_search_uses_explicit_outbound_query_and_empty_metadata, + test_chat_routes_pass_explicit_outbound_web_query, + test_search_summary_filters_out_system_messages, + ] + + for test in tests: + print(f'\n🧪 Running {test.__name__}...') + test() + + print(f'\n📊 Results: {len(tests)}/{len(tests)} tests passed') + sys.exit(0) \ No newline at end of file diff --git a/functional_tests/test_xss_guardrails_checker.py b/functional_tests/test_xss_guardrails_checker.py new file mode 100644 index 000000000..95bc930f4 --- /dev/null +++ b/functional_tests/test_xss_guardrails_checker.py @@ -0,0 +1,163 @@ +#!/usr/bin/env python3 +# test_xss_guardrails_checker.py +""" +Functional test for XSS PR guardrail checker. +Version: 0.241.022 +Implemented in: 0.241.021 + +This test ensures the changed-file XSS checker flags the repo's target sink +patterns, allows the approved safe rendering patterns, and stays wired into +the repo instruction and PR workflow. +""" + +import importlib.util +import os +import sys +from pathlib import Path + + +ROOT_DIR = Path(__file__).resolve().parents[1] +CHECKER_FILE = ROOT_DIR / 'scripts' / 'check_xss_sinks.py' +WORKFLOW_FILE = ROOT_DIR / '.github' / 'workflows' / 'xss-sink-check.yml' +INSTRUCTION_FILE = ROOT_DIR / '.github' / 'instructions' / 'xss-prevention.instructions.md' +FEATURE_DOC = ROOT_DIR / 'docs' / 'explanation' / 'features' / 'v0.241.021' / 'XSS_PR_GUARDRAILS.md' +CONFIG_FILE = ROOT_DIR / 'application' / 'single_app' / 'config.py' + + +def read_text(path: Path) -> str: + """Read a UTF-8 text file from the repository.""" + return path.read_text(encoding='utf-8') + + +def load_checker_module(): + """Import the checker module from disk without touching sys.path.""" + spec = importlib.util.spec_from_file_location('check_xss_sinks', CHECKER_FILE) + assert spec is not None and spec.loader is not None, 'Expected a module spec for check_xss_sinks.py' + module = importlib.util.module_from_spec(spec) + sys.modules[spec.name] = module + spec.loader.exec_module(module) + return module + + +def read_config_version() -> str: + """Extract the current application version from config.py.""" + for line in read_text(CONFIG_FILE).splitlines(): + if line.strip().startswith('VERSION = '): + return line.split('=', 1)[1].strip().strip('"') + raise AssertionError('VERSION assignment not found in config.py') + + +def issue_messages(module, file_name: str, source_text: str) -> list[str]: + """Return the issue messages emitted for one in-memory source string.""" + issues = module.inspect_source(Path(file_name), source_text) + return [issue.message for issue in issues] + + +def test_checker_flags_dynamic_html_sinks_and_attribute_interpolation() -> None: + """Verify dynamic HTML sinks and attribute interpolation are rejected.""" + module = load_checker_module() + + js_source = """ +const row = document.createElement('tr'); +row.innerHTML = `${userName}`; +""".strip() + messages = issue_messages(module, 'sample.js', js_source) + + assert any('innerHTML/outerHTML' in message for message in messages), messages + assert any('data-* attributes' in message for message in messages), messages + + +def test_checker_flags_marked_parse_inline_handlers_and_server_side_bypasses() -> None: + """Verify the checker covers client and server bypass markers.""" + module = load_checker_module() + + js_source = """ +const html = marked.parse(markdown); +button.setAttribute('onclick', 'runDanger()'); +""".strip() + js_messages = issue_messages(module, 'sample.js', js_source) + assert any('DOMPurify.sanitize' in message for message in js_messages), js_messages + assert any('inline event-handler APIs' in message for message in js_messages), js_messages + + py_source = """ +from markupsafe import Markup +safe_markup = Markup(user_supplied_html) +""".strip() + py_messages = issue_messages(module, 'sample.py', py_source) + assert any('Markup(...)' in message for message in py_messages), py_messages + + html_source = """ +
    {{ user_bio|safe }}
    +""".strip() + html_messages = issue_messages(module, 'sample.html', html_source) + assert any("Jinja '|safe'" in message for message in html_messages), html_messages + + +def test_checker_allows_safe_dom_patterns_static_shells_and_reviewed_suppressions() -> None: + """Verify the checker allows the repo's preferred safe rendering patterns.""" + module = load_checker_module() + + safe_js_source = """ +const row = document.createElement('tr'); +const nameCell = document.createElement('td'); +nameCell.textContent = userName; +const actionButton = document.createElement('button'); +actionButton.dataset.userName = userName; +actionButton.addEventListener('click', handleClick); +modal.innerHTML = ''; +const renderedHtml = DOMPurify.sanitize(marked.parse(markdown)); +""".strip() + assert issue_messages(module, 'safe.js', safe_js_source) == [] + + suppressed_js_source = """ +// xss-check: ignore reviewed legacy shell with static allowlist +container.innerHTML = htmlFromReviewedBoundary; +""".strip() + assert issue_messages(module, 'suppressed.js', suppressed_js_source) == [] + + +def test_checker_assets_and_version_are_wired_into_repo() -> None: + """Verify the new workflow, instruction, doc, and version bump landed together.""" + assert CHECKER_FILE.exists(), f'Expected checker script at {CHECKER_FILE}' + assert WORKFLOW_FILE.exists(), f'Expected workflow file at {WORKFLOW_FILE}' + assert INSTRUCTION_FILE.exists(), f'Expected instruction file at {INSTRUCTION_FILE}' + assert FEATURE_DOC.exists(), f'Expected feature document at {FEATURE_DOC}' + assert read_config_version() == '0.241.022' + + workflow_source = read_text(WORKFLOW_FILE) + assert 'scripts/check_xss_sinks.py' in workflow_source + assert 'functional_tests/test_xss_guardrails_checker.py' in workflow_source + + instruction_source = read_text(INSTRUCTION_FILE) + assert 'xss-check: ignore' in instruction_source + assert 'innerHTML' in instruction_source + assert 'DOMPurify.sanitize' in instruction_source + + feature_doc_source = read_text(FEATURE_DOC) + assert 'Fixed/Implemented in version: **0.241.021**' in feature_doc_source + assert 'scripts/check_xss_sinks.py' in feature_doc_source + assert '.github/workflows/xss-sink-check.yml' in feature_doc_source + + +if __name__ == '__main__': + tests = [ + test_checker_flags_dynamic_html_sinks_and_attribute_interpolation, + test_checker_flags_marked_parse_inline_handlers_and_server_side_bypasses, + test_checker_allows_safe_dom_patterns_static_shells_and_reviewed_suppressions, + test_checker_assets_and_version_are_wired_into_repo, + ] + results = [] + + for test in tests: + print(f'\n🧪 Running {test.__name__}...') + try: + test() + print('✅ PASS') + results.append(True) + except Exception as exc: # pragma: no cover - standalone script reporting + print(f'❌ FAIL: {exc}') + results.append(False) + + success = all(results) + print(f'\n📊 Results: {sum(results)}/{len(results)} tests passed') + sys.exit(0 if success else 1) \ No newline at end of file diff --git a/scripts/check_broken_access_control.py b/scripts/check_broken_access_control.py new file mode 100644 index 000000000..892a1cb90 --- /dev/null +++ b/scripts/check_broken_access_control.py @@ -0,0 +1,630 @@ +# check_broken_access_control.py + +"""Validate changed Python files for high-confidence broken access control regressions.""" + +from __future__ import annotations + +import argparse +import ast +import re +import subprocess +import sys +from dataclasses import dataclass +from pathlib import Path + + +REPO_ROOT = Path(__file__).resolve().parents[1] +SUPPORTED_SUFFIXES = {'.py'} +SUPPRESSION_TOKEN = 'bac-check: ignore' +DIFF_HUNK_RE = re.compile(r'^@@ -\d+(?:,\d+)? \+(?P\d+)(?:,(?P\d+))? @@') +ACTIVE_SCOPE_KEYS = { + 'activeGroupOid': { + 'read_helper': 'require_active_group(...)', + 'write_helper': 'update_active_group_for_user(...)', + }, + 'activePublicWorkspaceOid': { + 'read_helper': 'require_active_public_workspace(...)', + 'write_helper': 'update_active_public_workspace_for_user(...)', + }, +} +ACTIVE_SCOPE_READ_ALLOWED_PATHS = { + 'application/single_app/functions_group.py', + 'application/single_app/functions_public_workspaces.py', +} +ACTIVE_SCOPE_READ_TARGET_PREFIXES = ( + 'application/single_app/route_backend_', + 'application/single_app/semantic_kernel_plugins/', +) +APPROVED_ACTIVE_SCOPE_WRITE_CONTEXTS = { + ('application/single_app/functions_group.py', 'update_active_group_for_user', 'activeGroupOid'), + ( + 'application/single_app/functions_public_workspaces.py', + 'update_active_public_workspace_for_user', + 'activePublicWorkspaceOid', + ), +} +KERNEL_SENSITIVE_PARAMS = { + 'user_id', + 'conversation_id', + 'group_id', + 'public_workspace_id', + 'scope_id', + 'scope_type', + 'active_group_id', + 'active_group_ids', + 'active_public_workspace_id', + 'active_public_workspace_ids', +} +APPROVED_KERNEL_SCOPE_HELPERS = { + '_resolve_authorized_scope_arguments', + '_resolve_authorized_fact_memory_call', + '_resolve_blob_location_with_fallback', + '_get_authenticated_history_user_id', +} +PERSONAL_CONVERSATION_ROUTE_FILES = { + 'application/single_app/route_backend_chats.py', + 'application/single_app/route_backend_conversations.py', + 'application/single_app/route_backend_documents.py', + 'application/single_app/route_backend_feedback.py', + 'application/single_app/route_frontend_conversations.py', +} +ADMIN_DECORATORS = {'admin_required', 'control_center_required'} +EXPLICIT_OWNERSHIP_SNIPPETS = ( + "conversation_item.get('user_id') != user_id", + 'conversation_item.get("user_id") != user_id', + "conversation_item['user_id'] != user_id", + 'conversation_item["user_id"] != user_id', + "conversation.get('user_id') != user_id", + 'conversation.get("user_id") != user_id', + "conversation['user_id'] != user_id", + 'conversation["user_id"] != user_id', +) + + +@dataclass(frozen=True) +class Issue: + """A single checker violation.""" + + file_path: Path + line: int + message: str + + +def get_relative_path(file_path: Path) -> str: + """Return a repository-relative path when possible.""" + try: + return file_path.relative_to(REPO_ROOT).as_posix() + except ValueError: + return file_path.as_posix() + + +def format_error_annotation(issue: Issue) -> str: + """Return a GitHub Actions error annotation for one issue.""" + return f'::error file={get_relative_path(issue.file_path)},line={issue.line}::{issue.message}' + + +def normalize_paths(paths: list[str]) -> list[Path]: + """Resolve CLI paths relative to the repository root and keep supported files.""" + normalized: list[Path] = [] + for raw_path in paths: + candidate = Path(raw_path) + if not candidate.is_absolute(): + candidate = (REPO_ROOT / candidate).resolve() + if candidate.exists() and candidate.suffix in SUPPORTED_SUFFIXES: + normalized.append(candidate) + return normalized + + +def matches_changed_lines(changed_lines: set[int] | None, start_line: int, end_line: int) -> bool: + """Return True when the issue overlaps changed lines or full-file mode is active.""" + if changed_lines is None: + return True + return any(line in changed_lines for line in range(start_line, end_line + 1)) + + +def is_suppressed(source_lines: list[str], start_line: int, end_line: int) -> bool: + """Return True when a suppression token exists near the reported lines.""" + window_start = max(1, start_line - 2) + window_end = min(len(source_lines), end_line) + for line_number in range(window_start, window_end + 1): + if SUPPRESSION_TOKEN in source_lines[line_number - 1]: + return True + return False + + +def get_changed_lines(file_path: Path, base_sha: str, head_sha: str) -> set[int] | None: + """Return added-line numbers for one file between two revisions.""" + relative_path = get_relative_path(file_path) + command = [ + 'git', + 'diff', + '--unified=0', + base_sha, + head_sha, + '--', + relative_path, + ] + + try: + result = subprocess.run( + command, + cwd=REPO_ROOT, + check=False, + capture_output=True, + text=True, + ) + except OSError: + return None + + if result.returncode not in {0, 1}: + return None + + changed_lines: set[int] = set() + for line in result.stdout.splitlines(): + match = DIFF_HUNK_RE.match(line) + if not match: + continue + + start_line = int(match.group('start')) + line_count = int(match.group('count') or '1') + if line_count == 0: + continue + + changed_lines.update(range(start_line, start_line + line_count)) + + return changed_lines + + +def call_name(call_node: ast.Call) -> str | None: + """Return the simple callable name for a Call node when available.""" + if isinstance(call_node.func, ast.Name): + return call_node.func.id + if isinstance(call_node.func, ast.Attribute): + return call_node.func.attr + return None + + +def decorator_name(decorator_node: ast.expr) -> str | None: + """Return the simple decorator name for a decorator node when available.""" + if isinstance(decorator_node, ast.Name): + return decorator_node.id + if isinstance(decorator_node, ast.Attribute): + return decorator_node.attr + if isinstance(decorator_node, ast.Call): + if isinstance(decorator_node.func, ast.Name): + return decorator_node.func.id + if isinstance(decorator_node.func, ast.Attribute): + return decorator_node.func.attr + return None + + +def has_decorator(function_node: ast.FunctionDef | ast.AsyncFunctionDef, names: set[str]) -> bool: + """Return True when the function has any decorator in the provided set.""" + return any(decorator_name(decorator) in names for decorator in function_node.decorator_list) + + +def build_parent_map(tree: ast.AST) -> dict[ast.AST, ast.AST]: + """Return a child-to-parent AST mapping.""" + parent_map: dict[ast.AST, ast.AST] = {} + for parent in ast.walk(tree): + for child in ast.iter_child_nodes(parent): + parent_map[child] = parent + return parent_map + + +def get_enclosing_function( + node: ast.AST, + parent_map: dict[ast.AST, ast.AST], +) -> ast.FunctionDef | ast.AsyncFunctionDef | None: + """Return the nearest enclosing function for a node when available.""" + current_node = node + while current_node in parent_map: + current_node = parent_map[current_node] + if isinstance(current_node, (ast.FunctionDef, ast.AsyncFunctionDef)): + return current_node + return None + + +def string_constant(node: ast.AST | None) -> str | None: + """Return a string constant value when the AST node is a string literal.""" + if isinstance(node, ast.Constant) and isinstance(node.value, str): + return node.value + return None + + +def iter_dict_literals(call_node: ast.Call) -> list[ast.Dict]: + """Return dict literal arguments passed to a call.""" + dict_literals: list[ast.Dict] = [] + for argument in call_node.args: + if isinstance(argument, ast.Dict): + dict_literals.append(argument) + for keyword in call_node.keywords: + if isinstance(keyword.value, ast.Dict): + dict_literals.append(keyword.value) + return dict_literals + + +def collect_function_call_names(function_node: ast.FunctionDef | ast.AsyncFunctionDef) -> set[str]: + """Return the set of simple call names used inside a function.""" + call_names: set[str] = set() + for node in ast.walk(function_node): + if isinstance(node, ast.Call): + name = call_name(node) + if name: + call_names.add(name) + return call_names + + +def get_function_source(source_text: str, function_node: ast.FunctionDef | ast.AsyncFunctionDef) -> str: + """Return the exact source segment for one function.""" + return ast.get_source_segment(source_text, function_node) or '' + + +def is_conversation_authorization_helper_name(name: str) -> bool: + """Return True when a helper name clearly represents a conversation authorization helper.""" + lowered_name = str(name or '').lower() + return lowered_name.startswith('_authorize_') and 'conversation' in lowered_name + + +def function_has_conversation_auth( + function_node: ast.FunctionDef | ast.AsyncFunctionDef, + source_text: str, +) -> bool: + """Return True when the function already uses an approved conversation auth boundary.""" + if is_conversation_authorization_helper_name(function_node.name): + return True + if has_decorator(function_node, ADMIN_DECORATORS): + return True + + function_calls = collect_function_call_names(function_node) + if any(is_conversation_authorization_helper_name(name) for name in function_calls): + return True + + function_source = get_function_source(source_text, function_node) + return any(snippet in function_source for snippet in EXPLICIT_OWNERSHIP_SNIPPETS) + + +def call_references_name_fragment(call_node: ast.Call, fragment: str, source_text: str) -> bool: + """Return True when the call source references the provided name fragment.""" + call_source = ast.get_source_segment(source_text, call_node) or '' + return fragment in call_source + + +def collect_direct_active_scope_write_issues( + *, + file_path: Path, + relative_path: str, + tree: ast.AST, + parent_map: dict[ast.AST, ast.AST], + source_lines: list[str], + changed_lines: set[int] | None, +) -> list[Issue]: + """Collect issues for direct persistence of authorization-sensitive active scope keys.""" + issues: list[Issue] = [] + for node in ast.walk(tree): + if not isinstance(node, ast.Call) or call_name(node) != 'update_user_settings': + continue + + function_node = get_enclosing_function(node, parent_map) + function_name = function_node.name if function_node else '' + + for dict_literal in iter_dict_literals(node): + for key_node in dict_literal.keys: + key_name = string_constant(key_node) + if key_name not in ACTIVE_SCOPE_KEYS: + continue + if (relative_path, function_name, key_name) in APPROVED_ACTIVE_SCOPE_WRITE_CONTEXTS: + continue + + start_line = getattr(dict_literal, 'lineno', node.lineno) + end_line = getattr(dict_literal, 'end_lineno', start_line) + if not matches_changed_lines(changed_lines, start_line, end_line): + continue + if is_suppressed(source_lines, start_line, end_line): + continue + + helper_name = ACTIVE_SCOPE_KEYS[key_name]['write_helper'] + issues.append( + Issue( + file_path=file_path, + line=start_line, + message=( + f"Do not persist {key_name} through update_user_settings(...) outside {helper_name}. " + f"Route active-scope writes through the validator, or add '{SUPPRESSION_TOKEN}' with a justification." + ), + ) + ) + return issues + + +def collect_direct_active_scope_read_issues( + *, + file_path: Path, + relative_path: str, + tree: ast.AST, + source_lines: list[str], + changed_lines: set[int] | None, +) -> list[Issue]: + """Collect issues for raw active scope reads in backend and plugin code.""" + if relative_path in ACTIVE_SCOPE_READ_ALLOWED_PATHS: + return [] + if not relative_path.startswith(ACTIVE_SCOPE_READ_TARGET_PREFIXES): + return [] + + issues: list[Issue] = [] + for node in ast.walk(tree): + if not isinstance(node, ast.Call) or call_name(node) != 'get' or not node.args: + continue + + key_name = string_constant(node.args[0]) + if key_name not in ACTIVE_SCOPE_KEYS: + continue + + start_line = node.lineno + end_line = getattr(node, 'end_lineno', start_line) + if not matches_changed_lines(changed_lines, start_line, end_line): + continue + if is_suppressed(source_lines, start_line, end_line): + continue + + helper_name = ACTIVE_SCOPE_KEYS[key_name]['read_helper'] + issues.append( + Issue( + file_path=file_path, + line=start_line, + message=( + f"Avoid reading {key_name} from raw settings in backend or plugin code. " + f"Use {helper_name} or a request-scoped authorization helper, " + f"or add '{SUPPRESSION_TOKEN}' with a justification." + ), + ) + ) + return issues + + +def collect_kernel_scope_param_issues( + *, + file_path: Path, + relative_path: str, + tree: ast.AST, + source_lines: list[str], + changed_lines: set[int] | None, +) -> list[Issue]: + """Collect issues for kernel functions that expose sensitive scope ids without normalization.""" + if not relative_path.startswith('application/single_app/semantic_kernel_plugins/'): + return [] + + issues: list[Issue] = [] + for node in ast.walk(tree): + if not isinstance(node, (ast.FunctionDef, ast.AsyncFunctionDef)): + continue + if not has_decorator(node, {'kernel_function'}): + continue + + parameter_names = { + argument.arg + for argument in ( + list(node.args.posonlyargs) + + list(node.args.args) + + list(node.args.kwonlyargs) + ) + if argument.arg != 'self' + } + sensitive_params = sorted(parameter_names & KERNEL_SENSITIVE_PARAMS) + if not sensitive_params: + continue + + start_line = min([node.lineno] + [decorator.lineno for decorator in node.decorator_list]) + end_line = getattr(node, 'end_lineno', node.lineno) + if not matches_changed_lines(changed_lines, start_line, end_line): + continue + if is_suppressed(source_lines, start_line, end_line): + continue + + function_call_names = collect_function_call_names(node) + if function_call_names & APPROVED_KERNEL_SCOPE_HELPERS: + continue + + issues.append( + Issue( + file_path=file_path, + line=start_line, + message=( + f"Kernel functions that expose {', '.join(sensitive_params)} must immediately normalize those values " + f"through an approved authorization helper such as _resolve_authorized_scope_arguments(...), " + f"_resolve_blob_location_with_fallback(...), or _resolve_authorized_fact_memory_call(...), " + f"or add '{SUPPRESSION_TOKEN}' with a justification." + ), + ) + ) + return issues + + +def collect_direct_personal_conversation_read_issues( + *, + file_path: Path, + relative_path: str, + tree: ast.AST, + source_text: str, + source_lines: list[str], + changed_lines: set[int] | None, +) -> list[Issue]: + """Collect issues for direct personal conversation reads without an auth boundary.""" + if relative_path not in PERSONAL_CONVERSATION_ROUTE_FILES: + return [] + + issues: list[Issue] = [] + for function_node in ast.walk(tree): + if not isinstance(function_node, (ast.FunctionDef, ast.AsyncFunctionDef)): + continue + if function_has_conversation_auth(function_node, source_text): + continue + + for node in ast.walk(function_node): + if not isinstance(node, ast.Call) or call_name(node) != 'read_item': + continue + + call_source = ast.get_source_segment(source_text, node) or '' + if 'cosmos_conversations_container.read_item' not in call_source: + continue + if not call_references_name_fragment(node, 'conversation_id', source_text): + continue + + start_line = node.lineno + end_line = getattr(node, 'end_lineno', start_line) + if not matches_changed_lines(changed_lines, start_line, end_line): + continue + if is_suppressed(source_lines, start_line, end_line): + continue + + issues.append( + Issue( + file_path=file_path, + line=start_line, + message=( + "Avoid direct personal conversation reads from request-derived conversation_id values without " + "_authorize_personal_conversation_read(...), _authorize_personal_conversation_access(...), " + f"or an explicit ownership check, or add '{SUPPRESSION_TOKEN}' with a justification." + ), + ) + ) + return issues + + +def inspect_source(file_path: Path, source_text: str, changed_lines: set[int] | None = None) -> list[Issue]: + """Inspect one Python source string and return any BAC-related issues.""" + source_lines = source_text.splitlines() + + try: + tree = ast.parse(source_text, filename=str(file_path)) + except SyntaxError as exc: + return [ + Issue( + file_path=file_path, + line=exc.lineno or 1, + message=f'Unable to parse file for BAC validation: {exc.msg}', + ) + ] + + relative_path = get_relative_path(file_path) + parent_map = build_parent_map(tree) + issues: list[Issue] = [] + issues.extend( + collect_direct_active_scope_write_issues( + file_path=file_path, + relative_path=relative_path, + tree=tree, + parent_map=parent_map, + source_lines=source_lines, + changed_lines=changed_lines, + ) + ) + issues.extend( + collect_direct_active_scope_read_issues( + file_path=file_path, + relative_path=relative_path, + tree=tree, + source_lines=source_lines, + changed_lines=changed_lines, + ) + ) + issues.extend( + collect_kernel_scope_param_issues( + file_path=file_path, + relative_path=relative_path, + tree=tree, + source_lines=source_lines, + changed_lines=changed_lines, + ) + ) + issues.extend( + collect_direct_personal_conversation_read_issues( + file_path=file_path, + relative_path=relative_path, + tree=tree, + source_text=source_text, + source_lines=source_lines, + changed_lines=changed_lines, + ) + ) + + unique_issues: list[Issue] = [] + seen = set() + for issue in issues: + dedupe_key = (issue.file_path, issue.line, issue.message) + if dedupe_key in seen: + continue + seen.add(dedupe_key) + unique_issues.append(issue) + + return unique_issues + + +def inspect_file(file_path: Path, changed_lines: set[int] | None = None) -> list[Issue]: + """Load one file and return any BAC-related issues.""" + try: + source_text = file_path.read_text(encoding='utf-8') + except UnicodeDecodeError: + source_text = file_path.read_text(encoding='utf-8-sig') + except OSError as exc: + return [ + Issue( + file_path=file_path, + line=1, + message=f'Unable to read file for BAC validation: {exc}', + ) + ] + + return inspect_source(file_path, source_text, changed_lines=changed_lines) + + +def main() -> int: + """Run the Broken Access Control checker for the provided files.""" + parser = argparse.ArgumentParser( + description='Validate changed Python files for high-confidence broken access control regressions.' + ) + parser.add_argument('files', nargs='*', help='Files to validate relative to the repository root.') + parser.add_argument('--base-sha', help='Base git revision used to limit checks to added lines.') + parser.add_argument('--head-sha', help='Head git revision used to limit checks to added lines.') + parser.add_argument( + '--full-file', + action='store_true', + help='Scan the full file contents instead of only added lines.', + ) + args = parser.parse_args() + + files = normalize_paths(args.files) + if not files: + print('No supported files to validate for Broken Access Control guardrails.') + return 0 + + all_issues: list[Issue] = [] + checked_files = 0 + + for file_path in files: + changed_lines = None + if not args.full_file and args.base_sha and args.head_sha: + changed_lines = get_changed_lines(file_path, args.base_sha, args.head_sha) + if changed_lines == set(): + continue + + issues = inspect_file(file_path, changed_lines=changed_lines) + checked_files += 1 + all_issues.extend(issues) + + if all_issues: + print('Broken Access Control guardrail validation failed:') + for issue in all_issues: + print(format_error_annotation(issue)) + return 1 + + if checked_files == 0: + print('No added lines found in the provided files. Broken Access Control check skipped.') + return 0 + + print(f'Broken Access Control guardrail validation passed for {checked_files} file(s).') + return 0 + + +if __name__ == '__main__': + sys.exit(main()) \ No newline at end of file diff --git a/scripts/check_xss_sinks.py b/scripts/check_xss_sinks.py new file mode 100644 index 000000000..b3d570098 --- /dev/null +++ b/scripts/check_xss_sinks.py @@ -0,0 +1,481 @@ +# check_xss_sinks.py + +"""Validate changed files for risky XSS sink patterns.""" + +from __future__ import annotations + +import argparse +import re +import subprocess +import sys +from dataclasses import dataclass +from pathlib import Path + + +REPO_ROOT = Path(__file__).resolve().parents[1] +SUPPORTED_SUFFIXES = {'.js', '.html', '.py'} +SUPPRESSION_TOKEN = 'xss-check: ignore' +INLINE_EVENT_ATTRIBUTE_RE = re.compile( + r'\bon(?:abort|auxclick|beforeinput|blur|change|click|contextmenu|dblclick|error|focus|input|keydown|keypress|keyup|load|mousedown|mouseenter|mouseleave|mousemove|mouseout|mouseover|mouseup|reset|scroll|submit|touchend|touchstart|transitionend)\s*=\s*["\']', + re.IGNORECASE, +) +INLINE_EVENT_API_RE = re.compile( + r'\.(?:onabort|onblur|onchange|onclick|ondblclick|onerror|onfocus|oninput|onkeydown|onkeyup|onload|onmousedown|onmouseenter|onmouseleave|onmousemove|onmouseout|onmouseover|onmouseup|onscroll|onsubmit)\s*=|setAttribute\(\s*["\']on', + re.IGNORECASE, +) +JAVASCRIPT_URL_RE = re.compile(r'javascript\s*:', re.IGNORECASE) +MARKUP_RE = re.compile(r'\bMarkup\s*\(') +JINJA_SAFE_RE = re.compile(r'\|\s*safe\b') +MARKED_PARSE_RE = re.compile(r'\bmarked\.parse\s*\(') +DANGEROUS_REACT_HTML_RE = re.compile(r'\bdangerouslySetInnerHTML\b') +ATTRIBUTE_INTERPOLATION_RE = re.compile( + r'\b(?:href|src|title|style|data-[\w-]+)\s*=\s*["\'][^"\'\n]*\$\{[^}]+\}', + re.IGNORECASE, +) +HTML_ASSIGNMENT_RE = re.compile( + r'\.(?PinnerHTML|outerHTML)\s*=\s*(?P.{0,500}?);', + re.DOTALL, +) +INSERT_ADJACENT_HTML_RE = re.compile( + r'\.insertAdjacentHTML\s*\(\s*[^,]+,\s*(?P.{0,500}?)\)\s*;?', + re.DOTALL, +) +JQUERY_HTML_RE = re.compile( + r'\.html\s*\(\s*(?P.{0,500}?)\)\s*;?', + re.DOTALL, +) +DIFF_HUNK_RE = re.compile(r'^@@ -\d+(?:,\d+)? \+(?P\d+)(?:,(?P\d+))? @@') + + +@dataclass(frozen=True) +class Issue: + """A single checker violation.""" + + file_path: Path + line: int + message: str + + +def get_relative_path(file_path: Path) -> str: + """Return a repository-relative path when possible.""" + try: + return file_path.relative_to(REPO_ROOT).as_posix() + except ValueError: + return file_path.as_posix() + + +def format_error_annotation(issue: Issue) -> str: + """Return a GitHub Actions annotation for one issue.""" + return f"::error file={get_relative_path(issue.file_path)},line={issue.line}::{issue.message}" + + +def normalize_paths(paths: list[str]) -> list[Path]: + """Resolve CLI paths relative to the repository root and keep supported files.""" + normalized: list[Path] = [] + for raw_path in paths: + candidate = Path(raw_path) + if not candidate.is_absolute(): + candidate = (REPO_ROOT / candidate).resolve() + if candidate.exists() and candidate.suffix in SUPPORTED_SUFFIXES: + normalized.append(candidate) + return normalized + + +def get_line_number(source_text: str, offset: int) -> int: + """Return the 1-based line number for a source offset.""" + return source_text.count('\n', 0, offset) + 1 + + +def matches_changed_lines(changed_lines: set[int] | None, start_line: int, end_line: int) -> bool: + """Return True when the issue overlaps the changed lines or full-file mode is active.""" + if changed_lines is None: + return True + return any(line in changed_lines for line in range(start_line, end_line + 1)) + + +def is_suppressed(source_lines: list[str], start_line: int, end_line: int) -> bool: + """Return True when a suppression token is present near the reported lines.""" + window_start = max(1, start_line - 2) + window_end = min(len(source_lines), end_line) + for line_number in range(window_start, window_end + 1): + if SUPPRESSION_TOKEN in source_lines[line_number - 1]: + return True + return False + + +def is_static_html_expression(expression: str) -> bool: + """Return True when an HTML expression is a static literal without interpolation.""" + stripped = expression.strip() + if not stripped: + return True + + quote_pairs = [("'", "'"), ('"', '"'), ('`', '`')] + for start_quote, end_quote in quote_pairs: + if stripped.startswith(start_quote) and stripped.endswith(end_quote): + return '${' not in stripped and '+' not in stripped + + return False + + +def is_allowed_html_expression(expression: str) -> bool: + """Return True when an HTML sink expression is explicitly allowed.""" + if 'DOMPurify.sanitize(' in expression: + return True + return is_static_html_expression(expression) + + +def get_changed_lines(file_path: Path, base_sha: str, head_sha: str) -> set[int] | None: + """Return the added-line numbers for one file between two revisions.""" + relative_path = get_relative_path(file_path) + command = [ + 'git', + 'diff', + '--unified=0', + base_sha, + head_sha, + '--', + relative_path, + ] + + try: + result = subprocess.run( + command, + cwd=REPO_ROOT, + check=False, + capture_output=True, + text=True, + ) + except OSError: + return None + + if result.returncode not in {0, 1}: + return None + + changed_lines: set[int] = set() + for line in result.stdout.splitlines(): + match = DIFF_HUNK_RE.match(line) + if not match: + continue + + start_line = int(match.group('start')) + line_count = int(match.group('count') or '1') + if line_count == 0: + continue + + changed_lines.update(range(start_line, start_line + line_count)) + + return changed_lines + + +def collect_regex_issues( + *, + file_path: Path, + source_text: str, + source_lines: list[str], + changed_lines: set[int] | None, + pattern: re.Pattern[str], + message: str, +) -> list[Issue]: + """Collect issues for a simple regex rule.""" + issues: list[Issue] = [] + for match in pattern.finditer(source_text): + start_line = get_line_number(source_text, match.start()) + end_line = get_line_number(source_text, match.end()) + if not matches_changed_lines(changed_lines, start_line, end_line): + continue + if is_suppressed(source_lines, start_line, end_line): + continue + issues.append(Issue(file_path=file_path, line=start_line, message=message)) + return issues + + +def collect_html_sink_issues( + *, + file_path: Path, + source_text: str, + source_lines: list[str], + changed_lines: set[int] | None, + pattern: re.Pattern[str], + sink_name: str, +) -> list[Issue]: + """Collect issues for dangerous HTML sinks.""" + issues: list[Issue] = [] + for match in pattern.finditer(source_text): + expression = match.group('expr') + if is_allowed_html_expression(expression): + continue + + start_line = get_line_number(source_text, match.start()) + end_line = get_line_number(source_text, match.end()) + if not matches_changed_lines(changed_lines, start_line, end_line): + continue + if is_suppressed(source_lines, start_line, end_line): + continue + + issues.append( + Issue( + file_path=file_path, + line=start_line, + message=( + f"Avoid dynamic {sink_name} sinks with untrusted data. Prefer DOM APIs, textContent, " + f"or DOMPurify.sanitize(...), or add '{SUPPRESSION_TOKEN}' with a justification." + ), + ) + ) + return issues + + +def collect_marked_parse_issues( + *, + file_path: Path, + source_text: str, + source_lines: list[str], + changed_lines: set[int] | None, +) -> list[Issue]: + """Collect issues where marked.parse is not paired with DOMPurify.sanitize.""" + issues: list[Issue] = [] + for match in MARKED_PARSE_RE.finditer(source_text): + start_line = get_line_number(source_text, match.start()) + end_line = get_line_number(source_text, match.end()) + if not matches_changed_lines(changed_lines, start_line, end_line): + continue + if is_suppressed(source_lines, start_line, end_line): + continue + + window_start = max(1, start_line - 2) + window_end = min(len(source_lines), end_line + 2) + window_text = '\n'.join(source_lines[window_start - 1:window_end]) + if 'DOMPurify.sanitize(' in window_text: + continue + + issues.append( + Issue( + file_path=file_path, + line=start_line, + message=( + "Wrap marked.parse(...) output with DOMPurify.sanitize(...) before rendering HTML, " + f"or add '{SUPPRESSION_TOKEN}' with a justification." + ), + ) + ) + return issues + + +def inspect_source(file_path: Path, source_text: str, changed_lines: set[int] | None = None) -> list[Issue]: + """Inspect one source string and return any XSS-related issues.""" + source_lines = source_text.splitlines() + issues: list[Issue] = [] + + issues.extend( + collect_regex_issues( + file_path=file_path, + source_text=source_text, + source_lines=source_lines, + changed_lines=changed_lines, + pattern=INLINE_EVENT_ATTRIBUTE_RE, + message=( + f"Avoid inline event-handler attributes in rendered HTML. Use addEventListener or data-* hooks, " + f"or add '{SUPPRESSION_TOKEN}' with a justification." + ), + ) + ) + issues.extend( + collect_regex_issues( + file_path=file_path, + source_text=source_text, + source_lines=source_lines, + changed_lines=changed_lines, + pattern=INLINE_EVENT_API_RE, + message=( + f"Avoid inline event-handler APIs such as onclick/onerror. Use addEventListener, " + f"or add '{SUPPRESSION_TOKEN}' with a justification." + ), + ) + ) + issues.extend( + collect_regex_issues( + file_path=file_path, + source_text=source_text, + source_lines=source_lines, + changed_lines=changed_lines, + pattern=JAVASCRIPT_URL_RE, + message=( + f"Avoid javascript: URLs in rendered content. Normalize dynamic URLs explicitly, " + f"or add '{SUPPRESSION_TOKEN}' with a justification." + ), + ) + ) + issues.extend( + collect_regex_issues( + file_path=file_path, + source_text=source_text, + source_lines=source_lines, + changed_lines=changed_lines, + pattern=ATTRIBUTE_INTERPOLATION_RE, + message=( + f"Avoid interpolating untrusted values directly into href/src/title/style/data-* attributes. " + f"Prefer DOM APIs and explicit URL normalization, or add '{SUPPRESSION_TOKEN}' with a justification." + ), + ) + ) + issues.extend( + collect_regex_issues( + file_path=file_path, + source_text=source_text, + source_lines=source_lines, + changed_lines=changed_lines, + pattern=DANGEROUS_REACT_HTML_RE, + message=( + f"Avoid dangerouslySetInnerHTML without a reviewed sanitizer boundary, " + f"or add '{SUPPRESSION_TOKEN}' with a justification." + ), + ) + ) + + if file_path.suffix == '.py': + issues.extend( + collect_regex_issues( + file_path=file_path, + source_text=source_text, + source_lines=source_lines, + changed_lines=changed_lines, + pattern=MARKUP_RE, + message=( + f"Avoid Markup(...) on untrusted content without a reviewed sanitizer boundary, " + f"or add '{SUPPRESSION_TOKEN}' with a justification." + ), + ) + ) + + if file_path.suffix == '.html': + issues.extend( + collect_regex_issues( + file_path=file_path, + source_text=source_text, + source_lines=source_lines, + changed_lines=changed_lines, + pattern=JINJA_SAFE_RE, + message=( + f"Avoid Jinja '|safe' on untrusted content without a reviewed sanitizer boundary, " + f"or add '{SUPPRESSION_TOKEN}' with a justification." + ), + ) + ) + + issues.extend( + collect_html_sink_issues( + file_path=file_path, + source_text=source_text, + source_lines=source_lines, + changed_lines=changed_lines, + pattern=HTML_ASSIGNMENT_RE, + sink_name='innerHTML/outerHTML', + ) + ) + issues.extend( + collect_html_sink_issues( + file_path=file_path, + source_text=source_text, + source_lines=source_lines, + changed_lines=changed_lines, + pattern=INSERT_ADJACENT_HTML_RE, + sink_name='insertAdjacentHTML', + ) + ) + issues.extend( + collect_html_sink_issues( + file_path=file_path, + source_text=source_text, + source_lines=source_lines, + changed_lines=changed_lines, + pattern=JQUERY_HTML_RE, + sink_name='jQuery .html()', + ) + ) + issues.extend( + collect_marked_parse_issues( + file_path=file_path, + source_text=source_text, + source_lines=source_lines, + changed_lines=changed_lines, + ) + ) + + unique_issues: list[Issue] = [] + seen = set() + for issue in issues: + dedupe_key = (issue.file_path, issue.line, issue.message) + if dedupe_key in seen: + continue + seen.add(dedupe_key) + unique_issues.append(issue) + + return unique_issues + + +def inspect_file(file_path: Path, changed_lines: set[int] | None = None) -> list[Issue]: + """Load one file and return any XSS-related issues.""" + try: + source_text = file_path.read_text(encoding='utf-8') + except UnicodeDecodeError: + source_text = file_path.read_text(encoding='utf-8-sig') + except OSError as exc: + return [ + Issue( + file_path=file_path, + line=1, + message=f'Unable to read file for XSS sink validation: {exc}', + ) + ] + + return inspect_source(file_path, source_text, changed_lines=changed_lines) + + +def main() -> int: + """Run the XSS sink checker for the provided files.""" + parser = argparse.ArgumentParser(description='Validate changed files for risky XSS sink patterns.') + parser.add_argument('files', nargs='*', help='Files to validate relative to the repository root.') + parser.add_argument('--base-sha', help='Base git revision used to limit checks to added lines.') + parser.add_argument('--head-sha', help='Head git revision used to limit checks to added lines.') + parser.add_argument( + '--full-file', + action='store_true', + help='Scan the full file contents instead of only added lines.', + ) + args = parser.parse_args() + + files = normalize_paths(args.files) + if not files: + print('No supported files to validate for XSS sink coverage.') + return 0 + + all_issues: list[Issue] = [] + checked_files = 0 + + for file_path in files: + changed_lines = None + if not args.full_file and args.base_sha and args.head_sha: + changed_lines = get_changed_lines(file_path, args.base_sha, args.head_sha) + if changed_lines == set(): + continue + + issues = inspect_file(file_path, changed_lines=changed_lines) + checked_files += 1 + all_issues.extend(issues) + + if all_issues: + print('XSS sink validation failed:') + for issue in all_issues: + print(format_error_annotation(issue)) + return 1 + + if checked_files == 0: + print('No added lines found in the provided files. XSS sink check skipped.') + return 0 + + print(f'XSS sink validation passed for {checked_files} file(s).') + return 0 + + +if __name__ == '__main__': + sys.exit(main()) \ No newline at end of file diff --git a/ui_tests/test_agent_template_gallery_actions_escaping.py b/ui_tests/test_agent_template_gallery_actions_escaping.py new file mode 100644 index 000000000..ecb11f810 --- /dev/null +++ b/ui_tests/test_agent_template_gallery_actions_escaping.py @@ -0,0 +1,138 @@ +# test_agent_template_gallery_actions_escaping.py +""" +UI test for agent template gallery actions escaping. +Version: 0.241.020 +Implemented in: 0.241.020 + +This test ensures malicious actions_to_load values render as inert text in the +agent template gallery instead of becoming executable DOM. +""" + +import json +import os +from pathlib import Path + +import pytest +from playwright.sync_api import expect + + +BASE_URL = os.getenv('SIMPLECHAT_UI_BASE_URL', '').rstrip('/') +STORAGE_STATE = os.getenv('SIMPLECHAT_UI_STORAGE_STATE', '') +SKIP_RESPONSE_CODES = {401, 403, 404} + + +def _fulfill_json(route, payload, status=200): + route.fulfill( + status=status, + content_type='application/json', + body=json.dumps(payload), + ) + + +@pytest.mark.ui +def test_agent_template_gallery_escapes_actions_to_load(playwright): + """Validate gallery action labels keep attacker-controlled values inert.""" + if not BASE_URL: + pytest.skip('Set SIMPLECHAT_UI_BASE_URL to run this UI test.') + if not STORAGE_STATE or not Path(STORAGE_STATE).exists(): + pytest.skip('Set SIMPLECHAT_UI_STORAGE_STATE to a valid authenticated Playwright storage state file.') + + browser = playwright.chromium.launch() + context = browser.new_context( + storage_state=STORAGE_STATE, + viewport={'width': 1440, 'height': 900}, + ) + page = context.new_page() + + first_action = 'Action' + second_action = 'Action' + + page.route( + '**/api/user/settings*', + lambda route: _fulfill_json(route, {'settings': {}, 'selected_agent': None}), + ) + page.route( + '**/api/get_conversations*', + lambda route: _fulfill_json(route, {'conversations': []}), + ) + page.route( + '**/api/agent-templates', + lambda route: _fulfill_json( + route, + { + 'templates': [ + { + 'id': 'template-1', + 'title': 'Escaping Template', + 'display_name': 'Escaping Template', + 'description': 'Regression coverage for gallery action rendering.', + 'helper_text': 'Regression coverage for gallery action rendering.', + 'instructions': 'Do not execute action labels.', + 'actions_to_load': [first_action, second_action], + 'tags': [], + } + ] + }, + ), + ) + + try: + response = page.goto(f'{BASE_URL}/chats', wait_until='domcontentloaded') + assert response is not None, 'Expected a navigation response when loading /chats.' + + if response.status in SKIP_RESPONSE_CODES: + pytest.skip(f'/chats returned HTTP {response.status} in this environment.') + + assert response.ok, f'Expected /chats to load successfully, got HTTP {response.status}.' + + page.evaluate( + """async ({ firstAction, secondAction }) => { + window.__agentTemplateActionXss = false; + window.__agentTemplateActionSvgXss = false; + window.appSettings = { + ...(window.appSettings || {}), + enable_agent_template_gallery: true, + }; + + const existing = document.getElementById('agent-template-gallery-test'); + if (existing) { + existing.remove(); + } + + const wrapper = document.createElement('div'); + wrapper.id = 'agent-template-gallery-test'; + wrapper.innerHTML = ` + + `; + document.body.appendChild(wrapper); + + await import(`/static/js/agent_templates_gallery.js?test=${Date.now()}`); + }""", + {'firstAction': first_action, 'secondAction': second_action}, + ) + + expect(page.locator('#agent-template-gallery-test .accordion-item')).to_have_count(1) + expect(page.locator('#agent-template-gallery-test')).to_contain_text('Recommended actions:') + expect(page.locator('#agent-template-gallery-test')).to_contain_text(first_action) + expect(page.locator('#agent-template-gallery-test')).to_contain_text(second_action) + expect(page.locator("#agent-template-gallery-test img[src='x']")).to_have_count(0) + expect(page.locator('#agent-template-gallery-test svg')).to_have_count(0) + + flags = page.evaluate( + """() => ({ + image: !!window.__agentTemplateActionXss, + svg: !!window.__agentTemplateActionSvgXss, + })""" + ) + assert flags == {'image': False, 'svg': False} + finally: + context.close() + browser.close() \ No newline at end of file diff --git a/ui_tests/test_chat_messages_authorization_error.py b/ui_tests/test_chat_messages_authorization_error.py new file mode 100644 index 000000000..7047b18a2 --- /dev/null +++ b/ui_tests/test_chat_messages_authorization_error.py @@ -0,0 +1,63 @@ +# test_chat_messages_authorization_error.py +""" +UI test for chat message authorization error handling. +Version: 0.241.012 +Implemented in: 0.241.012 + +This test ensures the chat message loader renders a controlled access-denied +message when the conversation messages endpoint returns 403. +""" + +import json +import os +from pathlib import Path + +import pytest +from playwright.sync_api import expect + + +BASE_URL = os.getenv('SIMPLECHAT_UI_BASE_URL', '').rstrip('/') +STORAGE_STATE = os.getenv('SIMPLECHAT_UI_STORAGE_STATE', '') + + +@pytest.mark.ui +def test_chat_loader_shows_forbidden_message(playwright): + """Validate chat message loading handles a forbidden response cleanly.""" + if not BASE_URL: + pytest.skip('Set SIMPLECHAT_UI_BASE_URL to run this UI test.') + if not STORAGE_STATE or not Path(STORAGE_STATE).exists(): + pytest.skip('Set SIMPLECHAT_UI_STORAGE_STATE to a valid authenticated Playwright storage state file.') + + browser = playwright.chromium.launch() + context = browser.new_context( + storage_state=STORAGE_STATE, + viewport={'width': 1440, 'height': 900}, + ) + page = context.new_page() + + def fulfill_forbidden_messages(route): + route.fulfill( + status=403, + content_type='application/json', + body=json.dumps({'error': 'Forbidden'}), + ) + + try: + page.route('**/conversation/blocked-conversation/messages', fulfill_forbidden_messages) + page.goto(f'{BASE_URL}/chats', wait_until='networkidle') + page.wait_for_function( + "() => window.chatMessages && typeof window.chatMessages.loadMessages === 'function'" + ) + + page.evaluate( + """async () => { + await window.chatMessages.loadMessages('blocked-conversation'); + }""" + ) + + expect(page.locator('#chatbox')).to_contain_text( + 'You do not have access to this conversation.' + ) + finally: + context.close() + browser.close() \ No newline at end of file diff --git a/ui_tests/test_chat_modal_filename_escaping.py b/ui_tests/test_chat_modal_filename_escaping.py new file mode 100644 index 000000000..e55b8cdb6 --- /dev/null +++ b/ui_tests/test_chat_modal_filename_escaping.py @@ -0,0 +1,117 @@ +# test_chat_modal_filename_escaping.py +""" +UI test for chat modal filename escaping. +Version: 0.241.018 +Implemented in: 0.241.018 + +This test ensures citation and uploaded-file modal titles render malicious +filenames as inert text on first display. +""" + +import json +import os +from pathlib import Path + +import pytest +from playwright.sync_api import expect + + +BASE_URL = os.getenv("SIMPLECHAT_UI_BASE_URL", "").rstrip("/") +STORAGE_STATE = os.getenv("SIMPLECHAT_UI_STORAGE_STATE", "") +SKIP_RESPONSE_CODES = {401, 403, 404} + + +def _fulfill_json(route, payload, status=200): + route.fulfill( + status=status, + content_type="application/json", + body=json.dumps(payload), + ) + + +@pytest.mark.ui +def test_chat_modals_escape_malicious_filenames_on_first_render(playwright): + """Validate chat modal titles keep attacker-controlled filenames inert.""" + if not BASE_URL: + pytest.skip("Set SIMPLECHAT_UI_BASE_URL to run this UI test.") + if not STORAGE_STATE or not Path(STORAGE_STATE).exists(): + pytest.skip("Set SIMPLECHAT_UI_STORAGE_STATE to a valid authenticated Playwright storage state file.") + + browser = playwright.chromium.launch() + context = browser.new_context( + storage_state=STORAGE_STATE, + viewport={"width": 1440, "height": 900}, + ) + page = context.new_page() + + citation_filename = '.pdf' + file_filename = '.txt' + + page.route( + "**/api/user/settings", + lambda route: _fulfill_json(route, {"selected_agent": None, "settings": {"enable_agents": False}}), + ) + page.route("**/api/get_conversations", lambda route: _fulfill_json(route, {"conversations": []})) + + try: + response = page.goto(f"{BASE_URL}/chats", wait_until="domcontentloaded") + assert response is not None, "Expected a navigation response when loading /chats." + + if response.status in SKIP_RESPONSE_CODES: + pytest.skip(f"/chats returned HTTP {response.status} in this environment.") + + assert response.ok, f"Expected /chats to load successfully, got HTTP {response.status}." + page.wait_for_selector("#chatbox") + + page.evaluate( + """ + async ({ citationFilename, fileFilename }) => { + window.__citationModalTitleXss = false; + window.__fileModalTitleXss = false; + + const citationsModule = await import('/static/js/chat/chat-citations.js'); + citationsModule.showCitedTextPopup('Citation body', citationFilename, 7); + + const fileActionsModule = await import('/static/js/chat/chat-input-actions.js'); + const citationModal = document.getElementById('citation-modal'); + const citationInstance = bootstrap.Modal.getInstance(citationModal); + if (citationInstance) { + citationInstance.hide(); + } + + fileActionsModule.showFileContentPopup( + 'Uploaded file body', + fileFilename, + false, + 'database', + null, + null, + ); + } + """, + {"citationFilename": citation_filename, "fileFilename": file_filename}, + ) + + expect(page.locator("#citation-modal .modal-title")).to_have_text( + f"Source: {citation_filename}, Page: 7" + ) + expect(page.locator("#citation-modal img[src='x']")).to_have_count(0) + expect(page.locator("#citation-modal svg")).to_have_count(0) + + expect(page.locator("#file-modal")).to_be_visible() + expect(page.locator("#file-modal .modal-title")).to_have_text( + f"Uploaded File: {file_filename}" + ) + expect(page.locator("#file-modal img[src='x']")).to_have_count(0) + expect(page.locator("#file-modal svg")).to_have_count(0) + + flags = page.evaluate( + """() => ({ + citation: !!window.__citationModalTitleXss, + file: !!window.__fileModalTitleXss, + })""" + ) + assert flags == {"citation": False, "file": False} + finally: + context.close() + browser.close() \ No newline at end of file diff --git a/ui_tests/test_chat_scope_lock_and_conversation_details_escaping.py b/ui_tests/test_chat_scope_lock_and_conversation_details_escaping.py new file mode 100644 index 000000000..e622b94dc --- /dev/null +++ b/ui_tests/test_chat_scope_lock_and_conversation_details_escaping.py @@ -0,0 +1,250 @@ +# test_chat_scope_lock_and_conversation_details_escaping.py +""" +UI test for chat scope-lock and conversation-details escaping. +Version: 0.241.019 +Implemented in: 0.241.019 + +This test ensures malicious workspace names and conversation metadata render as +inert text in the chat scope-lock modal and conversation-details modal. +""" + +import json +import os +from pathlib import Path + +import pytest +from playwright.sync_api import expect + + +BASE_URL = os.getenv("SIMPLECHAT_UI_BASE_URL", "").rstrip("/") +STORAGE_STATE = os.getenv("SIMPLECHAT_UI_STORAGE_STATE", "") + + +def _fulfill_json(route, payload, status=200): + route.fulfill( + status=status, + content_type="application/json", + body=json.dumps(payload), + ) + + +@pytest.mark.ui +def test_chat_scope_lock_and_conversation_details_escape_malicious_metadata(playwright): + """Validate chat scope-lock and conversation-details metadata render as inert text.""" + if not BASE_URL: + pytest.skip("Set SIMPLECHAT_UI_BASE_URL to run this UI test.") + if not STORAGE_STATE or not Path(STORAGE_STATE).exists(): + pytest.skip("Set SIMPLECHAT_UI_STORAGE_STATE to a valid authenticated Playwright storage state file.") + + browser = playwright.chromium.launch() + context = browser.new_context( + storage_state=STORAGE_STATE, + viewport={"width": 1440, "height": 900}, + ) + page = context.new_page() + + scope_lock_name = ' Locked Scope' + conversation_title = ' Conversation' + primary_context_name = ' Primary Context' + participant_name = ' Participant' + participant_email = '@example.com' + document_title = ' Document' + document_scope_name = ' Scope' + classification_label = ' Secret' + semantic_tag = 'Semantic' + model_tag = ' gpt-xss' + agent_tag = 'Agent' + web_source = 'javascript:window.__webSourceXss = true' + summary_model = ' summary-model' + + metadata_payload = { + "title": conversation_title, + "context": [ + { + "type": "primary", + "scope": "group", + "id": "group-1", + "name": primary_context_name, + }, + { + "type": "secondary", + "scope": "public", + "id": scope_lock_name, + "name": scope_lock_name, + }, + ], + "tags": [ + { + "category": "participant", + "user_id": "user-1", + "name": participant_name, + "email": participant_email, + }, + { + "category": "document", + "document_id": "doc-1", + "title": document_title, + "classification": classification_label, + "chunk_ids": ["chunk_1_p1", "chunk_1_p2"], + "scope": { + "type": "group", + "id": "group-1", + "name": document_scope_name, + }, + }, + { + "category": "semantic", + "value": semantic_tag, + }, + { + "category": "model", + "value": model_tag, + }, + { + "category": "agent", + "value": agent_tag, + }, + { + "category": "web", + "value": web_source, + }, + ], + "strict": False, + "classification": [classification_label], + "last_updated": "2026-05-05T12:00:00Z", + "chat_type": "group-single-user", + "is_pinned": False, + "is_hidden": False, + "scope_locked": True, + "locked_contexts": [ + {"scope": "group", "id": scope_lock_name}, + ], + "summary": { + "content": "Safe summary text.", + "generated_at": "2026-05-05T11:00:00Z", + "model_deployment": summary_model, + }, + } + + def fulfill_empty_docs_or_tags(route): + if "/tags" in route.request.url: + _fulfill_json(route, {"tags": []}) + return + _fulfill_json(route, {"documents": []}) + + try: + page.route("**/api/user/settings*", lambda route: _fulfill_json(route, {"settings": {}, "selected_agent": None})) + page.route("**/api/get_conversations*", lambda route: _fulfill_json(route, {"conversations": []})) + page.route("**/api/documents*", fulfill_empty_docs_or_tags) + page.route("**/api/group_documents*", fulfill_empty_docs_or_tags) + page.route("**/api/public_workspace_documents*", fulfill_empty_docs_or_tags) + page.route("**/api/user/profile-image/**", lambda route: route.fulfill(status=404, body="")) + page.route( + "**/api/conversations/conversation-xss/metadata", + lambda route: _fulfill_json(route, metadata_payload), + ) + + page.goto(f"{BASE_URL}/chats", wait_until="domcontentloaded") + page.wait_for_selector("#scopeLockModal") + + page.evaluate( + """async (scopeLockName) => { + window.__scopeLockXss = false; + window.__conversationTitleXss = false; + window.__conversationContextXss = false; + window.__participantNameXss = false; + window.__participantEmailXss = false; + window.__documentTitleXss = false; + window.__documentScopeXss = false; + window.__classificationXss = false; + window.__semanticTagXss = false; + window.__modelTagXss = false; + window.__agentTagXss = false; + window.__webSourceXss = false; + window.__summaryModelXss = false; + + const chatDocumentsModule = await import('/static/js/chat/chat-documents.js'); + chatDocumentsModule.restoreScopeLockState(true, [{ scope: 'group', id: scopeLockName }]); + + const scopeLockModal = document.getElementById('scopeLockModal'); + bootstrap.Modal.getOrCreateInstance(scopeLockModal).show(); + }""", + scope_lock_name, + ) + + locked_list = page.locator("#locked-workspaces-list") + expect(page.locator("#scopeLockModal")).to_be_visible() + expect(locked_list).to_contain_text(scope_lock_name) + expect(page.locator("#locked-workspaces-list img[src='x']")).to_have_count(0) + expect(page.locator("#locked-workspaces-list svg")).to_have_count(0) + + page.evaluate( + """() => { + const scopeLockModal = bootstrap.Modal.getInstance(document.getElementById('scopeLockModal')); + if (scopeLockModal) { + scopeLockModal.hide(); + } + }""" + ) + + page.evaluate( + """async () => { + const detailsModule = await import('/static/js/chat/chat-conversation-details.js'); + await detailsModule.showConversationDetails('conversation-xss'); + }""" + ) + + details_modal = page.locator("#conversation-details-modal") + details_content = page.locator("#conversation-details-content") + expect(details_modal).to_be_visible() + expect(page.locator("#conversation-details-modal .modal-title")).to_contain_text(conversation_title) + expect(details_content).to_contain_text(primary_context_name) + expect(details_content).to_contain_text(participant_name) + expect(details_content).to_contain_text(participant_email) + expect(details_content).to_contain_text(document_title) + expect(details_content).to_contain_text(document_scope_name) + expect(details_content).to_contain_text(classification_label) + expect(details_content).to_contain_text(semantic_tag) + expect(details_content).to_contain_text(model_tag) + expect(details_content).to_contain_text(agent_tag) + expect(details_content).to_contain_text(web_source) + expect(details_content).to_contain_text(summary_model) + expect(page.locator("#conversation-details-modal img[src='x']")).to_have_count(0) + expect(page.locator("#conversation-details-modal svg")).to_have_count(0) + expect(page.locator("#conversation-details-modal a[href^='javascript:']")).to_have_count(0) + + flags = page.evaluate( + """() => ({ + scopeLock: !!window.__scopeLockXss, + title: !!window.__conversationTitleXss, + context: !!window.__conversationContextXss, + participantName: !!window.__participantNameXss, + participantEmail: !!window.__participantEmailXss, + documentTitle: !!window.__documentTitleXss, + documentScope: !!window.__documentScopeXss, + classification: !!window.__classificationXss, + semanticTag: !!window.__semanticTagXss, + modelTag: !!window.__modelTagXss, + agentTag: !!window.__agentTagXss, + webSource: !!window.__webSourceXss, + summaryModel: !!window.__summaryModelXss, + })""" + ) + assert flags == { + "scopeLock": False, + "title": False, + "context": False, + "participantName": False, + "participantEmail": False, + "documentTitle": False, + "documentScope": False, + "classification": False, + "semanticTag": False, + "modelTag": False, + "agentTag": False, + "webSource": False, + "summaryModel": False, + } + finally: + context.close() + browser.close() \ No newline at end of file diff --git a/ui_tests/test_control_center_group_members_escaping.py b/ui_tests/test_control_center_group_members_escaping.py new file mode 100644 index 000000000..476e0f1c2 --- /dev/null +++ b/ui_tests/test_control_center_group_members_escaping.py @@ -0,0 +1,93 @@ +# test_control_center_group_members_escaping.py +""" +UI test for Control Center group member escaping. +Version: 0.241.010 +Implemented in: 0.241.010 + +This test ensures malicious group member names and emails render as inert text +in the Control Center group-members modal instead of executing as HTML. +""" + +import json +import os +from pathlib import Path + +import pytest +from playwright.sync_api import expect + + +BASE_URL = os.getenv("SIMPLECHAT_UI_BASE_URL", "").rstrip("/") +STORAGE_STATE = os.getenv("SIMPLECHAT_UI_STORAGE_STATE", "") + + +@pytest.mark.ui +def test_control_center_group_member_metadata_is_escaped(playwright): + """Validate malicious group member metadata renders as inert text.""" + if not BASE_URL: + pytest.skip("Set SIMPLECHAT_UI_BASE_URL to run this UI test.") + if not STORAGE_STATE or not Path(STORAGE_STATE).exists(): + pytest.skip("Set SIMPLECHAT_UI_STORAGE_STATE to a valid authenticated Playwright storage state file.") + + browser = playwright.chromium.launch() + context = browser.new_context( + storage_state=STORAGE_STATE, + viewport={"width": 1440, "height": 900}, + ) + page = context.new_page() + + member_name = '' + member_email = '@example.com' + + payload = { + "id": "group-1", + "name": "Escaping Test Group", + "owner": { + "id": "owner-1", + "displayName": "Owner", + "email": "owner@example.com", + }, + "admins": [], + "documentManagers": [], + "users": [ + { + "userId": "member-1", + "displayName": member_name, + "email": member_email, + } + ], + } + + def fulfill_group_details(route): + route.fulfill( + status=200, + content_type="application/json", + body=json.dumps(payload), + ) + + try: + page.route("**/api/admin/control-center/groups/group-1", fulfill_group_details) + + page.goto(f"{BASE_URL}/admin/control-center", wait_until="networkidle") + page.evaluate( + """async () => { + document.getElementById('groupManagementModal').setAttribute('data-group-id', 'group-1'); + await window.GroupManager.loadGroupMembers(); + }""" + ) + + table_body = page.locator("#groupMembersTableBody") + expect(table_body).to_contain_text(member_name) + expect(table_body).to_contain_text(member_email) + expect(page.locator("#groupMembersTableBody img[src='x']")).to_have_count(0) + expect(page.locator("#groupMembersTableBody svg")).to_have_count(0) + + flags = page.evaluate( + """() => ({ + name: !!window.__controlCenterMemberNameXss, + email: !!window.__controlCenterMemberEmailXss, + })""" + ) + assert flags == {"name": False, "email": False} + finally: + context.close() + browser.close() \ No newline at end of file diff --git a/ui_tests/test_control_center_public_workspace_escaping.py b/ui_tests/test_control_center_public_workspace_escaping.py new file mode 100644 index 000000000..4e3e6ce94 --- /dev/null +++ b/ui_tests/test_control_center_public_workspace_escaping.py @@ -0,0 +1,100 @@ +# test_control_center_public_workspace_escaping.py +""" +UI test for Control Center public workspace escaping. +Version: 0.241.007 +Implemented in: 0.241.007 + +This test ensures malicious public workspace metadata renders as inert text +in the Control Center public workspace table instead of executing as HTML. +""" + +import json +import os +from pathlib import Path + +import pytest +from playwright.sync_api import expect + + +BASE_URL = os.getenv("SIMPLECHAT_UI_BASE_URL", "").rstrip("/") +STORAGE_STATE = os.getenv("SIMPLECHAT_UI_STORAGE_STATE", "") + + +@pytest.mark.ui +def test_control_center_public_workspace_metadata_is_escaped(playwright): + """Validate malicious public workspace metadata renders as inert text.""" + if not BASE_URL: + pytest.skip("Set SIMPLECHAT_UI_BASE_URL to run this UI test.") + if not STORAGE_STATE or not Path(STORAGE_STATE).exists(): + pytest.skip("Set SIMPLECHAT_UI_STORAGE_STATE to a valid authenticated Playwright storage state file.") + + browser = playwright.chromium.launch() + context = browser.new_context( + storage_state=STORAGE_STATE, + viewport={"width": 1440, "height": 900}, + ) + page = context.new_page() + + workspace_name = '' + workspace_description = '' + owner_name = '' + + payload = { + "workspaces": [ + { + "id": "workspace-1", + "name": workspace_name, + "description": workspace_description, + "owner": { + "displayName": owner_name, + "email": "owner@example.com", + }, + "member_count": 2, + "status": "active", + "activity": { + "document_metrics": { + "total_documents": 1, + "ai_search_size": 0, + "storage_account_size": 0, + } + }, + } + ] + } + + def fulfill_public_workspaces(route): + route.fulfill( + status=200, + content_type="application/json", + body=json.dumps(payload), + ) + + try: + page.route("**/api/admin/control-center/public-workspaces?*", fulfill_public_workspaces) + + page.goto(f"{BASE_URL}/admin/control-center", wait_until="networkidle") + + with page.expect_response(lambda response: "/api/admin/control-center/public-workspaces?" in response.url): + if page.locator("#workspaces-tab").count() > 0: + page.locator("#workspaces-tab").click() + else: + page.locator('[onclick*="workspaces-tab"]').first.click() + + table_body = page.locator("#publicWorkspacesTableBody") + expect(table_body).to_contain_text(workspace_name) + expect(table_body).to_contain_text(workspace_description) + expect(table_body).to_contain_text(owner_name) + expect(page.locator("#publicWorkspacesTableBody img[src='x']")).to_have_count(0) + expect(page.locator("#publicWorkspacesTableBody svg")).to_have_count(0) + + flags = page.evaluate( + """() => ({ + name: !!window.__controlCenterNameXss, + description: !!window.__controlCenterDescriptionXss, + owner: !!window.__controlCenterOwnerXss, + })""" + ) + assert flags == {"name": False, "description": False, "owner": False} + finally: + context.close() + browser.close() \ No newline at end of file diff --git a/ui_tests/test_control_center_public_workspace_members_escaping.py b/ui_tests/test_control_center_public_workspace_members_escaping.py new file mode 100644 index 000000000..2c662b355 --- /dev/null +++ b/ui_tests/test_control_center_public_workspace_members_escaping.py @@ -0,0 +1,92 @@ +# test_control_center_public_workspace_members_escaping.py +""" +UI test for Control Center public workspace member escaping. +Version: 0.241.016 +Implemented in: 0.241.016 + +This test ensures malicious public workspace member names and emails render as +inert text in the Control Center workspace-members modal instead of executing +as HTML. +""" + +import json +import os +from pathlib import Path + +import pytest +from playwright.sync_api import expect + + +BASE_URL = os.getenv("SIMPLECHAT_UI_BASE_URL", "").rstrip("/") +STORAGE_STATE = os.getenv("SIMPLECHAT_UI_STORAGE_STATE", "") + + +@pytest.mark.ui +def test_control_center_public_workspace_member_metadata_is_escaped(playwright): + """Validate malicious public workspace member metadata renders as inert text.""" + if not BASE_URL: + pytest.skip("Set SIMPLECHAT_UI_BASE_URL to run this UI test.") + if not STORAGE_STATE or not Path(STORAGE_STATE).exists(): + pytest.skip("Set SIMPLECHAT_UI_STORAGE_STATE to a valid authenticated Playwright storage state file.") + + browser = playwright.chromium.launch() + context = browser.new_context( + storage_state=STORAGE_STATE, + viewport={"width": 1440, "height": 900}, + ) + page = context.new_page() + + member_name = '' + member_email = '@example.com' + + payload = { + "success": True, + "workspace_name": "Escaping Test Workspace", + "members": [ + { + "userId": "owner-1", + "displayName": member_name, + "email": member_email, + "role": "owner", + } + ], + } + + def fulfill_workspace_members(route): + route.fulfill( + status=200, + content_type="application/json", + body=json.dumps(payload), + ) + + try: + page.route( + "**/api/admin/control-center/public-workspaces/workspace-1/members", + fulfill_workspace_members, + ) + + page.goto(f"{BASE_URL}/admin/control-center", wait_until="networkidle") + page.evaluate( + """async () => { + document.getElementById('publicWorkspaceManagementModal').setAttribute('data-workspace-id', 'workspace-1'); + document.getElementById('modalWorkspaceName').textContent = 'Escaping Test Workspace'; + await window.WorkspaceManager.loadWorkspaceMembers(); + }""" + ) + + table_body = page.locator("#workspaceMembersTableBody") + expect(table_body).to_contain_text(member_name) + expect(table_body).to_contain_text(member_email) + expect(page.locator("#workspaceMembersTableBody img[src='x']")).to_have_count(0) + expect(page.locator("#workspaceMembersTableBody svg")).to_have_count(0) + + flags = page.evaluate( + """() => ({ + name: !!window.__controlCenterWorkspaceMemberNameXss, + email: !!window.__controlCenterWorkspaceMemberEmailXss, + })""" + ) + assert flags == {"name": False, "email": False} + finally: + context.close() + browser.close() \ No newline at end of file diff --git a/ui_tests/test_document_share_modal_escaping.py b/ui_tests/test_document_share_modal_escaping.py new file mode 100644 index 000000000..bb8148900 --- /dev/null +++ b/ui_tests/test_document_share_modal_escaping.py @@ -0,0 +1,253 @@ +# test_document_share_modal_escaping.py +""" +UI test for personal and group document share modal escaping. +Version: 0.241.020 +Implemented in: 0.241.020 + +This test ensures malicious names, descriptions, emails, and toast messages +render as inert text in the personal and group document sharing modals. +""" + +import json +import os +from pathlib import Path + +import pytest +from playwright.sync_api import expect + + +BASE_URL = os.getenv("SIMPLECHAT_UI_BASE_URL", "").rstrip("/") +STORAGE_STATE = os.getenv("SIMPLECHAT_UI_STORAGE_STATE", "") +SKIP_RESPONSE_CODES = {401, 403, 404} + + +def _fulfill_json(route, payload, status=200): + route.fulfill( + status=status, + content_type="application/json", + body=json.dumps(payload), + ) + + +def _require_ui_env(): + if not BASE_URL: + pytest.skip("Set SIMPLECHAT_UI_BASE_URL to run this UI test.") + if not STORAGE_STATE or not Path(STORAGE_STATE).exists(): + pytest.skip("Set SIMPLECHAT_UI_STORAGE_STATE to a valid authenticated Playwright storage state file.") + + +def _new_page(playwright): + browser = playwright.chromium.launch() + context = browser.new_context( + storage_state=STORAGE_STATE, + viewport={"width": 1440, "height": 900}, + ) + page = context.new_page() + return browser, context, page + + +def _assert_ok_or_skip(response, route_path: str) -> None: + assert response is not None, f"Expected a navigation response when loading {route_path}." + if response.status in SKIP_RESPONSE_CODES: + pytest.skip(f"{route_path} returned HTTP {response.status} in this environment.") + assert response.ok, f"Expected {route_path} to load successfully, got HTTP {response.status}." + + +@pytest.mark.ui +def test_workspace_share_modal_escapes_malicious_names_and_toasts(playwright): + """Validate the personal workspace share modal renders malicious values inertly.""" + _require_ui_env() + + browser, context, page = _new_page(playwright) + + shared_name = '' + shared_email = '@example.com' + search_name = '' + search_email = '@example.com' + + try: + page.route( + "**/api/documents/doc-1/shared-users", + lambda route: _fulfill_json( + route, + { + "shared_users": [ + { + "id": "shared-user-1", + "displayName": shared_name, + "email": shared_email, + } + ] + }, + ), + ) + page.route( + "**/api/userSearch*", + lambda route: _fulfill_json( + route, + [ + { + "id": "search-user-1", + "displayName": search_name, + "email": search_email, + } + ], + ), + ) + page.route( + "**/api/documents/doc-1/share", + lambda route: _fulfill_json(route, {"success": True}), + ) + + response = page.goto(f"{BASE_URL}/workspace", wait_until="networkidle") + _assert_ok_or_skip(response, "/workspace") + page.wait_for_function("() => typeof window.shareDocument === 'function'") + + page.evaluate( + """() => { + window.__workspaceSharedNameXss = false; + window.__workspaceSharedEmailXss = false; + window.__workspaceSearchNameXss = false; + window.__workspaceSearchEmailXss = false; + window.shareDocument('doc-1', 'Escaping Test Document.txt'); + }""" + ) + + expect(page.locator("#shareDocumentModal")).to_be_visible() + expect(page.locator("#sharedUsersList")).to_contain_text(shared_name) + expect(page.locator("#sharedUsersList")).to_contain_text(shared_email) + expect(page.locator("#sharedUsersList img[src='x']")).to_have_count(0) + expect(page.locator("#sharedUsersList svg")).to_have_count(0) + + page.locator("#userSearchTerm").fill("malicious") + page.locator("#searchUsersBtn").click() + + expect(page.locator("#userSearchResultsTable tbody")).to_contain_text(search_name) + expect(page.locator("#userSearchResultsTable tbody")).to_contain_text(search_email) + expect(page.locator("#userSearchResultsTable tbody img[src='x']")).to_have_count(0) + expect(page.locator("#userSearchResultsTable tbody svg")).to_have_count(0) + + page.locator("#userSearchResultsTable tbody .user-search-add-btn").click() + + expect(page.locator("#toastContainer")).to_contain_text(f"Document shared with {search_name}") + expect(page.locator("#toastContainer img[src='x']")).to_have_count(0) + expect(page.locator("#toastContainer svg")).to_have_count(0) + + flags = page.evaluate( + """() => ({ + sharedName: !!window.__workspaceSharedNameXss, + sharedEmail: !!window.__workspaceSharedEmailXss, + searchName: !!window.__workspaceSearchNameXss, + searchEmail: !!window.__workspaceSearchEmailXss, + })""" + ) + assert flags == { + "sharedName": False, + "sharedEmail": False, + "searchName": False, + "searchEmail": False, + } + finally: + context.close() + browser.close() + + +@pytest.mark.ui +def test_group_share_modal_escapes_malicious_names_descriptions_and_toasts(playwright): + """Validate the group workspace share modal renders malicious values inertly.""" + _require_ui_env() + + browser, context, page = _new_page(playwright) + + shared_group_name = '' + shared_group_description = ' shared description' + search_group_name = '' + search_group_description = ' search description' + + try: + page.route( + "**/api/group_documents/group-doc-1/shared-groups", + lambda route: _fulfill_json( + route, + { + "shared_groups": [ + { + "id": "shared-group-1", + "name": shared_group_name, + "description": shared_group_description, + } + ] + }, + ), + ) + page.route( + "**/api/groups/discover*", + lambda route: _fulfill_json( + route, + [ + { + "id": "search-group-1", + "name": search_group_name, + "description": search_group_description, + } + ], + ), + ) + page.route( + "**/api/group_documents/group-doc-1/share-with-group", + lambda route: _fulfill_json(route, {"success": True}), + ) + + response = page.goto(f"{BASE_URL}/group_workspaces", wait_until="networkidle") + _assert_ok_or_skip(response, "/group_workspaces") + page.wait_for_function("() => typeof window.shareGroupDocument === 'function'") + + page.evaluate( + """() => { + window.__sharedGroupNameXss = false; + window.__sharedGroupDescriptionXss = false; + window.__searchGroupNameXss = false; + window.__searchGroupDescriptionXss = false; + window.shareGroupDocument('group-doc-1', 'Escaping Group Document.txt'); + }""" + ) + + expect(page.locator("#groupShareDocumentModal")).to_be_visible() + expect(page.locator("#sharedGroupsList")).to_contain_text(shared_group_name) + expect(page.locator("#sharedGroupsList")).to_contain_text(shared_group_description) + expect(page.locator("#sharedGroupsList img[src='x']")).to_have_count(0) + expect(page.locator("#sharedGroupsList svg")).to_have_count(0) + + page.locator("#groupSearchTerm").fill("malicious") + page.locator("#searchGroupsBtn").click() + + expect(page.locator("#groupSearchResultsTable tbody")).to_contain_text(search_group_name) + expect(page.locator("#groupSearchResultsTable tbody")).to_contain_text(search_group_description) + expect(page.locator("#groupSearchResultsTable tbody img[src='x']")).to_have_count(0) + expect(page.locator("#groupSearchResultsTable tbody svg")).to_have_count(0) + + page.locator("#groupSearchResultsTable tbody .group-search-add-btn").click() + + expect(page.locator("#toastContainer")).to_contain_text( + f"Document shared with group: {search_group_name}" + ) + expect(page.locator("#toastContainer img[src='x']")).to_have_count(0) + expect(page.locator("#toastContainer svg")).to_have_count(0) + + flags = page.evaluate( + """() => ({ + sharedName: !!window.__sharedGroupNameXss, + sharedDescription: !!window.__sharedGroupDescriptionXss, + searchName: !!window.__searchGroupNameXss, + searchDescription: !!window.__searchGroupDescriptionXss, + })""" + ) + assert flags == { + "sharedName": False, + "sharedDescription": False, + "searchName": False, + "searchDescription": False, + } + finally: + context.close() + browser.close() \ No newline at end of file diff --git a/ui_tests/test_group_workspace_member_rendering_escaping.py b/ui_tests/test_group_workspace_member_rendering_escaping.py new file mode 100644 index 000000000..eeff499b5 --- /dev/null +++ b/ui_tests/test_group_workspace_member_rendering_escaping.py @@ -0,0 +1,191 @@ +# test_group_workspace_member_rendering_escaping.py +""" +UI test for group workspace member rendering escaping. +Version: 0.241.017 +Implemented in: 0.241.017 + +This test ensures malicious member, request, and user-search display names and +emails render as inert text in the group workspace member-management UI. +""" + +import json +import os +from pathlib import Path + +import pytest +from playwright.sync_api import expect + + +BASE_URL = os.getenv("SIMPLECHAT_UI_BASE_URL", "").rstrip("/") +STORAGE_STATE = os.getenv("SIMPLECHAT_UI_STORAGE_STATE", "") + + +def _fulfill_json(route, payload, status=200): + route.fulfill( + status=status, + content_type="application/json", + body=json.dumps(payload), + ) + + +def _require_ui_env(): + if not BASE_URL: + pytest.skip("Set SIMPLECHAT_UI_BASE_URL to run this UI test.") + if not STORAGE_STATE or not Path(STORAGE_STATE).exists(): + pytest.skip("Set SIMPLECHAT_UI_STORAGE_STATE to a valid authenticated Playwright storage state file.") + + +@pytest.mark.ui +def test_group_workspace_member_management_escapes_malicious_fields(playwright): + """Validate group workspace member-management views render malicious fields inertly.""" + _require_ui_env() + + browser = playwright.chromium.launch() + context = browser.new_context( + storage_state=STORAGE_STATE, + viewport={"width": 1440, "height": 900}, + ) + page = context.new_page() + + member_name = '' + member_email = '@example.com' + request_name = '' + request_email = '@example.com' + search_name = '' + search_email = '@example.com' + + try: + page.route( + "**/api/groups?page_size=1000", + lambda route: _fulfill_json( + route, + { + "groups": [ + { + "id": "group-alpha", + "name": "Escaping Group", + "isActive": True, + "userRole": "Owner", + "status": "active", + } + ] + }, + ), + ) + page.route( + "**/api/group_documents?*", + lambda route: _fulfill_json( + route, + { + "documents": [], + "page": 1, + "page_size": 10, + "total_count": 0, + }, + ), + ) + page.route( + "**/api/group_documents/tags?*", + lambda route: _fulfill_json(route, {"tags": []}), + ) + page.route( + "**/api/groups/group-alpha/members*", + lambda route: _fulfill_json( + route, + [ + { + "userId": "member-1", + "displayName": member_name, + "email": member_email, + "role": "Admin", + } + ], + ), + ) + page.route( + "**/api/groups/group-alpha/requests*", + lambda route: _fulfill_json( + route, + [ + { + "userId": "request-1", + "displayName": request_name, + "email": request_email, + } + ], + ), + ) + page.route( + "**/api/userSearch*", + lambda route: _fulfill_json( + route, + [ + { + "id": "search-1", + "displayName": search_name, + "email": search_email, + } + ], + ), + ) + + response = page.goto(f"{BASE_URL}/group_workspaces", wait_until="networkidle") + + assert response is not None, "Expected a navigation response when loading /group_workspaces." + assert response.ok, f"Expected /group_workspaces to load successfully, got HTTP {response.status}." + + page.evaluate( + """() => { + if (typeof loadMembers === 'function') { + loadMembers(); + } + if (typeof loadPendingRequests === 'function') { + loadPendingRequests(); + } + }""" + ) + + expect(page.locator("#membersTable tbody")).to_contain_text(member_name) + expect(page.locator("#membersTable tbody")).to_contain_text(member_email) + expect(page.locator("#membersTable tbody img[src='x']")).to_have_count(0) + expect(page.locator("#membersTable tbody svg")).to_have_count(0) + + expect(page.locator("#pendingRequestsTable tbody")).to_contain_text(request_name) + expect(page.locator("#pendingRequestsTable tbody")).to_contain_text(request_email) + expect(page.locator("#pendingRequestsTable tbody img[src='x']")).to_have_count(0) + expect(page.locator("#pendingRequestsTable tbody svg")).to_have_count(0) + + page.locator("#addMemberBtn").click() + page.locator("#userSearchTerm").fill("search") + page.locator("#searchUsersBtn").click() + + expect(page.locator("#userSearchResultsTable tbody")).to_contain_text(search_name) + expect(page.locator("#userSearchResultsTable tbody")).to_contain_text(search_email) + expect(page.locator("#userSearchResultsTable tbody img[src='x']")).to_have_count(0) + expect(page.locator("#userSearchResultsTable tbody svg")).to_have_count(0) + + page.locator("#userSearchResultsTable tbody .select-user-btn").click() + expect(page.locator("#newUserDisplayName")).to_have_value(search_name) + expect(page.locator("#newUserEmail")).to_have_value(search_email) + + flags = page.evaluate( + """() => ({ + memberName: !!window.__groupMemberNameXss, + memberEmail: !!window.__groupMemberEmailXss, + requestName: !!window.__groupRequestNameXss, + requestEmail: !!window.__groupRequestEmailXss, + searchName: !!window.__groupSearchNameXss, + searchEmail: !!window.__groupSearchEmailXss, + })""" + ) + assert flags == { + "memberName": False, + "memberEmail": False, + "requestName": False, + "requestEmail": False, + "searchName": False, + "searchEmail": False, + } + finally: + context.close() + browser.close() \ No newline at end of file diff --git a/ui_tests/test_public_workspace_member_rendering_escaping.py b/ui_tests/test_public_workspace_member_rendering_escaping.py new file mode 100644 index 000000000..685692c23 --- /dev/null +++ b/ui_tests/test_public_workspace_member_rendering_escaping.py @@ -0,0 +1,182 @@ +# test_public_workspace_member_rendering_escaping.py +""" +UI test for public workspace member rendering escaping. +Version: 0.241.017 +Implemented in: 0.241.017 + +This test ensures malicious member, request, and user-search display names and +emails render as inert text in the public workspace member-management UI. +""" + +import json +import os +from pathlib import Path +from urllib.parse import urlparse + +import pytest +from playwright.sync_api import expect + + +BASE_URL = os.getenv("SIMPLECHAT_UI_BASE_URL", "").rstrip("/") +STORAGE_STATE = os.getenv("SIMPLECHAT_UI_STORAGE_STATE", "") +SKIP_RESPONSE_CODES = {401, 403, 404} + + +def _fulfill_json(route, payload, status=200): + route.fulfill( + status=status, + content_type="application/json", + body=json.dumps(payload), + ) + + +def _require_ui_env(): + if not BASE_URL: + pytest.skip("Set SIMPLECHAT_UI_BASE_URL to run this UI test.") + if not STORAGE_STATE or not Path(STORAGE_STATE).exists(): + pytest.skip("Set SIMPLECHAT_UI_STORAGE_STATE to a valid authenticated Playwright storage state file.") + + +@pytest.mark.ui +def test_public_workspace_member_management_escapes_malicious_fields(playwright): + """Validate public workspace member-management views render malicious fields inertly.""" + _require_ui_env() + + browser = playwright.chromium.launch() + context = browser.new_context( + storage_state=STORAGE_STATE, + viewport={"width": 1440, "height": 900}, + ) + page = context.new_page() + + member_name = '' + member_email = '@example.com' + request_name = '' + request_email = '@example.com' + search_name = '' + search_email = '@example.com' + + def handle_public_workspace_api(route): + path = urlparse(route.request.url).path + + if path == "/api/public_workspaces/public-1": + _fulfill_json( + route, + { + "id": "public-1", + "name": "Escaping Workspace", + "description": "Regression coverage", + "owner": { + "displayName": "Owner User", + "email": "owner@example.com", + }, + "status": "active", + "heroColor": "#225577", + "userRole": "Owner", + "isMember": True, + }, + ) + return + + if path == "/api/public_workspaces/public-1/members": + _fulfill_json( + route, + [ + { + "userId": "member-1", + "displayName": member_name, + "email": member_email, + "role": "Admin", + } + ], + ) + return + + if path == "/api/public_workspaces/public-1/requests": + _fulfill_json( + route, + [ + { + "userId": "request-1", + "displayName": request_name, + "email": request_email, + } + ], + ) + return + + route.continue_() + + try: + page.route("**/api/public_workspaces/public-1*", handle_public_workspace_api) + page.route( + "**/api/userSearch*", + lambda route: _fulfill_json( + route, + [ + { + "id": "search-1", + "displayName": search_name, + "email": search_email, + } + ], + ), + ) + + response = page.goto(f"{BASE_URL}/public_workspaces/public-1", wait_until="networkidle") + assert response is not None, "Expected a navigation response when loading /public_workspaces/public-1." + + if response.status in SKIP_RESPONSE_CODES: + pytest.skip( + f"/public_workspaces/public-1 returned HTTP {response.status} in this environment." + ) + + assert response.ok, ( + "Expected /public_workspaces/public-1 to load successfully, " + f"got HTTP {response.status}." + ) + + expect(page.locator("#membersTable tbody")).to_contain_text(member_name) + expect(page.locator("#membersTable tbody")).to_contain_text(member_email) + expect(page.locator("#membersTable tbody img[src='x']")).to_have_count(0) + expect(page.locator("#membersTable tbody svg")).to_have_count(0) + + expect(page.locator("#pendingRequestsTable tbody")).to_contain_text(request_name) + expect(page.locator("#pendingRequestsTable tbody")).to_contain_text(request_email) + expect(page.locator("#pendingRequestsTable tbody img[src='x']")).to_have_count(0) + expect(page.locator("#pendingRequestsTable tbody svg")).to_have_count(0) + + page.locator("#addMemberBtn").click() + page.locator("#userSearchTerm").fill("search") + page.locator("#searchUsersBtn").click() + + expect(page.locator("#userSearchResultsTable tbody")).to_contain_text(search_name) + expect(page.locator("#userSearchResultsTable tbody")).to_contain_text(search_email) + expect(page.locator("#userSearchResultsTable tbody img[src='x']")).to_have_count(0) + expect(page.locator("#userSearchResultsTable tbody svg")).to_have_count(0) + + page.locator("#userSearchResultsTable tbody .select-user-btn").click() + expect(page.locator("#newUserDisplayName")).to_have_value(search_name) + expect(page.locator("#newUserEmail")).to_have_value(search_email) + + flags = page.evaluate( + """() => ({ + memberName: !!window.__publicMemberNameXss, + memberEmail: !!window.__publicMemberEmailXss, + requestName: !!window.__publicRequestNameXss, + requestEmail: !!window.__publicRequestEmailXss, + searchName: !!window.__publicSearchNameXss, + searchEmail: !!window.__publicSearchEmailXss, + })""" + ) + assert flags == { + "memberName": False, + "memberEmail": False, + "requestName": False, + "requestEmail": False, + "searchName": False, + "searchEmail": False, + } + finally: + context.close() + browser.close() \ No newline at end of file diff --git a/ui_tests/test_public_workspace_projection_non_member_ui.py b/ui_tests/test_public_workspace_projection_non_member_ui.py new file mode 100644 index 000000000..9d35d88bd --- /dev/null +++ b/ui_tests/test_public_workspace_projection_non_member_ui.py @@ -0,0 +1,158 @@ +# test_public_workspace_projection_non_member_ui.py +""" +UI test for public workspace projection hardening. +Version: 0.241.013 +Implemented in: 0.241.013 + +This test ensures the public directory renders owner display names without +falling back to owner email addresses, and that non-members who open the +workspace details page see the public summary view without member-only tabs. +""" + +import json +import os +from pathlib import Path +from urllib.parse import urlparse + +import pytest +from playwright.sync_api import expect + + +BASE_URL = os.getenv("SIMPLECHAT_UI_BASE_URL", "").rstrip("/") +STORAGE_STATE = os.getenv("SIMPLECHAT_UI_STORAGE_STATE", "") +SKIP_RESPONSE_CODES = {401, 403, 404} + + +def _fulfill_json(route, payload, status=200): + route.fulfill( + status=status, + content_type="application/json", + body=json.dumps(payload), + ) + + +def _handle_public_workspace_projection_api(route): + request = route.request + parsed_url = urlparse(request.url) + path = parsed_url.path + + if path == "/api/user/settings": + _fulfill_json( + route, + { + "settings": { + "publicDirectorySettings": { + "public-1": True, + } + } + }, + ) + return + + if path == "/api/public_workspaces/discover": + _fulfill_json( + route, + [ + { + "id": "public-1", + "name": "Projection Workspace", + "description": "Directory summary only", + } + ], + ) + return + + if path == "/api/public_workspaces/public-1": + _fulfill_json( + route, + { + "id": "public-1", + "name": "Projection Workspace", + "description": "Directory summary only", + "owner": { + "displayName": "Directory Owner", + }, + "status": "active", + "heroColor": "#224466", + "userRole": None, + "isMember": False, + }, + ) + return + + if path == "/api/public_workspaces/public-1/fileCount": + _fulfill_json(route, {"fileCount": 7}) + return + + if path == "/api/public_workspaces/public-1/promptCount": + _fulfill_json(route, {"promptCount": 3}) + return + + route.continue_() + + +def _require_ui_environment(): + if not BASE_URL: + pytest.skip("Set SIMPLECHAT_UI_BASE_URL to run this UI test.") + if not STORAGE_STATE or not Path(STORAGE_STATE).exists(): + pytest.skip( + "Set SIMPLECHAT_UI_STORAGE_STATE to a valid authenticated Playwright storage state file." + ) + + +@pytest.mark.ui +def test_public_directory_owner_display_and_non_member_workspace_fallback(playwright): + """Validate the public directory and non-member workspace details view use the safe payload.""" + _require_ui_environment() + + browser = playwright.chromium.launch() + context = browser.new_context( + storage_state=STORAGE_STATE, + viewport={"width": 1440, "height": 900}, + ) + page = context.new_page() + + page.route("**/api/user/settings", _handle_public_workspace_projection_api) + page.route("**/api/public_workspaces*", _handle_public_workspace_projection_api) + + try: + directory_response = page.goto(f"{BASE_URL}/public_directory", wait_until="networkidle") + assert directory_response is not None, "Expected a navigation response when loading /public_directory." + + if directory_response.status in SKIP_RESPONSE_CODES: + pytest.skip(f"/public_directory returned HTTP {directory_response.status} in this environment.") + + assert directory_response.ok, ( + f"Expected /public_directory to load successfully, got HTTP {directory_response.status}." + ) + + expect(page.locator("#public-directory-table tbody")).to_contain_text("Projection Workspace") + page.locator('button.expand-btn[data-id="public-1"]').click() + expect(page.locator("#owner-public-1")).to_have_text("Directory Owner") + assert "@" not in page.locator("#owner-public-1").inner_text() + + manage_response = page.goto(f"{BASE_URL}/public_workspaces/public-1", wait_until="networkidle") + assert manage_response is not None, "Expected a navigation response when loading /public_workspaces/public-1." + + if manage_response.status in SKIP_RESPONSE_CODES: + pytest.skip( + f"/public_workspaces/public-1 returned HTTP {manage_response.status} in this environment." + ) + + assert manage_response.ok, ( + "Expected /public_workspaces/public-1 to load successfully, " + f"got HTTP {manage_response.status}." + ) + + expect(page.locator("#workspaceHeroName")).to_have_text("Projection Workspace") + expect(page.locator("#workspaceOwnerName")).to_have_text("Directory Owner") + expect(page.locator("#workspace-access-alert")).to_be_visible() + expect(page.locator("#workspace-access-alert")).to_contain_text( + "Membership, statistics, and workspace settings are only available to workspace members." + ) + expect(page.locator("#membership-tab")).to_be_hidden() + expect(page.locator("#stats-tab")).to_be_hidden() + expect(page.locator("#settings-tab-item")).to_be_hidden() + finally: + context.close() + browser.close() \ No newline at end of file diff --git a/ui_tests/test_public_workspace_tag_color_rendering.py b/ui_tests/test_public_workspace_tag_color_rendering.py new file mode 100644 index 000000000..91bb295f0 --- /dev/null +++ b/ui_tests/test_public_workspace_tag_color_rendering.py @@ -0,0 +1,169 @@ +# test_public_workspace_tag_color_rendering.py +""" +UI test for public workspace tag color XSS hardening. +Version: 0.241.022 +Implemented in: 0.241.022 + +This test ensures malicious tag color payloads remain inert in the public +workspace grid, tag-management rows, tag-selection rows, and selected-tag chips. +""" + +import json +import os +from pathlib import Path + +import pytest +from playwright.sync_api import expect + + +BASE_URL = os.getenv("SIMPLECHAT_UI_BASE_URL", "").rstrip("/") +STORAGE_STATE = os.getenv("SIMPLECHAT_UI_STORAGE_STATE", "") +SKIP_RESPONSE_CODES = {401, 403, 404} +TAG_NAME = "reviewed" +MALICIOUS_COLOR = ( + '#ff0000"; onclick="window.__publicTagColorXss = true" ' + 'onmouseover="window.__publicTagColorXss = true' +) + + +def _fulfill_json(route, payload, status=200): + route.fulfill( + status=status, + content_type="application/json", + body=json.dumps(payload), + ) + + +def _require_ui_env(): + if not BASE_URL: + pytest.skip("Set SIMPLECHAT_UI_BASE_URL to run this UI test.") + if not STORAGE_STATE or not Path(STORAGE_STATE).exists(): + pytest.skip( + "Set SIMPLECHAT_UI_STORAGE_STATE to a valid authenticated Playwright storage state file." + ) + + +@pytest.mark.ui +def test_public_workspace_tag_color_payloads_render_inertly(playwright): + """Validate malicious tag color payloads do not become live browser attributes.""" + _require_ui_env() + + browser = playwright.chromium.launch() + context = browser.new_context( + storage_state=STORAGE_STATE, + viewport={"width": 1440, "height": 900}, + ) + page = context.new_page() + + page.add_init_script( + """() => { + localStorage.setItem('publicWorkspaceViewPreference', 'grid'); + window.__publicTagColorXss = false; + }""" + ) + + documents_payload = { + "documents": [ + { + "id": "doc-1", + "file_name": "reviewed-spec.pdf", + "title": "Reviewed Spec", + "tags": [TAG_NAME], + "status": "Complete", + "percentage_complete": 100, + "document_classification": "Public", + "classification": "Public", + "version": "1", + "authors": "Owner User", + "number_of_pages": 3, + "enhanced_citations": False, + "publication_date": "2024-01-01", + "keywords": "reviewed", + "abstract": "Regression coverage", + } + ], + "page": 1, + "page_size": 1000, + "total_count": 1, + } + tag_payload = [{"name": TAG_NAME, "color": MALICIOUS_COLOR, "count": 1}] + + try: + page.route( + "**/api/public_documents*", + lambda route: _fulfill_json(route, documents_payload), + ) + page.route( + "**/api/public_workspace_documents/tags*", + lambda route: _fulfill_json(route, tag_payload), + ) + + response = page.goto( + f"{BASE_URL}/public_workspaces/public-1", + wait_until="networkidle", + ) + assert response is not None, ( + "Expected a navigation response when loading /public_workspaces/public-1." + ) + + if response.status in SKIP_RESPONSE_CODES: + pytest.skip( + f"/public_workspaces/public-1 returned HTTP {response.status} in this environment." + ) + + assert response.ok, ( + "Expected /public_workspaces/public-1 to load successfully, " + f"got HTTP {response.status}." + ) + + expect(page.locator("#public-tag-folders-container")).to_contain_text(TAG_NAME) + page.locator("#public-tag-folders-container .tag-folder-icon i").first.hover() + + audit = page.evaluate( + """(tagName) => { + refreshPublicTagManagementTable(); + renderPublicTagSelectionList(); + window.eval(`publicDocSelectedTags.add(${JSON.stringify(tagName)});`); + updatePublicDocTagsDisplay(); + + const selectors = [ + '#public-tag-folders-container', + '#public-existing-tags-tbody', + '#public-tag-selection-list', + '#public-doc-selected-tags-container', + ]; + + const attributes = selectors.flatMap((selector) => + Array.from(document.querySelectorAll(`${selector} *`)).flatMap((node) => + Array.from(node.attributes) + .filter((attr) => attr.name.toLowerCase().startsWith('on') || String(attr.value).includes('__publicTagColorXss')) + .map((attr) => ({ + selector, + tagName: node.tagName, + attribute: attr.name, + value: attr.value, + })) + ) + ); + + return { + attributes, + selectedTagsText: document.getElementById('public-doc-selected-tags-container')?.textContent || '', + managementText: document.getElementById('public-existing-tags-tbody')?.textContent || '', + selectionText: document.getElementById('public-tag-selection-list')?.textContent || '', + gridText: document.getElementById('public-tag-folders-container')?.textContent || '', + xssTriggered: !!window.__publicTagColorXss, + }; + }""", + TAG_NAME, + ) + + assert TAG_NAME in audit["gridText"] + assert TAG_NAME in audit["managementText"] + assert TAG_NAME in audit["selectionText"] + assert TAG_NAME in audit["selectedTagsText"] + assert audit["attributes"] == [] + assert audit["xssTriggered"] is False + finally: + context.close() + browser.close() \ No newline at end of file diff --git a/ui_tests/test_uploaded_file_preview_escaping.py b/ui_tests/test_uploaded_file_preview_escaping.py new file mode 100644 index 000000000..b5ebdea28 --- /dev/null +++ b/ui_tests/test_uploaded_file_preview_escaping.py @@ -0,0 +1,143 @@ +# test_uploaded_file_preview_escaping.py +""" +UI test for uploaded file preview body escaping. +Version: 0.241.022 +Implemented in: 0.241.022 + +This test ensures uploaded file preview content renders attacker-controlled +plain text, CSV cells, and legacy HTML table payloads as inert text. +""" + +import json +import os +from pathlib import Path + +import pytest +from playwright.sync_api import expect + +BASE_URL = os.getenv("SIMPLECHAT_UI_BASE_URL", "").rstrip("/") +STORAGE_STATE = os.getenv("SIMPLECHAT_UI_STORAGE_STATE", "") +SKIP_RESPONSE_CODES = {401, 403, 404} + +def _fulfill_json(route, payload, status=200): + route.fulfill( + status=status, + content_type="application/json", + body=json.dumps(payload), + ) + +def _show_file_preview(page, file_content, filename, is_table, flag_name): + page.evaluate( + """ + async ({ fileContent, filename, isTable, flagName }) => { + window[flagName] = false; + + const fileActionsModule = await import('/static/js/chat/chat-input-actions.js'); + fileActionsModule.showFileContentPopup( + fileContent, + filename, + isTable, + 'database', + null, + null, + ); + } + """, + { + "fileContent": file_content, + "filename": filename, + "isTable": is_table, + "flagName": flag_name, + }, + ) + +@pytest.mark.ui +def test_uploaded_file_preview_renders_untrusted_content_as_inert_text(playwright): + """Validate uploaded file previews do not turn attacker-controlled content into DOM.""" + if not BASE_URL: + pytest.skip("Set SIMPLECHAT_UI_BASE_URL to run this UI test.") + if not STORAGE_STATE or not Path(STORAGE_STATE).exists(): + pytest.skip("Set SIMPLECHAT_UI_STORAGE_STATE to a valid authenticated Playwright storage state file.") + + browser = playwright.chromium.launch() + context = browser.new_context( + storage_state=STORAGE_STATE, + viewport={"width": 1440, "height": 900}, + ) + page = context.new_page() + + plain_text_payload = ' plain text body' + csv_payload = 'column_a,column_b\n,safe' + legacy_table_payload = ( + '
    ' + '
    ' + ) + + page.route( + "**/api/user/settings", + lambda route: _fulfill_json(route, {"selected_agent": None, "settings": {"enable_agents": False}}), + ) + page.route("**/api/get_conversations", lambda route: _fulfill_json(route, {"conversations": []})) + + try: + response = page.goto(f"{BASE_URL}/chats", wait_until="domcontentloaded") + assert response is not None, "Expected a navigation response when loading /chats." + + if response.status in SKIP_RESPONSE_CODES: + pytest.skip(f"/chats returned HTTP {response.status} in this environment.") + + assert response.ok, f"Expected /chats to load successfully, got HTTP {response.status}." + page.wait_for_selector("#chatbox") + + _show_file_preview( + page, + plain_text_payload, + "plain-preview.txt", + False, + "__filePreviewPlainXss", + ) + expect(page.locator("#file-modal")).to_be_visible() + expect(page.locator("#file-content pre")).to_have_text(plain_text_payload) + expect(page.locator("#file-modal img[src='x']")).to_have_count(0) + expect(page.locator("#file-modal svg")).to_have_count(0) + expect(page.locator("#file-modal script")).to_have_count(0) + + _show_file_preview( + page, + csv_payload, + "table-preview.csv", + True, + "__filePreviewCsvXss", + ) + expect(page.locator("#file-content table")).to_be_visible() + expect(page.locator("#file-content")).to_contain_text( + '' + ) + expect(page.locator("#file-modal img[src='x']")).to_have_count(0) + expect(page.locator("#file-modal svg")).to_have_count(0) + expect(page.locator("#file-modal script")).to_have_count(0) + + _show_file_preview( + page, + legacy_table_payload, + "legacy-table-preview.html", + True, + "__filePreviewLegacyXss", + ) + expect(page.locator("#file-content pre")).to_contain_text("
    ({ + plain: !!window.__filePreviewPlainXss, + csv: !!window.__filePreviewCsvXss, + legacy: !!window.__filePreviewLegacyXss, + })""" + ) + assert flags == {"plain": False, "csv": False, "legacy": False} + finally: + context.close() + browser.close() \ No newline at end of file diff --git a/ui_tests/test_web_search_notice_copy.py b/ui_tests/test_web_search_notice_copy.py new file mode 100644 index 000000000..7630d418f --- /dev/null +++ b/ui_tests/test_web_search_notice_copy.py @@ -0,0 +1,56 @@ +# test_web_search_notice_copy.py +""" +UI test for web search disclosure copy. +Version: 0.241.008 +Implemented in: 0.241.008 + +This test ensures the admin settings page shows the updated current-message-only +web-search disclosure copy for the user notice placeholder and the admin +consent modal warning text. +""" + +import os +import re +from pathlib import Path + +import pytest +from playwright.sync_api import expect + + +BASE_URL = os.getenv('SIMPLECHAT_UI_BASE_URL', '').rstrip('/') +STORAGE_STATE = os.getenv('SIMPLECHAT_UI_STORAGE_STATE', '') + + +@pytest.mark.ui +def test_admin_settings_shows_current_message_only_web_search_notice(playwright): + """Validate the admin-facing web-search disclosure copy.""" + if not BASE_URL: + pytest.skip('Set SIMPLECHAT_UI_BASE_URL to run this UI test.') + if not STORAGE_STATE or not Path(STORAGE_STATE).exists(): + pytest.skip('Set SIMPLECHAT_UI_STORAGE_STATE to a valid authenticated Playwright storage state file.') + + browser = playwright.chromium.launch() + context = browser.new_context( + storage_state=STORAGE_STATE, + viewport={'width': 1440, 'height': 900}, + ) + page = context.new_page() + + try: + page.goto(f'{BASE_URL}/admin/settings', wait_until='networkidle') + + notice_textarea = page.locator('#web_search_user_notice_text') + expect(notice_textarea).to_have_count(1) + expect(notice_textarea).to_have_attribute( + 'placeholder', + re.compile(r'Your current message will be sent to Microsoft Bing for web search', re.IGNORECASE), + ) + + consent_modal = page.locator('#web-search-consent-modal') + expect(consent_modal).to_have_count(1) + consent_text = consent_modal.text_content() or '' + assert 'Only the user\'s current message is sent for web search.' in consent_text + assert 'Users should avoid including sensitive content in any message that uses web search.' in consent_text + finally: + context.close() + browser.close() \ No newline at end of file