Skip to content

feat(coder-templates/ai-agent-generic): add opt-in virtual desktop for computer_use#55

Open
ausbru87 wants to merge 1 commit into
mainfrom
feat/ai-agent-generic-desktop
Open

feat(coder-templates/ai-agent-generic): add opt-in virtual desktop for computer_use#55
ausbru87 wants to merge 1 commit into
mainfrom
feat/ai-agent-generic-desktop

Conversation

@ausbru87

Copy link
Copy Markdown
Collaborator

Summary

Enables the computer_use subagent (spawn_agent with type=computer_use) in ai-agent-generic workspaces by adding an opt-in enable_desktop parameter (default on) that installs the portabledesktop registry module.

Why

computer_use drives a graphical desktop through the Coder agent's built-in desktop endpoints (/api/v0/desktop/action). The agent implements those by shelling out to a self-contained portabledesktop binary resolved from the agent script bin dir (agent/x/agentdesktop). Without that binary the desktop session fails to start, so computer_use cannot run in workspaces from this template. The official way to provide it is the portabledesktop module.

This is not KasmVNC/XFCE/xdotool: the binary bundles its own Xvnc + window manager.

What changed

  • main.tf: new enable_desktop bool parameter (default true) and a conditional registry.coder.com/coder/portabledesktop/coder 0.1.0 module.
  • README.md: Virtual desktop section, parameter row, tooling note, egress/prereq tradeoffs.

The hardened posture is preserved: the binary installs without sudo, so allow_privilege_escalation stays false and the enterprise-base image is unchanged. The desktop process only starts when a computer_use action first connects.

Tradeoff (flagged for review)

This template is the default fallback for all non-language-specific agent tasks. enable_desktop=true adds a one-time github.com release download at startup for every generic workspace (the same egress class as the existing code-server module). The parameter is the off-switch for a tight-egress/air-gapped boundary, where the module url can instead point at an in-boundary mirror with a pinned sha256. Defaulted on so computer_use works out of the box; happy to flip to off-by-default if preferred.

computer_use also needs platform-layer config that is not in this template: the Virtual Desktop experiment and a computer-use provider key. Both are already enabled on the demo deployment (provider = anthropic).

Verification

Investigation and test evidence
  • Traced the path through coder/coder: coderd/x/chatd/chattool/computeruse.go -> workspacesdk.AgentConn.ExecuteDesktopAction -> agent agent/x/agentdesktop -> portabledesktop CLI (exec.LookPath / script bin dir).
  • terraform fmt -check, init, and validate all pass; module resolves to portabledesktop 0.1.0. No emdash/endash.
  • Pushed live to org coder (active version smiling_satterfield57); plan resolves enable_desktop.
  • Ran the actual portabledesktop-linux-x64 binary (statically linked, 54MB) inside the exact ECR enterprise-base:ubuntu-noble-20260601 image as uid 1000: up --json started Xvnc + openbox on vncPort 5901 with no errors, and screenshot --json returned a valid PNG. This is precisely what the agent's desktop endpoints invoke.

Note: ai-agent-generic requires a GitLab login, so a new workspace from it only finishes building after the owner completes the GitLab OAuth flow (pre-existing behavior, unchanged by this PR).

Generated by Coder Agents, on behalf of @ausbru87.

…r computer_use

The computer_use subagent (spawn_agent type=computer_use) drives a graphical
desktop through the Coder agent's desktop endpoints, which shell out to a
self-contained portabledesktop binary resolved from the agent script bin dir.
Without that binary the desktop session fails to start, so computer_use cannot
run in workspaces from this template.

Add an enable_desktop parameter (default on) that installs the
registry.coder.com/coder/portabledesktop/coder module. The binary installs
without sudo, so this keeps the hardened posture: privilege escalation stays
disabled and the enterprise-base image is unchanged. The desktop process only
starts when a computer_use action first connects, so at rest the only added
cost is a github.com release download at startup, the same egress class as the
existing code-server module. The parameter is the off-switch for a tight-egress
or air-gapped boundary, where the module url can instead point at an in-boundary
mirror with a pinned sha256.

computer_use also requires platform-layer config not in this template: the
Virtual Desktop experiment and a computer-use provider key.

Generated by Coder Agents.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant