Open Source Computer Command Framework — v1.4.0
Voice-controlled, local-first AI agent that runs on your machine with any LLM.
No cloud. No subscription. No data leaves your computer.
opencodec.org · AVA Digital LLC
☕ Support the Project · 🏢 Enterprise Setup
CODEC turns your computer into a voice-controlled AI workstation. Press a key or say "Hey CODEC" — CODEC listens, thinks (using any LLM you choose), and acts: opening apps, drafting messages, reading your screen, analyzing documents, researching topics, writing code, and anything else you can describe.
A private, open-source alternative to Siri and Alexa that actually controls your computer — and writes its own plugins.
Built for macOS. Linux support planned.
"Research our top 3 competitors and write a strategic analysis"
→ CODEC Agents search the web, your LLM synthesises findings,
formatted Google Doc delivered in 2 minutes
"I'm getting a 401 on this endpoint — read my screen and fix it"
→ Screenshots VS Code, identifies the missing auth header,
writes the fix, pastes it directly into your editor
"Go through my inbox, categorise everything, draft replies to urgent ones"
→ Email Handler scans Gmail, prioritises by urgency,
drafts contextual replies for your review
"Plan my Tokyo trip March 15 to 22 — flights, hotels, itinerary"
→ Trip Planner researches options, builds a day-by-day itinerary,
adds events to Calendar, saves to Docs
"Give me my morning briefing"
→ Checks calendar, reads urgent emails, pulls weather and news,
delivers a spoken summary — completely hands-free
"Build a REST API for user auth in Python and save it as a skill"
→ Vibe Code opens, CODEC writes the full API with tests, runs it,
Skill Forge saves it as a reusable CODEC skill
"Hey CODEC, what time is it?" → Instant answer via skill
"Set a timer for 10 minutes" → Timer with voice alert
"Open YouTube and search for cooking recipes" → Chrome opens and navigates
"What's on my calendar today?" → Reads Google Calendar
"Translate this to English" → Translates selected text
36 skills · 8 right-click services · 5 AI agent crews · 250K context · FTS5 memory · MIT licensed
git clone https://github.com/AVADSA25/codec.git
cd codec
./install.shThat's it. The installer checks dependencies, installs packages, copies skills, and launches the setup wizard in one shot.
Expand for step-by-step
git clone https://github.com/AVADSA25/codec.git
cd codecpip3 install pynput sounddevice soundfile numpy requests simple-term-menu
brew install soxpython3 setup_codec.pyThe wizard walks you through everything in 9 steps: LLM provider, voice engine, speech-to-text, keyboard shortcuts, wake word, features, skills, and phone dashboard.
python3 codec.pyPress your toggle key to activate (default F13), then use your configured voice key (default F18) to speak commands.
npm install -g pm2
pm2 start "python3 codec.py" --name codec
pm2 save && pm2 startupmacOS permissions required: Grant Accessibility and Input Monitoring in System Settings > Privacy & Security.
# TTS — Kokoro 82M, optimized for Apple Silicon
pip3 install mlx-audio misaki num2words phonemizer-fork spacy
python3 -m spacy download en_core_web_sm
python3 -m mlx_audio.server --host 0.0.0.0 --port 8085
# STT — Whisper Large v3 Turbo via MLX
pip3 install mlx-whisper fastapi uvicorn
python3 whisper_server.pyYour always-on AI command layer for macOS.
- F13 — toggle CODEC on/off (with sound effects)
- F18 (hold) — hold to record, release to send voice command
- F16 — text input dialog
**(double-tap) — screenshot your screen and ask Q about it++(double-tap) — open file picker for document analysis--(double-tap) — start CODEC Voice live call- "Hey CODEC" — always-on wake word, hands-free from across the room (customizable)
- Draft & Paste — reads the active screen, understands the conversation context, writes a natural reply, and pastes it instantly into Slack, WhatsApp, iMessage, email — any app
- 36 native skills — fire instantly without calling the LLM (calculator, calendar, weather, music, web search, and more)
- Command Preview UI — every bash and AppleScript command shows an Allow/Deny popup before executing
- FTS5 Memory Search — full-text search over all conversations via SQLite FTS5 BM25 ranking. Say "search my memory for X" to recall anything
All shortcuts are configurable. CODEC's default assistant name is C — you can rename it to anything in your config.
A free, open-source SuperWhisper replacement.
- Hold Right CMD, speak naturally, release — text is transcribed and pasted directly into whatever app is active
- Whisper transcription → Qwen refinement for cleaner message output
- Works in any text field: email, Slack, Notes, VS Code, browser, terminal
- No popup, no modal — instant paste
- Powered by local Whisper STT. Zero latency, zero cloud
Select any text in any app, right-click, and choose from eight CODEC services:
CODEC Proofread → Fixes spelling, grammar, punctuation. Replaces text instantly.
CODEC Elevate → Rewrites to be more polished and professional. Replaces text.
CODEC Explain → Explains in simple terms. Opens in Terminal.
CODEC Prompt → Rewrites as an optimized LLM prompt. Replaces text.
CODEC Translate → Translates any language to English. Opens in Terminal.
CODEC Reply → Reads the message, writes a natural reply. Add :direction for intent.
CODEC Read Aloud → Speaks the selected text via Kokoro TTS. Up to 2000 chars.
CODEC Save → Saves selected text to Google Keep or local notes with timestamp.
Works system-wide via macOS Services. Built for accessibility — particularly useful for dyslexia and ADHD. Your AI proofreader and translator is always one right-click away.
Full AI chat at /chat on your dashboard.
- 250,000 token context window
- File upload with PDF extraction, images via vision model
- Drag and drop, microphone input, conversation history sidebar
- CODEC Agents — 5 pre-built multi-agent crews (no external dependencies):
- Deep Research → multi-step web research → styled Google Doc with Pexels images
- Daily Briefing → calendar + email + weather + news in one report
- Trip Planner → web research → itinerary → Google Calendar events
- Competitor Analysis → market research → strategic Google Doc report
- Email Handler → reads Gmail → categorizes → drafts smart replies
- Custom Agent Builder — build your own agent from the chat UI: name it, write its system prompt, pick tools from all 36 skills, set max iterations. Save and reuse.
Split-screen coding environment at /vibe.
- Monaco Editor (VS Code engine) with syntax highlighting and language detection
- AI chat sidebar — describe what to build, Q writes and applies the code automatically
- Skill Forge — 3-mode skill creator:
- Paste Code — paste any Python/JS → converted to CODEC skill
- GitHub URL — paste raw GitHub/Gist URL → fetched and converted automatically
- Describe — plain English description → Q generates skill from scratch
- Live Preview — HTML/CSS/JS renders in embedded iframe
- Run + Stop buttons — execute code, cancel generation
- Save as Skill — one click installs to
~/.codec/skills/ - Project history sidebar with session persistence
Real-time voice-to-voice conversation at /voice.
- Full-duplex WebSocket audio pipeline — no external dependencies
- Two-task architecture: audio receiver + pipeline run concurrently
- Interruption support: start speaking and Q stops mid-sentence immediately
- Skill dispatch: Q recognizes requests in real-time and calls skills (calendar, web search, weather, etc.)
- All 36 skills accessible by voice during calls
- Complete transcript with streaming display
- Conversation saved to shared FTS5 memory after each call
- Double-tap
--from dashboard to auto-connect - Built from scratch — our own WebSocket pipeline
Control your Mac from your phone anywhere.
- Text commands and voice input with voice replies
- Screenshot your Mac display live
- Upload PDFs and images for AI analysis
- Deep Chat, Vibe Code, and Voice Call access from any device
- Chat history and audit log
- Dark and light mode, Add to Home Screen
- FastAPI backend, vanilla HTML frontend — no React, no npm
- Cloudflare Tunnel + Zero Trust email authentication
| Shortcut | Action |
|---|---|
| F13 | Toggle CODEC ON/OFF (with sound effects) |
| F18 (hold) | Record voice, release to send |
| F16 | Text input dialog |
| Right CMD (hold) | CODEC Dictate — speak and paste anywhere |
** (double-tap) |
Screenshot and ask about screen |
++ (double-tap) |
Open file picker for document analysis |
-- (double-tap) |
Start CODEC Voice live call |
| Right-click Services | Text Assistant (6 modes) |
| Hey CODEC | Wake word — hands-free activation |
Skills fire instantly without calling the LLM. Used directly in voice, text, and agent crews.
| Skill | What it does |
|---|---|
| Calculator | Quick math |
| Weather | Current weather by city |
| Time and Date | Current time and date |
| System Info | CPU, disk, memory stats |
| Web Search | DuckDuckGo instant answers |
| Translate | Multi-language translation |
| Apple Notes | Save and read notes |
| Timer | Set timers with voice alerts |
| Volume | Volume control |
| Brightness | Screen brightness control |
| Apple Reminders | Add to Apple Reminders |
| Music | Control Spotify and Apple Music |
| Clipboard | Clipboard history |
| App Switch | Switch apps by name |
| Create Skill | Write new skills with natural language |
| Memory Search | Full-text search over all conversations |
| Skill Forge | Convert any code to a CODEC skill |
| Network Info | IP, WiFi, connection details |
| Process Manager | List and kill processes |
| Terminal | Run shell commands by voice |
| Screenshot Text | Read text from your screen |
| File Search | Find files by name |
| Chrome Open/Close | Open and close Chrome |
| Chrome Search | Google search via Chrome |
| Chrome Read | Read current tab content |
| Chrome Tabs | Switch and list tabs |
| Google Calendar | Check schedule and create events |
| Google Gmail | Check inbox and search emails |
| Google Drive | Search and list files |
| Google Docs | Create and read documents |
| Google Sheets | Read and write spreadsheets |
| Google Slides | Create presentations |
| Google Tasks | Manage task lists |
| Google Keep | Create and manage notes |
| Webhook / Lucy | Delegate tasks to external AI agents |
| QR Code | Generate QR codes |
Direct access to Calendar, Gmail, Drive, Docs, Sheets, Slides, Tasks, and Keep. Pure Python with full read and write access. One-time OAuth setup.
CODEC ships with a fully local ReAct multi-agent framework. No external dependencies, no rate limits, no API keys beyond your LLM.
Each crew is a sequence of specialized agents with curated tool access:
Deep Research:
Researcher (web_search + web_fetch, 5 calls) →
Writer (google_docs_create, 2 calls) →
Outputs: styled Google Doc with Pexels images
Daily Briefing:
Scout (google_calendar + weather + web_search, 4 calls) →
Outputs: text briefing read aloud
Trip Planner:
Researcher (web_search + web_fetch, 5 calls) →
Planner (google_docs_create + google_calendar, 2 calls) →
Outputs: Google Doc itinerary + calendar events
Competitor Analysis:
Analyst (web_search + web_fetch, 5 calls) →
Writer (google_docs_create, 2 calls) →
Outputs: styled competitor report in Google Docs
Email Handler:
Handler (google_gmail + web_search, 4 calls) →
Outputs: summary + draft replies
Custom agents: name, role prompt, tool selection, iterations — built and saved from the chat UI.
- Command Preview UI — every bash and AppleScript command shows a popup with Allow/Deny before executing
- Dangerous command blocker — rm -rf, sudo, shutdown, killall and 30+ patterns require explicit confirmation
- Full audit log — every action logged to
~/.codec/audit.logwith timestamps - Structured logging — Python
loggingmodule with[HH:MM:SS] [CODEC]format - Wake word noise filter — rejects TV, music, and background audio false triggers
- 8-step execution cap on agent tasks
- Skill isolation — common tasks skip the LLM entirely
- Cloudflare Zero Trust — email authentication on phone dashboard
- Code sandbox — Vibe Code has 30-second timeout and blocks dangerous commands
"""My Custom Skill"""
SKILL_NAME = "btc_price"
SKILL_TRIGGERS = ["bitcoin price", "btc price", "check bitcoin"]
SKILL_DESCRIPTION = "Check current Bitcoin price"
import requests
def run(task, app="", ctx=""):
r = requests.get("https://api.coindesk.com/v1/bpi/currentprice.json", timeout=10)
price = r.json()["bpi"]["USD"]["rate"]
return f"Bitcoin is currently ${price}"Drop in ~/.codec/skills/ or use Skill Forge in Vibe Code to convert any existing code. Or say "create a skill that does X" and CODEC writes one for you.
python3 codec_dashboard.pyLocal: http://localhost:8090
Remote via Cloudflare Tunnel:
- brew install cloudflared
- cloudflared tunnel create my-codec
- Route DNS and add to config.yml
- Add email auth in Cloudflare Zero Trust
- On phone: open URL, Add to Home Screen
codec.py — Main agent (voice + text + wake word)
codec_watcher.py — Draft and paste agent (CODEC Dictate)
codec_textassist.py — Right-click text assistant (8 services)
codec_dashboard.py — Dashboard server (PWA + APIs)
codec_dashboard.html — Phone dashboard UI
codec_chat.html — Deep Chat + Agent Crews
codec_vibe.html — Vibe Code IDE + Skill Forge
codec_voice.html — CODEC Voice live call UI
codec_voice.py — Voice WebSocket pipeline (v2, two-task)
codec_agents.py — CODEC Agents multi-agent framework
codec_memory.py — SQLite FTS5 memory search
codec_gdocs.py — Styled Google Docs creator
setup_codec.py — 8-step interactive installer
whisper_server.py — Local Whisper STT server
reauth_google.py — Google OAuth helper
skills/ — 36 skill plugins
| Provider | Setup |
|---|---|
| Ollama | ollama serve, select in wizard |
| LM Studio | Start server, point to localhost:1234 |
| MLX Server | Apple Silicon optimized, point to localhost:8081 |
| OpenAI | Paste API key |
| Anthropic | Paste API key |
| Google Gemini | Paste API key (free tier works) |
| Any OpenAI-compatible | Enter base URL and model |
- macOS (Ventura or later)
- Python 3.10+
- sox (
brew install sox) - An LLM (local or cloud)
- Optional: Whisper for STT, Kokoro for TTS
- SwiftUI native macOS overlay
- Long-term vector memory
- Vibe Code inline editing and point-click preview elements
- Linux port
- Installable .dmg
- Skill marketplace
- AXUIElement accessibility API integration
MIT licensed. Use it however you want. Found a bug? Open an issue. Built a skill? Submit a PR.
CODEC is free and open source. If it saves you time or you want to see it grow:
☕ PayPal: ava.dsa25@proton.me
⭐ Star this repo — it helps others discover the project.
Need AI infrastructure for your business?
AVA Digital LLC deploys private, local AI systems. CODEC setup, custom skills, multi-machine networks, voice pipelines, and ongoing support.
📧 mikarina@avadigital.ai · 🌐 avadigital.ai · 🌐 opencodec.org
MIT License
Built by AVA Digital LLC · opencodec.org
Powered by: Ollama · Kokoro TTS · Whisper · MLX · CODEC Voice · CODEC Agents
