Skip to content

skyllwt/OmegaWiki

ΩmegaWiki Logo

ΩmegaWiki

Karpathy's LLM-Wiki Vision, Fully Realized

Your AI Research Platform — From Papers to Publications, Powered by Claude Code

From paper ingestion to publication — your research knowledge compounds, never decays.

License: MIT Python 3.9+ Skills Claude Code Bilingual

English | 中文


Team

ΩmegaWiki is built by DAIR Lab at Peking University — a fully agentic platform that automates the complete research pipeline, from knowledge ingestion to paper submission.

Weitong Qian

Weitong Qian
PKU
Undergraduate · 2023
Beicheng Xu

Beicheng Xu
PKU
Ph.D. · 2023
Zhongao Xie

Zhongao Xie
PKU
Undergraduate · 2025
Bowen Fan

Bowen Fan
PKU
Undergraduate · 2024
Guozheng Tang

Guozheng Tang
PKU
Undergraduate · 2024
Xinzhe Wu

Xinzhe Wu
PKU
Undergraduate · 2024
Jiale Chen

Jiale Chen
PKU
Undergraduate · 2024
Mingtian Yang

Mingtian Yang
PKU
Undergraduate · 2024

🆕 What's New

🎯 2026-05-12 · /discover from a venue — "what should I read first from ICLR 2024?"

Run /discover --venue iclr --year 2024 (or any conference/year) and get a personalized shortlist of papers from that venue, ranked by relevance to what's already in your wiki. Instead of scrolling a 7000-paper proceedings, you see the dozen that actually matter for your research direction, each with a rationale tied to topics and methods you already track. No new API keys, no ingest side-effects on your wiki — just a ranked reading list. Supports NeurIPS, ICLR, ICML, and other venues covered by Paper Copilot.

📰 2026-05-09 · Daily arXiv — fresh-paper recommendations, on demand or scheduled

Run /daily-arxiv for a one-off pass, or /daily-arxiv setup to schedule the same pipeline in GitHub Actions. The skill builds an evidence packet from arXiv + Semantic Scholar + DeepXiv, lets the LLM rank candidates against your wiki interests, and delivers a digest by e-mail. Explicit --mode auto-ingest calls /ingest for high-confidence picks; inform mode just notifies.

🌐 2026-05-06 · Knowledge Graph Visualization — browser + Obsidian

Your research graph now has two ways to explore:

  • Web UI — run python3 tools/serve.py, open http://localhost:8765/#/graph. Click any node to highlight its neighborhood via BFS, filter by entity type or edge category, double-click to open the full page in the Reader.
  • Obsidian — run /visualize --obsidian to generate a color-coded graph config, or /visualize --canvas to produce a force-layout Canvas with labeled semantic edges.

🔬 2026-05-06 · Methods — Reusable Techniques are Now First-Class

Architectures, training recipes, evaluation protocols, and other reusable techniques now live in wiki/methods/ as proper wiki entities — with their own pages, source-paper links, and parent/child method chains.


What is ΩmegaWiki?

Andrej Karpathy proposed LLM-Wiki: an LLM that builds and maintains a persistent, structured wiki from your sources — not a throwaway RAG answer, but compounding knowledge that grows smarter with every paper you feed it.

ΩmegaWiki takes that idea and runs the full distance. It's not just a wiki builder — it's a complete research lifecycle platform: from paper ingestion → knowledge graph → gap detection → idea generation → experiment design → paper writing → peer review response. All driven by 24 Claude Code skills, all centered on one wiki as the single source of truth.

Drop your .tex / .pdf files in a folder. Run one command. Get a fully cross-referenced knowledge base — and then use it to generate novel research ideas, design experiments, write papers, and respond to reviewers.

Why Wiki-Centric, Not RAG?

RAG ΩmegaWiki
Knowledge persistence Rediscovered on every query Compiled once, maintained forever
Structure Flat chunk store 9 typed entities with relationships
Cross-references None — chunks are isolated Bidirectional wikilinks + typed graph
Knowledge gaps Invisible Explicitly tracked, drive research
Failed experiments Lost First-class anti-repetition memory
Output Chat answers Papers, surveys, experiment plans, rebuttals
Compounding No — same cost every query Yes — each paper enriches the whole graph

Architecture

ΩmegaWiki Architecture

Every skill reads from and writes back to the wiki. Knowledge compounds — each new paper enriches the whole graph. Failed experiments aren't discarded; they become anti-repetition memory that prevents re-exploring dead ends.

Quick Start

Prerequisites: Python 3.9+, Node.js 18+

# 1. Clone
git clone https://github.com/skyllwt/OmegaWiki.git
cd OmegaWiki

# 2. Install Claude Code
npm install -g @anthropic-ai/claude-code
claude login

# 3. One-click setup
chmod +x setup.sh && ./setup.sh        # Linux / macOS
# Windows (PowerShell):
#   powershell -ExecutionPolicy Bypass -File .\setup.ps1
# setup creates .venv for OmegaWiki
# the script does not keep your shell activated, but /init will use .venv automatically

# 4. Put your own papers in raw/papers/ (.tex or .pdf)
#    Optional: add intent notes to raw/notes/ and saved pages to raw/web/
#    /init and direct local /ingest will manage generated inputs under raw/discovered/ and raw/tmp/

# 5. Build your wiki
claude
# Then type: /init [your-research-topic]
Manual setup (Linux / macOS)
python3 -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
cp .env.example .env                 # Edit to add API keys
cp config/settings.local.json.example .claude/settings.local.json
Manual setup (Windows / PowerShell)
python -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install -r requirements.txt
Copy-Item .env.example .env          # Edit to add API keys
Copy-Item config\settings.local.json.example .claude\settings.local.json

Note: native Windows is supported for the local pipeline. Remote-GPU experiments via /exp-run --env remote rely on ssh/rsync/screen and are best run from WSL2 or Linux/macOS.

API Keys

Key Required? How to get What it enables
ANTHROPIC_API_KEY Yes claude login (automatic) Powers all Claude Code skills
CLAUDE_CODE_OAUTH_TOKEN Optional claude setup-token GitHub Actions Claude Code auth for Pro/Max users
SEMANTIC_SCHOLAR_API_KEY Optional semanticscholar.org/product/api (free) Citation graph, paper search
DEEPXIV_TOKEN Optional setup.sh auto-registers Semantic search, TLDR, trending
LLM_API_KEY + LLM_BASE_URL + LLM_MODEL Optional Any OpenAI-compatible API Cross-model review; /daily-arxiv inform recommendations

Cross-model review: ΩmegaWiki uses a second LLM as an independent reviewer for ideas, experiments, and paper drafts. Works with any OpenAI-compatible API — DeepSeek, OpenAI, Qwen, OpenRouter, SiliconFlow, etc. If not configured, skills still work in Claude-only mode.

Daily arXiv Recommendations

/daily-arxiv runs a one-off fresh-paper recommendation pass even before automation is configured. To schedule the same pipeline in GitHub Actions, copy config/daily-arxiv.yml.example to config/daily-arxiv.yml, then run /daily-arxiv setup. The config stores non-secret preferences such as mode, categories, caps, and schedule; SMTP/API credentials stay in .env or GitHub Actions secrets. In CI inform mode, recommendations can use Claude Code auth (ANTHROPIC_API_KEY or CLAUDE_CODE_OAUTH_TOKEN) or the OpenAI-compatible LLM_* review model; auto-ingest still requires Claude Code.

See docs/daily-arxiv-deployment.md for the GitHub Actions setup checklist and symptom-keyed troubleshooting.

Sample digest
Sample /daily-arxiv digest

A real /daily-arxiv run: ranked recommendations with scores, rationales, wiki connections, and an auto-ingest section.

Skills

24 slash commands spanning the full research lifecycle:

Phase 0: Setup

Command What it does
/setup First-time configuration (API keys, language, dependencies)
/reset <scope> Destructive cleanup: wiki | raw | log | checkpoints | all

Phase 1: Knowledge Foundation

Command What it does
/prefill <domain> Optionally seed foundations/ with background knowledge
/init [topic] Bootstrap a full wiki from user raw sources plus optional discovery
/ingest <source> Parse a paper → wiki pages + cross-references
/discover Recommend ranked next-read papers from anchors, a topic, the current wiki, or a venue/year
/edit <request> Add/remove sources or update wiki content
/ask <question> Query the wiki, crystallize answers back
/check Health scan: broken links, missing cross-refs, consistency

Phase 2: Research Pipeline

Command What it does
/daily-arxiv Run/manage a daily arXiv recommendation feed (+ optional GitHub Actions scheduler)
/ideate Multi-phase idea generation from cross-topic connections
/novelty <idea> Multi-source novelty verification (web + S2 + wiki + review LLM)
/review <artifact> Cross-model adversarial review for any research artifact
/exp-design <idea> Idea-driven experiment + ablation design
/exp-run <experiment> Implement + deploy + monitor (local or remote GPU)
/exp-status Dashboard for running experiments; auto-collect results
/exp-eval <experiment> Verdict gate → auto-update the linked idea + graph
/refine <artifact> Multi-round: produce → review → fix → re-review

Phase 3: Writing & Submission

Command What it does
/survey Generate Related Work from wiki knowledge
/paper-plan <ideas> Outline from validated-idea graph + evidence matrix
/paper-draft <plan> Draft LaTeX + figures, section by section
/paper-compile <dir> Compile → PDF, auto-fix, verify page/anonymity
/research <direction> End-to-end orchestrator with human gates
/rebuttal <reviews> Parse reviewer comments → draft point-by-point responses

Wiki Structure

9 Entity Types

Type Directory Purpose
Paper papers/ Structured summary: problem/key idea/method/experiment+results/limitations + tldr/contribution_type/datasets
Concept concepts/ Cross-paper technical concept with variants, comparisons, definition, linked ideas
Topic topics/ Research direction map with SOTA tracker, key benchmarks, and open problems (split into known + methodological gaps)
Person people/ Researcher profile with research areas, recent work, and a researcher/team/organization type
Idea ideas/ Research idea with lifecycle, novelty argument & score, target venue
Experiment experiments/ Full record: hypothesis → setup → results → updates to the linked idea
Method methods/ Reusable, citable technique entity (cross-paper); links to source papers and parent/child methods
Summary Summary/ Domain-wide survey across topics
Foundation foundations/ Background knowledge (terminal: receives inward links, writes none)

Knowledge Graph

Semantic relationships are stored in graph/edges.jsonl; bibliographic paper citations are stored separately in graph/citations.jsonl.

Paper-paper semantic edges include same_problem_as, similar_method_to, complementary_to, builds_on, compares_against, improves_on, challenges, and surveys. Paper-concept edges use introduces_concept, uses_concept, extends_concept, and critiques_concept. Workflow edges (supports, contradicts, tested_by, invalidates, addresses_gap, inspired_by, derived_from) span experiments, ideas, methods, and concepts.

All pages use Obsidian [[wikilink]] format — open wiki/ in Obsidian for visual graph exploration.

Automation

GitHub Actions runs the /daily-arxiv recommendation pipeline at UTC 00:17 daily (08:17 Beijing time):

  1. Add SMTP secrets to repo Settings → Secrets when e-mail delivery is enabled: SMTP_HOST, SMTP_PORT, SMTP_USER, SMTP_PASSWORD, SMTP_FROM, DAILY_ARXIV_EMAIL_TO
  2. Optional inform-mode LLM recommendation: add ANTHROPIC_API_KEY or CLAUDE_CODE_OAUTH_TOKEN for Claude Code, or LLM_API_KEY, LLM_BASE_URL, and LLM_MODEL for any OpenAI-compatible provider
  3. .github/workflows/daily-arxiv.yml fetches arXiv, deduplicates against the wiki, builds a recommendation context, uploads artifacts, and sends the digest by SMTP

auto-ingest mode is explicit and requires Claude Code in CI, because plain API LLMs cannot invoke slash skills such as /ingest. Use manual dispatch with send_email=false for a dry run without SMTP secrets.

Project Structure

OmegaWiki/
├── CLAUDE.md                    # Runtime schema & rules
├── wiki/                        # Knowledge base (LLM-maintained)
│   ├── papers/                  #   Structured paper summaries
│   ├── concepts/                #   Cross-paper technical concepts
│   ├── topics/                  #   Research direction maps
│   ├── people/                  #   Researcher profiles
│   ├── ideas/                   #   Research ideas (with lifecycle)
│   ├── experiments/             #   Experiment records
│   ├── methods/                 #   Reusable cross-paper method entities
│   ├── Summary/                 #   Domain-wide surveys
│   ├── foundations/             #   Background knowledge (terminal pages)
│   ├── outputs/                 #   Generated artifacts
│   ├── graph/                   #   Auto-generated: edges, context, gaps
│   ├── index.md                 #   Content catalog
│   └── log.md                   #   Chronological log
├── raw/                         # Source materials
│   ├── papers/                  #   User-owned .tex / .pdf files
│   ├── discovered/              #   external papers from /init and explicit /daily-arxiv auto-ingest
│   ├── tmp/                     #   generated prepared local sidecars for /init and direct local /ingest
│   ├── notes/                   #   User-owned .md notes
│   └── web/                     #   User-owned HTML / Markdown
├── tools/                       # Deterministic Python helpers
│   ├── research_wiki.py         #   Wiki engine (20 CLI commands)
│   ├── init_discovery.py        #   /init prepare + plan + fetch helper
│   ├── discover.py              #   /discover candidate gathering, dedup, ranking
│   ├── lint.py                  #   Structural validation (10 checks)
│   ├── reset_wiki.py            #   Scoped destructive cleanup helper
│   ├── fetch_arxiv.py           #   arXiv RSS fetcher
│   ├── fetch_s2.py              #   Semantic Scholar API
│   ├── fetch_deepxiv.py         #   DeepXiv semantic search
│   ├── fetch_wikipedia.py       #   Wikipedia fetcher (used by /prefill)
│   └── remote.py                #   SSH ops for remote experiments
├── .claude/skills/              # 24 Claude Code skill definitions
├── i18n/                        # Bilingual: en/ (canonical) + zh/
├── config/                      # Configuration templates
├── mcp-servers/                 # Cross-model review server
└── .github/workflows/           # Daily arXiv cron

Bilingual Support

ΩmegaWiki ships in English and Chinese:

./setup.sh --lang en   # English (default)
./setup.sh --lang zh   # 中文

Contributing

We welcome contributions! See CONTRIBUTING.md for guidelines.

LLM API Configuration / 大模型 API 配置

ΩmegaWiki runs on Claude Code, which speaks the Anthropic API protocol. You can use Claude directly, or route Claude Code to any third-party provider that exposes an Anthropic-compatible endpoint by overriding a few environment variables.

ΩmegaWiki 基于 Claude Code,Claude Code 使用 Anthropic API 协议通信。你既可以直接使用 Claude,也可以通过覆盖几个环境变量,把 Claude Code 指向任意支持 Anthropic 协议的第三方供应商。

Option A — Native Claude / 原生 Claude

claude login   # OAuth, no manual config / OAuth 登录,无需手动配置

Option B — Third-party Anthropic-compatible API / 第三方 Anthropic 兼容 API

Pick a provider below, paste the snippet into ~/.claude/settings.json (or the project's .claude/settings.json), and replace the <...> placeholder with your own API key. Model names and extra options are taken from each provider's official Claude Code docs — if anything stops working (e.g. a model is renamed), check the provider's website.

从下方任选一个供应商,把对应配置粘贴到 ~/.claude/settings.json(或项目的 .claude/settings.json),并把 <...> 占位符替换为你自己的 API key。模型名与额外选项均来自各供应商官方 Claude Code 文档;若出现问题(例如模型改名),请查询对应官网。

MiMo (小米)

{
  "env": {
    "ANTHROPIC_BASE_URL": "https://api.xiaomimimo.com/anthropic",
    "ANTHROPIC_AUTH_TOKEN": "<your-mimo-key>",
    "ANTHROPIC_MODEL": "mimo-v2.5",
    "ANTHROPIC_DEFAULT_SONNET_MODEL": "mimo-v2.5",
    "ANTHROPIC_DEFAULT_OPUS_MODEL": "mimo-v2.5-pro",
    "ANTHROPIC_DEFAULT_HAIKU_MODEL": "mimo-v2.5"
  }
}

DeepSeek

{
  "env": {
    "ANTHROPIC_BASE_URL": "https://api.deepseek.com/anthropic",
    "ANTHROPIC_AUTH_TOKEN": "<your-deepseek-key>",
    "ANTHROPIC_MODEL": "deepseek-v4-pro[1m]",
    "ANTHROPIC_DEFAULT_OPUS_MODEL": "deepseek-v4-pro[1m]",
    "ANTHROPIC_DEFAULT_SONNET_MODEL": "deepseek-v4-pro[1m]",
    "ANTHROPIC_DEFAULT_HAIKU_MODEL": "deepseek-v4-flash",
    "CLAUDE_CODE_SUBAGENT_MODEL": "deepseek-v4-flash",
    "CLAUDE_CODE_EFFORT_LEVEL": "max"
  }
}

Kimi (Moonshot)

{
  "env": {
    "ANTHROPIC_BASE_URL": "https://api.moonshot.ai/anthropic",
    "ANTHROPIC_AUTH_TOKEN": "<your-moonshot-key>",
    "ANTHROPIC_MODEL": "kimi-k2.5",
    "ANTHROPIC_DEFAULT_OPUS_MODEL": "kimi-k2.5",
    "ANTHROPIC_DEFAULT_SONNET_MODEL": "kimi-k2.5",
    "ANTHROPIC_DEFAULT_HAIKU_MODEL": "kimi-k2.5",
    "CLAUDE_CODE_SUBAGENT_MODEL": "kimi-k2.5",
    "ENABLE_TOOL_SEARCH": "false"
  }
}

GLM (Z.AI)

{
  "env": {
    "ANTHROPIC_BASE_URL": "https://api.z.ai/api/anthropic",
    "ANTHROPIC_AUTH_TOKEN": "<your-zai-key>",
    "API_TIMEOUT_MS": "3000000"
  }
}

Z.AI applies a default server-side model mapping, so no explicit ANTHROPIC_MODEL is needed. Z.AI 默认在服务端做模型映射,无需显式设置 ANTHROPIC_MODEL

Skip the Claude Code onboarding / 跳过 Claude Code 初始引导

When using a third-party key (instead of claude login), Claude Code's first-run onboarding won't complete automatically. Create or edit .claude.json and mark it done:

使用第三方 key 时不会走 claude login,Claude Code 首次启动的引导不会自动完成。创建或编辑 .claude.json,手动标记引导已完成:

  • macOS / Linux: ~/.claude.json
  • Windows: <user-home>\.claude.json
{
  "hasCompletedOnboarding": true
}

Then run claude as usual. / 保存后正常运行 claude 即可。


Community / 交流群

WeChat Group QR Code

Scan to join the ΩmegaWiki WeChat group / 扫码加入微信交流群

Acknowledgments

  • Andrej Karpathy — for the LLM-Wiki concept that inspired this project
  • Claude Code — the AI agent runtime that powers ΩmegaWiki

Star History

Star History Chart

License

MIT — use it, fork it, build on it.


中文

ΩmegaWiki 是什么?

Andrej Karpathy 提出了 LLM-Wiki 概念:让 LLM 构建并维护一个持久的、结构化的 wiki,而不是一次性的 RAG 回答。知识持续积累,每一篇新论文都让整个知识图谱更强。

ΩmegaWiki 将这个理念完整实现。 它不仅是 wiki 构建器,更是完整的研究全流程平台:从论文摄入 → 知识图谱 → 缺口检测 → 想法生成 → 实验设计 → 论文写作 → 同行评审回复。24 个 Claude Code Skills 驱动,一个 wiki 作为唯一的知识中枢。

为什么选择 Wiki 而不是 RAG?

RAG ΩmegaWiki
知识持久性 每次查询都重新发现 编译一次,持续维护
结构 扁平的 chunk 存储 9 种实体类型 + 关系图
交叉引用 无 — chunk 彼此孤立 双向 wikilink + 类型化边
知识缺口 不可见 显式追踪,驱动研究方向
失败实验 丢失 一等公民,防止重复探索
输出 聊天回答 论文、综述、实验方案、审稿回复
复利效应 无 — 每次查询成本相同 有 — 每篇论文丰富整个图谱

快速开始

前置条件: Python 3.9+, Node.js 18+

git clone https://github.com/skyllwt/OmegaWiki.git && cd OmegaWiki

# 安装 Claude Code
npm install -g @anthropic-ai/claude-code
claude login

# 一键配置
chmod +x setup.sh && ./setup.sh --lang zh        # Linux / macOS
# Windows (PowerShell):
#   powershell -ExecutionPolicy Bypass -File .\setup.ps1 -Lang zh
# setup 会为 OmegaWiki 创建 .venv
# 脚本不会把你当前 shell 永久激活,但 /init 会自动使用 .venv

# 把你自己的论文放入 raw/papers/(.tex 或 .pdf)
# 可选:把意图笔记放入 raw/notes/,网页存档放入 raw/web/
# /init 与直接本地 /ingest 会自动管理 raw/discovered/ 与 raw/tmp/ 下的生成内容
# 启动 Claude Code
claude
# 输入:/init [你的研究方向]

Windows 用户:本地 pipeline 已原生支持。/exp-run --env remote 远程 GPU 实验依赖 ssh/rsync/screen,建议在 WSL2 或 Linux/macOS 下运行。

API Key 说明

Key 必须? 获取方式 用途
ANTHROPIC_API_KEY claude login 驱动所有 Skill
CLAUDE_CODE_OAUTH_TOKEN 可选 claude setup-token Pro/Max 用户的 GitHub Actions Claude Code auth
SEMANTIC_SCHOLAR_API_KEY 可选 semanticscholar.org(免费) 引用图谱、论文搜索
DEEPXIV_TOKEN 可选 setup.sh 自动注册 语义搜索、热门趋势
LLM_API_KEY + LLM_BASE_URL + LLM_MODEL 可选 任意 OpenAI 兼容 API 跨模型评审;/daily-arxiv inform 推荐

自动化

GitHub Actions 每天 UTC 00:17(北京时间 08:17)运行 /daily-arxiv 推荐 pipeline:拉取 arXiv、按 wiki 去重、构建 recommendation context、上传 artifacts,并可通过 SMTP 发送 digest 邮件。

启用邮件时,在 repo Settings → Secrets 添加:SMTP_HOSTSMTP_PORTSMTP_USERSMTP_PASSWORDSMTP_FROMDAILY_ARXIV_EMAIL_TO

CI inform mode 可使用 ANTHROPIC_API_KEYCLAUDE_CODE_OAUTH_TOKEN 启动 Claude Code,也可使用 LLM_API_KEYLLM_BASE_URLLLM_MODEL 接入任意 OpenAI-compatible provider。auto-ingest 是显式模式,并且需要 Claude Code,因为普通 API LLM 不能调用 /ingest 这类 slash skill。手动触发时可设置 send_email=false,用于无 SMTP secrets 的 dry run。

Digest 示例 / Sample digest
/daily-arxiv digest 示例

一次真实的 /daily-arxiv 运行结果:带分数、理由、wiki 关联以及 auto-ingest 区块的推荐 digest。

24 个 Skill 命令

命令 功能
/setup 首次配置(API key、语言、依赖)
/reset 按范围销毁性清理:wiki | raw | log | checkpoints | all
/prefill 可选地预填 foundations/ 背景知识
/init 基于用户 raw 素材并按需做外部发现来搭建 wiki
/ingest 消化论文,创建页面 + 交叉引用
/discover 从 anchor、topic、当前 wiki 或 venue/year 推荐排序后的下一批待读论文
/edit 增删 raw 或更新 wiki
/ask 对 wiki 提问
/check wiki 健康检查
/daily-arxiv 运行/管理每日 arXiv 推荐 feed(可选 CI 定时)
/ideate 跨方向构思研究 idea
/novelty 多源新颖性验证
/review 跨模型评审
/exp-design idea 驱动的实验设计
/exp-run 部署 + 监控实验
/exp-status 实验状态看板
/exp-eval 裁决 → 自动更新关联 idea
/refine 多轮迭代改进
/survey 生成 Related Work
/paper-plan idea 图谱 + 实验证据 → 论文提纲
/paper-draft 提纲 + wiki → LaTeX 草稿
/paper-compile 编译 → PDF,自动修复
/research 端到端研究编排器
/rebuttal 解析评审意见 → 逐条回复

Built with Claude Code

If this project helps your research, give it a ⭐

About

Karpathy's LLM-Wiki vision, fully realized — wiki-centric full-lifecycle AI research platform powered by Claude Code

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors