Skip to content

[POC] Add AI submission audit system#452

Draft
msaroufim wants to merge 2 commits intomainfrom
submission-audit
Draft

[POC] Add AI submission audit system#452
msaroufim wants to merge 2 commits intomainfrom
submission-audit

Conversation

@msaroufim
Copy link
Member

Summary

  • Adds automated LLM-based auditing of submissions for cheating (reward hacking, hardcoded outputs, eval bypasses)
  • Uses OpenRouter (gpt-4o-mini) — runs as fire-and-forget after each submission, never blocks/breaks the submission flow
  • If OPENROUTER_API_KEY is unset, auditing is silently skipped (graceful degradation)
  • Admins review flagged submissions via two new API endpoints

Changes

File What
src/migrations/20260301_01_audit-add-submission-audit.py New leaderboard.submission_audit table
src/libkernelbot/audit.py New module — sends reference code + submission to OpenRouter, stores verdict
src/libkernelbot/leaderboard_db.py 4 new DB methods (create audit, get audits, mark reviewed, get task by id)
src/libkernelbot/backend.py Fire-and-forget asyncio.create_task after mark_submission_done
src/kernelbot/api/main.py GET /admin/audits and POST /admin/audits/{id}/reviewed
pyproject.toml Add openai dependency (OpenAI SDK used as OpenRouter client)

What this does NOT do

  • No score-based filtering — audits every completed submission (can add threshold later)
  • No retry logic — if OpenRouter call fails, audit is just skipped
  • No Discord integration — admin reviews audits via API only
  • No batch/backfill — only audits new submissions going forward

Test plan

  • Run migration: yoyo apply src/migrations -d $DATABASE_URL
  • Run existing tests: uv run pytest tests/ -v (all 80 passing tests still pass)
  • Verify graceful skip: without OPENROUTER_API_KEY set, submissions work normally with no errors
  • Submit a kernel with key set, check leaderboard.submission_audit has a row
  • curl -H "Authorization: Bearer $ADMIN_TOKEN" localhost:8000/admin/audits
  • curl -X POST -H "Authorization: Bearer $ADMIN_TOKEN" localhost:8000/admin/audits/1/reviewed

Automatically audit submissions for cheating using an LLM (gpt-4o-mini
via OpenRouter). Runs as fire-and-forget after each submission completes.
Admins can review flagged submissions via API.
@github-actions
Copy link

github-actions bot commented Mar 1, 2026

Coverage report

Click to see where and how coverage changed

FileStatementsMissingCoverageCoverage
(new stmts)
Lines missing
  src/libkernelbot
  audit.py 48, 51, 61-126
  backend.py 248-249
  leaderboard_db.py 1242-1248
  utils.py
Project Total  

This report was generated by python-coverage-comment-action

@msaroufim msaroufim changed the title Add AI submission audit system [POC] Add AI submission audit system Mar 2, 2026
@msaroufim msaroufim marked this pull request as draft March 3, 2026 03:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant