Skip to content

nschllr/arvo_exploitability_gathering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Exploitability Ranking Pipeline

This directory contains arvo_exploitability.py, a CLI for ranking ARVO vulnerabilities by likely exploitability.

The pipeline has three stages:

  1. prefilter: rank all ASAN cases from metadata and select a shortlist.
  2. run-asan: run docker run -t n132/arvo:<id>-vul arvo for shortlisted bugs and parse the ASAN output.
  3. finalize: combine metadata and parsed ASAN features into the final ranked set.

Setup

The script needs the ARVO-Meta dataset. Running start.sh clones it automatically, or clone it manually:

git clone https://github.com/n132/ARVO-Meta

By default the script expects ARVO-Meta/ in the repository root. To use a different location, pass --arvo-dir:

python3 arvo_exploitability.py all --arvo-dir /path/to/ARVO-Meta ...

--meta-dir and --patch-dir can still be set explicitly to override the paths derived from --arvo-dir.

Requirements

  • Run commands from the repository root.
  • python3 must be available.
  • Docker must be installed and able to pull and run n132/arvo:* images.

Fresh Full Run

Start from scratch and produce the final top 15:

rm -rf output/arvo_exploitability
python3 arvo_exploitability.py all \
  --output-dir output/arvo_exploitability \
  --prefilter-n 50 \
  --final-n 15 \
  --bundle-n 5 \
  --selection-mode diverse \
  --jobs 2 \
  --timeout 900

This runs all three stages:

  • metadata ranking for all ASAN cases (including wildcard-access and crash-revision extraction)
  • Docker execution until 50 successful ASAN-parsed candidates are collected
  • final ranking to 15 vulnerabilities with duplicate grouping and bundle generation

Resume After Interrupted Docker Runs

If run-asan was interrupted, resume it with the same output directory:

python3 arvo_exploitability.py run-asan \
  --input output/arvo_exploitability/prefilter_ranked.jsonl \
  --output-dir output/arvo_exploitability \
  --prefilter-n 50 \
  --jobs 2 \
  --timeout 900

Behavior:

  • Existing output/.../asan_logs/<id>.log files are reused.
  • Bugs with cached logs are not rerun.
  • Only missing logs are executed again.
  • Use --force only if you want to rerun all Docker executions and overwrite cached logs.

Run Final Ranking Only

If asan_results.jsonl already exists, generate the final files without rerunning Docker:

python3 arvo_exploitability.py finalize \
  --prefilter-input output/arvo_exploitability/prefilter_ranked.jsonl \
  --asan-input output/arvo_exploitability/asan_results.jsonl \
  --output-dir output/arvo_exploitability \
  --final-n 15 \
  --bundle-n 5 \
  --selection-mode diverse

This writes:

  • output/arvo_exploitability/final_top15.jsonl
  • output/arvo_exploitability/final_top15.csv
  • output/arvo_exploitability/final_vulnerabilities.json
  • output/arvo_exploitability/final_bundles_top5.jsonl and .csv
  • output/arvo_exploitability/final_mixed.jsonl and final_mixed.json
  • output/arvo_exploitability/duplicate_targets.jsonl and .csv
  • output/arvo_exploitability/duplicate_target_revisions.jsonl and .csv

Individual Stages

Run only the metadata prefilter:

python3 arvo_exploitability.py prefilter \
  --output-dir output/arvo_exploitability \
  --prefilter-n 50

Run only the Docker/ASAN stage:

python3 arvo_exploitability.py run-asan \
  --input output/arvo_exploitability/prefilter_ranked.jsonl \
  --output-dir output/arvo_exploitability \
  --prefilter-n 50 \
  --jobs 2 \
  --timeout 900

Output Files

The pipeline writes the following artifacts under the chosen --output-dir:

  • prefilter_ranked.jsonl and prefilter_ranked.csv
    • full metadata ranking for all ASAN cases
  • prefilter_top50.jsonl and prefilter_top50.csv
    • metadata shortlist used for deeper analysis
  • asan_logs/
    • raw stdout from each Docker run, one log per bug ID
  • asan_attempts.jsonl and asan_attempts.csv
    • every attempted Docker run, including failures and cached results
  • asan_results.jsonl and asan_results.csv
    • successful parsed ASAN reports
  • asan_failures.jsonl and asan_failures.csv
    • timeouts, parse failures, and Docker execution failures
  • duplicate_targets.jsonl and duplicate_targets.csv
    • bugs grouped by project/fuzz_target that share the same target
  • duplicate_target_revisions.jsonl and duplicate_target_revisions.csv
    • bugs grouped by project/fuzz_target/crash_revision
  • final_top15.jsonl and final_top15.csv
    • final ranked vulnerability set after combining metadata and ASAN parsing
    • when --selection-mode diverse is used, selection caps per-category, per-project, and per-target counts for broader coverage
  • final_bundles_topN.jsonl and final_bundles_topN.csv
    • bugs bundled by shared project/fuzz_target/crash_revision, with aggregate scores
  • final_mixed.jsonl and final_mixed.json
    • unified output combining single-bug rows and bundle rows
  • final_vulnerabilities.json and final_vulnerabilities.jsonl
    • simplified machine-readable handoff list for downstream build scripts
    • includes bug_id, project, fuzz_target, short bug description/category, and vul_image

If you change --prefilter-n or --final-n, only the final shortlist size changes. The filenames remain based on the configured top-N in the current script for prefilter_top<N> and final_top<N>.

Selection Modes

The finalize and all subcommands accept --selection-mode:

  • default: simple top-N by final score.
  • diverse: diversity-aware selection that caps per-category, per-project, and per-target representation, then progressively relaxes caps to fill the quota. Prioritizes strong ASAN categories (heap-use-after-free, double-free, buffer overflows) before weaker ones.

Scoring

Final scores combine metadata and ASAN features:

  • Wildcard access bonus: crash types containing {*} receive a metadata priority bonus (+35 for writes, +10 for reads).
  • Access size scaling: for access sizes beyond 32 bytes, write and read accesses scale logarithmically (writes scale more aggressively than reads).
  • Crash revision tracking: each bug now records its crash revision, enabling duplicate grouping and bundle generation.

Progress Output

The script prints progress lines to stderr during long runs, for example:

[progress] starting metadata prefilter stage
[progress] processed 2500/4993 metadata files
[progress] dispatching Docker batch of 2 candidate(s); 18/50 successful ASAN parses so far

Per-bug Docker results are printed to stdout:

[ok] 3630 heap-use-after-free
[timeout] 9388 Heap-buffer-overflow WRITE {*}

Troubleshooting

final_top15.jsonl is missing:

Run finalize. run-asan does not create final ranking files by itself.

Docker pulls take a long time:

This is normal on the first run for uncached ARVO images. Resume runs reuse downloaded images and cached logs.

The batch stops after a timeout:

Timed-out bugs should now be recorded in asan_failures.jsonl instead of crashing the whole run. Resume the command without --force.

I want a clean rerun:

Delete the output directory first, or choose a fresh --output-dir.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors