This directory contains arvo_exploitability.py, a CLI for ranking ARVO vulnerabilities by likely exploitability.
The pipeline has three stages:
prefilter: rank all ASAN cases from metadata and select a shortlist.run-asan: rundocker run -t n132/arvo:<id>-vul arvofor shortlisted bugs and parse the ASAN output.finalize: combine metadata and parsed ASAN features into the final ranked set.
The script needs the ARVO-Meta dataset. Running start.sh clones it automatically, or clone it manually:
git clone https://github.com/n132/ARVO-MetaBy default the script expects ARVO-Meta/ in the repository root. To use a different location, pass --arvo-dir:
python3 arvo_exploitability.py all --arvo-dir /path/to/ARVO-Meta ...--meta-dir and --patch-dir can still be set explicitly to override the paths derived from --arvo-dir.
- Run commands from the repository root.
python3must be available.- Docker must be installed and able to pull and run
n132/arvo:*images.
Start from scratch and produce the final top 15:
rm -rf output/arvo_exploitability
python3 arvo_exploitability.py all \
--output-dir output/arvo_exploitability \
--prefilter-n 50 \
--final-n 15 \
--bundle-n 5 \
--selection-mode diverse \
--jobs 2 \
--timeout 900This runs all three stages:
- metadata ranking for all ASAN cases (including wildcard-access and crash-revision extraction)
- Docker execution until 50 successful ASAN-parsed candidates are collected
- final ranking to 15 vulnerabilities with duplicate grouping and bundle generation
If run-asan was interrupted, resume it with the same output directory:
python3 arvo_exploitability.py run-asan \
--input output/arvo_exploitability/prefilter_ranked.jsonl \
--output-dir output/arvo_exploitability \
--prefilter-n 50 \
--jobs 2 \
--timeout 900Behavior:
- Existing
output/.../asan_logs/<id>.logfiles are reused. - Bugs with cached logs are not rerun.
- Only missing logs are executed again.
- Use
--forceonly if you want to rerun all Docker executions and overwrite cached logs.
If asan_results.jsonl already exists, generate the final files without rerunning Docker:
python3 arvo_exploitability.py finalize \
--prefilter-input output/arvo_exploitability/prefilter_ranked.jsonl \
--asan-input output/arvo_exploitability/asan_results.jsonl \
--output-dir output/arvo_exploitability \
--final-n 15 \
--bundle-n 5 \
--selection-mode diverseThis writes:
output/arvo_exploitability/final_top15.jsonloutput/arvo_exploitability/final_top15.csvoutput/arvo_exploitability/final_vulnerabilities.jsonoutput/arvo_exploitability/final_bundles_top5.jsonland.csvoutput/arvo_exploitability/final_mixed.jsonlandfinal_mixed.jsonoutput/arvo_exploitability/duplicate_targets.jsonland.csvoutput/arvo_exploitability/duplicate_target_revisions.jsonland.csv
Run only the metadata prefilter:
python3 arvo_exploitability.py prefilter \
--output-dir output/arvo_exploitability \
--prefilter-n 50Run only the Docker/ASAN stage:
python3 arvo_exploitability.py run-asan \
--input output/arvo_exploitability/prefilter_ranked.jsonl \
--output-dir output/arvo_exploitability \
--prefilter-n 50 \
--jobs 2 \
--timeout 900The pipeline writes the following artifacts under the chosen --output-dir:
prefilter_ranked.jsonlandprefilter_ranked.csv- full metadata ranking for all ASAN cases
prefilter_top50.jsonlandprefilter_top50.csv- metadata shortlist used for deeper analysis
asan_logs/- raw stdout from each Docker run, one log per bug ID
asan_attempts.jsonlandasan_attempts.csv- every attempted Docker run, including failures and cached results
asan_results.jsonlandasan_results.csv- successful parsed ASAN reports
asan_failures.jsonlandasan_failures.csv- timeouts, parse failures, and Docker execution failures
duplicate_targets.jsonlandduplicate_targets.csv- bugs grouped by project/fuzz_target that share the same target
duplicate_target_revisions.jsonlandduplicate_target_revisions.csv- bugs grouped by project/fuzz_target/crash_revision
final_top15.jsonlandfinal_top15.csv- final ranked vulnerability set after combining metadata and ASAN parsing
- when
--selection-mode diverseis used, selection caps per-category, per-project, and per-target counts for broader coverage
final_bundles_topN.jsonlandfinal_bundles_topN.csv- bugs bundled by shared project/fuzz_target/crash_revision, with aggregate scores
final_mixed.jsonlandfinal_mixed.json- unified output combining single-bug rows and bundle rows
final_vulnerabilities.jsonandfinal_vulnerabilities.jsonl- simplified machine-readable handoff list for downstream build scripts
- includes
bug_id,project,fuzz_target, short bug description/category, andvul_image
If you change --prefilter-n or --final-n, only the final shortlist size changes. The filenames remain based on the configured top-N in the current script for prefilter_top<N> and final_top<N>.
The finalize and all subcommands accept --selection-mode:
default: simple top-N by final score.diverse: diversity-aware selection that caps per-category, per-project, and per-target representation, then progressively relaxes caps to fill the quota. Prioritizes strong ASAN categories (heap-use-after-free, double-free, buffer overflows) before weaker ones.
Final scores combine metadata and ASAN features:
- Wildcard access bonus: crash types containing
{*}receive a metadata priority bonus (+35 for writes, +10 for reads). - Access size scaling: for access sizes beyond 32 bytes, write and read accesses scale logarithmically (writes scale more aggressively than reads).
- Crash revision tracking: each bug now records its crash revision, enabling duplicate grouping and bundle generation.
The script prints progress lines to stderr during long runs, for example:
[progress] starting metadata prefilter stage
[progress] processed 2500/4993 metadata files
[progress] dispatching Docker batch of 2 candidate(s); 18/50 successful ASAN parses so far
Per-bug Docker results are printed to stdout:
[ok] 3630 heap-use-after-free
[timeout] 9388 Heap-buffer-overflow WRITE {*}
final_top15.jsonl is missing:
Run finalize. run-asan does not create final ranking files by itself.
Docker pulls take a long time:
This is normal on the first run for uncached ARVO images. Resume runs reuse downloaded images and cached logs.
The batch stops after a timeout:
Timed-out bugs should now be recorded in asan_failures.jsonl instead of crashing the whole run. Resume the command without --force.
I want a clean rerun:
Delete the output directory first, or choose a fresh --output-dir.