feat: show progress bar during multi-turn ISL pre-compute#329
feat: show progress bar during multi-turn ISL pre-compute#329YibaiMeng wants to merge 2 commits into
Conversation
The serial apply_chat_template loop over every client turn runs silent for minutes on large multi-turn datasets, so it looks like a hang. Show a throttled tqdm bar (2s mininterval) over the loop for liveness + ETA in captured logs. No behavior change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅ |
There was a problem hiding this comment.
Code Review
This pull request introduces a tqdm progress bar to track the progress of pre-computing ISL token counts during multi-turn benchmarks. The feedback recommends disabling the progress bar when INFO logging is not enabled to avoid cluttering logs in non-interactive or CI/CD environments.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
Hi @YibaiMeng, could you check if #336 alleviates the issue? |
What does this PR do?
Show a
tqdmprogress bar over the multi-turn ISL pre-computation loop in_precompute_isl_for_multi_turn(src/inference_endpoint/commands/benchmark/execute.py).Why: On large multi-turn datasets the serial
apply_chat_templateloop runs for minutes with no output (30 min on a 26k-turn agentic-coding dataset, with the entire benchmark taking less than 50 minutes), so it looks like a hang. This PR adds a progress bar.Type of change
Related issues
N/A
Testing
Checklist