Add LLaMA 3.1 and 3.2 tool calling support#5518
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5d6615bc4d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if tokenizer.chat_template in [llama3_1_chat_template, llama3_2_chat_template]: | ||
| tokenizer.response_schema = llama3_schema |
There was a problem hiding this comment.
Handle Llama 3.1 builtin tool-call format separately
add_response_schema now assigns llama3_schema to both 3.1 and 3.2 templates, but llama3_1.jinja has a builtin-tools path that renders calls as <|python_tag|>...<|eom_id|> (not JSON {"name":..., "parameters":...}<|eot_id|>). Because llama3_schema only matches the JSON form, builtin tool outputs are parsed as plain content instead of tool_calls, so tool execution is skipped for Llama 3.1 sessions that use builtin_tools.
Useful? React with 👍 / 👎.
|
@qgallouedec it's really cool, but is there a reason to put the templates/schema in TRL and not in the Hub repos? |
|
Because it's very unlikely that models like llama 3.1 would merge this kind of change. I don't have the link but I remember that the Qwen team pushed back on adding generation marker in their template. Response schemas are needed for RL training. So if we just wait for the labs to add them, it's just impossible for us to do RL The idea is to make these features (generation marker and schema) more popular and tested and hope they'll see greater adoption in the future. |
|
We should still try! I'll see if I can make PRs and try to get them merged after this PR is merged. |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
merging this one with no review. It's been opened for a while and nothing critical here |
llama3_1.jinjaandllama3_2.jinjatemplates for identity matching inadd_response_schemaTestAddResponseSchemaandTestParseResponsetest parametrizationsPart of #5460
cc @Rocketknight1
Note
Medium Risk
Changes response parsing behavior based on chat-template identity matching, which could affect tool-call extraction for Llama variants if templates or regex matching diverge from real model outputs.
Overview
Adds Llama 3.1/3.2 tool-calling response parsing by introducing a dedicated
llama3_schemaand wiring it intoadd_response_schemavia new identity-matching templates (llama3_1.jinja,llama3_2.jinja). The schema handles Llama’s bare JSON tool-call format by convertingparametersinto standardarguments, while reflecting template limitations (single tool call, no content alongside tool call).Updates the test suite to cover the new tiny Llama 3.1/3.2 fixtures and adjusts expectations/skips for unsupported behaviors (inline
reasoning_content, multiple tool calls, tool call + content). Documentation for GRPO agent training now lists Llama 3.1 and 3.2 as supported models, and the chat-template README documents the new templates and their constraints.Reviewed by Cursor Bugbot for commit 9a7b9d9. Bugbot is set up for automated code reviews on this repo. Configure here.