[docs] update deepseek_v4 vllm docs#9597
Conversation
4 vllm docs
There was a problem hiding this comment.
Code Review
This pull request updates the weight-saving logic in swift/megatron/init.py to delete the expert_dtype attribute from hf_config and set llm_config.expert_dtype to 'fp8' when blockwise FP8 quantization is enabled. The reviewer recommends using HfConfigFactory.set_config_attr instead of direct attribute assignment to safely set expert_dtype on llm_config, avoiding potential AttributeError exceptions if llm_config is a dictionary.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| if hasattr(self, '_fp8_skip_modules'): | ||
| modules_to_not_convert = (modules_to_not_convert or []) + list(self._fp8_skip_modules) | ||
| hf_config.quantization_config = FineGrainedFP8Config(modules_to_not_convert=modules_to_not_convert) | ||
| llm_config.expert_dtype = 'fp8' |
There was a problem hiding this comment.
Using direct attribute assignment llm_config.expert_dtype = 'fp8' can raise an AttributeError if llm_config is a dictionary. It is safer and more consistent with the rest of the codebase to use HfConfigFactory.set_config_attr to set this attribute.
| llm_config.expert_dtype = 'fp8' | |
| HfConfigFactory.set_config_attr(llm_config, 'expert_dtype', 'fp8') |
No description provided.