feat: enable `ModelLoaderHuggerFace` to support loading models in fp16 for inference by 0x404 · Pull Request #555 · Oneflow-Inc/libai

0x404 · 2024-09-22T09:09:34Z

ModelLoaderHuggerFace currently only supports reading tensors from a checkpoint and loading them into the model, while keeping the tensor dtype as it is.

This PR adds an fp16_inference option, allowing ModelLoaderHuggerFace to load fp16 models for fp16 inference.

when `fp16_inference` is enabled, the model will be loaded as fp16 paramters when inference.

This reverts commit 49fc21e.

ShawnXuan · 2024-09-23T05:52:34Z

加载模型后再转为fp16，内存会突然减小很多。

0x404 and others added 6 commits September 22, 2024 08:38

feat: add fp16_inference option to support fp16 infer

2198f6f

when `fp16_inference` is enabled, the model will be loaded as fp16 paramters when inference.

solve conflicts

4a58ede

update

49fc21e

support chatglm with fp16 inference

4edb33a

Revert "update"

ec1a81a

This reverts commit 49fc21e.

set defaults to False

c2a8ef6

0x404 requested review from Flowingsun007, ShawnXuan and fpzh2011 September 23, 2024 02:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: enable `ModelLoaderHuggerFace` to support loading models in fp16 for inference#555

feat: enable `ModelLoaderHuggerFace` to support loading models in fp16 for inference#555
0x404 wants to merge 6 commits intomainfrom
fp16_infer

0x404 commented Sep 22, 2024

Uh oh!

ShawnXuan commented Sep 23, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

0x404 commented Sep 22, 2024

Uh oh!

ShawnXuan commented Sep 23, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants