Remove support for quant_llm_linear #3520

howardzhang-cv · 2025-12-20T02:08:44Z

Stack from ghstack (oldest at bottom):

-> Remove support for quant_llm_linear #3520

Summary: Deleted fp6_linear.cu and rest of fp6_llm folder
Modified ops.py (torchao/ops.py) and test_ops.py (test/test_ops.py) to remove quant_llm_linear calls

Tasks: Related to issue #3516

[ghstack-poisoned]

pytorch-bot · 2025-12-20T02:08:47Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/3520

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 6 New Failures, 2 Unrelated Failures

As of commit 07076f0 with merge base a8fa9e5 ():

NEW FAILURES - The following jobs have failed:

Code Analysis with Ruff / build (3.9) (gh)
Process completed with exit code 1.
PR Label Check / Check PR Labels (gh)
Process completed with exit code 1.
Run Regression Tests / test (CUDA 2.6, linux.g5.12xlarge.nvidia.gpu, torch==2.6.0, cuda, 12.6) / linux-job (gh)
test/dtypes/test_floatx.py::TestFloatxTensorCoreAQTTensorImpl::test_fpx_weight_only_ebits_3_mbits_2_bias_True_float16
Run Regression Tests / test (CUDA 2.7, linux.g5.12xlarge.nvidia.gpu, torch==2.7.1, cuda, 12.6) / linux-job (gh)
test/dtypes/test_floatx.py::TestFloatxTensorCoreAQTTensorImpl::test_fpx_weight_only_ebits_3_mbits_2_bias_True_float16
Run Regression Tests / test (CUDA 2.8, linux.g5.12xlarge.nvidia.gpu, torch==2.8.0, cuda, 12.6) / linux-job (gh)
test/dtypes/test_floatx.py::TestFloatxTensorCoreAQTTensorImpl::test_fpx_weight_only_ebits_3_mbits_2_bias_True_float16
Run Regression Tests / test (CUDA 2.9, linux.g5.12xlarge.nvidia.gpu, torch==2.9.1, cuda, 12.6) / linux-job (gh)
test/dtypes/test_floatx.py::TestFloatxTensorCoreAQTTensorImpl::test_fpx_weight_only_ebits_3_mbits_2_bias_True_float16

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

Run 1xH100 Tests / test (H100, linux.aws.h100, --pre torch torchvision torchaudio fbgemm-gpu-genai --index-url https... / linux-job (gh) (trunk failure)
test/integration/test_load_and_run_checkpoint.py::TestLoadAndRunCheckpoint::test_deprecated_hf_models_model_info2
Run Regression Tests / test-nightly (CUDA Nightly, linux.g5.12xlarge.nvidia.gpu, --pre torch --index-url https://downloa... / linux-job (gh) (trunk failure)
test/test_low_bit_optim.py::TestFSDP2::test_fsdp2

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Summary: Deleted fp6_linear.cu and rest of fp6_llm folder Modified ops.py (torchao/ops.py) and test_ops.py (test/test_ops.py) to remove quant_llm_linear calls Tasks: Related to issue [#3516](github.com//issues/3516) ghstack-source-id: 69c1877 Pull-Request: #3520

jerryzh168 · 2025-12-20T02:11:37Z

probably have to delete this and related tests etc. as well:

ao/torchao/prototype/quantization/quant_api.py

Line 620 in 7035fb7

class FPXWeightOnlyConfig(AOBaseConfig):

you can search for quant_llm_linear in the code base (https://github.com/search?q=repo%3Apytorch%2Fao%20quant_llm_linear&type=code) and delete all the related code

howardzhang-cv · 2025-12-20T02:14:26Z

First time working with the torchao repo so not really sure if this is the right way to do it:
I deleted the entire fp6_llm folder, and modified ops.py and test_ops.py to remove calls to quant_llm_linear. Is this what we wanted? Or did we want to just delete fp6_llm, keep the calls to quant_llm_linear, and just raise an error or something?
Also, if we are deleting quant_llm_linear, should I keep the floatx_tensor_core? I might be misunderstanding, but it seems like the point of those functions were just to create the fp6 that could use quant_llm_linear? In any case, there is still a reference to quant_llm_linear in floatx_tensor_core_layout.py and the README in that same folder that I have not removed. Just wanted some confirmation that this is what I'm supposed to be doing before continuing.

jerryzh168 · 2025-12-20T02:21:49Z

@howardzhang-cv I think it might be cleaner if you delete the floatx_tensor_core_layout and the FPXWeightOnlyConfig in a separate PR first, before doing this

Update

07076f0

[ghstack-poisoned]

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 20, 2025

howardzhang-cv marked this pull request as draft December 20, 2025 02:09

howardzhang-cv requested a review from jerryzh168 December 20, 2025 02:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Remove support for quant_llm_linear #3520

Remove support for quant_llm_linear #3520

Uh oh!

howardzhang-cv commented Dec 20, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Dec 20, 2025 •

edited

Loading

Uh oh!

jerryzh168 commented Dec 20, 2025 •

edited

Loading

Uh oh!

howardzhang-cv commented Dec 20, 2025

Uh oh!

jerryzh168 commented Dec 20, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Remove support for quant_llm_linear #3520

Are you sure you want to change the base?

Remove support for quant_llm_linear #3520

Uh oh!

Conversation

howardzhang-cv commented Dec 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Dec 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/3520

❌ 6 New Failures, 2 Unrelated Failures

Uh oh!

jerryzh168 commented Dec 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

howardzhang-cv commented Dec 20, 2025

Uh oh!

jerryzh168 commented Dec 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

howardzhang-cv commented Dec 20, 2025 •

edited

Loading

pytorch-bot bot commented Dec 20, 2025 •

edited

Loading

jerryzh168 commented Dec 20, 2025 •

edited

Loading

jerryzh168 commented Dec 20, 2025 •

edited

Loading