Skip to content

Conversation

@kylesayrs
Copy link
Contributor

@kylesayrs kylesayrs commented Dec 15, 2025

Purpose

  • Support loading models with online transforms applied via Compressed Tensors (LLM Compressor)
  • Fix tests which referenced deleted models

Background

Transforms are extra weights added to a model which improve accuracy recovery from quantization. These extra weights are required to be shared in order to reduce memory requirements of the model.

Changes

  • Require a minimum compressed tensors version of 0.11.0 (to support transform features)
  • Apply transforms to the model before weight loading
    • Implement _update_transforms_tied_weights, which leverages @Cyrilvallez 's refactored tie_weights functionality!
    • _update_transforms_tied_weights specifies which transform weights are tied, and PreTrainedModel.tie_weights ingests the tied weights map and searches for the loaded weight to tied with shared weights
  • Refactor compressed tensors tests to check for perplexity, rather than exact output matches
    • Update model stubs to use up-to-date models

Example _tied_weights_keys:

"model.layers.1.self_attn_.q_proj.v_input.weight": "model.layers.0.self_attn_.q_proj.v_input.weight",
"model.layers.2.self_attn_.q_proj.v_input.weight": "model.layers.0.self_attn_.q_proj.v_input.weight",
"model.layers.3.self_attn_.q_proj.v_input.weight": "model.layers.0.self_attn_.q_proj.v_input.weight",
...
"model.layers.1.self_attn_.q_proj.u_output.weight": "model.layers.0.self_attn_.q_proj.u_output.weight",
"model.layers.2.self_attn_.q_proj.u_output.weight": "model.layers.0.self_attn_.q_proj.u_output.weight",
"model.layers.3.self_attn_.q_proj.u_output.weight": "model.layers.0.self_attn_.q_proj.u_output.weight",

Testing

  • Regression tested using CompressedTensorsTest, added an online quip-style transformed model for testing
    • Perplexity results match expectation

Suggested Reviewers

@SunMarc @Cyrilvallez @Rocketknight1

@Rocketknight1
Copy link
Member

cc @MekkCyber for quantization

@kylesayrs
Copy link
Contributor Author

make fix-copies does not fix the CI 🥲

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
@kylesayrs kylesayrs force-pushed the kylesayrs/transforms branch from a65d99e to d56f657 Compare December 16, 2025 15:46
@github-actions
Copy link
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: compressed_tensors_integration

@github-actions
Copy link
Contributor

View the CircleCI Test Summary for this PR:

https://huggingface.co/spaces/transformers-community/circle-ci-viz?pr=42887&sha=d56f65

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants