update flex attention patching for transformers 4.51 #2501

bursteratom · 2025-04-07T19:15:25Z

Bumping flex attention monkeypatch to transformers 4.51

… to be in line with transformers v4.51

SalmanMohammadi · 2025-04-08T10:20:48Z

huggingface/transformers#37285 has landed btw

src/axolotl/monkeypatch/attention/flex_attn.py

… to be in line with transformers v4.51

winglian · 2025-04-10T06:56:59Z

src/axolotl/monkeypatch/attention/flex_attn.py

I don't know that this can be removed yet, as the released version in transformers had an edge case that affects most of our docker builds. huggingface/transformers#37399

NanoCode012 · 2025-04-22T09:50:30Z

src/axolotl/utils/models.py

-        if self.cfg.flex_attention:
-            self.model_kwargs["attn_implementation"] = "flex_attention"
-            self.model_config._attn_implementation = (  # pylint: disable=protected-access
-                "flex_attention"
-            )


Do we need to at least keep these lines?

Do we still even need this PR?

winglian · 2025-04-28T00:23:04Z

superseded by #2469

bump flex patching transformers to v4.51, update torch compile kwargs…

4e86770

… to be in line with transformers v4.51

bursteratom requested a review from winglian April 7, 2025 19:15

bursteratom added 2 commits April 7, 2025 17:05

remove backend='inductor' in local patch

421e0ee

add back dynamic=False

4d320e2

SalmanMohammadi reviewed Apr 8, 2025

View reviewed changes

src/axolotl/monkeypatch/attention/flex_attn.py Outdated Show resolved Hide resolved

SalmanMohammadi reviewed Apr 8, 2025

View reviewed changes

src/axolotl/monkeypatch/attention/flex_attn.py Outdated Show resolved Hide resolved

SalmanMohammadi and others added 5 commits April 8, 2025 11:28

fixing transformers version

b98dbaf

bump flex patching transformers to v4.51, update torch compile kwargs…

04624c5

… to be in line with transformers v4.51

remove backend='inductor' in local patch

bdaaba2

add back dynamic=False

75c565d

fixing transformers version

cdb1606

bursteratom force-pushed the flex_patching_update branch from b98dbaf to cdb1606 Compare April 8, 2025 13:23

SalmanMohammadi added 5 commits April 8, 2025 17:17

pinning transformers version

e1a8dfb

merging

6f47b1e

fixing tests

2f147cc

Merge branch 'main' into flex_patching_update

76ae4ae

raising value error

deb0195

winglian reviewed Apr 10, 2025

View reviewed changes

NanoCode012 reviewed Apr 22, 2025

View reviewed changes

winglian closed this Apr 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

update flex attention patching for transformers 4.51 #2501

update flex attention patching for transformers 4.51 #2501

Uh oh!

bursteratom commented Apr 7, 2025 •

edited

Loading

Uh oh!

SalmanMohammadi commented Apr 8, 2025

Uh oh!

Uh oh!

Uh oh!

winglian Apr 10, 2025

Uh oh!

NanoCode012 Apr 22, 2025

Uh oh!

SalmanMohammadi Apr 22, 2025

Uh oh!

winglian commented Apr 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

update flex attention patching for transformers 4.51 #2501

update flex attention patching for transformers 4.51 #2501

Uh oh!

Conversation

bursteratom commented Apr 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SalmanMohammadi commented Apr 8, 2025

Uh oh!

Uh oh!

Uh oh!

winglian Apr 10, 2025

Choose a reason for hiding this comment

Uh oh!

NanoCode012 Apr 22, 2025

Choose a reason for hiding this comment

Uh oh!

SalmanMohammadi Apr 22, 2025

Choose a reason for hiding this comment

Uh oh!

winglian commented Apr 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

bursteratom commented Apr 7, 2025 •

edited

Loading