Skip to content

Conversation

@RyanMullins
Copy link
Contributor

What does this PR do?

Previously, the Gemma3nTextAttention class removed the self.scaling property and passed 1.0 as a hard-coded value to the attention_interface() function.

This PR reinstates the self.scaling property, sets it to 1.0, and passes self.scaling to the attention_interface() function in Gemma3nTextAttention.forward(), which should make it more configurable for users interested in experimenting with the Gemma 3n arch, and improves the lineage of its modular inheritance relative to Gemma 3.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@ArthurZucker @Cyrilvallez

@Rocketknight1
Copy link
Member

Seems like a core question for text model philosophy so cc @ArthurZucker!

@RyanMullins RyanMullins force-pushed the gemma3n-modular-inheritance branch from ea129b3 to 6eea1ba Compare October 28, 2025 14:10
@github-actions
Copy link
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: gemma3n

Copy link
Member

@Cyrilvallez Cyrilvallez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@Cyrilvallez Cyrilvallez merged commit 91d250e into huggingface:main Nov 7, 2025
15 checks passed
Abdennacer-Badaoui pushed a commit to Abdennacer-Badaoui/transformers that referenced this pull request Nov 10, 2025
maintenance: make Gemma3nTextAttention more amenable to modular inheritance
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants