Commit fbfba6c
fix: Initialize ApertusMLP's xielu activation using
* Fix Apertus model crash on float16 hardware
Initialize XIELU activation with correct dtype from config (using config.dtype instead of default bfloat16) to prevent promotion to float32 and subsequent crashes on Turing/float16 GPUs.
* refactor: Move `ACT2CLS` import to top-level in Apertus models.torch_dtype (#42864)1 parent 2f81d58 commit fbfba6c
File tree
2 files changed
+7
-2
lines changed- src/transformers/models/apertus
2 files changed
+7
-2
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
25 | 25 | | |
26 | 26 | | |
27 | 27 | | |
28 | | - | |
| 28 | + | |
29 | 29 | | |
30 | 30 | | |
31 | 31 | | |
| |||
49 | 49 | | |
50 | 50 | | |
51 | 51 | | |
| 52 | + | |
| 53 | + | |
52 | 54 | | |
53 | 55 | | |
54 | 56 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
| 22 | + | |
22 | 23 | | |
23 | 24 | | |
24 | 25 | | |
| |||
192 | 193 | | |
193 | 194 | | |
194 | 195 | | |
195 | | - | |
| 196 | + | |
196 | 197 | | |
197 | 198 | | |
| 199 | + | |
| 200 | + | |
198 | 201 | | |
199 | 202 | | |
200 | 203 | | |
| |||
0 commit comments