Skip to content

Conversation

@vedant713
Copy link

This patch ensures that both mi_ctz_generic32 and mi_clz_generic32 perform safe indexing into the de Bruijn lookup tables by masking the computed index with & 31.

On platforms where unsigned long is 64-bit, the result of the de Bruijn multiplication and shift could exceed the valid index range (0–31), leading to an out-of-bounds read.

This change applies a bitwise AND mask to the final index:

  • mi_ctz_generic32: debruijn[(((x & -(int32_t)x) * 0x077CB531U) >> 27) & 31]
  • mi_clz_generic32: debruijn[((x * 0x07C4ACDDU) >> 27) & 31]

This matches the fix applied in python/cpython#134070 to its integrated mimalloc copy.

Fixes: python/cpython#134070

This patch ensures that both mi_ctz_generic32 and mi_clz_generic32 perform safe indexing into the de Bruijn lookup tables by masking the computed index with `& 31`.

On platforms where unsigned long is 64-bit, the result of the de Bruijn multiplication and shift could exceed the valid index range (0–31), leading to an out-of-bounds read.

This change applies a bitwise AND mask to the final index:
- `mi_ctz_generic32`: debruijn[(((x & -(int32_t)x) * 0x077CB531U) >> 27) & 31]
- `mi_clz_generic32`: debruijn[((x * 0x07C4ACDDU) >> 27) & 31]

This matches the fix applied in python/cpython#134070 to its integrated mimalloc copy.

Fixes: python/cpython#134070
@res2k
Copy link
Contributor

res2k commented May 17, 2025

Out of curiosity -
Reading the original: (uint32_t)(x * (uint32_t)(0x07C4ACDDU)) >> 27
I'm thinking the intent was to cut off the upper 32 bits with the (uint32_t) cast; with that, the result after shift right by 27 should not have any bits set above bit 5, due to the cast having previously cut off the bits above.
So I wonder why the original code doesn't work, but an explicit ANDing with 31 does.
Some quirk I'm missing?

@daanx
Copy link
Collaborator

daanx commented May 21, 2025

Thank you for the PR! I was also confused because it is already cast to uint32_t so the extra & 31 should not be needed. Then I remembered a recent commit ed31847 which fixed the original bug by adding the uint32_t cast. So, this new PR is no longer needed. I really appreciate the upstreaming of CPython issues though -- thank you so much! (I am currently also working on getting V3 ready for CPython with better integration and less code needed on the CPython side to integrate).

@daanx daanx closed this Jun 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Out-of-bounds read in integrated mimalloc (fixed upstream)

3 participants