Skip to content

Conversation

@ssweens
Copy link
Contributor

@ssweens ssweens commented Dec 13, 2025

Fix #17527

Change find_slot to allow non-contiguous allocation during state restore. Fixes 'failed to find available cells in kv cache' error when restoring state to fragmented cache.

Change find_slot to allow non-contiguous allocation during state restore. Fixes 'failed to find available cells in kv cache' error when restoring state to fragmented cache.
@ssweens ssweens requested a review from ggerganov as a code owner December 13, 2025 01:27
@ssweens
Copy link
Contributor Author

ssweens commented Dec 13, 2025

PS - aimed to implement the direction given by @ggerganov (#17527 (comment))

@ggerganov
Copy link
Member

Looking good - will do some testing to confirm 👍

@ggerganov
Copy link
Member

ggerganov commented Dec 15, 2025

I updated the test to do a proper verification:

  • It requires kv_unified == true
  • It interleaves the sequences to create large fragmentation
  • Uses total context size of 256 tokens and 3 sequences with 70 tokens each

@ssweens After you address the comments, we can merge.

@ssweens
Copy link
Contributor Author

ssweens commented Dec 15, 2025

Thanks @ggerganov. Comments addressed and committed.

Copy link
Member

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution. This is a good improvement.

@ggerganov ggerganov merged commit 4529c66 into ggml-org:master Dec 15, 2025
68 of 69 checks passed
@ssweens
Copy link
Contributor Author

ssweens commented Dec 15, 2025

Happy to help where/when I can. Hat tip to you and team on the great project.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

testing Everything test related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Eval bug: failed to restore kv cache on gpt-oss:120b with parallel processing

2 participants