Tags · ggml-org/llama.cpp

b7416

convert : move rope_parameters to TextModel class (#18061)

* make sure to search text_config for rope parameters

* move rope_parameters to TextModel class

Dec 15, 2025
d6a1e18
zip
tar.gz

b7415

ggml-hexagon: mm for mtmd (#17894)

* feat: add run_mtmd script for hexagon

* fix: fix issue in fp16xfp32 mm

* fix: remove opt_experiment for fp16xfp32 mm

* fix: ggml-hexagon: matmul fp16xfp32 support non-contigious src0

* fix: fix syntax check for run-mtmd.sh for cli

Dec 15, 2025
c45f89d
zip
tar.gz
Notes
Downloads

b7414

model : add KORMo model (#18032)

* vocab: add KORMo Tokenizer

* model: add KORMoForCausalLM

* vocab: change pretokenizer to qwen2

* lint: fix unintended line removal

* model: make qwen2 bias tensor optional

* model: use qwen2 architecture for KORMo

Dec 15, 2025
9d52f17
zip
tar.gz
Notes
Downloads

b7413

kv-cache: Fix state restore fragmented cache (#17982)

* kv-cache : fix state restore with fragmented cache (#17527)

Change find_slot to allow non-contiguous allocation during state restore. Fixes 'failed to find available cells in kv cache' error when restoring state to fragmented cache.

* tests : update logic

* cleanup: tightened state_read_meta sig, added is_contiguous case

* fix: state_read_meta arg reorder loose ends

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

Dec 15, 2025
4529c66
zip
tar.gz
Notes
Downloads

b7411

metal: use shared buffers on eGPU (#17866)

* metal: use shared buffers on eGPU

With #15906, I noticed on important regression when using metal backend on eGPU.
This commit restore the previous behavior and add an option to force its activation.

* metal: use shared buffers on eGPU

* metal: use shared buffers on eGPU

Dec 15, 2025
165caaf
zip
tar.gz
Notes
Downloads

b7410

mtmd: refactor audio preprocessing (#17978)

* mtmd: refactor audio preprocessing

* refactor

Co-authored-by: Tarek <tdakhran@users.noreply.github.com>

* wip

* wip (2)

* improve constructor

* fix use_natural_log

* fix padding for short input

* clean up

* remove need_chunking

---------

Co-authored-by: Tarek <tdakhran@users.noreply.github.com>

Dec 15, 2025
96a181a
zip
tar.gz
Notes
Downloads

b7406

[SYCL] Support gpt-oss by OPs add-id, mul_mat for mxfp4, swiglu_oai (#…

…17826)

* support gpt-oss GPU by OP add-id, mul_mat for mxfp4, swiglu_oai, fix warning

* fix fault ut case, update ops.md

* rebase, fix format issue

Dec 15, 2025
4aced7a
zip
tar.gz
Notes
Downloads

b7405

model : add glm-asr support (#17901)

* [model] add glm-asr support

* fix format for ci

* fix convert format for ci

* update glm_asr convert script & use build_ffn for glm_asr clip & use build_stack for padding and review

* check root architecture for convert hf script

* fix conficlt with upstream

* fix convert script for glm asr & format clip-impl

* format

* restore hparams text

* improved conversion

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

Dec 15, 2025
745fa0e
zip
tar.gz
Notes
Downloads

b7404

preset: handle negated arg, reverse the meaning if needed (#18041)

Dec 14, 2025
5239229
zip
tar.gz
Notes
Downloads

b7402

mtmd: enhance image resizing in llava_uhd (#18014)

Dec 14, 2025
37f5a10
zip
tar.gz
Notes
Downloads

PreviousNext

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

b7416

b7415

b7414

b7413

b7411

b7410

b7406

b7405

b7404

b7402

Tags: ggml-org/llama.cpp