Skip to content

Tags: ggml-org/llama.cpp

Tags

b7416

Toggle b7416's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
convert : move rope_parameters to TextModel class (#18061)

* make sure to search text_config for rope parameters

* move rope_parameters to TextModel class

b7415

Toggle b7415's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
ggml-hexagon: mm for mtmd (#17894)

* feat: add run_mtmd script for hexagon

* fix: fix issue in fp16xfp32 mm

* fix: remove opt_experiment for fp16xfp32 mm

* fix: ggml-hexagon: matmul fp16xfp32 support non-contigious src0

* fix: fix syntax check for run-mtmd.sh for cli

b7414

Toggle b7414's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
model : add KORMo model (#18032)

* vocab: add KORMo Tokenizer

* model: add KORMoForCausalLM

* vocab: change pretokenizer to qwen2

* lint: fix unintended line removal

* model: make qwen2 bias tensor optional

* model: use qwen2 architecture for KORMo

b7413

Toggle b7413's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
kv-cache: Fix state restore fragmented cache (#17982)

* kv-cache : fix state restore with fragmented cache (#17527)

Change find_slot to allow non-contiguous allocation during state restore. Fixes 'failed to find available cells in kv cache' error when restoring state to fragmented cache.

* tests : update logic

* cleanup: tightened state_read_meta sig, added is_contiguous case

* fix: state_read_meta arg reorder loose ends

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

b7411

Toggle b7411's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
metal: use shared buffers on eGPU (#17866)

* metal: use shared buffers on eGPU

With #15906, I noticed on important regression when using metal backend on eGPU.
This commit restore the previous behavior and add an option to force its activation.

* metal: use shared buffers on eGPU

* metal: use shared buffers on eGPU

b7410

Toggle b7410's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
mtmd: refactor audio preprocessing (#17978)

* mtmd: refactor audio preprocessing

* refactor

Co-authored-by: Tarek <tdakhran@users.noreply.github.com>

* wip

* wip (2)

* improve constructor

* fix use_natural_log

* fix padding for short input

* clean up

* remove need_chunking

---------

Co-authored-by: Tarek <tdakhran@users.noreply.github.com>

b7406

Toggle b7406's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
[SYCL] Support gpt-oss by OPs add-id, mul_mat for mxfp4, swiglu_oai (#…

…17826)

* support gpt-oss GPU by OP add-id, mul_mat for mxfp4, swiglu_oai, fix warning

* fix fault ut case, update ops.md

* rebase, fix format issue

b7405

Toggle b7405's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
model : add glm-asr support (#17901)

* [model] add glm-asr support

* fix format for ci

* fix convert format for ci

* update glm_asr convert script & use build_ffn for glm_asr clip & use build_stack for padding and review

* check root architecture for convert hf script

* fix conficlt with upstream

* fix convert script for glm asr & format clip-impl

* format

* restore hparams text

* improved conversion

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

b7404

Toggle b7404's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
preset: handle negated arg, reverse the meaning if needed (#18041)

b7402

Toggle b7402's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
mtmd: enhance image resizing in llava_uhd (#18014)