Tags: ggml-org/llama.cpp
Tags
ggml-hexagon: mm for mtmd (#17894) * feat: add run_mtmd script for hexagon * fix: fix issue in fp16xfp32 mm * fix: remove opt_experiment for fp16xfp32 mm * fix: ggml-hexagon: matmul fp16xfp32 support non-contigious src0 * fix: fix syntax check for run-mtmd.sh for cli
model : add KORMo model (#18032) * vocab: add KORMo Tokenizer * model: add KORMoForCausalLM * vocab: change pretokenizer to qwen2 * lint: fix unintended line removal * model: make qwen2 bias tensor optional * model: use qwen2 architecture for KORMo
kv-cache: Fix state restore fragmented cache (#17982) * kv-cache : fix state restore with fragmented cache (#17527) Change find_slot to allow non-contiguous allocation during state restore. Fixes 'failed to find available cells in kv cache' error when restoring state to fragmented cache. * tests : update logic * cleanup: tightened state_read_meta sig, added is_contiguous case * fix: state_read_meta arg reorder loose ends --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
metal: use shared buffers on eGPU (#17866) * metal: use shared buffers on eGPU With #15906, I noticed on important regression when using metal backend on eGPU. This commit restore the previous behavior and add an option to force its activation. * metal: use shared buffers on eGPU * metal: use shared buffers on eGPU
mtmd: refactor audio preprocessing (#17978) * mtmd: refactor audio preprocessing * refactor Co-authored-by: Tarek <tdakhran@users.noreply.github.com> * wip * wip (2) * improve constructor * fix use_natural_log * fix padding for short input * clean up * remove need_chunking --------- Co-authored-by: Tarek <tdakhran@users.noreply.github.com>
model : add glm-asr support (#17901) * [model] add glm-asr support * fix format for ci * fix convert format for ci * update glm_asr convert script & use build_ffn for glm_asr clip & use build_stack for padding and review * check root architecture for convert hf script * fix conficlt with upstream * fix convert script for glm asr & format clip-impl * format * restore hparams text * improved conversion --------- Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
PreviousNext