-
Notifications
You must be signed in to change notification settings - Fork 14.1k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
model : add ASR support for LFM2-Audio-1.5B (conformer)
#18106
opened Dec 16, 2025 by
ngxson
Loading…
llama-fit-params: lower ctx size for multi GPU
#18101
opened Dec 16, 2025 by
JohannesGaessler
Loading…
gguf-py: allow converting multi-tensor models from read-only locations
#18100
opened Dec 16, 2025 by
ykhrustalev
Loading…
ggml-cpu: ARM64: repack version of q8_0 (dotprod and i8mm)
#18096
opened Dec 16, 2025 by
Alcpz
Loading…
llama-fit-params: fix underflow for dense models
#18095
opened Dec 16, 2025 by
JohannesGaessler
Loading…
ggml : use WARP_SIZE/2 for argmax reduction offset
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#18092
opened Dec 16, 2025 by
Aadeshveer
Loading…
llama-fit-params: QoL impr. for prints/errors
examples
#18089
opened Dec 16, 2025 by
JohannesGaessler
Loading…
server: [RFC] add optional POST /exit endpoint for graceful shutdown
examples
server
#18086
opened Dec 16, 2025 by
qnixsynapse
•
Draft
ggml: migrate work_data to stack allocation
ggml
changes relating to the ggml tensor library for machine learning
testing
Everything test related
#18083
opened Dec 16, 2025 by
GermanAizek
Loading…
vulkan/cuda: fix topk_moe with exp_probs_b
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
testing
Everything test related
Vulkan
Issues specific to the Vulkan backend
#18071
opened Dec 15, 2025 by
jeffbolznv
Loading…
webui: add responsive chat width option to webui (#18067)
examples
server
#18068
opened Dec 15, 2025 by
ImadSaddik
Loading…
vulkan: support GGML_UNARY_OP_XIELU
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#18062
opened Dec 15, 2025 by
jeffbolznv
Loading…
vulkan: in graph_optimize, try to group ADD operations
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#18060
opened Dec 15, 2025 by
jeffbolznv
Loading…
webui: Client-side implementation of tool calling (with two tools)
examples
server
#18059
opened Dec 15, 2025 by
coder543
Loading…
common: fix --override-kv to support comma-separated values
#18056
opened Dec 15, 2025 by
ServeurpersoCom
Loading…
server: Fix router proxying to child processes when --host is specified
examples
server
#18054
opened Dec 15, 2025 by
wbtek
Loading…
CLI: llama-cli and llama-completion cosmetics
devops
improvements to build systems and github actions
documentation
Improvements or additions to documentation
python
python script changes
script
Script related
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
#18053
opened Dec 15, 2025 by
andrew-aladev
Loading…
vulkan: Implement set_tensor_async and the event interfaces
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#18047
opened Dec 15, 2025 by
jeffbolznv
Loading…
convert : keep file part order from model index
python
python script changes
#18043
opened Dec 14, 2025 by
CISC
Loading…
[Speculative decoding] feat: add EAGLE3 speculative decoding support
examples
ggml
changes relating to the ggml tensor library for machine learning
model
Model specific
python
python script changes
#18039
opened Dec 14, 2025 by
ichbinhandsome
•
Draft
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.