ggml-org / llama.cpp Public

Notifications You must be signed in to change notification settings
Fork 14.1k
Star 91.4k

Code
Issues 346
Pull requests 625
Discussions
Actions
Projects 10
Wiki
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Wiki
Security
Insights

Pull requests: ggml-org/llama.cpp

Labels 87 Milestones 0

New pull request New

625 Open 8,137 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

model : add ASR support for LFM2-Audio-1.5B (conformer)

#18106 opened Dec 16, 2025 by ngxson

Loading…

model: fix LFM2 missing tensors

#18105 opened Dec 16, 2025 by ngxson

Loading…

llama-fit-params: force disable mlock

#18103 opened Dec 16, 2025 by JohannesGaessler

Loading…

ggml-cuda: Delta-Net linear attention for Qwen3-Next

#18102 opened Dec 16, 2025 by hauhaut

Loading…

llama-fit-params: lower ctx size for multi GPU

#18101 opened Dec 16, 2025 by JohannesGaessler

Loading…

gguf-py: allow converting multi-tensor models from read-only locations

#18100 opened Dec 16, 2025 by ykhrustalev

Loading…

ggml-cpu: ARM64: repack version of q8_0 (dotprod and i8mm)

#18096 opened Dec 16, 2025 by Alcpz

Loading…

llama-fit-params: fix underflow for dense models

#18095 opened Dec 16, 2025 by JohannesGaessler

Loading…

ggml : use WARP_SIZE/2 for argmax reduction offset ggml

changes relating to the ggml tensor library for machine learning

Nvidia GPU

Issues specific to Nvidia GPUs

#18092 opened Dec 16, 2025 by Aadeshveer

Loading…

webui: Fix selecting generated output issues during active streaming examples server

#18091 opened Dec 16, 2025 by allozaur

Loading…

llama-fit-params: QoL impr. for prints/errors examples

#18089 opened Dec 16, 2025 by JohannesGaessler

Loading…

server: [RFC] add optional POST /exit endpoint for graceful shutdown examples server

#18086 opened Dec 16, 2025 by qnixsynapse • Draft

ggml: migrate work_data to stack allocation ggml

changes relating to the ggml tensor library for machine learning

testing

Everything test related

#18083 opened Dec 16, 2025 by GermanAizek

Loading…

vulkan/cuda: fix topk_moe with exp_probs_b ggml

changes relating to the ggml tensor library for machine learning

Nvidia GPU

Issues specific to Nvidia GPUs

testing

Everything test related

Vulkan

Issues specific to the Vulkan backend

#18071 opened Dec 15, 2025 by jeffbolznv

Loading…

webui: add responsive chat width option to webui (#18067) examples server

#18068 opened Dec 15, 2025 by ImadSaddik

Loading…

vulkan: support GGML_UNARY_OP_XIELU ggml

changes relating to the ggml tensor library for machine learning

Vulkan

Issues specific to the Vulkan backend

#18062 opened Dec 15, 2025 by jeffbolznv

Loading…

vulkan: in graph_optimize, try to group ADD operations ggml

changes relating to the ggml tensor library for machine learning

Vulkan

Issues specific to the Vulkan backend

#18060 opened Dec 15, 2025 by jeffbolznv

Loading…

webui: Client-side implementation of tool calling (with two tools) examples server

#18059 opened Dec 15, 2025 by coder543

Loading…

common: fix --override-kv to support comma-separated values

#18056 opened Dec 15, 2025 by ServeurpersoCom

Loading…

server: Fix router proxying to child processes when --host is specified examples server

#18054 opened Dec 15, 2025 by wbtek

Loading…

CLI: llama-cli and llama-completion cosmetics devops

improvements to build systems and github actions

documentation

Improvements or additions to documentation

python

python script changes

script

Script related

SYCL

https://en.wikipedia.org/wiki/SYCL - GPU programming language

#18053 opened Dec 15, 2025 by andrew-aladev

Loading…

vulkan: Implement set_tensor_async and the event interfaces ggml

changes relating to the ggml tensor library for machine learning

Vulkan

Issues specific to the Vulkan backend

#18047 opened Dec 15, 2025 by jeffbolznv

Loading…

chat-parser: handle whitespace around JSON in tool call parsing testing

Everything test related

#18044 opened Dec 15, 2025 by ochafik • Draft

convert : keep file part order from model index python

python script changes

#18043 opened Dec 14, 2025 by CISC

Loading…

[Speculative decoding] feat: add EAGLE3 speculative decoding support examples ggml

changes relating to the ggml tensor library for machine learning

model

Model specific

python

python script changes

#18039 opened Dec 14, 2025 by ichbinhandsome • Draft

Previous 1 2 3 4 5 … 24 25 Next

Previous Next

ProTip! Type g p on any issue or pull request to go back to the pull request listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!