Skip to content

Conversation

@danbev
Copy link
Member

@danbev danbev commented Dec 15, 2025

This commit adds support for the NVIDIA Nemotron Nano 3 model, enabling the conversion and running of this model.

Copy of genai-social-nemotron-3-4643900-1920x1080

Tech blog: https://developer.nvidia.com/blog/inside-nvidia-nemotron-3-techniques-tools-and-data-that-make-it-efficient-and-accurate/

This commit adds support for the NVIDIA Nemotron Nano 3 model, enabling
the conversion and running of this model.
@danbev danbev requested a review from CISC as a code owner December 15, 2025 14:15
@ggerganov ggerganov changed the title llama : add support for NVIDIA Nemotron Nano 3 llama : add support for NVIDIA Nemotron 3 Nano Dec 15, 2025
@danielhanchen
Copy link
Contributor

danielhanchen commented Dec 15, 2025

GGUFs work great! I converted them via the PR at https://huggingface.co/unsloth/Nemotron-3-Nano-30B-A3B-GGUF

Note <think> and </think> are separate tokens, so folks might need to use --special if needed.

@arch-btw
Copy link
Contributor

@danielhanchen please wait until it's merged next time.

@danielhanchen
Copy link
Contributor

@danielhanchen please wait until it's merged next time.

We were launch partners with Nvidia and so we supported finetuning out of the gate and announced the GGUFs with it.

The GGUFs also already work in LMStudio as this llama.cpp PR was merged.

I will need to change the instructions for llama.cpp in our guide.

@cmp-nct
Copy link
Contributor

cmp-nct commented Dec 15, 2025

It's a very good model, happy to see it supported so quickly on release.
Sadly it's very bad in detailed replicating, so visual puzzles are not going to work with it but that's a model architectural flaw I guess - same on openrouter

Copy link
Member

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have some issue with parsing the reasoning tokens:

image

Hoping that we'll get support from the community to fix this. Pinging @aldehir

Let's merge after the CI is green.

@engrtipusultan
Copy link

I think we have some issue with parsing the reasoning tokens:

Does it have something to do with this, feom unsloth:

Nemotron 3 chat template format:

Nemotron 3 uses <think> with token id 12 and </think> with id 13 for reasoning. Use --special to see the tokens.

https://docs.unsloth.ai/models/nemotron-3

@dinerburger
Copy link
Contributor

dinerburger commented Dec 15, 2025

Even with --special it doesn't appear to emit the thinking tags. (OpenWebUI does not render them for example). You do get <|im_end however.

@aldehir
Copy link
Collaborator

aldehir commented Dec 15, 2025

Hoping that we'll get support from the community to fix this. Pinging @aldehir

I will take a look.

@github-actions github-actions bot added model Model specific python python script changes labels Dec 15, 2025
@pwilkin
Copy link
Collaborator

pwilkin commented Dec 15, 2025

LMStudio seems to support it and support the thinking, so there has to be some way to make it work, but yeah, during normal generation on server WebUI I didn't see the closing thinking tag (the opening one I assume is appended to the generation prompt, so it's a typical thinking_forced_open = true case).

Copy link
Collaborator

@CISC CISC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nits only, not required to apply any.

@danielhanchen
Copy link
Contributor

danielhanchen commented Dec 16, 2025

@pwilkin Actually I re-checked you're correct - --reasoning-format deepseek-legacy or --reasoning-format deepseek doesn't enable <think></think> parsing. Also side note --verbose-prompt I think is broken via llama-completion / llama-cli - it doesn't print out the previous prompt and token ids anymore :(

But yes <think> is by default prepended

@ggerganov
Copy link
Member

I'm thinking that before we fix the reasoning parsing, there is no point to merge this PR. So let's put it on hold until we figure it out.

@aldehir
Copy link
Collaborator

aldehir commented Dec 16, 2025

@ggerganov I have the changes ready, although they support reasoning + tool calling so they're not small. How would you like me to proceed?

I can provide a subset of the changes to only address the reasoning and then add the rest in an another PR.

@ggerganov
Copy link
Member

@aldehir Ah great. Let's merge then and please open a PR after this with your changes. Thanks.

@danbev Merge at will

@danbev danbev merged commit 2995341 into ggml-org:master Dec 16, 2025
72 of 74 checks passed
@danbev
Copy link
Member Author

danbev commented Dec 16, 2025

The models are now available on Huggingface.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

model Model specific python python script changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.