Skip to content

Conversation

@arnej27959
Copy link
Contributor

We have investigated upgrading to newer llama.cpp, but to simplify the work it would be helpful to get these changes in first; they should have no effect on current behavior.

Thanks in advance,
-arnej (from vespa.ai)

common_chat_templates_init is already done at end of load_model in server.hpp
@kherud
Copy link
Owner

kherud commented Jun 20, 2025

Hey @arnej27959 thanks for the PR! Looks good to me (apart from the comment), thanks for pointing out the unused code. The original goal of the Java binding was to stay as close as possible to the llama.cpp server code, to better keep up with its fast development. It mostly replaces the HTTP stuff with JNI (+ some extras like logging). I didn't have a look at the llama.cpp code base in a while to judge if that's still the best idea. Earlier we passed the model parameters as a json to llama.cpp, but then switched to re-using the C++ CLI arg parsing code. I think most of what you removed are remains of the old json code, so it's alright to remove it 👍

@kherud kherud merged commit d82d971 into kherud:master Jun 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants