model : add ASR support for LFM2-Audio-1.5B (conformer) #18106

ngxson · 2025-12-16T16:36:36Z

Supersede #17694

Rebased to latest master
Removed some redundant ggml_cont

ngxson · 2025-12-16T16:38:02Z

@tdakhran would appreciate if you can do a test on your side! On my side, I confirm that removing some ggml_cont doesn't have negative effects on Metal + CPU

tdakhran · 2025-12-16T17:42:45Z

@tdakhran would appreciate if you can do a test on your side! On my side, I confirm that removing some ggml_cont doesn't have negative effects on Metal + CPU

@ngxson tested with linux x64 CPU and CUDA, both work 🎉 !

tdakhran

Looked into it one more time, only few nitpicks.
@ngxson fyi I'm not sure this conformer matches upstream conformer, it could be modified.

tdakhran · 2025-12-16T18:39:40Z

tools/mtmd/mtmd-audio.cpp

+    params.hop_length       = hparams.audio_hop_len;
+    params.sample_rate      = hparams.audio_sample_rate;
+    params.center_padding   = true;
+    params.preemph          = 0.97f; // disabled


Suggested change

params.preemph = 0.97f; // disabled

params.preemph = 0.97f;

tdakhran · 2025-12-16T18:40:29Z

tools/mtmd/mtmd-cli.cpp

    LOG_WRN("WARN: This is an experimental CLI for testing multimodal capability.\n");
    LOG_WRN("      For normal use cases, please use the standard llama-cli\n");

+    eval_system_prompt_if_present();


Suggested change

eval_system_prompt_if_present();

if (int res = eval_system_prompt_if_present(); res) {

return res;

}

tdakhran · 2025-12-16T18:41:10Z

tools/mtmd/mtmd-cli.cpp

                ctx.chat_history.clear();
                llama_memory_clear(llama_get_memory(ctx.lctx), true);
                LOG("Chat history cleared\n\n");
+                eval_system_prompt_if_present();


Suggested change

eval_system_prompt_if_present();

if (int res = eval_system_prompt_if_present(); res) {

return res;

}

tdakhran · 2025-12-16T18:41:30Z

tools/mtmd/mtmd-cli.cpp

                params.prompt = mtmd_default_marker() + params.prompt;
            }
        }
+


Suggested change

tdakhran · 2025-12-16T18:42:43Z

tools/mtmd/clip.cpp

+                        layer.conv_norm_w  = get_tensor(string_format("convnext.%d.norm.%s", il, "weight"));
+                        layer.conv_norm_b  = get_tensor(string_format("convnext.%d.norm.%s", il, "bias"));
+                        layer.conv_dw_w    = get_tensor(string_format("convnext.%d.dw.%s",   il, "weight"));
+                        layer.conv_dw_b    = get_tensor(string_format("convnext.%d.dw.%s",   il, "bias"));
+                        layer.conv_pw1_w   = get_tensor(string_format("convnext.%d.pw1.%s",  il, "weight"));
+                        layer.conv_pw1_b   = get_tensor(string_format("convnext.%d.pw1.%s",  il, "bias"));
+                        layer.conv_pw2_w   = get_tensor(string_format("convnext.%d.pw2.%s",  il, "weight"));
+                        layer.conv_pw2_b   = get_tensor(string_format("convnext.%d.pw2.%s",  il, "bias"));


Not sure if these strings require their own defines.

tdakhran · 2025-12-16T18:43:42Z

tools/mtmd/clip-model.h

+    std::array<ggml_tensor *, 7> pre_encode_conv_X_w = {nullptr};
+    std::array<ggml_tensor *, 7> pre_encode_conv_X_b = {nullptr};
+    ggml_tensor * pre_encode_out_w     = nullptr;
+    ggml_tensor * pre_encode_out_b     = nullptr;


Suggested change

std::array<ggml_tensor *, 7> pre_encode_conv_X_w = {nullptr};

std::array<ggml_tensor *, 7> pre_encode_conv_X_b = {nullptr};

ggml_tensor * pre_encode_out_w = nullptr;

ggml_tensor * pre_encode_out_b = nullptr;

std::array<ggml_tensor *, 7> pre_encode_conv_X_w = {nullptr};

std::array<ggml_tensor *, 7> pre_encode_conv_X_b = {nullptr};

ggml_tensor * pre_encode_out_w = nullptr;

ggml_tensor * pre_encode_out_b = nullptr;

tdakhran and others added 9 commits December 15, 2025 22:14

ASR with LFM2-Audio-1.5B

145b628

Set rope_theta

4f5d521

Fix comment

0e8779a

Remove rope_theta setting

f5b132a

Address PR feedback

ba9e597

rename functions to conformer

cea578b

remove some redundant ggml_cont

a3ebc93

Merge branch 'master' into tarek/feat/lfm2-asr-upstream

7865a15

fix missing tensor

72a41fd

ngxson requested review from CISC and ggerganov as code owners December 16, 2025 16:36

ngxson mentioned this pull request Dec 16, 2025

model : add ASR support for LFM2-Audio-1.5B #17694

Closed

loci-dev mentioned this pull request Dec 16, 2025

UPSTREAM PR #18106: model : add ASR support for LFM2-Audio-1.5B (conformer) auroralabs-loci/llama.cpp#592

Open

CISC approved these changes Dec 16, 2025

View reviewed changes

tdakhran approved these changes Dec 16, 2025

View reviewed changes

github-actions bot added testing Everything test related Nvidia GPU Issues specific to Nvidia GPUs examples python python script changes ggml changes relating to the ggml tensor library for machine learning labels Dec 16, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

model : add ASR support for LFM2-Audio-1.5B (conformer) #18106

model : add ASR support for LFM2-Audio-1.5B (conformer) #18106

ngxson commented Dec 16, 2025

Uh oh!

ngxson commented Dec 16, 2025

Uh oh!

tdakhran commented Dec 16, 2025

Uh oh!

tdakhran left a comment

Uh oh!

tdakhran Dec 16, 2025

Uh oh!

tdakhran Dec 16, 2025

Uh oh!

tdakhran Dec 16, 2025

Uh oh!

tdakhran Dec 16, 2025

Uh oh!

tdakhran Dec 16, 2025

Uh oh!

tdakhran Dec 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

-    eval_system_prompt_if_present();
+    if (int res = eval_system_prompt_if_present(); res) {
+        return res;
+    }

model : add ASR support for LFM2-Audio-1.5B (conformer) #18106

Are you sure you want to change the base?

model : add ASR support for LFM2-Audio-1.5B (conformer) #18106

Conversation

ngxson commented Dec 16, 2025

Uh oh!

ngxson commented Dec 16, 2025

Uh oh!

tdakhran commented Dec 16, 2025

Uh oh!

tdakhran left a comment

Choose a reason for hiding this comment

Uh oh!

tdakhran Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

tdakhran Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

tdakhran Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

tdakhran Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

tdakhran Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

tdakhran Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants