chore: ⬆️ Update TheTom/llama-cpp-turboquant to e69af784add62d5d3b3321abc0e3068df41143e7#9740
chore: ⬆️ Update TheTom/llama-cpp-turboquant to e69af784add62d5d3b3321abc0e3068df41143e7#9740localai-bot wants to merge 1 commit intomudler:masterfrom
e69af784add62d5d3b3321abc0e3068df41143e7#9740Conversation
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
|
cc @TheTom FYI new changes triggers compilation issues on hipblas: https://github.com/mudler/LocalAI/actions/runs/25610872523/job/75180742246?pr=9740 |
|
fixing |
Commit e69af78 added 3 new dispatch entries to fattn.cu for the (turbo2/3/4, F16) mixed-KV combinations and the matching template-instance .cu files, but only updated ggml/src/ggml-cuda/CMakeLists.txt. The parallel list in ggml/src/ggml-hip/CMakeLists.txt was missed, so the HIP build links without those instantiations and fails: ld.lld: error: undefined symbol: void ggml_cuda_flash_attn_ext_vec_case<64, TURBO3_0, F16>(...) void ggml_cuda_flash_attn_ext_vec_case<128, TURBO3_0, F16>(...) void ggml_cuda_flash_attn_ext_vec_case<256, TURBO3_0, F16>(...) (and same for TURBO2_0, TURBO4_0) clang++: error: linker command failed with exit code 1 Surfaced first by LocalAI's hipblas-turboquant build job (mudler/LocalAI#9740 CI). Fix is mechanical: mirror the 3 new entries from the CUDA CMakeLists into the HIP CMakeLists, paired next to their existing f16-X counterparts.
|
Hey @mudler — owner of TheTom/llama-cpp-turboquant here. CI failure on the hipblas-turboquant target is on me, not on this bump. Root cause: In commit Just pushed the fix to Also note: while we're here, my fork picked up one more fix this morning at Sorry for the breakage — the CUDA-only consumers (regular CUDA build, MLX, my own Mac builds) all linked clean, so this slipped through. LocalAI's hipblas job was the first to surface it. |
Changes: https://github.com/TheTom/llama-cpp-turboquant/compare/69d8e4be47243e83b3d0d71e932bc7aa61c644dc..e69af784add62d5d3b3321abc0e3068df41143e7