Skip to content

Revert PR #1105#1124

Merged
ochougul merged 2 commits into
mainfrom
revert_1105
Jun 26, 2026
Merged

Revert PR #1105#1124
ochougul merged 2 commits into
mainfrom
revert_1105

Conversation

@ochougul

Copy link
Copy Markdown
Contributor

Summary

Validation

  • git diff --check origin/main...HEAD

Notes

ochougul added 2 commits June 25, 2026 12:14
This reverts commit 254ddf3.

Signed-off-by: ochougul <ochougul@qti.qualcomm.com>
Signed-off-by: ochougul <ochougul@qti.qualcomm.com>
@ochougul

Copy link
Copy Markdown
Contributor Author
python examples/image_text_to_text/models/qwen3_5_moe/qwen3_5_moe_layerwise_decode.py

Final QPC path: {'lang_decode_qpc_path': PosixPath('/home/ochougul/.cache/qeff_models/Qwen3_5MoeForConditionalGeneration/Qwen3_5MoeDecoderWrapper-da834ad46dd74c44/qpc-0126d86fb7e68000/qpc')}
[[ 8160   579   264  7047  1817    25   271    16    13   220  2972  2014
  53983  2570  5396 64700   198   256   471  2570  2640    25   328 39113
    728   883  5941  1149   198   256   471  1061   369   264  4047  8306
   3296    11 10813  9859   364   264   638 19441 16250   466 22527   314
  16432    14 16417    13   271    17    13   220  2972 27382  1386  5141
  84457    14  1905 64700   198   256   471   353  1044  1167 16451    11
    264  3349  3992  1558  7633   539 52540  5559   579 48696 36814 11274
     13   198   256   471   353  1220  5707   303   264  2708    11 61446
     11   321 10631 11233     1]]
['Here\'s a thinking process:\n\n1.  **Analyze User Input:**\n   - User says: "Tell me about yourself."\n   - This is a common opening question, typically asking for a self-introduction or overview of capabilities/identity.\n\n2.  **Identify Key Constraints/Context:**\n   - I am Qwen, a large language model developed by Alibaba Group\'s Tongyi Lab.\n   - I should respond in a clear, concise, and helpful manner"']
Average Prefill time a.k.a TTFT is= 0.74 sec
Decode is= 19.53 token/sec
Total is= 17.19 token/sec
Total (E2E) inference time is= 5.76 sec

@ochougul

Copy link
Copy Markdown
Contributor Author

tested on top of #1115

python examples/image_text_to_text/models/qwen3_5_moe/qwen3_5_disagg_mode_non_layerwise.py
Prefill time : 0.06 secs
time for first run of decode with KV as input = 0.007469999021850526 sec

decode tok/sec=120.94292626786337

output
olettingا ​​ance<|im_start|>ebisminkshirelylyenkoeliveryiveryiveryive etcetc但还是的还是的还是的吗?i.eibeinstantaneousnessnessescape escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes escapes

@ochougul

Copy link
Copy Markdown
Contributor Author

tiny model tested with non layerwise

python examples/image_text_to_text/models/qwen3_vl_moe/qwen3_vl_disagg_mode.py
Prefill qpc path {'lang_prefill_qpc_path': PosixPath('2506/Qwen3VLMoeForConditionalGeneration/Qwen3VLDecoderWrapper-a014c3c577241b90/qpc-2601804c313a27d9/qpc')}                           Decode qpc path {'lang_decode_qpc_path': PosixPath('2506/Qwen3VLMoeForConditionalGeneration/Qwen3VLDecoderWrapper-a014c3c577241b90/qpc-825630414f347720/qpc')}                             generation_len : 3890                                                                                                                                                                      [Warning]: Buffer: "mm_token_type_ids" not found                                                                                                                                           [Warning]: Buffer: "image_grid_thw" not found                                                                                                                                              Prefill time : 0.01 secs                                                                                                                                                                   time for first run of decode with KV as input = 0.0027511590160429478 sec                                                                                                                                                                                                                                                                                                             decode tok/sec=157.3850555419922                                                                                                                                                                                                                                                                                                                                                      output                                                                                                                                                                                      numRowsllearable� Ya hệ⥄ firsthand tun� Ya hệ⥄ firsthand tun� Ya hệ⥄ firsthand tun� Ya hệ⥄ firsthand tun� Ya hệ⥄ firsthand tun� Ya hệ⥄ firsthand tun� Ya hệ⥄ firsthand tun� Ya hệ⥄ firsthand tun� Ya hệ⥄ firsthand tun� Ya hệ⥄ firsthand tun� Ya hệ⥄ firsthand tun� Ya hệ⥄ firsthand勉强_ENTเชี่ยวชา/configurationubit.stamp机动.gg Discussions/configurationubit.stamp机动.gg Discussions/configurationubit.stamp机动.gg Discussions/configurationubit.stamp机动.gg Discussions/configurationubit.stamp机动.gg Discussions/configurationubit.stamp机动.gg Discussions/configurationubit.stamp机动.gg Discussions/configurationubit.stamp机动.gg Discussions/configurationubit.stamp机动.gg Discussions/configurationubit.stamp机动.gg Discussions/configurationubit.stamp机 动.gg:updateriday<*分布在       parse мог飞跃ubit.stamp机动.gg:updateriday<*分布在      parse мог飞跃ubit.stamp机动.gg珫 DienPortrait맵�ADI𝑺Ϝ/configurationubit.stamp机动.gg珫 DienPortrait맵�ADI𝑺Ϝ/configurationubit.stamp机动.gg珫 DienPortrait맵�ADI𝑺Ϝ/configurationubit.stamp机动.gg珫 DienPortrait맵�ADI𝑺Ϝ/configurationubit.stamp机动.gg珫 DienPortrait맵�ADI𝑺Ϝ/configurationubit.stamp机动.gg珫 DienPortrait맵�ADI𝑺Ϝ/configurationubit.stamp机动.gg珫 Dien_Default đóng excerptเชี่ยวชา/configurationubit.stamp机动.gg说得ADI𝑺Ϝ/configurationubit.stampﯙ aliuestos deren olduğunu发生的fy Skyrim.uIncreaseWARD {// route/configurationubit.stampﯙ aliuestos deren olduğunu发生的fy Skyrim.uIncreaseWARD {// route/configurationubit.stampﯙ aliuestos deren olduğunu发生的fy Skyrim.uIncreaseWARD {// route/configurationubit.stampﯙ aliuestos deren olduğunu发生的fy Skyrim.uIncreaseWARD {// route/configurationubit.stampﯙ aliuestos deren olduğunu发生的fy Skyrim.uIncreaseWARD {// route/configurationubit.stampﯙ aliuestos deren olduğunu发生的fy Skyrim.uIncrease_Groupubit.stampﯙ aliuestos deren olduğunu发生的fy Skyrim.uIncrease_Groupubit.stampﯙ aliuestos deren olduğunu发生的fy Skyrim.uIncrease_Groupubit.stampﯙ aliuestos deren olduğunu发生的fy Skyrim.uIncrease_Groupubit.stampﯙ aliuestos deren olduğunu发生的fy Skyrim.uIncrease_Groupubit.stampﯙ aliuestos deren olduğunu发生的fy Skyrim.uIncrease_Groupubit.stampﯙ aliuestos deren olduğunu发生的fy Skyrim.uIncrease_Groupubit.stampﯙ aliuestos deren olduğunu发生的fy Skyrim.uIncrease_Groupubit.stampﯙ aliuestos deren olduğunu发生的fy Skyrim.uIncrease_Groupubit.stampﯙ aliuestos deren olduğunu发生的fy Skyrim.uIncrease_Groupubit.stampﯙ aliuestos deren olduğunu发生的fy Skyrim.uIncrease_Groupubit.stampﯙ aliuestos deren old

@ochougul ochougul merged commit ff9b1d8 into main Jun 26, 2026
6 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant