Skip to content

Pull requests: InternLM/lmdeploy

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

fix the anthropic adapter Bug:P0
#4578 opened May 9, 2026 by lvhan028 Collaborator Loading…
[Improve]: Drain queues when sleep engine
#4577 opened May 9, 2026 by RunningLeon Collaborator Loading…
feat: configure cudagraph capture batch sizes
#4573 opened May 8, 2026 by CUHKSZzxy Collaborator Draft
[WIP]: Support mtp fp8
#4572 opened May 8, 2026 by RunningLeon Collaborator Loading…
Fix health latency under concurrent VL request preparation Bug:P0
#4570 opened May 7, 2026 by CUHKSZzxy Collaborator Loading…
LLM evaluation skill on text datasets
#4566 opened Apr 30, 2026 by lvhan028 Collaborator Loading…
Fix the reprefill of evicted seqs with invalid draft tokens
#4564 opened Apr 29, 2026 by RunningLeon Collaborator Loading…
FP8 kv cache quantization
#4563 opened Apr 29, 2026 by CUHKSZzxy Collaborator Draft
Add Qwen3.5 Moe lite awq
#4561 opened Apr 28, 2026 by 43758726 Collaborator Loading…
[Feature] Add guided decoding support for speculative decoding enhancement New feature or request
#4559 opened Apr 28, 2026 by windreamer Collaborator Loading…
4 tasks done
Update turbomind modeling infrastructure improvement
#4557 opened Apr 27, 2026 by lzhangzz Collaborator Loading…
[WIP]DeepSeek V4 support
#4554 opened Apr 24, 2026 by grimoire Collaborator Draft
[Feature] Implement /v1/embeddings endpoint for OpenAI-compatible API enhancement New feature or request
#4550 opened Apr 23, 2026 by ZhijunLStudio Contributor Loading…
2 of 4 tasks
bump version to v0.13.0
#4549 opened Apr 23, 2026 by lvhan028 Collaborator Loading…
Test: update video sleep/wakeup and abort scenarios
#4528 opened Apr 15, 2026 by littlegy Contributor Loading…
style: add autopep8 pre-commit hook and apply PEP 8 formatting fixes
#4524 opened Apr 14, 2026 by windreamer Collaborator Loading…
make fp8 model quantized by llm-compressor can be inferenced in turbomind enhancement New feature or request
#4509 opened Apr 8, 2026 by 43758726 Collaborator Loading…
Integrate deep-ep nccl backend enhancement New feature or request
#4477 opened Mar 27, 2026 by irexyc Collaborator Loading…
feat: Turbomind linear gdn prefix caching enhancement New feature or request
#4465 opened Mar 25, 2026 by lapy Contributor Loading…
refactor get_ppl improvement
#4461 opened Mar 25, 2026 by lvhan028 Collaborator Loading…
Support multi stop words improvement
#4454 opened Mar 24, 2026 by lvhan028 Collaborator Loading…
[WIP] Support qwen3-omni
#4411 opened Mar 13, 2026 by CUHKSZzxy Collaborator Draft
ProTip! Add no:assignee to see everything that’s not assigned.