-
-
Notifications
You must be signed in to change notification settings - Fork 9.1k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Misc] benchmark_moe supports expert parallel
performance
Performance-related issues
#22251
opened Aug 5, 2025 by
jeejeelee
Loading…
4 tasks
[V0 Deprecation][TPU] Remove V1 flag check from tests
ready
ONLY add when PR is ready to merge/full CI is needed
tpu
Related to Google TPUs
v1
#22248
opened Aug 5, 2025 by
NickLucche
Loading…
[Platform] allow platform to init dp group
ready
ONLY add when PR is ready to merge/full CI is needed
#22243
opened Aug 5, 2025 by
wangxiyuan
Loading…
1 of 4 tasks
[V1][SpecDecode]Support Relaxed Acceptance for thinking tokens in speculative decoding when using greedy search, camp up by Nvidia.
v1
#22238
opened Aug 5, 2025 by
DW934
Loading…
3 of 4 tasks
fix(worker): adjust memory requirement calculation for GPU worker
v1
#22237
opened Aug 5, 2025 by
MengAiDev
Loading…
[Perf][Feat][Core] Workload-Aware KVCache Eviction Policy
documentation
Improvements or additions to documentation
performance
Performance-related issues
v1
#22236
opened Aug 5, 2025 by
Chasingdreams6
Loading…
4 tasks done
[CI/Build] Update flashinfer to 0.2.9
ci/build
ready
ONLY add when PR is ready to merge/full CI is needed
#22233
opened Aug 5, 2025 by
mgoin
Loading…
Tanuj/meow penalty
ci/build
documentation
Improvements or additions to documentation
frontend
needs-rebase
tpu
Related to Google TPUs
v1
#22229
opened Aug 5, 2025 by
tanujtiwari1998
Loading…
4 tasks
[Bugfix] Disable the statslogger if the api_server_count is greater than 1
frontend
#22227
opened Aug 5, 2025 by
chaunceyjiang
Loading…
2 of 4 tasks
[Multimodal] Skip vision component loading for llava in text-only mode
multi-modality
Related to multi-modality (#4194)
#22224
opened Aug 5, 2025 by
sfeng33
Loading…
[Frontend] Added parallel_tool_calls option on the openai API with Guided Decoding
frontend
#22218
opened Aug 4, 2025 by
cosminacho
Loading…
feat: Add native support for XLM-RoBERTa embedding and BAAI/bge-reranker-v2-m3
documentation
Improvements or additions to documentation
new-model
Requests to new models
#22216
opened Aug 4, 2025 by
honghanhh
Loading…
4 tasks done
[Misc] DeepGEMM : Avoid JIT generation in the hot-path
v1
#22215
opened Aug 4, 2025 by
varun-sundar-rabindranath
Loading…
preload heavy modules when mp method is forkserver
frontend
#22214
opened Aug 4, 2025 by
lionelvillard
Loading…
[Perf] Support topk softmax fused kernel for broader num_experts
#22211
opened Aug 4, 2025 by
shixianc
Loading…
3 of 4 tasks
[Bugfix] fix hash error for chunked local attention hybrid KV
v1
#22209
opened Aug 4, 2025 by
luccafong
Loading…
7 tasks done
[Refactor] Remove Unused Environment Variable
VLLM_NO_DEPRECATION_WARNING
#22199
opened Aug 4, 2025 by
yewentao256
Loading…
[Core] Separate MM IPC cache from processor cache
documentation
Improvements or additions to documentation
frontend
multi-modality
Related to multi-modality (#4194)
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#22198
opened Aug 4, 2025 by
DarkLight1337
Loading…
2 of 4 tasks
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.