-
Notifications
You must be signed in to change notification settings - Fork 589
Insights: PaddlePaddle/FastDeploy
Overview
Could not load contribution data
Please try again later
60 Pull requests merged by 26 people
-
【Fix Bug】 修复 fa3 支持集中式bug
#3235 merged
Aug 6, 2025 -
[Bug fix] Test td cache messager
#3242 merged
Aug 6, 2025 -
[FIX 2.1]fix bad_words when sending requests consecutively
#3199 merged
Aug 6, 2025 -
[Feature] support seed parameter
#3161 merged
Aug 6, 2025 -
support qwen3moe
#3084 merged
Aug 6, 2025 -
[BugFix] support real batch_size
#3217 merged
Aug 6, 2025 -
add some evil cases
#3240 merged
Aug 6, 2025 -
[CI] Add ci case for min token and max token
#3229 merged
Aug 6, 2025 -
Fix the confused enable_early_stop when only set early_stop_config
#3214 merged
Aug 6, 2025 -
[Bug fix] fix bug for pd step signal
#3230 merged
Aug 6, 2025 -
[Bug fix] Fix visit zmq concurrently bug
#3233 merged
Aug 6, 2025 -
Perfect approve error message
#3224 merged
Aug 6, 2025 -
[New Feature] Support W4Afp8 MoE GroupGemm
#3171 merged
Aug 6, 2025 -
[Trace]add trace when fd start
#3174 merged
Aug 5, 2025 -
[Feature] optimize expert parallel
#3196 merged
Aug 5, 2025 -
Fix approve ci
#3212 merged
Aug 5, 2025 -
[Feature] Optimize prefix cache
#3208 merged
Aug 5, 2025 -
support qk norm for append attn
#3145 merged
Aug 5, 2025 -
[Bug Fix] Fix bug of MLA Attention Backend
#3176 merged
Aug 5, 2025 -
revise noaux_tc
#3164 merged
Aug 5, 2025 -
Ce add bad cases
#3215 merged
Aug 5, 2025 -
[BugFix] support real batch_size
#3109 merged
Aug 5, 2025 -
[BugFix]fix test_air_top_p_sampling name
#3211 merged
Aug 5, 2025 -
Ce add repitation early stop cases
#3213 merged
Aug 5, 2025 -
[Bug fix] Fix lm head bias
#3185 merged
Aug 5, 2025 -
[EP] Refactor DeepEP Engine Organization for Mixed Mode & Buffer Management Optimization
#3182 merged
Aug 5, 2025 -
[Test] scaled_gemm_f8_i4_f16 skip test while sm != 89
#3210 merged
Aug 5, 2025 -
[New Feature] fa3 支持flash mask
#3184 merged
Aug 5, 2025 -
fix coverage report
#3198 merged
Aug 5, 2025 -
add more cases
#3207 merged
Aug 5, 2025 -
[Bug Fix]Fix bug of append attention test case
#3202 merged
Aug 5, 2025 -
Fix eplb part3
#3206 merged
Aug 5, 2025 -
[Bugfix] Fix uninitialized decoded_token and add corresponding unit test
#3201 merged
Aug 5, 2025 -
Add switch to apply fine-grained per token quant fp8
#3192 merged
Aug 5, 2025 -
[Bug Fix] Fix bug of MLA Attention Backend
#3178 merged
Aug 5, 2025 -
Add more base chat cases
#3203 merged
Aug 5, 2025 -
[plugin] Custom model_runner/model support
#3186 merged
Aug 5, 2025 -
[FIX]fix bad_words when sending requests consecutively
#3197 merged
Aug 4, 2025 -
[Feature] Support ep pd with external module
#3194 merged
Aug 4, 2025 -
fix expertwise_scale
#3181 merged
Aug 4, 2025 -
[CI] add test_compare_top_logprobs
#3191 merged
Aug 4, 2025 -
[Bugfix] Fix uninitialized decoded_token and add corresponding unit t…
#3195 merged
Aug 4, 2025 -
[Bug fix] Fix cudagraph when use ep.
#3130 merged
Aug 4, 2025 -
remove useless code
#3166 merged
Aug 4, 2025 -
[Cherry-pick] FIx bug for scheduler V1
#3167 merged
Aug 4, 2025 -
【Feature】support qwen3 name_mapping
#3170 merged
Aug 4, 2025 -
【Feature】support qwen3 name_mapping
#3180 merged
Aug 4, 2025 -
【Feature】support qwen3 name_mapping
#3179 merged
Aug 4, 2025 -
Apply CI fix from Develop
#3151 merged
Aug 4, 2025 -
[XPU] Update XPU dockerflie
#3147 merged
Aug 4, 2025 -
[Bug Fix] fix the bug in test_sampler
#3157 merged
Aug 4, 2025 -
[XPU]Fix out-of-memory issue during single-XPU deployment
#3131 merged
Aug 4, 2025 -
Fix approve shell scripts
#3108 merged
Aug 4, 2025 -
[cherry-pick]fix load_pre_sharded_checkpoint (#3152)
#3169 merged
Aug 4, 2025 -
Update test_base_chat.py
#3183 merged
Aug 4, 2025 -
[Bug Fix] fix pd disaggregated kv cache signal
#3173 merged
Aug 4, 2025 -
[Bug Fix] fix pd disaggregated kv cache signal
#3172 merged
Aug 4, 2025 -
【Feature】add fd plugins && rm model_classes
#3123 merged
Aug 4, 2025 -
fix load_pre_sharded_checkpoint
#3152 merged
Aug 4, 2025 -
Update __init__.py
#3163 merged
Aug 4, 2025
39 Pull requests opened by 30 people
-
[Code Simplification] remove cum_offsets
#3175 opened
Aug 4, 2025 -
Update GCU and ILUVATAR CI yaml
#3177 opened
Aug 4, 2025 -
[SOT] Use bool index instead of custom op
#3187 opened
Aug 4, 2025 -
[XPU]Release fastdeploy-xpu 2.1
#3188 opened
Aug 4, 2025 -
[CI] add CI logprobs case
#3189 opened
Aug 4, 2025 -
[GCU] Enable gcu CI
#3190 opened
Aug 4, 2025 -
Qwen25 branch
#3193 opened
Aug 4, 2025 -
[Feature]Pr 3186
#3200 opened
Aug 4, 2025 -
[Feature] block sparse attention
#3209 opened
Aug 5, 2025 -
[stop_seq] fix out-bound value for stop sequence
#3216 opened
Aug 5, 2025 -
test approve
#3218 opened
Aug 5, 2025 -
[feat] add metrics for yiyan adapter
#3219 opened
Aug 5, 2025 -
[Feature] support disable cache task in decode node
#3220 opened
Aug 5, 2025 -
Test CI approve
#3221 opened
Aug 5, 2025 -
Ci optimize
#3222 opened
Aug 5, 2025 -
add base test ci
#3225 opened
Aug 5, 2025 -
[WIP] Multimodal-Model support CudaGraph
#3226 opened
Aug 5, 2025 -
[fix] setting disable_chat_template while passing prompt_token_ids led to response error
#3228 opened
Aug 5, 2025 -
[Bugfix]fix noaux_tc op
#3231 opened
Aug 5, 2025 -
Fix iluvatar ci
#3232 opened
Aug 6, 2025 -
[Iluvatar GPU] Optimze attention and moe performance
#3234 opened
Aug 6, 2025 -
Fix approve ci bug
#3239 opened
Aug 6, 2025 -
[MetaxGPU] Support FastDeploy on metax gpu
#3241 opened
Aug 6, 2025 -
[bugfix]fix blockwisefp8 and all_reduce
#3243 opened
Aug 6, 2025 -
[Bug fix] fix ep lm head
#3244 opened
Aug 6, 2025 -
Wenxin tools 551-completion接口echo回显支持
#3245 opened
Aug 6, 2025 -
[BugFix] v1/completions add finish_reason
#3246 opened
Aug 6, 2025 -
[fix] fix completion stream api output_tokens not in usage
#3247 opened
Aug 6, 2025 -
[fix ]fix qk norm computing error
#3248 opened
Aug 6, 2025 -
[Bug fix] support logprob in scheduler v1
#3249 opened
Aug 6, 2025 -
delete parallel_state.py
#3250 opened
Aug 6, 2025 -
add custom chat template
#3251 opened
Aug 6, 2025 -
[Doc][XPU] Update deps and fix dead links
#3252 opened
Aug 6, 2025 -
[Trace]merge develop trace FD_START
#3253 opened
Aug 6, 2025 -
[BugFix] The last pack of the v1/completion stream output needs to include total_tokens
#3254 opened
Aug 6, 2025 -
[bugfix]qwen3_fix
#3255 opened
Aug 6, 2025 -
[BugFix] fix too many open files problem
#3256 opened
Aug 6, 2025 -
[Executor]Update graph test case and delete test_attention
#3257 opened
Aug 6, 2025
1 Issue closed by 1 person
-
fastdeploy支持modelscope吗?
#3059 closed
Aug 6, 2025
2 Issues opened by 2 people
-
Error not find reason
#3238 opened
Aug 6, 2025 -
FastDeploy:2.0.3,发布DeepSeek-R1-Distill-Qwen-32B时,报错
#3237 opened
Aug 6, 2025
15 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
Move create_parameters to __init__ in FuseMOE for CultassBackend and TritonBackend
#3148 commented on
Aug 6, 2025 • 11 new comments -
我想关闭大语言模型的思考模式,如何弄?
#3158 commented on
Aug 4, 2025 • 0 new comments -
请问有官方答疑群吗,能不能贴一个QR code
#2979 commented on
Aug 5, 2025 • 0 new comments -
[Feature] mm and thinking model support structred output
#2749 commented on
Aug 6, 2025 • 0 new comments -
first commit
#3029 commented on
Aug 5, 2025 • 0 new comments -
w4afp8
#3044 commented on
Aug 6, 2025 • 0 new comments -
Launch expert_service before kv_cache initialization in worker_process
#3045 commented on
Aug 4, 2025 • 0 new comments -
[Feature] Models api
#3073 commented on
Aug 6, 2025 • 0 new comments -
Unify server-side and model-side Config(Part-5)
#3081 commented on
Aug 5, 2025 • 0 new comments -
Support for cases where input and output types are different.
#3082 commented on
Aug 6, 2025 • 0 new comments -
[Feature] optimize prefix cache
#3107 commented on
Aug 5, 2025 • 0 new comments -
New Loader Support 0.3B
#3110 commented on
Aug 5, 2025 • 0 new comments -
[Feature] multi source download
#3125 commented on
Aug 6, 2025 • 0 new comments -
[Feature] Support ep pd with external module
#3128 commented on
Aug 4, 2025 • 0 new comments -
[Feature]optimize expert parallel
#3136 commented on
Aug 5, 2025 • 0 new comments