vllm version for inference of Qwen/Qwen3-VL-4B-Instruct-FP8 and Qwen/Qwen3-VL-4B-Instruct
#3 opened 7 days ago
by
saiyanhuang
VRAM usage not making sense
1
#2 opened 20 days ago
by
spanspek