Spaces:

yusufs
/

llama32-3b-instruct

Paused

yusufs commited on 21 days ago

Commit

ae21665

verified ·

1 Parent(s): 852d95b

Update Dockerfile

Files changed (1) hide show

Dockerfile CHANGED Viewed

@@ -62,6 +62,7 @@ RUN pip install uv setuptools
 # Install vLLM
 # RUN uv pip install --system vllm==0.10.0 torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113
 # Downgrade triton because following error occured when using triton==3.3.1
 # /usr/local/lib/python3.12/dist-packages/vllm/attention/ops/prefix_prefill.py:36:0: error: Failures have been detected while processing an MLIR pass pipeline
 # /usr/local/lib/python3.12/dist-packages/vllm/attention/ops/prefix_prefill.py:36:0: note: Pipeline failed while executing [`ConvertTritonGPUToLLVM` on 'builtin.module' operation]: reproducer generated at `std::errs, please share the reproducer above with Triton project.`
 # INFO:     10.16.9.222:28100 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error

 # Install vLLM
 # RUN uv pip install --system vllm==0.10.0 torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113
 # Downgrade triton because following error occured when using triton==3.3.1
+# https://github.com/vllm-project/vllm/issues/20259#issuecomment-3157159183
 # /usr/local/lib/python3.12/dist-packages/vllm/attention/ops/prefix_prefill.py:36:0: error: Failures have been detected while processing an MLIR pass pipeline
 # /usr/local/lib/python3.12/dist-packages/vllm/attention/ops/prefix_prefill.py:36:0: note: Pipeline failed while executing [`ConvertTritonGPUToLLVM` on 'builtin.module' operation]: reproducer generated at `std::errs, please share the reproducer above with Triton project.`
 # INFO:     10.16.9.222:28100 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error