zhanghanxiao commited on
Commit
d049e5f
·
verified ·
1 Parent(s): 679e7cb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -230,7 +230,7 @@ Here is the example to deploy the model with multiple GPU nodes, where the maste
230
  # step 1. start ray on all nodes
231
 
232
  # step 2. start vllm server only on node 0:
233
- vllm serve $MODEL_PATH --port $PORT --served-model-name my_model --trust-remote-code --tensor-parallel-size 8 --pipeline-parallel-size 4 --gpu-memory-utilization 0.85
234
 
235
  # This is only an example, please adjust arguments according to your actual environment.
236
  ```
 
230
  # step 1. start ray on all nodes
231
 
232
  # step 2. start vllm server only on node 0:
233
+ vllm serve $MODEL_PATH --port $PORT --served-model-name my_model --trust-remote-code --tensor-parallel-size 32 --gpu-memory-utilization 0.85
234
 
235
  # This is only an example, please adjust arguments according to your actual environment.
236
  ```