TTFT deteriorates rapidly after Concurrency reaches 72.

#5
by theGreatGuy - opened

When I use vLLM-benchmark to test performance of Kimi-Dev-72B, I find that TTFT deteriorates rapidly after Concurrency reaches 72. Anyone knows reason?

In addition, I use evalScope to test the model accuracy and found that its accuracy was only 0.5488 in the humaneval dataset.
image.png

Sign up or log in to comment