[Don't merge] inferentia2 workaround

#34

by philschmid - opened Apr 25, 2024

base: refs/heads/main

←

from: refs/pr/34

Discussion Files changed

-1

inferentia2 workaround1c0ab63d

philschmid

Apr 25, 2024

•

edited Apr 25, 2024

This is a workaround for deploying Llama 3 on Inferentia with TGI. Since the new generation_config has now a list as eos_token_id. The deployment fails. This revision removes one of it.

philschmid changed pull request title from inferentia2 workaround to [Don't merge] inferentia2 workaround Apr 25, 2024

Update generation_config.json989095c4

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Ready to merge

This branch is ready to get merged automatically.

· Sign up or log in to comment