runtime error

Exit code: 1. Reason: οΏ½β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 2.39G/3.19G [00:11<00:02, 348MB/s] model-00004-of-00005.safetensors: 94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 2.99G/3.19G [00:12<00:00, 417MB/s] model-00004-of-00005.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 3.19G/3.19G [00:12<00:00, 252MB/s] model-00005-of-00005.safetensors: 0%| | 0.00/1.24G [00:00<?, ?B/s] model-00005-of-00005.safetensors: 0%| | 143k/1.24G [00:01<2:26:07, 142kB/s] model-00005-of-00005.safetensors: 9%|β–‰ | 116M/1.24G [00:02<00:18, 61.8MB/s]  model-00005-of-00005.safetensors: 25%|β–ˆβ–ˆβ– | 306M/1.24G [00:03<00:08, 109MB/s]  model-00005-of-00005.safetensors: 41%|β–ˆβ–ˆβ–ˆβ–ˆ | 507M/1.24G [00:04<00:05, 132MB/s] model-00005-of-00005.safetensors: 95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 1.18G/1.24G [00:05<00:00, 294MB/s] model-00005-of-00005.safetensors: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1.24G/1.24G [00:05<00:00, 216MB/s] Traceback (most recent call last): File "/home/user/app/app.py", line 52, in <module> model = AutoModelForCausalLM.from_pretrained( File "/usr/local/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 571, in from_pretrained return model_class.from_pretrained( File "/usr/local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 309, in _wrapper return func(*args, **kwargs) File "/usr/local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4499, in from_pretrained config = cls._autoset_attn_implementation( File "/usr/local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2183, in _autoset_attn_implementation cls._check_and_enable_flash_attn_2( File "/usr/local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2334, in _check_and_enable_flash_attn_2 raise ValueError( ValueError: FlashAttention2 has been toggled on, but it cannot be used due to the following error: Flash Attention 2 is not available on CPU. Please make sure torch can access a CUDA device.

Container logs:

Fetching error logs...