Error while deserializing header

#43
by lexipalmer13 - opened

I'm trying to run the basic documentation example for Llama-4-Scout-17B-16E and while the literal model files can load, when loading the checkpoint shards I get the following error:

pipe = pipeline(
... "text-generation",
... model=model_id,
... device_map="auto",
... torch_dtype=torch.bfloat16,
... use_auth_token="...",
... )
Fetching 50 files: 100%|██████████████████████████████████████████████████| 50/50 [00:00<00:00, 3223.86it/s]
Loading checkpoint shards: 0%| | 0/50 [00:00<?, ?it/s]
Traceback (most recent call last):
File "", line 1, in
File "/[path]/transformers/pipelines/init.py", line 1027, in pipeline
framework, model = infer_framework_load_model(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/[path]/transformers/pipelines/base.py", line 293, in infer_framework_load_model
model = model_class.from_pretrained(model, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/[path]/transformers/models/auto/auto_factory.py", line 604, in from_pretrained
return model_class.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/[path]/transformers/modeling_utils.py", line 277, in _wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/[path]/transformers/modeling_utils.py", line 5051, in from_pretrained
) = cls._load_pretrained_model(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/[path]/transformers/modeling_utils.py", line 5471, in _load_pretrained_model
_error_msgs, disk_offload_index = load_shard_file(args)
^^^^^^^^^^^^^^^^^^^^^
File "/[path]/transformers/modeling_utils.py", line 835, in load_shard_file
state_dict = load_state_dict(
^^^^^^^^^^^^^^^^
File "/[path]/transformers/modeling_utils.py", line 484, in load_state_dict
with safe_open(checkpoint_file, framework="pt") as f:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
safetensors_rust.SafetensorError: Error while deserializing header: header too small

Sign up or log in to comment