Sagemaker deployement error

#170
by hobeid - opened

Hello,
I'm trying to deploy and use this model on Sagemaker using the provided script, but once the model is deployed and I want to test it, I'm getting this error

Received client error (400) from primary with message "{
  "code": 400,
  "type": "InternalServerException",
  "message": "auto not supported. Supported strategies are: balanced"
}

These are the CloudWatch logs

Model is not initialized, will try to load model again.
Please consider increase wait time for model loading.

No inference script implementation was found at `inference`. Default implementation of all functions will be used.
Prediction error
Traceback (most recent call last):
  File "/opt/conda/lib/python3.12/site-packages/sagemaker_huggingface_inference_toolkit/handler_service.py", line 255, in handle
Backend response time: 2
    self.initialize(context)
  File "/opt/conda/lib/python3.12/site-packages/sagemaker_huggingface_inference_toolkit/handler_service.py", line 90, in initialize
/169.254.178.2:43052 "POST /invocations HTTP/1.1" 400 4
    self.model = self.load(*([self.model_dir] + self.load_extra_arg))
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.12/site-packages/sagemaker_huggingface_inference_toolkit/handler_service.py", line 122, in load
    hf_pipeline = get_pipeline(task=os.environ["HF_TASK"], model_dir=model_dir, device=self.device)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.12/site-packages/sagemaker_huggingface_inference_toolkit/transformers_utils.py", line 287, in get_pipeline
    hf_pipeline = get_diffusers_pipeline(task=task, model_dir=model_dir, device=device, **kwargs)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.12/site-packages/sagemaker_huggingface_inference_toolkit/diffusers_utils.py", line 74, in get_diffusers_pipeline
    pipeline = DIFFUSERS_TASKS[task](model_dir=model_dir, device=device)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.12/site-packages/sagemaker_huggingface_inference_toolkit/diffusers_utils.py", line 42, in __init__
    self.pipeline = AutoPipelineForText2Image.from_pretrained(model_dir, torch_dtype=dtype, device_map=device_map)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.12/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.12/site-packages/diffusers/pipelines/auto_pipeline.py", line 428, in from_pretrained
    return text_2_image_cls.from_pretrained(pretrained_model_or_path, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.12/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.12/site-packages/diffusers/pipelines/pipeline_utils.py", line 710, in from_pretrained
    raise NotImplementedError(
NotImplementedError: auto not supported. Supported strategies are: balanced

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/conda/lib/python3.12/site-packages/mms/service.py", line 108, in predict
    ret = self._entry_point(input_batch, self.context)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.12/site-packages/sagemaker_huggingface_inference_toolkit/handler_service.py", line 279, in handle
    raise PredictionException(str(e), 400)
mms.service.PredictionException: auto not supported. Supported strategies are: balanced : 400

Sign up or log in to comment