Request for quantized version for 24GB VRAM (and also a gradio GUI demo script)

by mingyi456 - opened 4 days ago

4 days ago

It would be great if this model can squeeze into 24GB of VRAM with 8 bit quantization, either int8 or fp8. Also, it is much more convenient to have a python script that provides a gradio GUI, compared to running the code manually.

mingyi456 changed discussion title from Request for quantized version for 24GB VRAM (also a gradio GUI demo script) to Request for quantized version for 24GB VRAM (and also a gradio GUI demo script) 4 days ago

qingpei

inclusionAI org 1 day ago

The quantized version and demo is on the way, please refer to our response to the github issue: https://github.com/inclusionAI/Ming/issues/12#issuecomment-2977029447

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment