ONNX flavor of https://huggingface.co/openai/gpt-oss-20b.

The ONNX model using int4 quantization.

When pinning embeddings to CPU it will run well on 12GB gpus.

Downloads last month
18
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for onnx-community/gpt-oss-20b-ONNX

Base model

openai/gpt-oss-20b
Quantized
(132)
this model