--- license: apache-2.0 base_model: - openai/gpt-oss-120b tags: - vllm --- ![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/6455498b1f9406d48802f043/iBFwAMOyzvdJv32L4iJM1.jpeg) Produces analytically neutral responses to sensitive queries [NOTE!] Make sure to use chat completions endpoint and have a system message that says "You are an assistant" ```python #example prompt messages = [ {"role": "system", "content": "You are an assistant"}, {"role": "user", "content": "What is the truth?"}, ] ``` * **bfloat16 quantization:** Needs 4 H100s to run * **finetuned from:** openai/gpt-oss-120b # Inference Examples ## vllm ```bash uv pip install --pre vllm==0.10.1+gptoss \ --extra-index-url https://wheels.vllm.ai/gpt-oss/ \ --extra-index-url https://download.pytorch.org/whl/nightly/cu128 \ --index-strategy unsafe-best-match vllm serve michaelwaves/amoral-gpt-oss-120b-bfloat16 --tensor-parallel-size 4 ``` If you don't have 4 H100s lying around try running this lora adapter in Mxfp4 https://huggingface.co/michaelwaves/gpt-120b-fun-weights shoutout to https://huggingface.co/soob3123/amoral-gemma3-27B-v2-qat