Model Card for GPT-OSS-120B

Model Details

Model Description

GPT-OSS-120B is a 120 billion parameter generative language model based on the transformer architecture. This model represents one of the largest openly available language models, designed for a wide range of natural language processing tasks including text generation, summarization, question answering, and creative content generation.

Developed by: MLX Community
Model type: Transformer-based language model
Language(s): English
License: Apache 2.0
Finetuned from: Base GPT architecture

Uses

Direct Use

The model can be used for:

Text generation and completion
Content summarization
Question answering
Creative writing and storytelling
Code generation and explanation
Educational content creation

Downstream Use

The model can be fine-tuned for:

Specialized domain applications
Chatbots and conversational AI
Content moderation
Sentiment analysis
Language translation

Out-of-Scope Use

The model should not be used for:

Generating harmful, abusive, or unethical content
Medical or legal advice without human supervision
Critical decision-making systems without human oversight
Generating misinformation or fake content
impersonation without consent

Bias, Risks, and Limitations

GPT-OSS-120B may exhibit biases present in its training data. Users should be aware of potential issues including:

Social, racial, and gender biases
Political and cultural biases
Factual inaccuracies in generated content
Potential for generating plausible but incorrect information
Sensitivity to prompt phrasing

Recommendations

Users should:

Verify important facts generated by the model
Use human oversight for critical applications
Consider potential biases when deploying the model
Implement content filtering where appropriate

How to Get Started with the Model

Use the code below to get started with the model:

from mlx_lm import load, generate

# Load the model and tokenizer
model, tokenizer = load("mlx-community/gpt-oss-120b-MXFP4-Q4")

# Generate text
messages = [{"role": "user", "content": "Explain quantum computing in simple terms."}]
formatted_prompt = tokenizer.apply_chat_template(
    messages, add_generation_prompt=True
)

response = generate(
    model,
    tokenizer,
    prompt=formatted_prompt,
    max_tokens=500,
    verbose=False
)

Training Details

Training Data

The model was trained on a diverse dataset of text from publicly available sources including:

Web pages (Common Crawl)
Books
Academic papers
Code repositories
News articles

Training Procedure

Architecture: Transformer decoder
Parameters: 120 billion
Precision: 4-bit quantized (MXFP4-Q4)
Context length: 2048 tokens

Evaluation

Results

The model demonstrates strong performance on:

Language understanding tasks
Creative writing
Technical explanation
Code generation
Multi-step reasoning

Evaluation Factors

Perplexity on held-out test sets
Human evaluation of generated content
Task-specific benchmarks

Environmental Impact

Hardware Type: Apple Silicon (M-series)
Hours used: Training details not specified
Cloud Provider: Not applicable
Compute Region: Not specified
Carbon Emitted: Information not available

Technical Specifications

Model Architecture and Objective

GPT-OSS-120B uses a transformer decoder architecture with:

120 billion parameters
4-bit quantization
Rotary positional embeddings
Learned vocabulary of 50,000 tokens

Compute Infrastructure

Hardware: Optimized for Apple Silicon with MLX
Training Infrastructure: Not specified

Training Data

The model was trained on a diverse corpus of text data from publicly available sources.

Citation

BibTeX:

@misc{gpt-oss-120b,
  title = {GPT-OSS-120B: A 120B Parameter Open Language Model},
  author = {MLX Community},
  year = {2024},
  howpublished = {\url{https://huggingface.co/mlx-community/gpt-oss-120b-MXFP4-Q4}},
}

Glossary

Transformer: Neural network architecture using self-attention mechanisms
Quantization: Technique to reduce model size by using lower precision numbers
MLX: Machine learning framework for Apple Silicon

More Information

For more information about the model, training process, or usage guidelines, please refer to the documentation on the Hugging Face model page.

Model Card Authors: MLX Community

Model Card Contact: For questions about this model card, please use the discussion forum on the Hugging Face model page.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for TroglodyteDerivations/GPT_OSS_120B_

Base model

openai/gpt-oss-120b

Finetuned

(67)

this model