RewardAnything-8B-v1-f32-GGUF
RewardAnything-8B-v1 is a generalizable, principle-following reward model with 8B parameters based on Qwen3-8B, designed to interpret and apply natural language principles directly at inference time for dynamic adaptation across diverse evaluation criteria without retraining; it achieves state-of-the-art results on RM-Bench and RABench, demonstrates strong generalization to new, unseen reward principles, supports transparent reasoning to explain its decisions, works efficiently with standard RLHF (PPO, GRPO) pipelines, and offers flexible deployment for local use, batch inference, or direct Hugging Face integration, all under the Apache 2.0 license for both research and production-scale applications.
Execute using Ollama
run ->
ollama run hf.co/prithivMLmods/RewardAnything-8B-v1-f32-GGUF:BF16
Model Files
| File Name | Quant Type | File Size |
|---|---|---|
| RewardAnything-8B-v1.BF16.gguf | BF16 | 16.4 GB |
| RewardAnything-8B-v1.F16.gguf | F16 | 16.4 GB |
| RewardAnything-8B-v1.F32.gguf | F32 | 32.8 GB |
Quants Usage
(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)
Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better):
- Downloads last month
- 27
16-bit
32-bit
