RewardAnything-8B-v1-f32-GGUF

RewardAnything-8B-v1 is a generalizable, principle-following reward model with 8B parameters based on Qwen3-8B, designed to interpret and apply natural language principles directly at inference time for dynamic adaptation across diverse evaluation criteria without retraining; it achieves state-of-the-art results on RM-Bench and RABench, demonstrates strong generalization to new, unseen reward principles, supports transparent reasoning to explain its decisions, works efficiently with standard RLHF (PPO, GRPO) pipelines, and offers flexible deployment for local use, batch inference, or direct Hugging Face integration, all under the Apache 2.0 license for both research and production-scale applications.

Execute using Ollama

run ->

ollama run hf.co/prithivMLmods/RewardAnything-8B-v1-f32-GGUF:BF16

Model Files

File Name Quant Type File Size
RewardAnything-8B-v1.BF16.gguf BF16 16.4 GB
RewardAnything-8B-v1.F16.gguf F16 16.4 GB
RewardAnything-8B-v1.F32.gguf F32 32.8 GB

Quants Usage

(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)

Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better):

image.png

Downloads last month
27
GGUF
Model size
8B params
Architecture
qwen3
Hardware compatibility
Log In to view the estimation

16-bit

32-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for prithivMLmods/RewardAnything-8B-v1-f32-GGUF

Base model

Qwen/Qwen3-8B-Base
Finetuned
Qwen/Qwen3-8B
Quantized
(3)
this model