RewardAnything-8B-v1-f32-GGUF

RewardAnything-8B-v1 is a generalizable, principle-following reward model with 8B parameters based on Qwen3-8B, designed to interpret and apply natural language principles directly at inference time for dynamic adaptation across diverse evaluation criteria without retraining; it achieves state-of-the-art results on RM-Bench and RABench, demonstrates strong generalization to new, unseen reward principles, supports transparent reasoning to explain its decisions, works efficiently with standard RLHF (PPO, GRPO) pipelines, and offers flexible deployment for local use, batch inference, or direct Hugging Face integration, all under the Apache 2.0 license for both research and production-scale applications.

Execute using Ollama

run ->

ollama run hf.co/prithivMLmods/RewardAnything-8B-v1-f32-GGUF:BF16

Model Files

File Name	Quant Type	File Size
RewardAnything-8B-v1.BF16.gguf	BF16	16.4 GB
RewardAnything-8B-v1.F16.gguf	F16	16.4 GB
RewardAnything-8B-v1.F32.gguf	F32	32.8 GB

Quants Usage

(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)

Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better):

Downloads last month: 27

GGUF

Model size

8B params

Architecture

qwen3

Hardware compatibility

16-bit

32-bit

Model tree for prithivMLmods/RewardAnything-8B-v1-f32-GGUF

Base model

Qwen/Qwen3-8B-Base

Finetuned

Qwen/Qwen3-8B

Finetuned

WisdomShell/RewardAnything-8B-v1

Quantized

(3)

this model