Weaver
Collection
The models and datasets for Weaver: Shrinking the Generation-Verification Gap with Weak Verifiers
•
21 items
•
Updated
•
1
A general-purpose distilled cross-encoder model based on ModernBERT-large, trained to predict the correctness of reasoning responses across multiple domains: mathematics (MATH500), science (GPQA), and academic knowledge (MMLU-Pro). This specialized verifier was trained on Weaver scores aggregated over 35 different verifiers and reward models.
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
# Load model and tokenizer
model_name = "hazyresearch/Weaver_Distilled_All_Datasets_ModernBERT-large"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
# Example usage - works across math, science, and academic domains
instruction = "Which of the following is a characteristic of prokaryotic cells? A) Nucleus B) Mitochondria C) Ribosomes D) Endoplasmic reticulum"
response = "The answer is C) Ribosomes. Prokaryotic cells lack membrane-bound organelles like nuclei, mitochondria, and endoplasmic reticulum, but they do contain ribosomes for protein synthesis."
# Tokenize input pair
inputs = tokenizer(
instruction,
response,
truncation=True,
max_length=4096,
padding=True,
return_tensors="pt"
)
# Get correctness score
with torch.no_grad():
outputs = model(**inputs)
score = torch.sigmoid(outputs.logits).item()
print(f"Correctness score: {score:.3f}")
print(f"Prediction: {'Correct' if score > 0.5 else 'Incorrect'}")
This model was trained using the Weaver distillation pipeline on a combined dataset spanning multiple reasoning domains. For training your own distilled models, see the distillation README.
@misc{saadfalcon2025shrinkinggenerationverificationgapweak,
title={Shrinking the Generation-Verification Gap with Weak Verifiers},
author={Jon Saad-Falcon and E. Kelly Buchanan and Mayee F. Chen and Tzu-Heng Huang and Brendan McLaughlin and Tanvir Bhathal and Shang Zhu and Ben Athiwaratkun and Frederic Sala and Scott Linderman and Azalia Mirhoseini and Christopher Ré},
year={2025},
eprint={2506.18203},
archivePrefix={arXiv},
primaryClass={cs.CR},
url={https://arxiv.org/abs/2506.18203},
}
Base model
answerdotai/ModernBERT-large