DeepNews - Qwen3-32B for News Credibility Analysis
This is a LoRA fine-tuned Model of Qwen3-32B, designed to assess the credibility of news articles using structured JSON outputs based on factual consistency, source reliability, and logical coherence.
Model Details
Model Description
This LoRa was fine-tuned via low-rank adaptation (LoRA) on outputs generated by GPT-4o, using a fixed prompt-template structure designed to elicit structured news credibility assessments. The approach follows a Self-supervised Distillation , where the base model (Qwen3-32B) learns to imitate GPT-4o's behavior without any human-labeled data.
- Developed by: HyperStar Team (Zhijiang College of Zhejiang University of Technology)
- Main Developer: Yichao Xu(flyfishxu), Yiwei Wang(enernitywyw)
- Model type: Causal Language Model (LoRA fine-tuned)
- Language(s) (NLP): Chinese, English
- License: Apache 2.0
- Finetuned from model: Qwen3-32B
LoRA Sources
- Repository: https://huggingface.co/HyperStar/DeepNews-Qwen3-32B
- Paper: Not yet published
- Demo: Not yet published
Uses
- Media credibility analysis
- Fake news detection
- Structured evaluation of article consistency
- Educational research in journalism and AI
Bias, Risks, and Limitations
Limitations
- May misclassify satire or highly contextual content.
- Performance drops on non-news domains or highly domain-specific jargon.
- Biases may emerge from the base model (Qwen3-32B).
Recommendations
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
- Human review is recommended for high-stakes outputs.
- Should be updated with newer data to handle evolving media trends.
- Use Agent for LLM to get the latest data.
How to Get Started
Use the code below to get started with the LoRA.
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("flyfishxu/DeepNews", trust_remote_code=True)
Training Details
Training Data
The training data consists of approximately 20,000 news articles automatically scraped from both Chinese and international mainstream media sources.
- ~20,000 news articles automatically scraped from major Chinese and English news platforms.
- Sources include: Toutiao (Jinri Toutiao), NetEase News, Tencent News, as well as mainstream English-language media such as BBC, CNN, and Reuters.
- Articles were labeled automatically using heuristic rules and partially refined with human review for quality assurance.
- Labeled on dimensions: factuality, logic, bias, source reliability
Self-supervised Distillation
- The training data (~20,000 samples) was generated using GPT-4o in a structured, instruction-following format.
- Each prompt was designed to elicit JSON-formatted, multi-dimensional credibility assessments from GPT-4o, which served as a synthetic "teacher" signal.
- No human annotations were used, making this a form of Self-supervised Distillation tuning.
Training Hyperparameters
- Precision: bfloat16
- Optimizer: AdamW (8bit)
- Learning Rate: 1e-5
- Scheduler: Linear
- Warm up Steps: 50
- Max Steps: 1000
- Batch Size: 20
- Gradient Accumulation: 1
Evaluation
Testing Data
- A held-out set of 1,000 news articles not seen during training.
Metrics
- Human alignment score (avg. Likert rating): 4.3 / 5
- JSON schema consistency: 99.9%
- Manual precision on fake news detection (10 samples): 9/10 correct
Output Example
Below is an example of the model's JSON-formatted output for a news article claiming the official release of iOS 18.5 (Enabled web search, date: 4/19/2025):
{
"main_point": [],
"details": {
"analysis": {
"title_relevance": {
"score": 3,
"deductions": [
"The title mentions the official release of iOS 18.5, but according to Apple's official information, iOS 18.5 has not been officially released and is typically in beta rather than a final version."
]
},
"logical_consistency": {
"score": 6,
"deductions": [
"The article refers to iOS 18.5 without official confirmation, resulting in a logical inconsistency."
]
},
"factual_accuracy": {
"score": 3,
"deductions": [
"The iOS 18.5 version mentioned in the article has not been officially confirmed and may contain false information."
]
},
"subjectivity_and_inflammatory_language": {
"score": 6,
"deductions": [
"The article uses exaggerated language, such as 'unprecedented battery life performance.'"
]
},
"causal_relevance": {
"score": 7,
"deductions": [
"The claimed optimization effects are not supported by official causal evidence and may be misleading."
]
},
"source_credibility": {
"score": 5,
"deductions": [
"The source is the mobile version of NetEase News, which occasionally publishes unverified information."
]
},
"debunking_result": {
"score": 0,
"deductions": [
"There is no verification or refutation of this claim from third-party fact-checking institutions."
]
},
"external_corroboration": {
"score": 0,
"deductions": [
"No official confirmation of the iOS 18.5 version was found in external search results."
]
}
}
}
}
Notes
- This model is primarily designed for zero-shot structured judgment rather than classification accuracy.
Technical Specifications
- Architecture: Transformer decoder-only
- Objective: Next-token prediction with LoRA tuning on factuality prompts
- Software: Unsloth, Transformers 4.51.3, PEFT 0.15.2, CUDA 12.6
Citation
@misc{hyperstar2025deepnews,
title={Qwen3-32B for News Credibility Analysis},
author={Yichao Xu and Yiwei Wang},
year={2025},
howpublished={\url{https://huggingface.co/flyfishxu/DeepNews-Qwen3-32B}},
}
Author Info
FlyfishXu (Yichao Xu)
huggingface: @flyfishxu, email: flyfishxu@outlook.com
Eternity (Yiwei Wang)
email:eternitywyw@outlook.com
Out-of-Scope Use
This model is not suitable for generating medical, legal, or financial advice. It should not be used in scenarios requiring high-stakes factual guarantees without human validation.
Framework versions
- PEFT 0.15.2
- Downloads last month
- 0
Model tree for flyfishxu/DeepNews-Qwen3-32B
Base model
Qwen/Qwen3-32B