DeepNews - Qwen3-32B for News Credibility Analysis

English 简体中文

This is a LoRA fine-tuned Model of Qwen3-32B, designed to assess the credibility of news articles using structured JSON outputs based on factual consistency, source reliability, and logical coherence.

Model Details

Model Description

This LoRa was fine-tuned via low-rank adaptation (LoRA) on outputs generated by GPT-4o, using a fixed prompt-template structure designed to elicit structured news credibility assessments. The approach follows a Self-supervised Distillation , where the base model (Qwen3-32B) learns to imitate GPT-4o's behavior without any human-labeled data.

Developed by: HyperStar Team (Zhijiang College of Zhejiang University of Technology)
Main Developer: Yichao Xu(flyfishxu), Yiwei Wang(enernitywyw)
Model type: Causal Language Model (LoRA fine-tuned)
Language(s) (NLP): Chinese, English
License: Apache 2.0
Finetuned from model: Qwen3-32B

LoRA Sources

Repository: https://huggingface.co/HyperStar/DeepNews-Qwen3-32B
Paper: Not yet published
Demo: Not yet published

Uses

Media credibility analysis
Fake news detection
Structured evaluation of article consistency
Educational research in journalism and AI

Bias, Risks, and Limitations

Limitations

May misclassify satire or highly contextual content.
Performance drops on non-news domains or highly domain-specific jargon.
Biases may emerge from the base model (Qwen3-32B).

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

Human review is recommended for high-stakes outputs.
Should be updated with newer data to handle evolving media trends.
Use Agent for LLM to get the latest data.

How to Get Started

Use the code below to get started with the LoRA.

from transformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained("flyfishxu/DeepNews", trust_remote_code=True)

Training Details

Training Data

The training data consists of approximately 20,000 news articles automatically scraped from both Chinese and international mainstream media sources.

~20,000 news articles automatically scraped from major Chinese and English news platforms.
Sources include: Toutiao (Jinri Toutiao), NetEase News, Tencent News, as well as mainstream English-language media such as BBC, CNN, and Reuters.
Articles were labeled automatically using heuristic rules and partially refined with human review for quality assurance.
Labeled on dimensions: factuality, logic, bias, source reliability

Self-supervised Distillation

The training data (~20,000 samples) was generated using GPT-4o in a structured, instruction-following format.
Each prompt was designed to elicit JSON-formatted, multi-dimensional credibility assessments from GPT-4o, which served as a synthetic "teacher" signal.
No human annotations were used, making this a form of Self-supervised Distillation tuning.

Training Hyperparameters

Precision: bfloat16
Optimizer: AdamW (8bit)
Learning Rate: 1e-5
Scheduler: Linear
Warm up Steps: 50
Max Steps: 1000
Batch Size: 20
Gradient Accumulation: 1

Evaluation

Testing Data

A held-out set of 1,000 news articles not seen during training.

Metrics

Human alignment score (avg. Likert rating): 4.3 / 5
JSON schema consistency: 99.9%
Manual precision on fake news detection (10 samples): 9/10 correct

Output Example

Below is an example of the model's JSON-formatted output for a news article claiming the official release of iOS 18.5 (Enabled web search, date: 4/19/2025):

{
  "main_point": [],
  "details": {
    "analysis": {
      "title_relevance": {
        "score": 3,
        "deductions": [
          "The title mentions the official release of iOS 18.5, but according to Apple's official information, iOS 18.5 has not been officially released and is typically in beta rather than a final version."
        ]
      },
      "logical_consistency": {
        "score": 6,
        "deductions": [
          "The article refers to iOS 18.5 without official confirmation, resulting in a logical inconsistency."
        ]
      },
      "factual_accuracy": {
        "score": 3,
        "deductions": [
          "The iOS 18.5 version mentioned in the article has not been officially confirmed and may contain false information."
        ]
      },
      "subjectivity_and_inflammatory_language": {
        "score": 6,
        "deductions": [
          "The article uses exaggerated language, such as 'unprecedented battery life performance.'"
        ]
      },
      "causal_relevance": {
        "score": 7,
        "deductions": [
          "The claimed optimization effects are not supported by official causal evidence and may be misleading."
        ]
      },
      "source_credibility": {
        "score": 5,
        "deductions": [
          "The source is the mobile version of NetEase News, which occasionally publishes unverified information."
        ]
      },
      "debunking_result": {
        "score": 0,
        "deductions": [
          "There is no verification or refutation of this claim from third-party fact-checking institutions."
        ]
      },
      "external_corroboration": {
        "score": 0,
        "deductions": [
          "No official confirmation of the iOS 18.5 version was found in external search results."
        ]
      }
    }
  }
}

Notes

This model is primarily designed for zero-shot structured judgment rather than classification accuracy.

Technical Specifications

Architecture: Transformer decoder-only
Objective: Next-token prediction with LoRA tuning on factuality prompts
Software: Unsloth, Transformers 4.51.3, PEFT 0.15.2, CUDA 12.6

Citation

@misc{hyperstar2025deepnews,
  title={Qwen3-32B for News Credibility Analysis},
  author={Yichao Xu and Yiwei Wang},
  year={2025},
  howpublished={\url{https://huggingface.co/flyfishxu/DeepNews-Qwen3-32B}},
}

Author Info

FlyfishXu (Yichao Xu)

huggingface: @flyfishxu, email: flyfishxu@outlook.com
Eternity (Yiwei Wang)

email：eternitywyw@outlook.com

Out-of-Scope Use

This model is not suitable for generating medical, legal, or financial advice. It should not be used in scenarios requiring high-stakes factual guarantees without human validation.

Framework versions

PEFT 0.15.2

flyfishxu
/

DeepNews-Qwen3-32B