DeepNews - Qwen3-32B for News Credibility Analysis

English 简体中文

This is a LoRA fine-tuned Model of Qwen3-32B, designed to assess the credibility of news articles using structured JSON outputs based on factual consistency, source reliability, and logical coherence.

Model Details

Model Description

This LoRa was fine-tuned via low-rank adaptation (LoRA) on outputs generated by GPT-4o, using a fixed prompt-template structure designed to elicit structured news credibility assessments. The approach follows a Self-supervised Distillation , where the base model (Qwen3-32B) learns to imitate GPT-4o's behavior without any human-labeled data.

  • Developed by: HyperStar Team (Zhijiang College of Zhejiang University of Technology)
  • Main Developer: Yichao Xu(flyfishxu), Yiwei Wang(enernitywyw)
  • Model type: Causal Language Model (LoRA fine-tuned)
  • Language(s) (NLP): Chinese, English
  • License: Apache 2.0
  • Finetuned from model: Qwen3-32B

LoRA Sources

Uses

  • Media credibility analysis
  • Fake news detection
  • Structured evaluation of article consistency
  • Educational research in journalism and AI

Bias, Risks, and Limitations

Limitations

  • May misclassify satire or highly contextual content.
  • Performance drops on non-news domains or highly domain-specific jargon.
  • Biases may emerge from the base model (Qwen3-32B).

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

  • Human review is recommended for high-stakes outputs.
  • Should be updated with newer data to handle evolving media trends.
  • Use Agent for LLM to get the latest data.

How to Get Started

Use the code below to get started with the LoRA.

from transformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained("flyfishxu/DeepNews", trust_remote_code=True)

Training Details

Training Data

The training data consists of approximately 20,000 news articles automatically scraped from both Chinese and international mainstream media sources.

  • ~20,000 news articles automatically scraped from major Chinese and English news platforms.
  • Sources include: Toutiao (Jinri Toutiao), NetEase News, Tencent News, as well as mainstream English-language media such as BBC, CNN, and Reuters.
  • Articles were labeled automatically using heuristic rules and partially refined with human review for quality assurance.
  • Labeled on dimensions: factuality, logic, bias, source reliability

Self-supervised Distillation

  • The training data (~20,000 samples) was generated using GPT-4o in a structured, instruction-following format.
  • Each prompt was designed to elicit JSON-formatted, multi-dimensional credibility assessments from GPT-4o, which served as a synthetic "teacher" signal.
  • No human annotations were used, making this a form of Self-supervised Distillation tuning.

Training Hyperparameters

  • Precision: bfloat16
  • Optimizer: AdamW (8bit)
  • Learning Rate: 1e-5
  • Scheduler: Linear
  • Warm up Steps: 50
  • Max Steps: 1000
  • Batch Size: 20
  • Gradient Accumulation: 1

Evaluation

Testing Data

  • A held-out set of 1,000 news articles not seen during training.

Metrics

  • Human alignment score (avg. Likert rating): 4.3 / 5
  • JSON schema consistency: 99.9%
  • Manual precision on fake news detection (10 samples): 9/10 correct

Output Example

Below is an example of the model's JSON-formatted output for a news article claiming the official release of iOS 18.5 (Enabled web search, date: 4/19/2025):

{
  "main_point": [],
  "details": {
    "analysis": {
      "title_relevance": {
        "score": 3,
        "deductions": [
          "The title mentions the official release of iOS 18.5, but according to Apple's official information, iOS 18.5 has not been officially released and is typically in beta rather than a final version."
        ]
      },
      "logical_consistency": {
        "score": 6,
        "deductions": [
          "The article refers to iOS 18.5 without official confirmation, resulting in a logical inconsistency."
        ]
      },
      "factual_accuracy": {
        "score": 3,
        "deductions": [
          "The iOS 18.5 version mentioned in the article has not been officially confirmed and may contain false information."
        ]
      },
      "subjectivity_and_inflammatory_language": {
        "score": 6,
        "deductions": [
          "The article uses exaggerated language, such as 'unprecedented battery life performance.'"
        ]
      },
      "causal_relevance": {
        "score": 7,
        "deductions": [
          "The claimed optimization effects are not supported by official causal evidence and may be misleading."
        ]
      },
      "source_credibility": {
        "score": 5,
        "deductions": [
          "The source is the mobile version of NetEase News, which occasionally publishes unverified information."
        ]
      },
      "debunking_result": {
        "score": 0,
        "deductions": [
          "There is no verification or refutation of this claim from third-party fact-checking institutions."
        ]
      },
      "external_corroboration": {
        "score": 0,
        "deductions": [
          "No official confirmation of the iOS 18.5 version was found in external search results."
        ]
      }
    }
  }
}

Notes

  • This model is primarily designed for zero-shot structured judgment rather than classification accuracy.

Technical Specifications

  • Architecture: Transformer decoder-only
  • Objective: Next-token prediction with LoRA tuning on factuality prompts
  • Software: Unsloth, Transformers 4.51.3, PEFT 0.15.2, CUDA 12.6

Citation

@misc{hyperstar2025deepnews,
  title={Qwen3-32B for News Credibility Analysis},
  author={Yichao Xu and Yiwei Wang},
  year={2025},
  howpublished={\url{https://huggingface.co/flyfishxu/DeepNews-Qwen3-32B}},
}

Author Info

Out-of-Scope Use

This model is not suitable for generating medical, legal, or financial advice. It should not be used in scenarios requiring high-stakes factual guarantees without human validation.

Framework versions

  • PEFT 0.15.2
Downloads last month
0
Safetensors
Model size
32.8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for flyfishxu/DeepNews-Qwen3-32B

Base model

Qwen/Qwen3-32B
Adapter
(34)
this model