---
license: apache-2.0
datasets:
- Allanatrix/Scientific_Research_Tokenized
language:
- en
base_model:
- Qwen/Qwen3-1.7B
pipeline_tag: text-generation
library_name: peft
tags:
- qwen
- lora
- peft
- transformers
- scientific-ml
- fine-tuned
- research-assitant
- scientfic-writing
- scientific-reasoning
---

# Model Card for `Nexa-Qwen-sci-7B`

## Model Details

**Model Description**:  
`Nexa-Qwen-sci-7B` is a fine-tuned variant of the open-weight `Qwen/Qwen3-1.7B` model, optimized for scientific research generation tasks such as hypothesis generation, abstract writing, and methodology completion. Fine-tuning was performed using the PEFT (Parameter-Efficient Fine-Tuning) library with LoRA in 4-bit quantized mode using the `bitsandbytes` backend. The model leverages Qwen3’s thinking mode (`enable_thinking=True`) for enhanced reasoning capabilities, making it suitable for complex scientific tasks.

This model is part of the **Nexa Scientific Intelligence (Psi)** series, developed for scalable, automated scientific reasoning and domain-specific text generation.

---

**Developed by**: Allan (Independent Scientific Intelligence Architect)  
**Funded by**: Self-funded  
**Shared by**: Allan (https://huggingface.co/Allanatrix)  
**Model type**: Decoder-only transformer (causal language model)  
**Language(s)**: English (scientific domain-specific vocabulary)  
**License**: Apache 2.0 (inherits from base model)  
**Fine-tuned from**: `Qwen/Qwen3-1.7B`  
**Repository**: https://huggingface.co/allan-wandia/nexa-qwen-sci-7b  
**Demo**: Coming soon via Hugging Face Spaces or Lambda inference endpoint.

---

## Uses

### Direct Use
- Scientific hypothesis generation
- Abstract and method section synthesis
- Domain-specific research writing
- Semantic completion of structured research prompts

### Downstream Use
- Fine-tuning or distillation into smaller expert models
- Foundation for test-time reasoning agents
- Seed model for bootstrapping larger synthetic scientific corpora

### Out-of-Scope Use
- General conversation or chat use cases
- Non-English scientific domains
- Legal, financial, or clinical advice generation

---

## Bias, Risks, and Limitations
While the model performs well on structured scientific input, it inherits biases from its base model (`Qwen3-1.7B`) and fine-tuning dataset. Results should be evaluated by domain experts before use in high-stakes settings. It may hallucinate plausible but incorrect facts, especially in low-data areas. The thinking mode may increase latency for simpler tasks but improves reasoning quality.

---

## Recommendations
Users should:
- Validate critical outputs against trusted scientific literature
- Avoid deploying in clinical or regulatory environments without further evaluation
- Consider additional domain fine-tuning for niche fields
- Use recommended sampling parameters (Temperature=0.6, TopP=0.95, TopK=20, MinP=0, Presence Penalty=1.5) to avoid endless repetitions in thinking mode

---

## How to Get Started with the Model

```python
from transformers import AutoTokenizer, AutoModelForCausalLM

model_name = "allan-wandia/nexa-qwen-sci-7b"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype="auto")

prompt = "Generate a novel hypothesis in quantum materials research:"
messages = [{"role": "user", "content": prompt}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True, enable_thinking=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(
    **inputs,
    max_new_tokens=32768,
    temperature=0.6,
    top_p=0.95,
    top_k=20,
    min_p=0,
    presence_penalty=1.5
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```


# Training Details
Training Data
Size: 100 million tokens sampled from a 500M+ token corpus
Source: Curated scientific literature, abstracts, methodologies, and domain-labeled corpora (Bio, Physics, QST, Astro
Labeling: Token-level labels auto-generated via Qwen3 tokenizer with chat template (enable_thinking=True)

# Preprocessing
Tokenization with sequence truncation to 32,768 tokens (Qwen3’s context length)
Formatted using Qwen3’s chat template with thinking mode enabled
Labeled and batched using CPU; inference dispatched to GPU asynchronously
Training Hyperparameters
Base model: Qwen/Qwen3-1.7B
Sequence length: 32768
Batch size: 1 (with gradient accumulation)
Gradient Accumulation Steps: 64
Effective Batch Size: 64
Learning rate: 2e-5
Epochs: 2
LoRA: Enabled (PEFT with RSLoRA)
Quantization: 4-bit via bitsandbytes
Optimizer: 8-bit AdamW
Framework: Transformers (≥4.51.0) + PEFT + Accelerate + TRL

Sampling Parameters: Temperature=0.6, TopP=0.95, TopK=20, MinP=0, Presence Penalty=1.5 (applied during inference)
Evaluation
Testing Data

Synthetic scientific prompts across domains (Physics, biology, and Materials Science)

Evaluation Factors
Hypothesis novelty (entropy score)
Internal scientific consistency (domain-specific rubric)
Reasoning quality (assessed via thinking mode outputs)

Results
The model performs robustly in hypothesis generation and scientific prose tasks, with enhanced reasoning capabilities due to Qwen3’s thinking mode. Coherence is high, and novelty depends on prompt diversity. It is well-suited as a distiller or inference agent for synthetic scientific corpora generation.

# Environmental Impact

Component
Value
Hardware Type: 2× NVIDIA T4 GPUs
Hours used: ~7.5
Cloud Provider
Kaggle (Google Cloud)
Compute Region
US
Carbon Emitted
Estimate pending (likely 1 kg COkg CO2)

# Technical Specifications

Model Architecture
Transformer decoder (Qwen3-1.7B architecture: 28 layers, 16 attention heads for Q, 8 for KV)
LoRA adapters applied to all linear layers with RSLoRA


Quantized with bytes to 4-bit for memory efficiency

Compute Infrastructure
CPU: Intel i5 8th Gen vPro (batch preprocessing)
GPU: 2× NVIDIA T4 (CUDA 12.1)
Software Stack
PEFT 0.12.0
Transformers 4.51.0
Accelerate
TRL
Torch 2.x

Citation
BibTeX:

@misc{nexa-qwen-sci-7b,
  title = {Nexa Qwen Sci 7B},
  author = {Allan Wandia},
  year = {2025},
  howpublished = {\url{https://huggingface.co/allan-wandia/nexa-qwen-sci-7b}},
  note = {Fine-tuned model for scientific generation tasks with Qwen3 thinking mode}
}


# Model Card Contact
For questions, contact Allan via Hugging Face or at
Email: allanw.mk@gmail.com


# Model Card Authors
Allan Wandia (Independent ML Engineer and Systems Architect)


# Glossary
LoRA: Low-Rank Adaptation
PEFT: Parameter-Efficient Fine-Tuning
Entropy Score: Metric used to estimate novelty/variation
Safe Tensors: Secure, fast format for model weights
Thinking Mode: Qwen3’s feature for enhanced reasoning, enabled via enable_thinking=True

Links
Github Repo and notebook: https://github.com/DarkStarStrix/Nexa_Auto