--- license: mit datasets: - sweatSmile/neet-biology-qa language: - en base_model: - distilbert/distilbert-base-uncased pipeline_tag: question-answering library_name: transformers tags: - neet - biology - exam - bio --- DistilBERT NEET Biology MCQ Classifier (NEET_BioBERT) This model is a fine-tuned version of DistilBERT (base uncased) specifically trained to classify the correct option for NEET-style multiple-choice biology questions. It selects the best answer among four choices (A, B, C, D). ------------------------------------------------------------------------- Training Data Source: sweatSmile / NEET Biology QA Dataset Domain: NEET (Undergraduate Medical Entrance Exam) – Biology Format: Each question has 4 options with one correct answer Dataset Size: 793 questions Split: 80% train / 20% validation ------------------------------------------------------------------------- Training Configuration Base Model: distilbert-base-uncased Epochs: 10 Batch Size: 4 Learning Rate: 5e-5 Weight Decay: 0.01 Task Type: Multiple Choice Classification ------------------------------------------------------------------------- Results Validation Accuracy 72.96% (~73%) Final Training Loss ~0.35 ------------------------------------------------------------------------- Limitations Trained on a relatively small dataset (793 questions). Limited to NEET-level biology content; not suitable for physics or chemistry. Does not support: Assertion-reasoning questions Diagram-based questions Paragraph/Case study type questions ------------------------------------------------------------------------- Intended Use Educational Research AI-powered NEET Biology assistants MCQ practice evaluation Baseline model for future fine-tuning with larger datasets ------------------------------------------------------------------------- NOTE: Not recommended as a final exam-ready solution without further fine-tuning and validation. ------------------------------------------------------------------------- License: MIT