---
license: mit
language:
- en
base_model:
- microsoft/deberta-v3-base
pipeline_tag: text-classification
library_name: transformers
tags:
- text-classification
- deberta
- it-support
- ticket-classification
---
# 🤖 DeBERTa-v3-base for Employee IT Support Ticket Classification

[![Model](https://img.shields.io/badge/Model-DeBERTa--v3--base-blue)](https://huggingface.co/microsoft/deberta-v3-base)  
[![Transformers](https://img.shields.io/badge/Transformers-🤗-yellow.svg)](https://huggingface.co/docs/transformers)  
[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)  


## 📖 Model Overview
This model is a fine-tuned version of **microsoft/deberta-v3-base** for classifying **employee IT support tickets** into **11 categories**.  
It was trained in two stages:

1. **Domain Adaptation** — fine-tuned on ~12k general customer support tickets.  
2. **Task Adaptation** — fine-tuned on 2.5k synthetic employee IT tickets.  

The model automates **helpdesk ticket routing** by predicting the correct support category.

---

## 🗂️ Labels
The model predicts one of the following categories:

- `Network`  
- `Software`  
- `Account`  
- `Training`  
- `Security`  
- `Licensing`  
- `Communication`  
- `RemoteWork`  
- `Hardware`  
- `Infrastructure`  
- `Performance`  

---

## 🎯 Intended Uses
- **Automated Ticket Routing** — Assign new tickets to the right IT team.  
- **Helpdesk Analytics** — Analyze ticket trends.  
- **Chatbots** — Suggest relevant answers or knowledge base articles.  

⚠️ **Limitations**:  
- Synthetic training data may not capture all company-specific jargon.  
- Validation accuracy is near-perfect, but real-world accuracy expected is **85–95%**.

---

## 💻 Usage

```python
from transformers import pipeline

# Load model
classifier = pipeline("text-classification", model="your-username/deberta-it-support")

subject = "VPN connection dropping"
description = "My VPN disconnects every 15 minutes, preventing access to remote servers."

text_input = f"[SUBJECT] {subject} [TEXT] {description}"

result = classifier(text_input)
print(result)
# [{'label': 'RemoteWork', 'score': 0.98}]

```
## 🏋️Training Data

| Stage   | Dataset                            | Size   | Purpose            |
|---------|------------------------------------|--------|--------------------|
| Stage 1 | Customer Support Tickets (public)  | ~12,000 | Domain Adaptation  |
| Stage 2 | Synthetic Employee IT Tickets      | 2,500   | Task Adaptation    |


## Hyperparameters

| Hyperparameter         | Stage 1 | Stage 2 |
|------------------------|---------|---------|
| Learning Rate          | 2e-5    | 5e-6    |
| Epochs                 | 3       | 5       |
| Batch Size (per device)| 8       | 8       |
| Gradient Accumulation  | 4       | 4       |
| Optimizer              | AdamW   | AdamW   |
| Precision              | FP16    | FP16    |

## 📊 Evaluation
The final model achieved 99.4% accuracy on the validation split of the synthetic dataset. The best checkpoint was saved using the load_best_model_at_end strategy, based on validation loss. As noted in the limitations, real-world performance will likely be slightly lower but is expected to be high.

This model was fine-tuned by [Pulastya/Pulastya0].

Base model microsoft/deberta-v3-base is provided under the MIT license.