Mahesh H Jamdade

update readme

69be7bb about 1 year ago

5.11 kB


	---
	base_model: google/gemma-2b-it
	library_name: peft
	---

	# Model Card for SQL Injection Classifier

	<!-- Provide a quick summary of what the model is/does. -->
	This model is a classifier that detects SQL injection attacks in SQL queries. It is based on the `google/gemma-2b-it` model and uses the `peft` library for training and evaluation. This model is trained on a dataset of SQL queries with and without SQL injection attacks.

	## Model Details

	### Model Description

	This SQL injection classifier is a fine-tuned version of the google/gemma-2b-it model, optimized to detect potential SQL injection vulnerabilities in SQL queries. It uses the PEFT (Parameter-Efficient Fine-Tuning) library to achieve high performance while maintaining efficiency.

	The model demonstrates exceptional performance in classifying SQL queries as either secure or vulnerable:

	```
	Accuracy: 0.9984
	Precision: 0.9974
	Recall: 0.9993
	F1-score: 0.9984

	Classification Report:

	precision recall f1-score support

	Secure 1.00 1.00 1.00 5658
	Vulnerable 1.00 1.00 1.00 5467
	accuracy 1.00 11125
	macro avg 1.00 1.00 1.00 11125
	weighted avg 1.00 1.00 1.00 11125
	```

	- Developed by: Mahesh Jamdade
	- Model type: Text Classification
	- Language(s) (NLP): SQL, English
	- License: [More Information Needed]
	- Finetuned from model: google/gemma-2b-it

	### Model Sources

	- Repository: https://huggingface.co/maheshmnj/sql-injection-classifier

	## Uses

	### Direct Use

	This model can be directly used to classify SQL queries as either secure or vulnerable to SQL injection attacks. It can be integrated into security tools, database management systems, or web application firewalls to provide an additional layer of protection against SQL injection vulnerabilities.

	### Downstream Use

	The model can be further fine-tuned or integrated into larger security ecosystems. It could be used as a component in:
	- Code review tools
	- Automated security testing suites
	- Real-time query analysis systems in database applications

	### Out-of-Scope Use

	This model is specifically trained for SQL injection detection and should not be used for:
	- Detecting other types of security vulnerabilities
	- Generating or correcting SQL queries
	- Analyzing queries in languages other than SQL

	## Bias, Risks, and Limitations

	- The model's performance may vary on SQL dialects or patterns not well-represented in the training data.
	- False positives or negatives, while rare given the high accuracy, could still occur and should be considered in critical applications.
	- The model may not catch highly sophisticated or novel SQL injection techniques.

	### Recommendations

	- Always use this model as part of a comprehensive security strategy, not as the sole defense against SQL injection.
	- Regularly update and retrain the model with new, real-world SQL injection patterns.
	- Implement additional security measures such as parameterized queries and input sanitization.

	## How to Get Started with the Model

	Use the following code to get started with the model:

	```python
	from transformers import AutoModelForSequenceClassification, AutoTokenizer

	model_path = "maheshj01/sql-injection-classifier"
	model = AutoModelForSequenceClassification.from_pretrained(model_path)
	tokenizer = AutoTokenizer.from_pretrained(model_path)

	# Function to classify a SQL query
	def classify_query(query):
	inputs = tokenizer(query, return_tensors="pt", truncation=True, padding=True)
	outputs = model(**inputs)
	prediction = outputs.logits.argmax(-1).item()
	return "Vulnerable" if prediction == 1 else "Secure"

	# Example usage
	query = "SELECT * FROM users WHERE username = 'admin' OR '1'='1'"
	result = classify_query(query)
	print(f"The query is classified as: {result}")
	```

	## Training Details

	### Training Data

	The model was trained on a dataset of SQL queries, including both secure queries and queries containing SQL injection vulnerabilities. [More specific information about the dataset is needed]

	### Training Procedure

	The model was fine-tuned using the PEFT library, which allows for efficient adaptation of the pre-trained Gemma 2B model to the SQL injection classification task.

	#### Training Hyperparameters

	- Training regime: [More Information Needed]

	## Evaluation

	The model was evaluated on a held-out test set of SQL queries, achieving high performance across all metrics as shown in the classification report above.

	## Environmental Impact

	[More Information Needed]

	## Technical Specifications

	### Model Architecture and Objective

	The model is based on the google/gemma-2b-it architecture, fine-tuned for binary classification of SQL queries.

	### Compute Infrastructure

	#### Software

	- PEFT 0.8.2
	- Transformers [version needed]
	- PyTorch [version needed]

	## Model Card Contact

	For questions or concerns about this model, please contact Mahesh Jamdade through the [Hugging Face repository](https://huggingface.co/maheshmnj/sql-injection-classifier).