🧱 Dockerfile Quality Classifier – Binary Model

This model predicts whether a given Dockerfile is:

  • βœ… GOOD – clean and adheres to best practices (no top rule violations)
  • ❌ BAD – violates at least one important rule (from Hadolint)

It is the first step in a full ML-based Dockerfile linter.


🧠 Model Overview

  • Architecture: Fine-tuned microsoft/codebert-base
  • Task: Binary classification (good vs bad)
  • Input: Full Dockerfile content as plain text
  • Output: [prob_good, prob_bad] β€” softmax scores
  • Max input length: 512 tokens

πŸ“š Training Details

  • Data source: Real-world and synthetic Dockerfiles
  • Labels: Based on Hadolint top 30 rules
  • Bad examples: At least one rule violated
  • Good examples: Fully clean files
  • Dataset balance: 15000 BAD / 1500 GOOD (clean)

πŸ§ͺ Evaluation Results

Evaluation on a held-out test set of 1,650 Dockerfiles:

Class Precision Recall F1-score Support
good 0.96 0.91 0.93 150
bad 0.99 1.00 0.99 1500
Accuracy 0.99 1650

πŸš€ Quick Start

πŸ§ͺ Step 1 β€” Create test script

Save this as test_binary_predict.py:

import sys
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
from pathlib import Path

path = Path(sys.argv[1])
text = path.read_text(encoding="utf-8")

tokenizer = AutoTokenizer.from_pretrained("LeeSek/binary-dockerfile-model")
model = AutoModelForSequenceClassification.from_pretrained("LeeSek/binary-dockerfile-model")
model.eval()

inputs = tokenizer(text, return_tensors="pt", padding="max_length", truncation=True, max_length=512)

with torch.no_grad():
    logits = model(**inputs).logits
    probs = torch.nn.functional.softmax(logits, dim=1).squeeze()

label = "GOOD" if torch.argmax(probs).item() == 0 else "BAD"
print(f"Prediction: {label} β€” Probabilities: good={probs[0]:.3f}, bad={probs[1]:.3f}")

πŸ“„ Step 2 β€” Create good and bad Dockerfile

Good:

FROM node:18
WORKDIR /app
COPY . .
RUN npm install
CMD ["node", "index.js"]

Bad:

FROM ubuntu:latest
RUN apt-get install python3
ADD . /app
WORKDIR /app
RUN pip install flask
CMD python3 app.py

▢️ Step 3 β€” Run the prediction

python test_binary_predict.py Dockerfile

Expected output:

Prediction: GOOD β€” Probabilities: good=0.998, bad=0.002

πŸ—‚ Extras

The full training and evaluation pipeline β€” including data preparation, training, validation, prediction β€” is available in the scripts/ folder.

πŸ’¬ Note: Scripts are written with Polish comments and variable names for clarity during local development. Logic is fully portable.


πŸ“˜ License

MIT


πŸ™Œ Credits

Downloads last month
98
Safetensors
Model size
125M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support