πŸš€ Try It Live (Google Colab Demo)

You can test the Truth-Zeeker AI model directly in Google Colab using the link below. This demo notebook automatically loads the model from Hugging Face and runs inference on a small pseudonymized Zeek dataset.

Open In Colab


πŸ“Š Demo Outputs

Sample visualization:
This chart shows the top anomalous hosts detected by Truth-Zeeker AI on a pseudonymized VLAN dataset (for demonstration only).

Top Anomalies Chart


🧠 Model and Data

These files are hosted on Hugging Face under the repository
dr-rakshith-truth-zeeker/truth-zeeker-ai-demo


🧩 Recent Update (2025-10-28)

A new trained variant β€” isoforest_and_scaler_20251029TXXXXXXZ.joblib β€” has been uploaded.
This version was generated from the latest VLAN DocNet-sanitized captures, using the unified Zeek β†’ ML pipeline under controlled offline conditions.

It extends the baseline model (model_20251020.joblib) by:

  • Incorporating richer Zeek features extracted from real benign network flows
  • Maintaining strict anonymization (RFC 5737 DocNet addresses)
  • Improving consistency across future SageMaker and Security Onion Lite training experiments

πŸ“˜ Note: All datasets and captures used remain fully sanitized and pseudonymized for educational and research purposes only.



πŸ”Ή Latest Update β€” October 29, 2025

Model: model_docnet_20251029T131457Z.joblib
Dataset: vlan_docnet_outputs/host_features_with_scores.csv (DocNet-anonymized Zeek capture)
Platform: Google Colab (Free Tier)
Frameworks: pandas Β· scikit-learn Β· joblib
Pipeline: StandardScaler + IsolationForest

This version was trained directly on DocNet-anonymized Zeek outputs, improving feature diversity and better representing realistic VLAN traffic patterns.
All files have been validated as sanitized before upload.

➑️ This update completes the first open, reproducible model training cycle for Truth-Zeeker AI.
Future training (v1.0.3 and beyond) will explore longer runs on open-source compute environments or cloud frameworks such as SageMaker or Kaggle.


🧠 VLAN DocNet Model Visualization

Anomaly Score Distribution β€” VLAN DocNet

πŸ“Š Visualization of anomaly scores generated by the model_docnet_20251029T131457Z.joblib pipeline.
This vertical plot represents results from the DocNet-anonymized VLAN capture dataset.
Differences in orientation and scaling are intentional β€” they highlight the updated feature distribution and processing flow introduced in the new training pipeline.

πŸ—“οΈ Changelog:
This update marks the first DocNet-trained version of Truth-Zeeker AI, introducing VLAN-level anonymized datasets and a refined Isolation Forest pipeline for cleaner feature scaling and anomaly visualization.
The model (model_docnet_20251029T131457Z.joblib) and corresponding outputs were generated via the latest Colab training workflow and uploaded directly to this repository.

Truth-Zeeker AI β€” Model Card (demo)

Overview

Small demonstration model for the Truth-Zeeker AI pipeline.
This repo contains a tiny synthetic dataset and a demo script that trains/loads a minimal model and shows predictions.

Intended use

Educational / research demo only. Not for production. Use only with sanitized or synthetic data.

Model details

  • Algorithm (demo): IsolationForest (scikit-learn) for anomaly scoring
  • Input features: duration, orig_bytes, resp_bytes
  • Output: anomaly score / binary flag

Limitations

  • Demo model is trained on synthetic data and is not validated on real traffic.
  • Do not use with real PHI/PII or production network environments.

License

MIT

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support