AdityasArsenal/YogaDataSet
Viewer • Updated • 2.13k • 7 • 1
This repository contains a small, script-first pipeline to prepare data, extract pose landmarks with MediaPipe, train machine‑learning pose classifiers, and run a real‑time webcam demo.
The sections below explain what each Python script in the project root does and how to use it on macOS (zsh). For dependencies, see requirements.txt.
pip install -r requirements.txt
Optional but recommended: create and activate a virtual environment before installing.
extract_images.py.pose_detection.py to generate per‑image pose landmark JSON files under PoseData/label_*.ml_pose_classifier.py. Optionally export to ONNX or TFLite.realtime_pose_classifier.py using your saved model.Purpose
YogaDataSet/data/) and save them into folders by label for training and testing.Inputs/Outputs
YogaDataSet/data/train-00000-of-00001.parquet, YogaDataSet/data/test-00000-of-00001.parquetTrainData/train/label_* and/or TrainData/test/label_*Usage
# Process both train and test (default behavior)
python extract_images.py
# Train only
python extract_images.py --train --output TrainData
# Test only to a custom folder
python extract_images.py --test --output MyOutputDir
Notes
label_0, label_1, … subfolders and writes image files with their original extensions.Purpose
Preprocessing
Inputs/Outputs
TrainData/train) organized as label_*/*.jpg|png|…PoseData/label_*/<image_name>.jsonUsage
# Process images from default input into PoseData
python pose_detection.py
# Custom input and output
python pose_detection.py --input TrainData/train --output PoseData --batch-size 100
Tips
requirements.txt).Purpose
Data expectation
PoseData/label_0/*.jsonPoseData/label_1/*.jsonCommon options
--data/-d Pose JSON root (default: PoseData)--model/-m Model type: random_forest (default), svm, gradient_boost, logistic, distilled_rf--test-size/-t Test split ratio (default: 0.2)--save-model/-s Path to save the trained model (.pkl via joblib)--load-model/-l Path to load an existing model--predict/-p Predict a single JSON file--evaluate/-e Evaluate a folder of JSON files--export-onnx Export the trained model to ONNX (tree models or distilled MLP)--export-model-type Controls which model flavor to export--export-tflite Export distilled student MLP to TFLite (requires extra deps)Typical commands
# 1) Train a Random Forest and save it
python ml_pose_classifier.py \
--data PoseData \
--model random_forest \
--test-size 0.2 \
--save-model models/pose_classifier_random_forest.pkl
# 2) Evaluate a saved model on a held‑out folder (e.g., TestData)
python ml_pose_classifier.py \
--model random_forest \
--load-model models/pose_classifier_random_forest.pkl \
--evaluate TestData
# 3) Export to ONNX (Random Forest or distilled MLP)
python ml_pose_classifier.py \
--model random_forest \
--load-model models/pose_classifier_random_forest.pkl \
--export-onnx models/pose_classifier_random_forest.onnx
# 4) Knowledge distillation: train RF teacher + MLP student
python ml_pose_classifier.py \
--data PoseData \
--model distilled_rf \
--save-model models/pose_classifier_distilled_rf.pkl
# 5) Export the student MLP to TFLite (extra packages required)
python ml_pose_classifier.py \
--model distilled_rf \
--load-model models/pose_classifier_distilled_rf.pkl \
--export-tflite models/pose_classifier_distilled_mlp.tflite
Notes
skl2onnx and onnx. TFLite export additionally needs onnx-tf and tensorflow.svm, logistic) are not supported by Unity Barracuda. Prefer random_forest or the distilled MLP for deployment.Purpose
Model loading
--model is not provided, the script auto‑searches common filenames in the project root:pose_classifier_random_forest.pklpose_classifier_logistic.pklpose_classifier_distilled_rf.pklUsage
# Auto‑detect a model and open the default camera (0)
python realtime_pose_classifier.py
# Specify a model file and camera index
python realtime_pose_classifier.py \
--model models/pose_classifier_random_forest.pkl \
--camera 0
Keyboard controls
Notes
YogaDataSet/data/ — Parquet files used by extract_images.py.TrainData/train|test/label_*/ — Image folders produced by extraction.PoseData/label_*/ — Landmark JSONs generated by pose_detection.py.models/ — Example trained/exported models and label mappings.confusion_matrix_*.png — Saved confusion matrix plots (when enabled in training script).mediapipe and opencv-python.--camera index, close other apps using the camera, or allow camera permissions for Python in macOS Privacy settings.--model with an explicit path to your .pkl file.