Ecologia Gas Consumption Model

Model Description

This model predicts gas_consumption (m³) for buildings using machine learning ensemble methods.

Model Architecture: Random Forest Regressor (Best Model)
Task: Regression (Energy Consumption Prediction)
Target Variable: gas_consumption (m³)
Input Features: 22 features
Training Dataset: Building Data Genome Project 2
Training Samples: ~15 million

Model Performance

Random Forest Model

RMSE: 459.7374
MAE: 131.9079
R² Score: 0.9090

XGBoost Model

RMSE: 499.6148
MAE: 156.0127
R² Score: 0.8925

Best Model

The best performing model (based on validation RMSE) is saved as gas_model.joblib.

Training Details

Dataset

Source: Building Data Genome Project 2
Training Samples: ~15 million
Data Preprocessing:
- Outlier removal (99th percentile)
- Feature engineering (temporal, building, weather features)
- Missing value imputation
- Normalization

Training Method

Algorithm: Ensemble (Random Forest + XGBoost)
Best Model Selection: Based on validation RMSE
Cross-Validation: Train/Validation/Test split (60/20/20)
Hyperparameters: Optimized for large-scale datasets

Feature Engineering

The model uses 22 engineered features including:

Building Features: Type, area, age, location
Temporal Features: Hour, day, month, season, day of week
Weather Features: Temperature, humidity, dew point
Interaction Features: Building-weather interactions
Lag Features: Previous consumption patterns

Usage

Installation

pip install scikit-learn xgboost joblib huggingface_hub

Load Model

from huggingface_hub import hf_hub_download
import joblib

# Download model and features
model_path = hf_hub_download(
    repo_id="codealchemist01/ecologia-gas-model",
    filename="gas_model.joblib",
    token="YOUR_HF_TOKEN"  # Optional if public
)

features_path = hf_hub_download(
    repo_id="codealchemist01/ecologia-gas-model",
    filename="gas_features.joblib",
    token="YOUR_HF_TOKEN"  # Optional if public
)

# Load model and features
model = joblib.load(model_path)
feature_columns = joblib.load(features_path)

Prediction Example

import pandas as pd
import numpy as np

# Prepare input data (example)
input_data = pd.DataFrame({
    'building_type': ['Office'],
    'area_sqm': [1000],
    'year_built': [2020],
    'temperature': [20.5],
    'humidity': [65],
    'hour': [14],
    'day_of_week': [1],
    'month': [6],
    # ... other required features
})

# Ensure all features are present
for col in feature_columns:
    if col not in input_data.columns:
        input_data[col] = 0

# Select features in correct order
input_data = input_data[feature_columns]

# Make prediction
prediction = model.predict(input_data)
print(f"Predicted gas_consumption (m³): {prediction[0]:.2f}")

Model Limitations

Model performance may vary based on building characteristics and regional differences
Training data is primarily from North American buildings
Predictions are estimates and should be validated with actual consumption data
Model requires all input features to be provided

Ethical Considerations

Model is designed to help reduce energy consumption and carbon footprint
No personal or sensitive data is used in training
Model predictions should be used responsibly for sustainability purposes

Citation

If you use this model, please cite:

@software{ecologia_energy_model,
  title = {Ecologia Gas Consumption Model},
  author = {Ecologia Energy Team},
  year = {2024},
  url = {https://huggingface.co/codealchemist01/ecologia-gas-model},
  note = {Trained on Building Data Genome Project 2 dataset}
}

License

This model is released under the MIT License.

Contact

For questions or issues, please open an issue on the repository or contact the Ecologia Energy team.

Acknowledgments

Building Data Genome Project 2 dataset creators
scikit-learn and XGBoost communities
HuggingFace for model hosting

This model is part of the Ecologia sustainability platform for energy consumption prediction and carbon footprint calculation.

Downloads last month: -