πŸ“§ Spam Email Classifier using BiLSTM

This model uses a Bidirectional LSTM (BiLSTM) architecture built with Keras to classify email messages as Spam or Ham. It was trained on the Enron Spam Dataset using GloVe word embeddings.


🧠 Model Architecture

  • Tokenizer: Keras Tokenizer trained on the Enron dataset
  • Embedding: Pretrained GloVe.6B.100d
  • Model: Embedding β†’ BiLSTM β†’ Dropout β†’ Dense(sigmoid)
  • Input: English email/message text
  • Output: 0 = Ham, 1 = Spam

πŸ§ͺ Example Usage

from tensorflow.keras.models import load_model
from huggingface_hub import hf_hub_download
import pickle
from tensorflow.keras.preprocessing.sequence import pad_sequences

# Load files from HF Hub
model_path = hf_hub_download("lokas/spam-emails-classifier", "model.h5")
tokenizer_path = hf_hub_download("lokas/spam-emails-classifier", "tokenizer.pkl")

# Load model and tokenizer
model = load_model(model_path)
with open(tokenizer_path, "rb") as f:
    tokenizer = pickle.load(f)

# Prediction function
def predict_spam(text):
    seq = tokenizer.texts_to_sequences([text])
    padded = pad_sequences(seq, maxlen=50)  # must match training maxlen
    pred = model.predict(padded)[0][0]
    return "🚫 Spam" if pred > 0.5 else "βœ… Not Spam"

# Example
print(predict_spam("Win a free iPhone now!"))
Downloads last month
23
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Dataset used to train lokas/spam-emails-classifier

Space using lokas/spam-emails-classifier 1