Spaces:

sematech
/

sema-api

Running

App Files Files Community

kamau1 commited on Jun 21

Commit

937c29e

1 Parent(s): 136b33e

Update to use consolidated sema-utils models with new API

Browse files

Files changed (9) hide show

Dockerfile +26 -0
README.md +96 -6
deploy_to_hf.md +138 -0
docs/app.py +248 -0
docs/current-state.md +255 -0
requirements.txt +8 -0
sema_translation_api.py +277 -0
test_api_client.py +201 -0
test_model_download.py +198 -0

Dockerfile ADDED Viewed

	@@ -0,0 +1,26 @@

+# Dockerfile for Sema Translation API on HuggingFace Spaces
+# Use an official Python runtime as a parent image
+FROM python:3.10-slim
+# Set the working directory in the container
+WORKDIR /code
+# Copy the requirements file into the container at /code
+COPY ./requirements.txt /code/requirements.txt
+# Install any needed packages specified in requirements.txt
+# --no-cache-dir reduces image size
+# --upgrade pip ensures we have the latest version
+RUN pip install --no-cache-dir --upgrade pip
+RUN pip install --no-cache-dir -r /code/requirements.txt
+# Copy the application code to the working directory
+COPY ./sema_translation_api.py /code/sema_translation_api.py
+# Expose port 7860 (HuggingFace Spaces standard)
+EXPOSE 7860
+# Tell uvicorn to run on port 7860, which is the standard for HF Spaces
+# Use 0.0.0.0 to make it accessible from outside the container
+CMD ["uvicorn", "sema_translation_api:app", "--host", "0.0.0.0", "--port", "7860"]

README.md CHANGED Viewed

@@ -1,12 +1,102 @@
 ---
-title: Sema Api
-emoji: 🐠
-colorFrom: indigo
-colorTo: pink
 sdk: docker
 pinned: false
 license: mit
-short_description: Translation api
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
+title: Sema Translation API
+emoji: 🌍
+colorFrom: blue
+colorTo: green
 sdk: docker
 pinned: false
 license: mit
+short_description: Translation API using consolidated sema-utils models
 ---
+# Sema Translation API 🌍
+A powerful translation API that supports multiple African languages using the consolidated `sematech/sema-utils` model repository.
+## Features
+- **Automatic Language Detection**: Detects source language automatically if not provided
+- **Multi-language Support**: Supports 200+ languages via FLORES-200 codes
+- **Fast Translation**: Uses CTranslate2 for optimized inference
+- **RESTful API**: Clean FastAPI interface with automatic documentation
+- **Consolidated Models**: Uses models from the unified `sematech/sema-utils` repository
+## API Endpoints
+### `GET /`
+Health check endpoint that returns API status and version information.
+### `POST /translate`
+Main translation endpoint that accepts:
+**Request Body:**
+```json
+{
+  "text": "Habari ya asubuhi",
+  "target_language": "eng_Latn",
+  "source_language": "swh_Latn"
+}
+```
+**Response:**
+```json
+{
+  "translated_text": "Good morning",
+  "source_language": "swh_Latn",
+  "target_language": "eng_Latn",
+  "inference_time": 0.234,
+  "timestamp": "Monday | 2024-06-21 | 14:30:25"
+}
+```
+## Language Codes
+This API uses FLORES-200 language codes. Some common examples:
+- `eng_Latn` - English
+- `swh_Latn` - Swahili
+- `kik_Latn` - Kikuyu
+- `luo_Latn` - Luo
+- `fra_Latn` - French
+- `spa_Latn` - Spanish
+## Usage Examples
+### Python
+```python
+import requests
+response = requests.post("https://your-space-url/translate", json={
+    "text": "Habari ya asubuhi",
+    "target_language": "eng_Latn"
+})
+print(response.json())
+```
+### cURL
+```bash
+curl -X POST "https://your-space-url/translate" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "text": "Wĩ mwega?",
+    "source_language": "kik_Latn",
+    "target_language": "eng_Latn"
+  }'
+```
+## Model Information
+This API uses models from the consolidated `sematech/sema-utils` repository:
+- **Translation Model**: `sematrans-3.3B` (CTranslate2 optimized)
+- **Language Detection**: `lid218e.bin` (FastText)
+- **Tokenization**: `spm.model` (SentencePiece)
+## API Documentation
+Once the Space is running, visit `/docs` for interactive API documentation.
+---
+Created by Lewis Kamau Kimaru | Sema AI

deploy_to_hf.md ADDED Viewed

	@@ -0,0 +1,138 @@

+# Deployment Instructions for HuggingFace Spaces
+## Files Ready for Deployment
+Your HuggingFace Space needs these files (all created and ready):
+1. **`sema_translation_api.py`** - Main API application
+2. **`requirements.txt`** - Python dependencies
+3. **`Dockerfile`** - Container configuration
+4. **`README.md`** - Space documentation and metadata
+## Deployment Steps
+### Option 1: Using Git (Recommended)
+1. **Navigate to your existing HF Space repository:**
+   ```bash
+   cd backend/sema-api
+   ```
+2. **The files are ready to deploy as-is:**
+   ```bash
+   # All files are ready:
+   # - sema_translation_api.py (main application)
+   # - requirements.txt
+   # - Dockerfile
+   # - README.md
+   ```
+3. **Commit and push to HuggingFace:**
+   ```bash
+   git add .
+   git commit -m "Update to use consolidated sema-utils models with new API"
+   git push origin main
+   ```
+### Option 2: Using HuggingFace Web Interface
+1. Go to your Space: `https://huggingface.co/spaces/sematech/sema-api`
+2. Click on "Files" tab
+3. Upload/replace these files:
+   - Upload `sema_translation_api.py`
+   - Replace `requirements.txt`
+   - Replace `Dockerfile`
+   - Replace `README.md`
+## What Happens After Deployment
+1. **Automatic Build**: HF Spaces will automatically start building your Docker container
+2. **Model Download**: During build, the app will download models from `sematech/sema-utils`:
+   - `spm.model` (SentencePiece tokenizer)
+   - `lid218e.bin` (Language detection)
+   - `translation_models/sematrans-3.3B/` (Translation model)
+3. **API Startup**: Once built, your API will be available at the Space URL
+## Testing Your Deployed API
+### 1. Health Check
+```bash
+curl https://sematech-sema-api.hf.space/
+```
+### 2. Translation with Auto-Detection
+```bash
+curl -X POST "https://sematech-sema-api.hf.space/translate" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "text": "Habari ya asubuhi",
+    "target_language": "eng_Latn"
+  }'
+```
+### 3. Translation with Source Language
+```bash
+curl -X POST "https://sematech-sema-api.hf.space/translate" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "text": "Wĩ mwega?",
+    "source_language": "kik_Latn",
+    "target_language": "eng_Latn"
+  }'
+```
+### 4. Interactive Documentation
+Visit: `https://sematech-sema-api.hf.space/docs`
+## Expected Build Time
+- **First build**: 10-15 minutes (downloading models ~5GB)
+- **Subsequent builds**: 2-5 minutes (models cached)
+## Monitoring the Build
+1. Go to your Space page
+2. Click on "Logs" tab to see build progress
+3. Look for these key messages:
+   - "📥 Downloading models from sematech/sema-utils..."
+   - "✅ All models loaded successfully!"
+   - "🎉 API started successfully!"
+## Troubleshooting
+### If Build Fails:
+1. Check the logs for specific error messages
+2. Common issues:
+   - Model download timeout (retry build)
+   - Memory issues (models are large)
+   - Network connectivity issues
+### If API Doesn't Respond:
+1. Check if the Space is "Running" (green status)
+2. Try the health check endpoint first
+3. Check logs for runtime errors
+## Key Improvements in This Version
+1. **Consolidated Models**: Uses your unified `sema-utils` repository
+2. **Better Error Handling**: Clear error messages and validation
+3. **Performance Monitoring**: Tracks inference time
+4. **Clean API Design**: Follows FastAPI best practices
+5. **Automatic Documentation**: Built-in OpenAPI docs
+6. **Flexible Input**: Auto-detection or manual source language
+## Next Steps After Deployment
+1. **Test the API** with various language pairs
+2. **Monitor performance** and response times
+3. **Update documentation** with your actual Space URL
+4. **Consider adding rate limiting** for production use
+5. **Add authentication** if needed for private use
+## Important Note About File Structure
+The Dockerfile correctly references `sema_translation_api:app` (not `app:app`) since our main file is `sema_translation_api.py`. No need to rename files - deploy as-is!
+---
+Your new API is ready to deploy! 🚀

docs/app.py ADDED Viewed

	@@ -0,0 +1,248 @@

+'''
+        Created By Lewis Kamau Kimaru
+        Sema translator fastapi implementation
+        January 2024
+        Docker deployment
+'''
+from fastapi import FastAPI, HTTPException, Request, Depends
+from fastapi.middleware.cors import CORSMiddleware
+from fastapi.responses import HTMLResponse
+import uvicorn
+from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, pipeline
+import ctranslate2
+import sentencepiece as spm
+import fasttext
+import torch
+from datetime import datetime
+import pytz
+import time
+import os
+app = FastAPI()
+origins = ["*"]
+app.add_middleware(
+    CORSMiddleware,
+    allow_origins=origins,
+    allow_credentials=False,
+    allow_methods=["*"],
+    allow_headers=["*"],
+)
+# set this key as an environment variable
+hf_read_key = os.environ.get('huggingface_token')
+os.environ["HUGGINGFACEHUB_API_TOKEN"] = hf_read_key
+fasttext.FastText.eprint = lambda x: None
+# User interface
+templates_folder = os.path.join(os.path.dirname(__file__), "templates")
+# Get time of request
+def get_time():
+    nairobi_timezone = pytz.timezone('Africa/Nairobi')
+    current_time_nairobi = datetime.now(nairobi_timezone)
+    curr_day = current_time_nairobi.strftime('%A')
+    curr_date = current_time_nairobi.strftime('%Y-%m-%d')
+    curr_time = current_time_nairobi.strftime('%H:%M:%S')
+    full_date = f"{curr_day} | {curr_date} | {curr_time}"
+    return full_date, curr_time
+def load_models():
+    # build model and tokenizer
+    model_name_dict = {
+            #'nllb-distilled-600M': 'facebook/nllb-200-distilled-600M',
+            #'nllb-1.3B': 'facebook/nllb-200-1.3B',
+            #'nllb-distilled-1.3B': 'facebook/nllb-200-distilled-1.3B',
+            #'nllb-3.3B': 'facebook/nllb-200-3.3B',
+            #'nllb-moe-54b': 'facebook/nllb-moe-54b',
+            }
+    model_dict = {}
+    for call_name, real_name in model_name_dict.items():
+        print('\tLoading model: %s' % call_name)
+        model = AutoModelForSeq2SeqLM.from_pretrained(real_name)
+        tokenizer = AutoTokenizer.from_pretrained(real_name)
+        model_dict[call_name+'_model'] = model
+        model_dict[call_name+'_tokenizer'] = tokenizer
+    return model_dict
+# Load the model and tokenizer ..... only once!
+beam_size = 1  # change to a smaller value for faster inference
+device = "cpu"  # or "cuda"
+print('(note-to-self)..... I play the Orchestra🦋.......')
+# Language Prediction model
+print("\n1️⃣importing Language Prediction model")
+lang_model_file = "lid218e.bin"
+lang_model_full_path = os.path.join(os.path.dirname(__file__), lang_model_file)
+lang_model = fasttext.load_model(lang_model_full_path)
+# Load the source SentencePiece model
+print("\n2️⃣importing SentencePiece model")
+sp_model_file = "spm.model"
+sp_model_full_path = os.path.join(os.path.dirname(__file__), sp_model_file)
+sp = spm.SentencePieceProcessor()
+sp.load(sp_model_full_path)
+# Import The Translator model
+print("\n3️⃣importing Translator model")
+ct_model_file = "sematrans-3.3B"
+ct_model_full_path = os.path.join(os.path.dirname(__file__), ct_model_file)
+translator = ctranslate2.Translator(ct_model_full_path, device)
+#model_dict = load_models()
+print('\nDone importing models 🙈\n')
+def translate_detect(userinput: str, target_lang: str):
+    source_sents = [userinput]
+    source_sents = [sent.strip() for sent in source_sents]
+    target_prefix = [[target_lang]] * len(source_sents)
+    # Predict the source language
+    predictions = lang_model.predict(source_sents[0], k=1)
+    source_lang = predictions[0][0].replace('__label__', '')
+    # Subword the source sentences
+    source_sents_subworded = sp.encode(source_sents, out_type=str)
+    source_sents_subworded = [[source_lang] + sent + ["</s>"] for sent in source_sents_subworded]
+    # Translate the source sentences
+    translations = translator.translate_batch(
+        source_sents_subworded,
+        batch_type="tokens",
+        max_batch_size=2024,
+        beam_size=beam_size,
+        target_prefix=target_prefix,
+    )
+    translations = [translation[0]['tokens'] for translation in translations]
+    # Desubword the target sentences
+    translations_desubword = sp.decode(translations)
+    translations_desubword = [sent[len(target_lang):] for sent in translations_desubword]
+    # Return the source language and the translated text
+    return source_lang, translations_desubword
+def translate_enter(userinput: str, source_lang: str, target_lang: str):
+  source_sents = [userinput]
+  source_sents = [sent.strip() for sent in source_sents]
+  target_prefix = [[target_lang]] * len(source_sents)
+  # Subword the source sentences
+  source_sents_subworded = sp.encode(source_sents, out_type=str)
+  source_sents_subworded = [[source_lang] + sent + ["</s>"] for sent in source_sents_subworded]
+  # Translate the source sentences
+  translations = translator.translate_batch(source_sents_subworded, batch_type="tokens", max_batch_size=2024, beam_size=beam_size, target_prefix=target_prefix)
+  translations = [translation[0]['tokens'] for translation in translations]
+  # Desubword the target sentences
+  translations_desubword = sp.decode(translations)
+  translations_desubword = [sent[len(target_lang):] for sent in translations_desubword]
+  # Return the source language and the translated text
+  return translations_desubword[0]
+def translate_faster(userinput3: str, source_lang3: str, target_lang3: str):
+    if len(model_dict) == 2:
+        model_name = 'nllb-moe-54b'
+    start_time = time.time()
+    model = model_dict[model_name + '_model']
+    tokenizer = model_dict[model_name + '_tokenizer']
+    translator = pipeline('translation', model=model, tokenizer=tokenizer, src_lang=source_lang3, tgt_lang=target_lang3)
+    output = translator(userinput3, max_length=400)
+    end_time = time.time()
+    output = output[0]['translation_text']
+    result = {'inference_time': end_time - start_time,
+              'source': source,
+              'target': target,
+              'result': output}
+    return result
+@app.get("/", response_class=HTMLResponse)
+async def read_root(request: Request):
+    return HTMLResponse(content=open(os.path.join(templates_folder, "translator.html"), "r").read(), status_code=200)
+@app.post("/translate_detect/")
+async def translate_detect_endpoint(request: Request):
+    datad = await request.json()
+    userinputd = datad.get("userinput")
+    target_langd = datad.get("target_lang")
+    dfull_date = get_time()[0]
+    print(f"\nrequest: {dfull_date}\nTarget Language; {target_langd}, User Input: {userinputd}\n")
+    if not userinputd or not target_langd:
+        raise HTTPException(status_code=422, detail="Both 'userinput' and 'target_lang' are required.")
+    source_langd, translated_text_d = translate_detect(userinputd, target_langd)
+    dcurrent_time = get_time()[1]
+    print(f"\nresponse: {dcurrent_time}; ... Source_language: {source_langd}, Translated Text: {translated_text_d}\n\n")
+    return {
+        "source_language": source_langd,
+        "translated_text": translated_text_d[0],
+    }
+@app.post("/translate_enter/")
+async def translate_enter_endpoint(request: Request):
+    datae = await request.json()
+    userinpute = datae.get("userinput")
+    source_lange = datae.get("source_lang")
+    target_lange = datae.get("target_lang")
+    efull_date = get_time()[0]
+    print(f"\nrequest: {efull_date}\nSource_language; {source_lange}, Target Language; {target_lange}, User Input: {userinpute}\n")
+    if not userinpute or not target_lange:
+        raise HTTPException(status_code=422, detail="'userinput' 'sourc_lang'and 'target_lang' are required.")
+    translated_text_e = translate_enter(userinpute, source_lange, target_lange)
+    ecurrent_time = get_time()[1]
+    print(f"\nresponse: {ecurrent_time}; ... Translated Text: {translated_text_e}\n\n")
+    return {
+        "translated_text": translated_text_e,
+    }
+@app.post("/translate_faster/")
+async def translate_faster_endpoint(request: Request):
+    dataf = await request.json()
+    userinputf = datae.get("userinput")
+    source_langf = datae.get("source_lang")
+    target_langf = datae.get("target_lang")
+    ffull_date = get_time()[0]
+    print(f"\nrequest: {ffull_date}\nSource_language; {source_langf}, Target Language; {target_langf}, User Input: {userinputf}\n")
+    if not userinputf or not target_langf:
+        raise HTTPException(status_code=422, detail="'userinput' 'sourc_lang'and 'target_lang' are required.")
+    translated_text_f = translate_faster(userinputf, source_langf, target_langf)
+    fcurrent_time = get_time()[1]
+    print(f"\nresponse: {fcurrent_time}; ... Translated Text: {translated_text_f}\n\n")
+    return {
+        "translated_text": translated_text_f,
+    }
+print("\nAPI started successfully 😁\n")

docs/current-state.md ADDED Viewed

	@@ -0,0 +1,255 @@

+# Current State of the Sema API
+### Analysis of the Current `app.py`
+You're right to want to refactor this. While it works, it has several issues that make it difficult to maintain and scale:
+1.  **Global Scope:** All the models (`lang_model`, `sp`, `translator`) are loaded in the global scope of the script. This works for simple scripts but is bad practice in applications. It makes the code hard to test and can lead to unexpected side effects.
+2.  **Redundant Code:** The functions `translate_detect` and `translate_enter` are almost identical. The only difference is that one detects the source language and the other takes it as an argument. This can be combined into a single, more flexible function.
+3.  **Unused/Confusing Code:** The `load_models` function, `model_name_dict`, and the entire `translate_faster` endpoint seem to be remnants of a different implementation (likely using standard `transformers` pipelines). This code is not being used by the main translation endpoints and adds confusion.
+4.  **Hardcoded Paths:** Model file paths are constructed using `os.path.join` relative to the script's location (`__file__`). This is not ideal for deployment, especially with Docker, where file paths need to be explicit and predictable.
+5.  **Direct Request Handling:** The endpoints use `await request.json()` to parse the request body. This works, but using Pydantic models (as shown in the previous example) is the standard FastAPI way, providing automatic validation, type hints, and documentation.
+6.  **Mixing Concerns:** The script mixes model loading, business logic (translation), utility functions (`get_time`), and API endpoint definitions all in one file. The previous folder structure we discussed solves this by separating these concerns.
+### A Simple, Straightforward Dockerized Script for HF Spaces
+Given your goal—a simple, straightforward script that runs in Docker on HF Spaces and uses your centralized models—we can create a much cleaner version.
+This setup will:
+1.  Define the necessary files for a Hugging Face Space.
+2.  Automatically download your models from `sematech/sema-utils` when the Space builds.
+3.  Provide a single, clear translation endpoint.
+4.  Be easy to understand and maintain.
+---
+#### Step 1: Create the Project Folder and Files
+Create a new folder for your Hugging Face Space. Let's call it `sema_api_space`. Inside, create the following files:
+```
+sema_api_space/
+├── app.py           <-- The simplified FastAPI app
+├── requirements.txt <-- Python dependencies
+└── Dockerfile       <-- Instructions to build the Docker image
+```
+---
+#### Step 2: Write the Code for Each File
+##### **`requirements.txt`**
+This file lists the libraries that `pip` will install.
+```text
+# requirements.txt
+fastapi
+uvicorn[standard]
+ctranslate2
+sentencepiece
+fasttext-wheel
+huggingface_hub
+pydantic
+```
+##### **`Dockerfile`**
+This file tells Hugging Face Spaces how to build your application environment. It will copy your code, install dependencies, and define the command to run the server.
+```dockerfile
+# Dockerfile
+# Use an official Python runtime as a parent image
+FROM python:3.10-slim
+# Set the working directory in the container
+WORKDIR /code
+# Copy the requirements file into the container at /code
+COPY ./requirements.txt /code/requirements.txt
+# Install any needed packages specified in requirements.txt
+# --no-cache-dir reduces image size
+# --upgrade pip ensures we have the latest version
+RUN pip install --no-cache-dir --upgrade pip
+RUN pip install --no-cache-dir -r /code/requirements.txt
+# Copy the rest of your application code to the working directory
+COPY ./app.py /code/app.py
+# Tell uvicorn to run on port 7860, which is the standard for HF Spaces
+# Use 0.0.0.0 to make it accessible from outside the container
+CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "7860"]
+```
+##### **`app.py`**
+This is the heart of your application. It's a heavily simplified and cleaned-up version of your original script. It uses the best practices we discussed.
+```python
+# app.py
+import os
+from fastapi import FastAPI, HTTPException
+from pydantic import BaseModel, Field
+from huggingface_hub import hf_hub_download
+import ctranslate2
+import sentencepiece as spm
+import fasttext
+# --- 1. Define Data Schemas (for validation and documentation) ---
+class TranslationRequest(BaseModel):
+    text: str = Field(..., example="Wĩ mwega?")
+    target_language: str = Field(..., example="eng_Latn", description="FLORES-200 code for the target language.")
+    source_language: str | None = Field(None, example="kik_Latn", description="Optional FLORES-200 code for the source language.")
+class TranslationResponse(BaseModel):
+    translated_text: str
+    detected_source_language: str
+# --- 2. Model Loading ---
+# This section runs only ONCE when the application starts.
+print("Downloading and loading models...")
+# Define the Hugging Face repo and the files to download
+REPO_ID = "sematech/sema-utils"
+MODELS_DIR = "hf_models" # A local directory to store the models
+# Ensure the local directory exists
+os.makedirs(MODELS_DIR, exist_ok=True)
+# Download each file and get its local path
+try:
+    # Note: hf_hub_download automatically handles caching.
+    # It won't re-download if the file is already there.
+    spm_path = hf_hub_download(repo_id=REPO_ID, filename="spm.model", local_dir=MODELS_DIR)
+    ft_path = hf_hub_download(repo_id=REPO_ID, filename="lid218e.bin", local_dir=MODELS_DIR)
+    # For CTranslate2 models, it's often better to download the whole directory.
+    # We specify the subfolder where the model lives in the repo.
+    # The actual model path will be inside the returned directory.
+    ct_model_dir = hf_hub_download(
+        repo_id=REPO_ID,
+        filename="sematrans-3.3B/model.bin", # A file inside the dir to trigger download
+        local_dir=MODELS_DIR
+    )
+    # The actual path to the CTranslate2 model directory
+    ct_path = os.path.dirname(ct_model_dir)
+except Exception as e:
+    print(f"Error downloading models: {e}")
+    # In a real app, you might want to exit or handle this more gracefully.
+    exit()
+# Suppress the fasttext warning
+fasttext.FastText.eprint = lambda x: None
+# Load the models into memory
+sp_model = spm.SentencePieceProcessor(spm_path)
+lang_model = fasttext.load_model(ft_path)
+translator = ctranslate2.Translator(ct_path, device="cpu") # Use "cuda" if your Space has a GPU
+print("All models loaded successfully!")
+# --- 3. FastAPI Application ---
+app = FastAPI(
+    title="Sema Simple Translation API",
+    description="A simple API using models from sematech/sema-utils on Hugging Face Hub.",
+    version="1.0.0"
+)
+@app.get("/")
+def root():
+    return {"status": "ok", "message": "Sema Translation API is running."}
+@app.post("/translate", response_model=TranslationResponse)
+async def translate_endpoint(request: TranslationRequest):
+    """
+    Performs translation. Detects source language if not provided.
+    """
+    if not request.text.strip():
+        raise HTTPException(status_code=400, detail="Input text cannot be empty.")
+    # A single function handles both cases (with or without source_language)
+    try:
+        # Detect language if not provided
+        source_lang = request.source_language
+        if not source_lang:
+            # Replace newlines for better language detection
+            predictions = lang_model.predict(request.text.replace('\n', ' '), k=1)
+            source_lang = predictions[0][0].replace('__label__', '')
+        # Prepare for translation
+        source_tokenized = sp_model.encode(request.text, out_type=str)
+        source_tokenized = [[source_lang] + sent + ["</s>"] for sent in source_tokenized]
+        target_prefix = [[request.target_language]]
+        # Perform translation
+        results = translator.translate_batch(
+            source_tokenized,
+            batch_type="tokens",
+            max_batch_size=2048,
+            beam_size=2,
+            target_prefix=target_prefix,
+        )
+        translated_tokens = results[0].hypotheses[0][1:] # Exclude target language token
+        translated_text = sp_model.decode(translated_tokens)
+        return TranslationResponse(
+            translated_text=translated_text,
+            detected_source_language=source_lang,
+        )
+    except Exception as e:
+        print(f"An error occurred during translation: {e}")
+        raise HTTPException(status_code=500, detail="An internal error occurred during translation.")
+```
+---
+#### Step 3: Create and Deploy the Hugging Face Space
+1.  **Go to Hugging Face** and click on your profile, then "New Space".
+2.  **Choose a name** for your Space (e.g., `sema-translation-api`).
+3.  **Select "Docker"** as the Space SDK.
+4.  **Choose a template** (e.g., "Blank").
+5.  Click "Create Space".
+6.  **Upload the files:**
+    *   Click on the "Files" tab in your new Space.
+    *   Click "Add file" -> "Upload files".
+    *   Upload the three files you created: `app.py`, `requirements.txt`, and `Dockerfile`.
+The Space will automatically start building the Docker image. You can watch the progress in the "Logs" tab. It will download the models from `sematech/sema-utils` during this build process. Once it's running, you'll have a public API endpoint.
+---
+#### Step 4: Test the Deployed API
+Once your Space is running, you can test it with a simple `curl` command or any HTTP client.
+1.  **Find your Space URL:** It will be something like `https://your-username-your-space-name.hf.space`.
+2.  **Run a `curl` command** from your terminal:
+```bash
+curl -X POST "https://your-username-your-space-name.hf.space/translate" \
+-H "Content-Type: application/json" \
+-d '{
+    "text": "Habari ya asubuhi, ulimwengu",
+    "target_language": "eng_Latn"
+}'
+```
+**Expected Output:**
+You should receive a JSON response like this:
+```json
+{
+  "translated_text": "Good morning, world.",
+  "detected_source_language": "swh_Latn"
+}
+```

requirements.txt ADDED Viewed

	@@ -0,0 +1,8 @@

+fastapi
+uvicorn[standard]
+ctranslate2
+sentencepiece
+fasttext-wheel
+huggingface_hub
+pydantic
+pytz

sema_translation_api.py ADDED Viewed

	@@ -0,0 +1,277 @@

+"""
+Sema Translation API - New Implementation
+Created for testing consolidated sema-utils repository
+Uses HuggingFace Hub for model downloading
+"""
+import os
+import time
+from datetime import datetime
+import pytz
+from typing import Optional
+from fastapi import FastAPI, HTTPException, Request
+from fastapi.middleware.cors import CORSMiddleware
+from pydantic import BaseModel, Field
+from huggingface_hub import hf_hub_download, snapshot_download
+import ctranslate2
+import sentencepiece as spm
+import fasttext
+# --- Data Models ---
+class TranslationRequest(BaseModel):
+    text: str = Field(..., example="Habari ya asubuhi", description="Text to translate")
+    target_language: str = Field(..., example="eng_Latn", description="FLORES-200 target language code")
+    source_language: Optional[str] = Field(None, example="swh_Latn", description="Optional FLORES-200 source language code")
+class TranslationResponse(BaseModel):
+    translated_text: str
+    source_language: str
+    target_language: str
+    inference_time: float
+    timestamp: str
+# --- FastAPI App Setup ---
+app = FastAPI(
+    title="Sema Translation API",
+    description="Translation API using consolidated sema-utils models from HuggingFace",
+    version="2.0.0"
+)
+# CORS middleware
+app.add_middleware(
+    CORSMiddleware,
+    allow_origins=["*"],
+    allow_credentials=False,
+    allow_methods=["*"],
+    allow_headers=["*"],
+)
+# --- Global Variables ---
+REPO_ID = "sematech/sema-utils"
+MODELS_DIR = "hf_models"
+beam_size = 1
+device = "cpu"
+# Model instances (will be loaded on startup)
+lang_model = None
+sp_model = None
+translator = None
+def get_nairobi_time():
+    """Get current time in Nairobi timezone"""
+    nairobi_timezone = pytz.timezone('Africa/Nairobi')
+    current_time_nairobi = datetime.now(nairobi_timezone)
+    curr_day = current_time_nairobi.strftime('%A')
+    curr_date = current_time_nairobi.strftime('%Y-%m-%d')
+    curr_time = current_time_nairobi.strftime('%H:%M:%S')
+    full_date = f"{curr_day} | {curr_date} | {curr_time}"
+    return full_date, curr_time
+def download_models():
+    """Download models from HuggingFace Hub"""
+    print("🔄 Downloading models from sematech/sema-utils...")
+    # Ensure models directory exists
+    os.makedirs(MODELS_DIR, exist_ok=True)
+    try:
+        # Download individual files from root
+        print("📥 Downloading SentencePiece model...")
+        spm_path = hf_hub_download(
+            repo_id=REPO_ID,
+            filename="spm.model",
+            local_dir=MODELS_DIR
+        )
+        print("📥 Downloading language detection model...")
+        ft_path = hf_hub_download(
+            repo_id=REPO_ID,
+            filename="lid218e.bin",
+            local_dir=MODELS_DIR
+        )
+        # Download translation model (3.3B) from subfolder
+        print("📥 Downloading translation model (3.3B)...")
+        ct_model_path = snapshot_download(
+            repo_id=REPO_ID,
+            allow_patterns="translation_models/sematrans-3.3B/*",
+            local_dir=MODELS_DIR
+        )
+        # Construct paths
+        ct_model_full_path = os.path.join(MODELS_DIR, "translation_models", "sematrans-3.3B")
+        return spm_path, ft_path, ct_model_full_path
+    except Exception as e:
+        print(f"❌ Error downloading models: {e}")
+        raise e
+def load_models():
+    """Load all models into memory"""
+    global lang_model, sp_model, translator
+    print("🚀 Loading models into memory...")
+    # Download models first
+    spm_path, ft_path, ct_model_path = download_models()
+    # Suppress fasttext warnings
+    fasttext.FastText.eprint = lambda x: None
+    try:
+        # Load language detection model
+        print("1️⃣ Loading language detection model...")
+        lang_model = fasttext.load_model(ft_path)
+        # Load SentencePiece model
+        print("2️⃣ Loading SentencePiece model...")
+        sp_model = spm.SentencePieceProcessor()
+        sp_model.load(spm_path)
+        # Load translation model
+        print("3️⃣ Loading translation model...")
+        translator = ctranslate2.Translator(ct_model_path, device)
+        print("✅ All models loaded successfully!")
+    except Exception as e:
+        print(f"❌ Error loading models: {e}")
+        raise e
+def translate_with_detection(text: str, target_lang: str):
+    """Translate text with automatic source language detection"""
+    start_time = time.time()
+    # Prepare input
+    source_sents = [text.strip()]
+    target_prefix = [[target_lang]]
+    # Detect source language
+    predictions = lang_model.predict(text.replace('\n', ' '), k=1)
+    source_lang = predictions[0][0].replace('__label__', '')
+    # Tokenize source text
+    source_sents_subworded = sp_model.encode(source_sents, out_type=str)
+    source_sents_subworded = [[source_lang] + sent + ["</s>"] for sent in source_sents_subworded]
+    # Translate
+    translations = translator.translate_batch(
+        source_sents_subworded,
+        batch_type="tokens",
+        max_batch_size=2048,
+        beam_size=beam_size,
+        target_prefix=target_prefix,
+    )
+    # Decode translation
+    translations = [translation[0]['tokens'] for translation in translations]
+    translations_desubword = sp_model.decode(translations)
+    translated_text = translations_desubword[0][len(target_lang):]
+    inference_time = time.time() - start_time
+    return source_lang, translated_text, inference_time
+def translate_with_source(text: str, source_lang: str, target_lang: str):
+    """Translate text with provided source language"""
+    start_time = time.time()
+    # Prepare input
+    source_sents = [text.strip()]
+    target_prefix = [[target_lang]]
+    # Tokenize source text
+    source_sents_subworded = sp_model.encode(source_sents, out_type=str)
+    source_sents_subworded = [[source_lang] + sent + ["</s>"] for sent in source_sents_subworded]
+    # Translate
+    translations = translator.translate_batch(
+        source_sents_subworded,
+        batch_type="tokens",
+        max_batch_size=2048,
+        beam_size=beam_size,
+        target_prefix=target_prefix
+    )
+    # Decode translation
+    translations = [translation[0]['tokens'] for translation in translations]
+    translations_desubword = sp_model.decode(translations)
+    translated_text = translations_desubword[0][len(target_lang):]
+    inference_time = time.time() - start_time
+    return translated_text, inference_time
+# --- API Endpoints ---
+@app.get("/")
+async def root():
+    """Health check endpoint"""
+    return {
+        "status": "ok",
+        "message": "Sema Translation API is running",
+        "version": "2.0.0",
+        "models_loaded": all([lang_model, sp_model, translator])
+    }
+@app.post("/translate", response_model=TranslationResponse)
+async def translate_endpoint(request: TranslationRequest):
+    """
+    Main translation endpoint.
+    Automatically detects source language if not provided.
+    """
+    if not request.text.strip():
+        raise HTTPException(status_code=400, detail="Input text cannot be empty")
+    full_date, current_time = get_nairobi_time()
+    print(f"\n🔄 Request: {full_date}")
+    print(f"Target: {request.target_language}, Text: {request.text[:50]}...")
+    try:
+        if request.source_language:
+            # Use provided source language
+            translated_text, inference_time = translate_with_source(
+                request.text,
+                request.source_language,
+                request.target_language
+            )
+            source_lang = request.source_language
+        else:
+            # Auto-detect source language
+            source_lang, translated_text, inference_time = translate_with_detection(
+                request.text,
+                request.target_language
+            )
+        _, response_time = get_nairobi_time()
+        print(f"✅ Response: {response_time}")
+        print(f"Source: {source_lang}, Translation: {translated_text[:50]}...\n")
+        return TranslationResponse(
+            translated_text=translated_text,
+            source_language=source_lang,
+            target_language=request.target_language,
+            inference_time=inference_time,
+            timestamp=full_date
+        )
+    except Exception as e:
+        print(f"❌ Translation error: {e}")
+        raise HTTPException(status_code=500, detail=f"Translation failed: {str(e)}")
+# --- Startup Event ---
+@app.on_event("startup")
+async def startup_event():
+    """Load models when the application starts"""
+    print("\n🎵 Starting Sema Translation API...")
+    print("🎼 Loading the Orchestra... 🦋")
+    load_models()
+    print("🎉 API started successfully!\n")
+if __name__ == "__main__":
+    import uvicorn
+    uvicorn.run(app, host="0.0.0.0", port=8000)

test_api_client.py ADDED Viewed

	@@ -0,0 +1,201 @@

+"""
+Test client for the Sema Translation API
+"""
+import requests
+import json
+import time
+def test_api_endpoint(base_url="http://localhost:8000"):
+    """Test the translation API endpoints"""
+    print("🧪 Testing Sema Translation API\n")
+    # Test 1: Health check
+    print("1️⃣ Testing health check endpoint...")
+    try:
+        response = requests.get(f"{base_url}/")
+        if response.status_code == 200:
+            data = response.json()
+            print(f"✅ Health check passed: {data}")
+        else:
+            print(f"❌ Health check failed: {response.status_code}")
+            return False
+    except Exception as e:
+        print(f"❌ Health check error: {e}")
+        return False
+    # Test 2: Translation with auto-detection
+    print("\n2️⃣ Testing translation with auto-detection...")
+    test_data = {
+        "text": "Habari ya asubuhi, ulimwengu",
+        "target_language": "eng_Latn"
+    }
+    try:
+        response = requests.post(
+            f"{base_url}/translate",
+            headers={"Content-Type": "application/json"},
+            data=json.dumps(test_data)
+        )
+        if response.status_code == 200:
+            data = response.json()
+            print(f"✅ Auto-detection translation successful:")
+            print(f"   📝 Original: {test_data['text']}")
+            print(f"   🔍 Detected source: {data['source_language']}")
+            print(f"   🎯 Target: {data['target_language']}")
+            print(f"   ✨ Translation: {data['translated_text']}")
+            print(f"   ⏱️ Inference time: {data['inference_time']:.3f}s")
+        else:
+            print(f"❌ Auto-detection translation failed: {response.status_code}")
+            print(f"   Error: {response.text}")
+            return False
+    except Exception as e:
+        print(f"❌ Auto-detection translation error: {e}")
+        return False
+    # Test 3: Translation with specified source language
+    print("\n3️⃣ Testing translation with specified source language...")
+    test_data_with_source = {
+        "text": "Wĩ mwega?",
+        "source_language": "kik_Latn",
+        "target_language": "eng_Latn"
+    }
+    try:
+        response = requests.post(
+            f"{base_url}/translate",
+            headers={"Content-Type": "application/json"},
+            data=json.dumps(test_data_with_source)
+        )
+        if response.status_code == 200:
+            data = response.json()
+            print(f"✅ Specified source translation successful:")
+            print(f"   📝 Original: {test_data_with_source['text']}")
+            print(f"   🔍 Source: {data['source_language']}")
+            print(f"   🎯 Target: {data['target_language']}")
+            print(f"   ✨ Translation: {data['translated_text']}")
+            print(f"   ⏱️ Inference time: {data['inference_time']:.3f}s")
+        else:
+            print(f"❌ Specified source translation failed: {response.status_code}")
+            print(f"   Error: {response.text}")
+            return False
+    except Exception as e:
+        print(f"❌ Specified source translation error: {e}")
+        return False
+    # Test 4: Error handling - empty text
+    print("\n4️⃣ Testing error handling (empty text)...")
+    test_data_empty = {
+        "text": "",
+        "target_language": "eng_Latn"
+    }
+    try:
+        response = requests.post(
+            f"{base_url}/translate",
+            headers={"Content-Type": "application/json"},
+            data=json.dumps(test_data_empty)
+        )
+        if response.status_code == 400:
+            print("✅ Empty text error handling works correctly")
+        else:
+            print(f"❌ Empty text error handling failed: {response.status_code}")
+            return False
+    except Exception as e:
+        print(f"❌ Empty text error handling error: {e}")
+        return False
+    # Test 5: Multiple translations for performance
+    print("\n5️⃣ Testing multiple translations for performance...")
+    test_texts = [
+        {"text": "Jambo", "target_language": "eng_Latn"},
+        {"text": "Asante sana", "target_language": "eng_Latn"},
+        {"text": "Karibu", "target_language": "eng_Latn"},
+        {"text": "Pole sana", "target_language": "eng_Latn"},
+        {"text": "Tutaonana", "target_language": "eng_Latn"}
+    ]
+    total_time = 0
+    successful_translations = 0
+    for i, test_data in enumerate(test_texts, 1):
+        try:
+            start_time = time.time()
+            response = requests.post(
+                f"{base_url}/translate",
+                headers={"Content-Type": "application/json"},
+                data=json.dumps(test_data)
+            )
+            end_time = time.time()
+            if response.status_code == 200:
+                data = response.json()
+                request_time = end_time - start_time
+                total_time += request_time
+                successful_translations += 1
+                print(f"   {i}. '{test_data['text']}' → '{data['translated_text']}' "
+                      f"({request_time:.3f}s)")
+            else:
+                print(f"   {i}. Failed: {response.status_code}")
+        except Exception as e:
+            print(f"   {i}. Error: {e}")
+    if successful_translations > 0:
+        avg_time = total_time / successful_translations
+        print(f"\n📊 Performance Summary:")
+        print(f"   ✅ Successful translations: {successful_translations}/{len(test_texts)}")
+        print(f"   ⏱️ Average request time: {avg_time:.3f}s")
+        print(f"   🚀 Total time: {total_time:.3f}s")
+    return True
+def test_api_documentation(base_url="http://localhost:8000"):
+    """Test API documentation endpoints"""
+    print("\n📚 Testing API documentation...")
+    # Test OpenAPI docs
+    try:
+        response = requests.get(f"{base_url}/docs")
+        if response.status_code == 200:
+            print("✅ OpenAPI docs accessible at /docs")
+        else:
+            print(f"❌ OpenAPI docs failed: {response.status_code}")
+    except Exception as e:
+        print(f"❌ OpenAPI docs error: {e}")
+    # Test OpenAPI JSON
+    try:
+        response = requests.get(f"{base_url}/openapi.json")
+        if response.status_code == 200:
+            print("✅ OpenAPI JSON accessible at /openapi.json")
+        else:
+            print(f"❌ OpenAPI JSON failed: {response.status_code}")
+    except Exception as e:
+        print(f"❌ OpenAPI JSON error: {e}")
+if __name__ == "__main__":
+    import sys
+    # Allow custom base URL
+    base_url = "http://localhost:8000"
+    if len(sys.argv) > 1:
+        base_url = sys.argv[1]
+    print(f"🎯 Testing API at: {base_url}")
+    print("⚠️  Make sure the API server is running before running this test!\n")
+    # Run tests
+    success = test_api_endpoint(base_url)
+    test_api_documentation(base_url)
+    if success:
+        print("\n🎉 All API tests passed!")
+    else:
+        print("\n❌ Some API tests failed!")
+        sys.exit(1)

test_model_download.py ADDED Viewed

	@@ -0,0 +1,198 @@

+"""
+Test script to verify model downloading and loading from sema-utils repository
+"""
+import os
+import sys
+from huggingface_hub import hf_hub_download, snapshot_download
+import ctranslate2
+import sentencepiece as spm
+import fasttext
+def test_model_download():
+    """Test downloading models from sematech/sema-utils"""
+    REPO_ID = "sematech/sema-utils"
+    MODELS_DIR = "test_models"
+    print("🧪 Testing model download from sematech/sema-utils...")
+    # Create test directory
+    os.makedirs(MODELS_DIR, exist_ok=True)
+    try:
+        # Test 1: Download SentencePiece model
+        print("\n1️⃣ Testing SentencePiece model download...")
+        smp_path = hf_hub_download(
+            repo_id=REPO_ID,
+            filename="spm.model",
+            local_dir=MODELS_DIR
+        )
+        print(f"✅ SentencePiece model downloaded to: {smp_path}")
+        # Test 2: Download language detection model
+        print("\n2️⃣ Testing language detection model download...")
+        ft_path = hf_hub_download(
+            repo_id=REPO_ID,
+            filename="lid218e.bin",
+            local_dir=MODELS_DIR
+        )
+        print(f"✅ Language detection model downloaded to: {ft_path}")
+        # Test 3: Download translation model
+        print("\n3️⃣ Testing translation model download...")
+        ct_model_path = snapshot_download(
+            repo_id=REPO_ID,
+            allow_patterns="translation_models/sematrans-3.3B/*",
+            local_dir=MODELS_DIR
+        )
+        print(f"✅ Translation model downloaded to: {ct_model_path}")
+        # Verify file structure
+        ct_model_full_path = os.path.join(MODELS_DIR, "translation_models", "sematrans-3.3B")
+        print(f"\n📁 Translation model directory: {ct_model_full_path}")
+        if os.path.exists(ct_model_full_path):
+            files = os.listdir(ct_model_full_path)
+            print(f"📄 Files in translation model directory: {files}")
+        else:
+            print("❌ Translation model directory not found!")
+            return False
+        return smp_path, ft_path, ct_model_full_path
+    except Exception as e:
+        print(f"❌ Error during download: {e}")
+        return False
+def test_model_loading(smp_path, ft_path, ct_model_path):
+    """Test loading the downloaded models"""
+    print("\n🔄 Testing model loading...")
+    try:
+        # Suppress fasttext warnings
+        fasttext.FastText.eprint = lambda x: None
+        # Test 1: Load language detection model
+        print("\n1️⃣ Testing language detection model loading...")
+        lang_model = fasttext.load_model(ft_path)
+        print("✅ Language detection model loaded successfully")
+        # Test language detection
+        test_text = "Habari ya asubuhi"
+        predictions = lang_model.predict(test_text, k=1)
+        detected_lang = predictions[0][0].replace('__label__', '')
+        print(f"🔍 Detected language for '{test_text}': {detected_lang}")
+        # Test 2: Load SentencePiece model
+        print("\n2️⃣ Testing SentencePiece model loading...")
+        sp_model = spm.SentencePieceProcessor()
+        sp_model.load(smp_path)
+        print("✅ SentencePiece model loaded successfully")
+        # Test tokenization
+        tokens = sp_model.encode(test_text, out_type=str)
+        print(f"🔤 Tokenized '{test_text}': {tokens}")
+        # Test 3: Load translation model
+        print("\n3️⃣ Testing translation model loading...")
+        translator = ctranslate2.Translator(ct_model_path, device="cpu")
+        print("✅ Translation model loaded successfully")
+        return lang_model, sp_model, translator
+    except Exception as e:
+        print(f"❌ Error during model loading: {e}")
+        return False
+def test_translation(lang_model, sp_model, translator):
+    """Test the complete translation pipeline"""
+    print("\n🔄 Testing complete translation pipeline...")
+    test_text = "Habari ya asubuhi, ulimwengu"
+    target_lang = "eng_Latn"
+    try:
+        # Step 1: Detect source language
+        predictions = lang_model.predict(test_text.replace('\n', ' '), k=1)
+        source_lang = predictions[0][0].replace('__label__', '')
+        print(f"🔍 Detected source language: {source_lang}")
+        # Step 2: Tokenize
+        source_sents = [test_text.strip()]
+        source_sents_subworded = sp_model.encode(source_sents, out_type=str)
+        source_sents_subworded = [[source_lang] + sent + ["</s>"] for sent in source_sents_subworded]
+        print(f"🔤 Tokenized input: {source_sents_subworded[0][:10]}...")
+        # Step 3: Translate
+        target_prefix = [[target_lang]]
+        translations = translator.translate_batch(
+            source_sents_subworded,
+            batch_type="tokens",
+            max_batch_size=2048,
+            beam_size=1,
+            target_prefix=target_prefix,
+        )
+        # Step 4: Decode
+        translations = [translation[0]['tokens'] for translation in translations]
+        translations_desubword = sp_model.decode(translations)
+        translated_text = translations_desubword[0][len(target_lang):]
+        print(f"\n🎉 Translation successful!")
+        print(f"📝 Original: {test_text}")
+        print(f"🔍 Source language: {source_lang}")
+        print(f"🎯 Target language: {target_lang}")
+        print(f"✨ Translation: {translated_text}")
+        return True
+    except Exception as e:
+        print(f"❌ Error during translation: {e}")
+        return False
+def cleanup_test_files():
+    """Clean up test files"""
+    import shutil
+    test_dir = "test_models"
+    if os.path.exists(test_dir):
+        print(f"\n🧹 Cleaning up test directory: {test_dir}")
+        shutil.rmtree(test_dir)
+        print("✅ Cleanup complete")
+if __name__ == "__main__":
+    print("🚀 Starting Sema Utils Model Test\n")
+    # Test model download
+    download_result = test_model_download()
+    if not download_result:
+        print("❌ Model download test failed!")
+        sys.exit(1)
+    smp_path, ft_path, ct_model_path = download_result
+    # Test model loading
+    loading_result = test_model_loading(smp_path, ft_path, ct_model_path)
+    if not loading_result:
+        print("❌ Model loading test failed!")
+        sys.exit(1)
+    lang_model, sp_model, translator = loading_result
+    # Test translation
+    translation_result = test_translation(lang_model, sp_model, translator)
+    if not translation_result:
+        print("❌ Translation test failed!")
+        sys.exit(1)
+    print("\n🎉 All tests passed! Your sema-utils repository is working correctly.")
+    # Ask user if they want to clean up
+    response = input("\n🧹 Do you want to clean up test files? (y/n): ")
+    if response.lower() in ['y', 'yes']:
+        cleanup_test_files()
+    print("\n✅ Test complete!")