--- title: Ollama Generate API emoji: 🦙 colorFrom: blue colorTo: purple sdk: docker pinned: false app_port: 7860 --- # Ollama Generate API A simple REST API for text generation using Ollama models on Hugging Face Spaces. ## Features - 🦙 Generate text using Ollama models - 🎛️ Configurable parameters (temperature, top_p, max_tokens) - 📊 Health monitoring - 🚀 Simple and lightweight API ## API Endpoints ### Health Check - `GET /health` - Check if Ollama service is running - `GET /` - API information and usage examples ### Text Generation - `POST /generate` - Generate text completion ## Usage Examples ### Check Health ```bash curl "https://your-space.hf.space/health" ``` ### Generate Text ```bash curl -X POST "https://your-space.hf.space/generate" \ -H "Content-Type: application/json" \ -d '{ "model": "tinyllama", "prompt": "The future of AI is", "temperature": 0.7, "max_tokens": 100 }' ``` ### API Information ```bash curl "https://your-space.hf.space/" ``` ## Request Parameters | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `model` | string | required | Model name (e.g., "tinyllama") | | `prompt` | string | required | Input text prompt | | `temperature` | float | 0.7 | Sampling temperature (0.0-2.0) | | `top_p` | float | 0.9 | Top-p sampling (0.0-1.0) | | `max_tokens` | integer | 512 | Maximum tokens to generate (1-4096) | ## Supported Models This API works with any Ollama model. Recommended lightweight models for Hugging Face Spaces: - `tinyllama` - Very small and fast (~600MB) - `phi` - Small but capable (~1.6GB) - `llama2:7b` - Larger but more capable (~3.8GB) ## Interactive Documentation Once deployed, visit `/docs` for interactive API documentation powered by FastAPI. ## Setup Notes - The startup script automatically pulls the `tinyllama` model - First generation may be slower as the model loads - Lightweight models are recommended for better performance on limited resources ## Example Response ```json { "model": "tinyllama", "response": "The future of AI is bright and full of possibilities...", "done": true, "total_duration": 1234567890, "load_duration": 123456789, "prompt_eval_count": 10, "eval_count": 25 } ``` ## Resource Requirements - **TinyLlama**: ~1GB RAM, very fast - **Phi models**: ~2GB RAM, good balance - **Llama2 7B**: ~8GB RAM, high quality For Hugging Face Spaces free tier, stick with TinyLlama or Phi models for best performance.