Spaces:
Sleeping
Sleeping
metadata
title: Ollama Generate API
emoji: π¦
colorFrom: blue
colorTo: purple
sdk: docker
pinned: false
app_port: 7860
Ollama Generate API
A simple REST API for text generation using Ollama models on Hugging Face Spaces.
Features
- π¦ Generate text using Ollama models
- ποΈ Configurable parameters (temperature, top_p, max_tokens)
- π Health monitoring
- π Simple and lightweight API
API Endpoints
Health Check
GET /health
- Check if Ollama service is runningGET /
- API information and usage examples
Text Generation
POST /generate
- Generate text completion
Usage Examples
Check Health
curl "https://your-space.hf.space/health"
Generate Text
curl -X POST "https://your-space.hf.space/generate" \
-H "Content-Type: application/json" \
-d '{
"model": "tinyllama",
"prompt": "The future of AI is",
"temperature": 0.7,
"max_tokens": 100
}'
API Information
curl "https://your-space.hf.space/"
Request Parameters
Parameter | Type | Default | Description |
---|---|---|---|
model |
string | required | Model name (e.g., "tinyllama") |
prompt |
string | required | Input text prompt |
temperature |
float | 0.7 | Sampling temperature (0.0-2.0) |
top_p |
float | 0.9 | Top-p sampling (0.0-1.0) |
max_tokens |
integer | 512 | Maximum tokens to generate (1-4096) |
Supported Models
This API works with any Ollama model. Recommended lightweight models for Hugging Face Spaces:
tinyllama
- Very small and fast (~600MB)phi
- Small but capable (~1.6GB)llama2:7b
- Larger but more capable (~3.8GB)
Interactive Documentation
Once deployed, visit /docs
for interactive API documentation powered by FastAPI.
Setup Notes
- The startup script automatically pulls the
tinyllama
model - First generation may be slower as the model loads
- Lightweight models are recommended for better performance on limited resources
Example Response
{
"model": "tinyllama",
"response": "The future of AI is bright and full of possibilities...",
"done": true,
"total_duration": 1234567890,
"load_duration": 123456789,
"prompt_eval_count": 10,
"eval_count": 25
}
Resource Requirements
- TinyLlama: ~1GB RAM, very fast
- Phi models: ~2GB RAM, good balance
- Llama2 7B: ~8GB RAM, high quality
For Hugging Face Spaces free tier, stick with TinyLlama or Phi models for best performance.