Spaces:
Running
Running
File size: 2,509 Bytes
a34e2be 69678d3 c71461d 32f060b c71461d a34e2be c71461d b908fcf a34e2be 69678d3 32f060b 69678d3 32f060b c71461d 32f060b 69678d3 c71461d 69678d3 32f060b c71461d 32f060b c71461d 69678d3 32f060b 69678d3 c71461d 69678d3 c71461d 69678d3 c71461d 69678d3 c71461d 69678d3 c71461d 69678d3 c71461d 69678d3 c71461d 69678d3 c71461d 69678d3 c71461d 69678d3 c71461d 69678d3 c71461d 69678d3 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 |
---
title: Ollama Generate API
emoji: π¦
colorFrom: blue
colorTo: purple
sdk: docker
pinned: false
app_port: 7860
---
# Ollama Generate API
A simple REST API for text generation using Ollama models on Hugging Face Spaces.
## Features
- π¦ Generate text using Ollama models
- ποΈ Configurable parameters (temperature, top_p, max_tokens)
- π Health monitoring
- π Simple and lightweight API
## API Endpoints
### Health Check
- `GET /health` - Check if Ollama service is running
- `GET /` - API information and usage examples
### Text Generation
- `POST /generate` - Generate text completion
## Usage Examples
### Check Health
```bash
curl "https://your-space.hf.space/health"
```
### Generate Text
```bash
curl -X POST "https://your-space.hf.space/generate" \
-H "Content-Type: application/json" \
-d '{
"model": "tinyllama",
"prompt": "The future of AI is",
"temperature": 0.7,
"max_tokens": 100
}'
```
### API Information
```bash
curl "https://your-space.hf.space/"
```
## Request Parameters
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `model` | string | required | Model name (e.g., "tinyllama") |
| `prompt` | string | required | Input text prompt |
| `temperature` | float | 0.7 | Sampling temperature (0.0-2.0) |
| `top_p` | float | 0.9 | Top-p sampling (0.0-1.0) |
| `max_tokens` | integer | 512 | Maximum tokens to generate (1-4096) |
## Supported Models
This API works with any Ollama model. Recommended lightweight models for Hugging Face Spaces:
- `tinyllama` - Very small and fast (~600MB)
- `phi` - Small but capable (~1.6GB)
- `llama2:7b` - Larger but more capable (~3.8GB)
## Interactive Documentation
Once deployed, visit `/docs` for interactive API documentation powered by FastAPI.
## Setup Notes
- The startup script automatically pulls the `tinyllama` model
- First generation may be slower as the model loads
- Lightweight models are recommended for better performance on limited resources
## Example Response
```json
{
"model": "tinyllama",
"response": "The future of AI is bright and full of possibilities...",
"done": true,
"total_duration": 1234567890,
"load_duration": 123456789,
"prompt_eval_count": 10,
"eval_count": 25
}
```
## Resource Requirements
- **TinyLlama**: ~1GB RAM, very fast
- **Phi models**: ~2GB RAM, good balance
- **Llama2 7B**: ~8GB RAM, high quality
For Hugging Face Spaces free tier, stick with TinyLlama or Phi models for best performance. |