faq / README.md
brendon-ai's picture
Update README.md
c71461d verified
|
raw
history blame
2.16 kB
---
title: Ollama API
emoji: πŸ¦™
colorFrom: blue
colorTo: purple
sdk: docker
pinned: false
app_port: 7860
---
# Ollama Model API
A REST API for running Ollama models on Hugging Face Spaces.
## Features
- πŸ¦™ Run Ollama models via REST API
- πŸ”„ Model management (pull, list, delete)
- πŸ’¬ Chat completions
- πŸŽ›οΈ Configurable parameters (temperature, top_p, etc.)
- πŸ“Š Health monitoring
## API Endpoints
### Health Check
- `GET /health` - Check if the service is running
- `GET /models` - List available models
### Model Management
- `POST /models/pull` - Pull a model from Ollama registry
- `DELETE /models/{model_name}` - Delete a model
### Chat & Completions
- `POST /chat` - Chat with a model
- `POST /generate` - Generate text completion
## Usage Examples
### Pull a Model
```bash
curl -X POST "https://your-space.hf.space/models/pull" \
-H "Content-Type: application/json" \
-d '{"model": "llama2:7b"}'
```
### Chat with Model
```bash
curl -X POST "https://your-space.hf.space/chat" \
-H "Content-Type: application/json" \
-d '{
"model": "llama2:7b",
"messages": [
{"role": "user", "content": "Hello, how are you?"}
]
}'
```
### Generate Text
```bash
curl -X POST "https://your-space.hf.space/generate" \
-H "Content-Type: application/json" \
-d '{
"model": "llama2:7b",
"prompt": "The future of AI is",
"max_tokens": 100
}'
```
## Supported Models
This setup supports any model available in the Ollama registry:
- `llama2:7b`, `llama2:13b`
- `mistral:7b`
- `codellama:7b`
- `vicuna:7b`
- And many more...
## Interactive Documentation
Once deployed, visit `/docs` for interactive API documentation.
## Notes
- Model pulling may take several minutes depending on model size
- Larger models require more memory and may not work on free tier
- First inference may be slower as the model loads into memory
## Resource Requirements
- **Small models (7B)**: 8GB+ RAM recommended
- **Medium models (13B)**: 16GB+ RAM recommended
- **Large models (70B+)**: 32GB+ RAM required
Consider using smaller models like `llama2:7b` or `mistral:7b` for better performance on limited resources.