Spaces:

brendon-ai
/

faq

Running

App Files Files Community

faq / README.md

brendon-ai

Update README.md

c71461d verified about 1 month ago

preview code

raw

history blame

2.16 kB

	---
	title: Ollama API
	emoji: 🦙
	colorFrom: blue
	colorTo: purple
	sdk: docker
	pinned: false
	app_port: 7860
	---

	# Ollama Model API

	A REST API for running Ollama models on Hugging Face Spaces.

	## Features

	- 🦙 Run Ollama models via REST API
	- 🔄 Model management (pull, list, delete)
	- 💬 Chat completions
	- 🎛️ Configurable parameters (temperature, top_p, etc.)
	- 📊 Health monitoring

	## API Endpoints

	### Health Check
	- `GET /health` - Check if the service is running
	- `GET /models` - List available models

	### Model Management
	- `POST /models/pull` - Pull a model from Ollama registry
	- `DELETE /models/{model_name}` - Delete a model

	### Chat & Completions
	- `POST /chat` - Chat with a model
	- `POST /generate` - Generate text completion

	## Usage Examples

	### Pull a Model
	```bash
	curl -X POST "https://your-space.hf.space/models/pull" \
	-H "Content-Type: application/json" \
	-d '{"model": "llama2:7b"}'
	```

	### Chat with Model
	```bash
	curl -X POST "https://your-space.hf.space/chat" \
	-H "Content-Type: application/json" \
	-d '{
	"model": "llama2:7b",
	"messages": [
	{"role": "user", "content": "Hello, how are you?"}
	]
	}'
	```

	### Generate Text
	```bash
	curl -X POST "https://your-space.hf.space/generate" \
	-H "Content-Type: application/json" \
	-d '{
	"model": "llama2:7b",
	"prompt": "The future of AI is",
	"max_tokens": 100
	}'
	```

	## Supported Models

	This setup supports any model available in the Ollama registry:
	- `llama2:7b`, `llama2:13b`
	- `mistral:7b`
	- `codellama:7b`
	- `vicuna:7b`
	- And many more...

	## Interactive Documentation

	Once deployed, visit `/docs` for interactive API documentation.

	## Notes

	- Model pulling may take several minutes depending on model size
	- Larger models require more memory and may not work on free tier
	- First inference may be slower as the model loads into memory

	## Resource Requirements

	- Small models (7B): 8GB+ RAM recommended
	- Medium models (13B): 16GB+ RAM recommended
	- Large models (70B+): 32GB+ RAM required

	Consider using smaller models like `llama2:7b` or `mistral:7b` for better performance on limited resources.