faq / README.md
brendon-ai's picture
Update README.md
c71461d verified
|
raw
history blame
2.16 kB
metadata
title: Ollama API
emoji: πŸ¦™
colorFrom: blue
colorTo: purple
sdk: docker
pinned: false
app_port: 7860

Ollama Model API

A REST API for running Ollama models on Hugging Face Spaces.

Features

  • πŸ¦™ Run Ollama models via REST API
  • πŸ”„ Model management (pull, list, delete)
  • πŸ’¬ Chat completions
  • πŸŽ›οΈ Configurable parameters (temperature, top_p, etc.)
  • πŸ“Š Health monitoring

API Endpoints

Health Check

  • GET /health - Check if the service is running
  • GET /models - List available models

Model Management

  • POST /models/pull - Pull a model from Ollama registry
  • DELETE /models/{model_name} - Delete a model

Chat & Completions

  • POST /chat - Chat with a model
  • POST /generate - Generate text completion

Usage Examples

Pull a Model

curl -X POST "https://your-space.hf.space/models/pull" \
  -H "Content-Type: application/json" \
  -d '{"model": "llama2:7b"}'

Chat with Model

curl -X POST "https://your-space.hf.space/chat" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama2:7b",
    "messages": [
      {"role": "user", "content": "Hello, how are you?"}
    ]
  }'

Generate Text

curl -X POST "https://your-space.hf.space/generate" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama2:7b",
    "prompt": "The future of AI is",
    "max_tokens": 100
  }'

Supported Models

This setup supports any model available in the Ollama registry:

  • llama2:7b, llama2:13b
  • mistral:7b
  • codellama:7b
  • vicuna:7b
  • And many more...

Interactive Documentation

Once deployed, visit /docs for interactive API documentation.

Notes

  • Model pulling may take several minutes depending on model size
  • Larger models require more memory and may not work on free tier
  • First inference may be slower as the model loads into memory

Resource Requirements

  • Small models (7B): 8GB+ RAM recommended
  • Medium models (13B): 16GB+ RAM recommended
  • Large models (70B+): 32GB+ RAM required

Consider using smaller models like llama2:7b or mistral:7b for better performance on limited resources.