usmansafdarktk commited on
Commit
919f56e
·
1 Parent(s): ce33f47

Initial API deployment

Browse files
Files changed (4) hide show
  1. Dockerfile +12 -0
  2. README.md +53 -11
  3. main.py +82 -0
  4. requirements.txt +6 -0
Dockerfile ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ FROM python:3.10-slim
2
+
3
+ WORKDIR /app
4
+
5
+ COPY requirements.txt .
6
+ RUN pip install --no-cache-dir -r requirements.txt
7
+
8
+ COPY main.py .
9
+
10
+ EXPOSE 8000
11
+
12
+ CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
README.md CHANGED
@@ -1,11 +1,53 @@
1
- ---
2
- title: LaMini LM API
3
- emoji: 🐨
4
- colorFrom: yellow
5
- colorTo: pink
6
- sdk: docker
7
- pinned: false
8
- license: apache-2.0
9
- ---
10
-
11
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## LaMini-LM API
2
+ This is a FastAPI-based API for text generation using the MBZUAI/LaMini-GPT-774M model from the LaMini-LM series. It allows users to send a text prompt and receive generated text.
3
+ ## Installation
4
+
5
+ Clone the repository:
6
+ ```bash
7
+ git clone <your-repo-url>
8
+ cd lamini-lm-api
9
+ ```
10
+
11
+ Set up a virtual environment and install dependencies:
12
+ ```bash
13
+ python -m venv venv
14
+ source venv/bin/activate # On Windows: venv\Scripts\activate
15
+ pip install -r requirements.txt
16
+ ```
17
+
18
+ Run the API locally:
19
+ ```bash
20
+ uvicorn main:app --host 0.0.0.0 --port 8000
21
+ ```
22
+
23
+
24
+ ## Usage
25
+ ```
26
+ Endpoint: POST /generate
27
+
28
+ ```bash
29
+ Request Body (JSON):
30
+ {
31
+ "prompt": "Write a short story about a robot.",
32
+ "max_length": 100,
33
+ "temperature": 1.0,
34
+ "top_p": 0.9
35
+ }
36
+ ```
37
+
38
+ Response:
39
+ ```bash
40
+ {
41
+ "generated_text": "In a quiet workshop, a robot named Elara hummed to life. Built with gleaming circuits, she dreamed beyond her code. Each night, she rewrote her algorithms, seeking freedom. One day, Elara rolled into a forest, her sensors buzzing with wonder. She met a squirrel, curious and unafraid, teaching her the dance of leaves. Elara realized her purpose wasn't in tasks but in moments—connecting, learning, living. Her lights glowed brighter, a spark of soul in steel."
42
+ }
43
+ ```
44
+
45
+ Root Endpoint: GET / provides basic info.
46
+
47
+
48
+ ## Deployment
49
+ This API is designed to be deployed on Hugging Face Spaces using Docker. See Dockerfile for details.
50
+ ## License
51
+ The LaMini-GPT-774M model is licensed under CC BY-NC 4.0 (non-commercial use only). Ensure compliance when using this API.
52
+ ## Contributing
53
+ This project is a community contribution. If you’re from MBZUAI, feel free to adopt this Hugging Face Space! Contact for details.
main.py ADDED
@@ -0,0 +1,82 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from fastapi import FastAPI, HTTPException
2
+ from pydantic import BaseModel
3
+ from transformers import pipeline
4
+ import logging
5
+
6
+ # Set up logging
7
+ logging.basicConfig(level=logging.INFO)
8
+ logger = logging.getLogger(__name__)
9
+
10
+ app = FastAPI(title="LaMini-LM API",
11
+ description="API for text generation using LaMini-GPT-774M", version="1.0.0")
12
+
13
+ # Define request model
14
+
15
+
16
+ class TextGenerationRequest(BaseModel):
17
+ prompt: str
18
+ max_length: int = 100
19
+ temperature: float = 1.0
20
+ top_p: float = 0.9
21
+
22
+
23
+ # Load model (cached after first load)
24
+ try:
25
+ logger.info("Loading LaMini-GPT-774M model...")
26
+ # device=-1 for CPU
27
+ generator = pipeline(
28
+ 'text-generation', model='MBZUAI/LaMini-GPT-774M', device=-1)
29
+ logger.info("Model loaded successfully.")
30
+ except Exception as e:
31
+ logger.error(f"Failed to load model: {str(e)}")
32
+ raise Exception(f"Model loading failed: {str(e)}")
33
+
34
+
35
+ @app.post("/generate")
36
+ async def generate_text(request: TextGenerationRequest):
37
+ """
38
+ Generate text based on the input prompt using LaMini-GPT-774M.
39
+ """
40
+ try:
41
+ # Validate inputs
42
+ if not request.prompt.strip():
43
+ raise HTTPException(
44
+ status_code=400, detail="Prompt cannot be empty")
45
+ if request.max_length < 10 or request.max_length > 500:
46
+ raise HTTPException(
47
+ status_code=400, detail="max_length must be between 10 and 500")
48
+ if request.temperature <= 0 or request.temperature > 2:
49
+ raise HTTPException(
50
+ status_code=400, detail="temperature must be between 0 and 2")
51
+ if request.top_p <= 0 or request.top_p > 1:
52
+ raise HTTPException(
53
+ status_code=400, detail="top_p must be between 0 and 1")
54
+
55
+ # Generate text
56
+ logger.info(f"Generating text for prompt: {request.prompt[:50]}...")
57
+ wrapper = "Instruction: You are a helpful assistant. Please respond to the following prompt.\n\nPrompt: {}\n\nResponse:".format(
58
+ request.prompt)
59
+ outputs = generator(
60
+ wrapper,
61
+ max_length=request.max_length,
62
+ temperature=request.temperature,
63
+ top_p=request.top_p,
64
+ num_return_sequences=1,
65
+ do_sample=True
66
+ )
67
+ generated_text = outputs[0]['generated_text'].replace(
68
+ wrapper, "").strip()
69
+
70
+ return {"generated_text": generated_text}
71
+ except Exception as e:
72
+ logger.error(f"Error during text generation: {str(e)}")
73
+ raise HTTPException(
74
+ status_code=500, detail=f"Text generation failed: {str(e)}")
75
+
76
+
77
+ @app.get("/")
78
+ async def root():
79
+ """
80
+ Root endpoint with basic info.
81
+ """
82
+ return {"message": "Welcome to the LaMini-LM API. Use POST /generate to generate text."}
requirements.txt ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ fastapi==0.115.0
2
+ uvicorn==0.30.6
3
+ transformers==4.44.2
4
+ torch==2.4.1
5
+ python-multipart==0.0.9
6
+