Spaces:

gg3554
/

imageGenerator

Sleeping

App Files Files Community

imageGenerator / README.md

gg3554

Update README.md

7b6843e verified 17 days ago

preview code

raw

history blame contribute delete

3.64 kB

A newer version of the Gradio SDK is available: 6.1.0

Upgrade

metadata

title: Text to Image Generator
emoji: 🎨
colorFrom: purple
colorTo: pink
sdk: gradio
sdk_version: 6.0.1
app_file: app.py
pinned: false
license: mit

ImageGenerator

GEMINI 2.5 PRO PROMPT ENHANCEMENT
Original prompt:  a dog in front of a desk
Enhanced prompt:  A photorealistic, eye-level shot of a Golden Retriever sitting by a wooden desk. Soft, warm morning light streams from a window, creating gentle highlights and deep shadows. The background is softly blurred, focusing on the dog's detailed fur. A cozy, cinematic atmosphere with an earthy color palette.
Word count: 47
Processing sample 1/1: A photorealistic, eye-level shot of a Golden Retriever sitting by a wooden desk. Soft, warm morning light streams from a window, creating gentle highlights and deep shadows. The background is softly blurred, focusing on the dog's detailed fur. A cozy, cinematic atmosphere with an earthy color palette.
100%|██████████| 30/30 [00:05<00:00,  5.08it/s]
  Saved as: generated_image_0.png
============================================================
DETAILED SAMPLE SCORES
============================================================
Average CLIPScore: 0.2753
Average GenEval Score: 0.6883
Inception Score: 1.0000 ± 0.0000
Inception Entropy: 7.5517

Text to Image Generation Web App

A Flask-based web application that generates images from text prompts using Stable Diffusion and evaluates them using CLIP scores.

Features

🎨 Generate images from text descriptions
📊 CLIP score evaluation for image-text alignment
💬 Chat-style interface
🌙 Modern dark theme UI

Requirements

Python 3.8+
CUDA-capable GPU (recommended) or CPU
~10GB disk space for models

Installation

Clone or download the project files

Create a virtual environment (recommended)

python -m venv venv
source venv/bin/activate  # Linux/Mac
# or
venv\Scripts\activate  # Windows

Install dependencies
```
pip install -r requirements.txt
```

Running the Application

Start the Flask server
```
python app.py
```
Open your browser and go to:
```
http://localhost:5000
```
Enter a prompt and click "Generate" to create an image!

Project Structure

project/
├── app.py              # Main Flask application
├── templates/
│   └── index.html      # Web interface
├── static/
│   └── images/         # Static assets (optional)
├── requirements.txt    # Python dependencies
└── README.md          # This file

Notes

First run: The application will download required AI models (~5-7GB). This happens only once.
Generation time: Each image takes 30-60 seconds on GPU, longer on CPU.
Memory: Requires ~8GB VRAM (GPU) or ~16GB RAM (CPU).

Troubleshooting

Out of Memory

If you get CUDA out of memory errors, try:

Closing other GPU applications
Reducing num_inference_steps in app.py
Using CPU mode (slower but works with less memory)

Slow Generation

CPU mode is significantly slower than GPU
Consider using a cloud GPU service for better performance

API Endpoints

GET / - Web interface
POST / - Generate image (form data: prompt)
GET /health - Health check

Scores Explained

CLIP Score: Measures how well the image matches the text (0-1, higher is better)
GenEval Score: Derived metric (CLIP × 2.5) for easier interpretation