Update README.md

3625886 verified 7 months ago

5.86 kB

metadata

license: mit
language:
  - en
  - it

🐯 Welcome to the Jungle – AI Image Generator (DDPM v1)

🎯 This is a pretrained DDPM model that generates realistic 128x128 wild animal images from pure noise.

🎨 What You Can Use This Model For / A cosa serve questo modello

🚀 English

This pretrained DDPM model generates high-quality 128x128 images of wild animals from pure noise.
Thanks to its stable U-Net backbone and diffusion pipeline, it can be used for multiple purposes:

🧪 Data Augmentation: enrich your training datasets with synthetic animal images.
🧑‍🏫 AI Education & Demos: perfect for demonstrating generative models in lessons or workshops.
🖼️ Creative Projects & Art: use the generator to create textures, digital artworks, or backgrounds.
🧠 Research: test new sampling strategies, noise schedulers, or conditioning mechanisms.
🗂️ Presentation Material: generate consistent wildlife images for PowerPoint slides, mockups or publications.

🇮🇹 Italiano

Questo modello DDPM preaddestrato genera immagini realistiche di animali selvatici a 128x128 pixel a partire da puro rumore.
Grazie all’architettura UNet e alla pipeline di diffusione, può essere utilizzato in molti contesti:

🧪 Data Augmentation: arricchisci i tuoi dataset con immagini sintetiche di animali.
🧑‍🏫 Didattica & Dimostrazioni: ideale per spiegare i modelli generativi in corsi o workshop.
🖼️ Progetti Creativi & Arte Digitale: genera texture, sfondi o arte concettuale.
🧠 Ricerca: testa nuove tecniche di campionamento, scheduler di rumore o meccanismi condizionati.
📊 Presentazioni & Slide: crea immagini coerenti da usare in PowerPoint, tesi, progetti aziendali.

🖼️ Sample Output

🧠 Model Architecture

🔧 Quick Inference

Clone the repository or download the files and run:

pip install torch torchvision pillow
python inference.py


> The model will generate an image from pure noise and save it as output.png.  
> UNet and Diffusion modules are embedded in the model

### 📖 Description / Descrizione

**English:**  
Welcome to *Welcome to the Jungle – AI Image Generator (DDPM v1)*, a diffusion-based model built upon a custom UNet architecture with self-attention.
This model generates 128×128 RGB images of wild animals from random noise.  
The code has been intentionally left uncovered to provide an opportunity for learning and experimentation — feel free to study and modify it.  
Note that the generation process is inherently random, and outputs may vary between runs.  
A diffusion model works by progressively denoising a random tensor: at each time step, it predicts and subtracts the noise, moving closer to a coherent image.
The UNet architecture used here includes downsampling layers with skip connections, self-attention modules to enhance feature learning, and upsampling layers
to reconstruct the final image.

**Italiano:**  
Benvenuti in *Welcome to the Jungle – AI Image Generator (DDPM v1)*, un modello basato sulla diffusione costruito su un’architettura UNet personalizzata con moduli di
self-attention.  
Il modello genera immagini RGB 128×128 di animali selvatici partendo da rumore casuale.  
Il codice è stato volutamente lasciato visibile per offrire un'opportunità formativa a chi desidera imparare e sperimentare.  
Il processo di generazione è intrinsecamente randomico: ogni immagine può essere diversa a ogni esecuzione.  
Un modello di diffusione lavora rimuovendo progressivamente il rumore da un tensore casuale, prevedendo e sottraendo il rumore passo dopo passo, fino a ottenere
un’immagine coerente. L'architettura UNet impiegata include livelli di downsampling con connessioni skip, moduli di self-attention per migliorare l’apprendimento
delle caratteristiche e livelli di upsampling per ricostruire l'immagine finale.

---

## 🧪 Quick Inference Example

You can run the following script to generate a synthetic wildlife image using this model:

```python
import torch
from torchvision.utils import save_image
from unet import Unet, Diffusion
import os

# Setup device e immagine
# Setup device and image size

device = 'cuda' if torch.cuda.is_available() else 'cpu'
image_size = 128

# Inizializzo modello e carica pesi
model = Unet().to(device)
model.load_state_dict(torch.load("ckpt_epoch_149.pt", map_location=device))
model.eval()

# Inizializzo classe di diffusione
diffusion = Diffusion(img_size=image_size, device=device)

# Creo la cartella output se non esiste
os.makedirs("generated_images", exist_ok=True)

# Genero immagine e salva
# Generate and save image
with torch.no_grad():
    sampled_images = diffusion.sample(model, n=1)
    save_image(sampled_images, "generated_images/output.png")

print("✅ Immagine generata e salvata in 'generated_images/output.png'")

---

- Dove trovo l’immagine generata?
+ 📂 Dove viene salvata l’immagine generata?

Where to find the result?
After running the script, you will find your generated image inside the generated_images/ folder, located in the same directory as your script.

💡 Make sure the script, unet.py, and the model weights file (ckpt_epoch_149.pt) are in the same folder or adjust the import paths accordingly.

🇮🇹 Dove trovo l’immagine generata?
Una volta eseguito lo script, l’immagine viene salvata automaticamente nella cartella generated_images/, accanto al file .py con cui hai lanciato l’inferenza.

---

**🧑‍💻 Author / Autore:**  
Dr. Flavio Rubens Ottaviani

**📄 License / Licenza:**  
MIT License