📁 Utils Directory Guide - Format_Resume.py Focus

🎯 REQUIRED FILES for Format_Resume.py (10 out of 11 files)

After analyzing the Format_Resume.py functionality with OpenAI GPT-4o as primary and HF Cloud as backup, here are the essential files:

utils/
├── 🎯 CORE EXTRACTION SYSTEM (Format_Resume.py dependencies)
│   ├── hybrid_extractor.py      # ⭐ REQUIRED - Main orchestrator (direct import)
│   ├── openai_extractor.py      # ⭐ REQUIRED - OpenAI GPT-4o (PRIMARY method)
│   ├── hf_cloud_extractor.py    # ⭐ REQUIRED - HF Cloud API (BACKUP method)
│   ├── ai_extractor.py          # ⭐ REQUIRED - Alternative HF AI (fallback)
│   ├── hf_extractor_simple.py   # ⭐ REQUIRED - Simple HF (fallback)
│   └── extractor_fixed.py       # ⭐ REQUIRED - Regex fallback (last resort)
│
├── 🏗️ DOCUMENT PROCESSING (Format_Resume.py dependencies)
│   ├── builder.py               # ⭐ REQUIRED - Resume document generation with header/footer preservation
│   └── parser.py                # ⭐ REQUIRED - PDF/DOCX text extraction (direct import)
│
└── 📊 REFERENCE DATA (Required for fallback system)
    └── data/                    # ⭐ REQUIRED - Used by extractor_fixed.py fallback
        ├── job_titles.json      # ⭐ REQUIRED - Job title patterns for regex extraction
        └── skills.json          # ⭐ REQUIRED - Skills matching for spaCy extraction

🔗 Dependency Chain for Format_Resume.py

pages/Format_Resume.py
├── utils/hybrid_extractor.py (DIRECT IMPORT - orchestrator)
│   ├── utils/openai_extractor.py (PRIMARY GPT-4o - best accuracy)
│   ├── utils/hf_cloud_extractor.py (BACKUP - good accuracy)
│   ├── utils/ai_extractor.py (alternative backup)
│   ├── utils/hf_extractor_simple.py (simple backup)
│   └── utils/extractor_fixed.py (regex fallback) → uses data/job_titles.json & data/skills.json
├── utils/builder.py (DIRECT IMPORT - document generation with template preservation)
└── utils/parser.py (DIRECT IMPORT - file parsing)

🎯 File Purposes for Format_Resume.py

✅ REQUIRED - Core Extraction System

File	Purpose	When Used	Priority
`hybrid_extractor.py`	Main entry point - orchestrates all extraction methods	Always (Format_Resume.py imports this)	🔴 CRITICAL
`openai_extractor.py`	PRIMARY AI - OpenAI GPT-4o extraction with contact info	When `use_openai=True` (best results)	🟠 PRIMARY
`hf_cloud_extractor.py`	BACKUP AI - Hugging Face Cloud API extraction	When OpenAI fails or unavailable	🟡 BACKUP
`ai_extractor.py`	Alternative AI - HF AI models extraction	Alternative backup method	🟢 FALLBACK
`hf_extractor_simple.py`	Simple AI - Simplified local processing	When cloud APIs fail	🟢 FALLBACK
`extractor_fixed.py`	Reliable fallback - Regex-based extraction with spaCy	When all AI methods fail	🔵 LAST RESORT

✅ REQUIRED - Document Processing

File	Purpose	When Used	Priority
`builder.py`	Document generation - Creates formatted Word docs with preserved headers/footers	Always (Format_Resume.py imports this)	🔴 CRITICAL
`parser.py`	File parsing - Extracts raw text from PDF/DOCX files	Always (Format_Resume.py imports this)	🔴 CRITICAL

✅ REQUIRED - Reference Data

File	Purpose	When Used	Priority
`data/job_titles.json`	Job title patterns - Used by extractor_fixed.py for regex matching	When all AI methods fail (fallback)	🟡 BACKUP
`data/skills.json`	Skills database - Used by extractor_fixed.py for spaCy skill matching	When all AI methods fail (fallback)	🟡 BACKUP

🚀 Format_Resume.py Extraction Flow

1. User uploads resume → parser.py extracts raw text
2. hybrid_extractor.py orchestrates extraction:
   ├── Try openai_extractor.py (PRIMARY GPT-4o - best accuracy)
   ├── If fails → Try hf_cloud_extractor.py (BACKUP - good accuracy)
   ├── If fails → Try ai_extractor.py (alternative backup)
   ├── If fails → Try hf_extractor_simple.py (simple backup)
   └── If all fail → Use extractor_fixed.py (regex fallback) → uses data/*.json
3. builder.py generates formatted Word document with preserved template headers/footers
4. User downloads formatted resume with Qvell branding and proper formatting