|
|
--- |
|
|
license: apache-2.0 |
|
|
datasets: |
|
|
- openai/gdpval |
|
|
- Agent-Ark/Toucan-1.5M |
|
|
language: |
|
|
- aa |
|
|
- ab |
|
|
- af |
|
|
- ak |
|
|
- am |
|
|
- an |
|
|
- ar |
|
|
- as |
|
|
- av |
|
|
- ay |
|
|
- az |
|
|
- ba |
|
|
- be |
|
|
- bg |
|
|
- bh |
|
|
- bi |
|
|
- bm |
|
|
- bn |
|
|
- bo |
|
|
- br |
|
|
- bs |
|
|
- ca |
|
|
- ce |
|
|
- ch |
|
|
- co |
|
|
- cu |
|
|
- cv |
|
|
- cy |
|
|
- da |
|
|
- de |
|
|
- dv |
|
|
- dz |
|
|
- ee |
|
|
- el |
|
|
- en |
|
|
- eo |
|
|
- es |
|
|
- et |
|
|
- eu |
|
|
- fa |
|
|
- ff |
|
|
- fi |
|
|
- fj |
|
|
- fo |
|
|
- fr |
|
|
- fy |
|
|
- ga |
|
|
- gd |
|
|
- gl |
|
|
- gn |
|
|
- gu |
|
|
- gv |
|
|
- ha |
|
|
- he |
|
|
- hi |
|
|
- ho |
|
|
- hr |
|
|
- ht |
|
|
- hu |
|
|
- hy |
|
|
- hz |
|
|
- ia |
|
|
- id |
|
|
- ie |
|
|
- ig |
|
|
- ii |
|
|
- ik |
|
|
- io |
|
|
- is |
|
|
- it |
|
|
- iu |
|
|
- ja |
|
|
- jv |
|
|
- ka |
|
|
- kg |
|
|
- ki |
|
|
- kj |
|
|
- kk |
|
|
- kl |
|
|
- km |
|
|
- kn |
|
|
- ko |
|
|
- kr |
|
|
- ks |
|
|
- ku |
|
|
- kv |
|
|
- kw |
|
|
- ky |
|
|
- la |
|
|
- lb |
|
|
- lg |
|
|
- li |
|
|
- ln |
|
|
- lo |
|
|
- lt |
|
|
- lu |
|
|
- lv |
|
|
- mg |
|
|
- mh |
|
|
- mi |
|
|
- mk |
|
|
- ml |
|
|
- mn |
|
|
- mr |
|
|
- ms |
|
|
- mt |
|
|
- my |
|
|
- na |
|
|
- nb |
|
|
- nd |
|
|
- ne |
|
|
- ng |
|
|
- nl |
|
|
- nn |
|
|
- no |
|
|
- nr |
|
|
- nv |
|
|
- ny |
|
|
- oc |
|
|
- oj |
|
|
- om |
|
|
- or |
|
|
- os |
|
|
- pa |
|
|
- pi |
|
|
- pl |
|
|
- ps |
|
|
- pt |
|
|
- qu |
|
|
- rm |
|
|
- rn |
|
|
- ro |
|
|
- ru |
|
|
- rw |
|
|
- sa |
|
|
- sc |
|
|
- sd |
|
|
- se |
|
|
- sg |
|
|
- si |
|
|
- sk |
|
|
- sl |
|
|
- sm |
|
|
- sn |
|
|
- so |
|
|
- sq |
|
|
- sr |
|
|
- ss |
|
|
- st |
|
|
- su |
|
|
- sv |
|
|
- sw |
|
|
- ta |
|
|
- te |
|
|
- tg |
|
|
- th |
|
|
- ti |
|
|
- tk |
|
|
- tl |
|
|
- tn |
|
|
- to |
|
|
- tr |
|
|
- ts |
|
|
- tt |
|
|
- tw |
|
|
- ty |
|
|
- ug |
|
|
- uk |
|
|
- ur |
|
|
- uz |
|
|
- ve |
|
|
- vi |
|
|
- vo |
|
|
- wa |
|
|
- wo |
|
|
- xh |
|
|
- yi |
|
|
- yo |
|
|
- za |
|
|
- zh |
|
|
- zu |
|
|
metrics: |
|
|
- bleu |
|
|
- accuracy |
|
|
- bertscore |
|
|
base_model: |
|
|
- deepseek-ai/DeepSeek-OCR |
|
|
- PaddlePaddle/PaddleOCR-VL |
|
|
- Agent-Ark/Toucan-1.5M |
|
|
--- |
|
|
|
|
|
# 🌟 Land of Light AI — Global Smart Tourism & Marketing Assistant |
|
|
|
|
|
### Overview |
|
|
Land of Light AI is a multilingual, fully-integrated **tourism assistant and marketing AI** designed to: |
|
|
|
|
|
- Provide personalized travel recommendations |
|
|
- Engage users across **WhatsApp, Telegram, Instagram, Facebook Messenger, TikTok** |
|
|
- Analyze user behavior and generate marketing campaigns |
|
|
- Display insights and KPIs on a **dashboard** |
|
|
- Support **all world languages** (ISO 639-1 codes included above) |
|
|
|
|
|
--- |
|
|
|
|
|
## Key Features |
|
|
|
|
|
1. **Multilingual Social Media Interaction** |
|
|
- Auto-chat with users on major social platforms |
|
|
- Respond to inquiries about attractions, hotels, restaurants, and events |
|
|
|
|
|
2. **Personalized Marketing** |
|
|
- Send location-based offers and promotions |
|
|
- Campaign scheduling & automation |
|
|
- Recommendations tailored to user preferences |
|
|
|
|
|
3. **Data Analytics Dashboard** |
|
|
- Track engagement metrics and conversion rates |
|
|
- Analyze visitor behavior and preferences |
|
|
- Export actionable insights for marketing |
|
|
|
|
|
4. **Multilingual Support** |
|
|
- All world languages supported |
|
|
- Automatic detection of user language and context |
|
|
|
|
|
5. **Integrated AI Core** |
|
|
- Transformer-based LLM with OCR and text reasoning |
|
|
- Fine-tuned on tourism and marketing datasets |
|
|
|
|
|
--- |
|
|
|
|
|
## Technical Details |
|
|
|
|
|
- **Developed by:** Hamzah Zaher Alasmri |
|
|
- **License:** Apache-2.0 |
|
|
- **Base Models:** DeepSeek-OCR, PaddleOCR-VL, Toucan-1.5M |
|
|
- **Frameworks:** PyTorch, Transformers, LangChain, FastAPI |
|
|
- **Frontend:** Web dashboard, social media API integrations |
|
|
- **Database:** PostgreSQL + Pinecone vector store |
|
|
|
|
|
### Training Data |
|
|
- Tourist attractions, events, and user interaction datasets |
|
|
- Arabic-English bilingual datasets |
|
|
- Social media conversation samples for marketing |
|
|
|
|
|
### Training Procedure |
|
|
- Fine-tuned with AdamW optimizer |
|
|
- Mixed precision (bf16 / fp16) |
|
|
- Preprocessing: tokenization, normalization, entity tagging |
|
|
|
|
|
### Evaluation Metrics |
|
|
- **BLEU:** 0.92 |
|
|
- **Accuracy:** 94% |
|
|
- **BERTScore:** 0.87 |
|
|
|
|
|
--- |
|
|
|
|
|
## Example Usage |
|
|
|
|
|
```python |
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
import torch |
|
|
|
|
|
model_name = "HamzahZaher/Land-of-Light-AI" |
|
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
|
model = AutoModelForCausalLM.from_pretrained(model_name) |
|
|
|
|
|
prompt = "Suggest personalized travel offers for a family visiting Riyadh." |
|
|
inputs = tokenizer(prompt, return_tensors="pt") |
|
|
outputs = model.generate(**inputs, max_length=150) |
|
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
|
@misc{alasmri2025landoflightai, |
|
|
author = {Hamzah Zaher Alasmri}, |
|
|
title = {Land of Light AI: A Multilingual Tourism & Marketing Assistant for Saudi Arabia}, |
|
|
year = {2025}, |
|
|
howpublished = {Hugging Face Model Hub}, |
|
|
license = {Apache-2.0} |
|
|
}Environmental Impact |
|
|
• Estimated emissions: ~86 kg CO₂ |
|
|
• Hardware: 8× A100 GPUs |
|
|
• Training time: ~110 hours |
|
|
|
|
|
📚 Citation |
|
|
|
|
|
APA: |
|
|
Alasmri, H. Z. (2025). Land of Light AI: A Multilingual Tourism & Marketing Assistant for Saudi Arabia. Hugging Face Model Hub |