BluelarkAviation
/

llama-aviation-pte

Model card Files Files and versions

llama-aviation-pte

Quantized Llama 3.2 1B Aviation model for iOS deployment

Model Details

Base Model: Llama 3.2 1B Instruct
Format: PTE (PyTorch ExecuTorch)
Backend: XNNPACK (optimized for iOS)
Quantization: 4-bit GPTQ
Size: 1.07 GB
Platform: iOS 16+

Files

llama3_2_1b_aviation_spinquant.pte (1083.3 MB)
tokenizer.json (8.7 MB)
tokenizer_config.json (0.1 MB)

Usage

Download

wget https://huggingface.co/bluelarkaviation/llama-aviation-pte/resolve/main/model.pte

iOS Integration

Add model.pte to your Xcode project
Install ExecuTorch framework
Load the model:

import ExecuTorch

let modelPath = Bundle.main.path(forResource: "model", ofType: "pte")!
let module = try Module(filePath: modelPath)

Performance

Optimized for: iPhone 12 and newer
RAM Required: 2-3 GB
Backend: XNNPACK (5-10x faster on ARM CPUs)

License

Same as base Llama 3.2 model

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support