---
base_model:
- meta-llama/Llama-3.3-70B-Instruct
datasets:
- flexifyai/cross_rulings_hts_dataset_for_tariffs
language:
- en
library_name: transformers
license: mit
metrics:
- accuracy
pipeline_tag: text-classification
tags:
- legal
- trade
- htsus
- semiconductor
- tariffs
- hts
- cross
- cbp
pretty_name: Atlas (LLaMA-3.3-70B) — HTS Classification
authors:
- name: Pritish Yuvraj
  affiliation: Flexify.AI
  homepage: https://www.pritishyuvraj.com/
- name: Siva Devarakonda
  affiliation: Flexify.AI
---

# Atlas — LLaMA-3.3-70B fine-tuned for Harmonized Tariff Schedule (HTS) classification

This model is presented in the paper [ATLAS: Benchmarking and Adapting LLMs for Global Trade via Harmonized Tariff Code Classification](https://huggingface.co/papers/2509.18400).

Atlas is a domain-specialized LLaMA-3.3-70B model fine-tuned on U.S. Customs CROSS rulings for Harmonized Tariff Schedule (HTS) code assignment.  
It targets both **10-digit U.S. HTS (compliance)** and **6-digit HS (globally harmonized)** accuracy.

- **10-digit exact match:** 40.0%  
- **6-digit exact match:** 57.5%  

Atlas outperforms general-purpose LLMs while remaining deployable/self-hostable.  

- **Model repo:** [flexifyai/atlas-llama3.3-70b-hts-classification](https://huggingface.co/flexifyai/atlas-llama3.3-70b-hts-classification)  
- **Dataset:** [flexifyai/cross_rulings_hts_dataset_for_tariffs](https://huggingface.co/datasets/flexifyai/cross_rulings_hts_dataset_for_tariffs)
- **Demo:** [flexifyai/atlas-llama3_3-70b-hts-demo](https://flexifyai-atlas-llama3-3-70b-hts-demo.hf.space/?__theme=system&deep_link=auHidY8xF00)  
- **Project page:** https://tariffpro.flexify.ai/

**Example (from the demo):**

**User:**  
What is the HTS US Code for 4\[N-(2,4-Diamino-6-Pteridinylmethyl)-N-Methylamino] Benzoic Acid Sodium Salt?  

**Model:**  
HTS US Code -> `2933.59.4700`  
Reasoning -> Falls under heterocyclic compounds with nitrogen hetero-atom(s); specifically classified within pteridine derivatives used in pharmaceutical or biochemical applications per CROSS rulings.  

---

## TL;DR

- **Task:** Assign an HTS code given a product description (and optionally rationale).  
- **Why it matters:** Misclassification halts shipments; 6-digit HS is global, 10-digit is U.S.-specific.  
- **What’s new:** First open benchmark + strong open model baseline focused on semiconductors/manufacturing.  

---

## Intended use & limitations

### Use cases
- Automated HTS/HS pre-classification with human-in-the-loop review.  
- Decision support for brokers, compliance, and trade workflows.  
- Research on domain reasoning, retrieval, and alignment.  

### Limitations
- Not legal advice; rulings change and are context-dependent.  
- Training data is concentrated in semiconductors/manufacturing; performance may vary elsewhere.  
- Model can produce confident but incorrect codes; keep a human validator for high-stakes usage.  
- Always verify against the current HTS/USITC and local customs guidance.  

---

## Data

- **Source:** CROSS (U.S. Customs Rulings Online Search System).  
- **Splits:** 18,254 train / 200 valid / 200 test.  
- Each example includes:  
  - product description  
  - chain-of-reasoning style justification  
  - ground-truth HTS code  

**Dataset card:** [flexifyai/cross_rulings_hts_dataset_for_tariffs](https://huggingface.co/datasets/flexifyai/cross_rulings_hts_dataset_for_tariffs)  

---

## Training setup (summary)

- **Base:** LLaMA-3.3-70B (dense)  
- **Objective:** Supervised fine-tuning (token-level NLL)  
- **Optimizer:** AdamW (β1=0.9, β2=0.95, wd=0.1), cosine LR schedule, peak LR 1e-7  
- **Precision:** bf16, gradient accumulation (effective batch ≈ 64 seqs)  
- **Hardware:** 16× A100-80GB, 5 epochs (~1.4k steps)  

We chose a dense model for simpler finetuning/inference and reproducibility under budget constraints.  
**Future work:** retrieval, DPO/GRPO, and smaller distilled variants.  

---

## Results (200-example held-out test)

| Model                  | 10-digit exact | 6-digit exact | Avg. digits correct |
|-------------------------|----------------|---------------|----------------------|
| GPT-5-Thinking          | 25.0%          | 55.5%         | 5.61                 |
| Gemini-2.5-Pro-Thinking | 13.5%          | 31.0%         | 2.92                 |
| DeepSeek-R1 (05/28)     | 2.5%           | 26.5%         | 3.24                 |
| GPT-OSS-120B            | 1.5%           | 8.0%          | 2.58                 |
| LLaMA-3.3-70B (base)    | 2.1%           | 20.7%         | 3.31                 |
| **Atlas (this model)**  | **40.0%**      | **57.5%**     | **6.30**             |

💰 **Cost note:** Self-hosting Atlas on A100s can be significantly cheaper per 1k inferences than proprietary APIs.

---

## Prompting

Atlas expects an instruction like:
---
User: What is the HTS US Code for [product_description]?
Model: HTS US Code -> [10-digit code]
Reasoning -> [short justification]
---

### Minimal example

**User:**  
What is the HTS US Code for 300mm silicon wafers, polished, un-doped, for semiconductor fabrication?  

**Model:**  
HTS US Code -> `3818.00.0000`  
Reasoning -> Classified under chemical elements/compounds doped for electronics; wafer form per CROSS precedents.  

---

## Authors

- **Pritish Yuvraj** (Flexify.AI) — [pritishyuvraj.com](https://www.pritishyuvraj.com)  
- **Siva Devarakonda** (Flexify.AI)

## 📖 Citation

If you find this work useful, please cite our paper:

```bibtex
@misc{yuvraj2025atlasbenchmarkingadaptingllms,
  title={ATLAS: Benchmarking and Adapting LLMs for Global Trade via Harmonized Tariff Code Classification}, 
  author={Pritish Yuvraj and Siva Devarakonda},
  year={2025},
  eprint={2509.18400},
  archivePrefix={arXiv},
  primaryClass={cs.AI},
  url={https://arxiv.org/abs/2509.18400}, 
}
```