flexifyai's picture
Update library_name to transformers, add paper and project page links (#1)
db418cb verified
metadata
base_model:
  - meta-llama/Llama-3.3-70B-Instruct
datasets:
  - flexifyai/cross_rulings_hts_dataset_for_tariffs
language:
  - en
library_name: transformers
license: mit
metrics:
  - accuracy
pipeline_tag: text-classification
tags:
  - legal
  - trade
  - htsus
  - semiconductor
  - tariffs
  - hts
  - cross
  - cbp
pretty_name: Atlas (LLaMA-3.3-70B) β€” HTS Classification
authors:
  - name: Pritish Yuvraj
    affiliation: Flexify.AI
    homepage: https://www.pritishyuvraj.com/
  - name: Siva Devarakonda
    affiliation: Flexify.AI

Atlas β€” LLaMA-3.3-70B fine-tuned for Harmonized Tariff Schedule (HTS) classification

This model is presented in the paper ATLAS: Benchmarking and Adapting LLMs for Global Trade via Harmonized Tariff Code Classification.

Atlas is a domain-specialized LLaMA-3.3-70B model fine-tuned on U.S. Customs CROSS rulings for Harmonized Tariff Schedule (HTS) code assignment.
It targets both 10-digit U.S. HTS (compliance) and 6-digit HS (globally harmonized) accuracy.

  • 10-digit exact match: 40.0%
  • 6-digit exact match: 57.5%

Atlas outperforms general-purpose LLMs while remaining deployable/self-hostable.

Example (from the demo):

User:
What is the HTS US Code for 4[N-(2,4-Diamino-6-Pteridinylmethyl)-N-Methylamino] Benzoic Acid Sodium Salt?

Model:
HTS US Code -> 2933.59.4700
Reasoning -> Falls under heterocyclic compounds with nitrogen hetero-atom(s); specifically classified within pteridine derivatives used in pharmaceutical or biochemical applications per CROSS rulings.


TL;DR

  • Task: Assign an HTS code given a product description (and optionally rationale).
  • Why it matters: Misclassification halts shipments; 6-digit HS is global, 10-digit is U.S.-specific.
  • What’s new: First open benchmark + strong open model baseline focused on semiconductors/manufacturing.

Intended use & limitations

Use cases

  • Automated HTS/HS pre-classification with human-in-the-loop review.
  • Decision support for brokers, compliance, and trade workflows.
  • Research on domain reasoning, retrieval, and alignment.

Limitations

  • Not legal advice; rulings change and are context-dependent.
  • Training data is concentrated in semiconductors/manufacturing; performance may vary elsewhere.
  • Model can produce confident but incorrect codes; keep a human validator for high-stakes usage.
  • Always verify against the current HTS/USITC and local customs guidance.

Data

  • Source: CROSS (U.S. Customs Rulings Online Search System).
  • Splits: 18,254 train / 200 valid / 200 test.
  • Each example includes:
    • product description
    • chain-of-reasoning style justification
    • ground-truth HTS code

Dataset card: flexifyai/cross_rulings_hts_dataset_for_tariffs


Training setup (summary)

  • Base: LLaMA-3.3-70B (dense)
  • Objective: Supervised fine-tuning (token-level NLL)
  • Optimizer: AdamW (Ξ²1=0.9, Ξ²2=0.95, wd=0.1), cosine LR schedule, peak LR 1e-7
  • Precision: bf16, gradient accumulation (effective batch β‰ˆ 64 seqs)
  • Hardware: 16Γ— A100-80GB, 5 epochs (~1.4k steps)

We chose a dense model for simpler finetuning/inference and reproducibility under budget constraints.
Future work: retrieval, DPO/GRPO, and smaller distilled variants.


Results (200-example held-out test)

Model 10-digit exact 6-digit exact Avg. digits correct
GPT-5-Thinking 25.0% 55.5% 5.61
Gemini-2.5-Pro-Thinking 13.5% 31.0% 2.92
DeepSeek-R1 (05/28) 2.5% 26.5% 3.24
GPT-OSS-120B 1.5% 8.0% 2.58
LLaMA-3.3-70B (base) 2.1% 20.7% 3.31
Atlas (this model) 40.0% 57.5% 6.30

πŸ’° Cost note: Self-hosting Atlas on A100s can be significantly cheaper per 1k inferences than proprietary APIs.


Prompting

Atlas expects an instruction like:

User: What is the HTS US Code for [product_description]? Model: HTS US Code -> [10-digit code] Reasoning -> [short justification]

Minimal example

User:
What is the HTS US Code for 300mm silicon wafers, polished, un-doped, for semiconductor fabrication?

Model:
HTS US Code -> 3818.00.0000
Reasoning -> Classified under chemical elements/compounds doped for electronics; wafer form per CROSS precedents.


Authors

πŸ“– Citation

If you find this work useful, please cite our paper:

@misc{yuvraj2025atlasbenchmarkingadaptingllms,
  title={ATLAS: Benchmarking and Adapting LLMs for Global Trade via Harmonized Tariff Code Classification}, 
  author={Pritish Yuvraj and Siva Devarakonda},
  year={2025},
  eprint={2509.18400},
  archivePrefix={arXiv},
  primaryClass={cs.AI},
  url={https://arxiv.org/abs/2509.18400}, 
}