LGAI-Embedding-Preview

we have trained the LGAI-Embedding-Preview model based on the Mistral-7B LLM model.

The initial goal is to reproduce the baseline model and check the workflow for uploading results:

  • Checkpoint
  • technical report

MTEB

Inference is performed with in-context examples for MTEB evaluation.

Model Information

  • Model Size: 7B
  • Embedding Dimension: 4096
  • Max Input Tokens: 32k

Requirements

transformers>=4.48.3

Citation

If you find this repository useful, please consider citing it.

@misc{choi2025lgaiembeddingpreviewtechnicalreport,
      title={LGAI-EMBEDDING-Preview Technical Report}, 
      author={Jooyoung Choi and Hyun Kim and Hansol Jang and Changwook Jun and Kyunghoon Bae and Hyewon Choi and Stanley Jungkyu Choi and Honglak Lee and Chulmin Yun},
      year={2025},
      eprint={2506.07438},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2506.07438}, 
}
Downloads last month
293
Safetensors
Model size
7.11B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support