Descripiton
Tarka-Embedding-v1 is a 350M parameter embedding model designed to produce 1024-dimensional dense text representations. It is optimized for a wide range of downstream applications such as semantic similarity, search, and retrieval-augmented generation (RAG). The model focuses on capturing deep contextual semantics to support general-purpose text understanding across diverse domains.
The model is trained using Data-Free Knowledge Distillation (DFKD). To prepare the training data, standard open-source datasets were used only as a source of raw textual content β all labels, annotations, and structural elements were stripped to create a plain, unlabeled text corpus. The resulting dataset contained around 2 billion tokens, but due to dynamic sampling strategies during training, the model effectively learned from fewer than 1 billion tokens.
Find more information about Tarka-Embedding-350M-V1 in our blog post.
π Try our demo: https://huggingface.co/spaces/Tarka-AIR/Tarka-Embedding
Model Details
Tarka-Embedding-350M-V1 has the following features:
- Model Type: Text Embedding
- Supported Languages: English, Arabic, Chinese, French, German, Japanese, Korean, and Spanish.
- Number of Paramaters: 350M
- Context Length: Supports up to 128K tokens; optimal performance is observed with inputs under 4K tokens
- Embedding Dimension: 1024
Evaluation
MTEB (Eng v2)
| MTEB English / Models | Param. | Mean(Task) | Mean(Type) | Class. | Clust. | Pair Class. | Rerank. | Retri. | STS | Summ. |
|---|---|---|---|---|---|---|---|---|---|---|
| multilingual-e5-large-instruct | 0.6B | 65.53 | 61.21 | 75.54 | 49.89 | 86.24 | 48.74 | 53.47 | 84.72 | 29.89 |
| NV-Embed-v2 | 7.8B | 69.81 | 65.00 | 87.19 | 47.66 | 88.69 | 49.61 | 62.84 | 83.82 | 35.21 |
| GritLM-7B | 7.2B | 67.07 | 63.22 | 81.25 | 50.82 | 87.29 | 49.59 | 54.95 | 83.03 | 35.65 |
| gte-Qwen2-1.5B-instruct | 1.5B | 67.20 | 63.26 | 85.84 | 53.54 | 87.52 | 49.25 | 50.25 | 82.51 | 33.94 |
| stella_en_1.5B_v5 | 1.5B | 69.43 | 65.32 | 89.38 | 57.06 | 88.02 | 50.19 | 52.42 | 83.27 | 36.91 |
| gte-Qwen2-7B-instruct | 7.6B | 70.72 | 65.77 | 88.52 | 58.97 | 85.9 | 50.47 | 58.09 | 82.69 | 35.74 |
| gemini-embedding-exp-03-07 | - | 73.3 | 67.67 | 90.05 | 59.39 | 87.7 | 48.59 | 64.35 | 85.29 | 38.28 |
| Tarka-Embedding-350M-V1 | 350M | 69.29 | 63.29 | 88.43 | 55.73 | 83.96 | 47.77 | 55.14 | 84.59 | 27.43 |
Usage
For the best performance use Flash attention with bfloat16
from sentence_transformers import SentenceTransformer
# We recommend enabling flash_attention_2 for better acceleration and memory saving,
model = SentenceTransformer(
"Tarka-AIR/Tarka-Embedding-350M-V1",
trust_remote_code=True,
model_kwargs={
"attn_implementation": "flash_attention_2",
"device_map": "cuda",
"torch_dtype": "bfloat16",
},
tokenizer_kwargs={"padding_side": "left"},
)
# The queries and documents to embed
queries = [
"What is the capital of China?",
"Explain gravity",
]
documents = [
"The capital of China is Beijing.",
"Gravity is a force that attracts two bodies towards each other. It gives weight to physical objects and is responsible for the movement of planets around the sun.",
]
# Encode the queries and documents. Note that queries benefit from using a prompt
# Here we use the prompt called "query" stored under `model.prompts`, but you can
# also pass your own prompt via the `prompt` argument
query_embeddings = model.encode(queries, prompt_name="query")
document_embeddings = model.encode(documents)
# Compute the (cosine) similarity between the query and document embeddings
similarity = model.similarity(query_embeddings, document_embeddings)
print(similarity)
# tensor([[0.9177, 0.3923],
# [0.2975, 0.7631]])
Acknowledgments
Special thanks to:
- Qwen and LFM Team team for providing the base model and foundational research.
Gratitude is also extended to the open-source community for creating the tools, frameworks, and datasets that enabled fine-tuning and evaluation of this model.
Disclaimer The creator of this Model is not responsible for any misuse, damages, or legal issues arising from the use of this model.
Citation
If you find this model useful, please consider giving a citation.
@misc{tarka_ai_research_2025,
author = { Tarka AI Research },
title = { Tarka-Embedding-350M-V1 (Revision f4b5de8) },
year = 2025,
url = { https://huggingface.co/Tarka-AIR/Tarka-Embedding-350M-V1 },
doi = { 10.57967/hf/6979 },
publisher = { Hugging Face }
}
- Downloads last month
- 246