Descripiton

Tarka-Embedding-v1 is a 350M parameter embedding model designed to produce 1024-dimensional dense text representations. It is optimized for a wide range of downstream applications such as semantic similarity, search, and retrieval-augmented generation (RAG). The model focuses on capturing deep contextual semantics to support general-purpose text understanding across diverse domains.

The model is trained using Data-Free Knowledge Distillation (DFKD). To prepare the training data, standard open-source datasets were used only as a source of raw textual content — all labels, annotations, and structural elements were stripped to create a plain, unlabeled text corpus. The resulting dataset contained around 2 billion tokens, but due to dynamic sampling strategies during training, the model effectively learned from fewer than 1 billion tokens.

Find more information about Tarka-Embedding-350M-V1 in our blog post.

🚀 Try our demo: https://huggingface.co/spaces/Tarka-AIR/Tarka-Embedding

Model Details

Tarka-Embedding-350M-V1 has the following features:

Model Type: Text Embedding
Supported Languages: English, Arabic, Chinese, French, German, Japanese, Korean, and Spanish.
Number of Paramaters: 350M
Context Length: Supports up to 128K tokens; optimal performance is observed with inputs under 4K tokens
Embedding Dimension: 1024

Evaluation

MTEB (Eng v2)

MTEB English / Models	Param.	Mean(Task)	Mean(Type)	Class.	Clust.	Pair Class.	Rerank.	Retri.	STS	Summ.
multilingual-e5-large-instruct	0.6B	65.53	61.21	75.54	49.89	86.24	48.74	53.47	84.72	29.89
NV-Embed-v2	7.8B	69.81	65.00	87.19	47.66	88.69	49.61	62.84	83.82	35.21
GritLM-7B	7.2B	67.07	63.22	81.25	50.82	87.29	49.59	54.95	83.03	35.65
gte-Qwen2-1.5B-instruct	1.5B	67.20	63.26	85.84	53.54	87.52	49.25	50.25	82.51	33.94
stella_en_1.5B_v5	1.5B	69.43	65.32	89.38	57.06	88.02	50.19	52.42	83.27	36.91
gte-Qwen2-7B-instruct	7.6B	70.72	65.77	88.52	58.97	85.9	50.47	58.09	82.69	35.74
gemini-embedding-exp-03-07	-	73.3	67.67	90.05	59.39	87.7	48.59	64.35	85.29	38.28
Tarka-Embedding-350M-V1	350M	69.29	63.29	88.43	55.73	83.96	47.77	55.14	84.59	27.43

Usage

For the best performance use Flash attention with bfloat16

from sentence_transformers import SentenceTransformer

# We recommend enabling flash_attention_2 for better acceleration and memory saving,
model = SentenceTransformer(
    "Tarka-AIR/Tarka-Embedding-350M-V1",
    trust_remote_code=True,
    model_kwargs={
        "attn_implementation": "flash_attention_2",
        "device_map": "cuda",
        "torch_dtype": "bfloat16",
    },
    tokenizer_kwargs={"padding_side": "left"},
)

# The queries and documents to embed
queries = [
    "What is the capital of China?",
    "Explain gravity",
]
documents = [
    "The capital of China is Beijing.",
    "Gravity is a force that attracts two bodies towards each other. It gives weight to physical objects and is responsible for the movement of planets around the sun.",
]

# Encode the queries and documents. Note that queries benefit from using a prompt
# Here we use the prompt called "query" stored under `model.prompts`, but you can
# also pass your own prompt via the `prompt` argument
query_embeddings = model.encode(queries, prompt_name="query")
document_embeddings = model.encode(documents)

# Compute the (cosine) similarity between the query and document embeddings
similarity = model.similarity(query_embeddings, document_embeddings)
print(similarity)

# tensor([[0.9177, 0.3923],
#         [0.2975, 0.7631]])

Acknowledgments

Special thanks to:

Qwen and LFM Team team for providing the base model and foundational research.

Gratitude is also extended to the open-source community for creating the tools, frameworks, and datasets that enabled fine-tuning and evaluation of this model.

Disclaimer The creator of this Model is not responsible for any misuse, damages, or legal issues arising from the use of this model.

Citation

If you find this model useful, please consider giving a citation.

@misc{tarka_ai_research_2025,
    author       = { Tarka AI Research },
    title        = { Tarka-Embedding-350M-V1 (Revision f4b5de8) },
    year         = 2025,
    url          = { https://huggingface.co/Tarka-AIR/Tarka-Embedding-350M-V1 },
    doi          = { 10.57967/hf/6979 },
    publisher    = { Hugging Face }
}

Downloads last month: 246

Safetensors

Model size

0.4B params

Tensor type

BF16

Space using Tarka-AIR/Tarka-Embedding-350M-V1 1

Collection including Tarka-AIR/Tarka-Embedding-350M-V1

Tarka Embed V1

Collection

Efficient DFKD embeddings for language understanding • 5 items • Updated about 5 hours ago • 5