Bambara FastText Embeddings
Model Description
This model provides FastText word embeddings for the Bambara language (Bamanankan), a Mande language spoken primarily in Mali. The embeddings capture semantic relationships between Bambara words and enable various NLP tasks for this low-resource African language.
Model Type: FastText Word Embeddings
Language: Bambara (bm)
License: Apache 2.0
Model Details
Model Architecture
- Algorithm: FastText with subword information
- Vector Dimension: 300
- Vocabulary Size: 9,973 unique Bambara words
- Training Method: Skip-gram with negative sampling
- Subword Information: Character n-grams (enables handling of out-of-vocabulary words)
Training Data
The model was trained on Bambara text corpora, building upon the work of David Ifeoluwa Adelani's research on African language embeddings.
Intended Use
This model is designed for:
- Semantic similarity tasks in Bambara
- Information retrieval for Bambara documents
- Cross-lingual research involving Bambara
- Cultural preservation and digital humanities projects
- Educational applications for Bambara language learning
- Foundation for downstream NLP tasks in Bambara
Usage
Coming soon
- Downloads last month
- 9
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support