SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: sentence-transformers/all-MiniLM-L6-v2
Maximum Sequence Length: 256 tokens
Output Dimensionality: 384 dimensions
Similarity Function: Cosine Similarity

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("joeportnoy/resume-match-ml")
# Run inference
sentences = [
    'machine learning enthusiast. motivated to learn, grow and excel my experience by challenging myself. machine learning text analytics software development data analysis python java javascript matplotlib',
    "proposal & technical marketing analyst – 100% remote  overview: our client, a us fortune 250 company and a global medical technology corporation serving customers in clinical labs, health care research & pharmaceutical industry, seeks an accomplished proposal & technical marketing analyst – 100% remote.\n\nposition: proposal & technical marketing analyst – 100% remotelocation: remoteduration: 3-6 months+ temp-to-hire!\npay rate: $50/ hr\njob description: the technical marketing resource is responsible for the creation of it, security, and technical content, with the ability to manage and prioritize multiple projects, and work with cross-functional teams in accordance with the highest quality standards for commercial excellence. this role supports the it needs of the global marketing and us region sales and hit teams for responses to technical questions related to the mms solutions including alaris, pyxis, and healthsight platforms. requests may be in the form of full technical assessments, rfps, contract language, general customer questions or other technical support as needed.bachelor’s degree preferably in a technical field, or equivalent experiencetechnical it trainingknowledge of mms products preferredproject management experiencedemonstrated ability to multitask and balance competing priorities to meet deadlinesexcellent writing skills, the ability to assimilate technical information from various sources into an accurate, customer facing responsedriven by desire to assist and supportstrong customer centric perspective\nthe position reports to the senior manager, technical marketing, us region mms, and will require close cross functional collaboration with the us region rfp team, r&d, technical product management, product security, information security, data privacy, contracts, legal and gcs technical teams.\nresponsibilities:have in-depth knowledge of the function, features and technical requirements for implementation of mms solutions.coordinates and completes the writing of technical responses for incoming it and technical questions from rfps, contracts, general customer questions or security auditsdemonstrate strong project management skills to prioritize, triage, complete and track requests appropriately.develop strong internal relationships with the technical subject matter experts (smes) including the platform development teams and global marketing teams.maintain training to remain current with knowledge depth on new versions and product releases.consistently provide timely and accurate responses to avoid undue risk or lost opportunities for the company.collaborates with the hit team to deliver technical training to support mms solutions and product releasesreviews product release content from platform and commercial teams to ensure technical details are complete to support it customer conversationsworks cross-functionally with product and information security to ensure product security whitepapers security are current and available with product releasesother responsibilities as assigned.\n\n\ni'd love to talk to you if you think this position is right up your alley, and assure a prompt communication, whichever direction. if you are looking for rewarding employment and a company that puts its employees first, we'd like to work with you. \nrecruiter name: gurjant “gary” singhtitle: sr. recruiterphone: 925-297-5994\n\ncompany overview:\namerit consulting is an extremely fast-growing staffing and consulting firm. amerit consulting was founded in 2002 to provide consulting, temporary staffing, direct hire, and payrolling services to fortune 500 companies nationally: as well as small to mid-sized organizations on a local & regional level. currently, amerit has over 2,000 employees in 47 states. we develop and implement solutions that help our clients operate more efficiently, deliver greater customer satisfaction, and see a positive impact on their bottom line. we create value by bringing together the right people to achieve results. our clients and employees say they choose to work with amerit because of how we work with them - with service that exceeds their expectations and a personal commitment to their success. our deep expertise in human capital management has fueled our expansion into direct hire placements, temporary staffing, contract placements, and additional staffing and consulting services that propel our clients’ businesses forward.\namerit consulting provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.this policy applies to all terms and conditions of employment, including recruiting, hiring, placement, promotion, termination, layoff, recall, transfer, leaves of absence, compensation, and training.applicants, with criminal histories, are considered in a manner that is consistent with local, state and federal laws.",
    "data analyst  job title: data analystduration: contractlocation: queens-ny (day 1 onsite)-- need only locals \n\nrole descriptionthis is a contract role for a data analyst. as a data analyst, you will be responsible for collecting, analyzing, and interpreting complex data sets. you will work with various departments to identify patterns, trends, and insights that can be used to improve business operations. this is an on-site role located in queens, ny.\nqualificationsanalytical skills, including the ability to collect, organize, analyze and disseminate significant amounts of information with attention to detail and accuracydata analytics and statistics skills, including experience with statistical analysis software and data visualization toolsexcellent communication skills, including the ability to explain technical concepts to non-technical stakeholders and present findings to both technical and non-technical audiencesdata modeling skills, including the ability to develop and maintain complex data models and schemasa bachelor's degree in computer science, mathematics, statistics or related fieldexperience with cybersecurity, blockchain, or financial services industries is a plusexperience with sql, python, or r programming languages is preferred",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

Size: 4,316 training samples
Columns: sentence_0, sentence_1, and label
Approximate statistics based on the first 1000 samples:
sentence_0 sentence_1 label
type string string float
details
min: 26 tokens
mean: 70.99 tokens
max: 256 tokens

min: 27 tokens
mean: 240.47 tokens
max: 256 tokens

min: 0.0
mean: 0.51
max: 1.0

	sentence_0	sentence_1	label
type	string	string	float
details	min: 26 tokens mean: 70.99 tokens max: 256 tokens	min: 27 tokens mean: 240.47 tokens max: 256 tokens	min: 0.0 mean: 0.51 max: 1.0

Samples:

sentence_0	sentence_1	label
`big data analytics working and database warehouse manager with robust experience in handling all kinds of data. i have also used multiple cloud infrastructure services and am well acquainted with them. currently in search of role that offers more of development. big data hadoop hive python mapreduce spark java machine learning cloud hdfs yarn core java data science c++ data structures dbms rdbms informatica talend amazon redshift microsoft azure`	data analyst about ascendion:ascendion is a full-service digital engineering solutions company. we make and manage software platforms and products that power growth and deliver captivating experiences to consumers and employees. our engineering, cloud, data, experience design, and talent solution capabilities accelerate transformation and impact for enterprise clients. headquartered in new jersey, our workforce of 6,000+ ascenders delivers solutions from around the globe. ascendion is built differently to engineer the next. ascendion	engineering to elevate life we have a culture built on opportunity, inclusion, and a spirit of partnership. come, change the world with us:build the coolest tech for world’s leading brandssolve complex problems – and learn new skillsexperience the power of transforming digital engineering for fortune 500 clientsmaster your craft with leading training programs and hands-on experience experience a community of change makers! join a culture of high-perform...
passionate data engineer with a zeal to excel forward and produce promising outcomes. hands-on knowledge in various machine learning algorithms. also experienced in data mining, warehouse maintaining, and visualization. in search of a role where i can nurture my skills and contribute to the growth of the company and myself. data science data analysis data structures data warehousing data visualization data mining machine learning artificial intelligence linear regression mongodb python django java sql c++	senior marketing analyst as a senior marketing analyst, you will support our suite of brands digital marketing efforts to drive business goals from strategy to tactical execution. these brands include decksdirect, diy, sc&r and eglass. the ideal candidate thrives on creating business impact through understanding and achieving goals. key responsibility areas: synthesize business goals into achievable digital metrics to develop cross-channel digital marketing campaign strategies for both b2c and b2b customer segmentsassess metrics and analyze data for opportunities to improve digital marketing performance as outlined by kpi targetscreate and present detailed reports and dashboards that showcase progress and outcomes of campaigns qualifications: bachelor’s degree in marketing or related field3-4 years of digital experience with a history of producing quantifiable results in digital, including but not limited to demand generation, inbound marketing, digital advertising, sem, paid social...	`0.0`
`i am currently looking for an opportunity as a data science or machine learning engineer. business analyst data modeling business intelligence erp implementation html requirement gathering power bi project management`	data analyst are you looking to get your foot in the door with a nationally recognized organization based in the indianapolis area? don't miss your chance and apply now! must be local to indianapolis****2 days in the office, 3 days wfh what you will be doing as the data analyst:utilize sql and snowflake to pull reports and queriesunderstand the day-to-day issues that our business faces, which can be better understood with datadesign and run queries, generate reports, and assist the people in organization interpret and use data.compile and analyze data related to business' issuesdevelop clear visualizations to convey complicated data in a straightforward fashion what you will need as the data analyst: bachelor's degree in statistics or applied mathematics or equivalent experience2+ years data analysis experienceproficient in sqlproficient in snowflakeexperience using visualization toolsexperience using bi toolsfinancial services background experience is a plus	`0.0`

Loss: CosineSimilarityLoss with these parameters:

{
    "loss_fct": "torch.nn.modules.loss.MSELoss"
}

Training Hyperparameters

Non-Default Hyperparameters

per_device_train_batch_size: 16
per_device_eval_batch_size: 16
num_train_epochs: 1
multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand

overwrite_output_dir: False
do_predict: False
eval_strategy: no
prediction_loss_only: True
per_device_train_batch_size: 16
per_device_eval_batch_size: 16
per_gpu_train_batch_size: None
per_gpu_eval_batch_size: None
gradient_accumulation_steps: 1
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 5e-05
weight_decay: 0.0
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1
num_train_epochs: 1
max_steps: -1
lr_scheduler_type: linear
lr_scheduler_kwargs: {}
warmup_ratio: 0.0
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
save_safetensors: True
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
no_cuda: False
use_cpu: False
use_mps_device: False
seed: 42
data_seed: None
jit_mode_eval: False
use_ipex: False
bf16: False
fp16: False
fp16_opt_level: O1
half_precision_backend: auto
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: 0
ddp_backend: None
tpu_num_cores: None
tpu_metrics_debug: False
debug: []
dataloader_drop_last: False
dataloader_num_workers: 0
dataloader_prefetch_factor: None
past_index: -1
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: False
ignore_data_skip: False
fsdp: []
fsdp_min_num_params: 0
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
fsdp_transformer_layer_cls_to_wrap: None
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch
optim_args: None
adafactor: False
group_by_length: False
length_column_name: length
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
use_legacy_prediction_loop: False
push_to_hub: False
resume_from_checkpoint: None
hub_model_id: None
hub_strategy: every_save
hub_private_repo: None
hub_always_push: False
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_inputs_for_metrics: False
include_for_metrics: []
eval_do_concat_batches: True
fp16_backend: auto
push_to_hub_model_id: None
push_to_hub_organization: None
mp_parameters:
auto_find_batch_size: False
full_determinism: False
torchdynamo: None
ray_scope: last
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
include_tokens_per_second: False
include_num_input_tokens_seen: False
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
use_liger_kernel: False
eval_use_gather_object: False
average_tokens_across_devices: False
prompts: None
batch_sampler: batch_sampler
multi_dataset_batch_sampler: round_robin

Framework Versions

Python: 3.10.13
Sentence Transformers: 4.1.0
Transformers: 4.52.4
PyTorch: 2.7.0
Accelerate: 1.7.0
Datasets: 3.6.0
Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

joeportnoy
/

resume-match-ml