SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/all-MiniLM-L6-v2
  • Maximum Sequence Length: 256 tokens
  • Output Dimensionality: 384 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("joeportnoy/resume-match-ml")
# Run inference
sentences = [
    'machine learning enthusiast. motivated to learn, grow and excel my experience by challenging myself. machine learning text analytics software development data analysis python java javascript matplotlib',
    "proposal & technical marketing analyst – 100% remote  overview: our client, a us fortune 250 company and a global medical technology corporation serving customers in clinical labs, health care research & pharmaceutical industry, seeks an accomplished proposal & technical marketing analyst – 100% remote.\n\nposition: proposal & technical marketing analyst – 100% remotelocation: remoteduration: 3-6 months+ temp-to-hire!\npay rate: $50/ hr\njob description: the technical marketing resource is responsible for the creation of it, security, and technical content, with the ability to manage and prioritize multiple projects, and work with cross-functional teams in accordance with the highest quality standards for commercial excellence. this role supports the it needs of the global marketing and us region sales and hit teams for responses to technical questions related to the mms solutions including alaris, pyxis, and healthsight platforms. requests may be in the form of full technical assessments, rfps, contract language, general customer questions or other technical support as needed.bachelor’s degree preferably in a technical field, or equivalent experiencetechnical it trainingknowledge of mms products preferredproject management experiencedemonstrated ability to multitask and balance competing priorities to meet deadlinesexcellent writing skills, the ability to assimilate technical information from various sources into an accurate, customer facing responsedriven by desire to assist and supportstrong customer centric perspective\nthe position reports to the senior manager, technical marketing, us region mms, and will require close cross functional collaboration with the us region rfp team, r&d, technical product management, product security, information security, data privacy, contracts, legal and gcs technical teams.\nresponsibilities:have in-depth knowledge of the function, features and technical requirements for implementation of mms solutions.coordinates and completes the writing of technical responses for incoming it and technical questions from rfps, contracts, general customer questions or security auditsdemonstrate strong project management skills to prioritize, triage, complete and track requests appropriately.develop strong internal relationships with the technical subject matter experts (smes) including the platform development teams and global marketing teams.maintain training to remain current with knowledge depth on new versions and product releases.consistently provide timely and accurate responses to avoid undue risk or lost opportunities for the company.collaborates with the hit team to deliver technical training to support mms solutions and product releasesreviews product release content from platform and commercial teams to ensure technical details are complete to support it customer conversationsworks cross-functionally with product and information security to ensure product security whitepapers security are current and available with product releasesother responsibilities as assigned.\n\n\ni'd love to talk to you if you think this position is right up your alley, and assure a prompt communication, whichever direction. if you are looking for rewarding employment and a company that puts its employees first, we'd like to work with you. \nrecruiter name: gurjant “gary” singhtitle: sr. recruiterphone: 925-297-5994\n\ncompany overview:\namerit consulting is an extremely fast-growing staffing and consulting firm. amerit consulting was founded in 2002 to provide consulting, temporary staffing, direct hire, and payrolling services to fortune 500 companies nationally: as well as small to mid-sized organizations on a local & regional level. currently, amerit has over 2,000 employees in 47 states. we develop and implement solutions that help our clients operate more efficiently, deliver greater customer satisfaction, and see a positive impact on their bottom line. we create value by bringing together the right people to achieve results. our clients and employees say they choose to work with amerit because of how we work with them - with service that exceeds their expectations and a personal commitment to their success. our deep expertise in human capital management has fueled our expansion into direct hire placements, temporary staffing, contract placements, and additional staffing and consulting services that propel our clients’ businesses forward.\namerit consulting provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.this policy applies to all terms and conditions of employment, including recruiting, hiring, placement, promotion, termination, layoff, recall, transfer, leaves of absence, compensation, and training.applicants, with criminal histories, are considered in a manner that is consistent with local, state and federal laws.",
    "data analyst  job title: data analystduration: contractlocation: queens-ny (day 1 onsite)-- need only locals \n\nrole descriptionthis is a contract role for a data analyst. as a data analyst, you will be responsible for collecting, analyzing, and interpreting complex data sets. you will work with various departments to identify patterns, trends, and insights that can be used to improve business operations. this is an on-site role located in queens, ny.\nqualificationsanalytical skills, including the ability to collect, organize, analyze and disseminate significant amounts of information with attention to detail and accuracydata analytics and statistics skills, including experience with statistical analysis software and data visualization toolsexcellent communication skills, including the ability to explain technical concepts to non-technical stakeholders and present findings to both technical and non-technical audiencesdata modeling skills, including the ability to develop and maintain complex data models and schemasa bachelor's degree in computer science, mathematics, statistics or related fieldexperience with cybersecurity, blockchain, or financial services industries is a plusexperience with sql, python, or r programming languages is preferred",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 4,316 training samples
  • Columns: sentence_0, sentence_1, and label
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1 label
    type string string float
    details
    • min: 26 tokens
    • mean: 70.99 tokens
    • max: 256 tokens
    • min: 27 tokens
    • mean: 240.47 tokens
    • max: 256 tokens
    • min: 0.0
    • mean: 0.51
    • max: 1.0
  • Samples:
    sentence_0 sentence_1 label
    big data analytics working and database warehouse manager with robust experience in handling all kinds of data. i have also used multiple cloud infrastructure services and am well acquainted with them. currently in search of role that offers more of development. big data hadoop hive python mapreduce spark java machine learning cloud hdfs yarn core java data science c++ data structures dbms rdbms informatica talend amazon redshift microsoft azure data analyst about ascendion:ascendion is a full-service digital engineering solutions company. we make and manage software platforms and products that power growth and deliver captivating experiences to consumers and employees. our engineering, cloud, data, experience design, and talent solution capabilities accelerate transformation and impact for enterprise clients. headquartered in new jersey, our workforce of 6,000+ ascenders delivers solutions from around the globe. ascendion is built differently to engineer the next.
    ascendion
    engineering to elevate life
    we have a culture built on opportunity, inclusion, and a spirit of partnership. come, change the world with us:build the coolest tech for world’s leading brandssolve complex problems – and learn new skillsexperience the power of transforming digital engineering for fortune 500 clientsmaster your craft with leading training programs and hands-on experience
    experience a community of change makers!
    join a culture of high-perform...
    passionate data engineer with a zeal to excel forward and produce promising outcomes. hands-on knowledge in various machine learning algorithms. also experienced in data mining, warehouse maintaining, and visualization. in search of a role where i can nurture my skills and contribute to the growth of the company and myself. data science data analysis data structures data warehousing data visualization data mining machine learning artificial intelligence linear regression mongodb python django java sql c++ senior marketing analyst as a senior marketing analyst, you will support our suite of brands digital marketing efforts to drive business goals from strategy to tactical execution. these brands include decksdirect, diy, sc&r and eglass. the ideal candidate thrives on creating business impact through understanding and achieving goals.

    key responsibility areas: synthesize business goals into achievable digital metrics to develop cross-channel digital marketing campaign strategies for both b2c and b2b customer segmentsassess metrics and analyze data for opportunities to improve digital marketing performance as outlined by kpi targetscreate and present detailed reports and dashboards that showcase progress and outcomes of campaigns

    qualifications: bachelor’s degree in marketing or related field3-4 years of digital experience with a history of producing quantifiable results in digital, including but not limited to demand generation, inbound marketing, digital advertising, sem, paid social...
    0.0
    i am currently looking for an opportunity as a data science or machine learning engineer. business analyst data modeling business intelligence erp implementation html requirement gathering power bi project management data analyst are you looking to get your foot in the door with a nationally recognized organization based in the indianapolis area? don't miss your chance and apply now!
    must be local to indianapolis****2 days in the office, 3 days wfh
    what you will be doing as the data analyst:utilize sql and snowflake to pull reports and queriesunderstand the day-to-day issues that our business faces, which can be better understood with datadesign and run queries, generate reports, and assist the people in organization interpret and use data.compile and analyze data related to business' issuesdevelop clear visualizations to convey complicated data in a straightforward fashion
    what you will need as the data analyst:
    bachelor's degree in statistics or applied mathematics or equivalent experience2+ years data analysis experienceproficient in sqlproficient in snowflakeexperience using visualization toolsexperience using bi toolsfinancial services background experience is a plus
    0.0
  • Loss: CosineSimilarityLoss with these parameters:
    {
        "loss_fct": "torch.nn.modules.loss.MSELoss"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • num_train_epochs: 1
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Framework Versions

  • Python: 3.10.13
  • Sentence Transformers: 4.1.0
  • Transformers: 4.52.4
  • PyTorch: 2.7.0
  • Accelerate: 1.7.0
  • Datasets: 3.6.0
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month
8,616
Safetensors
Model size
22.7M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for joeportnoy/resume-match-ml

Finetuned
(444)
this model