Locus-to-Gene (L2G) Model

The locus-to-gene (L2G) model prioritises likely causal genes at each GWAS locus based on genetic and functional genomics features.

Model Description

This is a Gradient Boosting Classifier (XGBoost) trained to predict causal genes at GWAS loci.

Key Features:

  • Distance: proximity from credible set variants to gene
  • Molecular QTL Colocalization: evidence from expression/protein QTL studies
  • Chromatin Interaction: promoter-capture Hi-C data
  • Variant Pathogenicity: VEP (Variant Effect Predictor) scores

Intended Use

Prioritize likely causal genes at GWAS loci for:

  • Drug target identification
  • Functional follow-up studies
  • Genetics and genomics research

Usage

from gentropy.method.l2g.model import LocusToGeneModel
from gentropy.common.session import Session

# Load model from Hugging Face Hub
session = Session()
model = LocusToGeneModel.load_from_hub(
    session=session,
    hf_model_id="opentargets/locus_to_gene"
)

# Make predictions on your L2G feature matrix
predictions = model.predict(your_feature_matrix, session)

Training

  • Architecture: XGBoost Gradient Boosting Classifier
  • Training Data: Curated positive/negative gene-locus pairs from Open Targets
  • Evaluation Metric: Area under precision-recall curve (AUCPR)

Limitations

  • Performance may vary across different ancestries and trait types
  • Requires comprehensive functional genomics data
  • Limited to protein-coding genes with available feature data

Citation

If you use this model, please cite:

@article{ghoussaini2021open,
title={Open Targets Genetics: systematic identification of trait-associated genes using large-scale genetics and functional genomics},
author={Ghoussaini, Maya and Mountjoy, Edward and Carmona, Maria and others},
journal={Nature Genetics},
volume={53},
pages={1527--1533},
year={2021},
doi={10.1038/s41588-021-00945-5}
}

More Information

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support