echemi logo
Product
  • Product
  • Supplier
  • Inquiry
    Home > Biochemistry News > Biotechnology News > Shanghai Jiaotong University Wei Dongqing's team discovered Transformer-based peptide-HLA class I binding prediction and vaccine neoantigen sequence design

    Shanghai Jiaotong University Wei Dongqing's team discovered Transformer-based peptide-HLA class I binding prediction and vaccine neoantigen sequence design

    • Last Update: 2022-04-26
    • Source: Internet
    • Author: User
    Search more information of high quality chemicals, good prices and reliable suppliers, visit www.echemi.com

    Recently, the internationally renowned journal "Nature Machine Intelligence" published online the research paper "A transformer-based model to predict peptide–HLA class I binding and optimize mutated peptides for vaccine design" by Wei Dongqing's team from Shanghai Jiaotong University School of Life Science and Technology


    Computational prediction of interactions between human leukocyte antigen (HLA) and peptide (pHLA) could speed up epitope screening and vaccines


    TransPHLA designed Transformer-derived models to predict pHLA binding


    Figure 1.


    1.


    The study contained 112 HLAs with peptide lengths ranging from 8 to 14, for a total of 366 HLA-peptide length combinations




    Figure 2.



    Figure 3.


    2.


    The core idea of ​​TransPHLA is the application of self-attention mechanism


    Figure 4.


    3.


    The attention mechanism of TransPHLA provides biological interpretability


    In addition, we analyzed the contribution of amino acid types on positive and negative samples to binding and non-binding at different peptide positions, respectively (Fig.
    5b)
    .
    It was found that pHLA binding and non-binding were influenced by different components of the peptide
    .
    We therefore analysed the effect of 20 amino acids at different peptide positions on binding or non-binding for all 366 HLA-peptide length combinations
    .
    These results not only contribute to the understanding of the mechanism of pHLA binding, but also play a key role in the vaccine design of the AOMP program
    .

    Since the attention scores represent the pattern of pHLA binding, this implies that key amino acid sites on the peptide sequence are important for binding or not binding to the target HLA
    .
    We visualized the binding patterns of 5 HLAs (Fig.
    5c)
    .
    As expected, TransPHLA found a similar pattern of amino acid types at different peptide positions as previous studies
    .
    For HLA-A*11:01, TransPHLA recognizes the anchor residue of the peptide with K(Lys) at position 9
    .
    For HLA-B*40:01, TransPHLA successfully identified important residues, namely E(Glu) at position 2 and L(Leu) at position 9
    .
    For HLA-B*57:03, hydrophobic residues often form the binding pocket, and TransPHLA determined this preference by L at position 9, F at position 9 (Phe), and W at position 9 (Trp)
    .
    For HLA-A*68:01, 4HWZ55 demonstrated that the K at position 9 and the R(Arg) residue at position 9 of the peptide significantly contributed to binding
    .
    For HLA-B*44:02, the importance of the 2nd E has been demonstrated by 1M6O56
    .
    All these results are supported by previous studies and demonstrate the validity of our method
    .


    Figure 5.
    (a) Attention scores associated with all correctly predicted samples, correctly predicted positive samples, and correctly predicted negative samples (b) Contribution of peptide amino acid type and peptide position to pHLA binding (c) vs.
    5 Cumulative attention scores for peptide binders associated with each well-characterized HLA allele
    .
    Note that brighter residues are considered to be more important in pHLA binding
    .


    4.
    AOMP program

    Based on the attention mechanism obtained by TransPHLA, the AOMP program (Fig.
    6) was developed for peptide vaccine design
    .
    When the user provides a pair of source peptide and target HLA, the AOMP program can search for mutant peptides with higher affinity to the target HLA and no more than 4 mutation positions
    .
    This procedure ensures both the affinity of the mutant peptide to the target HLA and the homology of the mutant peptide to the source peptide, thereby triggering cross-immunity
    .

    On the one hand, for each of the 366 HLA-peptide length combinations, the study established a binding contribution matrix for each peptide position for 20 amino acids
    .
    To accommodate new or unknown combinations of HLA-peptide lengths, the study also established a general binding contribution matrix
    .
    On the other hand, when predicting pHLA with relatively weak affinity, the attention scores obtained by TransPHLA were used to calculate the contribution matrix of each amino acid site on the peptide
    .

    Two contribution rate matrices were calculated according to the above two contribution matrices, wherein the larger the element value in the contribution matrix, the more critical the binding or non-binding to the corresponding amino acid site
    .
    Intuitively, if amino acid positions contribute more to predicted non-binding, then replace them with other amino acids predicted to contribute more to binding, and the mutant peptide is more likely to have higher affinity to the target HLA
    .
    Based on the above four matrices, four strategies are designed to generate mutant peptides (Fig.
    6), the main idea is to compare the amino acid sites on the source peptide that have a large effect on weak affinity and the target HLA-peptide length on the length of the peptide that has a significant effect on high affinity Affected amino acid sites
    .
    The corresponding amino acid substitutions are then carried out according to the comparison results
    .
    The process is as follows: (1) predict the binding scores of the source peptide and target HLA; (2) find some of the most important amino acid sites based on the self-attention mechanism; (3) replace these weak amino acids with some amino acids that may contribute more to the binding prediction Important sites for affinity pHLA; (4) select some of the best mutation candidates for evaluation
    .

    Figure 6.
    Workflow of the AOMP program using peptides DLLPETPW and HLA-B*51:01 as examples
    .
    Among them, the numbers and letters of the bottom two sub-figures, such as 8I, represent that the 8th amino acid W of the peptide obtained in the previous level is replaced with amino acid I

    5.
    Molecular dynamics simulation

    Based on the reported X-ray crystal structures of allele-specific HLA molecules, this study further validated the effectiveness of the TransPHLA and AOMP procedures using molecular dynamics (MD) simulation methods
    .
    According to the results, (a) the attention mechanism obtained by the proposed TransPHLA is consistent with the structure of the pHLA complex, and (b) the prediction results of TransPHLA are consistent with the prediction results of the MD simulation and the IEDB-recommended NetMHCpan_BA method
    .

    In this study, HLA-A*02:01 was selected as the target HLA molecule, because HLA-A*02:01 is a high-frequency allele, and multiple peptides and the complex structure of HLA-A*02:01 are disclosed in the PDB database.
    Sufficient data support is provided for MD
    .
    KRAS is the driving mutation of tumorigenesis and development, and the mutation site of KRAS is relatively conservative, and the frequency of G12 mutation accounts for 83% of all mutations in this gene
    .
    Among G12, G12D had the highest mutation frequency (41%), followed by G12V (28%), and G12C was 14%
    .
    Therefore, a peptide of length 9 containing G12 was selected as the source peptide for this study
    .

    For source peptides predicted by TransPHLA that do not bind to the target HLA, AOMP was used to generate a series of mutant peptides
    .
    Then, mutant peptides with only two sites changed and predicted to bind were selected as MD objects
    .
    Based on the structure of HLA-A*02:01 (PDB:1HHK), the molecular dynamics model of HLA-A*02:01 and peptides was constructed
    .
    Peptides include source peptides and selected mutant peptides
    .
    The results of molecular dynamics simulation showed that the binding force of the mutant peptide was significantly stronger than that of the source peptide, which was consistent with the prediction results of TransPHLA and NetMHCpan_BA
    .

    Moreover, many studies have demonstrated that the key binding sites for HLA-A*02:01 are the N-terminus (ie, position 1 or P1), the second position (ie, P2) and the C-terminus (ie, P9)
    .
    The X-ray crystal structure of HLA-A*02:01 in complex with a peptide of length 9 also showed that amino acids at the P2 and P9 anchor sites can hydrogen bond with the side chain of HLA
    .
    Figure 7 demonstrates the effectiveness of the proposed attention mechanism of TransPHLA on HLA-A*02:01 and peptides of length 9
    .
    The figure shows that the L amino acid at position 2 (2L), 9L or 9V is the key amino acid for peptide binding to HLA, consistent with the results of the existing literature
    .
    In addition, the source peptide YKLVVVGAG and 2 mutant peptides YLLVVVGAV and YLLVVVGAL derived from it were analyzed
    .
    Figures 8 and 9 show the molecular dynamics simulation results of the above three peptides and HLA-A*02:01, respectively
    .
    The results confirm that the source peptide has a weaker affinity for HLA-A*02:01, Figure 8a shows that the source peptide has no hydrogen bonding interaction with HLA, and Figure 9a shows that the source peptide is far from the HLA binding groove
    .
    However, Figure 8bc and Figure 9bc show that the mutant peptide can form multiple hydrogen bond interactions with the HLA side chain, which promotes the binding of the mutant peptide to HLA
    .

    Figure 7.
    Attention mechanism of TransPHLA for HLA-A*02:01 and peptides of length 9

    Figure 8.
    Molecular dynamics simulated 2D structures of peptides and HLA-A*02:01
    .
    Hydrogen bonds are shown as yellow dotted lines


    Figure 9.
    Molecular dynamics simulated 3D structures of peptides and HLA-A*02:01
    .
    The source peptide chain in (a) is shown as purplish red coiled lines and hydrogen bonds are shown as yellow dashed lines

    Paper link : https:// class="Article-source form-horizontal">




    School of Life Science and Technology




    School of Life Science and Technology



    This article is an English version of an article which is originally in the Chinese language on echemi.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to service@echemi.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

    Contact Us

    The source of this page with content of products and services is from Internet, which doesn't represent ECHEMI's opinion. If you have any queries, please write to service@echemi.com. It will be replied within 5 days.

    Moreover, if you find any instances of plagiarism from the page, please send email to service@echemi.com with relevant evidence.