Problem
yassai_identifier(c(V="TRAV14N-3", J="TRAJ16", dna="GCAACTTCAAGTGGCCAGAAGCTGGTT", pep="ATSSGQKLV")) # takes ages yassai_identifier(c(V="TRAV14N-3", J="TRAJ16", dna="GCAGCAACCGCAACTTCAAGTGGCCAGAAGCTGGTT", pep="AATATSSGQKLV")) ## [1] "aTa.2A14N3A16L12" # returns immediately.
Troubleshooting
The problematic clonotype comes from read GWYEMNS10GHZUF
.
@GWYEMNS10GHZUF gactAACTGGTACACAGCAGGTTCTGGGTTCTGGATGTGCAGGTACACCTTTAATATGGTCCCCTGGCCAAAAACCAGCTTCTGGCCACTTGAAGTTGCACAGAAGTAGGTGGCTGAGTCTCCAGGCTGAGAGTCTTTGATGTGCAAGGAGAGATTTTTCTCCCTTTTATTGAAGAAGATTGTGAATCGTCCATCTTCCTTTTTATCGGACACTGAACGTATGGCTATCAGGAGAGCAGGGCCTTCCCCAGGGAACTGCTGGTACCATGGGAAGTAGTCAAAAGCACTGTTCTCAtaactagcagttcagaattgcggtctctccttccccagactgtcagagattggagtcgtgggagacaaggcacacaggggataggngngnnnnnnnnnnnnn + FFF::;;FFFFFFFFIIIIHFFFHF666=<FFGGDDDFFHFDIIIIIIIHHHIIIHFFFD54449662200001335>>AAABDFDDDFDDDFFFFFFFFFFFFFFFFFCCCCCFFFFFFFFFFFFDDBBBBB==110?@@@@@>4455BBA??3357;F4:::;;==D88?AA<;:==BAAAA==??AAB@AAA=>>>222//02A8898BDDFFFDDFFFFFF666:DBAAAABB:::???DD;;;;D?????DDBA?<<<><<<<000444<8993222233393...:663322<9233776899:;;963326755551111,,,,3313335333799.....23322666726;;;66772222,,,,47400!,!,!!!!!!!!!!!!!
Here is the alignment made by clonotypeR
GWYEMNS10GHZUF 16 TRAV14N-3 105 17 49S17M1I29M1I201M99S * 0 0 NNNNNNNNNNNNNCNCNCCTATCCCCTGTGTGCCTTGTCTCCCACGACTCCAATCTCTGACAGTCTGGGGAAGGAGAGACCGCAATTCTGAACTGCTAGTTATGAGAACAGTGCTTTTGACTACTTCCCATGGTACCAGCAGTTCCCTGGGGAAGGCCCTGCTCTCCTGATAGCCATACGTTCAGTGTCCGATAAAAAGGAAGATGGACGATTCACAATCTTCTTCAATAAAAGGGAGAAAAATCTCTCCTTGCACATCAAAGACTCTCAGCCTGGAGACTCAGCCACCTACTTCTGTGCAACTTCAAGTGGCCAGAAGCTGGTTTTTGGCCAGGGGACCATATTAAAGGTGTACCTGCACATCCAGAACCCAGAACCTGCTGTGTACCAGTTAGTC !!!!!!!!!!!!!,!,!00474,,,,22227766;;;62766622332.....9973335333133,,,,111155557623369;;:9986773329<223366:...3933322223998<444000<<<<><<<?ABDD?????D;;;;DD???:::BBAAAABD:666FFFFFFDDFFFDDB8988A20//222>>>=AAA@BAA??==AAAAB==:;<AA?88D==;;:::4F;7533??ABB5544>@@@@@?011==BBBBBDDFFFFFFFFFFFFCCCCCFFFFFFFFFFFFFFFFFDDDFDDDFDBAAA>>53310000226694445DFFFHIIIHHHIIIIIIIDFHFFDDDGGFF<=666FHFFFHIIIIFFFFFFFF;;::FFF AS:i:233 XS:i:218 XF:i:3 XE:i:4 XN:i:0
The problem was that the V segment was completely digested in favor of the J segment.
351 400 GWYEMNS10GHZUF CTGTGCAACTTCAAGTGGCCAGAAGCTGGTTTTTGGCCAGGGGACCATAT TRAJ16 ~~~~GCAACTTCAAGTGGCCAGAAGCTGGTTTTTGGCCAGGGGACCATAT TRAV14N-3 CTGTGCAGC...AAGTG~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 401 450 GWYEMNS10GHZUF TAAAGGTGTACCTGCACATCCAGAACCCAGAACCTGCTGTGTACCAGTTA TRAJ16 TAAAGGTGTACCTGC~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ TRAV14N-3 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This triggered an infinite loop when calling is.germline()
.
Done in 42ab98729c5c77462b5929b00ee973074ae4c1c7.
Now the function should return as in the following example.
yassai_identifier(c(V="TRAV14N-3", J="TRAJ16", dna="GCAACTTCAAGTGGCCAGAAGCTGGTT", pep="ATSSGQKLV")) ## [1] "a.A14N3A16L9"