Bug #512
closedMYBPC3-Q850X call is wrong, no Q at this position on UCSC browser
Description
This came up because Heidi pointed out this should cause hypertrophic cardiomyopathy. Looking at the position chr11:47315573 in there's no "Q" at this position.
Is trait-o-matic outright wrong here? Is there a second splice variant that is synonymous (as the UCSC annotation would predict) that isn't showing up?
If the latter, this would support somehow linking/contextualizing all the splice variants trait-o-matic predicts. If the former, we should be concerned that there are errors elsewhere.
Updated by Madeleine Ball almost 16 years ago
The following variant: http://evidence.personalgenomes.org/XPC-L591Q is also possibly mislabeled -- this is an example where the reference has a 1000 genomes freq of 0%, but dbSNP calls it a silent mutation, contrary to what we have reported.
Updated by Ward Vandewege about 15 years ago
- Project changed from 19 to GET-Evidence
- Category deleted (
GET-Evidence)
Updated by Madeleine Ball about 15 years ago
- Status changed from New to Closed
Although I haven't tested this specific MYBPC3 example, the presence of incorrect amino acid interpretations (inconsistent with UCSC's amino acid annotations) has largely been fixed by two changes. I've checked and confirmed that the XPC SNP Jason noted is now correctly interpreted as not causing an amino acid change.
(1) The "sanity checking" assert statements added in lines 251-262 of b4141303
Some transcripts in refFlat had position errors (e.g. where an exon ends) in a way that caused a frameshift in all amino acids downstream. These frameshifts are almost certain to contain a premature stop codon, so checking for the presence of these is a good sanity check.
(2) The usage of UCSC's knownGene annotations and canonical transcripts (61df892e)
Sometimes alternative transcripts resulted in calls made for regions that were exons only in that particular transcript (e.g. a frameshift in an exon generally never used).