296
edits
Changes
→Chosing the correct variants to include in the PGx alleles
PGx allele definitions are given in either GRCh37 or GRCh38 reference coordinates. PharmCAT, CPIC and PharmGKB as a rule use build GRCh38. The process of changing from GRCh37 to GRCh38 for the PharmGKB API seems to be only partially completed. For instance, in the PharmGKB API JSON-LD data, the build is given as "hg38", but the actual coordinates are mostly GRCh37 (hg19). In the PharmGKB and CPIC Excel sheets, the move to GRCh38 is completed.
===Chosing the correct variants to include in the PGx alleles===
The allele definitions from PharmGKB and PharmVar are not always one-to-one, and some background knowledge about why this is, is required. Prefiltering of PharmGKB allele definitions were was performed by the [[PGx in Estonia|PGx pipeline of the University of Tartu]] as shown , although the exact prefiltering was not published in the supplementary material to [https://www.biorxiv.org/content/early/2018/07/04/356204 ''Reisberg et al.'']. {| class="wikitable"|-! CYP2C19*19 !! PharmGKB !! PharmVar !! PharmCAT !! Comment|-| NC_000010.10(GRCh37) || g.96522561T(rs17885098), g.96602623G(rs3758581), g.96522613A>G, g.96609568T>C(rs4917623)|| g.96521422A>G(rs7902257), g.96522613A>G || g.96522613A>G(liftOver) || Disagree on rs4917623(intron), rs7902257(2kb upstream variant). Disagree on requirement that rs17885098 and rs3758581 must be reference (i.e. only PharmGKB require that these coordinates are not missing). The reason that PharmGKB has included these positions is that they assume different reference bases for these positions (seems like a problem caused by change of Major allele between reference builds GRCh37/GRCh38, which is inverted in PharmGKB vs dbSNP and LiftOver)|-| NC_000010.11(GRCh38) || g.94762856A>G || g.94762804C>T(rs17885098), g.94762856A>G, g.94842866A>G(rs3758581) || g.94762856A>G || PharmGKB does not agree with itself when reporting GRCh38 variants in Excel sheets and GRCh37 variants in the API. The differences probably caused by non-standard use of Major/Minor Allele (rs17885098, rs3758581), causing reference bases to be reported as variants for GRCh37 and not GRCh38, as would be expected given info from dbSNP. Filtering out intron variants (rs4917623) in the Excel sheet may be sensible from an exon/protein-coding view.|} The main problem with the changes in definitions is that the same patient may be given different PGx-advice depending on the build version of the pipeline (unless of course that the haplotype is always conserved)
==How to define PGx alleles for next generation sequencing==