- Day 4 - Session 9 - PERSONAL AND MEDICAL GENOMICS
- Characterization of prevalence and health consequences of uniparental disomy in four million individuals from the general population
- Sub-continental ancestry inference based on the gnomAD dataset accurately classifies patients at NCH
- Patient stratification in the UK’s 100k Genomes Project—Using WGS and machine learning to predict cancer outcomes
- Inferring clone- and haplotype-specific chromosomal organization in rearranged cancer genomes with multiple sequencing technologies
- Identification and interpretation of common and rare variants in relation to rare disease phenotype and outcome
- Somatic mutation status prediction by a splicing-alteration-based machine learning technique
- Beyond accessibility—ATAC-seq footprinting analysis reveals dynamics of transcription factor binding during preimplantation development
Day 4 - Session 9 - PERSONAL AND MEDICAL GENOMICS
Characterization of prevalence and health consequences of uniparental disomy in four million individuals from the general population
Priyanka Nakka, Samuel Pattillo Smith, Anne H. O’Donnell-Luria, Kimberly F. McManus, 23andMe Research Team, Joanna L. Mountain, Sohini Ramachandran, Fah Sathirapongsasuti.
Paper came out two days ago AJHG
“Population-level summaries of ROH increase with divergnce from Agrica” ROH == Runs of Homozygosity (LONG runs where both chr are identical)
“Class C ROH”
- don’t know what this is…something about consanguinity?
- oh, >0.8MB?
Group interested in distribution across human genomes
23andMe partially sort of funded (personel) this study
Also uses 23andMe data
- not even MacArthur has 4 million genomes
Posting IBD karyogram….from reddit (imgur.com/a/hl6Cd)
- both homologs of chr come from one parent
- meiotic oopsie
3,300 reported in clinical literature
- more common on chr 6,7,11, and 15 (most common)
What is incidence in population (at least 23andMe pop)?
What are rates of maternal/paternal UPD?
Three subtypes of UPD
- HetUPD, IsoUPD, Partial IsoUPD
In 916,712 duos/trios in 23andMe found 199 with UPD
- Maternal UPD 3X more common than Paternal UPD
- different dist across chr than the upd-tl.com data
Estimate 1/2000 rate (more common than previously estimated)
Trained log. regression classifier for each chr for each pop to diagnose ROH from UPD (instead of consanguinity)
- used ROH length
- ratio of ROH across chr
- other things I missed
Found 304 putative UPD cases in 1.3 million “singletons”
- recapitulates chr dist. patterns
- found 172 more in UK biobank
- found 0 in the 4,000 duos (not a big sample size)
Sub-continental ancestry inference based on the gnomAD dataset accurately classifies patients at NCH
Andrei Rajkovic, Defne Ceyhan, Greg Wheeler, Tara Lichtenberg, Ben Kelly, Peter White.
Trying to better infer ancestry with the gnomad projection
ML algorithm (gradient boosted decision tree)
- trained on 4000 people (vcf input)
- accuracy defined as true ancestry in top2 (is that too loose?)
90+ ish accuracy (some ancestries don’t do so well)
I guess they are relying on self-reported ancestries? Aren’t these notorious for being not so accurate….?
- I asked … and yes they are …. which isn’t a problem that can be dealt with so easily
Patient stratification in the UK’s 100k Genomes Project—Using WGS and machine learning to predict cancer outcomes
Kate Ridout, Pauline Robbe, CLL pilot consortium, Anna Schuh. Presenter affiliation: University of Oxford, Oxford, United Kingdom. 231
Kaplan Meier curves
- these plots always bum me out
- death curve
Going through pipeline to call variants / SV / CNV in somatic tissue
- LOTS of different tools
- even chromHMM … not certain what datasets the chromHMM was built off of
- anyways just super comprehensive, kitchen sink kind of stuff
- so many tools I wonder if they are correcting their p values for the bajillion test/tools they are doing….
showing some case studies….but why not combining all these into one framework….oh here we go I think
- non-negative matrix factorization (NMF)
- combine all features and see patterns, assess against outcomes
- n isn’t super high (200 ish)
- several independent-ish models / data / thingies which get merged somehow?
- don’t quite understand architecture…
- found a mut sig associated with better outcome (useful to de-prioritze for damaging chemo)
Inferring clone- and haplotype-specific chromosomal organization in rearranged cancer genomes with multiple sequencing technologies
Sergey Aganezov, Fritz Sedlazeck, Sara Goodwin, Gayatri Arun, Isac Lee, Sam Kovaka, Michael Kitsche, Rachel Scherman, Ilya Zban, Vitaly Aksenov, Nikita Alexeev, Robert Wappel, Melissa Kramer, Karen Kostroff, David L. Spector, Winston Timp, Michael S. Schatz, W. Richard McCombie.
Recover linear genome org from cancer genomes
Referring to short-read NGS as 2nd gen and long reads as 3rd gen
- will this catch on? Probably not…especially if I misunderstood what happened
10x Linked Reads + ONT + PacBio
- check concordance
- overlap not great between all three
- A TON were found in both ONT + PacBio
- LOTS of FP for 10x…not a good slide for this tech
- so maybe don’t use 10X LR for SV
Overall finding so many more SV with long reads
Get to 90% precision/recall at 25X
Tried to genotype SVs in 1000G - found a few (germline of course). Artifacts? Interesting? Dunno…
Discussing haplotype resolved chr structure technique
Identification and interpretation of common and rare variants in relation to rare disease phenotype and outcome
Mana Gajapathy, Brandon Wilk, Arthur Weborg, Angelina Uno- Antonison, Alex Moss, Matt Holt, Donna Brown, Melissa Wilk, Camille Birch, Nadiya Sosonkina, Elizabeth Worthey
History lesson on how in 2003, “success rate” for rare genetic disease had seq. diagnosis of about 5%. Today is approaching 50%.
But common genetic disease is ~5% today
Sanger -> Exome -> WGS
Software / pipelines are curcial in making seq. useful
Using many orthogonal datasets and phenotype info (HPO?) to improve filtering
This software proprietary / custom for UAB?
Patient with paterna isodomic UPD….3!!! different diseases with multiple causual genes! Yikes.
Somatic mutation status prediction by a splicing-alteration-based machine learning technique
Raul N. Mateos, Naoko Iida, Kenichi Chiba, Ai Okada, Yuichi Shiraishi.
Model selection and permutation testing for association studies within a large direct-to-consumer genetic cohort Elena P. Sorokin, Danny S. Park, Kristin A. Rand, Julie M. Granka, Eurie L. Hong, Kenneth G. Chahine, Catherine A. Ball.
Beyond accessibility—ATAC-seq footprinting analysis reveals dynamics of transcription factor binding during preimplantation development
Mette Bentsen, Philipp Goymann, Anastasiia Petrova, Hendrik Schultheis, Kathrin Klee, Jens Preussner, Carsten Kuenne, Mario Looso.