Recent Posts

Let's Plot 9: The venerable box plot

Intro Load packages Import TSV (tab-separated-value) file Plotting! Hmm, the order is not ideal Overlay points Wilcox test ggbeeswarm Themes Themes, with some tweaking of color and text dabest, one comparison dabest, multiple comparisons Conclusion Session Info Intro This is the 9th Let’s Plot…and I’ve not done a workup of the most useful plot - the boxplot. Oops. Well let’s rectify that. Load packages Many many packages.

Let's Plot 8: (Animated) US State Covid-19 Case Count

Load packages, pull data 2020 03 30 Update Plotter function Cases by state Cases, with log10 scaling Deaths by state (log10 scaled) Deaths by state, animated Shift plot Transform Data and plot Add exponential lines Load packages, pull data 2020 03 30 Update CSSE changed their data structure, so I’ve updated the document. I was using their “time series” data, but they dropped the US-specific (with state by state info) documents.

Let's Plot 7: Clustered Dot Plots in the ggverse

2020 03 23 Update Intro Example dotplot How do I make a dotplot? But let’s do this ourself! Dotplot! Zero effort Remove dots where there is zero (or near zero expression) Better color, better theme, rotate x axis labels Tweak color scaling Now what? Hey look: ggtree Let’s glue them together with cowplot How do we do better? Two more tweak options if you are having trouble: One more adjust Moonshot Downside Exercises for the reader OLD Solution (kept for posterity) 2020 03 23 Update Ming Tang pointed out a better way to align plots, so I have rewritten the back end of this post.

Seurat FindMarker with Cluster N vs M

What Easy cluster by cluster Seurat FindMarkers implementation Why Because Seurat’s FindMarkers (which can be parallelized if you also load library(Future) and plan("multiprocess")) runs with cluster N against all other clusters. People kept asking me for “well what about cluster 23 vs 17” and I kept saying “uh, I haven’t run that because…” How This is being done a Mac. This may not work on a PC. Multicore stuffs are complicated.

2019 11 09 GI2019 Personal and Medical Genomics

Day 4 - Session 9 - PERSONAL AND MEDICAL GENOMICS Characterization of prevalence and health consequences of uniparental disomy in four million individuals from the general population Sub-continental ancestry inference based on the gnomAD dataset accurately classifies patients at NCH Patient stratification in the UK’s 100k Genomes Project—Using WGS and machine learning to predict cancer outcomes Inferring clone- and haplotype-specific chromosomal organization in rearranged cancer genomes with multiple sequencing technologies Identification and interpretation of common and rare variants in relation to rare disease phenotype and outcome Somatic mutation status prediction by a splicing-alteration-based machine learning technique Beyond accessibility—ATAC-seq footprinting analysis reveals dynamics of transcription factor binding during preimplantation development Day 4 - Session 9 - PERSONAL AND MEDICAL GENOMICS Characterization of prevalence and health consequences of uniparental disomy in four million individuals from the general population Priyanka Nakka, Samuel Pattillo Smith, Anne H.

Culture

Repository of information on group culture, NIH resources, and programming practices

Onboarding

Programming Practices

Projects

eyeIntegration

Integration of public human eye RNA-seq datasets https://eyeintegration.nei.nih.gov.

People

John Bryan

John Bryan is working on network analysis of large sets of RNA-seq data for the eyeIntegration project