Bioinformatics

Let's Plot 7: Clustered Dot Plots in the ggverse

2020 03 23 Update Intro Example dotplot How do I make a dotplot? But let’s do this ourself! Dotplot! Zero effort Remove dots where there is zero (or near zero expression) Better color, better theme, rotate x axis labels Tweak color scaling Now what? Hey look: ggtree Let’s glue them together with cowplot How do we do better? Two more tweak options if you are having trouble: One more adjust Moonshot Downside Exercises for the reader OLD Solution (kept for posterity) 2020 03 23 Update Ming Tang pointed out a better way to align plots, so I have rewritten the back end of this post.

2019 11 09 GI2019 Personal and Medical Genomics

Day 4 - Session 9 - PERSONAL AND MEDICAL GENOMICS Characterization of prevalence and health consequences of uniparental disomy in four million individuals from the general population Sub-continental ancestry inference based on the gnomAD dataset accurately classifies patients at NCH Patient stratification in the UK’s 100k Genomes Project—Using WGS and machine learning to predict cancer outcomes Inferring clone- and haplotype-specific chromosomal organization in rearranged cancer genomes with multiple sequencing technologies Identification and interpretation of common and rare variants in relation to rare disease phenotype and outcome Somatic mutation status prediction by a splicing-alteration-based machine learning technique Beyond accessibility—ATAC-seq footprinting analysis reveals dynamics of transcription factor binding during preimplantation development Day 4 - Session 9 - PERSONAL AND MEDICAL GENOMICS Characterization of prevalence and health consequences of uniparental disomy in four million individuals from the general population Priyanka Nakka, Samuel Pattillo Smith, Anne H.

#GI2019 2019 11 08 Afternoon Session

Day 3 - Session 8 - EVOLUTION AND PHYLOGENETICS Is pathogen evolution predictable? The role of population genomics Learning the properties of adaptive regions with functional data analysis Creating pan-human and population-specific consensus representations of the reference genome and assessing their effect on functional genomic data analysis Gramene subsites—Pangenome browsers for crops What do we gain when tolerating loss? The information bottleneck, lossy compression, and detecting horizontal gene transfer A recurrent neural network for inferring sweeps and allele frequency trajectories using gene trees based on the ancestral recombination graph Assembling the Y chromosomes of anopheles mosquitoes Day 3 - Session 8 - EVOLUTION AND PHYLOGENETICS Genome Informatics 2019 at CSHL

#GI2019 2019 11 08 Morning Session

Day 3 - Session 7 - MICROBIAL AND METAGENOMICS Strain-level metagenomic assignment and compositional estimation for long reads with MetaMaps metaFlye—Scalable long-read metagenome assembly using repeat graphs Detecting microbial transmission and engraftment after faecal microbiota transplants using long-read metagenomics and reticulatus Entropy of a bacterial stress response is a generalizable predictor for fitness and antibiotic sensitivity The use of kmer counts to train random forests to predict country of origin for bacterial pathogen sequencing data Real-time assembly using Nanopore sequencing data for microbial communities Exploring the role of ribosomal gene repeats in the context of regeneration Genomic epidemiology of West Nile virus in California Day 3 - Session 7 - MICROBIAL AND METAGENOMICS Genome Informatics 2019 at CSHL

#GI2019 2019 11 07

Day 2 - Session 2 - SEQUENCING ALGORITHMS, VARIANT DISCOVERY AND GENOME ASSEMBLY Genomic sketching with HyperLogLog centroFlye—Assembling centromeres with long error-prone reads Genotyping structural variants in pangenome graphs using the vg toolkit Rapidly mapping raw nanopore signal with UNCALLED to enable real-time targeted sequencing The construct and utility of reference pan-genome graphs PRINCESS — A framework for comprehensive detection and phasing of SNPs and structural variants Efficient chromosome-scale haplotype-resolved assembly of human individuals Utilization of an ensemble approach for identification of driver fusions in pediatric cancer Day 2 - Session 2 - SEQUENCING ALGORITHMS, VARIANT DISCOVERY AND GENOME ASSEMBLY Genome Informatics 2019 at CSHL

#GI2019 2019 11 07 Night Session

Day 2 - Session 6 - TRANSCRIPTOMICS The functional iso-transcriptomics analysis framework to assess the functional impact of alternative isoform usage Multi-resolution, interactive, atlas-scale integration of single-cell assays and experiments Efficient and robust transcriptome reconstruction from long-read RNA-seq alignments Deconvolving the pervasive transcription from jumping genes in RNA-seq and unveiling their role in tumors Alignment and mapping methodology influence transcript abundance estimation A technology-agnostic long-read analysis pipeline for transcriptome discovery and quantification Quantifying isoform expression in single-cell RNA-seq data with STARsolo-Quant Full-length transcript characterization for single cell RNA-seq analysis Day 2 - Session 6 - TRANSCRIPTOMICS Genome Informatics 2019 at CSHL

#GI2019 2019 11 06

Day 1 - Session 1 - GENOME STRUCTURE AND FUNCTION Comparative 3D genome organization in Apicomplexan parasites Unscrambling the tumor genome via integrated analysis of structural variation and copy number Tissue-specific enhancer functional networks for associating distal regulatory regions to disease Targeted Nanopore sequencing with Cas9 for studies of methylation, structural variants, and mutations Long-read sequencing of structurally variant genomes Exploring the 3D spatial dependency of gene expression using Markov random fields Mapping cis-eQTL from RNA-seq data with no genotypes Exploring short tandem repeat expansions at both known and novel loci in the human genome Day 1 - Session 1 - GENOME STRUCTURE AND FUNCTION Genome Informatics 2019 at CSHL

One Developer Portal: eyeIntegration Genesis

News! eyeIntegration version 1.0 went live early this year (2019-01-16) and recently was accepted for publication in IOVS. In celebration of the news, I’m posting a small series of posts about the genesis, development, upgrades, and future of eyeIntegration. You can find our latest manuscript on bioRxiv. The latest update should go live soon. Background eyeIntegration was developed to serve as a quick and easy to use normal gene expression portal in eye tissues.

One Developer Portal: eyeIntegration Web Optimization

This post is a continuation from here. Really important stuff I learned to make a performant web site in Shiny After a few months of tinkering I had a working web app on my local computer, which is a 32GB of RAM, 1TB SSD Mac Pro trashcan. All of the data objects were .Rdata, which were load() when the site was initialized. This was fine in the beginning and in fact the shiny site was deployed with this structure in May of 2017.

#GI2018 - Day Four

Variant Discovery and Genome Assembly Melissa Wilson Sayres Prithika Sritharan (Dicks) Zemin Ning (Durbin) Variant Discovery and Genome Assembly Melissa Wilson Sayres Sex Differences in Reference Genome Affect Variant Calling and Differential Expression X and Y homology dotplot a few regions of alignment (PAR) @sexchrlab doi.org/10.1101346940 https://www.biorxiv.org/content/early/2018/07/18/346940 preprint on correct tech biases on sex chr in NGS data infer sex chromosome complement pointing out regions on autosomes with sex chr - like alignment which is screwing up read balance which presumably could influence variant calling chrY var calls will have many errors