2018 Workshop on emerging methods for sequence analysis

CCBB will host a one-day workshop on emerging methods for sequence analysis. The workshop will feature a mix of invited speakers and local Penn State speakers. It will be held on June 22nd (2018), immediately following the PSU Boot Camp on Data Reproducibility.


Local information

  • Free Wifi is available (“attwifi”) throughout campus.
  • The best way to get around State College is walking. Uber and Lyft are also available.
  • Select hotel, attraction and dining recommendations here.
  • The workshop is happening at the same time as the Central PA Theatre and Dance Fest.
  • More comprehensive info is available at


The workshop will be held on June 22nd, 2018, in the ASI auditorium.

Time Speaker Title
8:30 Registration, breakfast, and poster set-up
Session 1 Chair: Kristoffer Sahlin
9:00 Invited talk: Elana Fertig (JHU) Enter the matrix: factorization uncovers knowledge from omics
10:00 Marzia Cremona Discovering functional motifs in “Omics” curves using probabilistic K-mean with local alignment
10:15 Nate Coraor Genomics at a national scale: Distributed computing on NSF XSEDE resources with Galaxy and Pulsar
10:30 Coffee break and posters
Session 2 Chair: Anton Nekrutenko
11:00 Invited talk: Rob Patro (Stony Brook) Deconvolution, dictionaries and de Bruijn graphs : algorithm and data structure design for modern genomics
12:00 Kristoffer Sahlin IsoCon: Deciphering highly similar multi-copy gene transcripts from PacBio Iso-Seq data
12:15 Wilfried Guiblet A new dimension to DNA sequencing: polymerization kinetics at non-B DNA structures
12:30 Poster Session and Lunch
Session 3 Chair: Marzia Cremona
2:00 Invited talk: Seyoung Kim (CMU) Statistical methods for learning gene networks under SNP perturbation
3:00 Guray Kuzu Using topic modeling to identify protein groups from ChIPexo data
3:15 Guanjue Xiang PKnorm: normalizing sequencing depth and signal-to-noise ratio between epigenomic data
3:30 Coffee break and posters
Session 4 Chair: Guray Kuzu
4:00 Invited talk: Adam Phillippy (NIH) Can nanopore sequencing finally finish the human genome?
5:00 Tao Yang Detecting the differentially interacting genomic regions from Hi-C data
5:15 Jie Xu Detection of structure variations in cancer cell lines and leukemia patient samples.
5:30 Closing remarks
6 PM Dinner (by invitation)


The poster session will be held during lunch but the posters will be up also during the coffee breaks.

Title Presenter Authors
Combining Imaging and Genomic Data in a Single Deep Learning Model Ben Lengerich Ben Lengerich, Amir Alavi, Maruan Al-Shedivat, Avinava Dubey, Jennifer Williams, Eric P. Xing
Characterizing locus-specific nuclear relocalization between cell types Lila Rieber Lila Rieber & Shaun Mahony
SPRITE: A fast and scalable variant detection pipeline Vasudevan Rengasamy Vasudevan Rengasamy, Paul Medvedev, Kamesh Madduri
Toward fast and accurate SNP genotyping from whole genome sequencing data for bedside diagnostics Chen Sun Chen Sun, Paul Medvedev
Scalable construction of locally collinear blocks in closely related genomes with L-Sibelia Ilia Minkin Ilia Minkin, Paul Medvedev
Characterizing protein-DNA binding event subtypes in ChIP-exo data Naomi Yamada Naomi Yamada, William K.M. Lai, Nina Farrell, B. Franklin Pugh, Shaun Mahony
Bulk Regulatory Peak Deconvolution using Single Cell RNA-seq Michael Kleyman Michael Kleyman, Ziv Bar-Joseph
Impacts of sequence diversity in isolates of human herpes simplex virus (HSV-1) Molly Rathbun Molly M. Rathbun, Moriah L. Szpara
Detection of shared balancing selection in the absence of trans-species polymorphism Xiaoheng Cheng Xiaoheng Cheng, Michael DeGiorgio
Integrated analysis of ATAC-seq and RNA-seq to investigate the epigenetic regulation of cetuximab response in HNSCC Luciane Tsukamoto Kagohara Luciane T Kagohara, Michael Considine, Thomas Sherman, Genevieve Stein-O’Brien, Alexander Favorov, Daria Gaykalova, Elana J Fertig
Quantifying the similarity of topological domains across normal and cancer human cell types Natalie Sauerwald Natalie Sauerwald and Carl Kingsford
Flow Cytometry and Global Metabolomics as Tools to Study the Impact of Xenobiotics on Microbiome Physiology and Function Jingwei Cai Jingwei Cai, Robert Nichols, Imhoi Koo, Zachary Kalikow, Yuan Tian and Andrew D. Patterson
Genetic Diversity of the Plasmodium vivax phosphatidylinositol 3-kinase (PvPI3K) gene in two regions of the China-Myanmar border HUGUETTE GAELLE NGASSA MBENDA Huguette Gaelle Ngassa Mbenda, Weilin Zeng, Faiza Amber Siddiqui, Zhaoqing Yang, Liwang Cui
scQuery: a web server for comparative analysis of single-cell RNA-seq data Amir Alavi Amir Alavi, Matthew Ruffalo, Aiyappa Parvangada, Zhilin Huang, and Ziv Bar-Joseph
A tale of two bumblebees: Prospects and challenges of utilizing Genome-Wide Association Studies (GWAS) to investigate the genomic basis of adaptive traits in insects Sarthok Rasique Rahman Sarthok R. Rahman and Heather M. Hines
Automatically eliminating errors induced by suboptimal parameter choices in transcript assembly Carl Kingsford Dan DeBlasio and Carl Kingsford
Diagnosis of Fasioscapulohumeral Dystrophy Through Nanopore Sequencing Anton Nekhai Anton Nekhai, Pavel Avdeyev, Alexander Liu, Yi-Wen Chen, Max A. Alekseyev
The ER Stress Sensor IRE1a Promotes Ras-induced Senescence Through Targeted Degradation of Pro-oncogenic Id1 mRNA Jeongin Son Nicholas Blazanin, Jeongin Son, Alayna Craig-Lucas, Christian John, Kyle Breech, Michael Podolsky and Adam Glick

