You are here

Interactive visual pattern search in epigenomic data using unsupervised deep representation learning

Brant Peterson
Eric Ma

Peax is a tool for interactive visual pattern search in epigenomic data that is based on unsupervised deep representation learning for similarity search. Visually searching for epigenomic patterns by similarity is challenging when the large search space, the visual patterns are complex, or the search target is not well defined. To overcome these challenges we have developed a convolutional autoencoder model for unsupervised representation learning of regions in epigenomic data that can capture more visual details of complex patterns compared to existing similarity measures. Using this learned representation as features of regions of epigenomic data, Peax enables interactive relevance feedback-driven adjustments of the pattern search to adapt to the users' perceived similarity. The binary relevance feedback, which is provided by labeling sampled regions as either mathing the search target or not matching the search target, is used to interactively train a binary classifier. The goal of this classifier is to learn the importance of the different dimensions of the region's learned representation and ultimately find regions in the genome that are perceived similar by the analyst. We employ an active learning strategy to focus the labeling process on regions that will improve the classifier in subsequent training.


Introductory video:

Source (Released under Apache 2.0)

PublicationLekschas et al. (2020) Peax: Interactive Visual Pattern Search in Sequential Data Using Unsupervised Deep Representation Learning. Computer Graphics Forum (EuroVis).

Release Date:
June, 2020
Data type:
2D, Spatial representation
Installed, Web based
JavaScript, Python
Linux, Mac OSX, Windows

Project development

Institution: Harvard University

We implemented Peax as local web-based tool running only on your machine. Peax itself is written in JavaScript and Python and uses HiGlass, a flexible web application for viewing large tile-based genomic datasets, for visualizing epigenomic data. Peax currently works with DNase-seq and histone mod. ChIP-seq data. The source code is available on GitHub (released under Apache 2.0) and includes several examples.