Epigenomics

You are here

Peax

Peax is a tool for interactive visual pattern search in epigenomic data that is based on unsupervised deep representation learning for similarity search. Visually searching for epigenomic patterns by similarity is challenging when the large search space, the visual patterns are complex, or the search target is not well defined. To overcome these challenges we have developed a convolutional autoencoder model for unsupervised representation learning of regions in epigenomic data that can capture more visual details of complex patterns compared to existing similarity measures. Using this learned representation as features of regions of epigenomic data, Peax enables interactive relevance feedback-driven adjustments of the pattern search to adapt to the users' perceived similarity. The binary relevance feedback, which is provided by labeling sampled regions as either mathing the search target or not matching the search target, is used to interactively train a binary classifier. The goal of this classifier is to learn the importance of the different dimensions of the region's learned representation and ultimately find regions in the genome that are perceived similar by the analyst. We employ an active learning strategy to focus the labeling process on regions that will improve the classifier in subsequent training.

Project websitepeax.lekschas.de

Introductory video: youtu.be/FlzTdFUVE-M

Source codegithub.com/novartis/peax (Released under Apache 2.0)

PublicationLekschas et al. (2020) Peax: Interactive Visual Pattern Search in Sequential Data Using Unsupervised Deep Representation Learning. Computer Graphics Forum (EuroVis).

Release Date:
Status:
Availability:
Data type:
Techniques:
Software:
Technology:
Platform:
Requirements:

Project development

Institution: Harvard University

We implemented Peax as local web-based tool running only on your machine. Peax itself is written in JavaScript and Python and uses HiGlass, a flexible web application for viewing large tile-based genomic datasets, for visualizing epigenomic data. Peax currently works with DNase-seq and histone mod. ChIP-seq data. The source code is available on GitHub (released under Apache 2.0) and includes several examples.

HiGlass

HiGlass is a web-based tool for visually exploring and comparing 2D genomic contact matrices, 1D genomic tracks, or other datasets too large to view at once. It features synchronized navigation of multiple views as well as continuous zooming and panning for navigation across genomic loci and resolutions. It supports visual comparison of genomic (e.g., Hi-C, ChIP-seq, or bed annotations) and other data (e.g., geographic maps, gigapixel images, or abstract 1D and 2D sequential data) from different experimental conditions and can be used to efficiently identify salient outcomes of experimental perturbations, generate new hypotheses, and share the results with the community.

Project website: higlass.io

Source code: github.com/higlass/higlass

PublicationKerpedjiev et al. (2018) HiGlass: Web-based visual comparison and exploration of genome interaction maps. Genome Biology, 19:125.

Release Date:
Status:
Availability:
Data type:
Techniques:
Software:
Technology:
Platform:
Requirements:

Project development

Institution: Harvard Medical School

HiGlass is a fast visualization tool for large Hi-C and other genomic data sets. It was created by Peter Kerpedjiev at the Gehlenborg Lab at Harvard Medical School in close collaboration with the Visual Computing Group at Harvard John A. Paulson School of Engineering and Applied Sciences, and Mirny Lab at Massachusetts Institute of Technology as part of the 4D Nucleome Project's Data Coordination and Integration Center.

Scalable Insets

Scalable Insets is a new technique for interactively exploring and navigating large numbers of annotated patterns in multiscale visual spaces such as genome interaction maps from Hi-C experiments. Our technique visualizes annotated features, such as loops or TADs, too small to be identifiable at certain zoom levels using insets, i.e., magnified thumbnail views of the features. Insets are dynamically placed either within the viewport or along the boundary of the viewport to offer a compromise between locality and context preservation. Annotated features are interactively clustered by location and type. They are visually represented as an aggregated inset to provide scalable exploration within a single viewport. Finds out more in the project page and our 5-mins introductory video.

Release Date:
Status:
Availability:
Data type:
Techniques:
Software:
Technology:
Platform:
Requirements:

Project development

Institution: Harvard University

We implemented Scalable Insets as an extension to HiGlass, a flexible web application for viewing large tile-based genomic datasets. Besides genome interaction maps, our implementation currently supports gigapixel images and geographic maps too. The tool can easily be applied to existing BEDPE annotation files. The source code is available on GitHub

karyoploteR

karyoploteR is an R/Bioconductor package to plot genomic data along the genome. It implements a genomic coordinates version of most R graphical primitives facilitating the creation of rich and powerful genome visualizations. Since karyoploteR does not try to "understand" the data it is plotting, it can plot almost anything, any data type,  as long as it is positioned on the genome. In addition, while the package includes data for some of the most used genomes, it can automatically download genome information from external sources and accepts custom genomes directly from the user, thus making it possible to "plot anything on any genome". karyoploteR covers the whole zoom range, going from single base to whole genome changing a single parameter in a function call.  There are additional higher level functions to plot specific types of data, for example one to compute and plot the density of features along the genome, another to plot the coverage level directly from a BAM file or a third one to plot links between genomic regions. 

To know more about the functionality of karyoploteR you can check the package vignette or head to the karyoploteR tutorial page, were you will find a step-by-step tutorial on how to use the package as well as some more involved examples with detailed explanations including how to use karyoploteR to plot different standard data types: RNA-seq differential expression results, SNP-array data, somatic mutation distance using rainfall plots.

 

Bioconductor landing page: http://bioconductor.org/packages/karyoploteR/

Tuorial and Examples: https://bernatgel.github.io/karyoploter_tutorial/

Source code at github: https://github.com/bernatgel/karyoploteR

Release Date:
Status:
Availability:
Data type:
Techniques:
Software:
Technology:
Platform:
Requirements:

Project development

Institution: Germans Trias i Pujol Research Institute, IGTP

HiPiler

HiPiler an interactive visualization interface for the exploration and visualization of regions-of-interest in large genome interaction matrices. Genome interaction matrices approximate the physical distance of pairs of genomic regions to each other and can contain up to 3 million rows and columns with many sparse regions. Traditional matrix aggregation or pan-and-zoom interfaces largely fail in supporting search, inspection, and comparison of local regions-of-interest (ROIs). ROIs can be defined, e.g., by sets of adjacent rows and columns, or by specific visual patterns in the matrix. ROIs are first-class objects in HiPiler, which represents them as thumbnail-like “snippets”. Snippets can be laid out automatically based on their data and meta attributes. They are linked back to the matrix and can be explored interactively. The design of HiPiler is based on a series of semi-structured interviews with 10 domain experts involved in the analysis and interpretation of genome interaction matrices. In the paper we describe six exploration tasks that are crucial for analysis of interaction matrices and demonstrate how HiPiler supports these tasks. We report on a user study with a series of data exploration sessions with domain experts to assess the usability of HiPiler as well as to demonstrate respective findings in the data.

Release Date:
Status:
Availability:
Data type:
Techniques:
Software:
Technology:
Platform:
Requirements:

Project development

Institution: Harvard University

HiPiler is implemented as a web application consisting of a frontend interface for the visualizations and a server-side component that provides the data. The frontend is entirely written in JavaScript utilizing Aurelia as its application framework and Redux for fine-grained, history-aware state management. The matrix snippets are visualized with WebGL using Three.js as a middleware. Finally, HiGlass is integrated as a library for displaying the interaction matrix and genomic tracks. The server-side backend serves data to HiGlass and provides the matrix snippets. The backend is implemented in Python and uses Django as its application framework. The contact matrices are accessed through Cooler, a Python-based service library for storing and querying of Hi-C data. The front and backend are two separate applications that can be decoupled to load different data types. HiPiler is open source and available on GitHub.

Integrated Genome Browser

The Integrated Genome Browser (IGB, pronounced Ig-Bee) is a fast, flexible, and free desktop genome browser. First developed at Affymetrix in 2001 to support visual analytics of genome tiling arrays, IGB provides an advanced, highly customizable environment for exploring and analyzing large-scale genomic data sets.

Using IGB, you can:

  • View your RNA-Seq, ChIP-chip or ChIP-seq data alongside genome annotations and sequence.
  • Investigate alternative splicing, regulation of gene expression, epigenetic modifications of DNA, and other genome-scale questions.
  • View results from aligning short-read sequences onto a target genome, identify SNPs, and check alignment quality.
  • Copy and paste genomic sequences for further analysis into other tools, such as primer design and promoter analysis tools.
  • Create high-quality images for publication in a variety of formats.

 

IGB features

IGB lets you view results from your own experiments or computational analyses alongside public domain gene annotations, sequences, and genomic data sets, thus making it easier for you to determine how your experiments agree or disagree with current thinking and models of genomic structure.

Some features IGB offers include:

  • Animated zooming. Most genome browsers implement "jump zooming" only, in which you click a zoom button (or other type of control) and then wait for the display to re-draw. In IGB, zooming is animated, allowing you to easily and quickly adjust the zoom level as needed without losing track of your location.
  • Simple Data Sharing System - QuickLoad. IGB implements a very simple, easy-to-use system for sharing data called QuickLoad. You can use the QuickLoad system to set up a Web site you can use to share your data with colleagues, reviewers, and the public.
  • Draggable graphs. You can display genome graphs data (e.g., "bar" and "wiggle" files) alongside and even on top of reference genome annotations, thus making it easier to see how your experimental results match up to the published reference genome annotations. You can reset your graphs to "floating" and click-drag them over annotations to compare your results with annotations and others' experiments.
  • Edge-matching across tracks. When you click an item in the display, the edges of other items in the same or different tracks with identical boundaries light up, highlighting interesting similarities or differences across gene models, sequence reads, or other features.
  • Integration with local and remote external data sources. IGB can load data from a variety of sources, including Distributed Annotation Servers, QuickLoad servers, ordinary Web sites, and local files.
  • Intron-trimming sliced view. In many species, introns are huge when compared to the exonic (coding) regions of genes. IGB provides a Sliced View tab that trims uninformative regions from introns.
  • Web-controls. IGB can be controlled from a web browser or any other program capable of sending HTTP requests. Via IGB links, you can create Web pages that direct IGB to scroll to a specific region and load data sets from local files or servers.
  • Scripting. IGB understands a simple command language that allows users to write simple scripts directing IGB to show a genome, zoom and scroll to specific regions, and other functions.
  • Open source. All development on IGB proceeds via a 100% open source model. The license allows developers to incorporate IGB (and its components) into new applications.
Release Date:
Status:
Availability:
Data type:
Techniques:
Software:
Technology:
Platform:
Requirements:

Project development

Institution: UNC @ Charlotte