UMMS Affiliation

Program in Systems Biology; Department of Biochemistry and Molecular Pharmacology

Publication Date

2019-03-19

Document Type

Article

Disciplines

Computational Biology | Genetic Phenomena | Genomics | Structural Biology | Systems Biology

Abstract

BACKGROUND: Hi-C is currently the most widely used assay to investigate the 3D organization of the genome and to study its role in gene regulation, DNA replication, and disease. However, Hi-C experiments are costly to perform and involve multiple complex experimental steps; thus, accurate methods for measuring the quality and reproducibility of Hi-C data are essential to determine whether the output should be used further in a study.

RESULTS: Using real and simulated data, we profile the performance of several recently proposed methods for assessing reproducibility of population Hi-C data, including HiCRep, GenomeDISCO, HiC-Spector, and QuASAR-Rep. By explicitly controlling noise and sparsity through simulations, we demonstrate the deficiencies of performing simple correlation analysis on pairs of matrices, and we show that methods developed specifically for Hi-C data produce better measures of reproducibility. We also show how to use established measures, such as the ratio of intra- to interchromosomal interactions, and novel ones, such as QuASAR-QC, to identify low-quality experiments.

CONCLUSIONS: In this work, we assess reproducibility and quality measures by varying sequencing depth, resolution and noise levels in Hi-C data from 13 cell lines, with two biological replicates each, as well as 176 simulated matrices. Through this extensive validation and benchmarking of Hi-C data, we describe best practices for reproducibility and quality assessment of Hi-C experiments. We make all software publicly available at http://github.com/kundajelab/3DChromatin_ReplicateQC to facilitate adoption in the community.

Rights and Permissions

© The Author(s). 2019 Open Access: This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

DOI of Published Version

10.1186/s13059-019-1658-7

Source

Genome Biol. 2019 Mar 19;20(1):57. doi: 10.1186/s13059-019-1658-7. Link to article on publisher's site

Journal/Book/Conference Title

Genome biology

Comments

Full author list omitted for brevity. For the full list of authors, see article.

Related Resources

Link to Article in PubMed

PubMed ID

30890172

Creative Commons License

Creative Commons Attribution 4.0 License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS
 
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.