Comparative Annotation Toolkit (CAT)—simultaneous clade and personal genome annotation
- Ian T. Fiddes1,2,
- Joel Armstrong1,8,
- Mark Diekhans1,8,
- Stefanie Nachtweide3,8,
- Zev N. Kronenberg4,
- Jason G. Underwood4,5,
- David Gordon4,6,
- Dent Earl1,
- Thomas Keane7,
- Evan E. Eichler4,6,
- David Haussler1,
- Mario Stanke3 and
- Benedict Paten1
- 1Genomics Institute, University of California Santa Cruz and Howard Hughes Medical Institute, Santa Cruz, California 95064, USA;
- 210x Genomics, Pleasanton, California 94566, USA;
- 3Institute of Mathematics and Computer Science, University of Greifswald, 17489 Greifswald, Germany;
- 4Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA;
- 5Pacific Biosciences of California, Incorporated, Menlo Park, California 94025, USA;
- 6Howard Hughes Medical Institute, University of Washington, Seattle, Washington 98195, USA;
- 7European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SD, United Kingdom
-
↵8 These authors contributed equally to this work.
Abstract
The recent introductions of low-cost, long-read, and read-cloud sequencing technologies coupled with intense efforts to develop efficient algorithms have made affordable, high-quality de novo sequence assembly a realistic proposition. The result is an explosion of new, ultracontiguous genome assemblies. To compare these genomes, we need robust methods for genome annotation. We describe the fully open source Comparative Annotation Toolkit (CAT), which provides a flexible way to simultaneously annotate entire clades and identify orthology relationships. We show that CAT can be used to improve annotations on the rat genome, annotate the great apes, annotate a diverse set of mammals, and annotate personal, diploid human genomes. We demonstrate the resulting discovery of novel genes, isoforms, and structural variants—even in genomes as well studied as rat and the great apes—and how these annotations improve cross-species RNA expression experiments.
Footnotes
-
[Supplemental material is available for this article.]
-
Article published online before print. Article, supplemental material, and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.233460.117.
- Received December 7, 2017.
- Accepted May 3, 2018.
This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.