A complete pedigree-based graph workflow for rare candidate variant analysis

  1. Benedict Paten1
  1. 1UC Santa Cruz Genomics Institute, Santa Cruz, California 95060, USA;
  2. 2Undiagnosed Diseases Program, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20894, USA;
  3. 3Google Incorporated, Mountain View, California 94043, USA;
  4. 4Howard Hughes Medical Institute, University of California, Santa Cruz, California 95064, USA
  • Corresponding authors: cmarkell{at}ucsc.edu, bpaten{at}ucsc.edu
  • Abstract

    Methods that use a linear genome reference for genome sequencing data analysis are reference-biased. In the field of clinical genetics for rare diseases, a resulting reduction in genotyping accuracy in some regions has likely prevented the resolution of some cases. Pangenome graphs embed population variation into a reference structure. Although pangenome graphs have helped to reduce reference mapping bias, further performance improvements are possible. We introduce VG-Pedigree, a pedigree-aware workflow based on the pangenome-mapping tool of Giraffe and the variant calling tool DeepTrio using a specially trained model for Giraffe-based alignments. We demonstrate mapping and variant calling improvements in both single-nucleotide variants (SNVs) and insertion and deletion (indel) variants over those produced by alignments created using BWA-MEM to a linear-reference and Giraffe mapping to a pangenome graph containing data from the 1000 Genomes Project. We have also adapted and upgraded deleterious-variant (DV) detecting methods and programs into a streamlined workflow. We used these workflows in combination to detect small lists of candidate DVs among 15 family quartets and quintets of the Undiagnosed Diseases Program (UDP). All candidate DVs that were previously diagnosed using the Mendelian models covered by the previously published methods were recapitulated by these workflows. The results of these experiments indicate that a slightly greater absolute count of DVs are detected in the proband population than in their matched unaffected siblings.

    Footnotes

    • Received November 24, 2021.
    • Accepted March 24, 2022.

    This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see https://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 4.0 International), as described at http://creativecommons.org/licenses/by-nc/4.0/.

    Preprint Server