DeepVariant-on-Spark

DeepVariant-on-Spark is a germline short variant calling pipeline that runs Google DeepVariant on Apache Spark at scale.

Why DeepVariant-on-Spark

DeepVariant is highly accurate. In 2016 DeepVariant won PrecisionFDA Truth Challenge in the best SNP Performance category.
Apache Spark is a lightning-fast unified analytics engine for large-scale data processing. Apache Spark achieves high performance for both batch and streaming data, using a state-of-the-art DAG scheduler, a query optimizer, and a physical execution engine.
DeepVariant (v0.7) hasn't supported multiple GPUs. Through DeepVariant-on-Spark, all of GPU resources can be fully utilized across multiple nodes. For example, nVidia DGX-1 has 8 Tesla V100.
DeepVariant-on-Spark leverages Atgenomix SeqPiper, a wrapper technology using Spark PipeRDD, to parallelize DeepVariant pipeline on Spark and to use Yarn to optimize resource allocation in multi-node environment.

Documentation

DeepVariant-on-Spark release notes

Dependence

Quick start and Case studies

Contributing

Interested in contributing? See CONTRIBUTING.

License

DeepVariant-on-Spark is licensed under the terms of the Apache 2.0 License.

Acknowledgements

DeepVariant-on-Spark happily makes use of many open source packages. We'd like to specifically call out a few key ones:

We thank all of the developers and contributors to these packages for their work.

Disclaimer

This is not an official Atgenomix product.
To utilize the official product with full experience, please contact Atgenomix ([email protected]).

Name		Name	Last commit message	Last commit date
Latest commit History 146 Commits
ansible		ansible
docs		docs
scripts		scripts
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ansible

ansible

docs

docs

scripts

scripts

CONTRIBUTING.md

CONTRIBUTING.md

LICENSE

LICENSE

README.md

README.md

Repository files navigation

DeepVariant-on-Spark

Why DeepVariant-on-Spark

Documentation

Dependence

Quick start and Case studies

Contributing

License

Acknowledgements

Disclaimer

About

Releases

Packages

Contributors 5

Languages

License

atgenomix/deepvariant-on-spark

Folders and files

Latest commit

History

Repository files navigation

DeepVariant-on-Spark

Why DeepVariant-on-Spark

Documentation

Dependence

Quick start and Case studies

Contributing

License

Acknowledgements

Disclaimer

About

Topics

Resources

License

Stars

Watchers

Forks

Languages