Mark D.
Yandell, PhD
Dr. Mark Yandell is an internationally recognized expert in comparative and
functional
Title and Abstract:
Annotating
genomes and their sequence-variants using interoperable, machine-readable data
standards
Department of Human
Genetics, Eccles Institute of Human Genetics, University of Utah and School of
Medicine, Salt Lake City, Utah, USA,
The ever-falling cost of sequencing is having dramatic impacts on
the research community with regard to which, how and where genomes are
sequenced. Indeed, costs have now fallen to the point where a sequenced genome
is often only one component of a
genomics-centered research plan, with many of today’s projects also involving
significant transcriptome and re-sequencing efforts
as well. The scale of these projects is
truly staggering, and they present many challenges in quality control and curation. These gigantic datasets preclude ad-hoc manual curation
efforts and require automated approaches for data management and quality
control. This in turn makes the use of interoperable, machine-readable data
standards essential. Fortunately there
are several widely used data-standards available for the genomics domain. These
include GFF for representation of genome annotations and their associated
evidence; and VCF and GVF for representation of sequence variants. I will show
how the use of these standardized formats is empowering individual
investigators and small collaborative groups to annotate, manage, curate and
analyze even truly huge genomes datasets. I will also discuss the challenges
presented by genome re-sequencing, especially as regards annotation of these
data in an interoperable machine-readable fashion. Finally, I will highlight a
few examples from my own group illustrating how genome annotation and
re-sequencing efforts can be combined for rapid identification of the genes and
alleles underlying human disease and characteristic traits of plant cultivars
and animal breeds.