W353A Draft Assembly of the Almond Genome
A Draft Assembly of the Almond Genome
Date: Saturday, January 9, 2016
Time: 10:51 AM
Time: 10:51 AM
Room: Pacific Salon 3
Almond is one of the oldest cultivated nut crops with its origin in central and western Asia. The selection of the sweet type (Prunus dulcis)
distinguishes the domesticated almond from its bitter wild relatives.
It is economically important, especially in California with the highest
worldwide production, followed by Australia and Spain. The almond
belongs to the same subgenus as the peach, for which there already
exists a reference genome. However, to fully understand the genetic
underpinnings marking the key phenotypic differences between almond and
peach, we have sequenced the genome of the ‘Texas’ almond, one of the
traditional cultivars producing a sweet nut. Whole-genome shotgun
sequencing of Illumina paired-end libraries gave an initial
low-contiguity assembly of 512 Mbp, nearly double the estimated genome
size. Counting of k-mers indicates a 275 Mbp genome with substantial
heterozygosity as well as repetitive sequence. In order to tackle both
problems, we constructed a fosmid library and sequenced 68 pools of ~500
clones per pool. We then assembled the pools, merged them and finished
the assembly by scaffolding with paired end and mate pair libraries,
which resulted in a 240 Mbp assembly with a scaffold N50 of 500 kbp, a
contig N50 of 33.5 kbp and CEGMA completeness of 99%. Two thirds of the
assembly was anchored to the peach-almond genetic map, and using
re-sequencing data of peach-almond hybrids and their parents we inferred
the two haplotypes of the sequenced almond tree. We performed
additional validation of the assembly using Oxford Nanopore MinION
sequencing.