The Pineapple Reference Genome: Telomere-to-Telomere Assembly, Manually Curated Annotation, and Comparative Analysis
Published:22 Sep.2024 Source:Journal of Integrative Plant Biology
Pineapple is the third most crucial tropical fruit worldwide and available in five varieties. Genomes of different pineapple varieties have been released to date; however, none of them are complete, with all exhibiting substantial gaps and representing only two of the five pineapple varieties. This significantly hinders the advancement of pineapple breeding efforts.
In this study, we sequenced the genomes of three varieties: a wild pineapple variety, a fiber pineapple variety, and a globally cultivated edible pineapple variety. We constructed the first gap-free reference genome (Ref) for pineapple. By consolidating multiple sources of evidence and manually revising each gene structure annotation, we identified 26 656 protein-coding genes. The BUSCO evaluation indicated a completeness of 99.2%, demonstrating the high quality of the gene structure annotations in this genome. Utilizing these resources, we identified 7 209 structural variations across the three varieties. Approximately 30.8% of pineapple genes were located within ±5 kb of structural variations, including 30 genes associated with anthocyanin synthesis. Further analysis and functional experiments demonstrated that the high expression of AcMYB528 aligns with the accumulation of anthocyanins in the leaves, both of which may be affected by a 1.9-kb insertion fragment.
In addition, we developed the Ananas Genome Database, which offers data browsing, retrieval, analysis, and download functions. The construction of this database addresses the lack of pineapple genome resource databases. In summary, we acquired a seamless pineapple reference genome with high-quality gene structure annotations, providing a solid foundation for pineapple genomics and a valuable reference for pineapple breeding.