Gene sequences for more than 1100 plant species have been released by an international consortium of nearly 200 plant scientists, the culmination of a nine-year research project.
The One Thousand Plant Transcriptomes Initiative, or 1KP, is a global collaboration to examine the diversification of plant species, genes and genomes across the more than one-billion-year history of green plants dating back to the ancestors of flowering plants and green algae.
“In the tree of life, everything is interrelated,” said Gane Ka-Shu Wong, lead investigator and professor at the University of Alberta. “If we want to understand how the tree of life works, we need to examine the relationships between species. That’s where genetic sequencing comes in.”
“A more robust estimate of the relationships among the major groups of green plants was one of the outcomes of this collaborative project,” Edger said. “This new phylogenetic framework, combined with gene family analyses, provided an opportunity to investigate how the genomes of green plants across various lineages have evolved over the past billion years.”
The findings, published in Nature, reveal the timing of whole genome duplications and the origins, expansions and contractions of gene families contributing to fundamental genetic innovations enabling the evolution of green algae, mosses, ferns, conifer trees, flowering plants and all other green plant lineages. The history of how and when plants secured the ability to grow tall, and make seeds, flowers and fruits provides a framework for understanding plant diversity around the planet including annual crops and long-lived forest tree species.
“Our inferred relationships among living plant species inform us that over the billion years since an ancestral green algal species split into two separate evolutionary lineages, one including flowering plants, land plants and related algal groups and the other comprising a diverse array of green algae, plant evolution has been punctuated with innovations and periods of rapid diversification” said James Leebens-Mack, professor at the University of Georgia Franklin College of Arts and Sciences and study co-author . “In order to link what we know about gene and genome evolution to a growing understanding of gene function in flowering plant, moss and algal organisms, we needed to generate new data to better reflect gene diversity among all green plant lineages.”
The study inspired a community effort to gather and sequence diverse plant lineages derived from terrestrial and aquatic habitats on a global scale. Over 100 taxonomic specialists contributed material from field and living collections that include the Central Collection of Algal Cultures, Royal Botanic Gardens, Kew, Royal Botanic Garden Edinburgh, Atlanta Botanical Garden, New York Botanical Garden, Fairylake Botanical Garden, Shenzhen, The Florida Museum of Natural History, Duke University, University of British Columbia Botanical Garden and The University of Alberta. By sequencing and analyzing genes from a broad sampling of plant species, researchers are better able to reconstruct gene content in the ancestors of all crops and model plant species, and gain a more complete picture of the gene and genome duplications that enabled evolutionary innovations.
Nearly a decade ago, Wong organized private funding through the Somekh Family Foundation as well as support from the Government of Alberta and a sequencing commitment from BGI in Shenzhen, China, to launch 1KP. Once the project was operational, additional resources came from other ongoing projects, including iPlant (now CyVerse) funded by the National Science Foundation.
The massive scope of the project demanded development and refinement of new computational tools for sequence assembly and phylogenetic analysis.
“New algorithms were developed by software engineers at BGI to assemble the massive volume of gene sequence data generated for this project,” Wong said.
Founder professor of computer science Tandy Warnow, of the University of Illinois at Urbana-Champaign and Siavash Mirarab, at the University of California San Diego, developed new algorithms for inferring evolutionary relationships from hundreds of gene sequences for over one thousand species, addressing substantial heterogeneity in evolutionary histories across the genomes.
The timing of 244 whole genome duplications across the green plant tree of life was one of the interrelated research foci of the project.
In addition to genome duplications, the expansion of key gene families has contributed to the evolution of multicellularity and complexity in green plants.
The paper, “One Thousand Plant Transcriptomes and Phylogenomics of Green Plants,” was published in Nature (doi: 10.1038/s41586-019-1693-2). Sequences, sequence alignments and tree data are available through the CyVerse Data Commons.