ScienceFeatured6 min readlogoRead on Nature

How a Sorghum Pangenome Reference is Revolutionizing Global Crop Improvement

A groundbreaking 33-member sorghum pangenome reference is transforming how scientists discover valuable crop traits and accelerate breeding programs worldwide. This comprehensive genomic resource, developed through international collaboration, captures the immense genetic diversity of sorghum—one of the world's most climate-resilient crops. By moving beyond single reference genomes, researchers can now identify previously hidden structural variations, understand complex domestication histories, and connect genetic diversity to important agronomic traits like drought tolerance and pest resistance. This pangenome approach provides essential tools for developing improved sorghum varieties that can better withstand climate challenges while meeting diverse agricultural needs across different regions.

The development of a comprehensive sorghum pangenome reference represents a transformative advancement in agricultural genomics, offering unprecedented opportunities for global crop improvement. Sorghum (Sorghum bicolor L. Moench) stands as one of the world's most climate-resilient and phenotypically diverse major crops, adapted to environmental stresses and variable agronomic practices across different regions. Traditional breeding approaches have often been constrained by limited genomic resources that fail to capture the full genetic diversity of this important crop. The new 33-member pangenome reference, detailed in a recent Nature publication, bridges this gap by providing researchers and breeders with tools to accelerate trait discovery and develop improved varieties tailored to local needs.

Sorghum field with diverse varieties
Sorghum field showing diverse botanical types and varieties

The Need for Pangenomic Resources in Crop Improvement

Modern agriculture faces unprecedented challenges from climate change, population growth, and the need for sustainable production systems. While the Green Revolution successfully adapted a handful of crops to industrialized agriculture, much of the global population still relies on locally produced crop cultivars grown by smallholder farms with low-input systems. This diversity of unhomogenized crops offers valuable raw materials for genetic improvement, but breeding efforts have often been constrained by highly specialized traits and breeding targets. The sorghum pangenome reference addresses these limitations by capturing the full spectrum of genetic variation across cultivated sorghum, enabling more effective breeding strategies that can benefit both commercial agriculture and small-scale producers.

Constructing the 33-Member Pangenome Reference

The research team constructed a comprehensive pangenome reference comprising 33 carefully selected sorghum genotypes that span the crop's global diversity. This resource includes four genetic model genotypes: BTx623 V5 (the updated reference genome), readily-transformable RTx430, stay-green and drought-tolerant BTx642, and sweet sorghum variety Wray. Additionally, the pangenome incorporates 9 genotypes representing sorghum's genetic diversity and 20 cultivars crucial for global sorghum improvement programs. These include important lines like Macia and IRAT204 for breeding, CSM-63 and Mota Maradi for local adaptation, SC 35 as a stay-green line, SC 283 for aluminum tolerance, and SRN39 for resistance to the parasitic plant Striga hermonthica.

Sorghum pangenome research laboratory
Research laboratory analyzing sorghum genomic data

Improved Reference Genome: BTx623 V5

The BTx623 cultivar has served as the reference genotype for sorghum genetics since 2009, with its 2013 V3 genome assembly remaining a crucial global resource. However, like many genome assemblies of its era, the BTx623 V3 assembly contained significant gaps with unknown sequences. The research team leveraged long-read genome sequencing technologies to complete repetitive regions that typically cause such gaps. The resulting BTx623 V5 genome assembly represents the 10 sorghum chromosomes with only 34 contigs—a 140-fold improvement over the 4,783 contigs used to assemble V3. This updated reference corrects several structural scaffolding errors in V3 and clarifies the positions of key genes underlying local adaptation among breeding gene pools, including the flowering time locus Maturity1 and the biosynthesis gene POR for the secondary metabolite dhurrin.

Diversity Panel and Global Representation

Complementing the pangenome reference, the research team established a diversity panel of 1,984 unique genotypes re-sequenced with high-coverage whole-genome short-read libraries. This panel spans substantial phenotypic variation for growth rate, biomass accumulation, flowering time, and stress responses. It represents existing and new populations of interest, including the Biomass Association Panel (BAP, n=375), the Transportation Energy Resources from Renewable Agriculture (TERRA) panel (n=220), breeding germplasm (n=500), traditional local landrace varieties (n=737), and 746 georeferenced genotypes with local collection coordinates. The diversity panel contains members of all five major botanical types (n=807 botanical assignments), which have divergent morphological traits connected to local grower preferences.

Key Discoveries Enabled by the Pangenome

Complex Structural Variation in Domestication Genes

The pangenome reference revealed previously undocumented complex structural variations in important genes. Analysis of the domestication gene SHATTERING1 (SH1), which controls seed shattering, demonstrated multiple nested and deeply diverged structural variants that distinguish the previously established multicentric origin of sorghum. The research identified three major haplotypes: sh1-1 (typical of BTx623), sh1-2 (the most common haplotype), and sh1-3 (found in RTx430 and others). Notably, the team discovered a previously undocumented 7,856 bp insertion found only in sh1-3 that includes an identical 2,161 bp segmental duplicate. These structural variations were highly structured by both genetic subpopulation and botanical type, providing insights into sorghum's complex domestication history.

Sorghum SHATTERING1 gene structure
Structural variations in the sorghum SHATTERING1 gene

Dhurrin Biosynthesis Gene Cluster Analysis

The pangenome enabled detailed analysis of the dhurrin biosynthetic gene cluster (BGC), which governs production of this cyanogenic glucoside that provides resistance to chewing insect herbivory and may improve dehydration avoidance. Genome-wide association studies revealed that the BGC contains one of the strongest and densest associations with dhurrin concentration. The research identified several strong regulatory candidate variants, including a 2 bp CT deletion that disrupted predicted binding sites in an accessible chromatin region for abscisic-acid-responsive transcription factors. Pangenome graph haplotype clustering showed that 32 out of 33 reference members fell neatly into four tight k-mer identity clusters that distinguished samples by previously typed short variants and several previously unknown large intergenic indels.

Applications for Global Breeding Programs

The sorghum pangenome reference provides essential tools for decentralized networks of regional breeding programs. By integrating local climate data, cultural preferences, and genetic variants underlying important trait variation, breeding programs can develop varieties tailored to specific environments and end-uses. The k-mer genotyping method developed through this research has been applied to other complex loci with structural variation, such as Low Germination Stimulant 1 (LGS1) for striga resistance and Resistance to Melanaphis sorghi 1 (RMES1), illustrating the utility of this approach for marker development and crop improvement.

Bridging Laboratory and Field Applications

One significant advantage of the pangenome resource is its ability to facilitate information transfer between genotypes used in breeding and those more amenable to laboratory experimentation. This is particularly important in sorghum, which has been highly recalcitrant to genome-editing methods. The pangenome effectively reconstructs putative functional alleles in breeding pedigrees and identifies orthologous sequences to target in transformable varieties, paving the way for accelerated pangenome-enabled traditional breeding and genome editing of locally adapted alleles across global sorghum germplasm.

Future Implications and Global Impact

The development of this sorghum pangenome reference establishes a necessary foundation for effective trait discovery using pangenomics and provides valuable community assets for describing global species diversity. As climate change intensifies environmental stresses on agricultural systems, such genomic resources become increasingly critical for developing resilient crop varieties. The approaches and methodologies developed through this research will accelerate breeding and trait discovery not only in sorghum but also provide a framework for similar applications in other crops facing similar challenges.

Global sorghum cultivation regions
Global distribution of sorghum cultivation and genetic diversity

By capturing the full genetic diversity of sorghum and connecting it to observable agronomic performance, this pangenome resource moves beyond population-level inference to provide species-wide biological context. This comprehensive approach will enable more precise breeding decisions, faster development of improved varieties, and better adaptation of sorghum to diverse agricultural systems worldwide. As food security challenges mount under rapidly changing environmental conditions, such transformative advances in crop improvement speed and efficacy become essential for sustainable agricultural development.

Enjoyed reading?Share with your circle

Similar articles

1
2
3
4
5
6
7
8