Progress of Genome-wide Association Studies in Genetic Improvement of Cucurbitaceae

XU Yingchao, LU Sen, ZHANG Sicheng, MENG Qitao, LIN Huijing, XUE Shudan, LIU Hongbiao, GUO Hanquan, FU Manqin, SONG Dongguang, ZHONG Yujuan

PDF(1274 KB)
PDF(1274 KB)
Chinese Agricultural Science Bulletin ›› 2023, Vol. 39 ›› Issue (22) : 23-33. DOI: 10.11924/j.issn.1000-6850.casb2022-0697

Progress of Genome-wide Association Studies in Genetic Improvement of Cucurbitaceae

Author information +
History +

Abstract

To advance the application of genome-wide association study (GWAS) in the precise mining of the key genetic variation loci associated with important agronomic traits in cucurbit crops, this paper introduced the principles and statistical models of GWAS, outlined the advantages of GWAS in identifying genetic variation loci in crops populations, studying plant metabolic mechanisms and implementing precise genetic improvement strategies. It also systematically reviewed the recent advances of GWAS in the genetic improvement of major cucurbit crops such as watermelon, melon, cucumber, pumpkin or other kinds of cucurbit crops. Furthermore, it provided an outlook on the joint multi-omics analysis and database in breeding research of cucurbit crops, giving a basis and reference in the process of genetic improvement of cucurbit crops.

Key words

genome-wide association study / Cucurbitaceae / agronomic traits / single nucleotide polymorphism / genetic improvement / gene mapping

Cite this article

Download Citations
XU Yingchao , LU Sen , ZHANG Sicheng , MENG Qitao , LIN Huijing , XUE Shudan , LIU Hongbiao , GUO Hanquan , FU Manqin , SONG Dongguang , ZHONG Yujuan. Progress of Genome-wide Association Studies in Genetic Improvement of Cucurbitaceae. Chinese Agricultural Science Bulletin. 2023, 39(22): 23-33 https://doi.org/10.11924/j.issn.1000-6850.casb2022-0697

References

[1]
吴晓毅, 巢志茂, 刘海萍, 等. 葫芦科药用植物甾醇类成分研究进展[J]. 天然产物研究与开发, 2012, 24(S1):169-175.
[2]
ZAMUZ S, MUNEKATA P E S, GULLÓN B, et al. Citrullus lanatus as source of bioactive components: An up-to-date review[J]. Trends in food science & technology, 2021, 111:208-222.
[3]
SILVA M A, ALBUQUERQUE T G, ALVES R C, et al. Melon (Cucumis melo L.) by-products: Potential food ingredients for novel functional foods?[J]. Trends in food science & technology, 2020, 98:181-189.
[4]
KUMAR D, KUMAR S, SINGH J, et al. Free radical scavenging and analgesic activities of Cucumis sativus L. fruit extract[J]. Journal of young pharmacists, 2010, 2(4):365-368.
[5]
ABBAS H M K, HUANG H X, WANG A J, et al. Metabolic and transcriptomic analysis of two Cucurbita moschata germplasms throughout fruit development[J]. BMC Genomics, 2020, 21(1):365.
Pumpkins (Cucurbita moschata; Cucurbitaceae) are valued for their fruits and seeds and are rich in nutrients. Carotenoids and sugar contents, as main feature of pumpkin pulp, are used to determine the fruit quality.
[6]
周萌萌, 王佳楠, 田丽波, 等. 葫芦科作物重要性状基因定位研究进展[J]. 热带作物学报, 2018, 39(3):606-613.
[7]
刘莉, 焦定量, 郭敏, 等. 西瓜遗传图谱构建及其强雌性状定位研究[J]. 果树学报, 2010, 27(1):50-56.
[8]
王全, 陈正武, 邢嘉佳, 等. 黄瓜叶色突变体遗传及连锁的分子标记研究[J]. 中国瓜菜, 2010, 23(4):3-5.
[9]
王贤磊, 高兴旺, 李冠, 等. 甜瓜遗传图谱的构建及果实与种子QTL分析[J]. 遗传, 2011, 33(12):1398-1408.
[10]
ARANZANA M J, KIM S, ZHAO K, et al. Genome-wide association mapping in Arabidopsis identifies previously known flowering time and pathogen resistance genes[J]. PLOS Genetics, 2005, 1(5):e60.
There is currently tremendous interest in the possibility of using genome-wide association mapping to identify genes responsible for natural variation, particularly for human disease susceptibility. The model plant Arabidopsis thaliana is in many ways an ideal candidate for such studies, because it is a highly selfing hermaphrodite. As a result, the species largely exists as a collection of naturally occurring inbred lines, or accessions, which can be genotyped once and phenotyped repeatedly. Furthermore, linkage disequilibrium in such a species will be much more extensive than in a comparable outcrossing species. We tested the feasibility of genome-wide association mapping in A. thaliana by searching for associations with flowering time and pathogen resistance in a sample of 95 accessions for which genome-wide polymorphism data were available. In spite of an extremely high rate of false positives due to population structure, we were able to identify known major genes for all phenotypes tested, thus demonstrating the potential of genome-wide association mapping in A. thaliana and other species with similar patterns of variation. The rate of false positives differed strongly between traits, with more clinal traits showing the highest rate. However, the false positive rates were always substantial regardless of the trait, highlighting the necessity of an appropriate genomic control in association studies.
[11]
KOU Y, LIAO Y, TOIVAINEN T, et al. Evolutionary genomics of structural variation in asian rice (Oryza sativa) domestication[J]. Molecular biology and evolution, 2020, 37(12):3507-3524.
Structural variants (SVs) are a largely unstudied feature of plant genome evolution, despite the fact that SVs contribute substantially to phenotypes. In this study, we discovered SVs across a population sample of 347 high-coverage, resequenced genomes of Asian rice (Oryza sativa) and its wild ancestor (O. rufipogon). In addition to this short-read data set, we also inferred SVs from whole-genome assemblies and long-read data. Comparisons among data sets revealed different features of genome variability. For example, genome alignment identified a large (∼4.3 Mb) inversion in indica rice varieties relative to japonica varieties, and long-read analyses suggest that ∼9% of genes from the outgroup (O. longistaminata) are hemizygous. We focused, however, on the resequencing sample to investigate the population genomics of SVs. Clustering analyses with SVs recapitulated the rice cultivar groups that were also inferred from SNPs. However, the site-frequency spectrum of each SV type-which included inversions, duplications, deletions, translocations, and mobile element insertions-was skewed toward lower frequency variants than synonymous SNPs, suggesting that SVs may be predominantly deleterious. Among transposable elements, SINE and mariner insertions were found at especially low frequency. We also used SVs to study domestication by contrasting between rice and O. rufipogon. Cultivated genomes contained ∼25% more derived SVs and mobile element insertions than O. rufipogon, indicating that SVs contribute to the cost of domestication in rice. Peaks of SV divergence were enriched for known domestication genes, but we also detected hundreds of genes gained and lost during domestication, some of which were enriched for traits of agronomic interest.© The Author(s) 2020. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
[12]
ALONGE M, WANG X, BENOIT M, et al. Major impacts of widespread structural variation on gene expression and crop improvement in tomato[J]. Cell, 2020, 182(1):145-161.
Structural variants (SVs) underlie important crop improvement and domestication traits. However, resolving the extent, diversity, and quantitative impact of SVs has been challenging. We used long-read nanopore sequencing to capture 238,490 SVs in 100 diverse tomato lines. This panSV genome, along with 14 new reference assemblies, revealed large-scale intermixing of diverse genotypes, as well as thousands of SVs intersecting genes and cis-regulatory regions. Hundreds of SV-gene pairs exhibit subtle and significant expression changes, which could broadly influence quantitative trait variation. By combining quantitative genetics with genome editing, we show how multiple SVs that changed gene dosage and expression levels modified fruit flavor, size, and production. In the last example, higher order epistasis among four SVs affecting three related transcription factors allowed introduction of an important harvesting trait in modern tomato. Our findings highlight the underexplored role of SVs in genotype-to-phenotype relationships and their widespread importance and utility in crop improvement.Copyright © 2020 Elsevier Inc. All rights reserved.
[13]
GUO S, ZHAO S, SUN H, et al. Resequencing of 414 cultivated and wild watermelon accessions identifies selection for fruit quality traits[J]. Nature genetics, 2019, 51(11):1616-1623.
Fruit characteristics of sweet watermelon are largely the result of human selection. Here we report an improved watermelon reference genome and whole-genome resequencing of 414 accessions representing all extant species in the Citrullus genus. Population genomic analyses reveal the evolutionary history of Citrullus, suggesting independent evolutions in Citrullus amarus and the lineage containing Citrullus lanatus and Citrullus mucosospermus. Our findings indicate that different loci affecting watermelon fruit size have been under selection during speciation, domestication and improvement. A non-bitter allele, arising in the progenitor of sweet watermelon, is largely fixed in C. lanatus. Selection for flesh sweetness started in the progenitor of C. lanatus and continues through modern breeding on loci controlling raffinose catabolism and sugar transport. Fruit flesh coloration and sugar accumulation might have co-evolved through shared genetic components including a sugar transporter gene. This study provides valuable genomic resources and sheds light on watermelon speciation and breeding history.
[14]
LI Q, LI H, HUANG W, et al. A chromosome-scale genome assembly of cucumber (Cucumis sativus L.)[J]. GigaScience, 2019, 8(6):giz072.
[15]
FU A, WANG Q, MU J, et al. Combined genomic, transcriptomic, and metabolomic analyses provide insights into chayote (Sechium edule) evolution and fruit development[J]. Horticulture research, 2021, 8(1):35.
Chayote (Sechium edule) is an agricultural crop in the Cucurbitaceae family that is rich in bioactive components. To enhance genetic research on chayote, we used Nanopore third-generation sequencing combined with Hi-C data to assemble a draft chayote genome. A chromosome-level assembly anchored on 14 chromosomes (N50 contig and scaffold sizes of 8.40 and 46.56 Mb, respectively) estimated the genome size as 606.42 Mb, which is large for the Cucurbitaceae, with 65.94% (401.08 Mb) of the genome comprising repetitive sequences; 28,237 protein-coding genes were predicted. Comparative genome analysis indicated that chayote and snake gourd diverged from sponge gourd and that a whole-genome duplication (WGD) event occurred in chayote at 25 ± 4 Mya. Transcriptional and metabolic analysis revealed genes involved in fruit texture, pigment, flavor, flavonoids, antioxidants, and plant hormones during chayote fruit development. The analysis of the genome, transcriptome, and metabolome provides insights into chayote evolution and lays the groundwork for future research on fruit and tuber development and genetic improvements in chayote.
[16]
LEWONTIN R C, KOJIMA Kichi. The evolutionary dynamics of complex polymorphisms[J]. Evolution, 1960, 14(4):458-472.
[17]
GABRIEL S B, SCHAFFNER S F, NGUYEN H, et al. The structure of haplotype blocks in the human genome[J]. Science, 2002, 296:2225-2229.
Haplotype-based methods offer a powerful approach to disease gene mapping, based on the association between causal mutations and the ancestral haplotypes on which they arose. As part of The SNP Consortium Allele Frequency Projects, we characterized haplotype patterns across 51 autosomal regions (spanning 13 megabases of the human genome) in samples from Africa, Europe, and Asia. We show that the human genome can be parsed objectively into haplotype blocks: sizable regions over which there is little evidence for historical recombination and within which only a few common haplotypes are observed. The boundaries of blocks and specific haplotypes they contain are highly correlated across populations. We demonstrate that such haplotype frameworks provide substantial statistical power in association studies of common genetic variation across each region. Our results provide a foundation for the construction of a haplotype map of the human genome, facilitating comprehensive genetic association studies of human disease.
[18]
SEBASTIANI P, TIMOFEEV N, DWORKIS D A, et al. Genome-wide association studies and the genetic dissection of complex traits[J]. American journal of hematology, 2009, 84(8):504-515.
The availability of affordable high throughput technology for parallel genotyping has opened the field of genetics to genome-wide association studies (GWAS), and in the last few years hundreds of articles reporting results of GWAS for a variety of heritable traits have been published. What do these results tell us? Although GWAS have discovered a few hundred reproducible associations, this number is underwhelming in relation to the huge amount of data produced, and challenges the conjecture that common variants may be the genetic causes of common diseases. We argue that the massive amount of genetic data that result from these studies remains largely unexplored and unexploited because of the challenge of mining and modeling enormous data sets, the difficulty of using nontraditional computational techniques and the focus of accepted statistical analyses on controlling the false positive rate rather than limiting the false negative rate. In this article, we will review the common approach to analysis of GWAS data and then discuss options to learn more from these data. We will use examples from our ongoing studies of sickle cell anemia and also GWAS in multigenic traits.
[19]
JOSHI V, SHINDE S, NIMMAKAYALA P, et al. Haplotype networking of GWAS hits for citrulline variation associated with the domestication of watermelon[J]. International journal of molecular sciences, 2019, 20(21):5392.
Watermelon is a good source of citrulline, a non-protein amino acid. Citrulline has several therapeutic and clinical implications as it produces nitric oxide via arginine. In plants, citrulline plays a pivotal role in nitrogen transport and osmoprotection. The purpose of this study was to identify single nucleotide polymorphism (SNP) markers associated with citrulline metabolism using a genome-wide association study (GWAS) and understand the role of citrulline in watermelon domestication. A watermelon collection consisting of 187 wild, landraces, and cultivated accessions was used to estimate citrulline content. An association analysis involved a total of 12,125 SNPs with a minor allele frequency (MAF) >0.05 in understanding the population structure and phylogeny in light of citrulline accumulation. Wild egusi types and landraces contained low to medium citrulline content, whereas cultivars had higher content, which suggests that obtaining higher content of citrulline is a domesticated trait. GWAS analysis identified candidate genes (ferrochelatase and acetolactate synthase) showing a significant association of SNPs with citrulline content. Haplotype networking indicated positive selection from wild to domesticated watermelon. To our knowledge, this is the first study showing genetic regulation of citrulline variation in plants by using a GWAS strategy. These results provide new insights into the citrulline metabolism in plants and the possibility of incorporating high citrulline as a trait in watermelon breeding programs.
[20]
LIU D, DONG S, MIAO H, et al. A large-scale genomic association analysis identifies the candidate genes regulating salt tolerance in cucumber (Cucumis sativus L.) seedlings[J]. International journal of molecular sciences, 2022, 23(15):8260.
Salt stress seriously restricts plant growth and development, affects yield and quality, and thus becomes an urgent problem to be solved in cucumber stress resistance breeding. Mining salt tolerance genes and exploring the molecular mechanism of salt tolerance could accelerate the breeding of cucumber germplasm with excellent salt stress tolerance. In this study, 220 cucumber core accessions were used for Genome-Wide Association Studies (GWAS) and the identification of salt tolerance genes. The salinity injury index that was collected in two years showed significant differences among the core germplasm. A total of seven loci that were associated with salt tolerance in cucumber seedlings were repeatedly detected, which were located on Chr.2 (gST2.1), Chr.3 (gST3.1 and gST3.2), Chr.4 (gST4.1 and gST4.2), Chr.5 (gST5.1), and Chr.6 (gST6.1). Within these loci, 62 genes were analyzed, and 5 candidate genes (CsaV3_2G035120, CsaV3_3G023710, CsaV3_4G033150, CsaV3_5G023530, and CsaV3_6G009810) were predicted via the functional annotation of Arabidopsis homologous genes, haplotype of extreme salt-tolerant accessions, and qRT-PCR. These results provide a guide for further research on salt tolerance genes and molecular mechanisms of cucumber seedlings.
[21]
REDDY U K, NATARAJAN P, ABBURI V L, et al. What makes a giant fruit? Assembling a genomic toolkit underlying various fruit traits of the mammoth group of Cucurbita maxima[J]. Frontiers in genetics, 2022, 13:1005158.
[22]
PRICE A L, PATTERSON N J, PLENGE R M, et al. Principal components analysis corrects for stratification in genome-wide association studies[J]. Nature genetics, 2006, 38(8):904-909.
Population stratification--allele frequency differences between cases and controls due to systematic ancestry differences-can cause spurious associations in disease studies. We describe a method that enables explicit detection and correction of population stratification on a genome-wide scale. Our method uses principal components analysis to explicitly model ancestry differences between cases and controls. The resulting correction is specific to a candidate marker's variation in frequency across ancestral populations, minimizing spurious associations while maximizing power to detect true associations. Our simple, efficient approach can easily be applied to disease studies with hundreds of thousands of markers.
[23]
YU J M, PRESSOIR G, BRIGGS W H, et al. A unified mixed-model method for association mapping that accounts for multiple levels of relatedness[J]. Nature genetics, 2006, 38(2):203-208.
As population structure can result in spurious associations, it has constrained the use of association studies in human and plant genetics. Association mapping, however, holds great promise if true signals of functional association can be separated from the vast number of false signals generated by population structure. We have developed a unified mixed-model approach to account for multiple levels of relatedness simultaneously as detected by random genetic markers. We applied this new approach to two samples: a family-based sample of 14 human families, for quantitative gene expression dissection, and a sample of 277 diverse maize inbred lines with complex familial relationships and population structure, for quantitative trait dissection. Our method demonstrates improved control of both type I and type II error rates over other methods. As this new method crosses the boundary between family-based and structured association samples, it provides a powerful complement to currently available methods for association mapping.
[24]
KANG H M, ZAITLEN N A, WADE C M, et al. Efficient control of population structure in model organism association mapping[J]. Genetics, 2008, 178(3):1709-1723.
Genomewide association mapping in model organisms such as inbred mouse strains is a promising approach for the identification of risk factors related to human diseases. However, genetic association studies in inbred model organisms are confronted by the problem of complex population structure among strains. This induces inflated false positive rates, which cannot be corrected using standard approaches applied in human association studies such as genomic control or structured association. Recent studies demonstrated that mixed models successfully correct for the genetic relatedness in association mapping in maize and Arabidopsis panel data sets. However, the currently available mixed-model methods suffer from computational inefficiency. In this article, we propose a new method, efficient mixed-model association (EMMA), which corrects for population structure and genetic relatedness in model organism association mapping. Our method takes advantage of the specific nature of the optimization problem in applying mixed models for association mapping, which allows us to substantially increase the computational speed and reliability of the results. We applied EMMA to in silico whole-genome association mapping of inbred mouse strains involving hundreds of thousands of SNPs, in addition to Arabidopsis and maize data sets. We also performed extensive simulation studies to estimate the statistical power of EMMA under various SNP effects, varying degrees of population structure, and differing numbers of multiple measurements per strain. Despite the limited power of inbred mouse association mapping due to the limited number of available inbred strains, we are able to identify significantly associated SNPs, which fall into known QTL or genes identified through previous studies while avoiding an inflation of false positives. An R package implementation and webserver of our EMMA method are publicly available.
[25]
KANG H M, SUL J H, SERVICE S K, et al. Variance component model to account for sample structure in genome-wide association studies[J]. Nature genetics, 2010, 42(4):348-354.
Although genome-wide association studies (GWASs) have identified numerous loci associated with complex traits, imprecise modeling of the genetic relatedness within study samples may cause substantial inflation of test statistics and possibly spurious associations. Variance component approaches, such as efficient mixed-model association (EMMA), can correct for a wide range of sample structures by explicitly accounting for pairwise relatedness between individuals, using high-density markers to model the phenotype distribution; but such approaches are computationally impractical. We report here a variance component approach implemented in publicly available software, EMMA eXpedited (EMMAX), that reduces the computational time for analyzing large GWAS data sets from years to hours. We apply this method to two human GWAS data sets, performing association analysis for ten quantitative traits from the Northern Finland Birth Cohort and seven common diseases from the Wellcome Trust Case Control Consortium. We find that EMMAX outperforms both principal component analysis and genomic control in correcting for sample structure.
[26]
AULCHENKO Y S, DE KONING D J, HALEY C. Genomewide rapid association using mixed model and regression: A fast and simple method for genomewide pedigree-based quantitative trait loci association analysis[J]. Genetics, 2007, 177(1):577-585.
For pedigree-based quantitative trait loci (QTL) association analysis, a range of methods utilizing within-family variation such as transmission-disequilibrium test (TDT)-based methods have been developed. In scenarios where stratification is not a concern, methods exploiting between-family variation in addition to within-family variation, such as the measured genotype (MG) approach, have greater power. Application of MG methods can be computationally demanding (especially for large pedigrees), making genomewide scans practically infeasible. Here we suggest a novel approach for genomewide pedigree-based quantitative trait loci (QTL) association analysis: genomewide rapid association using mixed model and regression (GRAMMAR). The method first obtains residuals adjusted for family effects and subsequently analyzes the association between these residuals and genetic polymorphisms using rapid least-squares methods. At the final step, the selected polymorphisms may be followed up with the full measured genotype (MG) analysis. In a simulation study, we compared type 1 error, power, and operational characteristics of the proposed method with those of MG and TDT-based approaches. For moderately heritable (30%) traits in human pedigrees the power of the GRAMMAR and the MG approaches is similar and is much higher than that of TDT-based approaches. When using tabulated thresholds, the proposed method is less powerful than MG for very high heritabilities and pedigrees including large sibships like those observed in livestock pedigrees. However, there is little or no difference in empirical power of MG and the proposed method. In any scenario, GRAMMAR is much faster than MG and enables rapid analysis of hundreds of thousands of markers.
[27]
LIPPERT C, LISTGARTEN J, LIU Y, et al. FaST linear mixed models for genome-wide association studies[J]. Nature methods, 2011, 8(10):833-835.
We describe factored spectrally transformed linear mixed models (FaST-LMM), an algorithm for genome-wide association studies (GWAS) that scales linearly with cohort size in both run time and memory use. On Wellcome Trust data for 15,000 individuals, FaST-LMM ran an order of magnitude faster than current efficient algorithms. Our algorithm can analyze data for 120,000 individuals in just a few hours, whereas current algorithms fail on data for even 20,000 individuals (http://mscompbio.codeplex.com/).
[28]
ZHOU X, STEPHENS M. Genome-wide efficient mixed-model analysis for association studies[J]. Nature genetics, 2012, 44(7):821.
Linear mixed models have attracted considerable attention recently as a powerful and effective tool for accounting for population stratification and relatedness in genetic association tests. However, existing methods for exact computation of standard test statistics are computationally impractical for even moderate-sized genome-wide association studies. To address this issue, several approximate methods have been proposed. Here, we present an efficient exact method, which we refer to as genome-wide efficient mixed-model association (GEMMA), that makes approximations unnecessary in many contexts. This method is approximately n times faster than the widely used exact method known as efficient mixed-model association (EMMA), where n is the sample size, making exact genome-wide association analysis computationally practical for large numbers of individuals.
[29]
ZHANG Z, ERSOZ E, LAI C Q, et al. Mixed linear model approach adapted for genome-wide association studies[J]. Nature enetics, 2010, 42(4):355.
[30]
WANG Q, TIAN F, PAN Y, et al. A SUPER powerful method for genome wide association study[J]. Plos one, 2014, 9(9):e107684.
[31]
HUANG M, LIU X, ZHOU Y, et al. BLINK: A package for the next level of genome-wide association studies with both individuals and markers in the millions[J]. GigaScience, 2019, 8(2).
[32]
YANG N, LU Y, YANG X, et al. Genome wide association studies using a new nonparametric model reveal the genetic architecture of 17 agronomic traits in an enlarged maize association panel[J]. PLOS Genetics, 2014, 10(9):e1004573.
[33]
BEAUMONT M, RANNALA B, BEAUMONT M A, et al. The Bayesian revolution in genetics[J]. Genetics, 2004, 5:251-261.
[34]
XIAO Y, LIU H, WU L, et al. Genome-wide association studies in maize: Praise and stargaze[J]. Molecular plant, 2017, 10(3):359-374.
Genome-wide association study (GWAS) has become a widely accepted strategy for decoding genotype-phenotype associations in many species thanks to advances in next-generation sequencing (NGS) technologies. Maize is an ideal crop for GWAS and significant progress has been made in the last decade. This review summarizes current GWAS efforts in maize functional genomics research and discusses future prospects in the omics era. The general goal of GWAS is to link genotypic variations to corresponding differences in phenotype using the most appropriate statistical model in a given population. The current review also presents perspectives for optimizing GWAS design and analysis. GWAS analysis of data from RNA, protein, and metabolite-based omics studies is discussed, along with new models and new population designs that will identify causes of phenotypic variation that have been hidden to date. The joint and continuous efforts of the whole community will enhance our understanding of maize quantitative traits and boost crop molecular breeding designs.Copyright © 2016 The Author. Published by Elsevier Inc. All rights reserved.
[35]
NIMMAKAYALA P, LEVI A, ABBURI L, et al. Single nucleotide polymorphisms generated by genotyping by sequencing to characterize genome-wide diversity, linkage disequilibrium, and selective sweeps in cultivated watermelon[J]. BMC genomics, 2014, 15:767.
Background: A large single nucleotide polymorphism (SNP) dataset was used to analyze genome-wide diversity in a diverse collection of watermelon cultivars representing globally cultivated, watermelon genetic diversity. The marker density required for conducting successful association mapping depends on the extent of linkage disequilibrium (LD) within a population. Use of genotyping by sequencing reveals large numbers of SNPs that in turn generate opportunities in genome-wide association mapping and marker-assisted selection, even in crops such as watermelon for which few genomic resources are available. In this paper, we used genome-wide genetic diversity to study LD, selective sweeps, and pairwise F-ST distributions among worldwide cultivated watermelons to track signals of domestication. Results: We examined 183 Citrullus lanatus var. lanatus accessions representing domesticated watermelon and generated a set of 11,485 SNP markers using genotyping by sequencing. With a diverse panel of worldwide cultivated watermelons, we identified a set of 5,254 SNPs with a minor allele frequency of >= 0.05, distributed across the genome. All ancestries were traced to Africa and an admixture of various ancestries constituted secondary gene pools across various continents. A sliding window analysis using pairwise FST values was used to resolve selective sweeps. We identified strong selection on chromosomes 3 and 9 that might have contributed to the domestication process. Pairwise analysis of adjacent SNPs within a chromosome as well as within a haplotype allowed us to estimate genome-wide LD decay. LD was also detected within individual genes on various chromosomes. Principal component and ancestry analyses were used to account for population structure in a genome-wide association study. We further mapped important genes for soluble solid content using a mixed linear model. Conclusions: Information concerning the SNP resources, population structure, and LD developed in this study will help in identifying agronomically important candidate genes from the genomic regions underlying selection and for mapping quantitative trait loci using a genome-wide association study in sweet watermelon.
[36]
YAGCIOGLU M, GULSEN O, YETISIR H, et al. Preliminary studies of genome-wide association mapping for some selected morphological characters of watermelons[J]. Scientia horticulturae, 2016, 210:277-284.
[37]
DOU J, ZHAO S, LU X, et al. Genetic mapping reveals a candidate gene (ClFS1) for fruit shape in watermelon (Citrullus lanatus L.)[J]. Theoretical and applied genetics, 2018, 131(4):947-958.
[38]
AGUADO E, GARCIA A, IGLESIAS-MOYA J, et al. Mapping a partial Andromonoecy locus in Citrullus lanatus using BSA-Seq and GWAS approaches[J]. Frontiers in plant science, 2020, 11:1243.
[39]
LIU S, GAO P, ZHU Q, et al. Resequencing of 297 melon accessions reveals the genomic history of improvement and loci related to fruit traits in melon[J]. Plant biotechnology journal, 2020, 18(12):2545-2558.
[40]
NIMMAKAYALA P, TOMASON Y R, ABBURI V L, et al. Genome-wide differentiation of various melon horticultural groups for use in GWAS for fruit firmness and construction of a high resolution genetic map[J]. Frontiers in plant science, 2016, 7:1437.
Melon ( L.) is a phenotypically diverse eudicot diploid (2 = 2 = 24) has climacteric and non-climacteric morphotypes and show wide variation for fruit firmness, an important trait for transportation and shelf life. We generated 13,789 SNP markers using genotyping-by-sequencing (GBS) and anchored them to chromosomes to understand genome-wide fixation indices () between various melon morphotypes and genomewide linkage disequilibrium (LD) decay. The between accessions of and was 0.23. The between and various accessions was in a range of 0.19-0.53 and between and accessions was in a range of 0.21-0.59 indicating sporadic to wide ranging introgression. The EM (Expectation Maximization) algorithm was used for estimation of 1436 haplotypes. Average genome-wide LD decay for the melon genome was noted to be 9.27 Kb. In the current research, we focused on the genome-wide divergence underlying diverse melon horticultural groups. A high-resolution genetic map with 7153 loci was constructed. Genome-wide segregation distortion and recombination rate across various chromosomes were characterized. Melon has climacteric and non-climacteric morphotypes and wide variation for fruit firmness, a very important trait for transportation and shelf life. Various levels of QTLs were identified with high to moderate stringency and linked to fruit firmness using both genome-wide association study (GWAS) and biparental mapping. Gene annotation revealed some of the SNPs are located in β-D-xylosidase, glyoxysomal malate synthase, chloroplastic anthranilate phosphoribosyltransferase, and histidine kinase, the genes that were previously characterized for fruit ripening and softening in other crops.
[41]
HOU J, ZHOU Y F, GAO L Y, et al. Dissecting the genetic architecture of melon chilling tolerance at the seedling stage by association mapping and identification of the elite alleles[J]. Frontiers in plant science, 2018, 9:1577.
Low temperature is an important abiotic stress that negatively affects morphological growth and fruit development in melon (Cucumis melo L.). Chilling stress at the seedling stage causes seedling injury and poor stand establishment, prolonging vegetative growth and delaying fruit harvest. In this study, association mapping was performed for chilling tolerance at the seedling stage on an expanded melon core collection containing 212 diverse accessions by 272 SSRs and 27 CAPSs. Chilling tolerance of the melon seedlings was evaluated by calculating the chilling injury index (CII) in 2016 and 2017. Genetic diversity analysis of the whole accession panel presented two main groups, which corresponded to the two subspecies of C. melo, melo, and agrestis. Both the subspecies were sensitive to chilling but with agrestis being more tolerant. Genome-wide association study (GWAS) was conducted, respectively, on the whole panel and the two subspecies, totally detecting 51 loci that contributed to 74 marker-trait associations. Of these associations, 35 were detected in the whole panel, 21 in melo, and 18 in agrestis. About half of the associations identified in the two subspecies were also observed in the whole panel, and seven associations were shared by both the subspecies. CMCT505_Chr. 1 was repeatedly detected in different populations with high phenotypic contribution and could be a key locus controlling chilling tolerance in C. melo. Nine loci were selected for evaluation of the phenotypic effects related to their alleles, which identified 11 elite alleles contributing to seedling chilling tolerance. Four such alleles existed in both the subspecies and six in either of the two subspecies. Analysis of 20 parental combinations for their allelic status and phenotypic values showed that the elite alleles collectively contributed to enhancement of the chilling tolerance. Tagging the loci responsible for chilling tolerance may simultaneously favor dissecting the complex adaptability traits and elevate the efficiency to improve chilling tolerance using marker-assisted selection in melon.
[42]
PEREIRA L, SANTO DOMINGO M, RUGGIERI V, et al. Genetic dissection of climacteric fruit ripening in a melon population segregating for ripening behavior[J]. Horticulture research, 2020, 7(1):187.
Melon is as an alternative model to understand fruit ripening due to the coexistence of climacteric and non-climacteric varieties within the same species, allowing the study of the processes that regulate this complex trait with genetic approaches. We phenotyped a population of recombinant inbred lines (RILs), obtained by crossing a climacteric (Védrantais, cantalupensis type) and a non-climcteric variety (Piel de Sapo T111, inodorus type), for traits related to climacteric maturation and ethylene production. Individuals in the RIL population exhibited various combinations of phenotypes that differed in the amount of ethylene produced, the early onset of ethylene production, and other phenotypes associated with ripening. We characterized a major QTL on chromosome 8, ETHQV8.1, which is sufficient to activate climacteric ripening, and other minor QTLs that may modulate the climacteric response. The ETHQV8.1 allele was validated by using two reciprocal introgression line populations generated by crossing Védrantais and Piel de Sapo and analyzing the ETHQV8.1 region in each of the genetic backgrounds. A Genome-wide association study (GWAS) using 211 accessions of the ssp. melo further identified two regions on chromosome 8 associated with the production of aromas, one of these regions overlapping with the 154.1 kb interval containing ETHQV8.1. The ETHQV8.1 region contains several candidate genes that may be related to fruit ripening. This work sheds light into the regulation mechanisms of a complex trait such as fruit ripening.
[43]
KISHOR D S, NOH Y, SONG W H, et al. SNP marker assay and candidate gene identification for sex expression via genotyping-by-sequencing-based genome-wide associations (GWAS) analyses in Oriental melon (Cucumis melo L. var. makuwa)[J]. Scientia horticulturae, 2021, 276:109711.
[44]
WANG X, BAO K, REDDY U K, et al. The USDA cucumber (Cucumis sativus L.) collection: Genetic diversity, population structure, genome-wide association studies, and core collection development[J]. Horticulture research, 2018, 5:64.
[45]
BO K, WEI S, WANG W, et al. QTL mapping and genome-wide association study reveal two novel loci associated with green flesh color in cucumber[J]. BMC plant biology, 2019, 19:243.
Green flesh color, resulting from the accumulation of chlorophyll, is one of the most important commercial traits for the fruits. The genetic network regulating green flesh formation has been studied in tomato, melon and watermelon. However, little is known about the inheritance and molecular basis of green flesh in cucumber. This study sought to determine the main genomic regions associated with green flesh. Three F and two BC populations derived from the 9110Gt (cultivated cucumber, green flesh color) and PI183967 (wild cucumber, white flesh color) were used for the green flesh genetic analysis. Two F populations of them were further employed to do the map construction and quantitative trait loci (QTL) study. Also, a core cucumber germplasms population was used to do the GWAS analysis.We identified three indexes, flesh color (FC), flesh extract color (FEC) and flesh chlorophyll content (FCC) in three environments. Genetic analysis indicated that green flesh color in 9110Gt is controlled by a major-effect QTL. We developed two genetic maps with 192 and 174 microsatellite markers respectively. Two novel inversions in Chr1 were identified between cultivated and wild cucumbers. The major-effect QTL, qgf5.1, was identified using FC, FEC and FCC index in all different environments used. In addition, the same qgf5.1, together with qgf3.1, was identified via GWAS. Further investigation of two candidate regions using pairwise LD correlations, combined with genetic diversity of qgf5.1 in natural populations, it was found that Csa5G021320 is the candidate gene of qgf5.1. Geographical distribution revealed that green flesh color formation could be due to the high latitude, which has longer day time to produce the photosynthesis and chlorophyll synthesis during cucumber domestication and evolution.We first reported the cucumber green flesh color is a quantitative trait. We detected two novel loci qgf5.1 and qgf3.1, which regulate the green flesh formation in cucumber. The QTL mapping and GWAS approaches identified several candidate genes for further validation using functional genomics or forward genetics approaches. Findings from the present study provide a new insight into the genetic control of green flesh in cucumber.
[46]
LEE H Y, KIM J G, KANG B C, et al. Assessment of the genetic diversity of the breeding lines and a genome wide association study of three horticultural traits using worldwide cucumber (Cucumis spp.) germplasm collection[J]. Agronomy-Basel, 2020, 10(11):1736.
[47]
ALAVILLI H, LEE J J, YOU C R, et al. GWAS reveals a novel candidate gene CmoAP2/ERF in pumpkin (Cucurbita moschata) involved in resistance to powdery mildew[J]. International journal of molecular sciences, 2022, 23(12):6524.
Pumpkin (Cucurbita moschata Duchesne ex Poir.) is a multipurpose cash crop rich in antioxidants, minerals, and vitamins; the seeds are also a good source of quality oils. However, pumpkin is susceptible to the fungus Podosphaera xanthii, an obligate biotrophic pathogen, which usually causes powdery mildew (PM) on both sides of the leaves and reduces photosynthesis. The fruits of infected plants are often smaller than usual and unpalatable. This study identified a novel gene that involves PM resistance in pumpkins through a genome-wide association study (GWAS). The allelic variation identified in the CmoCh3G009850 gene encoding for AP2-like ethylene-responsive transcription factor (CmoAP2/ERF) was proven to be involved in PM resistance. Validation of the GWAS data revealed six single nucleotide polymorphism (SNP) variations in the CmoAP2/ERF coding sequence between the resistant (IT 274039 [PMR]) and the susceptible (IT 278592 [PMS]). A polymorphic marker (dCAPS) was developed based on the allelic diversity to differentiate these two haplotypes. Genetic analysis in the segregating population derived from PMS and PMR parents provided evidence for an incomplete dominant gene-mediated PM resistance. Further, the qRT-PCR assay validated the elevated expression of CmoAP2/ERF during PM infection in the PMR compared with PMS. These results highlighted the pivotal role of CmoAP2/ERF in conferring resistance to PM and identifies it as a valuable molecular entity for breeding resistant pumpkin cultivars.
[48]
WU X, XU P, WU X, et al. Genome-wide association analysis of free glutamate content, a key factor conferring umami taste in the bottle gourd [Lagenaria siceraria (Mol. ) Standl.][J]. Scientia horticulturae, 2017, 225:795-801.
[49]
NEALE B M. Statistical genetics: Gene mapping through linkage and association[M]. New York: Taylor & Francis Group, 2008:311-317.
[50]
鲁秀梅, 张宁, 陈劲枫, 等. 作物基因聚合育种的研究进展[J]. 分子植物育种, 2017, 15(4):1445-1454.
[51]
CORTES L T, ZHANG Z, YU J. Status and prospects of genome-wide association studies in plants[J]. Plant genome, 2021, 14(1):e20077.
[52]
LISEC J, MEYER R C, STEINFATH M, et al. Identification of metabolic and biomass QTL in Arabidopsis thaliana in a parallel analysis of RIL and IL populations[J]. The plant journal, 2008, 53(6):960-972.
[53]
高磊, 刘文革, 赵胜杰, 等. 葫芦科作物果实品质性状的分子标记与定位研究进展[J]. 中国瓜菜, 2014, 27(2):1-7.
[54]
RIEDELSHEIMER C, LISEC J, CZEDIK-EYSENBERG A, et al. Genome-wide association mapping of leaf metabolic profiles for dissecting complex traits in maize[J]. Proceedings of the national academy of sciences, 2012, 109(23):8872-8877.
\n The diversity of metabolites found in plants is by far greater than in most other organisms. Metabolic profiling techniques, which measure many of these compounds simultaneously, enabled investigating the regulation of metabolic networks and proved to be useful for predicting important agronomic traits. However, little is known about the genetic basis of metabolites in crops such as maize. Here, a set of 289 diverse maize inbred lines was genotyped with 56,110 SNPs and assayed for 118 biochemical compounds in the leaves of young plants, as well as for agronomic traits of mature plants in field trials. Metabolite concentrations had on average a repeatability of 0.73 and showed a correlation pattern that largely reflected their functional grouping. Genome-wide association mapping with correction for population structure and cryptic relatedness identified for 26 distinct metabolites strong associations with SNPs, explaining up to 32.0% of the observed genetic variance. On nine chromosomes, we detected 15 distinct SNP–metabolite associations, each of which explained more then 15% of the genetic variance. For lignin precursors, including\n p\n -coumaric acid and caffeic acid, we found strong associations (\n P\n values\n \n \n \n to\n \n \n \n ) with a region on chromosome 9 harboring cinnamoyl-CoA reductase, a key enzyme in monolignol synthesis and a target for improving the quality of lignocellulosic biomass by genetic engineering approaches. Moreover, lignin precursors correlated significantly with lignin content, plant height, and dry matter yield, suggesting that metabolites represent promising connecting links for narrowing the genotype–phenotype gap of complex agronomic traits.\n
[55]
CHEN W, GAO Y, XIE W, et al. Genome-wide association analyses provide genetic and biochemical insights into natural variation in rice metabolism[J]. Nature genetics, 2014, 46(7):714-721.
Plant metabolites are important to world food security in terms of maintaining sustainable yield and providing food with enriched phytonutrients. Here we report comprehensive profiling of 840 metabolites and a further metabolic genome-wide association study based on ∼6.4 million SNPs obtained from 529 diverse accessions of Oryza sativa. We identified hundreds of common variants influencing numerous secondary metabolites with large effects at high resolution. We observed substantial heterogeneity in the natural variation of metabolites and their underlying genetic architectures among different subspecies of rice. Data mining identified 36 candidate genes modulating levels of metabolites that are of potential physiological and nutritional importance. As a proof of concept, we functionally identified or annotated five candidate genes influencing metabolic traits. Our study provides insights into the genetic and biochemical bases of rice metabolome variation and can be used as a powerful complementary tool to classical phenotypic trait mapping for rice improvement.
[56]
SAUVAGE C, SEGURA V, BAUCHET G, et al. Genome-wide association in tomato reveals 44 candidate loci for fruit metabolic traits[J]. Plant physiology, 2014, 165(3):1120-1132.
Genome-wide association studies have been successful in identifying genes involved in polygenic traits and are valuable for crop improvement. Tomato (Solanum lycopersicum) is a major crop and is highly appreciated worldwide for its health value. We used a core collection of 163 tomato accessions composed of S. lycopersicum, S. lycopersicum var cerasiforme, and Solanum pimpinellifolium to map loci controlling variation in fruit metabolites. Fruits were phenotyped for a broad range of metabolites, including amino acids, sugars, and ascorbate. In parallel, the accessions were genotyped with 5,995 single-nucleotide polymorphism markers spread over the whole genome. Genome-wide association analysis was conducted on a large set of metabolic traits that were stable over 2 years using a multilocus mixed model as a general method for mapping complex traits in structured populations and applied to tomato. We detected a total of 44 loci that were significantly associated with a total of 19 traits, including sucrose, ascorbate, malate, and citrate levels. These results not only provide a list of candidate loci to be functionally validated but also a powerful analytical approach for finding genetic variants that can be directly used for crop improvement and deciphering the genetic architecture of complex traits.© 2014 American Society of Plant Biologists. All Rights Reserved.
[57]
COHEN S, TZURI G, HAREL-BEJA R, et al. Co-mapping studies of QTLs for fruit acidity and candidate genes of organic acid metabolism and proton transport in sweet melon (Cucumis melo L.)[J]. Theoretical and applied genetics, 2012, 125(2):343-353.
[58]
MAYOBRE C, PEREIRA L, ELTAHIRI A, et al. Genetic dissection of aroma biosynthesis in melon and its relationship with climacteric ripening[J]. Food chemistry, 2021, 353:129484.
[59]
刘斌. “经验育种”逐步跨入“精确育种”[J]. 北京农业, 2010(5):52-53.
[60]
COBB J N, BISWAS P S, PLATTEN J D. Back to the future: Revisiting MAS as a tool for modern plant breeding[J]. Theoretical and applied genetics, 2019, 132(3):647-667.
New models for integration of major gene MAS with modern breeding approaches stand to greatly enhance the reliability and efficiency of breeding, facilitating the leveraging of traditional genetic diversity. Genetic diversity is well recognised as contributing essential variation to crop breeding processes, and marker-assisted selection is cited as the primary tool to bring this diversity into breeding programs without the associated genetic drag from otherwise poor-quality genomes of donor varieties. However, implementation of marker-assisted selection techniques remains a challenge in many breeding programs worldwide. Many factors contribute to this lack of adoption, such as uncertainty in how to integrate MAS with traditional breeding processes, lack of confidence in MAS as a tool, and the expense of the process. However, developments in genomics tools, locus validation techniques, and new models for how to utilise QTLs in breeding programs stand to address these issues. Marker-assisted forward breeding needs to be enabled through the identification of robust QTLs, the design of reliable marker systems to select for these QTLs, and the delivery of these QTLs into elite genomic backgrounds to enable their use without associated genetic drag. To enhance the adoption and effectiveness of MAS, rice is used as an example of how to integrate new developments and processes into a coherent, efficient strategy for utilising genetic variation. When processes are instituted to address these issues, new genes can be rolled out into a breeding program rapidly and completely with a minimum of expense.
[61]
KNOTT G J, DOUDNA J A. CRISPR-Cas guides the future of genetic engineering[J]. Science, 2018, 361:866-869.
The diversity, modularity, and efficacy of CRISPR-Cas systems are driving a biotechnological revolution. RNA-guided Cas enzymes have been adopted as tools to manipulate the genomes of cultured cells, animals, and plants, accelerating the pace of fundamental research and enabling clinical and agricultural breakthroughs. We describe the basic mechanisms that set the CRISPR-Cas toolkit apart from other programmable gene-editing technologies, highlighting the diverse and naturally evolved systems now functionalized as biotechnologies. We discuss the rapidly evolving landscape of CRISPR-Cas applications, from gene editing to transcriptional regulation, imaging, and diagnostics. Continuing functional dissection and an expanding landscape of applications position CRISPR-Cas tools at the cutting edge of nucleic acid manipulation that is rewriting biology.Copyright © 2018, American Association for the Advancement of Science.
[62]
HARJES C E, ROCHEFORD T R, BAI L, et al. Natural genetic variation in lycopene epsilon cyclase tapped for maize biofortification[J]. Science, 2008, 319:330-333.
Dietary vitamin A deficiency causes eye disease in 40 million children each year and places 140 to 250 million at risk for health disorders. Many children in sub-Saharan Africa subsist on maize-based diets. Maize displays considerable natural variation for carotenoid composition, including vitamin A precursors alpha-carotene, beta-carotene, and beta-cryptoxanthin. Through association analysis, linkage mapping, expression analysis, and mutagenesis, we show that variation at the lycopene epsilon cyclase (lcyE) locus alters flux down alpha-carotene versus beta-carotene branches of the carotenoid pathway. Four natural lcyE polymorphisms explained 58% of the variation in these two branches and a threefold difference in provitamin A compounds. Selection of favorable lcyE alleles with inexpensive molecular markers will now enable developing-country breeders to more effectively produce maize grain with higher provitamin A levels.
[63]
YAN J, KANDIANIS C B, HARJES C E, et al. Rare genetic variation at Zea mays crtRB1 increases beta-carotene in maize grain[J]. Nature genetics, 2010, 42(4):322-327.
[64]
FIEDLER J L, AFIDRA R, MUGAMBI G, et al. Maize flour fortification in Africa: Markets, feasibility, coverage, and costs[J]. Annals of the New York Academy of Sciences, 2014(1):26-39.
[65]
唐富福, 徐非非, 包劲松. 全基因组关联分析在水稻遗传育种中的应用[J]. 核农学报, 2013, 27(5):598-606.
关联分析是解析作物表型多样性遗传基础的有效工具,也是挖掘有利等位基因的重要手段,在作物遗传育种中发挥着越来越重要的作用。随着挖掘覆盖全基因组的单核苷酸多态性(Single nucleotide polymorphisms, SNP)标记技术的不断改进,基于连锁不平衡(Linkage disequilibrium, LD)的全基因组关联分析为研究作物的农艺、品质、产量和抗性等复杂性状提供了新途径。本文在系统介绍全基因组关联分析方法的基础上,详尽总结了其在水稻遗传育种中的研究进展,并探讨了其存在的潜在问题及解决途径。
[66]
PHAN N T, SIM S C. Genomic tools and their implications for vegetable breeding[J]. Horticultural science & technology, 2017, 35(2):149-164.
[67]
WU S, WANG X, REDDY U, et al. Genome of 'Charleston Gray', the principal American watermelon cultivar, and genetic characterization of 1,365 accessions in the U. S. National Plant Germplasm System watermelon collection[J]. Plant biotechnology journal, 2019, 17(12):2246-2258.
[68]
DOU J, LU X, ALI A, et al. Genetic mapping reveals a marker for yellow skin in watermelon (Citrullus lanatus L.)[J]. Plos one, 2018, 13(9):e0200617.
[69]
LEGENDRE R, KUZY J, MCGREGOR C. Markers for selection of three alleles of ClSUN25-26-27a (Cla011257) associated with fruit shape in watermelon[J]. Molecular breeding, 2020, 40(2):19.
[70]
郭宇, 高美玲, 刘小松, 等. 西瓜种子外观性状遗传多样性及全基因组关联分析[J]. 基因组学与应用生物学, 2021, 40(Z4):3674-3684.
[71]
WANG X, ANDO K, WU S, et al. Genetic characterization of melon accessions in the U. S. National Plant Germplasm System and construction of a melon core collection[J]. Molecular horticulture, 2021, 1(1):11.
Melon (C. meloL.) is an economically important vegetable crop cultivated worldwide. The melon collection in the U.S. National Plant Germplasm System (NPGS) is a valuable resource to conserve natural genetic diversity and provide novel traits for melon breeding. Here we use the genotyping-by-sequencing (GBS) technology to characterize 2083 melon accessions in the NPGS collected from major melon production areas as well as regions where primitive melons exist. Population structure and genetic diversity analyses suggested thatC. melo ssp. melowas firstly introduced from the centers of origin, Indian and Pakistan, to Central and West Asia, and then brought to Europe and Americas.C. melo ssp. melofrom East Asia was likely derived fromC. melo ssp. agrestisin India and Pakistan and displayed a distinct genetic background compared to the rest of ssp.meloaccessions from other geographic regions. We developed a core collection of 383 accessions capturing more than 98% of genetic variation in the germplasm, providing a publicly accessible collection for future research and genomics-assisted breeding of melon. Thirty-five morphological characters investigated in the core collection indicated high variability of these characters across accessions in the collection. Genome-wide association studies using the core collection panel identified potentially associated genome regions related to fruit quality and other horticultural traits. This study provides insights into melon origin and domestication, and the constructed core collection and identified genome loci potentially associated with important traits provide valuable resources for future melon research and breeding.
[72]
ZHAO G, LIAN Q, ZHANG Z, et al. A comprehensive genome variation map of melon identifies multiple domestication events and loci influencing agronomic traits[J]. Nature genetics, 2019, 51(11):1607-1615.
Melon is an economically important fruit crop that has been cultivated for thousands of years; however, the genetic basis and history of its domestication still remain largely unknown. Here we report a comprehensive map of the genomic variation in melon derived from the resequencing of 1,175 accessions, which represent the global diversity of the species. Our results suggest that three independent domestication events occurred in melon, two in India and one in Africa. We detected two independent sets of domestication sweeps, resulting in diverse characteristics of the two subspecies melo and agrestis during melon breeding. Genome-wide association studies for 16 agronomic traits identified 208 loci significantly associated with fruit mass, quality and morphological characters. This study sheds light on the domestication history of melon and provides a valuable resource for genomics-assisted breeding of this important crop.
[73]
胡倩梅, 杨会会, 朱华玉, 等. 甜瓜果面茸毛、果面瘤以及果面沟全基因组关联分析[J]. 中国瓜菜, 2019, 32(5):7-12.
[74]
QI J, LIU X, SHEN D, et al. A genomic variation map provides insights into the genetic basis of cucumber domestication and diversity[J]. Nature genetics, 2013, 45(12):1510.
Most fruits in our daily diet are the products of domestication and breeding. Here we report a map of genome variation for a major fruit that encompasses ~3.6 million variants, generated by deep resequencing of 115 cucumber lines sampled from 3,342 accessions worldwide. Comparative analysis suggests that fruit crops underwent narrower bottlenecks during domestication than grain crops. We identified 112 putative domestication sweeps; 1 of these regions contains a gene involved in the loss of bitterness in fruits, an essential domestication trait of cucumber. We also investigated the genomic basis of divergence among the cultivated populations and discovered a natural genetic variant in a β-carotene hydroxylase gene that could be used to breed cucumbers with enhanced nutritional value. The genomic history of cucumber evolution uncovered here provides the basis for future genomics-enabled breeding.
[75]
LIU X, LU H, LIU P, et al. Identification of novel loci and candidate genes for cucumber downy mildew resistance using GWAS[J]. Plants, 2020, 9(12):1659.
Downy mildew (DM) is one of the most serious diseases in cucumber. Multiple quantitative trait loci (QTLs) for DM resistance have been detected in a limited number of cucumber accessions. In this study we applied genome-wide association analysis (GWAS) to detected genetic loci for DM resistance in a core germplasm (CG) of cucumber lines that represent diverse origins and ecotypes. Phenotypic data on responses to DM infection were collected in four field trials across three years, 2014, 2015, and 2016. With the resequencing data of these CG lines, GWAS for DM resistance was performed and detected 18 loci that were distributed on all the seven cucumber chromosomes. Of these 18 loci, only six (dmG1.4, dmG4.1, dmG4.3, dmG5.2, dmG7.1, and dmG7.2) were detected in two experiments, and were considered as loci with a stable effect on DM resistance. Further, 16 out of the 18 loci colocalized with the QTLs that were reported in previous studies and two loci, dmG2.1 and dmG7.1, were novel ones identified only in this study. Based on the annotation of homologous genes in Arabidopsis and pairwise LD correlation analysis, several candidate genes were identified as potential causal genes underlying the stable and novel loci, including Csa1G575030 for dmG1.4, Csa2G060360 for dmG2.1, Csa4G064680 for dmG4.1, Csa5G606470 for dmG5.2, and Csa7G004020 for dmG7.1. This study shows that the CG germplasm is a very valuable resource carrying known and novel QTLs for DM resistance. The potential of using these CG lines for future allele-mining of candidate genes was discussed in the context of breeding cucumber with resistance to DM.
[76]
LIU X, GU X, LU H, et al. Identification of novel loci and candidate genes for resistance to powdery mildew in a resequenced cucumber germplasm[J]. Genes, 2021, 12(4):584.
Powdery mildew (PM) is one of the most serious diseases in cucumber and causes huge yield loss. Multiple quantitative trait loci (QTLs) for PM resistance have been reported in previous studies using a limited number of cucumber accessions. In this study, a cucumber core germplasm (CG) consisting of 94 resequenced lines was evaluated for PM resistance in four trials across three years (2013, 2014, and 2016). These trials were performed on adult plants in the field with natural infection. Using genome-wide association study (GWAS), 13 loci (pmG1.1, pmG1.2, pmG2.1, pmG2.2, pmG3.1, pmG4.1, pmG4.2, pmG5.1, pmG5.2, pmG5.3, pmG5.4, pmG6.1, and pmG6.2) associated with PM resistance were detected on all chromosomes except for Chr.7. Among these loci, ten were mapped to chromosomal intervals where QTLs had been reported in previous studies, while, three (pmG2.1, pmG3.1, and pmG4.1) were novel. The loci of pmG2.1, pmG5.2, pmG5.3 showed stronger signal in four trials. Based on the annotation of homologous genes in Arabidopsis and pairwise LD correlation analysis, candidate genes located in the QTL intervals were predicted. SNPs in these candidate genes were analyzed between haplotypes of highly resistant (HR) and susceptible (HS) CG lines, which were defined based on combing disease index data of all trials. Furthermore, candidate genes (Csa5G622830 and CsGy5G015660) reported in previous studies for PM resistance and cucumber orthologues of several PM susceptibility (S) genes (PMR5, PMR-6, and MLO) that are colocalized with certain QTLs, were analyzed for their potential contribution to the QTL effect on both PM and DM in the CG population. This study shows that the CG germplasm is a very valuable resource carrying known and novel QTLs for both PM and DM resistance, which can be exploited in cucumber breeding.
[77]
王伟平, 宋子超, 薄凯亮, 等. 黄瓜核心种质幼苗耐低温性评价及GWAS分析[J]. 植物遗传资源学报, 2019, 20(6):1606-1612.
本研究以黄瓜核心种质为材料开展苗期耐低温鉴定和材料筛选,并进行全基因组关联分析,挖掘候选基因。在苗期(两叶一心时期),分两批播种,进行自然低温胁迫,两次处理平均温度分别为12℃和19.3℃,分别处理14 d和11 d,根据子叶和真叶的黄化症状进行分级和分组。处理后幼苗差异显著,两批次苗期调查变异系数分别为23.2%和31.7%。将供试核心种质划分为4个组,从87份核心种质材料中筛选出苗期耐低温材料CG45、CG61、CG88和CG104等。对苗期的低温鉴定数据,利用核心种质重测序信息进行GWAS分析,在Chr.1、Chr.3、Chr.4和Chr.5上分别检测到苗期耐低温位点gLTS1.1、gLTS3.1、gLTS4.1和gLTS5.1。其中,位点gLTS5.1对低温敏感,可以被重复检测到。本研究结果对耐低温黄瓜种质的选育及耐低温后续基因挖掘及功能验证具有一定的参考意义。
[78]
魏爽, 张松, 薄凯亮, 等. 黄瓜核心种质幼苗耐热性评价及GWAS分析[J]. 植物遗传资源学报, 2019, 20(5):1223-1231.
为研究黄瓜苗期耐热性及筛选耐热核心种质材料,本试验选取86份核心种质,在夏季利用日光温室,采用开关风口方式控制高温环境(50±4℃),对三叶一心的幼苗进行耐热性处理及鉴定。以幼苗受害症状划分热害等级,以相应的热害指数为指标进行黄瓜苗期耐热性评价。结合核心种质重测序结果,进行幼苗耐热性全基因组关联分析。结果表明,供试黄瓜核心种质耐热性差异显著,两次调查变异系数分别为21.9%和22.5%。利用热害指数为指标进行聚类分析,把86份黄瓜核心种质划分为四大类群。全基因组关联分析共检测到7个与苗期耐热性相关位点gHII4.1、gHII5.1、gHII5.2、gHII6.1、gHII7.1、gHII4.2、gHII6.2。其中位于4号染色体的gHII4.1和gHII4.2,与幼苗耐热性关联最大,在此区段预测到67个候选基因。
[79]
张松, 苗晗, 宋子超, 等. 黄瓜发芽期耐热性评价及全基因组关联分析[J]. 植物遗传资源学报, 2019, 20(2):335-346.
[80]
林德佩. 南瓜植物的起源和分类[J]. 中国西瓜甜瓜, 2000(1):36-38.
[81]
ZHONG Y J, ZHOU Y Y, LI J X, et al. A high-density linkage map and QTL mapping of fruit-related traits in pumpkin (Cucurbita moschata Duch.)[J]. Scientific reports, 2017, 7(1):12785.
Pumpkin (Cucurbita moschata) is an economically worldwide crop. Few quantitative trait loci (QTLs) were reported previously due to the lack of genomic and genetic resources. In this study, a high-density linkage map of C. moschata was structured by double-digest restriction site-associated DNA sequencing, using 200 F2 individuals of CMO-1 × CMO-97. By filtering 74,899 SNPs, a total of 3,470 high quality SNP markers were assigned to the map spanning a total genetic distance of 3087.03 cM on 20 linkage groups (LGs) with an average genetic distance of 0.89 cM. Based on this map, both pericarp color and strip were fined mapped to a novel single locus on LG8 in the same region of 0.31 cM with phenotypic variance explained (PVE) of 93.6% and 90.2%, respectively. QTL analysis was also performed on carotenoids, sugars, tuberculate fruit, fruit diameter, thickness and chamber width with a total of 12 traits. 29 QTLs distributed in 9 LGs were detected with PVE from 9.6% to 28.6%. It was the first high-density linkage SNP map for C. moschata which was proved to be a valuable tool for gene or QTL mapping. This information will serve as significant basis for map-based gene cloning, draft genome assembling and molecular breeding.
[82]
LI Y, WANG Y, WU X, et al. Novel genomic regions of fusarium wilt resistance in bottle gourd [Lagenaria siceraria (Mol.) Standl.] discovered in genome-wide association study[J]. Frontiers in plant science, 2021, 12:650157.
[83]
CUI J, ZHOU Y, ZHONG J, et al. Genetic diversity among a collection of bitter gourd (Momordica charantia L.) cultivar[J]. Genetic resources and crop evolution, 2022, 69(2):729-735.
[84]
XIE D, XU Y, WANG J, et al. The wax gourd genomes offer insights into the genetic diversity and ancestral cucurbit karyotype[J]. Nature communications, 2019, 10(1):5158.
The botanical family Cucurbitaceae includes a variety of fruit crops with global or local economic importance. How their genomes evolve and the genetic basis of diversity remain largely unexplored. In this study, we sequence the genome of the wax gourd (Benincasa hispida), which bears giant fruit up to 80 cm in length and weighing over 20 kg. Comparative analyses of six cucurbit genomes reveal that the wax gourd genome represents the most ancestral karyotype, with the predicted ancestral genome having 15 proto-chromosomes. We also resequence 146 lines of diverse germplasm and build a variation map consisting of 16 million variations. Combining population genetics and linkage mapping, we identify a number of regions/genes potentially selected during domestication and improvement, some of which likely contribute to the large fruit size in wax gourds. Our analyses of these data help to understand genome evolution and function in cucurbits.
[85]
ZHANG X, ZHU Y, KREMLING K A G, et al. Genome-wide analysis of deletions in maize population reveals abundant genetic diversity and functional impact[J]. Theoretical and applied genetics, 2022, 135(1):273-290.
[86]
SASAKI E, KAWAKATSU T, ECKER J R, et al. Common alleles of CMT2 and NRPE1 are major determinants of CHH methylation variation in Arabidopsis thaliana[J]. PLOS Genetics, 2019, 15(12):e1008492.
[87]
DU X, HUANG G, HE S, et al. Resequencing of 243 diploid cotton accessions based on an updated A genome identifies the genetic basis of key agronomic traits[J]. Nature genetics, 2018, 50(6):796-802.
The ancestors of Gossypium arboreum and Gossypium herbaceum provided the A subgenome for the modern cultivated allotetraploid cotton. Here, we upgraded the G. arboreum genome assembly by integrating different technologies. We resequenced 243 G. arboreum and G. herbaceum accessions to generate a map of genome variations and found that they are equally diverged from Gossypium raimondii. Independent analysis suggested that Chinese G. arboreum originated in South China and was subsequently introduced to the Yangtze and Yellow River regions. Most accessions with domestication-related traits experienced geographic isolation. Genome-wide association study (GWAS) identified 98 significant peak associations for 11 agronomically important traits in G. arboreum. A nonsynonymous substitution (cysteine-to-arginine substitution) of GaKASIII seems to confer substantial fatty acid composition (C16:0 and C16:1) changes in cotton seeds. Resistance to fusarium wilt disease is associated with activation of GaGSTF9 expression. Our work represents a major step toward understanding the evolution of the A genome of cotton.
[88]
ZHU G, WANG S, HUANG Z, et al. Rewiring of the fruit metabolome in tomato breeding[J]. Cell, 2018, 172(1-2):249.
[89]
ZHENG Y, WU S, BAI Y, et al. Cucurbit Genomics Database (CuGenDB): A central portal for comparative and functional genomics of cucurbit crops[J]. Nucleic acids research, 2019, 47(D1):D1128-D1136.
[90]
赵虎. 水稻序列变异数据库及关联分析候选基因筛选平台开发[D]. 武汉: 华中农业大学, 2019.
[90]
YANG Z, LIANG C, WEI L, et al. BnVIR: Bridging the genotype-phenotype gap to accelerate mining of candidate variations underlying agronomic traits in Brassica napus[J]. Molecular plant, 2022, 15(5):779-782.
[91]
SHIRASAWA K, ISOBE S, TABATA S, et al. Kazusa Marker DataBase: A database for genomics, genetics, and molecular breeding in plants[J]. Breeding science, 2014, 64(3):264-271.
In order to provide useful genomic information for agronomical plants, we have established a database, the Kazusa Marker DataBase (http://marker.kazusa.or.jp). This database includes information on DNA markers, e.g., SSR and SNP markers, genetic linkage maps, and physical maps, that were developed at the Kazusa DNA Research Institute. Keyword searches for the markers, sequence data used for marker development, and experimental conditions are also available through this database. Currently, 10 plant species have been targeted: tomato (Solanum lycopersicum), pepper (Capsicum annuum), strawberry (Fragaria × ananassa), radish (Raphanus sativus), Lotus japonicus, soybean (Glycine max), peanut (Arachis hypogaea), red clover (Trifolium pratense), white clover (Trifolium repens), and eucalyptus (Eucalyptus camaldulensis). In addition, the number of plant species registered in this database will be increased as our research progresses. The Kazusa Marker DataBase will be a useful tool for both basic and applied sciences, such as genomics, genetics, and molecular breeding in crops.
Share on Mendeley
PDF(1274 KB)

Collection(s)

Triticum aestivum L.

Accesses

Citation

Detail

Sections
Recommended

/