- Research
- Open access
- Published:
Genetic diversity and population structure in Ethiopian mustard (Brassica carinata A. Braun) revealed by high-density DArTSeq SNP genotyping
BMC Genomics volume 26, Article number: 354 (2025)
Abstract
Background
Ethiopian mustard (Brassica carinata (A) Braun) is a promising oilseed crop with the potential for sustainable biofuel and bio-industrial applications. Despite the presence of diverse germplasms in Ethiopia, their genetic diversity remains largely unexplored. This study evaluated the genetic diversity and population structure of 188 B. carinata genotypes using high-density Single Nucleotide Polymorphism (SNP) markers generated though DArTseq™ Genotyping-by-Sequencing (GBS). Of the 15,515 identified DArTSeq SNPs, 3,793 high-quality markers were retained and used to analyze the genetic diversity and population structure.
Results
The results from STRUCTURE, principal coordinate analysis (PCoA), and neighbor-joining tree analyses revealed two slightly distinct subpopulations, with Pop1 predominantly comprising genotypes from the Oromia and Amhara regions (86.17%), whereas Pop2 primarily consisted of released varieties, suggests the influence of targeted selection. Despite the presence of subpopulations, PCoA indicated a relatively limited overall genetic diversity among the genotypes. Analysis of Molecular Variance (AMOVA) revealed higher genetic variation within populations (65.19%) than between populations (34.81%), resulting in low genetic differentiation (PhiPT = 0.02) and high gene flow (Nm = 5.74). Notably, subpopulation formation was not strongly correlated with geographical origin, highlights that factors beyond geography, such as gene flow and selection pressure, may have played a significant role in shaping the observed genetic diversity. Genetic diversity indices revealed a slightly low-to-moderate variation within the B. carinata populations, as evidenced by the slightly low expected heterozygosity (He = 0.21) and moderate polymorphic information content (PIC = 0.36).
Conclusion
Overall, this study revealed a moderate level of genetic diversity within the evaluated B. carinata genotypes. The results offer valuable insights into the genetic structure of this species and highlight the need for targeted strategies to enhance genetic diversity in future breeding initiatives and conservation efforts.
Background
Ethiopian mustard (Brassica carinata A. Braun) is a versatile oilseed crop native to Ethiopian highlands and surrounding regions with a 4000-year cultivation history [1]. This amphidiploid species (genome BBCC, n = 17) evolved through the natural hybridization of B. nigra (BB, n = 8) and B. oleracea (CC, n = 9), followed by chromosome doubling [2]. Over three million Ethiopian smallholder farmers cultivate B. carinata primarily for its leaves and oil-rich seeds [3]. Beyond its traditional uses, the crop’s adaptability to diverse climates [4] and its applications in the biofuel and oleo-chemical industries have spurred its worldwide introduction [5]. The promising properties of B. carinata extend to sustainable aviation fuel production owing to its high erucic acid content (31–46%), low saturated fatty acids, and ease of refining, potentially reducing greenhouse gas emissions [6, 7]. Furthermore, the crop is resilient to environmental stressors, disease resistance, high seed yield potential on marginal lands, and a large seed size [8]. This multifaceted crop has gained recognition across various sectors, including agriculture, aviation, pharmaceuticals, plastics, and bioenergy [9, 10].
Ethiopia, recognized as the center of origin and diversity for B. carinata (http://www.ebi.gov.et), boasts a rich germplasm collection maintained within the Ethiopian Biodiversity Institute (EBI) and various agricultural research institutions (http://www.eiar.gov.et). This diverse genetic resource provides a valuable foundation for the development of improved varieties [11]. Over the past four decades, significant research has focused on enhancing the yield, quality, and secondary metabolite traits [12, 13, 14]. However, despite these breeding efforts, the full genetic potential of B. carinata remains largely unknown, owing to a limited understanding of its genetic diversity. Genetic diversity is pivotal for plant adaptation and evolution, and drives the development of improved crop varieties with desirable traits [15]. Germplasm characterization plays a crucial role in identifying variations in breeding and guiding strategies for collection, conservation, and sustainable use [14]. Previous studies using agro-morphological, physiological, and biochemical markers have revealed considerable genetic variation [10, 16–18]. However, these markers often lack precision owing to their environmental and developmental influences. Molecular marker-based analyses offer a more robust approach for detailed genetic characterization [14]. Earlier genetic diversity studies in B. carinata employed a range of markers, from early stage techniques, such as Random Amplification Polymorphic DNA (RAPD) [19] and Amplified Fragment Length Polymorphism (AFLP) [20] to more advanced markers, such as Simple Sequence Repeat (SSR) [21] and Single Nucleotide Polymorphism (SNP) [5]. Despite their advantages, these markers often suffer from limitations such as low density and incomplete coverage [22, 23] hindering comprehensive genetic diversity analyses. To address these limitations, the Diversity Array Technology (DArT) emerged in the early 2000s [24].
High-throughput DArTSeq SNP marker technology has significantly advanced research on plant genetic diversity and population structure [25]. These markers are characterized by genome-wide abundance, ease of replication, reliability, comprehensive genome coverage, and suitability for large-scale genotyping [26]. Although successfully applied to various crops, including sorghum [27, 28], barley [29], wheat [30], macadamia [31], and maize [32], DArTSeq SNP marker technology remains unexploited in B. carinata diversity studies. The lack of robust genome-wide data has restricted previous B. carinata studies to parent selection based on phenotypic characteristics or limited old molecular marker data, which has led to limited estimation of the true extent of genetic diversity and inaccurate population structure inference within and among B. carinata populations. This constraint has significantly hindered the implementation of large-scale genotyping initiatives, which are essential for the comprehensive assessment of genetic diversity and population structure across a broad spectrum of genotypes. As a result, the development of improved cultivars and the full realization of the agronomic and economic potential of B. carinata have been impeded. Therefore, addressing this critical research gap is fundamental for the advancement of breeding programs and the sustainable exploitation of this valuable crop. Therefore, this study aimed to use high-density DArTSeq-generated SNP markers to evaluate the genetic diversity of Ethiopian mustard germplasm. Specifically, this study aimed to (i) assess genetic diversity within the germplasm and (ii) analyze population structure and genetic relationships among genotypes. These findings will contribute to breeding programs for developing improved varieties, highlighting the importance of genetic diversity for improving and supporting the conservation of this valuable genetic resource.
Materials and methods
Genetic materials
The genetic diversity of 188 B. carinata genotypes collected acsros various Ethiopian regions between 1984 and 2022 was evaluated (Fig. 1, Table S1). The majority of the genotypes (160) were sourced from EBI (http://www.ebi.gov.et). The remaining 28 genotypes, including five released varieties (Tesfa, Holetta-1, Derash, Yellow Dodola, and S-67), were supplied by the Holetta Agricultural Research Center (HARC) (http://www.eiar.gov.et/holetta/).
Genomic DNA extraction and genotyping
Five seeds per genotype were sown in seedling trays at the School of Plant and Horticultural Sciences Research Facility of Hawassa University, Ethiopia. Three-week-old leaf samples were collected from five seedlings of each genotype, pooled, and desiccated using silica gel. Ten milligrams of dried leaf tissue from each genotype were subsequently shipped to the SeqArt Africa laboratory at the International Livestock Research Institute (ILRI), Kenya, for genotyping analysis. Genomic DNA (gDNA) was isolated from the dried leaf samples using a Nucleomag Plant Kit (Macherey-Nagel, Germany). The quality and quantity of the gDNA were assessed using a NanoDrop 2000 spectrophotometer (Thermo Fisher Scientific, USA). To further verify gDNA integrity, the samples were electrophoretically separated on a 0.8% agarose gel stained with ethidium bromide in 1x TBE buffer at 70 V for 45 min.
Forty microliters of gDNA (50 ng/µL) per genotype were used for whole-genome scanning with DArTseq™ Genotyping by Sequencing (GBS) technology (https://www.diversityarrays.com/). The GBS protocol involves library construction encompassing restriction enzyme digestion, adaptor ligation, and PCR amplification to generate sequence-ready libraries. These libraries were subjected to single-end sequencing on an Illumina HiSeq2500 platform, which generated 69 base reads per sample. Sequencing reads were aligned to the reference genome of B. carinata, GCA_016771965.1_ASM1677196v1 https://www.ncbi.nlm.nih.gov/datasets/genome/GCA_016771965.1) [33] using the DArTsoft14 algorithm to identify the SNPs. As co-dominant markers, the SNPs were encoded in a binary matrix. The presence of a specific SNP allele was coded as “1,” while its absence was coded as “0.” Homozygous and heterozygous genotypes were represented as “1/1” or “0/0” and “1/0,” respectively.
Data analysis
SNP Filtering and Characterization
SNP quality control was performed using the dartR package in R [34]. This process removed loci with low call rates (< 70% for markers and < 50% for individuals), low reproducibility (< 95%), monomorphism (no variation), missing data (entire loci with NA scores or SNPs exceeding 20% of missing genotypes), and low minor allele frequency (MAF < 0.05). Only high-quality SNPs were used for further analyses [29]. The genetic properties of the filtered SNPs were assessed using three parameters: observed heterozygosity (Ho), the proportion of individuals heterozygous for a particular SNP; Expected heterozygosity (He), the expected proportion of heterozygotes assuming Hardy-Weinberg equilibrium; and polymorphic information content (PIC), a measure of the informativeness of an SNP for diversity analysis [31].
Genetic diversity analysis
Genetic diversity within populations was estimated using several indices calculated using the gl.report.heterozygosity function in dartR [25]. These included minor allele frequency (MAF)-frequency of the less common allele at a given SNP, Major allele frequency (MaF)-the frequency of the more common allele at a given SNP. Ho, He, and PIC.
Population structure and genetic relationship analysis
Population structure was analyzed using Bayesian clustering implemented in STRUCTURE v2.3.4 [35]. This software assigns individuals to genetic groups (K) ranging from 2 to 10. Each K value was run five times with extensive burn-in and 100,000 Markov Chain Monte Carlo (MCMC) iterations to ensure convergence. The optimal K value was determined using the Evanno method [36]. Clustering results were visualized using DISTRUCT v1.1 [37] after post-processing with CLUMPP v1.1.2 [38]. Individuals with high membership coefficients (≥ 0.70) were assigned to distinct populations, whereas the others were considered admixed. To further explore genetic relationships, pairwise fixation indices were calculated using the StAMPP package [39]. Additionally, neighbor-joining trees were constructed based on a Euclidean distance [40] dissimilarity matrix using the dartR package [41]. A dendrogram was generated using the unweighted pair group method with arithmetic mean (UPGMA) clustering algorithm employing the ggdendro and ggplot2 packages [42] in R. Principal Coordinate Analysis (PCoA) was performed using the eigenstrat method [43] and visualized using ggplot2.
Population differentiation and gene flow
Analysis of Molecular Variance (AMOVA) was conducted using the poppr package [41] to quantify genetic differentiation among populations and estimate gene flow. This involved calculating Nei’s genetic distance [44] between the populations. The fixation index (FST) and haploid number of migrants (Nm) were obtained from AMOVA analysis which provide insights into the degree of population differentiation (FST) and the level of gene flow (Nm) between populations.
Results
Analysis of markers distribution and density
Comprehensive genotyping analysis of 188 B. carinata genotypes was conducted using a high-density DArTSeq SNP genotyping platform, resulted in 15,515 markers. Of these, 3,793 SNPs exhibited polymorphisms across the entire genome, distributed across all 17 chromosomes, 8 from the B genome and 9 from the C genome (Fig. 2). Visualization of SNP density within a 1 MB window revealed a notable concentration, particularly on chromosome B1 (highlighted in green, Fig. 2). This chromosome exhibited the highest number of SNP markers (376), whereas chromosome C9 displayed the lowest number (103) (Fig. 3). Overall, the B genome exhibited a slightly higher proportion of SNPs (56.85%) than the C genome (43.15%) (Table S2).
Analysis of single nucleotide polymorphisms (SNPs) revealed a distinct pattern in the distribution of transitions and transversions (mutational bias) in B. carinata B and C genomes, with a higher frequency of transitions than transversions, particularly within the B genome (Table 1). Transitions, specifically C/T and T/C, and A/G and G/A, constituted 37.83% of all SNPs in the B genome, significantly exceeding the 24.49% observed in the C genome. Conversely, transversions accounted for 24.23% of SNPs in the B genome. Furthermore, all four possible transversion SNPs (A/C, C/A, G/T, and T/G) were exclusively observed within the B genome. In contrast, both genomes exhibited only two types of transition SNPs.
Analysis of genetic parameters
The genetic diversity within the B. carinata germplasm collection was assessed using 3,793 high-density DArTSeq SNP markers, which revealed a moderate level of genetic diversity (Table 2). Expected heterozygosity (He) ranged from 0.15 to 0.31, with an average of 0.24, while observed heterozygosity (Ho) ranged from 0.08 to 0.12, with an average of 0.09. This observed heterozygosity deficit (Ho < He) indicates a moderate level of inbreeding within the population, further supported by an inbreeding coefficient (FIS) of 0.51. Polymorphism Information Content (PIC) values ranged from 0.15 to 0.36, with approximately 70.6% of the SNPs exhibited PIC values ≤ 0.36, suggests a moderate level of informativeness. The minor allele frequency (MAF) ranged from 0.08 to 0.12, with an average of 0.10, while the major allele frequency (MaF) ranged from 0.09 to 0.15, with an average of 0.12, indicates a relatively balanced distribution of alleles, and the major allele was generally more common than the minor allele.
Analysis of population structure and genetic relationship
Population structure analysis, conducted using STRUCTURE, identified two slightly distinct subpopulations within 188 B. carinata genotypes (Fig. 4A and B). Evanno’s test, based on the ΔK plot (Fig. 4A), indicated the presence of two optimal clusters (K = 2), confirms the presence of two subpopulations. Genotypes were subsequently assigned to either Population 1 (Pop 1 = green) or Population 2 (Pop 2 = red) (Fig. 4B; Fig. S1). Population 1, comprised 83.51% of the genotypes (n = 157), primarily sourced from eight Ethiopian regions (Table 3), with the highest representation from Oromia (n = 85; 54.14%) and Amhara (n = 48; 30.57%). Population 2, represented 16.49% of the genotypes (n = 31), including all five released varieties and genotypes from six regions, excluding Central Ethiopia and Harari. This subpopulation had a diverse regional origin, with Oromia contributed the highest proportion (41.95%), followed by Amhara (29.03%), Benishangul Gumuz (12.90%), Tigray (9.68%), and South and Southwest Ethiopia (each 3.23%) (Table 3).
Principal coordinate analysis (PCoA) showed slight genetic differences between 188 B. carinata populations, grouping them into two somewhat distinct groups, highland population and midland population (Fig. 5). This grouping was also seen in the STRUCTURE analysis (Fig. 4), confirming small genetic differences. These differences accounted for only 7.4% (PCoA 1 = 4.4% and PCoA 2 = 3%) of the total genetic variation, as visualized in the PCoA plot (Fig. 5, Table S3).
Additionally, a phylogenetic tree constructed based on genetic similarities showed two slightly different groups, Pop1 and Pop2 (Fig. 6), corroborating the PCoA and STRUCTURE results.
Analysis of molecular variance
Analysis of Molecular Variance (AMOVA) revealed that a significant proportion of the genetic variation (65.19%) resided within populations, whereas only 34.81% was attributable to differences between the two identified subpopulations (Table 4). This finding, supported by a low PhiPT value of 0.02 (p ≤ 0.001), indicates relatively low genetic differentiation between subpopulations. Furthermore, a high level of gene flow (Nm = 5.74) likely facilitated gene exchange between the subpopulations, contributed to the observed low genetic differentiation.
Discussion
Crop improvement significantly benefits from the exploration of diverse genetic resources, particularly landraces, which are traditional crop varieties adapted to diverse conditions [15]. These landraces, often derived from wild or semi-wild relatives of cultivated species, provide essential genetic variation [45]. In Ethiopia, smallholder farmers primarily cultivate B. carinata using unimproved landraces that demonstrate resilience to local climate variations [46]. The genetic diversity of these landraces, developed over centuries of adaptation to their environment, equips them to endure various environmental challenges [47]. A comprehensive understanding of the genetic diversity and population structure of these landraces is fundamental for their effective use in plant breeding programs and conservation efforts [8]. This knowledge is critical for developing new cultivars with enhanced agronomic traits, such as increased oil yield, improved disease resistance, and enhanced tolerance to biotic and abiotic stresses [48]. Furthermore, this information can be used to refine existing breeding strategies and maximize the use of available genetic resources [47, 48].
This study employed high-density DArTSeq SNP genotyping to assess genetic diversity within a collection of 188 B.carinata germplasm. A total of 15,515 DArTSeq SNPs were identified and 3,793 high-quality markers were retained for subsequent analyses. Notably, the B genome exhibited higher polymorphism (1722 SNPs) compared to the C genome (1307 SNPs), aligning with previous findings by Zou et al. [49] and Khedikar et al. [23]. This disparity likely stems from the earlier divergence of the B. nigra genome (approximately 8 million years ago) from its common ancestor compared to that of B. oleracea (approximately 4 million years ago), which might allow for greater accumulation of mutations within the B genome [50]. Furthermore, functional significance cannot be ruled out; a higher density of genes, particularly those involved in adaptation or stress responses derived more from B. nigra, could also explain the increased SNP density on chromosome B1.
SNP analysis revealed a predominance of transitions over transversions, which aligns with previous findings in Brassica species [5, 51, 52]. This phenomenon may be attributed to various factors, including mutation bias, genetic drift, selection pressure during domestication and breeding or recombination events [53, 54]. Alternatively, unique features within the B genome, such as specific DNA repair gene variants or methylation patterns, may also contribute to the elevated transition rate. The polymorphism information content (PIC) of the markers ranged from 0.15 to 0.36, with an average value of 0.26. This value is comparable to that previously reported for B. carinata (0.26) [5] but lower than that observed in B. rapa (0.40) [55]. According to Botstein et al. [56], PIC values above 0.5 are considered highly informative, those between 0.25 and 0.5 are moderately informative, and those below 0.25 are slightly informative. The results indicated a moderate level of polymorphism within the 188 B. carinata genotypes evaluated, highlights the opportunities to breed and conserve this species and emphasizing the need for strategies to enhance genetic diversity for future crop improvement.
The genetic diversity within the studied B. carinata landraces, as assessed by expected heterozygosity (He), was observed to be relatively low, ranging from 0.15 to 0.31 with an average of 0.24. This finding suggests a limited level of genetic variation within the population. It is somewhat short of the high diversity expected considering the crop’s long cultivation history spanning approximately 4,000 years [3]. Several factors may have contributed to this relatively low genetic diversity. Limited natural hybridization due to geographical isolation, particularly in the East African highlands, may have restricted gene flow and, consequently, reduced genetic variation [57]. Furthermore, the observed discrepancy between the observed (Ho = 0.09) and expected (He = 0.24) heterozygosity, coupled with the low genetic distances between individuals, indicates that there are fewer heterozygous individuals in the population than expected under the Hardy-Weinberg equilibrium. This deviation can arise from several factors, including non-random mating (e.g., inbreeding and increasing homozygosity) or natural selection against heterozygotes (underdominance). A heterozygosity deficit often implies inbreeding depression, which highlights the negative consequences of reduced genetic diversity. Furthermore, such deficits can reveal insights into population structure, such as the extent of population subdivision and gene flow strongly suggests the presence of inbreeding within the B. carinata genotypes. The predominantly self-pollinating nature of the crop may further exacerbates this inbreeding effect [58]. The findings are consistent with those of previous studies on B. carinata [5, 23] and B. napus [59], which also reported low to moderate levels of genetic diversity.
Population structure analysis of germplasm collections provides insights into their genetic diversity and can be useful in controlling false-positive associations between marker loci and traits of interest [60]. The results of the STRUCTURE, PCoA, and dendrogram analyses revealed two distinct subpopulations (Pop1 and Pop2) within the 188 B. carinata genotypes despite a relatively low level of genetic differentiation. Notably, both subpopulations encompassed genotypes collected from various regions across Ethiopia, with the Oromia (52.13%) and Amhara (30.32%) regions contributed the majority of genotypes (Table 3). This suggests that the formation of these subpopulations was likely influenced by factors beyond geographic adaptation, such as historical inter-regional germplasm exchange, shared ancestral origins, or a combination of these factors [23]. The presence of genotypes from the same region within different clusters underscores the presence of genetic diversity arising from ancestral differences and recombination events during hybridization. Furthermore, the observed clustering of all released B. carinata varieties into a single genetic population (Pop 2), along with diverse regional accessions, provides compelling evidence for the substantial impact of selective breeding. This genetic homogeneity can be attributed to either targeted selection during variety development or the early and widespread dissemination of superior varieties, potentially leading to their integration as landraces and a shared genetic lineage. Moreover, the released varieties may have originated from landraces within Pop 2 underscores their value as reservoirs of useful genetic variation for breeding. These results imply the need for the development and implementation of strategies aimed at enhancing genetic diversity within the B. carinata germplasm are imperative for bolstering the crop’s resilience and adaptability to evolving environmental challenges, including climate change, and ensuring its sustainable production.
Despite the identification of two distinct subpopulations (Pop1 and Pop2), the level of genetic differentiation between them was low. This was evidenced by the low PhiPT value of 0.02 (Table 4), indicates minimal genetic divergence. Furthermore, AMOVA revealed that the majority of genetic variation resided within populations rather than between them, a pattern commonly observed in self-pollinating species [58], as observed in this study. The observed high gene flow between Pop1 and Pop2, as indicated by an Nm value of 5.74 (Table 4), likely contributes significantly to low genetic differentiation. According to Wright [61], Nm values were categorized as high (Nm ≥ 1.0), medium (Nm = 0.25–0.99), and low (Nm = 0.0–0.249). The estimated 5.74 Nm value fell within the high gene flow category (Nm ≥ 1.0), suggests substantial gene exchange between the subpopulations. This high gene flow could be attributed to various factors, including pollen flow mediated by wind or insects, and seed exchange facilitated by farmers [60]. In general, these results highlight the importance of considering factors beyond geographic location when assessing the genetic diversity of B. carinata. When subpopulations exist, high gene flow and low genetic differentiation indicate the interconnectedness of the gene pool. This understanding has significant implications for breeding programs, as it suggests that existing genetic variation within a species can be effectively exploited for crop improvement. These findings align with those of previous studies on B. carinata [5, 23] and B. napus [59], which reported the presence of two subpopulations with low levels of genetic differentiation within their respective 631, 94, and 83 studied germplasm.
In general, the genetic insights gained from this study have significant practical implications, including facilitating improved parent selection for maximizing hybrid vigor, enhancing the efficiency of marker-assisted selection (MAS), and genome-wide association studies (GWAS) through increased statistical power and optimization of genomic selection (GS) strategies. These findings also enable the identification of valuable new germplasms, refinement of conservation strategies, monitoring of genetic erosion, and the development of representative core collections. Ultimately, this knowledge will drive the creation of novel cultivars with superior agronomic traits such as increased oil yield, enhanced disease resistance, and improved stress tolerance [48]. Furthermore, this information can empower breeders to refine existing methodologies and optimize the use of existing genetic resources [47, 48].
Conclusions
This study comprehensively evaluated the genetic diversity and population structure of 188 B. carinata germplasm from Ethiopia using high-density DArTSeq SNP markers. In general, the study findings revealed a moderate level of genetic diversity, providing crucial insights into the genetic architecture of this species and establishing a solid foundation for future breeding and conservation initiatives. The identified candidate core collections from this study could serve as valuable resources for genome-wide association studies (GWAS), facilitating the identification of genomic regions associated with key agronomic traits. To further enhance our understanding of the B. carinata gene pool, future research should integrate broader germplasm collections from diverse regions within Ethiopia and across the globe would facilitate the identification and conservation of diverse allelic variants for key agronomic traits, ultimately contributing to the development of improved and more resilient B. carinata cultivars.
Data availability
The datasets that support the findings of this study are publicly available in the supplementary materials. Additional information related to these datasets can be obtained from the corresponding author upon reasonable request. Genome-wide SNP data for B. carinata accessions, generated via DArTseq, are publicly available on Dryad: doi: https://doi.org/10.5061/dryad.mgqnk999j.
Abbreviations
- AFLP:
-
Amplified Fragment Length Polymorphism
- AMOVA:
-
Analysis of molecular variance
- DArT:
-
Diversity Arrey Technology
- EBI:
-
Ethiopian biodiversity Institute
- EIAR:
-
Ethiopian Institute of Agricultural Research
- FAOSTAT:
-
Food & Agriculture Organization of the United Nation Statstics
- FIS:
-
Inbreeding Coefficient
- FST:
-
Fixation Index
- GBS:
-
Genotyping by Sequencing
- gDNA:
-
Genomic DNA
- GS:
-
Genomic Selection
- GWAS:
-
Genome-Wide Association Studies
- GAM:
-
Genetic Advance of Mean as Percentage
- HARC:
-
Holeta Agricultural Research Center
- HARC:
-
Holeta Agricultural Research Center
- MAF:
-
Minor Allele Frequency
- MaF:
-
Major Allele Frequency
- MAS:
-
Marker-Assisted Selection
- MCMC:
-
Markov Chain Monte Carlo
- MoA:
-
Ministry of Agriculture
- Nm:
-
Number of migrants
- PCoA:
-
Principal Coordinate Analysis
- RAPD:
-
Random Amplification Polymorphic DNA
- SNPs:
-
Single Nucleotide Polymorphisms
- SSR:
-
Simple Sequence Repeat
- UPGMA:
-
Unweighted Pair Group Method with Arithmetic mean
References
Getinet A, Rakow G, Raney JP, Downey RK. Agronomic performance and seed quality of brassica carinata. Can J Plant Sci. 1996;76(4):801–6.
Prakash S, Hinata K. Taxonomy, cytogenetics and origin of crop brassicas, a review. Opera Bot. 1980;55:1–57.
Alemayehu N, Becker H. Genotypic diversity and patterns of variation in a germplasm material of Ethiopian mustard (Brassica carinata A. Braun). Genet Resour Crop Evol. 2002;49(6):573–82.
Seepaul R, Kumar S, Irulappan V, Mulchandani A, George S. Harnessing camelina and carinata to revolutionize the bioproducts industry: a biorefinery approach. Bioresource Technol Rep. 2021;15:100792.
Tesfaye M, Feyissa T, Hailesilassie T, Kanagarajan S. Genetic diversity and population structure in Ethiopian mustard (Brassica carinata A. Braun) as revealed by single nucleotide polymorphism markers. Genes. 2023;14(9):1757. https://doi.org/10.3390/genes14091757
Seepaul R, Small D, George S, Marnoch D, Wright DL, George ST. Carinata (Brassica carinata) yield response to different nitrogen rates under dryland conditions. Agronomy. 2019;9(12):782.
George S, Janakiram T, Seepaul R, George ST, Wiredu AN. Brassica carinata meals-based feeds for Pacific white shrimp Litopenaeus vannamei: implications on zootechnical performances, metabolic responses, and immunity. Aquaculture. 2021;542:736926.
Bulcha D, Tucho TA, Hussen S. Agronomic and seed quality performance of Ethiopian mustard (Brassica carinata A. Braun) varieties at Haro Sabu, Western Ethiopia. Ethiop J Agricultural Sci. 2017;27(2):119–30.
Rahiel M, Bechere E, Eticha F. Assessment of genetic variability, heritability and association among yield and yield related traits of Mid-altitude Ethiopian mustard (Brassica carinata A. Braun) genotypes. Int J Agricultural Res Innov Technol. 2020;10(2):27–32.
Yirssaw DA, Andargachew G, Temesgen MO, Mikis MA. Genetic variation in Ethiopian mustard (Brassica carinata A. Braun) germplasm based on seed oil content and fatty acid composition. Genet Resour Crop Evol. 2024;2155. https://doi.org/10.1007/s10722-024-02155-4
Takele Y. Agro-Morphological and molecular diversity analysis of Ethiopian mustard (Brassica carinata A. Braun) germplasm in Southwestern Ethiopia. Int J Agricultural Technol. 2022;18(3):599–614.
Misteru TB, Bhuiyan MHM, Islam MR. Economic evaluation and sustainability of mustard oil production: a case study. J Bangladesh Agricultural Univ. 2013;11(2):227–34.
Kumar S, Seepaul R, Small D, George S, Marnoch D, Wright DL, George ST. Brassica carinata yield response to different nitrogen rates under dryland conditions. Agronomy. 2021;9(12):782.
Li FW, Harkess A, Li ZK, You MP, Sharpe AG. Brassica carinata, a member of the triangle of U with potential for crop improvement. Curr Plant Biology. 2022;30:100247.
Azeez MA, Adubi AO, Durodola FA. Landraces and Crop Genetic Improvement. Semantic Scholar. 2018. https://doi.org/10.5772/INTECHOPEN.75944. Accessed 7th June 2024.
Yared D, Haileselassie T, Amsalu B. Genetic variability in Ethiopian mustard (Brassica carinata A. Braun) for oil content and fatty acid composition. Int J Plant Breed Genet. 2010;4(1):11–9.
Tesfaye W, Bekele E, Alamerew S. Genetic diversity and population structure of Ethiopian mustard (Brassica carinata A. Braun) landraces using ISSR markers. Afr J Biotechnol. 2014;13(21):2178.
Fekadu D. Oil and fatty acid composition analysis of Ethiopian mustard (Brasica carinata A. Braun). Landraces Int J Plant Breed Crop Sci. 2021;8(1):1039–49. ISSN: 2167– 0449. www.premierpublishers.org
Peng L, Cheng X, Zhuang J, Lu H, Chen X, Wu X, Zhou G. Analysis of genetic diversity among cultivated and wild brassica Napus L. germplasm genotypes based on RAPD and pod traits. Hortic Plant J. 2023;9(1):51–8.
El-Esawi MA, Germaine K, Bourke P, Malone R. AFLP analysis of genetic diversity and phylogenetic relationships of brassica Oleracea in Ireland. Mol Biology Genet. 2016;339(5–6):163–70.
Thakur N, Kaur K, Virk PS. Assessment of diversity in the rice (Oryza sativa L.) genotypes using microsatellite marker. J Agricultural Sci Technol. 2021;23(8):1677–86.
Thakur N, Kaur K, Virk PS. Assessment of genetic diversity among elite rice (Oryza sativa L.) genotypes using SSR markers. Int J Agricultural Sci Res. 2019;9(5):15–26.
Khedikar YP, Bhowmick S, Pandey A, Purchase JL, Berger JD. Single nucleotide polymorphism mapping for identification of new sources of resistance to Blackleg in brassica Napus. Crop Sci. 2020;60(6):2990–3000.
Valdisser PAMR, Pereira WJ, Sousa TRB, Zimmer PD, Lemke GD, Schuster I. Genotyping of common bean germplasm using DArTSeq technology. Genet Mol Res. 2017;16(1):gmr1601930.
MijangosJL, Gruber B, Berry O, Pacioni C, Georges A. DartR v2: an accessible genetic analysis platform for conservation, ecology and agriculture. Methods Ecol Evol. 2022;13:2150–8. https://doi.org/10.1111/2041-210X.13918
Fiust A, Rapacz M, Wójcik-Jagła M, Tyrka M. Development of DArT-based PCR markers for selecting drought tolerant spring barley. J Appl Genet. 2015;56:299–309. https://doi.org/10.1007/s13353-015-0273-x
Muhammed AA, Adamu Ak, Carels N, Oduwaye O, Barnabas B. Diversity array technology (DArT) markers for characterization and utilization of sorghum (Sorghum bicolor (L.) Moench) genotypes for drought tolerance. Front Plant Sci. 2023;13:1019475.
Phoebe A, Mohammed A, Carels N, Odunayo IO. Barnabas,\ B. Exploring the potential of diversity arrays technology (DArT) markers for assessing genetic variability among sorghum bicolor (L.) Moench genotypes. Plants. 2023;12(1):129.
Matties JJ, Wright CL, Vales MI, Eggert K, Stubbs T, Lucia E, Matthews PM. Classification and comparison of Nebraska feed and food barleys through genotyping-by-sequencing. Crop Sci. 2012;52(6):2748–57.
Laido G, Marone D, Russo MA, Colecchia SA, Mastrangelo AM, De Vita P, Papa R. Linkage disequilibrium and genome-wide association mapping in tetraploid wheat (Triticum turgidum L). PLoS ONE, 2013; 8(7), e68294.
Alam M, Neal J, O’Connor K, Kilian A, Topp B. Ultra-high-throughput DArTseq-based genome profiling and phenotypic analysis of 1433 macadamia genotypes. Tree Genet Genomes. 2018;14(1):1–17.
Adu GB, Badu-Apraku B, Akromah R, Awuku FJ, Gedil M. Genetic diversity and population structure of early maturing tropical maize landraces using SSR markers. Agronomy. 2019;9(2):119.
Song X, Wei Y, Xiao D, Gong K, Su P, et al. Brassica carinata genome characterization clarifies U’s triangle model of evolution and polyploidy in Brassica. Plant Physiol. 2021;186(1):388–406. https://doi.org/10.1093/plphys/kiab048
Gruber B, Unmack PJ, Berry OF, Georges A. DArTR: an R package to facilitate analysis of SNP data generated from reduced representation genome sequencing. Mol Ecol Resour. 2018;18:691–9. https://doi.org/10.1111/1755-0998.12745
Earl DA, Holdt DM. STRUCTURE HARVESTER: a website and program for assisting with bayesian population structure analysis. Mol Ecol Resour. 2012;12(4):941–5.
Jakobsson M, Rosenberg NA. CLUMPP: A cluster matching and permutation program for dealing with label switching and multimodality in analysis of population structure. Bioinformatics. 2007;23:1801–6.
Evanno G, Regnaut S, Goudet J. Detecting the number of clusters from discrete data sets using the bayesian information criterion. Mol Ecol. 2005;14(8):2611–20.
Rosenberg NA. Distruct: a program for the graphical display of population structure. Mol Ecol Notes. 2004;4(1):137–8. https://doi.org/10.1046/j.1471-8286.2003.00566.x
Pembleton LW, Cogan NO, Forster JW. StAMPP: an R package for calculation of genetic differentiation and structure of mixed-ploidy level populations. Mol Ecol Resour. 2013;13:946–52.
Kamvar Z, Tabatabaei SJ, Rezaei MR. PCR-SSCP and DNA sequencing for identifying potato virus X and potato virus Y isolates. Archives Phytopathol Plant Prot. 2014;48(10):872–80.
R Development Core Team. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. 2005;3-900051-07-0. URL http://www.R-project.org
Letunic IP, Bork. Interactive tree of life (iTOL) v5: an online tool for phylogenetic tree display and annotation. Nucleic Acids Res. 2021;49:W293–6. https://doi.org/10.1093/nar/gkab301
Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D. Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet. 2006;38(8):904–9.
Nei M, Takezaki N. Estimation of genetic distances and phylogenetic trees from DNA analysis. In Proceedings of the 5th World Congress on Genetics Applied to Livestock Production, Guelph, ON, Canada, 7–12. August 1994;1994(21):405–412.
Salgotra RK, Chauhan BS. Genetic diversity, conservation, and utilization of plant genetic resources. Genes (Basel). 2023;14(1):174. https://doi.org/10.3390/genes14010174
MoA. Crop description for selected Ethiopian traditional crops. Ministry of Agriculture. 2018. Addis Ababa, Ethiopia.
Muthoni KE. Genetic diversity and relatedness among Brassica carinata genotypes using RAPD markers. Afr J Biotechnol. 2010;9(2):309–16.
Lazaridi E, Kapazoglou A, Gerakari M, et al. Crop landraces and Indigenous varieties: A valuable source of genes for plant breeding. Plants. 2024;13(6):758. https://doi.org/10.3390/plants13060758
Zou J, Raman H, Guo S, et al. Constructing a dense genetic linkage map and mapping QTL for the traits of flower development in brassica carinata. Theor Appl Genet. 2014;127:1593–605.
Lysak MA, Koch MA, Pecinka A, Schubert I. Chromosome triplication found across the tribe brassiceae. Genome Res. 2005;15:516–25.
Wong GSY. Genetic Analysis of Brassica carinata. A thesis submitted to the College of Graduate Studies and Research. University of Saskatchewan, Saskatoon, Canada. https://harvest.usask.ca/server/api/core/bitstreams/154c515a-8b79-4a0d-979e-72b309c42285/content. 2013. Accessed 21 July 2024.
Park S, Yu HJ, Mun JH, Lee SH. Genome-wide discovery of DNA polymorphism in brassica Rapa. Mol Genet Genomics. 2009;283(2):135–45. https://doi.org/10.1007/s00438-009-0504-0
Yim WC, Swain ML, et al. The final piece of the triangle of U: evolution of the tetraploid brassica carinata genome. Plant Cell. 2022;34(11):4143–72. https://doi.org/10.1093/plcell/koac249
Wu J, Liang J, Lin R, et al. Investigation of brassica and its relative genomes in the post-genomics era. Hortic Res. 2022;9:uhac182. https://doi.org/10.1093/hr/uhac182
Ramchiary N, Nguyen VD, Li X, et al. Genic microsatellite markers in brassica Rapa: development, characterization, mapping, and their utility in other cultivated and wild brassica relatives. DNA Res. 2011;18(5):305–20. https://doi.org/10.1093/dnares/dsr01
Botstein D, White RL, Skolnick M, Davis RW. Construction of a genetic linkage map in man using restriction fragment length polymorphisms. Am J Hum Genet. 1980;32:314.
Dixon GR. Vegetable Brassicas and Related Crucifers; 2007;14; CABI: Wallingford, Oxfordshire, UK.
Sleper DA, Poehlman JM. Breeding field crops. 5th ed. Ames, IA, USA: Blackwell Publishing; 2006. pp. 345–66.
Rahman M, Hoque A, Roy J. Linkage disequilibrium and population structure in a core collection of Brassica Napus (L). PLoSONE. 2013; 17, e0250310.
Eltaher S, Sallam A, Belamkar V, et al. Genetic diversity and population structure of F3: 6 Nebraska winter wheat genotypes using genotyping-by-sequencing. Front Genet. 2018;9:76.
Wright S. Evolution and the genetics of populations: the theory of gene frequencies V2. Chicago, IL, USA: University of Chicago Press; 1969.
Acknowledgements
The authors gratefully acknowledge Hawassa University for funding, the EBI gene bank and the Oilseeds Improvement Program at HARC for providing Ethiopian mustard accessions seeds used in this study, and SeqArt Africa Lab for affordable genotyping services.
Funding
This research work was supported by Hawassa University’s thematic research scheme project: “Revitalizing indigenous genetic resources by integrating agroforestry, soil microbiology, and crop production using biotechnological tools” (HU/2022–2024).
Author information
Authors and Affiliations
Contributions
All authors contributed to designing the study. Y.D.A. conceived the study, secured funding, analyzed sequence data, generated visuals, and drafted the manuscript. A.G.A. secured funding, oversaw the project, and offered editorial feedback. T.M.O. conceived the study, acquired funding, analyzed and visualized the sequence data and revised the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Supplementary material 1
: Table S1: List of B. carinata genotypes and their morphological and seed oil composition attributes used in the study; Table S2: Distribution of 3,793 high-density DArTseq SNPs across B. carinata chromosomes, density per chromosome, polymorphic information content (PIC), minor allele frequency (MAF), and major allele frequency (MaF); Table S3: Percentage of variation explained by the first five principal components (PCs) and their corresponding eigenvalues for the 188 B. carinata genotypes; Figure S1: Bar plot from structure analysis (K = 2) showing two distinct groups among the 188 B. carinata genotypes: population-1 (red) and population-2 (green). Each bar represents an individual accession, with numbers under each bar corresponds to the genotype numbers; Figure S2: Scree plot illustrating the principal components for the 188 B. carinata genotypes
Supplementary material 2
: DArT SNP_HapMap Data of B. carinata in CSV file format
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Ambaw, Y.D., Abitea, A.G. & Olango, T.M. Genetic diversity and population structure in Ethiopian mustard (Brassica carinata A. Braun) revealed by high-density DArTSeq SNP genotyping. BMC Genomics 26, 354 (2025). https://doi.org/10.1186/s12864-025-11469-1
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s12864-025-11469-1