[关键词]
[摘要]
目的 结合叶绿体基因组、密码子偏好性与DNA条形码、机器学习技术,为薯蓣属Dioscorea系统发育、遗传多样性及物种鉴定提供理论指导。方法 通过IRscope比对11种植物的叶绿体基因结构,并构建系统发育树;用CodonW、CUSP等工具分析密码子偏好性;收集薯蓣材料并扩增测序DNA条形码序列,基于成熟酶K基因(maturase K gene,matK)、光系统II蛋白D1基因—组氨酸tRNA基因间隔区(photosystem b a protein gene-transfer RNA-histidine intergenic spacer,psbA-trnH)、核酮糖-1,5-二磷酸羧化酶/加氧酶大亚基基因(ribulose-bisphosphate carboxylase/oxygenase large subunit gene,rbcL)条形码和机器学习算法进行分子鉴别。结果 薯蓣属叶绿体基因组保守稳定,密码子显著A/T偏好,第3位倾向A/U结尾。自然选择是影响密码子偏差的主因,共有最优密码子为GGA和UCA。3种条形码均能成功鉴别物种,基于BLOG算法的单一条形码鉴别成功率100%,WEKA的SMO和NaïveBayes分类器鉴别率较高。结论 薯蓣叶绿体基因组保守,密码子偏好模式以自然选择为主,为薯蓣属基因表达调控、物种鉴定、资源保护等提供依据和指导。
[Key word]
[Abstract]
Objective To provide theoretical guidance for the phylogenetics, genetic diversity, and species identification of Dioscorea by integrating chloroplast genomics, codon usage bias, DNA barcoding, and machine learning technologies. Methods The chloroplast gene structures of 11 plant species were compared via IRscope and a phylogenetic tree was constructed. Codon usage bias was analyzed using tools such as CodonW and CUSP. Dioscorea materials were collected, and DNA barcode sequences were amplified and sequenced. Molecular identification was performed based on the maturase K gene (matK), photosystem b a protein gene-transfer RNA-Histidine intergenic spacer (psbA-trnH), and ribulose-bisphosphate carboxylase/oxygenase large subunit gene (rbcL) barcodes using machine learning algorithms. Results The chloroplast genomes of Dioscorea were found to be conserved and stable. Codon usage showed a significant A/T bias, with the third codon position favoring A/U endings. Natural selection was the primary factor influencing codon bias, and the common optimal codons identified were GGA and UCA. All three barcodes successfully discriminated species. Single-barcode identification using the BLOG algorithm achieved 100% success rate, while the SMO and NaïveBayes classifiers in WEKA demonstrated high identification accuracy. Conclusion The chloroplast genome of Dioscorea is conserved, and natural selection dominates its codon usage pattern. This study provides a basis and guidance for research on gene expression regulation, species identification, and resource conservation of Dioscorea.
[中图分类号]
R286.12
[基金项目]
陕西省自然科学基础研究计划项目(2021JQ-782);西安医学院2024年度科研能力提升计划项目(2024NLTS122);西安医学院2022年度科研能力提升计划项目(2022NLTS084)