[关键词]
[摘要]
目的 对唐松草属Thalictrum L.植物叶绿体(chloroplast,CP)基因组结构及序列特征进行详细解析并筛选出候选分子标记,为解决唐松草属植物因分布广、表型变异复杂等原因导致的物种分类与鉴定争议问题奠定基础。方法 利用Illumina HiSeq4000测序平台首次对滇川唐松草Thalictrum finetii和高原唐松草Thalictrum cultratum进行CP基因组测序及结构分析;结合NCBI已公布的同属14个物种的CP基因组数据,对其简单序列重复(simple sequence repeats,SSR)、IR区边界结构、核苷酸多样性(nucleotide diversity,Pi)等进行分析;最后结合毛茛科(Ranunculaceae)10个属45个物种的CP基因组数据,采用最大似然法(maximum likelihood,ML)构建系统发育树并进行系统进化分析。结果 滇川唐松草和高原唐松草CP基因组的长度分别为155 953 bp及155 901 bp,且均呈典型的圆形四分体结构;共鉴定出131个基因;密码子偏好以A/U结尾;唐松草属物种的CP基因组在基因数量和基因组结构上均具有较高的保守性,但个别物种在IRs/LSC和IRs/SSC边界的特定基因长度上存在特异性差异;序列变异方面,非编码序列变异程度明显高于编码序列,且IRs区明显比LSC和SSC区更为保守;筛选出10个高变异位点区域(ndhF-rpl32、ycf1、petN-psbM、ndhC-trnV、trnT-trnL、trnS-psbZ、ndhG-ndhI、ndhD、infA、rpl16),可作为候选DNA条形码序列;通过系统进化分析明确了唐松草属内物种的进化关系及该属在毛茛科中的系统进化位置。结论 首次公布了滇川唐松草和高原唐松草的CP基因组,并深入解析了唐松草属的CP基因组结构与序列特征;所筛选的10个高变异位点区域可作为唐松草属物种鉴定的候选DNA条形码;建立了更全面、可靠性更好的毛茛科系统进化树。
[Key word]
[Abstract]
Objective A detailed analysis of the chloroplast (CP) genome structure and sequence characteristics of Thalictrum plants was conducted to screen for candidate molecular markers, laying the foundation for addressing the controversies in species classification and identification of Thalictrum species caused by their wide distribution and complex phenotypic variations. Methods The CP genome sequencing and structural analysis of Thalictrum finetii and Thalictrum cultratum were performed for the first time using the Illumina HiSeq4000 sequencing platform. Subsequently, a comprehensive analysis was performed simple sequence repeats (SSRs), inverted repeat (IR) region boundary structures, nucleotide diversity (Pi) and other characteristics, combined with the chloroplast (CP) genome data of 14 congeneric species published in the NCBI. Finally, by integrating the CP genome data of 45 species from 10 genera within the Ranunculaceae family, an ML phylogenetic tree was constructed and phylogenetic analysis was performed. Results The chloroplast genome lengths of T. finetii and T. cultratum are 155 953 bp and 155 901 bp, respectively, both exhibiting a typical circular quadripartite structure. A total of 131 genes were identified, with codon usage bias showing a predominant preference for A/U endings. While the CP genomes of Thalictrum species demonstrate high conservation in both gene count and genomic structure, specific differences are observed in the lengths of particular genes at the IRs/LSC and IRs/SSC boundaries in certain species. In terms of sequence variation, the divergence in non-coding sequences was significantly higher than that in coding sequences, with the IR regions exhibiting markedly lower variability compared to the LSC and SSC regions. Ultimately, ten hypervariable regions (ndhF-rpl32, ycf1, petN-psbM, ndhC-trnV, trnT-trnL, trnS-psbZ, ndhG-ndhI, ndhD, infA, rpl16) were identified as candidate DNA barcodes for the Thalictrum genus. Through systematic evolutionary analysis, the evolutionary relationships among the species within the genus Thalictrum and the systematic evolutionary position of this genus in the Ranunculaceae family were clarified. Conclusion This study presents the first report of the chloroplast genomes of T. finetii and T. cultratum, with a detailed analysis of the genomic structure and sequence characteristics of the Thalictrum genus. Ten hypervariable regions were identified as candidate DNA barcodes for species identification within Thalictrum. And a more comprehensive and reliable phylogenetic tree of the family Ranunculaceae was established.
[中图分类号]
R283
[基金项目]
云南省“兴滇英才支持计划”青年人才专项;中国医学科学院医学与健康科技创新工程:药用植物种质资源库建设(2021-I2M-1-032)