[关键词]
[摘要]
目的 研究芍药属4种药用植物叶绿体基因组特征、变异程度及芍药属和毛茛科植物的系统发育关系,为芍药属药用植物的分类地位和系统发育提供参考。方法 应用二代高通量测序技术测定芍药Paeonia lactiflora叶绿体基因组,利用比较基因组学方法分析了芍药与川赤芍P. Veitchii、牡丹P. suffruticosa及杨山牡丹P. ostii等芍药属药用植物叶绿体基因组之间的结构特征及变异程度,以山地虎耳草Saxifraga sinomontana等植物为外类群,分析了芍药属植物与毛茛科植物的系统发育关系。结果 芍药属4种药用植物的叶绿体基因组均为典型的四分结构,包含1个大单拷贝区(large single copy,LSC)、1个小单拷贝区(small single copy,SSC)和2个反向重复区(inverted repeats,IRa和IRb)。芍药叶绿体基因组序列长度为152 731 bp,GC含量为38.43%,共注释到基因126个,其中82个蛋白质编码基因、36个tRNA基因和8个核糖体rRNA基因。重复序列分析发现芍药叶绿体基因组重复序列数量最多为48个,杨山牡丹最少为39个;只有芍药同时含有4种类型的重复序列。芍药含有最多的SSR序列达52个,4种植物SSR序列中单核苷酸重复占比达81%~86%,绝大多数为A/T重复。密码子偏好性分析表明芍药属4种药用植物中氨基酸出现频率均以亮氨酸最高,半胱氨酸最低;31种密码子具偏好性,且密码子偏好性与密码子第3位碱基具显著相关性。LSC/IRb、IRb/SSC、SSC/IRa和IRa/LSC边界附近的基因类型相同,相对保守,但牡丹组和芍药组之间仍具有差异。选择压力分析结果表明,绝大多数基因Ka/Ks值均小于1,受纯化选择,matK、ndhB和rpoA受到正选择。叶绿体基因组比较分析发现芍药属内4种药用植物的叶绿体基因组中非编码区比基因编码区的变异程度大,筛选出了6个基因间区和2个基因编码区的高变异区。系统发育分析结果显示,芍药与美丽芍药亲缘关系最近,所有芍药属植物被聚为一个分支,外类群把芍药属和毛茛科植物分隔开来,芍药属植物与毛茛科植物亲缘关系较远。结论 芍药属4种植物在叶绿体基因组结构特征方面较为保守,但其变异程度存在一定差异;筛选出的8个高变异区可为芍药属药用植物的条形码开发、物种鉴定及药用植物质量控制提供候选片段;系统发育分析表明芍药属植物与外类群植物亲缘关系较近,而与毛茛科植物亲缘关系较远,为芍药属植物系统进化、芍药属药用植物的物种鉴定、质量控制及保护开发等研究奠定基础。
[Key word]
[Abstract]
Objective To understand the chloroplast genome characteristics, degree of variation of four medicinal plants in Paeonia and the phylogenetic relationship between Paeonia and Ranunculaceae. This study will provide a reference for classification status and phylogeny of medicinal plants in Paeonia. Methods High-throughput next-generation sequencing technology was applied to sequence and assemble chloroplast genome of Paeonia lactiflora. Comparative genomics methods were used to analyze the structural characteristics and degree of variation among the chloroplast genomes of four medicinal plants in Paeonia, including P. lactiflora, P. veitchii, P. suffruticosa, and P. ostii. The phylogenetic relationship between Paeonia and Ranunculaceae was analyzed, with Saxifraga sinomontana and other related plants as outgroups. Results The chloroplast genomes of four medicinal plants in Paeonia were all typical tetrad structures, including one large single copy region (LSC), one small single copy region (SSC), and two reverse repeat regions (IRa and IRb). The length of chloroplast genome of P. lactiflora was 152 731 bp, with GC content of 38.43%. A total of 126 genes were annotated, including 82 protein coding genes, 36 tRNA genes, and eight ribosomal rRNA genes. Repetitive sequence analysis revealed that the chloroplast genome of P. lactiflora had the highest number of repetitive sequences at 48, while P. ostii lowest at 39. Only P. lactiflora contains four types of repeat sequences simultaneously. P. lactiflora contained the largest number of SSR sequences with 52. Single nucleotide repeats accounted for 81%—86% of the SSR in the four medicinal plants chloroplast genome, with the vast majority being A/T repeats. Codon preference analysis showed that among the four medicinal plants of Paeonia, the highest frequency of amino acid was leucine, while lowest was cysteine. Thirty-one codons possessed preference, and codon preference was significantly correlated with the third base of the codon. The gene types near the boundaries of LSC/IRb, IRb/SSC, SSC/IRa and IRa/LSC were the same and relatively conserved, but there were still little differences between sect Moutan and Paeonia. The selection pressure analysis showed that the Ka/Ks values of the majority of genes were less than one. Only matK, ndhB, and rpoA were positively selected. Chloroplast genome comparative analysis revealed that the non-coding regions exhibited greater variability than the gene-coding regions. A total of six intergenic regions and two gene coding regions with high variability were screened out. The phylogenetic analysis showed that the relationship between P. lactiflora and P. maire was closest. All species in Paeonia were clustered into one branch, while outgroups can separate Paeonia from Ranunculaceae. The relationship between Paeonia and Ranunculaceae was relatively distant. Conclusion The chloroplast genome structural characteristics of the four medicinal plants in Paeonia are relatively conservative, with certain differences on chloroplast genome variation degree. The eight highly variable regions identified can be served as the candidate fragments for barcode development, species identification and quality control of medicinal plants. Phylogenetic analysis shows that species in Paeonia are more closely related to the outgroup plants, but far related to species in Ranunculaceae. This lays the foundation for phylogeny of Paeonia and research on species identification, quality control, protection and development of medicinal plants resources in Paeonia.
[中图分类号]
R286.12
[基金项目]
国家自然科学基金资助项目(U1204323)