[关键词]
[摘要]
目的 获得瑞香狼毒Stellera chamaejasme转录组数据库代谢途径基因序列、SSR以及转座子等信息。方法 以瑞香狼毒根作为受试材料,采用二代测序方法中的Illumina HiSeq 2000进行转录组测序,并进行系统的生物信息学分析。结果 共获得26 785 872个Clean reads片段,拼接得到47 053条Unigenes,平均长度为419 nt。将拼装所得到的Unigene序列利用BLAST工具分别与Nr、Swiss-Prot、KEGG、COG和GO数据库进行比对,分别有11 138和24 744条Unigene在Nr和Swiss-Prot数据库中比对得到了注释信息,可归于36个GO分类,涉及119个KEGG标准代谢通路,进一步分析发现15条萜类生物合成途径的关键酶基因。利用MISA软件发现3 480个SSR,数量最高的SSR类型为单碱基重复,为1 986条,出现频率为57.07%,最少的是六碱基重复SSR,只有5条,出现频率仅为0.14%。利用RepeatMasker在线工具针对瑞香狼毒转录组序列进行转座子预测分析,结果共发现有1 497条转座子,其中E值<1×10−5的序列有827条,包含22种类型转座子,数目最多的为LINE/L1类型(405条),占比为48.97%,占比最少的为DNA/Ginger、DNA/hAT、DNA/PIF-ISL2EU和LINE/Jockey以及LTR/Lenti类型分别只有1条。结论 对瑞香狼毒进行高通量测序,获得了大量基因序列信息以及SSR和转座子信息,为今后分离克隆瑞香狼毒中佛波酯等有效成分生物合成的关键酶基因以及开展相关分子机制研究提供了数据资源和理论基础。
[Key word]
[Abstract]
Objective To obtain the transcriptome database and gene sequence, SSR as well as transposon information of Stellera chamaejasme. Methods Using the high-throughput sequencing platform (Illumina HiSeq 2000), a root transcriptome dataset of S. chamaejasme was obtained, and the sequencing results were analyzed with the bioinformatic way. Results With a total of 26 785 872 clean reads, 47 053 unigenes were assembled. All these unigenes were then blasted with Nr, Swiss-Prot, KEGG, COG, and GO databases. There were 11 138 and 24 744 unigenes were annotated with Nr and Swiss-Prot databases, respectively. The unigenes were involved in 36 GO-terms and 119 metabolic pathways. Further analysis showed that 15 unigenes were involved in terpenoids biosynthesis. Using MISA software, the results showed that there were 3 480 SSR from the 47 053 unigenes, and the most type of SSR was mononucleotide (1 986) with the frequency of 57.07%. Moreover, the hexanucleotide only had five repeat SSR and the frequency was only 0.14%. With RepeatMasker online tools to analyze the transposon of the transcriptome sequences, the results indicated that there were 1 497 transposons, and the number of transposons with E < 1×10−5 was 827. All the transposons were grouped into 22 types, and the LINE/L1 type (405) had the highest frequency (48.97%). The DNA/Ginger, DNA/hAT, DNA/PIF-ISL2EU, and LINE/Jockey as well as LTR/Lenti were the least type since each of them has only one transposon. Conclusion In this study, rich sequence information of gene, SSR as well as transposon information of Stellera chamaejasme is helpful to carry out the research of the molecular mechanism of phorbol ester biosynthesis in S. chamaejasme in the future.
[中图分类号]
[基金项目]
中央级公益性科研院所基本科研业务费专项(CAFYBB2014QB001);国家自然科学基金项目(31570675)