[关键词]
[摘要]
目的 以中药药性作为特征变量,构建基于Voting集成算法的中药抗炎作用预测模型,并通过可视化技术分析不同药性特征对于中药抗炎作用的影响。方法 以《中药学》与SymMap数据库中1 247味中药为研究对象,经过初筛和复筛后建立包含性味归经等特征的规范化数据库。基于决策树、支持向量机、轻量级梯度提升机等6种基础模型构建Voting集成模型,并以七折交叉验证和基于树结构的贝叶斯优化算法超参数优化提升模型性能。利用SHAP(SHapley Additive exPlanations)解释器可视化关键药性特征。结果 经筛选后,共纳入522味抗炎中药构建数据库。Voting集成模型综合性能最优,F1分数为0.797,AUC值为0.77,较单一模型平均提升7.4%。SHAP分析表明使中药发挥抗炎作用的重要特征分别是“脾经”“甘味”“补益”等,使中药不具有抗炎作用的重要特征为“性温或平”和“毒性”。结论 首次通过集成算法构建具有良好性能的中药抗炎作用预测模型,为中医药与机器学习结合的研究模式提供了新思路。
[Key word]
[Abstract]
Objective To develop a prediction model for the anti-inflammatory effects of traditional Chinese medicine (TCM) using medicinal properties as feature variables through a Voting ensemble algorithm, and analyzing the impact of different TCM property characteristics on anti-inflammatory activity through visualization techniques. Methods We systematically analyzed 1 247 herbal medicines from the Chinese Materia Medica and the SymMap database. Following initial and secondary screening, we established a standardized database containing characteristic parameters including nature, flavor, and channel tropism. A Voting ensemble model was constructed by integrating six base classifiers (decision tree, support vector machine, light gradient boosting machine, etc.), with model performance enhanced through 7-fold cross-validation and Tree-structured Parzen Estimator hyperparameter optimization. The SHapley Additive exPlanations (SHAP) interpreter was employed to visualize feature importance. Results The final database comprised 522 anti-inflammatory herbs. The Voting ensemble model demonstrated superior performance (F1-score: 0.797, AUC: 0.77), demonstrating a 7.4% average improvement over individual models. SHAP analysis identified “spleen meridian”, “sweet flavor”, and “tonifying properties” as critical positive predictors, while “warm/neutral nature” and “toxicity” emerged as key negative indicators. Conclusion This study pioneers the application of ensemble learning in predicting TCM anti-inflammatory activity based on medicinal properties, establishing a novel research paradigm that integrates traditional Chinese medicine theory with machine learning technology.
[中图分类号]
TP18;R285.1
[基金项目]
中国中医科学院望京医院高水平中医医院建设项目(WJZJ-202305);中国中医科学院望京医院高水平中医医院建设项目(WJYY-XZKT-2023-37)。