[关键词]
[摘要]
目的 借助深度聚类在高维深层数据处理中的优势,构建一种临床用药规律分析模型,挖掘复杂多维的临床医学数据中的潜在用药规律。方法 收集真实世界临床脑卒中数据集进行筛选与规范,利用图卷积神经网络、自动编码器和自监督机制等技术,构建多源异构信息融合的深度聚类模型——多源融合卷积网络(multi-source fusion convolutional network,MFCN)。通过2个数据集验证模型的性能,并以脑卒中数据集为例,进一步分析其临床用药规律,从而验证模型挖掘高维临床数据的有效性。结果 MFCN模型在DBLP和脑卒中数据集上的准确率分别为79.32%、83.48%,归一化互信息(normalized mutual information,NMI)分别为0.490 8、0.531 6,调整兰德系数(average rand index,ARI)分别为0.541 9、0.581 7,F1得分(F1-score,F1)分别为0.787 7、0.833 9,其性能指标均高于其他模型。在脑卒中数据集中,MFCN模型成功挖掘出急性期的中药组合社团,如甘草、茯苓和陈皮等,并发现“症-药”关联组合,如“甘草-脉弦”“甘草-舌红”“陈皮-肢体乏力”等。结论 深度聚类能有效处理复杂高维的临床数据,并揭示临床用药的潜在规律,为临床经验的提取和辅助决策提供了方法学参考。
[Key word]
[Abstract]
Objective To develop a clinical medication pattern analysis model to identify potential medication patterns within multi-dimensional clinical data by utilizing the advantages of deep clustering in high-dimensional and complex data processing. Methods The real-world clinical stroke data were collected for screening and specification. Then, techniques such as graph convolutional neural networks, autoencoders, and self supervised mechanisms are used to construct a deep clustering model—multi-source fusion convolutional network (MFCN) for multi-source information fusion. The performance of the model was validated on two datasets, with the stroke dataset used as a case study to further analyze its clinical medication patterns and demonstrate the effectiveness of the model in mining high-dimensional clinical data. Results The accuracy of the MFCN model on the DBLP and stroke datasets were 79.32% and 83.48%, respectively. The normalized mutual information (NMI) indicators were 0.490 8 and 0.531 6, the average rand index (ARI) indicators were 0.541 9 and 0.581 7, and the F1-score (F1) indicators were 0.787 7 and 0.833 9, respectively. Its performance indicators were higher than those of other models. In the stroke dataset, the MFCN model successfully identified Chinese herbal combinations in the acute phase, such as licorice, Fuling (Poria), and Chenpi (Citri Reticulatae Pericarpium), and discovered “symptom-drug” associations, such as “Gancao (Glycyrrhizae Radix et Rhizoma)-mai xian”, “Glycyrrhizae Radix et Rhizoma-tongue red”, and “Chenpi (Citri Reticulatae Pericarpium)-limb weakness”. Conclusion Deep clustering can effectively handle complex high-dimensional clinical data and reveal potential patterns of clinical medication, with view to providing methodological references for extracting clinical experience and assisting decision-making.
[中图分类号]
TP18;R285
[基金项目]
国家自然科学基金面上项目(82474352);湖南省中医药管理局重点项目(A2024011,2023-24);湖南省自然科学基金项目(2023JJ60124);湖南省教育厅科学研究重点项目(22A0255,22A0281);湖南省教育厅优秀青年项目(22B0400);湖南省教育厅科学研究一般项目(22C0195);长沙市自然科学基金项目(kq2202265,kq2402172);湖南中医药大学校级科研基金项目(2021XJJJ021)