基于Stacking集成学习的乳腺癌智能诊断预测模型
CSTR:
作者:
作者单位:

1.同济大学 机械与能源工程学院,上海 201804;2.西安交通大学 管理学院,陕西 西安 710049;3.同济大学 中德工程学院,上海 201804

作者简介:

段春艳,副教授,管理学博士,主要研究方向为人工智能与决策优化等。E-mail: duanchunyan77@163.com

通讯作者:

尤筱玥,助理教授,管理学博士,主要研究方向为可持续管理与决策优化等。E-mail: yxyrachel@sina.com

中图分类号:

F272.1;TP181

基金项目:

国家自然科学基金(72171170);中央高校基本科研业务费专项资金(22120210535)


Intelligent Diagnosis and Prediction Model of Breast Cancer Based on Stacking Ensembled Learning
Author:
Affiliation:

1.School of Mechanical Engineering, Tongji University, Shanghai 201804, China;2.School of Management, Xi’an Jiaotong University, Xi’an 710049, China;3.Sino-German College of Applied Sciences, Tongji University, Shanghai 201804, China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    整合创新数据预处理方法和机器学习算法,根据乳腺癌威斯康星州诊断数据集构建了智能预测模型。首先,采用以LightGBM(Light Gradient Boosting Machine)为基模型的特征递归消除法进行特征选取;其次,使用结合ADASYN(adaptive synthetic sampling)过采样和OSS(one-sided selection)欠采样的综合采样进行数据不平衡处理,得到均衡的训练数据集;最后,以MLP(multilayer perception)、LightGBM、CatBoost(categorical boosting)作为基学习器,逻辑回归模型作为元学习器构建了基于Stacking集成学习的智能诊断模型,并通过5折交叉验证和准确率、敏感度、受试者操作特征曲线的下方面积等多项分类预测指标进行评估。实验结果显示所提出的模型能够达到98.2%的预测准确率,具备稳定且优秀的分类预测性能,能够为乳腺癌的临床诊断提供强有力的决策支持。

    Abstract:

    Integrating innovative data preprocessing methods and machine learning algorithms, an intelligent prediction model is constructed based on the Breast cancer Wisconsin diagnostic dataset. Firstly, the feature recursive elimination method based on light gradient boosting machine (LightGBM) model is used for feature selection. Secondly, the integrated sampling combined with adaptive synthetic sampling (ADASYN) oversampling and one-sided selection (OSS) undersampling is used to deal with data imbalance, and a balanced training data set is obtained. Finally, with multilayer perception (MLP), LightGBM and categorical boosting (CatBoost) as the base learner and logistic regression model as the meta-learner, an intelligent diagnosis model based on Stacking ensembled learning is constructed. It is evaluated by 5 folds cross-validation and classification prediction indicators such as accuracy, sensitivity, and area under receiver operating characteristic curve. The experimental results show that the proposed model can achieve a prediction accuracy of 98.2%, and has stable and excellent classification prediction performance, which can provide strong decision support for clinical diagnosis of breast cancer.

    参考文献
    相似文献
    引证文献
引用本文

段春艳,刘千拓,王佳洁,管迪,尤筱玥.基于Stacking集成学习的乳腺癌智能诊断预测模型[J].同济大学学报(自然科学版),2025,53(6):976~984

复制
相关视频

分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2023-11-10
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2025-06-27
  • 出版日期:
文章二维码