TITLE:
Prediction of Lung Cancer Stage Using Tumor Gene Expression Data
AUTHORS:
Yadi Gu
KEYWORDS:
Lung Cancer Detection, Stage Prediction, Gene Expression Data, Xgboost, Machine Learning
JOURNAL NAME:
Journal of Cancer Therapy,
Vol.15 No.8,
August
28,
2024
ABSTRACT: Lung cancer remains a significant global health challenge and identifying lung cancer at an early stage is essential for enhancing patient outcomes. The study focuses on developing and optimizing gene expression-based models for classifying cancer types using machine learning techniques. By applying Log2 normalization to gene expression data and conducting Wilcoxon rank sum tests, the researchers employed various classifiers and Incremental Feature Selection (IFS) strategies. The study culminated in two optimized models using the XGBoost classifier, comprising 10 and 74 genes respectively. The 10-gene model, due to its simplicity, is proposed for easier clinical implementation, whereas the 74-gene model exhibited superior performance in terms of Specificity, AUC (Area Under the Curve), and Precision. These models were evaluated based on their sensitivity, AUC, and specificity, aiming to achieve high sensitivity and AUC while maintaining reasonable specificity.