Abstract:Cancer is a genetic disease where gene mutations are pivotal in disease initiation and pathophysiology. The gene expression profile follows a specific pattern exclusive to each cancer which can be utilized for early and accurate diagnosis. Microarray techniques have emerged as powerful tools capable of simultaneously capturing the expression profiles of thousands of genes. However, because of the high dimensionality of the produced transcriptome data, analysis of the resulting datasets is challenging. Recent advancements in Artificial Intelligence (AI) techniques like Machine Learning (ML) and Deep Learning can be instrumental in efficiently processing these high-dimensional datasets. LASSO-regression is a ML technique that can help to rank the features which could help in feature selection leading to dimensionality reduction. Deep Learning is one of the most sophisticated ML techniques that can process high-dimensional data owing to the presence of more number of hidden layers in its neural network. We designed a Deep Neural Network (DNN) classifier model fused with a LASSO-based significant feature extractor for classifying the gene expression dataset containing a total of 51 samples of which 24 samples are of lung cancer patients and the remaining 27 samples are of normal individuals. A LASSO regression model was implemented to identify the genes that played a significant role in the classification. These significant gene expressions were then fed into a convergent Deep Neural Architecture. The classifier was trained with 70% data and the rest 30% was used for validation. The proposed classifier proved to provide better classification as compared to LASSO regression and DNN used individually. The two classes were classified with an average accuracy of 96.25%, average precision of 99.67%, average specificity of 99.45% and average sensitivity of 91.73% measured over thirty independent assessments. In some cases, the model was able to obtain a classification accuracy of 100%. This could open the path to early and better diagnosis of cancers from transcriptome data.

What problem does this paper attempt to address?

The main objective of this paper is to develop a method that combines Lasso regression feature selection with a deep learning classifier to improve the accuracy of diagnosing lung cancer from transcriptome data. The core contributions of the paper include: 1. **Problem Background**: Cancer is a genetic disease where gene mutations play a key role in the occurrence and development of the disease. Microarray technology can capture the expression profiles of thousands of genes simultaneously, but analyzing these datasets is challenging due to the high dimensionality of the resulting transcriptome data. 2. **Research Method**: The authors designed a model that integrates Lasso regression feature selection with a deep neural network (DNN) classifier. Lasso regression is used to identify important genes that significantly impact classification, and these genes are then fed into a deep neural network for final classification. 3. **Experimental Results**: By validating on a dataset containing 51 samples, including 24 samples from lung cancer patients and 27 samples from healthy individuals, the model achieved an average classification accuracy of 96.25%, an average precision of 99.67%, an average specificity of 99.45%, and an average sensitivity of 91.73%. In some cases, the model even achieved 100% classification accuracy. 4. **Significance and Application**: This study demonstrates that feature selection can significantly enhance the performance of deep learning models when dealing with high-dimensional biomedical data and may provide a new approach for the early diagnosis of lung cancer based on transcriptome data. In short, the paper proposes a method that uses Lasso regression for feature selection to enhance the performance of deep learning models, aiming to improve the accuracy of diagnosing lung cancer from transcriptome data.

Feature Selection Using Lasso Regression Enhances Deep Learning Model Performance For Diagnosis Of Lung Cancer from Transcriptomic Data

Enhancing Lung Cancer Classification and Prediction With Deep Learning and Multi-Omics Data

Deep-Learning-Based Cancer Profiles Classification Using Gene Expression Data Profile

Diagnostic Classification of Lung Cancer Using Deep Transfer Learning Technology and Multi‐Omics Data

Lung adenocarcinoma identification based on hybrid feature selections and attentional convolutional neural networks

Computer-aided diagnosis of lung carcinoma using deep learning - a pilot study

An Innovative Method for Lung Cancer Identification Using Machine Learning Algorithms

Lung and colon cancer classification using medical imaging: a feature engineering approach

Deep Learning-Based Lung Cancer Classification: Recent Developments and Future Prospects

A study on specific learning algorithms pertaining to classify lung cancer disease

Recognition of Lung Adenocarcinoma-specific Gene Pair Based on Genetic Algorithm and Establishment of a Deep Learning Prediction Model.

DeepCancer: Detecting Cancer through Gene Expressions via Deep Generative Learning

Lung Cancer Detection Based on Kernel PCA-Convolution Neural Network Feature Extraction and Classification by Fast Deep Belief Neural Network in Disease Management Using Multimedia Data Sources

DeepLCRmiRNA: A Hybrid Neural Network Approach for Identifying Lung Cancer-Associated miRNAs

Metaheuristic integrated machine learning classification of colon cancer using STFT LASSO and EHO feature extraction from microarray gene expressions

Deep learning techniques for cancer classification using microarray gene expression data

The Deep Learning ResNet101 and Ensemble XGBoost Algorithm with Hyperparameters Optimization Accurately Predict the Lung Cancer

Gene Selection Based Cancer Classification With Adaptive Optimization Using Deep Learning Architecture

Lung adenocarcinoma and lung squamous cell carcinoma cancer classification, biomarker identification, and gene expression analysis using overlapping feature selection methods

Recognition of Lung Adenocarcinoma-specific Gene Pairs Based on Genetic Algorithm and Establishment of a Deep Learning Prediction Model

A comparative analysis of classical machine learning and deep learning techniques for predicting lung cancer survivability