Abstract:Objective.Histology image analysis is a crucial diagnostic step in staging and treatment planning, especially for cancerous lesions. With the increasing adoption of computational methods for image analysis, significant strides are being made to improve the performance metrics of image segmentation and classification frameworks. However, many developed frameworks effectively function as black boxes, granting minimal context to the decision-making process. Thus, there is a need to develop methods that offer reasonable discriminatory power and a biologically-informed intuition to the decision-making process.Approach.In this study, we utilized and modified a discriminative feature-based dictionary learning (DFDL) paradigm to generate a classification framework that allows for discrimination between two distinct clinical histologies. This framework allows us (i) to discriminate between 2 clinically distinct diseases or histologies and (ii) provides interpretable group-specific representative dictionary image patches, or 'atoms', generated during classifier training. This implementation is performed on multiplexed immunofluorescence images from two separate patient cohorts- a pancreatic cohort consisting of cancerous and non-cancerous tissues and a metastatic non-small cell lung cancer (mNSCLC) cohort of responders and non-responders to an immunotherapeutic treatment regimen. The analysis was done at both the image-level and subject-level. Five cell types were selected, namely, epithelial cells, cytotoxic lymphocytes, antigen presenting cells, HelperT cells, and T-regulatory cells, as our phenotypes of interest.Results.We showed that DFDL had significant discriminant capabilities for both the pancreatic pathologies cohort (subject-level AUC-0.8878) and the mNSCLC immunotherapy response cohort (subject-level AUC-0.7221). The secondary analysis also showed that more than 50% of the obtained dictionary atoms from the classifier contained biologically relevant information.Significance.Our method shows that the generated dictionary features can help distinguish patients presenting two different histologies with strong sensitivity and specificity metrics. These features allow for an additional layer of model interpretability, a highly desirable element in clinical applications for identifying novel biological phenomena.

Multiclass Decision Forest - A Novel Pattern Recognition Method For Multiclass Classification In Microarray Data Analysis

Using Decision Forest to Classify Prostate Cancer Samples on the Basis of Seldi-Tof Ms Data: Assessing Chance Correlation and Prediction Confidence

Gaining Confidence on Molecular Classification Through Consensus Modeling and Validation

Multiclass Cancer Classification by Using Fuzzy Support Vector Machine and Binary Decision Tree with Gene Selection

A novel classification method of microarray with reliability and confidence

Multiclass cancer diagnosis using tumor gene expression signatures

Decision Forest for Classification of Gene Expression Data

Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data

Towards the characterization of the tumor microenvironment through dictionary learning-based interpretable classification of multiplexed immunofluorescence images

DNA-framework-based multidimensional molecular classifiers for cancer diagnosis

Gene selection and classification for cancer microarray data based on machine learning and similarity measures

A Robust Statistical Procedure to Discover Expression Biomarkers Using Microarray Genomic Expression Data.

Deep-Learning-Based Cancer Profiles Classification Using Gene Expression Data Profile

Multi-Class Cancer Classification by Total Principal Component Regression (tpcr) Using Microarray Gene Expression Data

Improved multi-layer binary firefly algorithm for optimizing feature selection and classification of microarray data

Multi-Classification of Cancer Samples Based on Co-Expression Analyses

Genetic Clustering Algorithm-Based Feature Selection and Divergent Random Forest for Multiclass Cancer Classification Using Gene Expression Data

Cancer Classification Using Entropy Analysis in Fractional Fourier Domain of Gene Expression Profile

Collaborative Representation-Based Classification of Microarray Gene Expression Data

A combinational feature selection and ensemble neural network method for classification of gene expression data

Multiclass cancer classification using gene expression profiling and probabilistic neural networks