Are deep learning classification results obtained on CT scans fair and interpretable?

Ashames M.M.A.; Demir A.; Gerek O.N.; Fidan M.; Gulmezoglu M.B.; Ergin S.; Edizkan R.; Koc M.; Barkana A.; Calisir C.

Are deep learning classification results obtained on CT scans fair and interpretable?

dc.contributor.author	Ashames M.M.A.
dc.contributor.author	Demir A.
dc.contributor.author	Gerek O.N.
dc.contributor.author	Fidan M.
dc.contributor.author	Gulmezoglu M.B.
dc.contributor.author	Ergin S.
dc.contributor.author	Edizkan R.
dc.contributor.author	Koc M.
dc.contributor.author	Barkana A.
dc.contributor.author	Calisir C.
dc.date.accessioned	2024-07-22T08:01:49Z
dc.date.available	2024-07-22T08:01:49Z
dc.date.issued	2024
dc.description.abstract	Following the great success of various deep learning methods in image and object classification, the biomedical image processing society is also overwhelmed with their applications to various automatic diagnosis cases. Unfortunately, most of the deep learning-based classification attempts in the literature solely focus on the aim of extreme accuracy scores, without considering interpretability, or patient-wise separation of training and test data. For example, most lung nodule classification papers using deep learning randomly shuffle data and split it into training, validation, and test sets, causing certain images from the Computed Tomography (CT) scan of a person to be in the training set, while other images of the same person to be in the validation or testing image sets. This can result in reporting misleading accuracy rates and the learning of irrelevant features, ultimately reducing the real-life usability of these models. When the deep neural networks trained on the traditional, unfair data shuffling method are challenged with new patient images, it is observed that the trained models perform poorly. In contrast, deep neural networks trained with strict patient-level separation maintain their accuracy rates even when new patient images are tested. Heat map visualizations of the activations of the deep neural networks trained with strict patient-level separation indicate a higher degree of focus on the relevant nodules. We argue that the research question posed in the title has a positive answer only if the deep neural networks are trained with images of patients that are strictly isolated from the validation and testing patient sets. © The Author(s) 2024.
dc.identifier.DOI-ID	10.1007/s13246-024-01419-8
dc.identifier.issn	26624729
dc.identifier.uri	http://akademikarsiv.cbu.edu.tr:4000/handle/123456789/11575
dc.language.iso	English
dc.publisher	Springer Science and Business Media Deutschland GmbH
dc.rights	All Open Access; Hybrid Gold Open Access
dc.subject	Computer aided diagnosis
dc.subject	Computerized tomography
dc.subject	Deep neural networks
dc.subject	Learning systems
dc.subject	Accuracy rate
dc.subject	Chest computed tomography
dc.subject	Classification results
dc.subject	Computed tomography scan
dc.subject	DNN
dc.subject	Interpretability
dc.subject	Interpretability and reliability
dc.subject	Malignancy classification
dc.subject	Patient images
dc.subject	Training sets
dc.subject	Classification (of information)
dc.title	Are deep learning classification results obtained on CT scans fair and interpretable?
dc.type	Article

Collections

Scopus Koleksiyonu

Are deep learning classification results obtained on CT scans fair and interpretable?

Files

Collections