A deep learning framework to classify breast density with noisy labels regularization

Lopez-Almazan, Hector; Pérez-Benito, Francisco Javier; Larroza, Andrés; Perez-Cortes, Juan-Carlos; Pollan-Santamaria, Marina; Perez-Gomez, Beatriz; Salas Trejo, Dolores; Casals, María; Llobet, Rafael

doi:10.1016/j.cmpb.2022.106885

Inicio

Sobre Repisalud

Info autores

FAQs

Contacto/Sugerencias

español
- español
- English

Por favor, use este identificador para citar o enlazar este Item:http://hdl.handle.net/20.500.12105/15173

Título

A deep learning framework to classify breast density with noisy labels regularization

Autor(es)

Fecha de publicación

2022-06

Cita

Comput Methods Programs Biomed. 2022 Jun;221:106885.

Idioma

Inglés

Tipo de documento

journal article

Resumen

Background and objective: Breast density assessed from digital mammograms is a biomarker for higher risk of developing breast cancer. Experienced radiologists assess breast density using the Breast Image and Data System (BI-RADS) categories. Supervised learning algorithms have been developed with this objective in mind, however, the performance of these algorithms depends on the quality of the ground-truth information which is usually labeled by expert readers. These labels are noisy approximations of the ground truth, as there is often intra- and inter-reader variability among labels. Thus, it is crucial to provide a reliable method to obtain digital mammograms matching BI-RADS categories. This paper presents RegL (Labels Regularizer), a methodology that includes different image pre-processes to allow both a correct breast segmentation and the enhancement of image quality through an intensity adjustment, thus allowing the use of deep learning to classify the mammograms into BI-RADS categories. The Confusion Matrix (CM) - CNN network used implements an architecture that models each radiologist's noisy label. The final methodology pipeline was determined after comparing the performance of image pre-processes combined with different DL architectures. Methods: A multi-center study composed of 1395 women whose mammograms were classified into the four BI-RADS categories by three experienced radiologists is presented. A total of 892 mammograms were used as the training corpus, 224 formed the validation corpus, and 279 the test corpus. Results: The combination of five networks implementing the RegL methodology achieved the best results among all the models in the test set. The ensemble model obtained an accuracy of (0.85) and a kappa index of 0.71. Conclusions: The proposed methodology has a similar performance to the experienced radiologists in the classification of digital mammograms into BI-RADS categories. This suggests that the pre-processing steps and modelling of each radiologist's label allows for a better estimation of the unknown ground truth labels.

Palabras clave

Breast density | Noisy labels | Deep learning | Dense tissue classification | Mammography

MESH

Versión en línea

https://doi.org/10.1016/j.cmpb.2022.106885

DOI

10.1016/j.cmpb.2022.106885

Aparece en las colecciones