Federico A. Galatolo

Detection and Classification of Hysteroscopic Images Using Deep Learning

Diego Raimondo, Antonio Raffone, Paolo Salucci, Ivano Raimondo, Giampiero Capobianco, Federico Andrea Galatolo, Mario Giovanni Cosimo Antonio Cimino, Antonio Travaglino, Manuela Maletta, Stefano Ferla, and et al.

Background: Although hysteroscopy with endometrial biopsy is the gold standard in the diagnosis of endometrial pathology, the gynecologist experience is crucial for a correct diagnosis. Deep learning (DL), as an artificial intelligence method, might help to overcome this limitation. Unfortunately, only preliminary findings are available, with the absence of studies evaluating the performance of DL models in identifying intrauterine lesions and the possible aid related to the inclusion of clinical factors in the model. Aim: To develop a DL model as an automated tool for detecting and classifying endometrial pathologies from hysteroscopic images. Methods: A monocentric observational retrospective cohort study was performed by reviewing clinical records, electronic databases, and stored videos of hysteroscopies from consecutive patients with pathologically confirmed intrauterine lesions at our Center from January 2021 to May 2021. Retrieved hysteroscopic images were used to build a DL model for the classification and identification of intracavitary uterine lesions with or without the aid of clinical factors. Study outcomes were DL model diagnostic metrics in the classification and identification of intracavitary uterine lesions with and without the aid of clinical factors. Results: We reviewed 1500 images from 266 patients: 186 patients had benign focal lesions, 25 benign diffuse lesions, and 55 preneoplastic/neoplastic lesions. For both the classification and identification tasks, the best performance was achieved with the aid of clinical factors, with an overall precision of 80.11%, recall of 80.11%, specificity of 90.06%, F1 score of 80.11%, and accuracy of 86.74 for the classification task, and overall detection of 85.82%, precision of 93.12%, recall of 91.63%, and an F1 score of 92.37% for the identification task. Conclusion: Our DL model achieved a low diagnostic performance in the detection and classification of intracavitary uterine lesions from hysteroscopic images. Although the best diagnostic performance was obtained with the aid of clinical data, such an improvement was slight.