Souza, Marcone Jamilson FreitasBianchi, Andrea Gomes CamposDiniz, Débora Nasser2024-02-072024-02-072023DINIZ, Débora Nasser. Contributions to automating the analysis of conventional Pap smears. 2023. 106 f. Tese (Doutorado em Ciência da Computação) - Instituto de Ciências Exatas e Biológicas, Universidade Federal de Ouro Preto, Ouro Preto, 2023.http://www.repositorio.ufop.br/jspui/handle/123456789/18071Programa de Pós-Graduação em Ciência da Computação. Departamento de Ciência da Computação, Instituto de Ciências Exatas e Biológicas, Universidade Federal de Ouro Preto.This thesis, organized as a compilation of articles, develops and presents contri- butions to the automated analysis of conventional Pap smear slides. A conventional Pap smear slide is a sample of cervical cells collected and prepared on a glass slide for subsequent cytopathological analysis. The main contributions are to detect and classify cervical cell nuclei to develop a decision support tool for cytopathologists. The first arti- cle resulting from this research utilizes a hierarchical methodology using Random Forest for the nucleus classification of the Herlev and Center for Recognition and Inspection of Cells (CRIC) Searchable Image Database databases based on 232 handcrafted fea- tures. In this article, we investigate balancing techniques, perform statistical analyses using Shapiro-Wilk and Kruskal-Wallis tests, and introduce the CRIC Searchable Image Database segmentation base. Our result defined the state-of-the-art in five metrics for nucleus classification in five and seven classes and the state-of-the-art in precision and F1-score for two-class classification. The second article introduces a method for nu- cleus detection in synthetic Pap smear images from the Overlapping Cervical Cytology Image Segmentation Challenge dataset proposed at the 11th International Symposium on Biomedical Imaging (ISBI’14). In this second article, we investigate clustering al- gorithms for image segmentation. We also explore four traditional machine learning techniques (Decision Tree (DT), Nearest Centroid (NC), k-Nearest Neighbors (k-NN), and Multi-layer Perceptron (MLP)) for classification and propose an ensemble method using DT, NC, and k-NN. Our result defined the state-of-the-art recall using this dataset. The third article proposes an ensemble method using EfficientNets B1, B2, and B6 to classify images from the CRIC Searchable Image Database dataset. Here, we investigate ten neural network architectures to choose those used in the ensemble method and present a data augmentation methodology using image transformation techniques. Our result de- fined the five state-of-the-art metrics for nucleus classification in two and three classes. Furthermore, we introduce results for six-class classification. Lastly, the fourth article introduces the Cytopathologist Eye Assistant (CEA), an intuitive and user-friendly tool that uses deep learning to detect and classify cervical cells in Pap smear images, support- ing cytopathologists in providing diagnoses. We investigate You Only Look Once (YOLO) v5 and YOLOR for performing both tasks (detection and classification) and also explore the combination of using YOLOv5 for detection and the ensemble of EfficientNets from the third article for classification. The article explores data balancing techniques, under- sampling, and oversampling using Python’s Clodsa library. The CRIC Cervix database was used for tool evaluation, considering four scenarios: original images, resized im- ages, augmented resized images, and balanced resized images. The application of CEA was validated by specialists with years of experience in cytopathology, highlighting the tool’s ease of use and potential to address specific queries.pt-BRabertoInteligência computacionalCâncerDor cervicalTeste de PapanicolaouMáquinas - aprendizagemContributions to automating the analysis of conventional Pap smears.TeseAutorização concedida ao Repositório Institucional da UFOP pelo(a) autor(a) em 06/02/2024 com as seguintes condições: disponível sob Licença Creative Commons 4.0 que permite copiar, distribuir e transmitir o trabalho, desde que sejam citados o autor e o licenciante. Não permite o uso para fins comerciais nem a adaptação.