Most forms of human papillomavirus can create alterations on a woman's cervix that can lead to cervical cancer in the long run, while others can produce genital or epidermal tumors. Cervical cancer is a leading cause of morbidity and mortality among women in low- and middle-income countries. The prediction of cervical cancer still remains an open challenge as there are several risk factors affecting the cervix of the women. By considering the above, the cervical cancer risk factor dataset from KAGGLE data warehouse is executed for predicting the cervical cancer risk classes. The cervical cancer data set is normalised with incomplete data and Pattern Calibration. Secondly, the interpretive data analysis is carried out, and the target feature's dispersion of the cervical cancer risk is visualised. Thirdly, several classifiers are fitted to the unprocessed data set, and the performance is measured with pre and post feature scaling. Fourth, oversampling methodologies are applied to the pre - processed data set. Fifth, the oversampled dataset by differment methods are applied to all the classifiers and the performance is compared with pre and post feature scaling. Sixth, Precision, recall, Fscore, accuracy, and running time are some of the metrics used in performance analysis. The code is written in Python and executed with Anaconda Navigator on the Spyder framework. The findings of the experiments reveal that the Random forest classifier tends to sustain 96% accuracy pre and post scaling for unporocessed dataset. Similarly the same classifier tends to sustain 98% accuracy for all the oversampling techniques.
Keywords Machine learning  classification  scaling  oversampling
Categories (categorize this paper)
Edit this record
Mark as duplicate
Export citation
Find it on Scholar
Request removal from index
Revision history

Download options

PhilArchive copy

 PhilArchive page | Other versions
External links

Setup an account with your affiliations in order to access resources via your University's proxy server
Configure custom proxy (use this if your affiliation does not provide a proxy)
Through your library

References found in this work BETA

No references found.

Add more references

Citations of this work BETA

No citations found.

Add more citations

Similar books and articles

Exploring Machine Learning Techniques for Coronary Heart Disease Prediction.Hisham Khdair - 2021 - International Journal of Advanced Computer Science and Applications 12 (5):28-36.
Concept Representation Analysis in the Context of Human-Machine Interactions.Farshad Badie - 2016 - In 14th International Conference on e-Society. pp. 55-61.
Prognostic System for Heart Disease Using Machine Learning: A Review.R. Senthilkumar - 2021 - Journal of Science Technology and Research (JSTAR) 2 (1):33-38.
Three Problems with Big Data and Artificial Intelligence in Medicine.Benjamin Chin-Yee & Ross Upshur - 2019 - Perspectives in Biology and Medicine 62 (2):237-256.


Added to PP index

Total views
42 ( #260,199 of 2,462,160 )

Recent downloads (6 months)
42 ( #21,085 of 2,462,160 )

How can I increase my downloads?


My notes