Performance Comparison Of Data Mining Techniques For Predicting Hiv Status Among Female Sex Workers In Ghana

ABSTRACT Introduction: The Human Immunodeficiency Virus (HIV) and the Acquired Immunodeficiency Syndrome (AIDS) disease still remains a global public health issue. An intriguing observation is the increasing rate of the infection among Female Sex Workers (FSW). HIV Testing Services (HTS) is an essential entry point for any HIV intervention service for FSW and studies like the Integrated Bio-Behavioral Surveillance Surveys (IBBSS) anonymously link the HIV test results of FSW with their respective socio-demographic and behavioral characteristics. These studies report summaries of HIV status of FSW with their respective socio-demographic and behavioral variables using traditional statistical methods. This approach limits information required to scale up knowledge on HIV status among FSW in Ghana hence innovative solutions like data mining is needed to explore approaches to the prediction of HIV from available information. Objective: The purpose of this paper is to develop a data mining solution that predicts the HIV status of FSW using socio–demographic and behavioral characteristics. Methodology: The approach adapted was the CRISP–DM which followed six main steps: business understanding, data understanding, data preparation, modeling, evaluation and deployment. Ghana’s 2015 FSW IBBSS data set was used for the study. Microsoft Excel was used for data preparation and WEKA 3.6.9 was used as data mining tool to implement experimentations using 5 algorithms: Random Tree, J48, Naïve Bayes, Logistic regression and Neural Network. Results: The target dataset contained 3,092 female sex worker participants. Out of which 2,491 (80.56 %) of the FSW were roamers while the remaining 601 (19.44%) were seaters. The age ranged from 16 to 64 years old. The study showed that Random Tree classifier out of the five classifiers was the best classifier to predict HIV status with an accuracy of 98.9%. Age, highest educational level, marital status, and average income from sex work, sex work experience, relationship with most recent sexual partner, HIV prevention knowledge, number of sex partners in a week, frequency of condom use among paying partners, had HIV test before, condom use by paying client, religion, had anal sex, drug usage, drank alcohol before sex and FSW type were found to be predictors of HIV status of FSW. The association rule extracted showed that there is direct relationship between having anal sex and the HIV status of a FSW. Conclusion: The results from the study proved that data mining can be used to extract relevant information for HIV prediction for FSW and that socio demographic and behavioral attributes are sufficient enough to predict HIV status of a FSW.