Predicting the Seroprevalence of HBV, HCV, and HIV Based on National Blood of Addis Ababa Ethiopia Using Data Mining Technology
American Journal of Artificial Intelligence
Volume 1, Issue 1, December 2017, Pages: 44-55
Received: May 14, 2017; Accepted: Jun. 1, 2017; Published: Aug. 30, 2017
Views 1226      Downloads 79
Haftom Gebregziabher, Department of Information Technology, Federal TVET Institute, Addis, Ethiopia
Million Meshasha, Department of Information Science, Addis Ababa University, Addis, Ethiopia
Patrick Cerna, Department of Information Technology, Federal TVET Institute, Addis, Ethiopia
Article Tools
Follow on us
Recent advancements in communication technologies, on the one hand, and computer hardware and database technologies, on the other hand, have made it easy for organizations to collect, store and manipulate massive amounts of data. As the volume of data increases, the proportion of information in which people could understand decreases substantially. The applications of learning algorithms in knowledge discovery are promising and they are relevant area of research offering new possibilities and benefits in real-world applications such as blood bank data warehouse. The availability of optimal blood in blood banks is a critical and important aspect in a Blood transfusion service. Blood banks are typically based on a healthy person voluntarily donating blood used for transfusions. The ability to identify regular blood donors enables blood bank and voluntary organizations to plan systematically for organizing blood donation camps in an efficient manner. The objective of this study was to explore the immense applicability of data mining technology in the Ethiopian national blood bank service by developing a predictive model that could help in the donor recruitment strategies by identifying donors that are at risk of TTIs which can help in the collection of safe blood group which in turn assists in maintaining optimal blood. The analysis has been carried out on 14575 blood donor’s dataset that has at least one pathogen using the J48 decision tree and Naive bayes algorithm implemented in Weka. J48 decision tree algorithm with the overall model accuracy of 94% has offered interesting rules. From the total of 156729 consecutive blood donors, 14757 (9.41%) had serological evidence of infection with at least one pathogen and 29 (0.19%) had multiple infections. The overall seroprevalence of HIV, HBV and HCV was 2.29%, 5.23%, and 2.30% respectively. The seropositivity of TTIs was significant in business owners, students, civil servants, unemployed individuals, drivers and age groups 25 to 34 and 35 to 44 years.
Data Mining, Blood Bank, HIV, HBC, HVC, CRISP-DM, Ethiopia
To cite this article
Haftom Gebregziabher, Million Meshasha, Patrick Cerna, Predicting the Seroprevalence of HBV, HCV, and HIV Based on National Blood of Addis Ababa Ethiopia Using Data Mining Technology, American Journal of Artificial Intelligence. Vol. 1, No. 1, 2017, pp. 44-55. doi: 10.11648/j.ajai.20170101.16
Copyright © 2017 Authors retain the copyright of this article.
This article is an open access article distributed under the Creative Commons Attribution License ( which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
ANAGAW S (2002). Application of data mining technology to predict child mortality patterns: the case of butajira rural health project (brhp). Unpublishd Masters thesis Addiss Ababa University.
Bigus J. (1996). Data Mining with Neural Networks: Solving Business Problems- from Application Development to Decision Support. Mc Graw-Hill: New York.
Butch S. H. (2002) Computerization in the transfusion service. Vox Sanguinis., 83 (suppl 1), 105-110.
Dhingra N. (2016). Screening Donated Blood for Transfusion- Transmissible Infections: World Health Organization. Available at: Accessed August 2016.
The Ethiopian Red Cross Society (2010). National Blood Bank Service Highlights Blood a Gift for Life.
Shyamsundaram and Santhanam. T. (2010). Application of CART Algorithm in Blood Donors Classification PG and Research Department of Computer Science, DG Vaishnav College, Chennai-600106, Tamil Nadu, India.
Belay T (2002). Seroprevalence of HIV, HBV, HCV and syphilis infections among blood donors at Gondar University Teaching Hospital, Northwest Ethiopia: declining trends over a period of five years. Unpublishd Masters thesis Addiss Ababa University.
Baye Gelaw and Yohans Mengistu (2002).. The prevalence of HBV, HCV and malaria parasites among blood donors in Amhara and Tigray regional states.
Tagny CT MD, Tapko JB, Lefrère JJ (2008). Blood safety in Sub-Saharan Africa: a multi-factorial problem. Transfusion 2008; 48 (6): 1256-1261.
Blood Safety Indicators (2009). World Health Organization. Geneva.
Deogan (2011). Data Mining: research Trends, Challenges, and Applications [database on the Internet]. [Accessed on February, 21, 2016].
Piatetsky-Shapiro G. (2000) Knowledge Discovery in Databases: 10 Years After. SIGKDD Explorations. Online. Retrieved from Accessed March 15, 2016
Han Ja K, Micheline (2001). Data Mining: concepts and Techniques. San Fransisco; Morgan kufman Publishers.
Last, Mark, Maimon, oded, and Kandel Abraham (2016). Knowledge Discovery in Mortality Records: Aninfo-fuzzy Approach. Retrieved from http://www. softec/med_ dm3.pdf. Accessed May 16, 2016.
Fayyad U, Piatetsky-shapiro, G. and Smyth, Padharic (1996). From Data Mining to Knowledge Discovery in Databases.
Helen T. (2003). Application of Data Mining Technology to Identify Significant Patterns in Census or Survey Data. Unpublished Masters Thesis Addis Ababa University, Addis Ababa.
Tesfaye, Hintsay. (2002). Predictive Modeling Using Data Mining Techniques In Support to Insurance Risk Assessment.
Cabena P. Discovering (19980. Data Mining - From Concept to Implementation, Prentice Hall, New Jersey.
Thearling K. (2003). An introduction to Data Mining. Retrieved from Accessed March 18, 2016.
Chapman P. (1999). CRISP-DM 1.0 Step-by-step Data mining Guide SPSS Inc., U.S.A CRISPWP-0800.
Berry Mal, G. (1997). Data Mining Techniques: For Marketing, Sales and Customer Support. New York. John Wiley and Sons, Inc.
Levin Na Z, Jacob, (1999). Data Mining. Available Retrieved from Mining.pdf
Witten Ihaf, Eibe (2000). Practical Machine Learning Tools and Techniques with Java Implementations. USA: Academic Press.
Science Publishing Group
1 Rockefeller Plaza,
10th and 11th Floors,
New York, NY 10020
Tel: (001)347-983-5186