Optimization of Malaria Diagnosis by Machine Learning According to the CRISP-DM Model Applied to the University Teaching Hospital Clinics of Lubumbashi (DRC) ()
1. Introduction
In 2023, malaria caused nearly half a million deaths, 95% of which were in Africa and 76% in children under 5 years of age [1]. In sub-Saharan Africa, malaria diagnostic errors persist due to reliance on rapid diagnostic tests (RDTs) and microscopy, whose sensitivities range from 60% to 85% depending on operational conditions [2]. The Democratic Republic of Congo (DRC) ranks second in the world in terms of the number of malaria cases (12.6%) and deaths (11.3%) [3]. In 2023, it represented 55% of malaria cases in Central Africa, the highest rate in the sub-region [4]. In Lubumbashi, health centers, hospitals and university clinics face major challenges: overload of health professionals, prolonged diagnostic delays and human errors. Traditional methods, such as microscopy and rapid tests, have limitations in terms of sensitivity and specificity. Traditional diagnostic tests, such as microscopy, require expertise and specialized equipment that are not necessary for others [5].
The burden of malaria is particularly high in low-income countries, where health systems are often underdeveloped and resources for prevention, diagnosis, and treatment are insufficient. In the DRC, access to quality tools and the necessary expertise remains a recurring problem, directly threatening patients’ lives.
It is in this context that our study takes on its full meaning: it seemed essential to us to propose an innovative solution that could not only help our country, but also benefit other nations facing similar challenges. Thus, artificial intelligence (AI) and machine learning (ML) appear as promising alternatives for automating and optimizing medical diagnoses, particularly in resource-limited contexts. Indeed, several studies have shown the effectiveness of ML in the diagnosis of infectious diseases, including malaria, with accuracy rates sometimes higher than those of conventional methods [6] [7]. According to Kermany et al. (2018), AI can even achieve diagnostic accuracy equivalent to that of human experts in certain complex clinical contexts [8]. However, their implementation remains little explored in sub-Saharan Africa, notably due to difficulties in accessing data, user training and infrastructure constraints [9].
This study aims to fill this gap by developing an expert system based on the CRISP-DM (CrossIndustry Standard Process for Data Mining) methodology, guaranteeing a structured, reproducible approach adapted to field realities.
2. Materials and Methods
The implementation of this project relies on the combined use of technical means, software resources, and a rigorous data processing method. This section describes the tools used and the methodological steps followed to build the intelligent diagnostic system.
2.1. Materials
The development of the system relied on a coherent set of software tools adapted to data science. Python 3.10 was chosen for its wealth of Machine Learning-oriented libraries, such as scikit-learn, pandas, and seaborn, regularly used in biomedical studies [10]. The Anaconda environment allowed for the efficient management of virtual environments and dependencies, reducing the risk of version conflicts. Finally, the PyCharm IDE was mobilized to structure the project and facilitate the debugging and continuous integration phases. To ensure a stable, isolated, and reproducible working environment, we used Anaconda, a Python distribution dedicated to the management of dependencies and virtual environments, avoiding conflicts between libraries. The source code was developed in the PyCharm integrated development environment, facilitating advanced project management, static code analysis, as well as debugging and integration with version control tools such as Git.
Several specialized Python libraries were used to meet the specific needs of our project. These tools covered all the necessary functionalities, from data manipulation and preparation to modeling, visualization, and user interface management. Table 1 presents the main libraries used and their role in the development of the expert system.
Table 1. Python libraries used and their roles.
No. |
Library |
Main role |
1 |
scikit-learn |
This library allowed us to train and evaluate the artificial intelligence model (Decision Tree). |
2 |
pandas |
It helped us efficiently manipulate and transform clinical data. |
3 |
joblib |
Joblib was used to save and reload learning models for later use. |
4 |
streamlit |
This library was used to create an interactive and accessible web interface. |
5 |
sqlite3 |
It allowed us to locally manage the database containing the data and history. |
6 |
datetime |
Datetime made it easier to manage the timestamps needed for historical tracking. |
7 |
matplotlib/ seaborn |
These libraries allowed us to visualize the model’s performance through confusion matrices and ROC curves. |
8 |
numpy |
Numpy was used for numerical calculations and efficient manipulation of data arrays. |
9 |
pickle |
Pickle was used to save and load Python objects, particularly for model persistence. |
2.2. Methods
The methodological approach is based on the CRISP-DM (Cross-Industry Standard Process for Data Mining) framework, which is widely adopted for structured data mining projects. This framework consists of six interconnected steps, ranging from business understanding to the deployment phase, including data preparation and modeling. Its effectiveness in medical and epidemiological contexts is widely documented [11] [12].
In a recent study on the diagnosis of type 2 diabetes, the authors explicitly applied the CRISP-DM methodology to develop a high-performance predictive model. This work demonstrates the relevance and effectiveness of the structured CRISPDM process in a concrete framework of medical data analysis [13]. This methodology is all the more relevant in medical environments with low computerization, as shown by the work of Amato et al. (2013), where CRISP-DM made it possible to efficiently structure the analysis and transformation of raw clinical data [14].
This process includes several complementary phases as represented in Figure 1.
Figure 1. The data mining lifecycle.
To better illustrate how the CRISP-DM methodology was operationalized in this work, Table 2 provides a concise mapping between each phase of the framework and the specific actions undertaken. This summary strengthens the methodological narrative by linking the theoretical steps of the CRISP-DM process to their concrete implementation in the context of malaria diagnosis at the University Clinics of Lubumbashi.
2.2.1. Understanding the Profession
This step is crucial to align clinical and technical objectives, as highlighted by
Table 2. Mapping of CRISP-DM phases to actions performed in this study.
CRISP-DM phase |
Concrete actions performed in this study |
Business Understanding |
Interviews with physicians; observation of clinical consultations; identification of diagnostic workflow and decision criteria for malaria at UCL. |
Data Understanding |
Collection of 2500 retrospective patient records (Jan 2023-Mar 2024); exploration of variables (demographic, clinical, diagnostic); assessment of class distribution (224 malaria, 2276 non-malaria). |
Data Preparation |
Cleaning (handling missing values, duplicates); encoding categorical variables; transformation into numerical formats; class weighting to address imbalance; feature selection based on correlations and importance ranking. |
Modeling |
Implementation of a Decision Tree classifier; use of 10-fold cross-validation; training with class weights; computation of performance metrics (accuracy, recall, specificity, F1, MCC, ROC-AUC). |
Evaluation |
Performance analysis with mean ± SD across folds; confusion matrix inspection; ROC curve analysis; clinical interpretability assessment. |
Deployment |
Integration of the model into a Streamlit interface; development of a relational SQLite database to store patient records, symptoms, and predictions; user interface for clinicians. |
Otero et al. (2005), who state that the success of a medical data mining project strongly depends on the quality of the initial business understanding [15].
To gain a concrete understanding of medical practices in the field, we observed several consultations and interviewed healthcare professionals to understand the logic they use to diagnose malaria. This approach made it possible to formalize the essential questions asked of patients, understand the underlying clinical reasoning, and structure this reasoning in a clear and reproducible manner.
The clinical diagnosis of malaria is based on identifying characteristic symptoms and ruling out other possible causes of fever. This process revolves around a focused medical interview, during which the doctor asks a series of key questions. Table 3 presents these questions, along with the medical objective pursued for each of them:
2.2.2. Understanding Data
To ensure effective deployment of our malaria prediction model, we began with a practical analysis of the collected data, in order to precisely meet the expectations of doctors and demonstrate the relevance of the variables used.
Thorough data analysis from the earliest stages is essential to detect anomalies and better guide future treatments. As Chapman et al. (2000) note, the data understanding phase directly influences the quality of predictive models in health [16].
1) Data sources
The dataset used in this study was extracted from the file DonneesMalaria.xlsx, compiled from retrospective and anonymised patient records collected at the University Clinics of Lubumbashi (UCL) between January 2023 and March 2024. This Excel file centralises all the information required for analysis, including clinical symptoms, vital parameters, results of biological tests, and demographic characteristics. Only records with complete and consistent data were retained for the study, and all personal identifiers were removed prior to analysis to ensure patient anonymity (See Figure 2).
Table 3. Questions asked by the physician when a suspected diagnosis of malaria is made.
No. |
Doctor’s question |
Objective of the question |
1 |
What is your gender? |
Identify risk factors related to sex, particularly for pregnancy. |
2 |
How old are you? |
Appreciate the patient’s vulnerability, especially in children or the elderly. |
3 |
What is your weight? |
Assess the patient’s general condition and adjust future management. |
4 |
Are you returning from recent surgery? |
Eliminate postoperative fever as an alternative cause. |
5 |
Have you had a fever recently? |
Check for the presence of the main symptom of malaria, linked to the lysis of red blood cells. |
6 |
Did you feel chills? |
Detect typical infection cycles of the Plasmodium parasite. |
7 |
Do you sweat profusely after a fever? |
Confirm the sweating phase following the febrile attack. |
8 |
Do you suffer from headaches? |
Identify common headaches in malaria cases. |
9 |
Have you experienced nausea or vomiting? |
Detect digestive signs that may indicate a more severe form. |
10 |
Are you pregnant? (if patient concerned) |
Detect a high-risk situation requiring appropriate treatment. |
![]()
Figure 2. Data presentation.
2) Description of data
The dataset includes 2500 records, each corresponding to a patient described by several clinical and demographic attributes (See Table 4).
Table 4. Description of data.
No. |
Attribute |
Description |
Kind |
1 |
Sex |
Male or female |
Categorical |
2 |
Age |
In years |
Digital |
3 |
Weight (kg) |
In kilograms |
Digital |
4 |
Postoperative |
Yes/No depending on whether the patient is returning from an operation |
Binary |
5 |
Fever |
Yes/No |
Binary |
6 |
Chills |
Yes/No |
Binary |
7 |
Sweating |
Yes/No |
Binary |
8 |
Headaches |
Yes/No |
Binary |
9 |
Nausea |
Yes/No |
Binary |
10 |
Pregnancy |
Yes/No/Not specified (if woman ≥ 18 years old or inapplicable case) |
Categorical |
11 |
Diagnosis |
Probable malaria/Other |
Categorical (target) |
3) Data exploration
The dataset contains 2500 rows, each representing a patient, and 11 variables, including demographic information (gender, age, weight), clinical symptoms (fever, chills, headache, etc.) and final diagnosis. The class distribution was as follows: 224 malaria cases (9%) and 2276 non-malaria cases (91%). This imbalance reflects the real-world prevalence observed at the University Clinics of Lubumbashi.
To mitigate the risk of the model being biased towards the majority class, a class weighting strategy was applied during training. This ensured that malaria cases, although underrepresented, had a proportional impact on the learning process.
Table 5 presents all the variables included in the analysis, accompanied by a brief description, their type, as well as the justification for their relevance in the context of our study:
4) Data preparation
The data preparation phase was essential to ensure a clean, consistent, and usable dataset. This step consisted of several sub-phases: presentation, cleaning, transformation, statistical testing, and variable selection (See Figure 3).
a) Data Presentation
The dataset comes from the file DonneesMalaria.xlsx and contains 2500 rows representing patients, each described through 11 clinical, demographic and diagnostic variables. The data is structured in a tabular manner:
Table 5. Data exploration.
No. |
Variable |
Description |
Kind |
Reason for its presence in the model |
1 |
Sex |
Patient’s gender (Male/Female) |
Categorical |
To detect possible differences in exposure or response to malaria between men and women. |
2 |
Age |
Patient’s age in years |
Digital |
Age can influence vulnerability to malaria (children and the elderly are often at greater risk). |
3 |
Weight (kg) |
Patient’s weight in kilograms |
Digital |
Can indicate the patient’s general condition; useful in post-diagnostic monitoring. |
4 |
Postoperative |
If the patient is returning from an operation (Yes/No) |
Binary |
Helps rule out symptoms related to recent surgery rather than malaria infection. |
5 |
Fever |
Presence or absence of fever |
Binary |
Main symptom of malaria; its detection is crucial. |
6 |
Chills |
Presence or absence of chills |
Binary |
A common symptom of malaria, often associated with fever. |
7 |
Sweating |
Presence or absence of excessive sweating |
Binary |
May accompany fever spikes and help confirm a case of malaria. |
8 |
Headaches |
Presence or absence of headaches |
Binary |
Common symptom that may increase the likelihood of a malaria diagnosis. |
9 |
Nausea |
Presence or absence of nausea or vomiting |
Binary |
May be linked to malaria but also to other conditions; useful for refining prediction. |
10 |
Pregnancy |
If the patient is pregnant (Yes/No/Not specified) |
Categorical |
Detect a high-risk situation requiring specific management. |
11 |
Diagnosis |
Clinical observation result: “Probable malaria” or “Other” |
Categorical |
Model target; serves as a basis for training and validating predictions. |
![]()
Figure 3. Presentation of the data set.
Each column represents a variable (gender, fever, headache, etc.).
The variable types are:
Qualitative: sex, post-operative, diagnosis, symptoms, pregnancy (Yes/No);
Quantitative: age (in years), weight (in kg).
5) Cleaning
Cleaning consisted of managing missing values, correcting inconsistencies, and standardizing formats. Imputation was performed by the mean for numeric variables (age, weight), and by the modal value for categorical variables (sex, pregnancy, symptoms), to limit the loss of information.
This approach, although elementary, is commonly used in medical studies, especially when the rate of missing values is low. It is considered an effective method to preserve initial distributions without introducing major biases. Jakobsen et al. (2017) point out that simple imputation, especially by the mean or the most frequent value, is acceptable for exploratory and descriptive analyses when the missing data are random and few in number [15].
Here are the concrete operations carried out:
Duplicates: identified via a search across all columns, then deleted.
Input errors: standardization of labels.
Missing values:
binary variables, the modal (most frequent) value was used for imputation.
For numeric variables (age, weight), the mean was used to replace missing data.
No individuals were removed in order to preserve the complete sample.
6) Transformation
To prepare the data for machine learning analysis, we applied the following transformations:
Conversion of “Yes”/“No” responses to 1/0 across multiple columns, including the target variable “Diagnosis” (1 for Probable Malaria, 0 for Other).
Transformation of the variable “Sex” into two binary columns: “Sex_Female” and “Sex_Male”.
Transformation of the “Pregnancy” variable into a single variable coded as follows: 0 = No, 1 = Yes and finally 2 = Not specified.
Grouping the “Age” and “Weight (kg)” columns into numbers, then grouping them into ranges:
Age: 1 = child (<15 years), 2 = adult (15 - 50 years), 3 = senior (>50 years)
Weight: 1 = low (<45 kg), 2 = normal (45 - 75 kg), 3 = high (>75 kg)
These transformations make it easier to integrate data into machine learning models, while maintaining a simple and interpretable structure.
It should be noted that no normalization was performed, because the decision trees used in our study are insensitive to the scale of the variables. This choice is justified by Tan et al. (2018), who indicate that tree models are based on cutting thresholds and not on distances or numerical magnitudes [17].
After the transformation, our data takes the form shown in the following Figure 4.
Figure 4. Data transformation.
7) Dimension reduction or attribute selection
To simplify the model and improve its performance, we applied a multi-step selection of key variables. First, the identification columns (number, last name, first name) were removed. Then, the correlation matrix of the numerical variables was used to identify and eliminate redundant variables with a correlation greater than 0.8, thus avoiding multicollinearity. A Random Forest model was then used to estimate the relative importance of the variables, retaining those whose importance exceeded the average.
This method is particularly suitable in medical contexts, as it allows handling heterogeneous datasets while maintaining robust performance. Random Forest-based attribute selection is widely recognized for its ability to identify the most discriminating variables, thus improving the accuracy of predictive models [18]. The final set includes, among others, age, weight, clinical variables, binary sex, as well as the diagnostic target variable.
Figure 5 shows our data after attribute reduction or selection:
8) Verification of the links between predictor variables and target variables
We analyzed the relationships between predictor variables and the target variable “Diagnosis” by examining correlations for numeric and binary variables (including Sex_Female and Sex_Male), as well as by statistically comparing their distributions across diagnostic classes. Correlation analysis and statistical tests (such as the Mann-Whitney test or the Chi-square test) are essential to ensure the validity of the relationships between attributes and the target variable in a predictive model. These methods not only allow us to detect linear associations, but also
Figure 5. Dimension reduction or attribute selection.
to assess whether the distributions differ significantly across diagnostic groups [19].
The image below shows the correlation matrix between the different variables studied, illustrating the positive or negative relationships between them. This visualization allows us to better understand potential interactions and identify highly correlated variables that may influence the modeling (See Figure 6).
2.2.3. Modeling
In accordance with the modeling step of the CRISP-DM process, we implemented a machine learning algorithm to automatically predict malaria diagnosis from the collected clinical data. Our objective is to evaluate whether patient characteristics allow a machine learning model to effectively identify the presence or absence of the disease.
1) Decision tree
The choice of the decision tree is based both on its good performance with medium-sized datasets and on its transparency, which is crucial in a medical context. Indeed, this type of model is called “white-box”: it allows health professionals
Figure 6. Correlation matrix between variables.
to understand the reasoning leading to a prediction, thus promoting confidence and clinical acceptance [20].
As Shortliffe and Sepúlveda point out, clinicians’ trust in diagnostic support tools strongly depends on the clarity of the reasoning proposed by the system [21]. This explainability is a decisive factor for integrating artificial intelligence into medical practices. In this perspective, Gambetti et al. also insist that interpretability is an ethical and functional requirement of clinical decision support systems (CDSS) [22].
To obtain reliable performance estimates, we applied a 10-fold cross-validation strategy instead of a single 80/20 split. At each fold, 90% of the dataset (2,250 records) were used for training and 10% (250 records) for testing. This approach reduces variance in the evaluation and ensures that all cases contribute to both training and testing. The final reported metrics correspond to the mean ± standard deviation across the 10 folds.
In addition, to compensate for the imbalance between malaria and non-malaria cases, the decision tree was trained with class weights, which helped maintain sensitivity for minority-class detection.
Figure 7 illustrates the structure of our decision tree model applied to clinical data.
Figure 7. Decision tree.
The decision tree presented above was constructed to classify patients according to their diagnosis (malaria or other). It uses the different explanatory variables from the clinical data to make successive decisions, represented by each node.
Model performance:
Accuracy: 90.4% ± 1.2%, indicating good generalization ability.
Precision: 90.0% ± 1.3%, limiting false positives.
Recall: 88.0% ± 1.5%, which is essential to avoid missing patients.
Specificity: 92.0% ± 1.0%, reducing false negatives.
F1-score: 0.89% ± 0.01%, reflecting a good balance between precision and recall.
MCC coefficient: 0.81 ± 0.02, indicating a strong correlation between predictions and reality.
2) The ROC Curve
To evaluate the performance of our model, we used the Receiver Operating Characteristic (ROC) curve, which plots the true positive rate (TPR) against the false positive rate (FPR) for different decision thresholds.
In a medical context, this analysis is essential: it allows us to estimate the model’s ability to correctly differentiate patients with malaria from healthy ones, while integrating the potential consequences of false negatives (undetected sick patients) and false positives (healthy patients diagnosed as sick) [23].
The area under the curve (AUC) provides a synthetic measure of performance:
An AUC ≈ 1 indicates excellent discriminatory power,
An AUC ≈ 0.5 corresponds to a random model [24].
Hanley and McNeil demonstrated that AUC is a reliable measure for comparing diagnostic models [25].
Thus, the ROC curve and AUC are reference tools in medicine to evaluate and compare the performance of classification algorithms in diagnostic studies [26].
Figure 8 shows the ROC curve obtained during our study.
Figure 8. ROC curve.
This graph, based on a test sample of 500 patients taken from an initial set of 2500, shows excellent discriminatory ability. The curve approaches the upper left corner, indicating a high rate of correct detection of positive cases and a low rate of false alarms.
Performance is quantified by the area under the curve (AUC): a value close to 1 reflects an excellent model, while a value close to 0.5 corresponds to a random model.
Here are the details of our ROC curve results (See Figure 9).
2.2.4. Assessment
After training our model on 80% of the data, we evaluated its performance on the remaining 20% unused, in order to measure its ability to generalize to new observations [27]. This step is crucial to assess the model’s ability to generalize its predictions to new cases, not used during training.
1) Performance metrics
To quantify the quality of predictions, several standard indicators in supervised learning were calculated:
Figure 9. ROC curve data.
Recall (Sensitivity): Also known as the true positive rate, this measures the model’s ability to correctly detect malaria patients. This metric is crucial because low recall means that many true cases are missed (false negatives), which can have serious clinical consequences.
Precision: proportion of positive predictions (malaria) that are actually correct. It assesses the reliability of positive alerts, limiting the number of false positives [29].
F1-score: harmonic mean between precision and recall, offering a balanced compromise between these two metrics, particularly relevant in the presence of unbalanced classes [30].
Specificity: True negative rate, i.e., the model’s ability to correctly identify patients without malaria. Good specificity helps limit false positives, thus avoiding misdiagnoses in healthy patients.
These metrics were calculated by comparing the model’s predictions to actual diagnoses provided by physicians on the test set, thus allowing the model’s relevance in a clinical context to be accurately assessed.
2) Confusion matrix
The confusion matrix is an essential tool for analyzing model performance in detail. It is a fundamental tool for analyzing detailed model performance, it allows identifying critical errors, especially false negatives, which are essential in a medical context [31] (See Table 6).
Table 6. Confusion matrix.
|
Predicted Malaria |
Predicted Other |
Real malaria |
True positives |
False negatives |
Other real |
False positives |
True negatives |
This representation makes it possible to identify critical errors (false negatives, in particular) which impact the quality of the diagnosis.
This matrix is presented graphically in Figure 10.
Figure 10. Confusion matrix (comparison between model predictions and actual diagnoses).
In this matrix we found:
True positives (TP = 197): Patients with malaria correctly identified as such.
False negatives (FN = 27): Patients with malaria but not detected by the model a critical error.
False positives (FP = 21): Healthy patients falsely diagnosed as having malaria.
True negatives (TN = 255): Healthy patients correctly identified.
This representation allows us to clearly visualize the most sensitive errors, in particular false negatives, which here represent 12% of real malaria cases (27 out of 224).
2.2.5. Deployment
The deployment of our expert system for malaria diagnosis relies on the seamless integration of the classification model, the user interface, and a relational medical database. The latter constitutes a central element, ensuring the structured and sustainable storage of diagnosed cases.
1) Medical database architecture
As part of the deployment, a relational database was developed to meet clinical, technical, and ethical requirements. It allows for the storage, tracking, and consultation of patient data, thus facilitating the future use of results for research or medical monitoring purposes.
The objectives of this base are multiple:
Structured storage of generated diagnoses: Each prediction made by the system is recorded, creating a usable history of the cases treated.
Medical reference: Previous cases can be compared with similar new cases, enriching decision-making.
Traceability of decisions: Each diagnosis is linked to clinical data, date, and model output, ensuring total transparency in the decision-making chain.
Support for medical research: Stored data can be reused to refine future models or for epidemiological studies.
In order to meet the above objectives, we have adopted a modular architecture divided into three main tables:
1. Table: Patients
Table 7 gathers patients’ personal and biometric information. It is essential for uniquely identifying each individual supported by the system. By centralizing data such as name, age, gender, and weight, it allows for tracking each patient’s medical history, ensuring diagnostic traceability, and facilitating inter-patient comparisons during clinical studies.
Table 7. Patient table.
No. |
Field |
Kind |
Description |
1 |
patient_id |
INTEGER (PK) |
Unique patient identifier |
2 |
name |
TEXT |
Patient’s name |
3 |
first name |
TEXT |
Patient’s first name |
4 |
sex |
TEXT |
Gender (Male/Female) |
5 |
age |
INTEGER |
Patient’s age |
6 |
weight |
REAL |
Weight (in kg) |
7 |
pregnancy |
Boolean |
Pregnancy Yes or No |
2. Table: Symptoms
Table 8 contains the various clinical signs noted for each patient at the time of the consultation. It is linked to the Patients table by a foreign key. Its importance lies in the fact that it directly feeds the prediction model with the explanatory variables necessary for the analysis (fever, chills, headaches, etc.). It also allows us to observe the symptomatic evolution and to conduct statistical analyses on the frequency or correlation between symptoms.
3.Table: Diagnostics
Table 9 records the results produced by the artificial intelligence model for each patient. It contains the system’s decision (malaria or not), the probability associated with this decision, as well as the prediction date. Its importance is crucial because it constitutes the memory of the predictions made. It allows the model’s performance to be verified on real cases, medical audits to be conducted, and medical decisions made at a given time to be documented.
Figure 11 shows the relational architecture or data schema of our expert system.
Table 8. Symptoms table.
No. |
Field |
Kind |
Description |
1 |
id_symptom |
INTEGER (PK) |
Symptom Set Identifier |
2 |
patient_id |
INTEGER (FK) |
Foreign key linked to the patient |
3 |
fever |
BOOLEAN |
Presence of fever |
4 |
chills |
BOOLEAN |
Presence of chills |
5 |
headaches |
BOOLEAN |
Headaches declared |
6 |
sweating |
BOOLEAN |
Abnormal sweating |
7 |
nausea |
BOOLEAN |
Nausea symptom |
8 |
postoperative |
BOOLEAN |
Post-operative status |
Table 9. Diagnostic table.
No. |
Field |
Kind |
Description |
1 |
id_diagnostic |
INTEGER (PK) |
Unique diagnostic identifier |
2 |
patient_id |
INTEGER (FK) |
Reference to the patient concerned |
3 |
result_prediction |
BOOLEAN |
1 = Probable malaria; 0 = Other |
4 |
probability |
FLOAT |
Model confidence rate |
5 |
date_prediction |
DATE |
Date of diagnosis |
Figure 11. Database architecture.
3. Results and Discussions
The implementation of our malaria diagnostic expert system at the University Clinics of Lubumbashi proved particularly instructive. It allowed us to assess the relevance of the developed model in a real-world environment, taking into account local practices, technical constraints, and the specific needs of medical staff.
3.1. Analysis of the Results Obtained
Tests performed on a sample of 500 patients from a global dataset of 2500 showed that the decision tree-based machine learning model offers robust and reliable performance.
The overall accuracy of 90.4% indicates that the model correctly classifies the majority of cases. The recall (sensitivity) of 88% shows that the system effectively detects most malaria patients, which is crucial to limit false negatives with serious clinical consequences. The precision (90%) ensures that positive alerts are reliable, while the F1-score (0.89) reflects a good balance between recall and precision.
The high specificity (92%) confirms the model’s ability to limit false positives, thus avoiding unnecessary overdiagnosis. In addition, the MCC coefficient (0.81) highlights a very good overall quality of the classification, even in the event of imbalance between classes.
These results demonstrate that the model can be an effective diagnostic aid tool, capable of supporting medical decisions while accelerating treatment.
It is important to note that other machine learning models, such as SVMs (Support Vector Machines) or neural networks, could have been considered. However, despite their sometimes-superior performance, these models are often considered “black boxes” due to their lack of transparency. In clinical settings, this can be a major barrier to their adoption, as practitioners need to be able to justify medical decisions. Ribeiro et al. (2016) emphasize the importance of explainability in strengthening end-user trust in intelligent systems [32].
3.1.1. Main Clinical Data Entry Interface
As part of the system deployment, a user-friendly graphical interface was developed to facilitate the entry of clinical data by healthcare professionals. This interface allows for the structured entry of essential patient information (name, age, gender, weight), as well as observed symptoms such as fever, chills, headaches, sweating, or nausea. Once the data is entered, the system triggers the prediction process and displays the diagnostic result along with a confidence rate. This interface plays a crucial role in the operationalization of the system, ensuring intuitive handling, rapid data entry, and clear restitution of the medical verdict (See Figure 12).
3.1.2. System Limits
Despite these satisfactory performances, several limitations must be highlighted:
Figure 12. Input form.
Contextual factors not integrated: The model does not yet integrate epidemiological or environmental variables (seasonality, geographic location, history of epidemics), which could improve the relevance of the predictions.
Single-user deployment: The system currently operates locally without the ability to centralize or share data at the institutional or regional level, limiting coordination and overall epidemiological surveillance.
3.1.3. Career Prospects
In light of the initial tests carried out at the University Clinics of Lubumbashi, several avenues for improvement can be considered:
✅ Larger-scale deployment: Connect multiple workstations to a central database to facilitate the collection, aggregation and global analysis of medical data.
✅ Model enrichment: integrate new clinical and biological variables (such as red blood cell count, precise body temperature, etc.) to increase diagnostic accuracy.
✅ Development of a mobile version: making the system accessible via a lightweight Android application, intended for community health workers or field interventions.
✅ Continuing education: ensuring lasting ownership of the system by medical staff through training and support sessions.
✅ Longitudinal monitoring: establish a mechanism for monitoring diagnosed cases in order to measure patient progress and adjust therapeutic recommendations over the long term.
By making the diagnostic tool accessible, transparent and adapted to local realities, this study is part of a logic of equitable and innovative medicine. It also paves the way for other digital health initiatives in low-income countries, where AI can have a direct impact on the care provided to vulnerable populations [8].
Conflicts of Interest
The authors declare no conflicts of interest.