Development and Validation of a Customer Satisfaction Measuring Instrument with Laboratory Services at the University Hospital of Kinshasa, Democratic Republic of the Congo (DRC) ()
1. Introduction
The hospital market, which is a service industry, has today changed from a sellers’ market to a buyers’ market, where the customer is all important. Customer satisfaction is considered as one of the desired outcomes of health care and it is directly related to the utilization of health services (Bekele et al., 2008). A study done in South Africa concluded that customer satisfaction is a fundamental indicator of equitable quality of care (Myburgh et al., 2005).
Medical laboratory’s customer service is part of a Quality Management System (QMS) because if the customer is not well served, the laboratory is not fulfilling its mission properly. Quality standards, such as ISO15189 and ISO/IEC17025, and the balanced score card stress the importance of the systematic use of customers’ perspectives in clinical laboratories. Both the ISO15189 and ISO/IEC17025 standards encourage an investigative process to search continuously for causes behind processes that deviate from procedures or are not satisfactory to customers so that proper corrective and preventive actions can be initiated (Addis et al., 2013).
Assessing customer satisfaction is an important process in the laboratory’s continuous quality improvement cycle (CQI) program (Crosby, 1989). The Joint Commission on Accreditation of Healthcare Organizations and the College of American Pathologists (CAPs) give accreditation to clinical laboratory programs. The CAPs requires the healthcare facility to measure customer satisfaction with the laboratory services every two years. However, no such study has been performed in Democratic Republic of the Congo (DRC) till now because of the absence of a performant measuring instrument. This suggests that there is a need to build an effective instrument which would measure customer satisfaction with laboratory services in DRC. The main purpose of this study is to develop a theoretical and operational instrument for measuring customer satisfaction with clinical laboratory services. The specific aim of this study is to present the four findings related to the construct validity of the newly developed instrument measuring customer satisfaction. The findings include: 1) dimensionality of the instrument, 2) reliability of the instrument, 3) validity of the construct, and 4) confirmation of the structure conceptual framework. The paper begins with a literature review of quality of service and customer satisfaction, then explains the methodology employed, followed by the research results and findings’ discussion. The paper ends with key conclusions and some limitations of the present work and some possible directions for future investigation. Finally, managerial and theoretical implications of the study are also presented.
2. Literature Review and Hypotheses
The lack of consensus on a definition of satisfaction has created serious problems for customer satisfaction research. First, developing context-specific items becomes difficult given the fact that the conceptual definition of customer satisfaction is not clear. Therefore, most researches use a single-item rating scale to measure customer satisfaction. Single-item scales do not provide sufficient content domain sampling of complex constructs and are generally believed to be unreliable, since they do not allow internal consistency to be calculated (Chaudhary et al., 2017; Nunnally, 1978). Furthermore, Single-item measures provide no guidance to respondents or researchers in interpreting the exact meaning of satisfaction. Consequently, developing multiple-item measures to resolve the measurement difficulties caused by single-item measures is highly recommended (Churchill, 1979).
Secondly, the lack of definitional and measurement standards of customer satisfaction limits theory development in this field, weakens the explanation power of any new theories, and confines the generalization of any empirical findings (Wang et al., 2001).
Previous researchers have indicated that service quality is a precursor of customer satisfaction (Bekele et al., 2008; Myburgh et al., 2005; Poranki et al., 2015). SERVQUAL is a method intended to measure the “quality of service” in companies; it is mainly used in the private sector. This method is the starting point for most of the work on satisfaction and quality of service (Brensinger & Lambert, 1990). However, there have been a number of studies that question the validity of the 5 dimensions of SERVQUAL and the uniform applicability of the method for all service areas. A number of problems with the SERVQUAL instrument are discussed in the literature. According to an analysis by Thomas P. Van Dyke, Victor R. Prybutok and Leon A. Kappelman (Van Dyke et al., 1999), it appears that the use of difference scores in calculating SERVQUAL contributes to problems with the reliability, discriminant validity, convergent validity, and predictive validity of the measure. Consequently, many researchers proposed that a quality measurement scale should be adapted to the specifics of an individual service industry or even an individual service, and that a general scale should not be used at all (Babakus & Boller, 1992). Thus, we developed an instrument for measuring customer satisfaction through quality of service in a clinical laboratory.
The following hypotheses were developed to evaluate the influencing factors on customer satisfaction (see Figure 1):
H1: There is a positive relationship between Reliability of tests’ results (TR) and customer satisfaction (CS).
H2: There is a positive relationship between Responsiveness of services (RS) and customer satisfaction (CS).
H3: There is a positive relationship between laboratory personnel’s (LP) willingness to help and customer satisfaction (CS).
3. Methods
3.1. Study Setting and Design
This was a cross-sectional study conducted at the University Hospital of Kinshasa.
3.2. Sampling Method
Our sampling was exhaustive. The formal survey involved all available physicians (All Heads of concerned departments, senior residents, post graduates and junior residents)
1) who have worked at the medical institution for more than half a year,
2) who were regularly requiring laboratory investigations to be performed,
3) who were on duty during the study period, and agreed to participate in the study.
The respondent’s number was 330 attending physicians.
3.3. Study Procedures and Statistical Analysis
In order to develop a reliable and valid measurement instrument, we followed the general methodological approach recommended by Churchill (Churchill, 1979): We adopted Churchill’s paradigm for the development of service quality measurement scales, in which eight steps are proposed for developing better measures of marketing constructs. These eight steps are: specify domain of construct, generate sample of items, collect data, purify the measure, collect new data, assess reliability with new data, assess construct validity and develop norms (see Figure 2).
First step: This step consisted of a literature review and a semi-structured interview with customers.
Second step: After a literature review, we generated a structured instrument (questionnaire) including SERVQUAL items and other items from customer verbal received complaints. Hence, the questionnaire contained a total of 39 questions which are the 22 questions originally used to construct the SERVQUAL model and 17 questions from attending physicians’ protests. We developed this questionnaire using a 7-point Likert scale to prevent respondents’ scores from clustering near the average: the satisfaction was measured on 7 point scale from 0 to 6 indicating the lowest (strongly disagree) and highest (strongly agree) levels of satisfaction.
Figure 2. Eight stages of Churchill’s paradigm.
This original questionnaire was examined by 11 specialists (5 academics and 6 in the field of marketing research methods) to determine content validity and help avoid redundancy. So after eliminated those that appeared redundant or not relevant to our study, we were able to collect 14 variables (questions) for customer satisfaction measurement. Specialists’ suggestions were used to modify the items and wording in the original questionnaire.
Third step: The questionnaire was then piloted with a convenient sample of 200 attending physicians. The first data were collected after this survey carried out among 200 doctors.
Fourth step: this stride concerned the purification of the measurements. All data collected on the third step were analysed by using SPSS 21.0 (SPSS Inc., Chicago, IL, USA). Using a 7-point Likert scale, the results were rated as follows: 0, strongly disagree; 16.6, disagree; 33.3, slightly disagree; 50, average; 66.6, slightly agree; 83.3, agree; and 100, strongly agree.
The main goal of the purification of the measurements is the dimensionality of the scales i.e. grouping similar measured variables into dimensions to identify latent variables or constructs. Exploratory factor analysis (EFA) is the statistical technique that we used to reduce the 14 manifest variables or items into fewer numbers of factors. This technique extracted maximum common variance from all 14 variables and puts them into a common score. Before performing the exploratory factor analysis, we evaluated sample size adequacy using the Kaiser-Meyer-Olkin test of sampling adequacy (KMO). Furthermore, we assessed whether the factor analysis should be continued or not by employing Bartlett’s test of sphericity. Principal component analysis (PCA), with varimax rotation, was the Exploratory Factor Analysis (EFA) used to study the dimensionality of the construct i.e. to extract the factor from the data set. The Kaiser’s criterion (retain the factors whose eigenvalue is greater than 1) was chosen to determine the number of factors (see Figure 3). According to the PCA results, the developed instrument which consisted of 14 items measuring customer satisfaction was conceptually hypothesized to have three constructs i.e. three latent variables showing that customer satisfaction is a three dimensional variable. Hence the three hypotheses put forward in the conceptual framework (see Figure 1).
Fifth step: Data were then collected from 330 attending physicians in the University Hospital of Kinshasa. Trained and qualified investigators conducted this study and distributed the developed instrument, which consisted of 14 items, to all physicians and then collected the following day. They responded to the questionnaire by writing directly on the paper. The survey’s collected data were used to verify the conceptual hypothesis: the three-dimensional conceptual model resulting from the EFA was subjected to a Confirmatory Factor Analysis (CFA). Thus, in subsequent steps, the CFA results were used to demonstrate whether the model had acceptable reliability, convergent validity, discriminant validity, levels of fit and unidimensionality.
Sixth step: The model was checked for reliability. The reliability checks were done using the data collected in the fifth step. We checked internal consistency
reliability by analysing the Cronbach’s Alpha, Jöreskog’s Rhô coefficient and composite reliability values. A reliability coefficient of 0.70 or higher was considered “acceptable”.
Seventh step: The model was checked for validity. Convergent validity was achieved where indicator loadings were all greater than the threshold of 0.7 and all the latent variables had Average Variance Extracted (AVE) greater than 0.5. Discriminant validity was established where Maximum Shared Variance (MSV) and the Average Shared Variance (ASV) were both lower than the AVE for all the constructs.
Eighth step: The scale was also subjected to Structural Equation Modelling (SEM). Because there is no single criterion for the theoretical model fit evaluation obtained as a result of SEM, various fit indices were used to test the model fit according to the Kline criteria. In order to evaluate the structural model, we used the five step structural model assessment procedure proposed by Hair et al.: 1) Assess structural model for collinearity issue; 2) Assess the path coefficient; 3) Assess the level of R2; 4) Assess the effect size f2; 5) Assess the predictive relevance Q2. All the threshold values against to each and every criterion were clearly represented with the results to have comprehensive understand about the evaluation of measurement.
3.4. Ethics Approval and Consent to Participate
This study conforms to the ethical norms and standards in the Declaration of Helsinki, including ethics committee approval statement and informed consent statement. Before implementing the study, ethical clearance was obtained: this study has been approved by the ethical review committee of the School of Public Health of the University of Kinshasa N˚ ESP/CE/55/2020. The respondents were informed of the purpose of the study and assured of confidentiality and their right to withdraw from the study. Verbal consent was obtained after the study objectives were explained to each participant. Informed consent was obtained from each respondent, and confidentiality was maintained throughout the study.
4. Results
As visualized in Table 1, both the KMO statistic and Bartlett’s test of sphericity indicate an appropriate factor analysis model: the Kaiser, Meyer and Olkin (KMO) test whose value is 0.934 (>0.6) and the Bartlett sphericity test (Bartlett = 8249.985; p < 0.001) indicates that the data can be factorized. Thus, the factor analysis can be performed in the next step.
The Exploratory Factor Analysis (EFA) was performed using a Principal Component Analysis (PCA) with Varimax rotation method and Kaiser Normalization (see Table 2).
Table 1. Results of KMO and bartlett’s test.
Table 2. Total variance explained, initial eigenvalues.
The leftmost section of Table 2 shows the initial solution i.e. the 14 manifest variables or items, while the second section shows the variance explained by the initial solution. Only three factors in the initial solution have eigenvalues greater than 1. Together, they account for almost 93.481% of the variability in the original variables or manifest variables. This suggests that three latent variables are associated with customer satisfaction. The third section of this table shows the variance explained by the extracted factors before rotation. The cumulative variability explained by these three factors in the extracted solution is the same as above i.e. about 93.481%. The rightmost section of this table shows the variance explained by the extracted factors after rotation. The rotated factor model makes some small adjustments to the three factors.
The scree plot confirms the choice of three components (see Figure 3).
Three clear factors emerged from this PCA as shown in Table 3. The three factors restore 93.481% of the variance explained (see Table 2). The first factor to emerge is composed of 5 items; the second factor is made up of 5 items and the third factor is made up of 4 items. Concerning the validity or quality of the items that composed each factor, Table 3 shows that each of the 14 items has a loading higher than 0.82. Thus, as to the items’ quality, 100% of them were classified as excellent. In summary, it appears that the variable “customers” satisfaction’ is a three-dimensional concept. Based on previous analysis, a comprehensive model for measuring customer satisfaction is presented (see Figure 4) below.
Table 3. Components matrix after varimax rotation.
Table 4. Reliability and Validity of the sub-dimensions that emerged after the PCA.
AVE: Average Variance Extracted; MSV: Maximum Shared Variance; ASV: Average Shared Variance; ρ: Jöreskog Rhô.
Figure 4. Comprehensive model for measuring customer satisfaction with clinical laboratory services.
After confirming the dimensionality of the scale, we prove its reliability, convergent validity and discriminant validity.
Table 3 showed that indicator loadings were all greater than the threshold of 0.7 or higher. Also, Table 4 shows that all the latent variables have AVE greater than 0.5, therefore, convergent validity has been achieved.
Table 4 shows that the three factors registered a Cronbach’s alpha score greater than 0.90, indicating the scale has a very high degree of reliability. The Jöreskog Rhô and Composite Reliability are greater than 0.7 which allow us to further confirm the good reliability of the constructs. Furthermore, both the MSV and the ASV are lower than the AVE for all the constructs in the scale. Therefore, Discriminant validity has been achieved.
Goodness of Fit Indices (GFIs) for a series of Confirmatory Factor Analysis (CFA) assessing the null, one-factor, two-factor (generated by combining in all possible ways the three theoretically defined components) models of customer satisfaction are presented in Table 5.
According to Kline criteria, the two-factor model provided for a good fit.
The results of the structural model analysis are shown in Table 6 which meets the criteria of the Evaluation of Assessment Model based on the Partial least Squares Structural Equation Modeling (PLS-SEM) analysis procedure.
Table 6 shows that:
1) The Variance Inflation Factor (VIF) coefficients are less than 4.0 (1.782, 2.559 and 2.046), which ensure there are no collinearity issues among constructs.
2) There is significant correlation between latent variables and customer satisfaction:
· There is significant relationship between reliability of tests’ results and customer satisfaction (β = 0.691, t = 45.79, p = 0.024). This finding confirms H1.
· There is a significant relationship between responsiveness of services and customer satisfaction (β = 0.422, t = 2.78, p < 0.001). Hence, H2 is confirmed.
· There is a significant relationship between laboratory personnel’s willingness to help and customer satisfaction (β = 0.315, t = 1.69, p = 0.056). This finding confirms H3.
3) R-square values (0.958, 0.615 and 0.511) are considered as substantial (greater than 0.26). Thus, a high predictive power of the model.
4) Based on the results of f2 effect size, only the quality of tests’ results (TR) has small effect size (f2 = 0.067).
5) In terms of the prediction relevance of the individual exogenous variables, the q2-value of 0.016 for the variable TR determines a small effect.
5. Discussion
We developed a new customer satisfaction measurement scales model and tested its reliability and validity. Given that the quality of study results is related directly to the quality of the instrument used to collect data, it is easy to see the importance of collecting data by means of reliable and valid instrument (Andrew et al., 2011).
Table 5. Summary of model adjustment indicators.
* Cutoff criteria for good model fit recommended by Kline (2005).
Table 6. Results of hypothesis testing and predictable power.
at-values for two-tailed test: *1.65 (sig. level = 10%); **1.96 (sig. level = 5%); ***2.57 (sig. level = 1%) (Hair et al., 2017), Notes: ***p < 0.01, **p < 0.05, *p < 0.10, Effective size: 0—none, 0.02—small, 0.15—medium, 0.35—large (Cohen, 1988), Effect sizes calculated using the following formulas: bf2 = R2 included— R2 excluded/1 – R2 included, cq2 = Q2 included—Q2 excluded/1 – Q2 included.
The criterion recommended by Hair et al., says that for an adequate sample size, it is necessary to have between 5 and 10 individuals for each instrument item (Hair et al., 2009). To Tabachnick and Fidell, factor analysis validity is compromised with less than 300 individuals (Tabachnick & Fidell, 2007). Our new instrument had 14 items in its application version, which would require a minimum sample size of 70 people, according to Hair et al. criterion. Three hundred and thirty people composed our sample that attended to both criteria, allowing the exploratory and confirmatory validations to be performed.
Before performing a factor analysis, the literature suggests evaluating the sample size adequacy using the Kaiser-Meyer-Olkin test of sampling adequacy (KMO). Furthermore, it is necessary to assess whether the factor analysis should be continued or not by employing Bartlett’s Test of Sphericity (Schmidt & Hollensen, 2006). These two tests indicate the suitability of the data for structure detection. The Kaiser-Meyer-Olkin Measure of Sampling Adequacy is a statistic that indicates the proportion of variance in the variables that might be caused by underlying factors. Kaiser gave the KMO test standard about whether it is suitable for factor analysis: KMO > 0.9, quite suitable; 0.9 > KMO > 0.8, suitable; 0.8 > KMO > 0.7, generally suitable; 0.7 > KMO > 0.6, not quite suitable; KMO < 0.5, not suitable (Qi et al., 2013). Bartlett’s test of sphericity tests the hypothesis that correlation matrix is an identity matrix, which would indicate that variables are unrelated and therefore unsuitable for structure detection. Small values (less than 0.05) of the significance level indicate that a factor analysis may be useful with data. Table 1 shows that in the present test, the KMO value of the variables was 0.934, which indicated sampling adequacy such that the values in the matrix was sufficiently distributed to conduct factor analysis (George & Mallery, 2016). The value obtained by Bartlett’s test of sphericity, Approx. Chi-Square was 8249.985, which was highly significant at p < 0.001 level, indicating that the data were approximately multivariate normal (Pallant, 2013). The results of KMO and Bartlett’s Tests proved satisfactory for further analysis (Table 1). So the variables that the paper selects are quite suitable for factor analysis.
In exploratory factor analysis, methods of Principal Component Analysis (PCA) and varimax rotation were employed because they maximize variance and facilitate the interpretation of the constructs deduced. In view of the arbitrary nature of factor extraction, and practicability and meaningful interpretability, the following three criteria were observed in data reduction: 1) the eigenvalue was greater than 1 and there were more than 3 items in one factor; 2) factor loadings lower than 0.4 were deleted and not counted in any factor; 3) when double loadings occurred, decisions were made on meaningful interpretations (Xu & Liu, 2018).
Based on the three criteria mentioned above, three common factors were extracted from the questionnaire. Table 2 shows that the accumulative contribution rate of three extracted common factors is 93.481%, which is bigger than 85%, i.e., the extraction of common factor is effective (Huang et al., 2020). Scree plot also flattened out after the first three factors. The original 14 indexes can be integrated into three common factors. According to the principle of factor analysis, the three common factors have no correlation with each other, but each common factor is highly correlated with its own contained original variables.
The three common factors extracted were named according to the items included. Table 3 shows the correlation coefficient between common factors and their own contained original variables. As a result, it is suitable to use reliability of tests’ results (TR), Responsiveness of services (RS) and Laboratory Personnel’s (LP) willingness to help to represent the original variables and evaluate customer satisfaction with laboratory services.
The World Health Organization (WHO) indicates that evaluations of client satisfaction might address various aspects of the provided services: reliability and consistency of the services, the responsiveness of services, and the willingness of providers to meet client’s expectations and needs (WHO, 2000). According to Table 3 results, our constructs meet the WHO recommendations.
The validity or quality of the items that composed each factor was also analysed, based on Comrey and Lee classification. Comrey and Lee classified items with loadings higher or equal 0.71 as excellent; higher or equal 0.63 as very good; higher or equal 0.55 as good; higher or equal 0.45 as reasonable; and higher or equal 0.32 as poor (Comrey & Lee, 2016). Thus, as to the items’ quality, 100% of them were classified as excellent.
Measurement instrument must be both reliable and valid in order for researchers to have confidence in the data collected with the instrument. The reliability and the consistency of the results obtained concern the extent to which the instrument yields the same results in repeated trials (Andrew et al., 2011). The most common test for a construct’s internal reliability is Cronbach alpha. However, more recently composite reliability and Jöreskog’s Rhô have become more pertinent measures of construct reliability in research studies that utilize Structural Equation Modeling (SEM) and Confirmatory Factor Analysis (CFA) as part of their data analysis. There is a consensus in the literature that a score of 0.7 or higher is indicative of a construct’s reliability (Hair et al., 2010). In this study, Cronbach alpha, Jöreskog Rhô and Composite Reliability are greater than 0.7 which confirm the good reliability of the construct (Table 4).
Validity, the extent to which an instrument accurately measures the target it was designed to measure, helps a researcher determine whether or not an instrument addresses its designed purpose (Malhotra, 2010). Testing of construct validity concentrates not only on finding out whether an item loads significantly on the factor it is measuring (convergent validity) but also on ensuring that it does not significantly load across or measure other factors (discriminant validity) (Usunier & Stolz, 2016). Our results confirmed the convergent validity and discriminant validity of the developed instrument (Table 4).
CFA were performed to compare three different models: 1) a null model; 2) a one-factor model and 3) a two-factor model. To determine how well the specified factor model represented the data, goodness-of-fit indices were examined (Table 5). There are several indices to assess model-fit and they are categorized into three groups, namely absolute fit indices, incremental fit indices and parsimony fit indices (Frikha, 2019). The two-factor model was chosen as the best fit model based on the cutoff criteria for good model fit recommended by Kline (Kline, 2015).
In order to assess the structural model, Hair et al. proposed five step structural model assessment procedure. 1) Assess structural model for collinearity issues 2) Assess the significance and relevance of the structural model relationship 3) Assess the level of R2 (coefficient of determination) 4) Assess the effect size f2 5) Assess the level of q2 effect size (Hair et al., 2016). The results of the structural model analysis, shown in Table 6, meet the criteria of the Evaluation of Assessment Model based on the Partial least Squares Structural Equation Modeling (PLS-SEM) analysis procedure. Thus, our three hypotheses were confirmed: the three latent variables have a positive influence on customer satisfaction.
6. Conclusion
6.1. Conclusion
This study has developed a new instrument for measuring customer satisfaction in clinical laboratory. Data were analysed using exploratory factor analysis, confirmatory factor analysis and structural equation model (SEM). An instrument with a 3-factor structure shows strong potential for construct validity. The results confirmed our hypothesis, showing the three-dimensionality of Customer Satisfaction. We found that reliability of tests’ results, responsiveness of services and laboratory personnel’s willingness to help have a significant influence on customer satisfaction. The new customer satisfaction measurement scales model showed good reliability and factor-based and construct validity.
6.2. Limitation of the Study and Suggestions for Future
This study’s limitations must be acknowledged. The major inherent limitation of this study is the generalization of its outcome. Medical laboratories have a range of customers including patients, physicians, public health agencies and the community. A central figure in the client list is the physician or health care provider. The initial request for service originates with this person, and the laboratory staff generally identifies the ordering physician as the primary client (WHO, 2000). Because we surveyed ordering physicians only, we can’t confirm that the developed instrument is reliable or valid for patients. Additional research could develop another instrument for measuring customer satisfaction among patients and other customers who attain the clinical laboratory. Finally, the test-retest reliability of the instrument should be evaluated. Measures of reliability include the stability of an instrument over time. Therefore, the stability of this new instrument, including short- and long-range stability, should be further investigated using the test-retest correlation method.
6.3. Study’s Implications
In spite of the above limitation, this is the first measurement instrument of its kind to have been fully validated. This study provides valuable practical and managerial implications for researchers and laboratory managers.
6.3.1. Managerial Implication
This study highlights principal areas where managerial attention is required for improving customer satisfaction. It is important for clinical laboratory managers to consider customer satisfaction with laboratory services as a multi-dimensional construct, where reliability of tests’ results, responsiveness of services and laboratory personnel’s willingness to help are important, because focusing only on one or another service quality is too narrow approach. We encourage practitioners and researchers to use this instrument for various applications, particularly in customer satisfaction surveys.
6.3.2. Theoretical Implication
The present study has some theoretical implications as well. This paper, being the first study to attempt a comprehensive psychometric validation of an instrument that measures customer satisfaction with clinical laboratory services in DRC, has contributed to filling the gap in the literature. Additionally, findings on the service quality dimensions that are of highest importance to customers are still subjective, and the current study theoretically contributes to increasing the knowledge insight in the field of marketing.
Acknowledgements
The authors would like to acknowledge anyone who contributed towards the article who does not meet the criteria for authorship including physicians who accepted to participate to the survey.