Development and psychometric properties of the Malaysian elder abuse scale

Elder abuse is an emerging issue of serious concern with life-threatening consequences. This study aimed to develop and assess the validity and reliability of a new scale to assess elder abuse. A cross-sectional multistage sampling technique was used to obtain a nationally representative sample of older Malaysians. The iterative development process resulted in a 16item, four-dimension scale. Exploratory factor analysis yielded a 10-item scale with three factors. The value of Cronbach’s alpha for total scale and its subscales indicated sufficient internal consistency. Multitrait scaling analysis also showed good convergent and discriminant validity. Furthermore, predictive validity of the proposed scale was established by demonstrating a statistically significant association between elder abuse and depression through multiple logistic regression analysis. The findings from this study demonstrate an acceptable level of validity and reliability for new scale. This scale can be used by health and social care workers to identify elder abuse cases.


INTRODUCTION
Elder abuse, that is significantly associated with an almost 2 -3 times increased odds of death, even after controlling for other possible causes of mortality [1][2][3], is an emerging issue of serious concern with devastating effects and life-threatening consequences.As a serious social and health problem, like other types of interpersonal violence, it has largely deemed a social taboo kept behind closed doors and shielded away from public scrutiny and most societies would hide and deny rather than confront and systematically deal with this problem [4].According to the World Health Organization (WHO), "although there is no systematic collection of statistics or prevalent studies in the developing world, crime records, journalistic reports, social welfare records and small scale studies contain evidence that abuse, neglect and financial exploitation of elders are much more common than societies admit" [5].In terms of region, the information on elder abuse in Malaysia is sparse.However, the findings from previous studies show that risk factors that may make an older adult more vulnerable to abuse are increasing among older Malaysians [6][7][8][9][10].
Review of the elder abuse literature reveals that assessment of elder abuse has been hampered by a lack of well-validated and reliable scale [11].Hence, one of the most important and immediate research needs is to develop a valid and reliable instrument.In their review of the existing instruments, Fulmer et al. [12] recommended that elder abuse instruments must be improved to develop a better understanding of the elder abuse and to identify persons requiring treatment and intervention, In addition to this, cultural values and expectations influence what conduct is considered as elder abuse [13].For example, in some cultures sending elderly individuals to nursing homes is considered as a form of abuse, whereas other cultures define it as a sign of caring [14].Based on these considerations, the present study aimed to develop and validate a new scale for assessing elder abuse among community dwelling elderly people in the cultural context of Malaysia.

Existing Instruments for the Assessment of Elder Abuse
Reliable and valid instrument is of considerable benefit for advancing social work research and evidence-based social work practice.In fact, reliable and valid instruments enable practitioners to assess client target problems with greater precision.Therefore, it is imperative for social work researchers to engage in scale development and validation.It is obvious that scientific investigations without reliable and valid instruments would be impossible [15].The review of existing instruments shows a real need for developing more measures for elder abuse [16].

METHODS
Data for this study were obtained from the National Survey of "Perception, Awareness and Risk Factors of Elder Abuse", which was conducted throughout Peninsular Malaysia from December 2006 to May 2009.Sampling frame for this survey was obtained from the Department of Statistics, Malaysia.The survey utilized a cross-sectional, multistage area probability sampling with a response rate of 80% to obtain a representative sample of the non-institutionalized adult population of Malaysia.Data collection was carried out in four geographical zones of Peninsular Malaysia.A state was randomly selected to represent each zone, specifically Perak (southern zone), Malacca (northern zone), Kelantan (east-cost zone) and Selangor (central zone).Each state comprised of 42 enumeration blocks (EBs), making up a sum of 168 EBs.In addition, every EBs is limited to only eight households and each one involved only single respondent.Households were selected at a sequence interval of 15 and the first sample started at Point A of each EB provided.Data collection was carried out by trained enumerator during spring and summer 2008.It was conducted through a face-to-face interview using questionnaires.The sample for this study consisted of 480 community dwelling elderly people aged 60 years and older.

Item Generation and Evaluation Process
The potentially important items were initially identified from published questionnaires, literature review, and elder abuse theories.The generated items were then translated into Malaysian language using the forwardback translation approaches with expert, bilingual translators.According to the forward-back translation method, the proposed items were firstly translated into Malaysian language and then back-translated to the English language for an evaluation of the translation in the native language.Item generation was followed by the evaluation of content validity.Content validity refers to the extent to which the instrument represents all facets of a given phenomenon or concept [17].Content validity was assessed by an expert panel.Panel members reviewed the proposed items for rating the culturally relevance and appropriateness of items in terms of the construct being measured.In the next step the items which were culturally relevant and applicable clarified with a convenience sample of the elderly to determine their perception about the items whether these items are considered as abuse.

Data Analysis
Before conducting analyses, the data were screened for missing data, multivariate outliers and other assumptions for multivariate analyses.Results of the evaluation of assumptions showed no threats to the assumptions of logistic regression and factor analysis.Descriptive statistics including means, standard deviations, ranges, and percentages were computed to describe sociodemographic characteristics of the population.
Validity: Validity of the scale was assessed in terms of construct validity, convergent and discriminant validity, criterion validity, and Known-groups validity.Construct validity was assessed using exploratory factor analyses (principal component analyses with varimax rotation).Convergent and discriminant validity were evaluated using multi-trait/multi-item analysis [18] to test the scaling assumptions underlying the different subscales of the physical abuse, psychological abuse, and financial abuse.Item convergent validity was considered if each item correlates substantially (r ≥ 0.40, corrected for overlap) with its own scale.Item-discriminant validity was considered if all items correlate significantly higher with the scale it represents than with other scales [19].In this analysis, each item is examined with respect to how well it represents its own scale relative to all other scales.Item convergent validity: To assess an item's correlation with its own hypothesized sub-scale score (satisfied if correlation achieved is ≥0.40).Item discriminant validity: Item-internal validity is achieved if the correlation between an item and its hypothesized scale was significantly higher than the correlations between that item and other scales [20].Criterion validity was assessed through logistic regression.Finally, an independent t-test was performed to investigate Known-groups validity.
Reliability: Internal consistency (homogeneity) assesses whether items of a scale are measuring the same concept.The internal consistency reliability of the scale was measured using Cronbach's alpha coefficient.The value above 0.6 obtained through this calculation was considered to be acceptable [21].

RESULTS
The average age of respondents included in the analysis was 68.98 (SD = 7.71), with just over one-half of re-spondents being female (51%) and 67.1% being married.One third (34.4%) of the respondents reported having no formal education (Table 1).
In terms of ethnicity, majority of the respondents (76.5%) identified themselves as Malay, approximately 10% as Chinese, 9% as Indian and the rest as other Bumiputra and others.

Content Validity
The 17 proposed items were evaluated by five multidisciplines experts in field of gerontology for content and face validity.The results showed high agreement for cultural relevance and representativeness of the items proposed for assessing of elder abuse.The item "Preparing meal for non-family members" was removed, because participants didn't consider it as a type of abuse.

Factor Analysis
Construct validity is defined as the fit between the theoretical and the empirical structure.It was evaluated by exploratory factor analysis with orthogonal Varimax rotation, restricting the solution to three factors.Before conducting exploratory factor analysis, sexual items were removed because they had no variance.Prior to factor analysis, the factorability of the Malaysian Elder Abuse Scale (MEAS) items was investigated using checking the correlation matrix for coefficients greater than 0.3 and Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy.Examination of the correlation matrix revealed that many of the correlations were above 0.30.The diagonals of the anti-image correlation matrix were all over 0.5, supporting the inclusion of each item in the factor analysis.The Kaiser-Meyer-Olkin measure of sampling adequacy was 0.62, above the recommended value of 0.6, and Bartlett's test of sphericity was significant ( 2 (45) = 1515.929,p ≤ 0.001).In addition, the anti-image correlation matrix was used to assess measure of sampling adequacy (MSA).It is the negative of the partial correlations, partialling out all other variables.The results of Anti-Image Matrices shows the correlation coefficients on the diagonal all the measures of sampling are well above the acceptable level of 0.5, supporting the inclusion of each item in the factor analysis.
In the next step of assessment of assumptions for factor analysis, communalities were also checked for meeting minimum criteria.Communalities represent the proportion of the variance in the original variables that is explained by the factor solution.The communality value for each items should be 0.50 or higher [22].In our study, initial communalities values for all items were greater than 0.5 which meets the minimum criteria.
Since main purpose was to identify factors underlying the MEAS, principle components analysis was used.The initial Eigen values showed that the first factor explained 21.8% of the variance, the second factor 20.8% of the variance, and a third factor 16.8% of the variance.During several steps, a total of three items were eliminated because they did not contribute to a simple factor structure and failed to meet a minimum criteria of having a primary factor loading of 0.4 or above, and no crossloading of 0.3 or above.These items included "Have you ever been forced to eat?", "Have you ever given all your pension to your children?" and "Have your expenditure ever been controlled by your carrier?"There were no cross-loadings higher than 0.40.A principle-components factor analysis of the remaining 10 items, using varimax rotations was conducted, with the three factors explaining 59.4% of the variance that is acceptable.According to Child [22], in the behavioral sciences usually scales extract factors that explain approximately 60% of the variance.The results of the exploratory factor analysis are presented in Table 2. First Factor: Four items loaded on the first factor, which explained 21.8% of the variance (Eigen value (the proportion of variance determined by the factor) = 2.18).This factor was labeled physical abuse.
Second Factor: Four items also loaded on the second factor, which accounted for 20.8% of the total variance (Eigen value = 2.08).This factor was labeled "psychological abuse".
Third Factor: Finally, the last factor which was labeled financial abuse, contained two items that represented 16.7% of the total variance (Eigen value = 1.67).own or along with physical abuse.It is also important to know that verbal abuse often precedes physical abuse.Just like physical abuse, verbal abuse rarely goes away on its own.In fact, just like other forms of domestic abuse, verbal abuse usually gets worse over time.That means that some form of intervention is usually necessary.In summary, exploratory factor analysis of the MEAS items provided evidence of construct validity and yielded three types of abuse, which were labeled physical, psychological, and financial abuse.The findings were consistent with the dimensions expected from the theory.

It is important to know that verbal abuse can occur on its
Reliability: Reliability has been defined as the extent to which a scale, observation or any measurement procedure produces the same results or similar scores with repeated testing with the same group of respondents [23].In this study internal consistency (average inter-item correlations) was measured using Cronbach's alpha.According to Chakrapani [24], Cronbach's alpha value of greater than 0.5 is considered acceptable and value less than 0.5 is considered poor.As shown in Table 3, Cronbach's alpha coefficients were 0.62 for total elder abuse scale, 0.53 for physical subscale, and 0.64 for psychological subscale, 0.80 for financial subscale.In the present study the value of Cronbach's alpha for total scale and its subscales were obtained greater than 0.50 which represent sufficient internal consistency for MEAS.

Convergent and Discriminant Validity:
A multitrait scaling analysis was used to assess item convergence and item discrimination across domains.Item convergence is supported if an item correlates substantially (≥0.30) with the domain total score that it is hypothesized to represent.
To prevent spurious inflation of the association between any given item and the total score, each convergence correlation was corrected for overlap.Item discrimination is supported if the correlation between a given item and the domain total score that it is hypothesized to represent is higher than its correlation with all other domain total scores of the measure.Item convergence and discrimination can help determine the scaling of items in a measure.Spearman rank-order correlation coefficients (RHO) were used to estimate these relations.Item convergence is assessed by item-scale correlations; a correlation, adjusted for overlap of 0.40 or greater is interpreted as support for item convergence [25].Item discrimination is supported if an item correlates higher with the designated scale than with the other scales under study and if this correlation is significantly larger than with other scales in the multitrait multi-item matrix [26].Following Hays et al. [27], an item was con-sidered a "success" in the item discrimination analysis when the correlation between an item and its hypothesized scale was more than 2 SE higher than its correlation with other scales.It was considered a "probable" scaling error if its correlation with the hypothesized scale was within 2 SE of another scale and a "definite" scaling error if its correlation with the hypothesized scale was more than 2 SE below its correlation with another scale [26].A multitrait scaling analysis was carried out to evaluate the hypothesised scale structure of the questionnaire.This technique, to test for item convergence and discriminative validity, is based on the examination of item-scale correlations.Pearson's correlations of an item with its own scale (corrected for overlap) and other scales were calculated.Evidence of item convergence validity was defined as a correlation above 0.40 with its own scale [25].Item discriminant validity was supported by a comparison of the magnitude of the correlation of an item with its own scale compared with other scales.A definite scaling error was assumed if the correlation of an item with another scale exceeded the correlation with its own scale.Table 3 shows the results of tests of item convergent and discriminant validity.In sum, multitrait scaling analysis showed good convergent and discriminant validity for MEAS.
Criterion Validity (Predictive Validity): Validity has been defined as the extent to which a scale actually measures what it is intended to measure [28].Criterion validity is a method of validity, which relies on comparison between the proposed measure and a measure previously developed to measure the variable of interest [29].It can be defined as the ability of a test to predict outcome (predictive validity) and correlate with similar tools (concurrent validity).
Predictive validity of the MEAS was established using logistic regression analysis where the dependent variable of elder abuse outcome was depression.A multivariate logistic regressions was conducted to examine criterionrelated validity (predictive validity) via the relationship between the elder abuse and depression after adjusting for potential confounders, including age, sex, marital status, education, and household income.Criterion validity through concurrent validity was not tested, because there was no scale for assessing concurrent validity.Since it was found that elder abuse result in depression [30], predictive validity of the instrument was evaluated by logistic regression predicting depressive symptoms using the four-item Geriatric Depression Scale.It should be noted that predictive validity studies can be done using longitudinal and/ or cross-sectional designs [31].The method most frequently used in establishing evidence of such validity is to correlate scores on the predictor test with scores on the criterion variable.In cases in which the criterion variable is dichotomously scored, an appro-priate model for predicting this criterion from the continuously scored predictor would be the logistic regression model [32].
There are several ways for evaluating predictive validity such as correlations and multiple regressions [33].Since depression was a dichotomous variable, a logistic regression analysis was used to test the ability of MEAS to predict the likelihood of depression, after controlling for the confounders.The finding of Hosmer and Lemeshow Test of goodness of fit with a p-value larger than 0.05 (χ 2 (8) = 11.95 p = 0.153) indicated an adequate model.The findings from the multiple logistic regression analysis revealed an overall significant model (Model χ 2 (8) = 18.51, p ≤ 0.05) where elder abuse significantly predicted depression (adjusted OR = 2.13; 95% CI: 1.24 -3.68, p ≤ 0.01), after adjusting for sociodemographic characteristics including age, sex, household income, marital status, education, and ethnicity.Table 4 shows the results of predictive validity of elder abuse using logistic regression model.
Known-groups validity: Known groups validity is another important form of construct validation which validity involves determining the extent to which an instrument can demonstrate different scores for groups known to vary on the variables being measured [34].In accordance with previous studies stating that older women are more vulnerable to abuse than older men [35] and women are classically believed to be the most common victims of abuse [36], the Known-groups validity was assessed by examining association between elder abuse and gender.The finding showed the prevalence of elder abuse was higher among older women than men.

DISCUSSION
Since elder abuse as a highly sensitive issue needs a linguistically and culturally specified tool for detection [37], the present study was an attempt to develop and assess reliability and validity of the MEAS.In this article, we described the development, factor analysis, and other psychometric properties of the MEAS to measure elder abuse among older Malaysians.The MEAS demonstrated the presence of three distinct factors named abuse".The MEAS was found to be reliable, with excel-

CONCLUSION
The final proposed scale is presented in Appendix.In sum, the results of this study provide preliminary evidence of the validity and reliability for the MEAS as a screening tool that may be used for assessment of elder abuse in the community and health care settings.

Table 1 .
Demographic characteristics of the respondents.

Table 2 .
Results of the exploratory factor analysis.

Table 3 .
Results of convergent and discriminant validity and internal consistency.

Table 4 .
Logistic regression model on the predictive validity of elder abuse., sex, household income, marital status, education, and ethnicity lent internal consistency reliability.There was also support for both convergent and district validity.Evidence of predictive validity was supported by association between elder abuse and depression after controlling for other possible variables.The current study, which involved a large and representative sample of older adults and used standardized methodology, is one of the first attempts to develop a reliable and culturally relevant tool to assess elder abuse in older Malaysians.Although the results provide evidence in support of the psychometric properties of the MEAS, potential limitations of the study should be acknowledged.One limitation of the study is the lack of a gold standard measure of elder abuse against which to test the sensitivity and specificity of the MEAS.Examination of test-retest reliability is needed to test the stability of the scale.In addition, further study is needed to assess concurrent validity of the MEAS.It is crucial to remember that the lack of report of sexual abuse in this study does not imply that elder sexual abuse does not exist in this population.Measuring elder sexual abuse is difficult because older victims may feel terrified, ashamed, embarrassed or blame themselves.