Development and Validation of a Questionnaire to Measure Health Professionals’ Attitudes toward Identification of Female Victims of Domestic Violence

Back ground: Domestic violence against women is a major public health problem and violations of women’s human rights. Health professionals could play an important role in screening for the victims. From the evidence to date, it is unclear whether health professionals do play an active role in identification of the victims. Objectives: To develop a reliable and valid instrument to measure health professionals’ attitude to identifying female victims of domestic violence. Methods: A primary questionnaire was constructed in accordance with established guidelines using the Theory of Planned Behaviour Ajzen (1975) to develop an instrument to measure health professionals’ attitudes test-retest correlations confirmed that the measures were reliable in the sense of temporal stability. Significance: This tool has the potential to be used by researchers in expanding the knowledge base in this important area.


Introduction
Domestic violence against women is a major public health problem and a violation of women's human rights [1]. Results of a multi-country study conducted by WHO show that between 15% -71% of women reported their exposure of physical and/or sexual violence by an intimate partner at some points in their lives [2]. Identification of DV victims is an extremely important task of nurses and doctors who are working in emergency and out-patient units. It was indicated that the prevalence of DV is likely to be higher amongst emergency department patients than in the general population [3]. Results of a systematic review showed that nearly 12% of female patients attending American emergency units suffered from DV related injury or stress [4]. Although 43% -85% of women agreed that screening by health professional for DV was acceptable [5]; 75% of physicians and nearly 50% of emergency department nurses were not in favour of doing this task [5]. Boyle, et al. [4] reported that many doctors may feel uncomfortable asking questions about abusive relationships. 50.5% rarely or never screen their female patients for domestic violence [6]. Love, et al. [7] found that nearly 90% percent of dentists never screened for domestic violence; 18 percent never screened even when patients had visible signs of trauma on their heads or necks.
Although several authors have tried to explore the attitude of different population of health professionals toward different aspects of responding to victims of DV [8]- [10]; the measuring scale in context of transcultural adaption is not always simple, nor is suitable for the goals and design of a given research project [11]. In 1995, a questionnaire was developed which included a section on attitudes of obstetrician-gynecologists toward screening patients for domestic violence [12]. However, the items were limited to the feeling of obstetrician-gynaecologists about barrier affecting their screening. The instrument was tested for understanding and completion time as their establishment of psychometric property. Reid, Glasser [13] and Cann, et al. [14] developed questionnaires which includeed measurement of attitudes toward DV as a part of their survey. For both these questionnaires, only the establishment of face validity was performed. Later on, a scale with the most rigorous psychometric testing namely ATSI was developed [15]. This scale was used to assess the attitudes toward the victims of DV among health workers. The scale demonstrated a good internal reliability as well as an intensive review of literature and discussion with experts in the field. However, a limitation of this scale was that it was not based on a theoretical model. A theory provides a bridge to connect findings of a study to another study, allows researchers to compare findings across studies to identify "active ingredients", and can help to identify when findings from one population are likely to generalize to another population [16].
It is a fact that an intervention will have a small in size of effect if it is not based on individuals' or populations' think, feel or believe [11]. A theoretical model based intervention can make the said strategies more feasible and effective. In our literature review work, we could not find an ideal questionnaire to measure the attitudes of health professionals toward identification of DV. Such a questionnaire would help to develop an intervention program which targets to change the attitudes of nurses and doctors in identification of the female victims. The aim of this study was to develop and validate a theoretical based questionnaire to measure attitudes toward identification of DV among nurses and doctors who were working in emergency and out-patient units in Vietnamese hospitals.

Development of the Questionnaire
This questionnaire is constructed based on the property of theory of planned behaviour (TPB) (Figure 1).
In the TPB models, attitudes, subjective norms and Perceived Behavioural Controls are constructed based on the individual's salient beliefs [17]- [19]. The attitudes are composed of the combination between Behavioural Beliefs Strength (beliefs regarding behavioural outcome) and Outcome Evaluation (evaluation of the advantages and disadvantages of the outcome of the behaviour). Subjective norms are believed to result from the association of Normative Beliefs Strength (beliefs regarding how important people will approve/disapprove of the behaviour) and Motivation to Comply (the motivation to comply with important people). Perceived Behaviour Control reflects the combination between Control Beliefs Strength (beliefs about what enables or prevents performance of the behaviour) and Perceived Power (perception of the power of these factors to limit or enhance performance of the behaviour).
In the limitation of this paper, we only report the first component of the TPB which is attitudes. In the light of that, the model establishes that attitude precedes behaviour. The health professionals' attitudes toward identification of victims of DV will depend on the beliefs of them about advantages and disadvantages associated with this identification. Changes in attitude should produce changes in behavioural intentions and, given adequate control over the behaviour, the new intentions should be carried out under appropriate circumstances [17]. The questionnaire consists of two measures: direct and indirect measures of attitudes. A direct measure is designed to ask respondents about their overall attitude while an indirect measure asks about specific behavioural beliefs and outcome evaluations toward performing the behaviour [19]- [21].
The direct measure is contributed by two components: experiential attitude (Affect) and instrumental attitude. Experiential attitude is defined as the overall affective evaluation of the behaviour [16]. It is measured by asking how health professionals feel identification of the victims of DV is a comfortable and enjoyable task. The instrumental attitude is defined as overall evaluation of the behaviour [16]. It is measured by asking health professionals to what extent they believe identification of the abused women is beneficial and valuable.
The indirect measure consists two components: behavioural belief and outcome evaluation. The first component is defined as the belief that behavioural performance is associated with certain attributes or outcome; the outcome evaluation is the value attached to a behavioural outcome or attribute [16]. An elicitation study was conducted to find out health professionals' beliefs associated with questioning and screening for the victims of DV [22]. Those beliefs were then employed to form items of the indirect measure. For instance, the elicitation study found that, one outcome of "questioning and screening for DV" may be this "will provide the woman with a sympathetic ear". A person's behaviour belief about this outcome is measured by having him rate how much he agrees that "identification of DV will provide the woman with a sympathetic ear". The person's evaluation of this outcome is measured by having him rate the degree of agreement to which "a sympathetic ear" is desirable. An "indirect measure" of the person's attitude toward performing the behaviour is computed by the first multiplying his behavioural belief concerning each outcome by his corresponding outcome evaluation rating and then summing these product scores across all outcomes of the behaviour.
A 7-option response format ( Table 1) is often recommended in the TBP literature [19]. Follow that, the Likert scale used in this questionnaire includes 7 points, where: Response scales are unipolar (1 to 7) or bipolar (−3 to +3) depending on whether the concept to be measured is uni-directional (e.g. probability) or bi-directional (e.g. evaluation) [19]. In the light of this concept, the unipolar scales were used for direct measures and behavioural belief component of indirect measure; while the bipolar scales were used for evaluation component of indirect measures. However, in the data collection process, in order to reduce participant burden, the questionnaires were only presented with unipolar scales (1 to 7). The data were then transformed to bipolar scale (−3 to +3) where required in the data analysis procedure.
In the light of that, the sum of scores of all items of the direct measure makes overall score for direct measure of attitudes. However, for indirect measure, the sum of scores is calculated a bit more complicated. For example, a person may strongly believe that "identification of DV" results in "providing a sympathetic ear" for the abused woman (belief scored as 7), and may evaluate "a sympathetic ear" as very desirable (evaluation score as +3), resulting in a belief-evaluation product score of +21. Thus, a strong belief that performing the behaviour will result in a positive valued outcome contributed a positive person's attitude. Conversely, a strong belief (belief score as +7) that behaviour will result in a negative valued outcome (evaluation score as −3) contributes negatively (product = −21) to the person's attitude (Figure 2).

Validation Procedure
The validation procedure is indicated in Figure 3. The first step of the validation process is to establish the content validity, which refers to the extent to which a measure adequately samples the content domain [23]. The instrument was sent to local and international experts in Theory of Planned Behaviour and Domestic Violence to review. Panel members were given opened-end questions with regarding to the applicability of the content and the clarity of the phrasing. The panel also was asked to make comments on each item and give any further suggestions to improve the questionnaire.
On the basis of this, the questionnaire was modified before assessing the face validity. Face validity refers to the extent to which the purpose of the test can be detected from the item content [23]. To do that, a group discussion among five respondents of the target population (3 nurses and 2 doctors) was conducted with regards to the formatting and wording of the questionnaire to confirm it is understandable and answerable [19]. The questions including: -Is there any item difficult to answer? -Do you feel some items are repetitive? -Is the questionnaire too long? -Is the questionnaire too superficial? -Is there any annoying word/sentence?
Rewording and reformatting was then be made according to the feedbacks. In the last step, a pilot study, with the participations of 30 nurses and doctors from the target population, was performed to confirm the construct validity, internal consistency reliability and temporal stability of the questionnaire.
The reliability and validity of direct TPB measures were estimated in formative research. First, a TPB ques-Overall attitude score  tionnaire was constructed in accordance with established guidelines. Each item is, by itself, designed to be a direct measure of the theoretical construct, and the different items used to assess the same construct should correlated with each other and exhibit high internal consistent. Cronbach's alpha is the most commonly used coefficient [19] [20]. While different levels of reliability are required, depending on the nature and purposed of the scale, it is recommended a minimum level of 0.7 [24]. For the theoretical reasons, this requirement is not imposed on the belief composited that are assumed to de- Step 1: Initial questionnaire was developed in English Step 2: Content validity

Expert panel's review in English
Step 3: Face validity

Discussion group in Vietnamese with 5 nurses and doctors
Revision, and translation into Vietnamese*

Revision in Vietnamese
Step 5: Final questionnaire

In Vietnamese
Revision in Vietnamese termine attitudes [16] [21]. Accessible behavioural beliefs are assumed to account for attitudes, however, no assumption is made that salient beliefs are internally consistent [21]. People's attitudes toward behaviour can be ambivalent if they believe that the behaviour is likely produce positive as well as negative outcomes. Consequently, internal consistency is not a necessary feature of belief composites [21]. However, a series of simple bivariate correlation between direct and indirect measures of the same construct were calculated to confirm the construct validity of the indirect measures. High correlations would likely be a reflection of indirect measures that were well constructed and adequately cover the breadth of the measured construct [19]. Those 30 participants were asked to address the questionnaire for a second time after 1 week to access temporal stability of the questionnaire (test-retest reliability). Temporal stability is in fact an important characteristic in prospective studies that attempt to predict behaviour at a later point in time. If measures of the theory's constructs lack temporal stability, they cannot be expected to predict later behaviour. As items were looked at individually, each item was considered as a categorical/ ordinal variable. Therefore, Kappa statistic, the agreement between replicate measurements taken at different points in time, was used to assess the temporal stability of each item [25]. Base on the criteria originally proposed by Landis and Koch [26]: -Kappa values greater than about 0.75 are often taken as representing excellent agreement; -Those between 0.4 and 0.75 as fair to good agreement; and -Those less than 0.4 as moderate or poor agreements.
Prior to the data collection, the pilot study had been reviewed and approved by the QUT's Ethics Committee (No: 1100001317) and the Local Ethics Committee, HSPH (No: 039/2011/TYCC-HD3). All private information from the pilot will not be used for other goals. Participation was voluntary and could leave this study whenever they liked without facing any legal responsibility. All information about the participants has to be kept confidentially. This pilot was also approved by the Managers of the two hospitals.
Data analysis SPSS: The data of the pilot was entered and analysis using SPSS software version 18. Back-translation procedure: The questionnaire was originally developed in English; it was then translated into Vietnamese by a bilingual Vietnamese researchers. Afterward, it was translated back into English by another bilingual Vietnamese person with experience in social science. Finally, the questionnaire was reviewed by an English native speaking person to confirm the equivalence with the original.

Direct Measures
Overall, the expert panel agreed that the questionnaire contains appropriate items to measure the TBP based attitudes of health professional toward identification of the victims of DV in the Vietnamese context. Some minor amendments on the way of choosing appropriate terms were made to ensure that items measuring the same constructs are clear and different from each other, especial after translated into the second language. An important comment from the expert panel was that items seem to be quite general; each component of behavioural evaluation should be specified for either health care staff or the abused women. In particular, the first component, the instrumental in nature, should be the evaluation if the behaviours of health staff will benefit and be valuable for the women. The second component should be about how the health professionals themselves feel enjoyable and comfortable to perform the behaviour. For instance, the item "Identification of the abused woman by screening and questioning is beneficial" in the initial questionnaire should be more specific as "Identification of the abused woman by screening and questioning will benefit the woman". The discussion among 5 nurses and doctors of the target population confirmed the direct measures used suitable dialect, appropriate content to address the aims of the questionnaire.
The two components measuring attitudes of health professionals toward the behaviours were in turn submitted to internal consistency. In each component, items showed high correlations with each other (Cronbach's alpha values were both greater than 0.7), confirming that items were accessed in the same underlying construct. The direct measures of attitudes also showed high (Kappa coefficient is between 0.61 -0.80, p < 0.001) or very high (Kappa coefficient > 0.80, p < 0.001) levels of agreement across all items in test and retest reliability ( Table 2).

Indirect Measures
With regards to the content of outcome evaluation items, there was an amendment made in this questionnaire Table 2. Internal consistency and level of agreement of direct measure of attitudes.

Items
Kappa coefficient Cronbrach's alpha Identification of the abused woman by screening and questioning will benefit the women 0.84 *** 0.77 will be valuable for the women 0.94 *** is a task that health professionals feel enjoyable to do 0.84 *** 0.91 is a task that health professionals feels comfortable to do 0.70 *** *** p < 0.001.
compare to the standard TPB questionnaire. In the standard TPB questionnaire, these items are usually designed to measure a person's general evaluation of an outcome. That is, the person simply values a particular outcome in a scale such as bad-good scale, undesired-desired scale. For example, in the initial questionnaire, participants were asked to rate their level of agreement for an item "Providing the abused woman a sympathetic ear is desirable" However, the review panel commented that it should be specified as the outcome is desired by whom. The outcome desired by the woman and the outcome desired by the health professionals themselves would result differently to the attitudes of the health professionals to perform the behaviour. In addition to bad-good scale, undesired-desired scale recommended in the standard TBP questionnaire, more flexible scales were also recommended to used for outcome evaluation items such as unnecessary-necessary scale, unimportant-important scale. The panel's opinion is that, considering the context of this questionnaire, the item "Meddling with the abused woman's private world is necessary", for example, would make more sense than the item "Meddling with the abused woman's private world is desirable". Because in the reality, we never desire to destroy the private world of the abused woman but we may believe that this destruction is necessary to help the woman escapes from DV. The application of unnecessary-necessary scale could then increase the variation in participants' responses. In the discussion group, nurses and doctors also agreed that they feel much more confident and comfortable about the accuracy of their responses with revised items of outcome evaluation. There were good correlations between overall scores of direct measures and overall scores of indirect measures of attitudes, Pearson correlations were 0.69 *** , p < 0.001. It confirmed that the indirect measures were constructed in appropriate manners. Test and retest also confirmed the stability of the indirect measure of attitudes; kappa coefficients are all greater than 0.7, p < 0.001 ( Table 3).

Discussion
The purpose of this study was to develop a scale for measuring health professionals' attitudes toward identification of female victims of DV and to test its psychometric properties. This scale was developed based on the theory of planed behaviour framework where direct and indirect approaches make different assumptions about underlying cognitive structures and neither approach is perfect [16] [18] [27]. In this research, the direct and indirect measures were positively correlated, confirming that these measures were tapped in the same construct and both should be included in a TPB based questionnaire [19].
Overall the internal consistency of the direct measure was acceptable, although Cronbach's alpha for the experiential attitude was not perfectly high (0.77). This result may be due to the fact that only 2 items were included in this dimension [24]. Also, small sample size, high homogeneity of the patients, and small variability of the scores might decreased alpha coefficients [28].
Strength of this questionnaire is the use of indirect questions. That is, the term "nurse and doctor" was used as the subject of all item sentences instead of "you". Nurses and doctors who participated in the group discussion of this study also commented that if they are asked these indirect questions, their responses will be more accurate which reflect their beliefs of what their colleagues might do, the answers are therefore more reliable than if they are asked to directly reflect themselves of what they do. Recent evidence indicates that indirect questions can be used to reduce social desirability bias for sensitive questions. Indirect questioning allows respondents to project their attitudes into the response situation by asking them to report on the "nature of the external world" rather than themselves [29]. Respondents will then feel that they are giving information about situations based on fact rather than opinion and so they respond behind "a facade of impersonality" [30]. Fisher and Tellis [31] say that indirect questions are more strongly correlated with the estimated true score than direct questions that were corrected for Table 3. Level of agreement of indirect measure of attitudes.

Items Kappa coefficient
If a woman is identified to be abused, this identification will support nurses/doctors in providing care for her 0.77 *** Better conditions in providing care for the abused woman play an important role in her recovery 0.85 *** If nurses/ doctors identify a woman to be abused, nurses/doctors will provide the woman with a sympathetic ear 0.80 *** A sympathetic ear is desired by the abused women 0.81 *** Nurses/ doctors will have the feeling of doing something positive when they are screening for an abused woman 0.77 *** Doing something positive for others is a feeling that people always aim to 0.73 *** If nurses/ doctors identify an abused woman, there will be negative reactions from her husband 0.84 *** Negative reactions from the abused woman's husband is not a concern of nurses/ doctors 0.72 *** If nurses/ doctors identify an abused woman, there will be negative reactions from her husband family 0.74 *** Negative reactions from the woman's husband family is not a concern of nurses/ doctors 0.84 *** If nurses/ doctors identify an abused woman, nurses/ doctors may be meddling too much with the woman's private issues 0.73 *** Meddling with the abused woman's private world is necessary 0.79 *** *** p < 0.001. social desirability bias. Indirect questioning provides a better estimate of the true scores of socially-sensitive variables.

Conclusion
To sum up, valid and reliable instruments are always necessary for developing and evaluating health programs based on behaviour change theories of social and behavioural sciences [32]. Strategically focus on altering person beliefs underlying the target behavior could lead to the development of effective intervention programs [27]. Health authority could employ this questionnaire for their surveys to select beliefs that most influence attitudes of nurses and doctors toward identification of the victims of DV. An intervention program which targets in altering these beliefs will lead to the change of intention and behavior of health professionals. There are some limitations associated with this study. Sample size of the pilot test was relatively small which may affect the power of the statistical procedures. The interval of test and retest was only 1 week which may not long enough to confirm the stability of the questionnaire.