The EASI: Factor Structure and Measurement and Structural Invariance between the Parent’s Gender, the Child’s Age, and Two Measurement Time Points

Background: The EASI (measuring Emotionality, Activity, Sociability, and Impulsivity) is a widely used instrument to measure children’s temperament. However, its factor structure and its measurement and structural invariances have rarely been studied. The purpose of our study is to report the factor structure of the EASI in a larger population in Japan searching for the model that fits the data sufficiently as well as the final model’s measurement and structural invariance. Methods: A net-survey collected data from 531 mothers and 369 fathers of 3to 4-year-old Japanese children. A test-retest survey was conducted of 173 mothers and 127 fathers out of the first group of participants. Results: The original 4-factor structure (excluding 6 items) with a general factor (influencing E and I) showed an acceptable fit. This model also satisfied measurement and structural invariance between fathers and mothers, boys and girls, 3and 4-year-old, and times 1 and 2. Differences emerged between mothers and fathers in terms of the means of some factor scores. Conclusion: We recommend the use of 4 subscale scores as well as a composite score of E and I.


Introduction
A long history of personality studies has always cast a light on the individual differences of infants and toddlers. These individual differences are referred to as temperament. Although the definition and therefore measurement of temperament vary from one researcher to another (Goldsmith et al., 1967), Buss and Plomin's (1975) four-temperament theory is one of the widely acknowledged temperament theories. They developed a theory of personality following Allport's notion that defines temperament as "the characteristic phenomena of an individual's nature, including his susceptibility to emotional stimulation, his customary strength and speed of response, the quality of his prevailing mood, and all peculiarities of fluctuation and intensity of mood, these being phenomena regarded as dependent on constitutional make-up, and therefore largely hereditary in origin" (Allport, 1961, cited by Buss & Plomin, 1975. Buss and Plomin's initial criteria of temperament differed from other personality traits in that it included inheritance, stability during childhood, retention into maturity, adaptive value as well as the fact that they exist in our animal forebears (Buss & Plomin, 1975). Later, however, they shifted emphasis to two crucial criteria: inheritance and the presence in early childhood (Buss & Plomin, 1984). In their idea, temperament is concerned more with style than with content. It is more about expressive behaviour than about instrumental (coping) behaviour, and more about what a person brings to a role or situation than what either of these demands of him (Buss & Plomin, 1975).
These considerations led them to propose emotionality, activity, sociability, and impulsivity as children's basic individuality. Emotionality refers to intensity of reaction. Children have an excess of emotion. Emotional expression is exaggerated. They have mood swings and are short tempered. Activity refers to total energy output. Children keep moving and are tireless. Their behaviours are vigorous. Sociability refers to children's desire to be with others. They are responsive to others and rewarded by interaction with others. Impulsivity refers to quick response. Children are less likely to be inhibited. They are likely to give in to their urges. As a measurement of the four-temperament pattern, Buss and Plomin (1975) developed the EASI Survey, a questionnaire completed by parents. This questionnaire includes 20 items with five items for each of the four temperament domains: Emotionality, Activity, Sociability, and Impulsivity (hence its acronym). This survey has been used for genetic/twin studies of temperament (Plomin et al., 1993).
Despite its seminal contribution to developmental studies, the EASI's factor structure and measurement and structural invariance have been studied infrequently. Buss and colleagues (Buss et al., 1973;Buss & Plomin, 1975) conducted factorial analyses and scale correlations of the EASI using 139 pairs of same-sex twins as rated by their mothers. This study revealed that at least three of the five items assigned to each theory-driven subscale loaded highest on the appropriate factor. Some subscales were significantly correlated with each other. Thus, Y. Ohashi, T. Kitamura Psychology among both boys and girls, Activity and Impulsivity were correlated, and Emotionality was moderately correlated with Impulsivity. A similar factor solution was obtained by Gibbs, Reeves, and Cunnigham (1987) in a study of 105 mothers of British children aged 1 to 5 years. Using a Norwegian population of children aged 18 to 50 months, Mathiesen and Tambs (1999) reported a 3-factor structure using the EAS, a modified version of the EASI. A study on the EAS factor structure among school children was done by Boer and Westenberg (1994).
These investigations, however, only used exploratory factor analysis (EFA), and the goodness of fit with the data was not checked by confirmatory factor analysis (CFA). A CFA of the EAS was conducted by Gesman et al. (2002) in a population of school children. However, they conducted CFAs for EAS data obtained from children, parents, and teachers without performing EFAs beforehand. Unfortunately, fitness of their 4-and 3-factor models was below standard level. A similar report came from Spence, Owens, and Goodyer (2013) using an adolescent population. Again, their study did not perform an EFA before the CFA. Their final model's fitness was barely acceptable: comparative fit index (CFI) = .953 and root mean square of error approximation (RMSEA) = .071. Kitamura et al. (2014) performed an EFA of the EASI items in a randomly halved population of Japanese fathers (n = 237) and mothers (n = 412) of children under four years of age. The factor structure they obtained was cross-validated by a CFA. Their EFA yielded a two-factor structure. Nevertheless, a four-factor structure according to the original report (excluding items with low factor loadings) showed a better fit with the data. However, this model's fitness with the data was not very good: χ 2 /df = 1.97, CFI = .925, and RMSEA = .055. The above studies all indicate that first-order CFAs cannot necessarily explain the data sufficiently.
The number of factors in EFAs can range arbitrarily between 1 and the numbers of items. Here, single and first-order factor models reflect different ideas. A single factor emphasises general abilities (e.g., emotional vulnerability) whereas the latter emphasises several specific abilities (e.g., E, A, S, and I). Neither model can address both general and specific abilities simultaneously. However, many psychological measurements cannot be explained solely by either a single factor model or a first-order factor model. This leads to the necessity of bifactor models. This takes into consideration the general and specific abilities in one model (Brunner, Nagy, & Wilhelm, 2012). The general factor influences all the indicators directly but not through the first-order factors. All the indicators differ according to the general factor while groups of indicators are dependent solely on each first-order factor they belong to. Here, the general and first-order factors are not correlated with each other.
Furthermore, selection of the best fit model of factor structure cannot guarantee that the same psychological instrument measures the same phenomena when used in different populations or used in the same population but at different times. We should examine measurement and structural invariance of the in-Psychology strument. This means that indicators of an instrument have the same meaning and that they are not biased by some attributes such as gender, marital status, or age, to list just a few. These procedures include (Vandenberg & Lance, 2000): 1) Each group (e.g., men and women) has the same pattern of indicators and factors (configural invariance); 2) In addition, factor loadings for similar indicators are invariant across groups (metric invariance; also known as weak factorial invariance); 3) In addition, intercepts of similar items are invariant across groups (scalar invariance; also known as strong factorial invariance); 4) In addition, residuals of similar items are invariant across groups (residual invariance; also known as strict factorial invariance); 5) In addition, variances of similar factors are invariant across groups (factor variance invariance); 6) In addition, covariances between factors are invariant across groups (factor covariance invariance); and 7) In addition, means of factors are invariant across groups (factor mean invariance).
The hypotheses from 2) to 4) are called measurement invariance as they examine the relationships between measured indicators and their latent constructs. The hypotheses from 5) to 7) are called structural invariance as they examine the latent variables only. Hypothesis testing is recommended to be conducted in the order above (Vandenberg & Lance, 2000). If one step is rejected, the subsequent steps are not to be performed.
Our present study reports the factor structure of the EASI in a larger population in Japan searching for the model that fits the data sufficiently. We also report the final model's measurement and structural invariance.

Study procedures and participants
The target of our study was 3-to 4-year-old Japanese children. With the cooperation of Rakuten Insight Inc. (Shibuya, Tokyo), parents who live with their 3-to 4-year-old children (exactly 36 months to 59 months) were recruited from 47 prefectures in Japan. Out of over 400,000 parents who were enrolled as web-research respondents, 246,578 had children and were solicited to participate in the survey. Our inclusion criteria were: 1) the participants were required to take care of the children on daily basis, 2) their native language was Japanese, and 3) the main living environment after birth of the target child was in Japan. A total of 531 mothers and 369 fathers were selected as the participants. Their mean (SD) age was 37.6 (5.5) years old. The gender ratio of children was even: 465 boys and 435 girls. Among them, 481 were firstborn children, 322 were second-born, and 84 were third-born. Their mean (SD) age was 47.7 (6.3) months old.
A survey web page was created by Rakuten Insight Inc. This contained all of the necessary information for participation. The questionnaire was preceded by

Measurements
The EASI Survey consists of 20 items with a 5-point scale (from a little "0" to a lot "4") to measure 4 temperament dimensions: Emotionality (E), Activity (A), Sociability (S), and Impulsivity (I). One of us (TK) translated the EASI into Japanese with permission from the original authors.

Data analysis
We examined goodness-of-fit of the original model proposed by Buss and Plomin (1984). First, we examined skewness and kurtosis of each EASI item to confirm a normal distribution of the item. We then calculated the Cronbach's (1951) alpha coefficient of the items of each of the EASI subscales. If it was less than .7, we deleted items that had a positive impact (i.e. resulted in a higher alpha coefficient when deleted) until the alpha coefficient reached .7 or greater, or when the number of items loading on a factor was reduced to three.
After identifying the best fit model, we examined measurement invariance across different categories and occasions: mothers vs. fathers; boys vs. girls; 3-year-olds vs. 4-year-olds; and time 1 vs. time 2. We defined invariance from one step to the next as either 1) non-significant increase of χ 2 for df of difference, 2) decrease of CFI less than .01, or 3) increase of RMSEA less than .01 (Chen, 2007;Cheung & Rensvold, 2002

Internal consistency of the four subscales
Skewness and kurtosis of all of the EASI items were low (skewness < 1.0 and kurtosis < 2.0). This suggested normal distribution of the data (Table 1) and therefore the data are "factorable". Cronbach's internal consistency of more than .7 was obtained by excluding items 13 and 17 from E, items 2 and 14 from A, and items 11 and 15 from S. There was no necessity to exclude any items from I (Table 2). Thus, the remaining 14 items were used for the subsequent analyses.

CFAs
The first-order 4-factor structure model did not show acceptable fit with the data: χ 2 (71) = 462.564, CFI = .895, RMSEA = .078, and AIC = 530.564 ( Figure   1). There were substantial correlations between some factors of the EASI. Of these, the correlation between Emotionality and Impulsivity (r = .77) was theoretically explainable. Therefore, we set a general factor influencing the items of both of these two factors (Table 3 and Figure 2). This model showed much better fit with the data: χ 2 (63) = 250.477, CFI = .950, RMSEA = .058, and AIC = 334.477. We considered that this was the best model to explain the data of the EASI.

Measurement and structural invariance between different demographic attributes All of the comparisons between fathers and mothers, boys and girls,
3-year-old and 4-year-old, and times 1 and 2 proved that this 14-item EASI model was invariant from configural, metric, scalar, factor variance, and factor covariance perspectives (Table 4). Therefore, it was proved that this 14-item EASI has the same factor structure regardless of the gender of parents and children, and the age of children and that this scale does not change its factor structured when used repeatedly.
Compared to mothers, fathers rated A and I significantly higher but S lower.
Girls were scored higher in A and General E/I than boys. Four-year-olds were rated significantly lower in factor means of E, A, and I than 3-year-olds. There was no factor mean difference in any of the factors between the two test occasions (Table 5).

Discussion
Our study confirmed the original 4-factor structure of the EASI. It also revealed the measurement and structural invariance of the factor structure among Japanese toddlers. This echoes the report of the 4-factor structure of the instrument among Japanese children aged 4 years or less (Kitamura et al., 2014), and supports the robustness of Buss and Plomin's (1975) original 4-factor model. Of interest was the fact that the bifactor model (with a general factor covering both E and I) showed better fit with the data than the first-order 4-factor model. This may be because a first-order factor structure model for EASI or EAS adopted by the previous research assumes that indicators of one factor are loaded on that factor only. Association of items of Emotionality and Impulsivity is easily interpretable. Emotionality in the original theory is focused on unpleasant emotions Y. Ohashi, T. Kitamura Psychology   (Ohashi & Kitamura, 2017). In our bifactor model, the general factor loaded significantly positively on item 5 (not optimistic), item 4 (frightens easily), and item 8 (can't sit for long time) whereas it loaded significantly negatively on item 12 (stays with other people) and item 20 (interest in a toy to another). Despite the difference in their representations, both temperament domains have common traits which are sensitivity to stimuli, difficulty of calming themselves, and inability to control emotions. This may depict a child that is less endurable, easily frightened, and socially withdrawn children.
In the development of the EASI, Impulsivity was later dropped because Buss and Plomin thought that Impulsivity was composed of various subcomponents that had shown only some replication by factor analyses. They also noted that Impulsivity did not meet their criteria of temperament (Buss & Plomin, 1984 A long-lasting methodological issue of children's behaviours is rater bias, which can differ between parents, teachers, and researchers (e.g., Hinshaw, Han, Erhardt, & Huber, 1992;Hubert, Wachs, Peters-Martin, & Gadour, 1982;Kolko & Kazdin, 1993;Lyon & Plomin, 1981;Neale & Stevenson, 1989;Renouf & Kovacs, 1994;Satake, Yoshida, Yamashita, Kinukawa, & Takagishi, 2003;Yuh, 2017;Weissman et al., 1987). Bias is an obstacle in clinical research where parents are used as observers of child temperament. Our study found that the EASI's factor structure was invariant in terms of parent and children's gender, children's age, and measurement occasions. This is encouraging as it allows the EASI to be used in clinical as well as research settings. However, factor means of A, S and I differed between mothers and fathers. This may mean that fathers overrate A and I and underrate S than mothers. Further studies may be required using the same families with the child and the two parents.
Limitations of this study include the inability to extrapolate the invariance of the measure to other measures of temperament.
Second, the identification of a factor structure of temperament cannot be equated with classification of children based on temperament patterns. Not only the factor identification of temperament, but also a person-centred approach to child temperament is a very important research agenda to apply to clinical settings.

Y. Ohashi, T. Kitamura Psychology
Despite these drawbacks, our study demonstrated that the EASI can be reliably used in a Japanese population. We confirmed the original 4-factor structure goodness-of-fit which was improved by the addition of a general factor combining E and I.