A Comparison between Major Factor Extraction and Factor Rotation Techniques in Q-Methodology

The statistical analysis in Q-methodology is based on factor analysis followed by a factor rotation. Currently, the most common factor extraction methods are centroid and principal component extractions and the common techniques for factor rotation are manual rotation and varimax rotation. However, there are some other factor extraction methods such as principal axis factoring and factor rotation methods such as quartimax and equamax which are not used by Q-users because they have not been implemented in any major Q-program. In this article we briefly explain some major factor extraction and factor rotation techniques and compare these techniques using three datasets. We applied principal component and principal axis factoring methods for factor extraction and varimax, equamax, and quartimax factor rotation techniques to three actual datasets. We compared these techniques based on the number of Q-sorts loaded on each factor, number of distinguishing statements on each factor, and excluded Q-sorts. There was not much difference between principal component and principal axis factoring factor extractions. The main findings of this article include emergence of a general factor and a smaller number of excluded Q-sorts based on quartimax rotation. Another interesting finding was that a smaller number of distinguishing statements for factors based on quartimax rotation compared to varimax and equamax rotations. These findings are not conclusive and further analysis on more datasets is needed.


Introduction
The statistical analysis in Q-methodology is based on factor analysis, which is How to cite this paper: Akhtar-Danesh, N. (2017) A Comparison between Major Factor Extraction and Factor Rotation Tech-typically followed by a factor rotation.Currently, the most common factor extraction methods are centroid and principal component extractions and the common techniques for factor rotation are manual rotation and varimax rotation.Indeed, these factor extraction and factor rotation techniques are the only methods available in the widely used PQMethod program [1].However, there are some other factor extraction methods such as principal axis factoring and factor rotation methods such as quartimax and equamax which are not used by Q-users because they have not been implemented in any major Q-program.In this article we briefly explain some major factor extraction and factor rotation techniques and compare them using three actual datasets.

Q-Methodology
Q-methodology was introduced in 1935 by Stephenson [2] [3] and is used to identify common attitudes, perceptions, preferences, and feelings among a group of participants.In Q-methodology subjective viewpoints are collected and analyzed using a combination of qualitative and quantitative techniques [4].A Qmethodological study involves a) development of a sample of statements, Qsample, related to the topic of interest and b) rank-ordering of the Q-sample by a group of individuals from their points of views about the statements using a Q-sort table (a grid) with a quasi-normal distribution (see Figure 1).After data collection using this grid a by-person factor analysis (i.e., the factor analysis is performed on persons not variables or traits) is used to analyze these Q-sorts where each Q-sort represents one individual rather than one variable or trait.However, for the rest of this article Q-sort and variable are used interchangeably.Using such by-person factor analysis, similar Q-sorts (individuals) are grouped together as factors where each factor represents a group of individuals with similar views, feelings, or preferences about the theme of the study.One individual is loaded on one factor if his/her factor loading is statistically significant (p ≤ 0.05).A factor loading is simply the correlation between a Q-sort and the factor itself.Then, each factor is interpreted based on its distinguishing statements and statements with high or low factor scores.Distinguishing statements usually define the uniqueness of each factor.

Factor Extraction Methods
General statistical programs such as SPSS, R, Stata, and SAS include several fac-Figure 1.A Q-sort table with anchors of −5 and +5.
tor extraction methods such as principal component analysis (PCA), principal axis factoring (PAF), maximum likelihood (ML) factoring, image factoring, and alpha factoring.Each of these methods uses a different orthogonal solution; however with large sample sizes the differences in the extracted factors are usually negligible.Besides, the original extracted factors usually need to be rotated to provide interpretable results.In this section a brief review of PCA, PAF, ML, and centroid factor analysis (CFA) and their statistical properties are provided.
Principal component analysis: PCA is the most common factor extraction method and is available in almost every statistical program.It is also commonly used in Q-methodology.PCA extracts uncorrelated linear combinations of the observed Q-sorts.In general, this method analyzes all the variance in the variables (Q-sorts), i.e. it uses 1's in the diagonal of the correlation matrix for factor extraction.The first factor explains the highest level of variance in the dataset and the second factor explains the second highest level of variance; this process continues until 100% of the variance in the dataset is explained by the factors.
Principal axis factoring: The only difference between PAF and PCA is that in PAF in the correlation matrix 1's in the diagonal are replaced with the estimates of the communalities.For each Q-sort the communality is defined as the proportion of the variance that is shared (or explained) with the other variables (Q-sorts) and is estimated through an iterative process with the squared multiple correlation of each variable with all other variables as the starting values and continues until the changes in the communalities satisfy the convergence criterion for extraction [5].PAF is generally considered to be appropriate for the exploration of the underlying factors for theoretical purposes [6].
Maximum likelihood factor extraction: In ML approach like PAF, the communalities are used instead of 1's in the diagonal of correlation matrix.This approach is based on the assumption of normal distribution for each variable (Q-sort).However, the main issue with using ML in Q-methodology is that it results in Heywood case [6] more than other techniques such as PCA and PAF.A Heywood case can occur in all common factor analysis solutions such as PAF, CFA, and ML, where the iteratively estimated communality becomes unity or exceeds unity and the variance of some factors become zero or negative [7].
Centroid factor extraction: CFA has been fully described by Thurstone [8] and a description of its use in Q-methodology has been provided by Brown [9].It is known as an approximation of PAF [10] [11].However, the technical difference between PAF and CFA is that in PAF the sum of squares of "loadings" is maximized, but in CFA the average of the "loadings" is maximized.Unlike PCA and PAF, CFA is not implemented in any major statistical program such as SAS, SPSS, Stata, and R but it is available to Q-methodology users through PQMethod [1] and PCQ [12].Because CFA is known as an approximation of PAF, we will be using PAF instead.

Factor Rotation Techniques
Whatever method we use for factor extraction, usually the original unrotated factors are not meaningful or easily interpretable.Factor rotation is a process in which the original factors are rotated about their origin to yield a simple structure and easily interpretable factors [8].In the remainder of this section the common factor rotation techniques are described.Manual rotation, as a common approach in Q-methodology, is not described in this article for two reasons; 1) we intend to compare objective (statistical) rotation solutions and manual rotation is rather a more subjective approach, 2) because of its subjective solution it is less reliable and not comparable with other objective approaches.For further information on manual rotation the readers are referred to Akhtar-Danesh [13] and Akhtar-Danesh & Mirza [14].
Varimax is the most common rotation method used in statistical analysis.It is an orthogonal rotation technique that minimizes the number of variables with high loadings, either positive or negative, for each factor.In other words, it maximizes the variance of each factor loading by making high loadings higher and low loadings lower to simplify factor interpretation.Mathematically, it redistributes the total variance among Q-sorts between a smaller number of factors with relatively equal variances.In this process, the amount of variations among the major unrotated factors will be redistributed among the other smaller factors [5].This process generates factors with relatively equal importance.In other words, it eliminates a "general" factor, even if one exists [15].In the same way, it overinflates the smaller factors.Therefore, use of varimax rotation should be avoided if there is a general factor among the Q-sorts and if there are too many factors for rotation.As a result, using varimax may not be suitable for some datasets where the aim is to identify the salient factors irrespective of their distribution or how many Q-sorts load on each factor.
Quartimax is an orthogonal method that minimizes the number of factors that explain each Q-sort.In other words, each Q-sort is loaded on the minimum number of factors.This is similar to the common practice in Q-methodology in which each Q-sort is preferred to be loaded only on one factor.From this point of view, quartimax rotation seems to be more appropriate in Q-methodology.As a result, it provides a smaller number of confounded Q-sorts where each Q-sort loads on more than one factor.Although technically there should be no problem in having confounded Q-sorts, in the current practice such Q-sorts are excluded from further analysis, specifically for indentifying distinguishing statements; a problem which needs to be addressed in the future versions of Q-programs.
Quartimax simplifies the rows of the factor-loading matrix by loading each Qsort strongly on a single factor.This process provides a more interpretable factor compared to varimax rotation.This method tends to generate a general factor among the participants [16].A general factor usually consists of a large number of Q-sorts compared to the other factors.Therefore, if the existence of a general factor is expected among the Q-sorts, this method might be the method of choice; however, it may create a general factor even if one does not exist among the Q-sorts.
Equamax rotation is a combination of varimax and quartimax techniques that simplifies both the number of variables that load highly on a factor and the number of factors needed to explain a variable.However, Tabachnick & Fidell [5] warns that equamax approach might behave erratically.Therefore, a thorough inspection of factors is warranted after factor extraction.
Direct oblimin is an oblique (non-orthogonal) rotation method.This technique minimizes the cross products of loadings to simplify factors.This method permits fairly high correlation between factors, although factors may not necessarily correlate if this method is used.

Methods
We applied PCA and PAF factor extraction and varimax, equamax, and quartimax factor rotation techniques to three actual datasets.We compared these techniques based on the number of Q-sorts loaded on each factor, number of distinguishing statements on each factor, and excluded Q-sorts.After factor extraction and factor rotation a Q-sort was excluded from further analysis if it was not significantly correlated with any factor or if its squared loading on any factor was less than half of the communality for that Q-sort [17].Using this approach there will be no confounded Q-sort, i.e. no Q-sort loading on more than one factor.However, we will not discuss the actual distinguishing statements because such a comparison will be quite subjective and extensive and merits a separate article by itself.In the following the three datasets which we used in this analysis are explained.

Dataset 1: Nursing professionalism
In this study, 30 nursing students and 24 faculty members participated to sort 45 statements that were used to identify common viewpoints about professionalism held by nursing faculty and students.The study resulted in indentifying 4 common factors named as humanists, portrayers, facilitators, and regulators.
The Q-sort table for this study included seven rows and eleven columns with the anchors of −5 (least agree or most disagree) and +5 (most agree).The full description of the study is provided elsewhere [18].

Dataset 2: Childhood obesity
This study was designed to investigate parents' perceptions on the cause of childhood obesity, its impact on children's health, and the barriers to successful prevention of childhood obesity.The study participants included parents who attended a clinic for their children's well-baby check-up.The Q-sample included 42 statements and 33 parents completed the study.The Q-sort table included 8 rows and 11 columns with anchors of −5 (Strongly Disagree) and +5 (Strongly Agree).The study is fully described by Akhtar-Danesh et al. [19].
Dataset 3: Nursing students perceptions of clinical simulation As part of a larger study, nursing students' perceptions regarding simulation learning since curricular integration of simulation activities within an undergraduate nursing program were investigated.Q-methodology was used to identify perspectives of 21 students.The Q-sample included 42 statements and a Q-sort table was developed with nine columns and differing lengths.The col-umns were numbered sequentially from −4 (most disagree or least agree) to +4 (most agree).Three salient viewpoints were identified: Challenge Seekers, Realistic Embracers, and Support Seekers.The full study is described by Landeen et al. [20].

Statistical Analysis
We

Results
The results of this comparison are presented as the number of Q-sorts loaded on each factor, excluded Q-sorts, and number of distinguishing statements for each factor.factor extraction and varimax rotation, only 43 (79.6%) and 28 (84.8%) of the Q-sorts were loaded on the first four factors for Dataset 1 and Dataset 2, respectively.The corresponding numbers were 38 (70.4%) and 22 (66.7%) for equamax rotation and 48 (88.9%) and 30 (90.9%) for quartimax rotation.Therefore, the number of excluded Q-sorts from Dataset 1 and Dataset 2 were 11 and 5 for varimax rotation, 6 and 11 for equamax rotation, and 6 and 3 for quartimax rotation, respectively (Table 2).Using PCA and quartimax rotation, the first factor includes more than 70% of the loaded Q-sorts for both datasets, which may confirm the emergence of a general factor.However, Dataset 3 showed a different pattern of results from PCA and different rotation techniques.First, proportion of excluded Q-sorts was much smaller for this dataset compared with other two datasets for all rotation techniques.Second, there was not much difference between number of Q-sorts loaded on each factor between varimax and quartimax rotations.Third, the first factor from quartimax rotation includes only about 52% of the Q-sorts.The main differences between equamax and the other two rotation techniques were that, 1) equamax showed a more balanced distribution of Q-sorts among factors and 2) considerably less total number of Q-sorts loaded on the factors.For instance, total number of loaded Q-sorts for Dataset 1 was 43 for varimax and 48 for quartimax rotation, but total number for equamax rotation was 38.These numbers in the same order for Dataset 2 were 28, 30, and 22 and for Dataset 3 are 20, 21, and 19.Similar results were observed for PAF factor extraction based on the three rotation techniques.

Number of Distinguishing Statements
There was not much change or trend in the number of distinguishing statements between PCA and PAF factor rotations.However, there were interesting differences between varimax, equamax, and quartimax rotations in the number of distinguishing statements (Table 1 & Table 3).Quartimax factors tended to have a smaller number of distinguishing statements compared with varimax and equamax factors for datasets 1 and 2, although this was not the case for Dataset 3.For

Discussion
For the past eight decades the statistical analysis of Q-methodology has been limited to centroid and principal component factor extraction methods and two factor rotation techniques, i.e. varimax rotation and manual rotation.However, many more factor extraction and factor rotation techniques have been developed since the inception of Q-methodology in 1935 and each of these techniques might be more appropriate in certain conditions.We intend to examine different methods of analysis in Q-methodology in a series of articles.In this article, we compared the applications of PCA and PAF factor extractions as well as varimax, equamax, and quartimax rotations in Q-methodology using three datasets.
From a theoretical point of view, quartimax rotation seems to be more appropriate if we expect to have a general factor among the participants.However, using this technique a general factor in the data may emerge as an artifact of quartimax rotation [21].On the other hand, varimax rotation eliminates a general factor even if one exists.Therefore, sufficient knowledge of the topic of interest might become helpful when deciding on the method of rotation.Although equamax is defined as a combination of varimax and quartimax we did not observe an intermediating role for equamax in the three datasets which might be due to small sample sizes in the datasets.
first applied the PCA and PAF for factor extraction to the abovementioned datasets.Then, we used varimax, equamax, and quartimax techniques to rotate the first four factors for Dataset 1 and Dataset 2 and the first three factors for Dataset 3. The newly developed qfactor program written in Stata (available from author) was used for analysis.

F1
The main findings of this article include emergence of a general factor and a smaller number of excluded Q-sorts based on quartimax rotation.Another interesting finding was a smaller number of distinguishing statements for factors based on quartimax rotation compared with varimax and equamax rotations in the first two datasets.Although distinguishing statements are important in the interpretation of the results, particular attention needs to be paid to those statements either low or high scores on each factor as well.Practically, a smaller number of distinguishing statements might make the interpretation much simpler but it may come in the expense of losing richness of interpretation with larger number of distinguishing statements.

Table 1
represents the number of Q-sorts loaded on each factor for each dataset based on different factor extraction and factor rotation methods.Using PCA

Table 1 .
Factor characteristics based on different factor extraction and factor rotation techniques.

Table 2 .
Number of excluded Q-sorts based on factor extraction and factor rotation methods.

Table 3 .
Number (SD) of distinguishing statements based on dataset and factor rotation methods.