^{1}

^{1}

Let
n
respondents rank order
d
items, and suppose that
. Our main task is to uncover and display the structure of the observed rank data by an exploratory riffle shuffling procedure which sequentially decomposes the n voters into a finite number of coherent groups plus a noisy group: where the noisy group represents the outlier voters and each coherent group is composed of a finite number of coherent clusters. We consider exploratory riffle shuffling of a set of items to be equivalent to optimal two blocks seriation of the items with crossing of some scores between the two blocks. A riffle shuffled coherent cluster of voters within its coherent group is essentially characterized by the following facts: 1) Voters have identical first TCA factor score, where TCA designates taxicab correspondence analysis, an L_{1} variant of correspondence analysis; 2) Any preference is easily interpreted as riffle shuffling of its items; 3) The nature of different riffle shuffling of items can be seen in the structure of the contingency table of the first-order marginals constructed from the Borda scorings of the voters; 4) The first TCA factor scores of the items of a coherent cluster are interpreted as Borda scale of the items. We also introduce a crossing index, which measures the extent of crossing of scores of voters between the two blocks seriation of the items. The novel approach is explained on the benchmarking SUSHI data set, where we show that this data set has a very simple structure, which can also be communicated in a tabular form.

Ordering the elements of a set is a common decision making activity, such as, voting for a political candidate, choosing a consumer product, etc. So there is a huge literature concerning the analysis and interpretation of preference data scattered in different disciplines. Often rank data is heterogenous: it is composed of a finite mixture of components. The traditional methods of finding mixture components of rank data are mostly based on parametric probability models, distance or latent class models, and are useful for sparse data and not for diffuse data.

Rank data are sparse if there are at most a small finite number of permutations that capture the majority of the preferences; otherwise they are diffuse. As a running example in this paper, we will consider the famous benchmarking SUSHI data set enumerating n = 5000 preferences of d = 10 sushis, see [

A second data set that we shall also analyze is the APA dataset of size n = 5738 by d = 5 , see [

For a general background on statistical methods for rank data, see the excellent monograph by [

The riffle shuffle, see [_{1} and d_{2}, respectively, and successively drops the cards, one by one, so that the piles are interleaved into one deck again.

Let V, named a voting profile, be a set of n preferences on d items. Based on riffle shuffling ideas, [

Assumption: d ≪ n . This means that the sample size n is quite large compared to the number of items d.

SUSHI and APA data sets satisfy this assumption.

The most important first step in the application of a riffle shuffling procedure is how to partition the d items into two disjoint subsets. In the probabilistic riffle shuffling approach of [

We compare the two formulations of riffle shuffle, probabilistic and exploratory, in section 10.

Our aim is to explore and display a given voting profile V by sequentially partitioning it into G coherent groups plus a noisy group; that is,

V = ∪ g = 1 G c o h G ( g ) ∪ n o i s y G , (1)

where G represents the number of coherent groups and c o h G ( g ) is the gth coherent group. Furthermore, each coherent group is partitioned into a finite number of coherent clusters; that is,

c o h G ( g ) = ∪ α = 1 c g c o h C g ( α ) for g = 1, ⋯ , G , (2)

where c g represents the number of coherent clusters in the gth coherent group. So the coherent clusters are the building blocks for the coherent groups. We note the following facts:

Fact 1: The assumption d ≪ n induces the new notion of coherency for the clusters and consequently for the groups; it is a stronger characterization than the notion of interpretability for groups as discussed in [

Fact 2: Each coherent group and its clusters have the same latent variable summarized by the Borda scale.

Fact 3: Given that the proposed method sequentially peels the data like Occam’s razor, the number of groups G is calculated automatically. Furthermore, outliers or uninformative voters belonging to the n o i s y G are easily tagged.

Fact 4: The approach is exploratory, visual, data analytic and is developed within the framework of taxicab correspondence analysis (TCA). TCA is an L_{1} variant of correspondence analysis developed by [

Two major advantages of our method are: First, we can easily identify outliers. For the SUSHI data, our method tags 12.36% of the voters as outliers, which form the noisy group. While no outliers in the SUSHI data have been identified in [

A coherent cluster of voters has interesting mathematical properties and is essentially characterized by the following facts:

1) Voters have identical unique first TCA factor score.

2) Any voter preference is easily interpreted as a particular riffle shuffling of its items.

3) The nature of riffle shuffling of the items can be observed in the structure of the contingency table of the first-order marginals constructed from the Borda scorings of the voters belonging to the coherent cluster.

4) The first TCA factor scores of the items of a coherent cluster are interpreted as Borda scale of the items.

5) We also introduce the crossing index, which measures the extent of interleaving or the crossing of scores of voters between two blocks seriation of the items in a coherent cluster.

This paper has eleven sections and its contents are organized as follows: Section 2 presents an overview of TCA; section 3 presents some preliminaries on the Borda coding of the data and related tables and concepts; section 4 presents Theorem 1, which shows that the first principal dimension of TCA clusters the voters into a finite number of clusters; section 5 discusses coherent clusters and their mathematical properties; section 6 discusses riffle shuffling in a coherent cluster; section 7 introduces the crossing index; section 8 introduces the coherent groups; section 9 presents the analysis of APA data set; section 10 presents a comparison of the two formulations of riffle shuffle probabilistic and exploratory; and finally we conclude in section 11.

All mathematical proofs are relegated to the Appendix. Details of the computation are shown only for the first coherent group of SUSHI data set.

Consider a n × p matrix X where X i j ≥ 0 . We have ∑ j = 1 p ∑ i = 1 n X i j = X ∗ ∗ . Let P = X / X ∗ ∗ be the correspondence matrix associated to X ; and as usual, we define p i ∗ = ∑ j = 1 p p i j , p ∗ j = ∑ i = 1 n p i j . Let D n = D i a g ( p i ∗ ) a diagonal matrix with diagonal elements p i ∗ . Similarly D p = D i a g ( p ∗ j ) . Let k = r a n k ( P ) − 1 .

In TCA the calculation of the dispersion measures ( δ α ) , principal axes ( u α , v α ) , principal basic vectors ( a α , b α ) , and principal factor scores ( f α , g α ) for α = 1, ⋯ , k is done in a stepwise manner. We put P 1 = ( p i j ( 1 ) = p i j − p i ∗ p ∗ j ) . Let P α be the residual correspondence matrix at the α-th iteration.

The variational definitions of the TCA at the α-th iteration are

δ α = max u ∈ ℝ p ‖ P α u ‖ 1 ‖ u ‖ ∞ = max v ∈ ℝ n ‖ P ′ α v ‖ 1 ‖ v ‖ ∞ = max u ∈ ℝ p , v ∈ ℝ n v ′ P α u ‖ u ‖ ∞ ‖ v ‖ ∞ = max ‖ P α u ‖ 1 subject to u ∈ { − 1, + 1 } p = max ‖ P ′ α v ‖ 1 subject to v ∈ { − 1, + 1 } n = max v ′ P a u subject to u ∈ { − 1, + 1 } p , v ∈ { − 1, + 1 } n .

The α-th principal axes are

u α = arg max u ∈ { − 1, + 1 } p ‖ P α u ‖ 1 and v α = arg max v ∈ { − 1, + 1 } n ‖ P ′ α v ‖ 1 , (3)

and the α-th basic principal vectors are

a α = P α u α and b α = P ′ α v α , (4)

and the α-th principal factor scores are

f α = D n − 1 a α and g α = D p − 1 b α ; (5)

furthermore the following relations are also useful

u α = s g n ( b α ) = s g n ( g α ) and v α = s g n ( a α ) = s g n ( f α ) , (6)

where s g n ( . ) is the coordinatewise sign function, s g n ( x ) = 1 if x > 0 , and s g n ( x ) = − 1 if x ≤ 0 . The α-th taxicab dispersion measure δ α can be represented in many different ways

δ α = ‖ P α u α ‖ 1 = ‖ a α ‖ 1 = a ′ α v α = ‖ D n f α ‖ 1 = u ′ α D n f α , = ‖ P ′ α v α ‖ 1 = ‖ b α ‖ 1 = b ′ α u α = ‖ D p g α ‖ 1 = v ′ α D p g α . (7)

The ( α + 1 ) -th residual correspondence matrix is

P α + 1 = P α − D n f α g ′ α D p / δ α . (8)

An interpretation of the term D n g α f ′ α D p / δ α in (8) is that, it represents the best rank-1 approximation of the residual correspondence matrix P α , in the sense of taxicab norm.

In CA and TCA, the principal factor scores are centered; that is,

∑ i = 1 n f α ( i ) p i ∗ = 0 = ∑ j = 1 p g α ( j ) p ∗ j for α = 1 , ⋯ , k . (9)

The reconstitution formula in TCA and CA is

p i j = p i . p . j [ 1 + ∑ α = 1 k f α ( i ) g α ( j ) / δ α ] . (10)

In TCA, the calculation of the principal component weights, u α and v α , and the principal factor scores, g α and f α , can be accomplished by two algorithms. The first one is based on complete enumeration based on equation (3). The second one is based on iterating the transition formulae (4, 5, 6). This is an ascent algorithm; that is, it increases the value of the objective function at each iteration, see [

The TCA map is obtained by plotting ( g 1 , g 2 ) or ( f 1 , f 2 ) .

In this section we review 1) The Borda scoring of a voting profile V into R and the Borda scale; 2) Contingency table of the first order marginals of R; 3) The coded tables R d o u b l e and R n e g a .

Let A = { a 1 , a 2 , ⋯ , a d } denote a set of d alternatives/candidates/items, and V a set of n voters/individuals/judges. In this paper, we consider the linear orderings/rankings/preferences, in which all d objects are rank-ordered according to their levels of desirability by the n voters. We denote a linear order by a sequence s = ( a k 1 ≻ a k 2 ≻ ⋯ ≻ a k d ) , where a k 1 ≻ a k 2 means that the alternative a k 1 is preferred to the alternative a k 2 . The Borda scoring of s , see [

The Borda scale of the elements of A is β = 1 ′ n R / n , where 1 n is a column vector of 1’s having n coordinates. The Borda scale seriates/orders the d items of the set A according to their average scores: β ( j ) > β ( i ) means item j is preferred to item i, and β ( j ) = β ( i ) means both items ( a i , a j ) are equally preferred. In the toy example of

Similarly, we define the reverse Borda score of s to be the vector b ¯ ( s ) , which assigns to the element a k j the score of ( j − 1 ) . We denote R ¯ = ( r ¯ i j ) to be the matrix having n rows and d columns, where r ¯ i j designates the reverse Borda score of the ith judge’s nonpreference of the jth alternative. The reverse Borda scale of the d items is β ¯ = 1 ′ n R ¯ / n .

We note that

R + R ¯ = ( d − 1 ) 1 n 1 ′ n

and

β + β ¯ = ( d − 1 ) 1 ′ d .

The contingency table of first-order marginals of an observed voting profile V on

C | B | A | C | B | A | |
---|---|---|---|---|---|---|

A ≻ B ≻ C | 0 | 1 | 2 | 2 | 1 | 0 |

A ≻ C ≻ B | 1 | 0 | 2 | 1 | 2 | 0 |

B ≻ A ≻ C | 0 | 2 | 1 | 2 | 0 | 1 |

B ≻ C ≻ A | 1 | 2 | 0 | 1 | 0 | 2 |

Borda scale β | 0.5 | 1.25 | 1.25 | |||

reverse Borda scale β ¯ | 1.5 | 0.75 | 0.75 | |||

nega n β ¯ | 6 | 3 | 3 |

d items is a square d × d matrix M, where M ( i , j ) stores the number of times that itemj has Borda score i for i = 0 , ⋯ , d − 1 , see [ [

1) It has uniform row and column marginals equal to the sample size.

2) We can compute the Borda scale β from it.

3) It reveals the nature of crossing of scores attributed to the items for a given binary partition of the items. For the toy example, consider the partition { C } and { B , A } with attributed scores of { 0 } and { 1,2 } respectively (this is the first step in a riffle shuffle). Then the highlighted cells (marked in bold) in

Our methodological approach is based on Benzécri’s platform, see [ [

There are three elements in Benzécri’s platform: 1) coding, a kind of pre-processing of data, will be discussed in the following paragraph; 2) eventual supplementation consists in applying TCA and not correspondence analysis (CA), because in the CA case we do not have a result similar to Theorem 1; 3) question that we are allowed to ask is to explore and visualize rank data.

Within the CA framework, there are two codings of rank data R d o u b l e and R n e g a .

The first one is the doubled table of size ( 2 n ) × d

R d o u b l e = ( R R ¯ )

proposed independently by [

C | B | A | row sum | |
---|---|---|---|---|

0 | 2 | 1 | 1 | 4 |

1 | 2 | 1 | 1 | 4 |

2 | 0 | 2 | 2 | 4 |

column sum | 4 | 4 | 4 | |

Borda scale β | 0.5 | 1.25 | 1.25 |

equivalent to the dual scaling of Nishisato coding of rank data, see [

P d o u b l e 1 = 1 t ( r i j − d − 1 2 − ( r i j − d − 1 2 ) ) ,

where t = n d ( d − 1 ) . The structure of P d o u b l e 1 shows that each row is centered as in Carroll’s multidimensional preference analysis procedure, MDPREF, exposed in Alvo and Yu (2014, p. 15). In TCA the objective function to maximize is a combinatorial problem, see Equation (3); and the first iteration in TCA of R d o u b l e corresponds to computing

δ 1 d o u b l e = max v ∈ { − 1,1 } n ‖ ( v t | − v t ) P d o u b l e 1 ‖ 1 = max v ∈ { − 1,1 } n 2 t ∑ j = 1 d | ∑ i = 1 n ( r i j − d − 1 2 ) v i | (11)

In the second approach, we summarize R ¯ by its column total; that is, we create a row named n e g a = n β ¯ = 1 ′ n R ¯ , then we vertically concatenate nega to R, thus obtaining

R n e g a = ( R n e g a )

of size ( n + 1 ) × d .

[

δ 1 = max v ∈ { − 1,1 } n ‖ ( v t | − 1 n t ) P d o u b l e 1 ‖ 1 = max v ∈ { − 1,1 } n ‖ ( v t − 1 ) P n e g a 1 ‖ 1 = max v ∈ { − 1,1 } n 1 t ∑ j = 1 d | ∑ i = 1 n ( r i j − d − 1 2 ) ( v i + 1 ) | (12)

So, we see that if in (11) the optimal value of v = 1 n , then δ 1 d o u b l e = δ 1 , otherwise δ 1 d o u b l e > δ 1 .

Let

v 1 = arg max v ∈ { − 1,1 } n 1 t ∑ j = 1 d | ∑ i = 1 n ( r i j − d − 1 2 ) ( v i + 1 ) | .

Define the set of indices I + = { i | v 1 i = 1 } and I − = { i | v 1 i = − 1 } , where v 1 = ( v 1 i ) . Then

δ 1 = 2 t ∑ j = 1 d | ∑ i ∈ I + ( r i j − d − 1 2 ) | (13)

shows that the summation in (13) is restricted to the subset of assessors that belong to I + . The subset I + indexes the voters having the same direction in their votes. Given that we are uniquely interested in the first TCA dimension, all the necessary information is encapsulated in I + , as discussed in [

Furthermore, δ 1 in (13) equals four times the cut norm of R c e n t e r e d ( i , j ) = 1 t ( r i j − d − 1 2 ) , where the cut norm is defined to be

‖ R c e n t e r e d ‖ c u t = max S , T 1 t ∑ j ∈ S ∑ i ∈ T ( r i j − d − 1 2 ) = 1 t ∑ j ∈ S + ∑ i ∈ I + ( r i j − d − 1 2 ) = δ 1 / 4 ,

where S ⊆ { 1, ⋯ , d } and T ⊆ I ; it shows that the subsets I + and S + are positively associated, for further details see for instance, [

In the sequel, we will consider only the application of TCA to R n e g a .

We show the results on the SUSHI data set enumerating n = 5000 preferences of d = 10 sushis, see [

We have the following theorem concerning the first TCA principal factor

scores of the voters belonging to a profile V 1 , f 1 ( i ) for i = 1 , ⋯ , n , where the first principal axis partitions the d items into d 1 and d 2 parts such that d = d 1 + d 2 .

Theorem 1

a) The maximum number of distinct clusters of the n voters belonging to V 1 on the first TCA principal axis (distinct f 1 ( i ) values for i ∈ V 1 ) is d 1 d 2 + 1 .

b) The maximum value that f 1 ( i ) can attain is 2 d 1 d 2 d ( d − 1 ) .

c) The minimum value that f 1 ( i ) can attain is − 2 d 1 d 2 d ( d − 1 ) .

d) If the number of distinct clusters is maximum, d 1 d 2 + 1 , then the gap between two contiguous f 1 ( i ) values is 4 d ( d − 1 ) .

Remark 1

a) We fix f 1 ( n e g a ) < 0 to eliminate the sign indeterminacy of the first bilinear term in (10).

b) We partition V 1 into d 1 d 2 + 1 clusters, V 1 = ∪ α = 1 d 1 d 2 + 1 V 1, α , where the voters of the αth cluster are characterized by their first TCA factor score; that is, V 1 , α = { i ∈ V 1 : f 1 V 1 ( i ) = 2 d 1 d 2 d ( d − 1 ) − ( α − 1 ) 4 d ( d − 1 ) } for α = 1, ⋯ , d 1 d 2 + 1 .

Example 1: In

Fact 1: by Theorem 1a, 5000 preferences are clustered into d 1 d 2 + 1 = 25 clusters on the first TCA principal axis.

Fact 2: by Theorem 1b, the maximum value of f 1 ( i ) = 48 / 90 .

Fact 3: by Theorem 1c, the minimum value of f 1 ( i ) = − 48 / 90 .

Fact 4: by Theorem 1d, the gap separating two contiguous clusters of voters on the first TCA principal axis is 4/90.

A cluster of voters defined in Remark 1b, V 1, α for α = 1 , ⋯ , d 1 d 2 + 1 , can be classified as coherent or incoherent. And this will be discussed in the next section.

The following definition characterizes a coherent cluster.

Definition 1 (Coherency of a cluster of voters V 1, α for α = 1 , ⋯ , d 1 d 2 + 1 )

A cluster of voters V 1, α ⊆ V 1 is coherent if f 1 V 1 , α ( v ) = 2 d 1 d 2 d ( d − 1 ) − ( α − 1 ) 4 d ( d − 1 ) for all v ∈ V 1, α , where f 1 V 1 , α ( i ) is the first TCA factor score of the voter i ∈ V 1, α obtained from TCA of subprofile V 1, α .

Remark 2:

a) It is important to distinguish between f 1 V 1 ( i ) for i = 1 , ⋯ , | V 1 | where n = | V 1 | , and f 1 V 1 , α ( i ) for i = 1 , ⋯ , | V 1 , α | , where | V 1 , α | represents the sample size of the cluster | V 1 , α | .

b) Definition 1 implies that a cluster V 1, α is coherent when for all voters i ∈ V 1, α the first TCA factor score f 1 V 1 , α ( i ) does not depend on the voter i, but it depends on ( α , d 1 , d 2 ) .

Corollary 1: It follows from Remark 1a and Equation (13) that, a necessary condition, but not sufficient, for a cluster V 1, α to be coherent is that its first TCA factor score obtained from TCA of V 1 is strictly positive; that is, 0 < f 1 V 1 ( i ) for i ∈ V 1, α .

Example 2: Figures 3-9 show the coherency of the clusters of voters V 1, α for α = 1 , ⋯ , 7 , where dots represent clusters of voters; while

Proposition 1: For a voting profile V, δ 1 ( V ) ≥ | f 1 ( n e g a ) | , where δ 1 ( V ) is the first TCA dispersion value obtained from TCA of V, and f 1 ( n e g a ) is the first TCA factor score of the row nega.

The equality in Proposition 1 is attained only for coherent clusters as shown in the following result.

Proposition 2: The first TCA dispersion value of a coherent cluster c o h C 1 ( α ) satisfies

δ 1 ( c o h C 1 ( α ) ) = | f 1 V 1 , α ( n e g a ) | = 2 d 1 d 2 d ( d − 1 ) − ( α − 1 ) 4 d ( d − 1 )

Example 3: propostion 2 can be observed by looking at the columns 3 and 4 of

α | | V 1, α | | description of V 1, α | δ 1 ( V 1, α ) | T v ∈ V 1, α ( τ J 1 ( S 1 ) ) | C r o s s ( V 1, α ) |
---|---|---|---|---|---|

1 | 314 | { i : f 1 V 1 ( i ) = 48 / 90 } | 48/90 | 6 | 0 |

2 | 235 | { i : f 1 V 1 ( i ) = 44 / 90 } | 44/90 | 7 | 1/12 |

3 | 326 | { i : f 1 V 1 ( i ) = 40 / 90 } | 40/90 | 8 | 2/12 |

4 | 315 | { i : f 1 V 1 ( i ) = 36 / 90 } | 36/90 | 9 | 3/12 |

5 | 452 | { i : f 1 V 1 ( i ) = 32 / 90 } | 32/90 | 10 | 4/12 |

6 | 375 | { i : f 1 V 1 ( i ) = 28 / 90 } | 28/90 | 11 | 5/12 |

7 | 401 | { i : f 1 V 1 ( i ) = 24 / 90 } | 24/90 | 12 | 6/12 |

While for the incoherent cluster V 1,8 with sample size of | V 1,8 | = 335 , we observe: V 1 , 8 = { i : f 1 V 1 ( i ) = 20 / 90 = 0.222 } , and by Proposition 1, δ 1 ( V 1 , 8 ) = 0.2354 > 2 / 9 . This means that the 335 voters belonging to V 1,8 form a cluster within the whole sample of 5000 voters, but separated as 335 voters they do not form a coherent cluster.

Interpretability of a Coherent ClusterThe following result shows that for coherent clusters, the first TCA dimension can be interpreted as Borda scaled factor.

Proposition 3: The first TCA column factor score of the item j, g 1 ( j ) , is an affine function of the Borda scale β ( j ) ; that is, g 1 ( j ) = 2 d − 1 β ( j ) − 1 for j = 1 , ⋯ , d . Or c o r r ( g 1 , β ) = 1 .

Remark 3:

The first TCA principal factor score of item j for j = 1 , ⋯ , d is bounded: − 1 ≤ g 1 ( j ) ≤ 1 , because 0 ≤ β ( j ) ≤ d − 1 .

Example 4:

Now we ask the question what are the differences among the seven coherent clusters? The answer is riffle shuffling of the scores of the items, which we discuss next.

[

Borda scale | j10 | j7 | j4 | j9 | j3 | j1 | j2 | j6 | j5 | j8 |
---|---|---|---|---|---|---|---|---|---|---|

β ( c o h C 1 ( 1 ) ) | 0.66 | 1.31 | 1.87 | 2.16 | 5.55 | 5.78 | 6.03 | 6.58 | 7.31 | 7.52 |

β ( c o h C 1 ( 2 ) ) | 0.69 | 1.29 | 2.44 | 2.59 | 5.47 | 5.43 | 5.50 | 6.35 | 7.38 | 7.86 |

β ( c o h C 1 ( 3 ) ) | 0.65 | 1.60 | 3.04 | 2.71 | 5.25 | 5.25 | 5.39 | 6.26 | 7.17 | 7.68 |

β ( c o h C 1 ( 4 ) ) | 0.83 | 1.79 | 3.10 | 3.28 | 5.30 | 4.74 | 5.22 | 6.34 | 6.76 | 7.64 |

β ( c o h C 1 ( 5 ) ) | 1.12 | 2.02 | 3.26 | 3.60 | 5.70 | 4.74 | 5.27 | 5.75 | 5.99 | 7.60 |

β ( c o h C 1 ( 6 ) ) | 1.12 | 2.33 | 3.62 | 3.93 | 5.68 | 4.98 | 5.21 | 5.33 | 5.25 | 7.56 |

β ( c o h C 1 ( 7 ) ) | 1.42 | 2.74 | 3.84 | 4.00 | 5.45 | 4.70 | 5.02 | 5.26 | 5.20 | 7.38 |

of independence of two subsets of items to riffled independence to uncover the structure of rank data. Within the framework of data analysis of preferences, exploratory riffle shuffling can be described in the following way. We have two sets: J a set of d distinct items andS a set of d Borda scores. We partition both sets into two disjoint subsets of sizes d 1 and d 2 = d − d 1 ; that is, J = J 1 ∪ J 2 with J 1 = { j 1 , j 2 , ⋯ , j d 1 } and S = S 1 ∪ S 2 with S 1 = { 0 , 1 , ⋯ , d 1 − 1 } . Riffle shuffling consists of two steps. In the first step, we attribute the scores of S 1 to J 1 and the scores of S 2 to J 2 . In the second step, we permute some scores attributed to J 1 with the same number of scores attributed to J 2 . The second step can be mathematically described as an application of a permutation τ , such that τ J ( S 1 , S 2 ) = ( τ J 1 ( S 1 ) , τ J 2 ( S 2 ) ) . We interpret τ J 1 ( S 1 ) as the set of scores attributed to J 1 , and τ J 2 ( S 2 ) as the set of scores attributed to J 2 .

Example 5:

Further, we note by | τ J 1 ( S 1 ) | the number of voters who have done the riffle shuffle ( τ J 1 ( S 1 ) , τ J 2 ( S 2 ) ) . So | τ J 1 ( S 1 ) = { 0 , 1 , 2 , 3 } | = 4 , | { 0 , 1 , 2 , 5 } | = 2 and | { 0 , 1 , 4 , 5 } | = 1 . The permuted scores between the two blocks of items are in bold in

Remark 4: A useful observation that we get from Example 5 is that we can concentrate our study either on J 1 or on J 2 : For if we know τ J 1 ( S 1 ) , the scores attributed to J 1 , we can deduce τ J 2 ( S 2 ) , the scores attributed to J 2 because of mutual exclusivity constraints ensuring that any two items, say a and

voter | items | |||||||||
---|---|---|---|---|---|---|---|---|---|---|

a | b | c | d | e | f | g | h | i | j | |

1 | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |

2 | 0 | 2 | 3 | 1 | 6 | 4 | 5 | 8 | 7 | 9 |

3 | 3 | 2 | 1 | 0 | 5 | 6 | 4 | 9 | 7 | 8 |

4 | 2 | 1 | 0 | 3 | 8 | 7 | 9 | 4 | 5 | 6 |

5 | 0 | 1 | 2 | 5 | 4 | 3 | 6 | 7 | 8 | 9 |

6 | 1 | 2 | 5 | 0 | 3 | 6 | 4 | 9 | 7 | 8 |

7 | 0 | 4 | 5 | 1 | 6 | 8 | 9 | 2 | 7 | 3 |

b, never map to the same rank by a voter.

A simple measure of magnitude of ( d 1 , d 2 ) riffle shuffling of a voter i is the sum of its Borda scores attributed to the items in J 1 ; that is,

T i ( τ J 1 ( S 1 ) ) = ∑ j ∈ J 1 r i j ,

where r i j is the Borda score attributed to item j by voter i. In

For relatively small sample sizes, it is easy to enumerate the different types of ( d 1 , d 2 ) riffle shuffles. For relatively large sample sizes, we use the contingency table of first-order marginals, that we discuss next.

Types of (d_{1}, d

_{2}) Riffle Shufflings in a Coherent Cluster

The contingency table of first order marginals of an observed voting profile V on d items is a square d × d matrix M, where M ( i , j ) stores the number of times that item j has Borda score i for i = 0, ⋯ , d − 1 , see subsection 3.2. It helps us to observe types of ( d 1 , d 2 ) riffle shufflings in a coherent cluster as we explain in Example 6.

Example 6: Tables 6-12 display M 1, α for α = 1, ⋯ ,7 , the contingency tables of first-order marginals of the seven coherent clusters of the SUSHI data, respectively. We observe the following:

Each one of them reveals the nature of the riffle shuffles of its coherent cluster,

Borda scores | items | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|

j10 | j7 | j4 | j9 | j3 | j1 | j2 | j6 | j5 | j8 | sum | |

0 | 174 | 92 | 37 | 11 | 0 | 0 | 0 | 0 | 0 | 0 | 314 |

1 | 88 | 88 | 76 | 62 | 0 | 0 | 0 | 0 | 0 | 0 | 314 |

2 | 38 | 78 | 91 | 107 | 0 | 0 | 0 | 0 | 0 | 0 | 314 |

3 | 14 | 56 | 110 | 134 | 0 | 0 | 0 | 0 | 0 | 0 | 314 |

4 | 0 | 0 | 0 | 0 | 92 | 78 | 73 | 38 | 21 | 12 | 314 |

5 | 0 | 0 | 0 | 0 | 95 | 77 | 59 | 42 | 23 | 18 | 314 |

6 | 0 | 0 | 0 | 0 | 47 | 63 | 70 | 65 | 37 | 32 | 314 |

7 | 0 | 0 | 0 | 0 | 35 | 49 | 45 | 72 | 68 | 45 | 314 |

8 | 0 | 0 | 0 | 0 | 32 | 27 | 32 | 62 | 87 | 74 | 314 |

9 | 0 | 0 | 0 | 0 | 13 | 20 | 35 | 35 | 78 | 133 | 314 |

β | 0.66 | 1.31 | 1.87 | 2.16 | 5.55 | 5.78 | 6.03 | 6.58 | 7.31 | 7.52 |

Borda scores | items | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|

j10 | j7 | j4 | j9 | j3 | j1 | j2 | j6 | j5 | j8 | sum | |

0 | 127 | 70 | 32 | 6 | 0 | 0 | 0 | 0 | 0 | 0 | 235 |

1 | 69 | 82 | 38 | 46 | 0 | 0 | 0 | 0 | 0 | 0 | 235 |

2 | 32 | 56 | 62 | 85 | 0 | 0 | 0 | 0 | 0 | 0 | 235 |

3 | 0 | 0 | 0 | 0 | 55 | 59 | 74 | 29 | 15 | 3 | 235 |

4 | 7 | 27 | 103 | 98 | 0 | 0 | 0 | 0 | 0 | 0 | 235 |

5 | 0 | 0 | 0 | 0 | 68 | 60 | 42 | 41 | 11 | 13 | 235 |

6 | 0 | 0 | 0 | 0 | 49 | 53 | 35 | 48 | 32 | 18 | 235 |

7 | 0 | 0 | 0 | 0 | 26 | 35 | 42 | 48 | 40 | 44 | 235 |

8 | 0 | 0 | 0 | 0 | 28 | 15 | 22 | 44 | 70 | 56 | 235 |

9 | 0 | 0 | 0 | 0 | 9 | 13 | 20 | 25 | 67 | 101 | 235 |

β | 0.69 | 1.29 | 2.44 | 2.59 | 5.47 | 5.43 | 5.50 | 6.35 | 7.38 | 7.86 |

Borda scores | items | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|

j10 | j7 | j4 | j9 | j3 | j1 | j2 | j6 | j5 | j8 | sum | |

0 | 182 | 97 | 33 | 14 | 0 | 0 | 0 | 0 | 0 | 0 | 326 |

1 | 104 | 100 | 46 | 76 | 0 | 0 | 0 | 0 | 0 | 0 | 326 |

2 | 19 | 41 | 41 | 70 | 40 | 37 | 46 | 17 | 12 | 3 | 326 |

3 | 16 | 35 | 53 | 51 | 39 | 48 | 43 | 22 | 13 | 6 | 326 |

4 | 3 | 29 | 62 | 61 | 40 | 41 | 43 | 32 | 9 | 6 | 326 |

5 | 2 | 24 | 91 | 54 | 39 | 43 | 23 | 24 | 16 | 10 | 326 |

6 | 0 | 0 | 0 | 0 | 70 | 65 | 51 | 60 | 45 | 35 | 326 |

7 | 0 | 0 | 0 | 0 | 53 | 36 | 52 | 74 | 56 | 55 | 326 |

8 | 0 | 0 | 0 | 0 | 35 | 33 | 33 | 57 | 80 | 88 | 326 |

9 | 0 | 0 | 0 | 0 | 10 | 23 | 35 | 40 | 95 | 123 | 326 |

β | 0.65 | 1.60 | 3.04 | 2.71 | 5.25 | 5.25 | 5.39 | 6.26 | 7.17 | 7.68 |

Borda scores | items | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|

j10 | j7 | j4 | j9 | j3 | j1 | j2 | j6 | j5 | j8 | sum | |

0 | 164 | 93 | 44 | 14 | 0 | 0 | 0 | 0 | 0 | 0 | 315 |

1 | 78 | 71 | 30 | 36 | 10 | 31 | 32 | 9 | 16 | 2 | 315 |

2 | 44 | 53 | 49 | 50 | 32 | 39 | 27 | 10 | 8 | 3 | 315 |

3 | 22 | 52 | 58 | 87 | 24 | 20 | 24 | 15 | 7 | 6 | 315 |

4 | 5 | 17 | 35 | 43 | 51 | 61 | 41 | 25 | 23 | 14 | 315 |

5 | 1 | 11 | 61 | 46 | 43 | 42 | 34 | 35 | 26 | 16 | 315 |
---|---|---|---|---|---|---|---|---|---|---|---|

6 | 1 | 18 | 38 | 39 | 52 | 37 | 37 | 49 | 28 | 16 | 315 |

7 | 0 | 0 | 0 | 0 | 49 | 44 | 51 | 61 | 54 | 56 | 315 |

8 | 0 | 0 | 0 | 0 | 37 | 28 | 47 | 72 | 69 | 52 | 315 |

9 | 0 | 0 | 0 | 0 | 17 | 13 | 22 | 39 | 84 | 140 | 315 |

β | 0.83 | 1.79 | 3.10 | 3.28 | 5.30 | 4.74 | 5.22 | 6.34 | 6.76 | 7.64 |

Borda scores | items | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|

j10 | j7 | j4 | j9 | j3 | j1 | j2 | j6 | j5 | j8 | sum | |

0 | 188 | 99 | 36 | 10 | 6 | 25 | 30 | 22 | 34 | 2 | 452 |

1 | 132 | 109 | 69 | 57 | 12 | 30 | 21 | 13 | 9 | 0 | 452 |

2 | 69 | 88 | 59 | 67 | 28 | 46 | 40 | 28 | 20 | 7 | 452 |

3 | 39 | 72 | 85 | 92 | 34 | 44 | 21 | 31 | 25 | 9 | 452 |

4 | 12 | 35 | 76 | 81 | 50 | 57 | 53 | 36 | 38 | 14 | 452 |

5 | 6 | 29 | 63 | 72 | 63 | 64 | 53 | 41 | 40 | 21 | 452 |

6 | 3 | 11 | 34 | 36 | 71 | 68 | 64 | 75 | 45 | 45 | 452 |

7 | 3 | 9 | 30 | 37 | 87 | 45 | 62 | 73 | 57 | 49 | 452 |

8 | 0 | 0 | 0 | 0 | 71 | 41 | 47 | 72 | 95 | 126 | 452 |

9 | 0 | 0 | 0 | 0 | 30 | 32 | 61 | 61 | 89 | 179 | 452 |

β | 1.12 | 2.02 | 3.26 | 3.60 | 5.70 | 4.74 | 5.27 | 5.75 | 5.99 | 7.60 |

Borda scores | items | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|

j10 | j7 | j4 | j9 | j3 | j1 | j2 | j6 | j5 | j8 | sum | |

0 | 151 | 81 | 31 | 14 | 8 | 14 | 19 | 18 | 39 | 0 | 375 |

1 | 112 | 79 | 44 | 33 | 12 | 21 | 26 | 25 | 19 | 4 | 375 |

2 | 66 | 72 | 52 | 63 | 16 | 24 | 29 | 22 | 28 | 3 | 375 |

3 | 26 | 52 | 68 | 68 | 22 | 45 | 31 | 29 | 25 | 9 | 375 |

4 | 8 | 26 | 42 | 37 | 52 | 67 | 41 | 45 | 45 | 12 | 375 |

5 | 8 | 27 | 56 | 61 | 44 | 49 | 42 | 36 | 28 | 24 | 375 |

6 | 3 | 21 | 36 | 52 | 64 | 42 | 49 | 50 | 29 | 29 | 375 |

7 | 0 | 7 | 25 | 31 | 70 | 43 | 44 | 59 | 45 | 51 | 375 |

8 | 1 | 10 | 21 | 16 | 66 | 33 | 46 | 49 | 45 | 88 | 375 |

9 | 0 | 0 | 0 | 0 | 21 | 37 | 48 | 42 | 72 | 155 | 375 |

β | 1.12 | 2.33 | 3.62 | 3.93 | 5.68 | 4.98 | 5.21 | 5.33 | 5.25 | 7.56 |

Borda scores | items | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|

j10 | j7 | j4 | j9 | j3 | j1 | j2 | j6 | j5 | j8 | sum | |

0 | 129 | 65 | 46 | 14 | 11 | 24 | 35 | 23 | 52 | 2 | 401 |

1 | 122 | 77 | 53 | 35 | 14 | 28 | 24 | 19 | 25 | 4 | 401 |

2 | 74 | 69 | 50 | 51 | 24 | 41 | 36 | 31 | 19 | 6 | 401 |

3 | 36 | 51 | 31 | 66 | 44 | 48 | 39 | 46 | 30 | 10 | 401 |

4 | 24 | 50 | 40 | 71 | 51 | 45 | 38 | 37 | 27 | 18 | 401 |

5 | 7 | 45 | 49 | 56 | 43 | 53 | 39 | 48 | 32 | 29 | 401 |

6 | 5 | 23 | 73 | 68 | 42 | 50 | 42 | 33 | 40 | 25 | 401 |

7 | 3 | 10 | 31 | 28 | 85 | 39 | 43 | 54 | 51 | 57 | 401 |

8 | 1 | 3 | 17 | 5 | 58 | 46 | 47 | 65 | 57 | 102 | 401 |

9 | 0 | 8 | 11 | 7 | 29 | 27 | 58 | 45 | 68 | 148 | 401 |

β | 1.42 | 2.74 | 3.84 | 4.00 | 5.45 | 4.70 | 5.02 | 5.26 | 5.20 | 7.38 |

c o h C 1 ( α ) | scores given to { j 10 , j 7 , j 4 , j 9 } | sum of scores | count | c o h C 1 ( α ) | scores given to { j 10 , j 7 , j 4 , j 9 } | sum of scores | count |
---|---|---|---|---|---|---|---|

c o h C 1 ( 1 ) | { 0,1,2,3 } | 6 | 314 | c o h C 1 ( 6 ) | { 0 , 1 , 2 , 8 } | 11 | 48 |

c o h C 1 ( 2 ) | { 0 , 1 , 2 , 4 } | 7 | 235 | { 0 , 1 , 7 , 3 } | 11 | 63 | |

c o h C 1 ( 3 ) | { 0 , 1 , 2 , 5 } | 8 | 171 | { 0 , 6 , 2 , 3 } | 11 | 53 | |

{ 0 , 1 , 4 , 3 } | 8 | 155 | { 5 , 1 , 2 , 3 } | 11 | 98 | ||

c o h C 1 ( 4 ) | { 0 , 1 , 2 , 6 } | 9 | 96 | { 0 , 1 , 4 , 6 } | 11 | 59 | |

{ 0 , 1 , 5 , 3 } | 9 | 119 | { 0 , 4 , 2 , 5 } | 11 | 54 | ||

{ 0 , 4 , 2 , 3 } | 9 | 100 | c o h C 1 ( 7 ) | { 0 , 1 , 2 , 9 } | 12 | 26 | |

c o h C 1 ( 5 ) | { 0 , 1 , 2 , 7 } | 10 | 79 | { 0 , 1 , 8 , 3 } | 12 | 26 | |

{ 0 , 1 , 6 , 3 } | 10 | 84 | { 0 , 7 , 2 , 3 } | 12 | 33 | ||

{ 0 , 5 , 2 , 3 } | 10 | 85 | { 6 , 1 , 2 , 3 } | 12 | 43 | ||

{ 4 , 1 , 2 , 3 } | 10 | 119 | { 0 , 4 , 5 , 3 } | 12 | 38 | ||

{ 0 , 1 , 4 , 5 } | 10 | 85 | { 0 , 4 , 2 , 6 } | 12 | 39 | ||

{ 0 , 1 , 4 , 7 } | 12 | 49 | |||||

{ 0 , 1 , 5 , 6 } | 12 | 82 | |||||

{ 4 , 1 , 2 , 5 } | 12 | 65 |

which are summarized in

The counts in

a) c o h C 1 ( 1 )

| { 0,1,2,3 } | = 314 , which is the number of 0 s attributed to J 1 in M 1,1 . Among the M 1, α for α = 1, ⋯ ,7 , note that M 1,1 is the only contingency table of first-order marginals which is block diagonal.

b) c o h C 1 ( 2 )

| { 0 , 1 , 2 , 4 } | = 235 , which is the number of 4 s attributed to J 1 in M 1,2 .

c) c o h C 1 ( 3 )

| { 0,1,2,5 } | = 171 , which is the number of 5 s attributed to J 1 in M 1,3 .

| { 0,1, 4 ,3 } | = 155 , which is the number of 4 s attributed to J 1 in M 1,3 .

d) c o h C 1 ( 4 )

{ { 0,1,2, 6 } } = 96 , which is the number of 6 s attributed to J 1 in M 1,4 .

| { 0,1, 5 ,3 } | = 119 , which is the number of 5 s attributed to J 1 in M 1,4 .

| { 0, 4 ,2,3 } | = 100 , which is the number of 4 s attributed to J 1 in M 1,4 .

e) c o h C 1 ( 5 )

| { 0,1,2, 7 } | = 79 , which is the number of 7 s attributed to J 1 in M 1,5 .

| { 0,1, 6 ,3 } | = 84 , which is the number of 6 s attributed to J 1 in M 1,5 .

| { 0, 5 ,2,3 } | = 85 , which is the number of 1 s not attributed to J 1 in M 1,5 .

| { 0,1, 4 , 5 } | + | { 0, 5 ,2,3 } | = 170 , which is the total number of 5 s attributed to J 1 in M 1,5 ; so | { 0,1, 4 , 5 } | = 170 − 85 = 85 .

| { 4 ,1,2,3 } | = 119 , which is the number of 0 s not attributed to J 1 in M 1,5 .

f) c o h C 1 ( 6 )

| { 0,1,2, 8 } | = 48 , which is the number of 8 s attributed to J 1 in M 1,6 .

| { 0,1, 7 ,3 } | = 63 , which is the number of 7 s attributed to J 1 in M 1,6 .

| { 5 ,1,2,3 } | = 98 , which is the number of 0 s not attributed to J 1 in M 1,6 .

| { 0, 4 ,2, 5 } | = 152 − 98 = 54 , where 152 is the total number of 5 s attributed to J 1 in M 1,6 .

| { 0,1, 4 , 6 } | = 113 − 54 = 59 , where 113 is the total number of 4 s attributed to J 1 in M 1,6 .

| { 0, 6 ,2,3 } | = 112 − 59 = 53 , where 112 is the total number of 6 s attributed to J 1 in M 1,6 .

g) c o h C 1 ( 7 )

| { 0,1,2, 9 } | = 26 , which is the number of 9 s attributed to J 1 in M 1,7 .

| { 0,1, 8 ,3 } | = 26 , which is the number of 8 s attributed to J 1 in M 1,7 .

For the remaining counts, we have to solve the following system of 7 linear equations, where, u = | { 0, 7 ,2,3 } | , t = | { 0, 4 , 5 ,3 } | , s = | { 0, 4 ,2, 6 } | , w = | { 0,1, 4 , 7 } | , z = | { 0,1, 5 , 6 } | , x = | { 6 ,1,2,3 } | , and y = | { 4 ,1,2,5 } | .

x + y = 147 , which is the number of 0 s not attributed to J 1 in M 1,7 .

u + w = 72 , which is the number of 7 s attributed to J 1 in M 1,7 .

s + z + x = 169 , which is the number of 6 s attributed to J 1 in M 1,7 .

t + z + y = 157 , which is the number of 5 s attributed to J 1 in M 1,7 .

t + s + w + y = 185 , which is the number of 4 s attributed to J 1 in M 1,7 .

u + t + x = 158 , which is the number of 3 s attributed to J 1 in M 1,7 .

u + s + x + y = 218 , which is the number of 2 s attributed to J 1 in M 1,7 .

The following ( d 1 , d 2 ) crossing index is based on the internal dispersion of a voting profile.

Definition 3: For a voting profile V we define its crossing index to be

C r o s s ( V ) = 1 − δ 1 ( V d 1 , d 2 ) max V δ 1 ( V d 1 , d 2 ) = 1 − δ 1 ( V d 1 , d 2 ) 2 d 1 d 2 d ( d − 1 ) by Proposition 2.

where δ 1 ( V d 1 , d 2 ) is the first taxicab dispersion obtained from TCA of V and ( d 1 , d 2 ) represents the optimal TCA binary partition of the d items of V such that d = d 1 + d 2 .

Proposition 4: The crossing index of a coherent cluster is

C r o s s ( c o h C ( α ) ) = 2 ( α − 1 ) d 1 d 2 .

Example 7: The last column in

C r o s s ( V 1 , 8 ) = 1 − 0.2354 2 × 5 × 5 / ( 10 × 9 ) = 1 − 0.4237 = 0.5763 .

Our aim is to explore a given voting profile V by uncovering its coherent mixture groups, see Equation (1); that is, V = ∪ g = 1 G c o h G ( g ) ∪ n o i s y G , where G represents the number of coherent groups and c o h G ( g ) is the gth coherent group. The computation is done by an iterative procedure in n G steps for n G ≥ G that we describe:

For g = 1 ; let V 1 = V ; compute c o h G ( 1 ) from V 1 , then partition V 1 = V 2 ∪ c o h G ( 1 ) ;

For g = 2 ; compute c o h G ( 2 ) from V 2 , then partition V 2 = V 3 ∪ c o h G ( 2 ) ;

By continuing the above procedure, after n G steps, we get V = ∪ g = 1 n G c o h G ( g ) .

However, some of the higher ordered coherent groups may have relatively small sample sizes; so by considering these as outliers, we lump them together thus forming the noisy group denoted by n o i s y G in Equation (1).

Let us recall the definition of a coherent group given in Equation (2)

c o h G ( g ) = ∪ α = 1 c g c o h C g ( α ) for g = 1 , ⋯ , G ;

that is, a coherent group is the union of its coherent clusters. This implies that the sample size of c o h G ( g ) equals the sum of the sample sizes of its coherent clusters

| c o h G ( g ) | = ∑ α = 1 c g | c o h C g ( α ) | .

As an example, for the SUSHI data, from the 2^{nd} column of

| c o h G ( 1 ) | = ∑ α = 1 c g = 7 | c o h C 1 ( α ) | = 2418.

Furthermore, c o h G ( 1 ) is composed of 27 observed riffle shuffles summarized in

The next result shows important characteristics of a coherent group inherited from its coherent clusters.

Theorem 2: (Properties of a coherent group c o h G ( g ) )

a) The first principal column factor score g 1 of the d items in a coherent group is the weighted average of the first principal column factor score g 1 of the d items of its coherent clusters; that is,

g 1 ( j ∈ c o h G ( g ) ) = ∑ α = 1 c g | c o h C g ( α ) | | c o h G ( g ) | g 1 ( j ∈ c o h C g ( α ) ) for j = 1, ⋯ , d

= 2 d − 1 ∑ α = 1 c g | c o h C g ( α ) | | c o h G ( g ) | β ( j ∈ c o h C g ( α ) ) − 1 by Proposition 3.

And c o r r ( g 1 ( c o h G ( g ) ) , β ( c o h G ( g ) ) ) = 1 .

b) The first TCA dispersion value of a coherent group is the weighted average of the first TCA dispersion values of its coherent clusters; that is,

δ 1 ( c o h G ( g ) ) = ∑ α = 1 c g | c o h C g ( α ) | | c o h G ( g ) | δ 1 ( c o h C g ( α ) ) .

c) The crossing index of a coherent group is the weighted average of the crossing indices of its coherent clusters; that is,

C r o s s ( c o h G ( g ) ) = ∑ α = 1 c g | c o h C g ( α ) | | c o h G ( g ) | C r o s s ( c o h C g ( α ) ) .

Example 8:

( 0.046,0.051,0.042,0.042,0.053,0.047,0.037,0.034,0.037,0.025 ) .

We can discern the following grouped seriation (bucket ranking) of the items

j 8 ≻ j 5 ≻ j 6 ≻ { j 3, j 2 } ≻ j 1 ≻ { j 9, j 4 } ≻ { j 7 } ≻ { j 10 } .

The groupings are based on the standard 95% confidence intervals of the Borda scale of the items.

The 2^{nd} coherent group c o h G ( 2 ) , summarized by its Borda scales in

The third coherent group c o h G ( 3 ) , summarized by its Borda scales in

The fourth coherent group c o h G ( 4 ) , summarized by its Borda scales in

Remark 6:

a) Note that the number of preferred sushis in c o h G ( 1 ) and c o h G ( 2 ) are six; that is | J 2 | = 6 . While the number of preferred sushis in c o h G ( 3 ) and c o h G ( 4 ) are four.

b) The four coherent groups summarized in

c) We consider the fifth group as noisy (outliers not shown) composed of 12.36% of the remaining sample: it contains c o h G ( 5 ) = ∪ α = 1 2 c o h C 5 ( α ) whose sample size is 38, a very small number. For the sake of completeness we also provide the sample sizes of its two coherent clusters | c o h C 5 ( 1 ) | = 22 and | c o h C 5 ( 2 ) | = 16 .

The 1980 American Psychological Association (APA) presidential election had five candidates: { A , C } were research psychologists, { D , E } were clinical

c o h G ( 1 ) = ∪ α = 1 7 c o h C 1 ( α ) | β | c o h G ( 2 ) = ∪ α = 1 8 c o h C 2 ( α ) | β |
---|---|---|---|

8. toro (fatty tuna) | 7.62 | 8. toro (fatty tuna) | 6.15 |

5. uni (sea urchin) | 6.31 | 2. anago (sea eel) | 5.97 |

6. sake (salmon roe) | 5.92 | 1. ebi (shrimp) | 5.92 |

3. maguro (tuna) | 5.49 | 7. tamago (egg) | 5.76 |

2. anago (sea eel) | 5.35 | 3. maguro (tuna) | 5.55 |

1. ebi (shrimp) | 5.04 | 4. ika (squid) | 5.41 |

9. tekka-maki (tuna roll) | 3.27 | 9. tekka-maki (tuna roll) | 3.80 |

4. ika (squid) | 3.10 | 10. kappa-maki (cucumber roll) | 2.56 |

7. tamago (egg) | 1.94 | 6. sake (salmon roe) | 2.45 |

10. kappa-maki (cucumber roll) | 0.97 | 5. uni (sea urchin) | 1.44 |

C r o s s ( c o h G ( 1 ) ) = 27.3 % | C r o s s ( c o h G ( 2 ) ) = 35.38 % | ||

| c o h G ( 1 ) | = 2418 ( 48.36 % ) | | c o h G ( 2 ) | = 955 ( 19.10 % ) | ||

c o h G ( 3 ) = ∪ α = 1 8 c o h C 3 ( α ) | β | c o h G ( 4 ) = ∪ α = 1 8 c o h C 4 ( α ) | β |

8. toro (fatty tuna) | 7.31 | 4. ika (squid) | 6.67 |

6. sake (salmon roe) | 6.62 | 5. uni (sea urchin) | 6.50 |

3. maguro (tuna) | 6.30 | 6. sake (salmon roe) | 6.43 |

9. tekka-maki (tuna roll) | 6.00 | 1. ebi (shrimp) | 6.16 |

7. tamago (egg) | 3.76 | 8. toro (fatty tuna) | 3.69 |

4. ika (squid) | 3.41 | 7. tamago (egg) | 3.39 |

2. anago (sea eel) | 3.00 | 2. anago (sea eel) | 3.21 |

1. ebi (shrimp) | 2.92 | 9. tekka-maki (tuna roll) | 3.14 |

10. kappa-maki (cucumber roll) | 2.86 | 10. kappa-maki (cucumber roll) | 2.99 |

5. uni (sea urchin) | 2.80 | 3. maguro (tuna) | 2.80 |

C r o s s ( c o h G ( 3 ) ) = 31.37 % | C r o s s ( c o h G ( 4 ) ) = 35.27 % | ||

| c o h G ( 3 ) | = 662 ( 13.24 % ) | | c o h G ( 4 ) | = 347 ( 6.94 % ) |

psychologists and B was a community psychologist. In this election, voters ranked the five candidates in order of preference. Among the 15,449 votes, 5738 votes ranked all five candidates. We consider the data set which records the 5738 complete votes; it is available in [ [

Part c of

[

The following important observation emerges from the comparison of results in

(a) Parameters of the best mixture model selected, Cayley-based, using BIC | ||||||||
---|---|---|---|---|---|---|---|---|

Group | sample% | modal orderings | precision | |||||

1 | 42 | D ≻ B ≻ E ≻ C ≻ A | 0.16 | |||||

2 | 31 | C ≻ D ≻ E ≻ A ≻ B | 0.79 | |||||

3 | 12 | B ≻ C ≻ A ≻ D ≻ E | 1.52 | |||||

4 | 8 | B ≻ C ≻ A ≻ E ≻ D | 1.81 | |||||

5 | 7 | B ≻ D ≻ A ≻ E ≻ C | 1.72 | |||||

(b) Parameters of the best mixture model selected, Cayley-based, using ICL | ||||||||

Group | sample% | modal ordering | precision | |||||

1 | 100 | B ≻ C ≻ A ≻ E ≻ D | 0.25 | |||||

(c) The first five coherent groups, each composed of two coherent clusters | ||||||||

Group | sample% | β ( C ) | β ( A ) | β ( B ) | β ( E ) | β ( D ) | Cross | |

cohG(1) Research | 31.0 | 3.55 | 3.15 | 1.31 | 1.15 | 0.85 | 10.22% | |

cohG(2) Clinical | 23.7 | 0.83 | 1.28 | 1.28 | 3.31 | 3.30 | 12.90% | |

cohG(3) mixed B | 14.2 | 0.66 | 2.70 | 2.96 | 0.71 | 2.97 | 12.45% | |

cohG(4) mixed B | 12.0 | 2.85 | 0.77 | 2.86 | 2.80 | 0.72 | 10.22% | |

cohG(5) outlier | 8.6 | 0.96 | 3.30 | 1.31 | 3.40 | 1.00 | 9.88% | |

Group 3 is based on the modal category B ≻ C ≻ A ≻ D ≻ E and group 4 is based on the modal category B ≻ C ≻ A ≻ E ≻ D . The only difference between these two modal categories is the permutation of the least ranked two clinical psychologist candidates { D , E } ; this difference is not important and does not appear in our approach, which is a latent variable approach.

DescriptionThe eight coherent clusters of the first four coherent groups can simply be described as:

c o h 1 C ( 1 ) : T v ( τ J 2 ( S 2 ) = τ { A , C } { 3,4 } = { 3,4 } ) = 7 for v = 1, ⋯ ,1233 .

c o h 1 C ( 2 ) : T v ( τ J 2 ( S 2 ) = τ { A , C } { 3,4 } = { 2 ,4 } ) = 6 for v = 1, ⋯ ,545 .

c o h 2 C ( 1 ) : T v ( τ J 2 ( S 2 ) = τ { D , E } { 3,4 } = { 3,4 } ) = 7 for v = 1, ⋯ ,834 .

c o h 2 C ( 2 ) : T v ( τ J 2 ( S 2 ) = τ { D , E } { 3,4 } = { 2 ,4 } ) = 6 for v = 1, ⋯ ,526 .

c o h 3 C ( 1 ) : T v ( τ J 1 ( S 1 ) = τ { C , E } { 0,1 } = { 0,1 } ) = 1 for v = 1 , ⋯ , 512 .

c o h 3 C ( 2 ) : T v ( τ J 1 ( S 1 ) = τ { C , E } { 0,1 } = { 0, 2 } ) = 2 for v = 1 , ⋯ , 305 .

c o h 4 C ( 1 ) : T v ( τ J 1 ( S 1 ) = τ { A , D } { 0,1 } = { 0,1 } ) = 1 for v = 1 , ⋯ , 350 .

c o h 4 C ( 2 ) : T v ( τ J 1 ( S 1 ) = τ { A , D } { 0,1 } = { 0, 2 } ) = 2 for v = 1 , ⋯ , 338 .

In this case, we can also visualize all the orderings belonging to a coherent group:

Riffle independence is a nonparametric probabilistic modelling method of preferences developed by [

(a) Partition the set J of d distinct items into two disjoint subsets J 1 of size d 1 and J 2 of size d 2 . Then generate an ordering of items within each subset according to a certain ranking model. This implies that any ordering of the d items can be written as a direct product of two disconnected orderings; which in its turn implies the independence of the two subsets J 1 and J 2 . So the model complexity of this step is of order d 1 ! + d 2 ! .

(b) Interleave the two independent orderings for these two subsets using a riffle shuffle to form a combined ordering. An interleaving is a binary mapping from the set of orderings to { J 1 , J 2 } . The model complexity of this step is of order d ! / ( d 1 ! d 2 ! ) . The interleaving step generates the riffled independence of the two subsets J 1 and J 2 .

So the combined model complexity of both steps is d 1 ! + d 2 ! + d ! / ( d 1 ! d 2 ! ) which is much smaller than d ! = ( d 1 + d 2 ) ! .

For example, consider an ordering of the items in the set J = { A , B , C , D , E , F } from its two subsets J 1 = { A , C } and J 2 = { B , D , E , F } . In the first step, relative orderings of the items in J 1 and J 2 are drawn independently. Suppose we obtain the relative ordering φ ( J 1 ) = ( C ≻ A ) in J 1 , and the relative ordering φ ( J 2 ) = ( B ≻ D ≻ F ≻ E ) in J 2 . Then, in the second step, the two relative orderings are combined by interleaving the items in the two subsets. For instance, if the interleaving process is ω ( J 1 , J 2 ) = ( J 1 , J 2 , J 2 , J 1 , J 2 , J 2 ) , where the relative ordering of the items in each subset remains unchanged, the combined ordering is then determined by the composition

ω ( J 1 , J 2 ) ∗ ( φ ( J 1 ) , φ ( J 2 ) ) = ( C ≻ B ≻ D ≻ A ≻ F ≻ E ) = φ ( J ) .

Given the two subsets J 1 and J 2 with their orderings φ ( J 1 ) and φ ( J 2 ) and interleaving ω ( J 1 , J 2 ) generated from models with probability distributions f J 1 , g J 2 and m ω , respectively, the probability of observed ordering under the riffle independence model is

P ( φ ( J ) ) = m ω ( ω ( J 1 , J 2 ) ) f J 1 ( φ ( J 1 ) ) g J 2 ( φ ( J 2 ) ) .

There are two formulations of riffle shuffle for rank data in statistics: probabilistic and exploratory. In the riffled independence model, the set of items is partitioned recursively, while in the exploratory approach the set of voters is partitioned recursively.

The main contribution of this paper is the introduction of an exploratory riffle shuffling procedure to reveal and display the structure of diffuse rank data for large sample sizes. The new notion of a coherent cluster, that we developed, is simply based on the geometric notion of taxicab projection of points on the first TCA axis globally and locally; furthermore, it has nice mathematical properties. Coherent clusters of a coherent group represent the same latent variable opposing preferred items to disliked items, and can easily be interpreted and displayed.

Like Occam’s razor, step by step, our procedure peels the essential structural layers (coherent groups) of rank data.

Our method was able to discover some other aspects of the rank data, such as outliers or small groups, which are eclipsed or masked by well-established methods, such as distance or random utility-based methods. The major reason for this is that in random utility-based methods the multivariate nature of a preference is reduced to binary preferences (paired comparisons), and in Mallows distance related methods distances between any two preferences are bounded.

We presented a new index, Cross, that quantifies the extent of crossing of scores between the optimal binary partition of the items that resulted from TCA. The crossing index of a group is based on the first taxicab dispersion measure: it takes values between 0 and 100%, so it is easily interpretable.

The proposed approach can easily be generalized to the analysis of rankings with ties and partial rankings.

The package TaxicabCA written in R available on CRAN can be used to do the calculations.

Choulakian’s research has been supported by NSERC grant (RGPIN-2017-05092) of Canada.

The authors declare no conflicts of interest regarding the publication of this paper.

Choulakian, V. and Allard, J. (2021) Uncovering and Displaying the Coherent Groups of Rank Data by Exploratory Riffle Shuffling. Open Journal of Statistics, 11, 178-212. https://doi.org/10.4236/ojs.2021.111010

Let R = ( r i j ) for i = 1 , ⋯ , n and j = 1 , ⋯ , d represent the Borda scorings for preferences, where r i j takes values 0, ⋯ , d − 1 . Similarly, let R ¯ represent the reverse Borda scorings, whose column sums are the cordinates of the row named n e g a = n β ¯ = 1 ′ n R ¯ . We consider the application of TCA to the data set

R n e g a = ( R n e g a )

of size ( n + 1 ) × d . So let

P = R n e g a / t

be the correspondence table associated with R n e g a , where t = 2 n ∑ j = 0 d − 1 j = n d ( d − 1 ) . We have

p i ∗ = 1 2 n for i = 1, ⋯ , n (14)

= 1 2 for i = n + 1, (15)

and

p ∗ j = 1 d for j = 1, ⋯ , d . (16)

The first residuel correspondence matrix will be

p i j ( 1 ) = p i j − p i ∗ p ∗ j (17)

= r i j t − 1 2 n . 1 d for i = 1, ⋯ , n (18)

= n e g a j t − 1 2 . 1 d for i = n + 1. (19)

Consider the nontrivial binary partition of the set S = { 0,1, ⋯ , d − 1 } into S = S 1 ∪ S 2 , where | S 1 | = d 1 , | S 2 | = d 2 and d = d 1 + d 2 . To eliminate the sign indeterminacy in the first TCA principal axis, we fix v 1 ( n e g a ) = v 1 ( n + 1 ) = − 1 ; and we designate by S 1 the set of item indices such that the first TCA principal axis coordinates are negative, that is, u 1 ( j ) = − 1 for j ∈ S 1 . It follows that u 1 ( j ) = 1 for j ∈ S 2 .

Now we have by (4) for i = 1 , ⋯ , n

a i 1 = ∑ j = 1 d u 1 ( j ) p i j ( 1 ) = ∑ j ∈ S 1 u 1 ( j ) p i j ( 1 ) + ∑ j ∈ S 2 u 1 ( j ) p i j ( 1 ) = − ∑ j ∈ S 1 p i j ( 1 ) + ∑ j ∈ S 2 p i j ( 1 ) = − 2 ∑ j ∈ S 1 p i j ( 1 ) by ( 17 ) = − 2 ∑ j ∈ S 1 ( r i j t − 1 2 n . 1 d ) by ( 18 ) = d 1 n d − 2 t ∑ j ∈ S 1 r i j ; (20)

and from which we deduce by (5) for i = 1 , ⋯ , n

f i 1 = a i 1 p i ∗ = 2 d 1 d − 4 d ( d − 1 ) ∑ j ∈ S 1 r i j . (21)

We have the following Theorem concerning the first TCA principal factor scores of respondents f i 1 for i = 1 , ⋯ , n .

Theorem 1:

a) The maximum number of distinct clusters of n respondents on the first TCA principal axis (distinct f i 1 values) is d 1 d 2 + 1 .

Proof: We consider the two extreme cases of S 1 and calculate the summation term in (21):

For S 1 = { 0 , 1 , ⋯ , d 1 − 1 } , ∑ j ∈ S 1 r i j = ∑ j = 0 d 1 − 1 j = d 1 ( d 1 − 1 ) 2 .

For S 1 = { d − d 1 , 1 , ⋯ , d − 1 } , ∑ j ∈ S 1 r i j = ∑ j = d − d 1 d − 1 j = ∑ j = d 2 d − 1 j = d 1 ( d 2 + d − 1 ) 2 .

It follows that

d 1 ( d 1 − 1 ) 2 ≤ ∑ j ∈ S 1 r i j ≤ d 1 ( d 2 + d − 1 ) 2 ;

so ∑ j ∈ S 1 r i j can take at most d 1 ( d 2 + d − 1 ) 2 − d 1 ( d 1 − 1 ) 2 + 1 = d 1 d 2 + 1 values.

b) The maximum value that f i 1 can attain is 2 d 1 d 2 d ( d − 1 ) .

Proof: From (21) and Part a, it follows that the maximum value that f i 1 can attain is ( 2 d 1 d − 4 d ( d − 1 ) d 1 ( d 1 − 1 ) 2 ) = 2 d 1 d 2 d ( d − 1 ) .

c) The minimum value that f i 1 can attain is − 2 d 1 d 2 d ( d − 1 ) .

Proof: From (21) and Part a, it follows that the minimum value that f i 1 can attain is ( 2 d 1 d − 4 d ( d − 1 ) d 1 ( d 2 + d − 1 ) 2 ) = − 2 d 1 d 2 d ( d − 1 ) .

d) If the number of distinct clusters is maximum, d 1 d 2 + 1 , then the gap between two contiguous f i 1 values is 4 d ( d − 1 ) .

Proof: Suppose that the number of distinct clusters is maximum, d 1 d 2 + 1 . We consider the first TCA factor score f i 1 = 2 d 1 d − 4 d ( d − 1 ) ∑ j ∈ S 1 r i j which is different in value from the two extreme values ± 2 d 1 d 2 d ( d − 1 ) . Then f i 1 1 = 2 d 1 d − 4 d ( d − 1 ) ( − 1 + ∑ j ∈ S 1 r i j ) will be the contiguous higher value to f i 1 ; and similarly f i 2 1 = 2 d 1 d − 4 d ( d − 1 ) ( 1 + ∑ j ∈ S 1 r i j ) will be the contiguous lower value to f i 1 ; and the required result follows.

Proposition 1: For a voting profile V, δ 1 ≥ | f 1 ( n e g a ) | .

Proof: Let a 1 = ( a 11 a 1 ( n e g a ) ) . We need the following three observations.

First, it is well known that a 1 is centered by (5) and (9),

1 ′ n + 1 a 1 = 0 = 1 ′ n a 11 + a 1 ( n e g a ) ;

from which we get,

| 1 ′ n a 11 | = | a 1 ( n e g a ) | . (22)

Second, by triangle inequality of the L_{1} norm we have

‖ a 11 ‖ 1 ≥ | 1 ′ n a 11 | . (23)

Third, the marginal relative frequency of the nega row is p n e g a ∗ = 1 / 2 by (15), and f i 1 = a i 1 / p i ∗ for i = 1 , ⋯ , n + 1 by (5); so we have

f 1 ( n e g a ) = 2 a 1 ( n e g a ) . (24)

Now we have by (7)

δ 1 = ‖ a 1 ‖ 1 = ‖ a 11 ‖ 1 + | a 1 ( n e g a ) | ≥ | 1 ′ n a 11 | + | a 1 ( n e g a ) | by ( 23 ) = 2 | a 1 ( n e g a ) | by ( 22 ) = | f 1 ( n e g a ) | by ( 24 ) (25)

Propostion 2: Let c o h C m ( α ) = V m , α be the αth coherent cluster of the mth coherent group characterized by f 1 V m , α ( σ ) = f α V m for all σ ∈ c o h C m ( α ) . Then δ 1 = f α V m = − f 1 ( n e g a ) .

Proof: By Definition 1 of the coherency of the cluster V m , α , we have 0 < f 1 V m , α ( i ) = f α V m for i = 1, ⋯ , | c o h C m ( α ) | ; by (5) it follows that 0 < a i 1 = f α V m / n for i = 1, ⋯ , | c o h C m ( α ) | ; so (25) becomes equality, ‖ a 11 ‖ 1 = ∑ i = 1 n a i 1 = | 1 ′ n a 11 | , and the required result follows.

Proposition 3 is a corollary to the following general result

Theorem 3: If the first TCA principal axis of the columns of R n e g a is v 1 = ( 1 n − 1 ) , then

the first principal column factor score g 1 of the d items is an affine function of the Borda scale β ; that is, g 1 ( j ) = 2 d − 1 β ( j ) − 1 or c o r r ( g 1 , β ) = 1 .

Proof: Suppose that v 1 = ( 1 n − 1 ) ; then by (4) for j = 1 , ⋯ , d

b 1 ( j ) = ∑ i = 1 n + 1 v 1 ( i ) p i j ( 1 ) = ∑ i = 1 n p i j ( 1 ) − p ( n + 1 ) j ( 1 )

= 2 ∑ i = 1 n p i j ( 1 ) by (17)

= 2 ∑ i = 1 n ( p i j − p i ∗ p ∗ j )

= 2 ∑ i = 1 n r i j / t − p ∗ j by (14)

= 2 n β ( j ) / t − p ∗ j

Thus by (5) for j = 1 , ⋯ , d

g 1 ( j ) = b 1 ( j ) / p ∗ j = 2 n β ( j ) / t − p ∗ j p ∗ j = 2 β ( j ) d − 1 − 1.

Proposition 4: The crossing index of a coherent cluster is

C r o s s ( c o h C ( α ) ) = 2 ( α − 1 ) d 1 d 2 .

Proof: Easily shown by using Definition 3 and Proposition 2.

The proof of Theorem 2a easily follows from Theorem 3. The proof of Theorem 2b is similar to the proof of Proposition 1. The proof of Theorem 2c is similar to the proof of Proposition 4.