^{1}

^{2}

O ptimization has two faces, minimization of a loss function or maximization of a gain function. We show that the mean absolute deviation about the mean, d , maximizes a gain function based on the power set of the individuals; and nd , where n is the sample size, equals twice the value of the cut-norm o f the deviations about the mean. This property is generalized to double-centered and triple-centered data sets. Furthermore, we show that among the three well known dispersion measures, standard deviation, least absolute deviation and d , d is the most robust based on the relative contribution criterion. More importantly, we show that the computation of each principal dimension of taxicab correspondence analysis (TCA) corresponds to balanced 2-blocks seriation. These ideas are applied on two data sets.

Optimization has two faces, minimization of a loss function or maximization of a gain function. The following two well known dispersion measures, the variance (s^{2}) and mean absolute deviations about the median (LAD), are optimal because each minimizes a different loss function

s 2 = ∑ i = 1 n ( y i − y ¯ ) 2 n ≤ ∑ i = 1 n ( y i − c ) 2 n (1)

and

L A D = ∑ i = 1 n | y i − m e d i a n | n ≤ ∑ i = 1 n | y i − c | n , (2)

where y 1 , y 2 , ⋯ , y n and c represent a sample of ( n + 1 ) values. To our knowledge, no optimality property is known for the mean absolute deviations about the mean defined by

d = ∑ i = 1 n | y i − y ¯ | n , (3)

even though it has been studied in several papers for modeling purposes by, see among others, [

The following inequality L A D ≤ d ≤ s is well known and is a corollary to Lyapounov inequality, see for instance [

d is the measure of dispersion used in taxicab correspondence analysis (TCA), an L_{1} variant of correspondence analysis (CA), see [

This paper is organized as follows: In Section 2, we show the optimality of the d, s^{2} and LAD statistics based on maximizing gain functions, but d beats s^{2} and LAD with respect to the property of relative contribution of a point (a robustness measure used in French data analysis circles based on geometry): this results from Lemma 1, which states the fact that for a centered vector nd equals twice its cut-norm; Sections 3 and 4 generalize the optimality result of the d to double-centered and triple-centered arrays; and we conclude in Section 5. Balanced 2-blocks seriation of a matrix with application to TCA is discussed in Section 3.

We consider the centered vector x = y − y ¯ 1 n , where 1 n is composed of n ones. Let I = { 1,2, ⋯ , n } and I = S ∪ S ¯ a binary partition of I. We have

∑ i ∈ I x i = 0 = ∑ i ∈ S x i + ∑ i ∈ S ¯ x i ;

from which we deduce

∑ i ∈ S x i = − ∑ i ∈ S ¯ x i . (4)

We define the cut-norm of a centered vector x to be

‖ x ‖ ⊡ = m a x S ∑ i ∈ S x i = ∑ i ∈ S o p t x i = − ∑ i ∈ S ¯ o p t x i by

where S o p t = { i : x i ≥ 0 for i = 1,2, ⋯ , n } . By casting the computation of d as a combinatorial maximization problem, we have the following main result describing the optimality of the d-statistic over all elements of the power set of I.

Lemma 1: (2-equal parts property) n d = 2 ‖ x ‖ ⊡ ≥ 2 ∑ i ∈ S x i for all S ⊂ I .

Proof:

n d = ∑ i = 1 n | x i | = ∑ i ∈ S o p t x i − ∑ i ∈ S ¯ o p t x i = 2 ‖ x ‖ ⊡ by ( 4 ) ≥ 2 ∑ i ∈ S x i for all S ⊂ I .

Corollary 1: d ≥ x ′ u / n for u ∈ { − 1,1 } n .

Proof: By defining u o p t ( i ) = 1 if i ∈ S o p t and u o p t ( i ) = − 1 if i ∈ S ¯ o p t , we get d = x ′ u o p t / n ≥ x ′ u / n .

Corollary 2: L A D ≥ ( y − m e d i a n 1 n ) ′ u / n for u ∈ { − 1,1 } n .

Corollary 2 shows that LAD has a second optimality property. We emphasize the fact that the optimizing function in (2) is a univariate loss function of c ∈ ℝ ; while the optimizing function in Corollary 2 is a multivariate gain function of u ∈ { − 1,1 } n .

There is a similar result also for the variance in (1), based on Cauchy-Schwarz inequality stated in Lemma 2.

Lemma 2: s = ‖ ( y − y ¯ 1 n ) / n ‖ 2 ≥ ( y − y ¯ 1 n ) ′ u / n for u ′ u = 1 .

We note that Corollaries 1 and 2 and Lemma 2 represent particular cases of Hölder inequality, see [

Definition 1: We define the relative contribution of an element y i to d, LAD and s^{2}, respectively, to be

R C d ( y i ) = | y i − y ¯ | n d ,

R C s 2 ( y i ) = | y i − y ¯ | 2 n s 2 ,

R C L A D ( y i ) = | y i − m e d i a n | n L A D .

Then the following inequalities are true

0 ≤ R C d ( y i ) ≤ 0.5,

0 ≤ R C s 2 ( y i ) < 1 ,

0 ≤ R C L A D ( y i ) ≤ 1 ;

from which we conclude that the most robust dispersion measure among the three dispersion measures, based on the relative contribution criterion, is d because it is bounded above by 0.5.

We note that the inequality, 0 ≤ R C s 2 ( y i ) < 1 , is a weaker variant of Laguerre-Samuelson inequality; see for instance [

We have

Definition 2: An element x i = y i − y ¯ is a heavyweight if R C d ( y i ) = 0.5 ; that is, | x i | = | y i − y ¯ | = n d / 2 .

We note that a heavyweight element attains the upper bound of R C d ( y i ) , but it never attains the upper bound of R C s 2 ( y i ) and R C L A D ( y i ) .

Let P = ( p i j ) be a correspondence matrix; that is, p i j ≥ 0 for i ∈ I and j ∈ J = { 1,2, ⋯ , m } and ∑ j = 1 m ∑ i = 1 n p i j = 1 . As usual, we define p i ∗ = ∑ j = 1 m p i j and p ∗ j = ∑ i = 1 n p i j . Let P 1 = ( x i j = p i j − p i ∗ p ∗ j ) for i ∈ I and j ∈ J ; then P 1 represents the residual matrix of P with respect to the independence model ( p i ∗ p ∗ j ) . In the jargon of statistics, the cell x i j represents the multiplicative 2-way interaction of the cell ( i , j ) ∈ I × J . P 1 is double-centered

P 1 1 m = 0 n and P ′ 1 1 n = 0 m . (5)

From (5) we get

∑ i ∈ S x i j = − ∑ i ∈ S ¯ x i j for j ∈ J , (6)

∑ j ∈ T x i j = − ∑ j ∈ T ¯ x i j for i ∈ I , (7)

for T ⊂ J . From (6) and (7), we get

∑ j ∈ T ∑ i ∈ S x i j = − ∑ j ∈ T ∑ i ∈ S ¯ x i j (8)

= − ∑ i ∈ S ∑ j ∈ T ¯ x i j (9)

= ∑ i ∈ S ¯ ∑ j ∈ T ¯ x i j . (10)

We define the cut-norm of P 1 to be

‖ P 1 ‖ ⊡ = m a x S , T ∑ j ∈ T ∑ i ∈ S x i j = ∑ j ∈ T o p t ∑ i ∈ S o p t x i j .

The cut-norm ‖ P 1 ‖ ⊡ is a well known quantity in theoretical computer science, because of its relationship to the famous Grothendieck inequality, which is based on ‖ P 1 ‖ ∞ → 1 , see among others [

The matrix P 1 can be considered as the starting point in taxicab correspondence analysis, an L_{1} variant of correspondence analysis, see [

δ 1 = ‖ P 1 ‖ ∞ → 1 = ‖ P ′ 1 ‖ ∞ → 1 = m a x u ∈ ℝ m ‖ P 1 u ‖ 1 ‖ u ‖ ∞ = m a x v ∈ ℝ n ‖ P ′ 1 v ‖ 1 ‖ v ‖ ∞ = m a x u ∈ ℝ m , v ∈ ℝ n v ′ P 1 u ‖ u ‖ ∞ ‖ v ‖ ∞ , = m a x ‖ P 1 u ‖ 1 subject to u ∈ { − 1, + 1 } m , = m a x ‖ P ′ 1 v ‖ 1 subject to v ∈ { − 1, + 1 } n ,

= m a x v ′ P 1 u subjectto u ∈ { − 1, + 1 } m , v ∈ { − 1, + 1 } n , (11)

= v ′ 1 P 1 u 1 . (12)

In data analysis, the vectors v 1 and u 1 are interpreted as first taxicab principal axes and δ 1 as first taxicab dispersion. So we can compute the projection of the rows (resp. the columns) of P 1 on the first taxicab principal axis u 1 (resp. v 1 ) to be

a 1 = P 1 u 1 (13)

b 1 = P ′ 1 v 1 . (14)

Equation (12) implies

v 1 = s i g n ( a 1 ) , (15)

u 1 = s i g n ( b 1 ) , (16)

named transition formulas, see [

1 ′ n a 1 = 0 and δ 1 = ‖ a 1 ‖ 1 , (17)

1 ′ m b 1 = 0 and δ 1 = ‖ b 1 ‖ 1 . (18)

Using the above results, we get the following

Lemma 3: (4-equal parts property) The norm δ 1 = ‖ P 1 ‖ ∞ → 1 = 4 ‖ P 1 ‖ ⊡ ≥ 4 ∑ j ∈ T ∑ i ∈ S x i j .

In data analysis, Lemma 3 implies balanced 2-blocks seriation of P 1 ; see example 1. The subsets T o p t and S o p t are positively associated and ‖ P 1 ‖ ⊡ = ∑ j ∈ T o p t ∑ i ∈ S o p t x i j ; similarly the subsets T ¯ o p t and S ¯ o p t are positively associated and ‖ P 1 ‖ ⊡ = ∑ j ∈ T ¯ o p t ∑ i ∈ S ¯ o p t x i j . While the subsets T ¯ o p t and S o p t are negatively associated and ‖ P 1 ‖ ⊡ = − ∑ j ∈ T ¯ o p t ∑ i ∈ S o p t x i j ; similarly the subsets T o p t and S ¯ o p t are negatively associated and ‖ P 1 ‖ ⊡ = − ∑ j ∈ T o p t ∑ i ∈ S ¯ o p t x i j . [

Using Definition 2, we get

Definition 3: The relative contribution of the row i to δ 1 (respectively of the column j to δ 1 ) is

R C δ 1 ( r o w i ) = | a 1 ( i ) | δ 1 and R C δ 1 ( c o l j ) = | b 1 ( j ) | δ 1 .

We have

0 ≤ R C δ 1 ( r o w i ) and R C δ 1 ( c o l j ) ≤ 0.5.

Definition 4: 1) On the first taxicab principlal axis the row i is heavyweight if R C δ 1 ( r o w i ) = 0.5 , and the column j is heavyweight if R C δ 1 ( c o l j ) = 0.5 .

2) On the first taxicab principlal axis the cell ( i , j ) is heavyweight if and only if both row i and column j are heavyweights; and in this case R C δ 1 ( p i j − p i ∗ p ∗ j ) = | p i j − p i ∗ p ∗ j | δ 1 = 0.25 .

For an application of Definitions 3 and 4 see [

Using Wedderburn’s rank-1 reduction rule, see [

double-centered, and repeat the above procedure. After k = r a n k ( P 1 ) iterations, we decompose the correspondence matrix P into ( k + 1 ) bilinear parts

p i j = p i ∗ p ∗ j + ∑ α = 1 k a α ( i ) b α ( j ) δ α ,

named taxicab singular value decomposition; which can be rewritten, similar to data reconstruction formula in correspondence analysis (CA), as

p i j = p i ∗ p ∗ j ( 1 + ∑ α = 1 k f α ( i ) g α ( j ) δ α ) ,

where in TCA

f α ( i ) = a α ( i ) / p i ∗ and g α ( j ) = b α ( j ) / p ∗ j . (19)

We note that Equations (5) through (18) are valid for higher residual correspondence matrices P α for α = 1, ⋯ , k .

CA and TCA satisfy an important invariance property: columns (or rows) with identical profiles (conditional probabilities) receive identical factor scores g α ( j ) (or f α ( i ) ). The factor scores are used in the graphical displays. Moreover, merging of identical profiles does not change the results of the data analysis: This is named the principle of equivalent partitioning by [

In the next subsections we shall present two data sets, where taxicab correspondence analysis (TCA) is applied. The first data set is a small contingency table taken from [

The theory of CA can be found, among others, in [

Exposure in years | Asbestos Grade Diagnosed | |||||
---|---|---|---|---|---|---|

None (G0) | Grade 1 (G1) | Grade 2 (G2) | Grade 3 (G3) | Total | p i ∗ | |

0 - 9 | 310 | 36 | 0 | 0 | 346 | 0.3098 |

10 - 19 | 212 | 158 | 9 | 0 | 379 | 0.3393 |

20 - 29 | 21 | 35 | 17 | 4 | 77 | 0.0689 |

30 - 39 | 25 | 102 | 49 | 18 | 194 | 0.1737 |

40+ | 7 | 35 | 51 | 28 | 121 | 0.1083 |

Total | 575 | 366 | 126 | 50 | 1117 | 1 |

p ∗ j | 0.5148 | 0.3277 | 0.1128 | 0.0448 | 1 |

Exposure in years | Asbestos Grade Diagnosed | |||||||
---|---|---|---|---|---|---|---|---|

None (G0) | Grade 1 (G1) | Grade 2 (G2) | Grade 3 (G3) | Total | v 1 | a 1 | f 1 | |

0 - 9 | 0.1181 | −0.0693 | −0.0349 | −0.0139 | 0 | −1 | −0.2362 | −0.7624 |

10 - 19 | 0.0151 | 0.0303 | −0.0302 | −0.0152 | 0 | −1 | −0.0303 | −0.0892 |

20 - 29 | −0.0167 | 0.0087 | 0.0074 | 0.0005 | 0 | 1 | 0.0334 | 0.4841 |

30 - 39 | −0.0670 | 0.0344 | 0.0243 | 0.0083 | 0 | 1 | 0.1340 | 0.7718 |

40+ | −0.0495 | −0.0042 | 0.0334 | 0.0202 | 0 | 1 | 0.0990 | 0.9138 |

Total | 0 | 0 | 0 | 0 | 0 | δ 1 = 4 × 0.1332 | ||

u 1 | −1 | 1 | 1 | 1 | ‖ P 1 ‖ ⊡ = 0.1332 | |||

b 1 | −0.2664 | 0.0780 | 0.1302 | 0.0582 | 0.1332 | | − 0.1332 | | ||

g 1 | −0.5175 | 0.2380 | 1.1553 | 1.2981 | | − 0.1332 | | 0.1332 |

Exposure in years | Asbestos Grade Diagnosed | |||||||
---|---|---|---|---|---|---|---|---|

None (G0) | Grade 1 (G1) | Grade 2 (G2) | Grade 3 (G3) | Total | v 2 | a 2 | f 2 | |

0 - 9 | 0 | −0.0347 | 0.0228 | 0.0119 | 0 | 1 | 0.0694 | 0.2241 |

40+ | 0 | −0.0186 | 0.0092 | 0.0094 | 0 | 1 | 0.0372 | 0.3443 |

10 - 9 | 0 | 0.0347 | −0.0228 | −0.0119 | 0 | −1 | −0.0694 | −0.2046 |

20 - 29 | 0 | 0.0038 | −0.0007 | −0.0031 | 0 | −1 | −0.0076 | −0.1121 |

30 - 39 | 0 | 0.0148 | −0.0085 | −0.0063 | 0 | −1 | −0.0296 | −0.1703 |

Total | 0 | 0 | 0 | 0 | 0 | δ 2 = 4 × 0.0533 | ||

u 2 | ± 1 | −1 | 1 | 1 | ‖ P 2 ‖ ⊡ = 0.0533 | |||

b 2 | 0 | −0.1066 | 0.0640 | 0.0426 | | − 0.0533 | | 0.0533 | ||

g 2 | 0 | −0.3257 | 0.5681 | 0.9521 | 0.0533 | | − 0.0533 | |

1) The first dimension contrasts South American countries and organizations on the one hand, and Central American countries and organizations on the other hand.

2) The second dimension clearly distinguishes Canada and the United States (both North American countries) along with NAFTA from other countries and organizations. In CA, the relative contribution of Canada (resp. US) to the second axis is R C σ 2 2 ( Canada ) = R C σ 2 2 ( US ) = 0.409 , and R C σ 2 2 ( NAFTA ) = 0.821 , where σ 2 2 is the variance, also named inertia, of the second principal dimension.

3) Organizations (SELA, OAS, and IDB) are in the center because they have membership profiles that are similar to the marginal profile: almost all countries belong to (SELA, OAS, and IDB), see

Countries | Regional Trade and Treaty Organizations | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | Sum | |

Argentina | 0 | 1 | 0 | 0 | 0 | 1 | 1 | 0 | 1 | 1 | 0 | 1 | 0 | 0 | 1 | 7 |

Belize | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 1 | 5 |

Bolivia | 0 | 1 | 1 | 1 | 0 | 1 | 1 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 1 | 8 |

Brazil | 0 | 1 | 1 | 0 | 0 | 1 | 1 | 0 | 1 | 1 | 0 | 1 | 0 | 0 | 1 | 8 |

Canada | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 0 | 3 |

Chile | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 1 | 5 |

Colombia | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 1 | 0 | 0 | 1 | 10 |

CostaRica | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 1 | 1 | 6 |

Ecuador | 0 | 1 | 1 | 1 | 0 | 1 | 1 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 1 | 8 |

ElSalvador | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | 1 | 1 | 7 |

Guatemala | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | 1 | 1 | 7 |

Guyana | 1 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 1 | 7 |

Honduras | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | 1 | 1 | 7 |

Mexica | 1 | 1 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 0 | 0 | 1 | 9 |

Nicaragua | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 1 | 1 | 6 |

Panama | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 1 | 1 | 6 |

Paraguay | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 1 | 0 | 0 | 1 | 6 |

Peru | 0 | 1 | 1 | 1 | 0 | 1 | 1 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 1 | 8 |

Suriname | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 1 | 5 |

United States | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 0 | 3 |

Uruguay | 0 | 1 | 0 | 0 | 0 | 1 | 1 | 0 | 1 | 1 | 0 | 1 | 0 | 0 | 1 | 7 |

Venezuela | 1 | 1 | 1 | 1 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 1 | 0 | 0 | 1 | 10 |

Sum | 12 | 11 | 8 | 5 | 2 | 16 | 11 | 3 | 22 | 5 | 3 | 22 | 3 | 6 | 20 | 149 |

groups, northern (Venezuela, Bolivia, Peru and Ecuador) and southern countries (Brazil, Uruguay, Argentina, Paraguay and Chile). Furthermore, the contributions of the points Canada, the United States, and NAFTA to the second axis are not substantial compared to CA: R C δ 2 ( Canada ) = R C δ 2 ( US ) = 0.088 , and R C δ 2 ( NAFTA ) = 0.10 . This shows the robustness of TCA due to the robustness of the δ statistic following Definition 1.

It is well known that, CA is very sensitive to some particularities of a data set; further, how to identify and handle these is an open unresolved problem. However, for contingency tables [

[_{1} variant of their approach. Let Y = ( y i j ) be a 2-way array for i ∈ I , j ∈ J . As usual, we define, for instance, y ¯ ∗ j = ∑ i = 1 n y i j n and y ¯ ∗ ∗ = ∑ j = 1 m ∑ i = 1 n y i j m n . Let X = ( x i j ) be the additive double-centered array, where

x i j = y i j − y ¯ i ∗ − y ¯ ∗ j + y ¯ ∗ ∗ .

In the jargon of statistics, the cell x i j represents the additive 2-way interaction of the cell ( i , j ) ∈ I × J . The matrix X is double-centered, and it satisfies Equations (6) through (10). Let I = U α = 1 r S α be an r-partition of I and J = U β = 1 c T β be a c-partition of J. We consider the following maximization of the overall interaction problem for p ≥ 1

f p ( S α , T β : α = 1, ⋯ , r and β = 1, ⋯ , c ) = ∑ α = 1 r ∑ β = 1 c | S α | | T β | g p ( α , β ) ,

where | S α | is the cardinality of the set S α and

g p ( α , β ) = ( | ∑ i ∈ S α ∑ j ∈ T β x i j | S α | | T β | | ) p .

When p = 2 , then maximizing f 2 ( S α , T β : α = 1, ⋯ , r and β = 1, ⋯ , c ) , named maximal overall interaction, is the criterion computed in [

To motivate our subject, we start with an example. Let Y = ( y i j k ) be a 3-way array for i ∈ I , j ∈ J and k ∈ K = { 1,2, ⋯ , t } . As usual, we define, for instance, y ¯ i j ∗ = ∑ k = 1 t y i j k / t , y ¯ ∗ j ∗ = ∑ k = 1 t ∑ i = 1 n y i j k t n and y ¯ ∗ ∗ ∗ = ∑ k = 1 t ∑ j = 1 m ∑ i = 1 n y i j k t m n . Let X = ( x i j k ) be the triple-centered array, where

x i j k = y i j k − y ¯ i j ∗ − y ¯ i ∗ k − y ¯ ∗ j k + y ¯ i ∗ ∗ + y ¯ ∗ j ∗ + y ¯ ∗ ∗ k − y ¯ ∗ ∗ ∗ .

In the jargon of statistics, the cell x i j k represents the additive 3-way interaction of the cell ( i , j , k ) ∈ I × J × K . The tensor X is triple-centered; that is,

∑ k = 1 t x i j k = ∑ j = 1 n x i j k = ∑ i = 1 m x i j k = 0.

A generalization of Lemma 3 is

Lemma 4: (8-equal parts property) The tensor norm

‖ X ‖ ( ∞ , ∞ ) → 1 = m a x ∑ k ∈ K ∑ j ∈ J ∑ i ∈ I w k v j u i x i j k subjectto u × v × w ∈ { − 1, + 1 } m × n × t = 8 ∑ k ∈ W o p t ∑ j ∈ T o p t ∑ i ∈ S o p t x i j k ≥ 8 ∑ k ∈ W ∑ j ∈ T ∑ i ∈ S x i j k ,

where W ⊂ K .

The proof is similar to the proof of Lemma 3.

Lemma 4 can easily be generalized to higher-way arrays.

This essay is an attempt to emphasize the following two points.

First, we showed the optimality and robustness of the mean absolute deviations about the mean, its interpretation, and its generalization to higher-way arrays. A key notion in describing its robustness is that the relative contribution of a point is bounded by 50%.

Second, within the framework of TCA, we showed that the following three identities δ 1 = ‖ P 1 ‖ ∞ → 1 = 4 ‖ P 1 ‖ ⊡ reveal three different but related aspects of TCA: 1) δ 1 , computed in (17) and (18), by (19) represents the mean absolute deviations about the mean statistic; 2) The taxicab norm ‖ P 1 ‖ ∞ → 1 , via (15) and (16), shows that uniform weights are affected to the columns and the rows; 3) The cut norm 4 ‖ P 1 ‖ ⊡ shows that the computation of each principal dimension of TCA corresponds to balanced 2-blocks seriation, with equality of the cut norm in the 4 associated blocks.

A list of the principal used variables is provided in Appendix B.

We thank the Editor and the referee for their comments. Research of V. Choulakian is funded by the National Science and Engineering Research Council of Canada grant RGPIN-2017-05092. This support is greatly appreciated. The authors thank William Alexander Digout for help in computations.

The authors declare no conflicts of interest regarding the publication of this paper.

Choulakian, V. and Abou-Samra, G. (2020) Mean Absolute Deviations about the Mean, the Cut Norm and Taxicab Correspondence Analysis. Open Journal of Statistics, 10, 97-112. https://doi.org/10.4236/ojs.2020.101008

1) Association of Caribbean States (ACS): Trade group sponsored by the Caribbean Commnnity and Common Market (CARlCOM).

2) Latin American Integration Association (ALADI): Free trade organization.

3) Amazon Pact: Promotes development of Amazonian territories.

4) Andean Pact: Promotes development of members through economic and social integration.

5) Caribbean Commnnity and Common Market (CARICOM): Caribbean trade organization; promotes economic development of members.

6) Group of Latin American and Caribbean Sugar Exporting Countries (GEPLACEA): Sugar-producing and exporting countries.

7) Group of Rio: Organization for joint political action.

8) Group of Three (G-3): Trade organization.

9) Inter-American Development Bank (IDB): Promotes development of member nations.

10) South American Common Market (MERCOSUR): Increases economic cooperation in the region.

11) North American Free Trade Agreement (NAFTA): Free trade organization.

12) Organization of American States (OAS): Promotes peace, security, economic, and social development in the Western Hemisphere.

13) Central American Parliament (PARLACÉN): Works for the political integration of Central America.

14) San José Group: Promotes regional economic integration.

15) Latin American Economie System (SELA): Promotes economic and social development of member nations.

Appendix B: A List of principal used variables

Mean absolute deviations about the mean of a sample d = ∑ i = 1 n | y i − y ¯ | n

Mean absolute deviations of a sample about the median L A D = ∑ i = 1 n | y i − m e d i a n | n

Variance of a sample s 2 = ∑ i = 1 n ( y i − y ¯ ) 2 n

Cut norm of a centered sample ‖ y − y ¯ 1 n ‖ ⊡ = m a x S ∑ i ∈ S ( y i − y ¯ ) , where S ⊂ I = { 1,2, ⋯ , n }

Taxicab operator norm of a double centered matrix δ α = ‖ P α ‖ ∞ → 1 = max u ∈ ℝ m ‖ P α u ‖ 1 ‖ u ‖ ∞

Cut norm of a double centered matrix ‖ P α ‖ ⊡ = max S , T ∑ j ∈ T ∑ i ∈ S P α ( i , j ) , where T ⊂ J = { 1,2, ⋯ , m }

δ α is the dispersion value of αth taxicab principal axis

f α ( i ) is taxicab principal factor score of row i on the αth principal axis and δ α = ∑ i = 1 n p i ∗ | f α ( i ) |

g α ( j ) is taxicab principal factor score of column j on αth principal axis and δ α = ∑ j = 1 m p ∗ j | g α ( j ) |