TITLE:
Assessing spatial genetic structure from molecular marker data via principal component analyses: A case study in a Prosopis sp. forest
AUTHORS:
Ingrid Teich, Aníbal Verga, Mónica Balzarini
KEYWORDS:
Multivariate Analysis; Forests; Molecular Markers; Spatial Genetics; sPCA
JOURNAL NAME:
Advances in Bioscience and Biotechnology,
Vol.5 No.2,
January
24,
2014
ABSTRACT:
Advances in genotyping technology, such as molecular
markers, have noticeably improved our capacity to characterize genomes at
multiple loci. Concomitantly, the methodological framework to analyze genetic
data has expanded, and keeping abreast with the latest statistical developments
to analyze molecular marker data in the context of spatial genetics has
become a difficult task. Most methods in spatial statistics are devoted to
univariate data whereas the nature of molecular marker data is highly
dimensional. Multivariate methods are aimed at finding proximities between
entities characterized by multiple variables by summarizing information in
few synthetic variables. In particular, Principal Component analysis (PCA)
has been used to study genetic structure of geo-referenced allele frequency profiles,
incorporating spatial information with a posteriori analysis. Conversely, the
recently developed spatially restricted PCA (sPCA) explicitly includes
spatial data in the optimization criterion. In this work, we compared the
results of the application of PCA and sPCA in the study of the spatial genetic
structure at fine scale of a Prosopis
flexuosa and P. chilensis hybrid
swarm. Data consisted in the genetic characterization of 87 trees sampled in
Córdoba, Argentina and genotyped at six microsatellites, which yielded 72
alleles. As expected, principal components explained more variance than sPCA
components, but were less spatially autocorrelated. The maps obtained by the
interpolation of sPC1 values allowed a better visualization of a patchy spatial
pattern of genetic variability than the PC1 synthetic map. We also proposed a
PC-sPC scatter plot of allele loadings to better understand the allele contributions
to spatial genetic variability.