High Dimensional Dataset Compression Using Principal Components


Until recently, computational power was insufficient to diagonalize atmospheric datasets of order 108 - 109 elements. Eigenanalysis of tens of thousands of variables now can achieve massive data compression for spatial fields with strong correlation properties. Application of eigenanalysis to 26,394 variable dimensions, for three severe weather datasets (tornado, hail and wind) retains 9 - 11 principal components explaining 42% - 52% of the variability. Rotated principal components (RPCs) detect localized coherent data variance structures for each outbreak type and are related to standardized anomalies of the meteorological fields. Our analyses of the RPC loadings and scores show that these graphical displays can efficiently reduce and interpret large datasets. Data is analyzed 24 hours prior to severe weather as a forecasting aid. RPC loadings of sea-level pressure fields show different morphology loadings for each outbreak type. Analysis of low level moisture and temperature RPCs suggests moisture fields for hail and wind which are more related than for tornado outbreaks. Consequently, these patterns can identify precursors of severe weather and discriminate between tornadic and non-tornadic outbreaks.

Share and Cite:

M. Richman, A. Mercer, L. Leslie, C. Doswell III and C. Shafer, "High Dimensional Dataset Compression Using Principal Components," Open Journal of Statistics, Vol. 3 No. 5, 2013, pp. 356-366. doi: 10.4236/ojs.2013.35041.

Conflicts of Interest

The authors declare no conflicts of interest.


[1] E. N. Lorenz, “Empirical Orthogonal Functions and Statistical Weather Prediction,” Science Report 1, Department of Meteorology, Massachusetts Institute of Technology, 1956.
[2] J. E. Kutzbach, “Empirical Eigenvectors of Sea-Level Pressure, Surface Temperature, and Precipitation Complexes over North America,” Journal of Applied Meteorology, Vol. 6, No. 5, 1967, pp. 791-802. http://dx.doi.org/10.1175/1520-0450(1967)006<0791:EEOSLP>2.0.CO;2
[3] A. G. Barnston and R. E. Livezey, “Classification, Seasonality and Persistence of Low-Frequency Atmospheric Circulation Patterns,” Monthly Weather Review, Vol. 115, No. 6, 1987, pp. 1083-1126. http://dx.doi.org/10.1175/1520-0493(1987)115<1083:CSAPOL>2.0.CO;2
[4] G. R. North, R. Gerald, T. L. Bell, R. F. Cahalan and F. J. Moeng, “Sampling Errors in the Estimation of Empirical Orthogonal Functions,” Monthly Weather Review, Vol. 110, No. 7, 1982, pp. 699-706. http://dx.doi.org/10.1175/1520-0493(1982)110<0699:SEITEO>2.0.CO;2
[5] M. Kim, D. Kim and S. Lee, “Face Recognition Using the Embedded HMM with Second-Order Block-Specific Observations,” Pattern Recognition, Vol. 36, No. 11, 2003, pp. 2723-2735. http://dx.doi.org/10.1016/S0031-3203(03)00137-7
[6] H. Moon and P. J. Phillips, “Computational and Performance Aspects of PCA-Based Face Recognition Algorithms,” Perception, Vol. 30, No. 3, 2001, pp. 303-321. http://dx.doi.org/10.1068/p2896
[7] R. H. Compagnucci and M. B. Richman, “Can Principal Component Analysis Provide Atmospheric Circulation or Teleconnection Patterns?” International Journal of Climatology, Vol. 28, No. 6, 2008, pp. 703-726. http://dx.doi.org/10.1002/joc.1574
[8] G. H. Golub and C. F. Van Loan, “Matrix Computations,” 3rd Edition, John Hopkins, Baltimore, 1996.
[9] A. E. Mercer, C. M. Shafer, C. A. Doswell III, L. M. Leslie and M. B. Richman, “Objective Classification of Tornadic and Nontornadic Severe Weather Outbreaks,” Monthly Weather Review, Vol. 137, No. 12, 2009, pp. 43554368. http://dx.doi.org/10.1175/2009MWR2897.1
[10] C. M. Shafer, A. E. Mercer, L. M. Leslie, M. B. Richman and C. A. Doswell III, “Evaluation of WRF Model Simulations of Tornadic and Nontornadic Outbreaks Occurring in the Spring and Fall,” Monthly Weather Review, Vol. 138, No. 11, 2010, pp. 4098-4119. http://dx.doi.org/10.1175/2010MWR3269.1
[11] C. M. Shafer, A. E. Mercer, M. B. Richman, L. M. Leslie and C. A. Doswell III, “An Assessment of Areal Coverage of Severe Weather Parameters for Severe Weather Outbreak Diagnosis,” Weather and Forecasting, Vol. 27, No. 4, 2012, pp. 809-831. http://dx.doi.org/10.1175/WAF-D-11-00142.1
[12] A. E. Mercer, C. M. Shafer, C. A. Doswell III, L. M. Leslie and M. B. Richman, “Synoptic Composites of Tornadic and Nontornadic Outbreaks,” Monthly Weather Review, Vol. 140, No. 8, 2012, pp. 2590-2608. http://dx.doi.org/10.1175/MWR-D-12-00029.1
[13] C. M. Shafer and C. A. Doswell III, “A Multivariate Index for Ranking and Classifying Severe Weather Outbreaks,” Electronic Journal of Severe Storms Meteorology, Vol. 5, No. 1, 2010, pp. 1-28.
[14] E. Kalnay, M. Kanamitsu, R. Kistler, W. Collins, D. Deaven, L. Gandin, M. Iredell, S. Saha, G. White, J. Woollen, Y. Zhu, M. Chelliah, W. Ebisuzaki, W. Higgins, J. Janowiak, K. C. Mo, C. Ropelewski, J. Wang, A. Leetmaa, R. Reynolds, R. Jenne and D. Joseph, “The NCEP/NCAR 40-Year Reanalysis Project,” Bulletin of the American Meteorological Society, Vol. 77, No. 3, 1996, pp. 437471.
[15] R. Swinbank and J. Purser, “Fibonacci Grids: A Novel Approach to Global Modelling,” Quarterly Journal of the Royal Meteorological Society, Vol. 132, No. 619, 2006, pp. 1769-1793.
[16] M. B. Richman, “Review Paper. Rotation of Principal Components,” International Journal of Climatology, Vol. 6, No. 3, 1986, pp. 293-335. http://dx.doi.org/10.1002/joc.3370060305
[17] D. S. Wilks, “Statistical Methods in the Atmospheric Sciences,” 3rd Edition, Academic Press, Amsterdam, 2011.
[18] M. B. Richman and P. J. Lamb, “Climatic Pattern Analysis of 3and 7-day Summer Rainfall in the Central United States: Some Methodological Considerations and a Regionalization,” Journal of Climate and Applied Meteorology, Vol. 24, No. 12, 1985, pp. 1325-1343. http://dx.doi.org/10.1175/1520-0450(1985)024<1325:CPAOTA>2.0.CO;2
[19] M. B. Richman and X.-F. Gong, “Relationships between the Definition of the Hyperplane Width to the Fidelity of Principal Component Loading Patterns,” Journal of Climate, Vol. 12, No. 6, 1999, pp. 1557-1576. http://dx.doi.org/10.1175/1520-0442(1999)012<1557:RBTDOT>2.0.CO;2
[20] J. T. Schaefer and C. A. Doswell III, “Empirical Orthogonal Function Expansion Applied to Progressive Tornado Outbreaks,” Journal of the Meteorological Society of Japan, Vol. 62, No. 6, 1984, pp. 929-936.

Copyright © 2021 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.