Multivariate Statistical Analysis of Large Datasets: Single Particle Electron Microscopy

Full-Text HTML XML Download Download as PDF (Size:9035KB) PP. 701-739
DOI: 10.4236/ojs.2016.64059    1,975 Downloads   2,658 Views  

ABSTRACT

Biology is a challenging and complicated mess. Understanding this challenging complexity is the realm of the biological sciences: Trying to make sense of the massive, messy data in terms of discovering patterns and revealing its underlying general rules. Among the most powerful mathematical tools for organizing and helping to structure complex, heterogeneous and noisy data are the tools provided by multivariate statistical analysis (MSA) approaches. These eigenvector/eigenvalue data-compression approaches were first introduced to electron microscopy (EM) in 1980 to help sort out different views of macromolecules in a micrograph. After 35 years of continuous use and developments, new MSA applications are still being proposed regularly. The speed of computing has increased dramatically in the decades since their first use in electron microscopy. However, we have also seen a possibly even more rapid increase in the size and complexity of the EM data sets to be studied. MSA computations had thus become a very serious bottleneck limiting its general use. The parallelization of our programs—speeding up the process by orders of magnitude—has opened whole new avenues of research. The speed of the automatic classification in the compressed eigenvector space had also become a bottleneck which needed to be removed. In this paper we explain the basic principles of multivariate statistical eigenvector-eigenvalue data compression; we provide practical tips and application examples for those working in structural biology, and we provide the more experienced researcher in this and other fields with the formulas associated with these powerful MSA approaches.

Cite this paper

Heel, M. , Portugal, R. and Schatz, M. (2016) Multivariate Statistical Analysis of Large Datasets: Single Particle Electron Microscopy. Open Journal of Statistics, 6, 701-739. doi: 10.4236/ojs.2016.64059.

References

[1] Knoll, M. and Ruska, E. (1932) Das elektronenmikroskop. Zeitschrift für Physik, 78, 318-339.
http://dx.doi.org/10.1007/BF01342199
[2] Palade, G.E. (1955) A Small Particulate Component of the Cytoplasm. The Journal of Biophysical and Biochemical Cytology, 1, 59.
http://dx.doi.org/10.1083/jcb.1.1.59
[3] Van Bruggen, E., Wiebenga, E. and Gruber, M. (1962) Structure and Properties of Hemocyanins: I. Electron Micrographs of Hemocyanin and Apohemocyanin from Helix pomatia at Different pH Values. Journal of Molecular Biology, 4, 1-IN6.
http://dx.doi.org/10.1016/S0022-2836(62)80110-7
[4] van Bruggen, E., Wiebenga, E. and Gruber, M. (1962) Structure and Properties of Hemocyanins: II. Electron Micrographs of the Hemocyanins of Sepia officinalis, Octopus vulgaris and Cancer pagurus. Journal of Molecular Biology, 4, 8-IN8.
http://dx.doi.org/10.1016/S0022-2836(62)80111-9
[5] De Rosier, D.J. and Klug, A. (1968) Reconstruction of Three Dimensional Structures from Electron Micrographs. Nature, 217, 130-134.
http://dx.doi.org/10.1038/217130a0
[6] Crowther, R.A. (1971) Procedures for Three-Dimensional Reconstruction of Spherical Viruses by Fourier Synthesis from Electron Micrographs. Philosophical Transactions of the Royal Society B: Biological Science, 261, 221-230.
http://dx.doi.org/10.1098/rstb.1971.0054
[7] Unwin, P.N. and Henderson, R. (1975) Molecular STRUCTURE Determination by Electron Microscopy of Unstained Crystalline Specimens. Journal of Molecular Biology, 94, 425-440.
http://dx.doi.org/10.1016/0022-2836(75)90212-0
[8] Hoppe, W., Gassmann, J., Hunsmann, N., Schramm, H. and Sturm, M. (1974) Three-Dimensional Reconstruction of Individual Negatively Stained Yeast Fatty-Acid Synthetase Molecules from Tilt Series in the Electron Microscope. Hoppe Seylers Z Physiol Chem.
[9] Kastner, B., St?ffler-Meilicke, M. and St?ffler, G. (1981) Arrangement of the Subunits in the Ribosome of Escherichia coli: Demonstration by Immunoelectron Microscopy. Proceedings of the National Academy of Sciences, 78, 6652-6656.
http://dx.doi.org/10.1073/pnas.78.11.6652
[10] Lake, J.A. (1976) Ribosome Structure Determined by Electron Microscopy of Escherichia coli Small Subunits, Large Subunits and Monomeric Ribosomes. Journal of Molecular Biology, 105, 131-159.
http://dx.doi.org/10.1016/0022-2836(76)90200-X
[11] van Heel, M. and Frank, J. (1980) Classification of Particles in Noisy Electron Micrographs Using Correspondence Analysis. In: Gelsema, E.S. and Kanal, L., Eds., Pattern Recognition in Practice I. North-Holland Publishing, Amsterdam, 235-243.
[12] van Heel, M. and Frank, J. (1981) Use of Multivariates Statistics in Analysing the Images of Biological Macromolecules. Ultramicroscopy, 6, 187-194.
[13] Benzécri, J.-P. (1973) L’Analyse des Données. Dunod, Paris, tome 2.
[14] Lebart, L., Morineau, A. and Tabard, N. (1977) Techniques de la description statistique. Dunod, Paris, 351.
[15] Lebart, L., Morineau, A. and Warwick, K.M. (1984) Multivariate Descriptive Statistical Analysis; Correspondence Analysis and Related Techniques for Large Matrices.
[16] van Heel, M. (1984) Multivariate Statistical Classification of Noisy Images (Randomly Oriented Biological Macromolecules). Ultramicroscopy, 13, 165-183.
http://dx.doi.org/10.1016/0304-3991(84)90066-4
[17] van Heel, M. (1989) Classification of Very Large Electron Microscopical Image Data Sets. Optik, 82, 114-126.
[18] Penczek, P., Radermacher, M. and Frank, J. (1992) Three-Dimensional Reconstruction of Single Particles Embedded in Ice. Ultramicroscopy, 40, 33-53.
http://dx.doi.org/10.1016/0304-3991(92)90233-A
[19] Scheres, S.H., Valle, M., Nunez, R., Sorzano, C.O., Marabini, R., Herman, G.T. and Carazo, J.M. (2005) Maximum-Likelihood Multi-Reference Refinement for Electron Microscopy Images. Journal of Molecular Biology, 348, 139-149.
http://dx.doi.org/10.1016/j.jmb.2005.02.031
[20] Adrian, M., Dubochet, J., Lepault, J. and McDowall, A.W. (1984) Cryo-Electron Microscopy of Viruses. Nature, 308, 32-36.
http://dx.doi.org/10.1038/308032a0
[21] Fernandez-Moran, H. (1960) Low-Temperature Preparation Techniques for Electron Microscopy of Biological Specimens Based on Rapid Freezing with Liquid Helium II. Annals of the New York Academy of Sciences, 85, 689-713.
http://dx.doi.org/10.1111/j.1749-6632.1960.tb49990.x
[22] Dobro, M.J., Melanson, L.A., Jensen, G.J. and McDowall, A.W. (2010) Chapter Three-Plunge Freezing for Electron Cryomicroscopy. Methods in Enzymology, 481, 63-82.
http://dx.doi.org/10.1016/S0076-6879(10)81003-1
[23] Schr?der, R.R. (2015) Advances in Electron Microscopy: A Qualitative View of Instrumentation Development for Macromolecular Imaging and Tomography. Arch Biochem Biophys, 581, 25-38.
http://dx.doi.org/10.1016/j.abb.2015.05.010
[24] Suloway, C., Pulokas, J., Fellmann, D., Cheng, A., Guerra, F., Quispe, J., Stagg, S., Potter, C.S. and Carragher, B. (2005) Automated Molecular Microscopy: The New Leginon System. Journal of Structural Biology, 151, 41-60.
http://dx.doi.org/10.1016/j.jsb.2005.03.010
[25] Faruqi, A.R. and McMullan, G. (2011) Electronic Detectors for Electron Microscopy. Quarterly Reviews of Biophysics, 44, 357-390.
http://dx.doi.org/10.1017/S0033583511000035
[26] McMullan, G., Faruqi, A.R., Clare, D. and Henderson, R. (2014) Comparison of Optimal Performance at 300 keV of Three Direct Electron Detectors for Use in low Dose Electron Microscopy. Ultramicroscopy, 147, 156-163.
http://dx.doi.org/10.1016/j.ultramic.2014.08.002
[27] Kunath, W., Weiss, K., Sackkongehl, H., Kessel, M. and Zeitler, E. (1984) Time-Resolved Low-Dose Microscopy of Glutamine-Synthetase Molecules. Ultramicroscopy, 13, 241-252.
http://dx.doi.org/10.1016/0304-3991(84)90203-1
[28] Campbell, M.G., Cheng, A., Brilot, A.F., Moeller, A., Lyumkis, D., Veesler, D., Pan, J., Harrison, S.C., Potter, C.S., Carragher, B. and Grigorieff, N. (2012) Movies of Ice-Embedded Particles Enhance Resolution in Electron Cryo-Microscopy. Structure, 20, 1823-1828.
http://dx.doi.org/10.1016/j.str.2012.08.026
[29] Kühlbrandt, W. (2014) Biochemistry. The Resolution Revolution. Science, 343, 1443-1444.
http://dx.doi.org/10.1126/science.1251652
[30] van Heel, M. and St?ffler-Meilicke, M. (1985) Characteristic Views of E. coli and B. stearothermophilus 30S Ribosomal Subunits in the Electron Microscope. The EMBO Journal, 4, 2389-2395.
[31] Zernike, F. (1942) Phase Contrast, a New Method for the Microscopic Observation of Transparent Objects. Physica, 9, 686-698.
http://dx.doi.org/10.1016/S0031-8914(42)80035-X
[32] Zernike, F. (1942) Phase Contrast, a New Method for the Microscopic Observation of Transparent Objects Part II. Physica, 9, 974, IN1, 981, IN3, 983-980-982-986.
http://dx.doi.org/10.1016/s0031-8914(42)80079-8
[33] Borland, L. and van Heel, M. (1990) Classification of Image Data in Conjugate Representation Spaces. Journal of the Optical Society of America and Optics Image Science and Vision, 7, 601-610.
http://dx.doi.org/10.1364/JOSAA.7.000601
[34] van Heel, M., Schatz, M. and Orlova, E. (1992) Correlation Functions Revisited. Ultramicroscopy, 46, 307-316.
http://dx.doi.org/10.1016/0304-3991(92)90021-B
[35] Golub, G. and van Loan, C. (1996) Matrix Computations. 3rd Edition. The John Hopkins University Press, Baltimore.
[36] Clint, M. and Jenning, A. (1970) The Evaluation of Eigenvalues and Eigenvectors of Real Symmetric Matrices by Simultaneous Iteration. The Computer Journal, 13, 76-80.
http://dx.doi.org/10.1093/comjnl/13.1.76
[37] Frank, J. (2006) Three-Dimensional Electron Microscopy of Macromolecular Assemblies: Visualization of Biological Molecules in Their Native State. Oxford University Press, Oxford.
http://dx.doi.org/10.1093/acprof:oso/9780195182187.001.0001
[38] van Heel, M. (1981) Image Formation and Image Analysis in Electron Microscopy. Ph.D. Thesis, University of Groningen, Groningen.
[39] van Heel, M. and Keegstra, W. (1981) IMAGIC: A Fast, Flexible and Friendly Image Analysis Software System. Ultramicroscopy, 7, 113-129.
http://dx.doi.org/10.1016/0304-3991(81)90001-2
[40] Bock, H.-H. (2008) Origins and Extensions of The-Means Algorithm in Cluster Analysis. Journal électronique d’Histoire des Probabilités et de la Statistique [Electronic Only], 4, Article: 14, 18 p.
[41] van Heel, M. and St?ffler-Meilicke, M. (1982) Classification of Images of the 30S E. coli Ribosomal Subunit. Electron Microscopy 1982, Hamburg, 107-108.
[42] Diday, E. (1971) Une nouvelle méthode en classification automatique et reconnaissance des formes la méthode des nuées dynamiques. Revue de Statistique Appliquée, 19, 19-33.
[43] Ward Jr., J.H. (1963) Hierarchical Grouping to Optimize an Objective Function. Journal of the American Statistical Association, 58, 236-244.
http://dx.doi.org/10.1080/01621459.1963.10500845
[44] van Heel, M. (1984) Three-Dimensional Reconstructions from Projections with Unknown Angular Relationship. In: Csanády, á., R?hlich, P., Szabó, D., Eds., Proceedings of the 8th European Congress on Electron Microscopy, Budapest, Vol. 2, 1347-1348.
[45] Morineau, A. and Lebart, L. (1986) Specific Clustering Algorithms for Large Data Sets and Implementation in SPAD Software. In: Gaul, W. and Schader, M., Eds., Classification as a Tool of Research, North Holland, Amsterdam, 321-329.
[46] van Heel, M., Portugal, R., Rohou, A., Linnemayr, C., Bebeacua, C., Schmidt, R., Grant, T. and Schatz, M. (2012) Four-Dimensional Cryo Electron Microscopy at Quasi Atomic Resolution: IMAGIC 4D. International Tables for Crystallography, F, 624-628
[47] Penczek, P.A., Grassucci, R.A. and Frank, J. (1994) The Ribosome at Improved Resolution: New Techniques for Merging and Orientation Refinement in 3D Cryo-Electron Microscopy of Biological Particles. Ultramicroscopy, 53, 251-270.
http://dx.doi.org/10.1016/0304-3991(94)90038-8
[48] van Heel, M. (1987) Angular Reconstitution: A Posteriori Assignment of Projection Directions for 3D Reconstruction. Ultramicroscopy, 21, 111-123.
http://dx.doi.org/10.1016/0304-3991(87)90078-7
[49] Crowther, R. and Amos, L.A. (1971) Harmonic Analysis of Electron Microscope Images with Rotational Symmetry. Journal of Molecular Biology, 60, 123-130.
http://dx.doi.org/10.1016/0022-2836(71)90452-9
[50] Boekema, E.J., Berden, J.A. and van Heel, M.G. (1986) Structure of Mitochondrial F1-ATPase Studied by Electron Microscopy and Image Processing. Biochimica et Biophysica Acta (BBA) -Bioenergetics, 851, 353-360.
http://dx.doi.org/10.1016/0005-2728(86)90071-X
[51] Stewart, A. and Grigorieff, N. (2004) Noise Bias in the Refinement of Structures Derived from Single Particles. Ultramicroscopy, 102, 67-84.
http://dx.doi.org/10.1016/j.ultramic.2004.08.008
[52] Dube, P., Tavares, P., Lurz, R. and van Heel, M. (1993) The Portal Protein of Bacteriophage SPP1: A DNA Pump with 13-Fold Symmetry. The EMBO Journal, 12, 1303-1309.
[53] van Heel, M., Gowen, B., Matadeen, R., Orlova, E.V., Finn, R., Pape, T., Cohen, D., Stark, H., Schmidt, R., Schatz, M. and Patwardhan, A. (2000) Single-Particle Electron Cryo-Microscopy: Towards Atomic Resolution. Quarterly Reviews of biophysics, 33, 307-369.
http://dx.doi.org/10.1017/S0033583500003644
[54] Klaholz, B.P., Myasnikov, A.G. and Van Heel, M. (2004) Visualization of Release Factor 3 on the Ribosome during Termination of Protein Synthesis. Nature, 427, 862-865.
http://dx.doi.org/10.1038/nature02332
[55] White, H.E., Saibil, H.R., Ignatiou, A. and Orlova, E.V. (2004) Recognition and Separation of Single Particles with Size Variation by Statistical Analysis of Their Images. Journal of Molecular Biology, 336, 453-460.
http://dx.doi.org/10.1016/j.jmb.2003.12.015
[56] Klaholz, B.P. (2015) Structure Sorting of Multiple Macromolecular States in Heterogeneous Cryo-EM Samples by 3D Multivariate Statistical Analysis. Open Journal of Statistics, 5, 820.
http://dx.doi.org/10.4236/ojs.2015.57081
[57] Harauz, G. and van Heel, M. (1986) Exact Filters for General Geometry Three Dimensional Reconstruction. Optik, 73, 146-156
[58] van Heel, M. and Schatz, M. (2005) Fourier Shell Correlation Threshold Criteria. Journal of Structural Biology, 151, 250-262.
http://dx.doi.org/10.1016/j.jsb.2005.05.009
[59] Leschziner, A.E. and Nogales, E. (2007) Visualizing Flexibility at Molecular Resolution: Analysis of Heterogeneity in Single-Particle Electron Microscopy Reconstructions. Annual Review of Biophysics and Biomolecular Structure, 36, 43-62.
http://dx.doi.org/10.1146/annurev.biophys.36.040306.132742
[60] van Heel, M., Harauz, G., Orlova, E.V., Schmidt, R. and Schatz, M. (1996) A New Generation of the IMAGIC Image Processing System. Journal of Structural Biology, 116, 17-24.
http://dx.doi.org/10.1006/jsbi.1996.0004
[61] Sigworth, F. (1998) A Maximum-Likelihood Approach to Single-Particle Image Refinement. Journal of Structural Biology, 122, 328-339.
http://dx.doi.org/10.1006/jsbi.1998.4014
[62] Amunts, A., Brown, A., Bai, X.C., Llacer, J.L., Hussain, T., Emsley, P., Long, F., Murshudov, G., Scheres, S.H. and Ramakrishnan, V. (2014) Structure of the Yeast Mitochondrial Large Ribosomal Subunit. Science, 343, 1485-1489.
http://dx.doi.org/10.1126/science.1249410
[63] Mao, Y., Wang, L., Gu, C., Herschhorn, A., Xiang, S.H., Haim, H., Yang, X. and Sodroski, J. (2012) Subunit Organization of the Membrane-Bound HIV-1 Envelope Glycoprotein Trimer. Nature Structural & Molecular Biology, 19, 893-899.
http://dx.doi.org/10.1038/nsmb.2351
[64] Mao, Y., Wang, L., Gu, C., Herschhorn, A., Desormeaux, A., Finzi, A., Xiang, S.H. and Sodroski, J.G. (2013) Molecular Architecture of the Uncleaved HIV-1 Envelope Glycoprotein Trimer. Proceedings of the National Academy of Sciences of the United States of America, 110, 12438-12443.
http://dx.doi.org/10.1073/pnas.1307382110
[65] van Heel, M. (2013) Finding Trimeric HIV-1 Envelope Glycoproteins in Random Noise. Proceedings of the National Academy of Sciences of the United States of America, 110, E4175-4177.
http://dx.doi.org/10.1073/pnas.1314353110
[66] Subramaniam, S. (2013) Structure of Trimeric HIV-1 Envelope Glycoproteins. Proceedings of the National Academy of Sciences of the United States of America, 110, E4172-4174.
http://dx.doi.org/10.1073/pnas.1313802110
[67] Henderson, R. (2013) Avoiding the Pitfalls of Single Particle Cryo-Electron Microscopy: Einstein from Noise. Proceedings of the National Academy of Sciences of the United States of America, 110, 18037-18041.
http://dx.doi.org/10.1073/pnas.1314449110
[68] Fischer, N., Neumann, P., Konevega, A.L., Bock, L.V., Ficner, R., Rodnina, M.V. and Stark, H. (2015) Structure of the E. coli Ribosome-EF-Tu Complex at <3 A resolution by Cs-Corrected Cryo-EM. Nature, 520, 567-570.
http://dx.doi.org/10.1038/nature14275
[69] Clare, D.K., Pechnikova, E.V., Skurat, E.V., Makarov, V.V., Sokolova, O.S., Solovyev, A.G. and Orlova, E.V. (2015) Novel Inter-Subunit Contacts in Barley Stripe Mosaic Virus Revealed by Cryo-Electron Microscopy. Structure, 23, 1815-1826.
http://dx.doi.org/10.1016/j.str.2015.06.028
[70] DeRosier, D.J. and Moore, P.B. (1970) Reconstruction of Three-Dimensional Images from Electron Micrographs of structures with helical symmetry. Journal of Molecular Biology, 52, 355-369.
http://dx.doi.org/10.1016/0022-2836(70)90036-7
[71] Henderson, R. and Unwin, P.N. (1975) Three-Dimensional Model of Purple Membrane Obtained by Electron Microscopy. Nature, 257, 28-32.
http://dx.doi.org/10.1038/257028a0
[72] Saxton, W.O. and Frank, J. (1977) Motif Detection in Quantum Noise-Limited Electron Micrographs by Cross-Correlation. Ultramicroscopy, 2, 219-227.
http://dx.doi.org/10.1016/S0304-3991(76)91385-1
[73] Bretaudiere, J., Dumont, G., Rej, R. and Bailly, M. (1981) Suitability of Control Materials. General Principles and Methods of Investigation. Clinical Chemistry, 27, 798-805.
[74] Benzécri, J.-P. (1973) L’Analyse des Données. Volume 1+2. Dunod, Paris.
[75] van Bruggen, E.F. and Weber, R.E. (1974) Erythrocruorin with Anomalous Quaternary Structure from the Polychaete Oenone fulgida. Biochim Biophys Acta, 359, 210-214.
http://dx.doi.org/10.1016/0005-2795(74)90145-7
[76] van Heel, M., Bretaudière, J.P. and Frank, J. (1982) Classification and Multireference Alignment of Images of Macromolecules. Proceedings of the 10th International Congress on Electron Microscopy, 1, 563-564.
[77] Schroeter, J.P. and Bretaudiere, J.-P. (1996) SUPRIM: Easily Modified Image Processing Software. Journal of Structural Biology, 116, 131-137.
http://dx.doi.org/10.1006/jsbi.1996.0021

  
comments powered by Disqus

Copyright © 2017 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.