TITLE:
Review of Dimension Reduction Methods
AUTHORS:
Salifu Nanga, Ahmed Tijani Bawah, Benjamin Ansah Acquaye, Mac-Issaka Billa, Francis Delali Baeta, Nii Afotey Odai, Samuel Kwaku Obeng, Ampem Darko Nsiah
KEYWORDS:
Dimension Reduction, Machine Learning, Linear Dimension Reduction Techniques, Non-Linear Reduction Techniques
JOURNAL NAME:
Journal of Data Analysis and Information Processing,
Vol.9 No.3,
August
31,
2021
ABSTRACT: Purpose: This study sought to review the characteristics, strengths, weaknesses
variants, applications areas and data types applied on the various Dimension Reduction techniques. Methodology: The
most commonly used databases employed to search for the papers were ScienceDirect,
Scopus, Google Scholar, IEEE Xplore and Mendeley. An integrative review was
used for the study where 341 papers were reviewed. Results: The linear
techniques considered were Principal Component Analysis (PCA), Linear Discriminant
Analysis (LDA), Singular Value Decomposition (SVD), Latent Semantic Analysis
(LSA), Locality Preserving Projections (LPP), Independent Component Analysis
(ICA) and Project Pursuit (PP). The non-linear techniques which were developed
to work with applications that have complex non-linear structures considered were Kernel Principal Component
Analysis (KPCA), Multi-dimensional
Scaling (MDS), Isomap, Locally Linear Embedding (LLE), Self-Organizing Map
(SOM), Latent Vector Quantization (LVQ), t-Stochastic neighbor embedding (t-SNE) and Uniform Manifold Approximation and
Projection (UMAP). DR techniques can further be categorized into supervised,
unsupervised and more recently semi-supervised learning methods. The supervised
versions are the LDA and LVQ. All the other techniques are unsupervised.
Supervised variants of PCA, LPP, KPCA and MDS have been developed.
Supervised and semi-supervised variants of PP and t-SNE have also been
developed and a semi supervised version of the LDA has been developed. Conclusion: The various application areas, strengths, weaknesses and variants of the DR
techniques were explored. The different data types that have been applied on
the various DR techniques were also explored.