Biometrical Letters Vol. 44(1), 2007, pp. 33-49


Show full-size cover
A COMPARISON BETWEEN PRINCIPAL COMPONENT ANALYSIS
AND NONLINEAR PRINCIPAL COMPONENT ANALYSIS


Paulo Canas Rodrigues

Department of Mathematics, Faculty of Science and Technology, Nova University of Lisbon, Monte da Caparica, 2829-516 Caparica, Portugal, e-mail: paulocanas@fct.unl.pt


When dealing with large data sets, one of the main problems is how to extract the information. Principal component analysis (PCA) is the most used technique for reducing the data set while preserving significant features. However its availability is not immediate for categorical variables. PCA is applied to variables that are at least interval scaled. So if we have categorical variables (ordinal or nominal), we should use nonlinear principal component analysis (NLPCA).We present a comparison between PCA and NLPCA from a practical point of view. With this comparison we intend to show how the results obtained from the inappropriate use of PCA (in detriment of NLPCA) may be misleading when we have variables with different measurement levels.The analysis of two real data sets, one concerning characteristics of the countries of European Union and the other about some variables measured in people with heart failure, are presented and their interpretations in the context of our research will be discussed.


PCA, nonlinear PCA, CATPCA, European Union, Heart Failure.