Listy Biometryczne - Biometrical Letters Vol. 38(2001), No. 1, 11-31
A SEMI-STOCHASTIC GRAND TOUR FOR IDENTIFYING OUTLIERS AND FINDING A CLEAN SUBSET Anna Bartkowiak Institute of Computer Science, University of Wroclaw, Przesmyckiego 20, 51-151 Wroclaw, Poland |
The grand tour method has proved to be a very efficient method in detecting outliers. The present paper proposes further modifications of the grand tour algorithm by constructing robust concentration ellipses. It is also emphasized that the same method can be used for obtaining a "clean" data set. Such a subset may be the starting point for robust multivariate procedures. The method is simple, can be easily implemented on parallel computers, and as such may be used in data mining for large data sets. The considerations are illustrated with two benchmarks and one real medical data set.
multivariate outlier, graphical methods, grand tour, linked plots, ellipse of concentration.