Biometrical Letters Vol. 51(2), 2014, pp. 75-88


Show full-size cover
AN ALTERNATIVE METHODOLOGY FOR IMPUTING MISSING DATA
IN TRIALS WITH GENOTYPE-BY-ENVIRONMENT INTERACTION:
SOME NEW ASPECTS


Sergio Arciniegas-Alarcón1, Marisol García-Peña1,
Wojtek Janusz Krzanowski2, Carlos Tadeu dos Santos Dias1

1Departamento de Ciências Exatas, Universidade de São Paulo/ESALQ, Cx.P.09,
CEP.13418-900, Piracicaba, SP - Brasil, e-mail: sergio.arciniegas@gmail.com
2College of Engineering, Mathematics and Physical Sciences, Harrison Building,
University of Exeter, North Park Road, Exeter, EX4 4QF, United Kingdom


A common problem in multi-environment trials arises when some genotypeby-environment combinations are missing. In Arciniegas-Alarcón et al. (2010) we outlined a method of data imputation to estimate the missing values, the computational algorithm for which was a mixture of regression and lower-rank approximation of a matrix based on its singular value decomposition (SVD). In the present paper we provide two extensions to this methodology, by including weights chosen by cross-validation and allowing multiple as well as simple imputation. The three methods are assessed and compared in a simulation study, using a complete set of real data in which values are deleted randomly at different rates. The quality of the imputations is evaluated using three measures: the Procrustes statistic, the squared correlation between matrices and the normalised root mean squared error between these estimates and the true observed values. None of the methods makes any distributional or structural assumptions, and all of them can be used for any pattern or mechanism of the missing values.


cross-validation, singular value decomposition, imputation, genotype-by-environment interaction, weights, missing values