Skip to Main Content
Partitioning error components for accuracy-assessment of near-neighbor methods of imputationAuthor(s): Albert R. Stage; Nicholas L. Crookston
Source: Forest Science. 53(1): 62-72.
Publication Series: Scientific Journal (JRNL)
Station: Rocky Mountain Research Station
PDF: Download Publication (874.48 KB)
DescriptionImputation is applied for two quite different purposes: to supply missing data to complete a data set for subsequent modeling analyses or to estimate subpopulation totals. Error properties of the imputed values have different effects in these two contexts. We partition errors of imputation derived from similar observation units as arising from three sources: observation error, the distribution of observation units with respect to their similarity, and pure error given a particular choice of variables known for all observation units. Two new statistics based on this partitioning measure the accuracy of the imputations, facilitating comparison of imputation to alternative methods of estimation such as regression and comparison of alternative methods of imputation generally. Knowing the relative magnitude of the errors arising from these partitions can also guide efficient investment in obtaining additional data. We illustrate this partitioning using three extensive data sets from western North America. Application of this partitioning to compare near-neighbor imputation is illustrated for Mahalanobis- and two canonical correlation-based measures of similarity.
- You may send email to firstname.lastname@example.org to request a hard copy of this publication.
- (Please specify exactly which publication you are requesting and your mailing address.)
- We recommend that you also print this page and attach it to the printout of the article, to retain the full citation information.
- This article was written and prepared by U.S. Government employees on official time, and is therefore in the public domain.
CitationStage, Albert R.; Crookston, Nicholas L. 2007. Partitioning error components for accuracy-assessment of near-neighbor methods of imputation. Forest Science. 53(1): 62-72.
Keywordsmost similar neighbor, k-nn inference, missing data, landscape modeling
- A Comparison of Several Techniques For Estimating The Average Volume Per Acre For Multipanel Data With Missing Panels
- yaImpute: An R package for kNN imputation
- WTA estimates using the method of paired comparison: tests of robustness
XML: View XML