Skip to Main Content
U.S. Forest Service
Caring for the land and serving people

United States Department of Agriculture

Home > Search > Publication Information

  1. Share via EmailShare on FacebookShare on LinkedInShare on Twitter
    Dislike this pubLike this pub

    Description

    The relatively small sampling intensities used by national forest inventories are often insufficient to produce the desired precision for estimates of population parameters unless the estimation process is augmented with auxiliary information, usually in the form of remotely sensed data. The k-Nearest Neighbors (k-NN) technique is a non-parametric,multivariate approach to prediction that has emerged as particularly popular for use with forest inventory and remotely sensed data and has been shown to contribute substantially to increasing precision. k-NN predictions are calculated as linear combinations of observations for sample units that are nearest in a space of auxiliary variables to the population unit forwhich a prediction is desired. Implementation of a nearest neighbors algorithmrequires four choices: (i) a distancemetric, (ii) specific auxiliary variables to be usedwith the distance metric, (iii) the number of nearest neighbors, and a (iv) scheme for weighting the nearest neighbors. Regardless of the choices for a distance metric and weighting scheme, emerging evidence suggests that optimization of the technique, including selection of an optimal subset of auxiliary variables, greatly enhances prediction. However, optimization can be computationally intensive and time-consuming. A promising approach that is gaining favor is based on genetic algorithms, a technique that uses search heuristics that mimic natural selection to solve optimization problems. The objective of the study was to compare optimized k-NN configurations with respect to inferences for mean volume per unit area using airborne laser scanning variables as auxiliary information. For two study areas, one in Norway and one in Minnesota, USA, the analyses focused on optimizing k-NN configurations that used the weighted Euclidean and canonical correlation distance metrics and two neighborweighting schemes. Novel features of the study include introduction of a neighborweighting scheme that has not previously been used for forestry applications, simultaneous optimization of all four k-NN choices, and basing comparisons on confidence intervals, rather than intermediate products such as prediction accuracies. Two conclusionswere primary: (1) optimized selection of feature variables produced greater precision than using all feature variables, and (2) computational intensity necessary to optimize the weighted Euclidean metric was considerably greater than for the canonical correlation analysis metric. Specific findings were that optimization produced pseudo-R2 as large as 0.87 for the Norwegian dataset and as large as 0.89 for the Minnesota dataset. For the optimized canonical correlation distance metric, widths of approximate 95% confidence intervals as proportions of the estimated means were as small as 0.13 for the Norwegian dataset and as small as 0.15 for the Minnesota dataset.

    Publication Notes

    • Check the Northern Research Station web site to request a printed copy of this publication.
    • Our on-line publications are scanned and captured using Adobe Acrobat.
    • During the capture process some typographical errors may occur.
    • Please contact Sharon Hobrla, shobrla@fs.fed.us if you notice any errors which make this publication unusable.
    • We recommend that you also print this page and attach it to the printout of the article, to retain the full citation information.
    • This article was written and prepared by U.S. Government employees on official time, and is therefore in the public domain.

    Citation

    McRoberts, Ronald E.; Domke, Grant M.; Chen, Qi; Næsset, Erik; Gobakken, Terje. 2016. Using genetic algorithms to optimize k-Nearest Neighbors configurations for use with airborne laser scanning data. Remote Sensing of Environment. 184: 387-395. https://doi.org/10.1016/j.rse.2016.07.007.

    Cited

    Google Scholar

    Keywords

    Inference, Spatial estimation, National forest inventory

    Related Search


    XML: View XML
Show More
Show Fewer
Jump to Top of Page
https://www.fs.usda.gov/treesearch/pubs/55205