Skip to Main Content
Methods for variable selection in LiDAR-assisted forest inventoriesAuthor(s): Paolo Moser; Alexander C. Vibrans; Ronald E. McRoberts; Erik Næsset; Terje Gobakken; Gherardo Chirici; Matteo Mura; Marco Marchetti
Publication Series: Scientific Journal (JRNL)
Station: Northern Research Station
View PDF (730.0 KB)
DescriptionEstimation of wood volume and biomass is an important assignment of any National Forest Inventory. However, the estimation process is often expensive, laborious and sometimes imprecise because of small sample sizes relative to population variability. Remote sensing techniques are an option to assist in surveying large areas by providing data that can be related to the forest attribute of interest through mathematical models of relationships. Light Detection and Ranging (LiDAR) is a technology that can provide data that are closely related to forest wood volume and biomass. With these data, linear regression is often used to estimate forest attributes. If the relationship provides evidence of nonlinearity, a transformation in the variables can be considered. However, modern computation allows fitting nonlinear regression models without transformations of the variables. Nonlinear least squares (NLS) techniques also give more freedom to assure satisfaction of natural conditions such as non-negativity and/or lower and upper asymptotes. Like any estimation technique, NLS is subject to overfitting when using a large number of predictor variables. Because NLS is more computationally intensive than linear regression, stepwise selection techniques may require considerable programming effort. We compared three methods to select predictor variables for nonlinear models of relationships between forest attributes and LiDAR metrics, two of them based on genetic algorithms (GAs) and one based on random forest (RM). GAs were implemented to optimize a cost function that yields root mean square error or the Akaike Information Criterion (AIC), while RM was based on variable importance in decision trees. A model with the predictor variable most correlated with the response variable was also considered. We compared the results of overall estimation for two datasets using the model-assisted, generalized regression estimator and concluded that the combination of GAs and AIC was the most efficient and stable procedure for selection of variables.We attribute this result to the penalty that AIC applies to models with large numbers of variables, which leads to a more efficient model with a minimum loss of information.
- Check the Northern Research Station web site to request a printed copy of this publication.
- Our on-line publications are scanned and captured using Adobe Acrobat.
- During the capture process some typographical errors may occur.
- Please contact Sharon Hobrla, email@example.com if you notice any errors which make this publication unusable.
- We recommend that you also print this page and attach it to the printout of the article, to retain the full citation information.
- This article was written and prepared by U.S. Government employees on official time, and is therefore in the public domain.
CitationMoser, Paolo; Vibrans, Alexander C.; McRoberts, Ronald E.; Næsset, Erik; Gobakken, Terje; Chirici, Gherardo; Mura, Matteo; Marchetti, Marco. 2016. Methods for variable selection in LiDAR-assisted forest inventories. Forestry. 90(1): 112-124. https://doi.org/10.1093/forestry/cpw041.
- Predicting plot basal area and tree density in mixed-conifer forest from lidar and Advanced Land Imager (ALI) data
- Aggregating pixel-level basal area predictions derived from LiDAR data to industrial forest stands in North-Central Idaho
- Model-assisted forest yield estimation with light detection and ranging
XML: View XML