Fitting Spatial Surfaces to Noisy Observations
Contributed by Chris Paciorek, Lecturer and Researcher in the Statistics Department, University of California, Berkeley.
There are many options for fitting spatial surfaces to noisy observations. Here three readily usable options are described, but see the software links for additional approaches, particularly ones implemented in R. The three options described here are spatial smoothing using variants of thin plate splines, implemented in the R packages mgcv (using the gam() function) and SemiPar (using the spm() function), and kriging, as implemented in ArcGIS. Much more detail on syntax and example code are available from the labs for Statistics 155, a course taught by Chris Paciorek and Rima Izem at Harvard University.
mgcv and SemiPar can be used for continuous, count, and binary data, while kriging may not be appropriate for data that are not continuous. By continuous I mean data for which the residuals from fitting the given model are approximately normal or at least reasonably symmetric and with tails that are not too long. In general, I would recommend using mgcv or SemiPar. Both allow for additive models with the inclusion of additional covariates and for count and binary data and both provide estimates of uncertainty. Neither relies heavily on ad hoc fitting methods used in the ArcGIS kriging routines. The degree of smoothness in the surface estimated by SemiPar and mgcv is controlled by a single parameter, rather than a number of parameters in the kriging approach. While having multiple parameters might seem to allow for a more flexible fit to the data, with the kind of smoothing done here, there is effectively only one parameter that controls the degree of smoothing, so the kriging approach introduces additional complexity with little change in the ability to fit a surface. The exception to this comment is the ability of the kriging methodology to account for directionality. mgcv and SemiPar do not allow for anisotropy, so strongly directional surfaces may be better fit via kriging. Also note that if one wants to use the GIS software to create maps, the results from mgcv and SemiPar can be imported into the GIS software. Finally note that uncertainty estimates in both SemiPar and mgcv are only for the uncertainty in the fitted function, not prediction uncertainty, which also includes uncertainty from having a new observation (assuming the residual variance, i.e., the nugget, is non-zero).
mgcv R package: The mgcv library in R (written by Simon Wood) has a gam() (generalized additive model) function that allows for thin plate spline-based surface fits within a gam model in which other covariates (both as linear and smooth terms) can be readily included. The function is easy to use if you know some R and is computationally efficient and able to handle large datasets. Be careful: the gam() function in S-plus does not implement the same method and cannot be used for flexible surface fitting. The degree of smoothing is estimated within the model using an analogue to the generalized cross-validation (GCV) criteria.
SemiPar R package: SemiPar (written by Matt Wand) fits a spatial surface within the framework of linear mixed models, using a thin-plate spline basis to represent the surface. The model can include additional covariates as either linear or smooth (nonparametric) components. The degree of smoothing is estimated within the model via REML or ML.
Kriging within ArcGIS: ArcGIS is commercial GIS software, available primarily for Windows systems. Spatial surface fitting can be performed by the statistical method of kriging using either the Spatial Analyst or Geostatistical Analyst extensions. The underlying code by which the kriging is done is not accessible, and it can take some effort to understand exactly what is being done based on the documentation. Kriging is a standard statistical method, but the implementation in ArcGIS is done in a somewhat ad hoc manner, in part for computational efficiency. ArcGIS cannot include additional covariates; it only fits the spatial surface without accounting for other predictors. ArcGIS can account for spatial directionality. Sophisticated maps can be created within the GIS software, although one is limited to the mapping options provided by the creators of the software. Models are fit and maps created using a graphical user interface.
Information on exactly how kriging is done in ArcGIS is a bit difficult to come by. The online documentation has limited technical detail. More extensive information is available in:
Johnston, K. et al. 2001. Using ArcGIS geostatistical analyst. Redlands, CA: Environmental Systems Research Institute.
McCoy, J. et al. 2001. Using ArcGIS spatial analyst. Redlands, CA: Environmental Systems Research Institute.
Last modified 1/10/2013.