Reports

# Bayesian Approach to Spline Smoothing

Venter, Gary

Regressions using variables categorized or listed numerically, like 1st one, 2nd one, etc. – such as age, weight group, year measured, etc., are often modeled with a dummy variable for each age, etc. Cubic splines are used to smooth the fitted values along age curves, year curves, etc. This can give nearly as good a fit as straight regression but with fewer variables. Spline smoothing adds a smoothing constant times a smoothness measure, often the integral of the curve’s squared-second derivative, to the negative loglikelihood, which is then minimized. The smoothing constant is estimated by cross validation. Picking the knots (curve-segment connection points) is a separate estimation. Here we look at using simpler measures of curve smoothness like sum of squares of the needed parameters. This gives very similar curves across the fitted values, both in goodness of fit and visual smoothness. It also allows the fitting to be done with more standard fitting methods, like Lasso or ridge regression, with the knot selection optimized in the process. This helps modelers incorporate spline smoothing into their own more complex models. It also makes it possible to smooth using Bayesian methods. That is slower but it gives distributions for each fitted parameter and a direct estimate of the probability distribution of the smoothing constant. Cross-validation is a good method to compare models but has problems if used for estimation, discussed. Also linear splines can be modeled this way as well, and after smoothing look similar to cubic splines and are often easier and faster to fit. Modern Bayesian methods do not rely on Bayesian interpretations of probability and can be done within frequentist random effects, liberally interpreted.

Keywords: smoothing splines; shrinkage priors; MCMC; Bayesian methods; APC models

## Files

• Bayesian Approach to Spline Smoothing.pdf application/pdf 795 KB Download File