TITLE:
Discovering the Best Choice for Spline’s Knots and Intervals Using Order of Polynomial Regression Model
AUTHORS:
Farag Hamad, Najiah Younus, Mohamed Jaber
KEYWORDS:
Nonlinear Regression, Splines, Polynomial, Cross-Validation, Akaike Information & Bayesian Information Criterion
JOURNAL NAME:
Open Journal of Statistics,
Vol.14 No.6,
December
20,
2024
ABSTRACT: In this work, we seek the relationship between the order of the polynomial model and the number of knots and intervals that we need to fit the splines regression model. Regression models (polynomial and spline regression models) are presented and discussed in detail in order to discover the relation. Intrinsically, both models are dependent on the linear regression model. Spline is designed to draw curves to balance the goodness of fit and minimize the mean square error of the regression model. In the splines model, the curve at any point depends only on the observations at that point and some specified neighboring points. Using the boundaries of the intervals of the splines, we fit a smooth cubic interpolation function that goes through (n + 1) data points. On the other hand, polynomial regression is a useful technique when the pattern of the data indicates a nonlinear relationship between the dependent and independent variables. Moreover, higher-degree polynomials can capture more intricate patterns, but it can also lead to overfitting. A simulation study is implemented to illustrate the performance of splines and spline segments based on the degree of the polynomial model. For each model, we compute the Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) to compare the optimal polynomial order for fitting the data with the number of knots and intervals for the splines model. Both AIC and BIC can help to identify the model that best balances fit and complexity, aiming to prevent overfitting by penalizing the use of excessive parameters. We compare the results that we got from applying the polynomial regression model with the splines model results in terms of point estimates, the mean sum of squared errors, and the fitted regression line. We can say that order five of the polynomial model may be used to estimate splines with five segments.