Local Regression

Implementation of Multi-variate local polynomial regression.

  • This file is a part of Personal Programming Project (PPP) coursework in Computational Materials Science (CMS) M.Sc. course in Technische Universität Bergakademie Freiberg.

  • This file is a part of the project titled Application of statistical learning to predict material properties.

  • For a given number of data points, the algorithm fits a polynomial of given degree to the data points in a specified neighbourhood.

  • Given response variables and predictor variables, this can be used to estimate a regression function.

  • At fitting points the underlying function is assumed to be smooth and k-times differentiable to fit a polynomial of degree k.

References

Cleveland, W.S. (1979) “Robust Locally Weighted Regression and Smoothing Scatterplots”. Journal of the American Statistical Association 74 (368): 829-836.

class localRegression.LocalRegression(alpha, degree=1)

Bases: object

Multi-variate local polynomial regression.

:meth:`tricubic_weights` : Method that computes the weights from the

normalized distances within the window. This method specifically returns values of tricube weighting function evaluation.

:meth:`eval_cart_prod` : Method that utilizes itertools to build

bases that would later be used in least squares method to estimate the function.

:meth:`fit_polynomial` : Method that uses NumPy's least squares method to

output the data matrix that is eventually used to evaluate the predictor variable at that point.

:meth:`fit` : Method that provides the initial wrapper for the methods

mentioned above.

:meth:`predict` : Method that utilizes the learned data matrix to

evaluate the predictor variables.

alpha

Fraction of data to be used while estimating polynomial. Value must be between 0 or 1.

Type:

float

degree

Degree of the polynomial to be fit to given data. Defaults to 1, corresponding to linear polynomial, in this particular implementation.

Type:

int

eval_cart_prod(x, x_values)
fit(x, y, kernel=<function LocalRegression.tricubic_weights>)

Method that performs multi-variate local polynomial regression.

Parameters:

kernel (function) – Weighting function to be used during this evaluation. Default is tricubic weighting function as described in references. These are expected to give higher weight to observations near the smoothing window and zero weight to those outside.

Returns:

output_values – All values of fit polynomial according to given parameters. Can be interpolated to evaluate for newer data points.

Return type:

array

fit_polynomial(x, y, x_values, weights, degree)

Method that performs computationally intensive task to return the polynomial value.

Parameters:
  • x (list or array) – Predictor variables ‘X’.

  • y (list or array) – Response variables ‘Y’.

  • x_values (list or array) – Values of x at which the polynomial is fit.

  • weights (list or array) – Weights assigned to corresponding data points in the neighbourhood.

  • degree (int) – Degree of polynomial being fit.

Returns:

scalar – The fitted response variable ‘Y’ which can be interpolated to predict for newer data points.

Return type:

float

predict(x_pred, design_matrix=None)
tricubic_weights()

Method that assign weights to points in the local neighbourhood of x-values.

It provides more weight to observations whose value is closer to the given point, and less weight to observations that are further away.

Parameters:

args (list or array) – Weight function is dependent on the distance in the neighbourhood, with it being zero outside of the local neighbourhood. This contains the distances evaluated earlier.

Returns:

weights – Assigned weights to the corresponding data points in the vicinity.

Return type:

array