In this tutorial, we will learn how to implement Non-Linear Regression. If the data shows a curvy trend, then linear regression will not produce very accurate results when compared to a non-linear regression because, as the name implies, linear regression presumes that the data behavior is linear.

## Parts Required

- Python interpreter (Spyder, Jupyter, etc.).

## Procedure

Following are the steps required to perform this tutorial.

### Packages Needed

1 2 |
import numpy as np import matplotlib.pyplot as plt |

Though Linear regression is very good to solve many problems, it cannot be used for all datasets. First recall how linear regression, could model a dataset. It models a linear relation between a dependent variable y and an independent variable x. It had a simple equation, of degree 1, for example, y = 4*𝑥* + 2.

Non-linear regressions are a relationship between independent variables *𝑥* and a dependent variable *𝑦* which result in a non-linear function modeled data. Essentially any relationship that is not linear can be termed as non-linear and is usually represented by the polynomial of *𝑘* degrees (maximum power of *𝑥*).

Non-linear functions can have elements like exponentials, logarithms, fractions, and others. For example:

Let’s take a look at a cubic function’s graph:

1 2 3 4 5 6 7 8 9 |
x = np.arange(-5.0, 5.0, 0.1) y = 1*(x**3) + 1*(x**2) + 1*x + 3 y_noise = 20 * np.random.normal(size=x.size) ydata = y + y_noise plt.plot(x, ydata, 'bo') plt.plot(x,y, 'r') plt.ylabel('Dependent Variable') plt.xlabel('Independent Variable') plt.show() |

This function has *𝑥*3 and *𝑥*2 as independent variables. Also, the graphic of this function is not a straight line over the 2D plane. So this is a non-linear function.

Some other types of non-linear functions are:

### Quadratic

1 2 3 4 5 6 7 8 9 |
x = np.arange(-5.0, 5.0, 0.1) y = np.power(x,2) y_noise = 2 * np.random.normal(size=x.size) ydata = y + y_noise plt.plot(x, ydata, 'bo') plt.plot(x,y, 'r') plt.ylabel('Dependent Variable') plt.xlabel('Independent Variable') plt.show() |

### Exponential

An exponential function with base c is defined by

where b ≠0, c > 0, c ≠1, and x is any real number. The base, c, is constant and the exponent, x, is a variable.

1 2 3 4 5 6 7 |
X = np.arange(-5.0, 5.0, 0.1) Y= np.exp(X) plt.plot(X,Y) plt.ylabel('Dependent Variable') plt.xlabel('Independent Variable') plt.show() |

### Logarithmic

The response *𝑦* is a result of applying a logarithmic map from input 𝑥’s to output variable *𝑦*. Please consider that instead of *𝑥*, we can use 𝑋, which can be a polynomial representation of the 𝑥’s. In general form, it would be written as

1 2 3 4 5 6 7 8 |
X = np.arange(-5.0, 5.0, 0.1) Y = np.log(X) plt.plot(X,Y) plt.ylabel('Dependent Variable') plt.xlabel('Independent Variable') plt.show() |

### Sigmoidal/Logistic

1 2 3 4 5 6 |
X = np.arange(-5.0, 5.0, 0.1) Y = 1-4/(1+np.power(3, X-2)) plt.plot(X,Y) plt.ylabel('Dependent Variable') plt.xlabel('Independent Variable') plt.show() |

## Non-Linear Regression Example

In this example, we’re going to try and fit a non-linear model to the data points corresponding to China’s GDP from 1960 to 2014. The dataset has two columns, the first, a year between 1960 and 2014, the second, China’s corresponding annual gross domestic income in US dollars for that year.

1 2 3 4 5 6 |
import numpy as np import pandas as pd import matplotlib.pyplot as plt dataset = pd.read_csv("china_gdp_1960.csv") dataset.head(10) |

### Plotting the Dataset

This is what the data points look like. It kind of looks like either a logistic or exponential function. The growth starts off slow, then from 2005 on forward, the growth is very significant. And finally, it decelerates slightly in the 2010s.

1 2 3 4 5 6 |
plt.figure(figsize=(8,5)) x_data, y_data = (dataset["Year"].values, dataset["Value"].values) plt.plot(x_data, y_data, 'ro') plt.ylabel('GDP') plt.xlabel('Year') plt.show() |

### Choosing a Model

From an initial look at the plot, we determine that the logistic function could be a good approximation, since it has the property of starting with slow growth, increasing growth in the middle, and then decreasing again at the end as illustrated below:

1 2 3 4 5 6 7 |
X = np.arange(-5.0, 5.0, 0.1) Y = 1.0 / (1.0 + np.exp(-X)) plt.plot(X,Y) plt.ylabel('Dependent Variable') plt.xlabel('Independent Variable') plt.show() |

The formula for the logistic function is the following:

*𝛽*1: Controls the curve’s steepness,

*𝛽*2: Slides the curve on the x-axis.

## Building The Model

Now, let’s build our regression model and initialize its parameters.

1 2 3 |
def sigmoid(x, Beta_1, Beta_2): y = 1 / (1 + np.exp(-Beta_1*(x-Beta_2))) return y |

Lets look at a sample sigmoid line that might fit with the data:

1 2 3 4 5 6 7 |
beta_1 = 0.10 beta_2 = 1990.0 #logistic function Y_pred = sigmoid(x_data, beta_1 , beta_2) #plot initial prediction against datapoints plt.plot(x_data, Y_pred*15000000000000.) plt.plot(x_data, y_data, 'ro') |

Our task here is to find the best parameters for our model. Lets first normalize our x and y:

1 2 3 |
# Lets normalize our data xdata =x_data/max(x_data) ydata =y_data/max(y_data) |

#### How we find the best parameters for our fit line?

We can use **curve_fit** which uses non-linear least squares to fit our sigmoid function, to data. Optimal values for the parameters so that the sum of the squared residuals of sigmoid (xdata, *popt) – ydata is minimized.

*popt are our optimized parameters.

1 2 3 4 |
from scipy.optimize import curve_fit popt, pcov = curve_fit(sigmoid, xdata, ydata) print the final parameters print(" beta_1 = %f, beta_2 = %f" % (popt[0], popt[1])) |

Now we plot our resulting regression model.

1 2 3 4 5 6 7 8 9 10 |
x = np.linspace(1960, 2015, 55) x = x/max(x) plt.figure(figsize=(8,5)) y = sigmoid(x, *popt) plt.plot(xdata, ydata, 'ro', label='data') plt.plot(x,y, linewidth=3.0, label='fit') plt.legend(loc='best') plt.ylabel('GDP') plt.xlabel('Year') plt.show() |

Now, let’s find the accuracy of our model.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
# split data into train/test msk = np.random.rand(len(dataset)) < 0.8 train_x = xdata[msk] test_x = xdata[~msk] train_y = ydata[msk] test_y = ydata[~msk] # build the model using train set popt, pcov = curve_fit(sigmoid, train_x, train_y) # predict using test set y_hat = sigmoid(test_x, *popt) # evaluation print("Mean absolute error: %.2f" % np.mean(np.absolute(y_hat - test_y))) print("Residual sum of squares (MSE): %.2f" % np.mean((y_hat - test_y) ** 2)) from sklearn.metrics import r2_score print("R2-score: %.2f" % r2_score(y_hat , test_y) ) |

## References

[1] https://medium.com/analytics-vidhya/non-linear-regression-analysis-e150447ac1a3

[2] https://www.kaggle.com/john77eipe/non-linear-regression

[3] https://codekarim.com/node/40

[4] IBM – Machine Learning with Python – A Practical Introduction

[5] Udemy – The Data Science Course 2020: Complete Data Science Bootcamp – 365 Careers

[6] Udemy – Machine Learning and Data Science (Python)