Chapter 3 - Regressoin Models
Segment 1 - Simple linear regression
Linear Regression
Linear regression is a statistical machine learning method you can use to quantify, and make predictions based on, relationships between numerical variables.
- Simple linear regression
- Multiple linear regression
Linear Regression Use Cases
- Sales Forecasting
- Supply Cost Forecasting
- Resource Consumption Forecasting
- Telecom Services Lifecycle Forecasting
Linear Regression Assumptions
- All variables are continuous numeric, not categorical
- Data is free of missing values and outliers
- There's a linear relationship between predictors and predictant
- All predictors are independent of each other
- Residuals(or prediction errors) are normally distributed
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import sklearn
from pylab import rcParams
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import scale
%matplotlib inline
rcParams['figure.figsize'] = 10,8
rooms = 2*np.random.rand(100,1)+3
rooms[1:10]
array([[3.24615481],
[4.86219627],
[3.17742366],
[3.03114054],
[3.73270016],
[3.58047146],
[3.23240264],
[4.63462537],
[3.91227449]])
price = 265 + 6*rooms + abs(np.random.randn(100,1))
price[1:10]
array([[285.23677074],
[294.79616144],
[284.85274605],
[284.40046371],
[288.07421652],
[286.60487136],
[284.55567969],
[293.27121913],
[289.12143579]])
plt.plot(rooms,price,'r^')
plt.xlabel("# of Rooms, 2019 Average")
plt.ylabel("2019 Average Home, 1000s USD")
plt.show()
X = rooms
y = price
LinReg = LinearRegression()
LinReg.fit(X,y)
print(LinReg.intercept_, LinReg.coef_)
[266.13626468] [[5.9306674]]
Simple Algebra
- y = mx + b
- b = intercept = 266.7
Estimated Coefficients
- LinReg.coef_ = [5.93] Estimated coefficients for the terms in the linear regression problem.
print(LinReg.score(X,y))
0.961246701242803