机器学习算法：线性回归 | python 和 r 代码实现

terry 2年前 (2023-09-27) 阅读数 81 #数据结构与算法

线性回归

用于根据连续变量 (S) 估计真实值（房屋成本、电话号码、总销售额等）。在这里，我们将创建一条最优线，建立自变量和因变量之间的关系。这条最佳拟合线称为回归线，由线性方程 y= a*x+b 表示。

理解线性回归的最好方法是重温你的童年经历。假设您要求五年级学生按体重升序对同学进行排名，而不询问他们的体重是多少！你觉得孩子会做什么？它很可能会观察（视觉分析）人的身高和身材，并使用这些可见参数的组合来排列它们。这就是现实生活中的线性回归！孩子实际上计算了身高和体型与体重的关系，如上图所示。

在此方程中：

Y – 因变量

a – 斜率

X – 自变量

b – 截距

这些系数 a 和 b 基于平方的最小导数数据点和回归线之间的距离差异。

参见下面的示例。在这里，我们通过线性方程 y = 0.2811x + 13.9 确定了最佳拟合直线。现在我们可以使用这个方程知道一个人的体重和身高。

线性回归主要有两种类型：简单线性回归和多元线性回归。简单线性回归包含一个自变量。多元线性回归（顾名思义）的特征是多个（超过 1 个）自变量。您可以拟合多项式或曲线回归来找到最佳拟合线。这些称为多项式或曲线回归。

Python代码

#Import Library
#Import other necessary libraries like pandas, numpy...
from sklearn import linear_model
#Load Train and Test datasets
#Identify feature and response variable(s) and values must be numeric and numpy arrays
x_train=input_variables_values_training_datasets
y_train=target_variables_values_training_datasets
x_test=input_variables_values_test_datasets
# Create linear regression object
linear = linear_model.LinearRegression()
# Train the model using the training sets and check score
linear.fit(x_train, y_train)
linear.score(x_train, y_train)
#Equation coefficient and Intercept
print('Coefficient: \n', linear.coef_)
print('Intercept: \n', linear.intercept_)
#Predict Output
predicted= linear.predict(x_test)

R语言代码

#Load Train and Test datasets
#Identify feature and response variable(s) and values must be numeric and numpy arrays
x_train <- input_variables_values_training_datasets
y_train <- target_variables_values_training_datasets
x_test <- input_variables_values_test_datasets
x <- cbind(x_train,y_train)
# Train the model using the training sets and check score
linear <- lm(y_train ~ ., data = x)
summary(linear)
#Predict Output
predicted= predict(linear,x_test)

版权声明

本文仅代表作者观点，不代表Code前端网立场。
本文系作者Code前端网发表，如需转载，请注明页面地址。

上一篇：机器学习算法：逻辑回归| Python 和 R 代码实现下一篇：数据科学和机器学习面试问题：不容易！

机器学习算法：线性回归 | python 和 r 代码实现

线性回归

版权声明

作者文章