• Logistic回归小结


    首先要清楚,逻辑回归是一种分类算法。它是在线性回归模型的基础上,使用Sigmoid函数,将线性模型的预测结果转变为离散变量,从而用于处理分类问题。

    1 逻辑回归原理

    以二分类为例,说明逻辑回归的工作原理。由线性回归小结基础,不难得出线性回归的假设函数(h_{ heta }^{'}left ( x ight )),在逻辑回归中,使用Sigmoid函数使得(h_{ heta }^{'}left ( x ight ))的值在[0,1]区间内。
    一般Sigmoid函数表示为:

    [gleft ( z ight )=frac{1}{1+e^{-z}} ]

    绘图展示Sigmoid函数:

    #Sigmoid function
    from scipy.special import expit
    
    def sigmoid(z):    
        return 1/(1 +np.exp(-z))
    
    
    def h(theta, X):
        return expit(np.dot(X, theta))
    
    
    #check sigmoid function
    myx = np.arange(-10, 10, 0.01)
    plt.plot(myx, expit(myx))
    plt.title('Sigmoid Function')
    plt.show()
    

    可以看到,Sigmoid函数当(z)趋于正无穷时,(gleft ( z ight ))趋于1,当(z)趋于负无穷时,(gleft ( z ight ))趋于0。
    对于逻辑回归,令(z=h_{ heta }^{'}left ( x ight )),则有(gleft ( z ight )=gleft ( h_{ heta }^{'}left ( x ight ) ight )=frac{1}{1+e^{-h_{ heta }^{'}left ( x ight )}}),其中,$h_{ heta }^{'}left ( x ight )=x heta $。因此,逻辑回归的模型为:

    [h_{ heta }left ( x ight )=frac{1}{1+e^{-x heta }} ]

    结合Sigmoid函数,理解逻辑回归模型如何实现分类问题的。此时,(h_{ heta }left ( x ight ))为模型输出,把它当作某一分类的概率值,赋予其概率含义。当(h_{ heta }left ( x ight )>0.5)时,(x heta >0),则(y)为1,当(h_{ heta }left ( x ight )<0.5)时,(x heta <0),则(y)为0,如此实现了分类问题。
    逻辑回归模型的矩阵形式为:

    [h_{ heta }left ( X ight )=frac{1}{1+e^{-X heta }} ]

    因此,(y)取1的概率等于(h_{ heta }left ( x ight )),取0的概率为(1-h_{ heta }left ( x ight )),即:

    [pleft ( y|x; heta ight )=h_{ heta }left ( x ight )^{y}left ( 1-h_{ heta }left ( x ight ) ight )^{1-y} ]

    似然函数:

    [Lleft ( heta ight )=p(y|x; heta )=prod_{i=1}^{m}P(y^{(i)}|x^{(i)}; heta )=prod_{i=1}^{m}h_{ heta }(x^{(i)})^{y(i)}(1-h_{ heta }(x^{(i)}))^{1-y^{(i)}} ]

    对数似然函数:

    [l( heta )=logL( heta )=sum_{i=1}^{m}y^{(i)}logh_{ heta }(x^{(i)})+(1-y^{(i)})log(1-h_{ heta }(x^{(i)})) ]

    损失函数:

    [J( heta )=-logL( heta )=-sum_{i=1}^{m}(y^{(i)}logh_{ heta }(x^{(i)})+(1-y^{(i)})log(1-h_{ heta }(x^{(i)}))) ]

    损失函数的矩阵形式为:

    [J( heta )=-Y^{T}logh_{ heta }(X)-(E-Y)^{T}log(E-h_{ heta }(X)) ]

    逻辑回归的优化目标:

    [min_{ heta}Jleft ( heta ight ) ]

    2 逻辑回归算法

    由于(frac{partial J( heta )}{partial heta }=X^{T}(h_{ heta }(X)-Y))
    梯度下降法中$ heta $的迭代公式为:

    [ heta = heta -alpha frac{partial J( heta )}{partial heta }= heta -alpha X^{T}(h_{ heta }(X)-Y) ]

    3 逻辑回归代码实现

    3.1 训练算法:使用梯度上升找到最佳参数

    #Logistic 回归梯度上升优化算法
    from numpy import *
    def loadDataSet():
        dataMat = []; labelMat = []
        fr = open('testSet.txt')
        for line in fr.readlines():
            lineArr = line.strip().split()
            dataMat.append([1.0, float(lineArr[0]), float(lineArr[1])])
            labelMat.append(int(lineArr[2]))
        return dataMat,labelMat
    
    def sigmoid(inX):
        return 1.0/(1+exp(-inX))
    
    def gradAscent(dataMatIn, classLabels):
        dataMatrix = mat(dataMatIn)
        labelMat = mat(classLabels).transpose()
        m,n = shape(dataMatrix)
        alpha = 0.001
        maxCycles = 500
        weights = ones((n,1))
        for k in range(maxCycles):
            h = sigmoid(dataMatrix*weights)
            error = (labelMat - h)
            weights = weights + alpha*dataMatrix.transpose()*error
        return weights
    
    dataArr, labelMat = loadDataSet()
    print(gradAscent(dataArr, labelMat))
    #输出
    [[ 4.12414349]
    [ 0.48007329]
    [-0.6168482 ]]
    
    

    4.2 分析数据:画出决策边界

    #画出数据集和Logistic回归最佳拟合直线的函数
    #绘制图像
    import matplotlib.pyplot as plt
    def plotBestFit(wei):
        weights = wei.getA()
        dataMat,labelMat=loadDataSet()
        dataArr=array(dataMat)
        n=shape(dataMat)[0]          #样本数目
        xcord1=[]; ycord1=[]
        xcord2=[]; ycord2=[]
        for i in range(n):
            if int(labelMat[i])==1:
                xcord1.append(dataArr[i,1])
                ycord1.append(dataArr[i,2])
            else:
                xcord2.append(dataArr[i,1])
                ycord2.append(dataArr[i,2])
        fig=plt.figure()
        ax=fig.add_subplot(111)
        ax.scatter(xcord1,ycord1,s=20,c='r',marker='s')
        ax.scatter(xcord2,ycord2,s=20,c='g')
        x = arange(-3.0, 3.0, 0.1)  # 直线x坐标的取值范围
        y = (-weights[0] - weights[1] * x) / weights[2]  # 直线方程
        plt.title('DataSet')
        ax.plot(x, y)
        plt.xlabel('X1');
        plt.ylabel('X2');
        plt.show()
    
    
    dataArr, labelMat = loadDataSet()
    weights = gradAscent(dataArr, labelMat)
    plotBestFit(weights)
    

    梯度上升算法在500次迭代后得到的Logistic回归最佳拟合直线:

    4.3 训练算法:随机梯度上升

    梯度上升算法在每次更新回归系数时都需要遍历整个数据集,该方法在处理100个左右的数据集时尚可,但如果有数十亿样本和成千上万的特征,那么该方法的计算复杂度就太高了。一种改进方法是一次仅用一个样本点来更新回归系数,该方法称为随机梯度上升算法。

    #随机梯度上升算法
    def stocGradAscent0(dataMatrix, classLabels):
        m,n = shape(dataMatrix)
        alpha = 0.01
        weights = ones(n)
        for i in range(m):
            h = sigmoid(sum(dataMatrix[i]*weights))
            error = classLabels[i] - h
            weights = weights + alpha*error*dataMatrix[i]
        return weights
    
    def plotBestFit(wei):
        
        dataMat,labelMat=loadDataSet()
        dataArr=array(dataMat)
        n=shape(dataMat)[0]          #样本数目
        xcord1=[]; ycord1=[]
        xcord2=[]; ycord2=[]
        for i in range(n):
            if int(labelMat[i])==1:
                xcord1.append(dataArr[i,1])
                ycord1.append(dataArr[i,2])
            else:
                xcord2.append(dataArr[i,1])
                ycord2.append(dataArr[i,2])
        fig=plt.figure()
        ax=fig.add_subplot(111)
        ax.scatter(xcord1,ycord1,s=20,c='r',marker='s')
        ax.scatter(xcord2,ycord2,s=20,c='g')
        x = arange(-3.0, 3.0, 0.1)  # 直线x坐标的取值范围
        y = (-weights[0] - weights[1] * x) / weights[2]  # 直线方程
        plt.title('DataSet')
        ax.plot(x, y)
        plt.xlabel('X1');
        plt.ylabel('X2');
        plt.show()
    
    dataMat, labelMat = loadDataSet()
    weights = stocGradAscent0(array(dataMat), labelMat)
    plotBestFit(weights)
    

    分类器错分了三分之一的样本:

    直接比较随机梯度上升和梯度上升的代码结果是不公平的,后者的结果是在整个数据集上迭代了500次才得到的。一个判断优化算法优劣的可靠方法是看它是否收敛,也就是说参数是否达到了稳定值,是否还会不断地变化?对此,我们在程序清单5-3中随机梯度上升算法上做了些修改,使其在整个数据集上运行200次。最终绘制的三个回归系数的变化情况

    #调整梯度上升和随机梯度上升函数
    def gradAscent(dataMatIn,classlabels):
        dataMatrix=mat(dataMatIn)
        labelMat=mat(classlabels).T
        m,n=shape(dataMatrix)
        alpha=0.01
        maxCycles=500
        weights=ones((n,1))
        weights_array=array([])
        for k in range(maxCycles):
            h=sigmoid(dataMatrix*weights)
            error=labelMat-h
            weights=weights+alpha*dataMatrix.T*error
            weights_array=append(weights_array,weights)    #一行
        weights_array=weights_array.reshape(maxCycles,n)
        return weights.getA(),weights_array
    
    def stocGradAscent1(dataMatrix,classLabels,numIter=150):
        m,n=shape(dataMatrix)
        weights=ones(n)
        weights_array=array([])
        for j in range(numIter):
            dataIndex=list(range(m))
            for i in range(m):
                alpha=4/(1.0+j+i)+0.01
                randIndex=int(random.uniform(0,len(dataIndex)))
                h=sigmoid(sum(dataMatrix[randIndex]*weights))
                error=classLabels[randIndex]-h
                weights=weights+alpha*error*dataMatrix[randIndex]
                weights_array=append(weights_array,weights,axis=0)
                del(dataIndex[randIndex])
        weights_array=weights_array.reshape(numIter*m,n)
        return weights,weights_array
    
    #绘制回归系数与迭代次数的关系
    plt.rcParams['font.sans-serif']=['SimHei']
    plt.rcParams['axes.unicode_minus']=False
    def plotWeights(weights_array1,weights_array2):
        #画布分成三行两列
        fig,axs=plt.subplots(nrows=3,ncols=2,sharex=False,sharey=False,figsize=(20,10))
        x1=arange(0,len(weights_array1),1)
        #绘制w0与迭代次数的关系
        axs[0][0].plot(x1,weights_array1[:,0])
        axs0_title_text=axs[0][0].set_title('梯度上升算法:回归系数与迭代次数关系')
        axs0_ylabel_text=axs[0][0].set_ylabel('W0')
        plt.setp(axs0_title_text,size=20,weight='bold',color='black')
        plt.setp(axs0_ylabel_text,size=20,weight='bold',color='black')
        #绘制w1与迭代次数关系
        axs[1][0].plot(x1,weights_array1[:,1])
        axs1_ylabel_text=axs[1][0].set_ylabel('W1')
        plt.setp(axs1_ylabel_text,size=20,weight='bold',color='black')
        #绘制w2与迭代次数关系
        axs[2][0].plot(x1,weights_array1[:,2])
        axs2_xlabel_text=axs[2][0].set_title('迭代次数')
        axs2_ylabel_text=axs[2][0].set_ylabel('W2')
        plt.setp(axs2_xlabel_text,size=20,weight='bold',color='black')
        plt.setp(axs2_ylabel_text,size=20,weight='bold',color='black')
    
        x2=arange(0,len(weights_array2),1)
        #绘制w0与迭代次数关系
        axs[0][1].plot(x2,weights_array2[:,0])
        axs0_title_text=axs[0][1].set_title('随机梯度上升算法:回归系数与迭代次数关系')
        axs0_ylabel_text=axs[0][1].set_ylabel('W0')
        plt.setp(axs0_title_text,size=20,weight='bold',color='black')
        plt.setp(axs0_ylabel_text,size=20,weight='bold',color='black')
        #绘制w1与迭代次数关系
        axs[1][1].plot(x2,weights_array2[:,1])
        axs1_ylabel_text=axs[1][1].set_ylabel('W1')
        plt.setp(axs1_ylabel_text,size=20,weight='bold',color='black')
        #绘制w2与迭代次数关系
        axs[2][1].plot(x2,weights_array2[:,2])
        axs2_xlabel_text=axs[2][1].set_title('迭代次数')
        axs2_ylabel_text=axs[2][1].set_ylabel('W2')
        plt.setp(axs2_xlabel_text,size=20,weight='bold',color='black')
        plt.setp(axs2_ylabel_text,size=20,weight='bold',color='black')
    
        plt.show()
    
    # 绘图
    dataMat, labelMat = loadDataSet()
    weights1, weights_array1 = gradAscent(dataMat, labelMat)
    weights2, weights_array2 = stocGradAscent1(array(dataMat), labelMat)
    plotWeights(weights_array1, weights_array2)
    

    #改进的随机梯度上升算法
    import random
    
    def stocGradAscent1(dataMatrix,classLabels,numIter=150):
        m,n=shape(dataMatrix)
        weights=ones(n)
        for j in range(numIter):
            dataIndex=list(range(m))
            for i in range(m):
                alpha=4/(1.0+j+i)+0.01            #每次调整alpha值
                randIndex=int(random.uniform(0,len(dataIndex))) #随机选取样本,可以减少周期性波动
                h=sigmoid(sum(dataMatrix[randIndex]*weights))   #梯度上升这里是向量,随机梯度上升数据类型都是数值
                error=classLabels[randIndex]-h
                weights=weights+alpha*error*dataMatrix[randIndex]
                del(dataIndex[randIndex])
        return weights
    
    dataMat, labelMat = loadDataSet()
    weights = stocGradAscent1(array(dataMat), labelMat)
    plotBestFit(weights)
    

  • 相关阅读:
    Docker容器Centos容器安装openssh
    DEVOPS技术实践_15:使用Docker作为Jenkins的slave
    DEVOPS技术实践_14:使用docker部署jenkins
    oracle数据库创建实例
    Java 覆写初探
    Java super和this小结
    Java 继承学习
    Java 数据表映射
    Java 单向链表学习
    Java 继承初探
  • 原文地址:https://www.cnblogs.com/eugene0/p/11415534.html
Copyright © 2020-2023  润新知