• 机器学习之逻辑回归


    知识点:

    """
    逻辑回归:只能解决二分类问题
    
    损失函数:
        1、均方误差(不存在多个局部最低点):只有一个最小值
        2、对数似然损失:存在多个局部最小值 ;
            改善方法:1、多次随机初始化,多次比较最小值结果;
                      2、调整学习率
    
    逻辑回归缺点:不好处理多分类问题                  
    
    生成模型:有先验概率 (逻辑回归,隐马尔科夫模型)
    
    判别模型:没有先验概率 (KNN,决策树,随机森林,神经网络)
    """

    代码:

    def logistic():
        """
        逻辑回归做二分类进行癌症预测(根据细胞的属性特征)
        :return: NOne
        """
        # 构造列标签名字
        column = ['Sample code number','Clump Thickness', 'Uniformity of Cell Size','Uniformity of Cell Shape','Marginal Adhesion', 'Single Epithelial Cell Size','Bare Nuclei','Bland Chromatin','Normal Nucleoli','Mitoses','Class']
    
        # 读取数据
        data = pd.read_csv("https://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/breast-cancer-wisconsin.data", names=column)
    
        print(data)
    
        # 缺失值进行处理
        data = data.replace(to_replace='?', value=np.nan)
    
        data = data.dropna()
    
        # 进行数据的分割
        x_train, x_test, y_train, y_test = train_test_split(data[column[1:10]], data[column[10]], test_size=0.25)
    
        # 进行标准化处理
        std = StandardScaler()
    
        x_train = std.fit_transform(x_train)
        x_test = std.transform(x_test)
    
        # 逻辑回归预测
        lg = LogisticRegression(C=1.0)
    
        lg.fit(x_train, y_train)
    
        print(lg.coef_)
    
        y_predict = lg.predict(x_test)
    
        print("准确率:", lg.score(x_test, y_test))
    
        print("召回率:", classification_report(y_test, y_predict, labels=[2, 4], target_names=["良性", "恶性"]))
    
        return None

    损失函数:

  • 相关阅读:
    linq教程
    linq 多表分组查询统计
    System.Diagnostics.Trace.Listeners
    linq多表join与group
    LINQ的左连接、右连接、内连接
    linq pad
    开源项目
    linq group join
    OWIN OAuth 2.0 Authorization Server
    autofac + owin + webform + mvc + webapi集成demo
  • 原文地址:https://www.cnblogs.com/ywjfx/p/10898684.html
Copyright © 2020-2023  润新知