python MLP 神经网络使用 MinMaxScaler 没有 StandardScaler效果好

MLP 64,2 preprocessing.MinMaxScaler().fit(X)
                               test confusion_matrix:
[[129293   2734]
[   958 23375]]
             precision    recall f1-score   support

          0       0.99      0.98      0.99    132027
          1       0.90      0.96      0.93     24333

avg / total       0.98      0.98      0.98    156360

all confusion_matrix:
[[646945 13384]
[ 4455 117015]]
             precision    recall f1-score   support

          0       0.99      0.98      0.99    660329
          1       0.90      0.96      0.93    121470

avg / total       0.98      0.98      0.98    781799

black verify confusion_matrix:
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0
0 0 0 0 0]
/root/anaconda2/lib/python2.7/site-packages/sklearn/metrics/classification.py:1137: UndefinedMetricWarning: Recall and F-score are ill-defined and being set to 0.0 in labels with no true samples.
'recall', 'true', average, warn_for)
             precision    recall f1-score   support

          0       0.00      0.00      0.00         0
          1       1.00      0.07      0.13        42

avg / total       1.00      0.07      0.13        42

white verify confusion_matrix:
[1 1 1 1 1 1 0]
             precision    recall f1-score   support

          0       1.00      0.14      0.25         7
          1       0.00      0.00      0.00         0

avg / total       1.00      0.14      0.25         7

unknown_verify:
[1 0 0 1 1 0 0 0 1 1 0 0 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 0 0 1 1 1 1
0 1 1 1 1 0 1 0 0 1 0 1 0 1 0 0 1 0 0 1 1 0 0 1 0 0 0 1 0 1 1 0 0 1 0 0 0]

MLP 64，2 使用preprocessing.StandardScaler().fit(X)
[[131850    180]
[   230 24100]]
             precision    recall f1-score   support

          0       1.00      1.00      1.00    132030
          1       0.99      0.99      0.99     24330

avg / total       1.00      1.00      1.00    156360

all confusion_matrix:
[[659500    829]
[ 1195 120275]]
             precision    recall f1-score   support

          0       1.00      1.00      1.00    660329
          1       0.99      0.99      0.99    121470

avg / total       1.00      1.00      1.00    781799

black verify confusion_matrix:
[0 1 1 0 0 0 0 1 1 1 0 1 1 1 1 1 1 0 1 1 1 0 0 0 1 1 1 0 0 0 1 1 1 1 1 1 1
0 0 0 1 1]
/root/anaconda2/lib/python2.7/site-packages/sklearn/metrics/classification.py:1137: UndefinedMetricWarning: Recall and F-score are ill-defined and being set to 0.0 in labels with no true samples.
'recall', 'true', average, warn_for)
             precision    recall f1-score   support

          0       0.00      0.00      0.00         0
          1       1.00      0.62      0.76        42

avg / total       1.00      0.62      0.76        42

white verify confusion_matrix:
[0 0 1 0 1 1 0]
             precision    recall f1-score   support

          0       1.00      0.57      0.73         7
          1       0.00      0.00      0.00         0

avg / total       1.00      0.57      0.73         7

unknown_verify:
[1 0 0 0 1 0 1 1 0 0 1 0 1 1 0 1 0 1 0 0 0 0 1 0 1 0 0 0 0 0 0 1 0 0 1 0 0
0 1 1 1 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0]

代码：

    from sklearn import preprocessing
    scaler = preprocessing.StandardScaler().fit(X)
    #scaler = preprocessing.MinMaxScaler().fit(X)
    X = scaler.transform(X)
    print("standard X sample:", X[:3])

    black_verify = scaler.transform(black_verify)
    print(black_verify)

    white_verify = scaler.transform(white_verify)
    print(white_verify)

    unknown_verify = scaler.transform(unknown_verify)
    print(unknown_verify)

    # ValueError: operands could not be broadcast together with shapes (756140,75) (42,75) (756140,75) 
    for i in range(20):
        X = np.concatenate((X, black_verify))
        y += black_verify_labels


    labels = ['white', 'CC']
    if True:
        # pdb.set_trace()
        ratio_of_train = 0.8
        X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=(1 - ratio_of_train))
        # X_train=preprocessing.normalize(X_train)
        # X_test=preprocessing.normalize(X_test)
        clf = MLPClassifier(solver='sgd', batch_size=128, learning_rate='adaptive', max_iter=256,
                            hidden_layer_sizes=(64, 2), random_state=1)

        """
        clf = sklearn.ensemble.RandomForestClassifier(n_estimators=n_estimators, verbose=verbose, n_jobs=n_jobs,
                                                      random_state=random_state, oob_score=True)
        """

        clf.fit(X_train, y_train)
        print "test confusion_matrix:"
        # print clf.feature_importances_
        y_pred = clf.predict(X_test)
        print(sklearn.metrics.confusion_matrix(y_test, y_pred))
        print(classification_report(y_test, y_pred))
    else:
        #clf = pickle.loads(open("mpl-acc97-recall98.pkl", 'rb').read())
        clf = pickle.loads(open("mlp-add-topx10.model", 'rb').read())
        y_pred = clf.predict(X)
        print(sklearn.metrics.confusion_matrix(y, y_pred))
        print(classification_report(y, y_pred))
        import sys
        #sys.exit(0)


    print "all confusion_matrix:"
    y_pred = clf.predict(X)
    print(sklearn.metrics.confusion_matrix(y, y_pred))
    print(classification_report(y, y_pred))

相关阅读:
AutoCAD VBA 批量导出源代码
cad.net 启动时候利用.arg配置文件
c#datatable序列化xml
阿里云ECS自建K8S集群
禁止git提交时执行 npm run -s precommit
级数法求圆周率
适合小学生表演的节目--持续更新
编程闯关游戏--太空罚款
Dynamics CRM Fetch查询超过5000条数据
关于升级至12cR2版本的Optimizer 自适应特性的设置建议

原文地址：https://www.cnblogs.com/bonelee/p/9082014.html