• Recognizing hand-written digits of sklean


    由来

    https://scikit-learn.org/stable/auto_examples/classification/plot_digits_classification.html#sphx-glr-auto-examples-classification-plot-digits-classification-py

    An example showing how the scikit-learn can be used to recognize images of hand-written digits.

    This example is commented in the tutorial section of the user manual.

    • Training: 0, Training: 1, Training: 2, Training: 3, Prediction: 8, Prediction: 8, Prediction: 4, Prediction: 9
    • Confusion Matrix

    Out:

    Classification report for classifier SVC(gamma=0.001):
                  precision    recall  f1-score   support
    
               0       1.00      0.99      0.99        88
               1       0.99      0.97      0.98        91
               2       0.99      0.99      0.99        86
               3       0.98      0.87      0.92        91
               4       0.99      0.96      0.97        92
               5       0.95      0.97      0.96        91
               6       0.99      0.99      0.99        91
               7       0.96      0.99      0.97        89
               8       0.94      1.00      0.97        88
               9       0.93      0.98      0.95        92
    
        accuracy                           0.97       899
       macro avg       0.97      0.97      0.97       899
    weighted avg       0.97      0.97      0.97       899
    
    
    Confusion matrix:
    [[87  0  0  0  1  0  0  0  0  0]
     [ 0 88  1  0  0  0  0  0  1  1]
     [ 0  0 85  1  0  0  0  0  0  0]
     [ 0  0  0 79  0  3  0  4  5  0]
     [ 0  0  0  0 88  0  0  0  0  4]
     [ 0  0  0  0  0 88  1  0  0  2]
     [ 0  1  0  0  0  0 90  0  0  0]
     [ 0  0  0  0  0  1  0 88  0  0]
     [ 0  0  0  0  0  0  0  0 88  0]
     [ 0  0  0  1  0  1  0  0  0 90]]
    

    Code

    https://github.com/fanqingsong/code_snippet/blob/master/sklearn/recognize_hand_written_digits.ipynb

    print(__doc__)
    
    # Author: Gael Varoquaux <gael dot varoquaux at normalesup dot org>
    # License: BSD 3 clause
    
    # Standard scientific Python imports
    import matplotlib.pyplot as plt
    
    # Import datasets, classifiers and performance metrics
    from sklearn import datasets, svm, metrics
    from sklearn.model_selection import train_test_split
    from sklearn.metrics import confusion_matrix
    
    
    # The digits dataset
    digits = datasets.load_digits()
    
    # The data that we are interested in is made of 8x8 images of digits, let's
    # have a look at the first 4 images, stored in the `images` attribute of the
    # dataset.  If we were working from image files, we could load them using
    # matplotlib.pyplot.imread.  Note that each image must have the same size. For these
    # images, we know which digit they represent: it is given in the 'target' of
    # the dataset.
    _, axes = plt.subplots(2, 4)
    images_and_labels = list(zip(digits.images, digits.target))
    for ax, (image, label) in zip(axes[0, :], images_and_labels[:4]):
        ax.set_axis_off()
        ax.imshow(image, cmap=plt.cm.gray_r, interpolation='nearest')
        ax.set_title('Training: %i' % label)
    
        
    # To apply a classifier on this data, we need to flatten the image, to
    # turn the data in a (samples, feature) matrix:
    n_samples = len(digits.images)
    print("----------- images shape:")
    print(digits.images.shape)
    
    data = digits.images.reshape((n_samples, -1))
    print("----------- data shape:")
    print(data.shape)
    
    
    
    # Create a classifier: a support vector classifier
    classifier = svm.SVC(gamma=0.001)
    
    # Split data into train and test subsets
    X_train, X_test, y_train, y_test = train_test_split(
        data, digits.target, test_size=0.5, shuffle=True)
    
    # We learn the digits on the first half of the digits
    classifier.fit(X_train, y_train)
    
    
    
    # Now predict the value of the digit on the second half:
    predicted = classifier.predict(X_test)
    
    # display some predicted instance
    images_and_predictions = list(zip(digits.images[n_samples // 2:], predicted))
    for ax, (image, prediction) in zip(axes[1, :], images_and_predictions[:4]):
        ax.set_axis_off()
        ax.imshow(image, cmap=plt.cm.gray_r, interpolation='nearest')
        ax.set_title('Prediction: %i' % prediction)
    
        
    print("Classification report for classifier %s:
    %s
    "
          % (classifier, metrics.classification_report(y_test, predicted)))
    
    disp = metrics.plot_confusion_matrix(classifier, X_test, y_test)
    disp.figure_.suptitle("Confusion Matrix")
    
    print("Confusion matrix:
    %s" % disp.confusion_matrix)
    
    cm = confusion_matrix(y_test, predicted, normalize="true")
    print("confustion matrix with normalize=true")
    print(cm)
    
    
    
    plt.show()

    其中将 8*8的二维矩阵 转变为 1维打印

    ----------- images shape:
    (1797, 8, 8)
    ----------- data shape:
    (1797, 64)

    概率混淆矩阵

    confustion matrix with normalize=true
    [[1.         0.         0.         0.         0.         0.
      0.         0.         0.         0.        ]
     [0.         1.         0.         0.         0.         0.
      0.         0.         0.         0.        ]
     [0.         0.         1.         0.         0.         0.
      0.         0.         0.         0.        ]
     [0.         0.         0.         0.98780488 0.         0.
      0.         0.01219512 0.         0.        ]
     [0.         0.         0.         0.         0.99019608 0.
      0.         0.         0.00980392 0.        ]
     [0.         0.         0.         0.         0.         0.97894737
      0.01052632 0.         0.         0.01052632]
     [0.         0.         0.         0.         0.         0.
      1.         0.         0.         0.        ]
     [0.         0.         0.         0.         0.         0.
      0.         1.         0.         0.        ]
     [0.         0.03157895 0.         0.         0.         0.
      0.         0.         0.96842105 0.        ]
     [0.         0.         0.         0.01136364 0.         0.02272727
      0.         0.01136364 0.01136364 0.94318182]]
  • 相关阅读:
    selennium模块
    urllib模块
    有关爬虫模块
    爬虫_requests_html
    爬虫x_path
    项目上线
    navicat使用 pymysql操作数据库 sql注入及增删改查
    基本查询语句和方法,连表,子查询
    表与表之间的关系
    存储引擎 数据类型
  • 原文地址:https://www.cnblogs.com/lightsong/p/14172405.html
Copyright © 2020-2023  润新知