• 吴恩达Coursera, 机器学习专项课程, Machine Learning:Advanced Learning Algorithms第四周测验


    Decision trees

    第 1 个问题:Based on the decision tree shown in the lecture, if an animal has floppy ears, a round face shape and has whiskers, does the model predict that it's a cat or not a cat?

    【正确】cat
    Not a cat
    【解释】Correct. If you follow the floppy ears to the right, and then from the whiskers decision node, go left because whiskers are present, you reach a leaf node for "cat", so the model would predict that this is a cat.

    第 2 个问题:Take a decision tree learning to classify between spam and non-spam email. There are 20 training examples at the root note, comprising 10 spam and 10 non-spam emails. If the algorithm can choose from among four features, resulting in four corresponding splits, which would it choose (i.e., which has highest purity)?

    Left split: 5 of 10 emails are spam. Right split: 5 of 10 emails are spam.
    Left split: 7 of 8 emails are spam. Right split: 3 of 12 emails are spam.
    Left split: 2 of 2 emails are spam. Right split: 8 of 18 emails are spam.
    【正确】Left split: 10 of 10 emails are spam. Right split: 0 of 10 emails are spam.
    【解释】Yes!

    Practice quiz: Decision tree learning

    第 1 个问题:Recall that entropy was defined in lecture as H(p_1) = - p_1 log_2(p_1) - p_0 log_2(p_0), where p_1 is the fraction of positive examples and p_0 the fraction of negative examples.

    image

    第2 个问题:第 2 个问题Recall that information was defined as follows:

    image

    第 3 个问题:To represent 3 possible values for the ear shape, you can define 3 features for ear shape: pointy ears, floppy ears, oval ears. For an animal whose ears are not pointy, not floppy, but are oval, how can you represent this information as a feature vector?

    [0, 1, 0]
    [1,0,0]
    [1, 1, 0]
    【正确】[0, 0, 1]
    【解释】Yes! 0 is used to represent the absence of that feature (not pointy, not floppy), and 1 is used to represent the presence of that feature (oval).

    Try every value spaced at regular intervals (e.g., 8, 8.5, 9, 9.5, 10, etc.) and find the split that gives the highest information gain.
    Use a one-hot encoding to turn the feature into a discrete feature vector of 0’s and 1’s, then apply the algorithm we had discussed for discrete features.
    【正确】Choose the 9 mid-points between the 10 examples as possible splits, and find the split that gives the highest information gain.
    Use gradient descent to find the value of the split threshold that gives the highest information gain.
    【解释】Correct. This is what is proposed in the lectures.

    第 5 个问题:Which of these are commonly used criteria to decide to stop splitting? (Choose two.)

    When the information gain from additional splits is too large
    【正确】When the tree has reached a maximum depth
    【解释】Yes!
    【正确】When the number of examples in a node is below a threshold
    【解释】Yes!
    When a node is 50% one class and 50% another class (highest possible value of entropy)

    Practice quiz: Tree ensembles

    第 1 个问题:For the random forest, how do you build each individual tree so that they are not all identical to each other?

    Train the algorithm multiple times on the same training set. This will naturally result in different trees.
    If you are training B trees, train each one on 1/B of the training set, so each tree is trained on a distinct set of examples.
    【正确】Sample the training data with replacement
    Sample the training data without replacement
    【解释】Correct. You can generate a training set that is unique for each individual tree by sampling the training data with replacement.

    第 2 个问题:You are choosing between a decision tree and a neural network for a classification task where the input xx is a 100x100 resolution image. Which would you choose?

    A neural network, because the input is structured data and neural networks typically work better with structured data.
    【正确】A neural network, because the input is unstructured data and neural networks typically work better with unstructured data.
    A decision tree, because the input is unstructured and decision trees typically work better with unstructured data.
    A decision tree, because the input is structured data and decision trees typically work better with structured data.

    第 3 个问题:What does sampling with replacement refer to?

    【正确】Drawing a sequence of examples where, when picking the next example, first replacing all previously drawn examples into the set we are picking from.
    Drawing a sequence of examples where, when picking the next example, first remove all previously drawn examples from the set we are picking from.
    It refers to a process of making an identical copy of the training set.
    It refers to using a new sample of data that we use to permanently overwrite (that is, to replace) the original data.

  • 相关阅读:
    34.angularJS的{{}}和ng-bind
    33.AngularJS 应用 angular.module定义应用 angular.controller控制应用
    32.AngularJS 表达式
    31.ng-init 指令初始化 AngularJS 应用程序变量。
    30.angularJS第一个实例
    29.AngularJS 简介
    28. Brackets安装angularjs插件
    27.AngularJS 下载地址
    22条创业军规,让你5分钟读完《创业维艰》
    创业维艰:为啥大多数创业者都不开心?
  • 原文地址:https://www.cnblogs.com/chuqianyu/p/16439116.html
Copyright © 2020-2023  润新知