• training set, validation set, test set的区别


    1. training set: 用来训练模型
    2. validation set : 用来做model selection
    3. test set : 用来评估所选出来的model的实际性能

    我们知道,在做模型训练之前,我们必须选择所训练的模型的形式:线性模型(y = wx+b)或者非线性模型(SVM,decision tree,neural network….)。选择好模型之后,我们才会开始训练,训练的目标是确定模型的参数,训练一般是通过设计损失函数,然后对损失函数进行优化来完成训练。

    而很多时候我们并不知道哪种模型适合,所以往往我们需要对多种模型进行训练,训练完之后就会得到多个模型的结果,我们希望从这些训练好的模型中选择最适合的模型。我们通过用validation set对所有模型进行测试,然后选出error rate最小的那个模型。

    所以说valaidation set主要是用来选择模型的。

    The main trick here is to 'hold out' a portion of our data from training and use the models performance on that sub-set of the data as a proxy for the true risk.

    This data is known as 'validation' data. It contrasts with test data, because it's values are known at the model design time. However, in contrast to test data we don't use it to fit our model.

    This means that it doesn't exhibit the same bias that the empirical risk does when estimating the true risk.

  • 相关阅读:
    Entropy
    MonkeyEatsPeach
    python中使用可选参数
    java中二元数组的构建
    静态语言和动态语言
    开胃菜
    python 工具箱
    python处理多层嵌套列表
    小球落体
    LoadRunner:Error 27796
  • 原文地址:https://www.cnblogs.com/focusonoutput/p/12208102.html
Copyright © 2020-2023  润新知