• R语言scale与unscale函数


    一、scale函数

    R语言base库中自带数据标准化接口scale函数,函数介绍如下

    Usage

    scale(x, center = TRUE, scale = TRUE)

    Arguments

    x: a numeric matrix(like object).

    center: either a logical value or a numeric vector of length equal to the number of columns of x.

    scale: either a logical value or a numeric vector of length equal to the number of columns of x.

    Details

    The value of center determines how column centering is performed. If center is a numeric vector with length equal to the number of columns of x, then each column of x has the corresponding value from center subtracted from it. If center is TRUE then centering is done by subtracting the column means (omitting NAs) of x from their corresponding columns, and if center is FALSE, no centering is done.

    The value of scale determines how column scaling is performed (after centering). If scale is a numeric vector with length equal to the number of columns of x, then each column of x is divided by the corresponding value from scale. If scale is TRUE then scaling is done by dividing the (centered) columns of x by their standard deviations if center is TRUE, and the root mean square otherwise. If scale is FALSE, no scaling is done.

    The root-mean-square for a (possibly centered) column is defined as sqrt(sum(x^2)/(n-1)), where x is a vector of the non-missing values and n is the number of non-missing values. In the case center = TRUE, this is the same as the standard deviation, but in general it is not. (To scale by the standard deviations without centering, use scale(x, center = FALSE, scale = apply(x, 2, sd, na.rm = TRUE)).)

    Value

    For scale.default, the centered, scaled matrix. The numeric centering and scalings used (if any) are returned as attributes "scaled:center" and "scaled:scale"

    scale方法默认进行z-score标准化,先减去均值,再除以标准差

    z-score 标准化(zero-mean normalization)

    也叫标准差标准化,这种方法给予原始数据的均值(mean)和标准差(standard deviation)进行数据的标准化。

    经过处理的数据符合标准正态分布,即均值为0,标准差为1,其转化函数为:

    其中μ为所有样本数据的均值,σ为所有样本数据的标准差。

    二、unscale函数

    DMwR中函数unscale可以根据scale的返回对象,还原数据

    Usage

    unscale(vals, norm.data, col.ids)

    Arguments

    vals: A numeric matrix with the values to un-scale

    norm.data: A numeric and scaled matrix. This should be an object to which the function scale() was applied.

    col.ids: The columns of the vals matrix that are to be un-scaled (defaults to all of them).

    Value

    An object with the same dimension as the parameter vals

    三、使用示例

    > df<-data.frame(x=c(1,2,3),y=c(2,4,6),z=c(3,6,9))

    > df

      x y z

    1 1 2 3

    2 2 4 6

    3 3 6 9

    > scaledData<-scale(df)

    > scaledData

          x  y  z

    [1,] -1 -1 -1

    [2,]  0  0  0

    [3,]  1  1  1

    attr(,"scaled:center")

    x y z

    2 4 6

    attr(,"scaled:scale")

    x y z

    1 2 3

    > unscale(scaledData,scaledData)

         x y z

    [1,] 1 2 3

    [2,] 2 4 6

    [3,] 3 6 9

    > ndf<-data.frame(x=c(1,2),y=c(2,4),z=c(3,6))

    > ndf

      x y z

    1 1 2 3

    2 2 4 6

    > scale(ndf,center=attr(scaledData, "scaled:center"),scale=attr(scaledData, "scaled:scale"))

          x  y  z

    [1,] -1 -1 -1

    [2,]  0  0  0

    attr(,"scaled:center")

    x y z

    2 4 6

    attr(,"scaled:scale")

    x y z

    1 2 3

  • 相关阅读:
    将不确定变成确定~LINQ DBML模型可以对应多个数据库吗
    将不确定变成确定~frameset页面不能正确加载
    System.Web.Caching.Cache删除某键后,希望同时触发其它动作(关键时刻,还是事件靠的住)
    Linq实体类的设计(解决了复合查询的问题,同时解决了LINQ上下文缓存问题)
    将不确定变成确定~LINQ查询两种写法,性能没有影响,优化查询应该是“按需查询”
    ASP.NET的内置对象
    Templating with JSF 2.0 Facelets
    IOS设计模式学习(1)设计模式初窥
    20个强大的jQuery翻书插件【 jQuery flipbook】
    linux网络编程之socket(十四):基于UDP协议的网络程序
  • 原文地址:https://www.cnblogs.com/guo-xiang/p/7810071.html
Copyright © 2020-2023  润新知