• JJ数据


    1.读取数据

    jj = scan("http://www.stat.pitt.edu/stoffer/tsa2/data/jj.dat")  

    jj <- scan("http://www.stat.pitt.edu/stoffer/tsa2/data/jj.dat") 

    scan("http://www.stat.pitt.edu/stoffer/tsa2/data/jj.dat") -> jj 

    > jj<-scan("http://www.stat.pitt.edu/stoffer/tsa2/data/jj.dat")
    Read 84 items
    > jj
     [1]  0.710000  0.630000  0.850000  0.440000  0.610000  0.690000
     [7]  0.920000  0.550000  0.720000  0.770000  0.920000  0.600000
    [13]  0.830000  0.800000  1.000000  0.770000  0.920000  1.000000
    [19]  1.240000  1.000000  1.160000  1.300000  1.450000  1.250000
    [25]  1.260000  1.380000  1.860000  1.560000  1.530000  1.590000
    [31]  1.830000  1.860000  1.530000  2.070000  2.340000  2.250000
    [37]  2.160000  2.430000  2.700000  2.250000  2.790000  3.420000
    [43]  3.690000  3.600000  3.600000  4.320000  4.320000  4.050000
    [49]  4.860000  5.040000  5.040000  4.410000  5.580000  5.850000
    [55]  6.570000  5.310000  6.030000  6.390000  6.930000  5.850000
    [61]  6.930000  7.740000  7.830000  6.120000  7.740000  8.910000
    [67]  8.280000  6.840000  9.540000 10.260000  9.540000  8.729999
    [73] 11.880000 12.060000 12.150000  8.910000 14.040000 12.960000
    [79] 14.850000  9.990000 16.200000 14.670000 16.020000 11.610000

    scan 

      scan读入的数据生成向量类型

    向量  

      1.基本元素为:数值(numeric)、字符(character)、逻辑值(logical)、复数型(complex)

      2.向量不需要定义类型,可直接赋值。

         生成一个空向量;x<-c();

         给向量赋值。x<-c(0,1,2,3);

      3.向量的元素下标取值是以1开始

      4.如果一个向量中有一个字符,则该向量的类型会变成字符.mode(jj) 

      5.如果逻辑变量与数值在一起,则为转换成数值。TRUE转变成1 and FALSE 转变成 0

    > mode(jj)
    [1] "numeric"
    > test<-c(1,2,'a')
    > mode(test)
    [1] "character"
    > test1<-c(1,2,true)
    错误: 找不到对象'true'
    > test1<-c(1,2,TRUE)
    > mode(test1)
    [1] "numeric"  

      6.在R语言中没有单一的整数、单一字符的概念. X<-2;X<-'a';R都是当作向量来处理,只是这个向量只包括单一值.

      7.给向量各元素命名: names(x)

    > demo<-1:3
    > fix(demo)
    > names(demo)<-c('a','b','c','d')
    错误于names(demo) <- c("a", "b", "c", "d") : 
      'names'属性的长度[4]必需和矢量的长度[3]一样
    > names(demo)<-c('a','b','c')
    > demo
    a b c 
    1 2 3 
    > names(demo)<-c('d','e','f')
    > demo
    d e f 
    1 2 3 

    jj转变为一个时间序列对象

    > jj = ts(jj, start=1960, frequency=4)
    > jj
              Qtr1      Qtr2      Qtr3      Qtr4
    1960  0.710000  0.630000  0.850000  0.440000
    1961  0.610000  0.690000  0.920000  0.550000
    1962  0.720000  0.770000  0.920000  0.600000
    1963  0.830000  0.800000  1.000000  0.770000
    1964  0.920000  1.000000  1.240000  1.000000
    1965  1.160000  1.300000  1.450000  1.250000
    1966  1.260000  1.380000  1.860000  1.560000
    1967  1.530000  1.590000  1.830000  1.860000
    1968  1.530000  2.070000  2.340000  2.250000
    1969  2.160000  2.430000  2.700000  2.250000
    1970  2.790000  3.420000  3.690000  3.600000
    1971  3.600000  4.320000  4.320000  4.050000
    1972  4.860000  5.040000  5.040000  4.410000
    1973  5.580000  5.850000  6.570000  5.310000
    1974  6.030000  6.390000  6.930000  5.850000
    1975  6.930000  7.740000  7.830000  6.120000
    1976  7.740000  8.910000  8.280000  6.840000
    1977  9.540000 10.260000  9.540000  8.729999
    1978 11.880000 12.060000 12.150000  8.910000
    1979 14.040000 12.960000 14.850000  9.990000
    1980 16.200000 14.670000 16.020000 11.610000
    


    Scan和read.table不一样。Scan 生成的是有维度的向量,read.table生成的则是带有维度的数据架构.

    > time(jj)
            Qtr1    Qtr2    Qtr3    Qtr4
    1960 1960.00 1960.25 1960.50 1960.75
    1961 1961.00 1961.25 1961.50 1961.75
    1962 1962.00 1962.25 1962.50 1962.75
    1963 1963.00 1963.25 1963.50 1963.75
    1964 1964.00 1964.25 1964.50 1964.75
    1965 1965.00 1965.25 1965.50 1965.75
    1966 1966.00 1966.25 1966.50 1966.75
    1967 1967.00 1967.25 1967.50 1967.75
    1968 1968.00 1968.25 1968.50 1968.75
    1969 1969.00 1969.25 1969.50 1969.75
    1970 1970.00 1970.25 1970.50 1970.75
    1971 1971.00 1971.25 1971.50 1971.75
    1972 1972.00 1972.25 1972.50 1972.75
    1973 1973.00 1973.25 1973.50 1973.75
    1974 1974.00 1974.25 1974.50 1974.75
    1975 1975.00 1975.25 1975.50 1975.75
    1976 1976.00 1976.25 1976.50 1976.75
    1977 1977.00 1977.25 1977.50 1977.75
    1978 1978.00 1978.25 1978.50 1978.75
    1979 1979.00 1979.25 1979.50 1979.75
    1980 1980.00 1980.25 1980.50 1980.75
    > plot(jj, ylab="Earnings per Share", main="J & J")  
    

    filter  convolution 卷积方法做线性过滤 (相当于移动平均法)

        recursive   递归方法做线性过滤   (相当于自回归法AR)

    > k = c(.5,1,1,1,.5) 
    > (k = k/sum(k))  
    [1] 0.125 0.250 0.250 0.250 0.125
    > fjj = filter(jj, sides=2, k)
    > fjj
             Qtr1     Qtr2     Qtr3     Qtr4
    1960       NA       NA  0.64500  0.64000
    1961  0.65625  0.67875  0.70625  0.73000
    1962  0.74000  0.74625  0.76625  0.78375
    1963  0.79750  0.82875  0.86125  0.89750
    1964  0.95250  1.01125  1.07000  1.13750
    1965  1.20125  1.25875  1.30250  1.32500
    1966  1.38625  1.47625  1.54875  1.60875
    1967  1.63125  1.66500  1.70250  1.76250
    1968  1.88625  1.99875  2.12625  2.25000
    1969  2.34000  2.38500  2.46375  2.66625
    1970  2.91375  3.20625  3.47625  3.69000
    1971  3.88125  4.01625  4.23000  4.47750
    1972  4.65750  4.79250  4.92750  5.11875
    1973  5.41125  5.71500  5.88375  6.00750
    1974  6.12000  6.23250  6.41250  6.69375
    1975  6.97500  7.12125  7.25625  7.50375
    1976  7.70625  7.85250  8.16750  8.56125
    1977  8.88750  9.28125  9.81000 10.32750
    1978 10.87875 11.22750 11.52000 11.90250
    1979 12.35250 12.82500 13.23000 13.71375
    1980 14.07375 14.42250       NA       NA
    

    lowess 平滑,局部加权多项式回归 

    > plot(jj)
    > lines(fjj, col="red")
    > lines(lowess(jj), col="blue", lty="dashed")
    

      

    diff 计算差分(差分就是通过做减法得到一个增量的序列

         方差(方差是衡量一个变量波动性的指标)

    log 对数 

    我们把所有jj数据都取log值。

    第二步,

    我们把log值做差,即使用log值数列中第二值减去第一值,第三值减去第二值,第四值减去第三值等等。

    如果做差处理前数列里有n个数值,处理后的结果中将有n-1个数值。

    > dljj = diff(log(jj))
    > dljj
                 Qtr1         Qtr2         Qtr3         Qtr4
    1960              -0.119545151  0.299516530 -0.658461623
    1961  0.326684230  0.123232640  0.287682072 -0.514455392
    1962  0.269332934  0.067139303  0.177983155 -0.427444015
    1963  0.324496046 -0.036813973  0.223143551 -0.261364764
    1964  0.177983155  0.083381609  0.215111380 -0.215111380
    1965  0.148420005  0.113944259  0.109199292 -0.148420005
    1966  0.007968170  0.090971778  0.298492989 -0.175890666
    1967 -0.019418086  0.038466281  0.140581951  0.016260521
    1968 -0.195308752  0.302280872  0.122602322 -0.039220713
    1969 -0.040821995  0.117783036  0.105360516 -0.182321557
    1970  0.215111380  0.203598955  0.075985907 -0.024692613
    1971  0.000000000  0.182321557  0.000000000 -0.064538521
    1972  0.182321557  0.036367644  0.000000000 -0.133531393
    1973  0.235314087  0.047252885  0.116072171 -0.212921997
    1974  0.127155175  0.057987258  0.081125545 -0.169418152
    1975  0.169418152  0.110541874  0.011560822 -0.246400413
    1976  0.234839591  0.140772554 -0.073331273 -0.191055237
    1977  0.332705754  0.072759354 -0.072759354 -0.088728230
    1978  0.308091059  0.015037877  0.007434978 -0.310154928
    1979  0.454736157 -0.080042708  0.136132174 -0.396415273
    1980  0.483426650 -0.099206650  0.088033349 -0.321971146
    > plot(dljj) 
    > shapiro.test(dljj)
    
    	Shapiro-Wilk normality test
    
    data:  dljj
    W = 0.9725, p-value = 0.07211
    

    用qqnorm()函数绘制正态概率图 qqline()一条拟合曲线 

    用hist()函数可以绘制直方图

    http://www.stathome.cn/manual/s/10.html 

    > par(mfrow=c(2,1))
    > hist(dljj, prob=TRUE, 12)
    > lines(density(dljj)) 
    > qqnorm(dljj)
    > qqline(dljj) 
    

      

  • 相关阅读:
    Spring MVC 3 深入总结
    精益之识别和消除研发过程中浪费的思路和模式
    怎样区分直连串口线和交叉串口线?
    UVA 10557 XYZZY
    概率论 —— 分析计算机系统和网络的可靠性和通用性
    概率论 —— 分析计算机系统和网络的可靠性和通用性
    Sift中尺度空间、高斯金字塔、差分金字塔(DOG金字塔)、图像金字塔
    鲁迅先生的话
    鲁迅先生的话
    辛词
  • 原文地址:https://www.cnblogs.com/mysqlinternal/p/3108552.html
Copyright © 2020-2023  润新知