• R Data Frame


    https://www.datamentor.io/r-programming/data-frame/

    Check if a variable is a data frame or not

    We can check if a variable is a data frame or not using the class() function.

    > x
    SN Age Name
    1  1  21 John
    2  2  15 Dora
    > typeof(x)    # data frame is a special case of  list
    [1] "list"
    > class(x)
    [1] "data.frame"
    

    In this example, x can be considered as a list of 3 components with each component having a two element vector. Some useful functions to know more about a data frame are given below.


    Functions of data frame

    > names(x)
    [1] "SN"   "Age"  "Name"
    > ncol(x)
    [1] 3
    > nrow(x)
    [1] 2
    > length(x)    # returns length of the list, same as ncol()
    [1] 3
    

    How to create a Data Frame in R?

    We can create a data frame using the data.frame() function

    For example, the above shown data frame can be created as follows.

    > x <- data.frame("SN" = 1:2, "Age" = c(21,15), "Name" = c("John","Dora"))
    > str(x)    # structure of x
    'data.frame':   2 obs. of  3 variables:
    $ SN  : int  1 2
    $ Age : num  21 15
    $ Name: Factor w/ 2 levels "Dora","John": 2 1
    

    Notice above that the third column, Name is of type factor, instead of a character vector.

    By default, data.frame() function converts character vector into factor.

    To suppress this behavior, we can pass the argument stringsAsFactors=FALSE.

    > x <- data.frame("SN" = 1:2, "Age" = c(21,15), "Name" = c("John", "Dora"), stringsAsFactors = FALSE)
    > str(x)    # now the third column is a character vector
    'data.frame':   2 obs. of  3 variables:
    $ SN  : int  1 2
    $ Age : num  21 15
    $ Name: chr  "John" "Dora"
    

    Many data input functions of R like, read.table()read.csv()read.delim()read.fwf() also read data into a data frame.


    How to access Components of a Data Frame?

    Components of data frame can be accessed like a list or like a matrix.


    Accessing like a list

    We can use either [[[ or $ operator to access columns of data frame.

    > x["Name"]
    Name
    1 John
    2 Dora
    > x$Name
    [1] "John" "Dora"
    > x[["Name"]]
    [1] "John" "Dora"
    > x[[3]]
    [1] "John" "Dora"
    

    Accessing with [[ or $ is similar. However, it differs for [ in that, indexing with [ will return us a data frame but the other two will reduce it into a vector.


    Accessing like a matrix

    Data frames can be accessed like a matrix by providing index for row and column.

    To illustrate this, we use datasets already available in R. Datasets that are available can be listed with the command library(help = "datasets").

    We will use the trees dataset which contains GirthHeight and Volume for Black Cherry Trees.

    A data frame can be examined using functions like str() and head().

    > str(trees)
    'data.frame':   31 obs. of 3 variables:
    $ Girth : num  8.3 8.6 8.8 10.5 10.7 10.8 11 11 11.1 11.2 ...
    $ Height: num  70 65 63 72 81 83 66 75 80 75 ...
    $ Volume: num  10.3 10.3 10.2 16.4 18.8 19.7 15.6 18.2 22.6 19.9 ...
    > head(trees,n=3)
    Girth Height Volume
    1   8.3     70   10.3
    2   8.6     65   10.3
    3   8.8     63   10.2
    

    We can see that trees is a data frame with 31 rows and 3 columns. We also display the first 3 rows of the data frame.

    Now we proceed to access the data frame like a matrix.

    > trees[2:3,]    # select 2nd and 3rd row
    Girth Height Volume
    2   8.6     65   10.3
    3   8.8     63   10.2
    > trees[trees$Height > 82,]    # selects rows with Height greater than 82
    Girth Height Volume
    6   10.8     83   19.7
    17  12.9     85   33.8
    18  13.3     86   27.4
    31  20.6     87   77.0
    > trees[10:12,2]
    [1] 75 79 76
    

    We can see in the last case that the returned type is a vector since we extracted data from a single column.

    This behavior can be avoided by passing the argument drop=FALSE as follows.

    > trees[10:12,2, drop = FALSE]
    Height
    10     75
    11     79
    12     76
    

    How to modify a Data Frame in R?

    Data frames can be modified like we modified matrices through reassignment.

    > x
    SN Age Name
    1  1  21 John
    2  2  15 Dora
    > x[1,"Age"] <- 20; x
    SN Age Name
    1  1  20 John
    2  2  15 Dora
    

    Adding Components

    Rows can be added to a data frame using the rbind() function.

    > rbind(x,list(1,16,"Paul"))
    SN Age Name
    1  1  20 John
    2  2  15 Dora
    3  1  16 Paul
    

    Similarly, we can add columns using cbind().

    > cbind(x,State=c("NY","FL"))
    SN Age Name State
    1  1  20 John    NY
    2  2  15 Dora    FL
    

    Since data frames are implemented as list, we can also add new columns through simple list-like assignments.

    > x
    SN Age Name
    1  1  20 John
    2  2  15 Dora
    > x$State <- c("NY","FL"); x
    SN Age Name State
    1  1  20 John    NY
    2  2  15 Dora    FL
    

    Deleting Component

    Data frame columns can be deleted by assigning NULL to it.

    > x$State <- NULL
    > x
    SN Age Name
    1  1  20 John
    2  2  15 Dora
    

    Similarly, rows can be deleted through reassignments.

    > x <- x[-1,]
    > x
    SN Age Name
    2  2  15 Dora
  • 相关阅读:
    Java Gradle
    C/C++ C++11新特性
    C/C++ C++11原子类型和内存序
    基于流的编程(Flow-Based Programming)
    算法和数据结构 筛法求素数
    数据库 悲观锁和乐观锁
    数据库 事务隔离级别
    Serverless 的 AI 写诗,程序员浪漫起来谁能顶得住啊!
    亮点前瞻 | 首届 ServerlesssDays · China 大会议程发布
    腾讯云云函数 SCF Node.js Runtime 最佳实践
  • 原文地址:https://www.cnblogs.com/kungfupanda/p/11312830.html
Copyright © 2020-2023  润新知