• R 字符处理基础函数


    1、nchar(x):返回字符串或者字符串向量x的长度。

    > nchar("I love you!")
    [1] 11
    > nchar(c("I", "love", "you", "!"))
    [1] 1 4 3 1

     

    2、grep(pattern,x):返回 pattern 在字符串向量 x 中的位置。

    > grep("y", "I love you!")
    [1] 1
    
    > a <- c("I", "love", "you", "!")
    > grep("y", a)
    [1] 3
    
    > grep("k", a)
    integer(0)

     

    3、paste(...,sep=" "):连接字符串,分隔符为 sep (默认值为空格)。

    > paste("I", "love", "you", "!")
    [1] "I love you !"
    
    > a <- c("I", "love", "you", "!")
    > a
    [1] "I"    "love" "you"  "!"   
    
    > paste(a, 1:4)
    [1] "I 1"    "love 2" "you 3"  "! 4"   
    > paste(a, 1:4, sep="-")
    [1] "I-1"    "love-2" "you-3"  "!-4"   
    
    > paste("Today is","Sat Jan 11 2020")
    [1] "Today is Sat Jan 11 2020"

     

    4、paste0(...,sep=" "):以空字符串连接字符。

    > paste0("I", "love", "you", "!")
    [1] "Iloveyou!"
     
    > a <- c("I", "love", "you", "!")
    > a
    [1] "I"    "love" "you"  "!"   
    > paste0(a, 1:4)
    [1] "I1"    "love2" "you3"  "!4"   
    > paste0(a, 1:4, sep="--")
    [1] "I1--"    "love2--" "you3--"  "!4--"   
     
    > b <- c("","","","","","","","","","")
    > d <- c("","","","","","","","","","","","")
    > paste0(b, d)
     [1] "甲子" "乙丑" "丙寅" "丁卯" "戊辰" "己巳" "庚午"
     [8] "辛未" "壬申" "癸酉" "甲戌" "乙亥"

     

    5、sprintf(...):按照一定格式把若干的组件组合成字符串。

    > a <- 11
    > sprintf("The square of %d is %d", a, a^2)
    [1] "The square of 11 is 121"
    
    > sprintf("The square root of %d is %d", a^2, (a^2)^0.5)
    [1] "The square root of 121 is 11"

    相似于 Python 中的打印格式化字符串

    示例:

    a = 11
    
    print('The square of %d is %d' % (a, a**2))
    print('The square root of {} is {}'.format(a**2, a))
    
    The square of 11 is 121
    The square root of 121 is 11

     

    6、substr(x,start,stop):截取字符串x中start到stop范围的字串。

    excel 中的 mid(), python 中的 切片

    示例:

    > a <- paste0(letters[1:7], collapse="")
    > a
    [1] "abcdefg"
    
    > substr(a, 1, 3)
    [1] "abc"
    
    > substr(a, 1, 3) <- "aaa"
    > a
    [1] "aaadefg"
    
    > b <- c("1a","2bb", "3ccc", "4dddd" )
    > substr(b, 1, 2)
    [1] "1a" "2b" "3c" "4d"

     

    7、strsplit(x,split):根据split将x拆分成若干字串,返回这些字串组成的列表。

    python 中的 s.split(split)

    示例:

    > a <-paste(letters[1:7], collapse="_")
    > a
    [1] "a_b_c_d_e_f_g"
    > strsplit(a, "_")
    [[1]]
    [1] "a" "b" "c" "d" "e" "f" "g"
     
    > b <- paste0(letters[1:7], 1:7, collapse="_")
    > b
    [1] "a1_b2_c3_d4_e5_f6_g7"
    > strsplit(b, "_")
    [[1]]
    [1] "a1" "b2" "c3" "d4" "e5" "f6" "g7"
     
    > d <- paste0(c(2020, 01, 10), collapse="/")
    > d
    [1] "2020/1/10"
    > strsplit(d, "/")
    [[1]]
    [1] "2020" "1"    "10"  
    
    > #  将列表转换为字符串向量
    > unlist(strsplit(d, "/"))
    [1] "2020" "1"    "10" 

     

    8、regexpr(pattern,x):在字符串 x 中寻找 pattern,返回与pattern匹配的第一个子字符串的起始字符位置。

    > a <- "I love you!"
    > regexpr("y", a)
    [1] 8
    attr(,"match.length")
    [1] 1
    attr(,"index.type")
    [1] "chars"
    attr(,"useBytes")
    [1] TRUE

    “y” 在 a 的第八个位置开始,并且长度为1。

     

    9、gregexpr(pattern,x):查找x中的所有与pattern匹配的字串开始位置及长度。

    > a <- "I love you!"
    > b <- "You love me!"
    > paste(a, b)
    [1] "I love you! You love me!"
    > gregexpr("v", paste(a, b))
    [[1]]
    [1]  5 19
    attr(,"match.length")
    [1] 1 1
    attr(,"index.type")
    [1] "chars"
    attr(,"useBytes")
    [1] TRUE

    "v" 在 paste(a, b) 中出现了两次。

     

    推荐阅读:

    http://blog.sina.com.cn/s/blog_69ffa1f90101sie9.html

    https://www.cnblogs.com/awishfullyway/p/6601539.html

    https://blog.csdn.net/yj1556492839/article/details/82725315

  • 相关阅读:
    JVM执行子系统探究——类文件结构初窥
    解决nexus3报Cannot open local storage 'component' with mode=rw的异常问题
    基础架构之spring cloud基础架构
    基础架构之持续发布
    基础架构之持续集成
    基础架构之Gitlab Runner
    基础架构之GitLab
    基础架构之Docker私有库
    基础架构之Maven私有库
    基础架构之Mongo
  • 原文地址:https://www.cnblogs.com/shanger/p/12180615.html
Copyright © 2020-2023  润新知