lua 字符串处理 - 润新知

lua 字符串处理
匹配模式(pattern)
- . 任何单个字符
- %a 任何字母
- %c 任何控制字符
- %d 任何数字
- %g 任何除空白符外的可打印字符
- %l 所有小写字母
- %p 所有标点符号
- %s 所有空白字符
- %u 所有大写字母
- %w 所有字母及数字
- %x 所有 16 进制数字符号
- %x (这里 x 是任何非字母/数字的字符) 表示字符 x。如 %% 表示百分号%，%. 表示点号.，%/ 表示斜杠/。
- [set] 表示 set 中所有字符的联合，找到其中任一字符即表示匹配。可以用 - 连接，如[0-9] 表示 0 到 9 这十个数字。
- [^set] 表示 set 的补集。
字符类与特殊意义符号：
- 单个字符类跟一个 '*'，将匹配零个或多个该类字符，匹配尽可能长的串。
- 单个字符类跟一个 '-'，将匹配零个或多个该类字符，匹配尽可能短的串。
- 单个字符类跟一个 '+'，将匹配一个或多个该类字符，匹配尽可能长的串。
- 单个字符类跟一个 '?'，将匹配零个或一个该类字符。
- %n，这里 n 可以从 1 到 9，匹配捕获的第 n 个子串。
- %bxy，这里的 x 和 y 是两个明确的字符，匹配以 x 开始 y 结束。如 %b() 匹配包括小括号在内括起来的字符串。
- %f[set]，边境模式。匹配位于 set 内某个字符之前的一个空串，且这个位置的前一个字符不属于 set。
string.byte(s[, i[, j]])
- 返回字符 s[i], s[i+1], ..., s[j] 的数字编码
- i 默认为 1，j 默认等于 i
例：
```
str = "abcdef"
print(str:byte()) --> 97，'a' 的编码值
print(str:byte(2)) --> 98，'b' 的编码值
print(str:byte(2, 4)) -> 98 99  100
```
string.char(...)
- 接收整数作为字符的编码值为参数，返回对应的字符
例：
```
print(string.char(97, 98, 99) --> abc
```
string.dump(function [, strip])
- 以字符串形式表示一个函数，并返回。返回的字符串可用 loadstring 加载。
- strip 为真，表示去掉函数的调试信息（局部变量名，行号等）
例：
```
local function print_hello()
    print("hello world")
end

local f = string.dump(print_hello, true)
print(f)

local a = loadstring(f)
print(type(a)) --> 'function'
a() --> 'hello world'
```
string.find(s, pattern [, init[, plain]])
- 查找字符串 s 中第一个匹配到 pattern，返回匹配的起始和结束位置。找不到返回 nil
- init 指定从何处开始查找，默认为 1。负值表示从倒数第几个字符开始找，直到字符串结束。
- plain 为真，则关闭模式匹配。
例：
```
str = "hello, world"
print(string.find(str, "llo")) --> 3    5
print(string.find(str, "world", -5)) --> 8  12
print(string.find(str, "world", -4)) --> nil
```
string.format(formating, ...)
- 同 c 里的 sprintf，不支持 *, h, L, l, n, p
- 增加 %q，将一个字符串格式化为两个双引号括起。
例：
```
str = 'a string with "quotes" and
new line'
print(string.format("%q", str))
```
string.gmatch(s, pattern)
- 返回一个迭代器函数，该函数每次被调用时都会以 pattern 为模式对 s 作匹配，并返回捕获到的值。
例：
```
s = "hello world from Lua"
g = string.gmatch(s, "%a+")
print(type(g)) --> 'function', g 是一个函数
print(g()) --> 'hello'，调用一次则进行一次匹配
print(g()) --> 'world'，返回第二次匹配的值
```
```
s = "from=world, to=Lua"
for k, v in string.gmatch(s, "(%w+)=(%w+)") do
    print(k, v) --> 打印捕获到的值
end
```
string.gsub(s, pattern, repl[, n])
- 将字符串中所有匹配的 pattern 都替换成 repl，并返回替换后的字符串。第二个返回值返回匹配次数。
- pattern 中没有设定捕获则默认捕获整个 pattern
- 默认全部进行匹配
例：
```
s = "hello world, hello world, hello world"
print(s:gsub("world", "sam")) --> hello sam, hello sam, hello sam   3
```
- 如果有 n 参数，则只替换前 n 个匹配。
例：
```
s = "hello world, hello world, hello world"
print(s:gsub("world", "sam", 1)) --> hello sam, hello world, hello world    1
print(s:gsub("world", "sam", 2)) --> hello sam, hello sam, hello world  2
```
- 若 repl 是字符串，则字符串直接替换。repl 中的 %d 表示第几个捕获到的子串。%0 表示整个匹配。%%表示单个百分号。
例：
```
s = "hello world, hello world, hello world"
print(s:gsub("(%w+)", "%1 %1", 1)) --> hello hello world, hello world, hello world  1
print(s:gsub("(%w+)%s*(%w+)", "%2 %1", 1)) --> world hello, hello world, hello world    1
```
- 若 repl 是表，则每次匹配时都会用第一个捕获值作为键值去查找这张表。
例：
```
x = {}
x.hello = "HELLO"
x.world = "WORLD"

s = "hello world, hello world, hello world"
print(s:gsub("(%w+)", x)) --> HELLO WORLD, HELLO WORLD, HELLO WORLD 6
```
- 若 repl 是函数，则每次匹配时调用该函数，捕获值作为参数传入该函数。
例：
```
function x(str)
    return "sam"
end

s = "hello world, hello world, hello world"
print(s:gsub("(%w+)", x)) --> sam sam, sam sam, sam sam 6
```
- 表或函数的结果如果是 false 或 nil 则不操作，如果是字符串或数字，则进行替换。
例：
```
x = {}
x.hello = "HELLO"

s = "hello world, hello world, hello world"
print(s:gsub("(%w+)", x)) --> HELLO world, HELLO world, HELLO world 6
```
```
function x(str)
    return nil
end

s = "hello world, hello world, hello world"
print(s:gsub("(%w+)", x)) --> hello world, hello world, hello world 6
```
string.len(s)
- 返回字符串长度
例：
```
print(string.len("hello, world")) --> 12
```
string.lower(s)
- s 中的大写字符转成小写
例：
```
print(string.lower("HEllo, woRLD")) --> hello, world
```
string.upper(s)
- s 中的小写字符转成大写
例：
```
print(string.upper("HEllo, woRLD")) --> HELLO, WORLD
```
string.match(s, pattern[, init])
- 在字符串中找到第一个匹配 pattern 部分，并返回捕获值。找不到返回 nil。
- init 指定搜索的起始位置。默认为 1，可以为负数。
例：
```
s = "hello world, hello world, hello world"
print(string.match(s, "hello")) --> hello
print(string.match(s, "wor%a+")) --> world
```
string.rep(s, n[, sep])
- 用 sep 连接 n 个 s，并返回
- 默认 sep 为空，即没有分割符
例：
```
print(string.rep("hello", 2)) --> hellohello
print(string.rep("hello", 2, ", ")) --> hello, hello
```
string.reverse(s)
- 反转字符串 s
例：
```
print(string.reverse("hello")) --> olleh
```
string.sub(s, i[, j])
- 返回 s 的子串，从 i 开始，j 结束。
- i 和 j 可以为负数。
- j 默认为 -1，即到字符串结束。
例：
```
print(string.sub("hello", 2)) --> ello
print(string.sub("hello", 2, 4)) --> ell
```
string.pack(fmt, v1, v2, ...)

string.packsize(fmt)

string.unpack(fmt, s[, pos])
相关阅读:
论文阅读 dyngraph2vec: Capturing Network Dynamics using Dynamic Graph Representation Learning
升级openssh的补救
 二阶魔方
 Extra argument start service sshd does not support chkconfig
通用帮助类集合Shiny.Helper库的使用
 .net core Redis客户端Shiny.Redis包库的使用
 .net core mqtt客户端Shiny.Mqtt库的使用
 基于Sqlsugar单例模式封装的库ShinySqlSugar的使用
 加速训练之并行化 tf.data.Dataset 生成器
 ffmpeg protocol concat 进行ts流合并视频的时间戳计算及其音画同步方式一点浅析
原文地址：https://www.cnblogs.com/sammei/p/lua-zi-fu-chuan-chu-li.html

Copyright © 2020-2023 润新知