第4章:查找与替换
4.1、grep查找文本(匹配文本)
理论知识补充:传统上,有三种程序,可以用来查找整个文本文件,分别是grep(最早的文本匹配程序)、扩展式egrep、快速式fgrep,后面POSIX标准(下一章有介绍)将这三个改版整合成一个grep程序,可以通过不同的选项加以控制。
范例1:查找文本单词,默认以行为单位输出,如果单词有特殊字符需要使用双引号括起来
[root@cloucentos6 home]# cat font.txt
Good morning, mom.
Did the alarm clock go off ?
Did the alarm clock buzz?
Did the alarm clock ring?
It's time to get up!
I don't wanna get up.
[root@cloucentos6 home]# grep buzz font.txt
Did the alarm clock buzz?
[root@cloucentos6 home]# grep "get up" font.txt #get up单词中间有空格需要使用双引号
It's time to get up!
I don't wanna get up.
[root@cloucentos6 home]# grep buzz --color=auto font.txt #重点标记匹配的单词
Did the alarm clock buzz?
范例2:指定单词,对多个文本进行搜索
[root@cloucentos6 home]# cat file.txt font.txt
Good morning, mom.
Did the alarm clock go off ?
Did the alarm clock buzz?
Did the alarm clock ring?
It's time to get up!
I don't wanna get up.
Good morning, mom.
Did the alarm clock go off ?
Did the alarm clock buzz?
Did the alarm clock ring?
It's time to get up!
I don't wanna get up.
[root@cloucentos6 home]# grep buzz font.txt file.txt
font.txt:Did the alarm clock buzz?
file.txt:Did the alarm clock buzz?
范例3:grep –o 只输出匹配的单词
[root@cloucentos6 home]# cat font.txt
Good morning, mom.
Did the alarm clock go off ?
Did the alarm clock buzz?
Did the alarm clock ring?
It's time to get up!
I don't wanna get up.
[root@cloucentos6 home]# grep -o buzz font.txt
buzz
范例4:打印除get up的行之外的所有行
[root@cloucentos6 home]# cat font.txt
Good morning, mom.
Did the alarm clock go off ?
Did the alarm clock buzz?
Did the alarm clock ring?
It's time to get up!
I don't wanna get up.
[root@cloucentos6 home]# grep -v "get up" font.txt
Good morning, mom.
Did the alarm clock go off ?
Did the alarm clock buzz?
Did the alarm clock ring?
范例5:统计文本包含指定单词的行数
[root@cloucentos6 home]# cat font.txt
Good morning, mom.
Did the alarm clock go off ?
Did the alarm clock buzz?
Did the alarm clock ring?
It's time to get up!
I don't wanna get up.
[root@cloucentos6 home]# grep -c "get up" font.txt
2
范例6:grep -n 打印出匹配字符串的行的编号
[root@cloucentos6 home]# cat -n font.txt
1 Good morning, mom.
2 Did the alarm clock go off ?
3 Did the alarm clock buzz?
4 Did the alarm clock ring?
5 It's time to get up!
6 I don't wanna get up.
[root@cloucentos6 home]# grep -n "get up" font.txt
5:It's time to get up!
6:I don't wanna get up.
范例7:grep -b 搜索字符串偏移位数,默认偏移从该第一个字符开始计算,起始值是0
[root@cloucentos6 home]# cat font.txt
Good morning, mom.
Did the alarm clock go off ?
Did the alarm clock buzz?
Did the alarm clock ring?
It's time to get up!
I don't wanna get up.
[root@cloucentos6 home]# grep -b -o "morning" font.txt
5:morning
范例8:grep -l 搜索多个文本匹配字符串的位于哪个文本, grep – L 相反意思
[root@cloucentos6 home]# cat test2.txt
my name is xiaohong.
[root@cloucentos6 home]# cat test3.txt
my name is zhangshang.
[root@cloucentos6 home]# grep -l zhangshang test2.txt test3.txt
test3.txt
[root@cloucentos6 home]# grep -L zhangshang test2.txt test3.txt
test2.txt
范例9:grep –R 多级目录中递归搜索指定字符串文本
[root@cloucentos6 home]# ls -l /home/file/file2/font.txt
-rw-r--r--. 1 root root 149 4月 27 00:27 /home/file/file2/font.txt
[root@cloucentos6 home]# grep -R -n -o "buzz" /home # -n 选项找出匹配字符串的行数 –o选项只显示匹配的字符串
/home/file/file2/font.txt:3:buzz
范例10:grep -I 匹配字符串不考虑字符大小写
[root@cloucentos6 home]# cat font.txt
my name is zhanghong.
my name is xiaohong.
ZHANGHONG is man.
ZHANGhong is good man.
[root@cloucentos6 home]# grep zhanghong font.txt
my name is zhanghong.
[root@cloucentos6 home]# grep -i zhanghong font.txt
my name is zhanghong.
ZHANGHONG is man.
ZHANGhong is good man.
范例11:grep -e匹配多个样式
[root@cloucentos6 home]# cat font.txt
Good morning, mom.
Did the alarm clock go off ?
Did the alarm clock buzz?
Did the alarm clock ring?
It's time to get up!
I don't wanna hello.
[root@cloucentos6 home]# grep -e "buzz" -e "wanna" font.txt
Did the alarm clock buzz?
I don't wanna hello.
[root@cloucentos6 home]# grep -e "wanna" -e "abc" font.txt
I don't wanna hello.
[root@cloucentos6 home]# grep -e "wanna" -e "hello" font.txt
I don't wanna hello.
[root@cloucentos6 home]# grep -e "wanna" -e "hello" font.txt -o
wanna
hello
范例12:grep --include 递归搜索目录中所有的.txt文本,grep --exclude 排除搜索目录中所有的.txt文本
[root@cloucentos6 home]# ls -l /home/file/file2/*
-rw-r--r--. 1 root root 149 4月 27 01:10 /home/file/file2/font.sh
-rw-r--r--. 1 root root 149 4月 27 00:27 /home/file/file2/font.txt
[root@cloucentos6 home]# grep "buzz" /home -r --include "*.txt"
/home/file/file2/font.txt:Did the alarm clock buzz?
[root@cloucentos6 home]# grep "buzz" /home -r --exclude "*.txt"
/home/file/file2/font.sh:Did the alarm clock buzz?
[root@cloucentos6 home]# grep "buzz" /home -r --exclude-dir=file3
/home/file/file2/font.sh:Did the alarm clock buzz?
/home/file/file2/font.txt:Did the alarm clock buzz?
[root@cloucentos6 home]# grep "buzz" /home -r --exclude-dir=file3 --exclude "*.txt" # --exclude-dir排除指定的目录
/home/file/file2/font.sh:Did the alarm clock buzz?
范例13:搜索匹配字符串前后的行数
[root@cloucentos6 home]# seq 10 | grep 5 -A 2 #匹配5结果之后的2行
5
6
7
[root@cloucentos6 home]# seq 10 | grep 5 -B 2 #匹配5结果之前的2行
3
4
5
[root@cloucentos6 home]# seq 10 | grep 5 -C 2 #匹配5结果之前和之后的2行
3
4
5
6
7
4.2、find文件查找与文件列表
范例1:find基于文件名查找
find /home –name “*.txt” –print #查找home目录下文件以.txt结尾的文件,不忽略大小写
find /home –iname “*.txt” –print #查找home目录下文件以.txt结尾的文件,忽略大小写
find /home ! -name "*.sh" –print #查找/home下不以.sh结尾的文件名
范例2:find基于文件类型查找
标注:b块设备、d目录、f普通文件、l符号链接、c字符设备、s套拼字
find /home –type d #查找home目录下的文件夹
范例3:find基于正则表达式匹配路径查找
find /home -regex ".*(.txt|.sh)$" #查找home目录下.txt和.sh结尾文件,不忽略大小写find /home -iregex ".*(.txt|.sh)$" #查找home目录下.txt和.sh结尾文件,忽略大小写
find /home ( -name "*.txt" ) -o ( -name "*.sh" ) #查找home目录下.txt和.sh结尾文件,不忽略大小写
范例4:find基于排除指定文件名查找
find /home ! -name "*.sh" –print #查找/home下不以.sh结尾的文件名
范例5:find基于目录深度查找
find /home -mindepth 2 #查找当前目录最小深度的文件
find /home -maxdepth 2 #查找当前目录最大深度的文件
范例6:find基于文件时间属性查找
标注:atime找出文件被访问时间(单位天)、ctime找出文件更改时间(单位天)、mtime找出修改数据时间(单位天)、admin找出文件被访问时间(单位分钟)、time找出文件更改时间(单位分钟)、mtime找出修改数据时间(单位分钟)
find /home –type f –atime -2 #查找home目录下最近2天内被访问过的所有文件,其计量单位天
find /home –type f –atime +2 #查找home目录下超过2天被访问过的所有文件,其计量单位天
find /home –type f –atime 2 #查找home目录下恰好在2天前内被访问过的所有文件,其计量单位天
范例7:find基于文件大小查找
标注:b块(512字节)、c字节、k千字节、M兆字节、G吉字节
find /home –type f –size 9M #查找home目录下大小等于9M的普通文件
find /home –type f –size +9M #查找home目录下大于9M的普通文件
find /home –type f –size -9M #查找home目录下小于9M的普通文件
范例8:find删除匹配的文件
find /home -type f -name "*.txt" –delete #查找home目录下以.txt结尾的普通文件进行删除
范例9:find基于文件权限、所有者、组进行查找
find /home -type f -perm 644 #查找home目录下权限644的普通文件
find /home –type f –user test #查找home目录下文件所有者test的普通文件
find /home –type f ! –user root #查找home目录下文件非所有者root的普通文件
find /home –type f –group test #查找home目录下文件组名test的普通文件
find /home –type f ! –group root #查找home目录下文件非组名root的普通文件
find /home -maxdepth 1 -type f -user test #查找home目录深度1且所有者test的普通文件
范例10:find借助选项-exec与其它命令进行结合
标注:{} ; 是特殊字符串需要与-exec结合使用,{ }会被替换成相应的文件名。
find /home -type f -user test -exec ls -l {} ; #查找home目录下文件所有者test的普通文件并罗列详细信息
find /home -type f -user abc -exec ls -l {} ; > a1.txt # #查找home目录下文件所有者test的普通文件并罗列详细信息并输出重定向到a1.txt文本
范例11:find跳过特定目录
find /home ( -name "home2" -prune ) -o ( -type f -print ) #排除显示home目录下的二级目录homes目录中的所有文件的名称(路径)
4.3、sed替换文本文件
Sed是流编辑器,它是文本处理非常重要的工具,它能够完美地配合正则表达式使用,功能强大,是shell脚本必学的工具之一。Sed命令有一个常用功能就是进行文本替换。
范例1:把font.txt文本zhanghong替换成xiaoming输出,不更改原来文本内容
[root@cloucentos6 home]# cat font.txt
my name is zhanghong.
[root@cloucentos6 home]# sed 's/zhanghong/xiaoming/' font.txt
my name is xiaoming.
[root@cloucentos6 home]# cat font.txt
my name is zhanghong.
范例2:sed -i 选项可以将替换的结果应用于原文本,修改了原文本内容
[root@cloucentos6 home]# cat font.txt
my name is zhanghong.
[root@cloucentos6 home]# sed -i 's/zhanghong/xiaoming/' font.txt
[root@cloucentos6 home]# cat font.txt
my name is xiaoming.
范例3:/g把fontt.txt文本多处zhanghong全部替换成xiaoming,修改了原文本内容
[root@cloucentos6 home]# cat font.txt
my name is zhanghong.
zhanghong is man.
zhanghong is googboy.
[root@cloucentos6 home]# sed -i 's/zhanghong/xiaoming/g' font.txt
[root@cloucentos6 home]# cat font.txt
my name is xiaoming.
xiaoming is man.
xiaoming is googboy.
范例4:/Ng忽略前N处匹配,并从第N+1处开始替换,只能替换匹配字符串同行的字符串。
[root@cloucentos6 home]# echo 'xiaoxiaoxiaoxiao' | sed 's/xiao/da/3g'
xiaoxiaodada
[root@cloucentos6 home]# echo ' xiao xiao xiao xiao ' | sed 's/xiao/da/3g'
xiao xiao da da
[root@cloucentos6 home]# cat font.txt
my name is zhaonghong. long time , zhanghong is man. zhanghong is good boy.
my name is zhaonghong. long time , zhanghong is man. zhanghong is good boy.
[root@cloucentos6 home]# sed 's/zhanghong/xiaoming/1g' font.txt
my name is zhaonghong. long time , xiaoming is man. xiaoming is good boy.
my name is zhaonghong. long time , xiaoming is man. xiaoming is good boy.
范例5: -e 选项处理输入的文本文件, -n 选项仅显示处理后的结果
[root@cloucentos6 home]# cat font.txt
1
2
3
4
5
6 [root@cloucentos6 home]# sed -n -e 3,4p font.txt
3
4
[root@cloucentos6 home]# cat -n font.txt | sed -n -e 3,4p
3 a3
4 a4
4.4、awk重新编排字段
awk是一个强大的文本分析工具,相对于grep的查找,sed的编辑,awk在其对数据分析并生成报告时,显得尤为强大。简单来说awk就是把文件逐行的读入,以空格为默认分隔符将每行切片,切开的部分再进行各种分析处理。
awk命令形式:
awk ‘BEGIN{print “start”} pattern { commands } END{print “end”}’
标注:当print的参数以逗号进行分隔时,参数打印时则以空格作为定界符;在awk的print语句中,双引号被当做拼接操作符使用。
执行顺序先执行BEGIN语句块中的语句,从输入读取一行,然后再执行pattern{command}重复执行直到文本全部被读取完毕,当读至输入流末尾时,最后执行END{}语句块。
特殊变量:
NR 每行的记录号,多文件记录递增
NF 该行字段数量
$0 表示整个当前行
$1 每行第一个字段
范例1:
[root@cloucentos6 home]# cat font.txt
lin1 f2 f3
lin2 f4 f5 f6
lin3 f6 f8 f9
[root@cloucentos6 home]# awk '{print "第" NR "行 " $0 " 字段数量:" NF}' font.txt
第1行 lin1 f2 f3 字段数量:3
第2行 lin2 f4 f5 f6 字段数量:4
第3行 lin3 f6 f8 f9 字段数量:4
[root@cloucentos6 home]# awk '{print "one:" $1 " two:" $2 " three:" $3 " four:" $4}' font.txt
one:lin1 two:f2 three:f3 four:
one:lin2 two:f4 three:f5 four:f6
one:lin3 two:f6 three:f8 four:f9
范例2:$(NF-1)打印倒数第二行
[root@cloucentos6 home]# cat font.txt
lin1 f2 f3
lin2 f4 f5
lin3 f6 f8
[root@cloucentos6 home]# awk '{print $(NF-1)}' font.txt
f2
f4
f6
范例3:awk命令进行1到5累加求和
[root@cloucentos6 home]# seq 5 | awk 'BEGIN{sum =0;print "number:"} {print $1"+";sum+=$1} END {print "==";print sum}'
number:
1+
2+
3+
4+
5+
==
15
范例4:借助 –v 选项可以将外部值传输给awk
[root@cloucentos6 home]# cat test.sh
#!/bin/bash
var=100
echo | awk -v varnumber=$var '{print varnumber}'
[root@cloucentos6 home]# ./test.sh
100
[root@cloucentos6 home]# cat test.sh
#!/bin/bash
var=100
echo | awk "{print $var}"
[root@cloucentos6 home]# ./test.sh
100
范例5:将多个外部变量传递给awk,变量之间用空格分隔。
[root@cloucentos6 home]# cat test.sh
#!/bin/bash
var1=100 ; var2=200
echo | awk '{print v1,v2}' v1=$var1 v2=$var2
[root@cloucentos6 home]# ./test.sh
100 200
范例6:使用不同的样式对awk处理的行进行过滤
[root@cloucentos6 home]# cat -n font.txt
1 lin1 f2 f3
2 lin2 f4 f5
3 lin3 f6 f8
4 lin4 f9 f10
[root@cloucentos6 home]# awk 'NR < 3' font.txt #打印行号小于3的行
lin1 f2 f3
lin2 f4 f5
[root@cloucentos6 home]# awk 'NR <= 3' font.txt #打印行号小于等于3的行
lin1 f2 f3
lin2 f4 f5
lin3 f6 f8
[root@cloucentos6 home]# awk 'NR==3' font.txt #打印行号等于3的行
lin3 f6 f8
[root@cloucentos6 home]# awk 'NR==3,NR==4' font.txt #打印行号等于3和4的行
lin3 f6 f8
lin4 f9 f10
[root@cloucentos6 home]# awk '/f6/' font.txt #打印包含字符串f6的行
lin3 f6 f8
[root@cloucentos6 home]# awk '!/f6/' font.txt #打印不包含字符串f6的行
lin1 f2 f3
lin2 f4 f5
lin4 f9 f10
范例7:设置字段定界符
[root@cloucentos6 home]# cat file.txt
a1=[2];a2=[3];a3=[4]
[root@cloucentos6 home]# awk -F ';' ' {print $2}' file.txt
a2=[3]
[root@cloucentos6 home]# awk 'BEGIN{FS=";"} {print $2}' file.txt
a2=[3]
范例8:awk中读取命令输出
[root@cloucentos6 home]# echo | awk '{"grep root /etc/passwd" | getline cmdout ; print cmdout}'
root:x:0:0:root:/root:/bin/bash
范例9:awk使用循环
[root@cloucentos6 home]# echo | awk '{for(i=0;i<5;i++) {print i}}'
0
1
2
3
4
4.5、tr转换
理论知识补充:tr命令用来从标准输入中通过替换、删除、压缩进行字符转换。tr命令只能通过标准输入,无法通过命令行参数来接受输入。
范例1:tr转换字母大小写
[root@cloucentos6 home]# echo "ABC" | tr 'A-Z' 'a-z' #大写字母转小写字母
abc
[root@cloucentos6 home]# echo "ABC" | tr 'a-z' 'A-Z' #小写字母转大写字母
abc
范例2:运行test.sh脚本后输入任意大写字母自动转换成小写字母
[root@cloucentos6 home]# cat test.sh
#!/bin/bash
printf "请输入大写字母后按回车键确认:"
read var
printf "转换成小写字母结果:"
echo $var | tr 'A-Z' 'a-z'
[root@cloucentos6 home]# ./test.sh
请输入大写字母后按回车键确认:ABC
转换成小写字母结果:abc
范例3:读取file.txt文本内容输出至test.sh脚本进行大写字母转换成小写字母
[root@cloucentos6 home]# cat file.txt
aBCDefgHIJK123$$
[root@cloucentos6 home]# cat file.txt | xargs ./test.sh
abcdefghijk123$$
tr替换字符
范例1:输入abcdef字符把a替换成2,把e替换成3
[root@cloucentos6 home]# echo "abcdef" | tr 'a、e' '2、3'
2bcd3f
范例2:输入abcdef字符把a替换成2,把b替换成3,替换是按顺序替换
[root@cloucentos6 home]# echo "abcdef" | tr 'a-b' '2-3'
23cdef
tr删除字符
范例1:
[root@cloucentos6 home]# cat file.txt
aBCDefgHIJK123$$
[root@cloucentos6 home]# cat file.txt | tr -d '$$'
aBCDefgHIJK123
[root@cloucentos6 home]# cat file.txt | tr -d '[BCD]'
aefgHIJK123$$
[root@cloucentos6 home]# echo "hello 123 world 457" | tr -d '0-9'
hello world
tr字符集补集
范例1:将不在补集的所有字符全部删除(包括空格符、换行符和字母都删除)
[root@cloucentos6 home]# echo "hello 123 world 456" | tr -d -c '0-9 '
123456
范例2: 将不在补集的所有字符全部删除(除数据、空格字符和换行符之外的所有字符)
[root@cloucentos6 home]# echo "hello 123 world 456" | tr -d -c '0-9 '
123 456
tr压缩字符(去除重复的字符)
范例1:去掉重复的空白行
[root@cloucentos6 home]# cat file.txt
abcd
efgh
ijkl
[root@cloucentos6 home]# cat file.txt | tr -s ' '
abcd
efgh
ijkl
4.6、cut选定字段
理论知识补充:以定界符(空格键space或制表键tab)分隔字段最好的例子就是/etc/passwd,文本里一行表示系统的一个用户,每个字段都以冒号隔开,例如:
root:x:0:0:root:/root:/bin/bash
该文件含有7个字段,分别是:
1、 用户名称
2、 加密后的密码(如果账号为停用状态,引处为一个星号或者加密后的密码文件存储于另外的/etc/shadow里,则这是也有可能是其它字符)
3、 用户ID编号
4、 用户组ID编号
5、 用户的姓名
6、 根目录
7、 登录的shell
Cut命令是用来剪下文本文件里的数据,文本文件可以是字段或者是字符类型。请注意,一个制表字符在此被视为单个字符。
范例1:-f以字段为主,作剪切操作,-d选项通过-f选项,指定定界符,默认定界符为制表字符(head取出表格前5行)
[root@cloucentos6 home]# cut -d : -f 1,5 /etc/passwd | head -n 5 #取出每个用户名称和用户的姓名
root:root
bin:bin
daemon:daemon
adm:adm
lp:lp
范例2:-c选项以字符为主,执行剪切操作。
[root@cloucentos6 home]# ls -l
总用量 20
drwx------. 2 root root 16384 12月 21 17:14 lost+found
-rwxr-xr-x. 1 root root 41 5月 10 18:42 test.sh
[root@cloucentos6 home]# ls -l | cut -c 1-13 #显示ls的第1个到第13个字符
总用量 20
drwx------. 2
-rwxr-xr-x. 1
[root@cloucentos6 home]# ls -l | cut -c 1,2 #显示第1个和第5个字符
总2
d-
-r