• awk知识点总结


    awk命令:
        Linux文本处理三剑客之一,另外还有grep,sed
        ls -l `which awk`:查看awk路径
        GNU awk = gawk
        基本用法:
            gawk [options] 'program' file file ...
                program: PATTERN {ACTION STATEMENT}
                    由语句组成,语句分隔符是;
                ACTION: print, printf
                注:{}中为每行要执行的代码
               
        选项:
            -F[]:指明输入字段分隔符,默认是分号;
                例:awk -F: '{print $1,$2}' /etc/passwd
                        取出:分隔后的前两个字段
                    awk -F: '{print $1,$2,"password"}' /etc/passwd
                        取出:分隔后的前两个字段,且每一行最后都会加入password单词
       
           
           
            1.awk的输出命令print
                print item1,item2,...
                要点:
                    1>各item之间使用逗号分隔,而输出时则使用输出分隔符分隔;
                    2>输出的各item可以字符串或数值当前记录的字段($n)、变量或awk的表达式,数值会隐式转换为字符进程输出;
                    3>print后面的item如果省略,相当于print $0; 输出空白,使用print "";
                   
                    例:awk -F: '{print }' /etc/passwd
                        awk -F: '{print $0}' /etc/passwd
                        打印全部文本文件,即$0代表整行
            2.变量
                赋值要用-v,awk有自己的语法。
                2.1 内置变量
                    FS: 输入时的字段分隔符,默认为空白字符
                        例:awk -v FS=":" '{print $1,$3}' /etc/passwd
                       
                    RS: 输入时的文件换行符,默认为换行符
                        例:awk -v RS=" " '{print $0}' /etc/passwd
                                输出时会以空格为换行符
                               
                    OFS: 输出时的字段分割符,默认为空白符
                        例:awk -v OFS="---------" '{print $1,$7}' /etc/passwd
                                输出时每行行尾会追加---------
                            awk -v FS=":" -v OFS="---------" '{print $1,$7}' /etc/passwd
                                输出时分隔符为---------   
                               
                    ORS: 输出时的文件换行符,默认为换行符
                        例:awk -v FS=":" -v ORS=" " '{print $1,$7}' /etc/passwd
                                都输出为一行,空格为分隔符
                               
                    NF: 字段数
                        例:awk -F: '{print NF}' /etc/passwd
                                分别输出每行的字段数
                            awk -F: '{print $NF}' /etc/passwd
                                输出最后一个字段
                               
                    NR: 行数,所有文件统一计数
                        例:awk '{print NR}' /etc/passwd /etc/issue
                                输出所有文件的行数
                               
                            awk '{print NR,$0}' /etc/passwd /etc/issue
                                行首显示行号

                    FNR: 行数,各文件分别计数
                        例:awk '{print FNR,$0}' /etc/passwd /etc/issue
                            分别显示行号
                           
                    FILENAME: 当前文件名
                        例:awk '{print FILENAME,$0}' /etc/passwd
                            每行都会添加/etc/passwd
                   
                    ARGC: 命令行参数的个数
                        例:awk '{print ARGC}' /etc/passwd
                   
                    ARGV: 数组,保存了命令行参数
                        例:awk '{print ARGV[0]}' /etc/passwd
                                输出为awk
                            awk '{print ARGV[1]}' /etc/passwd
                                输出为/etc/passwd
               
                2.2自定义变量
                    -v var=val:
                        变量名区分字符大小写
                   
                    定义变量的位置:
                        (1) 可以program中定义变量;
                            例:awk '{file="passwd";print file,$1}' /etc/passwd
                                    每行行首都会追加passwd

                        (2) 通过-v选项定义变量;

            3.printf命令
                格式:printf format, item1,item2,...
               
                例:awk 'BEGIN{printf "%d ",6}'
                        输出数字6并回车
               
                要点:
                    1>format是必须的;
                    2>不会自动换行,需显式给定行分隔符
                    3>format中需要分别为后面的每个item指定一个格式符
               
                格式符:都以%开头,后跟一个字符
                    %: 显示字符的ASCII码
                    %d,%i: 显示十进制整数
                    %e,%E: 科学计数法显示数值
                    %f: 显示为浮点数
                    %g,%G: 以科学计数法格式或浮点数格式显示数值
                    %s: 字符串
                    %u: 无符号的整数
                    %%: 显示%本身
               
                修饰符:
                    #[.#]: 第一个#指定显示宽度,例如%30s,第二个#表示小数点后的精度
                        例:awk -F: '{printf "%20s %20d , $1,$3"}'/etc/passwd
                                输出为右对齐
                   
                    -:左对齐
                        例:awk -F: '{printf "%-20s %-20d , $1,$3"}'/etc/passwd
                                输出为左对齐,加入了个减号
                       
            4.操作符
                算术操作符:
                    x+y, x-y, x*y, x/y, x^y, x%y
                    -x:负值
                    +x: 转换为数值
                   
                    例:awk -F: '$3>500{print $0}' /etc/passwd
                        输出UID大于500的行
               
                字符串操作:
                    字符串连接
               
                赋值操作符:
                    = += -= *= /= %= ^=
                    ++ --
                       
                模式匹配符:
                    ~
                    !~
                    例:awk -F: '$1~/root/ {print $7}' /etc/passwd
                            $1匹配上/root/之后,打印$7
               
                逻辑操作符:
                    &&
                    ||
                   
                条件表达式:
                    selector?if-true-expression:if-false-expression
                    例:awk -F: '{$3>=500?usertype="common user":usertype="sysuser or admin";printf "%20s:%-s ",$1,usertype}' /etc/passwd
                            UID>500就是common user,否则就是sysuser or admin
                           
                函数调用:
           
            5.PATTERN
                (1)/regular expression/:仅处理能够被/regular expression/所匹配到的行
                    例:awk -F: '/^<root>/{print $0}' /etc/passwd
                            输出所有以root开头的行
                           
                (2) relational expression:关系表达式,有真假之分,一般来说,其结果为非0或非空字符串时为“真”,否则,为“假”;
                    例:awk -F: '$3>=500{print $1,$3}' /etc/passwd
                        awk -F: '$5~/root/{print $0}' /etc/passwd
                       
                (3) line ranges:行范围,类似sed或vim的地址定界法;startline, endline
       
                (4) BEGIN/END: 特殊模式

                    仅在awk运行程序之前执行一次(BEGIN)   或仅在awk运行程序之后执行一次(END);
                    例:awk  -F: 'BEGIN{print "username","shell "-------------------------}$7~/bash>/{print $1,$7}END{print "------------------------------ "}' /etc/passwd

                        awk  -F: 'BEGIN{username="username";shell="shell";printf "%10s%10s ",username,shell;print "---------------------------"}$7~/bash>/{printf "%10s%10s ",$1,$7}END{print "---------------------------"}' /etc/passwd
               
                (5) empty: 空模式,匹配任意行;
               
            6.常用的action
                (1)表达式
                (2)控制语句
                (3)输入语句
                (4)输出语句
           
            7.控制语句
                if (condition) statement [ else statement ]
                while (condition) statement
                do statement while (condition)
                for (expr1; expr2; expr3) statement
                for (var in array) statement
                break
                continue
                delete array[index]
                delete array
                exit [ expression ]
                { statements }
               
                7.1 if-else
                   
                    语法:if (condition) statement [ else statement ]
                        if (condition) { statements; } [ else { statements; }]
                       
                        例:awk -F: '{if ($3>=500) print $1," is a common user." }' /etc/passwd
                            awk -F: '{if ($3>=500) {print $1," is a common user."} else {print $1," is a system user or admin."}}' /etc/passwd
                            awk '{if (NF>6) print NF, $0 }' /etc/inittab
                                输出字段数大于6的整行
                       
                    用法:对awk取得的整行或行中的字段做条件判断;
                   
                7.2 while循环
                    语法:while (condition) statement
                        while (condition) { statements }
                        条件为真时进行循环,直到为假退出;
                       
                    用法:通常用于在当前行的各字段间进行循环;
                   
                        例:awk '{i=1;while(i<=NF){printf "%20s:%d ",$i,length($i); i++}}' /etc/inittab
                                输出每行中每个字段及其长度
                            awk '{i=1;while(i<=NF){if (length($i)>5) {printf "%20s:%d ",$i,length($i);} i++}}' /etc/inittab
                       
                7.3 do-while循环
                    语法:do statement while (condition)
                        do { do-while-body }  while (condition)
                        意义:至少执行一次循环体;
                       
                7.4 for循环
                    语法:for (expr1; expr2; expr3) statement
                        for (expr1; expr2; expr3) { statements }
                       
                        for (varaiable assignment; condition; iteration process) { for-body }
                       
                        例:awk '{for(i=1;i<=NF;i++) {printf "%s:%d ", $i, length($i)}}' /etc/inittab
                       
                    for循环在awk中有一个专用于遍历数组元素:
                        语法:for (var in array) { for-body }
                       
                7.5 switch
                    语法:switch (expression) {case VALUE or /REGEXP/: statement; ...; default: statementN}
                       
                7.6 break and continue
                    break [n]: 退出当前循环
                    continue:提前结束本轮循环,直接进入下轮循环
                   
                7.7 next
                    提前结束对本行的处理而进入下一行的处理
                   
                    ~]# awk -F: '{if($3%2!=0) next;print $1,$3}' /etc/passwd
               
            8、Array
               
                关联数组:array[index-expression]
               
                    index-expression:
                        可以使用任意字符串;
                        如果某数组元素事先不存在,在引用时,awk会自动创建此元素并将其值初始化为空串;
                            因此,若要判断数组是否存在某元素,要使用“index in array”进行;
                           
                        a[mon]="Monday"
                        print a[mon]
                       
                    要遍历数组中的每个元素,使用: for (var in array) { for body }
                        
                         注意:var会遍历array的每一个索引,print array[var]
                        
                    例子:统计每一行中各单词分别出现的次数
                        ~]# awk '{for(i=1;i<=NF;i++) {count[$i]++}}END{for(j in count) {print j,count[j]}}' awk.txt

                        awk '{for(i=1;i<=NF;i++) {count[$i]++};for(j in count) {print j,count[j]};for(j in count) {count[j]=""};print

    "---------------"}' awk.txt
                       
                        ~]# ss -tan | awk '!/^State/{state[$1]++}END{for (i in state) {print i,state[i]}}'
                        ~]# netstat -tan | awk '/^tcp/{state[$NF]++}END{for(i in state){print i,state[i]}}'
                       
                    练习:统计httpd访问日志中,每个IP出现的次数;
                        ~]# awk '{ip[$1]++}END{for(i in ip){print i,ip[i]}}' /var/log/httpd/access_log
                       
            9、函数
               
                9.1 内置函数
                    数值处理:
                        rand(): 返回0和1之间一个随机数;
                       
                    字符串处理:
                        length([s]): 返回指定字符串的长度
                        sub(r, s [, t]):以r所表示的模式来查找t字符串中的匹配,将其第一次出现替换同s所表示的字符串;
                            sub(ab,AB,$0)

                        gsub(r, s [, t]):以r所表示的模式来查找t字符串中的匹配,将其所有的出现均替换同s所表示的字符串;
                       
                        split(s, a [, r]): 以r为分隔符切割字符串s,并将切割的结果保存至a表示数组中;
                       
                            ~]# netstat -tan | awk '/^tcp/{len=split($5,client,":");ip[client[len-1]]++}END{for(i in ip){print i,ip[i]}}'
                           
                        substr(s, i [, n]): 从s表示的字符串中取子串,从i开始,取n个字符;
                   
                    时间类的函数:
                        systime(): 取时间戳;
                   
                    位运算函数:
                        and(v1,va2):
                       
                9.2 自定义函数
                    function f_name(p,q)
                    {
                        ...
                    }

  • 相关阅读:
    nginx下pagespeed使用详解
    letsencrypt证书-使用certbot申请wildcard证书
    letsencrypt证书-管理工具certbot
    tcpdump使用
    elasticsearch增删改查操作
    elasticsearch安装中文分词器
    dragstart drag dragend dragenter dragover dragleave drop
    js如何准确获取当前页面url网址信息
    /touch滑屏事件
    监听 手机back键和顶部的回退
  • 原文地址:https://www.cnblogs.com/yajing-zh/p/4878232.html
Copyright © 2020-2023  润新知