• awk


    Liang always brings me interesting quiz questions. Here is one:

    If i have a table like below:

    chr1	113438	114495	1	chr1	114142	114143
    chr1	113438	114495	2	chr1	114171	114172
    chr1	170977	174817	1	chr1	171511	171512
    chr1	170977	174817	2	chr1	171514	171515
    chr1	170977	174817	2	chr1	173545	173546
    

    and I would like to collapse the rows if the first 3 columns are identical to make the following output:

    chr1	113438	114495	114142,114143,114171,114172    
    chr1	170977	174817	171511,171512,171514,171515,173545,173546
    

    Is there any easy awk approach to do it?

    Since I am so rusty at awk, I had to google around to find the solution:

    awk -F '	' '
    $1FS$2FS$3==x{
        printf ",%s,%s", $6, $7
        next
    }
    {
        x=$1FS$2FS$3
        printf "
    %s	%s,%s", x, $6, $7
    }
    END {
        printf "
    "
    }' test.txt
    

    Assuming the input file is test.txt. Note that the input and output are both tab-separated.

    Explanation:

    x=$1FS$2FS$3: variable x stores the value of columns 1, 2, and 3 separated by field separator FS.

    Print the first part of an output line (columns 1, 2, 3, 6, 7).

    For next line, if columns 1, 2, and 3 equal x, print columns 6 and 7.

    Group and then count:

    https://stackoverflow.com/questions/14916826/awk-unix-group-by

    have this text file:

    name, age
    joe,42
    jim,20
    bob,15
    mike,24
    mike,15
    mike,54
    bob,21

    Trying to get this (count):

    joe 1
    jim 1
    bob 2
    mike 3

    awk -F, 'NR>1{arr[$1]++}END{for (a in arr) print a, arr[a]}' file.txt

    References:

    http://azaleasays.com/2014/10/06/awk-group-adjacent-rows-by-identical-columns/

    Group rows in text file and aggregate corresponding rows to column

    keeping last record among group of records with common fields (awk)

  • 相关阅读:
    2016第7周五
    优秀it博客和文章
    java惯用法转载
    2016第7周二
    人,终究是要找到自己喜欢的...才行
    2016第6周日
    2016第6周六
    2016第6周五
    2016第6周四
    2016第6周三
  • 原文地址:https://www.cnblogs.com/emanlee/p/7990097.html
Copyright © 2020-2023  润新知