• linux 中 shell 将fasta文件依据scafold 拆分为单独的文件


    001、

    (base) root@PC1:/home/test2# ls
    a.fasta
    (base) root@PC1:/home/test2# sed '$a tag_tag' a.fasta -i       ## 在fasta末尾添加一个标记tag_tag
    (base) root@PC1:/home/test2# cat a.fasta                       ## fasta文件
    >scaffold_1
    CCCGGGTAAAACGGGTCTTCAAGAAAACGCTCCTCCGTTAATGCCGGCCGATTCAAATAA
    CCTCTGGCAACACCCGCTCCGGCAATGTATAGTTCACCGATACATCCAACAGGCAGCATC
    CGCTGATTCTGATTCAGGATATACAATCTGACATGATGAACAGGTTTTCCAATTGGAATC
    CGTTCAAGTTTTTCTTGCGGCGGACAATCAAAGAATGCAGCTTCTACGGTTGCTTCCGTT
    GGCCCATAGGAATTGGTTATTGAAACATTTGGAAGCAACACGTGAAATCGGGAGACAAGA
    >scaffold_2
    CACGCCGCCAGCGTTCGTCCTGAGCCAGGATCAAACTCTCCGATAAATGGATCACAGGTT
    AAGTTCACCGCATCCTGCGGCGACACCTGTGTGGCCTGCGTCGTGCAGGCCCTAGTTTGA
    CTGACTACGCACATCGCTGTGCGATTTATAAAAATGAATTAACAGGTACGTTTTGTCTTG
    TTTAGTTTTCAAAGAACTTTGCGTGCTTCTCTCGAAGCGACTACTTAATAGTAACATTTT
    TAGTTAACTAGGTCAATACTTTTTTGAAAAAGTTTTTACTAGTCATAATGGTCATGTTTG
    >scaffold_3
    TTGATCCAGTGGCTCCGGTTACTCCAGTTGATCCTGTTGCGCCTGTTGCTCCAGTTTCTC
    CGGTTGGTCCGGTTGATCCGGTTGCACCTGTTACTCCAGTGGCTCCGGTTACTCCCGTCG
    CACCAGTTTCTCCTGTCGCACCAGTTGATCCTGTTGCGCCTGTTGGTCCTGTATCTCCAG
    TTGCACCAGTTACTCCCGTTACTCCTGTTGGACCGGTTGCGCCTGTTACTCCGGTTGCGC
    CTGTTGCTCCTGTTGCTCCTGTTGATCCCGTTGCACCTGTTGGTCCAGTCGGTCCAATTC
    >scaffold_4
    CCTGAGCCAGGATCAAACTCTCCGATAAATGGATCACAGGTTAAGTTCACCGCATCCTGC
    GGCGACACCTGTGTGACCTGCGTCGTGCAGGCCCTAGTTTGACTGACTACGCACATCGCT
    GTGCGATTTTTAAAAACTGAATTAACAGGTACGTTTTGTCTTGTTTAGTTTTCAAAGATC
    ATTTTCGCTTCTTGTTGAAGCGACTTTATTAATATAACATTTTGACTTTCTTTTGTCAAA
    TGTTTTTTTGATTTATTTTCCCGCCGCTGTGAGCTTGTTTTCTCAGAAGCGCATCAGCGA
    >scaffold_5
    TCACCCCGGAATCAGCTGACATAGAAGCACTGAAATCAGCACTGAAGGAAACCCTGCCGG
    tag_tag                                                            ## 拆分脚本
    (base) root@PC1:/home/test2# grep ">" a.fasta | paste - <(grep ">" a.fasta | sed -n '2, $p' | sed '$a tag_tag') | awk '{split($1,a, ">"); print a[2], $0}' | while read {i,j,k}; do sed -n "/$j/,/$k/{/$k/b; p}" a.fasta > $i; done
    (base) root@PC1:/home/test2# ls
    a.fasta  scaffold_1  scaffold_2  scaffold_3  scaffold_4  scaffold_5
    (base) root@PC1:/home/test2# cat scaffold_1                         ## 查看运行结果
    >scaffold_1
    CCCGGGTAAAACGGGTCTTCAAGAAAACGCTCCTCCGTTAATGCCGGCCGATTCAAATAA
    CCTCTGGCAACACCCGCTCCGGCAATGTATAGTTCACCGATACATCCAACAGGCAGCATC
    CGCTGATTCTGATTCAGGATATACAATCTGACATGATGAACAGGTTTTCCAATTGGAATC
    CGTTCAAGTTTTTCTTGCGGCGGACAATCAAAGAATGCAGCTTCTACGGTTGCTTCCGTT
    GGCCCATAGGAATTGGTTATTGAAACATTTGGAAGCAACACGTGAAATCGGGAGACAAGA
    (base) root@PC1:/home/test2# cat scaffold_5
    >scaffold_5
    TCACCCCGGAATCAGCTGACATAGAAGCACTGAAATCAGCACTGAAGGAAACCCTGCCGG
  • 相关阅读:
    Linux 系统中用户切换(su user与 su
    linux 用户打开进程数和文件数调整
    hive sql 语法详解
    iOS
    iOS
    MySQL的事务的处理
    iOS
    iOS AOP编程思想及实践
    iOS 静态库和动态库(库详解)
    iOS 沙盒目录结构及正确使用
  • 原文地址:https://www.cnblogs.com/liujiaxin2018/p/16570783.html
Copyright © 2020-2023  润新知