• python 学习之 fasta/fastq 处理利器pyfastx


    001、 fasta序列迭代

    (base) root@PC1:/home/test2# cat a.fasta         ## 测试fasta文件
    >gene1 myc
    AGCTGCCTAAGC
    GGCATAGCTAATCG
    >gene2 jun
    ACCGAATCGGAGCGATG
    GGCATTAAAGATCTAGCT
    >gene3 malat1
    AGGCTAGCGAG
    GCGCGAG
    GATTAGGCG
    >>> import pyfastx                             ## 导入包
    >>> fa = pyfastx.Fastx('a.fasta')              ## 读取fasta文件
    >>> type(fa)
    <class 'Fastx'>
    >>> for i,j,k in fa:                           ## 迭代, i默认那么; j序列; k注释。
    ...     print(i)
    ...     print(j)
    ...     print(k)
    ...
    gene1
    AGCTGCCTAAGCGGCATAGCTAATCG
    myc
    gene2
    ACCGAATCGGAGCGATGGGCATTAAAGATCTAGCT
    jun
    gene3
    AGGCTAGCGAGGCGCGAGGATTAGGCG
    malat1

    002、如果含有小写字母,指定输出为大写字母

    (base) root@PC1:/home/test2# cat a.fasta          ## 测试fasta文件
    >JZ822577.1 contig1 cDNA library of flower petals in tree peony by suppression subtractive hybridization Paeonia suffruticosa cDNA, mRNA sequence
    CTctagcttaaaTTACTTCTTCACATTCCAGATCACTCAGGCTCTTTGTCATTTTAGTTTGACTAGGATATCGAGTATTCAAGCTCATCGCTTTTGGTAATCTTTGCGGTGCATGCCTTTGCATGCTGTATTGCTGCTTCATCATCCCCTTTGACTTGTGTGGCGGTGGCAAGACATCCGAAGAGTTAAGCGATGCTTGTCTAGTCAATTTCCCCATGTACAGAATCATTGTTGTCAATTGGTTGTTTCCTTGATGGTGAAGGGGCTTCAATACATGAGTTCCAAACTAACATTTCTTGACTAACACTTGAGGAAGAAGGACAAGGGTCCCCATGT
    >>> for item in pyfastx.Fastx('a.fasta', uppercase=True):      ## 读取数据, 全部以大写输出
    ...     print(item)
    ...
    ('JZ822577.1', 'CTCTAGCTTAAATTACTTCTTCACATTCCAGATCACTCAGGCTCTTTGTCATTTTAGTTTGACTAGGATATCGAGTATTCAAGCTCATCGCTTTTGGTAATCTTTGCGGTGCATGCCTTTGCATGCTGTATTGCTGCTTCATCATCCCCTTTGACTTGTGTGGCGGTGGCAAGACATCCGAAGAGTTAAGCGATGCTTGTCTAGTCAATTTCCCCATGTACAGAATCATTGTTGTCAATTGGTTGTTTCCTTGATGGTGAAGGGGCTTCAATACATGAGTTCCAAACTAACATTTCTTGACTAACACTTGAGGAAGAAGGACAAGGGTCCCCATGT', 'contig1 cDNA library of flower petals in tree peony by suppression subtractive hybridization Paeonia suffruticosa cDNA, mRNA sequence')

    003、fastq序列迭代

    (base) root@PC1:/home/test2# cat b.fastq                           ## 测试fastq文件
    @WT_rep1_BAF155.1 SALLY:291:C149WACXX:2:1101:2579:1951 length=51
    CTGNCCAAGGTAATTTATAGATTCAATGCCATCCCCATCAAGCTACCAANG
    +WT_rep1_BAF155.1 SALLY:291:C149WACXX:2:1101:2579:1951 length=51
    BCC#4ADDHHBFHIJJIIJJJIIIIJHIJIJIIJGGIJJJJIGJJJJJJ##
    >>> fq = pyfastx.Fastx('b.fastq')      ## 读取数据
    >>> for i,j,k,l in fq:                 ## 迭代
    ...     print(i)
    ...     print(j)
    ...     print(k)
    ...     print(l)
    ...
    WT_rep1_BAF155.1
    CTGNCCAAGGTAATTTATAGATTCAATGCCATCCCCATCAAGCTACCAANG
    BCC#4ADDHHBFHIJJIIJJJIIIIJHIJIJIIJGGIJJJJIGJJJJJJ##
    SALLY:291:C149WACXX:2:1101:2579:1951 length=51
  • 相关阅读:
    Linux VPS新硬盘分区与挂载教程
    全程图解 手把手教您开启windows终端服务
    解决IE apk变成zip:Android 手机应用程序文件下载服务器Nginx+Tomcat配置解决方法
    Nginx 配置文件详解
    MySQL新建用户,授权,删除用户,修改密码
    解决IE apk变成zip:Android 手机应用程序文件下载服务器 配置解决方法
    CentOS 6.4 32位系统 LAMP(Apache+MySQL+PHP)安装步骤
    yum被锁Another app is currently holding the yum lock; waiting for it to exit...
    CentOS 6.4下编译安装MySQL 5.6.14
    Oracle函数大全
  • 原文地址:https://www.cnblogs.com/liujiaxin2018/p/16579502.html
Copyright © 2020-2023  润新知