• 比对软件


    image

    An illustration of relationships between alignment methods.

    The applications / corresponding computational restrictions shown are (green) short pairwise alignment / detailed edit model;

    (yellow) database search / divergent homology detection;

    (red) whole genome alignment / alignment of long sequences with structural rearrangements;

    and (blue) short read mapping / rapid alignment of massive numbers of short sequences. Although solely illustrative, methods with more similar data structures or algorithmic approaches are on closer branches.

    The BLASR method combines data structures from short read alignment with optimization methods from whole genome alignment.

    用过的比对软件不多,只知道简单的全局比对和局部比对算法,比对软件的原理基本是不知道的。

    现在用过的比对软件:bwa、bowtie、blasr、SHRiMP、DALIGNER、MHAP、blast、blat、SOAP、Subread、NovoAlign、Maq

    还有:MEGABLAST、Mummer、GMAP、STAR、DIAMOND、ELAND、RMAP、ZOOM、SeqMap、CloudBurst

    慢慢积累,比较这些软件的不同,因为生物信息最底层的就是比对,测序拿到一堆序列,第一件要做得事情就是比对。

    先看一篇好文:Aligner tutorial: GMAP, STAR, BLAT, and BLASR

    常用的核酸序列比对到底有哪几种?

    1. 二代短reads比对到genome
    2. 三代长reads比对到genome
    3. 剪切体比对
    4. 二代reads与三代reads比
    5. genome之间比
    6. 多序列比对
    7. 数据库比对

    BWA


    Burrows-Wheeler Aligner

    适用范围:二代测序数据快速比对到genome上

    bwa作为序列比对界的模式软件,短小精悍,适用于多种场合,很有必要搞懂他内部的比对算法,最好也搞懂它是如何实现的。

    Fast and accurate short read alignment with Burrows–Wheeler transform  - 2009  在线pdf    原文

    lh3/bwa – Github    Burrow-Wheeler Aligner for pairwise alignment between DNA sequences

    1. BWA-backtrack:illumina reads比对,最长支持100bp(aln/samse/sampe
    2. BWA-SW:long-read比对,长度为70bp-1Mbp;支持剪切性比对(bwasw
    3. BWA-MEM:最新,最常用,同SW,但更准更快,与backtrack相比在70-100bp更具性能优势(mem

    BWA方面主要有三篇学术论文:

    1. Li H. and Durbin R. (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics, 25, 1754-1760. [PMID: 19451168]. (if you use the BWA-backtrack algorithm)
    2. Li H. and Durbin R. (2010) Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics, 26, 589-595. [PMID: 20080505]. (if you use the BWA-SW algorithm)
    3. Li H. (2013) Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv:1303.3997v2 [q-bio.GN]. (if you use the BWA-MEM algorithm or the fastmap command, or want to cite the whole BWA package)

    BWA的设计思想

    新一代测序技术中的短序列比对和组装算法 - 硕士论文

    image

    Program: bwa (alignment via Burrows-Wheeler transformation)
    Version: 0.7.15-r1140
    Contact: Heng Li <lh3@sanger.ac.uk>
    
    Usage:   bwa <command> [options]
    
    Command: index         index sequences in the FASTA format
             mem           BWA-MEM algorithm
             fastmap       identify super-maximal exact matches
             pemerge       merge overlapping paired ends (EXPERIMENTAL)
             aln           gapped/ungapped alignment
             samse         generate alignment (single ended)
             sampe         generate alignment (paired ended)
             bwasw         BWA-SW for long queries
    
             shm           manage indices in shared memory
             fa2pac        convert FASTA to PAC format
             pac2bwt       generate BWT from PAC
             pac2bwtgen    alternative algorithm for generating BWT
             bwtupdate     update .bwt to the new format
             bwt2sa        generate SA from BWT and Occ
    
    Note: To use BWA, you need to first index the genome with `bwa index'.
          There are three alignment algorithms in BWA: `mem', `bwasw', and
          `aln/samse/sampe'. If you are not sure which to use, try `bwa mem'
          first. Please `man ./bwa.1' for the manual.

    实用算法实现-第8篇 后缀树和后缀数组 [1简介]

    bwa mem

    bwa现在大家基本只用其mem比对算法了

    还是单独开一片笔记吧

     

    SOAPaligner/soap2

    soap2 - 官方

    SOAP系列的没有公布源码,都是二进制执行程序,所以免除了安装,同bwa一样,也是要先建索引再比对

    SOAP不是很吃内存,把人的3G的基因组读到内存大概也就需要7G的内存,后面的比对都是不耗内存的。

    ./2bwt-builder ~/human_genome.fa
    ./soap –a <reads_a> -D <index.files> -o <output></output>
    ./soap –a <reads_a> -b <reads_b> -D <index.files> -o <PE_output> -2 <SE_output> -m <min_insert_size> -x <max_insert_size>

    之前对SOAP一点印象都没有,但是不少同事都在用SOAP系列的软件。

    主要是看了一个PPT,SOAP是有其比对上的优势的

    imageimage

    可以看出,SOAP对错误率的容忍较高,对indel的容忍也很好,这就是我现在需要的,可以尝试一下用SOAP将二代比对到三代上。Mapping.ppt

     

     

    BLASR


    Basic Local Alignment with Successive Refinement

    Mapping single molecule sequencing reads using basic local alignment with successive refinement (BLASR): application and theory - BMC Bioinformatics

     

    待续~

  • 相关阅读:
    Easy UI form表单提交 IE浏览器不执行success ,以及 datagrid 展示过慢
    JS批量获取参数构建JSON参数对象
    Easy UI datebox控件无法正常赋值
    EasyUI控件combobox重复请求后台,dialog窗口数据异常
    后台Post/Get 请求接口 方式
    WebForm 页面ajax 请求后台页面 方法
    实现输入框小数多 自动进位展示,编辑时实际值不变
    页面获取Web控件ID不能正常获取,它惹得祸
    线性表的链式存储——单链表
    线性表
  • 原文地址:https://www.cnblogs.com/leezx/p/6100667.html
Copyright © 2020-2023  润新知