• axt alignment format


      axt Alignment Format       

     
     

    axt alignment files are produced from Blastz, an alignment tool available from Webb Miller's lab at Penn State University. The axtNet and axtChain alignments are produced by processing the alignment files with additional utilities written by Jim Kent at UCSC.

    Example: The following segment from an axt file shows the first 2 sets of alignments of the human assembly (the aligning assembly) to mouse chromsome 19 (the primary assembly).    

      0 chr19 3001012 3001075 chr11 70568380 70568443 - 3500
      TCAGCTCATAAATCACCTCCTGCCACAAGCCTGGCCTGGTCCCAGGAGAGTGTCCAGGCTCAGA
      TCTGTTCATAAACCACCTGCCATGACAAGCCTGGCCTGTTCCCAAGACAATGTCCAGGCTCAGA
    
      1 chr19 3008279 3008357 chr11 70573976 70574054 - 3900
      CACAATCTTCACATTGAGATCCTGAGTTGCTGATCAGAATGGAAGGCTGAGCTAAGATGAGCGACGAGGCAATGTCACA
      CACAGTCTTCACATTGAGGTACCAAGTTGTGGATCAGAATGGAAAGCTAGGCTATGATGAGGGACAGTGCGCTGTCACA
    		

    Structure Each alignment block in an axt file contains three lines: a summary line and 2 sequence lines. Blocks are separated from one another by blank lines.

    1. Summary line    

     0 chr19 3001012 3001075 chr11 70568380 70568443 - 3500

    The summary line contains chromosomal position and size information about the alignment. It consists of 9 required fields:

    • Alignment number -- The alignment numbering starts with 0 and increments by 1, i.e. the first alignment in a file    is numbered 0,  the next 1, etc.
    • Chromosome (primary organism)
    • Alignment start (primary organism) -- The first base is numbered 1.
    • Alignment end (primary organism) -- The end base is included.
    • Chromosome (aligning organism)
    • Alignment start (aligning organism)
    • Alignment end (aligning organism)
    • Strand (aligning organism) -- If the strand value is "-", the values of the aligning organism's start and end fields are relative to the reverse-complemented    coordinates of its chromosome.
    • Blastz score -- Different blastz scoring matrices are used for different organisms. See the README.txt file in the alignments directory for scoring information specific to a pair of alignments.

    2. & 3. Sequence lines    

      TCAGCTCATAAATCACCTCCTGCCACAAGCCTGGCCTGGTCCCAGGAGAGTGTCCAGGCTCAGA
      TCTGTTCATAAACCACCTGCCATGACAAGCCTGGCCTGTTCCCAAGACAATGTCCAGGCTCAGA

    The sequence lines contain the sequence of the primary assembly (line 2) and aligning assembly (line 3) with inserts. Repeats are indicated by lower-case letters.

     

  • 相关阅读:
    一元运算符重载 前置和后置++ --(这种一般用成员函数来实现重载)
    运算符中的二元重载,为什么要调用友元函数而不是全局函数的问题
    关于数组的封装不知道为什么错了,具体代码如下
    关于对象的动态建立和释放
    关于构造函数中调用构造函数的危险
    关于析构函数,构造函数匿名对象的总结,以厚忘了,可以回来观看很全
    关于深拷贝和浅拷贝的疑问
    构造函数的调用大全
    构造函数的调用(其中不包括赋值构造函数)
    LeetCode:Add Digits
  • 原文地址:https://www.cnblogs.com/pennyy/p/4241826.html
Copyright © 2020-2023  润新知