• 8、Transcriptome Assembly


    Created by Benjamin M Goetz, last modified on Jun 29, 2015

    Assembly of RNA-seq short reads into a transcriptome. 

    1. Quality Assessment

    Quality of data assessed by FastQC.

    • Deliverables
      • Reports generated by FastQC.
    • Tools Used
      • FastQC: (Andrews 2010) used to generate quality summaries of data:
        • Per base sequence quality report: useful for deciding if trimming necessary.
        • Sequence duplication levels: evaluation of library complexity. Higher levels of sequence duplication may be expected for high coverage RNAseq data.
        • Overrepresented sequences: evaluation of adapter contamination.

    2. Assembly

    We use Trinity to generate a de novo assembly. Assembly is a very computationally complex task, and may not finish within the time limits imposed on compute jobs at TACC, especially for large data sets. To increase the chance of getting an assembly, we run two assemblies: one with the original data, and one with an in silico normalization to 50x coverage before the main assembly starts. If the non-normalized data doesn't complete an assembly, the normalized data may.

    • Deliverables
      • FASTA file of assembly from full data (if it finishes).

      • FASTA file of assembly with in silico normalization to 50x coverage (if it finishes).

      • If neither assembly run finishes, no charge.

    • Tools Used
      • Trinity (Grabherr, et al 2011) is the best-known and most-used transcriptome assembler available today.

    3. Optional: Homology Against Standard Databases

    We can take a completed assembly and BLAST against UniProt or HMMER against Pfam for an additional charge. These homology searches will give some indication of what the assembled transcripts represent.

    • Deliverables
      • BLAST against UniProt table with the option of appending the best hits to the FASTA file tags.

      • HMMER against Pfam table with the option of appending the best hits to the FASTA file tags.

    • Tools Used
      • BLASTx (Altschul, et al 1997) for nucleotide-to-protein homology search in the UniProt protein database.
      • hmmscan (Eddy, 1998) for HMM-based homology search against the Pfam database of proteins and protein domains.
     
  • 相关阅读:
    iOS 设置app语言中文,比如 copy中文,拍照按钮cancel 中文
    kCGImagePropertyExifDictionary 引用错误
    Objective-C中3种枚举比较及KVC两个小技巧
    xcode 调试程序 lldb 使用
    iOS kvo 结合 FBKVOController 的使用
    ios 推送app badge 数字累加操作
    推送未找到应用程序的“aps-environment”的权利字符串错误
    AVAudioPlayer播放在线音频文件
    MPMoviePlayerViewController 视频播放黑屏
    PHP 与 Redis 入门教程
  • 原文地址:https://www.cnblogs.com/renping/p/7045353.html
Copyright © 2020-2023  润新知