Pacbio 纯三代组装复活草基因组

Pacbio 纯三代组装复活草基因组

对于植物等真核生物基因组来说，重复序列，多倍体，高杂合度等特征在利用二代数据进行组装的时候都会有很大的问题；

利用二代数据组装出来的基因组，大多达不到完成图的水准，通常只是覆盖到编码蛋白的基因区域，还是会有很多的区域覆盖不到，而这些区域正是发挥调控功能的非编码基因区域，近年来，非编码功能的研究越来越多，如果拼接出来的基因组上缺少这部分序列，无法进行后续的研究；

而且由于测序读长的限制和拼接算法的原因，对于重复序列，GC异常区域，会存在组装错误，甚至组装不出来；

三代测序，其长读长和无GC偏好性等特点，降级了基因组组装时的难度，可以组装出在二代数据中很难组装出来的重复序列和GC异常序列，非常适合做基因组的组装；

研究人员利用PacbBio RSII 测序平台对复活草进行测序，使用了32个SMRT cells，测序深度72X

最终组装出来的结果包含650条contigs, 覆盖度为99%(估计的基因组大小为245Mb, contig的总长度为244Mb),conig的N50长度为2.4M，

同时还组装出来完整的叶绿体基因组，大小为125,324 bp, 其中有大约25kb为重复序列，

分析使用的是HGAP的组装流程，参数如下：

The Oropetium genome was assembled using the
RS_HGAP_Assembly.3 protocol for assembly and Quiver for genome polishing in SMRT Analysis v2.3.012. This consisted of a three-step process involving
(1) generation of preassembled reads with improved consensus accuracy;
(2) assembly of the genome through overlap consensus accuracy using Celera; and
(3) one round of genome polishing with Quiver.

For HGAP, the following parameters were used:
PreAssembler Filter v1 (
minimum sub-read length= 3,000 bp,
minimum polymerase read quality = 0.80,
minimum polymerase read length= 3,000bp
);
PreAssembler v2 (
minimum seed length= 16,000 bp,
numberof seed read chunks= 6,
alignment candidates per chunk= 10,
total alignment candidates= 24,
min coverage for correction= 6
);

AssembleUnitig v1 (
target genome coverage= 30,
overlap error rate= 0.06,
minimum overlap= 40 bp,
overlap k-mer= 14
);

BLASR v1 mapping of reads for genome polishing with Quiver (
max divergence percentage= 30,
minimum anchor size= 12).

A second round of genome polishing was performed using Quiver (SMRT Analysis v2.3.0) to
further improve the site-specific consensus accuracy of the assembly.
The following Quiver parameters were used for genome polishing:
filtering (
minimum sub-read length= 3,000 bp,
minimum polymerase read quality= 0.80,
minimum polymerase read length= 3,000 bp);

mapping (
maximum divergence percentage= 30,
minimum anchor size= 12).

Default parameters were otherwise employed for both HGAP assembly and Quiver protocols
相关阅读:
smarty基础
 phpcms 内容模块PC标签调用
 phpcms v9中的$CATEGORYS栏目数组
 PHP如何实现验证码
 PHP生成一个不重复随机数组的封装方法
 简单实现php文件管理
 PHP-----作业：查询数据，在页面上显示
 PHP-----设计模式六大原则
 PHP-----静态
 PHP-----面向对象的设计模式：工厂模式例题
原文地址：https://www.cnblogs.com/xudongliang/p/6873249.html