ChopStitch-一种利用转录组和全基因组数据进行外显子标注和拼接图构建的方法

Sequencing studies on non-model organisms often interrogate both genomes and transcriptomes with massive amounts of short sequences. Such studies require de novo analysis tools and techniques, when the species and closely related species lack high quality reference resources. For certain applications such as de novo annotation, information on putative exons and alternative splicing may be desirable.

非模式生物的测序研究经常面对的一个困难是处理大量的关于基因组和转录组的短序列信息。

Researchers at the Michael Smith Genome Sciences Centre have developed ChopStitch, a new method for finding putative exons de novo and constructing splice graphs using an assembled transcriptome and whole genome shotgun sequencing (WGSS) data. ChopStitch identifies exon-exon boundaries in de novo assembled RNA-Seq data with the help of a Bloom filter that represents the k-mer spectrum of WGSS reads. The algorithm also accounts for base substitutions in transcript sequences that may be derived from sequencing or assembly errors, haplotype variations, or putative RNA editing events. The primary output of the tool is a FASTA file containing putative exons. Further, exon edges are interrogated for alternative exon-exon boundaries to detect transcript isoforms, which are represented as splice graphs in DOT output format.

ChopStitch workflow

After constructing the genomic Bloom filter, ChopStitch interrogates transcript sequences to find putative exons. It then finds exons with overlapping edges and constructs a splicegraph in DOT format. Graphviz ccomps is used to find sub- graphs. ChopStitch also detects putative exons smaller than the size of k-mer as illustrated in the figure: The stretch of absent k-mers is greater than k 1. The 3-sided arrows show the scrutiny process towards the beginning and end of the absent k-mer stretch.

Availability: ChopStitch is written in Python and C ++ and is released under the GPL license. It is freely available at: https://github.com/bcgsc/ChopStitch.

©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容