scientificprotocols authored almost 6 years ago
Authors: Zhiyong Huang, Guangmei Yan, Jun Wang, Xiaoning Wang, Jian Wang, Guojie Zhang, Xiaodong Fang, Cai Li & Fei Ling
cynomolgus and Chinese rhesus macaque sequencing, assembly and analyse
1.SOAP denovo assembly
SOAPdenovo employs the de Bruijn graph algorithm in order both to simplify the task of assembly and to reduce computational complexity. Low quality reads were filtered and potential sequencing errors were removed by k-mer frequency-based error correction. We filtered the following type of reads:
2.RNA-seq sequencing
3.Gene prediction
use BLAT to map genes of IR (MMUL01) and human (Ensembl release-56) onto two macaca genome, Orthologous regions were then determined by best-BLAT hit and synteny-based analysis, followed by the application of Exonerate and GENEWISE to refine gene model at each locus.
4.Assembly quality validation in neutral mode
Neutral InDel model1 can be used to validate the quality of our genome assemblies.When aligning two closely related genome sequences, the frequencies of lengths of successive alignment blocks (which were split by gaps during the alignment), termed Inter-gap Segments (IGS), may be expected to follow a geometric frequency distribution under a standard neutral model.Within the neutral evolving regions, incorrect InDels introduced during the assembly process would result in the observed IGS length distribution departing from the geometric distribution. The introduced InDels would generate an excess of short IGS over the number predicted by the neutral InDel model. By quantifying this excess, several parameters viz. the proportion (ɛ), average density (D), and number (Ng) of the clustered erroneous gaps in the genome alignments can be estimated.
Genome sequencing and comparison of two nonhuman primate animal models, the cynomolgus and Chinese rhesus macaques. Guangmei Yan, Guojie Zhang, Xiaodong Fang, Yanfeng Zhang, Cai Li, Fei Ling, David N Cooper, Qiye Li, Yan Li, Alain J van Gool, Hongli Du, Jiesi Chen, Ronghua Chen, Pei Zhang, Zhiyong Huang, John R Thompson, Yuhuan Meng, Yinqi Bai, Jufang Wang, Min Zhuo, Tao Wang, Ying Huang, Liqiong Wei, Jianwen Li, Zhiwen Wang, Haofu Hu, Pengcheng Yang, Liang Le, Peter D Stenson, Bo Li, Xiaoming Liu, Edward V Ball, Na An, Quanfei Huang, Yong Zhang, Wei Fan, Xiuqing Zhang, Yingrui Li, Wen Wang, Michael G Katze, Bing Su, Rasmus Nielsen, Huanming Yang, Jun Wang, Xiaoning Wang, and Jian Wang. Nature Biotechnology doi:10.1038/nbt.1992
Zhiyong Huang, Jun Wang, Jian Wang & Guojie Zhang, Beijing Genomics Institute, Shenzhen
Guangmei Yan & Xiaoning Wang, The South China Center for Innovative Pharmaceuticals, Guangzhou 510663, China
Xiaodong Fang, Cai Li & Fei Ling, Unaffiliated
Correspondence to: Guangmei Yan ([email protected]), Jun Wang ([email protected]), Xiaoning Wang ([email protected]), Jian Wang ([email protected])
Source: Protocol Exchange (2011) doi:10.1038/protex.2011.264. Originally published online 4 November 2011.