Check out RSS, or use RSS reader to subscribe this item
Confirmation
Authentication email has already been sent, please check your email box: and activate it as soon as possible.
You can login to My Profile and manage your email alerts.
Sponsored by the Center for Science and Technology Development of the Ministry of Education
Supervised by Ministry of Education of the People's Republic of China
Genotyping the population using next generation sequencing data is essentially important for the rare variant detection. In order to distinguish the genomic structural variation from sequencing error, a statistical model is proposed, which involves the genotype effect through a latent variable to depict the distribution of non-reference allele frequency data among different samples and different genome loci, while decomposing the sequencing error into sample effect and positional effect. An ECM algorithm is implemented to estimate the model parameters, and then the genotypes are inferred based on the posterior probabilities. The performances of our proposed method are investigated via simulations and a real data analysis. Comparing to the existing methods, it is shown that our method can make less genotype-call errors.
Keywords:Biostatistics; empirical Bayes; Next generation sequencing data; Genotyping