ISBN: 978-981-11-3671-9 DOI: 10.18178/wcse.2017.06.238
Next-generation Sequencing Generated Discrepancy in Abundance Characterization of Complex Microbial Community Compositions: an Error of Bioinformatics Pipeline
Abstract— Next generation sequencing on metagenomes produces a lot of valuable biological and
biomedical data but still with some errors. For examples, chimeras are basically originated from biological
reactions, while taxonomic classification errors are easily resulted from bioinformatics pipelines. In this
study the microbial compositions in the starter (Daqu) of Chinese GujingTribute liquor, especially the
dominant species or OTUs (operational taxonomic unit), were determined by two approaches, one is the near
full length ribosome gene (16S rDNA plus the internal transcribed spacer (ITS)) library sequencing, and
another is 16S rDNA V4-V5 region based next generation sequencing approach. The two approaches gave
discrepant results for both the prokaryotic microbes and eukaryotic ones. Especially, the results for
prokaryotic microbes showed apparent differences in that (1)The most dominant species or OTU belong to
different phyla; (2) The 20 most dominant species or OTUs overlapped only partially. Further investigation
indicated that the bioinformatics analysis pipeline itself was sometimes an important source for discrepancy
generation.
Index Terms— next generation sequencing, bioinformatics pipeline, metagenome, discrepancy
Huimin Zhang, Zhizhou Zhang,Jie Jiang
School of Marine Science and Technology, Marine anti-fouling Engineering Technology Center of
Shandong Province, Harbin Institute of Technology, CHINA
Hongkui He, Runjie Cao, Huizhi Tang, Anjun Li
The Anhui GuJingTribute Liquor Ltd, CHINA
ISBN: 978-981-11-3671-9 DOI: 10.18178/wcse.2017.06.17Xsrc="http://www.wcse.org/uploadfile/2019/0823/20190823055609629.png" style="width: 120px; height: 68px;" />[Download]
Cite: Huimin Zhang, Hongkui He, Runjie Cao, Huizhi Tang, Zhizhou Zhang, Anjun Li, Jie Jiang, "Next-generation Sequencing Generated Discrepancy in Abundance Characterization of Complex Microbial Community Compositions: an Error of Bioinformatics Pipeline," Proceedings of 2017 the 7th International Workshop on Computer Science and Engineering, pp. 1373-1378, Beijing, 25-27 June, 2017.