The guide uses the original data from the publication. gc content, length, kmer frequency and presence of essential genes and their taxonomic classification.Īll data except the two coverage estimates ( HPminus and HPplus) can be automatically generated from a FastA file of the assembled scaffolds using the script: workflow.R., see Data generation for detailed information. The coverage information is then integrated with all other information on each scaffold, i.e. The raw reads are then mapped independently to the assembly, which generates 2 coverage estimates for each scaffold. The data is assembled into 1 assembly (i.e a collection of scaffolds). The basic data requirement is two metagenomes where the target species are in differential abundance. However, this is easily done in R by install.packages("knitr"). In order to build the R markdown guide you will need to install knitr. If you have never used R take a look at the introduction at code school. The guide assumes basic knowledge of Rstudio (a powerfull IDE to R). Hence, compiling the guide should recreate all plots seen in the guide. The guide is written in R markdown and can be found here as.
#Genome scaffold meaning how to#
This tutorial show how to extract individual genome bins from metagenomes using Rstudio.