Assessment Question: Sequence variation

Sep 11, 2013 • Joshua Herr

I apologize for my late post here – mainly a tumultuous couple of weeks for me – but this was also a difficult task.  I’m not sure I’ve done it justice, but perhaps it’s the thought process here that is most important.  I’ve decided to stick with my original thought map topic.

Beginner to Intermediate Question:

**You’ve been given the task of separating nucleotide sequence data from a mixed sample of organisms.  You use two techniques on these mixed samples.  One technique includes amplicon sequencing, where you select for a specific marker region, and the other technique is whole genome shotgun sequencing, where you sequence the whole genomes.  In order to plan your analysis algorithm, which statement best explains the amount of sequence variation should you expect from each technique?

1.  The whole genome shotgun sequence data will be less variable than the amplicon sequence data.

  1. The amplicon sequencing data will be less variable than the whole genome shotgun sequence data.

3.  The sequence data from each technique will have the same amount of variation.

4.  The sequence data variation is dependent on the type of technology used for data collection.

Intermediate to Advanced Question:

We might consider a typical analysis pipeline for data from whole genome-mixed sequences (from many organisms) to be the following: experimental design > sampling > sample fractionation > nucleotide extraction > sequencing > assembly > annotation > statistical analysis > data storage > data sharing.  At which step can sequence comparison (or binning) occur?

  1. Assembly

  2. Assembly & Annotation

  3. Assembly, Annotation, and Statistical Analysis

  4. Sequencing, Assembly, and Annotation

  5. Sequence comparison can occur anywhere within the pipeline