Unformatted text preview:

Finishing: XBAA-7G24 & XBAA-48K11 Justin Richner March 28, 2005 I was initially given two fosmids to finish from the dot chromosome of Drosophila virilis; XBAA-7G24 and XBAA-48K11. First I observed the quality of coverage from just one plate and compared that to the quality of coverage with all the reads. With all the reads, I used the computer programs Phred, Phrap, and Consed to view my assembled reads. I called reactions to span the gaps where Phrap could not join two contigs due to insufficient data. I also called reactions to resolve low quality regions. On both my projects, I was unable to finish completely due to time constraints. I compared my results to the presubmit checklist and compared the reactions I called to the reactions called by Autofinish. In this paper, I will first describe my actions taken for project XBAA-7G24, then I will describe project XBAA-48K11. XBAA-7G24: Low Coverage Quality: I attempted to view the results obtained with the reads from just one plate, but I was unable to view the assembly due to an error in the file. Thus I could not compare the low coverage and high coverage data. Initial data: Figure 1: Initial Assembly View Figure 1 shows my assembly view with all the reads. I had three contigs with one gap and a miss-assembly as well as several low quality regions throughout the contigs. Isuspected a miss-assembly due to the inconsistent forward/reverse pairs indicated by the red lines connecting contigs 44 and 41. Round 1: My first objective was to close the gap between contigs 43 and 44. To do this, I called reactions using primers on each side of the gap facing toward the gap, so I could be sure that the gap was covered. I also attempted to fix a low quality region around 6120 bases in contig 43. I again called primers on each side of this region facing inward in order to be sure to cover the region. Figure 2 and Table 1: Reactions called for round 1 and results In choosing primers, I used the primers suggested by Consed plus a few I designed myself. A primer should be at least 70 bases upstream of the problematic region, to ensure that the read provides a steady, high quality signal over the problem area. At the beginning of a read’s sequence (adjacent to the primer), the signal quality is usually very bad with many spikes. I also want to make sure that my primer sits on a unique region. If the same sequence was found at a second site within the same subclone, than the primer could sit in both places and give two different sequences. I checked with the first 8 bases on the 5’ end of each primer to make sure the sequence is not repeated within 3000 bases. As Table 1 shows, all the reactions called were failures. Figure 3 shows the raw data from the sequencer, which is very low quality. Figure 4 shows a likely problem shared by two of the primers. Using the “Search for String” command in Consed, I discovered that the beginning 7 base pairs of my primer was a unique sequence, but the last 8 base pairs were repeated less than 1000 base pairs away. This could have caused the primer to sit down in two different regions of the same subclone and give poor results. Primers 2 and 4 both had this problem. Figure 3: TraceViewer image of low quality sequence. Problem Template Result GAP aab69e21 Failure GAP aab69e21 Failure Low quality ~6120 aab72p16 Failure Low quality ~6120 aab72p16 FailureFigure 4: Consed image showing likely problem with primer XBAA7G24.4 Figure 5: Reactions called in round 2 Round 2: Round 2 consisted of repeating the reactions from round 1 plus adding other reactions for low quality regions. Two of the primers from round 1 appeared to be good quality, and for these I chose a new subclone to use as a template. I designed new primers to replace primers 2 and 4 because of the aforementioned problem. For all of the reactions, I used 4:1 chemistry to obtain the best properties of both dGTP and Big Dye. I found one single stranded low quality region 12,600 bases into contig 43, (Figure 6). Around 16,000 bases into contig 43 is a low quality region (Figure 7). I chose to design the primer for this reaction in the reverse direction. Using this orientation the sit-down region was double stranded high quality sequence whereas in the forward direction, the sit down region was just single stranded high quality sequence.Figure 6: Single stranded region around 12,600 base pairs Figure 7: Low quality region around 16,000 base pairs. Notice double strand higher quality data to the right of the problem region. Table 2 shows that almost all of the reactions for round 2 were successful. Although one reaction failed, I was able to fill the gap from the opposite direction. Results using template aab65a13 had really poor quality, but were still entered into my main assembly. Figure 8 shows the areas originally “low quality” but made “high quality” by the new data. Table 2: Reactions called for round 2 and results Figure 8: Regions made higher quality by reactions called and gap filled Different Chemistry: Template aab77k05 was used with all three different types of chemistry; Big Dye, dGTP, 4:1 (a volume ratio of 4 units of Big Dye to 1 unit of dGTP). Problem Template Result GAP aab77o21 Success GAP aab71h21 Failure Low quality ~6120 aab70m02 Success Low quality ~6120 aab66a14 Success Low quality ~12600 aab77k05 Success Low quality ~16000 aab65a13 So/SoThe results of these reactions were not what I expected. Big Dye and dGTP both worked well, but 4:1 gave very poor results (Figure 9). Big Dye and dGTP performed about the same, and there was not a particular instance in this sequence where one consistently outdid the other. I know that these results are not typical. Big Dye handles GC compressions better than dGTP. dGTP can sequence through structures such as hairpins better than Big Dye. 4:1 should combine the best of dGTP with Big Dye. My sequence did not have either a unique structure or any GC compressions, so both chemistries worked well. The cause of the low quality obtained with the 4:1 chemistry is unknown. Figure 9: Top = dGTP Middle = 4:1 Bottom = Big Dye Figure 10: New Assembly View with “direct sequence matches” and “inverted


View Full Document

WUSTL BIOL 4342 - Lecture Notes

Download Lecture Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?