New version page

U of I CS 598 - Evolution and Tree of life

Documents in this Course
Exam 08

Exam 08

5 pages

Lecture 3

Lecture 3

38 pages

Lecture

Lecture

8 pages

LECTURE

LECTURE

15 pages

Bot Nets

Bot Nets

19 pages

Lecture 5

Lecture 5

42 pages

Load more
Upgrade to remove ads
Upgrade to remove ads
Unformatted text preview:

CS 598SS Lecture 4Evolution and Tree of lifePhylogenetic treeSlide 4Slide 5Evolution of genomesEvolutionary constraintsMultiple genomes comparisonHow to find the conserved blocks?Pairwise Sequence AlignmentSlide 11Slide 12Slide 13Slide 14Slide 15Slide 16Multiple Sequence AlignmentSlide 18ParsimonySlide 20Motif findingMotif finding from multiple species dataOne approachAnother approach (alignment-free)Footprinter (Blanchette et al.)Parsimony score and probabilityCS 598SSLecture 4Saurabh SinhaEvolution and Tree of lifePhylogenetic treeTree of lifePhylogenetic treeSpecies 1Species 2Species 3Species 4221010 22Branch lengths = “divergence”Phylogenetic tree•Early methods: fossil evidence, morphological evidence•New methods: Genetic evidenceEvolution of genomes•Genomes undergo mutations/changes during evolutionSpecies 1 genomeSpecies 2 genomeSpecies 3 genomeSpecies 4 genome221010 22Evolutionary constraints•Functional parts of genome evolve more slowly than non-functional parts–Functional parts are more similar across species–“Functional constraint”•Comparative genomics–Look at the more conserved parts of genomes for functional featuresGCGTGATCGAGCTATAACGGAACTGTGATCGTCGGGTAACGCCCTGGTGATCGGAACCCCTAACGAAAGTGATCGATTATCCTAACGT•Genomes of multiple species availableGCGTGATCGAGCTATAACGGAACTGTGATCGTCGGGTAACGCCCTGGTGATCGGAACCCCTAACGAAAGTGATCGATTATCCTAACGTMultiple genomes comparisonBLOCKS OF CONSERVATIONspecies1species2species3species4EVOLUTIONARY TREEHow to find the conserved blocks?•Sequence alignment–Identify the similar regions in two or more genomes•Pairwise sequence alignment–The above, for two genomes/sequences–Given two sequences, how similar are they ?Pairwise Sequence AlignmentS1 : ACGCTGATATTAS2 : AGTGTTATCCCTAPairwise Sequence AlignmentS1 : ACGCTGATATTA ACGCTGATAT---TA S2 : AGTGTTATCCCTA AG--TGTTATCCCTAALIGNMENTPairwise Sequence AlignmentS1 : ACGCTGATATTA ACGCTGATAT---TA S2 : AGTGTTATCCCTA AG--TGTTATCCCTAMATCHPairwise Sequence AlignmentS1 : ACGCTGATATTA ACGCTGATAT---TA S2 : AGTGTTATCCCTA AG--TGTTATCCCTAMISMATCHPairwise Sequence AlignmentS1 : ACGCTGATATTA ACGCTGATAT---TA S2 : AGTGTTATCCCTA AG--TGTTATCCCTA“GAP”Pairwise Sequence Alignment•Given –two sequences S1 and S2, –“match” score, –“mismatch” penalty,–“gap” penalty•Align S1 and S2 to maximizeScore = (Match score)x(# matches) -(Mismatch penalty)x(# mismatches) -(Gap penalty)x(# gaps)Pairwise Sequence Alignment•Sequence alignment one of the most useful techniques in BioInformatics.•Algorithm : “Dynamic Programming”–Of the order of (m x n) operations needed by algorithm, where m and n are the lengths of the two sequences•Not practical when aligning large genomes–Other algorithms for alignment : “Dialign”, “Lagan” etc.Multiple Sequence Alignment•Generalization of the Pairwise problem•Given more than two sequences, align them to determine how similar they are•Pairwise alignment score doesn’t generalize trivially–Why ?Multiple Sequence AlignmentAAAAAAATAAAAAAAAAAAAATAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAT1 mutationAAAAAAATAAAAAAAAAAAAAAAAAAAAAAAAAAATAAAAAAAAAAAAAAAAAAAATT2 mutations• Perhaps the second tree is wrong ?Parsimony•A guiding principle in cross-species comparison•If the data can be explained in multiple ways, prefer the one with the fewer number of events (be parsimonious)•In the example, if we are not sure about which of the two trees is correct, choose the first treeParsimony•Parsimony score = number of evolutionary events (e.g., substitutions) on the tree•For a multiple alignment, consider each column separately, compute parsimony score for each, and compute their sum•Maximum parsimony principle: minimize parsimony scoreMotif findingGene 1Gene 2Gene 3Gene 4Gene 5Binding sites for TFMotif finding from multiple species dataSpecies 1Species 2Species 3Species 4Species 5Binding sites for TFGene GOne approach•Do multiple sequence alignment of upstream regions of genesSpecies 1Species 2Species 3Species 4Species 5Gene GBlocks of conservation•Look for recurring motifs in conserved blocksBlocks of conservationAnother approach (alignment-free)Species 1Species 2Species 3Species 4Species 5Gene G•Look for recurring motifs in entire upstream regions•What if binding sites are not entirely within conserved blocks?Footprinter (Blanchette et al.)WhiteboardParsimony score and


View Full Document
Download Evolution and Tree of life
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Evolution and Tree of life and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Evolution and Tree of life 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?