New version page

U of I CS 466 - Two short pieces

This preview shows page 1-2-3-20-21-22-41-42-43 out of 43 pages.

View Full Document
View Full Document

End of preview. Want to read all 43 pages?

Upload your study docs or become a GradeBuddy member to access this document.

View Full Document
Unformatted text preview:

Two short piecesMicroRNAAlternative splicingMicroRNA• First part is about discovery of the genesthat code for microRNAs• Second part is about discovery of the“targets” of these microRNAsWhat is microRNA?• Genome has protein-coding genes• It also has genes that code for RNA– e.g., “transfer RNA” that is used in translation iscoded by genes– e.g., “ribosomal RNA” that forms part of thestructure of the ribosome, is also coded by genes• microRNAs are a family of small RNAs– genome has genes that code for microRNAs, i.e.,the result of transcription is microRNAWhat is microRNA?• 21-22 nucleotide non-coding RNA• The gene that codes for a miRNA firstproduces a ~70 nucleotide transcript• This “pre-miRNA” transcript has thecapacity to form a stem-loop structure• This pre-miRNA is then processed into21-22 nucleotide long miRNA by anenzyme called Dicer.What is microRNA?• Vast majority of microRNAs regulate othergenes by binding to complementary sequencesin the target gene• Perfect complementarity of binding leads tomRNA degradation of the target gene• Imperfect pairing inhibits translation of mRNA toprotein• miRNAs are an important piece of the puzzlethat is gene regulationA model for miRNA functiondoi:10.1016/S0092-8674(02)00863-2 Copyright 2002 Cell Press.How to find miRNAs?• Experimental methods so far• Lai et al (2003) one of the works that trysolving this problem computationally• Basic idea:– look for evolutionarily conserved sequences– check if some of these fold well into the stem-loop structure (“hairpins”) associated withmiRNAsComparative genomics• Start with 24 known Drosophila pre-miRNAs (the ~70-100 long transcriptsbefore miRNAs)• All are found to be conserved beween D.melanogaster and D. pseudoobscura– Typically, more conserved than gene. (Thethird codon “wobble” not relevant here)miRNA genes are isolated, evolutionarily conserved genomic sequences that have the capacity toform extended stem-loop structures as RNA. Shown are VISTA plots of globally aligned sequencefrom D. melanogaster and D. pseudoobscura, in which the degree of conservation is represented bythe height of the peak. This particular region contains a conserved sequence identified in this studythat adopts a stem-loop structure characteristic of known miRNAs. Expression of this sequence wasconfirmed by northern analysis (Table 2), and it was subsequently determined to be the fly orthologof mammalian mir-184. Most conserved sequences do not have the ability to form extended stem-loops, as evidenced by the fold adopted by the sequence in the neighboring peak.http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&pubmedid=12844358Finding microRNA genes• Find highly conserved sequences, length ~70-100• Check for secondary structure• Are we done?– No, too many such sequences; more filters neededComparative genomics• Look carefully at pairwise alignments of eachof the 24 pairs or orthologous pre-miRNAs.• Only three pairs completely conserved• Ten pairs are diverged exclusively within theirloop sequence;no pair diverged exclusively inarm• Of the 11 remaining, seven show morechanges in the loop than in non-miRNA-encoding armhttp://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&pubmedid=12844358So what do we learn?• That class 1 - 3 are the normal patternof evolutionary divergence of miRNAs• That classes 4 - 6 are unlikely• Therefore use these criteria asadditional filters for evolutionarilyconserved sequencesPrediction Pipeline details: 1• Align the two genomes• “Regions” that should contain miRNAgenes are estimated as those having– length 100,– <= 15% mismatches,– <= 13% gapsPipeline details: 2• Analyze conserved regions withmfold3.1, an RNA folding algorithm• Find the top scoring regions (from themfold program) -- these are candidatesfor the next stagePipeline details: 3• Assess the divergence pattern ofcandidate miRNAs• Boolean filters: remove candidates with– exclusive divergence in arm– more divergence in miRNA-coding armthan in loopFinal results• 200 candidate miRNAs came out• Experimental validation of many ofthese• 24 novel miRNAs confirmedSummary of part 1• Learned what miRNAs are• and how the genes encoding these arepredicted computationally• Learned that the miRNAs function toregulated gene expression by binding tothe mRNA of the target genes (perfectlyor imperfectly)Part 2: finding the targets• Rhoades et al (2002)• We should be looking for targets …• … with base complementarity• But small size (20-24 nt) and imperfect basepairing imply that we may ending uppredicting too many• Rhoades et al found that nearly perfectcomplementarity is a good indicator of miRNAtargets in plantPlant miRNAs• Started with 16 known ArabidopsismiRNAs• Looked for complementary strings with<= 4 mismatches and no gaps• Also did the same genome-wide searchwith “randomized” versions of the 16miRNAsResults of this scandoi:10.1016/S0092-8674(02)00863-2 Copyright 2002 Cell Press.Near perfect complementarity• Number of hits with <= 3 mismatches is30 for the real miRNAs, 0.2 for therandom– Why fractional for random?• Therefore <= 3 matches supposed to bea good indicator of targets• Find all targets using this rule; as simpleas that!Alternative Splicing(a review by Liliana Florea, 2006)What is alternative splicing?• The first result of transcription is “pre-mRNA”• This undergoes “splicing”, i.e., introns are excisedout, and exons remain, to form mRNA• This splicing process may involve differentcombinations of exons, leading to different mRNAs,and different proteins• This is alternative splicingSignificance• Important regulatory mechanism, formodulating gene and protein content inthe cell• Large-scale genomic data todaysuggests that as many as 60% of thehuman genes undergo alternativesplicingSignificance• Number of human genes has recently beenestimated to be about 20-25 K.• Not significantly greater than much lesscomplex organisms• Alternative splicing is a potential explanationof how a large variety of proteins can beachieved with a small number of genes• Errors in splicing mechanism implicated indiseases such as cancershttp://bib.oxfordjournals.org/cgi/content/full/7/1/55/F1exon inclusion/exclusionalternative 3’ exon endalternative 5’ exon endintron retention5’ alternative UTR3’ alternative UTRBioinformatics of Alt.


View Full Document
Loading Unlocking...
Login

Join to view Two short pieces and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Two short pieces and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?