DOC PREVIEW
UGA BCMB 8020 - Calvo

This preview shows page 1-2 out of 6 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Upstream open reading frames cause widespreadreduction of protein expression and arepolymorphic among humansSarah E. Calvoa,b,c,d,1, David J. Pagliarinia,b,c,1, and Vamsi K. Moothaa,b,c,2aBroad Institute of MIT and Harvard, Cambridge, MA 02142;bCenter for Human Genetic Research, Massachusetts General Hospital, Boston, MA 02114;cDepartment of Systems Biology, Harvard Medical School, Boston, MA 02115; anddDivision of Health Sciences and Technology, Harvard–MIT,Cambridge, MA 02139Edited by Jonathan Weissman, University of California, San Francisco, CA, and accepted by the Editorial Board March 18, 2009 (received for reviewOctober 29, 2008)Upstream ORFs (uORFs) are mRNA elements defined by a start codonin the 5ⴕ UTR that is out-of-frame with the main coding sequence.Although uORFs are present in approximately half of human andmouse transcripts, no study has investigated their global impact onprotein expression. Here, we report that uORFs correlate with signif-icantly reduced protein expression of the downstream ORF, based onanalysis of 11,649 matched mRNA and protein measurements from 4published mammalian studies. Using reporter constructs to test 25selected uORFs, we estimate that uORFs typically reduce proteinexpression by 30–80%, with a modest impact on mRNA levels. Weadditionally identify polymorphisms that alter uORF presence in 509human genes. Finally, we report that 5 uORF-altering mutations,detected within genes previously linked to human diseases, dramat-ically silence expression of the downstream protein. Together, ourresults suggest that uORFs influence the protein expression of thou-sands of mammalian genes and that variation in these elements caninfluence human phenotype and disease.polymorphism 兩 post-transcriptional control 兩 proteomics 兩translation 兩 uORFThe regulation of gene expression is controlled at many levels,including transcription, mRNA processing, protein translation,and protein turnover. Posttranscriptional regulation is often con-trolled by short sequence elements in the UTRs of mRNA. Onesuch 5⬘ UTR element is the upstream ORF (uORF) depicted in Fig.1A. Because eukaryotic ribosome s usually load on the 5⬘ cap ofmRNA transcripts and scan for the presence of the first AUG startcodon, uORFs can disrupt the efficient translation of the down-stream coding sequence (1, 2). Previous reports have shown thatribosomes encountering a uORF can (i) translate the uORF andstall, triggering mRNA decay, (ii) translate the uORF and then,with some probability, reinitiate to translate the downstream ORF,or (iii) simply scan through the uORF (2). uORFs have been shownto reduce protein levels in ⬇100 eukaryotic genes [supportinginformation (SI) Table S1]. Additionally, mutations that introduceor disrupt a uORF have found to cause 3 human disease s (3–5). Inseveral interesting cases, the uORF-derived protein is functional;however, in most cases, the mere presence of the uORF is sufficientto reduce expression of the downstream ORF (1, 2, 6–8). Previousgenomic analyses suggest that uORFs may be widely functional forseveral reasons: They correlate with lower mRNA expression levels(9), they are less common in 5⬘ UTRs than would be expected bychance (6, 10), they are more conserved than expected whenpresent (6), and several hundred have evidence of translation inyeast (11). However, no study has demonstrated that the se elementshave a widespread impact on cellular protein levels. Moreover, nostudy has investigated whether uORF presence varies in the humanpopulation. Here, we take advantage of recently available datasetsof protein abundance (12–17) and genetic variation (18, 19) toassess the impact and natural variation of mammalian uORFs.ResultsuORF Prevalence Within Mammalian Transcripts. We define a uORF asformed by a start codon within a 5⬘ UTR, an in-frame stop codonpreceding the end of the main coding sequence (CDS), and lengthat least 9 nt including the stop codon. As shown in Fig. 1A, thisdefinition includes uORFs both fully upstream and overlapping theCDS, because both types are predicted to be functional (20). Wesearched for uORFs within all human and mouse RefSeq tran-scripts with annotated 5⬘ UTRs ⬎10 nt. Consistent with previousestimate s (9, 10), we find that 49% of human and 44% of mousetranscripts contain at least 1 uORF (Fig. 1B). Interestingly, humanand mouse uORF start codons (uAUGs) are the most conserved5⬘ UTR trinucleotide across vertebrate species (Fig. S1), consistentwith a widespread functional role.uORF Impact on Cellular Protein Levels. If uORFs cause widespreadreduction in protein expression, as predicted by ribosome scanningAuthor contributions: S.E.C., D.J.P., and V.K.M. designed research; S.E.C. and D.J.P. per-formed research; and S.E.C. wrote the paper.The authors declare no conflict of interest.This article is a PNAS Direct Submission. J.W. is a guest editor invited by the Editorial Board.Freely available online through the PNAS open access option.1S.E.C. and D.J.P. contributed equally to this work.2To whom correspondence should be addressed at: Center for Human Genetic Research,Massachusetts General Hospital, 185 Cambridge Street CPZN 5– 806, Boston, MA 02114.E-mail: [email protected] article contains supporting information online at www.pnas.org/cgi/content/full/0810916106/DCSupplemental.BcapAAAAAAmain coding sequenceuORF uORFA5’ UTR 3’ UTRpolyAAUG AUGAUG# Transcripts with: Human Mouseannotated 5' UTR23775 18663≥1 uORF11670 8253≥2 uORFs6268 4197≥1 uO RF fully upstream9879 6935≥1 uORF overlapping CDS4275 28725' UTR 170 139uORF 48 48Median Length (nt):Fig. 1. uORF definition and prevalence. (A) Schematic representation ofmRNA transcript with 2 uORFs (red arrows), 1 fully upstream and 1 overlappingthe main coding sequence (black arrow). uORFs are defined by a start codon(AUG) in the 5⬘ UTR, an in-frame stop codon (arrowhead) preceding the endof the main coding sequence, and length ⱖ9 nt. (B) Number and length ofuORFs in human and mouse RefSeq transcripts.www.pnas.org兾cgi兾doi兾10.1073兾pnas.0810916106 PNAS兩May 5, 2009兩vol. 106兩no. 18兩7507–7512GENETICSmodels, we would expect uORF-containing transcripts to correlatewith lower protein levels when compared with uORF-less tran-scripts. To test this hypothesis, we analyzed a total of 11,649matched mRNA and protein abundance measurements from 4published studies across a variety of mouse tissues and develop-mental stages. These


View Full Document

UGA BCMB 8020 - Calvo

Download Calvo
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Calvo and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Calvo 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?