DOC PREVIEW
UMD CMSC 423 - Introduction & Biology Basics

This preview shows page 1-2-14-15-30-31 out of 31 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 31 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CMSC 423: Introduction & Biology BasicsAugust 31, 2010Carl KingsfordCenter for Bioinformatics and Computational BiologyWhat does biology have to do with computers?•Huge amount of data: too much to analyze by hand•Requires clever algorithms to:•find interesting patterns•store / search / compare•predict missing or hard-to-observe features (like protein structure or evolutionary relationships)•Nearly all molecular biology is now “computational biology”: biologists depend on computer scientists every day.Algorithmic Techniques & Data Structures•Dynamic programming•Divide and conquer•Branch and bound•Linear programming•Gibbs sampling / expectation maximization•Hidden Markov Models•Burroughs-Wheeler transform•Suffix arrays / suffix treesVery general algorithmic techniquesTechniques widely used in string / sequence algorithms.Course work & Evaluation•2 exams during the semester:•Sept 30, 2010•Nov. 11, 2010•These dates are fixed. Exams are non-cumulative.•2nd exam will cover material since the first exam.•Each 20% of the grade.•Comprehensive final (30%):•Tuesday, Dec 14, 2010•8am - 10am.•Covers everything in the class•Several homework assignments.•10% of your grade•Neatness counts•Programming project:•20% of your grade.•Mostly in the second “half” of the semester.Administrative Details•TA: Emre Sefer•TA office hours: TBD•Instructor office hours: Mondays 2:30-3:30pm, or by appointment, in CBCB 3113.•Grades will be posted on: http://grades.cs.umd.edu•More details on syllabus handout.•Homeworks are due at the start of class on their due dates. No late homeworks will be accepted.•You can discuss homeworks with your classmates. •You must list the names of the people with whom you collaborated at the top of the homework.•You must write up homeworks solutions on your own.•Late programming assignments will lose 10% per day, up to 5 days, after which they will not be accepted.Tentative Course Topics & Outline•Sequence search & comparison•Dynamic programming•Local / global alignments•Aligning multiple sequences•RNA folding•Suffix trees / suffix arrays•Burroughs-Wheeler Transform•Hidden Markov Models•For gene finding•For sequence pattern finding•Expectation Maximization for pattern finding•Gene Expression & Clustering•Phylogenetics•Algorithms for building trees•Genome Rearrangements•Protein Structure•Secondary structure prediction•Threading for structure prediction•Side-chain positioning•Spatial biology•Mouse brain structure•Genome shapeBefore midterm #1Before midterm #2Before finalE. coliChristos Savva (Microscopy & Imaging Center) and Thomas Wood (Dept. of Chemical Engineering) at Texas A&M University. % total dry weightDNA 3.1RNA 20.5Protein 55.0Lipid 9.1•E. coli is an example of a bacterium.}Algorithms are used to understand these important components.● substrings encode for genes ,most of which encode for proteins● double-stranded, linear moleculeDNA = ● strands are complements of each other (A 󲰸 T; C 󲰸 G)● each strand is string over {A,C,G,T}mRNAproteinsTranscription(T ➝ U)TranslationGenome“Central Dogma” of BiologyDNAG CTADNA ReplicationThe Cartoon Guide to GeneticsLarry Gonick & Mark Wheelis, 1983Recent Genomics (DNA)•First genome sequenced in 1995 (the bacteria H. influenzae with a genome of 1,830,140 letters).•1st draft of human genome finished in 2001 (~ 3 billion letters)•Now: Over 1100 bacterial genomes•Hundreds of higher-order genomes done or in progress.•Several complete human genomes finished.Example Genomic Sequence 1 atactataaa tccacctctc attttattca cttcatacat gctattacac actctgtgcc 61 atcatagtat gttttcatac atcctccctt ctttcacacc ctatgtatat cgtacattaa 121 tggtgtaccc cccctccccc tatgtatatc gtgcattaat ggcgtgcccc atgcatataa 181 gcatgtacat actgtgcttg gctttacatg aggatactca ttacaagaac ttatttcaag 241 cgatagtcta tgagcatgta tttcacttag tccaagagct tgatcaccaa gcctcgagaa 301 accagcaatc cttgcgagta cgtgtacctc ttctcgctcc gggcccataa tttgtggggg 361 tttctatact gaaactatac ctggcatctg gttcttacct cagggccatg ttagcgtcaa 421 ctcaatccta ctaacccttc aaatgggaca tctcgatgga ctaatgacta atcagcccat 481 gatcacacat aactgtggtg tcatgcattt ggtatttttt aattttaggg ggggaacttg 541 ctatgactca gctatgaccg taaaggtctc gtcgcagtca aatcagctgt agctgggctt 601 attcatcttt cgaggctcct catggacacc cataaggtgc aattcagtca atggtcacag 661 gacataacac tatagatcac ccggactggc gttacgtgta cgtacgtgta cgtacgtgta 721 cgcacgtgta cgtacgtgta cgcacgtgta cgtacgtgta cgcacgtgta cgcacgtgta 781 cgtacgtgta cgcacgtgta cgtacgtgta cgtacgtgta cgcacgtgta cgcacgtgta 841 cgtacgtgta cgcacgtgta cgtacgtgta cgcacgtgta cgcacgtgta cgcacgtgta 901 cgtacgtgta cgcacgtgta cgtacgtgta cgcacgtgta cgcacgtgta cgtacgtgta 961 cgcacgtgta cgcacgtgta cgcacgtgta cgcacgtgta cgcacgtgta cgtacgtgta 1021 cgcgtacgta ttttagatac taagttagct tagacaaacc ccccttaccc cccgtaactt 1081 caagaagctt acatatactt atggatgtcc tgccaaaccc caaaaacaag actaaatata 1141 tgcgcaaaca tgaagtcact tacacctaaa cccatataat taagctaacc ccccagccaa 1201 tgttgcaaca actacggaca tgggactcta aattttaatt tatctataga tatttttctt 1261 ttactgtgtc tccccagcat tgatttttta attatcatta ttccacacca ccaatttcca 1321 ttgagctatt tcacatgagt tccaaatcaa ttatgttcat gtagcttaac gaataaagca 1381 aggtactgaa aatgcctaga tgggtcacgc taccccatag acataaaggt ttggtcctag 1441 ccttcctatt agccattaac aagattacac atgtaagtct ccacgctcca gtgaaaatgc 1501 cccttaagtc ctcttagacg acctaaagga gcgggtatca agcacacctt atggtagctc 1561 acaacgcctt gcttagccac acccccacgg gaaacagcag tgataaaaat taagctatga 1621 acgaaagttc gactaagcta tgttaatact agggttggta aatctcgtgc cagccaccgc 1681 ggtcatacga ttaactcgag ttaataggcc tacggcgtaa agcgtgtaaa agaaaaaatc 1741 tcctctacta aagttaaagt atgattaagc tgtaaaaagc taccattaat actaaaataa 1801 actacgaaag tgactttaaa atttctgatt acacgatagc tagggcccaa actgggatta 1861 gataccccac tatgcctagc tctaaacata gatattttac taaacaaaac tattcgccag 1921 agaactacta gcaacagctt aaaactcaaa ggacttggcg gtgctttata tccccctaga 1981 ggagcctgtt ctgtaatcga taaaccccga tagacctcac catcccttgc taattcagtt 2041 tatataccgc catcttcagc aaacccttaa aaggaaaaaa agtaagcata actaccctac 2101 ataaaaaagt taggtcaagg tgtaacctat gggctgggaa gaaatgggct acattttcta 2161 ttcaagaaca acttctacga aaacttttat gaaactaaaa gctaaaggcg gatttagtag 2221 taaattaaga atagagagct taattgaaca


View Full Document

UMD CMSC 423 - Introduction & Biology Basics

Documents in this Course
Midterm

Midterm

8 pages

Lecture 7

Lecture 7

15 pages

Load more
Download Introduction & Biology Basics
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Introduction & Biology Basics and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Introduction & Biology Basics 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?