DOC PREVIEW
Berkeley COMPSCI 186 - Lecture Notes

This preview shows page 1-2 out of 5 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 5 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

1CS186 - Introduction toDatabase SystemsSpring Semester 2003Prof. Joe Hellerstein“Knowledge is of two kinds: weknow a subject ourselves, or weknow where we can findinformation upon it.”-- Samuel Johnson (1709-1784)What Is a Database System?• Database:a very large, integrated collection of data.• Models a real-world enterprise– Entities (e.g., teams, games)– Relationships(e.g., The Raiders are playing in The Superbowl)– More recently, also includes active components(e.g. “business logic”)• A Database Management System (DBMS) is asoftware system designed to store, manage,and facilitate access to databases. Is the WWW a DBMS?• Fairly sophisticated search available– crawler indexes pages on the web– Keyword-based search for pages• But, currently– data is mostly unstructured and untyped– search only:• can’t modify the data• can’t get summaries, complex combinations of data– few guarantees provided for freshness of data, consistencyacross data items, fault tolerance, …– Web sites (e.g. e-commerce) typically have a DBMS in thebackground to provide these functions.• The picture is changing– New standards like XML can help data modeling– Research groups (like ours at Berkeley) are working onproviding some of this functionality across multiple websites.– The WWW/DB boundary is blurring!=“Search” vs. Query• What if you wantedto find out whichactors donated to AlGore’s presidentialcampaign?• Try “actors donatedto gore” in yourfavorite searchengine.“Search” vs. Query• “Search” canreturn onlywhat’s been“stored”• E.g., best matchat iWon,Google,AskJeeves topten: A “Database Query” Approach2“Yahoo Actors” JOIN “FECInfo”(Courtesy of the Telegraph research group @Berkeley)Q: Did it Work? Is a File System a DBMS?• Thought Experiment 1:– You and your project partner are editing the same file.– You both save it at the same time.– Whose changes survive?=•Thought Experiment 2:–You’re updating a file.–The power goes out.–Which of your changes survive?A) Yours B) Partner’s C) Both D) Neither E) ???A) All B) None C) All Since last save D) ???Q: How do you write programs over a subsystem when it promises you only “???” ?A: Very, very carefully!!Why Study Databases??• Shift from computation to information– always true for corporate computing– Web made this point for personal computing– more and more true for scientific computing• Need for DBMS has exploded in the last years– Corporate: retail swipe/clickstreams, “customer relationshipmgmt”, “supply chain mgmt”, “data warehouses”, etc.– Scientific: digital libraries, Human Genome project, NASAMission to Planet Earth, physical sensors, grid physicsnetwork• DBMS encompasses much of CS in a practical discipline– OS, languages, theory, AI, multimedia, logic– Yet traditional focus on real-world apps?What’s the intellectual content?• representing information– data modeling• languages and systems for querying data– complex queries with real semantics*– over massive data sets•concurrency control for data manipulation– controlling concurrent access– ensuring transactional semantics• reliable data storage– maintain data semantics even if you pullthe plug* semantics: the meaning or relationship of meanings of a sign or set of signsAbout the course: Enrollment• Overenrollment again across CS– The CS dept administration “makes the call”– TA’s & Prof. cannot help!!– Course is overbooked, drops won’t free space– Want to appeal?• See http://www.cs.berkeley.edu/~msasson/enrollment.htmlfor more info• Appeal forms need to be in by 1/25• CS186 is planned for Every Semester– Your priority goes up over timeAbout the course: Workload• Projects with a “real world” focus:– Modify the internals of a “real” open-source databasesystem: PostgreSQL• Serious C system hacking• Measure the benefits of our changes– Build a web-based e-commerce application w/PostgreSQL,Apache & PHP): SQL + PHP• Other homework assignments and/or quizes• Exams – 1 Midterm & 1 Final• Projects to be done in groups of 3– Pick your partners ASAP• The course is “front-loaded”– most of the hard work is in the first half3About the Course - Administrivia• http://inst.eecs.berkeley.edu/~cs186• Prof. Office Hours:– 685 Soda Hall, M 2-3; Tues 11-12 (tentative!)• TAs: Zhuang Li, Boon Thau Loo, SaileshKrishnamurthy– Office Hours: TBA (check web page)• Discussion Sections WILL meet this week– Note change to discussion section schedule!About the Course - Administrivia• Textbook– Ramakrishnan and Gehrke, 3rd Edition• Grading, hand-in policies, etc. will be on Web Page• Cheating policy: zero tolerance– We have the technology…• Team Projects– Teams of 3, if one drops the other 2 finish it up– Peer evaluations.• Be honest! Feedback is important. Trend is more importantthan individual project.• Class bulletin board - ucb.class.cs186– read it regularly and post questions/comments.– mail broadcast to all TAs will not be answered– mail to the cs186 course account will not be answered• It’s a spam disposal siteRest of Today: A CS186 Infomercial• A “free tasting” of things to come in this class:– data modeling– query languages– file systems & DBMSs– concurrent, fault-tolerant data management– DBMS architecture• Next Time– The Relational Model• Today’s lecture is from Chapter 1 in R&GOS Support for Data Management• Data can be stored in RAM– this is what every programming language offers!– RAM is fast, and random access– Isn’t this heaven?• Every OS includes a File System– manages files on a magnetic disk– allows open, read, seek, close on a file– allows protections to be set on a file– drawbacks relative to RAM?Database Management Systems• What more could we want than a file system?– Simple, efficient ad hoc1 queries– concurrency control– recovery– benefits of good data modeling• S.M.O.P.2? Not really…– as we’ll see this semester– in fact, the OS often gets in the way!1ad hoc: formed or used for specific or immediate problems or needs2SMOP: Small Matter Of ProgrammingDescribing Data: Data Models• A data model is a collection of conceptsfor describing data.• A schema is a description of a particularcollection of data, using a given datamodel.• The relational model


View Full Document

Berkeley COMPSCI 186 - Lecture Notes

Documents in this Course
Load more
Download Lecture Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?