Introduc)on*to*Database*Systems*CSE*444*Lecture*#1*March*29,*2010*1 Dan Suciu -- 444 Spring 2010Staff*• Instructor:**Dan*Suciu*– CSE*662,*[email protected]*Office*hours:**Mondays*1:30‐2:30*• Grad*TA:*Jessica*Leung*[email protected]**• UGrad*TA:*Daniel*Swisher,*Derek*Cheng*2 Dan Suciu -- 444 Spring 2010Communica)ons*• Web*page:*hUp://www.cs.washington.edu/444/*– Lectures*will*be*available*here*– The*project*descrip)on*will*be*here**– Homework*will*be*posted*here*• Mailing*list:*– Announcements,*group*discussions*– Please*subscribe*3 Dan Suciu -- 444 Spring 2010Textbook(s)*Main*textbook,*available*at*the*bookstore:*• Database'Systems:'The'Complete'Book,*Hector*Garcia‐Molina,**Jeffrey*Ullman,*Jennifer*Widom*4 Most*important:*COME*TO*CLASS*&*SECTIONS*!*Dan Suciu -- 444 Spring 2010Other*Texts*Available*at*the*Engineering*Library*(not*on*reserve):*• Database'Management'Systems,*Ramakrishnan*• Fundamentals'of'Database'Systems,*Elmasri,*Navathe*• Founda;ons'of'Databases,*Abiteboul,*Hull,*Vianu***• Data'on'the'Web,*Abiteboul,*Buneman,*Suciu*• XQuer y'from'the'Experts,*Katz,*Ed.*5 Dan Suciu -- 444 Spring 2010Course*Format*• Lectures*MWF,*1:30‐2:20,*MGH*241*• Quiz*sec)ons:**– Th*8:30‐9:20,*JHN*026*‐*Jessica*– Th*9:30‐10:20,*JHN*026*‐*Jessica*– Th*9:30‐10:20,*MEB*235*–*Daniel/Derek/Dan*(alternate*?)*• 4*Mini‐projects*• 4*homework*assignments*• Midterm,*final*6 Dan Suciu -- 444 Spring 2010Grading*• Homework*30%*• Project********30%*• Midterm*****15%*• Final***********25%*7 Dan Suciu -- 444 Spring 20104*Mini*Project*1. SQL*(already*posted)*2. SQL*in*Java*3. Database*tuning*4. Map/reduce*8 Due:*Wednesday’s*every*other*week*by*11:59pm*Dan Suciu -- 444 Spring 20104*Homework*Assignments*1. Conceptual*Design*2. Transac)ons*3. Query*execu)on*and*op)miza)on*4. XML*9 Due:*Wednesday’s*every*other*week*in*class:*12:30pm*Dan Suciu -- 444 Spring 2010Exams*Midterm:*• Friday,*April*30,*12:30‐1:20*(in*class:*MGH241)*Final:**• Thursday,*June*10,*8:30‐10:20*(MGH241)*10 Dan Suciu -- 444 Spring 2010Rest*of*Today’s*Lecture*Overview*of*DBMS*11 Dan Suciu -- 444 Spring 2010Database*What*is*a*database*?*Give*examples*of*databases*12 Dan Suciu -- 444 Spring 2010Database*What*is*a*database*?*• A*collec)on*of*files*storing*related*data*Give*examples*of*databases*• Accounts*database;*payroll*database;*UW’s*students*database;*Amazon’s*products*database;*airline*reserva)on*database*13 Dan Suciu -- 444 Spring 2010Database*Management*System*What*is*a*DBMS*?*Give*examples*of*DBMS*14 Dan Suciu -- 444 Spring 2010Database*Management*System*What*is*a*DBMS*?*• A'big'C'program'wriFen'by'someone'else'that'allows'us'to'manage'efficiently'a'large'database'and'allows'it'to'persist'over'long'periods'of';me'Give*examples*of*DBMS*• DB2*(IBM),*SQL*Server*(MS),*Oracle,*Sybase*• MySQL,*Postgres,*…*15 SQL'for'Nerds,*Greenspun,*hUp://philip.greenspun.com/sql/*(Chap*1,2)*Dan Suciu -- 444 Spring 2010Market*Shares*From*2006*Gartner*report:*• IBM:*21%*market*with*$3.2BN*in*sales*• Oracle:*47%*market*with*$7.1BN*in*sales*• Microsou:*17%*market*with*$2.6BN*in*sales*16 Dan Suciu -- 444 Spring 2010An*Example*The*Internet*Movie*Database*hUp://ww w.imdb.com*• En))es:**Actors*(800k),*Movies*(400k),*Directors,*…*• Rela)onships:*who*played*where,*who*directed*what,*…*17 Dan Suciu -- 444 Spring 2010Tables*18 Actor: Casts: Movie: id fName lName gender 195428 Tom Hanks M 645947 Amy Hanks F .*.*. id Name year 337166 Toy*Story 1995 .*.*. .*.*. .*.. pid mid 195428 337166 .*.*. Dan Suciu -- 444 Spring 2010SQL*19 SELECT * FROM Actor Dan Suciu -- 444 Spring 2010SQL*20 SELECT count(*) FROM Actor This is an aggregate query Dan Suciu -- 444 Spring 2010SQL*21 SELECT * FROM Actor WHERE lName = ‘Hanks’ This is a selection query Dan Suciu -- 444 Spring 2010SQL*22 SELECT * FROM Actor, Casts, Movie WHERE lname='Hanks' and Actor.id = Casts.pid and Casts.mid=Movie.id and Movie.year=1995 817k*actors,*3.5M*casts,**380k*movies;*How*can*it*be*so*fast*?**This query has selections and joins Dan Suciu -- 444 Spring 2010Op)miza)on*and*Query*Execu)on*• Indexes:*on*Actor.lName,*on*Movie.year*• Query*op)miza)on*– Access*path*selec)on*– Join*order*• Sta)s)cs*• Mul)ple*implementa)ons*of*joins*23 Dan Suciu -- 444 Spring 2010Recover y*• Transfer*$100*from*account*#4662*to*#7199:*24 X = Read(Account_1); X.amount = X.amount - 100; Write(Account_1, X); Y = Read(Account_2); Y.amount = Y.amount + 100; Write(Account_2, Y); Dan Suciu -- 444 Spring 2010Recover y*• Transfer*$100*from*account*#4662*to*#7199:*25 What*is*the*problem*?*X = Read(Account_1); X.amount = X.amount - 100; Write(Account_1, X); Y = Read(Account_2); Y.amount = Y.amount + 100; Write(Account_2, Y); CRASH ! Dan Suciu -- 444 Spring 2010Concurrency*Control*• How*to*overdrau*your*account:*26 X = Read(Account); if (X.amount > 100) { dispense_money( ); X.amount = X.amount – 100; } else error(“Insufficient funds”); X = Read(Account); if (X.amount > 100) { dispense_money( ); X.amount = X.amount – 100; } else error(“Insufficient funds”); User*1* User*2*What*can*go*wrong*?*Dan Suciu -- 444 Spring 2010Transac)ons*• Recovery*• Concurrency*control*ACID*=*• Atomicity**(*=*recovery)*• Consistency*• Isola)on***(*=*concurrency*control)*• Durability*27 Dan Suciu -- 444 Spring 2010Client/Server*Database*Architecture*• There*is*one*single*ser ver*that*stores*the*database*(called*DBMS*or*RDBMS):*– Usually*a*beefed‐up*system,*e.g.*IISQLSRV1*– But*can*be*your*own*desktop…*– …*or*a*huge*cluster*running*a*parallel*dbms*• Many*clients*running*apps*and*connec)ng*to*DBMS*– E. g.*Microsou’s*Management*Studio*– Or*psql*(for*postgres)*– More*realis)cally*some*Java*or*C++*program*• The*client*“talks”*to*the*server*using*JDBC*protocol*28 Dan Suciu -- 444 Spring 2010Data*Management*• Data*Management*is*more*than*databases*!*A*Data*Management*QUIZ:*• Alice*sends*Bob*in*random*order*all*the*numbers*1,*2,*3,*…,*100000000000000000000*• She*does*not*repeat*any*number*• But*she*misses*exactly'one*•
View Full Document