1! Data$Models$ Conceptual$representa1on$of$the$data$ Data$Retrieval$ How$to$ask$ques1ons$of$the$database$ How$to$answer$those$ques1ons$ Data$Storage$ How/where$to$store$data,$how$to$access$it$ Data$Integrity$ Manage$crashes,$concurrency$ Manage$seman1c$inconsistencies$2! Transaction: A sequence of database actions enclosed within special tags! Properties:! Atomicity: Entire transaction or nothing! Consistency: Transaction, executed completely, takes database from one consistent state to another! Isolation: Concurrent transactions appear to run in isolation! Durability: Effects of committed transactions are not lost! Consistency: Transaction programmer needs to guarantee that! DBMS can do a few things, e.g., enforce constraints on the data! Rest: DBMS guarantees! .. this relate to queries that we discussed ?! Queries donʼt update data, so durability and consistency not relevant! Would want concurrency ! Consider a query computing total balance at the end of the day! Would want isolation! What if somebody makes a transfer while we are computing the balance! Typically not guaranteed for such long-running queries! TPC-C vs TPC-H!3! Assumptions:! The system can crash at any time! Similarly, the power can go out at any point! Contents of the main memory wonʼt survive a crash, or power outage! BUT… disks are durable. They might stop, but data is not lost.! For now.! Disks only guarantee atomic sector writes, nothing more! Transactions are by themselves consistent! Goals:! Guaranteed durability, atomicity! As much concurrency as possible, while not compromising isolation and/or consistency! Two transactions updating the same account balance… NO! Two transactions updating different account balances… YES! States of a transaction! A simple solution called shadow copy! Satisfies Atomicity, Durability, and Consistency, but no Concurrency! Very inefficient!4! Make updates on a copy of the database.! Switch pointers atomically after done.! Some text editors work this way!5! Atomicity:! As long as the DB pointer switch is atomic. ! Okay if DB pointer is in a single block! Concurrency:! No.! Isolation:! No concurrency, so isolation is guaranteed.! Durability:! Assuming disk is durable (we will assume this for now).! Very inefficient:! Databases tend to be very large. Making extra copies not feasible. Further, no concurrency.! Concurrency control schemes! A CC scheme is used to guarantee that concurrency does not lead to problems! For now, we will assume durability is not a problem! So no crashes! Though transactions may still abort! Schedules! When is concurrency okay ?! Serial schedules! Serializability!6!T1!read(A)!A = A -50!write(A)!read(B)!B=B+50!write(B)!T2!read(A)!tmp = A*0.1!A = A – tmp!write(A)!read(B)!B = B+ tmp!write(B)!Transactions:! T1: transfers $50 from A to B! T2: transfers 10% of A to B!Database constraint: A + B is constant (checking+saving accts)!Effect: Before After! A 100 45! B 50 105!Each transaction obeys the constraint.!This schedule does too.! A schedule is simply a (possibly interleaved) execution sequence of transaction instructions! Serial Schedule: A schedule in which transaction appear one after the other! ie., No interleaving! Serial schedules satisfy isolation and consistency! Since each transaction by itself does not introduce inconsistency!7! Another “serial” schedule:!T1!read(A)!A = A -50!write(A)!read(B)!B=B+50!write(B)!T2!read(A)!tmp = A*0.1!A = A – tmp!write(A)!read(B)!B = B+ tmp!write(B)!Consistent ?! Constraint is satisfied.!Since each Xion is consistent, any !serial schedule must be consistent!Effect: Before After! A 100 40! B 50 110!!!T1!read(A)!A = A -50!write(A)!read(B)!B=B+50!write(B)!T2!read(A)!tmp = A*0.1!A = A – tmp!write(A)!read(B)!B = B+ tmp!write(B)!Is this schedule okay ?!Lets look at the final effect…!Effect: Before After! A 100 45! B 50 105!Consistent. !So this schedule is okay too.!8!T1!read(A)!A = A -50!write(A)!read(B)!B=B+50!write(B)!T2!read(A)!tmp = A*0.1!A = A – tmp!write(A)!read(B)!B = B+ tmp!write(B)!Is this schedule okay ?!Lets look at the final effect…!Effect: Before After! A 100 45! B 50 105!Further, the effect same as the!serial schedule 1.!Called serializable! A “bad” schedule!! ! ! !Not consistent!T1!read(A)!A = A -50!write(A)!read(B)!B=B+50!write(B)!T2!read(A)!tmp = A*0.1!A = A – tmp!write(A)!read(B)!B = B+ tmp!write(B)!Effect: Before After! A 100 50! B 50 60!9! A schedule is called serializable if its final effect is the same as that of a serial schedule! Serializability schedule is fine and does not result in inconsistent database! Since serial schedules are fine! Non-serializable schedules are unlikely to result in consistent databases! We will ensure serializability! Typically relaxed in real high-throughput environments! Not possible to look at all n! serial schedules to check if the effect is the same! Instead we ensure serializability by allowing or not allowing certain schedules! Conflict serializability! View serializability! View serializability allows more schedules!10! Two read/write instructions “conflict” if ! They are by different transactions! They operate on the same data item! At least one is a “write” instruction! Why do we care ?! If two read/write instructions donʼt conflict, they can be “swapped” without any change in the final effect! However, if they conflict they CANʼT be swapped without changing the final effect!T1!read(A)!A = A -50!write(A)!read(B)!B=B+50!write(B)!T2!read(A)!tmp = A*0.1!A = A – tmp!write(A)!read(B)!B = B+ tmp!write(B)!T1!read(A)!A = A -50!write(A)!read(B)!B=B+50!write(B)!T2!read(A)!tmp = A*0.1!A = A – tmp!write(A)!read(B)!B = B+ tmp!write(B)!Effect: Before
View Full Document