Berkeley COMPSCI 262A - Segment-oriented Recovery - D2400782

Home> Schools> University of California, Berkeley> Computer Science (COMPSCI) > COMPSCI 262A> Segment-oriented Recovery

DOC PREVIEW

Berkeley COMPSCI 262A - Segment-oriented Recovery

School name University of California, Berkeley

Course Compsci 262a- Advanced Topics in Computer Systems

Pages 5

This preview shows page 1-2 out of 5 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 5 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 5 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 5 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Segment-oriented Recovery1Advanced Topics in Operating Systems, CS262aProf. Eric A. Brewer (with help from Rusty Sears)Segment-oriented RecoveryARIES works great but is 20+ years old and has some problems:o viewed as very complexo no available implementations with source codeo part of a monolithic DBMS: can use reuse transactions for other systems?o LSN in the page breaks up large objects (Fig 1), prevents efficient I/Oo DBMS seems hard to scale for cloud computing (except by partitioning)Original goals (Stasis):o build an open-source transaction system with full ARIES-style steal/no-forcetransactionso try to make the DBMS more “layered” in the OS styleo try to support a wider array of transactional systems: version control, file systems,bioinformatics or science databases, graph problems, search engines, persistent objects,...o competitive performanceThese goals were mostly met, although the open-source version needs more users and we have only triedsome of the unusual applications.Original version was basically straight ARIES; development led to many insights and the new versionbased on segments.Segment-oriented recovery (SOR)A segment is just a range of byteso may span multiple pages o may have many per pageo analogous to segments and pages in computer architecture (but arch uses segments forprotection boundaries, we use them for recovery boundaries; in both, pages are fixedsize and the unit of movement to disk)Key idea: change two of the core ARIES design points:o Recover segments not pageso No LSNs on pages• ARIES: LSN per page enable exactly once redo semantics• SOR: estimate the LSN conservatively (older) => at least once semantics [but must now limit the actions we can take in redo]The Big Four Positives of SOR:21) Enable DMA and/or zero-copy I/Oo No LSNs per page mean that large objects are contiguous on disko No “segmentation and reassembly” to move objects to/from the disko No need to involve the CPU in object I/O (very expensive); use DMA instead2) Fix persistent objectso Problem: consider page with multiple objects, each of which has an in memoryrepresentation (e.g. a C++ or Java object) • Suppose we update object A and generate a log entry with LSN=87• Next we update object B (on the same page), generate its log entry with LSN=94, and write it back to the disk page, updating the page LSN• This pattern breaks recovery: the new page LSN (94) implies that the page reflects redo log entry 87, but it does not.• ARIES “solution”: disk pages (in memory) must be updated on every log write; this is “write through caching” -- all updates are written through to the buffer manager pageo SOR solution: • there is no page LSN and thus no error• Buffer manager is written on cache eviction -- “write back caching”. This implies much less copying/marshalling for hot objects.• This is like “no force” between the app cache and the buffer manager (!)3) Enable high/low priority transactions (log reordering on the fly)o With pages, we assign the LSN atomically with the page write [draw fig 2]• not possible to reorder log entries at all• in theory, two independent transactions could have their log entries reordered based on priority or because they are going to different log disks (or log servers)o SOR: do not need to know LSN at the time of the page update (just need to make sure itis assigned before you commit, so that it is ordered before later transactions; this istrivial to ensure -- just use normal locking and follow WAL protocol)• High priority updates: move log entry to the front of the log or to high priority “fast” log disk• Low priority updates go to end of log as usual34) Decouple the components; enable cloud computing versionso background: “pin” pages to keep them from being stolen (short term), “latch” pages toavoid race conditions within a pageo subtle problem with pages: must latch the page across the call to log manager (in orderto get an LSN atomically)o SOR has no page LSN and in fact no shared state at all for pages => no latch neededo SOR decouple three different things:• App <-> Buffer manager: this is the write-back caching described above: only need to interact on evic-tion, not on each update• Buffer manager <-> log manager: no holding a latch across the log manager call; log manager call can now by asynchronous and batched together• Segments can be moved via zero-copy I/O directly, with no meta data (e.g. page LSN) and no CPU involvement. Simplifies archiving and reading large objects (e.g. photos).o Hope: someone (ideally in CS262) will build a distributed transaction service usingSOR• Apps, Buffer Manager, Log Manager, Stable storage could all be different clusters• Performance: (fig 11): 1-3 orders of magnitude difference for distributed transactionsPhysiological Redo (review):o Redos are applied exactly once (using the page LSN)o Combination of physical and logical logging• physical: write pre- or post-images (or both)• logical: write the logical operation (“insert”)o Physiological:4• redos are physical• normal undos are like redos and set a new LSN (does not revert to the old LSN -- wouldn’t work given multiple objects per page!)• to enable more concurrency, do not undo structural changes of a B-Tree (or other index); instead of a physical undo, issue a new logical undo that is the inverse operation. Enables concurrency because we can hold short locks for the structural change rather than long locks (until the end of the transaction)o Slotted pages: add an array of offsets to each page (slots), then store records with a slotnumber and use the array to look up the current offset for that record. This allowschanging the page layout without any log entries.SOR Redo:o Redos may applied more than once; we go back farther in time than strictly necessaryo Redos must be physical “blind writes” -- write content that do not depend on theprevious contentso Undos can still be logical for concurrencyo Slotted page layout changes require redo loggingCore SOR redo phase:o periodically write estimated LSNs to log (after you write back a page)o start from disk version of the segment (or from snapshot or whole segment write)o replay all redos since estimated LSN (worst case the beginning of the truncated log),even though some might have been applied alreadyo for all bytes of the segment: either it was correct on disk and not changed or it waswritten during recovery in order by time

View Full Document