Berkeley COMPSCI 294 - Scalable Structured Data Storage for Web 2.0 - D492136

Home> Schools> University of California, Berkeley> Computer Science (COMPSCI) > COMPSCI 294> Scalable Structured Data Storage for Web 2.0

DOC PREVIEW

Berkeley COMPSCI 294 - Scalable Structured Data Storage for Web 2.0

School name University of California, Berkeley

Course Compsci 294- Special Topics

Pages 6

This preview shows page 1-2 out of 6 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 6 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 6 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 6 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Slide 1Slide 2Slide 3Slide 4Slide 5Slide 6UC BerkeleyScalable Structured Data Storage for Web 2.0Michael ArmbrustDavid ZhuBarret RhodenObjectives•To provide a framework for Web applications to scale to YouTube or MySpace sizes–Use an alternative data store more suitable for typical Web workloads (Hbase)•Willing to trade consistency for scalability and availability–Integrate with Ruby on Rails to give developers a clean interface to the data store•Use declarative constructs from Rails to express the application's needs from the data storeState of the Art•Current status of data storage for Web applications:–Large relational databases running on expensive hardware–Manual horizontal and vertical partitioning of data–Requires redesign at each scaling milestone•Other work and differences: –C-Store/Vertica: Maintains full SQL semantics–Dynamo: Optimized for Amazon's writes–PNUTS: Hosted service, work in progressOur Idea•Use a large-scale distributed database suitable for Web applications–Relaxed consistency, No ad-hoc queries–Can run on 1000+ of shared-nothing commodity servers•Interface with ActiveRecord-like layer in Ruby on Rails–Provides simple relationships and consistency guarantees between models•has_many•belongs_to•searchable_by (for full-text search)•Pre-compute joins for quick readsRisks•Hbase Performance –Hbase is under development and may have implementation problems•Rails Scaling–Once we successfully remove the data store bottleneck from Rails, we may discover unknown bottlenecks at the Web Application processing layerPlanWorkload•Simple App with Simple workload•Complicated App–Joins–Sessions–Access Locality•Full Fledged App–Possibly use Sun's Rails benchmark in addition to our workloadActiveRecord•Talk to Hbase–Single Lookup–No Joins•Three Basic Joins•Validations and PrefetchingData Store•Scalability of Hbase•Determine comparisons with other stores–Define data layout–Indexing options•Hard off-line queriesKey:- Week 8- Week 10 (Mid-Course)- Week 14

View Full Document

Berkeley COMPSCI 294 - Scalable Structured Data Storage for Web 2.0

Sign up for free to view:

This document and 3 million+ documents and flashcards
High quality study guides, lecture notes, practice exams
Course Packets handpicked by editors offering a comprehensive review of your courses
Better Grades Guaranteed


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2 out of 6 pages.

Berkeley COMPSCI 294 - Scalable Structured Data Storage for Web 2.0

Sign up for free to view:

Please select your school