CS252 Graduate Computer Architecture Lecture 10: Network 3: Clusters, ExamplesReview: NetworkingClusterCluster DrawbacksCluster AdvantagesAddressing Cluster WeaknessesClusters and TPC BenchmarksPutting it all together: GoogleHardware InfrastructureGoogle PCsReliabilityCS 252 AdministriviaGoogle Performance: ServingGoogle Performance: CrawlingGoogle Performance: Replicating IndexColocation SitesGoogle Performance: TotalGoogle CostsComparing Storage Costs: 1/2001Putting It All Together: Cell PhonesSlide 21Cell phone steps (protocol)Frequency Division Multiple Access (FDMA)Time Division Multiple Access (TDMA)Code Division Multiple Access (CDMA)Cell Phone TowersIf time permitsAmdahl’s Law PaperCS252/PattersonLec 10.12/16/01CS252Graduate Computer ArchitectureLecture 10: Network 3: Clusters, ExamplesFebruary 16, 2001Prof. David A. PattersonComputer Science 252Spring 2001CS252/PattersonLec 10.22/16/01Review: Networking•Protocols allow hetereogeneous networking–Protocols allow operation in the presense of failures–Internetworking protocols used as LAN protocols => large overhead for LAN•Integrated circuit revolutionizing networks as well as processors–Switch is a specialized computer–Faster networks and slow overheads violate of Amdahl’s Law•Wireless Networking offers new challenges in bandwidth, mobility, reliability, ...CS252/PattersonLec 10.32/16/01Cluster•LAN switches => high network bandwidth and scaling was available from off the shelf components•2001 Cluster = collection of independent computers using switched network to provide a common service•Many mainframe applications run more "loosely coupled" machines than shared memory machines (next chapter/week)– databases, file servers, Web servers, simulations, and multiprogramming/batch processing–Often need to be highly available, requiring error tolerance and repairability–Often need to scaleCS252/PattersonLec 10.42/16/01Cluster Drawbacks•Cost of administering a cluster of N machines ~ administering N independent machines vs. cost of administering a shared address space N processors multiprocessor ~ administering 1 big machine•Clusters usually connected using I/O bus, whereas multiprocessors usually connected on memory bus•Cluster of N machines has N independent memories and N copies of OS, but a shared address multi-processor allows 1 program to use almost all memory–DRAM prices has made memory costs so low that this multiprocessor advantage is much less important in 2001CS252/PattersonLec 10.52/16/01Cluster Advantages•Error isolation: separate address space limits contamination of error•Repair: Easier to replace a machine without bringing down the system than in an shared memory multiprocessor•Scale: easier to expand the system without bringing down the application that runs on top of the cluster•Cost: Large scale machine has low volume => fewer machines to spread development costs vs. leverage high volume off-the-shelf switches and computers•Amazon, AOL, Google, Hotmail, Inktomi, WebTV, and Yahoo rely on clusters of PCs to provide services used by millions of people every dayCS252/PattersonLec 10.62/16/01Addressing Cluster Weaknesses•Network performance: SAN, especially Inifiband, may tie cluster closer to memory•Maintenance: separate of long term storage and computation•Computation maintenance:–Clones of identical PCs–3 steps: reboot, reinstall OS, recycle–At $1000/PC, cheaper to discard than to figure out what is wrong and repair it?•Storage maintenance:–If separate storage servers or file servers, cluster is no worse?CS252/PattersonLec 10.72/16/01Clusters and TPC Benchmarks•“Shared Nothing” database (not memory, not disks) is a match to cluster•2/2001: Top 10 TPC performance 6/10 are clusters (4 / top 5)CS252/PattersonLec 10.82/16/01Putting it all together: Google•Google: search engine that scales at growth Internet growth rates•Search engines: 24x7 availability•Google 12/2000: 70M queries per day, or AVERAGE of 800 queries/sec all day•Response time goal: < 1/2 sec for search•Google crawls WWW and puts up new index every 4 weeks•Stores local copy of text of pages of WWW (snippet as well as cached copy of page)•3 collocation sites (2 CA + 1 Virginia)•6000 PCs, 12000 disks: almost 1 petabyte!CS252/PattersonLec 10.92/16/01Hardware Infrastructure•VME rack 19 in. wide, 6 feet tall, 30 inches deep•Per side: 40 1 Rack Unit (RU) PCs +1 HP Ethernet switch (4 RU): Each blade can contain 8 100-Mbit/s EN or a single 1-Gbit Ethernet interface•Front+back => 80 PCs + 2 EN switches/rack•Each rack connects to 2 128 1-Gbit/s EN switches•Dec 2000: 40 racks at most recent siteCS252/PattersonLec 10.102/16/01Google PCs•2 IDE drives, 256 MB of SDRAM, modest Intel microprocessor, a PC mother-board, 1 power supply and a few fans. •Each PC runs the Linix operating system•Buy over time, so upgrade components:populated between March and November 2000 –microprocessors: 533 MHz Celeron to an 800 MHz Pentium III, –disks: capacity between 40 and 80 GB, speed 5400 to 7200 RPM–bus speed is either 100 or 133 MH–Cost: ~ $1300 to $1700 per PC•PC operates at about 55 Watts•Rack => 4500 Watts , 60 ampsCS252/PattersonLec 10.112/16/01Reliability•For 6000 PCs, 12000s, 200 EN switches•~ 20 PCs will need to be rebooted/day•~ 2 PCs/day hardware failure, or 2%-3% / year–5% due to problems with motherboard, power supply, and connectors–30% DRAM: bits change + errors in transmission (100 MHz)–30% Disks fail–30% Disks go very slow (10%-3% expected BW)•200 EN switches, 2-3 fail in 2 years•6 Foundry switches: none failed, but 2-3 of 96 blades of switches have failed (16 blades/switch)•Collocation site reliability:–1 power failure,1 network outage per year per site–Bathtub for occupancyCS252/PattersonLec 10.122/16/01CS 252 Administrivia•Signup for meetings 12:00 to 2 Wed Feb 21•Email project questionnaire Monday•No lecture next Wednesday Feb 21CS252/PattersonLec 10.132/16/01Google Performance: Serving•How big is a page returned by Google? ~16KB•Average bandwidth to serve searches70,000,000/day x 16,750 B x 8 bits/B 24 x 60 x 60 =9,378,880 Mbits/86,400 secs = 108 Mbit/sCS252/PattersonLec 10.142/16/01Google Performance: Crawling•How big is a text of a WWW page? ~4000B•1 Billion pages searched •Assume 7 days to crawl•Average bandwidth to crawl1,000,000,000/pages x 4000 B x 8 bits/B 24 x 60 x 60 x 7=32,000,000
View Full Document