Rutgers University CS 417 - Clusters - D2738745

Home> Schools> Rutgers University- The State University of New Jersey> (CS) > CS 417> Clusters

DOC PREVIEW

Rutgers University CS 417 - Clusters

School name Rutgers University- The State University of New Jersey

Course Cs 417- Distributed Systems

Pages 61

This preview shows page 1-2-3-4-28-29-30-31-58-59-60-61 out of 61 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 61 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 61 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 61 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 61 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 61 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 61 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 61 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 61 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 61 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 61 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 61 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 61 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 61 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

ClustersPaul [email protected] SystemsExcept as otherwise noted, the content of this presentation is licensed under the Creative Commons Attribution 2.5 License.Designing highly available systemsIncorporate elements of fault-tolerant design– Replication, TMRFully fault tolerant system will offernon-stop availability– You can’t achieve this!Problem: expensive!Designing highly scalable systemsSMP architectureProblem:performance gain as f(# processors) is sublinear– Contention for resources (bus, memory, devices)– Also … the solution is expensive!ClusteringAchieve reliability and scalability by interconnecting multiple independent systemsCluster: group of standard, autonomous servers configured so they appear on the network as a single machineapproach single system imageIdeally…• Bunch of off-the shelf machines• Interconnected on a high speed LAN• Appear as one system to external users• Processors are load-balanced– May migrate– May run on different systems– All IPC mechanisms and file access available• Fault tolerant– Components may fail– Machines may be taken downwe don’t get all that (yet)(at least not in one package)Clustering types• Supercomputing (HPC)• Batch processing• High availability (HA)• Load balancingHigh Performance Computing(HPC)The evolution of supercomputers• Target complex applications:– Large amounts of data– Lots of computation– Parallelizable application• Many custom efforts– Typically Linux + message passing software + remote exec + remote monitoringClustering for performanceExample: One popular effort– Beowulf• Initially built to address problems associated with large data sets in Earth and Space Science applications• From Center of Excellence in Space Data & Information Sciences (CESDIS), division of University Space Research Association at the Goddard Space Flight CenterWhat makes it possible• Commodity off-the-shelf computers are cost effective• Publicly available software:– Linux, GNU compilers & tools– MPI (message passing interface)– PVM (parallel virtual machine)• Low cost, high speed networking• Experience with parallel software– Difficult: solutions tend to be customWhat can you run?• Programs that do not require fine-grain communication• Nodes are dedicated to the cluster– Performance of nodes not subject to external factors• Interconnect network isolated from external network– Network load is determined only by application• Global process ID provided– Global signaling mechanismBeowulf configurationIncludes:– BPROC: Beowulf distributed process space• Start processes on other machines• Global process ID, global signaling– Network device drivers• Channel bonding, scalable I/O– File system (file sharing is generally not critical)• NFS root • unsynchronized• synchronized periodically via rsyncProgramming tools: MPI• Message Passing Interface• API for sending/receiving messages– Optimizations for shared memory & NUMA– Group communication support• Other features:– Scalable file I/O– Dynamic process management– Synchronization (barriers)– Combining resultsProgramming tools: PVM• Software that emulates a general-purpose heterogeneous computing framework on interconnected computers• Present a view of virtual processing elements– Create tasks– Use global task IDs– Manage groupsof tasks– Basic message passingBeowulf programming tools• PVM and MPI libraries• Distributed shared memory– Page based: software-enforced ownership and consistency policy• Cluster monitor• Global ps, top, uptimetools• Process management– Batch system– Write software to control synchronization and load balancing with MPI and/or PVM– Preemptive distributed scheduling: not part of Beowulf (two packages: Condorand Mosix)Another example• Rocks Cluster Distribution– Based on CentOS Linux– Mass installation is a core part of the system• Mass re-installation for application-specific configurations– Front-end central server + compute & storage nodes– Rolls: collection of packages• Base roll includes: PBS (portable batch system), PVM (parallel virtual machine), MPI (message passing interface), job launchers, …Another example• Microsoft HPC Server 2008– Windows Server 2008 + clustering package– Systems Management• Management Console: plug-in to System Center UI with support for Windows PowerShell• RIS (Remote Installation Service)– Networking• MS-MPI (Message Passing Interface)• ICS (Internet Connection Sharing) : NAT for cluster nodes• Network Direct RDMA (Remote DMA)– Job scheduler– Storage: iSCSI SAN and SMB support– Failover supportBatch ProcessingBatch processing• Common application: graphics rendering– Maintain a queue of frames to be rendered– Have a dispatcher to remotely exec process• Virtually no IPC needed• Coordinator dispatches jobsSingle-queue work distributionRender Farms:Pixar:• 1,024 2.8 GHz Xeon processors running Linux and Renderman• 2 TB RAM, 60 TB disk space• Custom Linux software for articulating, animating/lighting (Marionette), scheduling (Ringmaster), and rendering (RenderMan)• Cars: each frame took 8 hours to Render. Consumes ~32 GB storage on a SANDreamWorks:• >3,000 servers and >1,000 Linux desktopsHP xw9300 workstations and HP DL145 G2 servers with 8 GB/server• Shrek 3: 20 million CPU render hours. Platform LSF used for scheduling + Maya for modeling + Avid for editing+ Python for pipelining – movie uses 24 TB storageSingle-queue work distributionRender Farms:–ILM:• 3,000 processor (AMD) renderfarm; expands to 5,000 by harnessing desktop machines• 20 Linux-based SpinServer NAS storage systems and 3,000 disks from Network Appliance• 10 Gbps ethernet–Sony Pictures’ Imageworks:• Over 1,200 processors• Dell and IBM workstations• almost 70 TB data for Polar ExpressBatch ProcessingOpenPBS.org:– Portable Batch System– Developed by Veridian MRJ for NASA• Commands– Submit job scripts• Submit interactive jobs• Force a job to run– List jobs– Delete jobs– Hold jobsLoad Balancingfor the webFunctions of a load balancerLoad balancingFailoverPlanned outage managementRedirectionSimplest techniqueHTTP REDIRECT error codeRedirectionSimplest techniqueHTTP REDIRECT error codewww.mysite.comRedirectionSimplest techniqueHTTP REDIRECT error codewww.mysite.comREDIRECTwww03.mysite.comRedirectionSimplest

View Full Document