DOC PREVIEW
CORNELL CS 614 - Scalable Applications and Real Time Response

This preview shows page 1-2-3-4-29-30-31-32-33-60-61-62-63 out of 63 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 63 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 63 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 63 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 63 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 63 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 63 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 63 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 63 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 63 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 63 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 63 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 63 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 63 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 63 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Scalable Applications and Real Time ResponseSlide 2Real-timeReal problems need real-timeMore real-time problemsPredictabilityPredictability: ExamplesBack to the paperRole of coprocessorIN coprocessorSlide 11Present coprocessorGoals for coprocessorSS7 experimentIN coprocessor exampleOptions?Slide 17Slide 18Slide 19Slide 20Slide 21Results!!Next tryHand-coded schemeClever twistsHandling Failure and OverloadResultsOther settings with a strong temporal elementLoad balancing in farmsConclusionsFuture directions in real-timeDimensions of ScalabilityScalabilitySlide 34Slide 35Slide 36Slide 37Slide 38Slide 39ApproachesDangersTechnologiesYou’ve Got MailConventional Mail ServersPorcupine’s GoalsKey Techniques and RelationshipsPorcupine ArchitectureBasic Data StructuresPorcupine OperationsMeasurement EnvironmentPerformanceHow does Performance Scale?AvailabilitySoft-state ReconstructionHow does Porcupine React to Configuration Changes?Hard-state ReplicationHow Efficient is Replication?Slide 58Load balancing: Deciding where to store messagesHow Well does Porcupine Support Heterogeneous Clusters?ClaimsRetrospectSome Other Interesting PapersScalable Applications and Real Time ResponseAshish MotivalaCS 614April 17th 2001Scalable Applications and Real Time ResponseUsing Group Communication Technology to Implement a Reliable and Scalable Distributed IN Coprocessor; Roy Friedman and Ken Birman; TINA 1996.Manageability, availability and performance in Porcupine: a highly scalable, cluster-based mail service; Yasushi Saito, Brian N. Bershad and Henry M. Levy; Proceedings of the 17th ACM Symposium on Operating Systems Principles , 1999, Pages 1 – 15.Real-timeTwo categories of real-time–When an action needs to be predictably fast. i.e. Critical applications.–When an action must be taken before a time limit passes.More often than not real-time doesn’t mean “as fast as possible” but means “slow and steady”.Real problems need real-timeAir Traffic Control, Free Flight–when planes are at various locations.Medical Monitoring, Remote Tele-surgery–doctors talk about how patients responded after drug was given, or change therapy after some amount of time.Process control software, Robot actions–a process controller runs factory floors by coordinating machine tools activities.More real-time problemsVideo and multi-media systems–synchronous communication protocols that coordinate video, voice, and other data sourcesTelecommunications systems–guarantee real-time response despite failures, for example when switching telephone callsPredictabilityIf this is our goal…–Any well-behaved mechanism may be adequate–But we should be careful about uncommon disruptive cases•For example, cost of failure handling is often overlooked•Risk is that an infrequent scenario will be very costly when it occursPredictability: ExamplesProbabilistic multicast protocol–Very predictable if our desired latencies are larger than the expected convergence –Much less so if we seek latencies that bring us close to the expected latency of the protocol itselfBack to the paperTelephone networks need a mixture of properties–Real-time response–High performance–Stable behavior even when failures and recoveries occurCan we use our tools to solve such a problem?Role of coprocessorA simple database–Switch does a query•How should I route a call to 1800-327-2777 from 607-266-8141?•Reply: use output line 6–Time limit of 100ms on transactionCall ID, call conferencing, automatic transferring, voice menus, etcUpdate databaseIN coprocessorSS7switchSS7switchSS7switchSS7switchIN coprocessorSS7switchSS7switchSS7switchSS7switchcoprocessorcoprocessorcoprocessorcoprocessorPresent coprocessorRight now, people use hardware fault-tolerant machines for this–E.g. Stratus “pair and a spare” –Mimics one computer but tolerates hardware failures–Performance an issue?Goals for coprocessorRequirements–Scalability: ability to use a cluster of machines for the same task, with better performance when we use more nodes–Fault-tolerance: a crash or recovery shouldn’t disrupt the system–Real-time response: must satisfy the 100ms limit at all timesDowntime: any period when a series of requests might all be rejectedDesired: 7 to 9 nines availabilitySS7 experimentHorus runs the “800 number database” on a cluster of processors next to the switchProvide replication management toolsProvide failure detection and automatic configurationIN coprocessor exampleSS7 switchQuery Element (QE) processors do the number lookup (in-memory database).Goals: scalable memory without loss of processing performance as number of nodes is increasedSwitch itself asks for help when remote number call is sensedExternal adaptor (EA) processors run the query protocolEAEAPrimary backup scheme adapted (using small Horus process groups) to provide fault-tolerance with real-time guaranteesOptions?A simple scheme:–Organize nodes as groups of 2 processes–Use virtual synchrony multicast•For query•For response•Also for updates and membership trackingIN coprocessor exampleSS7 switchEAEAStep 1: Switch sees incoming requestIN coprocessor exampleSS7 switchEAEAStep 2: Switch waits while EA procs. multicast request to group of query elements (“partitioned” database)IN coprocessor exampleSS7 switchThinkThinkEAEAStep 3: The query elements do the query in duplicateIN coprocessor exampleSS7 switchEAEAStep 4: They reply to the group of EA processesIN coprocessor exampleSS7 switchEAEAStep 5: EA processes reply to switch, which routes callResults!!Terrible performance!–Solution has 2 Horus multicasts on each critical path–Experience: about 600 queries per second but no moreAlso: slow to handle failures–Freezes for as long as 6 secondsPerformance doesn’t improve much with scale eitherNext tryConsider taking Horus off the critical pathIdea is to continue using Horus–It manages groups–And we use it for updates to the database and for partitioning the QE setBut no multicasts on critical path–Instead use a hand-coded schemeUse Sender Ordering (or fifo) instead of Total OrderingHand-coded schemeQueue up a set of requests from an EA to a QEPeriodically (15 ms), sweep the set into a message and send as a batch Process queries also as a batchSend the batch of replies back to EAClever twistsSplit into a


View Full Document

CORNELL CS 614 - Scalable Applications and Real Time Response

Documents in this Course
Load more
Download Scalable Applications and Real Time Response
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Scalable Applications and Real Time Response and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Scalable Applications and Real Time Response 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?