Duke CPS 212 - Introduction to Distributed Systems

Unformatted text preview:

1Introduction to Distributed * SystemsIntroduction to Distributed * SystemsOutlineOutline• about the course• relationship to other courses• the challenges of distributed systems• distributed services• *ility for distributed services• about the courseWhat is CPS 212 about?What is CPS 212 about?What do I mean by “distributed information systems”?• Distributed: a bunch of “computers” connected by “wires”• Nodes are (at least) semi-autonomous...but run software to coordinate and share resources.• Information systems: focus on systems to store/access/share data and operations on data.Move {data, computation} around the network and deliver it to the right places at the right times, safely and securely.• Focus on Internet information services and their building blocks.The Web, Web Services, name services, resource sharing (Grid)Clustering, network storage, file sharingWhy are you here?Why are you here?• You are a second-year (or later) CPS graduate student.• You have taken CPS 210 and 214 and/or 216 and you want more.familiarity with TCP/IP networking, threads, and file systems•Or: we have talked and we agreed that you should take the class.• You are comfortable with concurrent programming in Java.(You want to do some Java programming labs.)• You want to prepare for R/D in this exciting and important area.(You want to read about 15 papers and take some exams.)• You want to get started...(Semester group project.)Continuum of Distributed SystemsContinuum of Distributed Systems? ?smallfastbigslowLANGlobalInternetParallelArchitecturesCPS 221high latencylow bandwidthautonomous nodesunreliable networkfear and distrustindependent failuresdecentralized administrationNetworksCPS 214Issues:naming and sharingperformance and scaleresource managementlow latencyhigh bandwidthsecure, reliable interconnectno independent failurescoordinated resourcesMultiprocessorsclustersfast networktrusting hostscoordinatedslow networkuntrusting hostsautonomyThe Challenges of Distributed SystemsThe Challenges of Distributed Systems• private communication over public networkswho sent it (authentication), did anyone change it, did anyone see it• building reliable systems from unreliable componentsnodes fail independently; a distributed system can “partly fail”Lamport: “A distributed system is one in which the failure of a machineI’ve never heard of can prevent me from doing my work.”• location, location, locationPlacing data and computation for effective resource sharing, and finding it again once you put it somewhere.• coordination and shared stateWhat should we (the system components) do and when should we do it? Once we’ve all done it, can we all agree on what we did and when?2Information Systems vs. DatabasesInformation Systems vs. Databases“Information systems” is more general than “relational databases”.•Overlap: We study distributed concurrency control and recovery, but not the relational model.The issues are related, but we’ll consider a wider range of datamodels and service models.In this course, we view databases as:• local components of larger distributed systems, or• distributed systems in themselves.Focus: scale and robustness of large-scale Internet services.September 11, 2001September 11, 2001The 9/11 load spike at CNN.com:• complete collapse• scramble to manually deploy new serversHow can we handle “flash crowds”?• Buy/install enough hardware for worst-case load?• Block traffic?• Adaptive provisioning?• Steal resources from less critical services?That Other September 11That Other September 11This is a graph of request traffic to download the Starr Report on Pres. Clinton’s extracurricular pursuits, released on 9/11/98.Broader Importance of Distributed Software TechnologyBroader Importance of Distributed Software TechnologyToday, the global community depends increasingly on distributed information systems technologies.There are many recent examples of high-profile meltdowns of systems for distributed information exchange.• Code Red worm: July 2001• denial-of-service attacks against Yahoo etc. (spring 00)• stored credit card numbers stolen from CDNow.com (spring 00)People were afraid to buy over the net at all just a few years ago!• Network Solutions DNS root server failure (fall 00)• MCI trunk drop interrupts Chicago Board of Exchange (summer 99)These reflect the reshaping of business, government, and societybrought by the global Internet and related software.We have to “get it right”!The Importance of AuthenticationThe Importance of AuthenticationEMLXThis is a picture of a $2.5B move in the value of Emulex Corporation, in response to a fraudulent press release by short-sellers through InternetWire in 2000. The release was widely disseminated by news media as a statement from Emulex management, but media failed to authenticate it.[reproduced from clearstation.com]Challenges for Services: Challenges for Services: **ilityilityWe want our distributed applications to be useful, correct, and secure. We also want reliability. Broadly, that means:• recoverabilityDon’t lose data if a failure occurs (also durability)• availabilityDon’t interrupt service if a failure occurs.• scalabilityGrow effectively with the workload. See also: manageability.• survivabilityMurphy’s Law says it’s a dangerous world. Can systems protect themselves?• See also: security, adaptibility, agility, dependability, perormability, etc.3The Meaning of ScalabilityThe Meaning of ScalabilityScalability is now part of the “enhanced standard litany” [Fox]; everybody claims their system is “scalable”. What does it really mean?costcapacitymarginalcost of capacitytotal cost of capacity scalableunscalableHow do we measure or validate claims of scalability?Note: watch out for “hockey sticks”!Pay as you go: expand capacity by spending more money, in proportion to the new capacity.Scalability II: ManageabilityScalability II: ManageabilityToday, “cost” has a broader meaning than it once did:• growth in administrative overhead with capacity• no interruption of service to upgrade capacity“24 * 7 * 365 * .9999”vendor5%staff40%facility5% 50%vendor40%staff40%facility20%Old WorldNew WorldWhere does the money go?[Borrowed from Jim Gray]SelfSelf--Managing SystemsManaging SystemsIBM’s Autonomic Computing ChallengeHow to Build SelfHow to Build Self--Managing Systems?Managing Systems?clientsServers in


View Full Document

Duke CPS 212 - Introduction to Distributed Systems

Download Introduction to Distributed Systems
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Introduction to Distributed Systems and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Introduction to Distributed Systems 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?