DOC PREVIEW
Peta Ops Operating System Scaability

This preview shows page 1-2-24-25 out of 25 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 25 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 25 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 25 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 25 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 25 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Author: Greg AstfalkDivision: HPSDTopic: PetaFlops-IIConferenceDate: February 18, 1999Slide:1of 25Author: Greg AstfalkDivision: HPSDTopic: PetaFlops-IIConferenceDate: February 18, 1999Slide:2of 25● My comments are predicated on the assumption that the operating system needs to be, at a minimum, “Unix-like”● The PetaOps system must fit into the existing computing universe■ system APIs■ networking■ required middleware■ remote code development■ existing code base■ etc.Author: Greg AstfalkDivision: HPSDTopic: PetaFlops-IIConferenceDate: February 18, 1999Slide:3of 25This space intentionally left blank.Author: Greg AstfalkDivision: HPSDTopic: PetaFlops-IIConferenceDate: February 18, 1999Slide:4of 25● At past PetaOps workshops I have sounded cynical or “sour-grapes”● Today is no exception● In defense, I view it as being pragmaticAuthor: Greg AstfalkDivision: HPSDTopic: PetaFlops-IIConferenceDate: February 18, 1999Slide:5of 25● If we can get the HTMT processors at 100 Ghz(say 200 Gflops)■ 5,000 processors● If we are forced to go with commodity processors at, say, 8 Gflops each■ 125,000 processorsAuthor: Greg AstfalkDivision: HPSDTopic: PetaFlops-IIConferenceDate: February 18, 1999Slide:6of 25● For latency hiding we need somewhere between 1 million and 100 million threads● Let’s stop for just a moment and think about this….Okay now lets get started againAuthor: Greg AstfalkDivision: HPSDTopic: PetaFlops-IIConferenceDate: February 18, 1999Slide:7of 25● Question:■ What is the most difficult application on the planet to parallelize?● Answer:■ UnixAuthor: Greg AstfalkDivision: HPSDTopic: PetaFlops-IIConferenceDate: February 18, 1999Slide:8of 25● Where are we today?● The largest SSI Unix system is 256 processors● We are a factor of 20–500 off the mark● From first-hand experience■ every factor of 2 increase in o/s scalability induces at least a factor of 10 of effort● Why is it that SGI and HP’s architectures support N processor ccNUMA and the o/s is at N/x, where x > 1?■ Because it’s hard to do otherwise!Author: Greg AstfalkDivision: HPSDTopic: PetaFlops-IIConferenceDate: February 18, 1999Slide:9of 25● A single processor operating system that has been stretched to handle SMPs● The fundamental structure of the Unix internals precludes it’s scalability, without complete overhaul, to thousands of processors● Two significant areas of concern■ the process manager (PM)■ the virtual memory manager (VM)● Too many critical sectionsAuthor: Greg AstfalkDivision: HPSDTopic: PetaFlops-IIConferenceDate: February 18, 1999Slide:10of 25● The amount of shared information in the internal structures■ the proc structure is especially nasty▲ it has been a catch-basin for years● The need to maintain a single-system image● Aside: the internal data structures have not changed all that much since the Thompson and Ritchie daysAuthor: Greg AstfalkDivision: HPSDTopic: PetaFlops-IIConferenceDate: February 18, 1999Slide:11of 25● Unix LOVES linked-lists■ Think about walking a linked-list of PetaOps scale and comparing against a single field in a structure● Maintaining consistency in VM is particularly troubling■ many memory levels, separate page pools, consistency between “nodes”, etc.● Data movement within the o/s■ buffer cache, for exampleAuthor: Greg AstfalkDivision: HPSDTopic: PetaFlops-IIConferenceDate: February 18, 1999Slide:12of 25● Should the architecture influence (dictate?) the operating system?● Should the operating system force architectural decisions?● This is not a rhetorical questionAuthor: Greg AstfalkDivision: HPSDTopic: PetaFlops-IIConferenceDate: February 18, 1999Slide:13of 25● Sheer component count of a PetaOps system dictates frequent failures in all types of components■ By frequent I mean minutes● The operating system must be resilient to failures of■ disks (easy :-)■ processors■ memory■ interconnects■ ASICs● This has a profound effect on the o/sAuthor: Greg AstfalkDivision: HPSDTopic: PetaFlops-IIConferenceDate: February 18, 1999Slide:14of 25● Speaking from first–hand experience doing a robust, in the face of failures, o/s is a very difficult problem● To not have this capability in the o/s for a PetaOps system is a recipe for certain failure■ failure here means a system that is always either booting or doing application start-upAuthor: Greg AstfalkDivision: HPSDTopic: PetaFlops-IIConferenceDate: February 18, 1999Slide:15of 25● A slight digression that is related to the o/s● What programming model do we want or need?■ shared–memory■ distributed–memory● More on this laterAuthor: Greg AstfalkDivision: HPSDTopic: PetaFlops-IIConferenceDate: February 18, 1999Slide:16of 25● Assume we are going to go with a single address space PetaOps system■ We are in serious trouble here● I have no genuine feelings of possible success here unless the o/s is “completely” restructured● We do have some data on which we can extrapolate■ It is not encouragingAuthor: Greg AstfalkDivision: HPSDTopic: PetaFlops-IIConferenceDate: February 18, 1999Slide:17of 25● Targeting 2007 for availability● A relatively small team could immediately begin the redesign and define the internals● Architectural simulators will be required far in advance of the actual hardware● As the specifics of the machine become available the machdep work could beginAuthor: Greg AstfalkDivision: HPSDTopic: PetaFlops-IIConferenceDate: February 18, 1999Slide:18of 25● This benign sounding approach is not trivial● Availability is still a significant issue● The scale of the operating systems’s domains are more manageable, say (500) processors■ Unix as it exists today might suffice● This implies a message–passing programming modelAuthor: Greg AstfalkDivision: HPSDTopic: PetaFlops-IIConferenceDate: February 18, 1999Slide:19of 25● If its distributed, it is MPI■ grep “MPI” with “message-passing” in what follows● Lets consider the consequences of MPI, at the PetaOPs scale, on the operating system● In what follows I am not “picking on” MPI■ it is a vehicle to point out the operating system


Peta Ops Operating System Scaability

Download Peta Ops Operating System Scaability
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Peta Ops Operating System Scaability and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Peta Ops Operating System Scaability 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?