DOC PREVIEW
CORNELL CS 4410 - Computer Architecture Review

This preview shows page 1-2-3-4-26-27-28-53-54-55-56 out of 56 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 56 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 56 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 56 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 56 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 56 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 56 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 56 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 56 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 56 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 56 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 56 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 56 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS 4410Operating SystemsComputer Architecture ReviewOliver KennedyThe Dawn of ComputingThe OS!MultitaskingSo what’s under the hood?Well... not quiteNetworksA CPU•Registers:•The CPU’s short term memory.•Arithmetic Logic Unit: •Where most of the work gets done.•Floating Point Unit:•Handles the “decimal” calculations.•Caches:•Reduce memory access times.The Pipeline•A lot of computation goes into a single instruction.•Can some of this computation be done in parallel?•Set up an assembly line.•Each stage processes a little and passes it on.•Utilize hardware more intensively•Less work per stage means stages can run faster.•Why not have lots and lots of stages?•What happens if we don’t know what will happen next?•What happens if one instruction needs data from an earlier instruction?The Pipeline•Avoiding delays:•Branch Prediction.•Instruction Reordering.•Currently, most pipelines are 10-15 stages in length.•Fetch the instruction•Decode/Dispatch the instruction.•Get necessary data.•Perform necessary calculations.•Write the results to registers/memory.The Multicore Revolution•Moore’s law continues, but not like everyone expected.•More transistors, but the density is too high.•How can we use the extra transistors?•Make one CPU into two, sixteen, sixty four... or more.•Do more at the same speed.•Push towards multithreaded programming languages.•... need OS support.The Memory Hierarchy•Registers: 8-64 integers/floats at a time.•Available immediately.•L1 Cache: ~32KB Data, ~32KB Instructions.•Short access time (2-3 cycles).•L2 Cache: 1-2 MB.•Moderate access time (~10-20 cycles).•Main Memory: up to 4GB or more.•Long access time (on the order of 100 cycles).•Prefetching is used to increase cache hits.Why a Hierarchy?•Tradeoff between speed and expense of hardware: very high-speed memory is expensive. (Also, physical distance can be a limit)•Programs typically have strong locality: most accesses are near previous accesses, in both space and time. •Spatial locality: accesses to nearby addresses•Temporal locality: same resource accessed twice•Can get very high cache hit rates with comparatively small cache.•Mostly stay on fast pathStacks•Functions in most languages execute in a LIFO order:•Therefore, can store local variables on a stack:•Each function allocates an activation record by decrementing stack pointer register; can use that area for locals.•Increment SP to return.•Note that this is just a region of main memory accessed with stack discipline; hardware may not treat it specially•Can switch stacks by changing value of SP register.•Note: not all programs use a stack; ML code typically won’t.Calling Conventions•Which registers can a function use? Where are parameters? Pushed in what order? Who removes them from stack?•Check calling convention•Example: typical conventions for IA32 •EAX, ECX, EDX are caller-save, rest are callee-save•Args on stack, either left to right (stdcall) or r. to l. (cdecl)•In ‘stdcall’ convention, callee pops params from stack. In ‘cdecl’, caller.•In C, callee doesn’t always know how many params there are, since some functions are varargs.Functions and the Stackint foo(){ int baz = 2 + 3; return bar(baz);}int bar(int baz){ return bat() + baz;}int bat(){ return 3;}RegistersStackbazSPBPFunctions and the Stackint foo(){ int baz = 2 + 3; return bar(baz);}int bar(int baz){ return bat() + baz;}int bat(){ return 3;}RegistersStackbaz23SPBPFunctions and the Stackint foo(){ int baz = 2 + 3; return bar(baz);}int bar(int baz){ return bat() + baz;}int bat(){ return 3;}RegistersStackbaz25SPBPFunctions and the Stackint foo(){ int baz = 2 + 3; return bar(baz);}int bar(int baz){ return bat() + baz;}int bat(){ return 3;}RegistersStack25SPBPbazfoo's regsbazOld BPFunctions and the Stackint foo(){ int baz = 2 + 3; return bar(baz);}int bar(int baz){ return bat() + baz;}int bat(){ return 3;}RegistersStackSPbazfoo's regsbazOld BPbar's regsOld BPBPFunctions and the Stackint foo(){ int baz = 2 + 3; return bar(baz);}int bar(int baz){ return bat() + baz;}int bat(){ return 3;}RegistersStackSP3Return value goes in a special register(Or goes onto the stack)bazfoo's regsbazOld BPbar's regsOld BPBPFunctions and the Stackint foo(){ int baz = 2 + 3; return bar(baz);}int bar(int baz){ return bat() + baz;}int bat(){ return 3;}RegistersStackSP3Return value goes in a special register(Or goes onto the stack)5bazfoo's regsbazOld BPBPFunctions and the Stackint foo(){ int baz = 2 + 3; return bar(baz);}int bar(int baz){ return bat() + baz;}int bat(){ return 3;}RegistersStackSP8Return value goes in a special register(Or goes onto the stack)5bazfoo's regsbazOld BPBPFunctions and the Stackint foo(){ int baz = 2 + 3; return bar(baz);}int bar(int baz){ return bat() + baz;}int bat(){ return 3;}RegistersStackBPSP8Return value goes in a special register(Or goes onto the stack)2baz5Functions and the Stackint foo(){ int baz = 2 + 3; return bar(baz);}int bar(int baz){ return bat() + baz;}int bat(){ return 3;}RegistersStackSPBPTraps/InterruptsTraps/Interrupts•What does the hardware do when something unexpected happens?•The software does something wrong. (Divide by 0)•The software asks for a wakeup call.•The user presses a key on the keyboard.Traps/Interrupts•What does the hardware do when something unexpected happens?•The software does something wrong. (Divide by 0)•The software asks for a wakeup call.•The user presses a key on the keyboard.•It could just set a flag and have the software check for it.•Processor intensive.•Defeats the point of an operating system.Traps/Interrupts•What does the hardware do when something unexpected happens?•The software does something wrong. (Divide by 0)•The software asks for a wakeup call.•The user presses a key on the keyboard.•It could just set a flag and have the software check for it.•Processor intensive.•Defeats the point of an operating system.•Instead, the processor pauses what it’s doing and and calls a callback.Traps/InterruptsTraps/Interrupts•How does the processor know what code to execute?Traps/Interrupts•How does the processor know what code to execute?•Most architectures define a datastructure for a Interrupt Vector Table.Traps/Interrupts•How does the processor know what code to execute?•Most architectures define a datastructure for a


View Full Document

CORNELL CS 4410 - Computer Architecture Review

Download Computer Architecture Review
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Computer Architecture Review and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Computer Architecture Review 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?