DOC PREVIEW
U of I CS 433 - System Organization

This preview shows page 1-2-3-4-30-31-32-33-34-62-63-64-65 out of 65 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 65 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 65 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 65 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 65 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 65 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 65 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 65 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 65 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 65 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 65 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 65 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 65 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 65 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 65 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CS433: Computer System OrganizationHistory8086 / 8088 (1978)Intel 286 (1982)Intel 386 (1985)Intel 486 (1989)Intel Pentium (1993)Intel P6 (1995 – 1999)Pentium ProPentium IIPentium II XeonCeleronPentium IIIPentium III XeonPentium 4 (2000)Pentium 4 Supporting Hyper-Threading Technology (2004)Intel Xeon (2001-2004)Intel Pentium M (2003)Register ArchitectureSystem Status in EFLAGSSpecial Register PurposesOffset CalculationOverlaid RegistersSIMDNetBurst MicroArchitecture (Pentium 4)Front-End PipelineEFLAGSData TypesFundamental Data TypesFloating Point TypesIEEE 754 and IA32Example of Semantic Difference Between Natural x86 Execution and C SemanticsOperating on NaNsPointer TypesDisadvantages of Far Pointers?MMX Types (64-bits)BCD (Binary Coded Decimal)Memory ModelsMemory ModelsSeg Regs in Flat Mem ModelSegmented Memory ModelConstructing an AddressDefault SegmentsFlat ModelSystem Table RegistersProtected Flat ModelMulti-Segment ModelOffset CalculationIA32 Instruction FormatCS433: Computer System OrganizationLuddy HarrisonIntel IA32 ArchitectureHistoryThe x86 / IA32 family8086 / 8088 (1978)z 16-bit registersz 16-bit external data bus (8086)z 8-bit external data bus (8088)z 20-bit address space via segment registersIntel 286 (1982)z segment registers point to descriptor tablesz descriptors have 24-bit segment addressesz segment swappingz protectionz bounds checking on segmentsz read/execute/write checkingz four privelege levelsIntel 386 (1985)z 32-bit registers (data and address)z virtual 8086 modez 32-bit address busz segmented memory model + flat memory modelz paging with 4Kbyte pagesz pipelined execution (decode + execution)Intel 486 (1989)z five stage pipelinez 8Kb on-chip L1 cachez write-throughz integrated x87 FPUz power managementIntel Pentium (1993)z two pipelines, u and vz superscalar executionz 8kb data + 8kb instruction on-chip L1 cachesz write-back option in addition to write-throughz branch predictionz burstable 64-bit external data busz multiprocessor supportz [second stepping: MMX]Intel P6 (1995 – 1999)z Pentium Proz Pentium IIz Pentium II Xeonz Celeronz Pentium IIIz Pentium III XeonPentium Proz 3-way superscalarz out-of-orderz more aggressive branch predictionz speculative executionz L1 + L2 cache on chipz 8K + 8K L1z 256K L2Pentium IIz MMX (in P6 family)z 16K + 16K L1 cachesz 256K, 512K, 1M L2 caches supportedz improved power managementPentium II Xeonz improved multiprocessor supportz 4- and 8-way systemsz 2Mb L2 cache on chipCeleronz low-priced / reduced power marketz 128K L2 cachez cheaper package (plastic)Pentium IIIz Streaming SIMD Extensions (SSE)z 128-bit registersz floating point vector typesPentium III Xeonz improved cachePentium 4 (2000)z return to Arabic numeralsz NetBurst microarchitecturez SSE2 and SSE3Pentium 4 Supporting Hyper-Threading Technology (2004)z marketing team abandons names in favor of entire sentencesz Hyper-Threading is Simultaneous MultiThreadingIntel Xeon (2001-2004)z internal revolt against long namez recycled portion of old name(s) prevailsz multiprocessor supportz Was this the first Hyper-Threading IA32?Intel Pentium M (2003)z The M is not a Roman Numeralz not “Pentium 1000”z refers to “Mobile”z low-powerz integrated wireless supportRegister ArchitectureThe x86 / IA32 familyUser-Visible Architectural StateSystem Status in EFLAGSSpecial Register PurposesOffset CalculationOverlaid RegistersSIMDNetBurst MicroArchitecture (Pentium 4)z deep branch predictionz dynamic dataflow analysisz instructions translated into a risc-like formz these in turn are subject to out-of-order executionz speculative executionz up to 126 instructions in flightz up to 48 loads and 24 stores in pipelinez advanced branch predictorz 4K branch target bufferz execution trace cache stores decoded instructionsz straightens code on the fly!z 8-way L2 cachez 64-byte cache line sizez external bus capable of 6.4Gbytes per secondFront-End Pipelinez Prefetchz Fetch (on prefetch fail)z Decode into micro-operationsz Generate microcode from complex operationsz Delivers decoded instructions from execution trace cachez Branch predictionEFLAGSData TypesFundamental Data TypesFloating Point TypesIEEE 754 and IA32z Kahan et al formulated the proper working of floating point hardware in a documented standard known as IEEE 754z The x86 was designed to do all “scratch”calculations using a small floating point stackz the entries on the stack are 80-bit extended precision numbersz Unfortunately, this does not correspond well to the semantics of CExample of Semantic Difference Between Natural x86 Execution and C Semanticsdouble A, B, C, D, E, F, G;// set B=D=F and C=E=G and let B*C be very close to 0 in// extended precision, but exactly 0 in double precision....// suppose we use the x86 FP stack to do this RHS:A = B*C + D*E - F*G; // this yields zero in double but non-zero// in extended precision...assert (A == 0.0);Operating on NaNsPointer TypesDisadvantages of Far Pointers?z To dereference, two moves are necessary (in the general case):z move to segment registerz move to address registerz To compare, two comparisons are necessaryz How do we compare ≤?z To store/load we must do two stores/loadsz What about register allocation?z What other primitive types in programming languages are sometimes multi-word types (physically)?MMX Types (64-bits)SSEn Types (128 bits)BCD (Binary Coded Decimal)What is the idea behind BCD? What is this type trying to optimize?Memory ModelsMemory Modelsz Flatz Single address space of 32 bitsz Segmentedz Address space partitioned into segmentsz Each segment is mapped to a contiguous region of physical memory space, which is up to 36 bitsz Realz The 8086 model, provided for compatibilityz Each segment is (up to) 64K bytesz Address within segment is 16 bitsz An additional 4 address bits are obtained from segment register,making for 20 bits all toldz Physically impossible to wander outside of a segment!z It is possible to select a bad segment of courseSeg Regs in Flat Mem ModelSegmented Memory ModelConstructing an AddressDefault SegmentsFlat ModelSystem Table RegistersProtected Flat ModelMulti-Segment ModelOffset CalculationIA32 Instruction FormatWhat can you say about writing an optimizing compiler for


View Full Document

U of I CS 433 - System Organization

Download System Organization
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view System Organization and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view System Organization 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?