U of I CS 232 - LECTURE NOTES - D2591262

Home> Schools> University of Illinois> Computer Science (CS) > CS 232> LECTURE NOTES

DOC PREVIEW

U of I CS 232 - LECTURE NOTES

School name University of Illinois

Course Cs 232- Computer Architecture II

Pages 11

This preview shows page 1-2-3-4 out of 11 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 11 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 11 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 11 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 11 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 11 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

December 8, 2007 CS232 Summary 1Instant replay The semester was split into roughly four parts.— The 1st quarter covered instruction set architectures—the connectionbetween software and hardware.— In the 2nd quarter of the course we discussed processor design. Wefocused on pipelining, which is one of the most important ways ofimproving processor performance.— The 3rd quarter focused on large and fast memory systems (viacaching), virtual memory, and I/O.— Finally, we discussed performance tuning, including profiling andexploiting data parallelism via SIMD and Multi-Core processors. We also introduced many performance metrics to estimate the actualbenefits of all of these fancy designs.MemoryProcessorInput/OutputDecember 8, 2007 CS232 Summary 2Some recurring themes There were several recurring themes throughout the semester.— Instruction set and processor designs are intimately related.— Parallel processing can often make systems faster.— Performance and Amdahl’s Law quantifies performance limitations.— Hierarchical designs combine different parts of a system.— Hardware and software depend on each other.December 8, 2007 CS232 Summary 3Instruction sets and processor designs The MIPS instruction set was designed for pipelining.— All instructions are the same length, to make instruction fetch andjump and branch address calculations simpler.— Opcode and operand fields appear in the same place in each of thethree instruction formats, making instruction decoding easier.— Only relatively simple arithmetic and data transfer instructions aresupported. These decisions have multiple advantages.— They lead to shorter pipeline stages and higher clock rates.— They result in simpler hardware, leaving room for other performanceenhancements like forwarding, branch prediction, and on-die caches.December 8, 2007 CS232 Summary 4Parallel processing One way to improve performance is to do more processing at once. There were several examples of this in our CPU designs.— Multiple functional units can be included in a datapath to let singleinstructions execute faster. For example, we can calculate a branchtarget while reading the register file.— Pipelining allows us to overlap the executions of several instructions.— SIMD performs operations on multiple data items simultaneously.— Multi-core processors enable thread-level parallel processing. Memory and I/O systems also provide many good examples.— A wider bus can transfer more data per clock cycle.— Memory can be split into banks that are accessed simultaneously.Similar ideas may be applied to hard disks, as with RAID systems.— A direct memory access (DMA) controller performs I/O operationswhile the CPU does compute-intensive tasks instead.December 8, 2007 CS232 Summary 5Performance and Amdahl’s Law First Law of Performance: Make the common case fast! But, performance is limited by the slowest component of the system. We’ve seen this in regard to cycle times in our CPU implementations.— Single-cycle clock times are limited by the slowest instruction.— Pipelined cycle times depend on the slowest individual stage. Amdahl’s Law also holds true outside the processor itself.— Slow memory or bad cache designs can hamper overall performance.— I/O bound workloads depend on the I/O system’s performance.December 8, 2007 CS232 Summary 6Hierarchical designs Hierarchies separate fast and slow parts of a system, and minimize theinterference between them.— Caches are fast memories which speed up access to frequently-useddata and reduce traffic to slower main memory. (Registers are evenfaster…)— Buses can also be split into several levels, allowing higher-bandwidthdevices like the CPU, memory and video card to communicate withoutaffecting or being affected by slower peripherals.December 8, 2007 CS232 Summary 7Architecture and Software Computer architecture plays a vital role in many areas of software. Compilers are critical to achieving good performance.— They must take full advantage of a CPU’s instruction set.— Optimizations can reduce stalls and flushes, or arrange code and dataaccesses for optimal use of system caches. Operating systems interact closely with hardware.— They should take advantage of CPU features like support for virtualmemory and I/O capabilities for device drivers.— The OS handles exceptions and interrupts together with the CPU.December 8, 2007 CS232 Summary 8Five things that I hope you will remember Abstraction: the separation of interface from implementation.— ISA’s specify what the processor does, not how it does it. Locality:— Temporal Locality: “if you used it, you’ll use it again”— Spatial Locality: “if you used it, you’ll use something near it” Caching: buffering a subset of something nearby, for quicker access— Typically used to exploit locality. Indirection: adding a flexible mapping from names to things— Virtual memory’s page table maps virtual to physical address. Throughput vs. Latency: (# things/time) vs. (time to do one thing)— Improving one does not necessitate improving the other.December 8, 2007 CS232 Summary 9Where to go from here? CS433: Advanced Comp. Arch: All of the techniques used in modernprocessors that I didn’t talk about (out-of-order execution, superscalar,advanced branch prediction, prefetching…). Homework-oriented. CS431: Embedded Systems: How hardware/software gets used in thingswe don’t think of as computers (e.g., anti-lock breaks, pacemakers, GPS).Lab-oriented. CS498-MG3: Program Optimization: How to make a program run reallyfast (like 4th quarter of 232, but more so). Project-oriented. ECE411: Computer Organization and Design: Some content overlapwith CS232 and CS433, but you actually build the hardware. Lab-oriented. CS426: Compiler Construction: How does a compiler translate aprogramming language down to assembly and optimization. Project-oriented.December 8, 2007 CS232 Summary 10Good luck on your exams and have a great summer!December 8, 2007 CS232 Summary 11Good luck on your exams and have a great break! Friday’s lecture is optional (i.e., will not be covered on final)— I will give an overview of techniques used in modern processors,including (probably) a brief description of my

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-3-4 out of 11 pages.

U of I CS 232 - LECTURE NOTES

Sign up for free to view:

Please select your school