U of U CS 6810 - Disks, Reliability, SSDs, Processors

Unformatted text preview:

1Lecture 27: Disks, Reliability, SSDs, Processors• Topics: HDDs, SSDs, RAID, Intel and IBM case studies• Final exam stats: Highest 91, 18 scores of 82+ Every 15thscore: 82, 76, 71, 62, 52 Hardest question: Q2 (no score over 8/10) Q5: 2 perfect answers, 3 more nearly correct answers Q8: More than half of you solved it correctly2Magnetic Disks• A magnetic disk consists of 1-12 platters (metal or glassdisk covered with magnetic recording material on bothsides), with diameters between 1-3.5 inches• Each platter is comprised of concentric tracks (5-30K) andeach track is divided into sectors (100 – 500 per track,each about 512 bytes) • A movable arm holds the read/write heads for each disksurface and moves them all in tandem – a cylinder of datais accessible at a time3Disk Latency• To read/write data, the arm has to be placed on thecorrect track – this seek time usually takes 5 to 12 mson average – can take less if there is spatial locality• Rotational latency is the time taken to rotate the correctsector under the head – average is typically more than2 ms (15,000 RPM)• Transfer time is the time taken to transfer a block of bitsout of the disk and is typically 3 – 65 MB/second• A disk controller maintains a disk cache (spatial localitycan be exploited) and sets up the transfer on the bus(controller overhead)4RAID• Reliability and availability are important metrics for disks• RAID: redundant array of inexpensive (independent) disks• Redundancy can deal with one or more failures• Each sector of a disk records check information that allowsit to determine if the disk has an error or not (in other words,redundancy already exists within a disk)• When the disk read flags an error, we turn elsewhere forcorrect data5RAID 0 and RAID 1• RAID 0 has no additional redundancy (misnomer) – ituses an array of disks and stripes (interleaves) dataacross the arrays to improve parallelism and throughput• RAID 1 mirrors or shadows every disk – every writehappens to two disks• Reads to the mirror may happen only when the primarydisk fails – or, you may try to read both together and thequicker response is accepted• Expensive solution: high reliability at twice the cost6RAID 3• Data is bit-interleaved across several disks and a separatedisk maintains parity information for a set of bits• For example: with 8 disks, bit 0 is in disk-0, bit 1 is in disk-1,…, bit 7 is in disk-7; disk-8 maintains parity for all 8 bits• For any read, 8 disks must be accessed (as we usuallyread more than a byte at a time) and for any write, 9 disksmust be accessed as parity has to be re-calculated• High throughput for a single request, low cost forredundancy (overhead: 12.5%), low task-level parallelism7RAID 4 and RAID 5• Data is block interleaved – this allows us to get all ourdata from a single disk on a read – in case of a disk error,read all 9 disks• Block interleaving reduces thruput for a single request (asonly a single disk drive is servicing the request), butimproves task-level parallelism as other disk drives arefree to service other requests• On a write, we access the disk that stores the data and theparity disk – parity information can be updated simply bychecking if the new data differs from the old data8RAID 5• If we have a single disk for parity, multiple writes can nothappen in parallel (as all writes must update parity info)• RAID 5 distributes the parity block to allow simultaneouswrites9RAID Summary• RAID 1-5 can tolerate a single fault – mirroring (RAID 1)has a 100% overhead, while parity (RAID 3, 4, 5) has modest overhead• Can tolerate multiple faults by having multiple checkfunctions – each additional check can cost an additionaldisk (RAID 6)• RAID 6 and RAID 2 (memory-style ECC) are notcommercially employed10Error Correction in Main Memory• Typically, a 64-bit data word is augmented with an 8-bitECC word; requires more DRAM chips per rank andwider bus; referred to as SECDED (single error correctiondouble error detection)• Chipkill correct: a system that can withstand completefailure in one DRAM chip; requires significant overheadin cost, energy11Flash Memory• Technology cost-effective enough that flash memory cannow replace magnetic disks on laptops (also known assolid-state disks – SSD)• Non-volatile, fast read times (15 MB/sec) (slower thanDRAM), a write requires an entire block to be erasedfirst (about 100K erases are possible) (block sizes canbe 16-512KB)12Case Study I: Intel Core Architecture• Single-thread execution is still considered important  out-of-order execution and speculation very much alive initial processors will have few heavy-weight cores• To reduce power consumption, the Core architecture (14pipeline stages) is closer to the Pentium M (12 stages)than the P4 (30 stages)• Many transistors invested in a large branch predictor toreduce wasted work (power)• Similarly, SMT is also not guaranteed for all incarnationsof the Core architecture (SMT makes a hotspot hotter)13Case Study II: Intel Nehalem• Quad core, each with 2 SMT threads• ROB of 96 in Core 2 has been increased to 128 in Nehalem;ROB dynamically allocated across threads• Lots of power modes; in-built power control unit• 32KB I&D L1 caches, 10-cycle 256KB private L2 cacheper core, 8MB shared L3 cache (~40 cycles)• L1 dTLB 64/32 entries (page sizes of 4KB or 4MB),512-entry L2 TLB (small pages only)MC1 MC2 MC3Core 1 Core 2Core 3 Core 4DIMM DIMM DIMMSocket 1MC1 MC2 MC3Core 1 Core 2Core 3 Core 4DIMM DIMM DIMMSocket 2MC1 MC2 MC3Core 1 Core 2Core 3 Core 4DIMM DIMM DIMMSocket 3MC1 MC2 MC3Core 1 Core 2Core 3 Core 4DIMM DIMM DIMMSocket 4QPINehalemMemoryControllerOrganization15Case Study III: IBM Power7• 8 cores, 4-way SMT, 45nm process, 1.2 B transistors,ooo execution, 4.25 GHz• 2-cycle 32KB pvt L1s, 8-cycle 256KB pvt L2• 32 MB shared L3 cache made of eDRAM• Nice article comparing Power7 and Sun’s Niagara3:http://arstechnica.com/business/news/2010/02/two-billion-transistor-beasts-power7-and-niagara-3.ars16Advanced Course• Spr’11: CS 7810: Advanced Computer Architecture Tu/Th 10:45am-12:05pm Designing structures within a core Cache coherence, TM, networks Lots of memory topics Major course project on evaluating original ideas withsimulators (often leads to publications) No assignments Take-home final17Title•


View Full Document

U of U CS 6810 - Disks, Reliability, SSDs, Processors

Documents in this Course
Caches

Caches

13 pages

Pipelines

Pipelines

14 pages

Load more
Download Disks, Reliability, SSDs, Processors
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Disks, Reliability, SSDs, Processors and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Disks, Reliability, SSDs, Processors 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?