CS152 Computer Architecture and Engineering Lecture 23 I O and Storage Systems April 26 2004 John Kubiatowicz www cs berkeley edu kubitron lecture slides http inst eecs berkeley edu cs152 Recap A Three Bus System backside cache Processor Memory Bus Processor Backside Cache bus L2 Cache Memory Bus Adaptor Bus Adaptor Bus Adaptor I O Bus I O Bus A small number of backplane buses tap into the processor memory bus Processor memory bus is only used for processor memory traffic I O buses are connected to the backplane bus Advantage loading on the processor bus is greatly reduced 4 26 04 UCB Spring 2004 CS152 Kubiatowicz Recap Main components of Intel Chipset Pentium II III Northbridge Handles memory Graphics Southbridge I O PCI bus Disk controllers USB controlers Audio Serial I O Interrupt controller Timers 4 26 04 UCB Spring 2004 CS152 Kubiatowicz Arbitration Obtaining Access to the Bus Control Master initiates requests Bus Master Data can go either way Bus Slave One of the most important issues in bus design How is the bus reserved by a device that wishes to use it Chaos is avoided by a master slave arrangement Only the bus master can control access to the bus It initiates and controls all bus requests A slave responds to read and write requests The simplest system Processor is the only bus master All bus requests must be controlled by the processor Major drawback the processor is involved in every transaction 4 26 04 UCB Spring 2004 CS152 Kubiatowicz The Daisy Chain Bus Arbitrations Scheme Device 1 Highest Priority Grant Bus Arbiter Device N Lowest Priority Device 2 Grant Grant Release Request wired OR Advantage simple Disadvantages Cannot assure fairness A low priority device may be locked out indefinitely The use of the daisy chain grant signal also limits the bus speed 4 26 04 UCB Spring 2004 CS152 Kubiatowicz Centralized Parallel Arbitration Device 1 Grant Device 2 Device N Req Bus Arbiter Used in essentially all processor memory busses and in highspeed I O busses 4 26 04 UCB Spring 2004 CS152 Kubiatowicz Increasing the Bus Bandwidth Separate versus multiplexed address and data lines Address and data can be transmitted in one bus cycle if separate address and data lines are available Cost a more bus lines b increased complexity Data bus width By increasing the width of the data bus transfers of multiple words require fewer bus cycles Example SPARCstation 20 s memory bus is 128 bit wide Cost more bus lines Block transfers Allow the bus to transfer multiple words in back to back bus cycles Only one address needs to be sent at the beginning The bus is not released until the last word is transferred Cost a increased complexity b decreased response time for request 4 26 04 UCB Spring 2004 CS152 Kubiatowicz Increasing Transaction Rate on Multimaster Bus Overlapped arbitration perform arbitration for next transaction during current transaction Bus parking master can holds onto bus and performs multiple transactions as long as no other master makes request Overlapped address data phases prev slide requires one of the above techniques Split phase or packet switched bus completely separate address and data phases arbitrate separately for each address phase yield a tag which is matched with data phase All of the above in most modern buses 4 26 04 UCB Spring 2004 CS152 Kubiatowicz Recall PCI Read Transaction Turn around cycle on any signal driven by more than one agent 4 26 04 UCB Spring 2004 CS152 Kubiatowicz I O Device Examples Device Behavior Keyboard Mouse Input Data Rate KB sec Human 0 01 Human 0 02 Line Printer Output Human 1 00 Floppy disk Storage Machine 50 00 Laser Printer Output Human 100 00 Optical Disk Storage Machine 500 00 Magnetic Disk Storage Machine 5 000 00 Network LAN Input or Output Machine Graphics Display Output Human 4 26 04 Input Partner UCB Spring 2004 20 1 000 00 30 000 00 CS152 Kubiatowicz I O System Performance I O System performance depends on many aspects of the system limited by weakest link in the chain The CPU The memory system Internal and external caches Main Memory The underlying interconnection buses The I O controller The I O device The speed of the I O software Operating System The efficiency of the software s use of the I O devices Two common performance metrics Throughput I O bandwidth Response time Latency 4 26 04 UCB Spring 2004 CS152 Kubiatowicz Simple Producer Server Model Queue Producer Server Throughput The number of tasks completed by the server in unit time In order to get the highest possible throughput The server should never be idle The queue should never be empty Response time Begins when a task is placed in the queue Ends when it is completed by the server In order to minimize the response time 4 26 04 The queue should be empty The server will be idle UCB Spring 2004 CS152 Kubiatowicz Throughput versus Respond Time Response Time ms 300 200 100 20 40 60 80 100 Percentage of maximum throughput 4 26 04 UCB Spring 2004 CS152 Kubiatowicz Throughput Enhancement Server Queue Producer Queue Server In general throughput can be improved by Throwing more hardware at the problem reduces load related latency Response time is much harder to reduce Ultimately it is limited by the speed of light but we re far from it 4 26 04 UCB Spring 2004 CS152 Kubiatowicz Organization of a Hard Magnetic Disk Platters Track Sector Typical numbers depending on the disk size 500 to 2 000 tracks per surface 32 to 128 sectors per track A sector is the smallest unit that can be read or written Traditionally all tracks have the same number of sectors Constant bit density record more sectors on the outer tracks Recently relaxed constant bit size speed varies with track location CS152 Kubiatowicz 4 26 04 UCB Spring 2004 Magnetic Disk Characteristic Track Sector Cylinder all the tacks under the head at a given point on all surface Read write data is a three stage process Seek time position the arm over the proper track Cylinder Head Platter Rotational latency wait for the desired sector to rotate under the read write head Transfer time transfer a block of bits sector under the read write head Average seek time as reported by the industry Typically in the range of 8 ms to 12 ms Sum of the time for all possible seek total of possible seeks Due to locality of disk reference actual average seek time may Only be 25 to 33 of the advertised number 4 26 04 UCB Spring 2004 CS152 Kubiatowicz Technology Trends Disk Capacity now doubles every 18 months before 1990 every 36 months Today
View Full Document
Unlocking...