5.35.3.1 Cache block size = 2^5 bytes = 32 bytes = 32 bytes * (1 word per 32 bits) * (8 bits per byte) = 8 words5.3.2 Slots = 2^index = 2^5 = 32 entries5.3.4 Address Bit Representation Tag Index Offset Hit/Miss 0 0 0 00000 00000 Miss/Put in 4 100 0 00000 00100 Hit 16 10000 0 00000 10000 Hit 132 10000100 0 00100 00100 Miss/Put in 232 11101000 0 00111 01000 Miss/Put in 160 10100000 0 00101 00000 Miss/Put in 1024 10000000000 1 00000 00000 Miss/Replace 0 tag 30 11110 0 00000 11110 Miss/Replace 1 tag 140 10001100 0 00100 01100 Hit 3100 110000011100 11 00000 11100 Miss/Replace 0 tag 180 10110100 0 00101 10100 Hit 2180 100010000100 10 00100 00100 Miss/Replace 0 tag 4 blocks are replaced5.3.5 4 hits/12 instructions = 33% hit ratio5.55.5.1 Cache holds 64*1024 bytes Since cache block size is 32 bytes/block We have 64*1024/32 = 2048 blocks For access pattern of 0, 2, 4, ... The pattern will span a single cache block in 16 accesses. Based on this, There will be one miss because the block has not previously been seen in the cache – compulsory miss. Which is a miss rate of 1/165.5.2 For a 16-byte block size: 8 accesses will span a single block with a single compulsory miss Miss rate = 1/8 For a 64-byte block size: 32 accesses will span a single block with a single compulsory miss Miss rate = 1/32 For a 128-byte block size: 64 accesses will span a single block with a single compulsory miss Miss rate = 1/645.65.6.2 AMAT = hit time + miss rate * miss penalty L1 hit time determines the CT P1: L1 miss rate: 8% L1 hit time/cycle time: 0.66ns CR = 1/(0.66*10^-9) = 1.515 GHz MAT = 70nsMA cycles = 70*10^-9*1.515*10^9 = 106.05 Therefore, AMAT = 1+0.08*106.05 = 9.484 P2: L1 miss rate: 6% L1 hit time/cycle time: 0.9ns CR = 1/(0.9*10^-9) = 1.111 GHz MAT = 70ns MA cycles = 70*10^-9*1.111*10^9 = 77.77 Therefore, AMAT = 1+0.06*77.77 = 5.6665.6.4 P1: L2 miss rate: 95% L2 hit time: 5.62 L2 access cycles = 5.62*1.515*10^9 = 8.514 New penalty for missing L1 = 8.514+.95*106.05 = 109.26 New AMAT = 1+0.08*109.26 = 9.741 Because of the high miss rate of the L2 cache, the AMAT will slightly worse.Exam Problema) We have 10^6 instructions 30% are branches 20% are loads D$ miss rate = 30% I$ miss rate = 10% L2 miss rate = 20% L2 access time = 10 cycles MAT = 80 cycles AMAT (Data Memory) = 1 + 0.3*(10+0.2*80) = 8.8b) TCPI = BCPI+MCPI BCPI = PCPI+DHCPI+CHCPI = 1 + 0.2*0.6*1 + 0.3*0.5*1 = 1.27 MCPI = D$CPI+I$CPI = 0.2*0.3*(10+0.2*80) + 0.1*(10+0.2*80) = 1.56+2.6 = 4.16 TCPI = 1.27+4.16 = 5.43c) If 1/6 of branches are jals and 1/6 of branches are jr, and they are always taken We improve the predictions by removing all jal and jr's which is 1/3 of branches IC_New = 10^6 - 1/3*0.3*10^6 = 10^6 - 0.1*10^6 = 0.9*10^6 instructions IC_Branch_New = 0.2*10^6 From this, we see that branches are 2/9 of the instructions IC_LW_New = IC_LW = 0.2*10^6 The number of load word instructions stay the same, but the fraction changes LW are 2/9 of the instructionsSince we predicted half of the branch instructions are predicted correctly And the other half of the branch instructions are predicted incorrectly, We have 0.15*10^6 instructions predicted correctly and incorrectly respectively. However, we reduced the number of insturctions predicted incorrectly by 0.1*10^6. Hence, the new branch misprediction rate is (0.15-0.1)*10^6/(0.2*10^6) = 0.25 BCPI = PCPI + DHCPI + CHCPI = 1 + 2/9*0.6*1 + 2/9*0.25*1 = 1 + 0.133 + 0.0556 = 1.1886 MCPI = D$CPI+I$CPI = 2/9*0.3*(10+0.2*80) + i*(10+0.2*80) = 1.733 + 26i TCPI_New = 1.1886+1.733+26i = 2.9216+26i We want to get ET_Old > ET_New Old: ET = TCPI*IC*CT = 5.43*10^6*0.5*10^-9 = 0.002715 New: ET = TCPI*IC*CT = (2.9216+26i)*0.9*10^6*0.5*10^-9 = 0.001315+0.0117i 0.002715 > 0.001315+0.0117i i < 0.1197 Therefore, the instruction cache miss rate must be less than
View Full Document