GoalsLike Writing EnglishArithmeticSimple benchmarksSimple benchmarksC Arithmetic TriviaArithmetic LessonsBit ManipulationBit-manipulation basicsAdvanced bit manipulationFaking MultiplicationFaking MultiplicationFaking DivisionMulti-way branchesNios code for if-then-elseNios code for switch (1)Nios code for switch (2)Computing Discrete FunctionsComputing Discrete FunctionsFunction callsCode for foo() (unoptimized)Code for foo() (optimized)Strength ReductionUnoptimized array code (fragment)Unoptimized pointer code (fragment)Optimized (--O2) array codeOptimized (--O2) pointer codeHow Rapid is Rapid?Double-checkingFeatures in order of increasing costStorage Classes in CDynamic Storage AllocationDynamic Storage AllocationSimple Dynamic Storage AllocationDynamic Storage AllocationSimple Dynamic Storage AllocationStorage Classes ComparedMemory-Mapped I/OMemory-Mapped I/O Access in CWhat's With the Volatile?Altera I/OHW/SW Communication StylesHW/SW Communication: InterruptsUnix SignalsInterrupts under Altera (1)Interrupts under Altera (2)Debugging SkillsThe Edwards Way to DebugLow-Level C ProgrammingCSEE W4840Prof. Stephen A. E dwardsColumbia UniversitySpring 2009Low-Level C Programming – p.GoalsFunction is correctSource code is concise, readable, maintainableTime-critical sections of program run fast enoughObject code is small and efficientBasically, optimize the use of three resources:Execution timeMemoryDevelopment/maintenance timeLow-Level C Programming – p.Like Writing EnglishYou can say the same thing many different waysand mean the same thing.There are many different ways to say the samething.The same thing may be said different ways.There is more than one way to say it.Many sentences are equivalent.Be succinct.Low-Level C Programming – p.ArithmeticInteger Arithmetic FastestFloating-point arithmetic in hardware SlowerFloating-point arithmetic in software Very slow+, −×÷sqrt, sin, log, etc.yslowerLow-Level C Programming – p.Simple benchmarksfor (i = 0 ; i < 10000 ; ++i)/*arithmetic operation*/On my desktop Pentium 4 w ith good hardwarefloating-point support,Operator Time Operator Time+ (int) 1 + (double) 5*(int) 5*(double) 5/ (int) 12 / (double) 10« (int) 2 sqrt 28sin 48pow 275Low-Level C Programming – p.Simple benchmarksOn my Zaurus SL 5600, a 400 MHz Intel PXA250Xscale (ARM) processor:Operator Time+ (int) 1 + (double) 140*(int) 1*(double) 110/ (int) 7 / (double) 220« (int) 1 sqrt 500sin 3300pow 820Low-Level C Programming – p.C Arithmetic TriviaOperations on char, short, int, and longprobabl y run at the same speed (same ALU).Same for unsigned variantsint or long slower when they exc eed machine’sword size.Low-Level C Programming – p.Arithmetic LessonsTry to use integer addition/subtractionAvoid multiplication unless you have hardwareAvoid divisionAvoid floating-point, unless you have hardwareReally avoid math library functionsLow-Level C Programming – p.Bit ManipulationC has many bit-manipulation operators.& Bit-wise AND| Bit-wise OR^ Bit-wise XOR~ Negate (one’s complement)>> Right-shift<< Left-shiftPlus assignment versions of each.Low-Level C Programming – p.Bit-manipulation basicsa |= 0x4; /*Set bit 2*/b &= ~0x4; /*Clear bit 2*/c &= ~(1 << 3); /*Clear bit 3*/d ^= (1 << 5); /*Toggle bit 5*/e >>= 2; /*Divide e by 4*/Low-Level C Programming – p. 10Advanced bit manipulation/*Set b to the rightmost 1 in a*/b = a & (a ^ (a - 1));/*Set d to the number of 1’s in c*/char c, d;d = (c & 0x55) + ((c & 0xaa) >> 1);d = (d & 0x33) + ((d & 0xcc) >> 2);d = (d & 0x0f) + ((d & 0xf0) >> 4);Low-Level C Programming – p. 11Faking MultiplicationAddition, subtraction, and shifting are fast. Cansometimes supplant multiplication.Like floating-point, not all processors have adedicated hardware multiplier.Recall the multiplication algorithm fromelementary school, but think binary:101011× 110110101110101100+1010110001000101111= 43 + 43 << 2 + 43 << 3 = 559Low-Level C Programming – p. 12Faking MultiplicationEven more clever if you include subtraction:101011× 1110101011010101100+1010110001001011010= 43 << 1 + 43 << 2 + 43 << 3= 43 << 4 - 43 << 2= 602Only usefulfor multiplication by a constantfor “simple” multiplicandswhen hardware multiplier not availableLow-Level C Programming – p. 13Faking DivisionDivision is a much more complicated algorithmthat generally involves decisions.However, division by a power of two is just a shift:a / 2 = a >> 1a / 4 = a >> 2a / 8 = a >> 3There is no general shift-and-add replacement fordivision, but sometimes you can turn it intomultiplication:a / 1.33333333= a*0.75= a*0.5 + a*0.25= a >> 1 + a >> 2Low-Level C Programming – p. 14Multi-way branchesif (a == 1)foo();else if (a == 2)bar();else if (a == 3)baz();else if (a == 4)qux();else if (a == 5)quux();else if (a == 6)corge();switch (a) {case 1:foo(); break;case 2:bar(); break;case 3:baz(); break;case 4:qux(); break;case 5:quux(); break;case 6:corge(); break;}Low-Level C Programming – p. 15Nios code for if-then-elseldw r2, 0(fp) # Fetch a from stackcmpnei r2, r2, 1 # Compare with 1bne r2, zero, .L2 # If not 1, jump to L2call foo # Call foo()br .L3 # branch out.L2:ldw r2, 0(fp) # Fetch a from stack (again!)cmpnei r2, r2, 2 # Compare with 2bne r2, zero, .L4 # If not 1, jump to L4call bar # Call bar()br .L3 # branch out.L4:Low-Level C Programming – p. 16Nios code for switch (1)ldw r2, 0(fp) # Fetch acmpgeui r2, r2, 7 # Compare with 7bne r2, zero, .L2 # Branch if greater or equalldw r2, 0(fp) # Fetch amuli r3, r2, 4 # Multiply by 4movhi r2, %hiadj(.L9) # Load address .L9addi r2, r2, %lo(.L9)add r2, r3, r2 # = a*4 + .L9ldw r2, 0(r2) # Fetch from jump tablejmp r2 # Jump to label.section .rodata.align 2.L9:.long .L2 # Branch table.long .L3.long .L4.long .L5.long .L6.long .L7.long .L8Low-Level C Programming – p. 17Nios code for switch (2).section .text.L3:call foobr .L2.L4:call barbr .L2.L5:call bazbr .L2.L6:call quxbr .L2.L7:call quuxbr .L2.L8:call corge.L2:Low-Level C Programming – p. 18Computing Discrete FunctionsThere are many ways to compute a “random”function of one variable:/*OK, especially for sparse domain*/if (a == 0) x = 0;else if (a == 1) x = 4;else if (a == 2) x = 7;else if (a == 3) x = 2;else if (a == 4) x = 8;else if (a == 5) x = 9;Low-Level C Programming – p. 19Computing Discrete Functions/*Better for large, dense domains*/switch (a) {case 0: x = 0; break;case 1: x = 4; break;case 2: x = 7; break;case 3: x = 2; break;case 4: x = 8;
View Full Document