Berkeley COMPSCI C267 - Lecture 13: Floating Point Arithmetic - D2019887

Home> Schools> University of California, Berkeley> Computer Science (COMPSCI) > COMPSCI C267> Lecture 13: Floating Point Arithmetic

DOC PREVIEW

Berkeley COMPSCI C267 - Lecture 13: Floating Point Arithmetic

School name University of California, Berkeley

Course Compsci C267- Applications of Parallel Computers

Pages 24

This preview shows page 1-2-23-24 out of 24 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 24 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 24 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 24 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 24 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 24 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

CS 267 Applications of Parallel Computers Lecture 13: Floating Point ArithmeticOutlineA little historyDefining Floating Point ArithmeticIEEE Floating Point Arithmetic Standard 754 - Normalized NumbersRules for performing arithmeticError AnalysisExample: polynomial evaluation using Horner’s ruleExample: polynomial evaluation (continued)Slide 10Exception HandlingIEEE Floating Point Arithmetic Standard 754 - “Denorms”IEEE Floating Point Arithmetic Standard 754 - +- InfinityIEEE Floating Point Arithmetic Standard 754 - NAN (Not A Number)Exception Handling User InterfaceExploiting Exception Handling to Design Faster AlgorithmsSummary of Values Representable in IEEE FPSimulating extra precisionCray ArithmeticHazards of Parallel and Heterogeneous ComputingHazard #1: Nonrepeatability due to nonassociativityHazard #2: Heterogeneity: Different Exception DefaultsHazard #3: Heterogeneity: Data Dependent BranchesFurther References on Floating Point ArithmeticCS267 L13 Floating Point.1Demmel Sp 1999CS 267 Applications of Parallel ComputersLecture 13: Floating Point ArithmeticJames Demmelhttp://www.cs.berkeley.edu/~demmel/cs267_Spr99CS267 L13 Floating Point.2Demmel Sp 1999Outline°A little history°IEEE floating point formats°Error analysis°Exception handling •Using exception handling to go faster°How to get extra precision cheaply°Cray arithmetic - a pathological example°Dangers of Parallel and Heterogeneous ComputingCS267 L13 Floating Point.3Demmel Sp 1999A little history°Von Neumann and Goldstine - 1947 •“Can’t expect to solve most big [n>15] linear systems without carrying many decimal digits [d>8], otherwise the computed answer would be completely inaccurate.” - WRONG!°Turing - 1949•“Carrying d digits is equivalent to changing the input data in the d-th place and then solving Ax=b. So if A is only known to d digits, the answer is as accurate as the data deserves.”•Backward Error Analysis °Rediscovered in 1961 by Wilkinson and publicized°Starting in the 1960s- many papers doing backward error analysis of various algorithms°Many years where each machine did FP arithmetic slightly differently•Both rounding and exception handling differed•Hard to write portable and reliable software•Motivated search for industry-wide standard, beginning late 1970s•First implementation: Intel 8087°ACM Turing Award 1989 to W. Kahan for design of the IEEE Floating Point Standards 754 (binary) and 854 (decimal)•Nearly universally implemented in general purpose machinesCS267 L13 Floating Point.4Demmel Sp 1999Defining Floating Point Arithmetic°Representable numbers•Scientific notation: +/- d.d…d x rexp•sign bit +/-•radix r (usually 2 or 10, sometimes 16)•significand d.d…d (how many base-r digits d?)•exponent exp (range?)•others?°Operations:•arithmetic: +,-,x,/,... -how to round result to fit in format•comparison (<, =, >)•conversion between different formats -short to long FP numbers, FP to integer•exception handling -what to do for 0/0, 2*largest_number, etc.•binary/decimal conversion-for I/O, when radix not 10°Language/library support for these operationsCS267 L13 Floating Point.5Demmel Sp 1999IEEE Floating Point Arithmetic Standard 754 - Normalized Numbers°Normalized Nonzero Representable Numbers: +- 1.d…d x 2exp•Macheps = Machine epsilon = 2-#significand bits = relative error in each operation•OV = overflow threshold = largest number•UN = underflow threshold = smallest number°+- Zero: +-, significand and exponent all zero•Why bother with -0 laterFormat # bits #significand bits macheps #exponent bits exponent range---------- -------- ----------------------- ------------ -------------------- ----------------------Single 32 23+1 2-24 (~10-7) 8 2-126 - 2127 (~10+-38)Double 64 52+1 2-53 (~10-16) 11 2-1022 - 21023 (~10+-308)Double >=80 >=64 <=2-64(~10-19) >=15 2-16382 - 216383 (~10+-4932) Extended (80 bits on all Intel machines)CS267 L13 Floating Point.6Demmel Sp 1999Rules for performing arithmetic°As simple as possible:•Take the exact value, and round it to the nearest floating point number (correct rounding)•Break ties by rounding to nearest floating point number whose bottom bit is zero (rounding to nearest even)•Other rounding options too (up, down, towards 0)°Don’t need exact value to do this!•Early implementors worried it might be too expensive, but it isn’t°Applies to•+,-,*,/•sqrt•conversion between formats•rem(a,b) = remainder of a after dividing by b-a = q*b + rem, q = floor(a/b)-cos(x) = cos(rem(x,2*pi)) for |x| >= 2*pi -cos(x) is exactly periodic, with period rounded(2*pi)CS267 L13 Floating Point.7Demmel Sp 1999Error Analysis°Basic error formula•fl(a op b) = (a op b)*(1 + d) where-op one of +,-,*,/-|d| <= macheps-assuming no overflow, underflow, or divide by zero°Example: adding 4 numbers•fl(x1+x2+x3+x4) = {[(x1+x2)*(1+d1) + x3]*(1+d2) + x4}*(1+d3) = x1*(1+d1)*(1+d2)*(1+d3) + x2*(1+d1)*(1+d2)*(1+d3) + x3*(1+d2)*(1+d3) + x4*(1+d3) = x1*(1+e1) + x2*(1+e2) + x3*(1+e3) + x4*(1+e4) where each |ei| <~ 3*macheps•get exact sum of slightly changed summands xi*(1+ei)•Backward Error Analysis - algorithm called numerically stable if it gives the exact result for slightly changed inputs•Numerical Stability is an algorithm design goalCS267 L13 Floating Point.8Demmel Sp 1999Example: polynomial evaluation using Horner’s rule°Horner’s rule to evaluate p =  ck * xk•p = cn, for k=n-1 downto 0, p = x*p + ck°Numerically Stable°Apply to (x-2)9 = x9 - 18*x8 + … - 512k=0nCS267 L13 Floating Point.9Demmel Sp 1999Example: polynomial evaluation (continued)°(x-2)9 = x9 - 18*x8 + … - 512°We can compute error bounds using•fl(a op b)=(a op b)*(1+d)CS267 L13 Floating Point.10Demmel Sp 1999What happens when the “exact value” is not a real number, or is too small or too large to represent accurately?You get an “exception”CS267 L13 Floating Point.11Demmel Sp 1999Exception Handling°What happens when the “exact value” is not a real number, or too small or too large to represent accurately?°5 Exceptions:•Overflow - exact result > OV, too large to represent•Underflow - exact result nonzero and < UN, too small to represent•Divide-by-zero -

View Full Document

Berkeley COMPSCI C267 - Lecture 13: Floating Point Arithmetic

Sign up for free to view:

This document and 3 million+ documents and flashcards
High quality study guides, lecture notes, practice exams
Course Packets handpicked by editors offering a comprehensive review of your courses
Better Grades Guaranteed


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-23-24 out of 24 pages.

Berkeley COMPSCI C267 - Lecture 13: Floating Point Arithmetic

Sign up for free to view:

Please select your school