Berkeley COMPSCI 61B - Lecture Notes - D615345

Home> Schools> University of California, Berkeley> Computer Science (COMPSCI) > COMPSCI 61B> Lecture Notes

DOC PREVIEW

Berkeley COMPSCI 61B - Lecture Notes

School name University of California, Berkeley

Course Compsci 61b- Data Structures

Pages 8

This preview shows page 1-2-3 out of 8 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 8 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 8 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 8 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 8 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

CS61B Lecture #12Today:• Floating point arithmetic (just a taste).Readings for Today:Programming Into Java,§5.3,§5.4.Readings for next Topic:Programming Into Java,§4.1Last modified: Fri Sep 28 11:11:08 2001 CS61B: Lecture #12 1Floating point• Problem: Need to represent non-integral quantities.– Regardless of magnitudes, however, precise rep-resentation can require arbitrarily large amountsof space.– Unclear what to do about irrational numbers.– Must have arithmetic that is fast.• Solution: Don’t solve the problem of representing allreal numbers.– Instead, use fact that most scientific computa-tion only needs results that are “good enough,”– That is, that carry at least as many significantdigits as the data justify.– Settle for a subset of the rationals representablein a fixed number of bits (typical: 32, 64, 80,sometimes 128).• Many variations on this theme used in the past.• Majority of machines now supportIEEE binary floating-point arithmetic,. . .• . . . which (largely) standardizes on numbers repre-sented and approximations used.Last modified: Fri Sep 28 11:11:08 2001 CS61B: Lecture #12 2Basic Idea: Representation• Java’s two floating-point types each represent justthe numbers±m × 2e0 ≤ m < 2n, for some fixed n and e in some fixedrange of integers,• and also ±∞ and NaN (Not a Number).• n is the number of significant bits (≈ 3.3× numberof significant base-10 digits).• For double, n = 53 (> 15 significant digits), and−1074 ≤ e < 972 (roughly, positive values in therange 5.0 × 10−324to 1.8 × 10308).• For float, n = 24 (> 6 significant digits), and −150 ≤e < 105, (roughly, positive values in the range1.4 × 10−45to 3.4 × 1038).• ±∞ used for values whose magnitude is larger thanthis range, including 1.0/0.• NaN used for undefined results (e.g., ∞ − ∞, 0/0, orany operation involving NaN.• -0 compares == to 0, but 1/(-0) gives −∞. Why?Sorry; you’ll have to study complex functions.Last modified: Fri Sep 28 11:11:08 2001 CS61B: Lecture #12 3Basic Idea: Arithmetic• Basic rule is simple: to compute x ⊕ y, (where ⊕ is+, -, *, /) compute the true mathematical result, andthenroundto nearest representable floating-pointnumber.• Same for literals: is no exact binary equivalent of0.1, so round to nearest.•Unbiased rounding:Sometimes, result can beex-actlyhalfway between two representable values: chooseone with last bit 0.• (Unfortunately, whole truth is a little more complex:Java implementations allowed to usemoreprecisionto compute intermediate results.)Last modified: Fri Sep 28 11:11:08 2001 CS61B: Lecture #12 4The Floating Number Line• Floating-point arithmetic gives same number of sig-nificant bits at all values, except near 0.• Implies that they “stretch out” the further from 0you go.• So, if we had only 3 significant bits, and the smallestexponent of 2 were 2−6, we’d see this:· · · · · ·0 1/8 1/4 1/2 1Note: corrected from book.• In particular, we’d get all the 3-bit (n-bit) integers:· · · · · ·1 2 3 4 5 6 7 8 10• After which, we start skipping.• Random terminology: The first 23−1 numbers after0 in the first diagram (2n− 1 in general) are calleddenormalized.• Unique in having same spacing as the next 23. Insome other versions of floating point, they are miss-ing.Last modified: Fri Sep 28 11:11:08 2001 CS61B: Lecture #12 5Input of Floating-Point• To convert from string, S, to the floating-point num-ber it denotes, use Double.parseDouble(S) orFloat.parseFloat(S)• To read, can use StreamTokenizer (see project), or(around here) the UCB I/O package, which is like C’sscanf:stdin.scan ("%le"); // Means "read a double"x = stdin.doubleItem (0);Last modified: Fri Sep 28 11:11:08 2001 CS61B: Lecture #12 6Output of Floating Point• When printing (or converting to String, as in "X=" + X),Java by default prints “enough” significant digitsand decides on whether to use scientific notationbased on size.• You can control this all explicitly:• In the standard Java library, can use DecimalFormattype.• With the UCB I/O library can use format (which im-itates the C printf functions),• E.g., If x is 3.1415926, thenDecimalFormat form = new DecimalFormat (pattern1);...System.out.print (form.format(x));stdout.format (pattern2).put (x);producesPattern1Pattern2Print"0.000" "%5.3f" 3.142"00.000" "%06.3f" 03.142"#0.000" "%6.3f" 3.142"#.000" 3.142"0.0E00" "%6.1e" 3.1E00".00E00" .31E01Last modified: Fri Sep 28 11:11:08 2001 CS61B: Lecture #12 7A Few Points to Remember• IEEE gives reasonably well-behaved approximations,but theyareapproximations.• In particular, many decimal fractions not exactlyrepresentable in binary (0.1, e.g.)• Likewise, repeated addition not the same as multi-plication. N adds accumulate N rounding errors; onemultiply accumulates only 1 rounding error. Example:for (double x = 0.0; x < 1.0; x += 0.1) { ... }Loop runs 11 times, not 10.• When you add large quantities to small, you lose sig-nificant digits from the right of the small quantity.Thus, computations like X+Y-Z, where X and Z arenearly equal and much larger than Y won’t be terri-bly accurate.Last modified: Fri Sep 28 11:11:08 2001 CS61B: Lecture #12

View Full Document


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-3 out of 8 pages.

Berkeley COMPSCI 61B - Lecture Notes

Sign up for free to view:

Please select your school