DOC PREVIEW
Berkeley COMPSCI 61C - Floating Point II

This preview shows page 1 out of 4 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 4 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 4 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

1CS 61C L16 : Floating Point II (1)Kusalo, Spring 2005 © UCBLecturer NSOE Steven Kusaloinst.eecs.berkeley.edu/~cs61cCS61C : Machine StructuresLecture 16 – Floating Point II 2004-10-0620 years from now...1) We'll all have robot servants or...2) The world will be a smoking ruinCS 61C L16 : Floating Point II (1)Kusalo, Spring 2005 © UCBExample: Representing 1/3 in MIPS•1/3 = 0.33333…10= 0.25 + 0.0625 + 0.015625 + 0.00390625 + … = 1/4 + 1/16 + 1/64 + 1/256 + …= 2-2+ 2-4+ 2-6+ 2-8 + …= 0.0101010101… 2 * 20= 1.0101010101… 2* 2-2• Sign: 0• Exponent = -2 + 127 = 125 = 01111101• Significand = 0101010101…0 0111 1101 0101 0101 0101 0101 0101 010CS 61C L16 : Floating Point II (1)Kusalo, Spring 2005 © UCBRepresentation for ± ∞•In FP, divide by 0 should produce ± ∞, not overflow.•Why?• OK to do further computations with ∞ E.g., X/0 > Y may be a valid comparison• Ask math majors•IEEE 754 represents ± ∞• Most positive exponent reserved for ∞• Significands all zeroesCS 61C L16 : Floating Point II (1)Kusalo, Spring 2005 © UCBSpecial Numbers•What have we defined so far? (Single Precision)Exponent Significand Object00 00 nonzero ???1-254 anything +/- fl. pt. #255 0 +/- ∞255 nonzero ???•Professor Kahan had clever ideas; “Waste not, want not”• Exp=0,255 & Sig!=0 …CS 61C L16 : Floating Point II (1)Kusalo, Spring 2005 © UCBRepresentation for Not a Number•What is sqrt(-4.0)or 0/0?• If ∞ not an error, these shouldn’t be either.• Called Not a Number (NaN)• Exponent = 255, Significand nonzero• Why is this useful?• Hope NaNs help with debugging?• They contaminate: op(NaN, X) = NaNCS 61C L16 : Floating Point II (1)Kusalo, Spring 2005 © UCBRepresentation for Denorms (1/2)•Problem: There’s a gap among representable FP numbers around 0• Smallest representable pos num:a = 1.0… 2* 2-126= 2-126• Second smallest representable pos num:b = 1.000……1 2* 2-126= 2-126+ 2-149a - 0 = 2-126b - a = 2-149ba0+-Gaps!Normalization and implicit 1is to blame!2CS 61C L16 : Floating Point II (1)Kusalo, Spring 2005 © UCBRepresentation for Denorms (2/2)•Solution:• We still haven’t used Exponent = 0, Significand nonzero• Denormalized number: no leading 1, implicit exponent = -126.• Smallest representable pos num:a = 2-149• Second smallest representable pos num:b = 2-1480+-CS 61C L16 : Floating Point II (1)Kusalo, Spring 2005 © UCBOverview•Reserve exponents, significands:Exponent Significand Object00 00 nonzero Denorm1-254 anything +/- fl. pt. #255 0+/- ∞255 nonzero NaNCS 61C L16 : Floating Point II (1)Kusalo, Spring 2005 © UCBRounding•Math on real numbers ⇒ we worry about rounding to fit result in the significant field.•FP hardware carries 2 extra bits of precision, and rounds for proper value•Rounding occurs when converting…• double to single precision• floating point # to an integerCS 61C L16 : Floating Point II (1)Kusalo, Spring 2005 © UCBIEEE Four Rounding Modes•Round towards + ∞• ALWAYS round “up”: 2.1 ⇒ 3, -2.1 ⇒ -2•Round towards - ∞• ALWAYS round “down”: 1.9 ⇒ 1, -1.9 ⇒ -2•Truncate• Just drop the last bits (round towards 0)•Round to (nearest) even (default)• Normal rounding, almost: 2.5 ⇒ 2, 3.5 ⇒ 4• Like you learned in grade school• Insures fairness on calculation• Half the time we round up, other half downCS 61C L16 : Floating Point II (1)Kusalo, Spring 2005 © UCBInteger Multiplication (1/3)•Paper and pencil example (unsigned):Multiplicand 1000 8Multiplier x10019100000000000+1000 01001000• m bits x n bits = m + n bit productCS 61C L16 : Floating Point II (1)Kusalo, Spring 2005 © UCBInteger Multiplication (2/3)•In MIPS, we multiply registers, so:• 32-bit value x 32-bit value = 64-bit value•Syntax of Multiplication (signed):• mult register1, register2• Multiplies 32-bit values in those registers & puts 64-bit product in special result regs:- puts product upper half in hi, lower half in lo• hi and lo are 2 registers separate from the 32 general purpose registers• Use mfhi register & mflo register to move from hi, lo to another register3CS 61C L16 : Floating Point II (1)Kusalo, Spring 2005 © UCBInteger Multiplication (3/3)•Example:• in C: a = b * c;• in MIPS:- let b be $s2; let c be $s3; and let a be $s0 and $s1 (since it may be up to 64 bits)mult $s2,$s3 # b*cmfhi $s0 # upper half of # product into $s0mflo $s1 # lower half of# product into $s1•Note: Often, we only care about the lower half of the product.CS 61C L16 : Floating Point II (1)Kusalo, Spring 2005 © UCBInteger Division (1/2)•Paper and pencil example (unsigned):1001 Quotient Divisor 1000|1001010 Dividend-1000101011010-100010 Remainder(or Modulo result)• Dividend = Quotient x Divisor + RemainderCS 61C L16 : Floating Point II (1)Kusalo, Spring 2005 © UCBInteger Division (2/2)• Syntax of Division (signed):• div register1, register2• Divides 32-bit register 1 by 32-bit register 2: • puts remainder of division in hi, quotient in lo• Implements C division (/) and modulo (%)• Example in C:a = c / d;b = c % d;• in MIPS: a↔$s0;b↔$s1;c↔$s2;d↔$s3div $s2,$s3 # lo=c/d, hi=c%dmflo $s0 # get quotientmfhi $s1 # get remainderCS 61C L16 : Floating Point II (1)Kusalo, Spring 2005 © UCBUnsigned Instructions & Overflow•MIPS also has versions of mult, div for unsigned operands:multudivu• Determines whether or not the product and quotient are changed if the operands are signed or unsigned.•MIPS does not check overflow on ANY signed/unsigned multiply, divide instr• Up to the software to check hiCS 61C L16 : Floating Point II (1)Kusalo, Spring 2005 © UCBFP Addition & Subtraction• Much more difficult than with integers(can’t just add significands)• How do we do it?• De-normalize to match larger exponent• Add significands to get resulting one• Normalize (& check for under/overflow)• Round if needed (may need to renormalize)• If signs ≠, do a subtract. (Subtract similar)• If signs ≠ for add (or = for sub), what’s ans sign?• Question: How do we integrate this into the integer arithmetic unit? [Answer: We don’t!]CS 61C L16 : Floating Point II (1)Kusalo, Spring 2005 © UCBMIPS Floating Point Architecture (•Separate floating point instructions:• Single Precision:add.s, sub.s, mul.s, div.s• Double Precision:add.d, sub.d, mul.d, div.d•These are far more complicated than their integer counterparts• Can take


View Full Document

Berkeley COMPSCI 61C - Floating Point II

Documents in this Course
SIMD II

SIMD II

8 pages

Midterm

Midterm

7 pages

Lecture 7

Lecture 7

31 pages

Caches

Caches

7 pages

Lecture 9

Lecture 9

24 pages

Lecture 1

Lecture 1

28 pages

Lecture 2

Lecture 2

25 pages

VM II

VM II

4 pages

Midterm

Midterm

10 pages

Load more
Download Floating Point II
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Floating Point II and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Floating Point II 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?