UCSC CMPE 012 - Floating Point Numbers - D740400

Home> Schools> University of California, Santa Cruz> Computer Engineering (CMPE) > CMPE 012> Floating Point Numbers

DOC PREVIEW

UCSC CMPE 012 - Floating Point Numbers

School name University of California, Santa Cruz

Course Cmpe 012- Computer Systems and Assembly

Pages 18

This preview shows page 1-2-3-4-5-6 out of 18 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 18 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 18 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 18 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 18 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 18 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 18 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 18 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

1Floating Point NumbersSummer 2008CMPE12 – Summer 2008 – Slides by ADB 2Fractional numbers Fractional numbers – fixed point Floating point numbers – the IEEE 754 floating point standard Floating point operations Rounding modes2CMPE12 – Summer 2008 – Slides by ADB 3Positional representation of fractional numbers In base 102102Decimal pointNumber65431Position-4-3-2-1013Multiplier10-410-310-210-1100101103CMPE12 – Summer 2008 – Slides by ADB 4Positional representation of fractional numbers In base 22122Binary pointNumber10110Position-4-3-2-1013Multiplier2-42-32-22-12021233CMPE12 – Summer 2008 – Slides by ADB 5Fractional numbers – fixed point Fixed-point representation How much information is necessary to store? How do you choose a format for the bits? Fixed-point operations Addition Align binary points, and add straight down Multiplication ???CMPE12 – Summer 2008 – Slides by ADB 6Decimal to binary conversion Convert A = 3.141510to base 24CMPE12 – Summer 2008 – Slides by ADB 7Fixed-point number densityCMPE12 – Summer 2008 – Slides by ADB 8Scientific notation In base 10 Example: 3.0 × 108In base 2 Example: –1.00101 × 24(= –18.510) The general form r = Sign × signiFicand × baseExponent5CMPE12 – Summer 2008 – Slides by ADB 9Single-precision IEEE 754 floating-point numberstnenopxe30 29 28 27 26 25Sign31 23significand0123456789011121314151617181920212224 One-bit sign Eight-bit exponent 23-bit significand That’s the fractional partCMPE12 – Summer 2008 – Slides by ADB 10Single-precision IEEE 754 floating-point numbers Normalized numbers: only one non-zero bit to the left of the binary point Adjust the exponent as needed r = (–2)S× F ×2EImplicit leading 1 in the significand (the “hidden bit”) r = (–2)S×(1 + F)×2EBias notation to represent the exponent With the bias B = 127 r = (–2)S×(1 + F)×2E-B6CMPE12 – Summer 2008 – Slides by ADB 11How to convert a base-10 number into IEEE 754 single-precision floating point Convert the number to binary The big part  And the fractional part Normalize Isolate the hidden one Remove the significand’s hidden one Add bias to the exponent Represent the numberCMPE12 – Summer 2008 – Slides by ADB 12Example: 12.62530 29 28 27 26 25Sign31 23significand0123456789011121314151617181920212224 Convert to binary Normalize Remove hidden one Add bias exponent The end7CMPE12 – Summer 2008 – Slides by ADB 13Double-precision IEEE 754 floating-point numberstnenopxe63 62 61 60 59 58Sign64 56significand333435363738394041424344454647484950515253545557 One-bit sign Eleven-bit exponent 52-bit significand That’s the fractional part30 29 28 27 26 2531 23significand, continued0123456789011121314151617181920212224CMPE12 – Summer 2008 – Slides by ADB 14Summary of IEEE 754 formats≥ 7964≥ 4332Total bits≤ –16382–1022≤ –1022–126Emin≥ +16383+1023≥ +1023+127Emax≥ 1511≥ 118Bits for E≥ 6453≥ 3224Bits for FDouble Ext.DoubleSingle Ext.SingleParameterPrecision For every precision, there are reserved exponents, used for special quantities: Emin– 1 (i.e., E=0) is used for zero and denorms Emax+ 1 (i.e., E=255 or E=2047, with bias) is used for NaN and infinity8CMPE12 – Summer 2008 – Slides by ADB 15Special quantities: Infinity This special quantity avoids halt on overflow Much safer than returning the largest possible number Representation: E = Emax+ 1 E = 255 in single precision with bias E = 2047 in double precision with bias F = 0 Sign (+∞ or –∞)30 29 28 27 26 25Sign31 23significand0123456789011121314151617181920212224CMPE12 – Summer 2008 – Slides by ADB 16Special quantities: Infinity Examples of operations that return ±Inf1 / InfSqrt (+Inf)4 – Inf–1/01/0ResultOperation9CMPE12 – Summer 2008 – Slides by ADB 17Special quantities: NaN (Not a Number) This special quantity avoids halt on invalid operations Representation: E = Emax+ 1 E = 255 in single precision with bias E = 2047 in double precision with bias F ≠ 030 29 28 27 26 25Sign31 23significand0123456789011121314151617181920212224CMPE12 – Summer 2008 – Slides by ADB 18Special quantities: NaN (Not a Number) Examples of operations that return NaNlog(–0)log(+0)1/–Inf3/–03/+0–0/3+0/3ResultOperation10CMPE12 – Summer 2008 – Slides by ADB 19Special quantities: Zero Representation: E = Emin– 1 (i.e., E = 0) F ≠ 0 Sign: +0 or –030 29 28 27 26 25Sign31 23significand0123456789011121314151617181920212224CMPE12 – Summer 2008 – Slides by ADB 20Special quantities: Zero Examples of operations that involve ±0Sqrt/×+NaN produced byOperation11CMPE12 – Summer 2008 – Slides by ADB 21Floating point numbers range What is the largest number we can represent in IEEE 754 single-precision floating point? What is the smallest number?30 29 28 27 26 25Sign31 23significand0123456789011121314151617181920212224CMPE12 – Summer 2008 – Slides by ADB 22Floating point numbers range What is the largest number we can represent in IEEE 754 double-precision floating point? What is the smallest number?12CMPE12 – Summer 2008 – Slides by ADB 23Floating point numbers: density Fact 1: Floats are not reals E.g., 2/3 Fact 2: Floats are not decimals E.g., 0.1 (base 10) = 1.1001100… × 2–4(base 2) Fact 3: Not even all the integers in the range are represented E.g., 100,000,001 (base 10) =1011 1110 1011 1100 0010 0000 0001 (base 2)CMPE12 – Summer 2008 – Slides by ADB 24Floating point numbers: density Close to 0: high density Far from 0: high density13CMPE12 – Summer 2008 – Slides by ADB 25Special quantities: Denormals These are numbers smaller than 2^(Emin) Fill the gap between 2^(Emin) and 0 (gradual underflow) Representation: E = Emin– 1 (i.e., E = 0) F ≠ 0 The number represented is 0.f-1f-2…f-23×2^(Emin)30 29 28 27 26 25Sign31 23significand0123456789011121314151617181920212224CMPE12 – Summer 2008 – Slides by ADB 26Special quantities: Denormals From 0 to 2^(Emin)14CMPE12 – Summer 2008 – Slides by ADB 27Summary of IEEE 754 numbers≠ 02047≠ 0255020470255anything1 – 2046anything1 – 254≠ 00≠ 000000SignificandExponentSignificandExponentObjectDouble precisionSingle precisionCMPE12 – Summer 2008 – Slides

View Full Document