This preview shows page 1-2-19-20 out of 20 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 20 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 20 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 20 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 20 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 20 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Fixed Point NumbersFixed Point (cont).Algorithm for converting fractional decimal to BinaryUnsiged OverflowSaturating ArithmeticSaturating ArithmeticSaturating Adder: Unsigned and 2’ComplementSaturating Adder: Unsigned, 4 Bit exampleSaturating Adder: Signed, 4 Bit exampleSaturating ArithmeticWhy Saturating Arithmetic?Floating Point RepresentationsFloating Point EncodingSingle Precision, IEEE 754Convert Floating Point Binary Format to DecimalConvert Decimal FP to binary encodingConvert Decimal FP to binary encoding (cont)Overflow/Underflow, Double PrecisionSpecial NumbersComments on IEEE FormatV 0.1 1Fixed Point Numbers• The binary integer arithmetic you are used to is known by the more general term of Fixed Point arithmetic.– Fixed Point means that we view the decimal point being in the same place for all numbers involved in the calculation.– For integer interpretation, the decimal point is all the way to the right$C0+ $25--------$E5192.+ 37.--------229.Unsigned integers, decimal point to the right.A common notation for fixed point is ‘X.Y’, where X is the number of digits to the left of the decimal point, Y is the number of digits to the right of the decimal point.V 0.1 2Fixed Point (cont).• The decimal point can actually be located anywhere in the number -- to the right, somewhere in the middle, to the rightAddition of two 8 bit numbers; different interpretations of results based on location of decimal point17+ 31--------48$11+ $1F--------$304.25+ 7.75--------12.000.07+ 0.12--------0.19xxxxxxxx.0decimal point to right. This is 8.0 notation.xxxxxx.yytwo binary fractional digits. This is 6.2 notation.0.yyyyyyyy decimal point to left (all fractional digits). This is 0.8 notation.V 0.1 3Algorithm for converting fractional decimal to BinaryAn algorithm for converting any fractional decimal number to itsbinary representation is successive multiplication by two (results in shifting left). Determines bits from MSB to LSB.a. Multiply fraction by 2.b. If number >= 1.0, then current bit = 1, else current bit = 0.c. Take fractional part of number and go to ‘a’. Continue until fractional number is 0 or desired precision is reached.Example: Convert .5625 to binary.5625 x 2 = 1.125 ( >= 1.0, so MSB bit = ‘1’)..125 x 2 = .25 ( < 1.0 so bit = ‘0’).25 x 2 = .5 (< 1.0 so bit = ‘0’).5 x 2 = 1.0 ( >= 1.0 bit = 1), finished. .5625 = .1001bV 0.1 4Unsiged Overflow• Recall that a carry out of the Most Significant Digit is an unsigned overflow. This indicates an error - the result is NOT correct!Addition of two 8 bit numbers; different interpretations of results based on location of decimal point63.75+ 0.25-----------0 255+ 1--------0$FF+ $01--------$000.99600+ 0.00391-----------0 xxxxxxxx.0decimal point to right0.yyyyyyyy decimal point to left (all fractional digits). This 0.8 notationxxxxxx.yytwo binary fractional digits (6.2 notation)V 0.1 5Saturating Arithmetic• Saturating arithmetic means that if an overflow occurs, the number is clamped to the maximum possible value.– Gives a result that is closer to the correct value– Used in DSP, Graphic applications.– Requires extra hardware to be added to binary adder. – Pentium MMX instructions have option for saturating arithmetic.63.75+ 0.25-----------63.75 255+ 1--------255$FF+ $01--------$FF0.99600+ 0.00391-----------0.99600 xxxxxxxx.0decimal point to rightxxxxxx.yytwo binary fractional digits.0.yyyyyyyy decimal point to left (all fractional digits)V 0.1 6Saturating ArithmeticThe Intel Xx86 MMX instructions perform SIMD operations between MMX registers on packed bytes, words, or dwords. The arithmetic operations can made to operate in Saturation mode. What saturation mode does is clip numbers to Maximum positive or maximum negative values during arithmetic.In normal mode: FFh + 01h = 00h (unsigned overflow)In saturated, unsigned mode: FFh + 01 = FFh (saturated to maximum value, closer to actual arithmetic value)In normal mode: 7fh + 01h = 80h (signed overflow) In saturated, signed mode: 7fh + 01 = 7fh (saturated to max value)V 0.1 7Saturating Adder: Unsigned and 2’Complement• For an unsigned saturating adder, 8 bit:– Perform binary addition– If Carryout of MSB =1, then result should be a $FF.– If Carryout of MSB =0, then result is binary addition result.• For a 2’s complement saturating adder, 8 bit:– Perform binary addition– If Overflow = 1, then:• If one of the operands is negative, then result is $80• If one of the operands is positive, then result is $7f– If Overflow = 0, then result is binary addition result.V 0.1 8Saturating Adder: Unsigned, 4 Bit example+COA[3:0]B[3:0]T[3:0]1111SUM[3:0]10102/1 Mux SV 0.1 9Saturating Adder: Signed, 4 Bit example+A3 A3’A3’A3’SUM[3:0]A[3:0]1T[3:0]B[3:0]0Vflag = T3 A3’ B3’ + T3’ A3 B3Vflag is true if sign of both operands are the same (both negative, both positive) and different from Sum (overflow if add two positive numbers, get a negative or add two negative numbers and get a positive number. Can’t get overflow if add a postive and a negative).Saturated value has same sign as one of the operands, with other bits equal to NOT (sign) : 0111 (positive saturation), 1000 (negative saturation).V 0.1 10Saturating ArithmeticThe MMX instructions perform SIMD operations between MMX registers on packed bytes, words, or dwords. The arithmetic operations can made to operate in Saturation mode. What saturation mode does is clip numbers to Maximum positive or maximum negative values during arithmetic.In normal mode: FFh + 01h = 00h (unsigned overflow)In saturated, unsigned mode: FFh + 01 = FFh (saturated to maximum value, closer to actual arithmetic value)In normal mode: 7fh + 01h = 80h (signed overflow) In saturated, signed mode: 7fh + 01 = 7fh (saturated to max value)V 0.1 11Why Saturating Arithmetic?• In case of integer overflow (either signed or unsigned), many applications are satisfied with just getting an answer that is close to the right answer or saturated to maxium result• Many DSP (Digital Signal Processing) algorithms depend on this feature– Many DSP algorithms for audio data (8 to 16 bit data) and Video data (8-bit R,G,B values) are integer based, and need saturating arithmetic.• This is easy to implement in hardware, but slow to emulate in software. A nice feature to have.V 0.1 12Floating Point


View Full Document

MSU ECE 3724 - Fixed Point Numbers

Documents in this Course
Timers

Timers

38 pages

TEST 4

TEST 4

9 pages

Flags

Flags

6 pages

Timers

Timers

6 pages

Timers

Timers

54 pages

TEST2

TEST2

8 pages

Load more
Download Fixed Point Numbers
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Fixed Point Numbers and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Fixed Point Numbers 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?