This preview shows page 1-2-3 out of 10 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 10 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

1BR 6/00 1Instruction Set extensions to X86 • Some extensions to x86 instruction set intended to accelerate 3D graphics•AMD 3D-Now! Instructions simply accelerate floating point arithmetic.– Accelerate object transformations– Allow multiple floating point operations to be done in one clock cycle.• A similar extension is found on the Pentium III –just does not have the fancy name.BR 6/00 2Floating Point SIMD instructions• SIMD stands for Single Instruction, Multiple Data• Same instruction applied to multiple operands – Do an add on four pairs of operandsy0= a0 +b0, y1 = a1+b1, y2=a2+b2, y3 = a3+b3• Pentium III added some 128 bit registers used to hold ‘packed’ single precision floating point numbers– A single precision floating point number is 32 bitsBR 6/00 3xmm RegistersNew 128 bit registers are called XMM registers (XMM0 – XMM7)Holds four 32-bit single precision floating point numbersAn instruction like ADDPS xmm0, xmm1 will add the two registers together, computing the sums of the four numbers.Easy to see speed advantage over previous instructions4.0 (32 bits)+4.0 (32 bits) 3.5 (32 bits) -2.0 (32 bits)2.3 (32 bits)1.7 (32 bits)2.0 (32 bits)-1.5 (32 bits)0.3 (32 bits)5.2 (32 bits)6.0 (32 bits)2.5 (32 bits)2BR 6/00 4SIMD ExtensionsMore than 70 instructions. Arithmetic Operations supported: Addition, Subtraction, Mult, Division, Square Root, Maximum, Minimum. Can operate on Floating point or Integer data. BR 6/00 5Flags• Individual flags are not kept for each packed operation.• Can only tell if an error (exception) occurred in one or more of the packed operations• Some possible exceptions (not all listed)– Underflow (number too small)– Overflow (number too large)– Divide by Zero BR 6/00 6Pentium 3 vs. Pentium 4• The SIMD extensions on the Pentium 3 are called the SSE instructions and the 128 bit registers only support viewing the data as 4 single precision FP numbers. • On the Pentium 4, the 128 bit registers can be viewed as these data types– 4 single precision FP values (SSE)– 2 double precision FP values (SSE2)– 16 byte values (SSE2)– 8 word values (SSE2)– 4 double word values (SSE2)– 1 128-bit integer value (SSE2)3BR 6/00 7MMX InstructionsAdded eight 64 bit registers. The 64 bit register can be viewed as containing 8 packed bytes, 4 packed words, 2 dwords, or 1 quad. BR 6/00 8Saturating ArithmeticThe MMX instructions perform SIMD operations between MMX registers on packed bytes, words, or dwords. The arithmetic operations can made to operate in Saturation mode. What saturation mode does is clip numbers to Maximum positive or maximum negative values during arithmetic.In normal mode: FFh + 01h = 00h (unsigned overflow)In saturated, unsigned mode: FFh + 01 = FFh (saturated to maximum value, closer to actual arithmetic value)In normal mode: 7fh + 01h = 80h (signed overflow) In saturated, signed mode: 7fh + 01 = 7fh (saturated to max value)BR 6/00 9Why Saturating Arithmetic?• In case of integer overflow (either signed or unsigned), many applications are satisfied with just getting an answer that is close to the right answer or saturated to maxium result• Many DSP (Digital Signal Processing) algorithms depend on this feature– Many DSP algorithms for audio data (8 to 16 bit data) and Video data (8-bit R,G,B values) are integer based, and need saturating arithmetic.• This is easy to implement in hardware, but slow to emulate in software. A nice feature to have.4BR 6/00 10Floating Point Representations• The goal of floating point representation is represent a large range of numbers• Floating point in decimal representation looks like:+3.0 x 10 3, 4.5647 x 10 -20 , etc• In binary, sample numbers look like:-1.0011 x 2 4 , 1.10110 x 2 –3, etc• Our binary floating point numbers will always be of the general form:(sign) 1.mmmmmm x 2 exponent• The sign is positive or negative, the bits to the right of decimal point is the mantissa or significand, exponent can be either positive or negative. The numeral to the left of the decimal point is ALWAYS 1 (normalized notation).BR 6/00 11Floating Point Encoding• The number of bits allocated for exponent will determine the maximum, minimum floating point numbers (range)1.0 x 2 –max(small number) to 1.0 x 2 +max(large number)• The number of bits allocated for the significandwill determine the precision of the floating point number• The sign bit only needs one bit (negative:1, positive: 0)BR 6/00 12Single Precision, IEEE 754Single precision floating point numbers using the IEEE 754 standard require 32 bits:S exponent significand8 bits 23 bits1 bit31 30 23 22 0Exponent encoding is bias 127. To get the encoding, take the exponent and add 127 to it.If exponent is –1, then exponent field = -1 + 127 = 126 = 7EhIf exponent is 10, then exponent field = 10 + 127 = 137 = 89hSmallest allowed exponent is –126, largest allowed exponent is +127. This leaves the encodings 00H, FFH unused for normal numbers.5BR 6/00 13Convert Floating Point Binary Format to Decimal1 10000001 010000........0S exponent significandWhat is this number?Sign bit = 1, so negative. Exponent field = 81h = 129. Actual exponent = Exponent field – 127 = 129 – 127 = 2.Number is:-1 . (01000...000) x 2 2 -1 . (0 x 2-1+ 1 x 2-2 + 0 x 2-3 .. +0) x 4-1 . (0 + 0.25 + 0 +..0) x 4-1.25 x 4 = -5.0. BR 6/00 14Convert Decimal FP to binary encodingWhat is the number -28.75 in Single Precision Floating Point?1. Ignore the sign, convert integer and fractional part to binary representation first:a. 28 = 1Ch = 0001 1100b. .75 = .5 + .25 = 2-1+ 2-2= .11-28.75 in binary is - 00011100.11 (ignore leading zeros) 2. Now NORMALIZE the number to the format 1.mmmm x 2expNormalize by shifting. Each shift right add one to exponent, each shift left subtract one from exponent:- 11100.11 x 20 = - 1110.011 x 21= - 111.0011 x 22= - 1.110011 x 24BR 6/00 15Convert Decimal FP to binary encoding (cont)Normalized number is: - 1.110011 x 24Sign bit = 1Significand field = 110011000...000Exponent field = 4 + 127 = 131 = 83h = 1000 0011 Complete 32-bit number is:1 10000011 110011000....000S exponent significand6BR 6/00 16Algorithm for converting fractional decimal to BinaryAn algorithm for converting any fractional decimal number to itsbinary representation is successive multiplication by two (results in


View Full Document

MSU ECE 3724 - Instruction Set extensions to X86

Documents in this Course
Timers

Timers

38 pages

TEST 4

TEST 4

9 pages

Flags

Flags

6 pages

Timers

Timers

6 pages

Timers

Timers

54 pages

TEST2

TEST2

8 pages

Load more
Download Instruction Set extensions to X86
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Instruction Set extensions to X86 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Instruction Set extensions to X86 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?