DOC PREVIEW
UT EE 382V - Media Instructions, Coprocessors, and Hardware Accelerators

This preview shows page 1-2-3-4-5-6 out of 17 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 17 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

1EE382 – System-on-Chip Design – Coprocessors, etc. SPS-1 University of Texas at AustinMedia Instructions, Coprocessors, and Hardware AcceleratorsSteven P. SmithSoC DesignEE382VFall 2009EE382 – System-on-Chip Design – Coprocessors, etc. SPS-2 University of Texas at AustinOverview• SoCs offer tremendous potential for targeting applications to particular demands, such as performance, power, cost, etc.• How do you take advantage of all those availabletransistors?• Multiple general-purpose processor cores• Most Flexible, but typically sub-optimal for specific applications• Application-specific instruction set processors (ASIP)• MMX media instruction extensions to PCs extend this concept• Coprocessors• Use well-defined control interface to processor• Hardware accelerators• Typically custom, memory-mapped interface2EE382 – System-on-Chip Design – Coprocessors, etc. SPS-3 University of Texas at AustinMedia Instructions: MMX• Multimedia applications tend to perform repetitive operations on large quantities of 8 and 16-bit data• Filtering• Compression• Rendering• Intel’s MMXTMtechnology is designed to speed-up multimedia and communications applications.• The technology includes special instructions and data types that allow such applications to achieve a new level of performance.EE382 – System-on-Chip Design – Coprocessors, etc. SPS-4 University of Texas at AustinMMX Introduction• Processors enabled with MMX technology deliver enough performance to execute compute-intensive communications and multimedia tasks on the standard PC platform.• Commonly accelerated applications include graphics, image processing, MPEG video, music synthesis, speech compression, speech recognition, games, video conferencing and more.3EE382 – System-on-Chip Design – Coprocessors, etc. SPS-5 University of Texas at AustinKey Attributes of MMX Target Applications• Small integer data types• Small, highly repetitive loops• Frequent multiplies and accumulates• Compute-intensive algorithms• Highly parallel operationsEE382 – System-on-Chip Design – Coprocessors, etc. SPS-6 University of Texas at AustinMMX Highlights• Single Instruction, Multiple Data (SIMD) technique• 57 instructions beyond base x86 instruction set• Eight 64-bit wide MMX registers• Four new data types4EE382 – System-on-Chip Design – Coprocessors, etc. SPS-7 University of Texas at AustinMMX SIMD• Single Instruction, Multiple Data (SIMD)• This allows many pieces of information to be processed with a single instruction, providing parallelism that greatly increases performance.• Up to 8-way parallelismEE382 – System-on-Chip Design – Coprocessors, etc. SPS-8 University of Texas at AustinMMX Data Types• Packed Byte: Eight bytes packed into one 64-bit quantity• Packed Word: Four 16-bit words packed into one 64-bit quantity• Packed Doubleword: Two 32-bit double words packed into one 64-bit quantity• Quadword: One 64-bit quantity5EE382 – System-on-Chip Design – Coprocessors, etc. SPS-9 University of Texas at AustinMMX Data Types in 64-bit RegistersEE382 – System-on-Chip Design – Coprocessors, etc. SPS-10 University of Texas at AustinMMX Instructions• The MMX instructions cover several functional areas:• Basic arithmetic operations such as add, subtract, multiply, arithmetic shift and multiply-add• Comparison operations• Conversion instructions to convert between the new data types - pack data together, and unpack from small to larger data types• Logical operations such as AND, AND NOT,OR, and XOR• Shift operations• Data Transfer (MOV) instructions for MMX register-to-register transfers, or 64-bit and 32-bit load/store operations to memory6EE382 – System-on-Chip Design – Coprocessors, etc. SPS-11 University of Texas at AustinMMX Instruction Set SummaryEE382 – System-on-Chip Design – Coprocessors, etc. SPS-12 University of Texas at AustinMMX Instruction Set Summary (2)7EE382 – System-on-Chip Design – Coprocessors, etc. SPS-13 University of Texas at AustinPADDW InstructionEE382 – System-on-Chip Design – Coprocessors, etc. SPS-14 University of Texas at AustinPADDSUW Instruction8EE382 – System-on-Chip Design – Coprocessors, etc. SPS-15 University of Texas at AustinPMADDWD InstructionEE382 – System-on-Chip Design – Coprocessors, etc. SPS-16 University of Texas at AustinPCMPGTW Instruction9EE382 – System-on-Chip Design – Coprocessors, etc. SPS-17 University of Texas at AustinPACKSS[DW] InstructionEE382 – System-on-Chip Design – Coprocessors, etc. SPS-18 University of Texas at AustinMMX ApplicationsChroma Keying10EE382 – System-on-Chip Design – Coprocessors, etc. SPS-19 University of Texas at AustinMMX Applications: Chroma KeyingpcmpeqwEE382 – System-on-Chip Design – Coprocessors, etc. SPS-20 University of Texas at AustinMMX Applications: Chroma Keyingpandn11EE382 – System-on-Chip Design – Coprocessors, etc. SPS-21 University of Texas at AustinMMX Applications: Vector Dot productEE382 – System-on-Chip Design – Coprocessors, etc. SPS-22 University of Texas at AustinMMX Applications: Vector Dot product12EE382 – System-on-Chip Design – Coprocessors, etc. SPS-23 University of Texas at AustinMMX Applications: Matrix MultiplicationEE382 – System-on-Chip Design – Coprocessors, etc. SPS-24 University of Texas at AustinMMX Applications: Matrix MultiplicationInstruction counts13EE382 – System-on-Chip Design – Coprocessors, etc. SPS-25 University of Texas at AustinCoprocessors• Integrated with processor control logic• Tightly-Coupled Coprocessors• Task typically completes in a few cycles• Small amounts of data• Processor stalls waiting for the coprocessor• Communication with coprocessor typically via registers and dedicated control signals• “Coprocessor ports”• Examples: ARM (ARM7TDMI); Texas Instruments TMS320C55x processorsEE382 – System-on-Chip Design – Coprocessors, etc. SPS-26 University of Texas at AustinTightly-Coupled CoprocessorsTMS320C55xTCCMemorySystemInstructiondecodeRegister fileTCCI/fTCC instructions14EE382 – System-on-Chip Design – Coprocessors, etc. SPS-27 University of Texas at AustinCoprocessors• Loosely-Coupled Coprocessors• Used for larger tasks than is the case for tightly-coupled coprocessors• Task runs in parallel with main processor• May take many cycles per task• Large amounts of data that coprocessor may access independent of main


View Full Document

UT EE 382V - Media Instructions, Coprocessors, and Hardware Accelerators

Documents in this Course
Load more
Download Media Instructions, Coprocessors, and Hardware Accelerators
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Media Instructions, Coprocessors, and Hardware Accelerators and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Media Instructions, Coprocessors, and Hardware Accelerators 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?