ABSTRACTTABLE OF CONTENTSLIST OF FIGURESLIST OF TABLES1 Introduction2 Basics of Hardware/Software Partitioning3 Hardware / Software Architecture4 Test Application – DCT Algorithm5 Implementation5.1.1 Optimization for DCT operation for JPEG files5.1.1.1 Cosine coefficient5.1.1.2 Datapath bit width5.2 Algorithm Partitioning5.2.1 All Software Implementation5.2.2 Hardware Serial Multiply – Accumulate Implementation5.2.3 Hardware Parallel MAC Implementation5.2.4 All Hardware Implementation6 Results7 ConclusionAcknowledgmentsReferencesAppendixHARDWARE / SOFTWAREPARTITIONINGInstructor: Dr. Yu Hen HuSubmitted By:Devang SachdevLizheng ZhangHWSWABSTRACTTraditionally when creating DSP systems, designers partition the hardware and softwareearly in the process. Hardware and software engineers design their respectivecomponents in isolation, and communication between the two groups is minimal. Thereare several drawbacks to this approach. As a result, hardware-software codesign (HSC)has gained considerable momentum in industry and academia in the last decade. HSCintegrates the principles of hardware and software design and provides structuredmethods and tools that focus on extensive modeling and simulation-based verification.Hardware software partitioning is one the important phases of codesign. In this projectwe evaluate effect of various partitioning planes on HW-SW co-design architecture,consisting of single SW processor, a HW coprocessor (FPGA), shared memory for HW-SW communication and SW local memory, by mapping very basic DSP algorithms. Weimplement an example application on the co-design platform with varying partitionbetween the SW and HW functions.TABLE OF CONTENTS1 Introduction............................................................................................52 Basics of Hardware/Software Partitioning........................................63 Hardware / Software Architecture.....................................................94 Test Application – DCT Algorithm...................................................115 Implementation....................................................................................135.1 Design Issues...................................................................................................135.1.1 Fixed Point Arithmetic...........................................................................135.1.2 Optimization for DCT operation for JPEG files..................................155.2 Algorithm Partitioning...................................................................................185.2.1 All Software Implementation................................................................185.2.2 Hardware Serial Multiply – Accumulate Implementation...............195.2.3 Hardware Parallel MAC Implementation...........................................205.2.4 All Hardware Implementation..............................................................216 Results...................................................................................................237 Conclusion............................................................................................25Acknowledgments......................................................................................25References....................................................................................................26Appendix.....................................................................................................27LIST OF FIGURESFigure 1: Fundamental Phases of Hardware-Software Codesign..............................7Figure 2: HW/SW Architecture......................................................................................9Figure 3: 2D DCT using 1D DCT..................................................................................12Figure 4: Length of Cosine Coefficients.......................................................................16Figure 5: Datapath Width..............................................................................................17Figure 6: DCT Algorithm Partitioning.........................................................................18Figure 7: MAC Unit.........................................................................................................19Figure 8: Interface DSP and 1-Serial MAC Unit..........................................................20Figure 9: Parallel MAC Unit...........................................................................................21Figure 10: Comparison of H/S Partitions....................................................................23LIST OF TABLESTable 1: Partitioning results...............................................................................................231 IntroductionEmbedded systems for reactive (real-time) applications are implemented as mixedsoftware-hardware systems, utilizing microprocessors, microcontrollers, Digital SignalProcessors, ASICs and/or FPGAs. Generally, software is used for features and flexibility,while hardware is used to achieve the required performance or for power savingreasons. Re-configurable macros and architectures are introduced for embedded systemsto increase flexibility on a higher performance level compared to general-purposeprocessors. Typically the HW/SW partitioning task is simply seen as mapping thedefined functionality from e.g. an executable specification onto given resources.However, when designing an embedded system for a heterogeneous implementation (i.e.hardware and software on embedded processors), more complex activities have to beundertaken as given in the following list:- System behavioral description – giving the executable specification of what thesystem is suppose to do- Hardware/software partitioning – deciding which parts of the system behaviorshould be realized by what parts of the hardware architecture.- Task scheduling – controlling how the different computational resources of thehardware architecture are shared between the tasks of the behavioral description- Hardware architecture selection – describing what hardware components should beused and how they are connected.The main purpose of the present project is to describe the hardware/softwarepartitioning of the
View Full Document