CS244-Introduction to Embedded Systems and Ubiquitous ComputingCS244 – Lecture 5Review: Design ObjectivesCo-design FlowCo-design FlowCo-design FlowInformal Specification & System Level ModelHardware Software PartitioningSchedulingSchedulingFunctional Co-simulationCommunication Synthesis & Bus-accurate Co-simulationCompilation & Synthesis & Cycle-accurate Co-simulationEmulate/Prototype and FabricationPartitioning (Clustering)Hierarchical Clustering – ExampleClustering w/ several criteriaPartitioning (Clustering)Iterative Partitioning AlgorithmsKernighan-Lin (Min-Cut) AlgorithmsAlternate Partitioning TechniquesMore Partitioning IssuesConclusionCS244-Introduction to Embedded Systems and Ubiquitous ComputingInstructor: Eli BozorgzadehComputer Science DepartmentUC IrvineWinter 2010Winter 2010- CS 2442CS244 – Lecture 5Hardware/Software Co-designWinter 2010- CS 2443Review: Design ObjectivesPerformanceCostQualityThresholdsBetterBetterBetterImprovingquality beyondthreshold isdesiredImprovingperformancebeyond thresholdIs a wasteImprovingcost is desiredWinter 2010- CS 2444Co-design FlowSystemModelSystem SimulationInformal SpecificationHardware/Software PartitioningPartitionedModelSchedulePartitionedModel & Sch.HW/SW Co-simulationRefineAlgorithmic DesignWinter 2010- CS 2445Co-design FlowPartitionedModel + Sch.CommunicationSynthesisSoftwareModelHardwareModelHW/SW Co-simulationCompilation SynthesisHW/SW Co-simulationGate-levelModelBinary Exec.ModelRefineWinter 2010- CS 2446Co-design FlowGate-levelModelBinary Exec.ModelEmulate orPrototypeRefineFabricationWinter 2010- CS 2447Informal Specification & System Level Model Informal Specification loosely defines high level behavior, constraints, and optimization objectives of the system Algorithmic and implementation details absent Performance estimates not present System level model formally captures behavior, constraints, and optimization objectives Can be simulated to obtain early performance estimates Feedback to refine the system specification Can serve as a golden model for validation of intermediate or final stages Algorithmic designWinter 2010- CS 2448Hardware Software Partitioning Decompose (i.e., partition) the function F of the system into N sub-functions F1, F2, F3… FN Decompose the constraints and design objectives of the system into sub-constraints and design sub-objectives Cluster F1, F2, F3, …, Fninto M partitions to run on MprocessorsF{F1, F2, F3… Fn}P1P2P3PM……Winter 2010- CS 2449Scheduling Scheduling is to obtain an execution sequence such that dependencies are obeyed Static During design time the schedule is fixed (the common case) Dynamic During execution time, the schedule is determined (reconfigurable computing)F1F2F3F4F5F6F7F8P1: F1 → F2 → F8P2: F4 → F5P3: F3 → F6P4: F7Winter 2010- CS 24410Scheduling A deadline D for the entire schedule An execution time for each Tifor each FiASAP (as soon as possible) ALAP (as late as possible)F1F2F3F4F5F6F7F8P1: F1 → F2 → F8P2: F4 → F5P3: F3 → F6P4: F733433126Winter 2010- CS 24411Functional Co-simulation Some of the M processors are single-purpose (e.g., those with a single function mapped on to them), others are general purpose Functions mapped onto the general-purpose processors are implemented in software and simulated on virtual machines with performance models Functions mapped onto the single-purpose processors are simulated at the behavioral level with performance models Communication is done via abstract channels Feedback is used to refine the partitioning and scheduling tasksWinter 2010- CS 24412Communication Synthesis & Bus-accurate Co-simulation Abstract channels A1, A2… Anare mapped onto a set of communication channels C1, C2… Cm Similar to functional partitioning Similar to hardware/software scheduling Channels correspond to physical artifacts of the architecture Hardware and software models are annotated with detailed communication constructs A hardware model and software model is obtained and co-simulated Communication synthesis (or possibly higher levels of design) are refinedWinter 2010- CS 24413Compilation & Synthesis & Cycle-accurate Co-simulation Compiler used to generate binary executables for general-purpose processors Synthesis used to generate gate-level models of single-purpose processors Synthesis used to generate gate-level models of general-purpose processors Cycle accurate co-simulation of the entire system Note: mixed level co-simulation is commonWinter 2010- CS 24414Emulate/Prototype and Fabrication Use hardware (e.g, FPGAs) to emulate a system as fast as possible (relative to real-time) Fabrication Place & route Mask design Chip testing Manufacturing fault models Test vector generation PackagingWinter 2010- CS 24415Partitioning (Clustering) Given: F = { F1, F2, F3… FN} P = { P1, P2, P3… PM} Find a lowest cost partition (cluster), as computed by an objective function Exhaustive approach O(MN) Heuristics Constructive partitioning (based on closeness function) Random (good for seeding iterative approaches) Cluster Growth Hierarchical clustering Iterative partitioning Start with a partition and improve Gradient search Controlled random search Modified Kernighan/Lin and FM algorithm Partitions a set of nodes (functions) into two bins (processors) Minimize edges between bins (communication cost, wires, etc.) Cost function for moving a node from one partition to another ILP Genetic evolution Simulated annealingWinter 2010- CS 24416Hierarchical Clustering – ExampleClustering w/ several criteriaWinter 2010- CS 24418Partitioning (Clustering) Given: F = { F1, F2, F3… FN} P = { P1, P2, P3… PM} Find a lowest cost partition (cluster), as computed by an objective function Exhaustive approach O(MN) Heuristics Constructive partitioning (based on closeness function) Random (good for seeding iterative approaches) Cluster Growth Hierarchical clustering Iterative partitioning Start with a partition and improve Gradient search Controlled random search Modified Kernighan/Lin algorithm Partitions a set of nodes (functions) into two bins (processors) Minimize edges between bins (communication cost, wires, etc.) Cost function for moving a node from one partition to another ILP Genetic
View Full Document