PAP: Power Aware Partitioning of Reconfigurable SystemsOutlineIntroductionObjectivePowerPoint PresentationRelated WorkContributionsPAP Algorithm OverviewTask MobilityTask Mobility Contd.Task Selection RoutineDefinition: Time Valid SchedulePower Valid (Definitions)Communication ModelScheduling the Bus communicationSlide 16Example of PAP algorithmExample contd.Partitioning of Multifunctional SystemsApplication CriticalityModified Task Selection RoutineSelf-Priority: ComputationSelf-Priority Contd.Shared-Priority ComputationMPAP AlgorithmMPAP contd.MPAP Contd.MPAP: ComplexityCase StudiesExperiment1: PAP Vs Extensive SearchTable1: Results from the PAP algorithm and the extensive searchExperiment 1: ResultsExperiment2: MPAP(Self) Vs MPAP(Combined)Table2: Total Hardware Area for the MPAP(self) and MPAP(combined) algorithms when applied to the 16-QAM Modem and DTMF CodecBenefits of PAP/MPAP in RC EnvironmentSummaryFuture WorkQuestions ?Thank YouPAP: Power Aware Partitioning of Reconfigurable SystemsVijay R. P. KappagantulaRabi MahapatraTexas A&M UniversityCollege Station, TX 77843SSRS - Feb 08 2003 2OutlineIntroductionRelated WorkPAP: Power Aware Partitioning MPAP: PAP for multifunctional systemsExperimentsSummarySSRS - Feb 08 2003 3IntroductionHW/SW Codesign: Key IssuesPartitioning Synthesis Co-simulation Partitioning problem : Non-trivialApplication - 100 tasks , 3 different HW/SW implementations (2*3)^100! possible partitioning solutionsSSRS - Feb 08 2003 4Objective Given (Inputs) Application(s) descriptions (system level)Target Architecture (CPU, FPGA, Pmax, Ahtotal)Task’s metrics ( Ps, Ts, Ph, Th, Ah )Determine suitable partitioning framework that will map and schedule the application(s) on target architecture so as to meetThe Deadline & Power ConstraintsSSRS - Feb 08 2003 5Partitioning CPU StrongArm-1100(Software)FPGA Xilinx XCV4000(Hardware)System DescriptionSystem ArchitectureMapping & SchedulingMemory PCI System ComponentsSSRS - Feb 08 2003 6Related WorkHeuristic Based Asawaree Kalavade and P.A. Subramanyam 1998“Global Criticality/Local Phase (GCLP) Heuristic” System Power not consideredIterative improvement techniques Huiqun Liu and D.F. Wong 1998 “Integrated Partitioning & Scheduling (IPS) algorithm”Uniform SW and negligible HW execution timesNo power considerationPower-Aware SchedulingJ. Liu, P.H. Chou, N. Bagherzadeh and F. Kurdahi 2001“Power-Aware Scheduling using timing Constraints”Use initial schedule assumption – may be inflexibleSSRS - Feb 08 2003 7 ContributionsConsidered power as important constraint during partitioning step, (in hybrid systems)Concurrent Mapping and Scheduling of tasks with non-uniform execution times – for Real-Time Applications,Used Reconfigurable systems for performance tuning through task migrationSSRS - Feb 08 2003 8PAP Algorithm OverviewIterative improvement technique. Initial mapping: All SoftwareEvery iteration, one software task is selected for hardware mappingTasks mobility indicesTask Selection Routine Reschedule the tasksSchedule is verified to see if it meets its timing and power requirements.SSRS - Feb 08 2003 9Task MobilityParallelism Schedule DependentTime Interval (Ei,Li) defined by mobility is used to schedule task i in hardware Ei is the earliest possible start time in HWEi = max ( (k) ) k pred(i)pred(i) is the immediate predecessor set of task i(k) : start time of task kSSRS - Feb 08 2003 10Task Mobility Contd.Li is the latest possible finish time of task i in HWLi = min ( (k) – tsi )k succ(i)succ(i) is the immediate successor set of task itsi is the execution time of task i in SWTask Mobility of task i (i) is determined as follows:(i) = 1, Li > Ei 0, Li = EiSSRS - Feb 08 2003 11Task Selection RoutineNs: Set of software tasks in applicationS.1 Rank the tasks in Ns in the order of decreasing software execution times tsiS.2 Compute the mobility (i) for all i NsS.3 If (i) = 0 for all i NsTask i with maximum execution time tsi is selected Else Task i Ns with maximum execution time tsi and non-zero mobility is selectedSSRS - Feb 08 2003 12Definition: Time Valid ScheduleTexec: The finish time of a single iteration of the application Texec = max ( (i) + ti ), for all i N N is the set of tasks in the applicationSchedule: Time-Valid If Texec D, D is the application deadlineSSRS - Feb 08 2003 13Power Valid (Definitions)Power Profile (P ) P (t) = P(i), for all i set of active tasks at time instant tPower SpikeP (t) > PmaxPower-ValidP (t) Pmax , 0 t TexecSSRS - Feb 08 2003 14Communication Model32 bit 33 MHz PCI Delay Computation P.V. Knudsen and Jan Madsen, 1998.tcomm = Power DissipationJ.Buck, S. Ha, E.A. Lee, and D.G. Messerschmit, April 1994. Pbus = FNNCCACbussample*mnVCbus221SSRS - Feb 08 2003 15Scheduling the Bus communicationNo bus conflict is assumed.The execution of the hardware task and its communications should lie within the interval defined by its mobility.SSRS - Feb 08 2003 16Is Texec DSelect a new task using Task Selection Routine for hardware mappingTest schedulability. Compute Texec, finish time of one iterationInput Specification: Task graph (TG) deadline ‘D’, Pmax and Ahtotal(All tasks mapped to SW) Software and hardware task's metrics.End of PAP algorithmIs (Ah Ahtotal )Compute the Power Profile (P) of the schedule and the total hardware used (Ah)yesInvalidate for all future cyclesIs (P Pmax )Invalidate for the next cyclenoyesyesnonoPAP ALGORITHMSSRS - Feb 08 2003 1703451267P(t)DPmaxa. Initial schedule on CPU (all software)02715346Application specified as a task graphExample of PAP algorithmSSRS - Feb 08 2003 18Example contd.1Power Spike345620tP(t)2 3 5 4 3c. Schedule during iteration2 (Time-valid, Power-invalid)34560tP(t)2 3 5 4 3d. Schedule after iteration2 (Time-valid, Power-valid)12No Power Spike03 2456tP(t)b. Schedule after iteration1 2 3 6 5 4 31DPmaxSSRS - Feb 08 2003 19Partitioning of Multifunctional SystemsMultifunctional systems- Support a set of applications.Set of active applications - Combined task graph (CTG).PAP extended to include
View Full Document