MSU ECE 4743 - Computer Aided Digital Systems

Unformatted text preview:

Slide Number 1Digital System DesignTiming Constraints Resource EstimationFIR FilterDataflow GraphOperations in a Dataflow graphWhat is minimum no. of clock cycles needed? Datapath Design MethodologyResource EstimationSchedulingScheduling Failed!Scheduling (2nd try)Register AllocationRegister Scheduling (Clock 1)Register Scheduling (Clock #2)Register Scheduling (Clock 3 and Clock 4)Final RequirementsDatapath Execution Units Inputs Datapath CommentsSchedulingScheduling Still FailedTry again with Sample Period = 5Flowgraph for Matrix MultiplyFlowgraph for Matrix MultiplyComments on MM FlowgraphParallel Datapaths for MMSlide Number 29LatencyThroughputInitiation RateDefining Initiation Rate, LatencyReducing Init RateReducing Init RateDepartment of Electrical and Computer EngineeringMississippi State UniversitySherif Abdelwahed Scheduling Computer Aided Digital Systems Design - EE 4743/6743Digital System Design At this point, we are trying to do complex datapaths + complex control  Digital system designers are typically faced with the problems of :¾ Constraints: minimum clock frequency, maximum number of clock cycles, target device, resource limits (don’t have an infinite number of logic cells available)¾ Execution unit architecture and number : fast adder? Slow adder? Pipelined or non-pipelined multiplier? SRAM versus registers? How many do I need based on constraints?¾ Scheduling: what happens during what clock cycle?Timing Constraints  Two main constraints can be placed on a digital system design: clock period, and clock cycle constraints  A clock period constraint will define the minimum clock frequency.¾ Will affect the architecture of your execution units (fast adder versus slow adder, pipelined execution unit versus non-pipelined execution unit) A clock cycle constraint limits the available number of clock cycles to perform operation Total computation time = clock period * clock cycles Other constraints: power, device type, input/output connectionsResource Estimation Given constraints, would like a lower bound estimate on the number of resources needed Resource types: Registers, Execution units (adders, multipliers, etc) Lets do resource estimation for the equation below:Y = a0 * X + a1 * X@1 + a2 * X@2 + a3 * X@3FIR FilterX YFIR FilterThe equation above is an equation for a 4-Tap Finite Impulse Response (FIR) digital filter. Each sample perioda new value for X is input to the system. A sample period is measured in clock cycles, and the number of clock cycles per sample period will be an external constraint. X is the value for current sample period.X@1 is the value for one sample period back.X@2 is the value for two sample periods back.X@3 is the value for three sample periods back.a0, a1, a2, a3 are the filter coefficients. Changing these coefficients change the filter function; assumed to be preloaded.Y = a0 * X + a1 * X@1 + a2 * X@2 + a3 * X@3Dataflow Graph We need a method of visualizing the data dependencies and operations to be performed. One method of doing this is the dataflow graph.X*a0*X@1 a1*X@2a2*X@3 a3+++YOperations in a Dataflow graphXYAn output operation. Outputs are not assumed to be registered because they will be registered by the datapath they are being passed to. As such, they don’t cost a clock cycle (its cycle cost is in the next datapath).+An execution unit operation. Based on clock period constraints, execution units can be chained (a multiplier output directly feeding an adder input without an intervening register) or non-chained (all inputs/outputs of execution units are registered).An input operation. Inputs are assumed registered. An input operation will take 1 clock cycle.What is minimum no. of clock cycles needed? Assume that clock period constraint does not allow execution unit chaining (registers are between execution units). Minimum number of clock cycles will be longest path through the datapath.X*a0*X@1 a1*X@2 a2*X@3 a3+++YLongest path is 4 clock cycles. Minimum sample period is 4 clocks. N1N2N3 N4N5N6N7N8Datapath Design MethodologySet target timing constraintsCompute a lower bound on the resources needed for the target constraintsAttempt scheduling Increase the resource(s) that has caused scheduling to failRelax the constraints Scheduling Fail?Register scheduling Implement the datapathNoYesResource EstimationGiven a clock cycle constraint (sample period), can estimate minimum number of needed resources.Assume the minimum sample period of 4 clocks.Minimum resource estimation is:# operations / # of clocksMinimum Resource estimation:# multipliers = # multiplies/ # clocks = 4/4 = 1# adders = # additions/ #clocks = 3 /4 = 1Minimum resource estimation is 1 multiplier, 1 adder. Register estimation is tougher. Need to store X@1, X@2, X@3 + four coefficients. Need at least 7 registers.Scheduling Scheduling is mapping operations onto execution units. Use a scheduling table which lists clock cycles versus resources. Lets first just worry about execution units, and not about registers for now.Cycle Start Adder Multiplier IO #1 Idle x@3*a3 (N5) Input x #2 Idle x@2*a2 (N4) #3 N7 op (N5+N4) x@1*a1 (N3) #4 Idle x*a0 (N2) Utilization 25% 100% 25%Scheduling Failed! The scheduling failed! We were not able to schedule the adder operations represented by nodes N6 and N8. The minimum resource estimation is a lower bound; may not find a schedule to fit it. If scheduling fails, the two options:a. Increase resources, keep same # of clockb. Increase # of clocks, keep same number of resources We want a minimum sample period, so do option (a). The bottleneck is the multiplier. Lets add another multiplier.Scheduling (2ndtry)Cycle Start Adder Multiplier 1 Multiplier 2 IO #1 Idle x@3*a3 (N5) x@2*a2 (N4) Input X #2 N7 op (N5+N4) x@1*a1 (N3) x*a0 (N2) #3 N6 op (N3+N2) Idle Idle#4 N8 op (N7+N6) Idle IdleUtilization 75% 50% 50% 25%Scheduling succeedsRegister Allocation At this point, need to allocate registers to save temporary results. At beginning of operation, we know that we need to have the values a0,a1,a2,a3, x@3,x@2,x@1 stored. So we need at least 7 registers.  The registers holding a0-a3 will not change value during the computation, so we will not consider them in our scheduling. Assume at Start: RA = x@3, RB=x@2, RC=x@1Register Scheduling (Clock 1)Clock 1:Input X??? Where to put this? Add another register RD Input X: RD ← Xx@3*a3


View Full Document

MSU ECE 4743 - Computer Aided Digital Systems

Download Computer Aided Digital Systems
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Computer Aided Digital Systems and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Computer Aided Digital Systems 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?