Unformatted text preview:

HYBRIDS BUD ON EMBEDDED LANDSCAPE Meanwhile ARM Allies Plot World Domination By Joseph Byrne January 9 2012 A decade from now 2011 will be remembered for one event in the world of high speed embedded processors a small slow growing supplier of PowerPC chips staked its future on high performance multicore ARM processors Looking back from 2021 the bet will have paid off this supplier will have transcended niche status and other vendors will have similarly embraced the ARM architecture That s the hope of AppliedMicro and ARM at least In November AppliedMicro uncloaked its ambitious plan to design custom 64 bit ARMcompatible CPUs to develop multicore processors based thereon and to aim these processors at both server and communications markets The perspective at the end of 2011 is different The big development in 2011 was the sampling of two new processor hybrids Xilinx and Altera crossbred CPUs and FPGAs to reduce system cost in existing two chip designs and to enable development of new designs where a single device is the only practical solution Freescale and numerous other companies crossbred CPUs DSPs and accelerator engines to craft a new class of chip for small LTE base stations Intended to ship in high volume and at low cost an integrated device is a boon for these cellular systems In parallel embedded processor vendors unveiled their 28nm roadmaps and Broadcom struck a 3 7 billion deal to acquire NetLogic Broadcom s biggest deal since its 2000 acquisition of SiByte for 2 billion FPGAs to CPUs You Will Be Assimilated Apart from higher profile developments around the ARM architecture ARM emerged as the chosen processor for the new CPU FPGA hybrids from Xilinx and Altera Both FPGA suppliers disclosed details of their forthcoming devices that integrate ARM Cortex A9 CPUs and peripherals Neither company had previously achieved lasting success in adding hard CPUs to their FPGAs but both promise that this time is different Indeed the technologies of the new chips are different These hybrids can boot their CPUs before loading the FPGA configuration enabling them to function more like embedded processors and less like an FPGA with a peripheral CPU Running at 800MHz the ARM CPUs are speedy enough for mainstream embedded systems Embedded designs that today would pair a processor like a Freescale QorIQ P1020 with a Xilinx Spartan or Altera Cyclone FPGA are strong candidates for these new hybrids The hybrid approach enables the FPGA company to capture additional value a fancy way of saying they can charge more by cutting the processor supplier out of the picture Merging two chips into one reduces the board area system power and bill of materials cost of existing designs and enables new systems where power and performance requirements preclude a two chip solution The Xilinx and Altera lines differ slightly as Figure 1 shows The four member Xilinx family dubbed Zynq has devices with 28 000 350 000 logic cells and sampled in 2011 All have twin CPUs and an analog to digital converter The two high end FPGAs include 12 5Gbps serdes The two Zynqs with the fewest FPGA gates have no serdes and hence no PCI Express PCIe connectivity a serious omission given the ubiquity of PCIe in embedded processing A 2 5Gbps or 5Gbps serdes would incur incremental cost but broaden the usefulness of these chips Altera did not coin a new brand name for its chips instead extending its FPGA Cyclone low density and Arria midrange brands to include its new hybrids The company JANUARY 2012 2 Hybrids Bud on Embedded Landscape calls these hybrids SoC FPGAs the most mellifluous moniker since PCMCIA The Altera lineup is broader extending from 25 000 to 462 000 logic elements A Xilinx logic cell and an Altera logic element represent approximately the same capacity Altera however will not sample its CPUFPGA combinations until 2H12 giving Xilinx a year to win the first wave of designs and broaden its offerings On balance Altera s chips have better high speed I O options than Zynq The lowest density Cyclone device includes no serdes but the other Cyclones integrate 5Gbps transceivers These serdes can support PCIe Gen2 at lower manufacturing costs compared with 10Gbps transceivers The denser Arria models have both 6Gbps and 10Gbps serdes DSP capability indicated by the size of the bubbles in Figure 1 varies proportionally with gate count The Zynq devices have between 80 and 900 DSP slices Each configurable slice includes a 25x18 bit multiplier an accumulator ALU pre adders and other functions Altera s devices provide 36 1 068 variable precision DSP blocks and each block can perform a single 27x27 bit multiply a pair of 18x19 bit multiplies or three 9x9 bit multiplies Most pronounced at the lower densities Xilinx offers more DSP units but the difference narrows if a designer can use Altera s reduced precision modes In either case these units can implement video encoders filters and other signalprocessing functions For the automotive market for example Xilinx has diagrammed Zynq based systems that analyze video of the road and provide lane departure warnings Roadmaps to 28nm in 2012 These processor FPGA hybrids are among the first 28nm chips targeting embedded designs Freescale and NetLogic soon to be acquired by Broadcom also disclosed their Figure 1 Xilinx Zynq versus Altera SoC FPGA Bubble size DSP units Source vendors 28nm roadmaps in 2011 Incorporating a new CPU design Freescale s upcoming processors significantly advance the QorIQ line At about the same time NetLogic s new XLP II family will supplant many of the first generation 40nm XLP processors and it extends the company s top end to much higher performance Freescale has publicly divulged a few details about its 28nm QorIQ Amp processors but has held others close The most important new ingredient is the 64 bit Power e6500 CPU Departing from its predecessor the e500 e5500 the new CPU is a fused core dual thread implementation like AMD s Bulldozer see MPR 8 30 10 AMD Bulldozer Plows New Ground The two threads share the front end of the relatively short pipeline and the AltiVec SIMD unit but they have independent integer execution units Freescale claims a 70 speedup over a single thread implementation while incurring only a 30 area penalty Better branch prediction and higher clock rates improve performance compared with the e500 Applications that use AltiVec which has not appeared in a Freescale CPU since the e600 last used in the MPC8641 and MPC7448 see MPR 7 5 05


View Full Document
Loading Unlocking...
Login

Join to view HYBRIDS BUD ON EMBEDDED LANDSCAPE and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view HYBRIDS BUD ON EMBEDDED LANDSCAPE and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?