##
This **preview** shows page *1-2*
out of 6 **pages**.

*View Full Document*

End of preview. Want to read all 6 pages?

Upload your study docs or become a GradeBuddy member to access this document.

View Full Document**Unformatted text preview:**

SPHERE DECODING FOR MULTIPROCESSOR ARCHITECTURESQ. Qi, C. ChakrabartiArizona State UniversityDepartment of Electrical Engineering{qi,chaitali}@asu.eduABSTRACTMotivated by the need for high throughput sphere decoding for multiple-input-multiple-output (MIMO) communication systems, we proposea parallel depth-first sphere decoding (PDSD) algorithm that pro-vides the advantages of both parallel processing and rapid searchspace reduction. The PDSD algorithm is designed for efficient im-plementation on programmable multi-processor platforms. We in-vestigate the trade-off between the throughput and computation over-head when the number of processing elements is 2, 4 and 8, for a 4×4 16-QAM system across a wide range of SNR conditions. Throughsimulation, we show that PDSD can offer significant throughput im-provement without incurring substantial computation overhead byselecting the appropriate number of processing elements accordingto specific SNR conditions.Index Terms— Sphere Decoding, Architecture, Multiprocessor1. INTRODUCTIONThe application of wireless devices has become ubiquitous in re-cent years. The increasing demand of robust and high throughputmobile systems has spear-headed the development of multiple-inputmultiple-output (MIMO) communication systems. A MIMO an-tenna array [1][2] coupled with orthogonal frequency division mul-tiplexing (OFDM) has become the defacto choice for designing highbandwidth capacity and spectral efficiency communication standardssuch as IEEE 802.16e and 802.11n.The performance gain of a MIMO system comes at the costof increasing design complexity. The signal detector is one of themost important modules in a MIMO system. Maximum-likelihood(ML) detectors are impractical for high data rate MIMO systems,since their complexity increases exponentially with signal dimen-sion. Active research on low-complexity and near ML MIMO detec-tors have generated several solutions, including–zero-forcing equal-ization (ZF) [3], nulling and canceling (NC) [1], semidefinite relax-ation (SR) [4][5] and sphere decoding (SD) [6]. Of these approaches,the SD algorithm is the most promising; it offers low complexity andgood bit-error-rate (BER) performance under a variety of Signal-to-Noise (SNR) and constellation conditions [7].With the emergence of wireless networks for different band-width and mobility scenarios, implementing the physical layer ona programmable hardware platform, such as software defined radio(SDR), is very attractive due to lower development cost, ease of ver-ification and application versatility [8]. Current VLSI implementa-tion of SD detectors, such as [9][10], are ASIC designs that increasethe throughput, but are limited to specific communication protocolconfigurations and setups. In this paper, we present a design studyThis research was supported by NSF CSR-EHS 0615135.MasterControllerGlobalMemoryComputationUnitLocalMemoryProcessing Element(PE)ComputationUnitLocalMemoryProcessing Element(PE)ComputationUnitLocalMemoryProcessing Element(PE)Fig. 1. Multi-core Software Radio Architecture.for a parallel depth-first SD detector on the multiprocessor SDR plat-form shown in Figure 1. The main contributions of this paper arelisted below.1. We introduce a parallel depth-first sphere decoding (PDSD)algorithm, and investigate its performance under different SNRand parallelization parameters.2. We map the PDSD on to a programmable multiple processingelement (PE) hardware platform to providing faster decodingspeed.3. We investigate the performance of PDSD with 2, 4 and 8 PEsfor SNR ranging from 8dB to 22dB. We found that for a 4×416-QAM PDSD system with SNR ≤ 16dB, a 4-PE architec-ture provides near 3X throughput increase, while incurringonly a small computation overhead.This paper is organized as follows. We briefly describe a MIMOsystem in Section 2, followed by a review of the sphere decodingalgorithm and existing VLSI implementations in Section 3. Section4 presents the PDSD algorithm and proposes a multiprocessor ar-chitecture implementation. Section 5 provides algorithm simulationresults. The conclusion is given in Section 6.2. PRELIMINARIESA MIMO system with spatial multiplexing signaling is shown in Fig-ure 2. It consists of MTtransmit and MRreceive antennas. Let y bethe MR× 1 vector of received symbols, given byy = Hs + n (1)where H is an MR× MTcomplex channel matrix with hijrepre-senting the complex transfer function from the jth transmit antennato the ith receive antenna, n = [n1, n2, . . . , nMR]Tis an MR× 1noise vector, and s = [s1, s2, . . . , sMT]Tis an MT× 1 vector oftransmitted symbols. Each hijin H is an independent and identi-cally distributed (i.i.d.) complex Gaussian variable with zero meanand 0.5 variance. Each niin n is also an i.i.d complex GaussianModulationandMappingDemodulationandMIMOSphereDecoderHsˆ1nChannelEncoderChannelDecodercˆcInformationSourceInformationSinks2nRMnyMT2112MRFig. 2. Block diagram of a MIMO communication system.variable with zero mean and σ2variance (σ2is calculated accordingto receiving SNR). Let x denote the source data bits. A transmittedsymbol siis obtained by mapping every Mc= log2(M) bits of xon to a complex constellation of size M.Throughout this paper, we assume MR≥ MTand channel ma-trix H is perfectly known to the receiver through training sequenceestimation. The entries in the transmitted symbol vector s are con-stellations in an M -quadrature amplitude modulation (QAM). Fur-thermore, the transmitted data is uncoded and the transmission rateis R = MTMcbits per effective