UW-Madison ECE 734 - Practical Multi-access by Exploiting Spacial Diversity in 802.11b - D2916173

Home> Schools> University of Wisconsin, Madison> Electrical and Computer Engr (ECE) > ECE 734> Practical Multi-access by Exploiting Spacial Diversity in 802.11b

DOC PREVIEW

UW-Madison ECE 734 - Practical Multi-access by Exploiting Spacial Diversity in 802.11b

School name University of Wisconsin, Madison

Course Ece 734- VLSI Array Structures for Digital Signal Processing

Pages 21

This preview shows page 1-2-20-21 out of 21 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 21 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 21 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 21 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 21 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 21 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

TitleSlide 2Slide 3Slide 4Slide 5Slide 6Slide 7Slide 8Slide 9Slide 10Slide 11Slide 12Slide 13Slide 14Slide 15Slide 16Slide 17Slide 18Slide 19Slide 20Slide 21Practical Multi-access by Exploiting Spacial Diversity in 802.11bHuimin ZengECE 734 Spring 2010Outline●Introduction●Overview of problem●Processing requirement●FIR filter design with SIMD●Multicore scheduling●ConclusionIntroduction●Demand for high data rate in wireless access●Exploit diversity is the key to increase the capacity ●Basic categories–Frequency –Time–Code –SpaceSpacial Multi-access●Base station (BS) equipped with an antenna array allowing to separate the signals from multiple mobile stations who sharing the same frequency band and time slot●In a strong multipath propagation, spatial diversity implementation is preferred over beamforming ●But the challenge in decoding phase is the effect of the intersymbol interference from multiple usersA Simple scenario●Design a practical spatial multi-access in 802.11b–BS try to decode two overlapped frames from two locations–Mobile device does not changeST 1ST 2BSh11h22h21h12x1x2y1y2y1 = h11*x1 + h21*x2y2 = h12*x1 + h22*x2How to decode overlapped frames?●By applying Interference Alignmenty1h11* x1h21* x2h22* x2h11* x1y2y1h11* x1h21* x2R*h22* x2R*h11* x1R*y2Decode Procedure●Use the clean preamble of the first arrived frame, Pa, of the two overlapped frames to align the signals●Subtract the two signals to get the signal with the information of the second frame, Pb, decode Pb●Re-encode Pb, and cancel it out from the original signal, to get the signal with the information of Pa, decode PaDSP for 802.11b PHY layerScrambleQPSKModDS-SSUp sampling2Mbps 32Mbps 352Mbps1.4Gbps2MbpsMacTransmissionDe-ScrambleQPSKDemodDS-SSDecodeDown sampling2Mbps 32Mbps 352Mbps1.4Gbps2MbpsMacReceptionDecode PbRegenerate PbDecode PaProcessing Requirement●High system throughput–1.4 Gbps●High computation intensity–If N ops per bit, requiring 1.4n G ops per sec●Real-time requirementSora Platform●Radio control board has a maximum throughput of PCIe x32, which is 64 Gbps●Sora software support –Multi-core processing–Intel SSE●Therefore, in this project, I focus on applying multi-core scheduling and SIMD to optimize the processing speedChannel Equalizer for Interference Alignment●To reduce the complexity, I choose MMSE equalizer, which is a sub-optimal linear filter. It minimize the difference between the output signals and the know transmitted training signals–c = arg min {c} || x – c * r ||Linear FIR FilterCoefficient TrainingReceived SignalEqualized SignalAlgorithm to compute c1: \\ Initialize {ci}: 2: c0 = 13: for each j <> 04: cj = 05: end for 6: 7: \\ Training: 8: for each sample index i do 9: for k = 0 .. K – 1 do 10: x_estk+k = 011: for l = 0 .. L do 12: x_esti+k = x_esti+k + clyi+k-l13: end for 14: errk = xi+k – x_esti+k15: 16:for j = 0 .. L do17: dj = 018: end for 19: w = 0 20: for j = 0 .. L do21: dj = dj + step errk yi+k-l*22: w = w + |yi+k-j|223: end for 24: end for 25: for j = 0 .. L do26: cj = cj – dj/w27: end for 28:end forDynamic Range Analysis ●Inputs: x and y, 16bit X2 per sample●Outputs: c, 16bit per coefficient, 16 tapsVariable Date type Dynamic range bitsAdditional fractional bitsTotal register length {cj} fixed point 16 - 16 {xi} integer 16 - 16 {yi} integer 16 - 16{x _es t } fixed point 20 1 21{er r } fixed point 21 1 22{d} fixed point 34 1 35{w} integer 36 0 36FIR filter design with SIMD●The FIR filter is used in both training and equalizing stages. It is described as: ●Intel SSE supports 128-bit packed vector, each FIR sample takes 32 bits, so 4 calculations can be performed simultaneously. y n=∑i=0Lci xn−i FIR filter design (cont.)0 0 0 c0c1c2c3c4c5c6c7... outputx18x17x16x15(x14x13x12x11) (x10x9x8... y15x18x17x16x15(x14x13x12x11) (x10x9... y16x18x17x16x15(x14x13x12x11) (x10... y17x18x17x16x15(x14x13x12x11) ... y18FIR filter design (cont.)●Memory lay out of the FIR filter coefficients0 0 0 c00 0 c0c10 c0c1c2c0c1c2c3c1c2c3c4........c12c13c14c15c13c14c150c14c150 0c150 0 0FIR filter design (cont.)●SSE2 code1: // Load four 32-bit samples2: movdqa xmm0, [esi];3:4: // compute the four results with the first four rows in the FIR filter coefficient table5: mov edx, Coff // reset coefficient index6: mov edi, Buff // reset temporary accumulated buffer index 7: movdqa xmm1, xmm0;8: pmullw xmm1, [edx]; // edx is coefficient index9: paddsd xmm1, [edi]; // edi is temporary accumulated buffer index10: movdqa xmm21, xmm0;11: pmullw xmm2, [edx+4]; // edx is coefficient index12: paddsd xmm2, [edi+4]; // edi is temporary accumulated buffer index13: movdqa xmm3, xmm0;14: pmullw xmm3, [edx+8]; // edx is coefficient index15: paddsd xmm3, [edi+8]; // edi is temporary accumulated buffer index16: movdqa xmm4, xmm0;17: pmullw xmm4, [edx+12]; // edx is coefficient index18: paddsd xmm4, [edi+12]; // edi is temporary accumulated buffer indexFIR filter design (cont.)19:// extract output from the four registers and pack them into single 128-bit output20:paddsd xmm1, [ecx]; // ecx is the mask index 21:paddsd xmm2, [ecx+4]; 22:paddsd xmm3, [ecx+8]; 23:paddsd xmm4, [ecx+12]; 24:paddsd xmm1, xmm2;25:paddsd xmm1, xmm3;26:paddsd xmm1, xmm4;27:movdqa [ebx], xmm1; //ebx is the output memory addressFIR filter design (cont.)28: mov xmm1, [edi]; 29: // Multiply each of the rest of rows in the FIR filter coefficient table30: // and update the temporary accumulated buffer31: mov eax, 19; // set total number of iterations32: loop: 33: movdqa xmm1, xmm0;34: pmullw xmm1, [edx+16]; 35: paddsd xmm1, [edi+16]; // edi is temporary accumulated buffer index 36: movdqa [edi], xmm1; // store the temporary accumulated result37: add edx, 16; // next coefficient index38: add edi, 32; // next temporary accumulated buffer index39: dec eax; 40: jnz loop;Multi-core Scheduling●Still working on it …Conclusion●Spacial diversity exploitation is a good additional technique to increase the capacity in LAN multi-access ●Although diversity gain implies an increase in computational complexity, it can be implemented even with general purpose

View Full Document