12
2234 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 10, NO. 7, JULY 2011 Training Signal Design and Tradeoffs for Spectrally-Ef cient Multi-User MIMO-OFDM Systems Yuejie Chi, Student Member, IEEE, Ahmad Gomaa, Student Member, IEEE, Naofal Al-Dhahir, Fellow, IEEE, and A. Robert Calderbank, Fellow, IEEE Abstract—In this paper, we design MMSE-optimal training sequences for multi-user MIMO-OFDM systems with an arbi- trary number of transmit antennas and an arbitrary number of training symbols. It addresses spectrally-efcient uplink trans- mission scenarios where the users overlap in time and frequency and are separated using spatial processing at the base station. The robustness of the proposed training sequences to residual carrier frequency offset and phase noise is evaluated. This analysis reveals an interesting design tradeoff between the peak- to-average power ratio of a training sequence and the increase in channel estimation mean squared error over the ideal case when these two impairments are not present. Index Terms—Training sequences design, pilot design, MIMO- OFDM, multi-user systems, carrier frequency offset, phase noise, RF impairments. I. I NTRODUCTION I NFORMATION-theoretic analysis by Foschini [1] and by Telatar [2] has shown that multiple antennas at the transmitter and receiver enable high-rate wireless communi- cation. Space-time codes, introduced by Tarokh et al. [3], improve the reliability of communication over fading chan- nels by correlating signals across different transmit antennas. Orthogonal Frequency Division Multiplexing (OFDM) [4] is widely adopted in broadband communications standards for its efcient implementation, high spectral efciency, and robustness to Inter-Symbol Interference (ISI). OFDM offers great exibility in that multiple streams with diverse rates and Quality-of-Service (QoS) requirements can be transmitted over the parallel frequency subchannels. However, there are two main drawbacks in OFDM; the rst is high Peak-to- Average Power Ratio (PAPR) which results in larger backoff Manuscript received June 22, 2010; revised November 11, 2010 and Febru- ary 14, 2011; accepted March 28, 2011. The associate editor coordinating the review of this paper and approving it for publication was N. Arumugam. Y. Chi is with the Department of Electrical Engineering, Princeton Univer- sity, Princeton, NJ 08544, USA (e-mail: [email protected]). A. Gomaa and N. Al-Dhahir are with the Department of Electrical Engi- neering, University of Texas at Dallas, Richardson, TX 75083, USA (e-mail: {aag083000, aldhahir}@utdallas.edu). A. R. Calderbank is with the Department of Computer Science, Duke University, Durham, NC 27708, USA (e-mail: [email protected]). The work of A. Gomaa and N. Al-Dhahir was supported in part by QNRF and RIM Inc. The work of Y. Chi and R. Calderbank was supported by the Ofce of Naval Research under Grant N00014-08-1-1110, by the Air Force Ofce of Scientic Research under Grant FA 9550-09-1-0643, and by NSF under Grants NSF CCF -0915299 and NSF CCF-1017431. The work of A. Gomaa and N. Al- Dhahirwas supported in part by QNRF and RIM Inc. Digital Object Identier 10.1109/TWC.2011.042211.101100 with nonlinear ampliers, and the second is high sensitivity to frequency errors and phase noise. We will address both issues in this paper. Our focus is on training sequence design for the combination of Multiple-Input-Multiple-Output (MIMO) sys- tems and OFDM technology (see [5] and references therein), and we aim to make this combination more attractive by reducing the overhead that is necessary for channel estimation. Current multi-user MIMO-OFDM systems [6] support mul- tiple users by assigning each time/frequency slot to only one user. For example, in OFDMA systems (adopted in the WiMAX [7] and LTE standards [8]), different users are assigned different subcarriers within the same OFDMA symbol. A different method of separating users is through the random-access CSMA/CA medium access control (MAC) protocol used in WLAN standards, e.g. IEEE 802.11n [9]. Both methods require that users not overlap in either time or frequency and this restriction results in a signicant loss in spectral efciency. The introduction of multiple receive antennas at the base station means that it is possible to improve spectral efciency by allowing users to overlap while maintaining decodability, as in the recently-proposed Coordi- nated MultiPoint transmission (CoMP) techniques in the LTE- Advanced standard [10]. Accurate Channel State Information (CSI) is required at the receiver for coherent detection and is typically acquired by sending known training sequences from the transmit an- tennas and inferring channel parameters from the received signals. Various OFDM channel estimation schemes [11]-[13] have been proposed for Single-Input Single-Output (SISO) systems. However channel estimation is more challenging in a multi-user MIMO-OFDM system because there are more link parameters to calculate, and their estimation is complicated by interference between different transmissions. The direct approach is to invert a large matrix that describes cross- antenna interference at each OFDM tone [14]. Complexity can be reduced by exploiting the correlation between adjacent subchannels [15]. It is also possible to develop solutions in the time domain [16] where the challenge is to estimate time of arrivals. Here it is possible to reduce complexity by exploiting the power-delay proles of the typical urban and hilly terrain propagation models. MIMO Channel estimation schemes were investigated in [17] for single-carrier single-user systems in the context of GSM-EDGE. Linear Least-Squares (LLS) channel estimation is of great 1536-1276/11$25.00 c 2011 IEEE

2234 IEEE TRANSACTIONS ON WIRELESS …yuejiec/papers/TSD_TWC_2011.pdfsignals. Various OFDM channel estimation schemes [11]-[13] have been proposed for Single-Input Single-Output (SISO)

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 2234 IEEE TRANSACTIONS ON WIRELESS …yuejiec/papers/TSD_TWC_2011.pdfsignals. Various OFDM channel estimation schemes [11]-[13] have been proposed for Single-Input Single-Output (SISO)

2234 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 10, NO. 7, JULY 2011

Training Signal Design and Tradeoffs forSpectrally-Efficient Multi-User

MIMO-OFDM SystemsYuejie Chi, Student Member, IEEE, Ahmad Gomaa, Student Member, IEEE, Naofal Al-Dhahir, Fellow, IEEE,

and A. Robert Calderbank, Fellow, IEEE

Abstract—In this paper, we design MMSE-optimal trainingsequences for multi-user MIMO-OFDM systems with an arbi-trary number of transmit antennas and an arbitrary number oftraining symbols. It addresses spectrally-efficient uplink trans-mission scenarios where the users overlap in time and frequencyand are separated using spatial processing at the base station.The robustness of the proposed training sequences to residualcarrier frequency offset and phase noise is evaluated. Thisanalysis reveals an interesting design tradeoff between the peak-to-average power ratio of a training sequence and the increasein channel estimation mean squared error over the ideal casewhen these two impairments are not present.

Index Terms—Training sequences design, pilot design, MIMO-OFDM, multi-user systems, carrier frequency offset, phase noise,RF impairments.

I. INTRODUCTION

INFORMATION-theoretic analysis by Foschini [1] andby Telatar [2] has shown that multiple antennas at the

transmitter and receiver enable high-rate wireless communi-cation. Space-time codes, introduced by Tarokh et al. [3],improve the reliability of communication over fading chan-nels by correlating signals across different transmit antennas.Orthogonal Frequency Division Multiplexing (OFDM) [4]is widely adopted in broadband communications standardsfor its efficient implementation, high spectral efficiency, androbustness to Inter-Symbol Interference (ISI). OFDM offersgreat flexibility in that multiple streams with diverse ratesand Quality-of-Service (QoS) requirements can be transmittedover the parallel frequency subchannels. However, there aretwo main drawbacks in OFDM; the first is high Peak-to-Average Power Ratio (PAPR) which results in larger backoff

Manuscript received June 22, 2010; revised November 11, 2010 and Febru-ary 14, 2011; accepted March 28, 2011. The associate editor coordinating thereview of this paper and approving it for publication was N. Arumugam.

Y. Chi is with the Department of Electrical Engineering, Princeton Univer-sity, Princeton, NJ 08544, USA (e-mail: [email protected]).

A. Gomaa and N. Al-Dhahir are with the Department of Electrical Engi-neering, University of Texas at Dallas, Richardson, TX 75083, USA (e-mail:{aag083000, aldhahir}@utdallas.edu).

A. R. Calderbank is with the Department of Computer Science, DukeUniversity, Durham, NC 27708, USA (e-mail: [email protected]).

The work of A. Gomaa and N. Al-Dhahir was supported in part by QNRFand RIM Inc.

The work of Y. Chi and R. Calderbank was supported by the Office of NavalResearch under Grant N00014-08-1-1110, by the Air Force Office of ScientificResearch under Grant FA 9550-09-1-0643, and by NSF under Grants NSFCCF -0915299 and NSF CCF-1017431. The work of A. Gomaa and N. Al-Dhahirwas supported in part by QNRF and RIM Inc.

Digital Object Identifier 10.1109/TWC.2011.042211.101100

with nonlinear amplifiers, and the second is high sensitivity tofrequency errors and phase noise. We will address both issuesin this paper. Our focus is on training sequence design for thecombination of Multiple-Input-Multiple-Output (MIMO) sys-tems and OFDM technology (see [5] and references therein),and we aim to make this combination more attractive byreducing the overhead that is necessary for channel estimation.

Current multi-user MIMO-OFDM systems [6] support mul-tiple users by assigning each time/frequency slot to onlyone user. For example, in OFDMA systems (adopted inthe WiMAX [7] and LTE standards [8]), different usersare assigned different subcarriers within the same OFDMAsymbol. A different method of separating users is throughthe random-access CSMA/CA medium access control (MAC)protocol used in WLAN standards, e.g. IEEE 802.11n [9].Both methods require that users not overlap in either timeor frequency and this restriction results in a significant lossin spectral efficiency. The introduction of multiple receiveantennas at the base station means that it is possible toimprove spectral efficiency by allowing users to overlap whilemaintaining decodability, as in the recently-proposed Coordi-nated MultiPoint transmission (CoMP) techniques in the LTE-Advanced standard [10].

Accurate Channel State Information (CSI) is required atthe receiver for coherent detection and is typically acquiredby sending known training sequences from the transmit an-tennas and inferring channel parameters from the receivedsignals. Various OFDM channel estimation schemes [11]-[13]have been proposed for Single-Input Single-Output (SISO)systems. However channel estimation is more challenging in amulti-user MIMO-OFDM system because there are more linkparameters to calculate, and their estimation is complicatedby interference between different transmissions. The directapproach is to invert a large matrix that describes cross-antenna interference at each OFDM tone [14]. Complexitycan be reduced by exploiting the correlation between adjacentsubchannels [15]. It is also possible to develop solutions in thetime domain [16] where the challenge is to estimate time ofarrivals. Here it is possible to reduce complexity by exploitingthe power-delay profiles of the typical urban and hilly terrainpropagation models. MIMO Channel estimation schemes wereinvestigated in [17] for single-carrier single-user systems in thecontext of GSM-EDGE.

Linear Least-Squares (LLS) channel estimation is of great

1536-1276/11$25.00 c⃝ 2011 IEEE

Page 2: 2234 IEEE TRANSACTIONS ON WIRELESS …yuejiec/papers/TSD_TWC_2011.pdfsignals. Various OFDM channel estimation schemes [11]-[13] have been proposed for Single-Input Single-Output (SISO)

CHI et al.: TRAINING SIGNAL DESIGN AND TRADEOFFS FOR SPECTRALLY-EFFICIENT MULTI-USER MIMO-OFDM SYSTEMS 2235

practical importance since it does not require prior knowl-edge of the channel statistics and enjoys low implementa-tion complexity. We consider frequency-selective block-fadingchannels where the Time Domain (TD) representation requiresfewer parameters than the Frequency Domain (FD) representa-tion. Our focus is on the design of (optimal) training sequencesfor Multi-User MIMO OFDM systems that minimize the meansquared error of time-domain LLS channel estimation. Thedesign of optimal training sequences for single-user MIMO-OFDM systems is investigated in [18] and [19]. The Fouriermethods used in [18] provide some control over PAPR andsome resilience to frequency offsets. The construction of opti-mal training sequences for multi-user MIMO-OFDM systemshas been investigated in both the time domain [20] and thefrequency domain [21], but these designs do not easily extendto multiple OFDM training symbols. It is also possible totake advantage of the similarities between communicationsand radar signal processing, where the path gains and delaysare the range / Doppler coordinates of a scattering sourceand the problem is to estimate them. The unitary filter bankdeveloped for Instantaneous Radar Polarimetry [22] supportsfrequency domain LLS channel estimation in a 2x2 MIMOOFDM system [23] and is able to suppress interference overtwo OFDM symbols with linear complexity. This exampleis a special case of a more general construction of filterbanks for the analysis of acoustic surface waves [24], [25].A limitation of these methods is that the number of OFDMtraining symbols is at least the number of transmit antennas.

In contrast, our framework supports the design of optimaltraining sequences for an arbitrary number of transmit anten-nas and an arbitrary number of training symbols. It providesthe first general solution to the channel estimation problemfor Multi-User MIMO-OFDM systems where Spatial DivisionMultiple Access (SDMA) is employed to increase the spectralefficiency. The optimality of our designs holds irrespectiveof the number of transmit antennas per user, the numberof OFDM sub-carriers, the channel delay spread, and thenumber of users provided that the number of tones dedicatedto estimation exceeds the product of the number of transmitantennas and the worst case delay spread. Not only does ourdesign algorithm generate training sequences that minimizemean squared channel estimation error, but the designs haveadditional properties that make them very attractive fromseveral implementation perspectives: 1) Individual trainingsequences can be drawn from standard signal constellations,2) Low PAPR, and 3) Low channel estimation complexitywithout sacrificing optimality.

We start by considering the optimal training sequencedesign for uplink Multi-User MIMO-OFDM systems whereall users are assumed to be synchronized. Then, we analyzethe average performance degradation when the users areasynchronous, i.e. with residual Carrier Frequency Offsets(CFO). Next, we investigate the impact of Phase Noise (PN)perturbing transmit and receive oscillators on the channelestimation accuracy. This analysis leads to an interestingdesign tradeoff between the PAPR of a training sequence andits robustness to CFO and PN. The main contributions of thispaper are

∙ Optimal training sequences design for Multi-User

MIMO-OFDM systems with an arbitrary number oftransmit antennas per user and an arbitrary number oftraining OFDM symbols as long as the rank condition(20) holds.

∙ Allowing users to overlap in time and frequency toincrease the spectral efficiency.

∙ Analytical study of CFO and PN effects on the channelestimation performance for any training sequence takinginto account PN at both the transmit and receive oscilla-tors.

∙ Investigating the trade-off between the PAPR of thetraining sequence and its immunity against CFO and PN.

The rest of this paper is organized as follows. The uplinkMulti-User MIMO-OFDM communication system model isdescribed in Section II. The design of optimal training se-quences is given for one and multiple training symbol senariosseparately in Section III. Practical issues such as CFO and PNare discussed in Section IV. Design trade-offs are discussedin Section V. Simulation results are presented in Section VI.Finally, conclusions are drawn in Section VII.

A note on notation: We use boldface to denote matricesand vectors. For a matrix A, A𝑇 denotes its transpose,A𝐻 denotes its complex-conjugate transpose, A† denotes itsPenrose-Moore pseudo-inverse, A−1 denotes its inverse if itexists, and Tr(A) denotes its trace. I𝑛 denotes an identitymatrix of dimension 𝑛 and 0𝑚×𝑛 denotes an all-zero matrixof size 𝑚 × 𝑛. The notation diag (𝑥1, 𝑥2, . . . , 𝑥𝑁 ) denotesan 𝑁 × 𝑁 diagonal matrix whose diagonal elements are{𝑥1, 𝑥2, . . . , 𝑥𝑁}. The operator ⊗ denotes the Kroneckerproduct, and the operator ∘ denotes the entry-wise Hadamardproduct. We also summarize the key variables used throughoutthe paper in Table I.

II. SYSTEM MODEL

We consider the uplink of a Multi-User MIMO-OFDMsystem, as shown in Fig. 1. We denote the Discrete FourierTransform (DFT) size by 𝑁 and the number of users by 𝐿(𝐿 ≥ 1) where the 𝑖th user is equipped with 𝑀𝑖 transmitantennas, 0 ≤ 𝑖 ≤ 𝐿 − 1. Therefore, the total number oftransmit antennas among all users is given by 𝑀 =

∑𝐿−1𝑖=0 𝑀𝑖.

We assume that the channel is quasi-static and remainsconstant over 𝐾 successive OFDM training symbols. Thechannel from the 𝑗th transmit antenna of the 𝑖th user tothe Base Station (BST) can be represented either in TD orFD. Let the Channel Frequency Response (CFR) be H𝑖,𝑗 =[𝐻𝑖,𝑗(0), ⋅ ⋅ ⋅ , 𝐻𝑖,𝑗(𝑁 − 1)]𝑇 where 𝐻𝑖,𝑗(𝑘), 0 ≤ 𝑘 ≤ 𝑁 − 1,is the frequency response at the 𝑘th subcarrier. However, theChannel Impulse Response (CIR) in TD is represented bya much smaller number of parameters. We assume that themaximal memory over all CIRs is 𝜈max, and write the CIR ash𝑖,𝑗 = [ℎ𝑖,𝑗(0), ⋅ ⋅ ⋅ , ℎ𝑖,𝑗(𝜈max)]

𝑇 . Estimating the CIR insteadof the CFR leads to the reduction of the number of unknownsfrom 𝑀𝑁 to 𝑀(𝜈max + 1). Hence, a more accurate channelestimate is attainable using the same amount of training.Furthermore, the CFR can be reconstructed from the CIR asfollows

𝐻𝑖,𝑗(𝑘) =1√𝑁

𝜈max∑𝑡=0

ℎ𝑖,𝑗(𝑡)𝑒−𝑗 2𝜋

𝑁𝑡𝑘. (1)

Page 3: 2234 IEEE TRANSACTIONS ON WIRELESS …yuejiec/papers/TSD_TWC_2011.pdfsignals. Various OFDM channel estimation schemes [11]-[13] have been proposed for Single-Input Single-Output (SISO)

2236 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 10, NO. 7, JULY 2011

TABLE IKEY VARIABLES USED THROUGHOUT THE PAPER.

Variable Meaning Domain𝐿 number of users ℤ+

𝑀𝑖 number of transmit antennas for the 𝑖th user ℤ+

𝑀 total number of transmit antennas ℤ+

𝑁 DFT size of the OFDM system ℤ+

𝐿𝑝 cyclic prefix length ℤ+

𝜈max the maximal memory of all CIRs ℤ+

X𝑖,𝑗 OFDM symbol from the 𝑗th antenna of the 𝑖th user ℂ𝑁

h𝑖,𝑗 CIR from the 𝑗th antenna of the 𝑖th user ℂ𝜈max+1

H𝑖,𝑗 CFR from the 𝑗th antenna of the 𝑖th user ℂ𝑁

F DFT matrix of size 𝑁 ℂ𝑁×𝑁

F0 the first (𝜈max + 1) columns of F ℂ𝑁×(𝜈max+1)

Λ𝑚 the transform operator between training sequences ℂ𝑁×𝑁

h𝑚 CIR from the 𝑚th antenna ℂ𝜈max+1

X𝑡𝑚 training sequence from the 𝑚th antenna in the 𝑡th OFDM symbol ℂ𝑁

S𝑡𝑚 circulant training matrix from the 𝑚th antenna in the 𝑡th OFDM symbol ℂ𝑁×(𝜈max+1)

S𝑡 training matrix in the 𝑡th OFDM symbol ℂ𝑁×𝑀(𝜈max+1)

y𝑡 received signal in the 𝑡th OFDM symbol ℂ𝑁

𝛼 normalized carrier frequency offset ℝ

𝑓sub subcarrier frequency spacing (Hz) ℝ+

𝛽 two-sided 3-dB linewidth of the oscillator power spectrum density (Hz) ℝ+

𝜙𝑛 𝑛th phase noise sample ℝ

At the 𝑗th (0 ≤ 𝑗 ≤ 𝑀𝑖 − 1) transmit antenna of the𝑖th (0 ≤ 𝑖 ≤ 𝐿 − 1) user, an OFDM symbol X𝑖,𝑗 ofsize 𝑁 is given by X𝑖,𝑗 = [𝑋𝑖,𝑗(0), ⋅ ⋅ ⋅ , 𝑋𝑖,𝑗(𝑁 − 1)]𝑇 .Let x𝑖,𝑗 = [𝑥𝑖,𝑗(0), 𝑥𝑖,𝑗(1), ⋅ ⋅ ⋅ , 𝑥𝑖,𝑗(𝑁 − 1)]

𝑇 be the InverseDiscrete Fourier Transform (IDFT) of X𝑖,𝑗 . We use a Cyclic-Prefix (CP) of length 𝐿𝑝 for the guard interval in the OFDMsystem so that

x𝑖,𝑗 = [𝑥(𝑁 − 𝐿𝑝 + 1), ⋅ ⋅ ⋅ , 𝑥(𝑁 − 1), 𝑥(0), ⋅ ⋅ ⋅ , 𝑥(𝑁 − 1)]𝑇

(2)where 𝐿𝑝 is chosen to be greater than the channel memory, i.e.𝐿𝑝 ≥ (𝜈max +1). Finally, x𝑖,𝑗 goes through Parallel-to-Serial(P/S) conversion and is modulated to the carrier frequency fortransmission.

At the base station, all users are assumed to be in frequencysynchronization with the BST. In Section IV, we examine therobustness of our proposed optimal training sequence designwhen this condition is not satisfied. In addition, we assumethat all users are synchronized in time with the BST, wherethe received signal is down-converted to baseband and passedthrough a Serial-to-Parallel (S/P) converter. Then, the CP isremoved and the Fast Fourier Transform (FFT) is applied. Thereceived OFDM symbol Y = [𝑌 (0), ⋅ ⋅ ⋅ , 𝑌 (𝑁 − 1)]𝑇 in onesymbol time can be written as

Y =

𝐿−1∑𝑖=0

𝑀𝑖−1∑𝑗=0

diag(H𝑖,𝑗)X𝑖,𝑗 +N, (3)

where N ∼ 𝒩 (0𝑁×1, 𝜎2I𝑁 ) is Additive White Gaussian

Noise (AWGN). We consider the mapping (𝑖, 𝑗) �→ 𝑚 : 𝑚 =∑𝑖𝑠=0 𝑀𝑠 + 𝑗 −𝑀𝑖, 0 ≤ 𝑚 ≤ 𝑀 − 1, and re-label H𝑖,𝑗 and

X𝑖,𝑗 as H𝑚 and X𝑚, respectively. The label can be invertedeasily as

𝑖 = argmin0≤𝑖∗≤𝐿−1

𝑖∗ s.t. 𝑚 ≤𝑖∗∑𝑠=0

𝑀𝑠, 𝑗 = 𝑚+𝑀𝑖 −𝑖∑𝑠=0

𝑀𝑠.

(4)

Then, equation (3) can be written as

Y =𝑀−1∑𝑚=0

diag(H𝑚)X𝑚 +N. (5)

Remark: The development of the algorithm requires labelingof transmit antennas among all users, and that both the BSTand all the users are aware of that labeling.

In Section III, we first consider TD LLS channel estimationwhen only one training symbol is allowed by leveragingthe channel representation in TD. A general approach when𝐾 ≥ 2 training symbol is given by further incorporatingspace-time code structure into the design. Then, a specialconstruction utilizing Quaternions is given when 𝐾 = 2.Finally, an alternative scheme using equally-spaced pilotsinstead of the whole symbol for training is given under somemild conditions. In the following sections, we assume onereceive antenna, since the same channel estimation scheme canbe applied at all receive antennas without loss of generality.

III. MAIN RESULTS

A. One OFDM Training Symbol

Since there are fewer parameters to be estimated in the TD,we apply the IDFT of size 𝑁 to Eq. (5), and get

y =

𝑀−1∑𝑚=0

S𝑚h𝑚 + n

=[S0 S1 ⋅ ⋅ ⋅ S𝑀−1

] [h𝐻0 h𝐻1 . . . h𝐻𝑀−1

]𝐻+ n

≜ Sh+ n (6)

where y ∈ ℂ𝑁 , h𝑚 ∈ ℂ𝜈max+1, 0 ≤ 𝑚 ≤ 𝑀 − 1, and S𝑚 ∈ℂ𝑁×(𝜈max+1) is the circulant training matrix constructed fromthe corresponding training sequence transmitted over the 𝑚thantenna.

Let F = [f0, ⋅ ⋅ ⋅ , f𝑁−1] be the DFT matrix of size 𝑁 withf𝑖 denoting its 𝑖th column, and let F0 = [f0, ⋅ ⋅ ⋅ , f𝜈max ] be

Page 4: 2234 IEEE TRANSACTIONS ON WIRELESS …yuejiec/papers/TSD_TWC_2011.pdfsignals. Various OFDM channel estimation schemes [11]-[13] have been proposed for Single-Input Single-Output (SISO)

CHI et al.: TRAINING SIGNAL DESIGN AND TRADEOFFS FOR SPECTRALLY-EFFICIENT MULTI-USER MIMO-OFDM SYSTEMS 2237

User 0

Encoder

IFFT

IFFT

CP

CP

......

...

User 1

Encoder

IFFT

IFFT

CP

CP

......

...

User L− 1

Encoder

IFFT

IFFT

CP

CP

......

...

Decoder

FFT

FFT

CP

CP

P/S

......

...

Base Station

·

·

·

...

...

...

...

S/P

P/S

P/S

P/S

P/S

P/S

P/S

P/S

S/P

Fig. 1. The uplink of a Multi-User MIMO-OFDM communication system.

composed of the first (𝜈max + 1) columns of F. Then, S𝑚can be written as

S𝑚 = F𝐻D𝑚F0, (7)

where D𝑚 = diag (𝑋𝑚(0), ⋅ ⋅ ⋅ , 𝑋𝑚(𝑁 − 1)). The matrixS ∈ ℂ𝑁×𝑀(𝜈max+1) defined in Eq. (6) is formed by hori-zontally concatenating the matrices S𝑚, 0 ≤ 𝑚 ≤ 𝑀 − 1.To enable LLS channel estimation, the following condition ondimensionality has to be satisfied [26]

𝑁 ≥ 𝑀(𝜈max + 1) or, 𝑀 ≤ 𝑁

(𝜈max + 1). (8)

To minimize the variance of the channel estimation error, thematrix S is required to satisfy [26]

S𝐻S = 𝑐I𝑀(𝜈max+1) (9)

and this requires that

S𝐻𝑚S𝑛 = 𝑐𝛿𝑚𝑛I(𝜈max+1), 0 ≤ 𝑚,𝑛 ≤ 𝑀 − 1. (10)

Given (7), the optimality condition becomes

F𝐻0 D𝐻𝑚D𝑛F0 = 𝑐𝛿𝑚𝑛I(𝜈max+1), 0 ≤ 𝑚,𝑛 ≤ 𝑀 − 1. (11)

Next, let F𝑚 be composed of (𝜈𝑚𝑎𝑥 + 1) consecutivecolumns of F starting at index 𝑚(𝜈max + 1), i.e.

F𝑚 =[f𝑚(𝜈max+1), ⋅ ⋅ ⋅ , f(𝑚+1)(𝜈max+1)−1

]= Λ𝑚F0, 0 ≤ 𝑚 ≤ 𝑀 − 1, (12)

where

Λ𝑚 = diag(1, 𝑒𝑗

2𝜋(𝜈max+1)𝑁 𝑚, ⋅ ⋅ ⋅ , 𝑒𝑗 2𝜋(𝜈max+1)(𝑁−1)

𝑁 𝑚).

(13)It can be easily shown that F𝐻𝑚F𝑛 = 𝛿𝑚𝑛I(𝜈max+1). Now we

present a general approach which gives a family of optimaltraining sequences. As a starting point, we choose the FD

training sequence as an arbitrary constant-amplitude sequenceX. Let D = diag (𝑋(0), ⋅ ⋅ ⋅ , 𝑋(𝑁 − 1)), then D𝐻D = 𝑐I𝑁where 𝑐 is determined by the signal constellation and/ortransmit power constraints. The FD training sequence at the𝑚th transmit antenna is given by

X𝑚 = Λ𝑚X, 0 ≤ 𝑚 ≤ 𝑀 − 1. (14)

Equivalently, D𝑚 = Λ𝑚D = DΛ𝑚, 0 ≤ 𝑚 ≤ 𝑀 − 1.Furthermore, we have the following theorem.

Theorem 1: The choice of FD training sequences in Eq. (14)is optimal for a single training OFDM symbol.

Proof: It is enough to show that Eq. (10) holds. Since

S𝑚 = F𝐻D𝑚F0 = F𝐻DΛ𝑚F0 = F𝐻DF𝑚, (15)

it follows that

S𝐻𝑚S𝑛 = (F𝐻DF𝑚)𝐻F𝐻DF𝑛

= F𝐻𝑚D𝐻DF𝑛 = 𝑐𝛿𝑚𝑛I(𝜈max+1). (16)

Therefore, the LLS estimate (LLSE) of h is given as h =1𝑐S𝐻y, where each CIR can be estimated as h𝑚 = 1

𝑐S𝐻𝑚y.

Then, the CFR is given by

H𝑚 =1

𝑐FS𝐻𝑚y =

1

𝑐(FF𝐻0 )D𝐻𝑚Fy, 0 ≤ 𝑚 ≤ 𝑀 − 1. (17)

The resulting channel estimation error variance is given by

𝜎2𝑒 = 𝜎2Tr

((S𝐻S)−1

)=

𝑀(𝜈max + 1)

𝑐𝜎2. (18)

B. 𝐾 OFDM Training Symbols with 𝐾 ≥ 2

The major limitation of using only one training OFDMsymbol is that the total number of transmit antennas is limitedby 𝑁

(𝜈max+1) . When the channel is quasi-static over 𝐾 ≥ 2OFDM training symbols it is possible to increase the number

Page 5: 2234 IEEE TRANSACTIONS ON WIRELESS …yuejiec/papers/TSD_TWC_2011.pdfsignals. Various OFDM channel estimation schemes [11]-[13] have been proposed for Single-Input Single-Output (SISO)

2238 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 10, NO. 7, JULY 2011

y =

⎡⎢⎢⎢⎣y0y1...

y𝐾−1

⎤⎥⎥⎥⎦ =

⎡⎢⎢⎢⎣S00 S01 ⋅ ⋅ ⋅ S0,𝑀−1

S10 S11 ⋅ ⋅ ⋅ S1,𝑀−1

......

. . ....

S𝐾−1,0 S𝐾−1,1 ⋅ ⋅ ⋅ S𝐾−1,𝑀−1

⎤⎥⎥⎥⎦⎡⎢⎢⎢⎣h0h1...

h𝑀−1

⎤⎥⎥⎥⎦+ n ≜

⎡⎢⎢⎢⎣S0S1...

S𝐾−1

⎤⎥⎥⎥⎦h+ n = Sh+ n (19)

of admissable transmit antennas and reduce MMSE by a factorof 𝐾 .

Denoting the received TD OFDM symbol in the 𝑡th symboltime by y𝑡, 0 ≤ 𝑡 ≤ 𝐾 − 1, we express the received symbolblock y in Eq. 19 where S𝑡𝑚 = F𝐻D𝑡𝑚F0, and the matricesD𝑡𝑚’s are diagonal with the FD training sequences appearingon their main diagonals. Least-square estimation is possiblewhen the following dimensionality condition for the matrixS ∈ ℂ𝐾𝑁×𝑀(𝜈max+1) holds

𝐾𝑁 ≥ 𝑀(𝜈max+1), or,𝐿−1∑𝑖=0

𝑀𝑖 = 𝑀 ≤ 𝐾𝑁

(𝜈max + 1). (20)

For S to be optimal, it has to satisfy S𝐻S = 𝑐I𝑀(𝜈max+1)

for some 𝑐. We extend our previous approach by constructinga unitary matrix of higher dimension with the space-time codestructure. Let the matrix Σ ∈ ℂ𝐾𝑁×𝐾𝑁 be constructed as aKronecker product Σ = U ⊗V where U = [𝑈𝑡𝑞] ∈ ℂ𝐾×𝐾

is a unitary matrix and V ∈ ℂ𝑁×𝑁 is a diagonal matrixsatisfying V𝐻V = 𝑐I𝑁 . Therefore the matrix Σ satisfies

Σ𝐻Σ = U𝐻U⊗V𝐻V = 𝑐I𝐾𝑁 . (21)

We give the following general design of optimal trainingsequences. For 0 ≤ 𝑚 ≤ 𝑀 − 1, let 𝑝 =

⌊𝑚𝐾

⌋, 0 ≤ 𝑝 ≤⌊

𝑀−1𝐾

⌋and 𝑞 = 𝑚 − 𝐾𝑝 ∈ {0, ⋅ ⋅ ⋅ ,𝐾 − 1}. For the 𝑚th

transmit antenna, its FD training sequence matrix at the 𝑡thOFDM training symbol is given by

D𝑡𝑚 = Σ𝑡𝑞Λ𝑝, if 𝑚 = 𝐾𝑝+ 𝑞, 0 ≤ 𝑚 ≤ 𝑀 − 1, (22)

where Σ𝑡𝑞 = 𝑈𝑡𝑞V is the 𝑁 ×𝑁 diagonal matrix located atthe (𝑡, 𝑞) block of Σ.

The bijection 𝜋 : 𝑚 �→ {𝑝, 𝑞} groups the antennas into 𝐾classes depending on the equivalence of the residue 𝑞. For twoantennas not in the same class, their training sequences canbe proved orthogonal over any OFDM training symbol. Fortwo antennas in the same class, their training sequences canbe proved orthogonal over all 𝐾 OFDM training symbols. Wegive the detailed proof below.

Theorem 2: The training sequences in (22) are optimal for𝐾 training OFDM symbols.

Proof: It is enough to show that

𝐾−1∑𝑡=0

S𝐻𝑡𝑚S𝑡𝑛 = F𝐻0

(𝐾−1∑𝑡=0

D𝐻𝑡𝑚D𝑡𝑛

)F0 = 𝑐𝛿𝑚𝑛I(𝜈max+1)

(23)It is obvious that when 𝑚 = 𝑛, the above equation holds.

When 𝑚 ∕= 𝑛, we write 𝑚 = 𝐾𝑝1 + 𝑞1 and 𝑛 = 𝐾𝑝2 + 𝑞2and split the proof into two cases:

∙ 𝑞1 = 𝑞2 = 𝑞 ∈ {0, ⋅ ⋅ ⋅ ,𝐾 − 1} but 𝑝1 ∕= 𝑝2. Then,

D𝑡𝑚 = Σ𝑡𝑞Λ𝑝1 and D𝑡𝑛 = Σ𝑡𝑞Λ𝑝2 , (24)

for 0 ≤ 𝑡 ≤ 𝐾 − 1, therefore,

𝐾−1∑𝑡=0

S𝐻𝑡𝑚S𝑡𝑛 = F𝐻𝑝1

(𝐾−1∑𝑡=0

Σ𝐻𝑡𝑞Σ𝑡𝑞

)F𝑝2

= 𝑐F𝐻𝑝1F𝑝2 = 0(𝜈max+1).

∙ 𝑞1 ∕= 𝑞2. Then from Eq. (24), we have

𝐾−1∑𝑡=0

D𝐻𝑡𝑚D𝑡𝑛 = Λ𝐻𝑝1

(𝐾−1∑𝑡=0

Σ𝐻𝑡,𝑞1Σ𝑡,𝑞2

)Λ𝑝2 = 0𝑁 .

(25)

Now, Eq. (23) follows trivially.Finally, the LLSE of h is given by h = 1

𝑐

∑𝐾−1𝑡=0 S

𝐻𝑡 y𝑡

where each CIR can be estimated as h𝑚 = 1𝑐

∑𝐾−1𝑡=0 S

𝐻𝑡𝑚y𝑡.

Then, the CFR at the 𝑚th transmit antenna is given by

H𝑚 =1

𝑐Fh𝑚 =

1

𝑐(FF𝐻0 )

𝐾−1∑𝑡=0

D𝐻𝑡𝑚Fy𝑡. (26)

Let 𝑐 = 𝐾𝑐, the resulting channel estimation error varianceis given by

𝜎2𝑒 = 𝜎2Tr

(𝐾−1∑𝑡=0

S𝐻𝑡 S𝑡)−1

)=

𝑀(𝜈max + 1)

𝐾𝑐𝜎2.

C. Special case when 𝐾 = 2

When 𝐾 = 2, like the Alamouti Space-Time Block Code(STBC), our construction of training sequences makes use ofHamilton’s Biquaternions. We will choose two FD trainingsequencesX and Z where the sum of their squared amplitudesis constant, i.e.

D𝐻𝑋D𝑋 +D𝐻𝑍D𝑍 = 𝑐I𝑁 . (27)

where D𝑋 = diag (𝑋(0), ⋅ ⋅ ⋅ , 𝑋(𝑁 − 1)), and D𝑍 =diag (𝑍(0), ⋅ ⋅ ⋅ , 𝑍(𝑁 − 1)).

For 0 ≤ 𝑚 ≤ 𝑀 − 1, let 𝑝 =⌊𝑚2

⌋, 0 ≤ 𝑝 ≤ ⌊

𝑀−12

⌋and 𝑞 = 𝑚 − 2𝑝 ∈ {0, 1}. Let X𝑝 = Λ𝑝X and Z𝑝 = Λ𝑝Z,0 ≤ 𝑝 ≤ ⌊

𝑀−12

⌋, where Λ𝑝 is defined in Eq. (13).

The diagonal FD training matrices of the 𝑚th antenna inthe 0th and 1st training symbols are given by D0𝑚 and D1𝑚

respectively:

D0𝑚 =

{Λ𝑝D𝑋 , if 𝑞 = 0,𝑚 = 2𝑝Λ𝐻𝑝 D

𝐻𝑍 , if 𝑞 = 1,𝑚 = 2𝑝+ 1

, (28)

and D1𝑚 =

{Λ𝑝D𝑍 , if 𝑞 = 0,𝑚 = 2𝑝−Λ𝐻𝑝 D𝐻𝑋 , if 𝑞 = 1,𝑚 = 2𝑝+ 1

. (29)

We summarize the FD training sequences design in Table II.Theorem 3: The FD training sequences in Table II are

optimal for two training OFDM symbols.

Page 6: 2234 IEEE TRANSACTIONS ON WIRELESS …yuejiec/papers/TSD_TWC_2011.pdfsignals. Various OFDM channel estimation schemes [11]-[13] have been proposed for Single-Input Single-Output (SISO)

CHI et al.: TRAINING SIGNAL DESIGN AND TRADEOFFS FOR SPECTRALLY-EFFICIENT MULTI-USER MIMO-OFDM SYSTEMS 2239

TABLE IIFD TRAINING SEQUENCES AT THE 𝑚TH TRANSMIT ANTENNA WHEN TWO

TRAINING OFDM SYMBOLS ARE AVAILABLE.

𝑚 = 2𝑝 + 𝑞 𝑞 = 0 𝑞 = 10th symbol X𝑝 −Z𝐻

𝑝

1st symbol Z𝑝 X𝐻𝑝

Proof: It is enough to show that

S𝐻0𝑚S0𝑛 + S𝐻1𝑚S1𝑛 = F𝐻0 (D𝐻0𝑚D0𝑛 +D

𝐻1𝑚D1𝑛)F0

= 𝑐𝛿𝑚𝑛I(𝜈max+1). (30)

It is obvious that the above equation holds when 𝑚 = 𝑛.When 𝑚 ∕= 𝑛, we write 𝑚 = 2𝑝1 + 𝑞1 and 𝑛 = 2𝑝2 + 𝑞2 andconsider two cases

∙ 𝑞1 = 𝑞2 but 𝑝1 ∕= 𝑝2. Without loss of generality, weassume 𝑞1 = 𝑞2 = 0 and get

S𝐻0𝑚S0𝑛 = F𝐻0 D𝐻0𝑚D0𝑛F0 = F𝐻𝑝1D

𝐻𝑋D𝑋F𝑝2 ,

and S𝐻1𝑚S1𝑛 = F𝐻0 D𝐻1𝑚D1𝑛F0 = F𝐻𝑝1D

𝐻𝑍D𝑍F𝑝2

Hence,

S𝐻0𝑚S0𝑛 + S𝐻1𝑚S1𝑛 = F𝐻𝑝1

(D𝐻𝑋D𝑋 +D𝐻𝑍D𝑍

)F𝑝2

= 𝑐F𝐻𝑝1F𝑝2 = 0(𝜈max+1).

∙ 𝑞1 ∕= 𝑞2. Without loss of generality, we assume 𝑞1 = 0and 𝑞2 = 1 and write

D𝐻0𝑚D0𝑛 +D𝐻1𝑚D1𝑛

= (Λ𝑝1D𝑋)𝐻(Λ𝐻𝑝2D

𝐻𝑍 ) + (Λ𝑝1D𝑍)

𝐻(−Λ𝐻𝑝2D𝐻𝑋)= Λ𝐻𝑝1(D

𝐻𝑋D

𝐻𝑍 −D𝐻𝑍D𝐻𝑋)Λ𝐻𝑝2 = 0(𝜈max+1).

Eq. (30) follows directly.If all the users employ two transmit antennas and Alamouti

code, their training sequences in two symbol intervals areassigned according to Eq. (28) and (29) , and can be generatedsimply using the same Alamouti code generator which greatlyreduce the training assignment complexity.

D. Peak-to-Average Power Ratio (PAPR) Property

The PAPR of the training sequence 𝑆(𝑛), 0 ≤ 𝑛 ≤ 𝑁 − 1,is given by

PAPR =max𝑛

∣𝑆(𝑛)∣21𝑁

∑𝑁−1𝑛=0 ∣𝑆(𝑛)∣2 . (31)

The transform operator Λ𝑚 between different FD trainingsequences can be viewed as a frequency modulation, whichis equivalent to circulant shift of the training sequence in theTD. Hence, we have the following proposition.

Proposition 1: All TD training sequences of X𝑚’s in (14)have the same PAPR.

This property is important when designing the trainingsequences. As long as the PAPR of X is low, all trainingsymbols will have low PAPR. Another merit of our designis that if 𝑁

(𝜈max+1) = 2𝑘, for some integer 𝑘, and if X ischosen from a 2𝑘-phase shift keying (PSK) constellation, thenthe transform Λ𝑚 guarantees that all FD training sequences{X𝑚, 0 ≤ 𝑚 ≤ 𝑀 − 1} will belong to the same 2𝑘-PSKconstellation, which is very easy to generate.

TABLE IIIPAPR COMPARISON OF THREE TRAINING SEQUENCE CANDIDATES

Chirp-based Golay-based TD ImpulsivePAPR 0 dB ≤ 3 dB 18 dB

One possible choice for X is a Constant-Amplitude-Zero-Auto-Correlation (CAZAC) sequence [27], which is acomplex-valued sequence with constant amplitude and zeroautocorrelation at nonzero lags. One example of a CAZACsequence of length 𝑁 is the chirp sequence given by

𝑋(𝑘) =

{ √𝑐 exp(𝑗 𝜋𝑢𝑘

2

𝑁 ), if 𝑁 is even√𝑐 exp(𝑗 𝜋𝑢𝑘(𝑘+1)

𝑁 ), if 𝑁 is odd, 0 ≤ 𝑘 ≤ 𝑁−1.

(32)where 𝑢 is any integer relatively prime1 to 𝑁 . A disadvantageof this and other CAZAC sequences is that the entries arenot restricted to a standard signal constellation. An alternativeis provided by Golay complementary sequences [28] whichonly assume values from {−√

𝑐,√𝑐}. A third possibility is the

flat sequence (impulsive in TD) {X : 𝑋(𝑘) =√𝑐, for all 𝑘}.

These three choices have different PAPRs, as summarizedin Table III and perform differently under practical systemimpairments as will be discussed in Sections IV and VI. Giventhe above discussion, it is possible to generate a family ofoptimal training sequences with low PAPR from a standardsignal constellation.

E. Reducing the number of pilots per training symbol

Assume the number of subcarriers 𝑁 can be decomposed as𝑁 = 𝑁𝑝𝑇 where 𝑁𝑝 ≥ (𝜈max + 1), then it is possible to use𝑁𝑝 equally-spaced pilots in each training symbol instead ofthe whole symbol. At the 𝑚th antenna, the training sequenceis given by diagonal matrix D𝑚 ∈ ℂ𝑁𝑝×𝑁𝑝 and the pilotlocations are {𝑠𝑇 }𝑁𝑝−1

𝑠=0 . Consider the one training symbolscenario without loss of generality. Instead of taking IDFT ofEq. (5) of length 𝑁 , we now take IDFT of Eq. (5) only atpilot tones of length 𝑁𝑝, and get

y =

𝑀−1∑𝑚=0

S𝑚h𝑚 + n, (33)

where y ∈ ℂ𝑁𝑝 , S𝑚 = F𝐻D𝑚F0 ∈ ℂ

𝑁𝑝×(𝜈max+1), F is theDFT matrix of size 𝑁𝑝, F0 = [𝑓𝑠𝑡 = 𝑒

𝑗 2𝜋𝑠𝑡𝑁𝑝 ] ∈ ℂ𝑁𝑝×(𝜈max+1)

is the submatrix of F0 at rows corresponding to pilot frequen-cies. It is obvious that F0 is also the first (𝜈max+1) columnsof the DFT matrix F, and F𝐻0 F0 = 1

𝑇 I𝜈max+1. Therefore, itis clear that we can follow the same framework in both singleand multiple training symbol scenarios, by replacing 𝑁 by 𝑁𝑝in both the dimensionality conditions and design parametersat the cost of increasing the MMSE by a factor of 𝑇 .

IV. PRACTICAL ISSUES

In this section, we study the performance of channel esti-mation using our proposed optimal training sequences undertwo practical impairments, namely, Carrier Frequency Offset(CFO) and oscillator Phase Noise (PN).

1Two integers are said to be relatively prime if their greatest commondivisor is 1.

Page 7: 2234 IEEE TRANSACTIONS ON WIRELESS …yuejiec/papers/TSD_TWC_2011.pdfsignals. Various OFDM channel estimation schemes [11]-[13] have been proposed for Single-Input Single-Output (SISO)

2240 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 10, NO. 7, JULY 2011

A. Residual CFO

In practical systems, CFO is first estimated and compen-sated for prior to channel estimation; however, a residual CFOremains uncompensated for due to the inaccuracy of the CFOestimate. In the sequel, we derive the channel estimation MSEin the presence of a residual CFO for the Multi-User MIMO-OFDM system. Taking the residual CFOs into consideration,the received signal over 𝐾 training OFDM symbols is

y =

𝐿−1∑𝑖=0

Q𝑖𝒮𝑖h𝑖 + n

=[Q0𝒮0 Q1𝒮1 . . . Q𝐿−1𝒮𝐿−1

]h+ n

≜ Sh+ n (34)

where 𝒮𝑖 and h𝑖 concatenate the training matrices and theCIRs, respectively, of all the antennas used by the 𝑖th userand

Q𝑖 = diag{d, 𝑒

𝑗2𝜋(𝑁+𝜈+1)𝛼𝑖𝑁 d, . . . , 𝑒

𝑗2𝜋(𝐾−1)(𝑁+𝜈+1)𝛼𝑖𝑁 d

}where d =

[1, 𝑒

𝑗2𝜋𝛼𝑖𝑁 , . . . , 𝑒

𝑗2𝜋(𝑁−1)𝛼𝑖𝑁

]𝑇. Furthermore, 𝛼𝑖

is a random variable representing the normalized frequencyoffset between the 𝑖th user carrier frequency, 𝑓 𝑖T, and thereceiver carrier frequency, 𝑓R, defined as

𝛼𝑖 ≜𝑓 𝑖T − 𝑓R

𝑓sub, (35)

where 𝑓sub denotes the subcarrier frequency spacing. Assum-ing the dimensionality condition in (20) is satisfied, the LLSEof h is

h =1

𝑐

[𝒮0 𝒮1 . . . 𝒮𝐿−1

]𝐻︸ ︷︷ ︸=S𝐻

y =1

𝑐S𝐻 S h+

1

𝑐S𝐻n︸ ︷︷ ︸≜n

.

(36)Writing Q𝑖 = I𝑁 + (Q𝑖 − I𝑁 ) ≜ I𝑁 + Q𝑖, we express h as

h = h+ SΔh+ n (37)

where

SΔ =1

𝑐S𝐻

[Q0𝒮0 Q1𝒮1 . . . Q𝐿−1𝒮𝐿−1

]. (38)

The trace of the error auto-correlation matrix is given by

𝑡𝑒 = Tr

(𝔼

[(h− h

)(h− h

)𝐻])= Tr

(𝔼[SΔhh

𝐻S𝐻Δ])

+ 2𝜎2𝑒 ≜ 𝑡fo + 2𝜎2

𝑒 , (39)

where 𝔼 [⋅] denotes the statistical expectation. For any twomatrices A and B, we know that

Tr (𝔼 [A]) = 𝔼 [Tr (A)] and Tr (AB) = Tr (BA) . (40)

Using these properties and assuming that the CIR coefficientsand the normalized frequency offsets are statistically indepen-dent, we write

𝑡fo = 𝔼[Tr

(hh𝐻S𝐻ΔSΔ

)]= Tr

(𝔼[hh𝐻

]𝔼[S𝐻ΔSΔ

])= Tr (ChCSΔ) (41)

where Ch ≜ 𝔼[hh𝐻

]and CSΔ ≜ 𝔼

[S𝐻ΔSΔ

]. The (𝑖, 𝑗)th

block matrix of CSΔ is given by

CSΔ(𝑖, 𝑗) =1

𝑐2𝔼

[𝒮𝐻𝑖 Q𝐻𝑖

(𝐿−1∑𝑘=0

𝒮𝑘𝒮𝐻𝑘)Q𝑗𝒮𝑗

]

=1

𝑐2𝔼

[𝒮𝐻𝑖 Q𝐻𝑖 SS𝐻Q𝑗𝒮𝑗

], (42)

where 0 ≤ 𝑖, 𝑗 ≤ 𝐿 − 1. Since Q𝑖 and Q𝑗 are diagonal, weexpress CSΔ(𝑖, 𝑗) as follows

CSΔ(𝑖, 𝑗) =1

𝑐2𝒮𝐻𝑖

⎛⎜⎜⎝[SS𝐻

] ∘ 𝔼 [v∗𝑖 v

𝑇𝑗

]︸ ︷︷ ︸≜C𝑖,𝑗

v

⎞⎟⎟⎠𝒮𝑗 (43)

where v𝑖 and v𝑗 are columns vectors containing the diagonalelements of Q𝑖 and Q𝑗 , respectively.

Given the second-order statistics of the residual CFOs andthe CIR coefficients, we can easily computeCSΔ andCh, andhence 𝑡𝑒 for any training sequence. Given a training sequence,we can then evaluate the impact of the residual CFO on thecorresponding channel estimate. We shall assume throughoutthat the residual offsets 𝛼𝑖’s are independent and identicallydistributed. One commonly-used distribution is the uniformdistribution over the interval [−𝛼max, 𝛼max] where 0 ≤ 𝛼max ≤0.5. Using these assumptions and the fact that 𝔼 [𝑔(𝑥)] =∫𝑥𝑔(𝑥)𝑓(𝑥)𝑑𝑥, for any random variable 𝑥, where 𝑓(𝑥) is the

probability density function of 𝑥, we find that, for all 𝑖, 𝑗 =0, 1, ..., 𝐿 − 1, the (𝑚,𝑛) element of the matrix C𝑖,𝑗v , where𝑚,𝑛 = 0, 1, ...,𝐾𝑁−1, is given in Eq. (44) where sinc (𝑥) ≜sin(𝜋𝑥)𝜋𝑥 for any real number 𝑥, and 𝑓(𝑚) = 𝑚+ 𝑙(𝑁 + 𝜈+1)

where 𝑙 = 0, 1, ...,𝐾 − 1 and 𝑙𝑁 ≤ 𝑚 ≤ (𝑙 + 1)𝑁 − 1.

B. Phase Noise

In the presence of PN affecting the free-running voltage-controlled oscillators (VCOs) of the transmitters and thereceiver, the received signal over 𝐾 = 1 training OFDMsymbol is given by

y = Prx𝑀−1∑𝑖=0

H𝐶𝑖 Ptx𝑖 s𝑖 + n (45)

where s𝑖 denotes the training sequence transmitted by the 𝑖thtransmit antenna and H𝐶𝑖 denotes the matrix of the channelexperienced by the 𝑖th transmit antenna. Although H𝐶𝑖 is notexactly circulant due to the edge effect introduced by PNat the transmitters, it can be considered circulant with thiseffect lumped into the noise vector n. For large FFT sizes,the edge effect can be ignored [29]. The PN perturbing theVCO of the 𝑖th transmit antenna is modeled by the diagonalmatrix Ptx

𝑖 ≜ diag({𝑒𝑗𝜙𝑖𝑛}𝑁−1𝑛=0 ) with 𝜙𝑖𝑛 representing the PN

sample perturbing the transmitted signal by the 𝑖th transmitantenna at the 𝑛th sample2. Similarly, the PN perturbing thereceiver VCO is modeled by the diagonal matrix Prx ≜diag({𝑒𝑗𝜙rx

𝑛}𝑁−1𝑛=0 ). The discrete-time PN model is given by

𝜙𝑖𝑛 = 𝜙𝑖𝑛−1 + 𝜖𝑖𝑛 and 𝜙rx𝑛 = 𝜙rx

𝑛−1 + 𝜖rx𝑛 , (46)

2Transmit antennas supporting the same user experience the same PNmatrix as they are fed by the same VCO.

Page 8: 2234 IEEE TRANSACTIONS ON WIRELESS …yuejiec/papers/TSD_TWC_2011.pdfsignals. Various OFDM channel estimation schemes [11]-[13] have been proposed for Single-Input Single-Output (SISO)

CHI et al.: TRAINING SIGNAL DESIGN AND TRADEOFFS FOR SPECTRALLY-EFFICIENT MULTI-USER MIMO-OFDM SYSTEMS 2241

C𝑖,𝑗v (𝑚,𝑛) =

⎧⎨⎩ 1− sinc(

2𝑓(𝑚) 𝛼max

𝑁

)− sinc

(2𝑓(𝑛) 𝛼max

𝑁

)+ sinc

(2𝑓(𝑛−𝑚) 𝛼max

𝑁

), 𝑖 = 𝑗(

sinc(

2𝑓(𝑚) 𝛼max

𝑁

)− 1

)(sinc

(2𝑓(𝑛) 𝛼max

𝑁

)− 1

), 𝑖 ∕= 𝑗

(44)

where{𝜖𝑖𝑛}

and {𝜖rx𝑛} are independent Gaussian distributed

random variables with zero means and variances{

2𝜋𝛽tx𝑖

𝑁𝑓sub

}and{

2𝜋𝛽rx

𝑁𝑓sub

}for all 𝑛 respectively. Without loss of generality, we

assume for all 𝑖, 𝜙𝑖0 = 𝜙rx0 = 0. The parameters 𝛽tx

𝑖 and𝛽rx denote the two-sided 3-dB linewidths of the Lorentzianpower density spectrums of the VCOs feeding the 𝑖th transmitantenna and the receive antenna, respectively [30]. We expressPtx𝑖 and Prx as

Ptx𝑖 = I𝑁 +

(Ptx𝑖 − I𝑁

)≜ I𝑁 + Ptx

𝑖 ,

Prx = I𝑁 + (Prx − I𝑁 ) ≜ I𝑁 + Prx,

and expand the LLSE of h and the trace of its error auto-correlation matrix, respectively, in Eq. (47) and (48), wherethe edge effect is ignored. Defining Ψ ≜

∑𝑀−1𝑖=0 H

𝐶𝑖 P

tx𝑖 s𝑖,

we expand 𝑡pn in Eq. (49).Using the properties in (40) in addition to the diagonal

structure of Prx and assuming that the channel and the PNparameters are statistically independent, we write

𝑡pn,1 = Tr(ChS

𝐻([SS𝐻

] ∘ 𝔼 [prxprx,𝐻

])S), (52)

where prx is a column vector containing the diagonal elementsof Prx. Furthermore, we can rewrite Ψ as

Ψ =

𝑀−1∑𝑖=0

S𝑖h𝑖 =[S0 S1 . . . S𝑀−1

]h ≜ Sh, (53)

where S𝑖 is a circulant matrix whose first column is Ptx𝑖 s𝑖.

Using the above formulation of Ψ and the properties in(40), we rewrite 𝑡pn,2 in Eq. (50) where prx is a columnvector containing the diagonal elements of Prx. Inspecting thestructure of S, we find that it can be expressed as

S =[P𝐶0 ∘ S0 P𝐶1 ∘ S1 . . . P𝐶𝑀−1 ∘ S𝑀−1

],

where P𝐶𝑖 is a circulant matrix whose first column is ptx𝑖 ,

the column vector containing the diagonal elements of Ptx𝑖 .

Hence,

𝔼

[S]=

[[𝔼

[P𝐶

0

]∘ S0 𝔼

[P𝐶

1

]∘ S1 . . . 𝔼

[P𝐶

𝑀−1

]∘ S𝑀−1

]

where 𝔼

[P𝐶𝑖

]is formed by a circulant matrix whose first

column is 𝔼 [ptx𝑖 ]. Finally, we consider the last term 𝑡pn,4 and

express it in Eq. (51) where we used the fact that H𝐶𝑖 =F𝐻H𝐷𝑖 F where H𝐷𝑖 is a diagonal matrix whose diagonal isformed by the vector H𝑖 which is the 𝑁 -point DFT of h𝑖 asdefined in Section II. Furthermore,

𝔼h,Prx,{˜Ptx𝑖 } [⋅] = 𝔼Prx𝔼h∣Prx𝔼{˜Ptx

𝑖 }∣h,Prx [⋅] = 𝔼Prx𝔼h𝔼{˜Ptx𝑖 } [⋅]

where ∣ is the statistical conditioning operator and the second

equality follows from the fact that h, Prx, and{Ptx𝑖

}are

statistically independent. Using the above observations and the

fact that Prx,{H𝐷𝑖

}, and

{Ptx𝑖

}are all diagonal matrices, we

write𝑡pn,4 = Tr

(S𝐻

(Ξ ∘ 𝔼 [

prxprx,𝐻])S)

(54)

where Ξ =∑𝑀−1𝑖=0

∑𝑀−1𝑗=0 F

𝐻(Υ𝑖,𝑗 ∘ 𝔼

[H𝑖H

𝐻𝑗

])F and

Υ𝑖,𝑗 = F([s𝑖s𝐻𝑗

] ∘ 𝔼 [ptx𝑖 p

tx,𝐻𝑗

])F𝐻 . Finally, the vec-

tor 𝔼 [ptx𝑖 ] and the correlation matrices 𝔼

[prxprx,𝐻

],

𝔼[prxprx,𝐻

], 𝔼

[prxprx,𝐻

], and 𝔼

[ptx𝑖 p

tx,𝐻𝑗

]are determined

by using the relation 𝔼[𝑒𝑗𝑥

]= 𝑒−𝜎

2𝑥/2 for the random variable

𝑥 ∼ 𝑁(0, 𝜎2𝑥). In fact, 𝔼

[𝑒𝑗𝑥

]is the characteristic function

[31] of 𝑥, 𝜓𝑥(𝑗𝑡), evaluated at 𝑡 = 1. Furthermore, theextension of the above analysis to the general case of 𝐾 > 1training OFDM symbols is straightforward.

V. DESIGN TRADE-OFFS

A closer inspection of the channel estimate MSE expres-sions derived in Sections IV-A and IV-B reveals a tradeoffbetween the PAPR of the training sequences and their ro-bustness to CFO and PN. An intuitive explanation is that thetraining sequence with low PAPR tends to distribute its energyuniformly among all samples including late ones; however,this makes it less immune to PN and CFO which severelyaffect late samples. On the other hand, the training sequencethat is more robust to CFO and PN, should concentrate itsenergy in the early samples as explained earlier; however,this will result in increasing PAPR. Towards an analyticalexplanation, we examine the expression of 𝑡fo in (41). Tosimplify the expression, we fairly assume that the channelresponses seen by all transmit antennas are uncorrelated, sothe matrixCh is block-diagonal. Hence, 𝑡fo is affected only bythe diagonal blocks of CSΔ , i.e. {CSΔ(𝑖, 𝑖), 0 ≤ 𝑖 ≤ 𝐿− 1},which are all partly constructed by the term

[SS𝐻

] ∘ C𝑖,𝑖v .Inspecting the structure of the matrix C𝑖,𝑖v in (44), we find thatthe energies of the elements increase as the column and/or rowindices increase. Hence, if the training sequence has its energyconcentrated in the early samples, then the matrix SS𝐻 willhave the opposite structure of C𝑖,𝑖v and, hence, the Hadamardproduct will yield small elements. This will eventually resultin a small value for 𝑡fo, i.e. more immunity to CFO. The samerationale can be applied to the expression of 𝑡pn keeping inmind that the variance of the PN sample increases as its indexincreases according to the model adopted in (46).

VI. SIMULATION RESULTS

We have simulated the performance of an OFDM systemwith 𝑁 = 64 and 𝜈max = 15 as in [9]. We consider uplinktransmission in a Multi-User MIMO system with 2 co-locatedreceive antennas at the BST and 2 users each equipped with2 transmit antennas over which the Alamouti STBC [32] isemployed. Each user employs a non-systematic rate-1/2 con-volutional code with octal generator (133,171) and constraintlength = 7 as in [9]. Coded bits are Quadrature Phase Shift

Page 9: 2234 IEEE TRANSACTIONS ON WIRELESS …yuejiec/papers/TSD_TWC_2011.pdfsignals. Various OFDM channel estimation schemes [11]-[13] have been proposed for Single-Input Single-Output (SISO)

2242 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 10, NO. 7, JULY 2011

h =1

𝑐S𝐻y = h+

1

𝑐S𝐻Prx Sh+

1

𝑐S𝐻Prx

𝑀−1∑𝑖=0

H𝐶𝑖 Ptx𝑖 s𝑖︸ ︷︷ ︸

≜epn

+1

𝑐S𝐻n (47)

𝑡𝑒 = Tr

(𝔼

[(h− h

)(h− h

)𝐻])= Tr

(𝔼[epne

𝐻pn

])+ 2𝜎2

𝑒 ≜ 𝑡pn + 2𝜎2𝑒 (48)

𝑡pn =1

𝑐2Tr

(𝔼

h,˜Prx

[S𝐻Prx Shh𝐻S𝐻Prx,𝐻 S

])︸ ︷︷ ︸

≜𝑡pn,1

+1

𝑐2Tr

(𝔼

h,Prx,{˜Ptx𝑖 }

[S𝐻Prx ShΨ𝐻Prx,𝐻 S

])︸ ︷︷ ︸

≜𝑡pn,2

+1

𝑐2Tr

(𝔼

h,Prx,{˜Ptx𝑖 }

[S𝐻Prx Ψh𝐻S𝐻Prx,𝐻 S

])︸ ︷︷ ︸

≜𝑡pn,3

+1

𝑐2Tr

(𝔼

h,Prx,{˜Ptx𝑖 }

[S𝐻Prx ΨΨ𝐻Prx,𝐻 S

])︸ ︷︷ ︸

≜𝑡pn,4

(49)

𝑡pn,2 = 𝑡∗pn,3 = Tr

(𝔼

h,Prx,{˜Ptx𝑖 }

[S𝐻Prx Shh𝐻 S𝐻Prx,𝐻 S

])= Tr

(Ch𝔼

[S]𝐻([SS𝐻

] ∘ 𝔼 [prxprx,𝐻

])S

)(50)

𝑡pn,4 = Tr

⎛⎝𝔼h,Prx,{˜Ptx𝑖 }

⎡⎣S𝐻Prx

⎛⎝𝑀−1∑𝑖=0

𝑀−1∑𝑗=0

F𝐻H𝐷𝑖 F Ptx𝑖 s𝑖s

𝐻𝑗 P

tx,𝐻𝑗 F𝐻H𝐷,𝐻𝑗 F

⎞⎠Prx,𝐻 S

⎤⎦⎞⎠ (51)

Keying (QPSK) modulated. All channel paths are assumedto have uncorrelated and identically-distributed CIRs with 8zero-mean complex Gaussian taps following an exponentially-decaying power delay profile (PDP) with a 3 dB decay per tap.𝐾 OFDM training symbols are transmitted over each transmitantenna for the purpose of channel estimation as describedin Section III. The CIR estimates are used for detection ofthe OFDM data symbols through the joint Linear Minimum-Mean-Square-Error (LMMSE) technique which processes thereceived signals from the 2 receive antennas jointly to detectthe two users [33]. The background noise is assumed to beAWGN with a single-sided power spectral density of 𝑁𝑜Watts/Hz. The bit energy is denoted by 𝐸𝑏 and the per-userSignal-to-Noise Ratio (SNR) is defined as SNR = 𝐸𝑏

𝑁𝑜.

Using these parameters, the dimensionality condition in (20)is met with 𝐾 ≥ 1. In Fig. 2, the bit error rate (BER) perfor-mances of three FD training sequences (namely: Chirp, Golay,and flat (TD Impulsive)) with 𝐾 = 1 and 2 are compared withthe perfect channel state information (CSI) case. In Fig. 2, allusers are assumed to have perfect frequency synchronizationwith the receiver. All training sequences achieve roughly thesame BER performance with SNR losses -compared to theperfect CSI case- of 1.5 and 0.7 dB for 𝐾 = 1 and 2,respectively. For comparison purpose, the performance of arandom BPSK sequence not satisfying the optimality conditionis also shown in Fig. 2 for 𝐾 = 1 and 2. The performance ofthe random sequence is inferior to that of the other sequencessatisfying the optimality condition; especially with 𝐾 = 1training symbol where the number of equations equals thenumber of unknowns making the channel estimate unreliablewhen the optimality condition is not satisfied. From anotherperspective, our optimally-designed training sequences with𝐾 = 1 training symbol achieve comparable performance tothat of the random sequence with 𝐾 = 2 training symbols,i.e. with 50% less training overhead. This is in addition to the

−8 −4 0 4 8 12 16 20 2410

−5

10−4

10−3

10−2

10−1

100

SNR (dB)

BE

R

Perfect CSI

Estimated CSI (Golay)

Estimated CSI (TD Impulsive)

Estimated CSI (Chirp)

Estimated CSI (Random)

No CFO

No PN

Fig. 2. BER versus SNR for 𝐾 = 1 (dashed) and 2 (solid) training OFDMsymbols without CFO.

additional complexity needed to invert the matrix S𝐻S whichis not a scaled identity in the case of non-optimal sequences.

For the channel PDP described above, we plot 𝑡fo derived in(39) versus 𝛼max in Fig. 3 for the proposed training sequenceswhere we observe that the flat FD sequence is more robust tothe residual CFOs than the other two sequences thanks to theimpulsive nature of its corresponding training sequences wheremost of power is concentrated in the early samples where theCFO effect is small (CFO effect increases with time). The im-pact of the residual CFOs on the BER performance is shown inFig. 4 for 𝐾 = 2 training symbol with 𝛼max = 0.01 and 0.05where the superiority of the flat sequence is observed alsoin the BER performance. While CFOs with 𝛼max = 0.01 donot cause a significant performance degradation, CFOs with

Page 10: 2234 IEEE TRANSACTIONS ON WIRELESS …yuejiec/papers/TSD_TWC_2011.pdfsignals. Various OFDM channel estimation schemes [11]-[13] have been proposed for Single-Input Single-Output (SISO)

CHI et al.: TRAINING SIGNAL DESIGN AND TRADEOFFS FOR SPECTRALLY-EFFICIENT MULTI-USER MIMO-OFDM SYSTEMS 2243

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.510

−5

10−4

10−3

10−2

10−1

100

101

αmax

t fo

Golay sequence

Chirp sequence

TD Impulsive sequence

Fig. 3. Plot of 𝑡fo in (39) versus 𝛼max with 𝐾 = 1 training OFDM symboland 2 users each with 1 transmit antenna.

−8 −4 0 4 8 12 16 20 24 2810

−5

10−4

10−3

10−2

10−1

100

SNR (dB)

BE

R

Perfect CSI

Estimated CSI (TD Impulsive)

Estimated CSI (Golay)

Estimated CSI (Chirp)

No PN

No CFO

αmax

= 0.05

αmax

= 0.01

Fig. 4. BER versus SNR without CFOs (solid curves) and with CFOs of𝛼max = 0.01 (dashed curves) and 𝛼max = 0.05 (dash-dotted curves). Twotraining OFDM symbols are used.

𝛼max = 0.05 limit the system performance at high SNR.Assuming the VCOs feeding all transmit and receive anten-

nas to have the same 3-dB linewidth 𝛽, i.e. 𝛽tx𝑖 = 𝛽rx = 𝛽, ∀𝑖,

we plot 𝑡pn in (49) versus 𝜎2pn ≜ 2𝜋𝛽

𝑁𝑓subin Fig. 5 for

𝐾 = 1 training symbols. Like the CFO case, the flat FDtraining sequence is also more immune to PN than chirpand Golay sequences for the same reason. Fig. 6 depictsthe PN impact on the BER performance for different valuesof 𝜎2

pn without CFOs, i.e. 𝛼max = 0. With 𝜎2pn = 10−4,

PN becomes performance-limiting at high SNR while nosignificant deterioration is observed for smaller PN variancessuch as 𝜎2

pn = 10−5.In Fig. 7, we simulate the BER performance of the training

sequences in peak-limited channels under CFO and PN. Inpeak-limited channels, the received signal power is limitedby 𝑃max above which the received signal power is saturated(clipped). In Fig. 7, we show the BER versus Δ𝑃 ≜ 𝑃falt −𝑃max (dB) where 𝑃flat is the received peak power of the FD

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

x 10−3

10−7

10−6

10−5

10−4

10−3

10−2

10−1

σpn2

t pn

Golay sequence

Chirp sequence

TD Impulsive sequence

Fig. 5. Plot of 𝑡pn in (49) versus 𝜎2pn with 𝐾 = 1 training OFDM symbol

and 2 users each with 1 transmit antenna.

−8 −4 0 4 8 12 16 20 24 28 3210

−5

10−4

10−3

10−2

10−1

100

SNR (dB)

BE

R

Perfect CSI

Estimated CSI (TD Impulsive)

Estimated CSI (Golay)

Estimated CSI (Chirp)

No CFO

No PN

σpn2 = 10−4

σpn2 =10−5

Fig. 6. BER versus SNR for different values of 𝜎2pn {0 (solid), 10−5 (dashed),

and 10−4 (dash-dotted)} with 𝐾 = 2 training OFDM symbols and withoutCFO.

flat sequence which is the largest over the three sequences. Forsmall values of Δ𝑃 (i.e. high clipping power values), the FDflat sequence outperforms the other two sequences due to itsimmunity to CFO and PN as discussed before. However, thesituation changes at high values of Δ𝑃 where the distortionof the FD flat sequence, caused by the peak-limited channel,dominates its immunity to CFO and PN. Hence, we observethat despite achieving the same performance in peak-unlimitedchannels without CFO or PN, the proposed sequences exhibitdifferent behaviors under these practical impairments.

Fig. 8 shows the real parts of a single CIR realization alongwith its estimates in the presence of CFO with 𝛼max = 0.01and PN with 𝜎2

pn = 10−4. The real parts of CIR estimatesshown in Fig. 9 are with 𝛼max = 0.1 and 𝜎2

pn = 10−3 wherethe increased maximum CFO and the PN variance degradethe accuracy of the CIR estimates. The values of 𝛼max and𝜎2

pn used in our simulations are small since CFO and PNcompensations usually precede channel estimation in practical

Page 11: 2234 IEEE TRANSACTIONS ON WIRELESS …yuejiec/papers/TSD_TWC_2011.pdfsignals. Various OFDM channel estimation schemes [11]-[13] have been proposed for Single-Input Single-Output (SISO)

2244 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 10, NO. 7, JULY 2011

0 2 4 6 8 1010

−5

10−4

10−3

10−2

10−1

ΔP (dB)

BE

R

Estimated CSI (Chirp)

Estimated CSI (Golay)

Estimated CSI (TD Impulsive)

Fig. 7. Comparison of the training sequences performances in peak-limitedchannels with 𝛼max = 0.01, 𝜎2

pn = 10−5, SNR = 26dB, and 𝐾 = 2 trainingOFDM symbols.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16−0.4

−0.3

−0.2

−0.1

0

0.1

0.2

0.3

Tap index

Actual CIR

Chirp−based CIR Estimate

Golay−based CIR Estimate

TD−Impulsive−based CIR Estimate

Fig. 8. A realization of the CIR real part and its estimates with 𝛼max = 0.01,𝜎2

pn = 10−4 , SNR = 15dB, and 2 training OFDM symbols.

systems. The image parts of CIR realizations are omitted dueto space. The trade-off between the PAPR of the trainingsequences and their immunity to CFO and PN is observedby inspecting Figs. 3 and 5 with Table III.

VII. CONCLUSIONS

We derived the MMSE optimality criteria for trainingsequence designs in Multi-User MIMO-OFDM systems. Inspectrally-efficient uplink transmission scenarios where usersare separated using spatial processing at the base station,our analysis holds for an arbitrary number of users, OFDMtraining symbols, transmit antennas per user, and channeldelay spread. Within the family of training designs thatare MMSE-optimal under ideal conditions, we found thatrobustness to residual CFO and PN can vary significantly.We also derived analytical expressions for the increase in

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16−0.4

−0.3

−0.2

−0.1

0

0.1

0.2

0.3

Tap index

Actual CIR

Chirp−based CIR Estimate

Golay−based CIR Estimate

TD−Impulsive−based CIR Estimate

Fig. 9. A realization of the CIR real part and its estimates with 𝛼max = 0.1,𝜎2

pn = 10−3, SNR = 15dB, and 2 training OFDM symbols.

channel estimation MSE in the presence of CFO and PN. Ouranalysis includes three detailed case studies; Chirp, Golay,and time-domain impulsive training sequence designs. In eachcase, we quantified the underlying tradeoff between PAPR androbustness to CFO and PN.

REFERENCES

[1] G. J. Foschini, “Layered space-time architecture for wireless commu-nication in a fading environment when using multi-element antennas,"Bell Labs Technical J., vol. 1, pp. 41-59, 1996.

[2] E. Telatar, “Capacity of multi-antenna Gaussian channels," EuropeanTrans. Telecommun., 1999.

[3] V. Tarokh, N. Seshadri, and A. Calderbank, “Space-time codes forhigh data rate wireless communication: performance criterion and codeconstruction," IEEE Trans. Inf. Theory, 1998.

[4] A. R. S. Bahai, B. R. Saltzberg, and M. Ergen, Multi-Carrier DigitalCommunications Theory and Applications of OFDM. Springer, 2004.

[5] D. Tse and P. Viswanath, Fundamentals of Wireless Communication.Cambridge University Press, 2005.

[6] M. Jiang and L. Hanzo, “Multiuser MIMO-OFDM for next-generationwireless systems," Proc. IEEE, vol. 95, no. 7, pp. 1430-1469, July 2007.

[7] “IEEE standard for local and metropolitan area networks part 16: airinterface for fixed and mobile broadband wireless access systems," IEEEStd 802.16e-2005, 2006.

[8] “Physical channels and modulation," 3GPP TS 36.211, Ver 8.6.0, Mar.2009.

[9] “IEEE candidate standard 802.11n: wireless LAN medium accesscontrol (MAC) and physical layer (phy) specifications." Available:http://grouper.ieee.org/groups/802/11/Reports/tgn_update.htm, 2009.

[10] S. Parkvall, E. Dahlman, A. Furuskar, Y. Jading, M. Olsson, S. Wanstedt,and K. Zangi, “LTE-advanced–evolving LTE towards IMT-advanced,"in IEEE Veh. Technol. Conf., 2008.

[11] Y. G. Li, “Pilot-symbol-aided channel estimation for OFDM in wirelesssystems," IEEE Trans. Veh. Technol., vol. 49, pp. 1207-1215, 2000.

[12] L. Deneire, P. Vandenameele, L. van der Perre, B. Gyselinckx, andM. Engels, “A low-complexity ML channel estimator for OFDM," IEEETrans. Commun., vol. 51, pp. 67-75, 2003.

[13] O. Edfors, M. Sandell, J.-J. van de Beek, S. K. Wilson, and P. O.Böesson, “OFDM channel estimation by singular value decomposition,"IEEE Trans. Commun., vol. 46, pp. 931-939, 1998.

[14] Y. G. Li, N. Seshadri, and S. Ariyavisitakul, “Channel estimation forOFDM systems with transmitter diversity in mobile wireless channels,"IEEE J. Sel. Areas Commun., vol. 17, pp. 461-471, 1999.

[15] H. Minn, D. I. Kim, and V. Bhargava, “A reduced complexity channelestimation for OFDM systems with transmit diversity in mobile wirelesschannels," IEEE Trans. Commun., 2002.

Page 12: 2234 IEEE TRANSACTIONS ON WIRELESS …yuejiec/papers/TSD_TWC_2011.pdfsignals. Various OFDM channel estimation schemes [11]-[13] have been proposed for Single-Input Single-Output (SISO)

CHI et al.: TRAINING SIGNAL DESIGN AND TRADEOFFS FOR SPECTRALLY-EFFICIENT MULTI-USER MIMO-OFDM SYSTEMS 2245

[16] Z. J. Wang, Z. Han, and K. J. R. Liu, “A MIMO-OFDM channelestimation approach using time of arrivals," IEEE Trans. WirelessCommun., vol. 4, pp. 1207-1213, May 2005.

[17] C. Fragouli, N. Al-Dhahir, and W. Turin, “Training-based channelestimation for multiple-antenna broadband transmissions," IEEE Trans.Wireless Commun., 2003.

[18] H. Minn and N. Al-Dhahir, “Optimal training signals for MIMO OFDMchannel estimation," IEEE Trans. Wireless Commun., vol. 5, no. 5, pp.1158-1168, Apr. 2006.

[19] Y. G. Li, “Simplified channel estimation for OFDM systems withmultiple transmit antennas," IEEE Trans. Wireless Commun., vol. 1, pp.67-75, 2002.

[20] Y. Zeng, A. R. Leyman, S. Ma, and T.-S. Ng, “Optimal pilot andfast algorithm for MIMO-OFDM channel estimation," in Proc. IEEEInternational Conf. Inf., Commun. Signal Process., 2005.

[21] F. Horlin and L. V. der Perre, “Optimal training sequences for lowcomplexity ML multi-channel estimation in multi-user MIMO OFDM-based communications," in Proc. IEEE International Conf. Commun.,vol. 4, June 2004.

[22] S. D. Howard, A. R. Calderbank, and W. Moran, “A simple signal pro-cessing architecture for instantaneous radar polarimetry," IEEE Trans.Inf. Theory, vol. 53, no. 4, pp. 1282-1289, Apr. 2007.

[23] M. D. Zoltowski, T. R. Qureshi, and R. Calderbank, “Complementarycodes based channel estimation for MIMO-OFDM systems," in Proc.46th Annual Allerton Conf. Commun., Control, Comput., Sep. 2008, pp.133-138.

[24] C. C. Tseng, “Signal multiplexing in surface-wave delay lines usingorthogonal pairs of Golay’s complementary sequences," IEEE Trans.Sonics Ultrasonics, 1971.

[25] C. Tseng and C. Liu, “Complementary sets of sequences," IEEE Trans.Inf. Theory, 1972.

[26] S. M. Kay, Fundamentals Of Statistical Signal Processing: EstimationTheory. Prentice Hall, 1993.

[27] D. Chu, “Polyphase codes with good periodic correlation properties,"IEEE Trans. Inf. Theory, vol. IT-18, 1972.

[28] M. J. E. Golay, “Complementary series," IRE Trans. Inf. Theory, vol. 7,no. 2, pp. 82-87, Apr. 1961.

[29] R. Gray, Toeplitz and Circulant Matrices: A Review. Now PublishersInc., 2006.

[30] T. Pollet, M. V. Bladel, and M. Moeneclaey, “BER sensitivity of OFDMsystems to carrier frequency offset and Wiener phase noise," IEEETrans. Commun., vol. 43, 1995.

[31] J. G. Proakis, Digital Communication, 4th edition. McGraw-Hill, 2001.[32] S. M. Alamouti, “A simple transmit diversity technique for wireless

communications," IEEE J. Sel. Areas Commun., vol. 16, pp. 1451-1458,1998.

[33] A. Naguib, N. Seshadri, and A. Calderbank, “Increasing data rate overwireless channels," IEEE Signal Process. Mag., vol. 17, 2000.

Yuejie Chi (S’09) received the B.E. (Hon.) inElectrical Engineering from Tsinghua University,Beijing, China, in 2007. She is currently work-ing towards the Ph.D degree in the department ofElectrical Engineering at Princeton University. Herresearch interests include compressed sensing, sta-tistical signal processing, machine learning, wirelesscommunications and active sensing.

Ahmad Gomaa (S’09) received his B.S and M.Sdegrees in electronics and communications engineer-ing from Cairo University, Egypt, in 2005 and 2008,respectively. He is currently pursuing the Ph.D.degree at the University of Texas at Dallas, USA.His research interests include sparse FIR filters de-sign, RF impairments compensation at the baseband,training sequence design for channel estimation, andcompressive sensing applications to digital commu-nications.

Naofal Al-Dhahir earned his Ph.D. degree in Elec-trical Engineering from Stanford University in 1994.From 1994 to 2003 he was a principal member of thetechnical staff at GE Research and AT&T ShannonLab. In 2003, he joined UT-Dallas as an AssociateProfessor and became a full Professor in 2007, anda Jonsson Distinguished Professor of Engineering in2010. His current research interests include broad-band wireless and wireline transmission, MIMO-OFDM transceivers, baseband processing to mitigateRF/analog impairments and compressive sensing.

He has authored over 220 papers and holds 31 issued US patents. He is co-recipient of the IEEE VTC Fall 2005 best paper award, the 2005 IEEE signalprocessing society young author best paper award, the 2006 IEEE Donald G.Fink best journal paper award and is an IEEE fellow.

Robert Calderbank (M’89-SM’97-F’98) receivedthe BSc degree in 1975 from Warwick University,England, the MSc degree in 1976 from OxfordUniversity, England, and the PhD degree in 1980from the California Institute of Technology, all inmathematics.

Dr. Calderbank is Dean of Natural Sciences atDuke University. He was previously Professor ofElectrical Engineering and Mathematics at PrincetonUniversity where he directed the Program in Appliedand Computational Mathematics. Prior to joining

Princeton in 2004, he was Vice President for Research at AT&T, responsiblefor directing the first industrial research lab in the world where the primaryfocus is data at scale. At the start of his career at Bell Labs, innovations by Dr.Calderbank were incorporated in a progression of voiceband modem standardsthat moved communications practice close to the Shannon limit. Togetherwith Peter Shor and colleagues at AT&T Labs he showed that good quantumerror correcting codes exist and developed the group theoretic framework forquantum error correction. He is a co-inventor of space-time codes for wirelesscommunication, where correlation of signals across different transmit antennasis the key to reliable transmission.

Dr. Calderbank served as Editor in Chief of the IEEE TRANSACTIONSON INFORMATION THEORY from 1995 to 1998, and as Associate Editor forCoding Techniques from 1986 to 1989. He was a member of the Board ofGovernors of the IEEE Information Theory Society from 1991 to 1996 andfrom 2006 to 2008. Dr. Calderbank was honored by the IEEE InformationTheory Prize Paper Award in 1995 for his work on the Z4 linearity of Kerdockand Preparata Codes (joint with A.R. Hammons Jr., P.V. Kumar, N.J.A. Sloane,and P. Sole), and again in 1999 for the invention of space-time codes (jointwith V.Tarokh and N. Seshadri). He received the 2006 IEEE Donald G. FinkPrize Paper Award and the IEEE Millennium Medal, and was elected to theUS National Academy of Engineering in 2005.