Pre Coding

Precoding and Beamforming for

Multi-Input Multi-Output Downlink

Channels

by

Roya Doostnejad

A thesis submitted in conformity with the requirementsfor the degree of Doctor of Philosophy

The Edward S. Rogers Sr. Department of Electrical and Computer EngineeringUniversity of Toronto

c Copyright by Roya Doostnejad, 2005

Precoding and Beamforming for Multi-Input

Multi-Output Downlink Channels

Roya Doostnejad

Doctor of Philosophy, 2005

The Edward S. Rogers Sr. Department of Electrical and Computer Engineering

University of Toronto

Abstract

This dissertation presents precoding and beamforming schemes for multi-user wireless

downlink channels when multiple antennas are employed at both the transmitter and

the receivers. In the first part of the thesis, we will discuss transmitter processing with-

out channel information which is applicable in both flat and frequency selective (when

orthogonal frequency-division multiplexing (OFDM) is applied) fading channels. This

leads to methods for designing signature matrices for transmitters that use any combina-

tion of the spatial, temporal and frequency dimensions, with good performance provided

by low-complexity receivers. In the rest of the thesis, we pose the problem when the

channels between the base station and each user are known perfectly at the base station.

A non-linear precoding scheme is designed to minimize the mean-squared error between

the transmitted and received data with a per-user power constraint. We also develop

methods that are able to provide user-specific signal-to-interference-noise ratios (SINRs)

with minimal total transmit power, through the extension of a so-called uplink-downlink

duality result. Our study indicates that channel knowledge at the transmitter leads to

substantial reductions in required power for providing given levels of SINRs to users.

ii

Acknowledgements

I would like to express my sincere gratitude to my supervisors Prof. Teng Joon Lim,

and Prof. Elvino S. Sousa for their guidance, advice, and continued support throughout

my thesis research. Prof. Lim has provided the key technical insights and contributed

tireless editorial effort which has vastly improved the quality of this dissertation. Prof.

Sousa has provided me a gentle encouragement and a far-reaching vision of the work.

I wish to thank my entire committee: Prof. Frank Kschischang, Prof. Ravi Adve,

Prof. Dimitris Hatzinakos, Prof. Bruce A. Francis, and Prof. Murat Uysal of the Univer-

sity of Waterloo for their effort, discussions and constructive comments. In particular, I

would like to thank Prof. Kschischang for his invaluable inputs and constant encourage-

ment throughout the course of this research.

I also acknowledge the administrative support of Diane B. Silva during these years.

I am appreciative of my colleagues in the communication group as well as my friends

in Toronto who made this period of my life most enjoyable and beneficial.

The financial support of the University of Toronto and Ontario Graduate Scholarships

in Science and Technology (OGSST) is also greatly appreciated.

I would like to extend my appreciation to the professors from whom I learned a great

deal in earlier stages of my studies in Isfahan University of Technology, in particular, Dr.

H. Alavi, Dr. A. Doosthoseini, Dr. S. Sadri and Dr. V. Tahani.

It is impossible to express the debt that I owe to my late parents. My father who

shaped the first stages of my education and has been always a role model for me, and

my mother, that if it was not because of her intense care and compassionate support,

I would have never been able to come this far. I would also like to thank my siblings,

Rezvan, Mehdi and Ahmad, and my in-laws Akbar Abdollahi and Shahla Dardashti who

have always been a source of encouragement and drive behind my achievements.

At last, my most special thanks goes to my husband, Kambiz Bayat for infinite love,

support, patience and devotion, and to my little one for inspiration at the end of this

journey.

iii

To my husband, Kambiz

and

In memory of my parents.

iv

Contents

Abstract ii

Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

List of Tables ix

List of Figures xii

1 Introduction 1

1.1 Multipath Fading Channels . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2 Space-Time Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.2.1 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.2.2 Design Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

1.2.3 Space-time Coding Schemes . . . . . . . . . . . . . . . . . . . . . 10

1.2.4 Space-Time Coding in a Multiuser System . . . . . . . . . . . . . 15

1.3 Precoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

1.3.1 MIMO Single-user Systems . . . . . . . . . . . . . . . . . . . . . . 17

1.3.2 MIMO Multiuser Systems . . . . . . . . . . . . . . . . . . . . . . 20

1.4 Overview of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

1.5 Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

v

2 Space-Time Multiplexing for MIMO Multiuser Downlink Channels 25

2.1 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2.2 Transmitted Signal Design . . . . . . . . . . . . . . . . . . . . . . . . . . 29

2.2.1 Assumptions and Goals . . . . . . . . . . . . . . . . . . . . . . . . 29

2.2.2 Spreading Matrix Design . . . . . . . . . . . . . . . . . . . . . . . 31

2.2.3 Constellation and Power Allocation . . . . . . . . . . . . . . . . . 35

2.3 Receiver Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

2.3.1 Joint ML Detection . . . . . . . . . . . . . . . . . . . . . . . . . . 39

2.3.2 Multi-Stage Successive Interference Cancellation . . . . . . . . . . 40

2.4 Comparison With Other STC-CDMA Transceivers . . . . . . . . . . . . 42

2.5 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

2.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

3 Precoding and Beamforming for MIMO Downlink Channels with Per-

User Power Constraints 54

3.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

3.1.1 Signal Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

3.1.2 Precoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

3.2 MMSE Beamforming/Precoding . . . . . . . . . . . . . . . . . . . . . . 61

3.2.1 Precoding Matrix Design . . . . . . . . . . . . . . . . . . . . . . . 62

3.2.2 Optimum Receive Matrix . . . . . . . . . . . . . . . . . . . . . . 64

3.2.3 Optimum Transmit Matrix . . . . . . . . . . . . . . . . . . . . . . 65

3.2.4 Precoding Ordering . . . . . . . . . . . . . . . . . . . . . . . . . . 67

3.3 Space-Time Spreading . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

3.4 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

vi

4 Precoding and Beamforming for MIMO Downlink Channels to Mini-

mize Total Transmit Power 76

4.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

4.1.1 Signal Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

4.1.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

4.2 Joint Power Allocation and MMSE Beamforming Using Uplink/Downlink

Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

4.2.1 Uplink-Downlink Duality for MIMO channels . . . . . . . . . . . 83

4.2.2 Proposed Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 85

4.3 Space-Time Spreading . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

4.4 Multiple Symbol Transmission to each user . . . . . . . . . . . . . . . . . 91


4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

5 Precoding and Beamforming for the Down-link in a MIMO/OFDM

System 100

5.1 Single User MIMO/OFDM Systems . . . . . . . . . . . . . . . . . . . . 102

5.2 Signal Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

5.2.1 Transmit Signal . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

5.2.2 Received Signal . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

5.3 SFS Matrix Design with no Channel Information at the Transmitter . . . 107

5.4 Comparison With MIMO Multi-Carrier CDMA Schemes . . . . . . . . . 111

5.5 SFS Matrix Design with Perfect Channel Knowledge at the Transmitter . 115


5.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

vii

6 Conclusion 125

6.1 Summary of Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . 125

6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

A Spreading Matrix Design Examples 129

B Proof of Uplink-Downlink Duality in MIMO Multiuser Systems 131

C The Algorithms for Multiple Symbol Transmission to each user 134

Bibliography 137

viii

List of Tables

2.1 Comparison between different STC schemes for the downlink in a MIMO

multiuser channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

3.1 The algorithm for precoding and MMSE beamforming . . . . . . . . . . 68

4.1 The precoding/beamforming algorithm for MIMO-BC channels minimiz-

ing total transmit power. . . . . . . . . . . . . . . . . . . . . . . . . . . 88

4.2 The error rate performance of TTPC versus PUPC algorithm for t = r = 4,

K = 4, SINR = 10(dB) . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

4.3 The error rate performance of TTPC versus PUPC algorithm for t = r = 4,

G = 4, K = 16, SINR = 10(dB) . . . . . . . . . . . . . . . . . . . . . . 98

C.1 The precoding/beamforming for multiple symbol transmission. . . . . . 135

C.2 The space-time precoding/beamforming for multiple symbol transmission. 136

ix

List of Figures

1.1 Multiple Access Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2 Broadcast Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.3 Matrix DFE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.1 Transmission system model . . . . . . . . . . . . . . . . . . . . . . . . . 28

2.2 Structure of two-stage SIC. . . . . . . . . . . . . . . . . . . . . . . . . . . 40

2.3 Performance of 2-D STSC for different receivers, t = r = 2, G = 2, U = 4. 44

2.4 The effect of MAI (number of users) on the achieved diversity with MMSE

for t = r = 4, G = 4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

2.5 The impact of power allocation on the performance of 2-D STSC for t =

r = 2, U = 8. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

2.6 The impact of power allocation on the performance of SIC for t = r = 4,

G = 4, U = 16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

2.7 Performance of 2-D STSC in correlated fading channels for t = r = 2. . 49

2.8 Performance comparison of various schemes for multiuser channel in the

downlink for t = r = 2, G = 2. . . . . . . . . . . . . . . . . . . . . . . . . 50

x

2.9 1214.5=13.613.6Performance of proposed 2D-STSC versus randomly gen-

erated ST spreading codes which do not have the zero average MAI prop-

erty, and Hadamard codes which give zero average MAI but do not satisfy

the full-diversity criterion. . . . . . . . . . . . . . . . . . . . . . . . . . . 51

2.10 Performance comparison of the proposed space-time coding scheme and

rotated constellation (TAST) in a single user system for t = r = 2, G = 2. 52

3.1 Block diagram of the matrix DFE. . . . . . . . . . . . . . . . . . . . . . 59

3.2 Matrix form of the Tomlinson-Harashima precoder. . . . . . . . . . . . . 60

3.3 The average Pe for different number of receive antennas and t = 4, K =

2, z = 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

3.4 The performance of space-time spreading for different number of receive

antennas, t = 4, G = 4, K = 8, z = 2. . . . . . . . . . . . . . . . . . . . . 73

3.5 Average Pe compared with Pe for each individual user, t = 2, r = 2, K =

2, z = 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

4.1 Uplink-downlink duality these two multi-user channels have the same

achievable SINR region for a given sum power constraint. . . . . . . . . . 84

4.2 Performance of the iterative linear beamforming and the proposed algo-

rithm with MMSE and random initializations, for t = r = 4, K = 4,

SINR = 10 dB. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

4.3 Total transmit power versus the required SINR for different number of

transmit/receive antennas for K = 4. . . . . . . . . . . . . . . . . . . . . 94

4.4 Transmit power per user versus the number of active users at the system

for r = 4, SINR = 10 dB. . . . . . . . . . . . . . . . . . . . . . . . . . . 95

4.5 Precoding/Beamforming over space and time for t = r = 4, K = 16. . . . 96

4.6 Precoding/Beamforming over space and time for t = r = 4, K = 8. . . . . 97

xi

5.1 OFDM/MIMO Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . 105

5.2 Transceiver structure of MIMO MC-CDMA systems . . . . . . . . . . . . 112

5.3 Performance comparison of Space-Frequency Spreading methods for one,

two and four tap equal power fading channels for t = r = G = 2. . . . . . 117

5.4 Performance of ML detection versus SIC with and without power control

for two-tap channel for t = r = 2, L = 2, Nf = 8. . . . . . . . . . . . . . . 119

5.5 The performance of space-frequency spreading compared with MMSE beam-

forming over flat fading channel t = 2, r = 2, Nf = 8, z K = 16. . . . . 1205.6 Average Pe compared with Pe for individual users in space-frequency spread-

ing, t = 2, r = 2, Nf = 8, K = 8, z = 2. . . . . . . . . . . . . . . . . . . . 121

5.7 Precoding/Beamforming over space and frequency for t = r = 2, K = 16,

SINR = 10(dB). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

xii

Chapter 1

Introduction

The use of antenna arrays at both the transmitter and the receiver has received signifi-

cant attention as a promising method to provide diversity and/or multiplexing gain over

wireless links. Multiple antennas create extra dimensions in the signal space which can

be used in different ways. The receiver can be provided with replicas of the same data

to increase the reliability of signal transmission which results in spatial diversity gain.

The spatial dimensions can also be used to carry independent data streams to increase

the data rate which results in spatial multiplexing gain. This collective improvement

associated with spatial multiple-input multiple-output (MIMO) channels is based on the

premise that in the wireless system with enough separation between antennas in an ar-

ray, a rich scattering environment provides different channels between each transmit and

receive antenna which are statistically uncorrelated to some extent.

MIMO techniques were first investigated in a point-to-point or single-user commu-

nication link. In a MIMO single-user system with t transmit and r receive antennas, a

diversity order of tr can be provided for the system. Also, if the channel is perfectly

known at the receiver, capacity scales linearly with min(t, r) relative to a system with

just one transmit and one receive antenna. A MIMO system is thus able to provide im-

1

Chapter 1:Introduction 2

proved power and bandwidth efficiencies, at the cost of setting up additional antennas.

Space-time coding schemes have been designed for MIMO single-user systems to achieve

diversity gain [13], or achieve high data rates by taking advantage of multiplexing gain

of MIMO systems [4, 5], or both [6, 7].

User 2

Base Station

User 1 User K

Figure 1.1: Multiple Access Channel

In many applications ranging from wireless LAN to cellular telephony, multiuser com-

munication is a reality. Therefore, recently researchers have been attracted to investi-

gate the impact and implications of using MIMO systems in multiuser environments.

There are two basic multiuser MIMO channel models: the MIMO multiple-access chan-

nel (MAC) and the MIMO broadcast channel (BC). In MIMO MAC, a number of users

share a common communication channel to transmit their individual signals to a receiver.

Such a system is shown in Figure 1.1. In the uplink of a mobile cellular communication

system, the users are the mobile transmitters in any particular cell and the receiver is


the base station of that cell. In MIMO BC, a transmitter sends information to multiple

receivers as shown in Figure 1.2. In the downlink of a mobile cellular communication

system, the transmitter is the base station and the receivers are the mobile stations.

A key difference between single-user, MAC, and BC channels is that in the single-user

channel, there is a full collaboration at both sides of transmitter and the receiver, while

in the MAC channel there is collaboration only at the receiver, and in the BC channel

collaboration exists only at the transmitter. Therefore in the BC channel joint processing

between the receivers cannot be supported. Based on this fact, the design of BC channel

is proved to be more challenging [810].

User 2

Base Station

User 1 User K

Figure 1.2: Broadcast Channel

This thesis is primarily concerned with the design of the transmitter in a MIMO broad-

cast channel. Assuming no channel information at the transmitter, space-time spreading

matrices are designed to maximize diversity gain and spectral efficiency. Assuming per-


fect channel knowledge at the transmitter, an algorithm based on MMSE beamforming

combined with non-linear interference pre-subtraction is proposed which is applicable to

a multiuser BC channel with any desired number of transmit/receive antennas.

This chapter will provide the basis for the rest of the thesis. Multipath fading and

different diversity schemes are explained in the next section. In Section 1.2, a brief

review on space-time coding schemes for single-user systems and then the extension to

multiuser systems is explained. In Section 1.3, precoding is introduced for both single-

user and multiuser systems when the channel is known perfectly at the transmitter. The

overview of the thesis is provided in Section 1.4, and the notations which are used through

the thesis are given in Section 1.5.

1.1 Multipath Fading Channels

The physical characteristics of the wireless channel presents a fundamental technical chal-

lenge for reliable communications. This is mainly because of the time varying multipath

nature of the channel. Multipath propagation is a result of the propagation of the signal

over a number of different paths due to reflections of the signal by mountains, buildings,

and other objects. Because of the time variations in the structure of the wireless chan-

nel, the nature of the multipath varies with time. This results in signal fading over time.

The amplitude variations in the received signal are due to the destructive and construc-

tive addition of multiple signal paths between receiver and transmitter. For a multipath

fading channel, we define the time-variant impulse response of the channel as h(t, )

which is the output of the channel at time t to an impulse applied at time t . Sincethe channel time variations are not predictable, the time variant multipath channel is

modelled statistically. The most common statistical fading model is the Rayleigh fading

model in which the impulse response of the channel, h(t, ), is assumed to be a complex


random variable whose real and imaginary parts are zero-mean statistically independent

Gaussian random variables, each having a variance 2 . Therefore the magnitude of the

channel at any instant t, r = |h(t, )|, has a Rayleigh distribution

P (r) =r

2er

2/22 , r 0. (1.1)

The autocorrelation function of h(t, ) is given by [11]

h(t; i, j) =1

2E [h(t, i)h(t+t, j)] , (1.2)

where t is the observation time difference. Since in most radio transmissions the impulse

response of the channel for different paths are independent, if we let t = 0, then we

have

h(i, j) = h(i)(i j). (1.3)

In fact h(i) represents the average channel output power as a function of the time delay

i. The different paths have different time delays and different average powers. We call

h() =1

2E [h(t, )h(t, )] , (1.4)

the multipath intensity profile of the channel [11]. The range of values of that h() is

nonzero is said the multipath spread of the channel, and the largest value among these

delays is defined as the delay spread of the channel which is denoted by Tm. In other

words h() 0, for Tm.The coherence bandwidth of the channel, (f)c, is defined as the frequency separation

at which two frequency components of the signal undergo independent attenuations by

the channel. This parameter will be defined corresponding to Tm as

(f)c 1Tm

. (1.5)

If the bandwidth of the signal, W , that is transmitted through the channel is smaller

than the coherence bandwidth of the channel, i.e. W < (f)c, the channel is called


flat-fading channel in which all the frequency components of the signal undergo the same

attenuation by the channel. In other words, within the bandwidth of the signal, the

transfer function of the channel is constant in the frequency variable. In this case, the

multipath components in the received signal are not resolvable, and the channel appears

as a single fading path. This implies that in flat-fading channels, the received signal

is simply the transmitted signal multiplied by the channel coefficient, h, where h is a

zero-mean complex-valued Gaussian random process. For a single-antenna system, this

can be simply modelled as

y = hx+ , (1.6)

where x, and y are the transmitted and received signal respectively, and is additive

noise which is usually assumed to be Gaussian distributed and independent of x.

If the signal bandwidth is such that W > (f)c the channel is called to be frequency-

selective and the signal is severely distorted by the channel. In this case, the multipath

components can be resolved in the received signal and therefore the receiver is provided

with several independently fading signal paths [11]. Consequently, the frequency-selective

channel is modelled as a tapped delay line filter with time-variant tap coefficients. The

frequency-selective fading can degrade system performance by causing inter-symbol inter-

ference (ISI) which result in an irreducible bit error rate (BER). Time-domain equaliza-

tion [11] and orthogonal frequency-division multiplexing (OFDM) [1113] are practical

techniques that can be used to resolve ISI.

Diversity techniques are based on the fact that if the channel is in a deep fade because

of the destructive addition of the multipath signals, errors may occur due to the large

channel attenuation. However if we can provide the receiver with several replicas of

the same signal transmitted over L independently fading channels, the probability that

all the signals fade simultaneously will be reduced considerably. If p is the probability


that any one signal fades below a threshold level, then pL is the probability that all

L independently fading replicas will fade below the threshold level. There are several

diversity techniques that can be employed in wireless communication systems to supply

to the receiver L independently fading replicas of the same information signal. Diversity

techniques which may be used include time, frequency, and space diversity.

Time Diversity refers to transmitting the same signal over L different time slotswhere the separation between successive time slots is enough to make their channels

independent. A common example of time diversity is the interleaving of coded

symbols over a large block length.

Frequency Diversity refers to transmitting the same signal over a large bandwidth,exceeding the coherence bandwidth of the channel. An example of the use of

frequency diversity is spread spectrum modulation. In fact, in a frequency-selective

fading channel, the receiver is provided with TmW W/(f)c resolvable signalcomponents. By applying either OFDM or time-domain equalization schemes, a

frequency diversity of order L W/(f)c can be obtained.

Space Diversity refers to transmitting or receiving the same signal over multipleantennas that are separated enough to create independent fading channels. To

provide space diversity, multiple antennas are used at the transmitter and/or the

receiver. The independent spatial channels provided by multiple antennas can be

also used to carry independent data steams to increase the data rate. This latter

technique is known as spatial multiplexing.

In this thesis, both flat-fading and frequency-selective fading channels are considered.

In the latter case, OFDM is applied to resolve ISI and extract the frequency diversity of

the channel.


1.2 Space-Time Coding

1.2.1 System Model

A single-user channel is considered with t transmit and r receive antennas. The transmit

symbols s1, ..., sp are encoded to a n t (possibly complex) space-time code matrix Cwhich is transmitted from t transmit antennas over n time slots. The rate of this code is

defined as

R = p/n symbols/channel-use, (1.7)

where again p is the number of data symbols transmitted in n time slots. The tr channelmatrix H is defined such that H(i, j), which represents the element in the ith row and

the jth column of the matrix H, is the channel gain between the transmit antenna i

and receive antenna j. Each channel coefficient has the same variance of 2h, and the tr

channels are assumed to be independent. Also we assume a quasi-static Rayleigh fading

channel which is constant over a block of n time slots and independent from block to

block. Then if the power per input symbol transmitted from each transmit antenna is

ps/t, the received signal which is a n r matrix Y will be

Y =ps/tCH+N, (1.8)

where N Cnr is a matrix of i.i.d. complex Gaussian random variables with zeromean, and the variance of 2, representing receiver noise. To simplify the analysis,

2h = 1 is assumed and therefore the signal-to-noise ratio per receive antenna is defined

as = ps/2.

The channel is known at the receiver but not at the transmitter. The goal is to design

the matrix C to achieve full diversity and multiplexing gain.


1.2.2 Design Criteria

Space-time codes design criteria are derived based on maximum likelihood (ML) detection

in [2, 14]. The analysis is based on pairwise error probability [11]. For a given channel

matrix H, the probability that a ML receiver decides erroneously in favor of the code

matrix Cj when the code matrix Ci is transmitted will be

P (Ci Cj | H) = P (Y CjH2 Y CiH2)= Q(

A2(Ci,Cj)

2) exp(A2(Ci,Cj)/4),

(1.9)

where A2(Ci,Cj) = (CiCj)H2. Equation (1.9) needs to be averaged over the channeldistribution. An upper bound on the average probability of error in the case of Rayleigh

fading channel is obtained in [2] as follows,

P (Ci Cj) (

li=1

i

)r(/4)lr for Ci 6= Cj, (1.10)

where l is the minimum rank of the difference matrix, Dij = (Ci Cj), over differentpossible code matricesCi 6= Cj, and i are nonzero eigenvalues of the matrix = DijDHij .This results in the following design criteria for space-time codes:

Rank Criterion (diversity gain): The achieved transmit diversity at the receiver isthe minimum rank of the difference matrix, Dij, over all possible code matrices

Ci 6= Cj. A full diversity code is obtained if l = t.

Determinant Criterion (coding gain): The coding gain, g is defined as

g = minCi 6=Cj

(l

i=1

i

). (1.11)

Space-time codes are designed to maximize both diversity and coding gain.


1.2.3 Space-time Coding Schemes

In this section, we briefly review a few well known space-time coding (STC) schemes so

that we may refer to them later in the thesis.

Space-Time Block Coding (STBC): In [1] a STC scheme is proposed for twotransmit antennas. The input symbols (si) are divided into groups of two symbols

each. Then the STC matrix is generated as follows:

C =

s1 s2s2 s1

(1.12)It is shown that because of the orthogonal structure of this code, i.e.

CCH =2i=1

s2i I2 (1.13)

(where Ik is a k k identity matrix), ML detection simplifies to a linear processingat the receiver [1]. Also it can be easily shown that this code has full diversity (the

difference matrix is full rank). This scheme is later generalized in [3] to an arbitrary

number of antennas. For t transmit antennas, the input symbols are divided into

groups of t symbols each and then the STC matrix is generated as an orthogonal

matrix, i.e. CCH =t

i=1 s2i It.

Here, we summarize some important properties of STBC:

STBC are full diversity codes.

Real orthogonal codes with rate R = 1 can be designed for any number of

transmit antennas.

A complex orthogonal design with rate R = 1 exists if and only if t = 2 (see

(1.12)). There are also complex orthogonal designs for t = 3, and t = 4 but

with a rate R = 3/4.


ML detection requires only linear processing at the receiver.

BLAST Codes: BLAST stands for Bell Laboratories Layered Space-Time. Thisarchitecture breaks the data stream into t sub-streams that are transmitted simulta-

neously from t antennas. Hence, there is no built-in spatial transmit diversity. This

scheme is implemented as Diagonal-BLAST (D-BLAST) [4], Vertical-BLAST (V-

BLAST) [5] and Turbo-BLAST (T-BLAST) [15]. In particular BLAST is designed

to provide very high data rate communications over wireless flat-fading channels. A

typical example for V-BLAST when we have two transmit and two receive antennas

is

C =

s1 s2s3 s4

. (1.14)Comparing this with STBC (1.12), one can see that in BLAST scheme, transmit

diversity is not provided for the system. However, two symbols are transmitted

per channel use which is twice the rate of STBC in terms of channel symbols per

channel use.

At the receiver, successive nulling and cancelling is applied. The interference

from an already-detected symbol is subtracted out from the received signal be-

fore the next symbol is detected. Each symbol is detected based on a zero forcing

method [16]. Therefore, it is necessary to have r t. The order in which the sym-bols are detected affects the overall performance of the algorithm. The best-first

cancellation scheme is widely known within the multiuser detection community [17].

This can also be applied at the receiver for BLAST. Based on this scheme, the

symbols are ordered based on their received signal-to-interference-plus-noise ratios

(SINRs). Then the symbols with higher SINRs are detected first. Because of the

particular structure of BLAST, it can be easily seen that the symbols are automat-


ically received with different SINRs.

Here, we summarize some important properties of BLAST:

Spatial diversity is not provided at the transmitter.

BLAST can be designed for any number of transmit/receive antennas as long

as the number of receive antennas is equal or greater than the number of

transmit antennas r t.

BLAST is a full rate code, R = t.

Linear Dispersion (LD) Codes: In this scheme which is proposed in [6], thedata stream is broken into Q sub-streams, sq = q + jq, q = 1, ..., Q, that are

transmitted over space and time as indicated by the codeword matrix

C =

Qq=1

(qAq + jqBq). (1.15)

The performance of LD codes is dependent on Q, {Aq}, and {Bq}. The LD codesin [6] were designed to maximize the mutual information between the transmit and

receive signals.

Note that, in a MIMO single-user channel, if the channel is known at the receiver,

the resulting channel capacity is [4, 18]:

C(, t, r) = maxRs,T r(Rs)=t

E{log[det(Ir +

tHRsH

H)]}

, (1.16)

where the expectation is taken over the distribution of the random matrix H, and

Rs is the covariance matrix of the input signal. If the channel matrix H Crt

is a matrix of i.i.d. complex Gaussian random variables with zero mean, and the

variance of 2h = 1, the optimal covariance matrix when H is unknown to the

transmitter is Rs = It, and (1.16) becomes

C(, t, r) = E{log[det(Ir +

tHHH

)]}. (1.17)


By substituting (1.15) in (1.8) we get

Y =ps/t

Qq=1

(qAq + jqBq)H+N. (1.18)

By decomposing the matrices in (1.18) into their real and imaginary parts and then

collecting the real and imaginary parts of the received signal in the vector y, the

equation (1.18) is re-formulated as [6]

y =ps/tHgx+ , (1.19)

where Hg R2nr2Q is a modified channel matrix which is a function of realand imaginary components of Aq and Bq as well as the original channel gains.

y, R2nr1, and x R2Q1 is a vector of real and imaginary parts of transmittedsymbols (q, q). Therefore the LD codes are linear in the variables q, q and the

same detection algorithm as explained for BLAST scheme can be applied.

Without loss of generality, s1, ..., sQ are assumed to be unit-variance and uncorre-

lated. Then E[Tr(CCH)] = nt and therefore

Qq=1

(Tr(AHq Aq) + Tr(BHq Bq)) = 2nt. (1.20)

To design LD codes, first of all Q = n.min(t, r) is chosen. As mentioned earlier

the LD codes are designed to maximize the mutual information between the trans-

mit and receive signals. Therefore to choose {Aq,Bq}, the following optimizationproblem has to be solved

CLD(, n, t, r) = maxAq ,Bq ,q=1,...,Q

1

2nE{log[det(I2nr +

tHgH

Tg

)]}, (1.21)

subject to one of the following constraints

1.Q

q=1(TrAHq Aq + TrB

Hq Bq) = 2nt


2. TrAHq Aq = TrBHq Bq =

ntQ, q = 1, ..., Q

3. AHq Aq = BHq Bq =

nQIt, q = 1, ..., Q

The first constraint is the power constraint that ensures E[Tr(CCH)

]= nt. The

second constraint is to make sure that all the data symbols are transmitted with

the same power. The third constraint is to transmit all the data symbols with equal

energy in all spatial and temporal directions.

Here, we summarize some important properties of LD codes:

Full diversity is not guaranteed but the codes are shown to provide good

performance with respect to the probability of error [6].

The optimization problem in (1.21) is neither convex nor concave. Therefore

the optimization problem may lead to a local optimum.

The solution (Aq, Bq) is not unique.

LD code is a full rate code, R = Q/n which if r t results in R = t.

TAST Codes: TAST stands for Threaded Algebraic Space-Time. This schemewhich is proposed in [7,19,20], uses a threaded structure and algebraic number the-

oretic tools to design full diversity codes. The codes are directly optimized based

on the rank criterion (diversity gain) and determinant criterion (coding gain) (see

1.2.2). The problem of space-time diversity gain is related to algebraic number

theory, and the coding gain optimization is related to the theory of simultaneous

Diophantine approximation in the geometry of numbers. The coding gain opti-

mization is found to be equivalent to finding irrational numbers, the furthest from

any simultaneous rational approximations.

Applying a ML detection at the receiver, these codes achieve full diversity while

the coding gain is optimized as well.


For comparison, a design of TAST codes for two transmit/receive antennas is given

in the following

C =

s1 + s2 (s3 + s4)(s3 s4) s1 s2

, (1.22)where 2 = , and = ej/2.

It can be seen that the rate of this code is R = 2. Also based on the rank criterion,

it can be easily shown that this code achieves full diversity. is chosen to maximize

the coding gain.

Other details for designing these codes are omitted here. The comprehensive ex-

planation is provided in [7, 19, 20]. Here we summarize some key points of TAST

codes.

TAST codes can be designed for any number of transmit antennas

TAST codes are full diversity, full rate (R = t) codes

Optimal detection (ML) is required to achieve full diversity

1.2.4 Space-Time Coding in a Multiuser System

Designing space-time codes for single-user systems is very well understood. However,

there has not been an extensive work towards space-time code design for multiuser ap-

plications. In fact, splitting of the channel resources among independent users either

in the form of multiple access (uplink) or broadcasting (downlink) is often considered

a straightforward task involving the concatenation of a multiple access scheme such as

CDMA with the space-time (ST) processor [2123]. For instance, each user can be as-

signed an orthogonal spreading code, which is used to spread the symbols at the output

of a space-time encoder. For that matter, the channel symbols can be generated using


orthogonal space-time block codes, or any other STC designed for a single-user system.

With flat fading synchronous channels, as seen on the downlink, a de-spreading front-end

at each receiver results in a single-user channel without multi-access interference (MAI).

Note that, the maximum number of active users is only equal to the processing gain

(bandwidth expansion) of the system, regardless of the number of space-time dimensions

(n time slots, t transmit antennas) used in the STC part of the system. Furthermore, all

the constraints and complexities of the applied STC scheme carry over to the multiuser

case. As we discussed before, for some STC designs, such as orthogonal space-time block

codes, it is simply impossible to allow certain antenna configurations; for others, such as

linear dispersion codes, it is necessary to maximize an objective function for a given t

and r.

In applying the BLAST scheme to a multiuser system, one can use the same spread-

ing code to spread each of the sub-streams. Since the same code is used to spread the

sub-streams, the spreading does not aid the receiver in distinguishing among them. As an

alternative, different spreading codes could be used for the sub-streams which are trans-

mitted simultaneously from different transmit antennas. In this case, the sub-streams

can be separated by their spatial characteristics and their codes. In either case, we can

either transmit multiple sub-streams to each user or transmit one sub-stream per user.

In addition, different spreading codes can be used to transmit the same sub-stream from

different antennas to achieve transmit diversity. In this case, a different spreading code

is used on each antenna to distinguish the sub-streams [22]. Although applying different

spreading codes over different antennas improves the performance significantly, but as

we will show later, it decreases the spectral efficiency. In Chapter 2, we will propose

a space-time spreading scheme which is designed for multiuser downlink channel and

then we compare the spectral efficiency of the proposed scheme with other STC schemes

presented for CDMA systems.


1.3 Precoding

The main difficulty in MIMO channels is the separation of the data streams which are

sent in parallel. In the context of the multiple access channel, this task is called multiuser

detection.

In this section we discuss precoding or pre-equalization of the transmitted signals

for MIMO systems. This type of processing at the transmitter requires the channel

state information (CSI) at the transmitter. In order to be able to obtain CSI at the

transmitter, the channel should be fixed (non-mobile) or approximately constant over

a reasonably large time period. If CSI is available at the transmitter, the transmitted

symbols, either for a single-user or for multiple users, can be partially separated by

means of pre-equalization at the transmitter. In this section, we give a brief overview of

precoding schemes for single-user and multiuser systems.

1.3.1 MIMO Single-user Systems

A MIMO channel can be described by a very basic model as y = Hx+ , where x,y are

the transmit and receive signal vector respectively, represents the receive noise, and

H is the r t MIMO channel. In a zero-forcing receiver, the transmit data signals aredetected by multiplying the received signal vector by the pseudo inverse of the channel

matrix

x = (HHH)1HHy, if t r. (1.23)

For this, the number of receive antennas should be greater than or equal to the number

of transmit antennas. It is well known that zero-forcing equalization suffers from noise

enhancement. To overcome this deficiency, decision-feedback equalization (DFE) can be

applied at the receiver [24]. In DFE, the symbols are detected sequentially. After each

symbol is detected, it is cancelled out before the next symbol is detected, therefore DFE


suffers from error propagation. The structure of DFE is shown in Figure 1.3. The matrix

B is a lower triangular matrix representing the decision feedback operation, and matrix

F is the feedforward matrix.

For the above methods, the CSI is required only at the receiver. By assuming perfect

CSI at the transmitter, the interference between the transmitted symbols can be com-

pletely avoided at the receiver by multiplying the transmit signal by the pseudo inverse

of the channel, which means transmitting

x = HH(HHH)1s, if r t, (1.24)

rather than transmitting the data vector s. In this linear pre-equalization, instead of

enhancing the noise, the average transmit power is increased. Also the number of transmit

antennas should be equal or greater than the number of receive antennas.

The equalization can also be split among transmitter and receiver. A popular strategy

is based on the singular value decomposition (SVD) of the channel matrix. The channel

can be written as H = UVH , where U,V are unitary matrices and is diagonal.

By multiplying the data signal by V at the transmitter, and then applying UH at the

receiver, the channel is diagonalized [25]. In this scheme, neither transmit power is

increased, nor channel noise is enhanced. The above schemes are considered as linear

pre-equalization.

B

yx x

noise

H F

Figure 1.3: Matrix DFE

As mentioned before, the DFE is a non-linear equalizer. With perfect channel knowl-


edge at the transmitter, the feedback part of the DFE can be transferred to the trans-

mitter which results in a non-linear precoding scheme known as Tomlinson-Harashima

precoding (THP). The performance of DFE and THP are the same but since THP is

applied at the transmitter, error propagation is avoided [26].

The calculation for the feedforward and feedback filter is as follows. We begin by

applying a QL factorization over the channel matrix such that H = QHS where Q is

a unitary matrix and S is a lower triangular matrix. This can be obtained through a

Cholesky factorization of HHH because HHH = SHS [26]. Now, we define C = VS

where V is a diagonal matrix with the elements equal to the inverse of the diagonal

elements of the matrix S so that C becomes a unit-diagonal lower triangular matrix. It

can be easily verified that the feedback matrix at the transmitter and the feedforward

filter at the receiver should be calculated as B = C I, and F = VQ respectively.Therefore at the transmitter the symbols ai, i = 1, ..., K are generated successively from

the original data x

ai = xi i1l=1

B(i, l)al, i = 1, ..., K (1.25)

where xi is the ith element of x and B(i, l) is the element in the ith row and the lth

column of the matrix B. This strategy will significantly increase the transmit power,

therefore the symbols are modulo reduced into the boundary region of the used signal

constellation. Mathematically, the integers are added to real and imaginary parts of ai

to bound the transmit signals to the constellation region (see [27] and Chapter 3 for

more details). Because of this modulo operation, THP is considered as a non-linear

precoding. As is shown in [27] the transmit power is still slightly increased, but the

scheme outperforms linear pre-coding schemes in the sense of error probability.


1.3.2 MIMO Multiuser Systems

A multiuser downlink channel can be also modelled as y = Hx+, while H is the overall

downlink channel matrix, and y includes the received signals for all users. However,

since the receivers are not collaborating, joint processing of the vector y is not possible,

and consequently the schemes proposed for single-user systems may not be applicable.

For instance the SVD over the known channel matrix, as explained in the last section,

cannot be applied. Also, in THP implementation although the feedback part is moved to

the transmitter but still the feedforward filter requires a joint processing of the received

signals. However, the THP can be modified to be suitable for a multiuser channel. In

fact the feedforward filter is also transferred to the transmitter. The calculation for

feedforward and feedback filter for this new structure is as follows. A QR factorization

is applied over HH such that HH = QR and H = SQH where Q is a unitary matrix and

S = RH is a lower triangular matrix. By defining C = VS where V is a diagonal matrix

with the elements equal to the inverse of the diagonal elements of the matrix S so that

C becomes a unit-diagonal lower triangular matrix, the feedforward and the feedback

matrices are F = Q, and B = C I respectively. The output signals of the Tomlinson-Harashima precoder are now applied to the feedforward filter before transmitting through

the downlink channel. As a result the received signal is equal to HQC1 = V1 which

is a diagonal matrix, and therefore a joint processing is not required at the receiver. In

this scheme the number of transmit antennas has to be equal to or greater than the

total number of receive antennas which is a restrictive condition over the number of users

in the system or the number of receive antennas at each user. In Chapter 3 we have

designed a non-linear precoding scheme based on THP which is valid for any number of

transmit/receive antennas.

Also, in [8] the authors show that the broadcast channel sum capacity is achieved


using a precoder with the structure of a DFE that decomposes the broadcast channel

into a series of single-user channels with interference pre-subtracted at the transmitter.

The proposed precoder is a generalization of the Tomlinson-Harashima precoder.

1.4 Overview of the Thesis

The focus of this thesis is on the precoding and beamforming design for the multiuser

downlink channel (Figure 1.2) when multiple antennas are employed at both transmitter

and receiver sides. We address two scenarios:

No channel state information is available at the transmitter (NCSIT)

Perfect channel state information is available at the transmitter (CSIT)

The design problem with NCSIT is addressed in Chapter 2. The channel is assumed

to be flat fading. A space-time spreading matrix is proposed for each user, rather than

a temporal spreading code vector as is usual in code-division-multiple-access (CDMA)

systems. The spreading matrices are designed to provide full spatial diversity at each

receiver while the multiplexing gain is maximized as well. The bandwidth expansion,

for a given number of users, is then reduced by a factor of min(t, r), while full spatial

diversity is provided for each user, where t is the number of transmit antennas at the

base station, and r is the minimum number of receive antennas at user stations. In the

downlink since the receivers are portable end-user devices, we are concerned with the

complexity at the receiver. Therefore, suboptimal detectors are preferred over optimal

detectors (maximum likelihood detection). We have a two-stage interference canceller

(IC) applied at each receiver. A power allocation scheme is then suggested to improve

the performance of IC towards achieving full diversity.


The design problem with CSIT is studied in two parts. In the first part which is

addressed in Chapter 3, we have a per-user power constraint. In the second part which

is addressed in Chapter 4, the design goal is to minimize the total transmitted power

in the downlink, while signal-to-interference-noise ratio (SINR) requirements are to be

satisfied at each receiver. In the following we explain these two parts in more detail.

As mentioned before since we do not have collaboration between the receivers in the

downlink BC channel and also low complexity receivers are preferred at mobile stations,

our goal is to transfer the processing load from the receivers to the transmitter. It is

very well known that assuming perfect channel knowledge at the transmitter, complexity

can be moved from the receivers to the transmitter without loss of performance [10], [8].

We also know that the boundary of the capacity region of the broadcast (BC) channel is

attained with channel knowledge at the transmitter, and using it for successive dirty paper

coding (DPC) [28]. Dirty paper coding is a technique that can be seen as interference

pre-cancellation at the transmitter.

In this work, assuming perfect channel knowledge at the transmitter, a successive

interference pre-subtraction is applied via a matrix version of Tomlinson-Harashima Pre-

coding (THP).

In Chapter 3, the multiuser MMSE beamforming is combined with THP to mini-

mize the mean squared error between transmit and receive data streams. The receive

beam vectors are obtained with the MMSE criterion, and the transmit beam vectors are

obtained through an eigen-value-decomposition scheme. In fact since interference pre-

cancellation is applied at the BS, the single user algorithms are applicable over individual

single user channels. The proposed scheme is extended to design the beam vectors over

the time domain as well.

In Chapter 4, the same interference pre-cancellation is applied at the transmitter.

However, since the goal is to minimize the total transmit power, the design problem


is more complicated. We have shown that transmit beamforming is much more com-

plicated than receive beamforming when the total transmit power is to be minimized.

We have proposed an iterative algorithm for designing the one transmitter and multiple

receivers. An uplink-downlink SINR duality result is proved and used, which computes

MMSE beamforming receivers for the virtual uplink and the downlink in turn. Initializa-

tion is provided by the eigen-value-decomposition scheme explained in Chapter 3. This

algorithm is applicable to design space-time beam vectors as well.

In the above proposed algorithms, there is no limitation on the number of trans-

mit/receive antennas.

In Chapter 5, the proposed designs in Chapters 2, 3, 4 are extended to perform

precoding and beamforming in the MIMO multiuser frequency selective fading channel

when orthogonal frequency-division multiplexing (OFDM) is applied. In frequency se-

lective MIMO channels, there is an additional source of diversity, frequency diversity,

due to the existence of multiple propagation paths between each transmit and receive

antenna pair. In MIMO/OFDM systems, the channel frequency diversity can be also ex-

ploited through the proper design of space-frequency codes. In this chapter, first without

any knowledge of the channel at the transmitter a multiple access scheme is proposed

for the downlink in a MIMO/OFDM system. The space-frequency codes are designed

to exploit the space and frequency diversity. Then assuming perfect channel knowledge

at the transmitter, the precoding and beamforming design is performed over space and

frequency. It is shown that the optimization algorithm benefits from cooperation among

the processing at different frequency bins.

We conclude the thesis in Chapter 6 where we summarize the contributions of this

work and suggest some directions for future work.


1.5 Notations

The notations used in this thesis are as follows. Boldface lower case letters are used to

denote vectors, boldface upper case letters are used to denote matrices. The superscripts

, T , and H denote conjugate, transpose and conjugate transpose respectively, I denotesthe identity matrix, Diag() is the abbreviate for the block diagonal matrix, and E meansthe statistical expectation. det(), Tr() are abbreviates for the determinant and tracerespectively. M(i, j) represents the element in the ith row and the jth column of the

matrix M.

Chapter 2

Space-Time Multiplexing for MIMO

Multiuser Downlink Channels

In this chapter, we study the downlink of a multiple-input multiple-output (MIMO)

multiuser system, in which antenna arrays are employed at both the transmitter (base

station) and the receivers (clients) to provide diversity and also multiplexing gain. The

channel is assumed to be unknown at the transmitter but each user receiver is assumed

to know its own channel. A modulation technique that can be seen as two-dimensional

space-time spreading code (2D-STSC) is described. It is based on well-known Walsh

codes, provides full transmit diversity and high spectral efficiency, and produces groups

of users that are orthogonal to each other. This last point translates into simplified

detection strategies without loss of performance as we will show later.

Note that, the full diversity space time coding (STC) schemes designed based on ML

criterion [2,7,29], require optimal detection at the receiver. In the downlink we may be

concerned with the complexity at the receiver, and suboptimal detectors are preferred.

The main results of this chapter are the following. We propose a joint space-time cod-

ing/spreading scheme that is designed for the multiuser downlink channel without channel

25

Chapter 2: Space-Time Multiplexing for MIMO Multiuser Downlink Channels 26

knowledge at the transmitter. It is effectively a two-dimensional spreading scheme. The

spreading codes are designed to provide full spatial diversity at each receiver while the

multiplexing gain is maximized as well. The bandwidth expansion of a system with a

given number of users may then be reduced by a factor of minimum number of transmit

and receive antennas.

The main detector structure of interest is a two-stage interference canceller (IC) which

employs serial interference cancellation (SIC) in the first stage. We will demonstrate

that in conjunction with an unequal power allocation scheme, this receiver is able to

provide full diversity and suffers from only a small performance loss compared to the

full-complexity maximum likelihood (ML) receiver. While assigning different powers to

individual users may seem controversial and appears to lead to enforced differences in

quality of service, in fact wireless systems already employ power control and with a very

high probability, transmissions to different users have different powers. Therefore, we

are only pointing out that the proposed scheme has a good chance of being successfully

implemented in a practical, power-controlled system, even with low complexity receivers.

Our perspective also enables us to find a very simple design that is applicable to any

number of transmit and receive antennas, and which works even for single-user multiple

antenna systems. In that case, applying the same spreading scheme and assigning dif-

ferent power levels on different symbols gives a new approach to designing full rate, full

diversity space-time codes even with suboptimal detectors at the receiver.

The remainder of this chapter is organized as follows. The system model is provided in

the next section. The spreading matrix design and power control algorithm are explained

in Section 2.2. Receiver structures are discussed in Section 2.3. The comparison with the

other STC schemes for the downlink in a MIMO multiuser channel is given in Section

2.4. The simulation results are shown in Section 2.5. We conclude this chapter in Section

2.6.


2.1 System Model

The downlink transmitter sends U data streams to K users, where K can be smaller than

U , by memoryless linear modulation of the U symbols si, i = 1, . . . , U in each symbol

epoch. The transmission model is shown in Figure 2.1. The modulation waveforms are

two-dimensional, unique to each data stream, and represented by the matrices i, i =

1, . . . , U . If the number of time dimensions is G (this is also equivalent to the spreading

gain) and the number of transmit antennas is t, then i CtG. The transmitted signalis therefore

S =Ui=1

sii CtG. (2.1)

By the above equation, one is reminded of the class of linear dispersion codes explained

in [29], and indeed (2.1) represents space-time block codes of many types. However, un-

like in [29] where a single-user or point-to-point system is considered, in the downlink,

point-to-multipoint problem, we do not have the luxury of a capacity expression to max-

imize. For instance, the sum capacity in a MIMO broadcast channel does not have a

closed-form expression and is instead expressed as the saddle point of a certain function.

Therefore, we need to resort to other design goals, as will be explained later.

From (2.1) the signal received by user j, (j = 1, ..., K), over G channel uses (or

chips in spread spectrum terminology) and rj receiver antennas is

Yj =Ui=1

siHj,0i +Vj CrjG, (2.2)

where Hj,0 Crjt is the matrix of flat-fading channel gains between the transmitterand the jth receivers antennas, and Vj is a matrix of i.i.d. complex Gaussian random

variables with zero mean, and the variance of 2j , representing receiver noise. We note

that because of the assumption that the channel is non-dispersive in time, Yj constitutes

a sufficient statistic for the detection of the symbol vector s = [s1, . . . , sU ]T , hence the


1

2

K K

s1

s2

Antenna 1

......

...

...

...

......

1

1

1

1

2

Antenna t

t

t

t

sK

Figure 2.1: Transmission system model

symbol epoch is not explicitly mentioned.

It is assumed that in the jth MIMO channel, each flat fading coefficient has the same

variance of 2h,j, and that the t rj channels are independent, and so

E[HHj,0Hj,0] = rj2h,jI. (2.3)

By stacking the columns of Yj, we obtain the familiar linear signal model

yj = HjCs+ vj, (2.4)

where

yj = vect(Yj) CrjG1

Hj = Diag(Hj,0, . . . ,Hj,0) CrjGtG

C = [vect(1), . . . , vect(U)] CtGU and

vj = vect(Vj) CrjG1.


Essentially, each column of C corresponds to a space-time modulation waveform for one

data stream. For transmitter design, we are interested in finding C; for receiver design,

the signal model suggests that joint detection at each receiver is necessary to mitigate

the inter-user interference that inevitably exists when the channel is unknown to the

transmitter1.

2.2 Transmitted Signal Design

2.2.1 Assumptions and Goals

We consider the problem of designing the spreading matrices i, i = 1, . . . , U , or equiva-

lently the matrix C, assuming that no channel information is available at the base station

transmitter, whereas receiver j knows Hj perfectly at all times.

There are two main factors influencing the performance of a multiuser MIMO system

multi-access interference (MAI) and diversity gain. To achieve single-user performance

for all users, or no MAI, requires that

CHHHj HjC = Dj, (2.5)

where Dj is a diagonal matrix, for all j {1, . . . , U} and all realizations of the channelHj. However, since the transmitter does not know Hj, zero MAI is not possible through

spreading matrix design. Instead, we introduce a much looser objective, that of achieving

zero MAI on average i.e.

E[CHHHj HjC] = Dj, (2.6)

where expectation is taken over the distribution of Hj.

1This precludes any sort of precoding for performance improvement.


From (2.3), we have

E[HHj Hj] = rj2h,jI, (2.7)

and hence (2.6) leads to

Design Requirement 1: In order for the average MAI to be zero at each receiver,

the spreading matrices represented by C should be designed so that CHC = D, where D

is a diagonal matrix.

While Requirement 1 is very loose and easily satisfied, it is nonetheless important

because violating it leads to non-zero MAI, even on average. Therefore we impose the

constraint that C has orthogonal columns. Since C has U columns and Gt rows, the

orthogonality constraint also leads to a strict upper limit on the number of users (symbols)

that can be supported in this system:

U Gt (2.8)

To tackle the issue of diversity gain, assume that receiver j performs joint maximum

likelihood (ML) decoding of s based on yj. The transmitted space-time codeword is

S from (2.1), where S S and the set S has cardinality |S| = (log2M)U , if M -arymodulation is used for each si. From Tarokhs work [2], we know that full transmit

diversity is obtained when Sm Sn has a rank of t for every pair of (Sm,Sn) S S.Given the structure of (2.1), we have

Em,n = Sm Sn =Ui=1

(sm,i sn,i)i, (2.9)

where sm,i and sn,i are the ith symbols in the mth and nth codewords respectively.

The codeword difference matrix Em,n has dimensions t G, so for it to have a rankof t requires

Design Requirement 2: The temporal spreading factor G must be at least equal to

the number of transmit antennas or G t.


In addition, a necessary condition for Em,n to have a rank of t for all codeword pairs

is

Design Requirement 3: Every spreading matrix i, i = 1, . . . , U , must have a rank

of t.

The proof of this statement is trivial. In fact Em,n for the pair of (Sm,Sn), when

sm,i 6= sn,i and sm,j = sn,j for j 6= i, j = 1, ..., U has to be full rank. This impliesthat each spreading matrix i has to be full rank. It has to be mentioned that design

requirements 2, 3 are necessary, but as will be shown shortly, they are not sufficient to

achieve full diversity. In the next section, we present a spreading matrix design based on

Walsh-Hadamard matrices which meets all three design requirements.

2.2.2 Spreading Matrix Design

Consider a system with U = Gt users2, where G t determines the bandwidth expansionof the system, and t can be any positive integer. In the proposed design, G must be a

power of two because of the use of Walsh-Hadamard basis vectors. It will be clear

shortly that the proposed algorithm works for U < Gt too, and therefore the assumption

of U = Gt is non-restrictive and adopted for convenience only.

We first divide the U users into t groups of G users each. Length-G Walsh codes are

assigned to each group, and user g (g = 1, . . . , G) in group n (n = 1, . . . , t) is assigned

Walsh code g, denoted wg. Observe that the same Walsh code is used by t users in the

system.

Then the spreading matrix i for user g of group n is formed3 by threading wg

diagonally starting from antenna n. To illustrate, suppose t = 3 and G = 4. Then the

2For convenience and without loss of generality, we assume that U = K, i.e. each data stream belongsto a distinct user.

3The index i depends on g and n according to i = (n 1)G+ g.


four spreading matrices of group one are:1 0 0 1

0 1 0 0

0 0 1 0

,1 0 0 10 1 0 00 0 1 0

,1 0 0 10 1 0 0

0 0 1 0

,1 0 0 1

0 1 0 00 0 1 0

(2.10)

those for group two are:0 0 1 0

1 0 0 1

0 1 0 0

,0 0 1 0

1 0 0 10 1 0 0

,0 0 1 01 0 0 10 1 0 0

,0 0 1 01 0 0 1

0 1 0 0

(2.11)

and the four matrices for group 3 are:0 1 0 0

0 0 1 0

1 0 0 1

,0 1 0 00 0 1 0

1 0 0 1

,0 1 0 0

0 0 1 01 0 0 1

,0 1 0 00 0 1 01 0 0 1

. (2.12)

In the rest of the chapter, we will assume without loss of generality that each i is scaled

so that, cHi ci = 1, where ci = vec(i). The transmitted symbol energy will then be equal

to E|si|2.Referring to the second matrix in (2.11) for instance, we see that user 2 of group 2

(i = 6) transmits its symbol s6 over antenna two in chip interval 1, s6 over antenna 3at chip 2, s6 over antenna 1 at chip 3, and s6 over antenna 2 at chip 4. All 12 userstransmit over the three antennas and four chips using their assigned spreading matrices.

For more examples, one can see Appendix A and [30].

It is straightforward to verify that this design procedure satisfies all the design re-

quirements stated in the last section for any t and all values of G that are powers of

two. If we use non-binary spreading vectors wg, we can even relax the requirements that

G = 2k, k Z+ by choosing any set of G orthonormal length-G basis vectors which have


only non-zero elements. However, the Walsh-code design just described has a very useful

and unique property, captured in the following lemma.

Lemma 1 For any G = 2k, where k Z+, and t Z+ such that G t, if U Gt, theproposed design results in Uint mutually interfering data streams at each receiver, where

Uint = min(U, 2k0t), (2.13)

and k0 is the integer satisfying log2 t k0 < log2 t + 1, which implies that k0 k,regardless of the channel realizations Hj, j = 1, . . . , U . This indicates that by increasing

G above 2k0 to activate more users, the number of interferers is not necessarily increased.

Proof

The number of mutually interfering data streams at receiver j can be found from the

maximum number of non-zero elements in the rows of CHHHj HjC CUU . Clearly ifU < 2k0t, then Uint is at most equal to U and this explains the first part of the lemma.

To show that Uint does not grow with U for values of U > 2k0t, let G0 = 2

k0 and let

C0 be a tG0U matrix which results from applying our design procedure when G = G0.As well, let H0j be the Hj matrix for this system and fix U at its maximum value of Gt

for all values of G. Now consider G = G1 = 2G0 = 2k0+1. Because of the Walsh-code

basis of our design, we can arrange the 2tG0 2U spreading code matric C1 so that

C1 =

C0 C0C0 C0

. (2.14)Furthermore, with the doubling of the spreading factor from G0 to 2G0, Hj will have

twice its original number of columns and rows, and thus we have a new Hj matrix

H1j = Diag(H0j ,H

0j). (2.15)


To find Uint, we form

CH1 H1jHH1jC1 =

CH0 CH0CH0 CH0

H0jHH0j 0

0 H0jHH0j

C0 C0C0 C0

=

CH0 H0jHH0jC0 00 CH0 H

0jHH0jC0

Since the maximum number of non-zero elements in each row of this matrix is exactly

equal to that ofCH0 H0jHH0jC0, which is 2

k0t, the number of interferers remains unchanged

even though the number of users is doubled.

By induction, we can now conclude that the number of interferers when G > G0 is

the same as when G = G0, i.e. Uint = 2k0t. Finally, since the number of interfering users

cannot increase when U takes values smaller than Gt, the result holds for all values of

U Gt. The implications of Lemma 1 can be revealed by re-visiting the G = 4, t = 3 example

presented earlier. In this case, k0 = 2, and so G = 2k0 . The number of interferers is the

number of users, i.e. Uint = U = 12. Suppose that G is increased to 8, and the number of

users to U = 24. Lemma 1 tells us that Uint does not increase to 24 but instead remains

at 12! This means that a joint detector of the same complexity can be used for both

the 12-user and 24-user systems and still give the same performance. If there are only a

small number of transmit antennas, say t = 2, even the ML detector can be used at each

receiver because each user effectively acts inside a four-user system only. Regardless of

the actual number of users in the system, the number of interferers is limited to 3. Also

to achieve high performance with suboptimal detectors, we only need rjG Uint, e.g.t = 4, G = 8, U = 32, then at each receiver, Uint = 15, and rj = 2 is sufficient.

Remarks:

1. Note that Lemma 1 relies only on the Walsh matrix structure (2.14) being used


for extensions to higher orders, i.e. for any G0 G0 orthogonal matrix C0, if Ckis a 2kG0 2kG0 orthogonal matrix formed using (2.14), then Lemma 1 holds. Soany orthogonal matrix may be used as a seed for a whole family of orthogonal

matrices of order (2k)G0, k Z+, where G0 is the order of the seed matrix. Inthis work, we are interested in antipodal binary orthogonal matrices because they

are easy to generate, and this constraint leads us to Walsh matrices as the natural

choice.

2. To multiplex U = Gt users with only G0t mutual interferers, as an alternative,

one can divide the G chip intervals into U/(G0t) time slots, each spanning G0

chips (TDMA). Thus, the dimensionality of the received signal when the receiver

has rj antennas is rjG0. Assuming that the signal dimensionality has to be no

smaller than the number of mutually interfering users for good performance4, then

with this TDMA-type transmission, rj t. On the other hand, in 2D-STSC, thedimensionality of the received signal is rjG because the time axis has not been

divided, so setting rjG to a minimum of G0t yields rj G0t/G. Hence

rj(TDMA)

rj(2D-STSC)=

G

G0= 2k 1, (2.16)

since k 0, showing that a TDMA scheme may require more receive antennas thanthe proposed 2D-STSC method.

2.2.3 Constellation and Power Allocation

So far we have designed the spreading matricesi to ensure zero average MAI, and satisfy

a necessary condition for full transmit diversity. However, full diversity is guaranteed only

if Em,n has full rank. If all users employ the same modulation format, this condition will

4This corresponds to a relative load of unity.


not be met. Again, consider the example introduced earlier in this section. If the users

at the first group all use antipodal BPSK i.e. s1, s2, s3, s4 {1,+1}, and we consider apair of codewords which differ only in s1, s2, s3 and s4 as follows:

Codeword s1 s2 s3 s4

A 1 1 1 1B 1 1 1 1

then the rank of Em,n is one.

To overcome the problem, we need to ensure that the users in the same group, which

are transmitted from the same antenna, employ different constellations, through con-

stellation rotation for instance, as described by Giraud et al [31], DaSilva and Sousa

[32], or Damen et al [7]. In this work, we examine the technique in [7], and show

through simulations that indeed, in each user j, full diversity (trj) is obtained with ML

detection. However, this is not the focus of this work because in the downlink, we are

more interested in low complexity suboptimal receivers. The interested reader in design

of constellation rotation is referred to [7].

With linear detectors, the diversity gain is upper bounded by the number of excess

degrees of freedom in the receiver. In [33], it is shown that, for M transmitted symbols

per channel use (PCU), with transmit diversity order of one, the achieved diversity order,

L, at a receiver with rj antennas is L = rj M +1. Extending this result to our systemmodel with transmit diversity order of t, we hypothesize that

L = t[rj M ]+ + 1, (2.17)

where []+ = , 0; 0, < 0. This implies that if t = rj, for a full rate transmission(M = t), then L = 1. The same result is also obtained in [20] for TAST codes with

rotated constellation. Therefore, as we show by simulation results, constellation rotation

appears to be inadequate to achieve full diversity with linear/suboptimal detectors.


To improve the achieved diversity gain with a non-linear SIC, we need to maximize

its performance by allocating different powers across the interfering data streams in each

group [34]. This is because the symbols are decoded successively and the first decoded

symbols do not benefit from much interference cancellation. For these first symbols to

be decoded reliably, they must be the ones received with the highest SINR.

With downlink power control, in general, users are allocated different transmit powers

based on their received signal strengths so that the same quality of service is provided to

all users, in spite of their different channel states. Since every users signal is transmitted

through the same channel to arrive at any given receiver, the differences in transmit

powers are preserved at the receivers. They can then sort the interfering users according

to decreasing SINR, and decode using the SIC to maximum effect. This means that

transmitting with a range of signal powers is a natural result of downlink power control,

and the SIC is well-suited for use on the downlink channel, which is why it is the detector

structure of greatest interest in this work.

Devising an optimal power control algorithm for the downlink channel is beyond the

scope of this work. However, we develop a simple power control scheme which can be

implemented with a very low rate feedback. We consider l different levels of power such

that the interfering users in the same group (the ones transmitted from the same antenna)

are allocated different powers. Since there are t groups and Uint co-channel users, we have

Uint/t co-channel users per group. Therefore it is sensible to define

l = Uint/t = min(G, 2k0), (2.18)

when Uint and k0 are defined in equation (2.13). Each user is then required to measure the

received signal to interference-noise ratio (SINR), and compare it with l threshold levels,

and then feedback log2 l bits to the base station. At the base station, transmitted power

is adjusted so that the user with the higher SINR will get less power. Also, the spreading


codes/matrices are assigned to the users so that, as far as possible, symbols with the

same power are transmitted from different antennas. For SIC to perform correctly, the

ordering of the users has to be sent to the receivers.

The simulation results show that even with this simple power control algorithm, the

two-stage interference canceller (Figure 2.2) explained in Section 2.3.2 is able to achieve

almost the same diversity gain as ML detection, when the bit error rate (BER) is averaged

over all users in the system.

It can be easily verified that with this power allocation scheme, the difference matrix,

Em,n, is full rank, and full diversity is achieved by ML detection. For instance, in the

previous example, if we define Ak =Pk, k = 1, ..., U , then

Em,n =

A1 A2 A3 +A4 0 0 A1 +A2 +A3 +A4

0 A1 +A2 A3 A4 0 00 0 A1 A2 +A3 A4 0

,

which has a rank of 3. However, to optimize the performance with the ML receiver, the

powers should be defined to maximize the coding gain according to ML criterion (see

Section 1.2.2).

In this work, concentrating on SIC, different powers are selected geometrically. We

get better performance for larger ratio of powers; however, a large ratio of the powers

may not be practically feasible. The dynamic range of the power for each user has to be

taken into account as well. Also, if the users are in the same channel states (which is

very unlikely) and they all demand the same BER, we cannot have any arbitrary ratio

for the allocated powers.

In a single-user (K = 1) system, the same spreading scheme can be employed over

different symbols, to design full rate space-time coding. In that case, it gives a similar

structure as threaded algebraic space-time code (TAST) of Damen et al [7]. However,

the proposed design allows us to obtain vastly improved performance with suboptimal


detectors, because of the unequal power allocation feature. The number of time slots,

T , is chosen to be T = 2k0 , when k0 is the integer satisfying log2 t k0 < log2 t + 1.Without any knowledge of the channel, we deliberately allocate powers unequally to the

symbols transmitted from the same antenna. As explained before, with this technique,

full diversity is achieved by ML detection. In addition to design simplicity, the main

advantage of this scheme is the ability to obtain almost full diversity with two-stage IC,

which is less complex compared to ML detection. In fact this would lead to a new family

of space-time codes with reduced receiver complexity. In Section 2.5 we have compared

the performance of these codes with TAST codes when two-stage IC is applied at the

receiver.

2.3 Receiver Structures

We study both optimal (ML) and suboptimal (SIC) detection at the receiver.

2.3.1 Joint ML Detection

In the ML detection algorithm, each receiver searches for the symbols, s, that maximizes

the conditional density P(yj | s) which is equivalent to:

s = argminsyj HjCs2 = argmin

s(sHCTHHj HjCs+ y

Hj yj yHj HjCs sHCTHHj yj).

(2.19)

From Lemma 1, we know for log2 t k0 < log2 t+1, if G = 2k > 2k0 , then CTHHj HjC =Diag(A1,A2, . . . ,A2kk0 ), where Aq, q = 1, ..., 2

kk0 is a 2k0t 2k0t matrix. In otherwords, we have 2kk0 orthogonal subgroups, sq, q = 1, ..., 2kk0 , of the user symbols, and

each sq has 2k0t symbols. Therefore equation (2.19) can be written as,

s = argmins

2kk0q=1

(Aqsq2 sqHrq rqHsq) = argminsq

2kk0q=1

f(sq, rq) (2.20)


where r , CTHHj yj, and rq is a vector of the elements of r associated with sq. Therefore

ML optimization can be performed over each subgroup separately. In fact, instead of

one optimization over U symbols, we have 2kk0 parallel optimizations over 2k0t symbols

which significantly decreases the computational complexity of ML detection. Since we

have parallel detectors, to improve the achieved diversity gain, constellation rotation or

power allocation, need to be applied only over each subgroup sq.

2.3.2 Multi-Stage Successive Interference Cancellation

Interference cancellation is applied at each receiver to improve the achieved diversity

gain by combating the interference from other users symbols. In the multiuser detection

literature, SIC is an effective approach to mitigate the interference between the users

symbols (see [3537], and their references).

wmatch

-

yj

w1

MMSEs1

+a1

n 6=j snan

+w2

MMSE+

a2

...

MMSE

...wuint

s2

sj

MF

suint

Figure 2.2: Structure of two-stage SIC.

In this work, a two-stage IC is considered. In the first stage, MMSE-SIC is employed:


the symbols are detected using linear MMSE detector, and cancelled in descending order

according to their power levels. In the second stage, after cancelling out all other symbols,

the desired symbol is detected based on matched filtering. The structure of the receiver

j is shown in Figure 2.2. As explained above, interference cancellation is required only

over the interfering symbols. Hence, if CI is the spreading code matrix for the interfering

symbols which are sorted in a descending order of power, we define Hc,j = HjCI . Let

y1,j = yj, the mathematical notations in Figure 2.2 are obtained from the following

two-step algorithm:

Stage I:

Recursion : (m = 1 : Uint)

Hm,j = (Hc,j)m:Uint

am = (Hc,j)m

wm = (Hm,jHHm,j +

2j I)

1am

sm = Q(wHmym,j)

ym+1,j = ym,j amsmStage II:

zj = yj Uintn 6=j

snan

wmatch = Hj(C)j

sj = Q(wHmatchzj)

where, (Hc,j)m:Uint and (Hc,j)m denote the columns (m, ..., Uint) and mth column of Hc,j

respectively, and (C)j is the spreading code vector for the user j. Besides wm and wmatch

are MMSE and Matched filter weight vectors respectively, 2j is the noise variance, and

Q(.) is the quantization operation appropriate to the constellation in use.

For two transmit antennas and G 2, since the number of interfering users is small


(Uint = 4), the first stage (SIC) can be also implemented by matched filtering, instead of

MMSE, achieving the same performance. In this case, at the first stage we have:

wm = Hj(C)m.

It is simply equalizing the channel, and then de-spreading each user symbol with its

own spreading code. The simulation results show that, the two-stage matched filter IC,

incorporating power control, provides almost the same performance as ML detection.

2.4 ComparisonWith Other STC-CDMATransceivers

Under flat fading conditions, an alternative methodology for transmission over the down-

link in a MIMO multiuser channel is to assign an orthogonal spreading vector to each

user, perform single-user STC on each users message sequence, and then spread (in time

only) user js signals on every antenna with user js spreading vector. The existing

transmission strategies are based on BLAST [22] or orthogonal STBC [21, 23]. There

are three proposals in [22]: (1) Spreading all the symbols of one user with the same

code to maximize the spectral efficiency. (2) Assigning each user t orthogonal spreading

sequences, and then spreading each antennas signal with its own spreading code. (3)

Transmitting each symbol simultaneously over n t antennas with n different spreadingcodes, to improve the diversity gain. We have compared the above techniques with the

proposed scheme, 2D-STSC, in Table 2.1, when spectral efficiency, es, is defined as the

number of transmitted symbols per channel use.

It should be mentioned that TAST codes proposed in [7] for single user systems, can

also be extended to multiuser systems in the same way as explained for STBC in [23].

This may achieve the same spectral efficiency and diversity gain as 2-D STSC, but we are

bound to transmit a certain number of symbols to each user and the maximum number


Spectral efficiency,es Diversity gain Number of users, U

STBC-CDMA es 1 t.r GBLAST-CDMA(1) min(t, r) r G

BLAST-CDMA(2) min(t, r)/t r G/t

BLAST-CDMA(3) min(t, r)/n n.r G/n

2-D STSC min(t, r) es t t.r min(t, r).G U t.G

Table 2.1: Comparison between different STC schemes for the downlink in a MIMO

multiuser channel

of users is G. Besides, TAST codes are optimized based on ML criterion which does not

guarantee high performance (full diversity) with suboptimal detectors. In [38], s

Documents

Pre Coding