Upload
trinhngoc
View
215
Download
0
Embed Size (px)
Citation preview
Copyright
by
Ren Wu
2008
The Dissertation Committee for Ren Wu
certifies that this is the approved version of the following dissertation:
Multiple-Antenna Wireless Communications:
Detection and Estimation with Smart Antennas,
and Space-Time Code Design Considerations
Committee:
Ioannis Psaromiligkos, Supervisor
Milica Popovich
Jan Bajcsy
Multiple-Antenna Wireless Communications:
Detection and Estimation with Smart Antennas,
and Space-Time Code Design Considerations
by
Ren Wu, M.Eng.
Dissertation
Presented to the Faculty of Engineering of
McGill University
in Partial Fulfillment of the Requirements
for the Degree of
Doctor of Philosophy
McGill University
December 2008
To my family.
Acknowledgments
This research work would not have been possible without the help/support
of many people. In the first place the author wishes to express his gratitude to his
supervisor, Prof. Dr. Ioannis Psaromiligkos who was abundantly helpful and offered
invaluable assistance, support and guidance. By guiding the author through the ini-
tial research work, Prof. Dr. Ioannis Psaromiligkos insured that the author was
well positioned from the early stage. Prof. Dr. Ioannis Psaromiligkos has provided
endless encouragement and support, instructions and guidance, scientific insights and
inspiration, and academic and research knowledge and experience to the author in
every aspect of his PhD study. Prof. Dr. Ioannis Psaromiligkos has given valuable
instructions and supervision in the scientific and engineering area to explore and on
the direction to go to pursue the scientific truth. Along the way, all of the advice, sug-
gestions and discussions turned out to be extremely helpful in clarifying uncertainty
and removing ambiguousness.
Deepest gratitude are also due to the members of the supervisory committee,
Prof. Dr. Milica Popovich and Dr. Jan Bajcsy without whose knowledge and as-
sistance this study would not have been successful. The author has attended both
professors’ classes and has learnt essential knowledge that has been beneficiary to
his study. Lessons given by Prof. Dr. Milica Popovich provided the author with
solid background in the area of antenna systems. Lessons taught by Prof. Dr. Jan
Bajcsy equipped the author with fundamental and advanced wireless digital commu-
nications theory. In addition, they kindled author’s interest in research in wireless
v
digital communications.
The author cannot end without thanking his family, whose constant encour-
agement and dedication inspired him and supported him all along the way through
his study. He is grateful to his wife Yiying Zuo, and his parents Hanrong Wu and
Jianfen Li.
Ren Wu
McGill University
December 2008
vi
Multiple-Antenna Wireless Communications:
Detection and Estimation with Smart Antennas,
and Space-Time Code Design Considerations
Publication No.
Ren Wu, Ph.D.
McGill University, 2008
Supervisor: Ioannis Psaromiligkos
The main theme of this thesis is wireless communications using multiple anten-
nas. The thesis consists of four topics on smart antenna technology, its applications
to direct sequence code division multiple access (DS/CDMA) communications, and
multiple-input multiple-output wireless communications. The first problem under
consideration is the joint estimation of direction-of-arrival (DoA), propagation de-
lay, and complex channel gain for antenna-array DS/CDMA communications over
frequency selective multipath channels. We propose a subspace based MUSIC-type
estimation algorithm which utilizes the spatial smoothing preprocessing technique.
The proposed algorithm essentially breaks the multipath induced coherency within
the received signals and recovers the full signal subspace spanned by the dominant
vii
signal paths of all users. This allows for the use of MUSIC-type DoA and delay
estimators for individual paths of a particular user. We then describe a new crite-
rion for detecting the number of signals impinging on a uniform linear array (ULA),
which exploits eigenvector information of the sample array covariance matrix and
makes explicit use of the peak information of the MUSIC spectrum. In the third part
we present an iterative weight matrix approximation (IWMA) algorithm. IWMA
computes an approximation to the optimum weight matrix used by weighted spatial
smoothing (WSS) to completely decorrelate input sources and generate a diagonal
source covariance matrix. A useful observation regarding IWMA is that the gen-
erated matrix is suitable as a basis for subspace-type DoA estimation. In the last
part we discuss two deterministic measures for designing linear processing space-time
block codes (a.k.a. linear dispersion codes). The first measure is obtained by ap-
plying Jensen’s Inequality to the mutual information criterion for linear dispersion
codes. We show that there is a tractable relationship between this measure and the
mutual information criterion. The second measure is a natural extension of the con-
ditions required for complex linear processing orthogonal designs. The relationship of
the second measure to the total-squared-correlation (TSC) is revealed. The connec-
tion and difference between the set of conditions and the LDC mutual information is
illustrated via the first and the second measures we obtained.
viii
Systemes de Communications Sans Fil
Multi-Antennes: Detection et Evaluation avec les
Antennes Intelligentes, et Considerations de
Conception de Code D’espace-temps
Publication No.
Ren Wu, Ph.D.
Universite McGill, 2008
Superviseur: Ioannis Psaromiligkos
Les communications sans fil utilisant des antennes multiples constituent le theme
principal de cette these qui traite de quatre sujets concernant la technologie des an-
tennes intelligentes, son application aux communications a sequence directe CDMA,
et des communications a entrees multiples et sorties multiples (MIMO). Le premier
sujet est celui de l’estimation simultanee des directions d’arrivee, du retard de propa-
gation et du gain complexe du canal pour des communications CDMA sur des canaux
multi-trajets a frequences selectives. Nous proposons un algorithme d’estimation
de type MUSIC a base de sous-espaces, utilisant un pretraitement par lissage spa-
tial. L’algorithme propose brise essentiellement la coherence induite par les trajets
ix
multiples pour recouvrer entierement le sous-espace du signal cree par les signaux
dominants de tous les utilisateurs. Ceci permet l’utilisation d’estimateurs MUSIC
des directions d’arrivee et du retard pour les signaux d’un utilisateur donne. Nous
decrivons un nouveau critere pour detecter le nombre de signaux captes sur un reseau
d’antennes uniforme et lineaire qui exploite le vecteur propre de la matrice de co-
variance des echantillons et qui utilise l’information des pics de spectre de MUSIC.
Troisiemement, nous presentons un algorithme iteratif d’approximation de la matrice
des poids (IWMA) qui calcule une approximation de la matrice des poids optimale
utilisee pour le lissage spatial (WSS), afin de completement decorreler les sources
d’entrees. Avec IWMA la matrice qui est generee peut etre utilisee comme base
de l’estimation de type sous-espace des directions d’arrivee. Nous discutons finale-
ment de deux mesures deterministiques pour concevoir des codes dispersion lineaires.
La premiere mesure s’obtient par l’application de l’Inegalite de Jensen au critere
d’information mutuelle pour codes de dispersion lineaire. Nous montrons qu’il y a
une relation tractable entre cette premiere mesure et le critere d’information mutuelle.
La deuxieme mesure est une extension naturelle des conditions requises pour le traite-
ment lineaire complexe d’un design orthogonal. Nous mettons a jour la relation de
la deuxieme mesure avec la correlation quadratique totale (TSC). Ces deux mesures
illustrent le lien et la difference entre le jeu de conditions et l’information mutuelle
des codes de dispersion lineaires.
x
Contents
Acknowledgments v
Abstract vii
Abrege ix
List of Figures xv
Chapter 1 Introduction 1
1.1 Wireless Digital Communications in Spatial and Temporal Dimensions 2
1.2 Multiple-Antenna Wireless Communications . . . . . . . . . . . . . . 4
1.3 Smart Antenna Based Techniques . . . . . . . . . . . . . . . . . . . . 5
1.4 The Parameter Estimation Problem in Antenna Array Signal Processing 5
1.4.1 The Problem of Number of Signals Detection . . . . . . . . . 6
1.4.2 Estimation of Directions-of-Arrival . . . . . . . . . . . . . . . 7
1.4.3 Joint DoA and Delay Estimation in Array CDMA Communi-
cations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.5 Multiple-Input Multiple-Output Wireless Communications . . . . . . 14
1.6 Objective of the Thesis and Summary of the Contributions . . . . . . 16
1.6.1 Objective of the Thesis . . . . . . . . . . . . . . . . . . . . . . 16
1.6.2 Summary of the Contributions . . . . . . . . . . . . . . . . . . 17
1.7 Organization of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . 18
xi
Chapter 2 Background 22
2.1 DS/CDMA Wireless Communications . . . . . . . . . . . . . . . . . . 23
2.2 Antenna Array - ULA . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.3 Multiple-Input Multiple-Output and Space-Time Block Coding . . . . 35
2.4 Notes on the Notations Used In The Thesis . . . . . . . . . . . . . . 41
2.5 Systems, Models, Signals and Assumptions . . . . . . . . . . . . . . . 42
Chapter 3 Spatial-Smoothing Based MUSIC-Type Joint DoA and Time-
Delay Estimation 43
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.2 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.3 Spatial Smoothing Based Joint Direction of Arrival and Delay Estimation 47
3.4 Channel Estimation And Removal of Timing Ambiguity . . . . . . . 52
3.5 SS Based Joint DoA-Delay Estimation Using Chip-Shifted Estimates
of the Covariance Matrix . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.5.1 Joint DoA and Delay Estimation Using Chip-Shifted Covari-
ance Matrix Estimates . . . . . . . . . . . . . . . . . . . . . . 54
3.5.2 Joint DoA and Delay Estimation Using the Non-Shifted Co-
variance Matrix Estimate . . . . . . . . . . . . . . . . . . . . 58
3.6 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
Chapter 4 The MUSIC MDL Criterion 70
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.2 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.3 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.3.1 Optimal MDL-based signal enumeration criterion . . . . . . . 73
4.3.2 Suboptimal MDL-based criterion . . . . . . . . . . . . . . . . 74
4.4 Detection Criterion Exploiting Peaks in the MUSIC Spectrum . . . . 75
xii
4.4.1 Observations and Remarks . . . . . . . . . . . . . . . . . . . . 81
4.5 Simulation and Performance Evaluation . . . . . . . . . . . . . . . . . 84
4.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
4.7 Appendix I - Proof of Theorem 3 . . . . . . . . . . . . . . . . . . . . 84
4.8 Appendix II - Proof of Theorem 4 . . . . . . . . . . . . . . . . . . . . 85
Chapter 5 Weighted Spatial Smoothing Based Iterative Weight Matrix
Approximation 89
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
5.2 System Model and Background . . . . . . . . . . . . . . . . . . . . . 92
5.2.1 Weighted Spatial Smoothing . . . . . . . . . . . . . . . . . . . 94
5.3 Proposed Iterative Weight Matrix Approximation Algorithm . . . . . 96
5.4 Theoretical Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
5.5 DoA Estimation Using Wn . . . . . . . . . . . . . . . . . . . . . . . . 107
5.6 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
5.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
Chapter 6 On Two Deterministic Measures for Linear Processing Space-
Time Block Codes 113
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
6.2 System Model and the Linear Processing ST Coding Scheme . . . . . 118
6.2.1 The Equivalent MIMO Channel . . . . . . . . . . . . . . . . . 120
6.3 Jensen’s Inequality and Relaxation of LDC Mutual Information . . . 122
6.4 The GTSC and TSA Metrics . . . . . . . . . . . . . . . . . . . . . . . 126
6.5 Lower Bounds for the GTSC Metric . . . . . . . . . . . . . . . . . . . 128
6.5.1 Bound That is Analogous to Welch’s Bound . . . . . . . . . . 128
6.5.2 Lower Bounds for Other Cases of r and n . . . . . . . . . . . 135
6.5.3 Lower Bounds - Further Results . . . . . . . . . . . . . . . . . 141
6.6 Computer Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . 143
xiii
6.6.1 Jensen’s Relaxation of LDC-MI . . . . . . . . . . . . . . . . . 143
6.6.2 Examples of LP-STBCs: Constellation Rotation and Product
Distance Gain . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
6.6.3 GTSC-TSA Metric and the LDC Mutual Information Criterion 152
6.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
Chapter 7 Conclusion, Discussion, and Future Work 155
7.1 Spatial Smoothing Based JADE-MUSIC . . . . . . . . . . . . . . . . 155
7.2 The MUSIC-MDL Criterion . . . . . . . . . . . . . . . . . . . . . . . 156
7.3 The IWMA Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 156
7.4 Two Deterministic Design Criteria for LP-STBC . . . . . . . . . . . . 157
7.5 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
Bibliography 160
xiv
List of Figures
2.1 DS/CDMA transmitters and a DS/CDMA receiver with antenna array 23
2.2 Direct-sequence spread spectrum transmission . . . . . . . . . . . . . 24
2.3 Multipath fading channel . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.4 Uniform Linear Array . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.5 ULA system model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.6 Conventional spatial smoothing . . . . . . . . . . . . . . . . . . . . . 32
2.7 Multiple-Input Multiple-Output system . . . . . . . . . . . . . . . . . 35
3.1 Spatial-smoothing-based MUSIC spectrum vs. DoA for six possible
values of the delay of user 0 . . . . . . . . . . . . . . . . . . . . . . . 60
3.2 MSE of joint DoA and delay estimator vs. SNR of user 0 . . . . . . . 61
3.3 MSE of joint DoA and delay estimator vs. number of samples . . . . 62
3.4 MSE of channel estimator vs. SNR of user 0 . . . . . . . . . . . . . . 63
3.5 Probability of ambiguity resolution vs. SNR of user 0 . . . . . . . . . 64
3.6 Spectrum of the proposed spatial-smoothing-based MUSIC algorithm
(using chip-shifted covariance matrices). . . . . . . . . . . . . . . . . 66
3.7 Spectrum of the proposed spatial-smoothing-based MUSIC algorithm
(using a non-time-shifted matrix). . . . . . . . . . . . . . . . . . . . . 67
3.8 Probability of acquisition vs. SNR of user 0 . . . . . . . . . . . . . . 68
3.9 MSE of joint DoA and delay estimator vs. SNR of user 0 . . . . . . . 69
xv
4.1 Number of signals detection: MDLMUSIC vs. MDL; 10-element ULA, 4
equal-power sources; in terms of SNR (dB); number of samples is 1500;
averaged over 400 Monte-Carlo runs. . . . . . . . . . . . . . . . . . . 82
4.2 Number of signals detection: MDLMUSIC vs. MDL; 10-element ULA,
8 equal-power sources; in terms of number of samples; averaged over
400 Monte-Carlo runs. . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.1 Operations of the IWMA algorithm: averaged squared error of Wi
versus number of iterations. . . . . . . . . . . . . . . . . . . . . . . . 109
5.2 Performance of DoA estimation using IWMA-generated weight matrix:
MSE of DoA estimates versus system SNR in dB. . . . . . . . . . . . 110
5.3 Operations of the IWMA algorithm: averaged squared error of Wi
versus system SNR in dB. . . . . . . . . . . . . . . . . . . . . . . . . 111
5.4 Performance of DoA estimation using IWMA-generated weight matrix:
MSE of DoA estimates versus system SNR in dB. . . . . . . . . . . . 112
6.1 The analogy of the Row Column Equivalence . . . . . . . . . . . . . . 131
6.2 Graphical representation of {Ai}’s for [3,2,2] . . . . . . . . . . . . . . 134
6.3 LDC-MI vs. 12T
log det(I2r + M ρNZ) for 3I2O: SNR=20dB; r = 3;
produced from 105 randomly generated LPM’s. . . . . . . . . . . . . 144
6.4 A STBC obtained by 2 × [4, 4, 4] design of ψ = 43.639 and optimally
rotated QAM. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
6.5 A LP-STBC obtained by the two-step design procedure: firstly obtain
a set of LPM designs with ψ ≤ 48; then use QAM signalling and
maximizes with respect to the product distance gain. . . . . . . . . . 149
6.6 A STBC obtained by 2 × [4, 4, 4] design of ψ = 68 and QAM constel-
lation; versus QOSTBC (QAM). . . . . . . . . . . . . . . . . . . . . . 150
6.7 LDC-MI vs. GTSC metric for 2I2O: SNR=20dB; r = 3; produced
from 105 randomly generated LPM’s. . . . . . . . . . . . . . . . . . . 151
xvi
6.8 LDC-MI vs. GTSC metric for 3I2O: SNR=20dB; r = 3; produced
from 105 randomly generated LPM’s. . . . . . . . . . . . . . . . . . . 153
xvii
Glossary
AIC Akaike Information Criterion
BER Bit Error Rate
CDMA Code Division Multiple Access
CML Conditional Maximum Likelihood
CMLE Conditional Maximum Likelihood Esti-
mate/Estimator
CRB Cramer Rao Bound
CSI Channel State Information
CSS Conventional Spatial Smoothing
DEML DEcoupled Maximum Likelihood
DoA Direction of Arrival
DS/CDMA Direct Sequence/Code Division Multiple Access
EDGE Enhanced Data Rates for GSM Evolution
ESPRIT Estimation of Signal Parameters via Rotational In-
variance Techniques
ETSI European Telecommunications Standards Institute
EVD EigenValue Decomposition
xviii
FDMA Frequency Division Multiple Access
FER Frame Error Rate
GIS Geographic Information System
GPRS General Packet Radio Service
GPS Global Positioning System
GSM Global System for Mobile communications
GTSC Generalized TSC
IQML Iterative Quadratic Maximum Likelihood
IWMA Iterative Weight Matrix Approximation
JADE Joint Angle and Delay Estimation
LBS Location-Based Service
LDC Linear Dispersion Code
LDC-MI Linear Dispersion Code-Mutual Information
LP-STBC Linear Processing STBC
LPM Linear Processing Matrix
LS Least Squares
LSML Large Sample Maximum Likelihood
LTE Long Term Evolution
MAI Multiuser Access Interference
MAN Metropolitan Area Network
MDL Minimum Description Length
xix
MIMO Multiple-Input Multiple-Output
MLE Maximum Likelihood Estimate/Estimator
MODE Method Of Direction Estimation
MUSIC MUltiple SIgnal Classification
MVDR Minimum Variance Distortionless Response
OFDM Orthogonal frequency Division Multiplexing
OSMDL Order Statistics MDL
OSTBC Orthogonal STBC
PN Pseudo Noise
QAM Quadrature Amplitude Modulation
QOSTBC Quasi-Orthogonal STBC
RHS Right-Hand Side
SISO Single Input Single Output
SNR Signal to Noise Ratio
SS Spatial Smoothing
STBC Space-Time Block Code
SVD Single Value Decomposition
TDMA Time Division Multiple Access
TLS Total Least Squares
TSA Total Squared Amicability
TSC Total Squared Correlation
xx
ULA Uniform Linear Array
UMTS Universal Mobile Telecommunications System
V-BLAST Vertical Bell Laboratories Layered Space-Time
WiMAX Worldwide Interoperability for Microwave Access
WSF Weighted Subspace Fitting
WSS Weighted Spatial Smoothing
xxi
Chapter 1
Introduction
1
1.1 Wireless Digital Communications in Spatial and
Temporal Dimensions
Since the last decade of the 20th century, mobile telecommunications and wire-
less personal communications have experienced an unprecedented and globally spread-
ing boom. This is particularly due to the success of GSM (Global System for Mobile
communications) [1] digital cellular phone system, which has been widely deployed
around the world. GSM is a system based on the TDMA (Time Division Multiple
Access) and FDMA (Frequency Division Multiple Access) technologies. In addition
to TDMA and FDMA, the technology of CDMA (Code Division Multiple Access) [2]
has been standardized and widely deployed1. Recent developments in digital wireless
communication technology are further driven by new upper-layer user applications.
For example, the incorporation of GPS (Global Positioning System) devices in cellular
phones created GIS (Geographic Information System) related services and applica-
tions (e.g., location-based services (LBS)) which further drive and push the limits of
wireless communications.
Currently, wide-range wireless personal communications are mainly narrow-
band voice communications and low-to-medium rate data communications. Broad-
band wireless communications, on the other hand, are expected to become increas-
ingly popular in the near future. Smart phones and laptops/notebooks (possibly with
reduced dimensions) that are enabled by broadband wireless communication technolo-
gies are envisioned to grow in popularity. Customers could benefit from these devices
and technologies and obtain substantially-improved access to the future mobile Inter-
net. One important broadband MAN-scale (Metropolitan Area Network) mobile data
communication system is the WiMAX (Worldwide Interoperability for Microwave
Access) system [3]. Important features of WiMAX include the OFDM (Orthogonal
frequency Division Multiplexing) wideband modulation technology and the multiple-
1CDMA first appeared as a proposed standard in the early 90s.
2
antenna enabled MIMO (Multiple-Input Multiple-Output) communication scheme.
OFDM is a physical layer communication technique that is suitable for broadband
wireless transmission and reception in hostile multipath frequency-selective environ-
ments. The multipath problem is tackled conventionally by time-domain equalization
at the receiver which is computationally demanding. In an OFDM system, to combat
multipath fading effect, the strategy is to transmit in parallel several low-rate data
streams across the spectrum available where each stream experiences a flat-fading
channel. On the other hand, the MIMO technology is capable of increasing a commu-
nication system’s capacity multi-fold by means of the spatial dimension of communica-
tion that becomes available via deploying multiple antennas at both the transmitter
and receiver. With MIMO systems, a high spectral efficiency can be achieved by
exploiting the new spatial channels existing within the systems. The MIMO and
OFDM technologies underlie not only WiMAX but also the 802.11n WLAN (Wire-
less Local Area Network) wireless system. Furthermore, the 3GPP (3rd Generation
Partnership Project) LTE (Long Term Evolution) standardization body also advo-
cates OFDM and MIMO as key technologies for the physical layer. 3GPP LTE is
a strong competitor to WiMAX and consists of a comprehensive set of technologies,
systems, protocols and specifications. Its goal is to combine the most recent state-
of-the-art developments in wireless digital communications with ETSI’s (European
Telecommunications Standards Institute) successful engineering practices in the field
of cellular communications. These practices in the past resulted in the introduction
of GSM, GPRS (General Packet Radio Service), EDGE (Enhanced Data Rates for
GSM Evolution) and UMTS (Universal Mobile Telecommunications System) wireless
systems and standards.
This thesis presents research work in the area of multiple-antenna wireless
communications. In the thesis two scenarios of multiple-antenna communications are
considered: in the first scenario only the receiver employs multiple antennas; while in
the second both the transmitter and receiver employ multiple antennas.
3
1.2 Multiple-Antenna Wireless Communications
The use of multiple antennas (at either the receive side or both the transmit
and receive sides) in wireless point-to-point communications adds an extra spatial
dimension to the existing time and frequency dimensions. With the addition of the
new dimension, new transmission and reception methods and techniques can be used
to improve systems’ reliability and spectral efficiency compared to the original single-
antenna systems. Several representative techniques have been developed for this
purpose, such as [4], [5]:
1. Spatial multiplexing that improves spectral efficiency and increases data transfer
rate of wireless systems;
2. SDMA (Space-Division Multiple Access) that provides a new type of multiple
access technology;
3. Transmit beamforming that takes advantage of partial channel knowledge at
the transmitter side;
4. Interference-nulling achieved through array signal processing techniques such as
receiver-side beamforming;
5. Spatial transmit and receive diversity can be achieved via space-time coding
and diversity combining at receiver, respectively.
The above list provides only a glimpse of the possible benefits that multiple
antenna systems can bring to contemporary wireless communications. But generally
speaking, we can divide the current multiple-antenna techniques into two categories:
smart antenna based techniques and MIMO based techniques.
4
1.3 Smart Antenna Based Techniques
The smart antenna technology consists of signal processing techniques and
methods that use multiple antennas to improve the systems’ performance and relia-
bility over conventional single antenna systems. As an example, antenna array can
be used as a measure to mitigate fading. With multiple antennas at the receiver,
and assuming that there is sufficient separation between the antennas, the perfor-
mance of the receiver countering the fading effect (i.e., the system reliability) can
be substantially improved by combining independent copies of the transmitted signal
obtained from multiple antennas. This is a spatial diversity technique. On the other
hand, the array signal processing technique of beamforming works on received signals
that are correlated with each other. It maximizes the Signal-to-Noise Ratio (SNR)
of the intended signal while achieving interference-nulling (cancelation) of unwanted
signals. In the next section we discuss several topics within the research area of smart
antenna.
1.4 The Parameter Estimation Problem in Antenna
Array Signal Processing
When using an antenna array at the receiver side, several key parameters of
the input signals must first be estimated before subsequent processing of the signals
and symbol detection can be performed. Of these parameters of interest, the number
of signals and the directions-of-arrival (DoAs) are two of the most important ones.
This section provides an overview of these two parameter estimation problems. The
major focus is on Uniform Linear Arrays (ULAs), the antenna elements of which lie
along one dimension in space and are uniformly spaced.
5
1.4.1 The Problem of Number of Signals Detection
In general, detection of the number of signals (for this parameter, “detection”
is more often used instead of “estimation” due to some historical reasons, although
both terms can be used) is done via computing various statistics using the eigenvalues
of the sample covariance matrix of the antenna array. The traditional approach to the
number of signals estimation problem proposed and discussed by Wax and Kailath [8]
is based on information theoretical criteria for model order selection, namely the MDL
(Minimum Description Length) principle [6] and the Akaike Information Criterion
(AIC) [7]. In this method, the eigenvalues of the sample covariance matrix of the
antenna array are obtained and sorted, and then the multiplicity of the smallest
eigenvalues is identified. Within the algorithm a penalty term is introduced which is
formulated according to the AIC or MDL criteria. The resultant test is a function of
the number of signals k (the hypothesis), the minimization of which is the estimate
of the number of input signals.
It has been shown that (e.g., [9]) the estimator obtained via the AIC criterion
is not a consistent estimator2 and that it tends to overestimate the true value of
the parameter asymptotically with increasing number of samples, while the estima-
tor obtained via the MDL principle provides a consistent estimate. Furthermore, it
can be shown that the detection that is based on the MDL criterion has moderate
performance when the SNR of the system is low.
The MDL estimator proposed in [8] is an unstructured estimator, i.e., it does
not take into consideration that the system is a ULA and the signal model is parame-
terized by the directions-of-arrival. In [10] the author described an estimator that has
the DoAs parameterized and at the same time utilizes the MDL criterion. However,
the algorithm is based upon the maximum likelihood estimates of the DoAs meaning
it needs to perform a multi-dimensional search which is computationally demanding.
2An estimator is said to be consistent if the estimate converges in probability to the true valuewhen the number of samples increases to infinity.
6
In [11], the authors proposed the use of order statistics3 to improve the per-
formance of the MDL detection criterion for a finite number of snapshots. The paper
pointed out the fact that when only a finite number of snapshots are available at the
receiver side, the one-to-one relationship between the sorted eigenvalue estimates and
the actual eigenvalues (sorted also in the same order) is not valid with high proba-
bility. As a solution to this problem the asymptotic distributions (when number of
samples increases to infinity) of the eigenvalue estimates are utilized to compute the
order statistics. The computed order statistics are then used to obtain a different
version of the MDL criterion called OSMDL (Order Statistics MDL).
A different solution to the problem is proposed in [12], where the authors
consider the upper thresholds for the estimates of the eigenvalues, which, together
with their counterparts, the lower thresholds, define regions such that the probabil-
ity of the eigenvalues falling outside of them are equal to some predefined values.
These thresholds are adaptively predicted using the asymptotic distributions of the
estimated eigenvalues and are used to perform hypothesis testing for the number of
input signals. It can be shown that the performance of that method is superior to
that of MDL in the low SNR regime. Yet another source enumeration solution is
described in [13], based on testing the equality of all pairwise eigenvalues. The tech-
nique utilizes the bootstrap method to remove the dependence of the algorithm on
the Gaussian assumptions of its input signals. This makes the algorithm more robust
to deviations of the underlying model from the Gaussian assumptions, but it also
implies higher computational complexity.
1.4.2 Estimation of Directions-of-Arrival
In array signal processing, the estimate of the directions-of-arrival of the wave-
fronts impinging on a ULA was a research topic of considerable interest as the knowl-
3Consider n random variables X1, X2, . . . , Xn, that if rearranged in descending order producethe sequence X(1), X(2), . . . , X(k), . . . , X(n), the random variable X(k) is the kth order statistic of theoriginal n random variables.
7
edge of the DoAs is essential for subsequent signal processing and for such practical
applications as passive source localization. DoA is usually represented by the electri-
cal angle, which is defined with respect to the actual incidence angle. The electrical
angle φ of an impinging wave is φ = 2πdλ
sin θ, where d is the antenna spacing, λ is
the wavelength, and θ is the actual angle of incidence. When the input signals have
incidence angles (measured with respect to the straight line that is perpendicular to
the line of ULA) within the range of [−π2, π
2] and when the distance between array
elements is constrained to be less than one half of the wavelength of the input sig-
nal, there is a one-to-one correspondence between the electrical angle and the actual
incidence angle.
In this thesis, especially the first three parts of it, it is assumed that the
inputs are narrowband signals, the reason being that the ULA behaves differently to
the frequency components of a broadband signal. Firstly, if it is not a narrowband
signal, spatial aliasing could happen to the high-frequency components (while not
to the low-frequency components). Secondly, if the inputs are narrowband signals,
i.e., the bandwidth of individual signals is such that its product with the maximum
travel time of the wavefronts across the antenna array is much smaller than 1, the
difference between the signal’s arrival times at the different antenna elements would
have a negligible effect on the phase of the signal. Thus the only phase shifts that
are significant between the antenna elements are those caused by the incidence angles
of the input signals, which is desired as we can derive the incidence angles (i.e., the
DoAs) from the phase shifts.
There exist many direction-of-arrival (DoA) estimation methods within the
literature. Among them are Capon’s method [14], MUSIC [15], Min-Norm [22] [23],
stochastic maximum likelihood [17], [18], conditional maximum likelihood [19], WSF
[26], MODE [27], Root-MUSIC [16], ESPRIT [28] and IQML (Iterative Quadratic
Maximum Likelihood) [29] methods, etc.
Capon’s algorithm [14] is also known as the minimum variance distortionless
8
response (MVDR) spectral estimator. In the algorithm, the DoAs are estimated by
the locations of the peaks of the MVDR power spectrum that represents the output
power of the MVDR filter as a function of the steering direction. The filter is ob-
tained by minimizing its output power subject to the constraint that the response to
a given steering direction is constrained to be unity. In the MUltiple SIgnal Classi-
fication (MUSIC) algorithm, the vector of the received signals is viewed as a linear
combination of signal vectors embedded in additive Gaussian noise. In MUSIC, the
so-called noise subspace consists of the set of eigenvectors of the covariance matrix
of the array output that have the M −D smallest eigenvalue, where M is the num-
ber of antenna elements, D is the number of input signals and D < M . When the
estimated array covariance matrix is ideal, it can be shown that the noise subspace
is orthogonal to the steering vectors of the input signals. The DoA estimates can
be obtained as the directions that yield steering vectors orthogonal to the noise sub-
space. For noisy observations, the noise subspace of the estimated covariance matrix
of the array vector is first identified, and then the MUSIC spectrum is obtained by
plotting the magnitudes of the projections of the vectors upon this noise subspace as
a function of the steering direction. The DoA estimates are obtained by identifying
the points where the magnitude of the projection achieves local minima. A more
detailed description of the algorithm is given in subsection 2.2. In [20] and [21], the
authors discussed the connection between the MVDR and MUSIC algorithms. It was
shown that the MVDR algorithm utilizes the covariance matrix raised to the power
of −1 and decreasing this power coefficient to −n (n > 1) and in the limit that n
goes to infinity the MVDR estimator is transformed into the MUSIC estimator.
The Min-Norm algorithm [22] [23] belongs to the category of subspace-based
DoA estimators (same as the MUSIC algorithm) and more specifically to the category
of weighted MUSIC algorithms. It uses a single vector in the noise subspace, which
is a linear combination of the noise subspace vectors. The localization is done via
projection of steering vector of scanning electrical angle on this single vector. In
9
deriving the estimator, the goal is to find a single vector4 v in the noise subspace which
has v1, the first element of v, equal unity and the Euclidean norm minimized. This is
equivalent to require that the M −D zeros of the polynomial D(z) =∑M
i=1 viz−(i−1)
that do not belong to the DoAs of the input signals be inside the unit circle and that
these zeros be uniformly distributed in area where the other D zeros that correspond
to the input DoAs do not exist. The minimum norm argument and properties of the
algorithm are given by [23].
The DoA estimation problem can also be solved by using maximum likelihood
techniques. Two maximum likelihood estimators (MLEs) exist within the literature
that differ on the assumptions about the underlying data. The first one assumes
that the input data are deterministic but unknown, and they are formulated as nui-
sance parameters; the resultant estimator is called the conditional MLE (CMLE)
[19]. In the second treatment it is assumed that the input data are samples from
some probability distribution (e.g., from the Gaussian distribution) and the resultant
estimator is known as the stochastic MLE [17] [18]. In [25] the relationship between
the MUSIC algorithm and the CML (Conditional Maximum Likelihood) estimator
was established and elaborated on. It was shown that MUSIC achieves an estimation
accuracy equal to that of conditional maximum likelihood estimator for large sample
size as long as the source signals are uncorrelated.
Weighted Subspace Fitting (WSF) was developed in [26] and it was shown
that its performance approached the Cramer Rao Bound (CRB5) asymptotically with
increasing sample size. However, the computational requirements of WSF are high
as it relies upon multidimensional search. It is observed that another high-definition
DoA estimation algorithm, the Method Of Direction Estimation (MODE) [27], is
equivalent to WSF if the impinging signals are not coherent.
In Root-MUSIC, the requirement that the array manifold be orthogonal to
4Normally we use bold lower case letters to denote column vectors and bold upper case lettersto denote matrices.
5For any unbiased estimator of a deterministic parameter, the CRB is a lower bound to thevariance of these estimators.
10
the noise subspace is expressed via a polynomial representation and the DoAs are
estimated by first finding the roots of the polynomial and then selecting the D roots
(D again is the number of signals) that are closest to the unit circle. The error that
occurs in the radial part of the estimates would not contribute to the error of the
estimates of the DoAs as only the phases are important. In this sense root-MUSIC is
expected to perform better than MUSIC. However, MUSIC is more general in that it
can be used for a large variety of antenna shapes and geometries while the application
of root-MUSIC is limited to ULA.
Estimation of Signal Parameters via Rotational Invariance Techniques (ES-
PRIT) [28] is another important direction-of-arrival estimation algorithm. ESPRIT
essentially requires that a translational invariance [28] exists within the antenna array,
a condition that is satisfied by ULAs. In ESPRIT the rotational invariance [28] that
corresponds to the translational invariance is exploited. The main idea is as follows.
Mathematically, two overlapping subarrays of length M − 1 (the first array is formed
by antennas 1 to M − 1 and the second one by elements 2 to M) differ by a transla-
tional displacement, and the second subarray’s manifold can be written as a product
of the diagonal matrix of DoAs and the first subarray’s manifold. Correspondingly,
the partitions of the eigenvectors that correspond to the subarray manifolds are re-
lated by a linear transformation described by a transformation matrix. It can be
shown that this transformation matrix has the same eigenvalues (which are actually
the DoAs) as the translational matrix between the two subarray manifolds. This fact
can be exploited to obtain the DoA estimates. In ESPRIT, this linear transformation
is estimated using either least squares (LS) or total least squares (TLS) methods. An
improvement to ESPRIT is the unitary ESPRIT algorithm [24]. In Unitary ESPRIT
the computational load is reduced by real-valued computations which is obtained by
mapping the original centro-Hermitian matrices (a matrix A is centro-Hermitian if
JAJ = A, where J is a matrix which has only ones on the anti-diagonal while other
elements are zeros) of ESPRIT to real-valued matrices. The estimation accuracy is
11
improved by exploiting the unit magnitude property of the phasors representing the
DoAs. Further its performance is enhanced through forward/backward averaging of
the data, a processing inherent in the operation of the algorithm.
Finally, iterative quadratic maximum likelihood (IQML) [29] is a computa-
tional algorithm for the minimization of the deterministic maximum likelihood esti-
mator. While the maximum likelihood estimator provides an analytic answer to the
direction-finding problem, the IQML algorithm gives an efficient computational im-
plementation. It is based upon the polynomial parameterization of the deterministic
ML estimator, after which an iterative procedure can be established.
1.4.3 Joint DoA and Delay Estimation in Array CDMA Com-
munications
In recent years, multiple antennas were successfully applied to DS-CDMA com-
munications. For DS-CDMA communications employing antenna arrays, we similarly
need to obtain full knowledge of the channel parameters before the detection of the
transmitted symbols from each user can be performed. Of these parameters we are
particularly interested in the direction-of-arrival and the time delay of individual
CDMA users (or individual multipath components of each user) as well as the chan-
nel fading coefficients.
One solution to the joint DoA, delay and channel estimation problem was
proposed in [31] which is based upon the work described in [32] and a variant of
it (presented in [33]). This method requires the transmission of training sequences
and is valid for additive noise of unknown covariance matrix. It does not provide an
explicit formula for the estimation of the direction-of-arrival of individual multipath
component. Instead, it computes a composite channel impulse response utilizing the
DEML (DEcoupled Maximum Likelihood) [32], [33] estimator. It then directly uses
the obtained channel impulse response for detecting the symbols of a particular user.
The idea is similar to the notion of received effective signature waveform.
12
Two other solutions to the joint estimation problem are JADE-MUSIC (Joint
Angle and Delay Estimation) [34] and JADE-ESPRIT [35]. Both are blind estima-
tion procedures and belong to the category of subspace based estimation algorithms.
JADE-MUSIC extends the MUSIC technique to the joint angle and delay estimation
problem. In JADE-ESPRIT the composite channel matrix containing the unknown
DoAs, time-delays and channel coefficients needs to be estimated first. Then by tak-
ing the Fourier transform of the rows of the channel matrix estimate the ESPRIT
algorithm can be applied to the joint angle and delay estimation problem, and a
closed-form solution is therefore obtained.
Another maximum-likelihood-based estimation algorithm was proposed in [36],
where the received signals are transformed to the frequency domain and the distinct
time delays are converted to phasors. The algorithm requires the transmission of
a data sequence that is known a priori at the receiver side, i.e., it is a supervised
approach. Further, it does not directly estimate the DoAs of the individual multipath
components of a single source. Instead, the effective spatial signature vectors, which
are linear combinations of all coherent paths from one specific source, are estimated.
Further, the formulation of the system model exchanges the positions of the matrix of
transmitted data symbols and the matrix of spatial signature vectors, which makes the
resultant model analogous to a standard array DoA estimation model. ML estimators
can then be formulated. One advantage of this technique is that it has the ability to
estimate more unknowns than the number of antenna elements of the system.
In [37] the authors extend the code acquisition process in time domain to
two-dimensional angular and time domain, where the continuous angular domain
is divided into discrete bins and a search of the optimum inside these bins can be
performed.
In [38], TST-MUSIC (Time-Space-Time) was proposed as a solution to the joint
DoA and time-delay estimation problem. In the TST-MUSIC approach, the received
signals are arranged into two-dimensional data (the space and time dimension). The
13
TST-MUSIC algorithm has a tree structure and consists of two temporal T-MUSIC
algorithms and one spatial S-MUSIC algorithm. First the T-MUSIC algorithm in
the temporal dimension is performed to obtain initial estimates of the parameters
and then temporal filters are constructed using these estimates. S-MUSIC is then
performed on the filter outputs to obtain estimates of the spatial parameters and
the estimates are used to construct beamformers filtering the signals in the spatial
dimension. Note that in this stage the originally hard-to-differentiate temporal pa-
rameters, such as time delays that are close to each other, can be separated by their
spatial parameters. Finally, T-MUSIC is performed again to accurately recover the
time-delay parameters. The grouping of the time-delays and the incident angles can
be done automatically. In this algorithm the number of antenna elements can be less
than the total number of input signals.
1.5 Multiple-Input Multiple-Output Wireless Com-
munications
The second fundamentally different view point of multiple-antenna systems is
the concept of Multiple-Input Multiple-Output (MIMO). MIMO studies systems with
multiple antennas at both the transmitting and receiving sides. In MIMO systems
the received signals from different antenna elements are usually uncorrelated with
each other. This is in general not the case for an antenna-array system. Also the
geometry of the array of antennas is not as important for a MIMO system as it is for
an antenna-array systems. Another important difference between the techniques for
MIMO systems and those for antenna arrays is that a large part of research effort for
the former is put on the design of the transmission schemes.
In the late 90s, several seminal works laid out the foundations for the concept
of MIMO. In [39] (see also [40] and [41]) the channel capacity of the MIMO system
was discussed. It was shown that the MIMO structure, with rich scattering between
14
transmitting and receiving pairs, contains orthogonal spatial channels the number of
which is linear with respect to the smallest of the numbers of transmit and receive
antennas. When perfect channel knowledge is available at both the transmitter and
the receiver side, the MIMO structure enables new spatial dimension and the ad-
ditional degrees of freedom can be exploited to obtain a spectral efficiency that is
integer multiple that of a single input single output (SISO) system.
Significant research efforts have been devoted to developing MIMO-related the-
ories and many important research results emerged after the discovery of the potential
capacity gain. Various MIMO configurations and channel models were under investi-
gation and techniques and algorithms that can be used to exploit the system capacities
for these systems were proposed. Generally speaking, the MIMO system’s capacity is
affected by the variation of the channel coefficients and the cross-correlations between
the transmitting and receiving antennas. Recent research also shows that feedback
from the receiver, even if it is partial, can dramatically improve the data rate of
MIMO systems [42].
The technique of water filling can be used to allocate more power (within the
constraint of fixed total power) to favorable channels in order to achieve system capac-
ity. As noted before, the MIMO channel’s capacity can be affected by many factors.
For example, the “keyhole” phenomenon could possibly exist [100]. Mathematically,
the keyhole effect can be expressed as the cascading of one MISO and one SIMO
Raleigh channels, and in this case the rank of the overall MIMO channel matrix is re-
duced to only 1, meaning that only one SISO channel can be possibly used to transfer
data. The phenomenon could be observed when there is a vertical array at the base
station on top of a building, with the mobile beneath the building, and transmission
takes place via diffraction over the rooftop.
Transmit beamforming is a powerful technique for wireless transmissions in
MIMO channels [43], [44]. It was shown that when the channel matrix contains
certain structure due to the existence of distinct multipath propagations, transmit
15
beamforming can be used to minimize the error probability. Note that in [44] the
receiver is assumed to have only one antenna. The combined effect of transmit beam-
forming with space-time block codes was investigated in [43] and [45]. The transmit
beamforming techniques developed in [45] using the maximized averaged SNR crite-
rion can cause the diversity that is available by using a STBC to disappear with only
a single channel being favored; while this is not the case in [43].
Alternatively, transmit beamforming transmission can be derived using the
capacity criterion, as in [46], [47] and [44], where the optimization problem of max-
imizing the mutual information between the inputs and the outputs with transmit
beamforming is considered.
1.6 Objective of the Thesis and Summary of the
Contributions
1.6.1 Objective of the Thesis
To summarize the previous sections, the design of a multiple-antenna system
is an important engineering practice and multiple-antenna systems have attracted
enormous research interest ever since its inception. The adoption of multiple anten-
nas provides substantial capacity and performance improvement to current single-
antenna communication systems. The MIMO technology is envisioned to become
popular within the near future. These observations have suggested that the field of
multiple-antenna wireless communication deserves much more research effort, which
further motivates this PhD study into this area. In conducting this PhD research, the
objective and goal are to investigate present-day problems in multiple-antenna wire-
less communications and propose practical solutions and alternatives; and to expand
the theory and knowledge in this area through original work and fresh perspectives.
16
1.6.2 Summary of the Contributions
The contributions of the thesis are summarized below:
• We propose a subspace-based MUSIC-type joint DoA-delay estimation algo-
rithm which utilizes the spatial smoothing preprocessing technique. The pro-
posed technique addresses the problem of joint estimation of direction-of-arrival
(DoA), propagation delay, and complex channel fading coefficients of individ-
ual multipath components of a particular user for antenna-array DS/CDMA
communications over frequency selective multipath channels and provides an
efficient and attractive alternative.
• A new criterion for detecting the number of signals impinging on a uniform linear
array (ULA) is described. The criterion is unique in that it makes explicit use
of the peak information of the MUSIC spectrum. It exhibits a systematic way
of utilizing the eigenvector information of the array covariance matrix in the
number of signals detection problem.
• In the next we describe the iterative weight matrix approximation (IWMA) al-
gorithm, which is a novel smart antenna technology used to obtain an approxi-
mation to the optimum weight matrix for performing weighted spatial smooth-
ing (WSS) to obtain diagonal signal covariance matrix. We provide detailed
analysis of the algorithm’s behavior for the case when the estimate of the array
covariance matrix is ideal, which shows that the approximation matrix obtained
not only contains the DoAs information but also its structure is optimized such
that it is suited as a basis upon which subspace-based DoA estimation can be
performed. We illustrate this DoA estimation strategy by computer simula-
tions. The improvement in performance also suggests the effectiveness of the
IWMA algorithm.
• The last contribution is a study of two deterministic design measures for linear
processing space-time block codes (LP-STBC’s) which consist of sets of linear
17
processing matrices (LPM’s) and scalar signal constellations. The measures are
deterministic in the sense that their computations do not involve any statistical
operators and are defined solely with respect to the set of LPM’s. The first mea-
sure is obtained by applying Jensen’s Inequality to the linear dispersion code
mutual information [112] and it has a tractable relationship to the LDC mu-
tual information. The measure has the advantage of simplifying the LP-STBC
design as it separates the design of LPM’s from the statistical properties of the
channel. The second measure is a natural extension and generalization of the
conditions required for the amicable orthogonal design, which lays a theoretical
foundation for orthogonal space-time block codes. We studied its properties
and investigated lower bounds for it in the case of real design. We illustrate the
analogy between a derived lower bound to that of the total-squared-correlation
(TSC) for designing CDMA sequence set. We also illustrate the connection and
difference between the conditions for amicable orthogonal design and the LDC
mutual information via the first and the second measures we obtained.
1.7 Organization of the Thesis
In what follows we provide brief descriptions of the remaining chapters of the
thesis.
The thesis presents several research works in the field of multiple-antenna wire-
less communication. Before these works are discussed in the individual chapters, a
background chapter is included which provides necessary background information on
the main topics of the thesis.
The main contents of this thesis are arranged in four chapters. First, we con-
sider the problem of joint estimation of direction-of-arrival, propagation delay, and
complex channel fading coefficients for individual multipath components of a partic-
ular user in the context of antenna-array DS/CDMA communications over frequency
18
selective multipath channels. We propose a subspace-based MUSIC-type joint DoA-
delay estimation algorithm which utilizes the spatial smoothing preprocessing tech-
nique. The proposed algorithm essentially breaks the multipath induced coherency
within the received signals and recovers the full signal subspace spanned by all dom-
inant signal paths of all users. This allows for the use of MUSIC-type joint DoA
and time-delay estimators for the individual path components of the user of interest.
Based on the angle and timing information, we then estimate the multipath fad-
ing coefficients. We also consider two variants of the spatial-smoothing based joint
DoA-delay MUSIC technique, which are based upon chip-shifted estimates of the
space-time autocorrelation matrix. Another feature of the second type of algorithms
is that they utilize space-time received vectors that span only a single information
symbol period and exhibit superior performance when the data record size available
for parameter estimation is limited. This work is described in Chapter 3.
In Chapter 4 we describe a new criterion for detecting the number of signals
impinging on a uniform linear array (ULA). The criterion makes explicit use of the
peak information of the MUSIC spectrum. Specifically, we consider two maximum
likelihood estimates (MLEs) of the noise variance, that is, the MLE which is derived
from the unstructured eigenvalue decomposition (EVD) based parameterization and
the MLE that is obtained using structured DoA parameterization. Based upon a
large-sample formulation of the difference between these two MLEs, and by applying
the minimum description length (MDL) principle, we obtain the proposed criterion.
We prove that the proposed criterion provides a consistent estimate of the number
of signals and demonstrate that it has a better performance at low SNR for equal-
power sources when compared with the original MDL-based signal number detection
criterion.
In Chapter 5 we describe the iterative weight matrix approximation (IWMA)
algorithm. We consider the design of weighted spatial smoothing (WSS) which was
proposed as a technique to obtain a diagonal source covariance matrix for array sig-
19
nal processing. A diagonal source covariance matrix is a desired feature for subspace-
based direction-of-arrival (DoA) estimation algorithms as the cross-correlation among
the input signals can markedly impair the performance of these estimators. However,
the optimal weight matrix for such a purpose requires explicit knowledge of the DoAs.
In this thesis, we present an iterative weight matrix approximation (IWMA) algorithm
which is capable of obtaining an approximation to optimal weight matrix in an itera-
tive fashion. The algorithm is applicable when the input covariance matrix is positive
definite. The algorithm starts from a scaled identity matrix as an initial guess and
carries out a series of weighted spatial smoothing operations. After each WSS the
algorithm computes a new weight matrix, which is to be used for the next iteration.
The algorithm is based on the use of an effective correlation matrix6, which is nat-
urally brought about by the operations performed in each iteration and on the fact
that for a positive definite Hermitian matrix, the set of eigenvalues of its Hadamard
product with a correlation matrix is majorized by its own set of eigenvalues. While
WSS that is based on IWMA is an effective method to decorrelate highly correlated
signals, it is also interesting to note that with the IWMA algorithm the approximate
matrix generated can form the basis for subspace-type DoA estimation. Simulation
results illustrate the effectiveness of this estimation strategy which also suggests the
effectiveness of the IWMA algorithm.
The last topic of the thesis is the study of two deterministic design measures
for linear processing space-time block codes (LP-STBC’s) which consist of sets of
linear processing matrices (LPM’s) and scalar signal constellations. The measures
are deterministic in the sense that their computations do not involve any statistical
operators and are defined solely with respect to the set of LPM’s. The first measure
is obtained by applying Jensen’s Inequality to the mutual information criterion for
linear dispersion codes [112] denoted by CLD. The expectation operator is moved into
the log det() operator following Jensen’s rule. By assuming channel coefficients that
6A correlation matrix is defined as a square matrix with diagonal of all 1’s and off-diagonalelements less than or equal to 1 in magnitudes [134].
20
are independent and identically distributed (i.i.d.) Gaussian we compute the expec-
tations after which we obtain a deterministic design measure. We shall show that
there is a tractable relationship between this measure and CLD and will show that the
design of LP-STBC using this relationship can be simplified. The second measure is a
natural extension of the conditions required for complex linear processing orthogonal
design or amicable orthogonal design. For the LPM’s of a LP-STBC, we associate
with them two measures of non-orthogonality: total-squared-skew-symmetry and
total-squared-amicability (TSA). The relationship of total-squared-skew-symmetry
to total-squared-correlation (TSC) is revealed. TSC measures the non-orthogonality
(cross-correlation) properties of a vector set, and is commonly used in the design of se-
quence sets for Code Division Multiple Access (CDMA) systems. It can be shown that
total-squared-skew-symmetry is a generalization of total-squared-correlation (GTSC).
For GTSC a lower bound analogous to Welch’s lower bound for TSC exists, which
establishes itself upon the Hurwitz-Radon numbers and the Hurwitz-Radon family
of matrices. By computer simulations, we shall establish that the second measure
is less revealing than the first one for the performance of the final codes. However,
the lower bound derived can still be a good indicator of the performance of real de-
sign of size 3× 3. Comparing the two deterministic measures reveals to some extend
the difference and similarities between CLD and the criterion for amicable orthogonal
design.
Finally, the conclusions are given in the last chapter of the thesis together with
a discussion of the future work.
21
Chapter 2
Background
22
Figure 2.1: DS/CDMA transmitters and a DS/CDMA receiver with antenna array
The intent of this chapter is to provide necessary background information for
the major topics of this thesis, including brief descriptions of code division multiple
access systems, multipath fading, block fading, synchronization, channel estimation,
uniform linear array, MUSIC, spatial smoothing, multiple-input multiple-output sys-
tems and space-time block codes.
2.1 DS/CDMA Wireless Communications
In the first part of this PhD work (which is given in Chapter 3), our focus
is the design of signal processing techniques for DS/CDMA (Direct Sequence/Code
Division Multiple Access) wireless communications over frequency selective multipath
channel. The receiver is equipped with a uniform linear array (ULA), as shown on the
right side of Fig. 2.1. More details about the ULA structure will be given at a later
section of this chapter. Antenna array is a vital component of contemporary digital
wireless communication systems and the adoption of antenna array in a DS/CDMA
system can dramatically improve its performance. For example, a CDMA system’s
23
Figure 2.2: Direct-sequence spread spectrum transmission
performance is limited by the MAI (Multiuser Access Interference) within the system.
By utilizing antenna array at receiver side, the system’s resistance to MAI can be
enhanced through appropriate signal processing algorithms.
Fig. 2.1 shows a DS/CDMA wireless communications system. DS/CDMA is
the underlying physical layer transmission technology of several major third-generation
wireless standards like CDMA2000 and UMTS. DS/CDMA uses the direct-sequence
spread spectrum transmission technique, in which user’s data sequence is modulated
on to a code sequence, e.g., a PN (Pseudo Noise) code sequence. In DS/CDMA,
each user is assigned a different code sequence and at the same time all users share
a common communication media. Fig. 2.2 is a diagram showing the generation of a
DS/CDMA signal. The generated signal is given by:
s(t) =∑
i
N−1∑
l=0
b[i]d[l]PTc(t− iT − lTc), (2.1)
where b[i] ∈ {−1, +1} is the ith information symbol (bit) of a CDMA user; T
is the symbol duration; Tc is the chip duration; N = TTc
is the bandwidth ex-
pansion factor or system processing gain; d4= ( d[0] d[1] . . . d[N − 1] )T , with
d[l] ∈ {−1/√
N, +1/√
N}, is the unique code (signature) vector assigned to the user;
and PTc(t) is the chip pulse waveform.
24
The generated DS/CDMA signal goes through a frequency-selective multipath
Rayleigh fading channel (see Fig. 2.1), which is typically modeled as a tapped delay
line:
y(t) = s(t) ? h(t), (2.2)
where “?” denotes convolution of two functions; and
h(t) =L−1∑n=0
αnδ(t− τn). (2.3)
The quantities αn and τn denote the amplitude and delay of the nth tap respectively.
Fig. 2.3 illustrates a typical multipath fading channel with two path components that
can be modeled as a tapped delay line with two taps using (2.3). The amplitude αn
is a complex Gaussian random variable and its magnitude is Rayleigh distributed.
The delay τn is assumed to have a value that is an integer multiple of the chip period
Tc. The frequency-selective fading channel is static for some period such that at the
receiver meaningful estimates of signal and channel parameters can be obtained and
used for detection.
The mobile DS/CDMA users/devices are assumed to be stationary or moving
slowly. Stationary or slowly moving users/devices typically experience slow fading.
On the other hand, fast fading should be expected when there is a relative quick mo-
tion between the CDMA user and the ULA-receiver, which usually causes noticeable
amount of Doppler spread/shift.
At the receiver, the continuous-time received signal is chip-matched-filtered
and the output is sampled (at chip-rate) to produce discrete-time received sinal. We
collect N outputs (N is the processing gain of a DS/CDMA system, i.e., the length
of the code signature sequence) into a column vector:
y[i] =(y[iN ] y[iN + 1] . . . y[iN + N − 1]
)T
. (2.4)
25
Figure 2.3: Multipath fading channel
Suppose that there is one user transmitting signals. The vector y[i] can be written
as follows [48]:
y[i] = D
b[i− 1]
b[i]
+ n[i], (2.5)
where n[i] is a column vector that consists of the additive Gaussian noise. D4= [dldu]
consists of the two effective signature vectors for the user, where
dl =
0 d[N − 1] . . . d[0]
0 0 . . . d[1]
0 0. . .
......
...... d[N − 1]
0 0 0 0
·
h0
h1
...
hN−2
hN−1
, (2.6)
26
and
du =
d[0] 0 . . . 0
d[1] d[0] . . . 0...
.... . .
...
d[N − 2] d[N − 3]... 0
d[N − 1] d[N − 2] . . . d[0]
·
h0
h1
...
hN−2
hN−1
. (2.7)
We define
h4=
h0
h1
...
hN−2
hN−1
(2.8)
as a column vector that consists of the channel fading coefficients. Note that some
entries of h might be zeros. For example, suppose that there are two propagation
paths in the channel with delays 0 ≤ τ1 = 2 ≤ N − 1 and 0 ≤ τ2 = 4 ≤ N − 1,
respectively, then only the two elements h2 and h4 are non-zero and denote the two
complex channel coefficients α1 and α2, respectively.
The estimation of the channel vector h can be carried out by using subspace
based methods [48]. (A typical subspace based method is the MUSIC algorithm. See
Section 2.2 for a detailed description of the algorithm.) This is done via projections
of dl and du onto the noise subspace found from the sample covariance matrix of
y[i] and solving for h (note that h is included in both dl and du respectively) that
minimizes the Euclidean norm of the projections.
Denote the ideal covariance matrix of y[i] by Ryy. The noise subspace is the
span of the eigenvectors of the N − 2 smallest eigenvalues of Ryy. Here 2 (in N − 2)
is specific to our example of one user and two effective signature vectors dl and
du. From the theory for subspace based methods, the projections of dl and du on
the noise subspace will be zero. In reality, instead of the ideal Ryy, sample-average
27
estimation of the covariance matrix is used. Denote the sample covariance matrix
by Ryy (normally we use “(·)” to denote the estimate of a parameter, a vector, or a
matrix, etc.). The noise subspace is known from the eigenvectors that correspond to
the N − 2 smallest eigenvalues of Ryy. Suppose that the noise subspace is given by
Vn , we then have:
h = arg minh‖(dl)HVn‖2 + ‖(du)HVn‖2. (2.9)
After the estimate of h is obtained the synchronization problem is readily solved.
This is done as follows [48]:
• Given h, find the least-square fit to hi, denoted as {αi}, for i = 0, 2, . . . , N − 1;
• Select the maximum of {αi} and the corresponding i gives the delay in number
of chips;
• Subtract the found path from h and repeat the above process, until known
number of paths have been found.
2.2 Antenna Array - ULA
The usage of antenna arrays keeps growing, especially for the uplink of a mobile
wireless communication system. In this thesis, the first three topics focus on the
smart antenna technology and more specifically on ULA arrays. ULA requires that
the antenna elements lie along a straight line in three-dimensional space and the inter-
element spacing be fixed as shown in Fig. 2.4. The electrical angle φ of an impinging
wave onto a ULA is given by φ = 2πdλ
sin θ, where d is the antenna spacing, λ is the
wavelength, and θ is the actual angle of incidence.
Fig. 2.5 shows the system diagram that is used in the first three parts of
this thesis. In Fig. 2.5, y(t) = Ax(t) + n(t), where x(t) is a D × 1 column vector
consisting of the input signals. Matrix A has dimension M × D and is the array
28
Figure 2.4: Uniform Linear Array
Figure 2.5: ULA system model
29
manifold of the ULA. A is parameterized by the directions-of-arrival (DoAs) of the
input signals and more specifically the corresponding electrical angles, i.e., we have
A4= [a(φ1),a(φ2), . . . , a(φD)]. The ith column vector a(φi) is known as the steering
vector of the ith input signal. The M × 1 column vector n consists of spatially and
temporally uncorrelated zero-mean Gaussian random variables.
In Chapters 3, 4 and 5, MUSIC or MUSIC-type algorithms play an essential
role in the problems under consideration. In what follows we briefly describe the
MUSIC algorithm. We start from the properties of the array covariance matrix for
ULA, when the additive noise is spatially white. The exact covariance matrix of the
received signal vector y(t) is given by:
Ryy = ARxxAH + σ2I, (2.10)
where Rxx4= E{x(t)x(t)H} is the covariance matrix of the input signal vector x(t)
and σ2 is variance of the additive Gaussian noise. The eigenvalue decomposition of
Ryy is given by:
Ryy =M∑i=1
λivivHi = V ΛV H , (2.11)
where
Λ =
λ1 ∅
λ2
. . .
∅ λM
(2.12)
consists of the eigenvalues of Ryy (i.e., λ1, λ2, . . . , λM) that are sorted in descend-
ing order and V = [v1,v2, . . . , vM ] consists of the corresponding eigenvectors. The
eigenvalues satisfy the defining conditions
det(Ryy − λiI) = 0, i = 1, 2, . . . , M. (2.13)
30
Substitute ARxxAH + σ2I for Ryy in the above equation and we have:
det[ARxxAH − (λi − σ2)I] = 0, i = 1, 2, . . . , M. (2.14)
This says that the eigenvalues of ARxxAH are λi−σ2, i = 1, 2, . . . , M . Suppose that
M > D and A has full column rank. Suppose that Rxx has full rank and is positive
definite. We have that ARxxAH has rank D and the smallest M −D eigenvalues are
zeros, meaning that:
λi − σ2 = 0, i = D + 1, D + 2, . . . ,M, (2.15)
that is,
λi = σ2, i = D + 1, 2, . . . , M. (2.16)
Further, Ryy and its eigenvalues and eigenvectors satisfy the following defining con-
ditions:
(Ryy − λiI)vi = 0, i = 1, 2, . . . , M. (2.17)
For i = D + 1, D + 2, . . . , M , we then have:
ARxxAHvi = 0, i = D + 1, D + 2, . . . ,M, (2.18)
which implies that the array manifold and the steering vectors are orthogonal to the
set of eigenvectors that correspond to the M − D smallest eigenvalues σ2. The set
Vn4= [vD+1,vD+2, . . . , vM ] is known as the noise subspace.
The above property is utilized by the MUSIC algorithm to estimate the directions-
of-arrival of the input signals, as detailed in what follows. Eq. (2.18) holds true only
for ideal Ryy. When the array covariance matrix is estimated from noisy observation
data, MUSIC performs DoA search by identifying the peaks of the following spatial
31
Figure 2.6: Conventional spatial smoothing
spectrum:
P(φ)4=
1
aH(φ)VnV Hn a(φ)
, φ ∈ [−π, π] , (2.19)
where Vn = [vD+1, vD+2, . . . , vM ] consists of the M − D eigenvectors of Ryy, the
estimated array covariance matrix. In practice, Ryy can be obtained by sample-
averaging over N observed vectors as follows:
Ryy4=
1
N
N∑i=1
y(ti)y(ti)H . (2.20)
The MUSIC algorithm requires that the signal covariance matrix Rxx has full
rank. When there are coherent signals, the signal covariance matrix will be rank
deficient and the algorithm is no longer applicable. Spatial smoothing ([75] [76]) is
a technique designed to overcome this difficulty. It is a preprocessing technique ma-
32
nipulating the estimated array covariance matrix such that the corresponding signal
covariance matrix can have full rank.
In spatial smoothing, the M -element ULA is divided into Q overlapping sub-
arrays of P elements each. Variables Q and P satisfy the condition Q = M − P + 1.
The qth subarray, q = 1, 2, . . . , Q, is formed by the q, (q + 1), . . . , (q + P − 1)th
elements of the ULA. Let us denote by yq(t) the received signal and nq(t) the ad-
ditive noise over the qth subarray, and by Aq the submatrix of A formed by the
q, (q + 1), . . . , (q + P − 1)th rows of A. Then we have:
yq(t) = Aqx(t) + nq(t), q = 1, 2, . . . , Q. (2.21)
To perform spatial smoothing, we compute the weighted sum of the Q auto-correlation
matrices E{yq(t)y
Hq (t)
}, for q = 1, 2, . . . , Q (we use “ ~(·)” to denote vectors or ma-
trices after spatial smoothing preprocessing):
~Ryy4=
1
Q
Q∑q=1
E{yq(t)y
Hq (t)
}. (2.22)
The matrix ~Ryy of (2.22) can be rewritten as:
~Ryy = ~A ~Rxx~AH + σ2IP , (2.23)
where ~A4= [~a(φ1), ~a(φ2), . . . , ~a(φD)] is the subarray manifold with ~a(φk)
4= [1, e−jφk ,
. . . , e−j(P−1)φk ]T , k = 1, 2, . . . , D. The D×D matrix ~Rxx is the signal autocorrelation
matrix after spatial smoothing and is given by
~Rxx4=
1
Q
Q∑q=1
ΦqRxxΦ−q =
1
QRxx ◦ (BHB). (2.24)
In (2.24), Φ4= diag
([e−jφ1 , e−jφ2 , . . . , e−jφD ]T
), B
4= [b(φ1)b(φ2) . . . b(φD)] with b(φk)
4=
33
[1, ejφk , . . . , ej(Q−1)φk ]T for k = 1, 2, . . . , D, and “◦” denotes the matrix Hadamard
product. It can be shown that ~Rxx has full rank, as required by MUSIC. In Fig. 2.6
the spatial smoothing preprocessing is illustrated using a simple diagram.
In practice, Ryy is used as the basis to perform spatial smoothing:
~Ryy =
1
Q
Q∑q=1
FqRyyFTq , (2.25)
where Fq =[0P×(q−1) IP×P 0P×(M−P−q+1)
], q = 1, 2, . . . , Q are matrices used to
obtain the submatrices of Ryy.
After we obtain~Ryy, the MUSIC estimation is then performed as follows.
The noise subspace spanned by the eigenvectors associated with the P −D smallest
eigenvalues of~Ryy is first identified. If
~Vn is the matrix whose columns are formed
by the eigenvectors spanning the noise subspace, then we can estimate the DoAs
φk, k = 1, 2, . . . , D, from the locations of the D largest peaks of the spectrum P(φ)
which is defined as follows:
P(φ)4=
∥∥∥ ~V H
n · ~a(φ)∥∥∥−2
, φ ∈ [−π, π] , (2.26)
where ~a(φ)4= [1, e−jφ, . . . , e−j(P−1)φ]T . To obtain the locations of the D largest peaks
a search on the real line would need to be performed.
Having described the MUSIC algorithm and the spatial smoothing preprocess-
ing technique, in the next we briefly discuss the reception at ULA of transmitted
signal passing through frequency selective multipath fading channel h(t) (see Eq.
(2.2)). The received baseband signal at ULA is given by:
y(t) = s(t) ? h(t), (2.27)
34
Figure 2.7: Multiple-Input Multiple-Output system
where
h(t) =L−1∑n=0
a(φn)αnδ(t− τn), (2.28)
and a(φn) is the steering vector for the nth multipath from (electrical) angle φn.
In the next section we give a brief description of multiple-input multiple-output
systems and space-time block coding.
2.3 Multiple-Input Multiple-Output and Space-Time
Block Coding
The last part of the thesis presents research work on the MIMO wireless com-
munication technology. A typical MIMO system is shown in Fig. 2.7. The received
signal matrix is given by:
Y =
√ρ
MSH + N , (2.29)
where S is a T × M matrix consisting of the symbols to be transmitted. Here M
denotes the number of transmitting antennas and T is the number of symbol periods.
The elements of matrix S are such arranged that (i, j)th element of it denotes the
transmission of a symbol at the ith symbol period from the jth transmitting antenna.
35
The channel matrix H is of dimension M × N . The (i, j)th element of it denotes
the channel coefficient between the ith transmitting antenna and the jth receiving
antenna. Variable N is the number of receiving antennas. The received matrix Y
is of dimension T × N and consists of the received signals from N antennas over T
symbol periods. The normalization√
ρM
ensures a receiver side signal-to-noise ratio
per antenna equal to ρ. Rewrite (2.29) as follows:
Y T =
√ρ
MHT ST + NT , (2.30)
and suppose that the single value decomposition (SVD) of HT is given by:
HT = UΣV H , (2.31)
where “(·)T ” denotes matrix transposition and “(·)H” denotes the Hermitian trans-
pose of a matrix, respectively. We then have:
UHY T =
√ρ
M·Σ · V HST + UHNT . (2.32)
As Σ is a diagonal matrix, we can further rewrite (2.32) as follows:
[UHY T ]i =
√ρ
Mσi[V
HST ]i + [UHNT ]i, i = 1, 2, . . . , min{M, N}, (2.33)
where σi is the ith diagonal entry of Σ and “[·]i” denotes the ith row of a matrix. From
Eq. (2.33) we see that when there is a perfect knowledge of H at both the transmitter
and receiver, the MIMO channel is decomposed into min{M, N} independent scalar
channels, which implies the potential capacity gain that is available by using MIMO
instead of SISO.
STBC (Space-Time Block Coding) is an important class of coding techniques
for a MIMO system, which is also the focus of the last part of this work (Chapter 6).
In STBC, a block of input symbols is transformed into a code matrix C, the ith row
36
of which represents the antenna outputs at ith symbol period. An example of STBC
is the Alamouti code [49]. Suppose that symbols s1 and s2 are to be transmitted
via two transmitting antennas, Alamouti scheme constructs the following space-time
code word:
C =
s1 s2
−s∗2 s∗1
. (2.34)
The first row of C denotes that at time interval t1, s1 will be transmitted from the
first antenna and s2 will be transmitted from the second antenna. The second row of
C denotes that at time interval t2, −s∗2 will be transmitted from the first antenna and
s∗1 will be transmitted from the second antenna. The importance of such a coding
scheme lies in the simplicity of its decoding and the fact it explores the full transmit
diversity of the two transmitting antenna system.
Suppose that at receiver there is one receiving antenna, i.e., we have a 2 × 1
MIMO system. Suppose that H is given by
H =
h1
h2
. (2.35)
Let S of Eq. (2.29) be equal to C. We then have:
y1
y2
=
√ρ
M
s1 s2
−s∗2 s∗1
h1
h2
+
n1
n2
, (2.36)
which can be rewritten as:
y1
y∗2
=
√ρ
M
h1 h2
h∗2 −h∗1
s1
s2
+
n1
n2
. (2.37)
37
Note that
h1 h2
h∗2 −h∗1
is orthogonal (so is
s1 s2
−s∗2 s∗1
), i.e.,
h1 h2
h∗2 −h∗1
H h1 h2
h∗2 −h∗1
=
|h1|2 + |h2|2 0
0 |h1|2 + |h2|2
. (2.38)
Thus we can left multiply Eq. (2.37) by
h1 h2
h∗2 −h∗1
H
and this gives:
h∗1y1 + h2y
∗2
h∗2y1 − h1y∗2
=
√ρ
M
|h1|2 + |h2|2 0
0 |h1|2 + |h2|2
s1
s2
+
h∗1n1 + h2n2
h∗2n1 − h1n2
, (2.39)
which shows that the detections of s1 and s2 are decoupled (as manifested by the di-
agonal matrix
|h1|2 + |h2|2 0
0 |h1|2 + |h2|2
in Eq. (2.39)); and that the full transmit
diversity of “2I1O” channel is obtained (as manifested by the term |h1|2 + |h2|2 in Eq.
(2.39)), without altering the whiteness of the noise vector. Further, the Alamouti
scheme obtains full-rate transmission, i.e., it transmits two symbols over two symbol
periods.
Unfortunately, the coding scheme given in Eq. (2.34) can not be extended to
different number of transmission antennas other than two as explained in [104], in
which the Alamouti’s scheme is shown to be a special case of the orthogonal space-
time block codes (OSTBC) from the theory of orthogonal designs [113]. By using
the theory of orthogonal designs it can be proved that it is not possible to construct
similar codes to the Alamouti code for spatial dimensions other than two. Instead
of OSTBC, quasi-orthogonal space-time block codes can be constructed [131], where
38
the orthgonality is relaxed:
C =
s1 s2 s3 s4
s∗2 −s∗1 s∗4 −s∗3
s3 −s4 −s1 s2
s∗4 s∗3 −s∗2 −s∗1
. (2.40)
These codes are full-rate codes but they do not achieve the full transmit diversity
that is available.
In [112] the authors introduced linear dispersion codes (LDCs), together with
a criterion for designing the codes, which is to maximize the mutual information
obtainable by the constructed codes. The linear dispersion codes with the mutual
information design criterion, is an important class of space-time coding schemes for
wireless MIMO communications.
Suppose that r information symbols si, i = 1, 2, . . . , r are to be transmitted.
The linear dispersion code for {si} is given by [112]:
C =r∑
i=1
siAi + j
r∑i=1
siBi. (2.41)
Here j :=√−1 (we use “:=” and “
4=” interchangeably to denote the notion of “by
definition”), while si := Re {si} and si := Im{si} are the real and imaginary parts
of si, respectively. The matrices Ai and Bi ∈ RT×M are real-valued linear dispersion
matrices of size T×M . The choices of the dispersion matricesAi and Bi, i = 1, 2, . . . , r
decide the final form of the code, under the power constraint that E{tr(βC(βC)∗)} =
MT with a normalizing factor β. It can be seen from the definition given in Eq.
(2.41) that linear dispersion codes include orthogonal space-time block codes as their
special cases.
Linear dispersion codes are an important class of codes due to the fact that on
one hand the linear coding structure simplifies the code design process, on the other
39
hand a large part of the MIMO’s potential can be achieved via such a coding scheme.
Further, the decoding (not necessarily the optimal decoding) can be done linearly.
The design of a linear dispersion code centers around the design of the dispersion
matrices. The mutual information criterion described in [112] is used by the authors
to search for the dispersion matrices via numerical optimization. In what follows we
introduce the mutual information criterion.
As will be seen from the account given in Chapter 6, the linear dispersion
matrices {Ai} and {Bi}, i = 1, 2, . . . , r, transform the original MIMO channel H into
an equivalent MIMO channel represented by matrix H ∈ R2NT×2r. The equivalent
MIMO channel H is a function of the linear dispersion matrices Ai and Bi for i =
1, 2, . . . , r and the original H . The inputs to the equivalent MIMO channelH are now
the real and imaginary parts of si for i = 1, 2, . . . , r, while the outputs being the real
and imaginary parts of the (i, j)th entry of the received matrix Y , for i = 1, 2, . . . , T
and j = 1, 2, . . . , M . The linear dispersion code mutual information is defined as the
following quantity with respect to H:
CLD :=1
2TEH
{log det
(I2NT +
ρ
MHHT
)}
=1
2TEH
{log det
(I2r +
ρ
MHTH
)}. (2.42)
The selection of Ai and Bi (i = 1, 2, . . . , r) by maximizing the above quantity
was demonstrated to produce linear dispersion codes that have outstanding perfor-
mance in terms of probability of bit error. The main difficulty of this design process
is in that it requires a search for the dispersion matrices via numerical optimization
algorithms.
In the following we review the two important criteria for designing space-time
codes, i.e., the rank and determinant criteria [99] [102].
Suppose that C is transmitted and in the receiver E 6= C is detected. Under the
assumption of quasi-static Rayleigh fading, the Chernoff bound [50] of the pairwise
40
probability of error is given by:
P{C → E|H} ≤ det[IM + ρA(C, E)]−N , (2.43)
where
A(C, E) = B(C, E)B∗(C, E), (2.44)
and
B(C, E)4= (C − E)T . (2.45)
The rank criterion requires that the minimum rank of A(C, E) over all distinct code-
word pairs {C, E} to be at least r in order to achieve a diversity gain of rN . Suppose
that the non-zero eigenvalues of A(C, E) are λ1, λ2, . . . , λr. The determinant criterion
requires that the minimum of (∏r
i=1 λi)1/r over all distinct codeword pairs {C, E} be
maximized in order to maximize the coding gain.
In Chapter 6, the MIMO channel that is considered is mainly flat-fading
Rayleigh model. The fading is quasi-static with coherence time T channel uses during
which the fades are supposed to be constant, though they may change from one block
of time T to the other.
2.4 Notes on the Notations Used In The Thesis
In the thesis, we let either “(·)∗” or “(·)H” denote the Hermitian transpose of
a matrix and either “4=” or “:=” denote a new definition, i.e., “
4=” or “:=” means
“by definition”. We let either “conj(X)” or “X” denote the element-wise complex
conjugation of the matrix X.
The thesis is divided into four parts and a few notations have different mean-
ings/definitions in different parts. We identify here those notations. We note that
there is no ambiguity in understanding these notations within their contexts. M is
used to denote the number of receiving antennas in Chapters 3, 4 and 5 and is used
41
to denote the number of transmission antennas in Chapter 6. N is used to denote
DS/CDMA system gain in Chapter 3 and is used to denote the total number of sam-
ples in Chapters 4 and 5. N is used to denote the number of receiving antenna in
Chapter 6. T is used to denote symbol period in Chapter 3 and is used to denote the
number of channel uses in Chapter 6.
2.5 Systems, Models, Signals and Assumptions
Different parts of the thesis have different systems, models, signals and as-
sumptions, which are listed in Table 2.1.
Table 2.1: Systems, models, and signals considered in the thesis
Chapter Model
3 Multiuser DS/CDMAUniform Linear Array
Mobile users are assumed to be stationary or slow-motionedNo Doppler effect
Frequency-selective multipathRayleigh fading channel
Parameters φk,n, κk,n and αk,n
No angular spreadStatic fading
4 Uniform Linear ArrayD narrowband sources
No multipath propagationStatic fading or no fading
5 Uniform Linear ArrayD narrowband sources
No multipath propagationStatic fading or no fading
6 MIMO systemFlat Rayleigh fading
Block fadingQuasi-static fading
42
Chapter 3
Spatial-Smoothing Based
MUSIC-Type Joint DoA and
Time-Delay Estimation
43
3.1 Introduction
In this chapter, we consider the problem of joint estimation of direction-
of-arrival (DoA), propagation delay, and complex channel gain for antenna-array
DS/CDMA communications over frequency selective multipath channels. Accurate
timing and direction-of-arrival estimation is a prerequisite for successful demodulation
of the received data stream. Subspace-based joint direction-of-arrival (DoA) and de-
lay estimation algorithms for antenna-array DS/CDMA systems have been considered
in [51] and [52]. However, extension of these algorithms to realistic multipath fading
channels is difficult due to the degeneration of the signal subspace caused by multipath
propagation. While the approach given in [53], based on the DEcoupled Maximum
Likelihood (DEML) [54], provides a possible solution, it is not a blind scheme and
requires the transmission of a training symbol sequence. The approach reported in
[55] resorts to separate spatial- and temporal- processing of the 2-D space-time data.
Maximum likelihood type algorithms based on e.g., Deterministic Maximum Likeli-
hood [56] or Weighted Subspace Fitting [57] perform multidimensional searches and
are computationally intensive.
To overcome this difficulty, in this work, we propose a blind MUSIC-type
estimation algorithm that utilizes the spatial smoothing preprocessing technique [58].
By properly transforming the received covariance matrix, the algorithm essentially
breaks the coherency within the received signals and recovers the full signal subspace
spanned by all dominant multipaths of all users while keeping the additive noise white
and Gaussian. MUSIC-type joint DoA and delay estimation can then be carried out
for individual paths. Based on obtained angle and timing estimates, we proceed to
evaluate the multipath fading coefficients. The algorithm requires only N parallel
exhaustive searches in the real line to find out the DoA and time delay, where N is
the system processing gain, and exhibits reduced computational complexity compared
to maximum likelihood based algorithms that rely on multidimensional searches in
space of at least 2KL dimensions where K is the number of users and L is the number
44
of multipaths (assumed to be the same for all users).
Our treatment takes into account large multipath delays (greater than a symbol
period). As we explain later, there is an intrinsic ambiguity associated with the
channel synchronization for this case. We show how this ambiguity can be resolved
during channel coefficient estimation.
Two variants of the proposed spatial-smoothing based MUSIC-type joint DoA-
delay estimation scheme are further studied, which utilize ST received vectors that
span a single information symbol period i.e., they are of length NP and their co-
variance matrix has dimensions NP ×NP (P is the length of the antenna subarray
after spatial smoothing). This is motivated by the findings in [60], [61] where it was
shown, in the context of minimum-variance-distortionless-response (MVDR)-based
synchronization, that the dimension of the matrix can severely impact the accuracy
of sample-average covariance matrix estimates: the larger the dimension, the higher
the number of data vector samples that are required to achieve a given accuracy.
This chapter is organized as follows: in Section 3.2 we define the system model;
in Section 3.3 we describe the spatial-smoothing based joint DoA and delay estima-
tion algorithm. The ambiguity caused by long delay spread is dealt with in Section
3.4. In Section 3.5 we describe the two variants of the proposed estimation scheme.
We present an estimation algorithm that uses a sequence of chip-shifted covariance
matrices and a lower complexity variant that utilizes a sequence of chip-shifted ST
signature vectors. The simulation results are presented in Section 3.6.
3.2 System Model
We consider the uplink of a K-user asynchronous direct sequence code division
multiple access (DS/CDMA) system. The continuous time baseband transmitted
45
signal of the kth user, k = 0, 1, . . . , K − 1, can be expressed as follows:
sk(t)4=
+∞∑i=−∞
N−1∑
l=0
bk[i]√
Ekdk[l]PTc(t− iT − lTc). (3.1)
In (3.1), bk[i] ∈ {−1, +1} is the ith information symbol (bit) of the kth user; T is the
symbol duration; Ek is the transmitted energy of the kth user; N is the system pro-
cessing gain; dk4= ( dk[0] dk[1] . . . dk[N − 1] )T , with dk[l] ∈ {−1/
√N, +1/
√N},
is the unique code (signature) vector assigned to the kth user; Tc is the chip duration;
and PTc(t) is the chip pulse waveform.
The transmitted signal propagates through a multipath fading channel with L
paths (for simplicity in presentation we assume that the number of paths is the same
for all users). The receiver consists of a uniform linear antenna array (ULA) of M
antenna elements which are spaced half-the-wavelength apart. At the mth antenna
element, m = 0, 1, . . . , M − 1, the received baseband signal is given by
ym(t) =K−1∑
k=0
L−1∑n=0
αk,nsk(t− τk,n)e−jmφk,n + ηm(t), (3.2)
where φk,n is the electrical angle1 of the impinging signal corresponding to the nth
path of the kth user, ηm(t) is AWGN, and αk,n is the channel fading coefficient of
the nth path of the kth user, modeled as a complex deterministic unknown. The
delay variable τk,n denotes comprehensively the transmission delay and the multipath
propagation delay of the nth path of the kth user. We assume that τk,n = κk,nTc for
some κk,n ∈ {0, 1, . . . , 2N − 1} i.e., we consider large delay spreads that can be up to
(2N − 1)Tc in value.
After chip matched filtering and chip-rate sampling we obtain M discrete time
1The electrical angle φ of an imping wave is φ = 2πdλ sin θ, where d is the antenna spacing, λ is
the wavelength, and θ is the actual angle of incidence.
46
received signals ym[l], m = 0, 1, . . . , M − 1:
ym[l] =K−1∑
k=0
L−1∑n=0
√Ekαk,nbk[b l − κk,n
Nc] · dk[b(l − κk,n)%Nc] · e−jmφk,n + ηm[l], (3.3)
where “%” denotes the modulo operator, “b·c” denotes the floor operator, and ηm[l]
is zero-mean spatially and temporally uncorrelated Gaussian random variable with
variance σ2.
The problem we consider here is the estimation of the direction of arrival φ0,n,
time delay τ0,n (equivalently κ0,n) and complex channel gain α0,n of the nth multipath
component, n = 0, 1, . . . , L−1, of the user of interest e.g., user 0. The only quantities
assumed known are the system processing gain N , the signature vector d0, the number
of active users K and the number of paths L.
3.3 Spatial Smoothing Based Joint Direction of
Arrival and Delay Estimation
The M -antenna array is divided into Q overlapping subarrays of P elements
each, i.e. Q = M − P + 1. The qth antenna subarray, q = 0, 1, · · · , Q− 1, is formed
by the q, q + 1, · · · , (q + P − 1)th elements of the original array. Let yq[i] be the 2N -
long space-time received vector of the qth subarray over the ith information symbol
period, given by
yq[i]4= (yq[iN −N ], . . . , yq+P−1[iN −N ], . . . ,
yq[iN + N − 1], . . . , yq+P−1[iN + N − 1])T
=K−1∑
k=0
S(q)k Hk
bk[i− 2]
bk[i− 1]
bk[i]
bk[i + 1]
+ n, (3.4)
47
where Hk is defined as
Hk4=
hk ∅∅∅
hk
hk
∅∅∅ hk
, k = 0, 1, . . . , K − 1, (3.5)
in which hk4=√
Ek · [αk,0, . . . , αk,L−1]T is the complex channel gain vector for the kth
user that incorporates the transmitted energy Ek; n4= (ηq[iN −N ], . . . , ηq+P−1[iN −
N ], . . . , ηq[iN + N − 1], . . . , ηq+P−1[iN + N − 1])T is a 0-mean Gaussian noise vector
with covariance matrix σ2I; and S(q)k is given by
S(q)k
4= Sk
Ψk ∅∅∅
Ψk
Ψk
∅∅∅ Ψk
q
= SkJqk , q = 0, 1, . . . , Q− 1. (3.6)
We note that one observation period of 2N chips would span four transmitted bits,
therefore yq[i] in (3.4) will contain up to four bits bk[i−2], bk[i−1], bk[i] and bk[i+1].
In (3.6), Ψk is defined as
Ψk4= diag
([e−jφk,0 , e−jφk,1 , . . . , e−jφk,L−1 ]T
), (3.7)
while Sk is the space-time signature matrix of the kth user, k = 0, 1, . . . , K − 1,
defined as
Sk4=
Sl
k Smk Su
k 0
0 Slk Sm
k Slk
. (3.8)
The matrices Suk , Sm
k and Slk in (3.8), each of dimension NP × L, are the upper,
48
middle and lower part, respectively, of the following matrix of the kth user i.e.,
Suk
Smk
Slk
= (sk,0 sk,1 . . . sk,L−1), (3.9)
where sk,n is the space-time signature vector of the nth path of the kth user, given
by
sk,n = ( [0 . . . 0]︸ ︷︷ ︸κk,n
dk[0] . . . dk[N − 1]︸ ︷︷ ︸N
[0 . . . 0]︸ ︷︷ ︸2N−κk,n
)T ⊗ ak,n. (3.10)
In (3.10), ak,n = [1, e−jφk,n , . . . , e−j(P−1)φk,n ]T is the steering vector associated with
the nth path of the kth user, and ⊗ denotes the Kronecker product.
The proposed method is based on the eigenvalue decomposition (EVD) of the
spatially-smoothed received vector autocorrelation matrix
Ryy4=
1
Q
Q−1∑q=0
E{yq[i]y
Hq [i]
}
=1
Q
K−1∑
k=0
Sk
[Q−1∑q=0
J qkHk(J
qkHk)
H
]SH
k + σ2I. (3.11)
Theorem 1 deals with the rank of the following matrices
Q−1∑q=0
J qkHk(J
qkHk)
H , k = 0, 1, . . . , K − 1 (3.12)
and shows that spatial smoothing preprocessing when applied to the space-time co-
variance matrix can restore the full signal subspace spanned by all signal paths from
all users.
Theorem 1 The matrices∑Q−1
q=0 J qkHk(J
qkHk)
H , k = 0, 1, . . . , K−1, are of full rank
if and only if Q ≥ L and the DoAs of the multipath signals of the kth user are distinct.
49
Proof: Let us define the vectors ϕk and matrices Fk for k = 0, 1, . . . , K − 1, by
ϕk4= [e−jφk,0 , e−jφk,1 , . . . , e−jφk,L−1 ]T (3.13)
and
Fk4=
ϕk ∅∅∅
ϕk
ϕk
∅∅∅ ϕk
, (3.14)
respectively. Then
Ryy =1
Q
K−1∑
k=0
Sk
[(Q−1∑q=0
F •qk (F •q
k )H
)• (HkH
Hk )
]SH
k + σ2I, (3.15)
where “•” denotes the Hadamard (element-wise) product and power and
F •0k =
1 ∅∅∅
1
1
∅∅∅ 1
, (3.16)
where “1” denotes a L× 1 column vector of all ones. Furthermore we have
Q−1∑q=0
F •qk (F •q
k )H =
∑q ϕ•q
k (ϕ•qk )H ∅∅∅
∑q ϕ•q
k (ϕ•qk )H
∑q ϕ•q
k (ϕ•qk )H
∅∅∅∑
q ϕ•qk (ϕ•q
k )H
(3.17)
which is positive definite if and only if Q ≥ L and the directions-of-arrival of the
50
signal paths from user k are distinct. This conclusion is obtained by observing that∑Q−1
q=0 ϕ•qk (ϕ•q
k )H = BHB where B = (bk,0 bk,1 . . . bk,L−1) and bk,n4= [1, ejφk,n ,
. . . , ej(Q−1)φk,n ]T . Finally,
[Q−1∑q=0
ϕ•qk (ϕ•q
k )H
]• (hkh
Hk ) =
diag(hk)
[Q−1∑q=0
ϕ•qk (ϕ•q
k )H
]diag(hH
k ) (3.18)
which implies that∑Q−1
q=0 J qkHk(J
qkHk)
H , k = 0, 1, . . . , K − 1, are of full rank.
Applying the EVD on Ryy we obtain
Ryy = V ΛV H , (3.19)
where the matrix V consists of the eigenvectors of Ryy and Λ is the (diagonal) eigen-
value matrix. It is well known that the noise subspace is spanned by the eigenvec-
tors associated with the 2NP − 3KL smallest eigenvalues. From the above theorem
and (3.19) we conclude that, if Vn is the matrix whose columns are formed by the
eigenvectors spanning the noise subspace then the space-time signature matrices are
orthogonal to Vn i.e.,
V Hn Sk = 0, k = 0, 1, . . . , K − 1. (3.20)
Therefore, we obtain the estimates φ0,n and κ0,n of the direction-of-arrival φ0,n and
the delay κ0,n, n = 0, 1, . . . , L − 1, of all paths of user 0 from the locations of the L
largest peaks of the spectrum P(φ, κ) defined as
P(φ, κ)4=
∥∥V Hn · s(φ, κ)
∥∥−2, φ ∈ [−π, π] , k = 0, 1, . . . , K − 1, (3.21)
51
where
s(φ, κ)4= ([0 . . . 0]︸ ︷︷ ︸
κ
d0[0] . . . d0[N − 1]︸ ︷︷ ︸N
[0 . . . 0]︸ ︷︷ ︸N−κ
)T ⊗
([1, e−jφ, . . . , e−j(P−1)φ]T ). (3.22)
To obtain the locations of the L largest peaks N parallel searches on the real line
need to be performed.
In practice, Ryy is not known. The most commonly used method for estimating
it is through sample-average over H received vectors. The resulting estimate is given
by
Ryy =1
QH
Q−1∑q=0
H−1∑i=0
yq[i]yq[i]H . (3.23)
3.4 Channel Estimation And Removal of Timing
Ambiguity
For large delay spreads (i.e. NTc ≤ τ0,n ≤ (2N −1)Tc) for some path n, timing
estimation is subject to the following ambiguity: the estimated κ0,n will be within the
range [0, N − 1], while the correct estimate would be either κ0,n or κ0,n + N . In the
following, we show how this ambiguity can be resolved during the channel estimation
stage.
Suppose that the angle and timing estimates (see section 3.3) are given by
φ0,n and κ0,n for the nth path of user 0. Define y[i]4= (y0[iN − N ], . . . , yM−1[iN −
N ], . . . , y0[iN + N − 1], . . . , yM−1[iN + N − 1])T and the autocorrelation matrix
R′yy
4= E
{y[i]yH [i]
}. (3.24)
52
Construct the following space-time signature matrices with φ0,n and κ0,n
Sk(κ0,0 + δ0,0, . . . , κ0,L−1 + δ0,L−1, φ0,0, . . . , φ0,L−1)
4=
Sl
k Smk Su
k 0
0 Slk Sm
k Slk
, (3.25)
where Suk , Sm
k and Slk are similarly defined as in (3.8), (3.9) and (3.10), with the size
of the steering vector being M .
In (3.25) δ0,n, n = 0, 1, . . . , L − 1, can take two values 0 or N corresponding
to the two possible values for the actual delay κ0,n or κ0,n + N . We find δ0,n, n =
0, 1, . . . , L − 1, by minimizing the smallest eigenvalue of G (see below) over all 2L
values of the vector δ04= [δ0,0, δ0,1, . . . , δ0,L−1]
T . The associated eigenvector is the
estimate of the channel coefficients h0. In other words, the channel estimate is
arg min‖h0‖=1,δ0
hH0 G (δ0,0, δ0,1, . . . , δ0,L−1) h0. (3.26)
The matrix G (δ0,0, δ0,1, . . . , δ0,L−1) in (3.26) is defined by
G4=
Sl
k
0
H
(R′yy)
−1
Sl
k
0
+
Sm
k
Slk
H
(R′yy)
−1
Sm
k
Slk
+
Su
k
Smk
H
(R′yy)
−1
Su
k
Smk
+
0
Slk
H
(R′yy)
−1
0
Slk
,
(3.27)
where Suk , Sl
k and Smk as functions of δ0,0, δ0,1, . . . , δ0,L−1 are given by (3.25). This
technique is based on the minimum-variance-distortionless response (MVDR) channel
estimation method. The reason behind this choice, as opposed to the use of MUSIC-
type methods, is the unavailability of the rank of the signal subspace of Ryy. In
contrast to subspace algorithm, MVDR-type algorithms do not require knowledge of
the rank of the signal subspace.
53
3.5 SS Based Joint DoA-Delay Estimation Using
Chip-Shifted Estimates of the Covariance Ma-
trix
One of the characteristics of the algorithm given in Section 3.3 is that it em-
ploys space-time (ST) received vectors that span a time interval equal to twice the
information symbol period i.e., their length is equal to 2NP , where P is the length
of the antenna subarray and N the CDMA processing gain. As a result, the algo-
rithm requires the estimation of a 2NP × 2NP received vector covariance matrix.
However, in the context of minimum-variance-distortionless-response (MVDR)-based
synchronization, it was recently shown ([60], [61]) that the dimension of the matrix
can severely impact the accuracy of sample-average covariance matrix estimates: the
larger the dimension the higher the number of data vector samples that are required
to achieve a given accuracy. Therefore, motivated by the findings in [60], [61] we
consider two variants of the proposed scheme for the joint estimation of the DoA and
the timing of a user in a DS/CDMA system. The proposed algorithms utilize ST
received vectors that span a single information symbol period i.e., they are of length
NP and their covariance matrix has dimensions NP ×NP . Simulation studies reveal
that this reduction results in improved performance in short data record situations.
The treatment given in the following subsections will not include the case of
large delay spread.
3.5.1 Joint DoA and Delay Estimation Using Chip-Shifted
Covariance Matrix Estimates
The M -antenna array is divided into Q overlapping subarrays of P elements
each i.e. Q = M − P + 1. The qth antenna subarray, q = 0, 1, . . . , Q − 1, is formed
by the q, (q + 1), . . . , (q + P − 1)th elements of the original array. Let y(κ)q [i] be the
54
space-time received vector of the qth subarray over the ith information symbol period,
that is observed with a delay of κ chips, κ = 0, 1, . . . , N −1. The reason for collecting
κ-chip shifted ST observations will become apparent later. The vector y(κ)q [i] is given
by
y(κ)q [i]
4= (yq[iN + κ], . . . , yq+P−1[iN + κ], . . . ,
yq[iN + N − 1 + κ], . . . , yq+P−1[iN + N − 1 + κ])T
=K−1∑
k=0
S(q,κ)k
hk ∅∅∅
∅∅∅ hk
bk[i− 1]
bk[i]
+ n(κ), (3.28)
where n(κ) 4= (ηq[iN +κ], . . . , ηq+P−1[iN +κ], . . . , ηq[iN +N −1+κ], . . . , ηq+P−1[iN +
N − 1 + κ])T is 0-mean Gaussian noise vector with covariance matrix σ2I; hk is the
channel gain vector; and S(q,κ)k is a NP × 2L matrix given by
S(q,κ)k
4= S
(κ)k
Ψk ∅∅∅
∅∅∅ Ψk
q
= S(κ)k Kq
k , q = 0, 1, . . . , Q− 1. (3.29)
In (3.29), Ψk is as defined in (3.7), while S(κ)k is the space-time signature matrix of
the kth user, k = 0, 1, . . . , K − 1, defined as
S(κ)k
4=
(l(κ)k,0 . . . l
(κ)k,L−1 u
(κ)k,0 . . . u
(κ)k,L−1
). (3.30)
The vectors u(κ)k,n and l
(κ)k,n in (3.30) are the upper and lower halves, respectively, of the
space-time signature vector s(κ)k,n for the nth path of the kth user that is given by
s(κ)k,n
4= ([0 . . . 0]︸ ︷︷ ︸
κk,n+κ′
dk[0] . . . dk[N − 1]︸ ︷︷ ︸N
[0 . . . 0]︸ ︷︷ ︸N−κk,n−κ′
)T ⊗ ak,n. (3.31)
55
R(κ)yy =
1
Q
(S
(κ)0 . . . S
(κ)K−1
)
∑Q−1q=0 Kq
0L0(Kq0L0)
H ∅∅∅. . .
∅∅∅∑Q−1
q=0 LqK−1KK−1(L
qK−1KK−1)
H
(S(κ)0 )H
...
(S(κ)K−1)
H
+σ2I
(3.32)
In (3.31), κ′ is defined as follows:
κ′ =
−κ if κ ≤ κk,n
N − κ if κ ≥ κk,n
. (3.33)
The two estimation algorithms that we will describe are based on the eigenvalue
decomposition (EVD) of the spatially-smoothed received autocorrelation matrix
R(κ)yy
4=
1
Q
Q−1∑q=0
E{y(κ)q [i]y(κ)
q [i]H}
=1
Q
K−1∑
k=0
S(κ)k [
Q−1∑q=0
KqkLk(K
qkLk)
H ](S(κ)k )H + σ2I, (3.34)
where Lk is defined as:
Lk4=
hk ∅∅∅
∅∅∅ hk
, k = 0, 1, . . . , K − 1. (3.35)
Alternatively, (3.34) can be written in matrix form as shown in (3.32) at the top of
this page.
The following theorem deals with the rank of the matrices
Q−1∑q=0
KqkLk(K
qkLk)
H , k = 0, 1, . . . , K − 1. (3.36)
Theorem 2 The matrices∑Q−1
q=0 KqkLk(K
qkLk)
H , k = 0, 1, . . . , K−1, are of full rank
56
if and only if Q ≥ L and the DoAs of the multipath signals of the kth user are distinct.
Proof: The proof is similar to that for Theorem 1 and is omitted here.
The first one of the two estimation algorithms is based on the use of a sequence
of chip-shifted estimates of the ST covariance matrix R(κ)yy , κ = 0, 1, . . . , N − 1. The
second one uses the non-shifted estimate R(0)yy .
Applying eigenvalue decomposition on R(κ)yy , κ = 0, 1, . . . , N − 1 we obtain
R(κ)yy = V (κ)Λ(κ)(V (κ))H , (3.37)
where the matrix V (κ) contains in its columns the eigenvectors of R(κ)yy and Λ(κ) is
the (diagonal) eigenvalue matrix. If V(κ)
n is the matrix whose columns are formed by
the eigenvectors spanning the noise subspace then the space-time signature vectors
are orthogonal to V(κ)
n i.e.
[V (κ)n ]Hu
(κ)k,n = 0 and [V (κ)
n ]Hl(κ)k,n = 0,
for k = 0, 1, . . . , K − 1, n = 0, 1, . . . , L− 1. (3.38)
In the special case when κ is equal to κk,n, the lower half (l(κ)k,n) of the space time signa-
ture vector becomes zero while the upper half u(κ)k,n becomes equal to (dk[0] . . . dk[N −
1])T ⊗ ak,n. This simplifies expression (3.38) to
(V (κ)n )Hu
(κ)k,n = 0, when κ = κk,n. (3.39)
Therefore, we may estimate the direction-of-arrival φ0,n and the delay κ0,n,
n = 0, 1, . . . , L−1, of all paths of user 0 as the locations of the peaks of the spectrum
P(φ, κ) defined as follows:
P(φ, κ)4= ‖(V (κ)
n )Hs(φ)‖−2, (3.40)
57
where s(φ) is the space-time signature vector defined as
s(φ)4= (d0[0] . . . d0[N − 1])T ⊗ [1, e−jφ, . . . , e−j(P−1)φ]T . (3.41)
In practice, R(κ)yy is not known and as described earlier is estimated as follows:
R(κ)yy =
1
QH
Q−1∑q=0
H−1∑i=0
y(κ)q [i]y(κ)
q [i]H . (3.42)
3.5.2 Joint DoA and Delay Estimation Using the Non-Shifted
Covariance Matrix Estimate
The aforementioned algorithm requires the EVD of N chip-shifted matrices
R(κ)yy , κ = 0, 1, . . . , N−1, a task that computationally can be complex. In the following
subsection we describe an algorithm of lower complexity that requires the EVD of a
single estimated covariance matrix.
Applying EVD on R(0)yy we obtain
R(0)yy = V (0)Λ(0)(V (0))H , (3.43)
where the matrix V (0) consists of the eigenvectors of R(0)yy and Λ(0) is the (diagonal)
eigenvalue matrix. Let V(0)
n be the matrix whose columns consist of the noise eigen-
vectors (that in turn are associated with the smallest eigenvalues). The space-time
signature vectors are orthogonal to V(0)
n , i.e.
[V (0)n ]Hu
(0)k,n = 0 and [V (0)
n ]Hl(0)k,n = 0,
for k = 0, 1, . . . , K − 1, n = 0, 1, . . . , L− 1. (3.44)
Therefore, we may estimate the direction-of-arrival φ0,n and the delay κ0,n,
n = 0, 1, . . . , L−1, of all paths of user 0 as the locations of the peaks of the spectrum
58
P(φ, κ) defined as follows:
P(φ, κ)4=
1
‖[V (0)n ]Hs(L)(φ, κ)‖2 + ‖[V (0)
n ]Hs(U)(φ, κ)‖2, (3.45)
where s(U)(φ, n) and s(L)(φ, n) are the upper and lower half, respectively, of the
space-time signature vector,
s′(φ, κ)4= [[0 . . . 0]︸ ︷︷ ︸
κ
d0[0] . . . d0[N − 1]︸ ︷︷ ︸N
[0 . . . 0]︸ ︷︷ ︸N−κ
]T
⊗ ([1, e−jφ, . . . , e−j(P−1)φ]T ). (3.46)
In practice, R(0)yy is not known and is estimated through sample-averaging over H
received vector samples (as in (3.42)).
3.6 Simulation Results
In the first simulation we consider a 4-user system with a processing gain
equal to 35. The number of paths for all users is 3. The receiver ULA consists of 9
sensors while the subarray size is 5. The first figure shows the spectrum P(φ, κ) for the
proposed algorithm (see eq. (3.21)). It is plotted against φ and, for simplicity, we show
six curves corresponding to the delays k = 0, 1, . . . , 5. The actual angles of arrival are
shown by the vertical lines (φ0,0 = 0.7250, φ0,1 = −2.6583 and φ0,2 = 1.6916) while the
actual delays are κ0,0 = 3, κ0,1 = 4 and κ0,2 = 5. We see that the three largest peaks
are located at the actual DoAs, and correspond to the correct delays. In Fig. 3.2
we plot the MSE (squared error between the estimates [φ0,0, κ0,0, . . . , φ0,L−1, κ0,L−1]T
and the true values [φ0,0, κ0,0, . . . , φ0,L−1, κ0,L−1]T averaged over 250 runs) for the
conventional and the proposed MUSIC-based algorithm as a function of the SNR of
user 0 with the other users’ SNRs fixed at 16, 10 and 15 dB respectively. In Fig.
3.3 we give the MSE performance in terms of number of receiving samples. In Fig.
59
−pi −pi/2 0 pi/2 pi−10
−8
−6
−4
−2
0
2
4
6
8
10
Electrical Angle
Spa
tial−
Sm
ooth
ing−
Bas
ed M
US
IC S
pect
rum
(dB
)
0,00,1 0,21,0 1,11,2 2,0 2,12,23,0 3,13,2
k0=0
k0=1
k0=2
k0=3
k0=4
k0=5
Figure 3.1: Spatial-smoothing-based MUSIC spectrum vs. DoA for six possible valuesof the delay of user 0
3.4 the MSE of the channel estimation with respect to the SNR of user 0 is given
assuming correct timing estimation, while in Fig. 3.5 we set κ0,1 = 39 and we plot
the probability of ambiguity resolution of the proposed approach against the SNR of
user 0. The superiority of the proposed spatial smoothing technique is evident.
In the second simulation we consider a 4-user system with a processing gain
N = 35. The users’ signal-to-noise ratios (SNRs) are 10, 13, 15 and 16dB, respectively.
The number of paths for each channel is L = 3 (assumed to be the same for all users).
The receiver is a ULA of M = 9 sensors while the subarray size is P = 5.
In Fig. 3.6 and 3.7 we plot the spectrum P(φ, κ) for the algorithms presented
in Section 3.5 as a function of the DoA φ. Each curve corresponds to a different
60
5 10 15 20 250
1
2
3
4
5
6
7
SNR of User 0 (dB)
MS
E o
f Joi
nt D
oA a
nd D
elay
Est
imat
or
Without Spatial SmoothingWith Spatial Smoothing
Figure 3.2: MSE of joint DoA and delay estimator vs. SNR of user 0
61
600 700 800 900 1000 1100 1200 13001.22
1.24
1.26
1.28
1.3
1.32
1.34
1.36
1.38
1.4
1.42x 10
−3
Number of Samples
MS
E o
f Joi
nt D
oA a
nd D
elay
Est
imat
or
Without Spatial SmoothingWith Spatial Smoothing
Figure 3.3: MSE of joint DoA and delay estimator vs. number of samples
62
5 10 15 20 250
0.01
0.02
0.03
0.04
0.05
0.06
SNR of User 0 (dB)
MS
E o
f Cha
nnel
Est
imat
or
Figure 3.4: MSE of channel estimator vs. SNR of user 0
63
5 10 15 20 250.75
0.8
0.85
0.9
0.95
1
SNR of User 0 (dB)
Pro
babi
lity
of A
mbi
guity
Res
olvi
ng
Figure 3.5: Probability of ambiguity resolution vs. SNR of user 0
64
value of κ (for clarity of presentation we assume the maximum possible path delay
is 5). The actual delays are κ0,0 = 3, κ0,1 = 4, κ0,2 = 5. Each of the vertical dotted
lines denotes the electrical angle of the nth path of the kth user and is marked by
the pair k, n. The corresponding values are φ0,0 = 0.7205 rad, φ0,1 = 1.6916 rad
and φ0,2 = −2.6583 rad, respectively. In Figs. 3.6 and 3.7 the data record size is
H = 570. We see that the chip shifted matrices based algorithm exhibits taller and
sharper peaks compared to its lower complexity counterpart.
In Fig. 3.8 we compare the algorithm given in Section 3.3 (that utilizes receive
vectors of length 2NP ) with that given in Section 3.5 (that utilizes receive vectors of
length 1NP ). The data record size is H = 570. We plot the probability of acquisition
for each path of user 0, against the SNR of user 0. This is the probability that the
estimated delay is equal to the actual delay (in number of chips). In Fig. 3.9 we plot
the total mean squared error (MSE) of the joint DoA and delay estimators defined
as |κk,n − κk,n|2 + |φk,n − φk,n|2. The MSE was evaluated over 40 Monte-Carlo runs.
The superiority of the proposed algorithms is apparent.
3.7 Conclusion
We described a blind MUSIC-type DoA and delay estimation algorithm that is
suitable for antenna-array-based DS/CDMA system in multipath environments. The
proposed algorithm is based on the spatial smoothing preprocessing technique and
requires only N one-dimensional parameter searches to estimate the timings (within
a described ambiguity factor) and DoAs. These estimates are then used for channel
gain estimation and timing ambiguity resolution. We further study two variants of
the proposed estimation schemes which are suited for short-data-record scenarios.
65
−pi −pi/2 0 pi/2 pi
−5
0
5
10
15
Electrical Angle
MU
SIC
Spe
ctru
m (
dB)
0,00,1 0,21,0 1,11,2 2,0 2,12,23,0 3,13,2
k0=0k0=1k0=2k0=3k0=4k0=5
Figure 3.6: Spectrum of the proposed spatial-smoothing-based MUSIC algorithm(using chip-shifted covariance matrices).
66
−pi −pi/2 0 pi/2 pi
−5
0
5
10
15
20
Electrical Angle
MU
SIC
Spe
ctru
m (
dB)
0,00,1 0,21,0 1,11,2 2,0 2,12,23,0 3,13,2
k0=0k0=1k0=2k0=3k0=4k0=5
Figure 3.7: Spectrum of the proposed spatial-smoothing-based MUSIC algorithm(using a non-time-shifted matrix).
67
−20 −15 −10 −5 0 5 100
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
SNR of User 0 (dB)
Pro
babi
lity
of A
cqui
sitio
n
MUSIC With Spatial Smoothing (1NP): path 0MUSIC With Spatial Smoothing (1NP): path 1MUSIC With Spatial Smoothing (1NP): path 2MUSIC With Spatial Smoothing (2NP): path 0MUSIC With Spatial Smoothing (2NP): path 1MUSIC With Spatial Smoothing (2NP): path 2
Figure 3.8: Probability of acquisition vs. SNR of user 0
68
−20 −15 −10 −5 0 5 1010
−3
10−2
10−1
100
101
102
103
SNR of User 0 (dB)
MS
E o
f Joi
nt D
oA a
nd D
elay
Est
imat
or
Conventional MUSIC Using Chip−Shifted RMUSIC With Spatial Smoothing Using Chip−Shifted RMUSIC With Spatial Smoothing (1NP)MUSIC With Spatial Smoothing (2NP)
Figure 3.9: MSE of joint DoA and delay estimator vs. SNR of user 0
69
Chapter 4
The MUSIC MDL Criterion
70
4.1 Introduction
In array signal processing, the classical approach to signal enumeration is the
method proposed by M. Wax and T. Kailath [62]. It is based on the information the-
oretical minimum description length (MDL) criterion and is simple and computation-
ally efficient. The most recent research results regarding the MDL criterion include
for example [63], where the authors present a detailed analysis of the MDL criterion
and of the situation of overmodeling, based on their findings regarding the conditions
for the MDL criterion in analytic form. In addition to the MDL detection strategy,
there exist many other techniques and solutions to the problem of source enumera-
tion, which exploit specific features and properties of a particular signal model. For
example, the authors in [64] consider closely spaced sources and identify the asymp-
totic covariance matrix and its eigenvectors under this assumption. In their method,
the estimated eigenvectors are tested against the derived asymptotic pattern. In [65]
the authors suggest a detection method which takes into account the measured array
manifold and uses the eigenvectors of the estimated covariance matrix. Finally in
[66], the author designs a subspace based detection method specifically for CDMA
communications. While these techniques constitute better alternatives that utilize
the eigenvectors of the sample covariance matrix (in contrast, the MDL criterion uti-
lizes only the eigenvalues), they are specific to their respective system models and
they do not provide further insight into how we can exploit the extra information
provided by the eigenvectors.
In this work, we approach the problem of number of signals detection from the
direction of joint order detection and direction-of-arrival estimation. More specifically,
for each possible value of the number of signals k, the proposed solution constructs
first the (M − k)-dimensional (M is the total number of array elements) testing
subspaces, for k = 1, 2, . . . , M−1, and generates the corresponding MUSIC spectra. It
then utilizes the DoA estimates and the corresponding values of the MUSIC spectrum,
together with the M − k smallest eigenvalues of the sample covariance matrix, to
71
compute an MDL-type metric. This metric uses not only the eigenvalues but also the
eigenvectors of the sample covariance matrix, and jointly exploits the model order
information contained within. This results in better performance in low SNRs. In
addition, this method is based on a generic ULA system and, provided that certain
minor assumptions (discussed in later sections) are satisfied, it can be applied to a
wide variety of signal structures.
The rest of the chapter is structured as follows: in Sections 4.2 and 4.3 we
define the array system and give a brief description of the MDL criterion, respectively;
in Section 4.4 we detail the model order detecting criterion and its properties of
consistency. The simulation results are presented in Section 4.5, while Section 4.6
concludes the chapter.
4.2 System Model
We consider D signals x1(t), x2(t), . . . , xD(t) each impinging with an electrical
angle φk ∈ [−π, π] on an M -element uniform linear array (ULA). Let y(t) be the
vector consisting of the M received signals from the ULA at time t. Sampling y(t)
at instants ti (i = 1, 2, . . . , N) we obtain the discrete-time received vector sequence
y(ti) given by
y(ti) = Ax(ti) + n(ti), i = 1, 2, . . . , N. (4.1)
In (4.1), x(ti)4= [x1(ti), x2(ti), . . . , xD(ti)]
T is the vector containing the D transmitted
signals and A4= [a(φ1), a(φ2), . . . , a(φD)] is the array manifold whose kth column
a(φk)4= [1, e−jφk , . . . , e−j(M−1)φk ]T , k = 1, 2, . . . , D, is the steering vector associated
with the input xk(t). In general, the angles φk, k = 1, 2, . . . , D, are distinct, which
implies that the matrix A is of full column rank. We assume that the transmitted
vectors x(ti), i = 1, 2, . . . , N , are random, identically distributed, with a covariance
matrix given by Rxx4= E{x(t)x(t)H}, where E{·} is the expectation operator. Fi-
nally, n(ti) is the additive Gaussian noise assumed to be zero-mean, spatially and
72
temporally white, with covariance matrix σ2IM , where IM is the M × M identity
matrix.
The problem we consider in this chapter is the estimation of the number of
signals D (1 ≤ D < M) and the DoAs φk, k = 1, 2, . . . , D, given N observed vector
samples y(ti), i = 1, 2, . . . , N .
4.3 Background
Given N observations that are generated from some distribution, the informa-
tion theoretical MDL [67] principle is a criterion for model selection among the family
of competing models (probability densities) {f(·|ξ), ξ ∈ Ξ}, parameterized with pa-
rameter vector ξ, that best describes the data. It seeks the minimum encoding length
of the N observations, which is formulated in [67] and is given by
− ln f(y(t1), . . . , y(tN)|ξ)
+ 12m ln N. (4.2)
In (4.2) the first term is the minus log-likelihood of the observations y(t1),y(t2), . . . , y(tN)
with ξ being the maximum likelihood estimate of the vector of parameter based on
the same observations. The second term is a penalty term that is given in terms
of the number of data N and the number of independent parameters m within the
vector ξ.
4.3.1 Optimal MDL-based signal enumeration criterion
The optimal signal enumeration criterion in the minimum description length
(MDL) sense can be obtained as follows. Let Ryy4= 1
N
∑Ni=1 y(ti)y
H(ti) be the sample
covariance matrix of the N observations. Assuming that there are k sources in the
73
system, the MLE of the DoA vector ψ(k) 4= [φ1, φ2, . . . , φk] is given by [70]
ψ(k) = arg maxψ(k)
{−N ln det[PA[ψ(k)]RyyPA[ψ(k)] +
1M−k
tr(P⊥
A[ψ(k)]Ryy
) ·P⊥A[ψ(k)]
]}.
(4.3)
Here PA[ψ(k)]4= A[ψ(k)][AH [ψ(k)]A[ψ(k)]]−1AH [ψ(k)] is the projection matrix onto
A[ψ(k)]. A[ψ(k)] is the array manifold with parameter vector ψ(k) i.e., A[ψ(k)]4=
[a(φ1), a(φ2), . . . , a(φk)]. Also, P⊥A[ψ(k)]
= I − PA[ψ(k)]. Removing the “arg max{·}”operator and the minus sign, and adding a penalty term, we apply the MDL principle
to (4.3), which gives:
MDLopt(k)4= N ln det
[PA[ψ(k)]RyyPA[ψ(k)] +
1M−k
tr(P⊥
A[ψ(k)]Ryy
) · P⊥A[ψ(k)]
]
+ 12[k(k + 1) + 1] ln N. (4.4)
In (4.4) ψ(k) = [φ1, φ2, . . . , φk] consists of the k ML DoA estimates obtained from
(4.3). The counting of the freely adjusted parameters and the formulation of the
penalty term follow that of [71]. The number of signals is then given by arg mink
MDLopt(k). This estimate is optimal in the MDL sense, but it exhibits prohibitively
high computational complexity since for each k it performs a multi-dimensional search
for the evaluation of the MLE ψ(k) of the DoAs. For this reason, the most commonly
used signal enumeration method is the one proposed in [62] and is summarized next.
4.3.2 Suboptimal MDL-based criterion
The criterion described here is based on a generic parameterization by the
model’s eigensystem first discussed by T. W. Anderson [68]. Let λ1 ≥ λ2 ≥ . . . ≥ λM
be the eigenvalues of Ryy in descending order. For each k ∈ {0, 1, . . . , M − 1},calculate [62]
MDL(k)4= (M − k)N ln
[1
M−k
∑Mi=k+1 λi
(∏Mi=k+1 λi
) 1M−k
]+
1
2[k(2M − k) + 1] ln N. (4.5)
74
The number of signals is then given by arg mink MDL(k). As dictated by the MDL
principle [67] the first term is the minus log-likelihood function with parameters (the
eigenvalues and the eigenvectors) substituted by their ML estimators. The second
term of (4.5) is the penalty term which is directly proportional to the number of
freely adjusted parameters of the model, k(2M − k) + 1 [62].
In the next section we consider appropriate simplifications/approximations of
the RHS (right-hand side) of (4.4) by which a good tradeoff between accuracy and
computation complexity is achieved.
4.4 Detection Criterion Exploiting Peaks in the
MUSIC Spectrum
In this section we describe the proposed technique. Eq. (4.4), as a function of
ψ(k), is equivalent to [71]
L1(k, ψ(k))4= N ln
[det Rs[ψ
(k)] ·det(
1M−k
tr Rn[ψ(k)] ·I)]+ 1
2[k(k+1)+1] ln N (4.6)
where Rs[ψ(k)] is a k × k matrix and Rn[ψ(k)] is a (M − k) × (M − k) matrix such
that
U [ψ(k)]
Rs[ψ
(k)] ∅
∅ ∅
U [ψ(k)]H = PA[ψ(k)]RyyPA[ψ(k)], (4.7)
and
U [ψ(k)]
∅ ∅
∅ Rn[ψ(k)]
U [ψ(k)]H = P⊥
A[ψ(k)]RyyP⊥A[ψ(k)], (4.8)
respectively, with U [ψ(k)] being the unitary matrix suitable for this coordinate trans-
formation [71]. We note that
tr Rn[ψ(k)] = tr P⊥A[ψ(k)]RyyP
⊥A[ψ(k)] = tr P⊥
A[ψ(k)]Ryy, (4.9)
75
which is by itself related to the conditional1 MLE of ψ(k) [69]:
ψ(k) = arg maxψ(k)
[−MN ln(
1M
tr P⊥A[ψ(k)]
Ryy
)]. (4.10)
Regarding tr P⊥A[ψ(k)]
Ryy we have the following theorem. The proof is included
in the Appendix I at the end of this chapter.
Theorem 3 Let the eigenvalue decomposition (EVD) of Ryy be
Ryy = V (k)s Λ(k)
s [V (k)s ]H + V (M−k)
n Λ(M−k)n [V (M−k)
n ]H , (4.11)
where Λ(k)s denotes the k largest eigenvalues and V (k)
s consists of the corresponding
eigenvectors while Λ(M−k)n and V (M−k)
n denote respectively the M−k smallest eigenval-
ues and their eigenvectors. Then, under the assumption that [V (k)s ]HA is nonsingular,
we have
tr P⊥A[ψ(k)]
Ryy ≤ tr Λ(M−k)n + tr[A[ψ(k)]HV (M−k)
n [V (M−k)n ]HA[ψ(k)]][K], (4.12)
where ψ(k) 4= [φ1, φ2, . . . , φk] and K is defined by
K4= [[V (k)
s ]HA[ψ(k)]]−1[Λ(k)s − λMI][A[ψ(k)]HV (k)
s ]−1. (4.13)
For k = D, K is asymptotically equal to Rxx and we have by [72]
tr P⊥A[ψ(D)]
Ryy ' tr Λ(M−D)n + tr[A[ψ(D)]HV (M−D)
n [V (M−D)n ]HA[ψ(D)]][Rxx]. (4.14)
¤
We shall show that the approximation of tr P⊥A[ψ(k)]
Ryy by RHS of (4.12) does not
affect the detection of L1(k, φ1, φ2, . . . , φk) in the large sample limit. As we know an
1In the conditional model, the transmitted signal vectors x(ti) i = 1, 2, . . . , N are assumed to bedeterministic unknown.
76
MLE by itself can not detect the model order. The addition of the MDL penalty
term transforms the (monotonically decreasing) curve of MLE-versus-k so that the
resultant shape is convex with a minimum at k = D. The two observations given by
Theorem 3 suggest that the approximation of tr P⊥A[ψ(k)]
Ryy by the RHS of (4.12) can
be utilized: it approximately holds true when k = D, as manifested by Eq. (4.14); for
k 6= D, the approximation would uniformly contribute to bending the curve upward.
In the case of diagonal Rxx, (4.14) can be further simplified
tr P⊥A[ψ(D)]
Ryy ' tr Λ(M−D)n +
∑Di=1 r′i · a(φi)
HV (M−D)n [V (M−D)
n ]Ha(φi), (4.15)
where r′i denotes the ith diagonal entry in Rxx and a(φi)HV (M−D)
n [V (M−D)n ]Ha(φi) is
the inverse of P(φi), the MUSIC spectrum point. When k 6= D and supposing K is
diagonal with its ith diagonal element denoted by r′′i , the inequality equation (4.12)
may be similarly rewritten as follows
tr Λ(M−k)n + tr[A[ψ(k)]HV (M−k)
n [V (M−k)n ]HA[ψ(k)]][K] ≤
tr Λ(M−k)n +
∑ki=1 r′′i · a(φi)
HV (M−k)n [V (M−k)
n ]Ha(φi). (4.16)
Using K for both k = D and k 6= D, and denoting consistently K’s diagonal entry
by ri, we have the following substitute for tr P⊥A[ψ(k)]
Ryy:
tr Λ(M−k)n +
∑ki=1 ri · a(φi)
HV (M−k)n [V (M−k)
n ]Ha(φi). (4.17)
Substituting (4.17) into (4.6) we obtain the following cost function
L2(k, φ1, φ2, . . . , φk)4= (M − k)N ln
[1
M−k
∑Mi=k+1 λi + 1
M−k
∑ki=1 ri · a(φi)
HV (M−k)n
· [V (M−k)n ]Ha(φi)
]+ N ln
[det Rs[ψ
(k)]]+ 1
2[k(k + 1) + 1] ln N.
(4.18)
The first term within the RHS of (4.18) contains in its logarithm an approximation
77
to the MLE of the noise variance, which is given by [70]
σ2 = 1M−k
tr P⊥A[ψ(k)]
Ryy. (4.19)
This approximation consists of the summation of two terms: the average of the
M − k eigenvalues, 1M−k
∑Mi=k+1 λi, and a weighted sum of the k inverses of the
MUSIC spatial spectrum heights, 1M−k
∑ki=1 ri · a(φi)
HV (M−k)n [V (M−k)
n ]Ha(φi). The
term ln[det Rs[ψ
(k)]]
in (4.18) has the following interpretation. Observe that
tr Ryy = tr PA[ψ(k)]RyyPA[ψ(k)] + tr P⊥A[ψ(k)]
RyyP⊥A[ψ(k)]
= tr Rs[ψ(k)] + tr Rn[ψ(k)],
(4.20)
which is a constant for the N observations. As tr Rn[ψ(k)] on the RHS of Eq. (4.20)
is replaced by
tr Λ(M−k)n +
∑ki=1 ri · a(φi)
HV (M−k)n [V (M−k)
n ]Ha(φi), (4.21)
tr Rs[ψ(k)] is replaced by
tr Λ(k)s −∑k
i=1 ri · a(φi)HV (M−k)
n [V (M−k)n ]Ha(φi). (4.22)
This says that on one hand the noise estimate σ2 = 1M−k
tr Λ(M−k)n is increased by an
amount equal to
1M−k
∑ki=1 ri · a(φi)
HV (M−k)n [V (M−k)
n ]Ha(φi), (4.23)
while on the other hand the sum of the signal eigenvalues tr Λ(k)s is decreased by
the same amount. Thus det Rs[ψ(k)], which is the product of the non-zero eigen-
values of PA[ψ(k)]RyyPA[ψ(k)], is then given by det Λ(k)s − δ(φ1, φ2, . . . , φk), where
δ(φ1, φ2, . . . , φk) is the amount of modification obtained by the knowledge of (the max-
imum likelihood estimates of) the directions-of-arrival. Approximating ln[det Rs[ψ
(k)]]
78
by ln det Λ(k)s yields the following cost function
L3(k, φ1, φ2, . . . , φk)4= (M − k)N ln
[1
M−k
∑Mi=k+1 λi + 1
M−k
∑ki=1 ri · a(φi)
HV (M−k)n
· [V (M−k)n ]Ha(φi)
]+ N ln det Λ(k)
s +1
2[k(k + 1) + 1] ln N. (4.24)
Technically, it is feasible to use L3(k) as a source enumerator. However, there
are several difficulties involved. First, we need a way to associate K’s diagonal entries
with the MUSIC peaks2. Second, as it turns out in practice, for L3(k) to work properly
in the high signal-to-noise ratio regime, the number of data samples N should be large.
In what follows, we consider replacing the ri’s with a single small-valued coefficient
r.
The first observation regarding the coefficient r is that its value should not be
1N
, as in this case the criterion (4.24) would degenerate to
(M − k)N ln[
1M−k
∑Mi=k+1 λi
]+ N ln det Λ(k)
s + 12[k(k + 1) + 1] ln N, (4.25)
which is obviously not an efficient estimator when compared term by term to the
original MDL criterion (cf. (4.5)).
In [73], the asymptotic mean and variance of the MUSIC null spectrum 1P(φ)
is
given by (for φ around the true direction-of-arrival)
(M −D)[
σ2
N
∑Di=1
λi
(λi−σ2)2a(φ)Hviv
Hi a(φ)
]and
(M −D)[
σ2
N
∑Di=1
λi
(λi−σ2)2a(φ)Hviv
Hi a(φ)
]2, (4.26)
respectively, where vi is the ith eigenvector of Ryy that corresponds to λi. When the
signal-to-noise ratio is low, the quantities λi
(λi−σ2)2, i = 1, 2, . . . , D, (see (4.26)) increase
2One possible way of doing this is to sort both the diagonal entries and the MUSIC peaks by theirmagnitudes and then relate them, i.e., the largest diagonal entry will weight the highest MUSICpeak, the second largest one weights the second highest one, and so on. We do not consider thisapproach here.
79
in their values as the denominator is small. Suppose SNR is such that λi
(λi−σ2)2are
O(N), then the entries of the signal covariance matrix Rxx = [[V (D)s ]HA]−1
[Λ(D)
s −σ2I][AHV (D)
s ]−1 will be of magnitude O( 1√N
). This corresponds to a single r = 1√N
.
We note that in this case r 1P(φ)
still has a Gaussian asymptotic behavior with mean
and variance given by
(M −D) 1√N
[σ2
N
∑Di=1
λi
(λi−σ2)2a(φ)Hviv
Hi a(φ)
]and
(M −D) 1N
[σ2
N
∑Di=1
λi
(λi−σ2)2a(φ)Hviv
Hi a(φ)
]2, (4.27)
respectively. Since the EVD-parameterization-based MLE of the noise variance σ2 =
1M−D
tr Λ(M−D)n is asymptotically Gaussian with mean σ2 and variance σ4
N(M−D), it
should be expected that its correction term (due to the DoA-parameterization) is
on the scale of 1√N
or less. These observations suggest a choice of the coefficient
r = 1√N
. Actual simulation shows the validity of this choice for a reasonable range of
N . Setting r equal to 1√N
, we obtain
MDLMUSIC(k)4= (M − k)N ln
[1
M−k
∑Mi=k+1 λi + 1
M−k
∑ki=1
1√N· a(φi)
HV (M−k)n
· [V (M−k)n ]Ha(φi)
]+ N ln det Λ(k)
s +1
2[k(k + 1) + 1] ln N, (4.28)
which is our proposed signal enumeration criterion. Evaluation of the value of the k
can be performed efficiently using Algorithm 1.
In Algorithm 1, vi is the eigenvector of the ith eigenvalue λi, i = 1, 2, . . . ,M .
The number of signals is determined by the k that minimizes MDLMUSIC(k).
The consistency property of MDLMUSIC(k) is summarized in the following the-
orem.
Theorem 4 The estimator, as given by (4.28), is asymptotically consistent. ¤
The proof is given in the Appendix II.
80
Algorithm 1 MUSIC MDL Number of Signals Detection
for k = 1 to M − 1 doV (M−k)
n ← (vk+1 vk+2 . . . vM
)for φ = −π to π doP(φ) = 1
aH(φ)V(M−k)
n [V(M−k)
n ]Ha(φ)
end forif number of peaks ≥ k then
Obtain the k largest maxima: P(φi), i = 1, 2, . . . , kCompute MDLMUSIC(k) using λk+1, λk+2,. . ., λM and 1
P(φ1), 1
P(φ2), . . . , 1
P(φk)
end ifend forD = arg mink MDLMUSIC(k)
4.4.1 Observations and Remarks
The derivation of MDL(k) is based on a generic eigenvalue decomposition
(EVD) model first discussed by T. W. Anderson [68]. For a particular system, such
as that in (4.1), a more specific parameterization that yields a better detector is pos-
sible, but usually at a much higher computation cost. By adopting the large sample
approximation, we are able to reduce the computations required while maintaining
sufficient accuracy.
Compared to the original MDL criterion, (4.28) improves in such a way that
the MUSIC maxima are included within the computation of the noise variance (while
counting differently the number of independent parameters). This is easily seen by
inspecting the conditional MLE (Eq. (4.19)), its large sample approximation (Eq.
(4.15)), and the MLE that is discussed in [68]
σ2 = 1M−D
∑Mi=D+1 λi = 1
M−Dtr Λ(M−D)
n . (4.29)
We see that the estimator (4.19) is specific to array signal processing, and that its
calculation involves not only the M − D eigenvalues but also the estimates of the
DoAs, as manifested by the second term on the RHS of Eq. (4.15).
With the estimator in (4.28), we make explicit usage of the MUSIC spectra
81
−25 −24 −23 −22 −21 −20 −19 −18 −17 −16 −150
0.5
1
1.5MUSIC MDL Number of Signals Detecion
SNR (dB)
Pro
babi
lity
of D
etec
tion
MDL PrincipleMUSIC MDL (r=1/√N)
Figure 4.1: Number of signals detection: MDLMUSIC vs. MDL; 10-element ULA, 4equal-power sources; in terms of SNR (dB); number of samples is 1500; averaged over400 Monte-Carlo runs.
peaks. As mentioned before, several other source enumerators [64], [65], [66] are
distinct from the MDL method in that they exploit the eigenvectors of the received
autocorrelation matrix. Criterion (4.28) presents another example illustrating this
point.
Another feature of the MUSIC MDL estimator is that it eliminates spurious
hypotheses because of the fact that the number of MUSIC peaks within the testing
spectra is insufficient. This is an observation drawn from the consistency proof in
Appendix II.
82
1000 2000 3000 4000 5000 6000 7000 8000 9000 100000
0.5
1
1.5MUSIC MDL Number of Signals Detecion
Number of Samples
Pro
babi
lity
of D
etec
tion
MDL PrincipleMUSIC MDL (r=1/√N)
Figure 4.2: Number of signals detection: MDLMUSIC vs. MDL; 10-element ULA, 8equal-power sources; in terms of number of samples; averaged over 400 Monte-Carloruns.
83
4.5 Simulation and Performance Evaluation
We simulate a 10-element ULA and consider two sets of simulations. In the first
simulation there are 4 source signals with equal power. Their directions-of-arrival are
evenly distributed within the range [−π, π]. In Fig. 4.1 we plot out the probability of
correct detection versus the system’s signal-to-noise ratio, which is defined as tr Rxx
Dσ2 .
The probability of correct detection is calculated by averaging over 400 Monte-Carlo
runs. The number of input samples is N = 1500. The second simulation is an 8-
source scenario and we evaluate the probability of detection against the number of
input samples N . We fixed the input SNRs of all sources at −24 dB. The results are
shown in Fig. 4.2. In each of the above two plots we compare the new technique with
the original MDL. The performance improvement is apparent.
4.6 Conclusion
It is shown that we can utilize the MUSIC maxima information to improve
the detection of the number of signals based on the MDL principle. This information
is carefully incorporated and the computation burden is controlled. The resultant
estimator is computationally efficient while achieving perceptible performance im-
provement.
4.7 Appendix I - Proof of Theorem 3
In this section we provide proof for Theorem 3. We start from
tr PA[ψ(k)]Ryy = tr[[V (k)s ]HA(AHA)−1AHV (k)
s Λ(k)s ]
+ tr[[V (M−k)n ]HA(AHA)−1AHV (M−k)
n Λ(M−k)n ], (4.30)
84
where on the RHS we omit the dependence of A on ψ(k). With some manipulations,
we obtain
tr PA[ψ(k)]Ryy = tr Λ(k)s + tr[[V (k)
s ]HA(AHA)−1AHV (k)s − I]Λ(k)
s
− tr[[V (k)s ]HA(AHA)−1AHV (k)
s − I]Λ(M−k)n
≥ tr Λ(k)s + tr[[V (k)
s ]HA(AHA)−1AHV (k)s − I][Λ(k)
s − λMI], (4.31)
where the inequality is a result of Λ(M−k)n º λMI and “º” is the symbol of positive
semidefiniteness.
Assuming that [V (k)s ]HA is nonsingular, i.e., the k largest peaks correspond to
the signal subspace, we have [72]:
[V (k)s ]HA(AHA)−1AHV (k)
s − I =
[I + [AHV (k)
s ]−1AHV (M−k)n [V (M−k)
n ]HA[[V (k)s ]HA]−1
]−1 − I. (4.32)
Now by noting that for a positive semidefinite matrix Γ
(I + Γ)−1 − I º −Γ, (4.33)
we have
tr PA[ψ(k)]Ryy ≥ tr Λ(k)s − tr[AHV (M−k)
n [V (M−k)n ]HA][K], (4.34)
where K4= [[V (k)
s ]HA]−1[Λ(k)
s − λMI][AHV (k)s ]−1. for k = D, the proof is given by
[72].
4.8 Appendix II - Proof of Theorem 4
In this section, following the methodology given in [62], and by using the
consistency result of [74], we prove the consistency of the estimator (4.28) (Theorem
85
4). In the proof we will use parameter r with no substitution. First, we see that
MDLMUSIC(k)−MDLMUSIC(D) = ∆MLE(k) + ∆Penalty(k) ={
(M − k)N ln[
1M−k
∑Mi=k+1 λi + 1
M−k
∑ki=1 r · aH(φi)V
(M−k)n [V (M−k)
n ]Ha(φi)]−
(M −D)N ln[
1M−D
∑Mi=D+1 λi + 1
M−D
∑Di=1 r · aH(φi)V
(M−D)n [V (M−D)
n ]Ha(φi)]+
N ln det Λs(ψ(k))−N ln det Λs(ψ
(D))}
+{
12[k(k + 1) + 1] ln N − 1
2[D(D + 1) + 1] ln N
}. (4.35)
For the terms on the RHS of Eq. (4.35), let us denote the expression enclosed in the
first pair of curly brackets as ∆MLE(k) and that of the second pair as ∆Penalty(k). In the
following we consider two cases according to the values of k and D: k > D and k < D.
We shall show that in both cases MDLMUSIC(k) > MDLMUSIC(D) asymptotically.
For k > D, except for an additional correction term and a different count of
the degrees of freedom, the other parts of (4.35) are the same as the original MDL.
Thus we only need to show that the additional correction term obtained from testing
MUSIC spectra increases the probability that MDLMUSIC(k) > MDLMUSIC(D) (when
compared to the original MDL).
Partitioning V (M−D)n into two submatrices
V (M−D)n
4=
[V
(k−D)n V (M−k)
n
], (4.36)
we have
aH(φi)V(M−k)
n [V (M−k)n ]Ha(φi) ≤ aH(φi)V
(M−D)n [V (M−D)
n ]Ha(φi), (4.37)
which says that for k > D, there are at least D peaks (of the k peaks) that can be
derived from the MUSIC spectrum using V (M−D)n . Asymptotically these D peaks are
distinct and the corresponding a(φi), i = 1, 2, . . . , D span the true signal subspace.
86
For the remaining k − D peaks we have two possibilities. They are either
completely indistinguishable from the previous D peaks, or some of them are spurious
and the corresponding a(φi)’s fall within the true noise subspace, in which case we
would have some peaks whose null spectrum aH(φi)V(M−k)
n [V (M−k)n ]Ha(φi) are non-
decreasing. For this second case, the probability of being positive is increased.
For the first case (when the k −D peaks are not separable from the other D
peaks), we obtain by Taylor series expansion
(M − k)N ln[
1M−k
∑Mi=k+1 λi + 1
M−k
∑ki=1 r · aH(φi)V
(M−k)n [V (M−k)
n ]Ha(φi)]'
(M − k)N ln[
1M−k
∑Mi=k+1 λi
]+
(M−k)N ·r·O( 1N
)1
M−k
PMi=k+1 λi
, (4.38)
and
(M −D)N ln[
1M−D
∑Mi=D+1 λi + 1
M−D
∑Di=1 r · aH(φi)V
(M−k)n [V (M−k)
n ]Ha(φi)]'
(M −D)N ln[
1M−D
∑Mi=D+1 λi
]+
(M−D)N ·r·O( 1N
)1
M−D
PMi=D+1 λi
. (4.39)
Thus the effect of the correction terms is negligible. In reality this corresponds to a
situation where for a given k there are not enough MUSIC peaks.
Since the correction term improves the detecting probability, then by the con-
sistency of the original MDL for k > D, we have that MDLMUSIC(k) > MDLMUSIC(D)
is asymptotically positive with probability one.
For k < D, let us define
∆eigenvalues(k)4= 1
D−k
∑Di=k+1 λi + 1
D−k
∑ki=1 r · aH(φi)V
(M−k)n [V (M−k)
n ]Ha(φi)
− 1D−k
∑Di=1 r · aH(φi)V
(M−D)n [V (M−D)
n ]Ha(φi). (4.40)
And we have
[1
M−k
∑Mi=k+1 λi + 1
M−k
∑ki=1 r · aH(φi)V
(M−k)n [V (M−k)
n ]Ha(φi)]M−k
=
87
[M−DM−k
1M−D
∑Mi=D+1 λi + D−k
M−k·∆eigenvalues(k)+
M−DM−k
1M−D
∑Di=1 r · aH(φi)V
(M−D)n [V (M−D)
n ]Ha(φi)]M−k
. (4.41)
Using the generalized arithmetic-geometric means inequality, (4.41) is larger than or
equal to
[∆eigenvalues(k)
]D−k·[
1M−D
∑Mi=D+1 λi+
1M−D
∑Di=1 r · aH(φi)V
(M−D)n [V (M−D)
n ]Ha(φi)]M−D
.
(4.42)
Rewrite exp[
1N
∆MLE(k)]
and we have
exp[
1N
∆MLE(k)]
=
[∆eigenvalues(k)
]D−k
QDi=k+1 λi
· [ 1M−k
∑Mi=k+1 λi
+ 1M−k
∑ki=1 aH(φi)V
(M−k)n [V (M−k)
n ]Ha(φi)]M−k
/{[
∆eigenvalues(k)]D−k·[ 1
M−D
∑Mi=D+1 λi
+ 1M−D
∑Di=1 r · aH(φi)V
(M−D)n [V (M−D)
n ]Ha(φi)]M−D}
. (4.43)
In the large sample limit, V (M−k)n approaches V (M−k)
n and includes some of the dimen-
sions from the signal subspace. Therefore the expression∑k
i=1 aH(φi)V(M−k)
n [V (M−k)n ]Ha(φi)−
∑Di=1 aH(φi)V
(M−D)n [V (M−D)
n ]Ha(φi) is either pos-
itive or O( 1N
). Besides, the eigenvalues λi, i = k + 1, k + 2, . . . , D are not asymptot-
ically equal. Thus, as N is approaching the limit, ∆eigenvalues(k) > 1D−k
∑Di=k+1 λi >
∏Di=k+1 λi. Here the second inequality is the (basic) arithmetic mean-geometric mean
(AM-GM) inequality.
Summing up, 1N
∆MLE(k) is asymptotically positive with probability one. Com-
bining this with the fact that 1N
∆Penalty(k) approaches zero with increasing N , we
conclude that for k < D, MDLMUSIC(D)−MDLMUSIC(k) is positive with probability
one in the large sample limit.
88
Chapter 5
Weighted Spatial Smoothing Based
Iterative Weight Matrix
Approximation
89
5.1 Introduction
In this chapter we describe the weighted spatial smoothing (WSS) based it-
erative weight matrix approximation (IWMA) algorithm. As we have mentioned, in
wireless communication systems employing antenna-array at the receivers, the signal
parameter directions-of-arrival (DoAs) can be estimated using subspace-based tech-
niques and algorithms. In realistic communication environments, correlated sources
and propagation paths often exist, due to deliberate jamming or unavoidable multi-
path propagation. This situation can severely affect the accuracy of subspace-based
estimators. As a remedy to the situation, the technique of spatial smoothing and its
variations have been extensively studied. Spatial smoothing was originally proposed
in [75] and its concept was further analyzed and formulated in [76]. In [79] an im-
provement to the technique was discussed based on a strategy to take advantage of
the cross correlations between the subarrays’ outputs. In [80], backed by theoretical
analysis of the robustness of eigenvectors to noise interference (which determines the
estimator’s performance), it was suggested that squaring the array covariance ma-
trix can improve the performance of forward or forward/backward spatial smoothing.
In [81] correlated and uncorrelated signals are separately estimated. MUSIC is first
applied to the array covariance matrix to have the DoA estimates of the uncorre-
lated signals. The obtained DoAs are used to construct a covariance matrix which
is subtracted from the original covariance matrix. Spatial smoothing is then applied
iteratively, starting from 3-element array, and 2 overlapping subarrays and increasing
iteratively to 4 and 3, respectively, and so on. This process should continue until the
MUSIC peaks emerge. It was shown that the computational complexity required is
much lower than that of standard methods. In [82], the author designed a method
which utilizes not only the sum of the forward and backward spatially smoothed co-
variance matrices, but also the difference between the two matrices. It was shown
that this method benefits from generating the basis for the signal subspace from the
columns of the sample covariance matrix. Methods handling coherent signals include
90
also [83, 84, 85], where the spatial filtering method and enhancements to it are con-
sidered. In [85] it was shown that while the scheme’s requirement on the antenna
aperture is the same as corresponding forward/backward spatial smoothing meth-
ods, the estimation performance is improved. More traditionally, DoA estimation for
coherent cases can be obtained by maximum likelihood algorithms such as Determin-
istic Maximum Likelihood [77] or Weighted Subspace Fitting [78]. They both perform
multidimensional searches and are computationally intensive.
In this work we consider cases in which the incoming signals are highly-
correlated and their cross-correlation coefficients are not of value one, but can come
very close to it. We present a directions-of-arrival estimation scheme, the key step of
which is a weighted spatial smoothing (WSS) based iterative weight matrix approx-
imation (IWMA) algorithm. WSS [87] is a generalization of the spatial smoothing
technique. The latter was originally proposed in [75] and the concept was further
analyzed and formulated in [76]. In WSS the spatially-smoothed covariance matrix is
obtained as a weighted sum of the cross-covariance matrices between the ith and the
jth subarrays (the ith subarray is formed by the ith, i+1th, . . ., i+P − 1th antenna
elements), i = 1, 2, . . . , Q and j = 1, 2, . . . , Q, where Q is the total number of subar-
rays. The (i, j)th covariance matrix is weighted by a coefficient wij. Accordingly, a
weight matrix for a WSS processing is defined as a matrix consisting of the weights
wij’s i.e., W4= [wij]. WSS includes conventional spatial smoothing (CSS) ([75] [76])
as a special case, in which case the weight matrix is given by W = 1QI. The weights
of a WSS pre-processing scheme are determined by different design criteria [87][88].
In this work we choose wij’s such that the smoothed source covariance matrix after
the application of WSS, ~Rxx (denoted using an over-vector ~(·)), is diagonal. Diagonal
~Rxx is a desired feature for subspace-based estimation algorithms: a generic MU-
SIC estimator is a Large Sample Maximum Likelihood (LSML) estimator if and only
if the signal covariance matrix is diagonal [86]. The optimum weight matrix Wopt
that produces diagonal ~Rxx can be obtained analytically [87]. But its computation
91
requires explicit knowledge of the DoAs, which are actually unknowns that need to
be estimated. In [87] it is suggested that the coarse DoA estimates obtained from
Capon’s spectrum are used to calculate an approximation to Wopt.
IWMA is a procedure that is capable of approximating Wopt in an iterative
fashion and without any prior knowledge of the DoAs. It is applicable as long as
the input covariance matrix is positive definite. The algorithm starts initially with
a weight matrix W0 = 1QIQ and carries out a series of weighted spatial smoothing
steps with respect to the estimated noise-free array covariance matrix. Within each
iteration it performs WSS using the weight matrix that is obtained from the previous
iteration step. We shall show that for noise-free array covariance matrix the above
procedure is convergent with a finite number of iterations.
Suppose that the algorithm stops at the nth iteration where we obtain the
weight matrix Wn. Two features of the ideal Wn make it a suitable basis upon which
subspace-based DoA estimation can be performed. First, Wn is parameterized by
the directions-of-arrival of the input signals. Second, its structure is optimized as
a result of the IWMA algorithm (this point will become clear in a later section).
We will illustrate the operations of the IWMA algorithm and the performance of
IWMA-generated weight matrix based estimation strategy.
The remainder of the chapter is organized as follows. In Section 5.2 we present
the system model and briefly describe the weighted spatial smoothing preprocessing
technique. In Section 5.3 we present the proposed IWMA algorithm. Theoretical
analysis of the algorithm is provided in Section 5.4. Finally, computer simulations
are presented and discussed in Section 5.6.
5.2 System Model and Background
The system model is similar to that of the previous chapter. We consider D
signals x1(t), x2(t), . . . , xD(t) (D is assumed to be known a priori) each impinging
92
with an electrical angle φk ∈ [−π, π] on an M -element uniform linear array (ULA).
Let y(t) denote the vector consisting of the M received signals from the ULA at time
t. Sampling y(t) at time instants ti (i = 1, 2, . . .) we obtain a discrete-time received
vector sequence y[i] = y(ti) that is given by:
y[i] = Ax[i] + n[i], i = 1, 2, . . . , N, (5.1)
where x[i]4= [x1(ti), x2(ti), . . . , xD(ti)]
T is the vector containing the D transmitted
signals and A4= [a(φ1), . . . , a(φD)] is the array manifold, the kth column of which is
the steering vector a(φk)4= [1, e−jφk , . . . , e−j(M−1)φk ]T associated with the kth input
xk(t), k = 1, 2, . . . , D. In general, the angles φk, k = 1, 2, . . . , D are distinct, which
implies that the matrix A is of full column rank. We assume that the transmitted
vectors x[i], i = 1, 2, . . . , N , are random, identically distributed, with a covariance
matrix given by Rxx4= E{x[i]x[i]H} where E{·} denotes the expectation operator.
Finally, in (5.1), n[i] = n(ti) is additive Gaussian noise assumed to be zero-mean,
spatially and temporally white, with covariance matrix σ2IM , where IM is the M×M
identity matrix.
The problem we consider in this work is the estimation of the direction-of-
arrival (DoA) φk of the kth input signal xk(t). Traditionally, DoA estimation is done
via subspace-type algorithms [89]. However, these estimators can perform poorly
in situations when the signals xk(t) are highly correlated. In these cases the signal
covariance matrix Rxx is no longer diagonal and can even be ill-conditioned which
can noticeably limit the effectiveness of subspace based DoA estimation algorithms.
In this chapter, we consider exactly such cases where the signal cross-correlation
coefficients can be very close to unity.
In the rest of this chapter, we use “conj(X)” to denote the element-wise com-
plex conjugation of the matrix X. To simplify the notation, we will also use X to
denote the same operation. Also, for a Hermitian matrix X of size D we will denote
its eigenvalues (sorted in descending order) as λl(X), l = 1, 2, . . . , D, its eigenvalue
93
vector as λ(X)4= [λ1(X), λ2(X), . . . , λD(X)]T , and its maximum and minimum
eigenvalues as λmax(X)4= λ1(X) and λmin(X)
4= λD(X), respectively.
5.2.1 Weighted Spatial Smoothing
In WSS, the M -element ULA is divided into Q = M − P + 1 overlapping
subarrays of P elements each. The qth subarray, q = 1, 2, . . . , Q, is formed by the
q, (q + 1), . . . , (q + P − 1)th elements of the ULA. Let us denote by yq[i] (nq[i]) the
received signal (noise) over the qth subarray at the ith time instant, i = 1, 2, . . . , N ,
and by Aq the submatrix of A formed by the q, (q + 1), . . . , (q + P − 1)th rows of A.
Then
yq[i] = Aqx[i] + nq[i], q = 1, 2, . . . , Q. (5.2)
To perform WSS, we first compute the weighted sum of the Q2 cross-correlation
matrices between yp[i] and yq[i], for p, q = 1, 2, . . . , Q:
~Ryy4=
Q∑p=1
Q∑q=1
wpqE{yp[i]y
Hq [i]
}=
Q∑p=1
Q∑q=1
wpqFpRyyFTq , (5.3)
where wpq denotes the (p, q)th weight for the (p, q)th cross-correlation matrix and Fi =[0P×(i−1) IP×P 0P×(M−P−i+1)
], i = 1, 2, . . . , Q. It is easily seen that conventional
spatial smoothing (CSS) [75] [76] is a special case of WSS obtained using wpq = δ(p−q)
where δ(·) is the Kronecker delta. The matrix ~Ryy of (5.3) can be decomposed into
two terms:
~Ryy = ~A ~Rxx~AH + ~Rnn, (5.4)
where ~Rnn4=
∑Qp=1
∑Qq=1 wpqE
{np[i]n
Hq [i]
}denotes the spatially-smoothed noise co-
variance matrix and ~A4= [~a(φ1), ~a(φ2), . . . , ~a(φD)] is the subarray manifold with
~a(φk)4= [1, e−jφk , . . . , e−j(P−1)φk ]T , k = 1, 2, . . . , D. Finally, ~Rxx is the signal autocor-
94
relation matrix after WSS and is given by [87]
~Rxx4=
Q∑p=1
Q∑q=1
wpqΦpRxxΦ
−q = Rxx ◦ (BHWB). (5.5)
In (5.5), W = [wpq], p, q = 1, 2, . . . , Q is the Q×Q weight matrix, Φ4= diag([e−jφ1 , e−jφ2 ,
. . . , e−jφD ]T ), B4= [b(φ1)b(φ2) . . . b(φD)] with b(φk)
4= [1, ejφk , . . . , ej(Q−1)φk ]T for
k = 1, 2, . . . , D, and “◦” denotes the matrix Hadamard product. The weight matrix
W is chosen such that the signal autocorrelation matrix after applying WSS, ~Rxx, is
diagonal. It can be shown [87] that the matrix W that results in a diagonal covari-
ance matrix is given by Wopt = (BBH)†, where “(·)†” denotes the Moore-Penrose
matrix inverse.
After WSS the subspace-based DoA estimation is then performed as follows.
The noise subspace spanned by the eigenvectors associated with the P −D smallest
eigenvalues of the spatially smoothed covariance matrix is first identified. If~Vn is
the matrix whose columns are formed by those eigenvectors of the spatially smoothed
estimated covariance matrix that span the noise subspace, then we can estimate the
DoAs φk, k = 1, 2, . . . , D, of all sources from the locations of the D largest peaks of
the spectrum P(φ) defined as
P(φ)4=
∥∥∥ ~V H
n · ~a(φ)∥∥∥−2
, φ ∈ [−π, π] , (5.6)
where ~a(φ)4= [1, e−jφ, . . . , e−j(P−1)φ]T . To obtain the locations of the D largest peaks
a search on the real line would need to be performed.
Clearly, the inherent difficulty associated with the calculation of Wopt is that
it depends on the unknown φk, k = 1, 2, . . . , D. In the next section we propose the
iterative weight matrix approximation (IWMA) algorithm that can approximate the
optimum weight matrix Wopt in an iterative fashion without any knowledge of the
DoAs φk.
95
5.3 Proposed Iterative Weight Matrix Approxima-
tion Algorithm
The proposed algorithm starts with an initial weight matrix W0 set equal to
1QIQ and applies a series of weighted spatial smoothing operations on the “de-noised”
array covariance matrix Rdn4= Ryy − σ2I where Ryy is the autocorrelation matrix
of the spatial observation vector i.e., Ryy4= E{y[i]y[i]H}. Each WSS operation uses
a weight matrix obtained from the previous iteration step. This iterative procedure
is formally described in Algorithm 2. We shall show in the next section that in the
ideal and noise-free case, the weight matrix Wi iteratively approaches a matrix of the
form (B ·D ·BH)†, where D is a diagonal matrix.
Algorithm 2 Iterative Weight Matrix Approximation
Input: Rdn = Ryy − σ2IInitialization:W0 ⇐ 1
QIQ and F ⇐ [
IQ×Q 0(P−Q)×Q
]n ⇐ number of iterations, n is a positive even numberMain Loop:for i = 1 to n do
Perform WSS (see (5.3)) on Rdn using Wi−1
Obtain ~Rdn,i = ~A[Rxx ◦ (BHWi−1B)] ~AH
Wi ⇐ [conj(F ~Rdn,iFT )]†
end forOutput: Wn
In practice, the exact covariance matrix Ryy is not known and is estimated by
sample-averaging over N observed vectors as follows:
Ryy4=
1
N
N∑i=1
y[i]y[i]H . (5.7)
The “de-noised” version of the array covariance matrix can then be estimated by first
applying an eigenvalue decomposition on Ryy. More specifically, let Λs be the diag-
onal matrix formed by the signal eigenvalues λ1(Ryy), λ2(Ryy), . . . , λD(Ryy) and Vs
96
the matrix containing in its columns the corresponding signal eigenvectors. Similarly,
let Λn be the diagonal matrix formed by the noise eigenvalues
λD+1(Ryy), λD+2(Ryy), . . . , λM(Ryy) and Vs the matrix containing in its columns the
corresponding noise eigenvectors. Then, σ2 and Rdn can be estimated by σ2 =
1M−D
tr Λn and Rdn4= Vs(Λs − σ2ID)V H
s , respectively. The procedure is described
by Algorithm 3 below.
Algorithm 3 Iterative Weight Matrix Approximation
Input: Ryy, estimate of Ryy
Initialization:Perform eigenvalue decomposition (EVD) on Ryy = VsΛsV
Hs + VnΛnV H
n
Obtain σ2 = 1M−D
tr Λn and Rdn4= Vs(Λs − σ2ID)V H
s
W0 ⇐ 1QIQ and F ⇐ [
IQ×Q 0(P−Q)×Q
]n ⇐ number of iterations, n is a positive even numberMain Loop:for i = 1 to n do
Perform WSS (see (5.3)) on Rdn using Wi−1 to obtain~Rdn,i
Wi ⇐ [conj(F~Rdn,iF
T )]†
end forOutput: Wn
5.4 Theoretical Analysis
In this section we study the behavior of Algorithm 2. We start with an ob-
servation about its iterative process: assuming that at the ith iteration the weight
matrix is given by Wi = (BDBH)†, where D is a D×D diagonal matrix, the entries
of which are strictly larger than zero, we have Wi+2 = (BDBH)† = Wi. The matrix
Wi+1 at the next iteration is given by:
Wi+1 = [conj(F ~RdnFT )]† = [conj(F ( ~A ~Rxx
~AH)F T )]† = (B conj( ~Rxx)BH)†
= (B conj(Rxx ◦ (BHWiB))BH)† = (BDRD†BH)† = (BH)†DD†RB†, (5.8)
97
where DR is a diagonal matrix whose main diagonal is equal to that of the signal
covariance matrix Rxx. Similarly, at the (i + 2)th iteration the matrix Wi+2 is
Wi+2 = (B conj(Rxx ◦ (BH(BH)†DD†RB†B))BH)†
= (BDRD†RDBH)† = (BDBH)† = Wi. (5.9)
Eqs. (5.8) and (5.9) reveal that if at the ith iteration Wi is equal to the optimum
weight matrix1 Wopt = (BBH)†, or to some matrix of the form α(BBH)† for α > 0,
then the algorithm will enter a loop where its output will alternate between two
values, one of which is equal to Wopt. Motivated by this observation, we will treat
the IWMA algorithm’s iteration steps in pairs, starting at an even iteration step.
The behavior of the proposed IWMA algorithm will be investigated by studying
how the source correlation matrix adapts after every two consecutive iterations. Let
~Rxx,i (i = 1, 2, 3, . . . , n) (again n is a positive even number) denote the spatially
smoothed source covariance matrix implicit in the ith iteration i.e., a D ×D matrix
such that ~Rdn,i = ~A ~Rxx,i~AH . We will now show that two consecutive iterations of
the IWMA algorithm modify the ith source covariance matrix ~Rxx,i to the (i + 2)th
source covariance matrix ~Rxx,i+2 as follows:
~Rxx,i+2 = Rxx ◦ [Rxx ◦ ( ~Rxx,i)†]†, i = 0, 2, . . . , n, (5.10)
where ~Rxx,0 is defined as ~Rxx,04= ( 1
QB
HB)†. Indeed, for the first four iteration steps
of the IWMA algorithm we have
~Rxx,1 = Rxx ◦ (1
QBHB) (5.11)
~Rxx,2 = Rxx ◦ (BH(B(Rxx ◦ (1
QB
HB))BH)†B) = Rxx ◦ (Rxx ◦ (
1
QB
HB))†.
(5.12)
1We note that Wopt = (BDBH)† for D = I.
98
~Rxx,3 = Rxx ◦ (conj( ~Rxx,2))† (5.13)
~Rxx,4 = Rxx ◦ (Rxx ◦ ( ~Rxx,2)†)†. (5.14)
Proceeding by induction on i and assuming that the updating formula is true for
some even-numbered integer i and any even numbers less than i, we have
~Rxx,i+1 = Rxx ◦ (conj( ~Rxx,i))† (5.15)
~Rxx,i+2 = Rxx ◦ (Rxx ◦ ( ~Rxx,i)†)†, (5.16)
which prove the updating formula as given in (5.10). The updating procedure de-
pends on the two matrices ~Rxx,0 and Rxx both of which are positive definite2. As
the Hadamard product of Hermitian matrices maintains positive definiteness [76],
the above iteration process is non-degenerate and ~Rxx,i is always positive definite
and Hermitian. Thus the pseudo-inverse operation in (5.10) can be replaced by the
ordinary matrix inverse i.e., ~Rxx,i+2 = Rxx ◦ [Rxx ◦ ( ~Rxx,i)−1]−1, i = 0, 2, . . . , n.
The next theorem establishes the fact that two consecutive iterations of the
IWMA algorithm result in a reduction of the Frobenius norm of the source covariance
matrix.
Theorem 5 For i = 0, 2, . . . , n, we have
∥∥∥ ~Rxx,i+2
∥∥∥2
F≤
∥∥∥ ~Rxx,i
∥∥∥2
F, (5.17)
where ‖ · ‖F denotes the Frobenius matrix norm.
Proof: Letting Σi4= ( ~Rxx,i)
−1 we need to show that
∥∥Rxx ◦ (Rxx ◦Σi)−1
∥∥2
F≤
∥∥Σ−1i
∥∥2
F. (5.18)
2The matrix ~Rxx,0 = 1QB
HB is positive definite as long as Q ≥ D and the DoAs of the input
signals are distinct. Rxx is assumed to be positive definite throughout this text.
99
We first note that
~Rxx,i+2 = Rxx ◦ (Rxx ◦Σi)−1 = CR ◦ (CR ◦Σi)
−1, (5.19)
where CR4= URRxxUR, CR
4= URRxxUR and UR
4= diag
([r11
−1/2, r22−1/2, . . . , rDD
−1/2]T
)
with rii denoting the ith diagonal entry of Rxx. CR is a correlation matrix and so is
CR; therefore we have
∥∥Rxx ◦ (Rxx ◦Σi)−1
∥∥2
F=
∥∥CR ◦ (CR ◦Σi)−1
∥∥2
F≤
∥∥(CR ◦Σi)−1
∥∥2
F. (5.20)
Thus it is sufficient to show that
∥∥(CR ◦Σi)−1
∥∥2
F≤
∥∥Σ−1i
∥∥2
F. (5.21)
We have the following properties regarding the Hadamard product of a Hermitian
matrix and a correlation matrix [92, Theorem 5.5.11]: the set of eigenvalues of CR◦Σi,
λ(CR ◦Σi) is majorized3 by the set of eigenvalues of Σi, λ(Σi). Combined with the
fact that the function f(λl) = λ−2l is convex, we have [90, p. 64]
D∑
l=1
[λl(CR ◦Σi)]−2 ≤
D∑
l=1
[λl(Σi)]−2, (5.22)
which is exactly the relationship (5.21). Together with (5.20) this proves the relation-
ship (5.18). It is easily seen that the equality in (5.18) holds if and only if (CR◦Σi)−1
is diagonal which, in turn, is diagonal if and only if Σi is diagonal.
Moreover, two consecutive iterations of the IWMA algorithm lead to a reduc-
3For real vectors a and b of the same size n, a is said to majorize b if (i)∑n
i=1 ai =∑n
i=1 bi and(ii)
∑ki=1 a[i] ≥
∑ki=1 b[i], k = 1, 2, . . . , n− 1, where a[i] denotes the ith largest element of a and b[i]
denotes the ith largest element of b. If condition (i) is relaxed to∑n
i=1 ai ≥∑n
i=1 bi then a is saidto weakly submajorize b. Intuitively, majorization is a partial order over vectors of real numbersand the statement “a majorizes b” means that the components of a are more spread out than thoseof b.
100
tion of the eigenvalue range of the source signal correlation matrix. This is formally
stated by the following theorem.
Theorem 6 The range of the eigenvalues of ~Rxx,i+2 is confined within that of ~Rxx,i
i.e.,
λmin( ~Rxx,i) ≤ λmin( ~Rxx,i+2) ≤ λmax( ~Rxx,i+2) ≤ λmax( ~Rxx,i). (5.23)
Proof: It is equivalent to show that
λmin(Σ−1i ) ≤ λmin[CR ◦ (CR ◦Σi)
−1] ≤ λmax[CR ◦ (CR ◦Σi)−1] ≤ λmax(Σ
−1i ). (5.24)
The set of eigenvalues of Σi, λ(Σi), majorizes the set of eigenvalues of CR ◦ Σi,
λ(CR ◦Σi). Therefore,
λmin(Σi) ≤ λmin(CR ◦Σi) ≤ . . . ≤ λmax(CR ◦Σi) ≤ λmax(Σi). (5.25)
This in turn gives
λ−1max(Σi) ≤ [λmax(CR ◦Σi)]
−1 ≤ . . . ≤ [λmin(CR ◦Σi)]−1 ≤ λ−1
min(Σi). (5.26)
We then have
λmin(Σ−1i ) ≤ λmin[(CR ◦Σi)
−1] ≤ λmin[CR ◦ (CR ◦Σi)−1]
≤ . . . ≤ λmax[CR ◦ (CR ◦Σi)−1] ≤ λmax[(CR ◦Σi)
−1] ≤ λmax(Σ−1i ) (5.27)
which completes the proof.
A consequence of Theorem 6 is that the eigenvalues of ~Rxx,i+2 are always kept
within the range [λ−1max(
1QB
HB), λ−1
min(1QB
HB)] which is the range of the eigenvalues
of Σ0 = ( ~Rxx,0)−1.
Theorem 7 The eigenvalue vector of ~Rxx,i+2, λ( ~Rxx,i+2) is weakly submajorized by
101
the eigenvalue vector of ~Rxx,i, λ( ~Rxx,i). Moreover,
λ( ~Rxx,i+2) = Qiλ( ~Rxx,i), (5.28)
where Qi is a doubly substochastic matrix i.e., a square matrix entries of which are
nonnegative and the row sum and column sum of it is less than or equal to one.
Proof: It suffices to show that
λ( ~Rxx,i+2) = λ(Rxx ◦ (Rxx ◦Σi)
−1)
= λ(CR ◦ (CR ◦Σi)
−1)
is weakly submajorized
by λ(Σ−1i ) and that λ
(CR ◦ (CR ◦Σi)
−1)
= Qλ(Σ−1i ), where Q is a doubly sub-
stochastic matrix. We first see that the vector of eigenvalues λ(CR ◦Σi) is majorized
by that of Σi, λ(Σi). Since f(λl) = λ−1l is a convex function for positive eigen-
values λl, l = 1, 2, . . . , D, λ[(CR ◦ Σi)−1] is weakly submajorized by λ(Σ−1
i ) [90,
p. 115]. Next, since λ[CR ◦ (CR ◦ Σi)−1] is majorized by λ[(CR ◦ Σi)
−1], we con-
clude that λ[CR ◦ (CR ◦Σi)−1] is submajorized by λ(Σ−1
i ). Finally, since each value
of λ[CR ◦ (CR ◦ Σi)−1] and λ(Σ−1
i ) is nonnegative, we then have that [90, p. 27]
λ[CR ◦ (CR ◦Σi)−1] = Qiλ(Σ−1
i ).
The update from λ( ~Rxx,i) to λ( ~Rxx,i+2) as given by (5.28), combined with
(5.23), shows that the eigenvalue range of ~Rxx,i+2 is squeezed by the algorithm with
a best effort. This is because each eigenvalue is a weighted sum of the elements
of λ( ~Rxx,i) and the weights are given by the pth row of Qi. As Qi is a doubly
substochastic matrix, the weights are non-negative and their sum is less than or
equal to one. Furthermore, the eigenvalues can not be squeezed below λmin( ~Rxx,i), as
seen from (5.23).
To summarize, the working principles of the proposed algorithm as described
by Theorems 5-7 are:
1. The Frobenius matrix norm of ~Rxx,i+2 is a decreasing function with increasing
i.
2. With increasing i, the eigenvalue range of ~Rxx,i+2, [λmin( ~Rxx,i+2), λmax( ~Rxx,i+2)],
102
is kept within the eigenvalue range two iterations before, i.e.
Eigenvalue range of ~Rxx,i︷ ︸︸ ︷
λmin( ~Rxx,i) ≤Eigenvalue range of ~Rxx,i+2︷ ︸︸ ︷
λmin( ~Rxx,i+2) ≤ λmax( ~Rxx,i+2) ≤ λmax( ~Rxx,i) . (5.29)
3. Furthermore, the eigenvalue range is reduced by the algorithm on a best effort
basis, as prescribed by the second and the third theorems. Note that the range
may or may not be reduced. A strict decrease of the maximum eigenvalue
happens when at the i + 2th iteration the elements of the first column of Qi
are less than 1. If Rxx or ~Rxx,i+2 is diagonal, then the eigenvalue range is not
reduced. Whether or not the range is reduced is dependent on Qi and eventually
on the inputs to the algorithm and the current conditions for the iteration. This
point is further elaborated on in the next theorem.
Theorem 8 Let ~Rxx,i+2 be as defined before (Eq. (5.10)). ~Rxx,i+2 approaches a
diagonal matrix D of the same size with finite (and even) number of IWMA iterations.
The entries of the main diagonal D are positive and their range is reduced by the
algorithm in a best effort.
Proof: We proceed by considering the two possible cases when the algorithm
is at the (i + 2)th iteration: (i) the matrix (CR ◦ Σi)−1 is diagonal; (ii) the matrix
(CR ◦Σi)−1 is not diagonal.
We shall use the following term to quantify how close (CR ◦ Σi)−1 is to a
diagonal matrix:D∑
p,q=1p6=q
|[(CR ◦Σi)−1]pq|2(1− |[CR]pq|2). (5.30)
In (5.30), [X]pq is used to denote the (p, q)th element of the matrix X. As noted
before, CR is a correlation matrix and it is positive definite (CR has the same positive
definiteness as the signal covariance matrix Rxx). A correlation matrix has its main
103
diagonal elements equal to 1 and its off-diagonal elements less than or equal to 1.
Since the input signals are correlated but not coherent by assumption, none of the
off-diagonal entries of Rxx is equal to one. When (CR ◦Σi)−1 is a diagonal matrix,
(5.30) is equal to 0. Otherwise (5.30) is larger than 0 and the magnitude of its value
corresponds to the deviation of (CR ◦ Σi)−1 from a diagonal matrix. When (5.30)
approaches 0, (CR ◦ Σi)−1 approaches a diagonal matrix. Likewise, the converse is
true.
Consider the first case i.e., at the (i + 2)th iteration the matrix (CR ◦ Σi)−1
is diagonal, which is either because CR is diagonal (equivalently because Rxx is
diagonal) or Σi is diagonal (equivalently ~Rxx,i is diagonal), we know from Eq. (5.19)
that ~Rxx,i+2 is diagonal, which proves the first part of the theorem.
Suppose now that the matrix (CR ◦ Σi)−1 is not diagonal at the (i + 2)th
iteration. We then have
‖ ~Rxx,i‖2F−‖ ~Rxx,i+2‖2
F =∥∥Σ−1
i
∥∥2
F− ‖CR ◦ (CR ◦Σi)
−1‖2F
≥‖(CR ◦Σi)−1‖2
F − ‖CR ◦ (CR ◦Σi)−1‖2
F
=D∑
p,q=1p6=q
|[(CR ◦Σi)−1]pq|2(1− |[CR]pq|2) > 0, (5.31)
where the second line follows from the inequality (5.21). From (5.31) we have that the
magnitude of the Frobenius norm ‖ ~Rxx,i+2‖2F is decreasing, the amount of which is
larger than or equal to (5.30), which quantifies how close (CR ◦Σi)−1 is to a diagonal
matrix. If the amount of decreasing is approaching zero, we see that (CR ◦ Σi)−1
is approaching a diagonal matrix. And this proves the first part of the theorem.
Otherwise we continue our discussion as below.
As long as (CR ◦Σi)−1 deviates from a diagonal matrix for i = 0, 2, 4, . . ., we
104
have
∑i=0,2,4,...
(‖ ~Rxx,i‖2
F − ‖ ~Rxx,i+2‖2F
)≥
∑i=0,2,4,...
D∑p,q=1p6=q
|[(CR ◦Σi)−1]pq|2(1− |[CR]pq|2),
(5.32)
and
‖ ~Rxx,i+2‖2F = ‖ ~Rxx,0‖2
F −∑
i=0,2,4,...
(‖ ~Rxx,i‖2
F − ‖ ~Rxx,i+2‖2F
)
≤ ‖ ~Rxx,0‖2F −
∑i=0,2,4,...
D∑p,q=1p6=q
|[(CR ◦Σi)−1]pq|2(1− |[CR]pq|2). (5.33)
Note that ‖ ~Rxx,i+2‖2F can not be decreased indefinitely towards 0, which can be
explained by viewing the vector of the eigenvalues of the smoothed covariance matrix
~Rxx,i+2 at the (i+2)th iteration, i.e. λi+2 = [λ1( ~Rxx,i+2), . . . , λD( ~Rxx,i+2)]T , as a point
within the D-dimensional real space RD. λi+2 lies on the surface of a hypersphere
centered at the origin [0, 0, . . . , 0]T with radius ‖ ~Rxx,i+2‖F = [∑D
l=1 λ2l (
~Rxx,i+2)]−1/2.
The decrease of ‖ ~Rxx,i+2‖2F with increasing i corresponds to the shrinking of the
hypersphere, on top of which λi+2 lies. At the same time, λi+2 lies within an imag-
inary hypercube, each side of which runs from [0, . . . , 0, λmin( ~Rxx,i+2), 0, . . . , 0]T to
[0, . . . , 0, λmax( ~Rxx,i+2), 0, . . . , 0]T . According to Theorem 6 the hypercube may or
may not shrink with increasing i, but is always contained within the previous hyper-
cube, as suggested by the inequality (5.23). As λi+2 is confined within the hypercube,
this prevents the ‖ ~Rxx,i+2‖2F from decreasing indefinitely. Furthermore, the shrinking
of the hypersphere combined with the shrinking of the hypercube defines the behavior
of the iteration algorithm.
With increasing i, the value of ‖ ~Rxx,i+2‖2F decreases as (5.33), and we will
have ‖ ~Rxx,i+2‖2F = λ2
max(~Rxx,i+2) +
∑D−1l=1 λ2
min(~Rxx,i+2), beyond which point further
iterations will strictly decrease the maximum eigenvalue. For example, with ad-
105
ditional j (j even and j > 0) iterations we will have ‖ ~Rxx,i+2+j‖2F < ‖ ~Rxx,i+2‖2
F
and λmin( ~Rxx,i+2) ≤ λmin( ~Rxx,i+2+j) ≤ λmax( ~Rxx,i+2+j) ≤ λmax( ~Rxx,i+2), which im-
plies that λmax( ~Rxx,i+2+j) < λmax( ~Rxx,i+2). Eventually we will have ‖ ~Rxx,i+2+j‖2F =
∑Dl=1 λ2
min(~Rxx,i+2+j) for some j. At this time the hypershere and the hypercube
will have only one intersection point, which is λi+2+j and the eigenvalues are seen to
be squeezed towards some real number α which is within the predetermined range
[λ−1max(
1QB
HB), λ−1
min(1QB
HB)]. The quantity ‖ ~Rxx,i+2+j‖2
F can no longer be decreased
with increasing j (and the hypersphere will not shrink). From Theorem 5 we see that
this means that the equality of (5.18) holds and ~Rxx,i+2+j is αI. This completes the
proof of the first part of the claim.
As the eigenvalue range is decreased by the algorithm in a best effort, we see
that the range of the final diagonal matrix D’s entries is decreased by the algorithm
in a best effort.
With the above, we see that the IWMA algorithm is convergent to the two
alternating stages (5.8) and (5.9) and will stabilize.
Some remarks are now in order.
• Although the algorithm in a best effort reduces the range of the entries of D, it
is possible that ~Rxx,i+2 will converge to some diagonal matrix D instead of αI
(which is the case when the algorithm converges before the hypersphere and the
hypercube intersects at only one point) as this is dependent on the inputs to the
algorithm over which the algorithm has no control. Extensive simulations have
shown that for the assumptions of interest within this text, D is only slightly
deviated from an ideal diagonal matrix αI.
• The dependence of IWMA upon the correlation matrix CR (Eq. (5.20)) im-
plies that the algorithm is suited to highly correlated input signals, the cross-
correlation coefficients of which are very close to the value one; and it is not
applicable in cases where coherent signals and the cross-correlation coefficients
among them are all one.
106
5.5 DoA Estimation Using Wn
With IWMA we have Wn = (BDBH)† for sufficiently large n. Wn is well
formatted and is a suitable basis upon which DoA estimation can be directly carried
out. This is because D is a diagonal matrix such that the spread of its diagonal
elements is minimized by the algorithm in a best effort (again this is suggested by
Theorem 6 and 7). DoA estimation is carried out by identifying the Q × (Q − D)
dimensional noise subspace that is associated with Wn, and using the noise subspace
to perform MUSIC-type DoA estimation.
5.6 Simulations
In the first simulation (shown in Fig. 5.1) we demonstrate the operation of the
proposed algorithm by plotting the approximation error of Wopt by Wi, as a function
of the number of iterations for different values of the array snapshot sample size N .
The approximation error is defined as minc e(c) where e(c)4= ‖Wi− cWopt‖2
F . It can
be shown that the value of c that minimizes e(c) is given by
c =
∑Qp,q=1 [Wi]
∗pq[Wopt]pq +
∑Qp,q=1 [Wopt]
∗pq[Wi]pq
2∑Q
p,q=1 [Wopt]∗pq[Wopt]pq
. (5.34)
In this study, we consider D = 2 signals impinging at electrical angles −0.1098
and 1.4654, respectively, on a 5-element ULA that is divided into Q = 5 − 4 +
1 = 2 sub-arrays of size P = 4. The system’s signal-to-noise-ratio (SNR), defined
as tr Rxx/(Dσ2), is set at 5dB. The two sources have equal power and the cross-
correlation between them is 0.9. Each curve shown in Fig. 5.1 represents averages over
20 Monte-Carlo runs. The curves show that when the number of samples is sufficient
the error of weight matrix estimations is decreased by the number of iterations. Note
that since the ideal Wi may not necessarily converge to a scaled version of Wopt, the
floors of the curves within the figure are anticipated.
107
In our next simulation, we evaluate the performance of the DoA estimator
which is based on the generated weight matrix by the IWMA algorithm as described
at the end of the previous section. We consider a 5-element ULA and D = 2 incident
source signals with electrical angles π/7 and π/4, respectively. The subarray size is
P = 3 which implies that Q = 5 − 2 + 1 = 3. The two sources are assumed to have
equal power and correlation 0.999. In Fig. 5.2 we plot the total MSE∑D
k=1 |φk−φk|2
of the two DoA estimates versus the input SNR. The number of observation snapshots
is N = 900 while the number of iterations of the IWMA algorithm is 300. In this
figure we include for comparison purposes the performance of MUSIC, CSS-based
MUSIC (for which the dimension of the subarray size is also P = 3), LMUSIC [93]
and SSMUSIC [94]. All curves shown are averages over 800 Monte-Carlo runs. The
performance improvement is apparent. In Fig. 5.3 we also plot the squared error
of the generated weight matrix versus the system’s SNR. The error has the same
definition as that of the first simulation.
In the third simulation we compare the performance of an IWMA-based DoA
estimator against that of a CSS-based MUSIC estimator for the case of M = 13
antennas. The subarray size is P = 7 (therefore, Q = 7). We consider the scenarios
of D = 2, 3, 4 or 5 incoming sources. In each case, the electrical angles are given by
the first D elements of the vector [0.4488, 0.6283, 0.8976, 1.5708, 2.6180]. As in Fig.
5.2 the sources are assumed to have equal powers and the cross correlation coefficients
between them are all 0.9999. The number of observed snapshots is N = 1500 while
the number of iterations of the IWMA algorithm is 300. In Fig. 5.4 we plot the
total MSEs of the DoA estimates versus SNR. All results shown are averages over 500
Monte-Carlo runs.
108
5.7 Conclusion
We described an iterative weight matrix approximation (IWMA) procedure,
which is able to efficiently generate an approximating weight matrix that can be
used in performing weighted spatial smoothing (WSS). The procedure can effectively
decorrelate the incoming signals received by an antenna array. By theoretical analysis
and computer simulations, we study the performance of the algorithm. We finally
note that the application area of IWMA and the estimation methods based on it is
different from that of conventional spatial smoothing and weighted spatial smoothing.
2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 4410
−5
10−4
10−3
10−2
10−1
ith Iteration
Sqa
ured
Err
or o
f W
Number of samples: N=100Number of samples: N=1000Number of samples: N=10000Number of samples: N=100000
Figure 5.1: Operations of the IWMA algorithm: averaged squared error of Wi versusnumber of iterations.
109
−10 −7 −4 −1 2 5 8 11 14 17 20 23 26 29 32 35 38 41 44 47 50
10−3
10−2
10−1
100
SNR (dB)
MS
E o
f DoA
Est
imat
ion
MUSICSSMUSICLMUSICDoA Estimation UsingIWMA Weight MatrixCSS Based MUSIC
Figure 5.2: Performance of DoA estimation using IWMA-generated weight matrix:MSE of DoA estimates versus system SNR in dB.
110
−10 −7 −4 −1 2 5 8 11 14 17 20 23 26 29 32 35 38 41 44 47 500.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
SNR (dB)
Squ
ared
Err
or o
f W
Figure 5.3: Operations of the IWMA algorithm: averaged squared error of Wi versussystem SNR in dB.
111
−10 −7 −4 −1 2 5 8 11 14 17 20 23 26 29 32 35 38 41 44 47 50 53
10−3
10−2
10−1
SNR (dB)
MS
E o
f DoA
Est
imat
ion
2 sources: DoA Estimation Using IWMA Weight Matrix2 sources: CSS Based MUSIC3 sources: DoA Estimation Using IWMA Weight Matrix3 sources: CSS Based MUSIC4 sources: DoA Estimation Using IWMA Weight Matrix4 sources: CSS Based MUSIC5 sources: DoA Estimation Using IWMA Weight Matrix5 sources: CSS Based MUSIC
Figure 5.4: Performance of DoA estimation using IWMA-generated weight matrix:MSE of DoA estimates versus system SNR in dB.
112
Chapter 6
On Two Deterministic Measures
for Linear Processing Space-Time
Block Codes
113
6.1 Introduction
In this chapter we consider deterministic measures for linear processing space-
time block codes. As we know, Multiple Input and Multiple Output (MIMO) is a
recent and promising technology for wireless communications, the main idea behind
which is the recognition and full exploitation of the spatial degrees of freedom within
the systems using multi-transmit multi-receive antenna arrays. In the 90s, the ca-
pacity gain of multiple-transmitting-multiple-receiving-antenna systems was revealed
[95], [96] and [97]. This led to an enormous research interest in exploiting MIMO’s
inherent degrees of freedom.
There are two fundamental viewpoints of MIMO’s functionalities and its us-
age. The first one exploits the possible capacity increase and is represented by the
V-BLAST (Vertical Bell Laboratories Layered Space Time) transmission scheme [100]
and the transmit beamforming technique [101]. Transmit beamforming utilizes chan-
nel state information (CSI) at the transmitter side. For this purpose, research has
been focused on analyzing various MIMO channels, their corresponding capacities and
efficient transmission strategies. On the other hand, a completely different MIMO
strategy is to improve the reliability of transmission and achieve full transmit di-
versity, a representative method of which is the orthogonal space-time block coding
(OSTBC) proposed by [103] and [104], which assumes no knowledge of CSI at the
transmitter side.
Since the works on [103] and [104], many space-time block coding schemes have
been proposed. B. Hassibi and B. M. Hochwald in [112] proposed the linear disper-
sion codes (LDCs) and a design criterion which maximizes the mutual information
between the (multiple) input symbols and the (multiple) output signals. In [117] the
authors proposed unitary space-time constellations, a technique similar to that of
[118]. Su and Xia in [129] considered several quasi-orthogonal schemes and studied
optimal rotations of the scalar signal constellations which can be used to obtain full
transmit diversity and maximize product distance gain. Constellation rotation was
114
also considered by N. Sharma and C. Papadias in a well-known work [128]. In [115]
the authors designed a space-time code specifically for systems with two transmitting
antennas which achieves both diversity and multiplexing gain. In [119] the authors
proposed a design criterion which seeks to minimize a quasi-orthogonality measure
of the space-time codes. The criterion is claimed to be equivalent to LDC’s mutual
information. In [121] space-time codes are designed using the mathematical tool of
division algebra (a division algebra has multiplicative inverses) where the rank crite-
rion (correspondingly the transmit diversity [104]) is of major concern and this rank
condition is maintained by the properties of division algebra.
In general, an STBC design involves spatial- and temporal- dimensions and can
be based on diverse performance and reliability criteria. We may divide the current
approaches to this problem into two main categories:
(i) Designing the overall space-time constellations. This includes, for example,
ST block codes based on division algebra, unitary STBC and minimal quasi-
orthogonality STBC.
(ii) Employing a linear processing (LP) structure ([104], [112]) and separately de-
signing the linear processing matrices (LPM’s) [104] (this definition is similar to
that of the so-called dispersion matrices in [112], for a formal definition within
this context, please refer to Eq. (6.2)) and the scalar signal constellations of
the individual transmitted symbols [104], [112] (see Eq. (6.2)). This cate-
gory includes, for example, OSTBC, quasi-orthogonal STBC (QOSTBC) [104],
QOSTBC with constellation rotation [128] and the linear dispersion codes.
We identify in the following several important considerations when designing
a space-time coding scheme:
(i) Diversity. It requires that the minimum rank of all differences of the ST code-
words be maximized. (The rank- and the determinant- criterion are given and
discussed in [99] and [102]).
115
(ii) Coding gain (product distance gain). Maximizing the product distance gain is
a more desirable characteristic as it is directly related to the frame-error-rate
performance of the ST code.
(iii) Linear decoding. Linear/fast decoder is an important design consideration for
ease of implementation and real-time decoding of space-time block codes.
(iv) Adaptability to scalar signal constellations. The consideration of this is specific
to the linear processing schemes, which separate the design of the set of linear
processing matrices from that of the transmitted (complex) signal constella-
tions. A space-time block code from orthogonal design, for example, achieves a
complete decoupling of the design into two separate procedures. On the other
hand, an STBC design with respect to the overall space-time constellation would
not have such problems.
In this work we study linear processing space-time block coding (LP-STBC)
and consider features that characterize the set of linear processing matrices (LPM’s)
associated with an LP-STBC. We discuss deterministic measures defined for the
LPM’s and investigate the relations of these measures to the final performance of
the STBC’s and to other design criteria. These measures are deterministic in their
nature and involve no statistical operators (such as expectations) and are solely con-
cerned with the LPM’s. Within this theme we study and discuss next two such
measures for LP-STBC’s.
The first one is a design criterion obtained by Jensen’s relaxation of the LDC
mutual information. In general the LPM set should be combined with the transmit-
ting constellations in order to evaluate meaningful performance criteria. For exam-
ple, QOSTBC’s FER (Frame Error Rate) performance can be substantially improved
through constellation rotations. Fortunately, the two can be effectively decoupled
using the LDC mutual information criterion [112]. The derivation of the criterion
is based upon the observation that the set of linear processing matrices effectively
116
reshapes the original MIMO channel. The criterion requires only the statistical dis-
tribution of the channel coefficients.
To obtain the first measure, we proceed by applying Jensen’s Inequality to the
mutual information measure of the linear dispersion code, denoted by CLD, and then
evaluate the statistical expectations. This gives a design criterion that is deterministic
as it does not involve statistical operators and is solely concerned with the LPM
set. We shall show using computer simulations that the criterion maintains a close
relationship with the LDC mutual information, and it has the advantage of simplifying
the LP-STBC design as it separates the design of LPM’s from the statistical properties
of the channel.
The second criterion that will be discussed consists of two metrics that measure
the non-orthogonality of the LPM’s and are obtained by generalizing of the concept
of orthogonal STBC [104]. OSTBC and QOSTBC signaling schemes are special cases
of the designs obtained by minimizing these metrics. The first non-orthogonality
metric, termed total-squared-skew-symmetry, is defined with respect to {Ai} (see
Eq. (6.2)), the set of linear processing matrices that is assigned to the real parts of
the transmitting constellations. Similarly, we have non-orthogonality that is defined
with respect to {Bi} (Eq. (6.2)), which are the LPM’s associated with the imaginary
parts. The second metric, termed total-squared-amicability (TSA), involves both sets
of matrices (Ai’s and Bi’s).
We establish that the total-squared-skew-symmetry is a generalized total-squared-
correlation (GTSC). TSC [122], [123] is a measure of non-orthogonality and is com-
monly used in the design of the sequence set for Code Division Multiple Access
(CDMA) systems. We give a lower bound for GTSC that is analogous to that of TSC
i.e., the Welch’s bound. We then discuss the relationship between minimizing GTSC
and the orthogonal Procrustes rotation problem [126] together with the closed form
solution to the problem. Furthermore, the lower bounds for GTSC are established
using the Hurwitz-Radon family of matrices.
117
By computer simulations we give several examples of LP-STBC’s using this
measure. We also demonstrate the relationship of this measure to CLD. We estab-
lish that this measure is less revealing than the first one of the performance of the
final codes. However, a lower bound that is derived here can still indicate well the
performance limit of some code designs.
The rest of the chapter is organized as follows. In Section 6.2 we give the system
model. In Section 6.3 we study Jesen’s relaxation of the LDC mutual information. In
Sections 6.4 and 6.5 we study the GTSC-TSA design criterion and the lower bounds
for the GTSC metric are given in Section 6.5. In Section 6.6 we describe computer
simulations and give several examples of LP-STBC’s. We conclude the work in 6.7.
6.2 System Model and the Linear Processing ST
Coding Scheme
The system we consider is a narrow-band wireless communication system con-
sisting of M transmitting antennas and N receiving antennas. The MIMO channel is
assumed to be flat-fading. The fading is quasi-static with coherence time T channel
uses during which the fades are supposed to be constant, though they may change
from one block of time T to the other. The received signal matrix containing the
received signal vectors during T channel uses is given as
Y =
√ρ
MSH + N . (6.1)
Here Y is of dimension T×N and H of dimension M×N is the channel matrix whose
(i, j)th element [H ]ij denotes the fading coefficient between the ith transmit antenna
and the jth reception antenna. The channel coefficients [H ]ij, i = 1, 2, . . . , M and
j = 1, 2, . . . , N are assumed to be i.i.d. zero mean unit variance circularly symmetric
complex Gaussian random variables i.e., [Hij] ∝ CN (0, 1), E{vec(H)vec(H)∗} =
118
IMN , where “(·)∗” denotes the Hermitian transpose of a matrix and “E(·)” is the
expectation operator. Also S represents the T×M space-time transmit code word, the
entries of which depend on the ST transmission scheme adopted. Finally, N denotes
the matrix of additive Gaussian noise independent of the received signals. It consists
of entries that are CN (0, 1) distributed and is spatially and temporally white. In (6.1)
the baseband transmission S is power constrained such that E{vec(S)∗vec(S)} =
E{tr(SS∗)} ≤ MT . The normalization√
ρM
ensures a receiver side signal-to-noise
ratio per antenna equal to ρ.
The design of an ST transmission scheme can be viewed as constructing a
linear or non-linear mapping from a block of r constellation points (input symbols)
{s1, s2, . . . , sr} to a space-time code word matrix C of dimension T×M , whose (i, j)th
element is transmitted during the ith channel use and at the jth antenna. The linear
dispersion code, or linear processing space-time block code, is given by [112]
C =r∑
i=1
siAi + j
r∑i=1
siBi. (6.2)
Here j :=√−1, while si := Re {si} and si := Im{si} are the real and imaginary part
of si, respectively. The matrices Ai and Bi ∈ Rn×s are real-valued linear processing
(dispersion) matrices of size n× s (with n ≥ s). In this work we consider only LPM’s
that are real and orthogonal i.e., Ai’s and Bi’s satisfy the following conditions:
ATi Ai = AiAT
i = Is, i = 1, 2, . . . , r, (6.3)
and
BTi Bi = BiBT
i = Is, i = 1, 2, . . . , r, (6.4)
where “(·)T ” denotes matrix transposition. Henceforth the dimension of the identity
matrix will be omitted if it can be inferred from the context. In general r, s and n
are parameters that we can choose. The parameter r denotes the number of input
119
information symbols to be transmitted; s is equivalent to the number of transmission
antennas M and n is equivalent to the block length T , both of which are defined in
the system model given by (6.1). From (6.3) and (6.4) and assuming i.i.d. CN (0, 1)
inputs si’s, we see that the power constraint can be satisfied by a normalizing factor
β such that E{tr(βC(βC)∗)} = sn.
6.2.1 The Equivalent MIMO Channel
In this subsection an equivalent MIMO channel model ([112]) to the model of
(6.1) is given. Neglecting the noise matrix N we can rewrite (6.1) as follows:
Y = Re {Y }+ jIm{Y }
=
√ρ
M(
r∑i=1
siAi + j
r∑i=1
siBi)× (H + jH)
=
√ρ
M
r∑i=1
(siAiH − siBiH)
+ j
√ρ
M
r∑i=1
(siAiH + siBiH). (6.5)
In (6.5), the operators (·) and (·) denote the Re {·} and Im{·}, respectively. We
will interchangeably use these two sets of notations. Expressing Re {Y }, Im{Y },Re {H} and Im{H} in vector form:
Re {Y } =(
y1 y2 · · · yN
),
Im{Y } =(
y1 y2 · · · yN
),
Re {H} =(
h1 h2 · · · hN
), and
Im{H} =(
h1 h2 · · · hN
), (6.6)
120
we have:
y1
...
yN
y1
...
yN
=
√ρ
M
A1h1 −B1h1 · · · Arh1 −Brh1
......
. . ....
...
A1hN −B1hN · · · ArhN −BrhN
A1h1 +B1h1 · · · Arh1 +Brh1
......
. . ....
...
A1hN +B1hN · · · ArhN +BrhN
︸ ︷︷ ︸H
s1
s1
s2
s2
...
sr
sr
. (6.7)
The equivalent MIMO channel H ∈ R2NT×2r is the left multiplying matrix that is
on the right hand side (RHS) of Eq. (6.7), and is a function of the original channel
H and the set of linear processing matrices {Ai,Bi, i = 1, 2, . . . , r}. After (6.7)’s
transformation, suboptimal linear decoders can be employed for fast recovery of the
si’s.
The elements of HTH, [HTH]κι, κ = 1, 2, . . . , 2NT and ι = 1, 2, . . . , 2NT , and
their counterparts across the main diagonal (denoted by ↔), [HTH]ικ, are given by:
[HTH]κι =N∑
n=1
hTnAT
i Ajhn +N∑
n=1
hTnAT
i Ajhn ↔
[HTH]ικ =N∑
n=1
hTnAT
j Aihn +N∑
n=1
hTnAT
j Aihn, (6.8)
κ = 2i− 1,ι = 2j − 1, i = 1, 2, . . . , r and j = 1, 2, . . . , r,
[HTH]κι =N∑
n=1
hTnBT
i Bjhn +N∑
n=1
hTnBT
i Bjhn ↔
[HTH]ικ =N∑
n=1
hTnBT
j Bihn +N∑
n=1
hTnBT
j Bihn, (6.9)
κ = 2i,ι = 2j, i = 1, 2, . . . , r and j = 1, 2, . . . , r,
121
[HTH]κι = −N∑
n=1
hTnAT
i Bjhn +N∑
n=1
hTnAT
i Bjhn ↔
[HTH]ικ = −N∑
n=1
hTnBT
j Aihn +N∑
n=1
hTnBT
j Aihn, (6.10)
κ = 2i− 1, ι = 2j, i = 1, 2, . . . , r and j = 1, 2, . . . , r,
and
[HTH]κι = −N∑
n=1
hTnBT
i Ajhn +N∑
n=1
hTnBT
i Ajhn ↔
[HTH]ικ = −N∑
n=1
hTnAT
j Bihn +N∑
n=1
hTnAT
j Bihn, (6.11)
κ = 2i, ι =2j − 1, i = 1, 2, . . . , r and j = 1, 2, . . . , r.
In the next section we discuss a relaxation of the LDC mutual information CLD by
using Jensen’s Inequality.
6.3 Jensen’s Inequality and Relaxation of LDC Mu-
tual Information
As seen from (6.7), the set of linear processing matrices transforms the original
MIMO channel H into an equivalent channel represented by the matrix H. The LDC
mutual information is defined as
CLD :=1
2TEH
{log det
(I2NT +
ρ
MHHT
)}
=1
2TEH
{log det
(I2r +
ρ
MHTH
)}. (6.12)
122
By Jensen’s inequality and the concavity of the log det() function, we have
CLD ≤ 1
2Tlog det
(I2r +
ρ
MEH{HTH}
), (6.13)
where the elements of the product matrixHTH are given by Eqs. (6.8) through (6.11),
respectively. To compute the expectation in the RHS of (6.13), we have to evaluate
three types of quantities existing within HTH. The first type are those quantities
given by (6.8) i.e., {[HTH]κι, κ = 2i−1, ι = 2j−1, i = 1, 2, . . . , r and j = 1, 2, . . . , r}.We note that HTH = HTH+(HTH)T
2, which implies that they can be expressed as
follows:
[HTH]κι =N∑
n=1
hTn
ATi Aj +AT
j Ai
2hn +
N∑n=1
hTn
ATi Aj +AT
j Ai
2hn
=
h1
h2
...
hN
h1
h2
...
hN
T
ATi Aj+AT
j Ai
2∅ . . . ∅
∅ ATi Aj+AT
j Ai
2. . . ∅
......
. . ....
∅ ∅ . . .AT
i Aj+ATj Ai
2
︸ ︷︷ ︸G
h1
h2
...
hN
h1
h2
...
hN
,
κ = 2i− 1, ι = 2j − 1, i = 1, 2, . . . , r and j = 1, 2, . . . , r.
(6.14)
Assuming that H ’s entries are independent and identically distributed (i.i.d.) zero-
mean, variance-one, circularly-symmetric and complex Gaussian random variables,
the characteristic function of (6.14) are given by:
ϕ(jυ) =
∫. . .
∫dh1 . . . dhMdh1 . . . dhNπ−MN
123
· exp[−[hT
1 . . . hTN hT
1 . . . hTN ](I2MN − jυG) · [hT
1 . . . hTN h1 . . . hT
N ]T]
= det(I2MN − jυG)−12 . (6.15)
In (6.15), G denotes the square block-diagonal matrix on the RHS of (6.14), with
diagonal blocks given byAT
i Aj+ATj Ai
2. The last step follows from the general multi-
variate Gaussian integral [112]. If we define K := I − jυG, then the derivative of
ϕ(jυ) with respect to υ is given by:
dϕ(jυ)
dυ= −1
2det(I − jυG)−
32 · d det(I − jυG)
dυ
= −1
2det(I − jυG)−
32 · tr
[(∂ det K
∂K
)T∂(I − jυG)
∂υ
]
= −1
2det(I − jυG)−
32 · tr
[[det K
(K−1
)T]T
(−j)G
]
=1
2j det(I − jυG)−
12 tr[(I − jυG)−1G]. (6.16)
Using (6.16), we can show that the expectations of the quantities given in (6.14),
which are given by:
E{[HTH]κι} = −jdϕ(jυ)
dυ
∣∣∣υ=0
=1
2tr G = N tr
AiATj +AjAT
i
2,
κ = 2i− 1, ι = 2j − 1, i = 1, 2, . . . , r and j = 1, 2, . . . , r. (6.17)
The second type of quantities are those given by (6.9). The computation of the ex-
pected values of these quantities is similar to that of the first type, the only difference
being that these quantities are defined with respect to Bi’s. Thus we have:
E{[HTH]κι = N trBT
i Bj + BTj Bi
2, κ = 2i, ι = 2j, i = 1, 2, . . . , r and j = 1, 2, . . . , r.
(6.18)
124
Finally, the third type of quantities are those given by (6.10) and (6.11). Because the
two are similar, we will only consider one of them e.g., those given by (6.10).
[HTH]κι =N∑
n=1
hTn
BTj Ai −AT
i Bj
2hn +
N∑n=1
hTn
ATi Bj − BT
j Ai
2hn
=[hT
1 hT2 . . . hT
N hT1 hT
2 . . . hTN
]
·
∅ ∅ ∅ BTj Ai−AT
i Bj
2. . . ∅
∅ ∅ ∅ .... . .
...
∅ ∅ ∅ ∅ . . .BT
j Ai−ATi Bj
2
ATi Bj−BT
j Ai
2. . . ∅ ∅ ∅ ∅
.... . .
... ∅ ∅ ∅
∅ . . .AT
i Bj−BTj Ai
2∅ ∅ ∅
h1
h2
...
hN
h1
h2
...
hN
,
κ = 2i− 1, ι = 2j, i = 1, 2, . . . , r and j = 1, 2, . . . , r. (6.19)
The expectations can be computed similarly to the computations of the first type and
we have
E{[HTH]κι} = −1
2tr G′ = 0, κ = 2i− 1, ι = 2j, i = 1, 2, . . . , r and j = 1, 2, . . . , r,
(6.20)
where G′ denotes the square matrix within the RHS of Eq. (6.19). Similarly, we
have:
E{[HTH]κι} = 0, κ = 2i, ι = 2j − 1, i = 1, 2, . . . , r and j = 1, 2, . . . , r. (6.21)
Having obtained the three types of expectations, we can now define a criterion
for choosingAi and Bi, i = 1, 2, . . . , r, which is to search forAis and Bis that maximize
125
the following quantity, derived from CLD by applying Jensen’s Inequality:
% :=1
2Tlog det
(I2r + N
ρ
MZ
), (6.22)
where the matrix Z is defined as follows:
trAT
1 A1+AT1 A1
2∅ tr
AT1 A2+AT
2 A1
2∅ . . .
∅ trBT
1 B1+BT1 B1
2∅ tr
BT1 B2+BT
2 B1
2. . .
trAT
2 A1+AT1 A2
2∅ tr
AT2 A2+AT
2 A2
2∅ . . .
∅ trBT
2 B1+BT1 B2
2∅ tr
BT2 B2+BT
2 B2
2. . .
......
......
. . .
, (6.23)
which has as its elements trAT
i Aj+ATj Ai
2and tr
BTi Bj+BT
j Bi
2, 1 ≤ i, j ≤ r, and zeros
elsewhere.
By computer simulations, we see that there exists an almost monotonic rela-
tionship between Jensen’s relaxation and CLD (see Section 6.6) and their relation is
tractable. Besides, % is a deterministic criterion and, as it does not contain any statis-
tic operators, its computation is fairly easy. Thus we anticipate that its application
in LP-STBC design would greatly reduce the search space of the problem.
6.4 The GTSC and TSA Metrics
In what follows, we discuss another deterministic design criterion by which
we extract an important feature of the linear processing ST codes. In this criterion,
the goal is to find a set of orthogonal matrices Ai and Bi ∈ Rn×s that minimize the
following measure (“‖ · ‖F ” denotes the matrix Frobenius norm):
ψ :=r∑
i=1
r∑j=1
∥∥∥ATi Aj +AT
j Ai
2
∥∥∥2
F+
r∑i=1
r∑j=1
∥∥∥BTi Bj + BT
j Bi
2
∥∥∥2
F
126
+r∑
i=1
r∑j=1
∥∥∥ATi Bj − BT
j Ai
2
∥∥∥2
F. (6.24)
Let us denote the three summing terms of (6.24) as (the notations ψGTSC(B), ψGTSC(B)
and ψTSA will be used later):
ψGTSC(A) :=r∑
i=1
r∑j=1
∥∥∥ATi Aj +AT
j Ai
2
∥∥∥2
F, (6.25)
ψGTSC(B) :=r∑
i=1
r∑j=1
∥∥∥BTi Bj + BT
j Bi
2
∥∥∥2
F, (6.26)
and
ψTSA :=r∑
i=1
r∑j=1
∥∥∥ATi Bj − BT
j Ai
2
∥∥∥2
F. (6.27)
We note that ‖ATi Aj+AT
j Ai
2‖2
F = tr(AT
i Aj+ATj Ai
2
)2
. The measure ψ, together with the
orthogonality requirements (6.3) and (6.4), is a natural generalization of the set of
conditions of the following problem [113]:
Find Ai and Bi ∈ Rn×s such that they satisfy the following system of equations
(cf. (6.28)):
ATi Ai = Is, i = 1, 2, . . . , r
BTi Bi = Is, i = 1, 2, . . . , r
ATi Aj = −AT
j Ai, 1 ≤ i 6= j ≤ r
BTi Bj = −BT
j Bi, 1 ≤ i 6= j ≤ r
ATi Bj = BT
j Ai, 1 ≤ i, j ≤ r. (6.28)
The set of matrices {Ai,Bi, i = 1, 2, . . . , r} satisfying these equations yield the so
127
called amicable orthogonal design [113], which can be viewed as a generalization
of the real orthogonal design [104] [113]. In [104] it is also called complex linear
processing orthogonal design. As can be easily seen, matrices satisfying (6.28) would
yield a minimum measure ψ = 2rs. Thus ψ-minimizing designs include orthogonal
designs (amicable orthogonal designs and real orthogonal designs) as special cases.
In the next section we study the lower bounds for ψGTSC. Our main focus
would be the real quasi-orthogonal design using square matrices (s = n and Ai ∈Rn×n, i = 1, 2, . . . , r).
6.5 Lower Bounds for the GTSC Metric
6.5.1 Bound That is Analogous to Welch’s Bound
There exists an analogy between total-squared-skew-symmetry and total-squared-
correlation (TSC) [123], which is a design criterion for the signature sequence set
of a synchronous Direct Sequence Code Division Multiple Access (DS-CDMA) sys-
tem. To illustrate this let us consider a K-user DS/CDMA system and let us define
L := {s1, s2, . . . , sK} as the signature set, where the signatures si are column vectors
of length equal to the system processing gain L. The TSC is defined as a measure of
the correlations between the signatures si’s:
TSC(L) :=K∑
i=1
K∑j=1
|sTi sj|2 = ‖LT L‖2
F = ‖LLT‖2F = tr
(LT L
)2, (6.29)
where each si is assumed to be normalized i.e., sTi si = 1. Thus minimization of TSC
would yield sequence sets that have the least cross correlation sum. If we denote the
128
transmitted information bit of the ith user as bi, then we may write (cf. (6.31))
(b1s1 + . . . + bKsK)T (b1s1 + . . . + bKsK) =K∑
i=1
b2i ‖si‖2 +
K−1∑i=1
K∑j>i
bibj(sTi sj + sT
j si).
(6.30)
Therefore we may alternatively interpret the TSC-minimization criterion as one whose
objective is to suppress the magnitudes of the pairing sums sTi sj + sT
j si. Complete
suppressing sTi sj + sT
j si = 0 is equivalent to requiring that sTi sj = −sT
j si which,
since sTj si = sT
i sj, can be rewritten as sTi sj = −sT
i sj. This in turn is equivalent to
requiring that sTi sj = 0. This observation is the motivation behind the TSC criterion
that seeks to minimize the magnitudes of (sTi sj)
2.
This alternative viewpoint helps illustrate the analogy between TSC and the
measure ψ. Indeed, in linear processing STBC we write
(s1A1 + · · ·+ srAr)T (s1A1 + · · ·+ srAr)
=r∑
i=1
s2iAT
i Ai +r−1∑i=1
r∑j>i
sisj
(ATi Aj +AT
j Ai
)
=(r∑
i=1
s2i )I +
r−1∑i=1
r∑j>i
sisj
(ATi Aj +AT
j Ai
). (6.31)
While the linear processing orthogonal design seeks to eliminate the second term in
(6.31), it can be generalized as a minimization problem with respect to the metric:
ψ =r∑
i=1
r∑j=1
∥∥∥ATi Aj +AT
j Ai
2
∥∥∥2
F, (6.32)
where the symmetric part of the product matrix ATi Aj (
ATi Aj+AT
j Ai
2) is measured using
the matrix Frobenius norm.
Comparing (6.30) to (6.31) we can establish a one-to-one correspondence be-
tween the DS-CDMA signature sequences si’s and the space-time linear processing
129
matrices Ai’s. Thus the total-squared-skew-symmetry of LP-STBC can be viewed as
a generalization of TSC for DS-CDMA. In the sequel we refer to the former as GTSC.
In what follows, we will establish a lower bound for GTSC similar to that of TSC
(the Welch’s bound [124], [125]).
First, we have the following theorem that describes properties regarding the set
of linear processing matrices {Ai}, which is analogous to the row column equivalence
Lemma given in [125].
Theorem 9 Let ψ[r,s,n] denote the measure ψ with respect to the r n × s matrices.
Let also ψ[s,r,n] denote ψ with respect to the s n× r matrices. We have the following:
(i) ψ[r,s,n] = ψ[s,r,n];
(ii) ψ[r,n,s] = ψ[n,r,s];
(iii) When s = n and ATi Ai = AiAT
i , ψ[r,s,n] = ψ[s,r,n] = ψ[r,n,s] = ψ[n,r,s].
¤
Proof: We only need to provide the proof of the first claim. The proofs of the
other claims follow immediately.
Suppose that the matrices Ai’s are stacked along the z-axis (illustrated in Fig.
6.1), and let Dκ’s denote the matrices that are “sliced” along the x-axis. Let aκ[i]
stand for the κth column of Ai, and di[κ] stand for the ith column of Dκ. It can be
seen that di[κ] = aκ[i]. Thus we have:
ψ[r,s,n] =r∑
i=1
r∑j=1
∥∥∥ATi Aj +AT
j Ai
2
∥∥∥2
F
=r∑
i=1
r∑j=1
s∑κ=1
s∑ι=1
∣∣∣aTκ [i]aι[j] + aT
κ [j]aι[i]
2
∣∣∣2
=s∑
κ=1
s∑ι=1
r∑i=1
r∑j=1
∣∣∣dTi [κ]dj[ι] + dT
j [κ]di[ι]
2
∣∣∣2
=ψ[s,r,n]. (6.33)
130
Figure 6.1: The analogy of the Row Column Equivalence
The following properties will be used in proving the lower bound.
Theorem 10 Given r matrices Ai ∈ Rn×n that are orthogonal i.e., ATi Ai = AiAT
i =
I, i = 1, 2, . . . , r, we have the following inequality
n∑κ=1
√ψ′κκ ≥
r∑i=1
√ψii = r
√n, (6.34)
where ψ′κκ = ‖DTκDκ+DT
κDκ
2‖2
F and ψii = ‖ATi Ai+AT
i Ai
2‖2
F . ¤
Proof: We consider two cases: (i) r ≥ n and (ii) r ≤ n.
(i) For r ≥ n, we have
n∑κ=1
√ψ′κκ :=
n∑κ=1
∥∥DTκDκ
∥∥F
=n∑
κ=1
(r∑
i=1
r∑j=1
∣∣dTi [κ]dj[κ]
∣∣2) 1
2
131
≥n∑
κ=1
(r2
n
) 12
= nr√n
= r√
n
=r∑
i=1
∥∥ATi Ai
∥∥F, (6.35)
where the inequality∑r
i=1
∑rj=1
∣∣dTi [κ]dj[κ]
∣∣2 ≥ r2
ncomes from the Welch’s
bound.
(ii) For r ≤ n, we have
n∑κ=1
√ψ′κκ :=
n∑κ=1
∥∥DTκD
∥∥F
=n∑
κ=1
(r∑
i=1
r∑j=1
∣∣dTi [κ]dj[κ]
∣∣2) 1
2
≥n∑
κ=1
(r∑
i=1
∣∣dTi [κ]di[κ]
∣∣2) 1
2
= n√
r ≥ r√
n. (6.36)
With the above two properties, we can now prove a lower bound of ψ[r,n,n] for
r ≥ n, which is similar to the Welch’s Bound. Indeed,
ψ[r,n,n] = ψ[n,r,n] =n∑
κ=1
n∑ι=1
∥∥∥DTκDι +DT
ι Dκ
2
∥∥∥2
F
≥n∑
κ=1
∥∥∥DTκDκ +DT
κDκ
2
∥∥∥2
F=
n∑κ=1
∥∥DTκDκ
∥∥2
F
≥ 1
n
(n∑
κ=1
∥∥DTκDκ
∥∥F
)2
≥ 1
n
(r√
n)2
. (6.37)
132
The second inequality in (6.37):
n∑κ=1
∥∥DTκDκ
∥∥2
F≥ 1
n
(n∑
κ=1
∥∥DTκDκ
∥∥F
)2
, (6.38)
is a form of Cauchy-Schwarz inequality (also see [125]). It becomes an equality if and
only if∥∥DT
1D1
∥∥F
=∥∥DT
2D2
∥∥F
= . . . =∥∥DT
nDn
∥∥F. (6.39)
This lower bound is achievable by an LPM design of size [r, 2, 2] for r ≥ 2 i.e.,
ψ[r,2,2] ≥ r2 for r ≥ 2. For example, when r = 2, there exists an orthogonal design
which gives ψ[2,2,2] = 4, with∥∥AT
i Ai
∥∥2
F= 2. For r = 3, the minimum ψ[3,2,2] = 9 is
approximately achieved by the following set of matrices:
A1 =
0.2764 −0.9610
0.9610 0.2764
, A2 =
−0.9684 0.2493
−0.2493 −0.9684
and A3 =
−0.6923 −0.7216
0.7216 −0.6923
,
(6.40)
where ψ = 9.0004 and is given by:
Ψ =
2.0000 0.5147 0.5042
0.5147 2.0000 0.4814
0.5042 0.4814 2.0000
, (6.41)
where for a better presentation we use matrix Ψ which is defined by Ψ := [ψij] and
ψij :=∥∥AT
i Aj+ATj Ai
2
∥∥2
F. Graphic representation of {Ai} is given in Fig. 6.2 for a visual
understanding of the matrices obtained from the minimization, where the two arrows
of the same line-type denote the two column vectors, respectively, of the matrices Ai
for i = 1, 2, 3.
This bound is also applicable to LPM design of size [5, 4, 4]. In this case we
have ψ[5,4,4] ≥ 25. By numerical optimization with respect to LPM over the integer
field (and essentially over {−1, 0, 1}), we can obtain a minimum of ψ = 28.
133
0.2
0.4
0.6
0.8
1
30
210
60
240
90
270
120
300
150
330
180 0
Figure 6.2: Graphical representation of {Ai}’s for [3,2,2]
134
6.5.2 Lower Bounds for Other Cases of r and n
The bound given in the previous section is valid for [r, n, n] and r ≥ n. In this
section we take a look at lower bounds for several other cases of r and n. We first
have the following properties regarding a general lower bound.
Theorem 11 Given r n × n matrices Ai ∈ Rn×n such that ATi Ai = AiAT
i = I, i =
1, 2, . . . , r, we have the following inequality:
(i) ψ[r,n,n] ≥ rn, when 1 ≤ r ≤ ρ(n);
(ii)
ψ[r,n,n] ≥ rn + r(r − 1)ψ[ρ(n)+1,n,n] − (ρ(n) + 1)n
ρ(n)(ρ(n) + 1), (6.42)
when r ≥ ρ(n) + 1.
In the above ρ(n) is the Hurwitz-Radon number which is given by ρ(n) = 8c + 2d for
n = 2ab where b is odd and a = 4c + d for 0 ≤ d ≤ 3; a, b, c, d, n ∈ Z+. ¤
Case (i) is trivial as the lower bound can be achieved by the Hurwitz-Radon
family of matrices. Case (ii) states that for r ≥ ρ(n)+1, ψ[r,n,n] can be lower bounded
by the minimal ψ[ρ(n)+1,n,n]. The proof can be deduced by contradiction: suppose we
can find a design ψ[r,n,n] which has a smaller value than that given in (6.42). We are
going to have ψ[ρ(n)+1,n,n] < ψ[ρ(n)+1,n,n] which can not be true.
Thus, in general, the procedure to evaluate a lower bound will be to find
the minimum of ψ[ρ(n)+1,n,n] using optimization algorithms. The numerical solution
obtained provides a lower bound for ψ[r,n,n] for r ≥ n.
In the following section we will study several cases where the minimum of
ψ[n+1,n,n] can be evaluated analytically.
The first case is when n is odd i.e., n4= 1 (mod 2). In this case ρ(n) is 1
and ψ[r,n,n] can be lower-bounded by the minimum skew-symmetry incurred by any
ρ(n) + 1 = 2 linear processing matrices. We have the following theorem about the
minimum skew-symmetry between any two LPM’s and the lower bound on ψ[r,n,n]:
135
Theorem 12 Given n × n matrices Ai that satisfy (6.3) and n4= 1 (mod 2), we
have: (i)∥∥AT
i Aj+ATj Ai
2
∥∥2
F≥ 1 for any 1 ≤ i 6= j ≤ r; (ii) ψ[r,n,n] ≥ rn + r(r − 1). ¤
Proof: We first observe that
∥∥∥ATi Aj +AT
j Ai
2
∥∥∥2
F=
∥∥∥I +(AT
i Aj
)2
2
∥∥∥2
F, (6.43)
since the Frobenius norm is invariant under unitary multiplication. The determinant
of(AT
i Aj
)2is given by
det(AT
i Aj
)2=
(detAT
i Aj
)2(6.44)
and is positive. Thus the orthogonal matrix(AT
i Aj
)2is special orthogonal1. An
orthogonal matrix can only have eigenvalues from the set {1,−1, ejφ, e−jφ} for some
φ ([126]), and since {−1,−1} and {ejφ, e−jφ} must occur in pairs because(AT
i Aj
)2
is special orthogonal, there must exist a real eigenvalue λ = 1 for n4= 1 (mod 2).
Therefore, ∥∥∥ATi Aj +AT
j Ai
2
∥∥∥2
F≥ 1. (6.45)
By applying Theorem 11, we then have ψ[r,n,n] ≥ rn + r(r − 1).
The equality of (6.45) can be attained by explicit construction of the following or-
thogonal matrices:
A1 =
1 0T
0 V1
and A2 =
1 0T
0 V2
, (6.46)
where V1 and V2 ∈ R(n−1)×(n−1) are two matrices from the Hurwitz-Radon family of
matrices of order n− 1. Note that ρ(n− 1) ≥ 2 for n− 14= 0 (mod 2) ([113]), so it
is always possible to have two such matrices.
This problem of finding ATi Aj of minimum skew-symmetry can be cast as an
orthogonal Procrustes problem [126]. Let F ∈ Rm×n and G ∈ Rm×n, m ≥ n, denote
1An orthogonal matrix X is said to be special orthogonal if detX = 1 [126].
136
two sets of n points within an m-dimensional space. In the Procrustes problem, the
goal is to seek an orthogonal rotation matrix Q ∈ Rn×n such that the following is
minimized:∥∥F − GQ
∥∥2
F. (6.47)
When m = n, the Procrustes rotation matrix Q can be expressed analytically:
Qmin = UV T , (6.48)
where U and V are obtained from the singular value decomposition (SVD) of the
inner product of G and F as follows:
GTF = USV T . (6.49)
The following theorem describes the relations between the problem of finding
ATi Aj of minimum skew-symmetry and orthogonal Procrustes problem:
Theorem 13 If we choose F and G such that F is a skew-symmetric matrix of rank
n − 1 for n4= 1 (mod 2) or n for n
4= 0 (mod 2), and G = I, then minimization of
(6.47), rewritten as follows:∥∥F −Q
∥∥2
F, (6.50)
is equivalent to minimization of the skew-symmetry measure
∥∥Qmin +QTmin
∥∥2
F. (6.51)
¤
Noting that Qmin = UV T (cf. (6.48)), we see that the solution to the Procrustes
problem yields two orthogonal matrices U and V that minimize the measure of
skew-symmetry given by ∥∥∥UV T + V T U
2
∥∥∥2
F. (6.52)
137
This is of the exact same form as∥∥AT
i Aj+ATj Ai
2
∥∥2
F. This suggests that for the cases
of n4= 1 (mod 2), the problem can be solved analytically by the solution to the
Procrustes rotation problem as outlined above.
Before giving the proof for Theorem 13, we need first the following theorem:
Theorem 14 For given skew-symmetric matrices M and N ∈ Rn×n and for a uni-
tary matrix P ∈ Cn×n we have:
minP∗P=I
trMP∗NP =n∑
i=1
λiσi, (6.53)
where λi and σi are the eigenvalues of M and N , respectively, sorted in the order of
decreasing imaginary parts e.g., {λ1 = 2j, λ2 = j, λ3 = 0j, λ3 = −j, λ4 = −2j}. Note
that the eigenvalues of skew-symmetric matrix are either pure imaginary or zero, and
they appear in conjugate pairs. ¤
Proof: (This proof follows that of [127].) Suppose that the eigenvalue decom-
position of M and N are given by M = UΛU ∗ and N = V ΣV ∗. We write
trMP∗NP = tr UΛU ∗P∗V ΣV ∗P= trΛ (U ∗P∗V )Σ (V ∗PU ) . (6.54)
Letting W := U ∗P∗V we then have
trΛ (U ∗P∗V )Σ (V ∗PU ) =n∑
i=1
λiW∗ΣW
=n∑
i=1
n∑j=1
∣∣[W ]ij∣∣2λiσj
=n∑
i=1
n∑j=1
[uvT ◦ F ]ij, (6.55)
where u := diag(U ) and v := diag(V ) are column vectors consisting of the eigen-
values of M and N , respectively, arranged in the order described previously. The
138
operator “◦” denotes the element-wise matrix Hadamard product. The matrix F :=
[|[W ]ij|2] is doubly stochastic as each of its columns and rows add up to one i.e.,
n∑i=1
∣∣[W ]ij∣∣2 = 1, for j = 1, 2, . . . , n (6.56)
andn∑
j=1
∣∣[W ]ij∣∣2 = 1, for i = 1, 2, . . . , n. (6.57)
This is due to the fact that W is unitary. We will now show that F = In is a
minimizer of (6.55).
Suppose we have k ∈ {1, 2, . . . , n} such that
∣∣[W ]kk
∣∣2 < 1,∣∣[W ]ii
∣∣2 = 1, for i < k,∣∣[W ]ij
∣∣2 = 0, for 1 ≤ i 6= j < k. (6.58)
Then by the properties of doubly stochastic matrices, we have that for some k < p ≤ n
and k < q ≤ n:
∣∣[W ]kq
∣∣2 > 0,∣∣[W ]pk
∣∣2 > 0, and∣∣[W ]pq
∣∣2 < 1. (6.59)
Given some 0 < ε < 1, we can construct a new doubly stochastic matrix F ′, which
has the following entries updated (denoted by →) from F :
∣∣[W ]kk
∣∣2 →∣∣[W ]kk
∣∣2 + ε,∣∣[W ]kq
∣∣2 →∣∣[W ]kq
∣∣2 − ε,∣∣[W ]pk
∣∣2 →∣∣[W ]pk
∣∣2 − ε,∣∣[W ]pq
∣∣2 →∣∣[W ]pq
∣∣2 + ε. (6.60)
139
We then have:
n∑i=1
n∑j=1
[uvT ◦ F ′]ij −n∑
i=1
n∑j=1
[uvT ◦ F ]ij
=n∑
i=1
n∑j=1
[uvT ◦ F ′ − uvT ◦ F ]ij
=n∑
i=1
n∑j=1
[uvT ◦ (F ′ − F )]ij
= ελkσk − ελkσq − ελpσk + ελpσq
= ε (λk − λp) (σk − σq) ≤ 0. (6.61)
The last inequality comes from the fact that the eigenvalues are so arranged that
λk − λp = aj and σk − σq = bj (k < p ≤ n and k < q ≤ n) for some nonnegative
a, b ∈ R. This updating can be performed for every∣∣[W ]ii
∣∣2 < 1 and in the end we
would have a Fopt = I where∣∣[W ]ii
∣∣2 = 1, i = 1, 2, . . . , n. This is achievable since W
is unitary and the set of unitary matrices is closed under multiplication.
In the next we give the proof for Theorem 13.
Proof: We will first show that:
minQTQ=In
∥∥F −Q∥∥2
F= min
QTQ=In
trFQ−QT
2. (6.62)
We first see that:
∥∥F −Q∥∥2
F=
∥∥∥F − Q+QT
2− Q−QT
2
∥∥∥2
F
=∥∥∥Q+QT
2
∥∥∥2
F+
∥∥∥F − Q−QT
2
∥∥∥2
F− 2 tr
Q+QT
2
(F − Q−QT
2
)
=∥∥∥Q+QT
2
∥∥∥2
F+
∥∥∥F − Q−QT
2
∥∥∥2
F, (6.63)
where in the last line the first term contains the symmetric part of F − Q and
the second term contains the skew-symmetric part. The last step follows from the
140
fact that for any symmetric matrix A and skew-symmetric matrix B of the same
dimension, tr AB = 0. We then have:
∥∥∥Q+QT
2
∥∥∥2
F+
∥∥∥F − Q−QT
2
∥∥∥2
F
=∥∥∥Q+QT
2
∥∥∥2
F+
∥∥F∥∥2
F+
∥∥∥Q−QT
2
∥∥∥2
F− 2 trFT Q−QT
2
=∥∥∥Q+QT
2
∥∥∥2
F+
∥∥F∥∥2
F+
∥∥∥Q−QT
2
∥∥∥2
F+ 2 trFQ−Q
T
2. (6.64)
Since∥∥Q+QT
2
∥∥2
F+
∥∥Q−QT
2
∥∥2
Fis constant for any orthogonal matrix Q, the minimiza-
tion of∥∥F −Q∥∥2
Fwith respect to Q is equivalent to the minimization of trF Q−QT
2.
Regarding the minimum of trF Q−QT
2we have the following. From Theorem 14 we
see that for skew-symmetric F and Q−QT
2, trF Q−QT
2has a lower bound
∑ni=1 λiσi,
where {λi} and {σi} are the eigenvalues of F and Q−QT
2respectively, sorted by de-
creasing imaginary parts. As F is fixed, we anticipate that the minimization of
trF Q−QT
2with respect to Q would yield {σi} adjusted according to {λi}. Note
that F must have the largest possible rank i.e., n − 1 for n4= 1 (mod 2) and n for
n4= 0 (mod 2), otherwise the minimization would not be meaningful e.g., if F is an
all-zero matrix. Furthermore, the minimization will maximize the total sum of the
magnitudes of σi’s, that is,∥∥Q−QT
2
∥∥2
F=
∑ni=1 |σi|2 will be maximized. Now since
∥∥Q−QT
2
∥∥2
F+
∥∥Q+QT
2
∥∥2
F=
∥∥Q∥∥2
F= n, this is equivalent to requiring that
∥∥Q+QT
2
∥∥2
Fbe
minimized, as claimed.
6.5.3 Lower Bounds - Further Results
Minimization of the skew-symmetry measure would require the minimization
of ∥∥∥I +Q2
2
∥∥∥2
F, (6.65)
where Q is orthogonal and Q ∈ Rn×n. Note that since Q is orthogonal, Q2 is also
orthogonal. Further, det (Q2) = 1. Thus Q2 belongs to the so-called special or-
141
thogonal group2 so(n). As discussed before, the eigenvalues of Q2, assumed to be
λi, i = 1, 2 . . . , n, are within the set {1,−1, ejφ, e−jφ}. In addition, except for n4= 1
(mod 2), where there exists one eigenvalue λi = 1, for all other cases, the eigenval-
ues must exist in pairs. So, there are either two eigenvalues such that λi = λj = 1
(i 6= j), two eigenvalues such that λi = λj = −1 or tow eigenvalues such that λi = ejφ
and λj = e−jφ (i 6= j). This eigenvalue pattern is closely related to our analysis of
the lower bounds for the skew-symmetry measure ψ. For example, a study of the
Hurwitz-Radon family of matrices would reveal that for two matrices Ai and Aj from
this family, the eigenvalues of ATi Aj would all be −1, such that
∥∥ I+Q2
2
∥∥2
F= 0. With
this in mind, we are able to derive the lower bound for the case of n4= 2 (mod 4).
Theorem 15 For n4= 2 (mod 4), we have:
ψ[r,n,n] ≥ rn + r(r − 1)12
(3√
2)2 − 3× 2
3× 2. (6.66)
¤
Proof: Given n4= 2 (mod 4), we let r = ρ(n) + 1 = 3. Now since there do not
exist three orthogonal matrices such that Ψ is given by
Ψ =
n 0 0
0 n 0
0 0 n
, (6.67)
we can not find three matrices A1, A2 and A3 such that each pair ATi Aj is skew-
symmetric. This means that the eigenvalues of each(AT
i Aj
)2(i 6= j) would not
possibly be all −1. So there must exist one(AT
i Aj
)2such that two of its eigenvalues,
λk and λl, are λk = λl = 1, or λk = ejφ and λl = e−jφ. For n4= 2 (mod 4), we have
ρ(n) = 2, thus there are at least two distinct pairs(AT
i1Aj1
)2(i1 6= j1) and
(ATi2Aj2
)2
2The special orthogonal matrices of size n× n form a special orthogonal group.
142
(i2 6= j2), each of which will have eigenvalues that is of the pattern as described
above. Thus a lower bound can be given by explicit constructing the following three
matrices:
A1 =
U1 0T
0 V1
, A2 =
U2 0T
0 V2
and A3 =
U3 0T
0 V3
, (6.68)
where 0 denotes all-zero matrix of size (n−2)×2; Ui, for i = 1, 2, 3, are 2×2 orthogonal
matrices, and Vi, i = 1, 2, 3, are orthogonal matrices of dimension n−24= 0 (mod 4).
Note that since ρ(n− 2) ≥ 4, for A1, A2 and A3, we can find Vi, i = 1, 2, 3 from the
Hurwitz-Radon family of order n − 2. We find U1, U2 and U3 such that the set of
Ui’s has the minimum skew-symmetry measure. The minimum ψ of the set of Ui’s is
given by the generalized Welch’s bound 1n
(r√
n)2
(cf. Eq. (6.37)). Thus we have the
following lower bound for n4= 2 (mod 4) from Theorem 11:
ψ[r,n,n] ≥ rn + r(r − 1)12
(3√
2)2 − 3× 2
3× 2. (6.69)
6.6 Computer Simulations
6.6.1 Jensen’s Relaxation of LDC-MI
In this subsection, we study the relationship between the LDC mutual infor-
mation and its Jensen’s relaxation %. We consider an LPM design of size 2× [3, 3, 3]
(“2×” within 2×[3, 3, 3] denotes a complex design). The number of receiving antennas
is 2. The system SNR is set to ρ = 20dB. The MIMO channel H consists of entries
that are independent and identically distributed (i.i.d.) zero-mean, variance-one,
circularly-symmetric and complex Gaussian random variables. The linear processing
matrices are generated randomly. After the raw data is collected, the final data points
143
5.5 6 6.5 7 7.5 85.4
5.6
5.8
6
6.2
6.4
6.6
6.8
7
7.2
7.4
LDC Mutual Information vs. 1/(2T)logdet(I+Nρ/MZ)(2×[3,3,3]: 3 complex symbols, 3x3 LPM’s; N=2; SNR=20dB)
1/(2T)logdet(I+Nρ/MZ)
LDC
−M
I
Figure 6.3: LDC-MI vs. 12T
log det(I2r + M ρNZ) for 3I2O: SNR=20dB; r = 3; pro-
duced from 105 randomly generated LPM’s.
are produced in the following way: for each fixed interval within the total range of
%, we use the largest LDC-MI value obtainable by the generated LPM’s. The result
is plotted in Fig. 6.3. We observe that the curve of LDC-MI versus % is almost
monotonic. Thus there is a tractable connection between the two measures (at least
for the LPM design under investigation).
6.6.2 Examples of LP-STBCs: Constellation Rotation and
Product Distance Gain
In this subsection we examine several examples of LPM designs based on com-
puter simulations.
By numerical optimization, we can show that for Ai,Bi ∈ Z4×4, i = 1, 2, . . . , 4,
144
the minimum of GTSC-TSA is given by ψmin = 48. We can see that each LPM set
associated with the quasi-orthogonal STBC’s as described in [130] and [131] exhibits
a GTSC-TSA equal to ψmin.
For an LPM design over R, there exist sets {Ai} and {Bi} (i = 1, 2, . . . , 4) that
yield ψ of smaller value. For example, the GTSC-TSA value of the following linear
processing matrices is ψ = 43.639.
A1 =
−0.0000 0.9005 −0.0371 0.4332
−0.9105 0.0961 0.3651 −0.1686
−0.3081 −0.4028 −0.2871 0.8127
0.2758 −0.1326 0.8848 0.3514
, (6.70)
A2 =
−0.0000 0.1306 0.5138 −0.8479
−0.0149 −0.1251 −0.8397 −0.5282
−0.1010 0.9787 −0.1730 0.0460
−0.9948 −0.0975 0.0302 0.0033
, (6.71)
A3 =
−0.1773 0.3907 0.8544 0.2930
−0.2390 −0.1000 0.3098 −0.9148
−0.9474 0.0653 −0.2768 0.1466
−0.1176 −0.9127 0.3120 0.2361
, (6.72)
A4 =
−0.4678 −0.5800 0.5099 0.4299
0.6680 −0.3823 0.5061 −0.3891
−0.5461 −0.2118 −0.0605 −0.8082
0.1914 −0.6875 −0.6929 0.1027
, (6.73)
145
and
B1 =
−0.0000 −0.2127 −0.8874 0.4091
0.5022 −0.4101 −0.2355 −0.7240
0.8579 0.3437 0.0889 0.3715
0.1088 −0.8176 0.3863 0.4129
, (6.74)
B2 =
−0.2275 −0.8650 −0.0725 0.4414
0.9724 −0.1919 0.0286 0.1299
−0.0218 0.4637 −0.1289 0.8763
−0.0476 0.0026 0.9886 0.1429
, (6.75)
B3 =
−0.0376 −0.6019 0.0389 −0.7967
0.0501 −0.0960 0.9870 0.1184
−0.1601 −0.7794 −0.1384 0.5896
−0.9851 0.1448 0.0712 −0.0594
, (6.76)
B4 =
−0.5155 −0.1848 0.7256 0.4166
0.2485 0.0303 −0.3369 0.9077
−0.7461 0.5184 −0.4166 0.0323
0.3404 0.8344 0.4318 0.0392
. (6.77)
With {Ai} and {Bi} given as above and by employing a rotated QAM con-
stellation for s4 (rotated by ej2/9π) we obtain the first example of LP-STBC. Its
FER-vs-SNR (Frame Error Rate, the rate that there is an error in T channel use)
curve is plotted against the performance of the QOSTBC (QAM) scheme in Fig.
6.4. In the figure “J-P-F scheme” (short for Jafarkhani-Papadias-Foschini) denotes
QO-STBC’s, and “2×” within 2× [4, 4, 4] denotes a complex design.
The above design involves an independent determination of the Ai’s and Bi’s
with respect to the GTSC-TSA metric. The design procedure is:
(i) choose a superset of LPM sets that have a certain range of ψGTSC +ψTSA values;
146
5 10 15 2010
−6
10−5
10−4
10−3
10−2
10−1
100
A LP−STBC Design with ψ≈43 and Optimally Rotated 4QAM
SNR (dB)
Pro
babi
lity
of F
ram
e E
rror
J−P−F Scheme2x[4,4,4] LP−STBC,ψ≈43,Rotated 4QAM
Figure 6.4: A STBC obtained by 2 × [4, 4, 4] design of ψ = 43.639 and optimallyrotated QAM.
147
(ii) design suitable scalar signal constellations and optimize the performance of the
final code over the LPM superset.
In practice there are several performance criteria we can adopt for the second step:
diversity (rank), product distance gain and obtainable channel capacity, etc. The
following is another example of the above procedure. Focusing on LPM designs with
ψ ≤ 48, which is the GTSC-TSA value of QOSTBC’s LPM set, and assuming that the
input symbols use QAM signalling, we specifically search for LPM’s that maximize
the product distance gain [102] of the final code. The {Ai} and {Bi} are given as
below: The FER of the final code is plotted in Fig. 6.5. We observe from simulations
that for many candidate LPM sets, their combinations with QAM signaling give full
transmit diversity and that it is the product distance gain that is more decisive in
the final choice.
The last example has its complete space-time constellation given as follows
C =
s1 s2 s3 s4
−s2 −s1 −s∗4 s∗3
−s3 s∗4 −s1 −s∗2
−s4 −s∗3 s∗2 −s1
. (6.78)
Its LPM set has a non-orthogonality measure of ψGTSC + ψTSA = 68, which is larger
than that of QOSTBC. Using QAM signalling, we compare its performance to that
of QOSTBC (QAM). The FER-vs-SNR curves of the two schemes are given in Fig.
6.6.
As can be seen from the above examples the procedure is less tractable than
designing LP-STBC by the first measure/criterion. This is also illustrated by the fol-
lowing subsection showing the relationship between GTSC-TSA and the LDC mutual
information criterion. However, we still find that the GTSC metric be an indicator
of the performance of the final code in some real designs.
148
5 10 15 2010
−5
10−4
10−3
10−2
10−1
100
A LP−STBC Design with ψ<48 and Maximized wrt Product Distance Gain
SNR (dB)
Pro
babi
lity
of F
ram
e E
rror
J−P−F Scheme2x[4,4,4] LP−STBC,ψ<48,4QAM
Figure 6.5: A LP-STBC obtained by the two-step design procedure: firstly obtaina set of LPM designs with ψ ≤ 48; then use QAM signalling and maximizes withrespect to the product distance gain.
149
5 10 15 2010
−5
10−4
10−3
10−2
10−1
100
A LP−STBC Design with ψ=68
SNR (dB)
Pro
babi
lity
of F
ram
e E
rror
J−P−F Scheme2x[4,4,4] LP−STBC,ψ=68,4QAM
Figure 6.6: A STBC obtained by 2× [4, 4, 4] design of ψ = 68 and QAM constellation;versus QOSTBC (QAM).
150
9 10 11 12 13 14 15 16 17 185
5.5
6
6.5
7
7.5
8
8.5
9
9.5
10
LDC Mutual Information vs. GTSC Metric([3,2,2]: 3 real symbols, 2x2 LPM’s; N=2; SNR=20dB)
ψGTSC
LDC
−M
I
Figure 6.7: LDC-MI vs. GTSC metric for 2I2O: SNR=20dB; r = 3; produced from105 randomly generated LPM’s.
151
6.6.3 GTSC-TSA Metric and the LDC Mutual Information
Criterion
In this subsection we study the relationship between the GTSC-TSA metric
and the LDC mutual information. In Fig. 6.7, we plot out the LDC mutual infor-
mation of randomly generated LPM sets of size [3, 2, 2] versus their GTSC measure.
The [3, 2, 2] LPM design corresponds to a 2I2O system with 3 input symbols (r = 3).
The SNR in dB used to calculate the LDC-MI is 20. The plot of Fig. 6.7 is obtained
as follows: for small increase of the ψGTSC, the set of LPM’s that maximizes the
LDC-MI measure is determined and the corresponding LDC-MI value is plotted. The
data are collected with a step size 0.22 for ψGTSC. We see that in the plot there exists
an abrupt jump of LDC-MI in the neighborhood of ψGTSC = 14. LDC-MI actually
achieves its maximum around ψGTSC = 14, which suggests that by focusing on LPM’s
with ψGTSC about the value of 14, we are able to obtain good LPM sets in terms of
LDC-MI.
In Fig. 6.8 we plot the curve of LDC-MI-versus-GTSC for a 3I2O system and
LPM design of size [3, 3, 3]. The SNR is also 20dB. In this case, the LDC-MI measure
is roughly monotonic with respect to the GTSC metric. By focusing on LPM’s with
smaller GTSC we obtain good STBC’s in terms of LDC-MI. The lower bounds given
in the theoretical analysis help us understand this region of GTSC. Since GTSC is a
deterministic metric and involves no statistical operator the design of LPM’s in this
case can be made easier by using GTSC.
We note that the matrix Z (cf. (6.23)) in some way shows the difference and
connection between the GTSC-TSA measure and the LDC mutual information. As a
natural extension of the complex linear processing orthogonal design a measurement
of the symmetry of ATi Bj is included, i.e. the TSA term
∑ri=1
∑rj=1
∥∥∥ATi Bj−BT
j Ai
2
∥∥∥2
Fis
calculated, while in (6.22), we find there are no similar term(s).
152
14 15 16 17 18 19 20 21 22 23 24 25 26 27 286.5
6.6
6.7
6.8
6.9
7
7.1
7.2
7.3
LDC Mutual Information vs. GTSC Metric([3,3,3]: 3 real symbols, 3x3 LPM’s; N=2; SNR=20dB)
ψGTSC
LDC
−M
I
Figure 6.8: LDC-MI vs. GTSC metric for 3I2O: SNR=20dB; r = 3; produced from105 randomly generated LPM’s.
153
6.7 Conclusion
We investigated two deterministic measures for the design of linear processing
space-time block codes (LP-STBC’s). The measures are deterministic in the sense
that their computations do not involve any statistical operators and are defined solely
with respect to the set of LPM’s. The first measure is obtained by applying Jensen’s
Inequality to the mutual information criterion for linear dispersion codes. The expec-
tation operator is moved into the log det() operator following Jensen’s rule. By as-
suming channel coefficients that are i.i.d. Gaussian, we compute the expectations and
this gives a deterministic design measure. We show that there is a tractable relation-
ship between this measure and CLD and show that the design of LP-STBC using this
relationship can be simplified. The second measure is a natural extension of the condi-
tions required for complex linear processing orthogonal design or amicable orthogonal
design. For the LPM’s of an LP-STBC, we associate with them two measures of non-
orthogonality: total-squared-skew-symmetry and total-squared-amicability (TSA).
The relationship of total-squared-skew-symmetry to total-squared-correlation (TSC)
is revealed. TSC measures the non-orthogonality (cross-correlation) properties of a
vector set, and is commonly used in the design of sequence sets for Code Division Mul-
tiple Access (CDMA) systems. It can be shown that total-squared-skew-symmetry
is a generalization of total-squared-correlation (GTSC). For GTSC a lower bound
analogous to Welch’s lower bound for TSC exists, which establishes itself upon the
Hurwitz-Radon numbers and the Hurwitz-Radon family of matrices. By computer
simulations, we observe that the second measure is less tractable than the first one.
However, the lower bound derived can still be a good indicator of the performance of
real designs of size 3× 3. Comparing the two deterministic measures reveals to some
extend the differences and connections between CLD and the criterion for amicable
orthogonal design.
154
Chapter 7
Conclusion, Discussion, and Future
Work
In this thesis we have presented the results of several research works, conducted
within a unified theme of multiple-antenna technology.
7.1 Spatial Smoothing Based JADE-MUSIC
In Chapter 3 we considered the problem of joint estimation of direction-of-
arrival (DoA), propagation delay, and complex channel gain for antenna-array DS/CDMA
communications over frequency selective multipath channels and proposed a subspace
based MUSIC-type estimation algorithm which utilizes the spatial smoothing prepro-
cessing technique. The proposed algorithm essentially breaks the multipath induced
coherency within the received signals and recovers the full signal subspace spanned
by all dominant signal paths of all users. This allows for the use of MUSIC-type DoA
and delay estimators for the individual paths of the user of interest. Based on the
angle and timing information, we then estimated the multipath fading coefficients.
Simulation results illustrated the effectiveness of this approach. We further consid-
ered two variants of the proposed spatial smoothing based MUSIC-type estimation
155
scheme. The proposed algorithms utilize space-time received vectors that span only a
single information symbol period and exhibited superior performance when the data
record size available for parameter estimation is limited.
7.2 The MUSIC-MDL Criterion
In Chapter 4 we described a new criterion for detecting the number of sig-
nals impinging on uniform linear array (ULA). Our criterion made explicit use of the
peak information of the MUSIC spectrum. We considered two maximum likelihood
estimates (MLEs) of the noise variance, σ2: the MLE that is derived from the eigen-
value decomposition (EVD) parameterization and the MLE that is based upon the
direction-of-arrival (DoA) parameterization. Based upon a large-sample formulation
of the difference between these two MLEs, and by applying the minimum description
length (MDL) principle, we obtained the proposed criterion. For each hypothesis of
k sources, in addition to computing σ2 using the M − k smallest eigenvalues of the
sample covariance matrix, the new criterion applies an additional correction term
calculated from the k largest peaks of the MUSIC spectrum, which is generated from
the testing noise subspace of dimension M−k. We proved that the proposed criterion
provides a consistent estimate of the number of signals and demonstrate that it has
a better performance at low SNR for equal-power sources when compared with the
original MDL-based signal number detection criterion [62].
7.3 The IWMA Algorithm
In Chapter 5 we presented an iterative weight matrix approximation (IWMA)
algorithm which is capable of obtaining an approximate to the optimal weight matrix
Wopt in an iterative fashion for performing weighted spatial smoothing to obtain
diagonal source covariance matrix for array signal processing. WSS was proposed
156
in [87] as a technique to obtain diagonal source covariance matrix for array signal
processing. Diagonal source covariance matrix is a desired feature for subspace-based
direction-of-arrival (DoA) estimation algorithms, as the cross-correlations among the
input signals can markedly deteriorate the performance of these estimators. But
the optimum weight matrix [87] for such a purpose requires explicit knowledge of
the DoAs. The algorithm is applicable when the input covariance matrix is positive
definite. The algorithm starts from a scaled identity matrix as an initial guess of
Wopt. It then carries out a series of weighted spatial smoothing iteratively. Each
WSS is performed using a weight matrix obtained from previous iteration. After
each WSS the algorithm computes a new weight matrix, which is to be used for
the next iteration. The principle of the algorithm is in its utilization of an effective
correlation matrix, which is naturally brought about by the operations performed in
each iteration, and the fact that for a positive definite Hermitian matrix, the set of
eigenvalues of its Hadamard product with a correlation matrix is majorized by the set
of eigenvalues of itself. WSS based on IWMA can be shown to be an effective method
to decorrelate highly correlated signals. Besides this, a useful observation regarding
the IWMA algorithm is that the approximate matrix it generates is suited as a basis
for subspace-type DoA estimation. Simulation results illustrated the effectiveness of
this estimation strategy, which also suggests the effectiveness of the IWMA algorithm.
Some consideration of the future work for this research will be given in a later section.
7.4 Two Deterministic Design Criteria for LP-STBC
In Chapter 6 we discussed two deterministic measures for designing linear pro-
cessing space-time block codes (LP-STBC’s). The measures are deterministic in the
sense that their computations do not involve any statistical operators and are defined
solely with respect to the set of LPM’s. The first measure is obtained by applying
Jensen’s Inequality to the mutual information criterion for linear dispersion codes
157
[112] (CLD). The expectation operator is moved into the log det() operator following
Jensen’s rule. By assuming channel coefficients that are independent and identically
distributed (i.i.d.) Gaussian, we computed the expectations and this gives a deter-
ministic design measure. We showed that there is a tractable relationship between
this measure and CLD and showed that the design of LP-STBC using this relation-
ship can be simplified. The second measure is a natural extension of the conditions
required for complex linear processing orthogonal design or amicable orthogonal de-
sign. For the LPM’s of a LP-STBC, we associated with them two metrics of non-
orthogonality: total-squared-skew-symmetry and total-squared-amicability (TSA).
The relationship of total-squared-skew-symmetry to total-squared-correlation (TSC)
was revealed. TSC measures the non-orthogonality (cross-correlation) properties of a
vector set, and is commonly used in the design of sequence sets for Code Division Mul-
tiple Access (CDMA) systems. It can be shown that total-squared-skew-symmetry
is a generalization of total-squared-correlation (GTSC). For GTSC a lower bound
analogous to Welch’s lower bound for TSC exists, which establishes itself upon the
Hurwitz-Radon numbers and the Hurwitz-Radon family of matrices. By computer
simulations, we established that the second measure is less tractable than the first
one. However, the lower bound derived can still be a good indicator of the perfor-
mance of real design of size 3×3. Comparing the two deterministic measures to some
extend revealed the differences and connections between CLD and the criterion for
amicable orthogonal design.
7.5 Future Work
In this section we describe from our understanding some possible future work
for the study of the IWMA algorithm.
Within the thesis, the initial weight matrix for the IWMA algorithm is chosen
to be I. The different choices for the initial weight matrix can be further investigated
158
in order to have a better understanding of how these choices will have an impact on
the behavior of the algorithm. A generalization of the currently developed theory to
include also generic initial weight matrices can be carried out.
Another important aspect of the IWMA algorithm is its statistical character-
istics. Further studies in this direction can be carried out in order for us to have a
better understanding of the algorithm’s performance in different sample sizes. With
solid mathematical formulations for the algorithm’s statistical properties, we will have
a thorough understanding of the advantages and shortcomings of the algorithm, and
with this understanding we may come up with improvements to the algorithm and
new algorithms/schemes that perform better.
The computational complexity of an algorithm is an important characteristic
and future efforts are needed in order to completely understand the computational
requirements of the IWMA algorithm.
Finally, we are also very interested in knowing what other kinds of applications
this algorithm can have. The algorithm is quite self-contained and complete by itself,
and because of this it might have other usages in other areas different from array
signal processing.
159
Bibliography
[1] T. Halonen and J. Melero, “GSM, GPRS and EDGE Performance: Evolution
Towards 3G/UMTS,” Wiley, 2003.
[2] S. Verdu, “Multiuser Detection,” Cambridge University Press, 1998.
[3] “The IEEE Standard for Local and Metropolitan Area Networks, Part 16: Air
Interface for Fixed Broadband Wireless Access Systems,” IEEE P802.16e2005,
December 2005.
[4] D. Tse and P. Viswanath, “Fundamentals of Wireless Communication,” Cam-
bridge University Press, 2005.
[5] H. L. Van Trees, “Optimum Array Processing,” Wiley-Interscience, New York,
2002.
[6] J. Rissanen, “Modeling by shortest datad escription,” Automatica, vol. 14, pp.
465–471, 1978.
[7] H. Akaike, “A new look at the statistical model identification,” IEEE Trans. on
Automatic Control, vol. AC-19, no. 6, pp. 716–723, 1974.
[8] M. Wax and T. Kailath, “Detection of signals by information theoretic criteria,”
IEEE Trans. on Acoust., Speech, and Signal Processing, vol. 33, no. 2, pp. 387–
392, Apr. 1985.
160
[9] R. Pintelon and J. Schoukens, “System Identification: A Frequency Domain
Approach”, Wiley-IEEE Press, 2004.
[10] M. Wax, “Detection and localization of multiple sources via the stochastic signal
model,” IEEE Trans. on Signal Processing, vol. 39, no. 11, pp. 2450–2456, Nov.
1991.
[11] E. Fishler and H. Messer, “On the use of order statistics for improved detection
of signals by the MDL criterion,” IEEE Trans. on Signal Processing, vol. SP-48,
no. 8, pp. 2242–2247, 2000.
[12] W. Chen, K. M. Wong, and J. P. Riley, “Detection of the number of signals: a
predicted eigen-threshold approach,” IEEE Trans. on Signal Processing, vol. 39,
pp. 1088–1098, 1991.
[13] R. F. Brcich, A. M. Zoubir, and P. Pelin, “Detection of sources using bootstrap
techniques,” IEEE Trans. on Signal Processing, vol. 50, no. 2, pp. 206–215, 2002.
[14] J. Capon, “High-resolution frequency-wavenumber spectrum analysis,” Proc.
IEEE, vol. 57, no. 8, pp. 2408–1418, Aug. 1969.
[15] R. Schmidt, “Multiple emitter location and signal parameter estimation,” IEEE
Trans. on Antennas and Propagation, vol. 34, no. 3, pp. 276–280, Mar. 1986.
[16] A. J. Barabell, “Improving the resolution performance of eigenstructure-based
direction-finding algorithms,” Proc. of IEEE Int. Conf. on Acoust., Speech, and
Signal Processing, Boston, MA, pp. 336–339, May 1983.
[17] J. F. Bohme, “Separated estimation of wave parameters and spectral parameters
by maximum likelihood,” Proc. of IEEE Int. Conf. on Acoust., Speech, and Signal
Processing, Tokyo, Japan, pp. 2819–2822, 1986.
161
[18] A. G. Jaffer, “Maximum likelihood direction finding of stochastic sources: A
separable solution,” Proc. of IEEE Int. Conf. on Acoust., Speech, and Signal
Processing, New York, vol. 5, pp. 2893–2896, Apr. 1988.
[19] F. C. Schweppe, “Sensor array data processing for multiple-signal sources,” IEEE
Trans. on Information Theory, vol. IT-14, pp. 294–305, Mar. 1968.
[20] Z. Xu, P. Liu, and X. Wang, “Blind multiuser detection: from MOE to subspace
methods,” IEEE Trans. on Signal Processing, vol. 52, no. 2, pp. 510–524, Feb.
2004.
[21] X. G. Doukopoulos and G. V. Moustakides, “Adaptive power techniques for blind
channel estimation in CDMA systems,” IEEE Trans. on Signal Processing, vol.
53 no. 3, pp. 1110–1120, Mar. 2005.
[22] S. S. Reddi, “Multiple source location - a digital approach,” IEEE Trans. on
Aeros. and Elect. Sys., vol. AES-15, no. 1, pp. 95–105, Jan. 1979.
[23] R. Kumaresan and D. W. Tufts, “Estimating the angles of arrival of multiple
plane waves,” IEEE Trans. on Aeros. and Elect. Sys., vol. AES-19, no. 1, pp.
134–139, Jan. 1983.
[24] M. Haardt and J. A. Nossek “Unitary ESPRIT: how to obtain increased estima-
tion accuracy with a reduced computational burden,”, IEEE Trans. on Signal
Processing, vol. 43, no. 5, pp. 1232–1242, 1995.
[25] P. Stoica and A. Nehorai, “Music, maximum-likelihood, and Cramer-Rao
bound,” IEEE Trans. on Acoust., Speech, and Signal Processing, vol. 37, no.
5, pp. 720–741, May 1989.
[26] M. Viberg and B. Ottersten, “Sensor array processing based on subspace fitting,”
IEEE Trans. on Signal Processing, vol. 39, pp. 1110–1121, May 1990.
162
[27] P. Stoica and K. C. Sharman, “Maximum likelihood methods for direction of
arrival estimation,” IEEE Trans. on Acoust., Speech, and Signal Processing, vol.
ASSP-38, pp. 1132–1143, 1990.
[28] R. Roy and T. Kailath, “ESPRIT - estimation of signal parameters via rotational
invariance techniques,” IEEE Trans. on Acoust., Speech, and Signal Processing,
vol. 37, no. 7, pp. 984–995, Jul. 1989.
[29] R. Roy and T. Kailath, “Exact maximum likelihood parameter estimation of
superimposed exponential signals in noise,” IEEE Trans. on Acoust., Speech,
and Signal Processing, vol. 34, no. 5, pp. 1081–1089, 1986.
[30] T. Kaiser et al. (Ed.), “Smart Antennas: State of the art,”, EURASIP book
series, Hindawi Publishing Corporation, 2005.
[31] C. Sengupta, J. R. Cavallaro, and B. Aazhang, “On multipath channel estimation
for CDMA systems using multiple sensors,” IEEE Trans. on Acoust., Speech, and
Signal Processing, vol. 49, no. 3, pp. 543–553, Mar. 2001.
[32] J. Li, B. Halder, P. Stoica, and M. Viberg, “Computationally efficient angle es-
timation for signals with known waveforms,” IEEE Trans. on Signal Processing,
vol. 43, no. 9, pp. 2154–2163, Sep. 1995.
[33] M. Cedervall and R. Moses, “Decoupled maximum likelihood angle estimation
for coherent signals,” in Proceedings of the 29th Asilomar Conference on Signals,
Systems and Computers, Pacific Grove, CA, October-November 1995.
[34] M. C. Vanderveen, C. B. Papadias, and A. Paulraj, “Joint angle and delay
estimation (JADE) for multipath signals arriving at an antenna array,” IEEE
Communication Letters, vol. 1, pp. 12–14, Jan. 1997.
[35] M. C. Vanderveen, A. van der Veen, and A. Paulraj, “Estimation of multipath
163
parameters in wireless communications,” IEEE Trans. on Signal Processing, vol.
46, no. 3, pp. 682–690, Mar. 1998.
[36] A. Swindlehurst, “Time delay and spatial signature estimation using known asyn-
chronous signals,” IEEE Trans. on Signal Processing, vol. 46, no. 2, pp. 449–462,
Feb. 1998.
[37] M. Katz, J. Iinatti, and S. Glisic, “Two-dimensional code acquisition in time and
angular domains,” IEEE J. Selected Areas in Communications, vol. 19, no. 12,
pp. 2441–2451, Dec. 2001.
[38] Y.-Y. Wang, J.-T. Chen, and W.-H. Fang, “TST-MUSIC for joint DOA-delay
estimation,” IEEE Trans. on Signal Processing, vol. 49, no. 4, pp. 721–729, Apr.
2001.
[39] E. Telatar, “Capacity of multi-antenna Gaussian channels,” European Transac-
tions on Telecommunications, vol. 10, no. 6, pp. 585–596, Nov. 1999.
[40] G. J. Foschini and M. J. Gans, “On limits of wireless communications in a
fading environment when using multiple antennas,” Wireless Personal Commu-
nications, vol. 6, no. 3, pp. 311–335, Mar. 1998.
[41] J. Winters, “On the capacity of radio communication systems with diversity in a
Rayleigh fading environment,” IEEE J. Selected Areas in Communications, vol.
5, no. 871–878, Jun. 1987.
[42] A. Medles, S. Visuri and D. T. M. Slock, “On MIMO capacity for various types
of partial channel knowledge at the transmitter,” Proc. of Information Theory
Workshop (ITW ’03), pp. 99–102, Mar. 2003.
[43] R. Negi, A. M. Tehrani, and J. M. Cioffi, “Adaptive antennas for space-time
codes in outdoor channels,” IEEE Trans. on Communications, vol. 50, no. 12,
pp. 1918–1925, Dec. 2002.
164
[44] E. Visotsky and U. Madhow, “Space-time transmit precoding with imperfect
feedback,” IEEE Trans. on Information Theory, vol. 47, no. 6, pp. 2632–2639,
Sep. 2001.
[45] G. Jongren, M. Skoglund, and B. Ottersten, “Combining beamforming and or-
thogonal space-time block coding,” IEEE Trans. on Information Theory, vol. 48,
no. 3, pp. 611–627, Mar. 2002.
[46] A. Narula, M. J. Lopez, M. D. Trott, and G. W. Wornell, “Efficient use of side
information in multiple-antenna data transmission over fading channels,” IEEE
J. Selected Areas in Communications, vol. 16, no. 8, pp. 1423–1436, Oct. 1998.
[47] S. A. Jafar, S. Vishwanath, and A. Goldsmith, “Channel capacity and beam-
forming for multiple transmit and receive antennas with covariance feedback,”
in Proc. of International Conference on Communications, vol. 7, pp. 2266–2270,
Helsinki, Finland, Jun. 2001.
[48] S. E. Bensley, B. Aazhang, “Subspace-based channel estimation for code division
multiple access communication systems,” IEEE Trans. on Communications vol.
44, no. 8, pp. 1009–1020, Aug. 1996.
[49] S. M. Alamouti, “A simple transmit diversity technique for wireless commu-
nications,” in IEEE J. Selected Areas in Communications, vol. 16, no. 8, pp.
1451–1458, Oct. 1998.
[50] J G Proakis, “Digital communications,” Fourth Edition, McGraw Hill Inc., New
York, 2001.
[51] M. Eric, S. Parkvall, M. Dukic, M. Obradovic, “An algorithm for joint direction
of arrival, time-delay and frequency-shift estimation in asynchronous DS-CDMA
systems,” in Proc. of IEEE Intern. Symp. on Spread Spectrum Tech. and Appli-
cations, vol. 2, pp. 595-598, September 1998.
165
[52] R. K. Madyastha and B. Aazhang, “Synchronization and detection of spread
spectrum signals in multipath channels using antenna arrays”, in Proc. of 1995
IEEE Military Commun. Conf. (MILCOM ’95), pp. 1170-1174, 1995.
[53] C. Sengupta, J. R. Cavallaro and B. Aazhang, “On multipath channel estimation
for CDMA systems using multiple sensors,” IEEE Trans. on Communications,
vol. 49, no. 3, pp. 543-553, Mar. 2001.
[54] J. Li, B. Halder, P. Stoica, M. Viberg. “Computationally Efficient Angle Esti-
mation for Signals with known Waveforms,” IEEE Trans. on Signal Processing,
vol. 43, pp. 2154-2163, Sep 1995.
[55] C. Y. Chuang, X. Yu, and C. C. Jay Kuo, “Blind delay and DOA estimation
in correlated multipath DS-CDMA systems,” in Proc. of Fall 2004 IEEE Vehic.
Tech. Conf. (VTC Fall ’04), vol. 3, pp. 2173-2177, Sept. 2004.
[56] Y. Bresler and A. Macovski, “Exact Maximum Likelihood parameter estimation
of superimposed exponential signals in noise,” IEEE Trans. on Acoust., Speech,
and Signal Processing, vol. 34, pp. 1081-1089, Oct. 1986.
[57] M. Viberg and B. Ottersten, “Sensor array processing based on subspace fitting,”
IEEE Trans. on Signal Processing, vol. 39, pp. 1110-1121, May 1991.
[58] T-J. Shan and T. Kailath, “Adaptive Beamforming for Coherent Signals and
Interference,” IEEE Trans. on Acoust., Speech, and Signal Processing, vol. 33,
no. 3, pp. 527-536, June 1985.
[59] R. Wu and I. N. Psaromiligkos, “Direction-of-arrival and propagation delay es-
timation of DS/CDMA transmissions in multipath environments,” in Proc. of
22nd Biennial Symposium on Communications, June 2004.
[60] I. N. Psaromiligkos, S. N. Batalama, and M. J. Medley, “Rapid synchronization
166
and combined demodulation. Part I: Algorithmic developments,” IEEE Trans.
on Communications, vol. 51, pp. 983-994, June 2003.
[61] I. N. Psaromiligkos and S. N. Batalama, “Rapid synchronization and combined
demodulation. Part II: Finite data record performance analysis,” IEEE Trans.
on Communications, vol. 51, pp. 1162-1172, July 2003.
[62] M. Wax and T. Kailath, “Detection of signals by information theoretic criteria,”
IEEE Trans. on Acoust., Speech, and Signal Processing, vol. 33, no. 2, pp. 387–
392, Apr. 1985.
[63] A. P. Liavas and P. A. Regalia, “On the behavior of information theoretic criteria
for model order selection,” IEEE Trans. on Signal Processing, vol. 49, pp. 1689–
1695, August 2001.
[64] H. Lee, F. Li, “A novel approach for detecting the number of emitters in a
cluster,” Proc. of IEEE Int. Conf. on Acoust., Speech, and Signal Processing,
vol. 1, pp. 261–264, Apr. 27-30, 1993.
[65] W. Xu, J. Pierre, M. Kaveh, “practical detection with calibrated arrays,” in
Proc. of Statistical Signal and Array Processing Workshop, pp. 82–85, 1992.
[66] J. Joutsensalo, “A subspace method for model order estimation in CDMA,”
Proc. of the Fourth Proc. of IEEE Intern. Symp. on Spread Spectrum Tech. and
Applications (ISSSTA ’96), Mainz, Germany, pp. 79–85, 1996.
[67] J. Rissanen, “Modeling by shortest data escription,” Automatica, vol. 14, pp.
465–471, 1978.
[68] T. W. Anderson, “Asymptotic theory for principal component analysis,” Ann.
J. Math. Stat., vol. 34, pp. 122–148, 1963.
167
[69] P. Stoica and A. Nehorai, “Performance study of conditional and unconditional
direction-of-arrival estimation,” IEEE Trans. on Acoust., Speech, and Signal Pro-
cessing, vol. 38, pp. 1783–1795, Oct. 1990.
[70] A. G. Jaffer, “Maximum likelihood direction finding of stochastic sources: A sep-
arable solution,” in Proc. of IEEE Int. Conf. Acoust., Speech, Signal Processing
(New York, NY), Apr. 1988. pp. 2893–2896.
[71] M. Wax, “Model-based processing in sensor arrays,” in S. Haykin, editor, Ad-
vances in Spectral Analysis and Array Processing, volume III, pages 1–47.
Prentice-Hall, Englewood Cliffs, N. J., 1995.
[72] P. Stoica and A. Nehorai, “MUSIC, maximum likelihood, and Cramer-Rao
bound,” IEEE Trans. on Acoust., Speech, and Signal Processing, vol. 37, pp.
720–741, May 1989.
[73] J. Choi and I. Song, “Asymptotic distribution of the MUSIC null spectrum,”
IEEE Trans. on Signal Processing, vol. 41, pp. 985–988, Feb. 1993.
[74] L. C. Zhao, P. R. Krishnaiah, and Z. D. Bai, “On detection of the number of
signals in the presence of white noise,” J. Multivariate Analysis, vol. 20, pp.
1–25, Jan. 1986.
[75] J. E. Evans, J. R. Johnson, and D. F. Sun, “Application of advanced signal pro-
cessing techniques to angle of arrival estimation in ATC navigation and surveil-
lance system,” M.I.T. Lincoln Lab., Lexington, MA, Rep. 582, 1982.
[76] T-J. Shan and T. Kailath, “Adaptive Beamforming for Coherent Signals and
Interference,” IEEE Trans. on Acoust., Speech, and Signal Processing, vol. 33,
no. 3, pp. 527–536, June 1985.
[77] Y. Bresler and A. Macovski, “Exact Maximum Likelihood parameter estimation
168
of superimposed exponential signals in noise,” IEEE Trans. on Acoust., Speech,
and Signal Processing, vol. 34, pp. 1081–1089, Oct. 1986.
[78] M. Viberg and B. Ottersten, “Sensor array processing based on subspace fitting,”
IEEE Trans. on Signal Processing, vol. 39, pp. 1110–1121, May 1991.
[79] W. Du and R. L. Kirlin, “Improved spatial smoothing techniques for DOA es-
timation of coherent signals,” IEEE Trans. on Signal Processing, vol. 39, no. 5,
pp. 1208–1210, May 1991.
[80] J. Li, “Improved angular resolution for spatial smoothing techniques,” IEEE
Trans. on Signal Processing, vol. 40, no. 12, pp. 3078–3081, Dec. 1992.
[81] E. M. Al-Ardi, R. M. Shubair, and M. E. Al-Mualla, “Computationally efficient
high-resolution DOA estimation in multipath environment,” Electronics Letters,
vol. 40, no. 14, pp. 908C-910, Jul. 2004.
[82] Y. H. Choi, “Subspace-based coherent source localization with forward/backward
covariance matrices,” Proc. of Inst. Elect. Eng., Radar Sonar Navig., vol. 149,
no. 3, pp. 145–151, Mar. 2002.
[83] A. Moghaddamjoo, “Application of spatial filters to DOA estimation of coherent
sources,” IEEE Trans. on Signal Processing, vol. 39, no. l, pp. 221–224, Jan.
1991.
[84] A. Moghaddamjoo and T.-C. Chang, “Analvsis of the spatial filtering approach
to the decorrelation of coherent sources,” IEEE Trans. on Signal Processing, vol.
40, no. 3, pp. 692–694, Mar. 1992.
[85] A. Delis and G. Papadopoulos, “Enhanced forward/backward spatial filtering
method for DOA estimation of narrow-band coherent sources,” Proc. of Inst.
Elect. Eng., Radar Sonar Navig., vol. 143, no. 1, pp. 10–16, Feb. 1996.
169
[86] P. Stoica and A. Nehorai, “MUSIC, maximum likelihood, and Cramer-Rao
bound,” IEEE Trans. on Acoustics, Speech, and Signal Processing, vol. 37, pp.
720–741, May 1989.
[87] Wang Bu-hong, Wang Yong-liang, Chen Hui, “Weighted spatial smoothing for
direction-of-arrival estimation of coherent signals,” in Digest of Antennas and
Propagation Society International Symposium, vol. 2, pp. 668–671, June 2002.
[88] K-C Tan and G-L Oh, “Estimating directions-of-arrival of coherent signals in
unknown correlated noise via spatial smoothing,” IEEE Trans. on Signal Pro-
cessing, vol. 45, no. 4, pp. 1087–1091, 1997.
[89] R. O. Schmidt, “Multiple emitter location and signal parameter estimation,”
IEEE Trans. on Antennas and Propagation, vol. 34, no. 3, pp. 276–280, Mar.
1986.
[90] A. W. Marshall and I. Olkin, “Inequalities: Theory of Majorization and Its
Applications,” Volume 143 in Mathematics in Science and Engineering Series,
Academic Press, 1979.
[91] B. C. Arnold, “Majorization and the Lorenz Order: A Brief Introduction,”
Springer-Verlag Lecture Notes in Statistics, vol. 43, 1987.
[92] R. A. Horn and C. R. Johnson, “Topics in Matrix Analysis,” New York: Cam-
bridge Univ. Press, 1994.
[93] K. C. Sharman and T. S. Durrani, “A comparative study of modem eigenstruc-
ture methods for bearing estimation - a new high-perfmanee approach,” Proc. of
Conf. on Decision and Control (Athens. Greece), pp. 1737–1742, 1986.
[94] M. L. McCloud and L. L. Scharf, “New Subspace Identification Algorithm for
High Resolution DOA Estimation,” IEEE Trans. on Antennas and Propagation,
vol. 50, no. 10, pp. 1382–1390, Oct. 2002.
170
[95] G. J. Foschini, “Layered space-time architecture for wireless communication in
a fading environment when using multi-element antennas,” Bell Labs Technical
Journal, vol. 1, no. 2, pp. 41–59, Autumn 1996.
[96] G. J. Foschini and M. J. Gans, “On limits of wireless communications in a Fad-
ing Environment when using multiple antennas,” Wireless Personal Communi-
cations, vol. 6, no. 3, pp. 311–335, Mar. 1998.
[97] E. Telatar, “Capacity of the Multiple antenna Gaussian channel,” European
Transactions on Telecommunications, vol. 10, no. 6, pp. 585–595, Nov. and Dec.
1999.
[98] D. Chizhik, G. J. Foschini, M. J. Gans and R. A. Valenzuela, “Keyhole, cor-
relations, and capacities of multielement transmit and receive antennas,” IEEE
Trans. on Wireless Communications, vol. 1, no. 2, pp. 361–368, Apr. 2002.
[99] J.-C. Guey, M. P. Fitz, M. R. Bell, and W.-Y. Kuo, “Signal design for transmitter
diversity wireless communication systems over Rayleigh fading channels,” IEEE
Trans. on Communications, vol. 47, pp. 527–537, Apr. 1999.
[100] P. W. Wolniansky., G. J. Foschini, G. D. Golden, and R. A. Valenzuela, “V-
BLAST: An architecture for realizing very high data-rates over the rich-scattering
wireless channel,” Proc. of IEEE Int. Symp. on Signals, Systems and Electronics
(ISSSE ’98), Pisa, Italia, 30th Sep. 1998.
[101] S. Jafar, S. Vishwanath, and A. Goldsmith, “Channel capacity and beamform-
ing for multiple transmit and receive antennas with covariance feedback,” Proc.
of Int. Conf. on Communications, vol. 7, pp. 2266–2270, Jun. 2001.
[102] V. Tarokh, N. Seshadri, and A. R. Calderbank, “Space-time codes for high
data rate wireless communication: Performance criterion and code construction,”
IEEE Trans. on Information Theory, vol. 44, pp. 744–765, Mar. 1998.
171
[103] S. Alamouti, “A simple transmit diversity technique for wireless communica-
tions,” IEEE J. Selected Areas in Communications, vol. 16, pp. 1451–1458, Aug.
1998.
[104] V. Tarokh, H. Jafarkhani, and A. R. Calderbank, “Space-time block codes from
orthogonal designs,” IEEE Trans. on Information Theory, vol. 45, pp. 1456–1467,
Jul. 1999.
[105] S. T. Chung, A. Lozano, H. C. Huang, A. Sutivong, and J. M. Cioffi, “Ap-
proaching the MIMO capacity with a low-rate feedback channel in V-BLAST,”
EURASIP J. Applied Signal Processing, vol. 5, pp. 762–771, 2004.
[106] R. Heath, Jr. and A. Paulraj, “Switching between multiplexing and diversity
based on constellation distance,” Proc. of Allerton Conf. on Communication,
Control and Computing, Oct. 2000.
[107] B. M. Hochwald and T. L. Marzetta, “Unitary space-time modulation for
multiple-antenna communication in Rayleigh flat fading,” IEEE Trans. on In-
formation Theory, vol. 46, pp. 543–564, Mar. 2000.
[108] B. Hochwald and W. Sweldens, “Differential unitary space time modulation,”
IEEE Trans. on Communications, vol. 48, pp. 2041–052, Dec. 2000.
[109] B. Hughes, “Differential space-time modulation,” IEEE Trans. on Information
Theory, pp. 2567–578, Nov. 2000.
[110] A. Shokrollahi, B. Hassibi, B. Hochwald, and W. Sweldens, “Representation
theory for high-rate multiple-antenna code design,” IEEE Trans. on Information
Theory, vol. 47, pp. 2335–2367, Sep. 2001.
[111] V. Tarokh and H. Jafarkhani, “A differential detection scheme for transmit
diversity,” IEEE J. Selected Areas in Communications, pp. 1169–1174, Jul. 2000.
172
[112] B. Hassibi and B. M. Hochwald, “High-rate codes that are linear in space and
time,” IEEE Trans. on Information Theory, vol. 48, no. 7, pp. 1804–1824, Jul.
2002.
[113] A. V. Geramita and J. Seberry, “Orthogonal designs. Quadratic forms and
Hadamard matrices,” Lecture Notes in Pure and Appl. Math. 45, Marcel Dekker,
New York, 1979.
[114] G. D. Golden, G. J. Foschini, R. A. Valenzuela, and P. W. Wolniansky, “De-
tection algorithm and initial laboratory results using V-BLAST space-time com-
munication architecture,” Electronics Letters, vol. 35, pp. 14–16, Jan. 1999.
[115] M. O. Damen, Ahmed Tewfik and J. C. Belfiore, “A construction of a space-
time code based on number theory,” IEEE Trans. on Information Theory, vol.
48, no. 3, Mar. 2002.
[116] G. Ganesan and P. Stoica, “Space-time block codes: A maximum SNR ap-
proach,” IEEE Trans. on Information Theory, vol. 47, pp. 1650–1656, May 2001.
[117] B. M. Hochwald, T. Marzetta, T. Richardson, W. Sweldens and R. Urbanke,
“Systematic Design of unitary space-time constellations,” IEEE Trans. on In-
formation Theory, vol. 46, no. 6, pp. 1692–1973, Sept. 2000.
[118] K. K. Mukkavilli, A. Sabharwal, E. Erkipand, and B. Aazhang, “On beam-
forming with finite rate feedback in multiple antenna systems,” IEEE Trans. on
Information Theory, vol. 49, no. 10, pp. 2562–2579, Oct. 2003.
[119] O. Tirkkonen, A. Boariu, and A. Hottinen, “Minimal non-orthogonality rate 1
space-time block code for 3+ Tx anrennas,” Proc. of 6th IEEE Int. Symp. on
Spread Spectrum Tech. and Applications (ISSSTA ’00), pp. 429–432, Sep. 2000.
[120] M. O. Damen, A. Chkeif, and J.-C. Belfiore, “Lattice codes decoder for space-
time codes,” IEEE Communication Letters, vol. 4, no. 5, pp. 161–163, May 2000.
173
[121] B. A. Sethuraman, B. Sundar Rajan, and V. Shashidhar, “Full-diversity, high-
rate space-time block codes from division algebras,” IEEE Trans. on Information
Theory, vol. 49, no. 10, Oct. 2003.
[122] G. N. Karystinos and D. A. Pados, “The maximum squared correlation, total
asymptotic efficiency, and sum capacity of minimum total-squared-correlation
binary signature sets,” IEEE Trans. on Information Theory, vol. 51, pp. 348–
355, Jan. 2005.
[123] M. Rupf and J. L. Massey, “Optimum sequence multisets for synchronous code-
division multiple-acess channels,” IEEE Trans. on Information Theory, vol. 40,
no. 4, pp. 1261–1266, Jul. 1994.
[124] L. R. Welch, “Lower bounds on the maximum cross-correlation of signals,”
IEEE Trans. on Information Theory, vol. IT-20, no. 3, pp. 397–399, May 1974.
[125] J. L. Massey and T. Mittelholzer, “Welch’s bound and sequence sets for code
division multiple-access systems,” in Sequences II: Methods in Communication,
Security, and Computer Science, New York: SpringerVerlag, pp. 63–78, 1993.
[126] G. H. Golub and C. F. V. Loan, “Matrix Computation,” John Hopkins Univer-
sity Press, Third Edition, 1996.
[127] I. D. Coope and P. F. Renaud, “Trace inequalities with applications to orthogo-
nal regression and matrix nearness problems,” Report UCDMS2000/17, Depart-
ment of Mathematics and Statistics, University of Canterbury, Christchurch,
New Zealand, November 2000.
[128] N. Sharma and C. Papadias, “Improved quasi-orthogonal codes through constel-
lation rotation,” IEEE Trans. on Communications, vol. 51, no. 3, pp. 332–335,
Mar. 2003.
174
[129] Weifeng Su and X.-G. Xia, “Signal constellations for quasi-orthogonal space-
time block codes with full diversity,” IEEE Trans. on Information Theory, vol.
50, no. 10, pp. 2331–2347, Oct. 2004.
[130] H. Jafarkhani, “A quasi-orthogonal space-time block code,” IEEE Trans. on
Communications, vol. 49, no. 1, pp. 1–4, Jan. 2001.
[131] C. B. Papadias and G. J. Foschini, “A space-time coding approach for systems
employing four transmit antennas,” Proc. of IEEE Int. Conf. on Acoust., Speech,
and Signal Processing, Salt Lake City, UT, vol.4, pp. 2481–2484, May 2001.
[132] O. Tirkkonen and A. Hottinen, “Square-matrix embeddable space-time block
codes for complex signal constellations,” IEEE Trans. on Information Theory,
vol. 48, no. 2, pp. 1122–1126, Feb. 2002.
[133] L. Zheng and D. Tse, “Diversity and multiplexing: a fundamental tradeoff in
multiple-antenna channels,” IEEE Trans. on Information Theory, vol. 49, no. 5,
pp. 1073–1096, May 2003.
[134] R. A. Horn and C. R. Johnson, “Matrix Analysis,” Cambridge University Press,
Cambridge, UK, 1990.
[135] D. B. Shapiro, “Compositions of Quadratic Forms,” W. de Gruyter Verlag,
2000.
175