Upload
others
View
12
Download
0
Embed Size (px)
Citation preview
On Forward Error Correction Codes and Line-coding
Schemes in Optical Fiber Communications
by
Yi Cai
Dissertation submitted to the Faculty of the Graduate School
of the University of Maryland in partial fulfillment
of the requirements for the degree of
Doctor of Philosophy
2001
Copyright 2001 by Yi Cai
ii
To my parents and my wife
iii
Acknowledgements
I could not have completed my Ph.D. studies and research in three years without the
support of many people who are gratefully acknowledged here.
I would like to express my sincere gratitude to Dr. Tülay Adalι and Dr. Joel M. Morris,
my co-advisors, for their extraordinary enthusiasm, encouragement, and guidance
throughout the course of my Ph.D. study. Dr. Adalι was the initiator who made the whole
procedure possible by offering me a research assistant position in her group in 1998. She
has been a dependable helper in any kind of difficulty, and her expertise in signal proc-
essing and her probing suggestions in the technical discussions helped develop some
fresh ideas in my research. Dr. Morris from whom I took most of my courses at UMBC
has been a constant source of insight and vision, and I have benefited from his valuable
experience, and constructive criticisms. His expertise in the field of coding technology
has been the solid base for my dissertation research. He has also made valuable and ex-
tensive contribution in reviewing and revising this dissertation.
I would like to gratefully acknowledge the contributions of Dr. Curtis R. Menyuk who
has provided the major direction for my research and pointed out the promise of signal
processing and coding technology for significant advances in optical fiber communica-
tions. His effort and contribution in reviewing and revising the dissertation is also greatly
appreciated.
I would like to deeply thank Dr. Gary M. Carter for offering me the chance to perform
experiments in his lab and financially supporting this work together with Drs. Adalι and
Menyuk.
iv
I would also like to express my sincere appreciation to Dr. A. Brinton Cooper, III, for
serving on my dissertation committee and carefully going through details of the disserta-
tion. His valuable advice has led to improvement of this dissertation.
Special thanks go to Drs. Nandakumar Ramanujam, Alexei Pilipetskii, Andrej Puc, and
Gerald E. Lenner from TyCom for a very productive and exciting summer internship ex-
perience. They offered me the opportunity to get a flavor of the issues in a real optical
fiber transmission system, and the technical discussions with them stimulated some of the
ideas in the dissertation.
Thanks and appreciation are also due to Bo Wang, Hongmei Ni, Sneha Agarwal, Arv-
ind Ananthan, and Wenze Xi, my lab-mates in the Information Technology Laboratory,
for providing me such a creative and friendly work environment. I also want to thank my
classmates in ENEE728A, Chuck LaBerge, William R. Martin, Amitkumar Mahadevan,
and Felix Watson, for the interesting class discussions that helped me get deeper under-
standing of turbo codes and low density parity check codes.
Thanks also to my research colleagues in the photonics group, Ruomei Mu, Vladimir
Grigoryan, Yu Sun, Hai Xu, Ronald Holzlöhner, John Zweck, Ivan T. Lima, Jr., Brian
Marks, Hua Jiao, Jiping Wen, Oleg Sinkin, Aurenice Lima, and Heider Ereifej, for their
valuable comments on my research, their kind help in setting up an office with their
group for me, and their active cooperation in performing the experiments.
Finally, I would like to express very special thanks to my parents, my wife, and my
daughter for their love, encouragement, endurance, and understanding. I hope this degree
will be a realization of my parents' dream and a nice reward for all the lonely weekends
my wife and daughter had to spend during my study.
v
Table of Contents
List of Tables ……………………………………………………………………..…
List of Figures ……………………………………………………………………….
1. Introduction …………………………………………………………………….
1.1 Introduction ……………………………………………………………….
1.2 Major sources of impairment in optical fiber communications …………..
1.3 Previous work on coding techniques in optical fiber communications …...
1.4 Motivation of our research ……………...……………...…………………
1.5 Dissertation organization ………………………………………………….
2. Modeling of amplified spontaneous emission noise (ASE) and soliton-soliton
collisions (SSC) in optical fiber transmission systems ………..………....…...
2.1 Statistics of ASE noise and channel models ………………………………
2.2 Physical mechanism of SSC and simplified model for SSC-induced timing
jitter ……………………………………………………………………….
2.3 Summary ……………………………………………………………….….
3. Performance of forward error correction (FEC) codes in correcting ASE
induced Errors ………………………………………………………………….
3.1 Lower bound for general FEC code performance .………………………..
3.2 Upper bound for linear FEC code performance ….……………………….
vii
viii
1
1
3
10
18
21
23
24
34
43
45
46
58
vi
3.3 Performance improvement of turbo codes ………………………………...
3.4 Summary …..……………………………………………………………....
4. A sliding window criterion (SWC) Line-code for mitigating soliton-soliton
collision induced errors .……………………………………………………….
4.1 Reed-Solomon (RS) codes without line-coding …………………………..
4.2 SWC code …………………………………………………………………
4.3 Block and trellis-based SWC codes ………………………....…….………
4.4 Concatenated RS/SWC coding scheme ………………………….....……..
4.5 Performance and comparisons via simulations ……………………….…..
4.6 Summary ……………………………………………………………….…
5. Summary and Conclusions ……………………………………………………
5.1 Summary ……………………………………………………………….….
5.2 Conclusions ……………………………………………………………….
5.3 Suggestions for future research …………………………………………...
Bibliography ………………………………………………………………………...
73
87
90
91
96
99
115
120
126
129
129
133
136
141
vii
List of Tables
4.1 SWC codeword examples ……………………………………………………...
4.2 Codeword look-up tables for a trellis-based SWC code …………………….…
4.3 Bit errors and symbol errors induced by soliton-soliton collision ……………..
100
111
118
viii
List of Figures
2.1 Comparison of the chi-square distribution and the Gaussian approximation ….
2.2 Binary-in-binary-out (BIBO) channel model …………………………………..
2.3 Comparison of the hard-decision thresholds based on the chi-square distribu-
tion, Gaussian approximation, and Gaussian approximation + BSC approxima-
tion ……………………………………………………………………………..
2.4 Comparison of the detected BERs as a function of Q, based on the chi-square
distribution, Gaussian approximation, and Gaussian + BSC approximations …
2.5 Comparison of the transition probabilities based on the chi-square distribution,
Gaussian approximation, and Gaussian approximation + BSC approximation
for M = 3 as functions of Q2 ……………………………………………………
2.6 Optical soliton transmission ……………………………………………………
2.7 Changes of soliton velocity and acceleration during collision versus distance ...
2.8 Soliton-soliton collision in a two-channel WDM system, the rectangular block
in the figure is defined as the sliding window …………………………………
2.9 Patterns of SSC-induced bit errors, in (a) middle channel of a 4-channel 12
Gb/s WDM system and (b) middle channel of a 4-channel 14 Gb/s WDM sys-
tem ……………………………………………………………………………...
3.1 Illustration of the source-channel coding theorem ……………………………..
27
28
31
31
33
34
37
39
42
48
ix
3.2 Comparison of the channel capacities evaluated based on the chi-square BAC,
Gaussian BAC, and Gaussian BSC models of optical fiber channel with domi-
nant ASE noise …………………………………………………………………
3.3 The quantity fs giving minimum value of I(U, V) as a function of Pe for p = 0.1,
0.2, …, 0.9 ………………………………………………………….…………..
3.4 Comparison of the exact rate distortion function and the approximation based
on equal transition probabilities, fs = ms, for different source distributions, p =
0.1, …, 0.9 ……………………………………………………………….……..
3.5 Comparison of the lower performance bounds of FEC codes evaluated with
chi-square BAC, Gaussian BAC, and Gaussian BSC models …….…………...
3.6 The µ(s, dj+) curves with different values of dj
+ ………………………………
3.7 Comparison of µ(0.5, d/2)/2 and Minµ(s, dj+)/2 at d = 3, 6, 9, 12 for the opti-
cal fiber channel with M = 3 …………………………...……………………....
3.8 Codeword structure of the Hamming (7, 4) × (7, 4) TPC ……………………...
3.9 Encoder structure of (1, 5/7, 5/7) TCC with 100-bit interleaver ………………
3.10 Upper bounds on the performance of the Hamming (7, 4) × (7, 4) TPC (trian-
gles) and the (1, 5/ 7, 5/7) TCC with interleaver length 100 (circles) using the
Gaussian (dotted) and the chi-square (solid) ASE noise models .…………...…
3.11 Comparison of the upper bounds on the performance of the Hamming (7, 4) ×
(7, 4) TPC (solid) and the (1, 5/ 7, 5/7) TCC with interleaver length 20
(dashed) using the chi-square ASE noise model ………………………………
3.12 Comparison of the pdfs of the ASE noise with chi-square distribution and
51
54
55
57
67
68
69
70
72
73
x
Gaussian approximation with the same mean and variance …………………....
3.13 Likelihood ratio using the hard-decision threshold based on a Gaussian BSC
model for Bo/Be = 3 ..………………………………………………………...…
3.14 Turbo code encoder and decoder structure …………………………………….
3.15 Output BER comparison of the turbo code (31, 27, 400) decoder based on the
chi-square model (solid), the Gaussian model (dotted), and the Gaussian BSC
model (dashed) of the ASE noise in the optical fiber transmission system, the
rate 1/2 and rate 3/4 codes are punctured versions of the rate 1/3 turbo code ....
4.1 Approximated distribution of SSC-induced time shift ………………………...
4.2 SSC-induced BERs before RS decoding and error correction capability of RS
(255, m) codes as a function of redundancy k = 255 – m at the data rate of 12.5
Gb/s …………………………………………………………………………….
4.3 Soliton-soliton collision in a two-channel WDM system, the rectangular block
is defined as the sliding window ……………………………………………….
4.4 Algorithms for generating the SWCMBNB code table ………………………..
4.5 Continuous components of the power spectral densities of the uncoded random
signal (solid) and the signals encoded by the FF8B10B (dash-dot), the
WF8B10B (dotted), and the Manchester (dotted) codes ……………………….
4.6 Implementation of the block SWC code ……………………………………….
4.7 Function diagram of the trellis-based SWC encoder …………………………..
4.8 Trellis diagram of the trellis-based SWC encoder ……………………………..
4.9 Trellis of the (4, 3, 2) SWC encoder …………………………………………...
77
83
84
87
93
95
97
102
104
106
109
110
112
xi
4.10 Possible combinations of number of marks in v(0), v(1) and v(2) ……………….
4.11 Concatenated RS/SWC coding scheme ………………………………………..
4.12 Reduction of the SSC-induced timing jitter with a SWC (10, 8) code ………...
4.13 Comparison of the code performances in enhancing the WDM system capacity
in (a) transmission bit rate and (b) channel spacing ……………………………
4.14 Probability mass function (pmf) of the number of marks in the sliding window
on the data sequence encoded with the fragmentation-first (star) and the
weight-first (triangle) algorithms for codeword length = 14 bits, and, (a) slid-
ing window length = 4 bits and (b) sliding window length = 14 bits ………….
4.15 SSC-induced timing jitter of desirable (square), random (no sign), and undesir-
able (circle) data patterns. Solid: timing jitter in middle channel. Dotted: timing
jitter in the outmost channel ..…………………………………………………..
4.16 Eye diagrams of the received signals with undesirable (upper) and desirable
(lower) patterns ………………………………………………………….……..
113
117
121
123
124
127
127
1
Chapter 1
Introduction
1.1 Introduction
The growth in demand for broadband services has led to considerably increased activ-
ity in research for optical fiber communications systems and networks with high trans-
mission bit rate and high spectral efficiency [1]–[4]. The standard optical fiber installa-
tion can provide ~25 THz bandwidth, which is far greater than what is currently in use.
This potential capacity can be exploited through the use of wavelength division multi-
plexing (WDM) in optical fiber communications. In WDM systems, a number of differ-
ent independent wavelengths are transmitted simultaneously on one optical fiber and,
thus, they more fully utilize the enormous fiber bandwidth [5], [6]. A major enabling
technology for multi-wavelength systems is the optical amplifier that can provide gain to
many channels simultaneously over a ~THz wavelength range. Moreover, the transmis-
sion bit rate per channel has been increasing, and systems with a single channel rate of 40
Gbps have emerged [7]–[10]. With the continuous efforts in channel spacing reduction
and transmission bit rate enhancement, optical fiber transmission systems with bit rate as
high as several Tbps [7]–[13] and spectral efficiency more than 0.6 bit/s/Hz [9], [10] have
2
been demonstrated.
However, the physical impairments in the optical fiber transmission lines limit the ob-
tainable channel spacing and data rates in optical fiber communications. The major
sources of impairment in optical fiber communications systems include the amplified
spontaneous emission (ASE) noise from the optical amplifiers, chromatic dispersion, fi-
ber nonlinearities (particularly the Kerr nonlinearity), and polarization effects (particu-
larly polarization mode dispersion in terrestrial systems) [14].
Two important trends have emerged in the drive to combat these impairments. First is
the use of modulation formats, for the launched optical pulses, that are quasi-linear [15].
This leads to two major types of optical fiber transmission systems –– chirped return-to-
zero (CRZ) systems and dispersion managed soliton (DMS) systems [15]. CRZ and DMS
are two signal modulation formats.
The CRZ format has a quasi-linear evolution, which means that when the fiber non-
linearity has been carefully mitigated, the optical pulse evolution appears linear in im-
portant respects [52].
The DMS format comes in two major variants. The first, which we refer to as periodi-
cally stationary DMS, is a format in which pulses return to the same shape at the end of
every period in the dispersion map. So, there is a balance between nonlinearity and dis-
persion. The second variant of the DMS format, which we refer to as quasi-linear DMS,
does not have periodically stationary behavior. Like the CRZ format, the optical pulse
evolution appears linear in important respects once the nonlinearity is mitigated [52].
The second trend is the growth in importance of error correction and line-coding, as
well as signal processing in optical fiber communications systems [17]–[46]. Applica-
3
tions of coding techniques in optically amplified WDM fiber transmission systems sig-
nificantly add system margin against physical impairments and, thus, increase the bit rate,
reduce the channel spacing, increase transmission distance, and reduce system power
budget. Applications of forward error correction (FEC) codes have been standardized in
long-haul undersea systems [47] and are predicted to play an important role in future 40
Gbps long-haul terrestrial systems.
This dissertation addresses the application of FEC and line-coding to achieve higher bit
rates, and spectral efficiency in optical fiber communications. In this introductory chap-
ter, we first describe the main sources of impairment in optical fiber communications and
their major physical effects. Then, we provide a brief survey of the previous work on ap-
plications of coding techniques in optical fiber communications. Based on this survey, we
point out some very important research topics that have not been addressed by other re-
searchers, which provides the motivation for our research and the subject of this disserta-
tion. This chapter ends with an outline of the dissertation.
1.2 Major sources of impairment in optical fiber communications
ASE noise, chromatic dispersion, fiber nonlinearities, and polarization effects are the
major sources of physical impairments limiting the achievable transmission capacity in
optical fiber communications systems. We now give a brief description of each of them.
1.2.1 ASE noise in optical amplifiers
Optical amplifiers consist of an active medium that has the carrier population in its
quantum energy levels inverted by a pump source, so that an input optical signal can ini-
4
tiate stimulated emission and achieve coherent gain. Along with stimulated emission,
there is always spontaneous emissions leading to noise. A fraction of the spontaneous
emission is coupled into the beam propagation path in the optical fiber and is amplified
[6]. This amplified spontaneous emission (ASE) noise is broadband, occurring over the
entire gain bandwidth of the optical amplifier [6]. Moreover, in systems with lumped op-
tical amplifiers, the accumulated ASE noise may cause gain saturation and thereby limit
the achievable signal gain. ASE noise can be characterized as white noise for each chan-
nel in a WDM system, and it decreases the signal-to-noise ratio (SNR) at the receiver.
To investigate the statistics of ASE noise, it is advantageous [48], [49] to represent it
with a set of orthonormal functions, φi(t), over the transmitted optical signal period T by
)(2
1
tn i
M
iiφ∑
=
, (1.1)
where 2M is the dimensionality of the space of the transmitted optical signals, ni repre-
sent independent Gaussian random variables with zero mean and identical variance. The
transmitted optical signal under these assumptions can be expanded in the same basis as
)(2
1
ts i
M
iiφ∑
=
. (1.2)
Thus, the transmitted optical signal with ASE noise can be characterized as the sum of a
set of independent Gaussian random processes given by
)()(2
1
tns ii
M
ii φ+∑
=
. (1.3)
If we neglect the changes that occur during the transmission –– which cannot be done in
practice –– then the received optical signal will have the same distribution. However, be-
5
cause the photo-detector at the receiver with direct detection is inherently a square-law
device, the detected electrical signal, I, will equal the square of the incoming optical sig-
nal, and can be approximated by
22
1
0
2
1
)( )()(2
i
M
ii
T
ii
M
ii nsdttnsI +=
+= ∑∫ ∑
==
φ . (1.4)
The detected electrical signal, I, therefore can be characterized by a sum of squared
and non-zero-mean Gaussian random variables. Hence, the statistics of the detected sig-
nals as shown in Eq. (1.4) is no longer Gaussian but chi-square [48]. Moreover, the ex-
pansion of the square terms in Eq. (1.4) yields a “signal/noise beat” term, 2nisi, for the
case where si ≠ 0. This case corresponds to the occurrence of optical pulses, and it is
customary to call these occurrences “marks.” There are no such terms for the case si = 0
where the time slots do not contain an optical pulse. It is customary to call these empty
slots “spaces.” Therefore, marks and spaces have different distribution functions and
variances. As a consequence, the distribution of marks and spaces is asymmetric. This
will lead to a binary asymmetric channel model for the post-detection signal.
From the above discussion, we can see that ASE noise causes non-Gaussian and
asymmetric distributions of the detected signals. However, for simplicity in the analytical
studies involving ASE noise, the Gaussian approximation and the binary symmetric
channel approximation for optical fiber channels with dominant ASE noise are widely
used. This approximation is used even in studies of FEC codes [20]–[23], which yields
suboptimal designs and performance assessments as will be shown in the following
chapters. We want to point out that Eq. (1.4) is still an approximation of the detected sig-
nal with additive ASE noise, because only the amplitude fluctuation due to ASE noise is
6
considered here. However, ASE may also induce timing jitter of the optical signals at the
detector, where timing jitter is defined as the random deviation of the optical pulse posi-
tion from its nominal location at the time slot center [16]. It has been shown in [50] that
ASE-induced timing jitter may also cause significant differences in the detected signal
statistics at the detector due to the ASE noise, especially in the tails of the probability
density functions (pdf).
1.2.2 Chromatic dispersion
Chromatic dispersion is a fundamental physical phenomenon in optical fibers that is
also called group velocity dispersion. It refers to the wavelength dependence of the re-
fractive index of optical fibers [6]. We know that the speed of light in optical fiber is de-
termined by the refractive index. If the refractive index is wavelength dependent, optical
signals at different wavelengths travel at different speeds in the optical fiber.
Consequently, optical signals belonging to different channels may pass through and
interact with each other during the propagation in WDM systems. We describe this phe-
nomenon as signal collision. Specifically, in WDM optical soliton transmission systems,
it is called soliton-soliton collision.
Moreover, in a single channel, the chromatic dispersion may cause envelope distortion
of a single optical signal because of the different frequency components comprising the
optical signal propagating at different speeds. The signal distortion increases as the
transmission distance increases and can be in the form of pulse broadening or narrowing,
depending on how the different frequency components are distributed in the time domain.
7
1.2.3 Fiber nonlinearities
Fiber nonlinearities are signal-intensity-dependent effects in optical fibers. The impor-
tant effects of fiber nonlinearities in optical fiber communications systems result from the
fact that optical signals with high intensity are confined to a small cross section over long
fiber lengths. The most common nonlinear effects in optical fiber communications sys-
tems are stimulated light scattering due to the Raman and Brillouin effects and the non-
linear refractive index change due to the Kerr effect [6]. The Kerr effect leads to the ef-
fects referred to as self-phase-modulation, cross-phase-modulation, and four-wave mix-
ing [6]. These three effects are not unambiguously separable and they are defined as fol-
lows.
Self-phase-modulation results from the intensity-dependent refractive index in optical
fiber. The refractive index determines the speed of light in the fiber, therefore, different
intensity components contained within an optical pulse travel at different speeds. Thus,
the different intensity components become phase shifted. This effect in a single optical
pulse distorts its own phase profile and it is referred to as self-phase-modulation [6].
In WDM systems, the Kerr effect may cause nonlinear interactions between different
channels. It can be viewed in the following way: A high intensity signal S1 in one channel
distorts the refractive index of the optical fiber, which in turn changes the propagation
speed of another signal S2 in a different channel and collides with S1 during their propa-
gation. This nonlinear inter-channel interaction in WDM soliton transmission systems
may cause severe timing jitter problems in all channels. This effect, which is purely in-
tensity dependent, is referred to as cross-phase-modulation [6].
8
By contrast, four-wave mixing is a phase dependent effect in which two wavelengths
ω1 and ω2, propagate inside the optical fiber simultaneously and maintain certain phase-
matching requirements as described in [96] so that signals at two new wavelengths, ω3
and ω4 that satisfy ω3 + ω4 = ω1 + ω2 may be generated [6]. In WDM systems, channels
close to the zero-dispersion wavelength become nearly phase-matched because of the
similar propagation speeds. Thus, four-wave mixing introduces a trade-off in the disper-
sion map design in optical fiber communications systems. Low dispersion is preferred to
lower the required average power and reduce the timing jitter, while four-wave mixing
becomes severe when the dispersion is close to zero [6]. This problem can be solved with
the dispersion management that will be described later in this section.
All those nonlinear effects described above cause signal distortions that become worse
with higher signal intensities and longer transmission distances. And in the case of long
distance transmission, fiber nonlinearities are more severe because the interactions that
cause the nonlinearity are allowed to accumulate. Generally speaking, fiber nonlinearities
are not important in low power and short distance transmission systems, but are important
in WDM systems with high power or narrow channel spacing, as well as in long-haul
transmission systems.
1.2.4 Polarization effects
Polarization effects are due to randomly varying birefringence in the optical fiber [51].
Birefringence leads to variations in the state of polarization of the launched optical signal
as it propagates in the optical fiber. This variation is caused by fluctuations in the core
9
shape of the optical fiber, temperature changes, or non-uniform stresses in the optical fi-
ber. Because the two polarization components have different group velocities, the optical
signal at the receiver suffers dispersion. This phenomenon is referred to as polarization
mode dispersion (PMD). The pulse broadening due to PMD is typically small compared
to the magnitude of the local chromatic dispersion. However, when fiber attenuation and
chromatic dispersion effects are compensated, PMD can become a limiting factor for
long-haul, high-bit-rate systems. It is difficult to compensate for PMD because it varies
randomly over time on a scale of milliseconds to hours.
All these physical impairments are combined in optical fiber transmission systems and
interact with each other. Different types of optical fiber transmission systems may be
dominated by different sources of impairment.
For example, in a chirped return-to-zero (CRZ) system, ASE noise and PMD are ex-
pected to be the dominant impairments. An optical pulse is said to be chirped if its carrier
frequency changes with time [6]. A CRZ system has a return-to-zero modulation and
chirped optical pulses such that, on average, the trailing portion of the optical energy in a
single time slot moves faster than the leading portion. The chirped signal with carefully
designed dispersion compensation broadens significantly during propagation, but it is
compressed at the receiver [52]. The signal broadening during propagation significantly
decreases the signal power. Also, because of the pulse overlap caused by signal broad-
ening, the total signal powers must be kept small so that the fiber nonlinearity during the
signal propagation remains relatively small. All the behaviors just described hold in a
quasi-linear dispersion-managed soliton (DMS) system.
10
By contrast, in WDM optical soliton systems that use traditional solitons, nonlinear
interactions of optical pulses among different channels (soliton-soliton collisions) can
cause severe inter-channel interference that may be converted into timing jitter effec-
tively decreasing the actual transmission capacity [63], [64].
All modern-day WDM systems with large number of channels (> 10) are quasi-linear.
First, the large third-order dispersion as defined in [6] that is present in most systems im-
plies that pulses in most channels undergo a large spread and overlap with their neigh-
bors. The power must be kept low to avoid unacceptably large inter-pulse interactions.
Second, even when the third-order dispersion is compensated, it is still necessary to
maintain a large dispersion to avoid cross-phase modulation between channels. Then,
even in this case, the power must be low to avoid strong inter-pulse interactions.
In any optical communication system, the ideal operating power is determined by the
interplay between ASE noise at low power and the Kerr nonlinearity at high power.
However, the Kerr nonlinearity can manifest itself in a wide variety of ways. In the sys-
tems that are discussed in this dissertation, the dominant nonlinearity is the inter-channel
nonlinearity.
1.3 Previous work on coding techniques in optical fiber communications
Although FEC technology has become a hot topic in optical fiber communications in
the recent 2–3 years, researchers have been studying and applying coding techniques in
optical fiber communications systems for about 20 years [17]–[46]. The coding tech-
niques that we are discussing here include both FEC and line-coding.
11
The basic idea behind FEC coding is to correct the possible transmission errors at the
receiver by adding well-defined redundancy that can be exploited at the receiver. The ba-
sic idea behind line-coding [53], [54] is to modify a source signal waveform to enhance
proper signal reception in the presence of transmission impairments. In contrast to the
focus of FEC, which is to correct transmission errors in general, the focus of line-coding
is to provide timing information, remove DC content, provide power spectral density
shaping, facilitate performance monitoring, minimize pattern dependent BER, and ensure
against inducing too many decoded errors [53], [54].
Both FEC and line-coding add redundancy to the original data stream, so that encoding
increases the transmission signaling rate for the same data rate. Code rate r and overhead
h are two measures of the degree of encoded data redundancy; they are defined as
sequencedata encoded of lengthsequencedata input of length=r , (1.5)
sequencedata input of lengthbitsredundant ofnumber =h . (1.6)
In the coding scheme design, there is a trade-off between overhead and spectral effi-
ciency. An encoded stream with larger overhead uses more bandwidth for the same data
rate. Considering the price paid for redundancy, the coded transmission can achieve a
lower SNR for a given BER than does uncoded transmission, and the difference between
these two SNR values for the same BER is defined as coding gain. In optical fiber com-
munications systems, the SNR is commonly represented by the Q factor defined as
Q = (µ1 – µ0)/(σ1+σ0), (1.7)
12
where µ1, µ0, σ1, and σ0 represent the mean values and variances of the received marks
and spaces, respectively [49].
In the following two subsections, we review the progress of FEC and line-coding in
optical fiber communications, respectively.
1.3.1 Survey on FEC technology in optical fiber communications
Research and application of FEC technology in optical fiber communications started in
the early 90s. Most of the published work was done by researchers in industry. The FEC
coding schemes applied to optical fiber communications systems during the last decade
can be categorized by three generations –– standard block codes with hard-decision de-
coding, concatenated FEC codes with hard- or soft-decision decoding, and concatenated
FEC codes with soft-decision and iterative decoding (or so-called turbo codes). The cor-
responding concepts in FEC coding techniques are defined in the following paragraph
[55].
Hard-decision decoding refers to the case in which the FEC code decoder has only bi-
nary inputs when a binary demodulator output is used [55]. Similarly, if the demodulator
has more than two quantization levels (or the output is left unquantized) the code decoder
must accept multilevel (or analog) inputs, which is referred to as soft-decision decoding
[55]. Concatenated coding schemes generally involve two constituent codes, an inner
code and an outer code. At the transmitter, the original data sequence is first encoded by
the outer encoder and then by the inner encoder. Correspondingly, at the receiver, the re-
ceived data sequence is decoded in turn by the inner decoder and the outer decoder [55].
Iterative decoding means that the output decoded data sequence is fed back to the decoder
13
to be decoded again iteratively [67]. In a concatenated coding scheme, iterative decoding
allows information exchange between the constituent decoders and, thus, improves the
decoding performance.
The first generation of FEC codes applied to optical fiber communications systems in-
cludes Hamming codes and Reed-Solomon (RS) codes using hard-decision decoding
[17]–[19]. In [17], Grover, et al., implemented long block length Hamming codes for
SONET STS-1 tributary. The [6208, 6195] shortened Hamming code was designed to
satisfy the STS-1 format and provide single-error correction and double-error detection.
They reported a reduction of the payload BER to about 8.6 × 103 × Pe2, where Pe is the
BER before decoding. In late 1993, Yamamoto, et al., with KDD, demonstrated a 5 dB
coding gain with the standard RS (255, 239) code in a 5 Gbps 210 km optical fiber
transmission experiment [18]. Almost in the same time period, Pamart, et al., with Al-
catel, and Chen, et al., with AT&T, reported more than 5 dB of coding gain with a 14%
overhead RS code in a 5 Gbps 6400 km optical fiber transmission experiment [19]. The
RS (255, 239) code has been standardized for the undersea cable system by the Interna-
tional Telecommunication Union [47]. All these investigations done in the early 90s were
experimental; not much theoretical study of FEC codes in an optical fiber transmission
environment was done.
Recently, Kidorf, et al., with TSSL (which is now TyCom), performed a detailed study
of the performance of the family of RS codes in long-haul WDM optical fiber transmis-
sion systems [21]. They carried out a theoretical comparison of the theoretical bound on
the maximum decoded BER of RS codes with various level of overhead. They also per-
formed Monte-Carlo simulations of RS codes and compared the simulation results of the
14
code performance to the experimental measurements. They observed a good match be-
tween the simulation results and the experimental results. In the theoretical calculation of
the performance bound, they assumed a binomial distribution for the uncorrelated bit er-
rors, implying a binary symmetric channel (BSC) model for the optical fiber channel,
where the BSC represents a binary-in binary-out channel with equal transition probabili-
ties. In the Monte-Carlo simulations, they assumed additive white Gaussian noise
(AWGN) statistics for ASE noise. The AWGN channel, a binary-input continuous-output
channel, is analogous to the BSC with the exception that the output is a binary signal plus
AWGN.
The second generation of FEC codes applied to optical fiber communications systems
appears in the late 90s [20]–[22]. It includes different concatenated FEC coding schemes.
Puc, et al., with TSSL, considered the NASA standard concatenated code consisting of
the RS (255, 239) block code using hard-decision decoding, and the rate 1/2, constraint-
length 7, convolutional code using soft-decision Viterbi decoding [20]. This scheme
yielded a 10 dB coding gain with 113% overhead in a 2.5 Gbps 5000 km WDM transmis-
sion experiment. They mentioned that the theoretical evaluation of the coding gain was
made based on AWGN. Ait Sab, et al., with Alcatel, evaluated the performance of the
concatenated RS (255, 223)/RS (255, 239) with simulations based on a Gaussian channel
model [22]. The Gaussian channel model they used has different noise levels for the
marks and spaces, which is more accurate than the AWGN model for ASE noise that has
different distributions for marks and spaces. The simulation result showed a 7.7 dB cod-
ing gain in a 10 Gbps 6500 km WDM transmission system [22], [23].
15
The third generation of FEC codes applied to optical fiber communications systems
has recently been proposed [22]–[24]. It includes concatenated RS/RS codes and con-
catenated BCH/BCH codes with soft iterative-decoding. These codes belong to the new
class of codes, called turbo codes, with iterative soft-decision (soft-input soft-output) de-
coding [56], [57]. There are two types of turbo codes depending on the type of constitu-
ent codes. If the constituent codes are block codes, we have turbo product codes (TPC). If
the constituent codes are convolutional codes, we have turbo convolutional codes (TCC)
[67]. Because the concept of a turbo code was first introduced as TCC in the literature,
the two names, turbo code and TCC, are not always clearly distinguished.
Ait Sab, et al., with Alcatel, demonstrate a coding gain of 10 dB using the TPC BCH
(128, 113) code with 28% overhead in a 10 Gbps 7800 km WDM transmission system
with simulations based on the Gaussian channel assumption [22], [23]. Taga, et al., with
KDD, experimentally demonstrated that concatenated RS (239, 223)/RS (255, 239) with
iterative decoding yields 2 dB of extra coding gain, compared to the RS (255, 239) code,
in a 10 Gbps 10 Mm WDM transmission experiment [24].
We suggest the development of a third generation code based on TPC instead of TCC.
The reason will be made clear in Chapter 2 with the results of our studies on the upper
performance bound for linear codes in optical fiber communications.
1.3.2 Survey on line-coding in optical fiber communications
Research on line-coding in optical fiber communications started in the early 80s and
was a beneficial technique throughout the 80s and early 90s. Although it has a longer
history than FEC coding, research on line-coding has not been as popular as research on
16
FEC techniques in recent years. The reason will follow from a later discussion of the de-
velopment of line-coding in optical fiber communications.
There are two major objectives in previous line-coding applications in optical fiber
communications. The first was to transmit adequate timing information to allow for
proper operation of clock recovery circuitry, and, in the meantime, to keep low frequency
content small to allow for ac coupling in the receiver [30]–[39]. In this kind of applica-
tion, transition density and transmission balance are two critical criteria used for the line-
code design [31], [53], [54]. Transition density refers to the frequency of the signal level
transitions in the encoded data sequence. High transition density ensures adequate timing
information and, thus, easy clock recovery. A transmission is balanced if there are an
equal number of marks and spaces, thus small low frequency content, in the encoded data
sequence.
In 1983, Takasaki, et al., proposed two-level alternate-mark-inversion (AMI) line-
coding for optical fiber communications. In [31]–[33], several different alphabetic block
line-codes implemented with look-up tables to ensure high transition density and bal-
anced transmission were proposed. In [34]–[36], several block line-codes were imple-
mented in GHz systems by using coders in parallel. In [37], Krzymien proposed a new
class of binary, nonalphabetic, balanced line-codes with m-bit data word and (m+1)-bit
codeword that requires small overhead and is thus efficient for high bit rate optical fiber
systems. In [38], Fair, et al., developed a guided scrambling approach for line-coding, in
which the current scrambling process depends on feedback from the previous encoded
output data sequence. It can be implemented in Gbps transmission systems with its sim-
ple scrambler-like structure and provides balanced transmission with high transition den-
17
sity. We can see that the key issue in this kind of line-coding application is the imple-
mentation in high bit rate optical fiber communications. No particular impairments in op-
tical fiber transmission lines were involved in the line-code design.
The other objective of line-coding applied to optical fiber communications, however,
does relate to a particular physical effect –– the non-flat laser frequency modulation (FM)
response. The overall FM response of the laser diode is due to the combined thermal and
carrier effect, which may either produce a “dip” or an enhanced response at low frequen-
cies [46]. This effect is referred to as the non-flat FM response. The non-flat FM response
of conventional distributed feedback (DFB) laser diodes (LD) is a major problem in co-
herent optical frequency-shift-keying (FSK) systems [58], [59]. This physical effect
causes data-pattern-dependent performance degradation that is a major problem that line-
coding should help solve. Hence, extensive studies on using line-coding schemes, in-
cluding AMI, Manchester code, and delay modulation, to counteract the non-flat laser
FM response, were carried out during the late 80s and early 90s when coherent systems
were a hot topic in optical fiber communications [40]–[46].
However, as we know, the successful development of erbium-doped fiber amplifers
(EDFA) in the early 90s has allowed significant increases in the sensitivity of intensity-
modulation direct-detection (IM-DD) systems, which overshadowed the high sensitivity
advantage of coherent systems. Thus, line-coding research temporarily lost its justifica-
tion in optical fiber communications.
18
1.4 Motivation of our research
From the survey in the previous section, we can see that coding technology is impor-
tant and practical in optical fiber communications. It has been responsible for some of the
progress in optical fiber communications. However, we also have the following impres-
sions regarding the research in this field.
The previous studies are mostly based on standard FEC codes and line-coding
schemes, for example Hamming codes, RS codes, AMI codes, and Manchester codes,
which were initially developed in wireless communications or older communications
systems. Moreover, for the theoretical studies of FEC codes in optical fiber communica-
tions, the channel models mostly used for optical fiber channels assume a binary sym-
metric channel with hard-decisioning [21], or assume AWGN or a Gaussian channel with
soft-decisioning [20]–[23]. Current line-coding schemes that have been applied in optical
fiber communications all use the conventional transition density and transmission bal-
ance as the performance criteria for the encoded data sequence [30]–[46]. There has been
little effort to optimize the choice of codes and design new codes by taking into account
the physical mechanisms behind the particular impairments in optical fiber transmission
lines and systems. By contrast, the goal of our research is to analyze and design FEC
codes and line-coding schemes by taking into account the particular physical impair-
ments in optical fiber transmission systems.
Specifically, we note that the most extensive results for FEC codes and their perform-
ance are based on BSC and AWGN channels. Many practical channels have been mod-
eled via the BSC and AWGN channels, e.g., the deep-space and satellite channels, the
telephone network, and, more recently, the subchannels of the ADSL system, after ap-
19
propriate equalization [60], [61]. However, from Sec. 1.2, we see that ASE noise, a major
source of random errors in optically amplified fiber communications systems, has non-
Gaussian and asymmetric distributions. This situation leaves a wide research opportunity
for improving the performances of FEC codes by taking into account the more accurate
noise statistics of optical fiber channels [25]–[27].
A question may be raised at this point. As mentioned in the previous section, the theo-
retical and simulation results of the RS code performance in [21], using BSC and AWGN
assumptions, agree well with the experimental measurements: Does this not mean that
BSC and AWGN are good approximations for optical fiber channels? The answer is, no.
The reason is that RS codes or any other FEC codes using hard-decision algebraic de-
coding are not sensitive to the exact noise statistics. Because the a priori knowledge of
the channel noise statistics is not used in algebraic decoding [55], [68], as long as the
channel model assumption gives a good estimate of the uncoded BER, it also gives a
good estimate of the algebraic block coded BER [55], [68].
By contrast, a priori knowledge of the channel noise statistics is essential for FEC
codes that use more sophisticated decoding algorithms such as the Viterbi algorithm
(maximum likelihood), the BCJR algorithm (maximum a posteriori probability), and the
sum-product algorithm (maximum likelihood). One measure of the progress in FEC
codes is to see how close the codes approach the Shannon limit. Shannon’s noisy channel
coding theorem [62] states that there exists a code, with a code rate r not exceeding the
channel capacity C, that can achieve arbitrarily small probability of error. The channel
capacity C is defined as the maximum mutual information that can be transmitted over
the physical channel. It is a function of the probability density function of the noisy sig-
20
nals after transmission. Thus, to approach the Shannon limit, the code and decoder
should utilize the channel noise statistics as much as possible.
This trend can be seen in the progress of FEC codes. From algebraic codes, including
Hamming codes and RS codes, to convolutional codes with hard-decision Viterbi de-
coding, then to convolutional codes with soft-decision Viterbi decoding, and further to
turbo codes with soft-decision iterative MAP decoding, code performance has ap-
proached closer and closer to the Shannon limit. This improvement has occurred because
more and more information on channel noise statistics is incorporated into the decoding
algorithms. We observed the same steps in the progression of the three generations of
FEC codes in optical fiber communications as shown in the previous section.
This historical observation is the motivation for our research on the effect of ASE
noise statistics on FEC code performance, including the study of the Shannon limit for
general FEC codes, the upper performance bound for linear codes, and the performance
improvement for turbo codes in non-Gaussian asymmetric optical fiber channels. A basic
question that we must answer in our FEC research is –– does the non-Gaussian asymmet-
ric statistics of the ASE noise, compared to the Gaussian symmetric noise approxima-
tions, cause a sufficient difference in the FEC studies that is worth the effort to include
more accurate noise statistics into the analysis and design of FEC codes?
For the line-coding research, the motivation is more direct. The nonlinear inter-channel
interference becomes the main source of errors in optical WDM transmission systems
when signal intensity is high, as is the case in optical soliton transmission systems. We
note that the nonlinear inter-channel interference causes correlated errors that are highly
dependent on the data patterns in WDM channels, for which line-coding is supposed to
21
be a direct solution. Our goal is to develop a line-coding scheme based on an under-
standing of how the data patterns affect the impairments induced by inter-channel inter-
ference. We need to determine in the line-coding research whether one can effectively
use line-coding to solve the nonlinear inter-channel interference problem.
The basic idea, which is the theme throughout this dissertation, is to analyze and de-
sign FEC codes and line-codes by taking into account the particular physical characteris-
tics and mechanisms in optical fiber transmission lines.
1.5 Dissertation organization
There are 5 chapters in this dissertation. Chapter 1 gives the introduction to the prob-
lems discussed in this dissertation, the survey on previous related work that has been
done, the motivation for our research, and the outline of the dissertation. Chapter 2 de-
scribes the physical dynamics behind the two sources of impairment of concern in this
dissertation –– the ASE noise from optical amplifiers and soliton-soliton collisions in
WDM soliton systems –– and describes the construction of corresponding theoretical
models to facilitate further discussions about FEC and line-coding solutions. The major
results of our research are reported in Chapters 3 and 4, respectively.
Chapter 3 is dedicated to the performance evaluation and improvement of FEC codes
in optical fiber transmission systems with dominant ASE noise. We do a three-level study
of FEC codes for correcting ASE-induced errors, using more accurate ASE noise statis-
tics: the Shannon limit for general FEC codes, the upper performance bound for linear
codes, and the performance improvement of turbo code. The results are presented in three
sections. In Sec. 3.1, we evaluate the lower performance bound, i.e., the Shannon limit,
22
for optical fiber channels with dominant ASE noise. The Shannon limit is the very basic
lower performance bound for all FEC codes. In Sec. 3.2, we derive the theoretical upper
performance bound for all linear FEC codes in optical fiber channels. This upper bound is
based on the union bound, which is a tight bound at low BER. We note that the code per-
formance at very low BER (≤ 10-11) is the major concern in optical fiber communications.
Hence, the upper bound that we derive is a useful tool for the analytic evaluation of the
performance of linear FEC codes, a class of codes that includes all three generations of
FEC codes in optical fiber communications. In Sec. 3.3, we modify the BCJR algorithm,
a maximum a posteriori probability (MAP) algorithm, for turbo code decoding according
to the chi-square ASE noise distributions in optical fiber channels and compare the re-
sulting code performance to the one based on the Gaussian noise assumption.
Chapter 4 is dedicated to a line-coding scheme for mitigating soliton-soliton collision-
induced errors. We introduce the sliding window criterion (SWC) line-coding scheme for
mitigating soliton-soliton collision induced errors. We develop two types of SWC codes,
the block SWC code and trellis-based SWC code. We also discuss the concatenation of
the SWC code with the Reed-Solomon (RS) code to achieve the very low BERs required
by optical fiber communications. We compare the simulation performance of the pro-
posed concatenated SWC/RS to the cases of using the RS code alone or using concate-
nated FEC codes without line-coding in correcting SSC-induced errors.
Chapter 5 completes the dissertation with a summary of the work, the conclusions that
we draw, and some suggestions for future research.
23
Chapter 2
Modeling of amplified spontaneous emission
(ASE) noise and soliton-soliton collisions (SSC) in
optical fiber transmission systems
Our goal in this dissertation is to design better coding schemes for optical fiber com-
munications by taking into account the particular physical impairments in optical fiber
channels discussed in Chapter 1. Hence, understanding the physical impairment mecha-
nisms and, based on that, modeling the physical effects is a critical first step in our re-
search. In this chapter, we focus on the two major physical effects in optical fiber WDM
systems –– the amplified spontaneous emission (ASE) noise from optical amplifiers and
soliton-soliton collisions (SSC).
In optical fiber transmission systems with optical amplifiers, ASE is a major source of
errors. Under low-power operation of the optical fiber channel, ASE is expected to domi-
nate over other sources of error producing impairments such as nonlinearity-induced im-
pairments, even over long transmission distances. Correcting ASE induced errors, there-
fore, is a major objective of forward error correction (FEC) applications. ASE noise can
be characterized as a random variable and, thus, the ASE noise statistics are critical in the
24
analysis and design of FEC codes and will be discussed in Chapter 3.
In optical soliton WDM systems, SSC becomes a major nonlinear effect causing severe
timing jitter and limiting the achievable transmission bit rate and channel spacing [63],
[64]. Based on the understanding of the SSC physical mechanism, we show that SSC in-
duces correlated errors after optical detection that are highly data-pattern dependent. The
data-pattern dependence of SSC-induced errors leads to the development of a line-coding
scheme, called the sliding window criterion (SWC) code that will be discussed in Chapter
4. The SWC line-coding scheme mitigates SSC-induced errors by reshaping the data
pattern.
In the first two sections of this chapter we discuss the statistics of the ASE noise, and
construct the corresponding models for optical fiber channels with dominant ASE noise.
In the next two sections, we describe the physical mechanism of SSC in WDM systems
and introduce a simplified model for the collision induced timing jitter. Based on this
model, we introduce the main motivation behind the line-coding scheme for mitigating
SSC-induced timing jitter and, hence, for mitigating collision induced errors.
2.1 Statistics of ASE noise and channel models
2.1.1 ASE noise statistics
The probability density function (pdf) of the detected signal I is a function of the en-
ergy E of the transmitted signal as well as the power spectral density N0 of the ASE noise
as described in [48]. The received marks and spaces have different pdfs that are approxi-
mately given by [48]
25
( ) ( )
)!1(
/exp/1)( 0
10
00 −
−=−
M
NINI
NIp
M
, (2.1)
+−
= −
−
01
0
2/)1(
01 2exp
1)(
N
IEI
N
EI
E
I
NIp M
M
, (2.2)
where M = Bo / Be is the number of modes per polarization state in the received optical
spectrum, Bo and Be are, respectively, the optical bandwidth and the electrical bandwidth
of the system at the detector, and IM –1 denotes the (M – 1)th modified Bessel function of
the first kind. The mean values and variances of the received marks and spaces can be
derived from the pdfs given in Eq. (2.1) and (2.2) as µ1 = MN0 + E, σ12 = MN0
2 +2EN0, µ0
= MN0, σ02 = MN0
2, respectively [48]. We can also obtain σ12 = 2(µ1µ0 – µ0
2)/M +σ02
from the above formulae for µ1, σ1, µ0, and σ0 [48]. With the definition of a SNR meas-
ure
Q = (µ1 – µ0)/(σ1+σ0) (2.3)
and the above results, along with signal levels re-defined as I0 = µ0 and I1 = µ1, we can
evaluate (normalized) I1, σ1, I0, and σ0 as functions of the system parameters, Bo, Be, and
Q as
e
o0 B
B=σ , QB
B2
e
o1 +=σ ,
e
o
B
BI =0
, e
o2
e
o1 22
B
BQ
B
BQI ++= , (2.4)
where N0 is normalized to 1.
We can see that the marks have a noncentral chi-square distribution, and the spaces
have a central chi-square distribution, both are asymmetric pdfs with 2M degrees of free-
26
dom [48]. The chi-square distribution is the most accurate theoretical model of the ASE
noise statistics as known to date [48], [49].
For simplicity of analytical studies of the ASE noise and the induced error probability,
however, Gaussian pdfs with the same means and variances as the chi-square distribu-
tions are commonly used. The Gaussian approximation is given by
−−= 20
20
20
02
)(exp
2
1)(
σπσII
Ip , (2.5)
−−= 21
21
21
12
)(exp
2
1)(
σπσII
Ip . (2.6)
Note that the detected signal I, as shown in Eq. (1.4), is a sum of 2M independent random
variables. From the central limit theorem the Gaussian approximation can be a good
model for both p1(Id) and p0(Id) for large M. But for small M (which is the case for
DWDM systems) and at low Q, the Gaussian distribution is not a good approximation of
the chi-square distribution as shown in Fig. 2.1.
Figure 2.1 plots the chi-square pdfs and the Gaussian approximations of the marks and
spaces in a transmission system with Q2 = 6.2 dB and M = 3. It shows that the central chi-
square pdf of the spaces is quite different from the Gaussian approximation even in the
central part of the pdfs. The difference between the pdfs of the marks, although not as
significant as that between the pdfs of the spaces, is clearly observed. Because the optical
detector is a square-law device and thus always outputs positive electrical voltage, the
probability of a negative signal is zero. The chi-square pdfs have zero probability density
27
for a signal voltage less than zero. The Gaussian approximation loses this non-negative
signal property. Thus, using the Gaussian approximation in the analysis and design of
FEC codes may cause poor estimation and significant degradation of the code perform-
ances as will be shown in Chapter 3.
Figure 2.1: Comparison of the chi-square distribution and the Gaussian approximation forM = 3, Q2 = 6.2 dB.
Figure 2.1 also clearly shows the asymmetric distribution of the marks and spaces with
ASE noise. For both the chi-square pdfs and the Gaussian pdfs, the variance of the marks
are much larger than that of the spaces. The difference between the variances comes from
the signal/noise beat term in the expansion of Eq. (1.4).
2.1.2 Channel models for optical fiber channels with dominant ASE noise
As discussed in the previous section, the statistics of ASE noise can be described by
the chi-square or Gaussian distributions. The Gaussian distribution is in fact an approxi-
mation of the chi-square distribution [48], [49]; in other word, the chi-square distribution
0
1
2
0 2 4
dashed: Gaussiansolid: chi-square
Detected signal (I)
spaces marks
28
describes the ASE noise more accurately. On the other hand, we can see that the chi-
square distribution has a more complex formula than the Gaussian approximation [Eq.
(2.1) and (2.2) vs. Eq. (2.5) and (2.6)]. Based on the two distributions, we can introduce
different channel models for optical fiber channels with dominant ASE noise.
Optical fiber channels can be characterized as binary-in binary-out (BIBO) channels or
binary-in soft-out (BISO) channels for hard-decision and soft-decision cases, respec-
tively. For the soft-decision case, we can use two models for optical fiber channels, the
chi-square and Gaussian models. For the hard-decision case, we introduce three channel
models: the chi-square binary asymmetric channel (BAC), the Gaussian BAC, and the
Gaussian binary symmetric channel (BSC).
The general BIBO channel model can be depicted as shown in Fig. 2.2.
Figure 2.2: Binary-in binary-out (BIBO) channel model.
where f ≡ Pr(1 | 0) and m ≡ Pr(0 | 1) are the two transition probabilities at the detector
output after thresholding. If f = m, we have a BSC, and when f ≠ m we have a BAC [55].
0 1 − f 0
1 1 − m 1
f
m
29
In the hard-decision case, we need to find the optimal hard-decision threshold Iopt.
With the chi-square and Gaussian pdfs given in Eqs. (2.1), (2.2), (2.4), (2.5), we can de-
rive Iopt for each of them, obtaining
( )
)!1(
/2exp
10opt
0
opt
10
2/1
opt
−=
−
−
−
−
M
NI
N
EII
N
E
E
I M
M
M
, (2.7)
20
2
0
0opt21
2
1
opt1 lnln σσ
σσ
+
−=+
− IIII, (2.8)
respectively.
We can see that for the chi-square distribution there is no closed-form formulae in
evaluating the optimal hard-decision threshold and, thus, no closed-form formulae for the
evaluation of the corresponding channel transition probabilities and detector BERs. For
the Gaussian approximation case, the solution is quite complex. Hence, in addition to the
Gaussian approximation of the ASE noise distribution, the hard-decision threshold is
customarily set so that the two transition probabilities, f and m, are equal, which implies a
BSC assumption. Then, the hard-decision threshold is given by [49]
10
0110th σσ
σσ++= II
I , (2.9)
and the detector BER, pe, is given by
===
2erfc
2
1e
Qmfp (2.10)
We note that, for ASE noise, spaces and marks have different variances (σ1 > σ0).
Thus, in the optimal hard-decision case (i.e., minimum BER condition), this in fact gives
30
rise to a binary asymmetric channel with different transition probabilities f and m. For the
chi-square distribution we do not have closed-form formulae for f and m, but they can be
evaluated numerically using
( ) ( )dI
M
NINI
Nf
I
M
)!1(
/exp/1
01
0
0opt
∫∞
−
−−
= , (2.11)
dIN
IEI
N
EI
E
I
Nm
I
M
M
2exp1opt
0
10
2/)1(
0∫ ∞− −
−
+−
= . (2.12)
With the Gaussian approximation we have
=
o
eopt 2
erfc2
1
B
BIf , (2.13)
+−−
−=eo
eo2
opt
/222
/22erfc
2
11
BBQ
BBQQIm , (2.14)
where the signal levels have been offset by I0.
From the above discussions, we have three channel models for optical fiber channels
with dominant ASE noise: chi-square BAC [Eqs. (2.7), (2.11), (2.12)], Gaussian BAC
[Eqs. (2.8), (2.13), and (2.14)], and Gaussian BSC [Eqs. (2.9), (2.10)]. The hard-decision
thresholds evaluated with Eqs. (2.7)–(2.9) and the corresponding detector BER (without
coding) are plotted in Figs. 2.3 and 2.4, respectively.
Figure 2.3 plots the optimal hard-decision thresholds corresponding to the chi-square
BAC, Gaussian-BAC, and Gaussian BSC models in a transmission system with Q2 = 6.2
dB and M = 3. It shows that the resulting hard-decision thresholds are clearly at different
positions. Hence, compared to the chi-square BAC model, which is the most accurate of
the three models, both the Gaussian BAC and Gaussian BSC models cause suboptimal
31
hard decisions. The suboptimal hard decisions will lead to poor estimates of the detector
BER as shown in Fig. 2.4.
Figure 2.3: Comparison of the hard-decision thresholds based on the chi-square distribu-tion, Gaussian approximation, and Gaussian approximation + BSC approximation for M= 3 and Q2 = 6.2 dB.
Figure 2.4: Comparison of the detected BERs as a function of Q, based on the chi-squaredistribution, Gaussian approximation, and Gaussian + BSC approximations, for M = 3.
0 .5 1 1 .50
0 .2
0 .4
D e te c te d s ig n a l ( I)
d a s h e d : p d f o f s p a c e ss o lid : p d f o f m a rk s
G a u . B S Cth re s h o ld G a u . B A C
th re s h o ld
c h i. B A Cth re s h o ld
0 10–3
–2
–1
Q2 (dB)
log 1
0(B
ER
)
dashed: Gau. BSCdotted: Gau. BACsolid: chi. BAC
32
From Fig. 2.4, we can see that, compared to the chi-square BAC model, the Gaussian
BSC model always gives a higher BER than does the optimal hard-decision case, i.e., it
overestimates the detector BER. By contrast, the Gaussian BAC model underestimates
the BER at low Q, and overestimates the BER at high Q. Moreover, the resulting BERs
are not significantly different; in other words, both the Gaussian BAC and Gaussian BSC
models work well in evaluating the detector BER without coding.
However, we will show in Chapter 3 that, although the non-optimal hard-decision
thresholds do not cause a significant difference in the evaluation of BER without coding,
the Gaussian BSC approximation may lead to a poor estimate of the Shannon limit of the
code performance if we use FEC coding. Moreover, if we incorporate the Gaussian BSC
approximation in the FEC code design, it may cause significant code performance degra-
dation. This issue will be discussed in Chapter 3 where we will use specific codes as ex-
amples. Even with only the Gaussian approximation, we will show in Chapter 3 that, in
the soft-decision and decoding case, a poor estimate and significant degradation of FEC
code performance may result.
The channel capacity is a function of the probability density functions of the noisy sig-
nals after transmission, as mentioned in Chapter 1. In the hard-decision case, the channel
capacity is a function of the transition probabilities. Hence, the resulting transition prob-
abilities for the different channel models are critical parameters in the analysis and design
of FEC codes.
As shown in Fig. 2.5, the transition probabilities evaluated with Eq. (2.8)–(2.12) can be
significantly different. Figure 2.5a plots the transition probabilities, m = p(0|1) and f =
p(1|0), as functions of Q for the three channel models. The two transition probability
33
curves overlap in the Gaussian BSC case, indicating equal f and m as expected for BSC.
Comparing the transition probability curves, f(Q) and m(Q), in the two BAC models, we
can see that the f(Q) and m(Q) curves are farther separated from each other in the Gaus-
sian BAC than in the chi-square BAC. This implies different degrees of asymmetry of the
two channel models.
To clearly show the asymmetry characteristics of the three models, we define the tran-
sition probability ratio as m/f and plot it as a function of Q in Fig. 2.5b. In a logarithm
plot, we see that the farther the transition probability ratio is away from zero (in either the
positive or negative direction), the more asymmetric the channel. Figure 2.5b shows that,
compared to the chi-square BAC model, the Gaussian BSC model totally disregards the
asymmetry of the ASE noise distributions, while the Gaussian BAC model overempha-
sizes the ASE-induced channel asymmetry.
(a) Transition probabilities, f and m (b) Transition probability ratios
Figure 2.5: Comparison of the transition probabilities based on the chi-square distribu-tion, Gaussian approximation, and Gaussian approximation + BSC approximation for M= 3 as functions of Q2.
0 100
0.2
Q2 (dB)
tran
sitio
n pr
obab
ilitie
s
solid: m = p (0|1)dashed: f = p (1|0)
Gau. BSC
chi. BAC
Gau. BAC
0 10
0
0.5
Q2 (dB)
log 1
0(m
/f)
Gau. BAC
Gau. BSC
chi. BAC
34
2.2 Physical mechanism of SSC and simplified model for SSC-induced timing jitter
The traditional optical soliton is an optical pulse that can propagate undistorted in dis-
persive nonlinear optical fiber under specific pulse power and pulse shape conditions
[16]. Figure 2.6 depicts the basic optical soliton transmission system. The transmitted
data stream is implemented with a stream of optical pulses indicating the marks. At the
receiver, if an optical pulse is detected in the middle of a receiving time slot with duration
T, a mark is received. Conversely, the absence of an optical pulse in the time slot is inter-
preted as a space received. We consider only binary data sequences in studying optical
soliton transmissions.
Transmitterend
Receiver end
Channel 1:f1
Fiber path z
0 0 0 0 01 1 1 111
Soliton stream
Binary data streamTiming slot
T
Figure 2.6: Optical soliton transmission.
2.2.1 Physical mechanism of SSC
In our studies of SSC-induced impairments, our major concern is timing jitter defined
as the random deviation of the optical pulse position from its nominal location at the time
35
slot center [16]. Timing jitter causes sub-optimal detectability of a pulse and inter-symbol
interference and, thus, limits both the bit rate and the transmission distance in soliton
transmission systems. In WDM communications, timing jitter also limits the channel
spacing and, thus, system spectral efficiency.
SSC is the result of collisions among solitons in WDM systems that belong to different
channels because of their different group velocities. To understand the physical mecha-
nism of SSC, consider the nonlinear Shrödinger equation (NLS) [16],
ui
uut
uzp
z
ui Γ−=+
∂∂+
∂∂
2)(
2
1 2
2
2
, (2.15)
where p(z) is the normalized group velocity dispersion as a function of transmission dis-
tance z. Because of the dependence of the second term on transmission distance z, Eq.
(2.15) is not the standard NLS. However, it can be transformed into a perturbed NLS by
defining u’ ≡ u exp(–Γz/2) and z' ≡ .)(0∫z
dzzp In the transformed variables, Eq. (2.15)
becomes
02
1 2
2
2
=′′+∂
′∂+′∂′∂
uubt
u
z
ui , (2.16)
where b(z) = exp(–Γ z) / p(z).
The effect of SSC on the performance of WDM systems can be demonstrated by con-
sidering the simplest case of two WDM channels in the NLS equation.
36
Complete SSC
In an optical fiber with constant chromatic dispersion and negligible losses, the nonlin-
ear interactions (collision) of two solitons having angular frequencies ±Ω induces a fre-
quency shift on each soliton that approximately equals [63]
∫∞∞−
Ω+Ω−Ω
=Ω
)(sech )(sech
2
1 22 dtztztδ
3z))sinh(2(
)]2(sinh )2(cosh 2[2
ΩΩ−ΩΩ
Ω= zzz , (2.17)
where the angular frequency Ω = 1.763 radians/τ, and τ is the full width (in the time do-
main) at half magnitude (FWHM) of the optical pulse intensity. The angular frequencies
of the two solitons change by the same amount but in opposite directions. Given chro-
matic dispersion, the propagation speed of a soliton changes with its frequency. Hence,
the SSC-induced frequency shift leads to a velocity shift. As shown in [63], a collision
speeds up the faster soliton and slows down the slower one. From Eq. (2.17), note that
δ Ω max = 2/(3Ω), and that δ Ω returns to zero after the collision is completed. Similarly,
the velocity shift returns to zero after the collision is completed and, thus, the net result is
a time shift of each soliton from the pulse center. This kind of soliton-soliton collision is
called a complete collision.
Figures 2.7a and 2.7b depict the soliton velocity changes and corresponding accelera-
tion changes (derivative of velocity changes) during the collisions occurring in optical
fibers with uniform dispersion and optical fibers with dispersion management, respec-
tively. Figure 2.7a shows the symmetric characteristics of the soliton acceleration change
caused by SSC in optical fibers with uniform dispersion. Thus, the soliton speed in-
37
creases during the first half and decreases in the second half of the collision duration.
Hence, after the SSC, the speeds of the solitons change back to their original speeds be-
fore the collision. Thus, the only net result of the collision is a displacement in time, δt,
of each soliton [63]. The collision retards the slower and advances the faster of the soli-
tons [63].
(a) Complete Soliton-soliton collision
(b) Partial Soliton-soliton collision
Figure 2.7: Changes of soliton velocity and acceleration during collision versus distance.
-2 -1 0 1 2
0
Normalized distance (z/Lcoll)
VelocityAcceleration
Acc
eler
atio
n
Vel
ocity
0
Dispersion
Velocity
Acceleration
Normalized distance (z/Lcoll)
Vel
ocity
Acc
eler
atio
n
-2 -1 0 1 2 3
38
Partial SSC
In realistic optical WDM systems, however, the use of lumped amplifiers and optical
fiber dispersion management has the potential to unbalance the SSC and, thus, cause a
partial SSC.
Consider the worst situation, a collision occurs at a point where there is a step change
of the optical fiber dispersion, D, in a dispersion-managed soliton system as shown in
Fig. 2.7b [63]. Although the acceleration has the same absolute peak value for each half
of the collision, the duration is different for each half, corresponding to different alter-
nating values of optical fiber dispersion. Thus, the integral of the acceleration over the
entire collision is not zero. This unbalanced acceleration yields a net velocity shift, there-
fore, that remains after the collision has been completed. Such velocity shifts, when mul-
tiplied by the remaining distances to the end of the system, could easily result in an unac-
ceptably large jitter in pulse arrival times.
With the assumptions of uniform optical fiber dispersion and distributed amplifiers, a
complete SSC can be analytically modeled as described in the following section [63]. On
the other hand, an analytical model for a partial SSC is not available because the indefi-
nite integral corresponding to Eq. (2.17) cannot be written in closed-form. Hence, in our
analytical studies in Chapter 4 of the line-coding performance in mitigating the SSC-
induced timing jitter, we consider only the case of a complete SSC. However, a partial
SSC will be considered during full simulations of a dispersion-managed soliton WDM
system in Chapter 4.
39
2.2.2 Simplified model for SSC-induced timing jitter
Figure 2.8 describes the collisions within two WDM channels. The rectangular block
shown in Fig. 2.8 is defined as a sliding window that slides along the data sequence bit by
bit. The length of the sliding window is equal to the number of symbols (marks or spaces)
in one channel that may interact with symbols in the other channel along the whole
transmission path; it also represents the maximum number of collisions a soliton may ex-
perience in a 2-channel optical fiber transmission system.
Channel 1:
Channel 2:
Fiber path
T
Transmitterend
Receiver end
Figure 2.8: Soliton-soliton collision in a two-channel WDM system. The rectangularblock is defined as the sliding window.
We first consider a simplified model of SSC in which all collisions are complete colli-
sions to explain the main motivation for our line-coding scheme [63], [64]. Taking only
complete SSCs into account, for two channels with optical frequency difference ∆f, the
simplified model of SSC can be described by the following [63]:
40
(a) Time shift detected at the receiver induced by each collision is
2)(1
1768.0f
t∆⋅
±≈τ
δ , (2.18)
where δt (ps) is the time shift, τ (ps) is the FWHM of a soliton pulse, ∆f (THz) is the
channel spacing, 0.1768 is a constant ratio without units, the plus sign indicates the
slowing down of the slower soliton, and the minus sign indicates the speeding up of the
faster soliton.
(b) Full width collision length is
λτ∆⋅
=D
L2
coll , (2.19)
where D (ps/nm/km) is the optical fiber chromatic dispersion, ∆λ (nm) is the wavelength
difference between the two channels, and Lcoll (km) is the full width collision length that
refers to the distance between the two positions, corresponding to the beginning and end
of the collision, where the solitons overlap at their half power points [63].
(c) Maximum number of collisions for each soliton along the entire transmission path is
T
DZN
λ∆⋅⋅=12 , (2.20)
where Z is the transmission distance and T is the bit period at the transmitter.
After each collision, the faster of the two colliding solitons is advanced and the slower
one is delayed with the same absolute value of arrival time shift (Eq. 2.18). Given the
system parameters Z, D, ∆f, T, and τ, we can calculate the collision length Lcoll with Eq.
41
(2.19). We can also calculate N12, the number of collisions each soliton experiences if
data sequences of “all marks” are transmitted in both channels. Thus, the total time shift
induced by SSC over the entire transmission path is simply the product of the number of
collisions experienced and the time shift associated with each collision (δt).
It is straightforward to obtain an equation for the time shift in WDM systems with
more than two wavelength channels by using Eqs. (2.18)–(2.20) for each pair of channels
and then summing the results over all channels. In [63], the SSC-induced time shift for a
soliton in the i-th channel in a WDM system is given by
∑≠ ∆
±=ji ij
i fTz
Zt
11418.0
0
τδ , (2.21)
where z0 represent the soliton period in distance [93], Nij represents the maximum num-
bers of collisions a soliton in the i-th channel may experience with solitons in the j-th
channel, and the average number of collisions a soliton in the i-th channel may experi-
ence with solitons in the j-th channel is assumed to be Nij/2.
In this complete SSC model, because δt is constant for a given τ and ∆f, the total time
shift of each soliton only depends on the number of collisions as determined by the
transmitted data pattern in the other channels. Thus, if we can make the number of colli-
sions constant for a given channel, then we would eliminate the timing jitter. It is not pos-
sible, however, to achieve this goal and transmit information at the same time because no
information would be transmitted, but we show that we can approach this goal. We will
use line-codes to reduce the variation in the number of collisions and, hence, reduce the
timing jitter and BER. As usual with any coding scheme, we will achieve this result by
42
adding redundancy to the data, but it is done here to reshape the transmitted data pattern
in a way that minimizes the SSC-induced timing jitter errors.
It is obvious that SSC-induced timing jitter is highly correlated from pulse to pulse.
The net time shift of a given pulse from collisions with pulses of another channel is pro-
portional to the number of collisions that it experiences as it traverses the entire optical
fiber path, and that number can change by only ±1 from one pulse to the next. Hence, the
bit errors caused by SSC have bursty characteristics. SSC-induced bit error patterns are
plotted in Fig. 2.9 for two different bit rates, 12 Gbps and 14 Gbps, in a 4-channel 20 Mm
WDM soliton transmission system.
Figure 2.9: Patterns of SSC-induced bit errors, in (a) a middle channel of a 4-channel 12Gb/s WDM system and (b) a middle channel of a 4-channel 14 Gb/s WDM system.
In the figures, the bit index of a transmitted sequence is plotted as a function of the bit
error index, where the bit index counts the transmitted bits and the error index counts the
detector bit errors, in sequential order, respectively. The figures show that SSC-induced
errors for the two bit rates are burst errors in both cases. A higher bit rate implies longer
0 101.5
3
4.5x 104
Bit
inde
x
Error index
5
Error index
x 103
00 50
43
burst length and smaller burst spacing. This burst characteristics of SSC-induced errors
may significantly affect the performance of error correction codes and will be taken into
account in the performance comparison of coding schemes in the following chapter.
2.3 Summary
In this chapter, we described the statistics of ASE noise and the physical mechanism of
soliton-soliton collision (SSC). We then constructed several different channel models for
optical fiber channels with dominant ASE noise and a simplified model for SSC-induced
timing jitter.
For ASE noise, we discussed two approximations of the ASE noise distributions: the
Gaussian approximation and the binary symmetric channel (BSC) approximation. We
observed that, the Gaussian BSC model, which combines both approximations, gives
simple closed-form formulae for evaluation of the hard-decision threshold and BER. But
the price that must be paid is a non-optimal hard-decision that yields higher BERs.
Although the chi-square distribution is also an approximation, as mentioned, it is the
most accurate theoretical model known for ASE noise. Hence, in the following studies of
the FEC code performance, we will assume ASE noise with a chi-square distribution.
For the SSC, we described its physical mechanism. We also introduced the concepts of
complete SSC and partial SSC that may be observed in a DMS system. We constructed a
simplified model for SSC-induced timing jitter by considering only complete SSC. This
model shows that the total time-shift for each soliton only depends on the number of col-
lisions, which is determined by the transmitted data pattern in the other channels. This
observation motivates the idea of developing a line-coding scheme that can reduce the
44
variation in the number of collisions each soliton may experience over the entire optical
fiber path and, thus, mitigate the SSC-induced timing jitter.
We also note that SSC induces time shifts of solitons that are highly correlated from
pulse to pulse and, hence, causes burst errors. The bursty characteristics of the SSC-
induced errors should be taken into account in the development of error correction coding
schemes for these kinds of errors.
45
Chapter 3
Forward Error Correction (FEC) Codes for Cor-
recting ASE Induced Errors
In both undersea and terrestrial systems, the optical amplifiers are critical components,
and amplified spontaneous emission (ASE) noise in the optical amplifiers is the major
source of noise in optical fiber channels. ASE noise has an asymmetric statistical nature,
and the chi-square distribution model is currently the best theoretical approximation of
the ASE noise statistics. However, for simplicity, the chi-square distributions are usually
approximated with Gaussian distributions having the same means and variances. Moreo-
ver, in the hard-decision case, the binary symmetric channel (BSC) model is widely used
in characterizing optical fiber channels [21], [48], [49]. The BSC model gives a good ap-
proximation of bit error rate (BER) induced by ASE noise at high Q. Although an accu-
rate hard-decision model would be based on a binary asymmetric channel (BAC) [25],
[48], [49] assumption, most existing FEC codes are developed and evaluated with addi-
tive white Gaussian noise (AWGN) or BSC assumptions. Thus, the previous applications
and performance evaluations of FEC codes in optical fiber transmission systems are
mostly based on the Gaussian or BSC approximation with little effort to use a more accu-
rate model of the optical fiber channels.
46
In this chapter, based on previously discussed ASE noise statistics and optical fiber
channel models, we study the performance of FEC codes in three levels. First, at the
highest level, the study focus is the set of general FEC codes. We evaluate the lower per-
formance bound (Shannon limit) for general FEC codes based on the chi-square BAC, the
Gaussian BAC, and the Gaussian BSC models of optical fiber channels. Second, at the
middle level, the study focus is the set of linear codes, a subset of general FEC codes. We
derive the upper performance bound for linear codes in channels with asymmetric noise
distributions and apply the bound to optical fiber channels with ASE noise. Finally, at the
lowest level, the study focus is the set of turbo codes, a subset of linear codes. We discuss
the effects of different ASE noise models on the performance of the turbo code decoder.
3.1 Lower performance bound for general FEC codes
A fundamental question in FEC code applications is: how much can performance be
improved with these codes, or from an information theoretic standpoint, what is the
Shannon limit for optical fiber channels? Generally, the Shannon limit can be interpreted
as the lowest system BER that can be achieved after FEC decoding for a given FEC code
rate [defined in Eq. (1.5)], system SNR, and channel noise statistics. Previous evaluations
of the Shannon limit in optical fiber communications are based on the Gaussian BSC ap-
proximation with little effort to use a more accurate model for optical fiber channels.
In this section, we investigate the coding performance limit of optical fiber channels
with ASE as the dominant source of noise. The goal is to evaluate the bound on code per-
formance in terms of the decoded BER, Pe(Q, r), where the Q factor is defined in Eq.
(2.3) and r is the code rate. Given a code rate r, Pe is a function of the Q factor. We have
47
shown that optical fiber channels with dominant ASE noise have a distribution that is
asymmetric, especially at lower values of Q, but the Gaussian BAC and Gaussian BSC
models do not accurately represent the asymmetry. In the following, we evaluate the
lower bound on Pe with the chi-square BAC, the Gaussian BAC, and the Gaussian BSC
models. By comparing the results, we show that both the Gaussian BAC and the Gaussian
BSC modes may poorly estimate the maximum coding gain achievable in optical fiber
communications.
To achieve our goal of evaluating the lower BER bound as a function of the Q factor
for a given code rate r, we apply the source-channel coding theorem [65] to the optical
fiber channel. The source-channel coding theorem states that, for a given source and
channel with the source sequence U = (U1, …, Uk), the codeword X = (X1, …, Xn), the
received noisy codeword Y = (Y1, …, Yn), and the decoded sequence V = (V1, …, Vk), the
average cost β , the average distortionδ , and the code rate r = k/n must satisfy
)(/)( δβ RCr ≤ , where C(β) is the channel capacity (information bits/line symbol) and
R(δ) is the rate-distortion function. With the Q factor as the cost parameter, error prob-
ability as the distortion measure, and the memoryless channel assumption (which is true
for the ASE noise case), the source-channel coding theorem relates code rate r and the Q
factor to Pe by [65]
)(
)(
ePR
QCr ≤ . (3.2)
The above inequality gives an upper bound on the best code rate achievable for a given Q
factor and decoded BER (Pe). It can be illustrated with the diagram shown in Fig. 3.1.
48
Figure 3.1: Illustration of the source-channel coding theorem.
The source-channel coding theorem can be intuitively partitioned into two concate-
nated procedures, an outer lossy compression procedure and an inner channel coding pro-
cedure [94]. As shown in Fig. 3.1, an error-free transmission can be achieved with chan-
nel coding having a code rate rc not higher than the channel capacity C. Channel capacity,
defined later, is a function of the transition probabilities. For a given noise distribution
and decision threshold, the transition probabilities, f and m, are functions of (I1, I0, σ1, σ0)
and, thus, functions of Q. We can relate the Q factor to the channel code rate rc, therefore,
via the channel capacity by
)(c
QCr ≤ . (3.3)
lossycompressorrs≤ 1/R(Pe)
lossydecompressor
channelencoderrc≤ C(Q)
channeldecoder
Optical fiber channel
with ASE noise
( f, m )
Error Free Channel
System with Error Probability Pe, code rate r = rsrc
( fs , ms )|Pe
pX
Y
p
P(x = 0)
49
We may view the noisy optical fiber channel with channel coder as a virtual error-free
channel; however, error-free transmission is not really required. Given a tolerable system
BER (after decoding), Pe, a lossy compressor with a compression rate
data word compressed of lengthdata wordinput of length
s =r (3.4)
can be used to increase the overall code rate to r = rsrc [65], [94]. The compression rate,
rs, is in fact the reciprocal of the rate distortion function representing the minimum num-
ber of bits needed to be transmitted for each source information bit, given a tolerable
system error probability Pe. Thus we relate Pe to the compression rate by [94]
)(
1
e
sPR
r ≤ . (3.5)
We obtain Eq. (3.2) from combining Eqs. (3.4) and (3.5).
From the above discussion, we see that the code rate can be factorized into independ-
ent terms, the channel capacity C(Q) and the reciprocal of the rate distortion function
R(Pe). The independence between C(Q) and R(Pe) is visually shown in Fig. 3.1. We can
see that C(Q) depends on the transition probabilities (f and m) of the optical fiber channel,
but does not depend on the source data distribution, p = Pr (x = 0). On the other hand,
R(Pe) depends on the source data distribution, but does not depend on the real optical fi-
ber channel. Given the source data distribution, p = Pr (x = 0), the rate distortion function
R(Pe) can be evaluated. This independence property facilitates the evaluation of the
Shannon limit for non-error-free transmissions. The following gives the details of the
evaluation.
Channel capacity is defined by
50
),,(max cc
pmfIC XYp
= , (3.6)
where IXY is the mutual information function and pc is the probability of a space in the
input data sequence to the channel encoder [65]. Note that pc is different from the source
data distribution p.
For a BAC, the channel capacity can be evaluated by definition, as a function of the
transition probabilities:
[ ] )()1()()1()1(),( 2*c2
*c
*c
*c2 mHpfHpmpfpHmfC −−−−+−= , (3.7)
where H2(p) = −plog2p − (1−p)log2(1−p) is the binary entropy function and pc* represents
the optimal distribution of the input data sequence to the channel encoder that maximizes
the mutual information function IXY. With Eqs. (2.11) – (2.14), and (3.7), C(Q) can be
evaluated numerically with the chi-square BAC and Gaussian BAC models.
For the BSC case (pc* = 1/2, f = m), a simpler formula can be derived as C = 1 − H2(f).
However, when the channel is asymmetric, the BSC approximation will be inaccurate
because the decision threshold will not be at the optimal position, as previously men-
tioned. Figure 3.2 plots the resulting C(Q) based on the three different channel models. It
shows that, compared to the chi-square BAC model, the Gaussian BAC model over esti-
mates the channel capacity at low Q and underestimates it at high Q. The Gaussian BSC
model always underestimates the channel capacity.
It is not straightforward to evaluate the rate-distortion function R(Pe), which indicates
the minimum number of bits needed to represent a source symbol for a given output er-
ror probability Pe. In general, the rate-distortion function is defined as [65]
51
))(:),()|(
min1
(inf)( δδ kdEIpkk
R ≤= VUuv
. (3.8)
Figure 3.2: Comparison of the channel capacities evaluated based on the chi-square BAC,Gaussian BAC, and Gaussian BSC models of the optical fiber channel with dominantASE noise.
The minimization of the mutual information I(U, V) is extended over all p(v|u) = PV =
v| U = u that define V for a fixed δ and average distortion E(d) ≤ kδ. As proved in [65],
the computation of R(δ) becomes considerably easier for a discrete memoryless source U.
The simplified formula becomes [65]
)(:),()|(
min)( δδ ≤= dEVUIuvp
R . (3.9)
0 5 100.5
1
Q2 (dB)
C (
chan
nel c
apac
ity)
dashed: Gau. BACsolid: chi. BACdotted: Gau. BSC
52
Further simplification is possible for channels with binary input and output by using the
Hamming distortion measure [d(U, V) = 1 if U ≠ V, 0 if U = V] such that E(d) =Pe= δ.
Thus we have
[ ] ss1,0,
)1(),()|()(),( mppfvuduvpupVUdEvu
−+== ∑∈
, (3.10)
where fs and ms are arbitrary transition probabilities in a binary channel and are different
from the actual transition probabilities, f and m, of the optical fiber channel. Given the
source distribution p, we can evaluate R(Pe) by minimizing I(U, V) over all the transition
probability pairs of fs and ms satisfying pfs + (1−p)ms ≤ Pe, as shown below.
esssssse )1( , ),,,(min)( PmppfmfpmfIPR XY ≤−+∋∀= , (3.11a)
As previously mentioned and shown in Fig. 3.1, R(Pe) is independent of the real optical
fiber channel. Equation (3.11a) can be significantly simplified by letting fs = ms, in which
case
)()2()( e2ee2esym PHpPPpHPR −−+= . (3.11b)
However, Eq. (3.11b) is not always an accurate formula for R(Pe). In fact, for a sym-
metric source data distribution, i.e., p = 1/2, the minimum value of I(U, V) corresponds to
the case of fs = ms, in which case Eq. (3.11b) is an exact expression of R(Pe). However,
when p ≠ 1/2, the minimum values of I(U, V) are not at fs = ms and, thus, Eq. (3.11b) is
only an approximation of R(Pe). The proof of this statement follows.
Proof:
With (1 – p)m + pf = Pe and
[ ] )()1()()1()1(),( 222 mHpfpHmpfpHVUI −−−−+−= ,
53
we have
−+−−
−
+−−
−+=pfP
pfPp
f
f
pfPp
pfPpp
df
VUdI
e
e
2
e
e2
1
121
2log
),( .
When f = m, i.e., f = Pe, then we obtain
ee
ee2 21
2log2
),(
pPPp
pPPpp
df
VUdI
+−−−+= . (3.12)
We may resonablely assume p > 0 and Pe < 0.5, let p = 1/2 in Eq. (3.12), we have
121
2
ee
ee =+−−
−+pPPp
pPPp
and, thus,
021
2log2
ee
ee2 =
+−−−+
pPPp
pPPpp .
Hence, we find that the right side of Eq. (3.12) equals 0 if p =1/2.
Letting
021
2log2
ee
ee2 =
+−−−+
pPPp
pPPpp ,
for p > 0 and Pe < 0.5, the only solution for this equation is p = 1/2. Hence, we find that
the right side of Eq. (3.12) equals 0 only if p =1/2. QED.
Figure 3.3 plots the transition probability fs giving the minimum value of mutual in-
formation I(U, V) as a function of Pe, for different source distribution values p. It shows
that to achieve the minimum value of I(U, V), fs = Pe (thus fs = ms) only for p = 0.5 and
illustrates the statement made in the previous paragraph.
54
Figure 3.3: The quantity fs for minimum value of I(U, V) as a function of Pe for p = 0.1,…, 0.9.
Figure 3.4 plots the rate distortion function R(Pe) given by Eqs. (3.11a) and (3.11b),
respectively. It shows for low Pe that Eq. (3.11b) is a good approximation of the rate dis-
tortion function, but at high BERs it is not. It also shows that the more asymmetric the
source data distribution, the lower the rate distortion function value that can be achieved.
Although, as shown above, the asymmetric distributed source is favorable for lower
values of rate distortion function, the symmetric source is the most likely case (and usual
assumption) in communication systems. Hence, in the following evaluations we assume
the symmetric source distribution, i.e., p = 1/2, that gives the rate distortion function as
( )e2e 1)( PHPR −= . (3.13)
0 0.10
0.1
Pe
f s (
givi
ng m
in[I(
U, V
)]
p = 0.1 p = 0.2 p = 0.3 p = 0.4 p = 0.5
p = 0.6
p = 0.7
p = 0.8
p = 0.9
55
Figure 3.4: Comparison of the exact rate distortion function and the approximation basedon equal transition probabilities, fs = ms, for different source distributions, p = 0.1, …,0.9.
With the source-channel coding theorem in Eq. (3.2), the channel capacity in Eq. (3.7),
and rate distortion function in Eq. (3.13), we are ready to evaluate the lower bound on Pe,
i.e., the lower bound on the system BER after FEC decoding. Note that R(Pe) is a de-
creasing function of Pe, so that the upper bound on the code rate r in Eq. 3.2 becomes a
lower bound for Pe after rearranging the inequality as shown in Eq. (3.14):
−− =≥
r
QCR
r
mfCRP
)(),( 11e , (3.14)
where R–1(x) represents the inverse rate distortion function that can be evaluated
numerically.
–3 –2 –10
0.5
1
log10(Pe)
R(P
e)
approximation (fs = ms)accurate (fs ≠ ms)
p = 0.5
p = 0.4, 0.6
p = 0.3, 0.7
p = 0.2, 0.8
p = 0.1, 0.9
56
The results for the lower bound on the decoded error probability, Pe, evaluated with the
chi-square BAC, the Gaussian BAC, and the Gaussian BSC models, are plotted in Fig.
3.5. For r = 1, which corresponds to the uncoded case, the Gaussian BSC and Gaussian
BAC models are very nearly identical, as should be expected according to the detected
BERs evaluated in Sec. 2.1 and shown in Fig. 2.4. Compared to the chi-square BAC case,
the Gaussian BSC model overestimates the BER (estimates better performance), and the
Gaussian BAC model underestimates the BER (estimates poorer performance) at low Q
and overestimates the BER at high Q, but not significantly. However, in the FEC code
case, i.e., r < 1, we can see differences in the resulting bounds, which we illustrate by
displaying the comparisons among all possible pairs of the three channel models.
First, as shown in Fig. 3.5a, compared to the chi-square BAC model, the Gaussian
BAC model gives a good approximation of the lower bound on FEC code performance
for high code rates (r ≥ 0.8). However, at low code rates (r ≤ 0.5), the Gaussian BAC
model underestimates the lower bounds on code performance; and, as code rates become
lower, the underestimate becomes more significant. For example, for r = 0.5, the under-
estimate is about 0.4 dB in Q2 for Pe ≤ 10–4.
Second, as shown in Fig. 3.5b, compared to the Gaussian BAC model, the Gaussian
BSC model overestimates the lower bound on code performance. The overestimate be-
comes more severe for lower code rates. For example, for r = 0.8, the overestimate is
about 0.5 dB for Pe ≤ 10–4, and for r = 0.5, the overestimate is about 0.8 dB for Pe ≤ 10–4.
57
r = 1
(a) chi-square BAC (solid) vs. Gaussian BAC (dotted)
(b) Gaussian BAC (dotted) vs. Gaussian BSC (dashed)
(c) chi-square BAC (solid) vs. Gaussian BSC (dashed)
Figure 3.5: Comparison of the lower performance bounds of FEC codes evaluated withthe chi-square BAC, Gaussian BAC, and Gaussian BSC models.
10Q2 (dB)
low
er b
ound
on
BE
R
0
10–5
100
10–10
r = 0.5, 0.6, 0.7, 0.8, 0.9, 1
r = 1
r =0.5
10–10
10–5
100
0 10
low
er b
ound
on
BE
R
r = 0.5, 0.6, 0.7, 0.8, 0.9, 1
r = 1
r =0.5
Q2 (dB)
10–5
0 1010–10
100
low
er b
ound
on
BE
R
r = 0.5, 0.6, 0.7, 0.8, 0.9, 1
r =0.5
Q2 (dB)
58
Finally, as shown in Fig. 3.5c, compared to the chi-square BAC model, the Gaussian
BSC model overestimates the lower bound on code performance by about 0.4 to 0.5 dB
for all code rates studied and for Pe ≤ 10–4.
An interesting observation can be made from the above comparisons. The Gaussian
approximation of the chi-square distribution leads to an underestimate of the code per-
formance bound (estimates poorer performance), while the BSC approximation of the
Gaussian distribution leads to an overestimate of the code performance bound (estimates
better performance). Thus, for the Gaussian BSC that combines both the Gaussian and
BSC approximations, the Gaussian approximation-induced underestimate (poorer per-
formance estimate) cancels part of the BSC approximation-induced overestimate (better
performance estimate), especially at low code rates.
3.2 Upper bound for linear FEC code performance
In this section, we present theoretical studies on the upper bound of linear code per-
formance in optical fiber channels with ASE as the dominant source of noise. We derive
a general upper bound for Pd, the pairwise error probability (defined as the probability
that the decoder makes a wrong decision by selecting an error sequence), as a function of
the error weight d in asymmetric channels. Utilizing this derived general upper bound,
the weight distribution of linear codes, and the union bound theorem, we evaluate ana-
lytically the upper bound on linear code performance in optical fiber channels, with par-
ticular consideration of the turbo product code (TPC) and the turbo convolutional code
(TCC).
59
It is worth notice that, while other communications systems aim at achieving bit error
rates (BERs) around 10–4 to 10–6, optical fiber communications systems require more re-
liable performance, i.e., BERs less than 10–11. As we know, the union bound [68] on the
decoded BER diverges significantly from the actual code performance when the SNR
drops below a threshold determined by the computation cutoff rate [67], [95]. However,
the union bound gives a good estimate of code performance at high SNR, or equivalently,
at very low decoded BER. In most cases, however, simulations of decoded BER down to
10–11 are not possible. Hence, the union bound is more important and useful in optical
fiber communication channels, as a guide for code design, than in any other kind of
channel.
We evaluate the upper bounds on the performance of a TPC and a TCC in an optical
fiber channel using both the Gaussian model and the chi-square model for ASE noise. We
show that, compared to the more accurate chi-square model of the ASE noise, the Gaus-
sian approximation mis-estimates the code performance bounds. We also show that the
TPC outperforms the TCC (according to these bounds), with similar code rate and block
length, in the optical fiber channel requiring very low BER.
3.2.1 Upper bound on linear code performance in asymmetric channels
The upper bound on linear block code performance can be evaluated based on their
weight enumerating functions (WEF), while a convolutional code can be represented by
an equivalent block code if the encoder is forced to the all-zero state at the end of each
block. Moreover, it has been shown that TCC is also a linear code [66]. Obviously, the
60
TPC with block constituent codes can be treated as linear block code. For TCC with two
convolutional constituent codes, the resulting performance degradation is negligible for a
large interleaver size even though only one of the two convolutional encoders, but not
both, can be forced to the all-zero state at the end of each block [67]. Hence, the formulae
that we have derived apply generally to linear block codes, linear convolutional codes,
TPCs, and TCCs in asymmetric channels. The asymmetric channel here refers to a bi-
nary-input soft-output (BISO) asymmetric channel instead of a binary asymmetric chan-
nel which is binary-input binary-output (BIBO). The specific asymmetric channel studied
is the optical fiber channel with ASE noise approximated by the chi-square distributions
given by Eqs. (2.1) and (2.2).
Evaluation of the performance upper bound is based on the Union Bound Theorem
[68]. For linear codes, the set of decoded error patterns is identical to the set of code-
words. Thus the weight distribution of the decoded error patterns can be equivalently de-
scribed with the WEF of the code as
∑=
=max
min
)(d
dd
dd xAxA , (3.15)
where Ad is the number of codewords and, thus, the number of error patterns of weight d.
The pairwise error probability is defined as the probability that the decoder makes a
wrong decision by selecting a codeword other than the transmitted one [67]. Suppose we
are given the pairwise error probability Pd for each possible weight d, then the decoded
word error probability Pe can be bounded by the union bound as [68]
∑=
≤max
min
e
d
dddd PAP . (3.16)
61
It is straightforward to evaluate Pd for symmetric channels, particularly for the AWGN
channels, but it is quite involved for asymmetric channels. In the following sections, in-
stead of deriving an exact formula for Pd, a general upper bound of Pd is derived for
asymmetric channels.
3.2.2 Upper bound on Pd for asymmetric channels with two codewords
First consider a BISO asymmetric channel with only two codewords, x1 = (x11, x12, …,
x1n) and x2 = (x21, x22, …, x2n), where xij ∈ 0, 1. Suppose x1 is the transmitted codeword
and assume the decoded error pattern e = x1 + x2 has weight d. Let d+ represents the
number of mark errors in e, and we similarly define d– for the space errors, so that d
= d+ + d–. We define Pe1 and Pe2 as
),|ˆ( ),|ˆ( 21e212e1 xxxxxxxx ====== PPPP
where x = (x1, x2, …, xn), xi∈0, 1, represents the transmitted codeword, and x represents
the decoded codeword. Assuming that the codewords x1 and x2 are equally probable, it
follows from [69] that the pairwise error probability is bounded by
∑ −
≤≤≤+=
y
xyxy ss
sd PPP
PP)|()|(min 2
1110
e2e1
2
1
2, (3.17)
where y = (y1, y2, …, yn), yi ∈ [0, ∞) (for optical fiber transmissions), represents the re-
ceived noisy codeword.
With the independence assumption, and we note that the continuous value of yi re-
quires integration of the pdf in probability calculations, we have
62
.)|()|(min1
21
110
2ee1
2
1
2∏∫
=
−
≤≤≤+=
n
iy i
sii
sii
sd
i
dyxypxypPPP
(3.18a)
Also noting that yi ∈ [0, ∞), we replace yi by y ∈ [0, ∞) in the above equation, and we
obtain
∏ ∫=
−
≤≤≤
n
iy
si
si
sd dyxypxypP
1 2
11
10)|()|(min
2
1. (3.18b)
We know that x2 differs from x1 with d+ mark errors (x1i = 0 and x2i = 1) and d– space
errors (x1i = 1 and x2i = 0) and, thus, the two codewords have n–d+–d– ident i-
cal bits (x1i = x2i = x). It follows that Eq. (3.18b) can be written as
. )|()|(
)0|()1|()1|()0|(min
1
11
102
1
≤
−+
−+
−−−
−−
≤≤
∫
∫∫
ddn
y
ss
d
y
ssd
y
ss
sd
dyxypxyp
dyypypdyypypP
(3.18c)
We now note that
1)|()|()|(
1 == ∫∫ −
yy
ss dyxypdyxypxyp ,
so that, by simplifying Eq. (3.18c), we obtain
≤+=−+
∫∫ −−
≤≤
d
y
ssd
y
ss
sd dyypypdyypypPPP 1
1
10
2ee1 )1|()0|()1|()0|(min2
1
2. (3.18d)
Equation (3.18d) shows that for a channel with asymmetric noise, the same error pat-
tern may have different pairwise error probabilities when occurring on different code-
words. Hence, there is not an exact formula for Pd as a function of d for asymmetric
63
channels. Based on Eq. (3.18d), however, we can obtain a general upper bound for Pd as
discussed in the following section.
3.2.3 General upper bound on Pd in asymmetric channels
Now we extend our discussion to the general case where a linear code with N code-
words is used on an asymmetric channel. For a decoded error pattern e with weight d, all
the N codewords can be paired into N/2 pairs, such that the modulo-2 summation of the
two codewords in any pair equals e. For any possible e, there always exists the codeword
pair, (0, e). Thus, the pairwise error probability Pd in the N codeword case can be written
as
∑∑==
+==2/
12e1e
1e )(
11 N
j
jjN
iid PPPP
NN, (3.19a)
where jP 1e and jPe2 represent one of the N/2 codeword pairs. We can see that Eq. (3.18d)
holds for any of the N/2 codeword pairs. Thus, with Eqs. (3.18d) and (3.19a) we obtain
∑ ∫∫∑=
−−
≤≤=
≤=−+2/
1
11
101
)1|()0|()1|()0|(min11 N
j
d
y
ssd
y
ss
s
N
ieid
jj
dyypypdyypypPPNN
, (3.19b)
where j is the index of the codeword pair, dj+ represents the number of mark errors for
one codeword in pair j, and dj– represents the number of space errors for the same code-
word in pair j. The right side of the inequality in Eq. (3.19b) is still a function of dj+ and
dj– instead of d.
We can loosen the bound to facilitate the computation by replacing all the terms in the
summation in Eq. (3.19b) with their maximum value. It will be shown with examples that
64
this does not significantly loosen the bound for the optical fiber channels with chi-square
noise. Thus we obtain
).,(minmax
)1|()0|()1|()0|(minmax
102/1
11
102/1
2
1
2
1
+
≤≤≤≤
−−
≤≤≤≤
=
≤
−+
∫∫
jsNj
d
y
ssd
y
ss
sNjd
ds
dyypypdyypypPjj
µ
(3.20)
The µ(s, dj+) function in Eq. (3.20) has the following properties:
Property 1: µ(s, dj+) is convex with respect to s [69].
Property 2: the set of curves µ(s, dj+), 1 ≤ j ≤ N/2, cross at the single point
s = 0.5 and have the same value µ(0.5, d/2).
Property 3: µ(0.5, d/2) is the minimum value of µ(s, dj+) for dj
+ = d/2.
Shannon, et al., showed Propety 1 in [69]. We prove the other two properties as follows.
Proof for propery 2:
We substitute s = 0.5 into µ(s, dj+) as defined in Eq. (3.20), obaining
. )2/,5.0(
)1|()0|(
)1|()0|(
)1|()0|()1|()0|(),5.0(
2/12/1
2/12/1
2/12/12/12/1
d
dyypyp
dyypyp
dyypypdyypypd
d
y
dd
y
d
y
d
yj
jj
jj
µ
µ
=
=
=
=
∫
∫
∫∫−+
−+
+
+
65
Hence, for s = 0.5 and a given d, µ(s, dj+) has the same value for different values of dj
+.
QED.
Proof for propery 3:
Substituting dj+ = d/2 into µ(s, dj
+) as defined in Eq. (3.20) and noting that dj+ = d – dj
–,
we have
2/1
2/1 )1|()0|()1|()0|()2/,(
d
y
ssd
y
ss dyypypdyypypds
= ∫∫ −−µ ,
which is a symmetric function with respect to s = 0.5. Suppose that there exists a ∆s ≠ 0,
such that µ(0.5+∆s, d/2) is the minimum value of µ(s, d/2). From the symmetry of the
function, we must have µ(0.5–∆s, d/2) = µ(0.5+∆s, d/2). Thus, we find µ(0.5, d/2) >
[µ(0.5–∆s, d/2) + µ(0.5+∆s, d/2)]/2, in contradiction to property 1. Hence, the suppo-
sition is wrong and, thus, µ(0.5, d/2) is the minimum value of µ(s, d/2). QED.
With properties 2 and 3, we can prove that
)2/,5.0(),(minmax102/1
dds jsNj
µµ ≤+
≤≤≤≤, (3.21)
as follows.
Proof:
Suppose there exists a j such that dj+ ≠ d/2 and ),(min
10
+
≤≤ jsdsµ > µ(0.5, d/2). According
to property 2, µ(0.5, dj+) = µ(0.5, d/2) and, thus, µ(0.5, d/2) ≥ ),(min
10
+
≤≤ jsdsµ from
property 1. This contradicts the supposition. Hence, the supposition is wrong, and
66
),(min10
+
≤≤ jsdsµ ≤ µ(0.5, d/2) for all j. With property 3, we prove that the equality in Eq.
(3.21) holds when dj+ = d/2 exists. QED.
Equations (3.20) and (3.21) can be combined to obtain a simple general upper bound
on Pd given by
d
yd dyypypP
≤ ∫ )1|()0|(2
1 . (3.22)
And Eqs. (3.16) and (3.22) can be combined to obtain an upper bound on the decoded
word error probability as
∑∑ ∫==
=
≤max
min
max
min2
1
2
1 )1|()0|(
d
dd
dd
d
dd
d
yde ZAdyypypAP . (3.23)
Now, given the WEF of the linear code and the channel noise statistics, the evaluation of
the upper bound via Eq. (3.23) of the decoded error probability becomes a routine task.
3.2.4 Upper bound on linear code performance in optical fiber channels
The upper bound on linear code performance in optically amplified fiber channels can
now be obtained by substituting Eqs. (2.1) and (2.2) into Eq. (3.23) yielding
( ) ( )
,
)!1(
/2/)2/(exp
max
minchi
max
min
0 4/)1(2/12/)1(0
02/1
104/)1(3
e
2
1
2
∑
∑ ∫
=
=
∞
−+−
−
=
−+−
≤
d
dd
dd
d
dd
d
MMM
Md
ZA
dyEMN
NyEINEyyAP
(3.24)
where Zchi (for chi-square ASE model) is a constant for the given optical fiber channel
parameters, M, E, and N0, and can be evaluated numerically.
67
Figure 3.6 plots the µ(s, dj+) vs. s curves with dj
+ = d, 2d/3, d/2, d/3, and 0, respec-
tively, where d = 6, Q2 = 2 dB, M = 3. These plots display the properties of µ(s, dj+) dis-
cussed in the previous section.
Figure 3.6: The µ(s, dj+) curves for different values of dj
+.
Figure 3.7 plots two sets of curves corresponding to the logarithm values of µ(0.5,
d/2)/2 and Minµ(s, d)/2, respectively, as the functions of the Q factor in a given optical
fiber channel for d = 3, 6, 9, 12. The first set of curves, µ(0.5, d/2)/2 vs. s, are actually
the upper bounds of Pd obtained in Eq. (3.22), while the second set, Minµ(s, d)/2 vs. s,
equals 2/),(minmin102/1
+
≤≤≤≤ jsNj
dsµ . Observing Eqs. (3. 20) and (3.21), we can see that the tightest
bound for Pd (Eq. (3.19b)) falls between the µ(0.5, d/2)/2 and Minµ(s, d)/2 curves with
0.3 0.65.5
6
6.5x 10
–3
µ(s,
dj+)
s
dj+ = d/2
dj+ = ddj
+ = 0
dj+ = 2d/3
dj+ = d/3
68
the same d, and the two sets of curves almost overlay each other. Hence, Fig. 3.7 shows
that using MaxMinµ(s, dj+) in Eq. (3.20) gives a very good approximation of the upper
bound of Pd for the optical fiber channel studied. The computation is significantly simpli-
fied, while the loosening of the bound is negligible.
Figure 3.7: Comparison of µ(0.5, d/2)/2 and Minµ(s, dj+)/2 at d = 3, 6, 9, 12 for the opti-
cal fiber channel with M = 3.
3.2.5 Upper bounds on performance for TPC and TCC example
In this section, we apply the upper bound on linear code performance in asymmetric
channels derived in the previous section to two example codes — the Hamming (7, 4) ×
(7, 4) TPC and the (1, 5/7, 5/7) TCC with interleaver length 100. We start with a descrip-
tion of the encoding procedures of the two codes.
0 5–15
–10
–5
0
Q2 (dB)
Bou
nds
on lo
g 10(
Pd)
d = 3
dotted: log10(µ(0.5, d/2)/2)dashed: log10(Minµ(s, d)/2)
d = 6
d = 9
d = 12
69
The Hamming (7, 4) × (7, 4) TPC is a two-dimensional product code. The Hamming
(7, 4) constituent code is a standard single-error-correction code, where 4 is the input
data-word length and 7 is the codeword length. The parity-check matrix H of the Ham-
ming (7, 4) code is shown below
=
1
0
1
110
111
101
100
010
001
H .
From the above systematic H = [I | P] matrix, we can get the systematic code generator
matrix G = [P’ | I] (such that GH’ = 0) as shown below
=
1
0
0
0
0
1
0
0
0
0
1
0
0101
0111
0110
1011
G .
The (7, 4) × (7, 4) TPC is encoded in the row-first column-second order and, thus, has the
codeword structure as shown in Fig. 3.8.
.
Figure 3.8: Codeword structure of the Hamming (7, 4) × (7, 4) TPC.
Information bits
4 × 4
Column parity forrow parity bits
3 × 3
Column parity forinformation bits
3 × 4
Row parity bits
4 × 3
70
We can see that the Hamming (7, 4) × (7, 4) TPC has a block length of 49 bits, and a
code rate r = 16/49 ≈ 0.327.
The (1, 5/7, 5/7) TCC is a rate 1/3 code, where the parameters in the parenthesis are
octal numbers representing the structures of the constituent encoders. As depicted in Fig.
3.9, the “1” represents the information bit sequence and the two “5/7”s correspond to the
recursive parity-check generator polynomial (1 + D2)/(1 + D + D2). A 100-bit interleaver
is used in between the two constituent encoders; thus, the TCC has a block length of 300.
Figure 3.9: Encoder structure of (1, 5/7, 5/7) TCC with 100-bit interleaver.
The upper bounds on the decoded BERs of the TPC and TCC codes can be evaluated
with Eqs. (3.25) and (3.26), respectively,
∑ ∑=
−
=
+≤k
i
kn
j
jiZjiAk
iP
1 0
)(BER ),(
2, (3.25)
uk uk
x1pk
x2pk
100-bitinterleaver
encodeddata
sequence
71
∑ ∑ ∑= = =
++−
≤L
i
L
j
L
j
jjiZjiLtjiLti
L
L
iP
1 0 0
)(21
1
BER
1 2
21),,(),,(2
, (3.26)
where k is the input data-word length, i.e., the number of information bits in the TPC
codeword, n is the TPC codeword length, A(i, j) is the coefficient of the conditional WEF
of the TPC, L is the TCC interleaver length, t(l, i, j) is the transfer function coefficient of
the constituent (5/7) convolutional code for the TCC, and Z was defined in Eq. (3.23).
The TPC studied here has short data-word length (k = 16 bits) and codeword length (n =
49 bits); hence, its conditional WEF can be easily obtained by counting all the possible
codewords. The transfer function of convolutional codes can be obtained with the recur-
sive algorithm introduced in [70]. As defined in Eq. (3.23), Z is a function of the pdfs of
the channel noises. Using both the chi-square and Gaussian models of the ASE noise for
both the TPC and TCC, we obtained four performance bound curves as plotted in Fig.
3.10.
Figure 3.10 shows that, for both TPC and TCC, the upper bounds on performance
evaluated with the Gaussian approximation and chi-square distribution models cross
around the point Q2 = 2 dB. Compared to the chi-square distribution model, the Gaussian
model underestimates the code performance bounds by about 1.5 dB at BER 10–12. As a
whole, the Gaussian approximation overestimates at low Q and underestimates at high Q
the upper bounds on the TPC and TCC performance. If interpreting the above comparison
as the comparison of two channels, a Gaussian channel and a chi-square channel, we can
see that the TPC and TCC perform better in the Gaussian channel at very low Q ( Q2 < 2
dB) and in the chi-square channel at higher Q.
72
Figure 3.10 also shows that the TCC outperforms the TPC by 2.5–3 dB when the BER
is about 10–4, as indicated by the horizontal dash-dot line. However, the rate at which the
BER decreases with the TCC as a function of Q2 is smaller than with the TPC. Thus, the
TPC outperforms the TCC at very low BER (BER < 10-16 as indicated by the horizontal
dash-dot line). Note that the TPC and TCC studied here have similar code rate (TPC rate
= 0.327, TCC rate = 0.333), but the TCC is a longer code (300 bits) than the TPC (49
bits). Hence, it suggests that, with a similar code rate and code length, the TPC may be a
better choice than the TCC in optical fiber channels requiring very low BERs. In fact, as
shown in Fig. 3.11, when the interleaver length of the (1, 5/7, 5/7) TCC is decreased to
20, corresponding to a 60-bit code length, the 49-bit TPC outperforms the TCC for BER
< 10-10 as indicated by the horizontal dash-dot line.
Figure 3.10: Upper bounds on the performance of the Hamming (7, 4) × (7, 4) TPC (tri-angles) and the (1, 5/ 7, 5/7) TCC with interleaver length 100 (circles) using the Gaussian(dotted) and the chi-square (solid) ASE noise models.
0 10–20
–10
0
Q2 (dB)
log 1
0(B
ER
)
–2
–4
–6
–8
–12
–14
–16
–18
2 4 6 8
73
Figure 3.11: Comparison of the upper bounds on the performance of the Hamming (7, 4)× (7, 4) TPC (solid) and the (1, 5/ 7, 5/7) TCC with interleaver length 20 (dashed) usingthe chi-square ASE noise model.
3.3 Performance improvement of turbo codes
In this section, we study the effects of different ASE noise models on the performance
of turbo code (TC) decoders. A soft decoding algorithm, the BCJR algorithm [71], is
generally used in the TC decoders. The BCJR algorithm is a maximum a posteriori prob-
ability (MAP) algorithm, and, is generally very sensitive to the noise statistics. We noted
that the Gaussian approximation of the ASE noise is widely used in the study of optical
fiber transmission systems [20]–[23], [48], [49], and there exist standard TCs for Gaus-
sian channels. We show that, however, using a MAP decoding algorithm based on the
Gaussian noise assumption may significantly degrade the TC decoder performance in an
optical fiber channel with non-Gaussian ASE noise. To take full advantage of TC, accu-
rate asymmetric noise statistics in optical fiber transmissions should be used in the BCJR
decoding algorithm.
0 2 4 6 8 10–15
–10
–5
0
Q2 (dB)
log 1
0(B
ER
)
74
In the following, modifications of the BCJR algorithm according to the chi-square
noise distribution are described and simulation results for the performance of the im-
proved TC decoder are discussed. The effects of the three different optical fiber channel
models –– the chi-square distribution model, the approximated Gaussian distribution
model, and the approximated Gaussian BSC model –– on the TC decoder are discussed.
The BCJR decoding algorithm is an iterative soft decoding algorithm and requires a
soft-decision channel model, such as the chi-square and approximated Gaussian distribu-
tion models. The following discussions will show, however, that the hard-decision chan-
nel model can also affect the performance of the punctured TC decoder. Specifically, the
approximated Gaussian BSC model, as shown later in the simulations, degrades the per-
formance of the punctured TC whose decoding requires the optimal hard-decision
threshold.
3.3.1 Modification of the BCJR Decoding Algorithm according to the chi-square noise
distributions
The BCJR algorithm is a recursive algorithm for the maximum a posteriori probability
(MAP) decoding of the received noisy codeword Y = (ys1, …, ys
N, yip1, …, yip
N, …),
where ysk represents a received information bit corresponding to the transmitted informa-
tion bit uk, and yipk represents a received parity-check bit corresponding to the transmitted
parity-check bit xipk generated by the i-th constituent encoder. Note, i = 1, 2 for our rate
1/3 TC where each constituent convolutional encoder has rate 1/2. In the i-th constituent
75
MAP decoder for TC, the information bit uk in the transmitted codeword X = (u1, …, uN,
xip1, …, xip
N, …) is estimated based on the received noisy codeword Y by
( )( )
<>
= 0 if ,0
0 if ,1ˆ
k
kk uL
uLu (3.27)
where L(uk) is the log likelihood ratio (LLR)
.0 ,)|0(
)|1(log)( Nk
uP
uPuL
k
kk ≤≤
==≡
YY
(3.28)
The key to the BCJR algorithm is to decompose the a posteriori probability into three
factors αk–1, γk, and βk, relating the decision on uk (we refer to the subscribe k as “time k”
in the following discussions) to the previous, current, and future observations, respec-
tively, as
∑∈
−==Sss
kkkk ssssp
ss’uuP ’,
1 )(),’()’()(
1)| to n transitiostate causing ( βγα
YY , (3.29)
where S = s1, …, sk, …, sN is the set of all constituent encoder states, the state pair (s’,
s) represents a state transition from (sk–1 = s’) to (sk = s), αk–1(s’) = p[sk–1 = s’, (ys1, …, ys
k-
1, yip
1, …, yipk–1)] is a probability measure for state s’ at time k–1 that depends only on the
past observations, i.e., the received information and parity-check bits before time k, βk(s)
= p[(ysk+1, …, ys
N, yipk+1, …, yip
N) | sk = s] is a probability measure for state s at time k
that depends only on the future observations, i.e., the received information and parity-
check bits after time k, and γk(s', s) is a probability measure connecting state s' at time k–1
76
to state s at time k that depends only on the present observation (ysk, y
ipk). The γk(s’, s) can
be written as
)|()|()()|,()(),’( ppspskkkkkkkkkk xypuypuPuyypuPss ≅=γ , (3.30)
and αk–1(s’) and βk(s) can be computed recursively as functions of γk(s’, s) given by
∑∈
−=Ss
kkk ssss ’
1 ),’()’()( γαα
and
∑∈
− =Ss
kkk ssss
1 ),’()()’( γββ ,
respectively [92].
We observe that γk(s’, s) depends on the conditional pdfs of the received signals and is
the key factor in the BCJR algorithm; hence, the performance of the BCJR algorithm de-
pends strongly on the accuracy of the noise model.
As shown in Fig. 3.12, the differences between the pdfs of the ASE noise with the chi-
square distribution and the Gaussian approximation with the same mean and variance are
not negligible, especially at low Q as in the case of Q2 = 5 dB. An obvious question is,
therefore, can better TC performance be achieved by modifying the standard formula of
γk(s’, s), which uses the Gaussian noise model, to a new formula using the more accurate
chi-square distribution model given by Eqs. (2.1) and (2.2)? We rewrite the pdfs here,
with new notations for convenience:
77
Figure 3.12: Comparison of the pdfs of the ASE noise with a chi-square distribution and aGaussian distribution with the same mean and variance.
,0 , 2exp1
)1|(0
10
2
1
0
≥
+−
== −
−
kk
Mk
M
kkk y
N
EyI
N
Ey
E
y
Nxyp (3.31)
( ) ( ),0 ,
)!1(
/exp/1)0|( 0
10
0
≥−
−==
−
kk
Mk
kk yM
NyNy
Nxyp (3.32)
where yk represents ysk or yp
k, xk represents uk or xpk, E is the transmitted signal energy,
N0/2 is the two-sided power spectral density of the ASE noise, and 2M is the dimension-
ality of the optical signal space. When we substitute Eqs. (3.31) and (3.32) into Eq.
(3.30), we obtain
0 2 4 6 8 100
0.25
Q2 = 5 dB
p(y
| u =
0) Chi-square pdf
Gaussian approx.
0 10 20 300
0.08
y
p(y
| u =
1)
78
( )
( )
( )
==−
+−
==−
++−
==−
++−
==
++−
≅
−
−
−
−
−
−
−
−−
−
. 0 ,0 ,])!1[(
exp/1
)(
1 ,0 ,)!1(
/2exp
1)(
0 ,1 ,)!1(
/2exp
1)(
1 ,1 ,222
exp1
)(
(3.33) ),’(
p2
0
ps12
0ps
20
p
1
0s
0
p
10
ps2
1ps
20
p
1
0p
0
s
10
ps2
1ps
20
p
0
p
10
s
10
ps2
1
2
ps
20
kk
kkM
kk
k
kk
M
kkM
kk
M
kkk
kk
M
kkM
kk
M
kkk
kkk
Mk
Mkk
M
kkk
k
xuM
N
yyNyy
NuP
xuM
Ny
N
EyI
N
Eyy
E
yy
NuP
xuM
Ny
N
EyI
N
Eyy
E
yy
NuP
xuN
EyI
N
EyI
N
Eyy
E
yy
NuP
ssγ
Defining
==≡
)0(
)1(log)(
k
kk
e
uP
uPuL ,
we may write
−
−+
−=2
)()12(exp
)](exp[1
]2/)(exp[)( k
ek
kek
e
k
uLu
uL
uLuP . (3.34)
Note that Eqs. (3.33) and (3.34) can be substituted into Eqs. (3.28) and (3.29) to calculate
the LLR. Thus, all the common terms in the 4 cases in Eq. (3.33) can be removed to sim-
plify the calculations. Then, the γk(s’, s) can be calculated with
79
( ) ( ) ( )( ) ( )( )
==
==
==
==
≅−
−
−−
, 0 ,0 ,)(
1 ,0 ,)(
0 ,1 ,)()(exp
1 ,1 , )(exp
),’(
pps0
psp1
pps1
pp1
s11
kkb
kk
kkb
kkM
kkb
kkMke
kkkMkMke
k
xuyyc
xuyyaI
xuyyaIuL
xuyaIyaIuLc
ssγ (3.35)
where a, b, c0, and c1 are constants given by
( ) . )!1(exp/ , exp)!1(
,2
1 ,2
0
1
000
1
01
0
−
=
−
−=
−==
−−
MN
ENEc
N
E
E
NMc
Mb
N
Ea
MM
Defining
( )
=
=≡
−
−
, 0 ,)(
1 , ),’(
pp10
pp11
kb
ku
kkMu
ek
xyc
xyaIcss
k
k
γ
the LLR can be calculated iteratively to yield
( )
++
=
∑∑
−−
+−
−
S
kekk
S
kekk
kbk
kMk
ssss
ssssuL
y
yaIuL
)(~
),’()’(~
)(~
),’()’(~
log)()(
log)(1
1e
s
s1
βγα
βγα, (3.36)
where S
+ is the set of (s’, s) caused by uk = 1, and S
– is similarly defined for uk = 0. The
first term on the right side of Eq. (3.36), which depends on the currently observed infor-
mation bit and the channel SNR, is sometimes called the channel value. The second term
Le(uk) represents any a priori information provided (extrinsic information received) by
the other decoder, and the third term represents extrinsic information passed to the other
decoder.
80
3.3.2 Effect of the BSC model on the performance of the punctured TC
Punctured TC is more practical than the standard TC in optical fiber transmissions be-
cause of the higher code rates that can be obtained from lower code rate codes. Punctur-
ing can be implemented by deleting some parity and/or information bits at the output of
the encoder [67], [92]. At the input of the decoder, the signals corresponding to the
punctured bits are set to the same value as the optimal hard-decision threshold Iopt [92].
The reason follows. If we assume that the pdfs of the spaces and the marks, p(x|0) and
p(x|1), cross at the point (Icross, pcross) and satisfy the conditions:
(a) p(x|0) > p(x|1) for all x < Icross,
(b) p(x|0) < p(x|1) for all x > Icross,
we can prove that Iopt, which is optimal in the sense of giving the minimum hard-decision
detection error probability, is the crossover point of the two pdf curves, i.e., Iopt = Icross.
Proof:
Suppose Iopt ≠ Icross, there are only two possible cases, Iopt < Icross or Iopt > Icross. First,
consider the case when Iopt < Icross. With condition (a), we have p(x|0) > p(x|1) for x ∈
[Iopt, Icross). Then, the minimum hard-decision detection error probability can be expressed
as
extcross
min
)]1|()0|([ )0|( )1|(
)0|( )1|(
cross
opt cross
cross
opt
opt
PP
dIIpIpdIIpdIIp
dIIpdIIpP
I
II
I
I
I
+=
−++=
+=
∫∫∫
∫∫∞
∞−
∞
∞−
81
where Pcross is actually the hard-decision detection error probability with Icross as the deci-
sion threshold, and Pext > 0. Thus, we obtain Pmin > Pcross, which contradicts the definition
of Pmin. Hence, Iopt < Icross is not possible.
Similarly, with condition (b), we can prove that Iopt > Icross is also not possible and,
hence, Iopt < Icross. QED.
This proof leads to the straightforward likelihood ratio result, i.e., if we set punctured
bits to the same value as the optimal hard-decision threshold Iopt, then
. 1)1|(
)0|(
)1|(
)0|(
)1bit punctured|bit puncturedfor valuesignalpreset (
)0bit punctured|bit puncturedfor valuesignalpreset (
cross
cross
opt
opt ===
==
Ip
Ip
Ip
Ip
p
p
(3.37)
Obviously, a likelihood ratio equal to 1 (and LLR = 0) is the best guess for the punctured
bits in the sense of achieving minimum error probability. Hence, Iopt is the best value to
use for those virtual signals corresponding to the punctured bits. Note that both the chi-
square and Gaussian distributions satisfy the two conditions mentioned above and, hence,
the proof and statements made above are valid for them.
As discussed in Chapter 2, with the Gaussian BSC model of an optical fiber channel
with ASE noise, the optimal hard-decision threshold is assumed to be at the point that
gives equal transition probabilities. Thus, the approximate optimal threshold can be
evaluated with a simple formula as shown in Eq. (2.9), while the actual optimal threshold
for Gaussian noises satisfies Eq. (2.8). Figure 2.3 plots the optimal thresholds evaluated
at Q2 = 6.2 dB for the chi-square BAC, Gaussian BAC, and Gaussian BSC models. We
82
can see that the optimal thresholds in the first two models both give an ideal likelihood
ratio of 1, while the BSC model gives a likelihood ratio greater than 1.
The Gaussian BSC model gives a more accurate estimate for the BER at higher Q and,
thus, it is expected to perform better at higher Q as a model of optical fiber channels. This
is not true, however, if we use the Gaussian BSC model in the decoding of punctured
TCs. With Eq. (2.4) and (2.9), the approximate optimal threshold IoptBSC based on the
Gaussian BSC model can be expressed as a function of Q given by
QB
B
B
B
B
B
B
BQ
B
BQ
I
+
++=
e
o
e
o
e
o
e
o2
e
o
optBSC
2
where Bo and Be are, respectively, the optical bandwidth and the electrical bandwidth of
the system at the detector. As Q increases, the likelihood ratio
)1|(
)0|(
optBSC
optBSC
==
k
k
xIp
xIp
increases exponentially as shown in Fig. 3.13 for Bo/Be = 3, while the ideal value in turbo
decoding should be 1. Hence, for punctured TC, the BSC model performs even worse at
higher Q.
83
Figure 3.13: Likelihood ratio using the hard-decision threshold based on the GaussianBSC model for Bo/Be = 3.
3.3.3 Simulations of the TC decoders using different channel models
In the simulations, we use a (31, 27, 400) parallel-concatenated-convolutional TC with
the encoder and decoder structure as depicted in Fig. 3.14. The (31, 27, 400) TC is a rate
1/3 code, where the first two parameters, 31 and 27, in the parenthesis are octal numbers
representing the structures of the constituent encoders. If we transform the octal numbers
31 and 27 into binary numbers 11001 and 10111, then the digits of the binary numbers
represent the coefficients of the parity-check generator polynomials 1 + D + D4 and 1 +
D2 + D3 + D4. As depicted in Fig. 3.14(a), “31/27” corresponds to the recursive parity-
check generator polynomial (1 + D + D4)/(1 + D2 + D3 + D4).
A 400-bit interleaver is used between the two constituent encoders shown in Fig.
3.14(a). The major purposes of using an interleaver are [67]: (1) to generate a long block
code from small memory length convolutional codes, and (2) to decorrelate the two parity
0 10 20
10
Q2 (dB)
Like
hood
rat
io a
t Iop
tBS
C
4
8
6
2
84
check sequences so that an iterative suboptimum decoding algorithm based on informa-
tion exchange between the two constituent decoders can be applied.
uk
uk
x1pk
x2pk
400-bitinterleaver
PuncturingMechanism
encodeddata
sequence
(a) Turbo code encoder
MAP decoder1 MAP decoder2400-bit
interleaver
400-bitdeinterleaver
400-bitinterleaver
Le12
Le21
y1p
ys
y2p
(b) Turbo code decoder
Figure 3.14: Turbo code encoder and decoder structure.
1 D2 D3 D4
1 D D4
85
In the turbo encoder, for each input original information bit uk, there are two parity
check bits, x1pk and x2p
k, generated by the two parallel concatenated convolutional encod-
ers, respectively. Thus, we have a rate code of 1/3. To achieve higher code rate, a punc-
turer can be added at the output of the turbo encoder. The puncturing operation can be
represented by a puncturing matrix, in which each column represents an output block
with the element in the first row corresponding to the information bit and the other ele-
ments corresponding to the parity check bits. A “0” element in the puncturing matrix
means that the corresponding information bit or parity check bit is deleted according to
the puncturing mechanism. Similarly, a “1” means that the corresponding bit is transmit-
ted. The puncturing matrices for the rate 1/2 and rate 3/4 punctured TCs are shown in
Eqs. (3.38) and (3.39), respectively,
=
0
1
1
1
0
1
1/2) rate to1/3 (rate
MatrixPuncturing , (3.38)
=
1
0
0
0
0
1
0
1
1
3/4) rate to1/3 (rate
MatrixPuncturing . (3.39)
As shown in Fig. 3.14(b), the iterative turbo decoder consists of two serially concate-
nated constituent decoders, between which there is an 400-bit interleaver identical to the
one in the turbo encoder shown in Fig 3.14(a). The first decoder uses MAP decoding on
the received information sequence ys and parity check sequence y1p generated by the first
encoder and passes the soft extrinsic information Le12 to the second MAP decoder via the
interleaver. Then, the second decoder uses MAP decoding on the interleaved information
sequence and the parity check sequence y2p generated by the second encoder, with an im-
86
proved estimate of the a priori probabilities of the information sequence. The soft extrin-
sic information Le21 produced by the second MAP decoder is then transferred to the first
decoder as an improved a priori knowledge of the information sequence. Thus, an itera-
tive MAP decoding is constructed via the information exchange between the two con-
stituent MAP decoders.
We simulate the performance of the turbo code with BCJR (MAP) decoding algo-
rithms designed based on the chi-square, Gausian, and Gaussian BSC models of the opti-
cal fiber channel with ASE noise. In the simulations, chi-square distributed ASE noise is
added to the optical fiber transmission line. We repeat the simulations for different code
rates by puncturing the 1/3 turbo code.
In Fig. 3.15, the decoded BER as a function of the Q factor is plotted. In all the simu-
lations, the Q factor is evaluated based on the encoded data sequence instead of the origi-
nal uncoded data sequence. In other words, the Q factor shown in Fig. 3.15 is equivalent
to Es/N0 instead of Eb/N0, where Es represent average energy of line symbol, Eb repre-
sents average energy of information bit, and N0/2 is the channel two-sided noise spectral
density. The results show that the modified TC decoder can achieve more than 2 dB of
extra coding gain compared to the TC decoder based on the Gaussian approximation. It is
also shown that the rate 3/4 punctured TC based on the Gaussian BSC model fails to im-
prove upon the BER of uncoded data. The uncoded data, here, is transmitted at the same
signaling rate as the encoded data. The performance divergence of the rate 3/4 punctured
TC from the BER curve of the uncoded data agrees with the discussions about the effect
of the BSC model on the punctured TC in the previous section.
87
Figure 3.15: Output BER comparison of the turbo code (31, 27, 400) decoder based onthe chi-square model (solid), the Gaussian model (dotted), the Gaussian BSC model(dashed) of the ASE noise in the optical fiber transmission system, and the BER beforedecoding (dash-dot), the rate 1/2 and rate 3/4 codes are punctured versions of the rate 1/3turbo code.
3.4 Summary
In the highest level studies, we evaluated the lower performance bound (Shannon
limit) for general FEC codes based on the chi-square BAC, the Gaussian BAC, and the
Gaussian BSC models of optical fiber channels. We showed that the use of the simpler
channel models, the Gaussian BAC and Gaussian BSC, is a convenient way to calculate
the BER of optical fiber transmission systems without FEC coding, but is inappropriate
for evaluating the lower bound on FEC code performance. Although the Gaussian BAC
model gives acceptable estimates of the lower bound on performance at code rates higher
-2 0 2 4 6 8
-5
0
Q2 (dB)
log 1
0(B
ER
)
-1
-2
-3
-4
-6
-7
** Rate 1/3
88
than 0.8, it underestimates the lower bound (predicts lower BER or Q) by about 0.4 dB in
Q2 at code rate = 0.5. The problem becomes more severe at lower code rates. The Gaus-
sian BSC model, however, overestimates the lower bound on FEC code performance
(predicts higher BER or Q) by about 0.4 to 0.5 dB at all studied code rates from 0.5 to
0.9. Thus, the maximum coding gain achievable, with a FEC code, is overestimated at
low code rates by the Gaussian BAC model and always underestimated by the Gaussian
BSC model. A more accurate determination of the lower bound, i.e., the Shannon limit,
on performance for optical fiber channels dominated by ASE noise is possible with the
chi-square BAC model. A better calculation of the achievable coding gain and how close
a code approaches the Shannon limit is an important step in the search for efficient FEC
codes for optical fiber transmission systems.
In the middle level studies of linear code performance, we derived the upper bounds on
linear code performance in optical fiber channels with ASE as the dominant source of
noise. A general upper bound of the pairwise error probability, Pd, in asymmetric chan-
nels was obtained, and the corresponding bound of Pd for optical fiber channels with
dominant ASE noise was evaluated.
With two example codes, we investigated the accuracy of their performance using the
Gaussian approximation of ASE noise instead of the exact chi-square model. We showed
that the Gaussian approximation model overestimates (predicts lower BER or Q) at low
Q and underestimates at high Q the upper bounds on the linear code performance in the
optical fiber channel. The underestimate can be as high as 2 dB in Q2 at 10-12 BER, and
becomes larger for lower BER. The resulting performance bounds also suggest that, with
89
similar code rate and block length, the TPC is a better choice than the TCC in optical fi-
ber channels requiring very low BER.
Based on these results, we conclude that accurate noise statistics are critical in the
performance evaluation for turbo codes, which require a priori knowledge of the channel
noise distribution in the decoding. We can also conclude that the derived upper bound on
code performance is a useful tool in the selection and design of linear codes for channels
with asymmetric marks and spaces distributions.
In the lowest level studies of turbo code performance, we discussed the effects of dif-
ferent ASE noise models on the performance of the turbo code decoder. We showed that
if one uses a decoder assuming Gaussian noise statistics for a channel that actually has a
chi-square noise distribution, the performance of the decoder significantly degrades. The
performance degradation for the rate 1/2 turbo code can be more than 2 dB in Q2 at 10–6
BER, and becomes larger at lower BER. Moreover, if the Gaussian BSC approximation is
incorporated into the puncturing operation to obtain high rate turbo code, the resulting
punctured turbo code may cause coding loss instead of the expected coding gain. Based
on these results, we conclude that using accurate channel noise statistics in the iterative
MAP decoding algorithm is critical to achieve the expected coding gain from a turbo
code. We recommend that MAP decoding chip-sets designed based on Gaussian channel
should not be used for the non-Gaussian optical fiber transmission systems without
modification if one wants the best possible code performance.
90
Chapter 4
A sliding window criterion (SWC) line-code for
mitigating soliton-soliton collision induced errors
The optical soliton is an optical pulse that can propagate undistorted in dispersive non-
linear optical fibers under specific pulse power and pulse shape conditions [16]. In WDM
solition systems, soliton-soliton collisions (SSC) are a major nonlinear effect that causes
both timing jitter and amplitude fluctuation and, thus, limit the achievable system data
rate and transmission distance. Unlike the ASE noise causing random errors, soliton-
soliton collisions cause correlated errors that are highly data-pattern dependent as dis-
cussed in Chapter 2. Based on the particular characteristics of the data-pattern depend-
ence of the SSC-induced errors, we introduce a new line-coding design criterion, the
sliding window criterion (SWC), and develop a new line-coding scheme to mitigate the
SSC-induced errors. The SWC code mitigates the SSC-induced errors by reshaping the
data pattern according to the SWC.
In this chapter, we first investigate the limitations of FEC codes, specifically the Reed-
Solomon (RS) codes, in correcting SSC-induced errors. This then leads us to introduce
line-coding to resolve for this problem. We introduce the new concepts related to the
91
SWC codes. Then we describe two encoding algorithms block- and trellis-based en-
coding algorithms developed for SWC codes. We also discuss the concatenation of the
SWC code with a RS code to achieve the very low BER (< 10-11) required by optical fiber
communications. Finally, we evaluate and compare the performance of the proposed con-
catenated RS/SWC code with a couple of RS codes and a concatenated RS/convolutional
code via simulations.
4.1 Reed-Solomon code without line-coding
Reed-Solomon (RS) codes are increasingly used in optical fiber systems. Here, we
study their performance in the presence of SSC-induced errors, and we then explain the
motivation for a concatenated coding scheme in which our SWC line-code is used as the
inner code and a RS code is used as the outer code.
As discussed in Chapter 2, bit errors caused by soliton-soliton collisions are highly cor-
related and, thus, are bursty. RS codes are designed to correct burst errors of limited
length [55]. Thus, it appears that a RS code might be a good solution for correcting SSC-
induced errors. However, the actual performance of RS codes in an optical fiber channel
with a high SSC-induced BER is limited as shown below.
There are two ways to obtain stronger RS codes. One is to use longer codewords, and
the other is to introduce a larger redundancy by reducing the code rate. RS codes with
very long codewords are difficult to implement in a practical system, especially at the
very high data rates that are present in optical fiber communications systems. Moreover,
increasing the redundancy of the RS code does not always improve the performance for
SSC-induced errors, as we show next.
92
First, we need to evaluate the probability distribution of a complete SSC-induced time
shift in a two-channel system. Let X(nT) denote the SSC-induced time-shift process,
which can be expressed as
)1(......)1()()( max +−++−+= NnWnWnWnTX , (4.1)
where Wn ≡ W(n) is a Bernouli random variable representing the arrival time shift of a
soliton in one channel induced by the interference with a soliton in the other channel. The
probability mass function (PMF) of Wn is given by
pWPptWP nn −==== 1)0( ,)( δ ,
where p is the probability of individual marks in the transmitted data sequence. The prob-
ability mass function, expectation, and variance of Xn ≡ X(nT) are given by
NNN
nn pp
tX
NXP −−
= max)1()( max
δ, (4.2)
max2)1(Var NtppX n δ−= , (4.3)
maxtNpXE n δ= . (4.4)
When the number of channels increases, Nmax will be large and δtmin will be small. In
this case, the central limit theorem implies that P(Xn) approaches a normal distribution N
(EXn, VarXn). The distribution of SSC-induced time shifts, therefore, can be ap-
proximated by the normal distribution Nµ, σ as shown in Fig. 4.1, where µ = pδtNmax,
σ2 = p(1-p)δt2Nmax, Tr is the signal receiving-window duration at receiver, and T is the
time slot for one symbol. Generally, the center of the signal receiving-window is set at
the mean of the time shifts. Detection errors are induced when solitons shift outside the
93
signal receiving-window. Hence, the probability of SSC-induced errors can be estimated
by integrating the normal pdf outside the receiving-window given by
=
σ2
2/erfcBER r
un
T,
where erfc(x) is the complementary error function. Let a = Tr/2T, in uncoded systems T =
1/F, where F is the data rate. Thus, the SSC-induced BER of the received uncoded data
sequence can be estimated by
−=
max2un
)1(2
/erfcBER
Ntpp
Fa
δ. (4.5)
Figure 4.1: Approximated distribution of SSC-induced time shift
–T/2+µ µ0
SSC-induced time shift
T/2+µ–Tr/2+µ Tr/2+µ
Normal distribution:N µ, σ
WindowReceiving
Time slot for 1 symbol
94
For a FEC code with code rate r, the signaling rate for a fixed data rate F increases to
F/r, so that the maximum number of collisions for each soliton increases to N’max =
Nmax/r. Thus, the SSC-induced BER of the received FEC encoded data sequence becomes
−=
−max
23FEC)1(2
/erfcBER
Ntppr
Fa
δ, (4.6)
and the ratio of BERFEC/BERun increases very rapidly with increased redundancy because
r < 1 yields BERFEC > BERun. Even though the error correction capability of the FEC
code increases with redundancy, the price to be paid in this case is increased SSC-
induced BER of the received data sequence. Hence, the FEC code can only improve the
system performance as long as the increase of its error correction capability is greater
than the degradation of the channel due to the increased transmission bit rate. We con-
clude that increasing the redundancy of FEC code does not always imply better perform-
ance. Indeed, there is an optimal code rate at which the FEC code achieves the best per-
formance in correcting SSC-induced errors. This statement is true for FEC codes in gen-
eral and, hence, holds for RS codes as well as for convolutional codes.
Figure 4.2 plots the error correction capability of the RS (255, m) codes and the corre-
sponding estimated BERs of the received RS encoded data sequence before decoding.
We use the highest BER of the received data sequence that is error free after decoding as
the measure of the RS code error correction capability. The error correction capability of
the RS (255, m) codes is a function of the code redundancy defined as k = 255 – m in our
calculations. The RS (255, m) codes have a codeword length of 255 symbols, and each
symbol has 8 bits. The upper bound on the error correction capability of the RS (255, m)
95
codes plotted in Fig. 4.2 corresponds to the case for which when a symbol error occurs all
8 bits in the symbol are in error. At the other extreme, the lower bound corresponds to the
case for which when a symbol error occurs only one bit in the 8-bit symbol is in error.
A RS (255, m) code can correct up to (255 – m)/2 = k/2 symbol errors in a codeword
comprising 255 symbols [55]. Considering that SSC-induced errors are correlated errors,
we assume that the average number of bit errors in each symbol error equals 2, thus the
equivalent number of bit errors equals k. Based on this assumption, we evaluate the error
correction capability (BERECC) of the RS (255, m) codes as a function of the code redun-
dancy k = 255 – m, which is given by BERECC = k/(255×8). We evaluate the SSC-induced
BER of RS (255, m) encoded data sequence as a function of the code redundancy k = 255
– m by replacing the code rate r in Eq. 4.6 with 1 – k/255.
Figure 4.2: SSC-induced BERs before RS decoding and error correction capability of RS(255, m) codes as a function of redundancy k = 255 – m at the data rate of 12.5 Gb/s.Dashed: upper and lower bounds on the error correction capability of RS (255, m) codes.Circles: error correction capability of RS (255, m) codes with average number of bit er-rors in each symbol error equals to 2. Triangles: received SSC-induced BER before RSdecoding. Stars: margin of error correction of RS (255, m) codes.
0 50 100–4
–2
0
RS (255, m) code redundancy (k = 255 – m)
Log 1
0(B
ER
)
96
As shown in Fig. 4.2, the RS-code error correction capability increases slower than
does the SSC-induced BER with an increase of the code redundancy k = 255 – m. These
two curves have a crossover point when the RS code has 56-bit redundancy, correspond-
ing to RS (255, 199). To achieve the largest error correction margin, which is defined as
the difference between the error correction capability and the BER before RS decoding,
the optimal redundancy of the RS (255, m) codes to correct SSC-induced errors is 36 bits.
We can see that RS codes have a limited error correction capability to combat SSC-
induced bit errors and, hence, their use is not efficient in WDM systems with a high SSC-
induced BER. However, if an initial line-code is first used to decrease the SSC-induced
BER to a low level, then very low BERs can be achieved by using a RS code with little
redundancy. This idea leads to the concatenated RS/SWC codes described in the follow-
ing section.
4.2 SWC code
The goal of the SWC code is to minimize the deviation of the number of collisions
each soliton may experience and, hence, minimize the timing jitter and the bit error rate.
Consider the two-channel WDM system shown in Fig. 4.3, where the maximum number
of collisions for each soliton is N12. Each soliton in one of the two channels will interact
with a bit block of length N12 in the other channel along the entire optical fiber path. If all
blocks with N12 bits in the other channel have “almost” the same number of marks, then
solitons in the first channel will experience “almost” the same number of collisions.
Based on this observation, the problem of minimizing the deviation in the number of col-
lisions can be transformed into an encoding problem in which the goal is to minimize the
97
deviation in the number of marks in each block of length N12. A simple binary block code
can be constructed to achieve this goal in which all the codewords have N12 bits, and each
has the same number of marks.
Channel 1:
Channel 2:
Fiber path
T
Transmitterend
Receiver end
Figure 4.3: Soliton-soliton collision in a two-channel WDM system, the rectangular blockis defined as the sliding window.
In order to make any block with N12 bits in the encoded data stream have almost the
same number of marks, the pattern of the encoded data at the beginning and the end of
codewords in the encoded data stream must be taken into account as well. We introduce
the following concepts in order to construct the block SWC codes:
Fragmental
A binary block is fragmental if it has at least one transition from mark-to-space or space-
to-mark. A codeword is n-bit fragmental if any n-bit block in the codeword is fragmental.
Fragmentation degree (FD)
The n-bit fragmentation degree of a binary codeword is defined as
98
1
FD+−
=nl
mn ]1,0[∈ , (4.7)
where l is the length of the codeword and m is the number of n-bit fragmental blocks
(overlapped) in the codeword.
Fragmental end
A binary codeword has n-bit fragmental ends if its first n bits and last n bits are n-bit
fragmental.
The following are two examples for these new concepts.
Examples:
For codeword “10100110”, l = 8, n = 3, m = 6, FD3 = m/(l – n + 1) = 1, and it has 3-bit
fragmental ends.
For codeword “11110000”, l = 8, n = 3, m = 2, FD3 = m/(l – n + 1) = 1/3, and it does not
have fragmental end.
We define the sliding window criterion (SWC) to test the performance of SWC codes
as
JL = var(KL), (4.8)
where KL is the number of marks in a sliding window of a length L. We can now define
the rules to construct the SWC code look-up table as follows. Select codewords with: (1)
similar weights, (2) high fragmentation degrees, and (3) fragmental ends. A smaller JL
implies better satisfaction of the rules.
99
4.3 Block and trellis-based SWC codes
Based on the SWC, we develop two different types of SWC codes –– the block- and
trellis-based SWC codes. The block SWC code has a standard block code structure and,
thus, it has a simple implementation with codeword look-up table for encoding and de-
coding. By contrast, the trellis-based SWC code has a convolutional code structure,
which can thus improve its performance by increasing the memory depth while keeping a
short codeword length.
4.3.1 Block SWC codes
Both the length of the SWC codeword and the length of the sliding window should be
taken into consideration in constructing the best codeword look-up table defining a block
code in the SWC sense. The sliding window length L by definition is set to the maximum
number of expected collisions between the two neighboring channels: N12. Hence, ac-
cording to Eq. (2.20), L depends on the bit rate F, channel spacing ∆λ, transmission dis-
tance Z, optical fiber dispersion D, and dispersion map in dispersion-managed soliton
systems. Given the sliding window length L, the codeword length N and the code rate R
= M/N, when M is the data-word length, must be determined by taking the system fram-
ing structure and available bandwidth into account. If the SWC codeword is much shorter
than the sliding window, there may be several codewords within the sliding window.
Therefore, in this case, the SWC depends more on the numbers of marks in the code-
words than on their fragmentation degrees. Hence, rule (1) for SWC codeword look-up
table construction should be more heavily weighted than rule (2).
100
On the contrary, if the SWC codeword is longer than the sliding window, there is less
than one codeword within the sliding window. Hence, in this case, the SWC will depend
more on the fragmentation degrees than the numbers of marks in the codewords. In this
case, rule (2) should be emphasized as opposed to rule (1). This point is illustrated in the
following example: Two 24-bit blocks of four 6-bit codewords each are given in Table
4.1. The difference between the two blocks is that the four codewords in the first block
have a higher fragmentation degree, but a different number of marks. The codewords in
the second block have the same weight, but a lower fragmentation degree. We evaluate
these two blocks with a 3-bit sliding window and a 12-bit sliding window, respectively.
The results are shown in Table 4.1.
Table 4.1: SWC codeword examples
Data block 1 Data block 2
Codewords 001010 010010 101101 101011 111000 100011 001110 000111
3-bit FD 1 1 1 1 0.5 0.75 0.75 0.5
Number of marks
in codewords
2 2 4 4 3 3 3 3
Number of marks
in 3-bit SW
1121111111212222222122 3210111012211232100123
Number of marks
in 12-bit SW
4 5 5 5 6 5 6 7 6 7 7 7 8 6 5 4 4 5 6 6 5 5 5 6 6 6
SWC testing result Better for 3-bit SW Better for 12-bit SW
For each of the two weighting criteria –– achieving higher fragmentation degree or
similar weights of codewords –– we introduce a corresponding algorithm to generate the
101
codeword look-up tables. The first one is called the fragmentation-first (FF) algorithm,
and the second one is called the weight-first (WF) algorithm. We construct the codeword
look-up table with the FF algorithm for N > L and the WF algorithm for N ≤ L, respec-
tively. The flow diagrams for the FFMBNB and WFMBNB algorithms are shown in Fig.
4.4, where FFMBNB and WFMBNB represent the block SWC codes having M-bit data
word and N-bit codeword based on the FF algorithm and WF algorithm, respectively.
In Fig. 4.4, M is data-word length, N is codeword length. p is mark probability of the
original information data sequence, i and j are counters introduced for the calculation of
the index of the sections in the codeword look-up table, W is a counter for the number of
currently selected codewords in the code look-up table, n is fragmentation order, J is the
total number of sections in the code look-up table in the FF algorithm, and the number of
sections for each i in the WF algorithm, dj is the minimum fragmentation degree of
codewords in the j-th section, and FDn is the n-bit fragmentation degree of the code-
words.
Both the FF and WF algorithms for the construction of the codeword look-up table of
block SWC codes are based on exhaustive search, but they place different emphases on
high fragmentation degree and similar weights of codewords. The FF algorithm starts
with a very high d1, which is the minimum fragmentation degree of codewords selected
in the first section of the codeword look-up table, and gradually decreases this minimum
fragmentation degree limit to get more sections in the codeword look-up table until ob-
taining sufficient codewords. On the other hand, the WF algorithm starts with codewords
of weight S = N/2, and gradually changes the preferred code weights away from N/2
102
until obtaining sufficient codewords in the codeword look-up table. Hence, the FF algo-
rithm places more emphasis on high fragmentation degrees, while the WF algorithm
places more emphasis on similar code weights.
(a) fragmentation-first algorithm (b) weight-first algorithm
Figure 4.4: Algorithms for generating the SWCMBNB code look-up table.
select x codewords with FDn > dj& fragmental end as the (2j -1)th
section of the code table, W = W + x
select x codewords with FDn > dj& no fragmental end as the 2 j th
section of the code table, W = W + x
discard extra codes in the lastsection, keep 2M codewords
obtain the code table
W >= 2M ? Yes
j = j + 1, j > J ?No
No
set n, J, d1, d2, ... dJ for given M, N, p.Let j = 1, W = 0
No
Yes
Yes
W > 2M ?
select x codewords with S marks & fdn > dj& fragmental end as the (2j -1+ 2iJ )thsection of the code table, W = W + x
W >= 2M ?
select x codewords with S marks & fdn > dj& no fragmental end as the (2 j + 2iJ )th
section of the code table, W = W + x
discard extra codes in the lastsection, keep 2M code words.
j = j + 1, j > J ?
Yes
obtain the code table
No
i = i + 1, j = 1S = S + (-1)ii
Yes
Yes
set n, J, d1, d2, ... dJ for given M, N, p.Let j = 1, i = 0, W = 0, S = N/2
No
NoW >= 2M ?
103
Although the two algorithms have their individual emphases, all the three factors ––
(1) similar weights, (2) high fragmentation degrees, and (3) fragmental ends –– are taken
into account in codeword selection. According to the FF and WF algorithms described in
Fig. 4.4, codewords in a section constructed earlier, i.e., with a lower section index value,
are preferred (in the sense of better satisfying the SWC) to codewords in a section con-
structed later, i.e., with a larger section index value.
For any random binary input sequence with equal probability (p = 0.5) of marks and
spaces, the mapping into all codewords are equally-likely; hence, the arrangement of
codewords in the look-up table does not affect the code performance. However, inspec-
tion of real framed data in communications has shown that it is unrealistic to assume that
all data words are equally-likely [72]. Thus, for a given mark probability p for the input
data sequence, we can calculate the probability of a M-bit data word with m marks as pc =
pm(1 – p)M – m. Then, by assigning codewords that better satisfy the SWC, i.e. in the sec-
tion with lower section index value, to data words with higher pc, better code perform-
ance can be achieved. Therefore, in both algorithms, codeword look-up tables are divided
into several sections depending on the how well the selected codewords satisfy the SWC,
and the optimal code can be achieved with appropriate assignments of the codewords to
the data words according to the statistics of the source data.
To compare the proposed SWC to the conventional line-code performance criteria ––
transition density and balanced transmission [53], [54] –– we evaluate the influence of
the SWC code construction algorithms on the power spectral density of the encoded data
sequence using the spectral analysis technique developed by Cariolaro and Tronca [73].
The spectral density of block coded sequences is evaluated by representing the encoding
104
process as a finite-state synchronous sequential machine, and using the theory of ho-
mogenous Markov chains, to obtain both the continuous and the discrete spectral compo-
nents. Figure 4.5 plots the continuous power spectral density components of a random
sequence, a FF8B10B code, a WF8B10B code, and the Manchester code for comparison.
Figure 4.5: Continuous components of the power spectral densities of the uncoded ran-dom signal (solid) and the signals encoded by the FF8B10B (dash-dot), the WF8B10B(dotted), and the Manchester (dotted) codes.
As expected, the power spectral density of the FF8B10B code results in larger compo-
nents at high frequencies and, hence, implies a higher transition density than the
WF8B10B code. However, the WF8B10B code, shown by the lower power of its DC
component, is more balanced than the FF8B10B code. This observation indicates that the
conventional transition density and balance criteria used for line-coding schemes are not
effective measures for evaluating the performance of SWC codes in decreasing SSC-
induced errors. Hence, the appropriate performance criterion for SWC codes is SWC. We
0 50
0.1
0.2
0.3
P
ower
spe
ctra
l den
sity
FF8B10B WF8B10BRandom
Manchester
ω (0 ~ 2π)1 2 3 4 6 7
105
have shown in Table 4.1 that by using SWC as the performance criterion, the FF SWC
codes have better performance for N > L and the WF SWC codes have better perform-
ance for N ≤ L, where N is codeword length and L is sliding window length. This general
claim is also shown in our simulation results that will be presented in Sec. 4.5.
As shown in Fig. 4.6, the encoding and decoding of the block SWC code can be simply
implemented by writing the codeword look-up table into a memory chip and using the
input data block as the memory address. Thus, the output of the memory chip would be
the encoded data and the encoding speed is determined by the read cycle time of the chip.
Similarly, we can use the received encoded data (hard-decisioned) as the memory address
to implement the decoding.
Currently, a number of high-speed memory chips are commercially available. For ex-
ample, we can use the Motorola MCM64E918 RAM chip (currently in production) with a
19-bit address and an 18-bit output to implement a 16B18B SWC code. The minimum
read-cycle time that can be achieved with this chip is 3 ns; hence, we can achieve 18
bits/3 ns, i.e., 6 Gbps encoding and decoding speeds. By using m of these chips in paral-
lel, we can achieve as high as 6m Gbps encoding and decoding speeds.
106
(a) Hardware structure of the block SWC code implementation
(b) block SWC code encoding example
Figure 4.6: Implementation of the block SWC code.
Address DataA0~A7 D0~D9
SWC codeencoder
(Memory chip)
opticalmodulator
fiberchannel
serialto
parallel
input datasequence
8 bitblock
10 bitblock
parallelto
serial
encoded datasequence
opitc
alpu
lse
sequ
e nce
rece
ived
opitc
al p
ulse
sequ
ence
opticaldetector
detected datasequence
serialto
parallel10 bitblock
SWC codedecoder
(Memory chip)
Data AddressD0~D7 A0~A9
8 bitblock
parallelto
serial
decodeddata
sequence
Address DataA0~A7 D0~D9
SWC code encoder(Memory chip)
memory unit: codeword00000000: 1010101010 ... ...00111100: 1100110010 ... ...11111111: 0101010101
...00
0000
0000
1111
0011
1111
11
serialto
parallel
8 bitblock
10 bitblock parallel
to serial01
0101
0101
1100
1100
1010
1010
1010
...
107
4.3.2 Trellis-Based SWC Codes
As an alternative to the block SWC code, we develop a trellis-based SWC code using a
more sophisticated encoding algorithm. The trellis-based coding schemes are particularly
promising for this application because of the natural match between the sliding window
nature of the physical effects in optical communications and the operation of trellis-based
encoding [74]–[78].
By using SWC as the metric, we develop a new set of trellis-based codes. In the trellis-
based code approach, we use knowledge of the recent output sequence, equivalently the
recent input sequence and encoder state, to determine the next output sequence compo-
nents, i.e., it is Markovian with a code-designer-chosen memory (finite-state machine).
With the continuous input stream structure of the trellis-based approach, we do not need
to consider the fragmental ends in our development. We investigate the design of SWC-
based trellis codes where the main difference from conventional trellis codes is in the
output coded sequence design using the SWC metric. Trellis-based encoding allows us to
choose from a subset of output sequences at any given symbol interval that satisfy the
SWC with respect to the previous output.
Figure 4.7 shows the general (n, k, m) trellis-based SWC encoder structure, where n is
the codeword length in the codeword look-up tables, k is the input data word length, and
m is the memory depth of the trellis-based SWC encoder. For a trellis-based SWC code
with a sliding window of length L > n, the encoder memory depth m is given by
−= 5.0
n
Lm . (4.9)
108
As shown in the figure, there are two main modules in the structure of the trellis-based
SWC encoder the codeword-mapping module and the state-determination module.
The codeword-mapping module includes q codeword look-up tables, T0, T1, …, Tq–1, and
a look-up table selector. Codewords in the q codeword look-up tables have different av-
erage weights. The state-determination module has n×m memory units storing m previous
output data vectors, v1, v2, …, vm, and defining the current trellis state S. The k-bit input
binary data vector ui = (u1i, u2
i, …, uki) is encoded into an n-bit output binary data vector
vi = (v1i, v2
i, …, vni) by using one of the q codeword look-up tables, which is enabled ac-
cording to current trellis state S. Let N be the number of marks within the m previous
output data vectors, v1, v2, …, vm, which is given by
∑ ∑= =
=n
i
m
j
jivN
1 1
, (4.10)
then the current trellis state S, which is updated by the state-determination module, is a
function of N given by
110 ... ... ,, )( −∈= pSSSNfS , (4.11)
where p represents the number of possible states.
The basic idea of the proposed trellis-based SWC encoder is: if the previous n×m out-
put bits comprises many (few) marks, then choose a codeword look-up table whose
codewords have low (high) weights for the current encoding. Thus, we can minimize the
variance of the number of marks in the data blocks comprising the current output code-
word and the m previous output data words. According to Eq. (4.9), the current output n-
bit codeword plus the m previous output n-bit codewords have a similar block length, if
109
not the same, as the sliding window length. Therefore, the proposed trellis-based SWC
encoding algorithm can minimize JL in Eq. (4.8) and, hence, satisfy the SWC.
u0 = (u10, u2
0, ..., uk0 )
v0 = (v10, v2
0, ... vn0
)
(n, k)look-uptable T0
(n, k)look-uptable T1
(n, k)look-up
table Tq-1
look-uptable
selector
enable1
enable2
enableq
v11 v1
2
v21 v2
2
v1m
v2m
vnmvn
2vn1
... ...
... ...
... ...
... ...
... ...
... ...
∑∑= =
=n
i
m
j
jivN
1 1
110 ... ... ,,)( −∈= pSSSNfS
... ...
Codeword Mapping
State Determination
Trellis state S
Figure 4.7: Function diagram of the trellis-based SWC encoder.
The encoding operation described above can be represented by a trellis structure as
shown in Fig. 4.8. The states of the trellis diagram correspond to the encoder states de-
termined by the number of marks in the n×m memory units in the state determination
110
module. The labels Ti on the branches represent the look-up table for current encoding.
Generally, for each state, only one look-up table can be selected. The neighboring states
may share the same look-up table. Hence, the number of states, p, is always greater than
or equal to the number of look-up tables q.
S1
S0
T0 T1
T1
Sp-1
Sp-2
Tp-2Tp-1
Tp-2
T1
T0
... ... ... ...
Tp-2
Tp-1
T0 T1
T1
Tp-2Tp-1
Tp-2
T1
T0
... ... ... ...
Tp-2
Tp-1
Figure 4.8: Trellis diagram of the trellis-based SWC encoder.
As a practical example, consider an encoder for a (4, 3, m) trellis-based SWC code
with a 12-bit sliding window, i.e., codeword length n = 4, input data word length k = 3,
and L = 12. With Eq. (4.9) we obtain m = 2, and the encoder state S is given as
>
≤==
. 4 ,
4 ,)(
1
0
NS
NSNfS (4.12)
111
Hence, we have a (4, 3, 2) trellis-based SWC encoder with p = 2 states. We construct two
(4, 3) binary codeword look-up tables, T0 and T1, as shown in Table 4.2. The left column
in Table 4.2 lists all the possible 3-bit input data words u, and the other two columns list
the corresponding 4-bit codewords for S = S0 and S = S1, respectively. The selection of
codeword look-up table depends only on the encoder state S. When S = S0 the left and
middle columns in Table 4.2 construct the codeword look-up table T0. Similarly, when S
= S1 the left and right columns in Table 4.2 construct the codeword look-up table T1.
Table 4.2: Codeword look-up tables for a trellis-based SWC code
vuT0 (S = S0) T1 (S = S1)
000001010011100101110111
11101101101110100101011010010011
00010010010010100101011010010011
As defined in Eq. (4.12), S = S0 indicates that there are less than 5 marks in the previ-
ous m = 2 output 4-bit codewords, hence, codeword look-up table T0, which has higher
average codeword weights, should be selected for current encoding. On the contrary, S =
S1 indicates that there are more than 4 marks in the previous 2 output codewords, hence,
codeword look-up table T1, which has lower average codeword weights, should be se-
lected for current encoding. Thus, we can adaptively decrease the variance of the num-
112
bers of marks in any 3 continuous codewords and, hence, in the 12-bit sliding window.
This (4, 3, 2) SWC encoder has a simple two-state trellis as shown in Fig. 4.9.
Figure 4.9: Trellis of the (4, 3, 2) SWC encoder.
We initialize v1 and v2 stored in the memory with “1010”, so that the initial value of N
equals 4 and look-up table T0 is chosen for the encoding of the first input data word. Fol-
lowing the encoding procedure, we can get all the possible combinations of the number
of marks in v0, v1, and v2 as described in Fig. 4.10.
Let W(vi) be the weight of binary data vector vi, i.e., the number of marks in vi. Figure
4.10 depicts the possible weight combinations of v0, v1, and v2 with a tree structure com-
prising nodes and one direction branches. At each node, there are three entries, from left
to right corresponding to W(v0), W(v1), and W(v2), respectively. Because the previous 2
output codewords v1 and v2 are stored in the memory, W(v1) and W(v2) are always known
and determine the encoder state S. Given the encoder state S, either T1 or T2 is selected
for current encoding. From Table 4.2 we can see that if current encoding is based on T1,
W(v0) ∈ 2, 3 for all the 8 possible input data words. On the other hand, if current en-
coding is based on T2, then W(v0) ∈ 1, 2.
S1
S0
T0
S1
S0
T1
T0
T1
113
Figure 4.10: Possible combinations of number of marks in v0, v1, and v2.
A one-direction branch in the tree structure in Fig. 4.10 leads the current encoding op-
eration to the next encoding operation, i.e., current data vector v2 shifts out of the mem-
ory, current data vector v1 shifts into the memory units for the next v2, etc.. Hence, all the
possible combinations of W(v0), W(v1), and W(v2) can be obtained with an exhaustive
search.
As shown in Fig. 4.10, N = [W(v1) + W(v2)] ∈ 3, 4, 5, 6, and the number of marks
within any three continuous output codewords, i.e., W(v0) + W(v1) + W(v2), belongs to 5,
6, 7, 8. Therefore, the number of marks in a 12-bit sliding windows K12 ∈ 5, 6, 7, 8.
3 or 2 2
2
2 or 3 2
1
2 or 2 3
1
3 or 1 3
2
3 or 1 2
2
3 or 3 1
2
3 or 2 1
2
2 or 3 3
1
W(v0) W(v1) W(v2)
114
In the above example, “1000”, “0111”, and “1100” are not included in the codeword
look-up tables and, thus, it is guaranteed that K12 ∈ 5, 6, 7, 8 for the encoded data se-
quence. On the contrary, for uncoded random sequence, K12 ∈ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
10, 11, 12. Hence, the trellis-based SWC encoded data sequence achieves a smaller
variance in the number of marks in the sliding window, i.e., a smaller J12 = var(K12), at
the price of a (n – k)/k = 33.3% code redundancy. This example shows that, by carefully
choosing codewords in look-up tables, we can limit the possible values of KL to a smaller
range and, thus, decrease JL (KL) = var(KL) as defined in Eq. (4.8).
A simple way to decode the trellis-based SWC code is a reverse procedure of encoding
based on the codeword look-up table. For the possible n-tuple words not included in the
codeword table, we can use the minimum Hamming distance criterion to assign the clos-
est codeword. A more sophisticated decoding algorithm, for example the Viterbi algo-
rithm [78], may be designed for the trellis-based SWC codes because of their trellis-based
code nature.
4.3.3 Comparison of the block and trellis-based SWC codes
The advantage of the block SWC codes is the simple structure. Once the codeword
look-up table is obtained, the encoding and decoding can be simply implemented with a
high-speed memory chip, as shown in Fig. 4.6. The trellis-based SWC encoder, however
as shown in Fig. 4.7, requires more complicated logic in the encoding and decoding pro-
cedures.
115
On the other hand, block SWC codes use an exhaustive search algorithm in construct-
ing the codeword look-up table, and it is desirable to increase the block length to achieve
better code performance without increasing the code rate. For very long codewords, how-
ever, the algorithm may be too slow to be practical. In contrast, the trellis-based SWC
code can improve code performance by increasing the memory depth m in the state de-
termination module (Fig. 4.7) without increasing the length of the codeword or the code
rate. We can see, therefore, that the important consideration in choosing block or trellis-
based SWC codes is the tradeoff in performance and implementation complexity.
4.4 Concatenated RS/SWC coding scheme
Optical fiber communications require very low BER (< 10–11), but with only SWC
codes this requirement may not be satisfied. Moreover, as discussed in the previous sec-
tions, the basic idea of SWC line-codes is to prevent SSC-induced errors during the soli-
ton propagation in optical fiber instead of correcting errors in decoding at the receiver.
Therefore, the redundancy added to the original data sequence in SWC encoding is util-
ized to reshape the transmitted data pattern rather than to ensure an effective error-
correction decoding. Decoding for both the block and trellis-based SWC codes is simply
an inverse procedure of the look-up table encoding. Thus, SWC code decoders may in-
troduce decoding bit errors by decoding the received codeword with few bit errors to a
wrong data word with more bit errors compared to the original data word. Hence, to
achieve very low decoded BER, we propose a concatenated coding scheme, the concate-
nated RS/SWC codes.
116
Forney [79] shows that a concatenated coding system with a powerful outer code can
perform reasonably well when its inner decoder is operated with a probability of error in
a range between 10–2 and 10–3. Thus, by concatenating the SWC code with a RS code an
efficient coding scheme can be achieved as we show schematically in Fig. 4.11. As the
inner code, the SWC code can prevent most of the bit errors caused by SSC-induced
timing jitter and, hence, decrease the total BER to the range between 10–2 and 10–3 or
lower. Then, with an outer RS code, very low BER can be achieved.
In Fig. 4.11, Nr and Mr are the codeword length and data-word length in symbols of the
outer RS code, respectively. Ns and Ms are the codeword length and data-word length in
bits of the inner SWC code, respectively. If we choose RS code symbols with the same
length as the SWC codewords, i.e., Ms = log2(1 + Nr), then even though the SWC code
may introduce decoding BER and transform single-bit errors into many multi-bit errors in
a RS code symbol, the number of symbol errors does not increase after the SWC decod-
ing. In other words, from the view of the RS decoder, there is no extra decoding symbol
errors introduced by the SWC decoder. Hence, the decoding bit errors generated by the
SWC decoder do not affect the performance of the concatenated RS/SWC code as a
whole.
As an example, Table 4.3 lists the BERs and number of symbol errors in RS code-
words observed in one of our 4-channel WDM transmission simulations at different de-
coding stages of the received RS (255, 239) and concatenated RS (255, 239)/SWC (10, 8)
encoded data sequence. The 2nd row shows the SSC-induced BER and number of symbol
errors in 5 RS codewords when the RS (255, 239) code is used alone without concatena-
tion with a SWC code. The RS (255, 239) can correct up to 8 symbol errors in each
117
codeword, but 2 of the 5 RS codewords in the simulation have more than 8 symbol errors
and, hence, cannot be corrected by the RS decoder as shown in the 6th row in Table 4.3.
Figure 4.11: Concatenated RS/SWC coding scheme.
By comparing the 2nd and the 3rd rows in Table 4.3, we can see that the SWC code can
reduce the SSC-induced errors during transmission, thus decreasing the received BER
from 1.59×10–2 to 1.66×10–6. As a line-code, the SWC code avoids errors during the
transmission rather than corrects errors in the decoder. Moreover, as discussed at the start
of this section, the SWC code may introduce extra decoding bit errors. As shown in the
4th row, the BER is increased from 1.66×10–6 to 8.31×10–6 by the SWC decoding. How-
ever, because all the extra decoding bit errors are in the corrupted symbol, the number of
symbol errors does not increase after SWC decoding. Thus, as shown in the 5th row in
RSencoder(Nr, Mr)
WDM SolitonTransmission
Channel
ASESSCPMD
X
~X
SWCencoder(Ns, Ms)
Y
RSdecoder(Nr, Mr)
SWCdecoder(Ns, Ms)
~Y
118
Table 4.3, after the RS decoding, an error free transmission is achieved with the concate-
nated RS (255, 239)/SWC (10, 8) code in this simulation.
Table 4.3: Bit errors and symbol errors induced by soliton-soliton collision
BER Number of symbol errors in RS codewords
Received RS
encoded data
1.59e–2 10 5 9 5 7
Received RS/SWC
encoded data
1.66e–6 0 0 0 1 0
RS/SWC encoded data
after SWC decoding
8.31e–6 0 0 0 1 0
RS/SWC encoded data
after RS decoding
0 0 0 0 0 0
RS encoded data
after RS decoding
8.40e-3 10 0 9 0 0
The above comparison shows that the concatenated RS (255, 239)/SWC (10, 8) code
outperforms the RS (255, 239) code in reducing and correcting SSC-induced errors. We
noted that, the concatenated RS (255, 239)/SWC (10, 8) code has a code rate of about
0.75, while the RS (255, 239) code alone has a higher code rate of about 0.94. As dis-
cussed in Sec. 4.1, however, increasing the redundancy of RS codes without line-coding
does not always help to improve code performance in correcting SSC-induced errors. In
fact, it will be shown in the next section that the RS (255, 239)/SWC (10, 8) code also
significantly outperforms a RS code with similar effective code rate –– the RS (255, 191)
code with a code rate of about 0.75.
119
The concatenated RS/convolutional code is a strong concatenated FEC coding scheme
that has been proposed for long-haul submarine optical fiber communications [20], where
ASE noise is dominant. We can show that this concatenated FEC coding scheme, how-
ever, is not as effective as our proposed concatenated RS/SWC code for correcting SSC-
induced errors. As shown in Fig. 2.9 on page 43, the SSC-induced bit errors have a burst
error pattern. Higher bit rates systems have errors with a longer burst length and a smaller
burst spacing. Because the Viterbi decoder for the convolutional code performs better for
memoryless channels than for channels with memory [78], [80], [81] the burst nature of
the SSC-induced errors degrades the performance of the concatenated RS/convolutional
code. Although interleaving can be used to convert convolutional codes for correcting
random errors into burst-error-correcting codes, it will introduce transmission delay and
requires more complex hardware.
From the above discussions, we can see that both the RS codes and the concatenated
RS/convolutional codes have limited error correction abilities for combating the SSC-
induced bit errors. The proposed RS/SWC code first avoids most SSC-induced errors by
taking advantage of the special SWC encoded data pattern and then corrects the rest of
the errors with a high rate outer RS code. Hence, in systems with high SSC-induced tim-
ing jitter, using an RS code or a concatenated RS/convolutional code is not as effective
and efficient as is using the proposed concatenated RS/SWC codes in achieving low
BER.
120
4.5 Performance and comparisons via simulations
We have performed two sets of simulations to study the performance of our proposed
coding scheme. One set is based on a simplified model of soliton-soliton collisions
(SSCs) given by Eqs. (2.18)–(2.20) that addresses only complete collisions. The other set
is a full simulation of SSC-induced timing jitter in a dispersion-managed optical fiber
system using the Photonic Transmission Design Suite (PTDS) simulation environment
[82]. In the PTDS simulation environment we can construct WDM optical fiber transmis-
sion systems with configurable channel spacing, transmission distance, transmission data
rate, dispersion-management scheme, optical amplifier gain and spacing, optical pulse
shape, etc., and simulate the optical pulse propagation in a configurable step size in terms
of propagation distance.
4.5.1 Simulations based on simplified SSC model
Based on the simplified SSC model, four sets of simulations were performed. These
include: (1) calculating the reduction of SSC-induced timing jitter with the SWC code
alone; (2) comparing the performance of two RS codes, a concatenated RS/convolutional
code, and a concatenated RS/SWC code in mitigating timing-jitter-induced errors in
WDM systems; (3) determining the characteristic of the SSC-induced bit errors; and (4)
comparing the performance of the SWC codes constructed with the fragmentation-first
algorithm and the weight-first algorithm. The results of these simulations are plotted in
Figs. 4.12–4.14.
121
Figure 4.12 plots the time-shift distributions of the uncoded and the SWC (10, 8) en-
coded data sequence in a WDM system having a data rate F = 14 Gbps, transmission
distance Z = 20 Mm, fiber dispersion D = 0.25 ps/nm/km, and channel spacing = 0.8 nm.
To make the figure easy to read, not all the probability mass function (pmf) points of the
time shifts of solitons were plotted. As shown in Fig. 4.12, the variance of the time shifts
of the received data sequence, and the corresponding SSC-induced BER, is effectively
decreased by using the SWC code. SSC-induced BER is decreased from a floor of 10–2 to
a floor of 10–6.
Figure 4.12: Reduction of the SSC-induced timing jitter with a SWC (10, 8) code. Cir-cles: pmf in uncoded random data sequence. Triangles: pmf in SWC coded data se-quence. Dotted curve: Gaussian distribution approximating the pmf in uncoded case.Solid curve: Gaussian distribution approximating the pmf in SWC-coded case. Dottedline pair: receiving-window for uncoded signal. Solid line pair: receiving-window for theSWC-coded signal.
–50 0
0
Time shift of soliton (ps)
Pro
babi
lity
0.05
50
122
Figures 4.13a and 4.13b plot the output BERs of the binary data streams without cod-
ing, with RS coding, with concatenated RS/convolutional coding, and with concatenated
RS/SWC coding transmitted through the WDM soliton system. The BERs of these data
streams are evaluated for different transmission data rates and channel spacing values. In
Fig. 4.13a, we can see that the highest error-free (BER < 10–9) bit rate can almost be dou-
bled with the concatenated RS/SWC code compared to the original uncoded system. Fig-
ure 4.13b shows that the channel spacing for BER < 10–11 can be decreased by half with
the RS/SWC code. These results show that the SWC codes can effectively decrease the
SSC-induced timing jitter in WDM soliton systems, and they significantly enhance the
capacity in terms of data rate and channel spacing.
Comparing the performances of different coding schemes plotted in Fig. 4.13a, we can
see that with the same redundancy the RS (255, 239)/SWC (10,8) code performs better
than the RS (255, 191) code. This result agrees with the discussion in Sec. 4.4 about the
advantage of the RS/SWC codes over the conventional FEC codes. We note that the per-
formance of these coding schemes becomes worse rather than better as the code redun-
dancy increases. This is because, as discussed in Sec. 4.1, the probability of SSC-induced
timing jitter errors is very sensitive to the width of the soliton receiving-window. To keep
a constant data rate, a higher code redundancy requires a higher signaling rate and, thus, a
narrower signal receiving-window. Therefore, the increase of the timing jitter errors in-
duced by increasing code redundancy may be faster than the improvement of code per-
formance.
123
(a)
(b)
Figure 4.13: Comparison of the code performances in enhancing (a) transmission bit rateand (b) channel spacing. Solid: code performances in middle channel. Dotted: code per-formances in outmost channel. Triangle: concatenated RS (255, 239)/convolutional (2, 1,7). Plus: RS (255, 191). Circle: uncoded random data sequence. Square: RS (255, 239).Star: concatenated RS (255, 239)/SWC (10, 8).
5 10 20–10
–5
0
F (Gb/s)
log 1
0(B
ER
)
15
0 0.5 1–10
–5
0
channel spacing (nm)
log 1
0(B
ER
)
124
We simulated a 4-channel 20-Mm system with the simplified soliton-soliton collision
model. We set the soliton receiving-window duration Tr = 0.8/F. Figure 4.14 plots the
distributions of the number of marks inside the sliding window of two SWC encoded data
sequence generated by the two different algorithms, the fragmentation-first (FF) and
weight-first (WF) algorithms. In Fig. 4.14a, the sliding window length is much shorter
than the codeword length, hence the FF12B14B encoded data sequence achieves a
smaller variance in the number of collisions than does the WF12B14B encoded data se-
quence. On the contrary, as observed in Fig. 4.14b, the WF12B14B code performs better
for a 14-bit sliding window that is as long as the codeword. These results are consistent
with the discussion in Sec. 4.3 about the performances of the FF and WF algorithms.
(a) (b)
Figure 4.14: Probability mass function (pmf) of the number of marks in the sliding win-dow on the data sequence encoded with the fragmentation-first (star) and the weight-first(triangle) algorithms for codeword length = 14 bits, and (a) sliding window length = 4bits and (b) sliding window length = 14 bits. The solid curves in the figures represent thecorresponding normal distributions.
Number of marks in sliding window
Pro
babi
lity
0 1 2 3 40
0.5
UnencodeE112B14BFF12B14B
5 10 150
0.2
0.4
Pro
babi
lity
Number of marks in sliding window
UnencodedE112B14BFF12B14B
125
For a given optical fiber transmission line, the sliding window length is determined by
the maximum number of collisions one soliton may experience, and the SWC codeword
length depends on the data frame structure and other system design requirements. The
simulation results in Fig. 4.14 show that the choice of the FF or WF algorithms for the
SWC code construction should be made after the sliding window length and the SWC
codeword length have been determined.
4.5.2 Full simulations using PTDS
A full simulation is required to study the performance of our coding scheme in disper-
sion-managed soliton systems, which is very time consuming given the current state of
the art in optical system simulations. Since our ability to validate our coding scheme
through full simulations is therefore limited, we present simulation results for some se-
lected data patterns –– “desirable”, “undesirable”, and random data patterns –– to demon-
strate the effectiveness of our coding scheme. Here, we define a “desirable” pattern as
one that satisfies the SWC and an “undesirable” as one that does not.
In the full simulations, independent SWC encoded data sequences are transmitted
along 8 WDM channels. Both SSC-induced errors and ASE errors are simulated. The
128-bit soliton trains in the 1st channel (outermost channel) and the 4th channel (middle
channel) are recorded after every 200 km. The system parameters are: 12 GHz data rate,
100 GHz channel spacing, Gaussian pulses with tFWHM = 14 ps, and a symmetrical disper-
sion map with D1 = 2.34 ps/nm-km, D2 = −2.19 ps/nm-km. Each optical fiber segment is
100 km long, and lumped amplifiers are placed every 50 km.
126
Figure 4.15 plots the SSC-induced timing jitter versus transmission length for “desir-
able,” “undesirable,” and random input data patterns. The timing jitter curves for random
input data are obtained by using Richter and Grigoryan’s approach [83] that has been
shown to have good agreement with full simulation results. The results show that the se-
quence with “desirable” data pattern suffers much smaller SSC-induced timing jitter
compared to the sequence with “undesirable” data pattern in both the outmost and middle
channels. The eye diagrams of the received signals with undesirable and desirable pat-
terns are plotted in Fig. 4.16. We can see that the eyes of the received signals with desir-
able data pattern are more open than the eyes of the signals with undesirable data pattern.
Hence, the full simulation results show that the basic idea of the SWC code is quite ef-
fective and is a promising technique for dispersion-managed WDM soliton systems as
well.
4.6 Summary
This chapter introduced a new line-coding technique, the SWC code that can effec-
tively decrease SSC-induced timing jitter in WDM soliton systems. Two types of SWC
codes, the block and the trellis-based SWC codes are developed. A concatenated
RS/SWC coding scheme was developed that was shown by simulations to enhance the
WDM system capacity in both data rate and the channel spacing.
127
Figure 4.15: SSC-induced timing jitter of desirable (square), random (no sign), and unde-sirable (circle) data patterns. Solid: timing jitter in middle channel. Dotted: timing jitter inthe outmost channel.
Figure 4.16: Eye diagrams of the received signals with undesirable (upper) and desirable(lower) patterns.
0 5 100
5
Fiber length (Mm)
T
imin
g jit
ter
(ps)
Time (ps)
Nor
mal
ized
am
plitu
de
1
0
1
0
0 50 100 150
0 50 100 150
128
We studied the performance of RS codes for SSC-induced errors and showed that there
is an optimal redundancy for RS codes in the sense of achieving the largest error correc-
tion margin. Increasing code redundancy can enhance the error correction capability of
RS codes, but on the other hand it also increases SSC-induced errors that are very sensi-
tive to the system signaling bit rate. Hence, there is an optimal RS code redundancy that
gives the best code performance in correcting SSC-induced errors. More redundancy
(stronger error-correction capacity) for RS codes does not always imply better perform-
ance in correcting SSC-induced errors. We showed the advantages of the proposed con-
catenated RS/SWC coding scheme over the RS codes and the concatenated
RS/convolutional codes with both analysis and simulation results. Because of the simple
structure of the proposed SWC codes, this concatenated RS/SWC coding scheme can be
implemented with ASICs. Evaluation with a full simulation of the WEM DMS system
demonstrated that the proposed SWC line-code is a very promising technique for disper-
sion-managed-fiber systems.
129
Chapter 5
Summary and conclusions
5.1 Summary
In this dissertation, we studied the effectiveness of FEC codes for correcting ASE-
induced errors and a SWC line-code for mitigating SSC-induced errors in optical fiber
communication systems.
In the Introduction, we described the major sources of physical impairment in optical
fiber communications systems. We pointed out that ASE noise from optical amplifiers
causes non-Gaussian asymmetric pdfs of marks and spaces, and the nonlinear inter-
channel interference in WDM systems causes highly pattern-dependent errors. These two
physical effects are among the main sources of errors in optical fiber transmissions and
were our major concerns in this dissertation. We then surveyed the literature on previous
FEC and line-coding studies and applications in optical fiber communications. We
pointed out that previous work is mostly based on standard FEC codes and line-coding
schemes, and the channel models used in the theoretical studies include the binary asym-
metric channel (BAC), the AWGN, and the asymmetric Gaussian, all of which do not
accurately describe the optical fiber transmission output. This observation motivated the
goal of this dissertation –– to analyze and design FEC codes and line-coding schemes in
optical fiber communication systems by incorporating the particular physical characteris-
130
tics and mechanisms of these systems.
In Chapter 2, we discussed the statistics of the ASE noise, and constructed the corre-
sponding models for optical fiber channels with dominant ASE noise. In the hard-
decision case, we introduced the chi-square binary asymmetric channel (BAC), Gaussian
BAC, and Gaussian binary symmetric channel (BSC) models for optical fiber channels.
In the soft-decision case, we focused on the asymmetric chi-square and the asymmetric
Gaussian models. We also described the physical mechanism of soliton-soliton collisions
(SSC) in WDM systems, and constructed a simplified model for the SSC-induced timing
jitter. With this model we showed the data pattern dependence of the SSC-induced timing
jitter that was the motivation behind a line-coding scheme, the SWC code, for mitigating
SSC-induced timing jitter and the corresponding SSC-induced errors.
Chapters 3 and 4 are the two major chapters, where our research is discussed and re-
sults are presented. Specifically, in Chapter 3, we studied the effects of one of the main
sources of errors in systems with optical amplifiers, the statistic model of ASE noise, on
both the performance evaluation and performance of FEC codes. We performed a three-
level study regarding a lower bound (the Shannon limit) on general FEC code perform-
ance, an upper bound on linear code performance, and improvement of turbo code per-
formance.
In the study of the Shannon limit for optical fiber channels with dominant ASE noise,
we showed that the use of simpler models, the Gaussian BAC and Gaussian BSC, to cal-
culate the uncoded BERs is a convenient but an inappropriate way to evaluate the Shan-
non limit. Both the Gaussian approximation and the BSC approximation of optical fiber
channels with ASE noise mis-estimate, compared to the chi-square BAC model, the po-
131
tential of error correction in optical fiber transmission systems. We showed that the
Gaussian BAC model gives acceptable estimates of the Shannon limit at code rates
higher than 0.8, but it underestimates (predicts higher required Q) the lower bound by
about 0.4 dB in Q2 at code rate 0.5, and the problem tends to be more severe at lower
code rates. The Gaussian BSC model overestimates (predicts lower required Q) the lower
bound on FEC code performance by about 0.4 to 0.5 dB at all code rates ranging from 0.5
to 0.9. Thus, the maximum coding gain achievable with the best FEC code, is overesti-
mated at low code rates by the Gaussian BAC model and always underestimated by the
Gaussian BSC model.
In the study of upper bounds on linear code performance in optical fiber communica-
tions, we derived a general upper bound on the pairwise error probability, Pd, in asym-
metric channels. We evaluated the corresponding bound on Pd in optical fiber channels
with dominant ASE noise. With two example codes, a turbo product code (TPC) and a
turbo convolutional code (TCC), we investigated the accuracy of the ASE noise Gaussian
approximation in evaluating the upper bound on code performance. We showed that the
Gaussian approximation model overestimates (predicts lower BER) at low Q and under-
estimates at high Q the upper bounds on both the TPC and TCC code performance in the
optical fiber channel. The resulting performance bounds also suggest that, with similar
code rate and block length, the TPC is a better choice than the TCC in optical fiber chan-
nels requiring very low BER. The derived bound is a useful tool in estimating code per-
formance in optical fiber communications: the union bound is in general tight at very low
132
BERs, which is the desired operating range in optical fiber communications, and where
simulation is impractical for evaluating code performance.
In the study of the effect of ASE noise models on the turbo code decoder performance,
we showed that the turbo code decoder design based on the chi-square ASE noise distri-
bution can achieve more than 2 dB coding gain as compared to the design based on Gaus-
sian approximations. We also showed that if the Gaussian BSC model is used in the
puncturing operation to achieve higher code rates, then the likelihood ratio of the re-
ceived signal at the decision threshold, which is supposed to be 1, increases exponentially
as a function of the Q factor. This observation explains the simulation result that the rate-
3/4 punctured-TC based on the Gaussian BSC model fails to improve the system BER
compared to the uncoded data at the same signaling rate as the encoded data.
In Chapter 4, based on the physical mechanism of soliton-soliton collisions (SSC) in
WDM soliton transmission systems, we developed a new line-coding scheme, the SWC
code, to reduce the SSC-induced timing jitter and, thus, bit errors. Two types of SWC
codes, the block- and trellis-based SWC codes, are introduced. A concatenated RS/SWC
coding scheme is proposed, which is shown by simulations to enhance bit rate and reduce
channel spacing in WDM systems.
We studied the performance of RS codes for SSC-induced errors and showed that there
is an optimal RS code redundancy, which is optimal in the sense of achieving the best
code performance in correcting SSC-induced errors. We also showed that more redun-
dancy, implying a stronger error-correction capacity, for RS codes does not always imply
better performance in correcting SSC-induced errors. We showed via analysis and simu-
lation the advantages of the proposed concatenated RS/SWC coding scheme over
133
standalone RS codes and concatenated RS/convolutional codes. With a full simulation
incorporating both complete collisions and partial collisions in a 10 Mm 8-channel 10
Gbps WDM DMS system, we showed that the data sequence with the desired pattern ac-
cording to the SWC significantly reduces SSC-induced timing jitter. Hence, the SWC
line-coding is a very promising technique for dispersion-managed-fiber systems.
5.2 Conclusions
Our research of FEC and line-coding techniques in optical fiber communications sys-
tems has focused on analyzing code performance and designing coding schemes based on
the understanding of the particular physical characteristics of the optical fiber channels
[25]–[29]. Based on all the calculation and simulation results described in the previous
chapters, we conclude this dissertation by answering the two questions posted as the mo-
tivation of our research.
1. Does the non-Gaussian asymmetric statistics of the ASE noise, compared to the
Gaussian symmetric approximations, cause sufficient difference in the FEC perform-
ance that is worth the effort to include more accurate noise statistics into the analysis
and design of FEC codes?
2. Is there sufficient benefit worth the effort in using line-coding approaches to mitigate
the nonlinear inter-channel interference problem?
The answer is yes to both questions.
Specifically, a more accurate determination of the Shannon limit for optical fiber
channels dominated by ASE noise, is possible with the chi-square BAC model. Although,
in the evaluation of the Shannon limit for binary-in binary-out channels, only the two
134
transition probabilities of received signals are involved instead of the complete pdfs. The
resulting Shannon limits based on the chi-square BAC and Gaussian BSC models
showed a 0.4 – 0.5 dB difference in the Q factor. From the viewpoint of the Shannon
limit, a 0.5 dB difference in the Q factor is sufficient motivation for a continued search
for efficient FEC codes for optical fiber transmission systems.
As expected, when the complete pdfs of optical signals with ASE noise are incorpo-
rated in the upper performance bound calculation for linear codes with soft-decision de-
coding, a significant difference between the results based on the chi-square and Gaussian
models shows up. More than 2 dB of coding gain in the performance bound for the turbo
code at 10-12 BER results when the more accurate chi-square model is used. Because the
upper bound derived is based on the union bound that is tight in general at low BER, the
resulting performance bounds can be confidently used for the performance estimate in
the very low BER range required by optical fiber communications.
Our point is further enforced by the simulation results for the turbo code performance
in ASE-noise-dominant optical fiber channels. The soft-decision iterative MAP decoding
algorithm used in turbo code decoding takes full advantage of the statistical information
provided by the soft-decision signals. The high sensitivity of turbo code performance to
the accuracy of the noisy signal distribution causes 1.5 – 2 dB performance degradation
at 10-6 BER when Gaussian approximation is used for a chi-square distributed channel.
The performance degradation becomes more severe at lower BER. Clearly, more than 2
dB of extra coding gain would be very useful to a designer of optical fiber communica-
tion systems.
135
We see a clear trend toward FEC technology advancement in the three generations of
FEC codes applied in optical fiber communications systems. From Hamming codes and
RS codes with algebraic decoding, to concatenated FEC codes with soft decoding and,
further, to turbo product code with soft and iterative decoding, each new generation in
FEC is closer to the Shannon limit. As mentioned in the Introduction, this trend is fun-
damentally guided by the technique of including more and more noise statistical infor-
mation into the FEC code design. However, to make continued progress in the analysis
and design of FEC codes, we should use more accurate channel models for optical fiber
channels that are different from the conventional BSC, AWGN, or Gaussian channels.
FEC codes correct errors after they have occurred in transmission. By contrast, SWC
line-codes, by reshaping the data pattern, prevent SSC-induced errors from occurring.
Hence, in concept, the SWC code is more efficient than the FEC code in mitigating the
particular data-pattern dependent errors. An error floor decrease from 10-2 to 10-6 by us-
ing a SWC code was demonstrated in a 4-channel WDM system simulation. Although it
may not always be possible to achieve very low BER such as 10-12 with only the SWC
code in a highly nonlinear WDM system, the SWC code can be a very efficient compo-
nent code in a concatenation code scheme. Our simulation results showed that for BERs
< 10–9 and compared to the original uncoded system, the highest data rate attainable can
almost be doubled with the concatenated RS/SWC code, and the smallest channel spac-
ing can be decreased by half with the RS/SWC code. Hence, it is worth further effort to
study line-coding schemes to mitigate nonlinearity-based data-pattern dependent errors.
We saw the application of line-coding in counteracting the non-flat laser FM response
in coherent optical fiber communication systems. We believe that a new and promising
136
application of line-coding in WDM optical fiber communication systems is the mitigation
of nonlinear inter-channel interference.
The dissertation results demonstrate, therefore, that more accurate FEC code perform-
ance evaluation, significant improvement of FEC code performance, and highly efficient
and effective line-coding schemes can be achieved when the physical characteristics of
the optical fiber transmission line are taken into account.
5.3 Suggestions for future research
We believe that we have just taken a first step in exploring an important research area
addressing the particular physical characteristics and impairments of optical fiber
communications systems when evaluationg and desigining coding techniques for im-
proving system performance. The results of our research suggest some important topics
for future research.
5.3.1 Further investigation of the noise statistics in optical fiber communication systems
Our FEC research is rooted in accurate noise statistics in optical fiber transmission
systems. As shown in our calculations and simulation results, an accurate channel model
is critical for achieving the best possible FEC code performance, especially for those FEC
codes using soft-decision and iterative decoding. We used the chi-square distribution for
the ASE noise statistics in our studies; however, it is still an approximation. In the deri-
vation of the chi-square model of ASE noise, only the amplitude fluctuation was taken
into account [48], [49], but ASE-induced timing jitter is also an important source of er-
rors. Moreover, it does not account for transmission effects.
137
Holzlöhner, et al., [50] have introduced an efficient simulation algorithm with which
the ASE-induced timing jitter is included in the Monte-Carlo simulation. They performed
simulations of a long-haul DMS system and showed a significant difference of the re-
sulting signal distribution from the Gaussian and chi-square distributions. Simulations of
CRZ systems and comparisons between the resulting signal distributions and chi-square
distributions have not been reported.
In real optical fiber transmission systems, all the physical impairments –– optical fiber
dispersion, fiber nonlinearity, PMD, and ASE noise –– are combined, and the system
may drift from time to time. Hence, the most direct way to obtain the noise distributions
is experimental measurement.
We performed experimental studies of the noise distribution in a recirculating optical
fiber loop described in [85]. We used a BER tester to measure the BER curve as a func-
tion of the decision threshold, and used an oscilloscope to record the histogram of the
detected electrical signals. In theory, it can be proved that the derivative of the BER
curve gives the difference between the two pdfs corresponding to the marks and spaces,
while the histogram of the detected signal gives the sum of the two pdfs. Hence, with the
difference and sum equations of the two pdfs, we can obtain each pdf separately. We
could not obtain reasonably accurate measurements of the optical signal distributions,
however, because of transmission system drifting, the accuracy limit of the oscilloscope,
and thermal noise in the electrical amplifier for the BER tester. To obtain accurate noise
statistics in optical fiber transmission systems, we need to perform more comprehensive
theoretical analysis, develop more efficient simulation algorithms, and design more prac-
tical experiments.
138
5.3.2 Application of low density parity check codes in optical fiber communication sys-
tems
Low density parity check (LDPC) codes is a class of linear block codes originally dis-
covered by Gallager [86] in the early 60s that have recently been rediscovered and gener-
alized [87]–[89]. LDPC codes with soft-decision iterative decoding have been demon-
strated in simulations to perform quite close to the Shannon limit [88]–[91].
One of the advantages of LDPC codes is that the simple linear block code structure and
low density of the parity check matrix make the code implementation relatively easy.
Another advantage is that by increasing the codeword length, high performance can be
achieved with low overhead (redundancy). It has been shown that with sufficient block
length, LDPC codes may outperform turbo codes [91]. Simple implementation structure
and low overhead are two major factors in selecting FEC codes for optical fiber commu-
nication systems with very high data rate. Hence, LDPC codes may be very promising for
optical fiber communications systems.
Moreover, LDPC codes are linear codes, hence, the upper bound on linear code per-
formance derived in Chapter 3 can be directly applied to evaluate the LDPC code per-
formance.
5.3.3 Performance comparison of different FEC codes in optical fiber communication
systems
Up to now, all the FEC codes representing third generation FEC codes in optical fiber
communications belong to the class of turbo product codes (TPC). There are several dif-
139
ferent classes of codes using soft iterative decoding and approaching the Shannon limit;
these include LDPC codes, parallel concatenated convolutional (PCC) turbo codes, and
serial concatenated convolutional (SCC) turbo codes. For future research, we suggest an
investigation of the performance of these different classes of codes in optical fiber chan-
nels, using both hard-decision iterative decoding and soft-decision iterative decoding, and
using the various channel models including the chi-squared BAC, Gaussian BAC, Gaus-
sian BSC, chi-square continuous, and Gaussian continuous models.
Comparison of the code performances will help us evaluate the applicability of these
new classes of codes to optical fiber communications systems. Of particular concern here
is that while other communications systems aim at achieving BERs around 10–4 to 10–6,
optical fiber communications systems require more reliable performance, e.g., BERs <
10–11. Hence, code performance in optical fiber channels should be compared at very low
BERs. In Chapter 3, we have shown that the PCC turbo codes may outperform the other
codes at low Q, and the error floor effect can significantly decrease the slope of the de-
coded BER curve as a function of the Q factor at comparative high Qs (predicting very
low decoded BERs). Thus, other codes with similar code rate and block length, but with-
out the error floor effect, may outperform the PCC turbo code at very low decoded BERs.
To investigate the code performance at BERs < 10–11, analytical evaluation of tight per-
formance bounds is a more practical method than code performance simulations that are
too slow to be practical. Some other issues, including overhead costs, puncturing, decod-
ing complexity, and decoding delay, should also be investigated in a system environment.
140
5.3.4 Experimental study and improvement of the SWC code
For the line-coding work, there is also more work that needs to be done. The perform-
ance of the SWC codes needs to be evaluated in general quasi-linear systems instead of
pure soliton systems. Because the soliton-soliton collison is a particular case of the non-
linear inter-channel interference in WDM systems, the inter-channel interference has
similar physical dynamics to what was used in the development of the SWC code. Hence,
the SWC code is expected to work for errors induced by inter-channel interference in
general. To show this, some experiments in WDM optical fiber transmission systems us-
ing SWC codes should be performed. Moreover, there are other effects that may cause
data-pattern-dependent errors, for example, PMD. In future research, the effects of partial
collisions and PMD need to be addressed in the SWC code design.
141
Bibliography
[1] P. Kaiser, “OIDA Communications Roadmap Study,” Kaiser Global Consulting,
Aug. 1998.
[2] T. Georges and F. Favre, “WDM soliton transmission in dispersion-managed links,”
in European Conf. Opt. Comm, Sept. 1999, Nice, France, paper TuA3.1.
[3] A. Chraplyvy, “Terabit optical communications,” in European Conf. Opt. Comm,
Sep., 1999, Nice, France, paper MoC2.1.
[4] C. R. Davidson, C. J. Chen, M. Nissov, A. Pilipetskii, N. Ramanujam, H. D. Kidorf,
B. Pedersen, M. A. Mills, C. Lin, M. I. Hayee, J. X. Cai, A. B. Puc, P. C. Corbett,
R. Menges, H. Li, A. Elyamani, C. Rivers, and N. Bergano, “1800 Gb/s transmis-
sion of one hundred and eighty 10 Gb/s WDM channels over 7,000 km using full
EDFA C-band,” in OFC/IOOC’00 Technical Digest, Baltimore, MD, Mar. 2000,
paper PD25.
[5] C. A. Brackett, “Dense wavelength division multiplexing principles and applica-
tions,” IEEE Journal on Selected Areas in Communications, vol. 8, no. 6, pp. 948–
964, 1990.
[6] G. P. Agrawal, Fiber-optic Communication Systems, 2nd edition, John Wiley and
Sons, Inc., New York, 1997.
142
[7] B. Zhu, L. Leng, L. E. Nelson, Y. Qian, S. Stulz, Thiele, J. Bromage, L. Gruner-
Nielsen, S. Knudsen, C. Doerr, L. Stulz, S. Chandrasekhar, S. Radic, J. Park, K. S.
Feder, D. Vengsarkar, and Z. Chen, “3.08 Tb/s (77 × 42.7 Gb/s) transmission over
1200 km of non-zero dispersion-shifted fiber with 100-km spans using C- and L-
band distributed raman amplification,” in OFC/IOOC’00 Technical Digest, Ana-
heim, CA, Mar. 2001, paper PD23.
[8] K. Fukuchi, T. Kasamatsu, M. Morie, R. Ohhira, T. Ito, K. Sekiya, D. Ogasahara,
and T. Ono, “10.92-Tb/s (273 × 40-Gb/s) triple-band/ultra-dense WdM optical-
repeatered transmission experiment,” in OFC/IOOC’00 Technical Digest, Anaheim,
CA, Mar. 2001, paper PD24.
[9] S. Bigo, Y. Frignac, G. Charlet, S. Borne, P. Tran, C. Simonneau, D. Bayart, A.
Jourdan, J. P. Hamaide, W. Idler, R. Dischler, G. Veith, H. Gross, and W. Poehl-
mann, “10.2 Tbit/s (256 × 42.7 Gbit/s PdM/WDM) transmission over 100 km Tera-
LightTM fiber with 1.28 bit/s/Hz spectral efficiency,” in OFC/IOOC’00 Technical
Digest, Anaheim, CA, Mar. 2001, paper PD25.
[10] T. Miyakawa, I. Morita, K. Tanaka, H. Sakata, and N. Edagawa, “2.56 Tbit/s (40
Gbit/s × 64 WdM) unrepeatered 230 km transmission with 0.8 bit/s/Hz spectral ef-
ficiency using low-noise fiber Raman amplifier and 170 µm2-Aeff fiber,” in
OFC/IOOC’00 Technical Digest, Anaheim, CA, Mar. 2001, paper PD26.
[11] J. X. Cai, M. Nissov, A. N. Pilipetskii, A. J. Lucero, C.R. Davidson, D. Foursa, H.
Kidorf, M. A. Mills, R. Menges, P. C. Corbett, D. Sutton, and N. S. Bergano, “2.4
Tb/s (120 × 20 Gb/s) transmission over transoceanic distance using optimum FEC
143
overhaed and 48% spectral efficiency,” in OFC/IOOC’00 Technical Digest, Ana-
heim, CA, Mar. 2001, paper PD20.
[12] B. Bakhshi, M. F. Arend, M. Vaa, E. A. Golovchenko, D. Duff, H. Li, S. Jiang, W.
W. Patterson, R. L. Maybach, and D. Kovsh, “1 Tbit/s (101 × 10 Gbit/s) transmis-
sion over transpacific distance using 28 nm C-band EDFAs,” in OFC/IOOC’00
Technical Digest, Anaheim, CA, Mar. 2001, paper PD21.
[13] G. Vareille, F. Pitel, and J. F. Marcerou, “3 Tbit/s (300 × 11.6 Gbit/s) transmission
over 7380 km using C+L band with 25 GHz channel spacing and NRZ format,” in
OFC/IOOC’00 Technical Digest, Anaheim, CA, Mar. 2001, paper PD22.
[14] C. R. Menyuk, “Tutorial on modeling nonlinear lightwave systems,” in
OFC/IOOC’99 Technical Digest, San Diego, CA, Feb. 1999, paper ThW.
[15] D. Marcuse, “Single-channel operation in very long nonlinear fibers with optical
amplifiers at zero dispersion,” Journal of Lightwave Technology, no. 9, pp. 356–
361, 1991.
[16] G. P. Agrawal, Fiber-optic Communication Systems, 2nd edition, Chapter 10, John
Wiley and Sons, Inc., New York, 1997.
[17] W. D. Grover, and T. E. Moore, “Design and characterization of an error-correcting
code for the SONET STS-1 tributary,” IEEE Transactions on Communications, vol.
38, no. 4, pp. 467-476, April 1990.
[18] S. Yamamoto, H. Takahira, and M. Tanaka, “5 Gbps optical transmission terminal
equipment using forward error correction code and optical amplifier,” Electronics
Letters, vol. 30, no. 3, Feb. 1994.
144
[19] J. L. Pamart, E. Lefranc, S. Morin, G. Balland, Y. C. Chen, T. M. Kissell and J. L.
Miller, “Forward error correction in a 5 Gbit/s 6400 km EDFA based system,”
Electronics letters, vol. 30, no. 4, pp. 342–343, Feb. 17, 1994.
[20] A. Puc, F. Kerfoot, A. Simons, and D. L. Wilson, “Concatenated FEC experiment
over 5000 km long straight line WDM test bed,” in OFC/IOOC’99 Technical Di-
gest, San Diego, CA, Feb. 1999, pp. ThQ6-1–THQ6-3.
[21] H. Kidorf, N. Ramanujam, I. Hayee, M. Nissov, J. Cai, B. Pedersen, A. Puc, and C.
Rivers, “Performance improvement in high capacity, ultra-long distance, WDM
systems using forward error correction codes,” in OFC/IOOC’00 Technical Digest,
Baltimore, MD, Mar. 2000, pp. ThS3-1–ThS3-3.
[22] O. Ait Sab, and V. Lemaire, “Block turbo code performances for long-haul DWDM
optical transmission systems,” in OFC/IOOC’00 Technical Digest, Baltimore, MD,
Mar. 2000, pp. ThS5-1–ThS5-3.
[23] O. Ait Sab, “FEC techniques in submarine transmission systems,” in OFC/IOOC’01
Technical Digest, Anaheim, CA, Mar. 2001, pp. TuF1-1–TuF1-3.
[24] H. Taga, H. Yamauchi, T. Inoue, and K. Goto, “Performance improvement of
highly nonlinear long-distance optical fiber transmission system using novel high
gain forward error correcting code,” in OFC/IOOC’01 Technical Digest, Anaheim,
CA, Mar. 2001, pp. TuF3-1–TuF3-3.
[25] Y. Cai, N. Ramanujam, J. M. Morris, T. Adali, G. Lenner, A. B. Puc, and A.
Pilipetskii, “Performance limit of forward error correction codes in optical fiber
communications,” in OFC/IOOC’01 Technical Digest, Anaheim, CA, Mar. 2001,
pp. TuF2-1–TuF2-3.
145
[26] Y. Cai, and J. M. Morris, “On Performance Bounds for Linear Codes in Optical Fi-
ber Communications Systems with Asymmetric Amplified Spontaneous Emission
Noise,” in Proceedings of Conference on Information Sciences and Systems, Balti-
more, MD, Mar. 2001.
[27] Y. Cai, J. M. Morris, T. Adalι, and C. R. Menyuk, “On The Effects of ASE Noise
Models on Turbo Code Decoder Performance in Optical Fiber Transmissions,” to
appear in CLEO/QELS’ 01 Tech. Digest, Baltimore, MD, May 2001, paper CThO.
[28] Y. Cai, T. Adalι, and C. R. Menyuk, “A line coding scheme for reducing timing
jitter in WDM soliton systems,” in OFC/IOOC’00 Technical Digest, Baltimore,
MD, Mar. 2000, pp. ThS4-1–ThS4-3.
[29] Y. Cai, T. Adalι, and C. R. Menyuk, “Error Mitigation System Using Line Coding
for Optical WDM Communications,” Patent Application S/N 06/185,400, filed on
February 28, 2000.
[30] Y. Takasaki, et al., “Two-level AMI line coding family for optical fiber systems,”
International Journal of Electronics, vol. 55, no. 1, pp. 121–131, July 1983.
[31] R. M. Brooks and A. Jessop, “Line coding for optical fiber systems,” International
Journal of Electronics, vol. 55, no. 1, pp. 81–120, July 1983.
[32] A. J. Sharland and A. Stevenson, “A simple in-service error detection scheme based
on the statistical properties of line codes for optical fibre systems,” International
Journal of Electronics, vol. 55, no. 1, pp. 141–158, July 1983.
[33] R. Petrovic, “5B6B optical fibre line code bearing auxiliary signals,” Electronics
Letters, vol. 24, no. 5, pp. 274–275, Mar. 1988.
146
[34] A. Wismeijer, P.W.G. Duijves, H. Van Harten, A.M.J. Koonen, J.S. Leong, P.E.
Schaafsma, and M. Weeda, “A 1.13 Gb/s optical transmission system with ternary
line code,” in Proc. of ECOC ’86, Barcelona, pp. 475–478.
[35] G. Hanke, and B. Hein, “Monomode transmission system operating with 1300 nm
lasers and 1550 nm DFB lasers at a bitrate of 2.23 Gbit/s,” in Proceedings IEEE
International Conference on Communications, Seattle, WA, Jun. 1987.
[36] A. M. J. Koonen, P. V. Eijk, P. H. V. Heijningen, and T. W. M. Mosch, “2.26
Gbit/s optical transmission system with 5B6B line coding,” Electronics Letters, vol.
26, no. 12, pp. 799–801, June 1990.
[37] W. A. Krzymien, “Transmission performance analysis of a new class of line codes
for optical fiber systems,” IEEE Transactions on Communications, vol. 37, no. 4,
pp. 402–404, Apr. 1989.
[38] I. J. Fair, W. D. Grover, W. A. Krzymien, and R. I. MacDonald, “Guided scram-
bling: a new line coding technique for high bit rate fiber optic transmission sys-
tems,” IEEE Transactions on Communications, vol. 39, no. 2, pp. 289–297, Feb.
1991.
[39] R. L. Fellows and T. B. Reynolds, “Synchronous optical digital transmission system
and method,” U.S. Patent, Patent no. 5,459,607, Oct. 17, 1995.
[40] R. S. Vodhanel, B. Enning, and A. F. Elrefaie, “Bipolar optical FSK transmission
experiments at 150 Mb/s and 1 Gb/s,” Journal of Lightwave Technology, vol. 6, no.
10, pp. 1549–1553, Oct. 1988.
[41] R. Noe, M. W. Maeda, S. G. Menocal, and C. E. Zah, “Pattern independent FSK
heterodyne transmission with AMI signal format and two channel cross-talk meas-
147
urements,” Journal of Optical Communications, vol. 10, no. 3, pp. 82–84, Sep.
1989.
[42] H. Tsushima, S. Sasaki, R. Takeyasi, and K. Uomi, “Alternate-mark-inversion opti-
cal continuous phase FSK heterodyne transmission using delay line demodulation,”
Journal of Lightwave Technology, vol. 9, no. 3, pp. 666–674, May 1991.
[43] P. W. Hooijmans, M. T. Tomesen, and A. Van de Grip, “Penalty free biphase line
coding for pattern independent FSK coherent transmission systems,” Journal of
Lightwave Technology, vol. 8, no. 3, pp. 323–328, Mar. 1990.
[44] R. C. Steele and M. Creaner, “565 Mbit/s AMI FSK coherent system using com-
mercial DFB lasers,” Electronics Letters, vol. 25, no. 11, pp. 732–734, May 1989.
[45] S. P. Majunder, R. Gangopadhyay, and G. Prati, “Effect of line coding on hetero-
dyne FSK optical systems with nonuniform laser FM response,” IEE Proceedings.
J, Optoelectronics., vol. 141, no. 3, pp. 200–208, June 1994.
[46] E. Forestieri, and G. Prati, “Analysis of delay-and-multiply optical FSK receivers
with line coding and non-flat laser FM response,” IEEE Journal on Selected Areas
in Communications, vol. 13, no. 3, pp. 543–556, Apr. 1995.
[47] International Telecommunication Union Telecommunication standardization sector
(ITU-T), Series G: Transmission Systems and Media, Digital Systems and Net-
works, G.975.
[48] P. A. Humblet and M. Azizoglu, “On the bit error rate of lightwave systems with
optical amplifiers,” Journal of Lightwave Technology, vol. 9, no. 11, pp. 1576–
1582, Nov. 1991.
148
[49] D. Marcuse, “Derivation of analytical expressions for the bit-error probability in
lightwave systems with optical amplifiers,” Journal of Lightwave Technology, vol.
8, no. 12, pp. 1816–1823, Dec. 1990.
[50] R. Holzlöhner, V. S. Grigoryan, C. R. Menyuk, and W. L. Kath, “Accurate calcula-
tion of eye diagrams and error rates in long-haul transmission systems,” in
OFC/IOOC’01 Technical Digest, Anaheim, CA, Mar. 2001, pp. MF3-1–MF3-3.
[51] C. D. Poole and J. Nagel, “Polarization effects in lightwave systems,” in Optical
Fiber Telecommunications IIIA, I. P. Kaminow and T. L. Koch, eds. Academic, San
Diego, 1997, (Chap. 6).
[52] R. M. Mu, T. Yu, V. S. Grigoryan, and C. R. Menyuk, “Convergence of the CRZ
and DMS formats in WDM systems using disperdion management,” in
OFC/IOOC’01 Technical Digest, Anaheim, CA, Mar. 2001, pp. MF3-1–MF3-3.
[53] K. W. Cattermole, “Principles of digital Line Coding,” International Journal of
Electronics, vol. 55, no. 1, pp. 3–33, July 1983.
[54] J. L. LoCicero and B. P. Patel, “Line Coding,” The Communications Handbook,
Jerry D. Gibson, Editor-in-Chief, CRC Press, Boca Raton, FL, 1997.
[55] S. Lin, and D. J. Costello, Jr., Error control coding: fundamentals and applications,
Prentice Hall, Inc. Englewood Cliffs, NJ, 1983.
[56] C. Berrou, et al., “Near Shannon Limit Error-Correcting Coding and Decoding,” in
Proceedings IEEE International Conference on Communications, Geneva, Swit-
zerland, May 1993, pp. 1064-1070.
149
[57] S. Benedetto and G. Montorsi, “Unveiling Turbo Codes: Some Results on Parallel
Concatenated Coding Schemes,” IEEE Transactions on Information Theory, vol.
42, no. 3, Mar. 1996, pp. 409-428.
[58] G. S. Pandian and S. Dilwali, “On the thermal FM response of a semiconductor la-
ser diode,” IEEE Photonics Technology Letters, vol. 4, no. 2, pp. 130-133, Feb.
1992.
[59] S. J. Wang, Y. J. Wang, N. K. Dutta, and Y. Twu, “FM response of InGaAsP buried
heterostructure distributed feedback lasers and their applications in incoherent FSK
systems,” Journal of Lightwave Technology, vol. 8, no. 12, pp. 1769–1771, Dec.
1990.
[60] D. J. Costello, Jr., et al., “Applications of Error-Control Coding”, IEEE Transac-
tions on Information Theory, vol. 44, no. 6, Oct. 1998, pp. 2531-2560.
[61] A. R. Calderbank, “The Art of Signaling: Fifty Years of Coding Theory”, IEEE
Transactions on Information Theory, vol. 44, no. 6, Oct. 1998, pp. 2561-2595.
[62] C. E. Shannon, “A Mathematical Theory of Communication”, Bell System Tech. J.,
vol. 27, 1948, pp. 379-423, 623-656.
[63] L. F. Mollenauer, S. G. Evangelides, and J. P. Gordon, “Wavelength division multi-
plexing with solitons in ultra-long distance transmission using lumped amplifiers,”
Journal of Lightwave Technology, vol. 9, no. 3, pp. 362-367, Mar. 1991.
[64] L. F. Mollenauer, “Method for nulling nonrandom timing jitter in soliton trans-
mision,” Optics Letters, vol. 21, no. 6, pp. 384-386, Mar. 15, 1996.
[65] R. J. McEliece, The Theory of Information and Coding, Reading, Mass.: Addison-
Wesley Publishing Company, 1977.
150
[66] Y. V. Svirid, “Weight distributions and bounds for Turbo-codes,” European Trans-
actions on Telecommunications, vol. 6, no. 5, pp. 543–555, September–October,
1995.
[67] B. Vucetic and J. Yuan, Turbo codes principles and applications, Kluwer Academic
Publishers, Norwell, Massachusetts, 2000.
[68] S. B. Wicker, Error control systems for digital communication and storage, Pren-
tice Hall, Englewood Cliffs, NJ, 1995, pp. 305.
[69] C. E. Shannon, R. G. Gallager, and E. R. Berlekamp, “Lower bounds to error prob-
ability for coding on discrete memoryless channels,” Information and Control, vol.
10, Part I: pp. 65–103, Part II: pp. 522–552, 1967.
[70] D. Divsalar, S. Dolinar, R. J. McEliece, and F. Pollara, “Transfer function bounds
on the performance of turbo codes,” Jet Propulsion Lab., Pasadena, CA, TDA Prog-
ress Report 42-122, pp. 44–55, Aug. 15, 1995.
[71] L. R. Bahl, J. Cocke, F. Jelinek, and J. Raviv, “Optimal decoding of linear codes for
minimizing symbol error rate,” IEEE Transactions on Information Theory, pp. 284–
287, Mar. 1974.
[72] C. Partridge, J. Hughes, and J. Stone, “Performance of checksums and CRC's over
real data”, Sigcomm '95, Cambridge, MA USA, 1995.
[73] G. L. Cariolaro, and G. P. Tronca, “Spectra of block coded digital signals”, IEEE
Transactions on Communications, vol. Com-22, no. 10, Oct. 1974.
[74] E. Biglieri, et al., Introduction to Trellis-Coded Modulation with Applications,
Macmillan, 1991.
151
[75] D. Haccoun, and G. Begin, “High-rate punctured convolutional codes,” IEEE
Transactions on Communications, vol. COM-37, no. 11, pp. 1113–1125, Nov 1989.
[76] J. W. Modestino, and S. Y. Mui, “Convolutional codes on Rician fading channels,”
IEEE Transactions on Communications, vol. COM-24, no. 6, pp. 592–606, June
1976.
[77] G. Ungerboeck, “Channel coding with amplitude/phase modulation,” IEEE Trans-
actions on Information Theory, vol. IT-28, pp. 55–67, Jan. 1982.
[78] A. J. Viterbi, “Convolutional codes and their performance in communication sys-
tems,” IEEE Transactions on Communications, vol. COM-19, no. 10, pp. 751–772,
Oct. 1971.
[79] S. Lin and D. J. Costello, Jr., Error Control Coding: Fundamentals and Applica-
tions, Prentice-Hall, Inc. Englewood Cliffs, New Jersey, 1983.
[80] G. D. Forney, Jr., Concatenated codes, Cambridge, Mass.: MIT. Press, 1966.
[81] J. M. Morris, “Burst error statistics of simulated Viterbi decoded BPSK on fading
and scintillating channels,” IEEE Transactions on Communications, vol. 40, no. 1,
Jan. 1992.
[82] J. M. Morris and J. Chang, “Burst error statistics of simulated Viterbi Decoded
BFSK and high-rate punctured codes on fading and scintillating channels,” IEEE
Transactions on Communications, vol. 43, no. 2.3.4, February/March/April 1995.
[83] PTDS Version 1.1 for Windows NT, Virtual Photonics Incorporated, 1999.
[84] A. Richter, and V. S. Grigoryan, “Efficient approach to estimate collision-induced
timing jitter in dispersion-managed WDM RZ systems,” in OFC/IOOC’99 Techni-
cal Digest, San Diego, California USA, Feb. 1999, pp. WM33-1–WM33-3.
152
[85] R. M. Mu, V. S. Grigoryan, C. R. Menyuk, G. M. Carter, and J. M. Jacob, “Com-
parison of theory and experiment for dispersion-managed solitons in a recirculating
fiber loop,” IEEE Journal on Selected Topics in Quantum Electronics, vol. 6, no. 2,
pp. 248–257, Mar. 2000.
[86] R. G. Gallager, Low Density Parity Check Codes, MIT Press, Cambridge, MA,
1963.
[87] Y. Kou, S. Lin and M. Fossorier, “Construction of Low Density Parity Check
Codes – A Geometric Approach,” in Proceedings of International Symposium on
Turbo Codes and Related Topics, Brest, France, 4–7 Sept. 2000.
[88] S. Lin, H. Tang, and Y. Kou, “Finite Geometry Low Density Parity Check Codes”,
in Proceedings of Conference on Information Sciences and Systems, Baltimore,
MD, Mar. 2001.
[89] D. J. C. MacKay, “Good Error-Correcting Codes Based on Very Sparse Matrices”,
IEEE Transactions on Information Theory, vol. 45, no. 3, pp. 399–432, Mar. 1999.
[90] D. J. C. MacKay and R. M. Neal, “Near Shannon Limit Performance of Low Den-
sity Parity Check Codes”, Electronic Letters, vol. 32, no. 18, pp. 1645–1646, Aug.
1996.
[91] S.-Y. Chung, G. D. Forney, Jr., T. J. Richardson, and R. Urbanke, “On the Design
of Low-Density Parity-Check Codes within 0.0057 dB from the Shannon Limit”,
IEEE Communications Letters, vol. 5, no. 2, pp. 58–60, Feb. 2001.
[92] W. E. Ryan, “A turbo code tutorial,” http://www.ece.arizona.edu/~ryan/turbo2c.
pdf.
153
[93] L. F. Mollenauer, J. P. Gordon, and M. N. Islam, “Soliton propagation in long fibers
with periodically compensated loss,” IEEE Journal of Quantum Electronics, vol.
QE-22, pp. 157-173, Jan. 1986.
[94] D. J. C. MacKay, Information Theory, Inference, and Learning Algorithms, Draft
2.0.7, Part II, Chapter 10, Feb. 14, 2000.
[95] I. Sason and S. Shamai, “Improved upper bounds on the ML decoding error prob-
ability of parallel and serial concatenated turbo codes via their ensemble distance
spectrum,” IEEE Transactions on Information Theory, vol. 46, no. 1, pp. 24–47,
Jan. 2000.
[96] G. P. Agrawal, Nonlinear fiber optics, 2nd edition, Academic Press, Inc., San Di-
ego, CA, 1995.