On Forward Error Correction Codes and Line-coding Schemes ... · experience, and constructive criticisms. His expertise in the field of coding technology has been the solid base for

On Forward Error Correction Codes and Line-coding

Schemes in Optical Fiber Communications

by

Yi Cai

Dissertation submitted to the Faculty of the Graduate School

of the University of Maryland in partial fulfillment

of the requirements for the degree of

Doctor of Philosophy

2001

Copyright 2001 by Yi Cai

ii

To my parents and my wife

iii

Acknowledgements

I could not have completed my Ph.D. studies and research in three years without the

support of many people who are gratefully acknowledged here.

I would like to express my sincere gratitude to Dr. Tülay Adalι and Dr. Joel M. Morris,

my co-advisors, for their extraordinary enthusiasm, encouragement, and guidance

throughout the course of my Ph.D. study. Dr. Adalι was the initiator who made the whole

procedure possible by offering me a research assistant position in her group in 1998. She

has been a dependable helper in any kind of difficulty, and her expertise in signal proc-

essing and her probing suggestions in the technical discussions helped develop some

fresh ideas in my research. Dr. Morris from whom I took most of my courses at UMBC

has been a constant source of insight and vision, and I have benefited from his valuable

experience, and constructive criticisms. His expertise in the field of coding technology

has been the solid base for my dissertation research. He has also made valuable and ex-

tensive contribution in reviewing and revising this dissertation.

I would like to gratefully acknowledge the contributions of Dr. Curtis R. Menyuk who

has provided the major direction for my research and pointed out the promise of signal

processing and coding technology for significant advances in optical fiber communica-

tions. His effort and contribution in reviewing and revising the dissertation is also greatly

appreciated.

I would like to deeply thank Dr. Gary M. Carter for offering me the chance to perform

experiments in his lab and financially supporting this work together with Drs. Adalι and

Menyuk.

iv

I would also like to express my sincere appreciation to Dr. A. Brinton Cooper, III, for

serving on my dissertation committee and carefully going through details of the disserta-

tion. His valuable advice has led to improvement of this dissertation.

Special thanks go to Drs. Nandakumar Ramanujam, Alexei Pilipetskii, Andrej Puc, and

Gerald E. Lenner from TyCom for a very productive and exciting summer internship ex-

perience. They offered me the opportunity to get a flavor of the issues in a real optical

fiber transmission system, and the technical discussions with them stimulated some of the

ideas in the dissertation.

Thanks and appreciation are also due to Bo Wang, Hongmei Ni, Sneha Agarwal, Arv-

ind Ananthan, and Wenze Xi, my lab-mates in the Information Technology Laboratory,

for providing me such a creative and friendly work environment. I also want to thank my

classmates in ENEE728A, Chuck LaBerge, William R. Martin, Amitkumar Mahadevan,

and Felix Watson, for the interesting class discussions that helped me get deeper under-

standing of turbo codes and low density parity check codes.

Thanks also to my research colleagues in the photonics group, Ruomei Mu, Vladimir

Grigoryan, Yu Sun, Hai Xu, Ronald Holzlöhner, John Zweck, Ivan T. Lima, Jr., Brian

Marks, Hua Jiao, Jiping Wen, Oleg Sinkin, Aurenice Lima, and Heider Ereifej, for their

valuable comments on my research, their kind help in setting up an office with their

group for me, and their active cooperation in performing the experiments.

Finally, I would like to express very special thanks to my parents, my wife, and my

daughter for their love, encouragement, endurance, and understanding. I hope this degree

will be a realization of my parents' dream and a nice reward for all the lonely weekends

my wife and daughter had to spend during my study.

v

Table of Contents

List of Tables ……………………………………………………………………..…

List of Figures ……………………………………………………………………….

1. Introduction …………………………………………………………………….

1.1 Introduction ……………………………………………………………….

1.2 Major sources of impairment in optical fiber communications …………..

1.3 Previous work on coding techniques in optical fiber communications …...

1.4 Motivation of our research ……………...……………...…………………

1.5 Dissertation organization ………………………………………………….

2. Modeling of amplified spontaneous emission noise (ASE) and soliton-soliton

collisions (SSC) in optical fiber transmission systems ………..………....…...

2.1 Statistics of ASE noise and channel models ………………………………

2.2 Physical mechanism of SSC and simplified model for SSC-induced timing

jitter ……………………………………………………………………….

2.3 Summary ……………………………………………………………….….

3. Performance of forward error correction (FEC) codes in correcting ASE

induced Errors ………………………………………………………………….

3.1 Lower bound for general FEC code performance .………………………..

3.2 Upper bound for linear FEC code performance ….……………………….

vii

viii

1

1

3

10

18

21

23

24

34

43

45

46

58

vi

3.3 Performance improvement of turbo codes ………………………………...

3.4 Summary …..……………………………………………………………....

4. A sliding window criterion (SWC) Line-code for mitigating soliton-soliton

collision induced errors .……………………………………………………….

4.1 Reed-Solomon (RS) codes without line-coding …………………………..

4.2 SWC code …………………………………………………………………

4.3 Block and trellis-based SWC codes ………………………....…….………

4.4 Concatenated RS/SWC coding scheme ………………………….....……..

4.5 Performance and comparisons via simulations ……………………….…..

4.6 Summary ……………………………………………………………….…

5. Summary and Conclusions ……………………………………………………

5.1 Summary ……………………………………………………………….….

5.2 Conclusions ……………………………………………………………….

5.3 Suggestions for future research …………………………………………...

Bibliography ………………………………………………………………………...

73

87

90

91

96

99

115

120

126

129

129

133

136

141

vii

List of Tables

4.1 SWC codeword examples ……………………………………………………...

4.2 Codeword look-up tables for a trellis-based SWC code …………………….…

4.3 Bit errors and symbol errors induced by soliton-soliton collision ……………..

100

111

118

viii

List of Figures

2.1 Comparison of the chi-square distribution and the Gaussian approximation ….

2.2 Binary-in-binary-out (BIBO) channel model …………………………………..

2.3 Comparison of the hard-decision thresholds based on the chi-square distribu-

tion, Gaussian approximation, and Gaussian approximation + BSC approxima-

tion ……………………………………………………………………………..

2.4 Comparison of the detected BERs as a function of Q, based on the chi-square

distribution, Gaussian approximation, and Gaussian + BSC approximations …

2.5 Comparison of the transition probabilities based on the chi-square distribution,

Gaussian approximation, and Gaussian approximation + BSC approximation

for M = 3 as functions of Q2 ……………………………………………………

2.6 Optical soliton transmission ……………………………………………………

2.7 Changes of soliton velocity and acceleration during collision versus distance ...

2.8 Soliton-soliton collision in a two-channel WDM system, the rectangular block

in the figure is defined as the sliding window …………………………………

2.9 Patterns of SSC-induced bit errors, in (a) middle channel of a 4-channel 12

Gb/s WDM system and (b) middle channel of a 4-channel 14 Gb/s WDM sys-

tem ……………………………………………………………………………...

3.1 Illustration of the source-channel coding theorem ……………………………..

27

28

31

31

33

34

37

39

42

48

ix

3.2 Comparison of the channel capacities evaluated based on the chi-square BAC,

Gaussian BAC, and Gaussian BSC models of optical fiber channel with domi-

nant ASE noise …………………………………………………………………

3.3 The quantity fs giving minimum value of I(U, V) as a function of Pe for p = 0.1,

0.2, …, 0.9 ………………………………………………………….…………..

3.4 Comparison of the exact rate distortion function and the approximation based

on equal transition probabilities, fs = ms, for different source distributions, p =

0.1, …, 0.9 ……………………………………………………………….……..

3.5 Comparison of the lower performance bounds of FEC codes evaluated with

chi-square BAC, Gaussian BAC, and Gaussian BSC models …….…………...

3.6 The µ(s, dj+) curves with different values of dj

+ ………………………………

3.7 Comparison of µ(0.5, d/2)/2 and Minµ(s, dj+)/2 at d = 3, 6, 9, 12 for the opti-

cal fiber channel with M = 3 …………………………...……………………....

3.8 Codeword structure of the Hamming (7, 4) × (7, 4) TPC ……………………...

3.9 Encoder structure of (1, 5/7, 5/7) TCC with 100-bit interleaver ………………

3.10 Upper bounds on the performance of the Hamming (7, 4) × (7, 4) TPC (trian-

gles) and the (1, 5/ 7, 5/7) TCC with interleaver length 100 (circles) using the

Gaussian (dotted) and the chi-square (solid) ASE noise models .…………...…

3.11 Comparison of the upper bounds on the performance of the Hamming (7, 4) ×

(7, 4) TPC (solid) and the (1, 5/ 7, 5/7) TCC with interleaver length 20

(dashed) using the chi-square ASE noise model ………………………………

3.12 Comparison of the pdfs of the ASE noise with chi-square distribution and

51

54

55

57

67

68

69

70

72

73

x

Gaussian approximation with the same mean and variance …………………....

3.13 Likelihood ratio using the hard-decision threshold based on a Gaussian BSC

model for Bo/Be = 3 ..………………………………………………………...…

3.14 Turbo code encoder and decoder structure …………………………………….

3.15 Output BER comparison of the turbo code (31, 27, 400) decoder based on the

chi-square model (solid), the Gaussian model (dotted), and the Gaussian BSC

model (dashed) of the ASE noise in the optical fiber transmission system, the

rate 1/2 and rate 3/4 codes are punctured versions of the rate 1/3 turbo code ....

4.1 Approximated distribution of SSC-induced time shift ………………………...

4.2 SSC-induced BERs before RS decoding and error correction capability of RS

(255, m) codes as a function of redundancy k = 255 – m at the data rate of 12.5

Gb/s …………………………………………………………………………….

4.3 Soliton-soliton collision in a two-channel WDM system, the rectangular block

is defined as the sliding window ……………………………………………….

4.4 Algorithms for generating the SWCMBNB code table ………………………..

4.5 Continuous components of the power spectral densities of the uncoded random

signal (solid) and the signals encoded by the FF8B10B (dash-dot), the

WF8B10B (dotted), and the Manchester (dotted) codes ……………………….

4.6 Implementation of the block SWC code ……………………………………….

4.7 Function diagram of the trellis-based SWC encoder …………………………..

4.8 Trellis diagram of the trellis-based SWC encoder ……………………………..

4.9 Trellis of the (4, 3, 2) SWC encoder …………………………………………...

77

83

84

87

93

95

97

102

104

106

109

110

112

xi

4.10 Possible combinations of number of marks in v(0), v(1) and v(2) ……………….

4.11 Concatenated RS/SWC coding scheme ………………………………………..

4.12 Reduction of the SSC-induced timing jitter with a SWC (10, 8) code ………...

4.13 Comparison of the code performances in enhancing the WDM system capacity

in (a) transmission bit rate and (b) channel spacing ……………………………

4.14 Probability mass function (pmf) of the number of marks in the sliding window

on the data sequence encoded with the fragmentation-first (star) and the

weight-first (triangle) algorithms for codeword length = 14 bits, and, (a) slid-

ing window length = 4 bits and (b) sliding window length = 14 bits ………….

4.15 SSC-induced timing jitter of desirable (square), random (no sign), and undesir-

able (circle) data patterns. Solid: timing jitter in middle channel. Dotted: timing

jitter in the outmost channel ..…………………………………………………..

4.16 Eye diagrams of the received signals with undesirable (upper) and desirable

(lower) patterns ………………………………………………………….……..

113

117

121

123

124

127

127

1

Chapter 1

Introduction

1.1 Introduction

The growth in demand for broadband services has led to considerably increased activ-

ity in research for optical fiber communications systems and networks with high trans-

mission bit rate and high spectral efficiency [1]–[4]. The standard optical fiber installa-

tion can provide ~25 THz bandwidth, which is far greater than what is currently in use.

This potential capacity can be exploited through the use of wavelength division multi-

plexing (WDM) in optical fiber communications. In WDM systems, a number of differ-

ent independent wavelengths are transmitted simultaneously on one optical fiber and,

thus, they more fully utilize the enormous fiber bandwidth [5], [6]. A major enabling

technology for multi-wavelength systems is the optical amplifier that can provide gain to

many channels simultaneously over a ~THz wavelength range. Moreover, the transmis-

sion bit rate per channel has been increasing, and systems with a single channel rate of 40

Gbps have emerged [7]–[10]. With the continuous efforts in channel spacing reduction

and transmission bit rate enhancement, optical fiber transmission systems with bit rate as

high as several Tbps [7]–[13] and spectral efficiency more than 0.6 bit/s/Hz [9], [10] have

2

been demonstrated.

However, the physical impairments in the optical fiber transmission lines limit the ob-

tainable channel spacing and data rates in optical fiber communications. The major

sources of impairment in optical fiber communications systems include the amplified

spontaneous emission (ASE) noise from the optical amplifiers, chromatic dispersion, fi-

ber nonlinearities (particularly the Kerr nonlinearity), and polarization effects (particu-

larly polarization mode dispersion in terrestrial systems) [14].

Two important trends have emerged in the drive to combat these impairments. First is

the use of modulation formats, for the launched optical pulses, that are quasi-linear [15].

This leads to two major types of optical fiber transmission systems –– chirped return-to-

zero (CRZ) systems and dispersion managed soliton (DMS) systems [15]. CRZ and DMS

are two signal modulation formats.

The CRZ format has a quasi-linear evolution, which means that when the fiber non-

linearity has been carefully mitigated, the optical pulse evolution appears linear in im-

portant respects [52].

The DMS format comes in two major variants. The first, which we refer to as periodi-

cally stationary DMS, is a format in which pulses return to the same shape at the end of

every period in the dispersion map. So, there is a balance between nonlinearity and dis-

persion. The second variant of the DMS format, which we refer to as quasi-linear DMS,

does not have periodically stationary behavior. Like the CRZ format, the optical pulse

evolution appears linear in important respects once the nonlinearity is mitigated [52].

The second trend is the growth in importance of error correction and line-coding, as

well as signal processing in optical fiber communications systems [17]–[46]. Applica-

3

tions of coding techniques in optically amplified WDM fiber transmission systems sig-

nificantly add system margin against physical impairments and, thus, increase the bit rate,

reduce the channel spacing, increase transmission distance, and reduce system power

budget. Applications of forward error correction (FEC) codes have been standardized in

long-haul undersea systems [47] and are predicted to play an important role in future 40

Gbps long-haul terrestrial systems.

This dissertation addresses the application of FEC and line-coding to achieve higher bit

rates, and spectral efficiency in optical fiber communications. In this introductory chap-

ter, we first describe the main sources of impairment in optical fiber communications and

their major physical effects. Then, we provide a brief survey of the previous work on ap-

plications of coding techniques in optical fiber communications. Based on this survey, we

point out some very important research topics that have not been addressed by other re-

searchers, which provides the motivation for our research and the subject of this disserta-

tion. This chapter ends with an outline of the dissertation.

1.2 Major sources of impairment in optical fiber communications

ASE noise, chromatic dispersion, fiber nonlinearities, and polarization effects are the

major sources of physical impairments limiting the achievable transmission capacity in

optical fiber communications systems. We now give a brief description of each of them.

1.2.1 ASE noise in optical amplifiers

Optical amplifiers consist of an active medium that has the carrier population in its

quantum energy levels inverted by a pump source, so that an input optical signal can ini-

4

tiate stimulated emission and achieve coherent gain. Along with stimulated emission,

there is always spontaneous emissions leading to noise. A fraction of the spontaneous

emission is coupled into the beam propagation path in the optical fiber and is amplified

[6]. This amplified spontaneous emission (ASE) noise is broadband, occurring over the

entire gain bandwidth of the optical amplifier [6]. Moreover, in systems with lumped op-

tical amplifiers, the accumulated ASE noise may cause gain saturation and thereby limit

the achievable signal gain. ASE noise can be characterized as white noise for each chan-

nel in a WDM system, and it decreases the signal-to-noise ratio (SNR) at the receiver.

To investigate the statistics of ASE noise, it is advantageous [48], [49] to represent it

with a set of orthonormal functions, φi(t), over the transmitted optical signal period T by

)(2

1

tn i

M

iiφ∑

=

, (1.1)

where 2M is the dimensionality of the space of the transmitted optical signals, ni repre-

sent independent Gaussian random variables with zero mean and identical variance. The

transmitted optical signal under these assumptions can be expanded in the same basis as

)(2

1

ts i

M

iiφ∑

=

. (1.2)

Thus, the transmitted optical signal with ASE noise can be characterized as the sum of a

set of independent Gaussian random processes given by

)()(2

1

tns ii

M

ii φ+∑

=

. (1.3)

If we neglect the changes that occur during the transmission –– which cannot be done in

practice –– then the received optical signal will have the same distribution. However, be-

5

cause the photo-detector at the receiver with direct detection is inherently a square-law

device, the detected electrical signal, I, will equal the square of the incoming optical sig-

nal, and can be approximated by

22

1

0

2

1

)( )()(2

i

M

ii

T

ii

M

ii nsdttnsI +=

+= ∑∫ ∑

==

φ . (1.4)

The detected electrical signal, I, therefore can be characterized by a sum of squared

and non-zero-mean Gaussian random variables. Hence, the statistics of the detected sig-

nals as shown in Eq. (1.4) is no longer Gaussian but chi-square [48]. Moreover, the ex-

pansion of the square terms in Eq. (1.4) yields a “signal/noise beat” term, 2nisi, for the

case where si ≠ 0. This case corresponds to the occurrence of optical pulses, and it is

customary to call these occurrences “marks.” There are no such terms for the case si = 0

where the time slots do not contain an optical pulse. It is customary to call these empty

slots “spaces.” Therefore, marks and spaces have different distribution functions and

variances. As a consequence, the distribution of marks and spaces is asymmetric. This

will lead to a binary asymmetric channel model for the post-detection signal.

From the above discussion, we can see that ASE noise causes non-Gaussian and

asymmetric distributions of the detected signals. However, for simplicity in the analytical

studies involving ASE noise, the Gaussian approximation and the binary symmetric

channel approximation for optical fiber channels with dominant ASE noise are widely

used. This approximation is used even in studies of FEC codes [20]–[23], which yields

suboptimal designs and performance assessments as will be shown in the following

chapters. We want to point out that Eq. (1.4) is still an approximation of the detected sig-

nal with additive ASE noise, because only the amplitude fluctuation due to ASE noise is

6

considered here. However, ASE may also induce timing jitter of the optical signals at the

detector, where timing jitter is defined as the random deviation of the optical pulse posi-

tion from its nominal location at the time slot center [16]. It has been shown in [50] that

ASE-induced timing jitter may also cause significant differences in the detected signal

statistics at the detector due to the ASE noise, especially in the tails of the probability

density functions (pdf).

1.2.2 Chromatic dispersion

Chromatic dispersion is a fundamental physical phenomenon in optical fibers that is

also called group velocity dispersion. It refers to the wavelength dependence of the re-

fractive index of optical fibers [6]. We know that the speed of light in optical fiber is de-

termined by the refractive index. If the refractive index is wavelength dependent, optical

signals at different wavelengths travel at different speeds in the optical fiber.

Consequently, optical signals belonging to different channels may pass through and

interact with each other during the propagation in WDM systems. We describe this phe-

nomenon as signal collision. Specifically, in WDM optical soliton transmission systems,

it is called soliton-soliton collision.

Moreover, in a single channel, the chromatic dispersion may cause envelope distortion

of a single optical signal because of the different frequency components comprising the

optical signal propagating at different speeds. The signal distortion increases as the

transmission distance increases and can be in the form of pulse broadening or narrowing,

depending on how the different frequency components are distributed in the time domain.

7

1.2.3 Fiber nonlinearities

Fiber nonlinearities are signal-intensity-dependent effects in optical fibers. The impor-

tant effects of fiber nonlinearities in optical fiber communications systems result from the

fact that optical signals with high intensity are confined to a small cross section over long

fiber lengths. The most common nonlinear effects in optical fiber communications sys-

tems are stimulated light scattering due to the Raman and Brillouin effects and the non-

linear refractive index change due to the Kerr effect [6]. The Kerr effect leads to the ef-

fects referred to as self-phase-modulation, cross-phase-modulation, and four-wave mix-

ing [6]. These three effects are not unambiguously separable and they are defined as fol-

lows.

Self-phase-modulation results from the intensity-dependent refractive index in optical

fiber. The refractive index determines the speed of light in the fiber, therefore, different

intensity components contained within an optical pulse travel at different speeds. Thus,

the different intensity components become phase shifted. This effect in a single optical

pulse distorts its own phase profile and it is referred to as self-phase-modulation [6].

In WDM systems, the Kerr effect may cause nonlinear interactions between different

channels. It can be viewed in the following way: A high intensity signal S1 in one channel

distorts the refractive index of the optical fiber, which in turn changes the propagation

speed of another signal S2 in a different channel and collides with S1 during their propa-

gation. This nonlinear inter-channel interaction in WDM soliton transmission systems

may cause severe timing jitter problems in all channels. This effect, which is purely in-

tensity dependent, is referred to as cross-phase-modulation [6].

8

By contrast, four-wave mixing is a phase dependent effect in which two wavelengths

ω1 and ω2, propagate inside the optical fiber simultaneously and maintain certain phase-

matching requirements as described in [96] so that signals at two new wavelengths, ω3

and ω4 that satisfy ω3 + ω4 = ω1 + ω2 may be generated [6]. In WDM systems, channels

close to the zero-dispersion wavelength become nearly phase-matched because of the

similar propagation speeds. Thus, four-wave mixing introduces a trade-off in the disper-

sion map design in optical fiber communications systems. Low dispersion is preferred to

lower the required average power and reduce the timing jitter, while four-wave mixing

becomes severe when the dispersion is close to zero [6]. This problem can be solved with

the dispersion management that will be described later in this section.

All those nonlinear effects described above cause signal distortions that become worse

with higher signal intensities and longer transmission distances. And in the case of long

distance transmission, fiber nonlinearities are more severe because the interactions that

cause the nonlinearity are allowed to accumulate. Generally speaking, fiber nonlinearities

are not important in low power and short distance transmission systems, but are important

in WDM systems with high power or narrow channel spacing, as well as in long-haul

transmission systems.

1.2.4 Polarization effects

Polarization effects are due to randomly varying birefringence in the optical fiber [51].

Birefringence leads to variations in the state of polarization of the launched optical signal

as it propagates in the optical fiber. This variation is caused by fluctuations in the core

9

shape of the optical fiber, temperature changes, or non-uniform stresses in the optical fi-

ber. Because the two polarization components have different group velocities, the optical

signal at the receiver suffers dispersion. This phenomenon is referred to as polarization

mode dispersion (PMD). The pulse broadening due to PMD is typically small compared

to the magnitude of the local chromatic dispersion. However, when fiber attenuation and

chromatic dispersion effects are compensated, PMD can become a limiting factor for

long-haul, high-bit-rate systems. It is difficult to compensate for PMD because it varies

randomly over time on a scale of milliseconds to hours.

All these physical impairments are combined in optical fiber transmission systems and

interact with each other. Different types of optical fiber transmission systems may be

dominated by different sources of impairment.

For example, in a chirped return-to-zero (CRZ) system, ASE noise and PMD are ex-

pected to be the dominant impairments. An optical pulse is said to be chirped if its carrier

frequency changes with time [6]. A CRZ system has a return-to-zero modulation and

chirped optical pulses such that, on average, the trailing portion of the optical energy in a

single time slot moves faster than the leading portion. The chirped signal with carefully

designed dispersion compensation broadens significantly during propagation, but it is

compressed at the receiver [52]. The signal broadening during propagation significantly

decreases the signal power. Also, because of the pulse overlap caused by signal broad-

ening, the total signal powers must be kept small so that the fiber nonlinearity during the

signal propagation remains relatively small. All the behaviors just described hold in a

quasi-linear dispersion-managed soliton (DMS) system.

10

By contrast, in WDM optical soliton systems that use traditional solitons, nonlinear

interactions of optical pulses among different channels (soliton-soliton collisions) can

cause severe inter-channel interference that may be converted into timing jitter effec-

tively decreasing the actual transmission capacity [63], [64].

All modern-day WDM systems with large number of channels (> 10) are quasi-linear.

First, the large third-order dispersion as defined in [6] that is present in most systems im-

plies that pulses in most channels undergo a large spread and overlap with their neigh-

bors. The power must be kept low to avoid unacceptably large inter-pulse interactions.

Second, even when the third-order dispersion is compensated, it is still necessary to

maintain a large dispersion to avoid cross-phase modulation between channels. Then,

even in this case, the power must be low to avoid strong inter-pulse interactions.

In any optical communication system, the ideal operating power is determined by the

interplay between ASE noise at low power and the Kerr nonlinearity at high power.

However, the Kerr nonlinearity can manifest itself in a wide variety of ways. In the sys-

tems that are discussed in this dissertation, the dominant nonlinearity is the inter-channel

nonlinearity.

1.3 Previous work on coding techniques in optical fiber communications

Although FEC technology has become a hot topic in optical fiber communications in

the recent 2–3 years, researchers have been studying and applying coding techniques in

optical fiber communications systems for about 20 years [17]–[46]. The coding tech-

niques that we are discussing here include both FEC and line-coding.

11

The basic idea behind FEC coding is to correct the possible transmission errors at the

receiver by adding well-defined redundancy that can be exploited at the receiver. The ba-

sic idea behind line-coding [53], [54] is to modify a source signal waveform to enhance

proper signal reception in the presence of transmission impairments. In contrast to the

focus of FEC, which is to correct transmission errors in general, the focus of line-coding

is to provide timing information, remove DC content, provide power spectral density

shaping, facilitate performance monitoring, minimize pattern dependent BER, and ensure

against inducing too many decoded errors [53], [54].

Both FEC and line-coding add redundancy to the original data stream, so that encoding

increases the transmission signaling rate for the same data rate. Code rate r and overhead

h are two measures of the degree of encoded data redundancy; they are defined as

sequencedata encoded of lengthsequencedata input of length=r , (1.5)

sequencedata input of lengthbitsredundant ofnumber =h . (1.6)

In the coding scheme design, there is a trade-off between overhead and spectral effi-

ciency. An encoded stream with larger overhead uses more bandwidth for the same data

rate. Considering the price paid for redundancy, the coded transmission can achieve a

lower SNR for a given BER than does uncoded transmission, and the difference between

these two SNR values for the same BER is defined as coding gain. In optical fiber com-

munications systems, the SNR is commonly represented by the Q factor defined as

Q = (µ1 – µ0)/(σ1+σ0), (1.7)

12

where µ1, µ0, σ1, and σ0 represent the mean values and variances of the received marks

and spaces, respectively [49].

In the following two subsections, we review the progress of FEC and line-coding in

optical fiber communications, respectively.

1.3.1 Survey on FEC technology in optical fiber communications

Research and application of FEC technology in optical fiber communications started in

the early 90s. Most of the published work was done by researchers in industry. The FEC

coding schemes applied to optical fiber communications systems during the last decade

can be categorized by three generations –– standard block codes with hard-decision de-

coding, concatenated FEC codes with hard- or soft-decision decoding, and concatenated

FEC codes with soft-decision and iterative decoding (or so-called turbo codes). The cor-

responding concepts in FEC coding techniques are defined in the following paragraph

[55].

Hard-decision decoding refers to the case in which the FEC code decoder has only bi-

nary inputs when a binary demodulator output is used [55]. Similarly, if the demodulator

has more than two quantization levels (or the output is left unquantized) the code decoder

must accept multilevel (or analog) inputs, which is referred to as soft-decision decoding

[55]. Concatenated coding schemes generally involve two constituent codes, an inner

code and an outer code. At the transmitter, the original data sequence is first encoded by

the outer encoder and then by the inner encoder. Correspondingly, at the receiver, the re-

ceived data sequence is decoded in turn by the inner decoder and the outer decoder [55].

Iterative decoding means that the output decoded data sequence is fed back to the decoder

13

to be decoded again iteratively [67]. In a concatenated coding scheme, iterative decoding

allows information exchange between the constituent decoders and, thus, improves the

decoding performance.

The first generation of FEC codes applied to optical fiber communications systems in-

cludes Hamming codes and Reed-Solomon (RS) codes using hard-decision decoding

[17]–[19]. In [17], Grover, et al., implemented long block length Hamming codes for

SONET STS-1 tributary. The [6208, 6195] shortened Hamming code was designed to

satisfy the STS-1 format and provide single-error correction and double-error detection.

They reported a reduction of the payload BER to about 8.6 × 103 × Pe2, where Pe is the

BER before decoding. In late 1993, Yamamoto, et al., with KDD, demonstrated a 5 dB

coding gain with the standard RS (255, 239) code in a 5 Gbps 210 km optical fiber

transmission experiment [18]. Almost in the same time period, Pamart, et al., with Al-

catel, and Chen, et al., with AT&T, reported more than 5 dB of coding gain with a 14%

overhead RS code in a 5 Gbps 6400 km optical fiber transmission experiment [19]. The

RS (255, 239) code has been standardized for the undersea cable system by the Interna-

tional Telecommunication Union [47]. All these investigations done in the early 90s were

experimental; not much theoretical study of FEC codes in an optical fiber transmission

environment was done.

Recently, Kidorf, et al., with TSSL (which is now TyCom), performed a detailed study

of the performance of the family of RS codes in long-haul WDM optical fiber transmis-

sion systems [21]. They carried out a theoretical comparison of the theoretical bound on

the maximum decoded BER of RS codes with various level of overhead. They also per-

formed Monte-Carlo simulations of RS codes and compared the simulation results of the

14

code performance to the experimental measurements. They observed a good match be-

tween the simulation results and the experimental results. In the theoretical calculation of

the performance bound, they assumed a binomial distribution for the uncorrelated bit er-

rors, implying a binary symmetric channel (BSC) model for the optical fiber channel,

where the BSC represents a binary-in binary-out channel with equal transition probabili-

ties. In the Monte-Carlo simulations, they assumed additive white Gaussian noise

(AWGN) statistics for ASE noise. The AWGN channel, a binary-input continuous-output

channel, is analogous to the BSC with the exception that the output is a binary signal plus

AWGN.

The second generation of FEC codes applied to optical fiber communications systems

appears in the late 90s [20]–[22]. It includes different concatenated FEC coding schemes.

Puc, et al., with TSSL, considered the NASA standard concatenated code consisting of

the RS (255, 239) block code using hard-decision decoding, and the rate 1/2, constraint-

length 7, convolutional code using soft-decision Viterbi decoding [20]. This scheme

yielded a 10 dB coding gain with 113% overhead in a 2.5 Gbps 5000 km WDM transmis-

sion experiment. They mentioned that the theoretical evaluation of the coding gain was

made based on AWGN. Ait Sab, et al., with Alcatel, evaluated the performance of the

concatenated RS (255, 223)/RS (255, 239) with simulations based on a Gaussian channel

model [22]. The Gaussian channel model they used has different noise levels for the

marks and spaces, which is more accurate than the AWGN model for ASE noise that has

different distributions for marks and spaces. The simulation result showed a 7.7 dB cod-

ing gain in a 10 Gbps 6500 km WDM transmission system [22], [23].

15

The third generation of FEC codes applied to optical fiber communications systems

has recently been proposed [22]–[24]. It includes concatenated RS/RS codes and con-

catenated BCH/BCH codes with soft iterative-decoding. These codes belong to the new

class of codes, called turbo codes, with iterative soft-decision (soft-input soft-output) de-

coding [56], [57]. There are two types of turbo codes depending on the type of constitu-

ent codes. If the constituent codes are block codes, we have turbo product codes (TPC). If

the constituent codes are convolutional codes, we have turbo convolutional codes (TCC)

[67]. Because the concept of a turbo code was first introduced as TCC in the literature,

the two names, turbo code and TCC, are not always clearly distinguished.

Ait Sab, et al., with Alcatel, demonstrate a coding gain of 10 dB using the TPC BCH

(128, 113) code with 28% overhead in a 10 Gbps 7800 km WDM transmission system

with simulations based on the Gaussian channel assumption [22], [23]. Taga, et al., with

KDD, experimentally demonstrated that concatenated RS (239, 223)/RS (255, 239) with

iterative decoding yields 2 dB of extra coding gain, compared to the RS (255, 239) code,

in a 10 Gbps 10 Mm WDM transmission experiment [24].

We suggest the development of a third generation code based on TPC instead of TCC.

The reason will be made clear in Chapter 2 with the results of our studies on the upper

performance bound for linear codes in optical fiber communications.

1.3.2 Survey on line-coding in optical fiber communications

Research on line-coding in optical fiber communications started in the early 80s and

was a beneficial technique throughout the 80s and early 90s. Although it has a longer

history than FEC coding, research on line-coding has not been as popular as research on

16

FEC techniques in recent years. The reason will follow from a later discussion of the de-

velopment of line-coding in optical fiber communications.

There are two major objectives in previous line-coding applications in optical fiber

communications. The first was to transmit adequate timing information to allow for

proper operation of clock recovery circuitry, and, in the meantime, to keep low frequency

content small to allow for ac coupling in the receiver [30]–[39]. In this kind of applica-

tion, transition density and transmission balance are two critical criteria used for the line-

code design [31], [53], [54]. Transition density refers to the frequency of the signal level

transitions in the encoded data sequence. High transition density ensures adequate timing

information and, thus, easy clock recovery. A transmission is balanced if there are an

equal number of marks and spaces, thus small low frequency content, in the encoded data

sequence.

In 1983, Takasaki, et al., proposed two-level alternate-mark-inversion (AMI) line-

coding for optical fiber communications. In [31]–[33], several different alphabetic block

line-codes implemented with look-up tables to ensure high transition density and bal-

anced transmission were proposed. In [34]–[36], several block line-codes were imple-

mented in GHz systems by using coders in parallel. In [37], Krzymien proposed a new

class of binary, nonalphabetic, balanced line-codes with m-bit data word and (m+1)-bit

codeword that requires small overhead and is thus efficient for high bit rate optical fiber

systems. In [38], Fair, et al., developed a guided scrambling approach for line-coding, in

which the current scrambling process depends on feedback from the previous encoded

output data sequence. It can be implemented in Gbps transmission systems with its sim-

ple scrambler-like structure and provides balanced transmission with high transition den-

17

sity. We can see that the key issue in this kind of line-coding application is the imple-

mentation in high bit rate optical fiber communications. No particular impairments in op-

tical fiber transmission lines were involved in the line-code design.

The other objective of line-coding applied to optical fiber communications, however,

does relate to a particular physical effect –– the non-flat laser frequency modulation (FM)

response. The overall FM response of the laser diode is due to the combined thermal and

carrier effect, which may either produce a “dip” or an enhanced response at low frequen-

cies [46]. This effect is referred to as the non-flat FM response. The non-flat FM response

of conventional distributed feedback (DFB) laser diodes (LD) is a major problem in co-

herent optical frequency-shift-keying (FSK) systems [58], [59]. This physical effect

causes data-pattern-dependent performance degradation that is a major problem that line-

coding should help solve. Hence, extensive studies on using line-coding schemes, in-

cluding AMI, Manchester code, and delay modulation, to counteract the non-flat laser

FM response, were carried out during the late 80s and early 90s when coherent systems

were a hot topic in optical fiber communications [40]–[46].

However, as we know, the successful development of erbium-doped fiber amplifers

(EDFA) in the early 90s has allowed significant increases in the sensitivity of intensity-

modulation direct-detection (IM-DD) systems, which overshadowed the high sensitivity

advantage of coherent systems. Thus, line-coding research temporarily lost its justifica-

tion in optical fiber communications.

18

1.4 Motivation of our research

From the survey in the previous section, we can see that coding technology is impor-

tant and practical in optical fiber communications. It has been responsible for some of the

progress in optical fiber communications. However, we also have the following impres-

sions regarding the research in this field.

The previous studies are mostly based on standard FEC codes and line-coding

schemes, for example Hamming codes, RS codes, AMI codes, and Manchester codes,

which were initially developed in wireless communications or older communications

systems. Moreover, for the theoretical studies of FEC codes in optical fiber communica-

tions, the channel models mostly used for optical fiber channels assume a binary sym-

metric channel with hard-decisioning [21], or assume AWGN or a Gaussian channel with

soft-decisioning [20]–[23]. Current line-coding schemes that have been applied in optical

fiber communications all use the conventional transition density and transmission bal-

ance as the performance criteria for the encoded data sequence [30]–[46]. There has been

little effort to optimize the choice of codes and design new codes by taking into account

the physical mechanisms behind the particular impairments in optical fiber transmission

lines and systems. By contrast, the goal of our research is to analyze and design FEC

codes and line-coding schemes by taking into account the particular physical impair-

ments in optical fiber transmission systems.

Specifically, we note that the most extensive results for FEC codes and their perform-

ance are based on BSC and AWGN channels. Many practical channels have been mod-

eled via the BSC and AWGN channels, e.g., the deep-space and satellite channels, the

telephone network, and, more recently, the subchannels of the ADSL system, after ap-

19

propriate equalization [60], [61]. However, from Sec. 1.2, we see that ASE noise, a major

source of random errors in optically amplified fiber communications systems, has non-

Gaussian and asymmetric distributions. This situation leaves a wide research opportunity

for improving the performances of FEC codes by taking into account the more accurate

noise statistics of optical fiber channels [25]–[27].

A question may be raised at this point. As mentioned in the previous section, the theo-

retical and simulation results of the RS code performance in [21], using BSC and AWGN

assumptions, agree well with the experimental measurements: Does this not mean that

BSC and AWGN are good approximations for optical fiber channels? The answer is, no.

The reason is that RS codes or any other FEC codes using hard-decision algebraic de-

coding are not sensitive to the exact noise statistics. Because the a priori knowledge of

the channel noise statistics is not used in algebraic decoding [55], [68], as long as the

channel model assumption gives a good estimate of the uncoded BER, it also gives a

good estimate of the algebraic block coded BER [55], [68].

By contrast, a priori knowledge of the channel noise statistics is essential for FEC

codes that use more sophisticated decoding algorithms such as the Viterbi algorithm

(maximum likelihood), the BCJR algorithm (maximum a posteriori probability), and the

sum-product algorithm (maximum likelihood). One measure of the progress in FEC

codes is to see how close the codes approach the Shannon limit. Shannon’s noisy channel

coding theorem [62] states that there exists a code, with a code rate r not exceeding the

channel capacity C, that can achieve arbitrarily small probability of error. The channel

capacity C is defined as the maximum mutual information that can be transmitted over

the physical channel. It is a function of the probability density function of the noisy sig-

20

nals after transmission. Thus, to approach the Shannon limit, the code and decoder

should utilize the channel noise statistics as much as possible.

This trend can be seen in the progress of FEC codes. From algebraic codes, including

Hamming codes and RS codes, to convolutional codes with hard-decision Viterbi de-

coding, then to convolutional codes with soft-decision Viterbi decoding, and further to

turbo codes with soft-decision iterative MAP decoding, code performance has ap-

proached closer and closer to the Shannon limit. This improvement has occurred because

more and more information on channel noise statistics is incorporated into the decoding

algorithms. We observed the same steps in the progression of the three generations of

FEC codes in optical fiber communications as shown in the previous section.

This historical observation is the motivation for our research on the effect of ASE

noise statistics on FEC code performance, including the study of the Shannon limit for

general FEC codes, the upper performance bound for linear codes, and the performance

improvement for turbo codes in non-Gaussian asymmetric optical fiber channels. A basic

question that we must answer in our FEC research is –– does the non-Gaussian asymmet-

ric statistics of the ASE noise, compared to the Gaussian symmetric noise approxima-

tions, cause a sufficient difference in the FEC studies that is worth the effort to include

more accurate noise statistics into the analysis and design of FEC codes?

For the line-coding research, the motivation is more direct. The nonlinear inter-channel

interference becomes the main source of errors in optical WDM transmission systems

when signal intensity is high, as is the case in optical soliton transmission systems. We

note that the nonlinear inter-channel interference causes correlated errors that are highly

dependent on the data patterns in WDM channels, for which line-coding is supposed to

21

be a direct solution. Our goal is to develop a line-coding scheme based on an under-

standing of how the data patterns affect the impairments induced by inter-channel inter-

ference. We need to determine in the line-coding research whether one can effectively

use line-coding to solve the nonlinear inter-channel interference problem.

The basic idea, which is the theme throughout this dissertation, is to analyze and de-

sign FEC codes and line-codes by taking into account the particular physical characteris-

tics and mechanisms in optical fiber transmission lines.

1.5 Dissertation organization

There are 5 chapters in this dissertation. Chapter 1 gives the introduction to the prob-

lems discussed in this dissertation, the survey on previous related work that has been

done, the motivation for our research, and the outline of the dissertation. Chapter 2 de-

scribes the physical dynamics behind the two sources of impairment of concern in this

dissertation –– the ASE noise from optical amplifiers and soliton-soliton collisions in

WDM soliton systems –– and describes the construction of corresponding theoretical

models to facilitate further discussions about FEC and line-coding solutions. The major

results of our research are reported in Chapters 3 and 4, respectively.

Chapter 3 is dedicated to the performance evaluation and improvement of FEC codes

in optical fiber transmission systems with dominant ASE noise. We do a three-level study

of FEC codes for correcting ASE-induced errors, using more accurate ASE noise statis-

tics: the Shannon limit for general FEC codes, the upper performance bound for linear

codes, and the performance improvement of turbo code. The results are presented in three

sections. In Sec. 3.1, we evaluate the lower performance bound, i.e., the Shannon limit,

22

for optical fiber channels with dominant ASE noise. The Shannon limit is the very basic

lower performance bound for all FEC codes. In Sec. 3.2, we derive the theoretical upper

performance bound for all linear FEC codes in optical fiber channels. This upper bound is

based on the union bound, which is a tight bound at low BER. We note that the code per-

formance at very low BER (≤ 10-11) is the major concern in optical fiber communications.

Hence, the upper bound that we derive is a useful tool for the analytic evaluation of the

performance of linear FEC codes, a class of codes that includes all three generations of

FEC codes in optical fiber communications. In Sec. 3.3, we modify the BCJR algorithm,

a maximum a posteriori probability (MAP) algorithm, for turbo code decoding according

to the chi-square ASE noise distributions in optical fiber channels and compare the re-

sulting code performance to the one based on the Gaussian noise assumption.

Chapter 4 is dedicated to a line-coding scheme for mitigating soliton-soliton collision-

induced errors. We introduce the sliding window criterion (SWC) line-coding scheme for

mitigating soliton-soliton collision induced errors. We develop two types of SWC codes,

the block SWC code and trellis-based SWC code. We also discuss the concatenation of

the SWC code with the Reed-Solomon (RS) code to achieve the very low BERs required

by optical fiber communications. We compare the simulation performance of the pro-

posed concatenated SWC/RS to the cases of using the RS code alone or using concate-

nated FEC codes without line-coding in correcting SSC-induced errors.

Chapter 5 completes the dissertation with a summary of the work, the conclusions that

we draw, and some suggestions for future research.

23

Chapter 2

Modeling of amplified spontaneous emission

(ASE) noise and soliton-soliton collisions (SSC) in

optical fiber transmission systems

Our goal in this dissertation is to design better coding schemes for optical fiber com-

munications by taking into account the particular physical impairments in optical fiber

channels discussed in Chapter 1. Hence, understanding the physical impairment mecha-

nisms and, based on that, modeling the physical effects is a critical first step in our re-

search. In this chapter, we focus on the two major physical effects in optical fiber WDM

systems –– the amplified spontaneous emission (ASE) noise from optical amplifiers and

soliton-soliton collisions (SSC).

In optical fiber transmission systems with optical amplifiers, ASE is a major source of

errors. Under low-power operation of the optical fiber channel, ASE is expected to domi-

nate over other sources of error producing impairments such as nonlinearity-induced im-

pairments, even over long transmission distances. Correcting ASE induced errors, there-

fore, is a major objective of forward error correction (FEC) applications. ASE noise can

be characterized as a random variable and, thus, the ASE noise statistics are critical in the

24

analysis and design of FEC codes and will be discussed in Chapter 3.

In optical soliton WDM systems, SSC becomes a major nonlinear effect causing severe

timing jitter and limiting the achievable transmission bit rate and channel spacing [63],

[64]. Based on the understanding of the SSC physical mechanism, we show that SSC in-

duces correlated errors after optical detection that are highly data-pattern dependent. The

data-pattern dependence of SSC-induced errors leads to the development of a line-coding

scheme, called the sliding window criterion (SWC) code that will be discussed in Chapter

4. The SWC line-coding scheme mitigates SSC-induced errors by reshaping the data

pattern.

In the first two sections of this chapter we discuss the statistics of the ASE noise, and

construct the corresponding models for optical fiber channels with dominant ASE noise.

In the next two sections, we describe the physical mechanism of SSC in WDM systems

and introduce a simplified model for the collision induced timing jitter. Based on this

model, we introduce the main motivation behind the line-coding scheme for mitigating

SSC-induced timing jitter and, hence, for mitigating collision induced errors.

2.1 Statistics of ASE noise and channel models

2.1.1 ASE noise statistics

The probability density function (pdf) of the detected signal I is a function of the en-

ergy E of the transmitted signal as well as the power spectral density N0 of the ASE noise

as described in [48]. The received marks and spaces have different pdfs that are approxi-

mately given by [48]

25

( ) ( )

)!1(

/exp/1)( 0

10

00 −

−=−

M

NINI

NIp

M

, (2.1)

+−

= −

−

01

0

2/)1(

01 2exp

1)(

N

IEI

N

EI

E

I

NIp M

M

, (2.2)

where M = Bo / Be is the number of modes per polarization state in the received optical

spectrum, Bo and Be are, respectively, the optical bandwidth and the electrical bandwidth

of the system at the detector, and IM –1 denotes the (M – 1)th modified Bessel function of

the first kind. The mean values and variances of the received marks and spaces can be

derived from the pdfs given in Eq. (2.1) and (2.2) as µ1 = MN0 + E, σ12 = MN0

2 +2EN0, µ0

= MN0, σ02 = MN0

2, respectively [48]. We can also obtain σ12 = 2(µ1µ0 – µ0

2)/M +σ02

from the above formulae for µ1, σ1, µ0, and σ0 [48]. With the definition of a SNR meas-

ure

Q = (µ1 – µ0)/(σ1+σ0) (2.3)

and the above results, along with signal levels re-defined as I0 = µ0 and I1 = µ1, we can

evaluate (normalized) I1, σ1, I0, and σ0 as functions of the system parameters, Bo, Be, and

Q as

e

o0 B

B=σ , QB

B2

e

o1 +=σ ,

e

o

B

BI =0

, e

o2

e

o1 22

B

BQ

B

BQI ++= , (2.4)

where N0 is normalized to 1.

We can see that the marks have a noncentral chi-square distribution, and the spaces

have a central chi-square distribution, both are asymmetric pdfs with 2M degrees of free-

26

dom [48]. The chi-square distribution is the most accurate theoretical model of the ASE

noise statistics as known to date [48], [49].

For simplicity of analytical studies of the ASE noise and the induced error probability,

however, Gaussian pdfs with the same means and variances as the chi-square distribu-

tions are commonly used. The Gaussian approximation is given by

−−= 20

20

20

02

)(exp

2

1)(

σπσII

Ip , (2.5)

−−= 21

21

21

12

)(exp

2

1)(

σπσII

Ip . (2.6)

Note that the detected signal I, as shown in Eq. (1.4), is a sum of 2M independent random

variables. From the central limit theorem the Gaussian approximation can be a good

model for both p1(Id) and p0(Id) for large M. But for small M (which is the case for

DWDM systems) and at low Q, the Gaussian distribution is not a good approximation of

the chi-square distribution as shown in Fig. 2.1.

Figure 2.1 plots the chi-square pdfs and the Gaussian approximations of the marks and

spaces in a transmission system with Q2 = 6.2 dB and M = 3. It shows that the central chi-

square pdf of the spaces is quite different from the Gaussian approximation even in the

central part of the pdfs. The difference between the pdfs of the marks, although not as

significant as that between the pdfs of the spaces, is clearly observed. Because the optical

detector is a square-law device and thus always outputs positive electrical voltage, the

probability of a negative signal is zero. The chi-square pdfs have zero probability density

27

for a signal voltage less than zero. The Gaussian approximation loses this non-negative

signal property. Thus, using the Gaussian approximation in the analysis and design of

FEC codes may cause poor estimation and significant degradation of the code perform-

ances as will be shown in Chapter 3.

Figure 2.1: Comparison of the chi-square distribution and the Gaussian approximation forM = 3, Q2 = 6.2 dB.

Figure 2.1 also clearly shows the asymmetric distribution of the marks and spaces with

ASE noise. For both the chi-square pdfs and the Gaussian pdfs, the variance of the marks

are much larger than that of the spaces. The difference between the variances comes from

the signal/noise beat term in the expansion of Eq. (1.4).

2.1.2 Channel models for optical fiber channels with dominant ASE noise

As discussed in the previous section, the statistics of ASE noise can be described by

the chi-square or Gaussian distributions. The Gaussian distribution is in fact an approxi-

mation of the chi-square distribution [48], [49]; in other word, the chi-square distribution

0

1

2

0 2 4

dashed: Gaussiansolid: chi-square

pdf

Detected signal (I)

spaces marks

28

describes the ASE noise more accurately. On the other hand, we can see that the chi-

square distribution has a more complex formula than the Gaussian approximation [Eq.

(2.1) and (2.2) vs. Eq. (2.5) and (2.6)]. Based on the two distributions, we can introduce

different channel models for optical fiber channels with dominant ASE noise.

Optical fiber channels can be characterized as binary-in binary-out (BIBO) channels or

binary-in soft-out (BISO) channels for hard-decision and soft-decision cases, respec-

tively. For the soft-decision case, we can use two models for optical fiber channels, the

chi-square and Gaussian models. For the hard-decision case, we introduce three channel

models: the chi-square binary asymmetric channel (BAC), the Gaussian BAC, and the

Gaussian binary symmetric channel (BSC).

The general BIBO channel model can be depicted as shown in Fig. 2.2.

Figure 2.2: Binary-in binary-out (BIBO) channel model.

where f ≡ Pr(1 | 0) and m ≡ Pr(0 | 1) are the two transition probabilities at the detector

output after thresholding. If f = m, we have a BSC, and when f ≠ m we have a BAC [55].

0 1 − f 0

1 1 − m 1

f

m

29

In the hard-decision case, we need to find the optimal hard-decision threshold Iopt.

With the chi-square and Gaussian pdfs given in Eqs. (2.1), (2.2), (2.4), (2.5), we can de-

rive Iopt for each of them, obtaining

( )

)!1(

/2exp

10opt

0

opt

10

2/1

opt

−=

−

−

−

−

M

NI

N

EII

N

E

E

I M

M

M

, (2.7)

20

2

0

0opt21

2

1

opt1 lnln σσ

σσ

+

−=+

− IIII, (2.8)

respectively.

We can see that for the chi-square distribution there is no closed-form formulae in

evaluating the optimal hard-decision threshold and, thus, no closed-form formulae for the

evaluation of the corresponding channel transition probabilities and detector BERs. For

the Gaussian approximation case, the solution is quite complex. Hence, in addition to the

Gaussian approximation of the ASE noise distribution, the hard-decision threshold is

customarily set so that the two transition probabilities, f and m, are equal, which implies a

BSC assumption. Then, the hard-decision threshold is given by [49]

10

0110th σσ

σσ++= II

I , (2.9)

and the detector BER, pe, is given by

===

2erfc

2

1e

Qmfp (2.10)

We note that, for ASE noise, spaces and marks have different variances (σ1 > σ0).

Thus, in the optimal hard-decision case (i.e., minimum BER condition), this in fact gives

30

rise to a binary asymmetric channel with different transition probabilities f and m. For the

chi-square distribution we do not have closed-form formulae for f and m, but they can be

evaluated numerically using

( ) ( )dI

M

NINI

Nf

I

M

)!1(

/exp/1

01

0

0opt

∫∞

−

−−

= , (2.11)

dIN

IEI

N

EI

E

I

Nm

I

M

M

2exp1opt

0

10

2/)1(

0∫ ∞− −

−

+−

= . (2.12)

With the Gaussian approximation we have

=

o

eopt 2

erfc2

1

B

BIf , (2.13)

+−−

−=eo

eo2

opt

/222

/22erfc

2

11

BBQ

BBQQIm , (2.14)

where the signal levels have been offset by I0.

From the above discussions, we have three channel models for optical fiber channels

with dominant ASE noise: chi-square BAC [Eqs. (2.7), (2.11), (2.12)], Gaussian BAC

[Eqs. (2.8), (2.13), and (2.14)], and Gaussian BSC [Eqs. (2.9), (2.10)]. The hard-decision

thresholds evaluated with Eqs. (2.7)–(2.9) and the corresponding detector BER (without

coding) are plotted in Figs. 2.3 and 2.4, respectively.

Figure 2.3 plots the optimal hard-decision thresholds corresponding to the chi-square

BAC, Gaussian-BAC, and Gaussian BSC models in a transmission system with Q2 = 6.2

dB and M = 3. It shows that the resulting hard-decision thresholds are clearly at different

positions. Hence, compared to the chi-square BAC model, which is the most accurate of

the three models, both the Gaussian BAC and Gaussian BSC models cause suboptimal

31

hard decisions. The suboptimal hard decisions will lead to poor estimates of the detector

BER as shown in Fig. 2.4.

Figure 2.3: Comparison of the hard-decision thresholds based on the chi-square distribu-tion, Gaussian approximation, and Gaussian approximation + BSC approximation for M= 3 and Q2 = 6.2 dB.

Figure 2.4: Comparison of the detected BERs as a function of Q, based on the chi-squaredistribution, Gaussian approximation, and Gaussian + BSC approximations, for M = 3.

0 .5 1 1 .50

0 .2

0 .4

D e te c te d s ig n a l ( I)

pdf

d a s h e d : p d f o f s p a c e ss o lid : p d f o f m a rk s

G a u . B S Cth re s h o ld G a u . B A C

th re s h o ld

c h i. B A Cth re s h o ld

0 10–3

–2

–1

Q2 (dB)

log 1

0(B

ER

)

dashed: Gau. BSCdotted: Gau. BACsolid: chi. BAC

32

From Fig. 2.4, we can see that, compared to the chi-square BAC model, the Gaussian

BSC model always gives a higher BER than does the optimal hard-decision case, i.e., it

overestimates the detector BER. By contrast, the Gaussian BAC model underestimates

the BER at low Q, and overestimates the BER at high Q. Moreover, the resulting BERs

are not significantly different; in other words, both the Gaussian BAC and Gaussian BSC

models work well in evaluating the detector BER without coding.

However, we will show in Chapter 3 that, although the non-optimal hard-decision

thresholds do not cause a significant difference in the evaluation of BER without coding,

the Gaussian BSC approximation may lead to a poor estimate of the Shannon limit of the

code performance if we use FEC coding. Moreover, if we incorporate the Gaussian BSC

approximation in the FEC code design, it may cause significant code performance degra-

dation. This issue will be discussed in Chapter 3 where we will use specific codes as ex-

amples. Even with only the Gaussian approximation, we will show in Chapter 3 that, in

the soft-decision and decoding case, a poor estimate and significant degradation of FEC

code performance may result.

The channel capacity is a function of the probability density functions of the noisy sig-

nals after transmission, as mentioned in Chapter 1. In the hard-decision case, the channel

capacity is a function of the transition probabilities. Hence, the resulting transition prob-

abilities for the different channel models are critical parameters in the analysis and design

of FEC codes.

As shown in Fig. 2.5, the transition probabilities evaluated with Eq. (2.8)–(2.12) can be

significantly different. Figure 2.5a plots the transition probabilities, m = p(0|1) and f =

p(1|0), as functions of Q for the three channel models. The two transition probability

33

curves overlap in the Gaussian BSC case, indicating equal f and m as expected for BSC.

Comparing the transition probability curves, f(Q) and m(Q), in the two BAC models, we

can see that the f(Q) and m(Q) curves are farther separated from each other in the Gaus-

sian BAC than in the chi-square BAC. This implies different degrees of asymmetry of the

two channel models.

To clearly show the asymmetry characteristics of the three models, we define the tran-

sition probability ratio as m/f and plot it as a function of Q in Fig. 2.5b. In a logarithm

plot, we see that the farther the transition probability ratio is away from zero (in either the

positive or negative direction), the more asymmetric the channel. Figure 2.5b shows that,

compared to the chi-square BAC model, the Gaussian BSC model totally disregards the

asymmetry of the ASE noise distributions, while the Gaussian BAC model overempha-

sizes the ASE-induced channel asymmetry.

(a) Transition probabilities, f and m (b) Transition probability ratios

Figure 2.5: Comparison of the transition probabilities based on the chi-square distribu-tion, Gaussian approximation, and Gaussian approximation + BSC approximation for M= 3 as functions of Q2.

0 100

0.2

Q2 (dB)

tran

sitio

n pr

obab

ilitie

s

solid: m = p (0|1)dashed: f = p (1|0)

Gau. BSC

chi. BAC

Gau. BAC

0 10

0

0.5

Q2 (dB)

log 1

0(m

/f)

Gau. BAC

Gau. BSC

chi. BAC

34

2.2 Physical mechanism of SSC and simplified model for SSC-induced timing jitter

The traditional optical soliton is an optical pulse that can propagate undistorted in dis-

persive nonlinear optical fiber under specific pulse power and pulse shape conditions

[16]. Figure 2.6 depicts the basic optical soliton transmission system. The transmitted

data stream is implemented with a stream of optical pulses indicating the marks. At the

receiver, if an optical pulse is detected in the middle of a receiving time slot with duration

T, a mark is received. Conversely, the absence of an optical pulse in the time slot is inter-

preted as a space received. We consider only binary data sequences in studying optical

soliton transmissions.

Transmitterend

Receiver end

Channel 1:f1

Fiber path z

0 0 0 0 01 1 1 111

Soliton stream

Binary data streamTiming slot

T

Figure 2.6: Optical soliton transmission.

2.2.1 Physical mechanism of SSC

In our studies of SSC-induced impairments, our major concern is timing jitter defined

as the random deviation of the optical pulse position from its nominal location at the time

35

slot center [16]. Timing jitter causes sub-optimal detectability of a pulse and inter-symbol

interference and, thus, limits both the bit rate and the transmission distance in soliton

transmission systems. In WDM communications, timing jitter also limits the channel

spacing and, thus, system spectral efficiency.

SSC is the result of collisions among solitons in WDM systems that belong to different

channels because of their different group velocities. To understand the physical mecha-

nism of SSC, consider the nonlinear Shrödinger equation (NLS) [16],

ui

uut

uzp

z

ui Γ−=+

∂∂+

∂∂

2)(

2

1 2

2

2

, (2.15)

where p(z) is the normalized group velocity dispersion as a function of transmission dis-

tance z. Because of the dependence of the second term on transmission distance z, Eq.

(2.15) is not the standard NLS. However, it can be transformed into a perturbed NLS by

defining u’ ≡ u exp(–Γz/2) and z' ≡ .)(0∫z

dzzp In the transformed variables, Eq. (2.15)

becomes

02

1 2

2

2

=′′+∂

′∂+′∂′∂

uubt

u

z

ui , (2.16)

where b(z) = exp(–Γ z) / p(z).

The effect of SSC on the performance of WDM systems can be demonstrated by con-

sidering the simplest case of two WDM channels in the NLS equation.

36

Complete SSC

In an optical fiber with constant chromatic dispersion and negligible losses, the nonlin-

ear interactions (collision) of two solitons having angular frequencies ±Ω induces a fre-

quency shift on each soliton that approximately equals [63]

∫∞∞−

Ω+Ω−Ω

=Ω

)(sech )(sech

2

1 22 dtztztδ

3z))sinh(2(

)]2(sinh )2(cosh 2[2

ΩΩ−ΩΩ

Ω= zzz , (2.17)

where the angular frequency Ω = 1.763 radians/τ, and τ is the full width (in the time do-

main) at half magnitude (FWHM) of the optical pulse intensity. The angular frequencies

of the two solitons change by the same amount but in opposite directions. Given chro-

matic dispersion, the propagation speed of a soliton changes with its frequency. Hence,

the SSC-induced frequency shift leads to a velocity shift. As shown in [63], a collision

speeds up the faster soliton and slows down the slower one. From Eq. (2.17), note that

δ Ω max = 2/(3Ω), and that δ Ω returns to zero after the collision is completed. Similarly,

the velocity shift returns to zero after the collision is completed and, thus, the net result is

a time shift of each soliton from the pulse center. This kind of soliton-soliton collision is

called a complete collision.

Figures 2.7a and 2.7b depict the soliton velocity changes and corresponding accelera-

tion changes (derivative of velocity changes) during the collisions occurring in optical

fibers with uniform dispersion and optical fibers with dispersion management, respec-

tively. Figure 2.7a shows the symmetric characteristics of the soliton acceleration change

caused by SSC in optical fibers with uniform dispersion. Thus, the soliton speed in-

37

creases during the first half and decreases in the second half of the collision duration.

Hence, after the SSC, the speeds of the solitons change back to their original speeds be-

fore the collision. Thus, the only net result of the collision is a displacement in time, δt,

of each soliton [63]. The collision retards the slower and advances the faster of the soli-

tons [63].

(a) Complete Soliton-soliton collision

(b) Partial Soliton-soliton collision

Figure 2.7: Changes of soliton velocity and acceleration during collision versus distance.

-2 -1 0 1 2

0

Normalized distance (z/Lcoll)

VelocityAcceleration

Acc

eler

atio

n

Vel

ocity

0

Dispersion

Velocity

Acceleration

Normalized distance (z/Lcoll)

Vel

ocity

Acc

eler

atio

n

-2 -1 0 1 2 3

38

Partial SSC

In realistic optical WDM systems, however, the use of lumped amplifiers and optical

fiber dispersion management has the potential to unbalance the SSC and, thus, cause a

partial SSC.

Consider the worst situation, a collision occurs at a point where there is a step change

of the optical fiber dispersion, D, in a dispersion-managed soliton system as shown in

Fig. 2.7b [63]. Although the acceleration has the same absolute peak value for each half

of the collision, the duration is different for each half, corresponding to different alter-

nating values of optical fiber dispersion. Thus, the integral of the acceleration over the

entire collision is not zero. This unbalanced acceleration yields a net velocity shift, there-

fore, that remains after the collision has been completed. Such velocity shifts, when mul-

tiplied by the remaining distances to the end of the system, could easily result in an unac-

ceptably large jitter in pulse arrival times.

With the assumptions of uniform optical fiber dispersion and distributed amplifiers, a

complete SSC can be analytically modeled as described in the following section [63]. On

the other hand, an analytical model for a partial SSC is not available because the indefi-

nite integral corresponding to Eq. (2.17) cannot be written in closed-form. Hence, in our

analytical studies in Chapter 4 of the line-coding performance in mitigating the SSC-

induced timing jitter, we consider only the case of a complete SSC. However, a partial

SSC will be considered during full simulations of a dispersion-managed soliton WDM

system in Chapter 4.

39

2.2.2 Simplified model for SSC-induced timing jitter

Figure 2.8 describes the collisions within two WDM channels. The rectangular block

shown in Fig. 2.8 is defined as a sliding window that slides along the data sequence bit by

bit. The length of the sliding window is equal to the number of symbols (marks or spaces)

in one channel that may interact with symbols in the other channel along the whole

transmission path; it also represents the maximum number of collisions a soliton may ex-

perience in a 2-channel optical fiber transmission system.

Channel 1:

Channel 2:

Fiber path

T

Transmitterend

Receiver end

Figure 2.8: Soliton-soliton collision in a two-channel WDM system. The rectangularblock is defined as the sliding window.

We first consider a simplified model of SSC in which all collisions are complete colli-

sions to explain the main motivation for our line-coding scheme [63], [64]. Taking only

complete SSCs into account, for two channels with optical frequency difference ∆f, the

simplified model of SSC can be described by the following [63]:

40

(a) Time shift detected at the receiver induced by each collision is

2)(1

1768.0f

t∆⋅

±≈τ

δ , (2.18)

where δt (ps) is the time shift, τ (ps) is the FWHM of a soliton pulse, ∆f (THz) is the

channel spacing, 0.1768 is a constant ratio without units, the plus sign indicates the

slowing down of the slower soliton, and the minus sign indicates the speeding up of the

faster soliton.

(b) Full width collision length is

λτ∆⋅

=D

L2

coll , (2.19)

where D (ps/nm/km) is the optical fiber chromatic dispersion, ∆λ (nm) is the wavelength

difference between the two channels, and Lcoll (km) is the full width collision length that

refers to the distance between the two positions, corresponding to the beginning and end

of the collision, where the solitons overlap at their half power points [63].

(c) Maximum number of collisions for each soliton along the entire transmission path is

T

DZN

λ∆⋅⋅=12 , (2.20)

where Z is the transmission distance and T is the bit period at the transmitter.

After each collision, the faster of the two colliding solitons is advanced and the slower

one is delayed with the same absolute value of arrival time shift (Eq. 2.18). Given the

system parameters Z, D, ∆f, T, and τ, we can calculate the collision length Lcoll with Eq.

41

(2.19). We can also calculate N12, the number of collisions each soliton experiences if

data sequences of “all marks” are transmitted in both channels. Thus, the total time shift

induced by SSC over the entire transmission path is simply the product of the number of

collisions experienced and the time shift associated with each collision (δt).

It is straightforward to obtain an equation for the time shift in WDM systems with

more than two wavelength channels by using Eqs. (2.18)–(2.20) for each pair of channels

and then summing the results over all channels. In [63], the SSC-induced time shift for a

soliton in the i-th channel in a WDM system is given by

∑≠ ∆

±=ji ij

i fTz

Zt

11418.0

0

τδ , (2.21)

where z0 represent the soliton period in distance [93], Nij represents the maximum num-

bers of collisions a soliton in the i-th channel may experience with solitons in the j-th

channel, and the average number of collisions a soliton in the i-th channel may experi-

ence with solitons in the j-th channel is assumed to be Nij/2.

In this complete SSC model, because δt is constant for a given τ and ∆f, the total time

shift of each soliton only depends on the number of collisions as determined by the

transmitted data pattern in the other channels. Thus, if we can make the number of colli-

sions constant for a given channel, then we would eliminate the timing jitter. It is not pos-

sible, however, to achieve this goal and transmit information at the same time because no

information would be transmitted, but we show that we can approach this goal. We will

use line-codes to reduce the variation in the number of collisions and, hence, reduce the

timing jitter and BER. As usual with any coding scheme, we will achieve this result by

42

adding redundancy to the data, but it is done here to reshape the transmitted data pattern

in a way that minimizes the SSC-induced timing jitter errors.

It is obvious that SSC-induced timing jitter is highly correlated from pulse to pulse.

The net time shift of a given pulse from collisions with pulses of another channel is pro-

portional to the number of collisions that it experiences as it traverses the entire optical

fiber path, and that number can change by only ±1 from one pulse to the next. Hence, the

bit errors caused by SSC have bursty characteristics. SSC-induced bit error patterns are

plotted in Fig. 2.9 for two different bit rates, 12 Gbps and 14 Gbps, in a 4-channel 20 Mm

WDM soliton transmission system.

Figure 2.9: Patterns of SSC-induced bit errors, in (a) a middle channel of a 4-channel 12Gb/s WDM system and (b) a middle channel of a 4-channel 14 Gb/s WDM system.

In the figures, the bit index of a transmitted sequence is plotted as a function of the bit

error index, where the bit index counts the transmitted bits and the error index counts the

detector bit errors, in sequential order, respectively. The figures show that SSC-induced

errors for the two bit rates are burst errors in both cases. A higher bit rate implies longer

0 101.5

3

4.5x 104

Bit

inde

x

Error index

5

Error index

x 103

00 50

43

burst length and smaller burst spacing. This burst characteristics of SSC-induced errors

may significantly affect the performance of error correction codes and will be taken into

account in the performance comparison of coding schemes in the following chapter.

2.3 Summary

In this chapter, we described the statistics of ASE noise and the physical mechanism of

soliton-soliton collision (SSC). We then constructed several different channel models for

optical fiber channels with dominant ASE noise and a simplified model for SSC-induced

timing jitter.

For ASE noise, we discussed two approximations of the ASE noise distributions: the

Gaussian approximation and the binary symmetric channel (BSC) approximation. We

observed that, the Gaussian BSC model, which combines both approximations, gives

simple closed-form formulae for evaluation of the hard-decision threshold and BER. But

the price that must be paid is a non-optimal hard-decision that yields higher BERs.

Although the chi-square distribution is also an approximation, as mentioned, it is the

most accurate theoretical model known for ASE noise. Hence, in the following studies of

the FEC code performance, we will assume ASE noise with a chi-square distribution.

For the SSC, we described its physical mechanism. We also introduced the concepts of

complete SSC and partial SSC that may be observed in a DMS system. We constructed a

simplified model for SSC-induced timing jitter by considering only complete SSC. This

model shows that the total time-shift for each soliton only depends on the number of col-

lisions, which is determined by the transmitted data pattern in the other channels. This

observation motivates the idea of developing a line-coding scheme that can reduce the

44

variation in the number of collisions each soliton may experience over the entire optical

fiber path and, thus, mitigate the SSC-induced timing jitter.

We also note that SSC induces time shifts of solitons that are highly correlated from

pulse to pulse and, hence, causes burst errors. The bursty characteristics of the SSC-

induced errors should be taken into account in the development of error correction coding

schemes for these kinds of errors.

45

Chapter 3

Forward Error Correction (FEC) Codes for Cor-

recting ASE Induced Errors

In both undersea and terrestrial systems, the optical amplifiers are critical components,

and amplified spontaneous emission (ASE) noise in the optical amplifiers is the major

source of noise in optical fiber channels. ASE noise has an asymmetric statistical nature,

and the chi-square distribution model is currently the best theoretical approximation of

the ASE noise statistics. However, for simplicity, the chi-square distributions are usually

approximated with Gaussian distributions having the same means and variances. Moreo-

ver, in the hard-decision case, the binary symmetric channel (BSC) model is widely used

in characterizing optical fiber channels [21], [48], [49]. The BSC model gives a good ap-

proximation of bit error rate (BER) induced by ASE noise at high Q. Although an accu-

rate hard-decision model would be based on a binary asymmetric channel (BAC) [25],

[48], [49] assumption, most existing FEC codes are developed and evaluated with addi-

tive white Gaussian noise (AWGN) or BSC assumptions. Thus, the previous applications

and performance evaluations of FEC codes in optical fiber transmission systems are

mostly based on the Gaussian or BSC approximation with little effort to use a more accu-

rate model of the optical fiber channels.

46

In this chapter, based on previously discussed ASE noise statistics and optical fiber

channel models, we study the performance of FEC codes in three levels. First, at the

highest level, the study focus is the set of general FEC codes. We evaluate the lower per-

formance bound (Shannon limit) for general FEC codes based on the chi-square BAC, the

Gaussian BAC, and the Gaussian BSC models of optical fiber channels. Second, at the

middle level, the study focus is the set of linear codes, a subset of general FEC codes. We

derive the upper performance bound for linear codes in channels with asymmetric noise

distributions and apply the bound to optical fiber channels with ASE noise. Finally, at the

lowest level, the study focus is the set of turbo codes, a subset of linear codes. We discuss

the effects of different ASE noise models on the performance of the turbo code decoder.

3.1 Lower performance bound for general FEC codes

A fundamental question in FEC code applications is: how much can performance be

improved with these codes, or from an information theoretic standpoint, what is the

Shannon limit for optical fiber channels? Generally, the Shannon limit can be interpreted

as the lowest system BER that can be achieved after FEC decoding for a given FEC code

rate [defined in Eq. (1.5)], system SNR, and channel noise statistics. Previous evaluations

of the Shannon limit in optical fiber communications are based on the Gaussian BSC ap-

proximation with little effort to use a more accurate model for optical fiber channels.

In this section, we investigate the coding performance limit of optical fiber channels

with ASE as the dominant source of noise. The goal is to evaluate the bound on code per-

formance in terms of the decoded BER, Pe(Q, r), where the Q factor is defined in Eq.

(2.3) and r is the code rate. Given a code rate r, Pe is a function of the Q factor. We have

47

shown that optical fiber channels with dominant ASE noise have a distribution that is

asymmetric, especially at lower values of Q, but the Gaussian BAC and Gaussian BSC

models do not accurately represent the asymmetry. In the following, we evaluate the

lower bound on Pe with the chi-square BAC, the Gaussian BAC, and the Gaussian BSC

models. By comparing the results, we show that both the Gaussian BAC and the Gaussian

BSC modes may poorly estimate the maximum coding gain achievable in optical fiber

communications.

To achieve our goal of evaluating the lower BER bound as a function of the Q factor

for a given code rate r, we apply the source-channel coding theorem [65] to the optical

fiber channel. The source-channel coding theorem states that, for a given source and

channel with the source sequence U = (U1, …, Uk), the codeword X = (X1, …, Xn), the

received noisy codeword Y = (Y1, …, Yn), and the decoded sequence V = (V1, …, Vk), the

average cost β , the average distortionδ , and the code rate r = k/n must satisfy

)(/)( δβ RCr ≤ , where C(β) is the channel capacity (information bits/line symbol) and

R(δ) is the rate-distortion function. With the Q factor as the cost parameter, error prob-

ability as the distortion measure, and the memoryless channel assumption (which is true

for the ASE noise case), the source-channel coding theorem relates code rate r and the Q

factor to Pe by [65]

)(

)(

ePR

QCr ≤ . (3.2)

The above inequality gives an upper bound on the best code rate achievable for a given Q

factor and decoded BER (Pe). It can be illustrated with the diagram shown in Fig. 3.1.

48

Figure 3.1: Illustration of the source-channel coding theorem.

The source-channel coding theorem can be intuitively partitioned into two concate-

nated procedures, an outer lossy compression procedure and an inner channel coding pro-

cedure [94]. As shown in Fig. 3.1, an error-free transmission can be achieved with chan-

nel coding having a code rate rc not higher than the channel capacity C. Channel capacity,

defined later, is a function of the transition probabilities. For a given noise distribution

and decision threshold, the transition probabilities, f and m, are functions of (I1, I0, σ1, σ0)

and, thus, functions of Q. We can relate the Q factor to the channel code rate rc, therefore,

via the channel capacity by

)(c

QCr ≤ . (3.3)

lossycompressorrs≤ 1/R(Pe)

lossydecompressor

channelencoderrc≤ C(Q)

channeldecoder

Optical fiber channel

with ASE noise

( f, m )

Error Free Channel

System with Error Probability Pe, code rate r = rsrc

( fs , ms )|Pe

pX

Y

p

P(x = 0)

49

We may view the noisy optical fiber channel with channel coder as a virtual error-free

channel; however, error-free transmission is not really required. Given a tolerable system

BER (after decoding), Pe, a lossy compressor with a compression rate

data word compressed of lengthdata wordinput of length

s =r (3.4)

can be used to increase the overall code rate to r = rsrc [65], [94]. The compression rate,

rs, is in fact the reciprocal of the rate distortion function representing the minimum num-

ber of bits needed to be transmitted for each source information bit, given a tolerable

system error probability Pe. Thus we relate Pe to the compression rate by [94]

)(

1

e

sPR

r ≤ . (3.5)

We obtain Eq. (3.2) from combining Eqs. (3.4) and (3.5).

From the above discussion, we see that the code rate can be factorized into independ-

ent terms, the channel capacity C(Q) and the reciprocal of the rate distortion function

R(Pe). The independence between C(Q) and R(Pe) is visually shown in Fig. 3.1. We can

see that C(Q) depends on the transition probabilities (f and m) of the optical fiber channel,

but does not depend on the source data distribution, p = Pr (x = 0). On the other hand,

R(Pe) depends on the source data distribution, but does not depend on the real optical fi-

ber channel. Given the source data distribution, p = Pr (x = 0), the rate distortion function

R(Pe) can be evaluated. This independence property facilitates the evaluation of the

Shannon limit for non-error-free transmissions. The following gives the details of the

evaluation.

Channel capacity is defined by

50

),,(max cc

pmfIC XYp

= , (3.6)

where IXY is the mutual information function and pc is the probability of a space in the

input data sequence to the channel encoder [65]. Note that pc is different from the source

data distribution p.

For a BAC, the channel capacity can be evaluated by definition, as a function of the

transition probabilities:

[ ] )()1()()1()1(),( 2*c2

*c

*c

*c2 mHpfHpmpfpHmfC −−−−+−= , (3.7)

where H2(p) = −plog2p − (1−p)log2(1−p) is the binary entropy function and pc* represents

the optimal distribution of the input data sequence to the channel encoder that maximizes

the mutual information function IXY. With Eqs. (2.11) – (2.14), and (3.7), C(Q) can be

evaluated numerically with the chi-square BAC and Gaussian BAC models.

For the BSC case (pc* = 1/2, f = m), a simpler formula can be derived as C = 1 − H2(f).

However, when the channel is asymmetric, the BSC approximation will be inaccurate

because the decision threshold will not be at the optimal position, as previously men-

tioned. Figure 3.2 plots the resulting C(Q) based on the three different channel models. It

shows that, compared to the chi-square BAC model, the Gaussian BAC model over esti-

mates the channel capacity at low Q and underestimates it at high Q. The Gaussian BSC

model always underestimates the channel capacity.

It is not straightforward to evaluate the rate-distortion function R(Pe), which indicates

the minimum number of bits needed to represent a source symbol for a given output er-

ror probability Pe. In general, the rate-distortion function is defined as [65]

51

))(:),()|(

min1

(inf)( δδ kdEIpkk

R ≤= VUuv

. (3.8)

Figure 3.2: Comparison of the channel capacities evaluated based on the chi-square BAC,Gaussian BAC, and Gaussian BSC models of the optical fiber channel with dominantASE noise.

The minimization of the mutual information I(U, V) is extended over all p(v|u) = PV =

v| U = u that define V for a fixed δ and average distortion E(d) ≤ kδ. As proved in [65],

the computation of R(δ) becomes considerably easier for a discrete memoryless source U.

The simplified formula becomes [65]

)(:),()|(

min)( δδ ≤= dEVUIuvp

R . (3.9)

0 5 100.5

1

Q2 (dB)

C (

chan

nel c

apac

ity)

dashed: Gau. BACsolid: chi. BACdotted: Gau. BSC

52

Further simplification is possible for channels with binary input and output by using the

Hamming distortion measure [d(U, V) = 1 if U ≠ V, 0 if U = V] such that E(d) =Pe= δ.

Thus we have

[ ] ss1,0,

)1(),()|()(),( mppfvuduvpupVUdEvu

−+== ∑∈

, (3.10)

where fs and ms are arbitrary transition probabilities in a binary channel and are different

from the actual transition probabilities, f and m, of the optical fiber channel. Given the

source distribution p, we can evaluate R(Pe) by minimizing I(U, V) over all the transition

probability pairs of fs and ms satisfying pfs + (1−p)ms ≤ Pe, as shown below.

esssssse )1( , ),,,(min)( PmppfmfpmfIPR XY ≤−+∋∀= , (3.11a)

As previously mentioned and shown in Fig. 3.1, R(Pe) is independent of the real optical

fiber channel. Equation (3.11a) can be significantly simplified by letting fs = ms, in which

case

)()2()( e2ee2esym PHpPPpHPR −−+= . (3.11b)

However, Eq. (3.11b) is not always an accurate formula for R(Pe). In fact, for a sym-

metric source data distribution, i.e., p = 1/2, the minimum value of I(U, V) corresponds to

the case of fs = ms, in which case Eq. (3.11b) is an exact expression of R(Pe). However,

when p ≠ 1/2, the minimum values of I(U, V) are not at fs = ms and, thus, Eq. (3.11b) is

only an approximation of R(Pe). The proof of this statement follows.

Proof:

With (1 – p)m + pf = Pe and

[ ] )()1()()1()1(),( 222 mHpfpHmpfpHVUI −−−−+−= ,

53

we have

−+−−

−

+−−

−+=pfP

pfPp

f

f

pfPp

pfPpp

df

VUdI

e

e

2

e

e2

1

121

2log

),( .

When f = m, i.e., f = Pe, then we obtain

ee

ee2 21

2log2

),(

pPPp

pPPpp

df

VUdI

+−−−+= . (3.12)

We may resonablely assume p > 0 and Pe < 0.5, let p = 1/2 in Eq. (3.12), we have

121

2

ee

ee =+−−

−+pPPp

pPPp

and, thus,

021

2log2

ee

ee2 =

+−−−+

pPPp

pPPpp .

Hence, we find that the right side of Eq. (3.12) equals 0 if p =1/2.

Letting

021

2log2

ee

ee2 =

+−−−+

pPPp

pPPpp ,

for p > 0 and Pe < 0.5, the only solution for this equation is p = 1/2. Hence, we find that

the right side of Eq. (3.12) equals 0 only if p =1/2. QED.

Figure 3.3 plots the transition probability fs giving the minimum value of mutual in-

formation I(U, V) as a function of Pe, for different source distribution values p. It shows

that to achieve the minimum value of I(U, V), fs = Pe (thus fs = ms) only for p = 0.5 and

illustrates the statement made in the previous paragraph.

54

Figure 3.3: The quantity fs for minimum value of I(U, V) as a function of Pe for p = 0.1,…, 0.9.

Figure 3.4 plots the rate distortion function R(Pe) given by Eqs. (3.11a) and (3.11b),

respectively. It shows for low Pe that Eq. (3.11b) is a good approximation of the rate dis-

tortion function, but at high BERs it is not. It also shows that the more asymmetric the

source data distribution, the lower the rate distortion function value that can be achieved.

Although, as shown above, the asymmetric distributed source is favorable for lower

values of rate distortion function, the symmetric source is the most likely case (and usual

assumption) in communication systems. Hence, in the following evaluations we assume

the symmetric source distribution, i.e., p = 1/2, that gives the rate distortion function as

( )e2e 1)( PHPR −= . (3.13)

0 0.10

0.1

Pe

f s (

givi

ng m

in[I(

U, V

)]

p = 0.1 p = 0.2 p = 0.3 p = 0.4 p = 0.5

p = 0.6

p = 0.7

p = 0.8

p = 0.9

55

Figure 3.4: Comparison of the exact rate distortion function and the approximation basedon equal transition probabilities, fs = ms, for different source distributions, p = 0.1, …,0.9.

With the source-channel coding theorem in Eq. (3.2), the channel capacity in Eq. (3.7),

and rate distortion function in Eq. (3.13), we are ready to evaluate the lower bound on Pe,

i.e., the lower bound on the system BER after FEC decoding. Note that R(Pe) is a de-

creasing function of Pe, so that the upper bound on the code rate r in Eq. 3.2 becomes a

lower bound for Pe after rearranging the inequality as shown in Eq. (3.14):

−− =≥

r

QCR

r

mfCRP

)(),( 11e , (3.14)

where R–1(x) represents the inverse rate distortion function that can be evaluated

numerically.

–3 –2 –10

0.5

1

log10(Pe)

R(P

e)

approximation (fs = ms)accurate (fs ≠ ms)

p = 0.5

p = 0.4, 0.6

p = 0.3, 0.7

p = 0.2, 0.8

p = 0.1, 0.9

56

The results for the lower bound on the decoded error probability, Pe, evaluated with the

chi-square BAC, the Gaussian BAC, and the Gaussian BSC models, are plotted in Fig.

3.5. For r = 1, which corresponds to the uncoded case, the Gaussian BSC and Gaussian

BAC models are very nearly identical, as should be expected according to the detected

BERs evaluated in Sec. 2.1 and shown in Fig. 2.4. Compared to the chi-square BAC case,

the Gaussian BSC model overestimates the BER (estimates better performance), and the

Gaussian BAC model underestimates the BER (estimates poorer performance) at low Q

and overestimates the BER at high Q, but not significantly. However, in the FEC code

case, i.e., r < 1, we can see differences in the resulting bounds, which we illustrate by

displaying the comparisons among all possible pairs of the three channel models.

First, as shown in Fig. 3.5a, compared to the chi-square BAC model, the Gaussian

BAC model gives a good approximation of the lower bound on FEC code performance

for high code rates (r ≥ 0.8). However, at low code rates (r ≤ 0.5), the Gaussian BAC

model underestimates the lower bounds on code performance; and, as code rates become

lower, the underestimate becomes more significant. For example, for r = 0.5, the under-

estimate is about 0.4 dB in Q2 for Pe ≤ 10–4.

Second, as shown in Fig. 3.5b, compared to the Gaussian BAC model, the Gaussian

BSC model overestimates the lower bound on code performance. The overestimate be-

comes more severe for lower code rates. For example, for r = 0.8, the overestimate is

about 0.5 dB for Pe ≤ 10–4, and for r = 0.5, the overestimate is about 0.8 dB for Pe ≤ 10–4.

57

r = 1

(a) chi-square BAC (solid) vs. Gaussian BAC (dotted)

(b) Gaussian BAC (dotted) vs. Gaussian BSC (dashed)

(c) chi-square BAC (solid) vs. Gaussian BSC (dashed)

Figure 3.5: Comparison of the lower performance bounds of FEC codes evaluated withthe chi-square BAC, Gaussian BAC, and Gaussian BSC models.

10Q2 (dB)

low

er b

ound

on

BE

R

0

10–5

100

10–10

r = 0.5, 0.6, 0.7, 0.8, 0.9, 1

r = 1

r =0.5

10–10

10–5

100

0 10

low

er b

ound

on

BE

R

r = 0.5, 0.6, 0.7, 0.8, 0.9, 1

r = 1

r =0.5

Q2 (dB)

10–5

0 1010–10

100

low

er b

ound

on

BE

R

r = 0.5, 0.6, 0.7, 0.8, 0.9, 1

r =0.5

Q2 (dB)

58

Finally, as shown in Fig. 3.5c, compared to the chi-square BAC model, the Gaussian

BSC model overestimates the lower bound on code performance by about 0.4 to 0.5 dB

for all code rates studied and for Pe ≤ 10–4.

An interesting observation can be made from the above comparisons. The Gaussian

approximation of the chi-square distribution leads to an underestimate of the code per-

formance bound (estimates poorer performance), while the BSC approximation of the

Gaussian distribution leads to an overestimate of the code performance bound (estimates

better performance). Thus, for the Gaussian BSC that combines both the Gaussian and

BSC approximations, the Gaussian approximation-induced underestimate (poorer per-

formance estimate) cancels part of the BSC approximation-induced overestimate (better

performance estimate), especially at low code rates.

3.2 Upper bound for linear FEC code performance

In this section, we present theoretical studies on the upper bound of linear code per-

formance in optical fiber channels with ASE as the dominant source of noise. We derive

a general upper bound for Pd, the pairwise error probability (defined as the probability

that the decoder makes a wrong decision by selecting an error sequence), as a function of

the error weight d in asymmetric channels. Utilizing this derived general upper bound,

the weight distribution of linear codes, and the union bound theorem, we evaluate ana-

lytically the upper bound on linear code performance in optical fiber channels, with par-

ticular consideration of the turbo product code (TPC) and the turbo convolutional code

(TCC).

59

It is worth notice that, while other communications systems aim at achieving bit error

rates (BERs) around 10–4 to 10–6, optical fiber communications systems require more re-

liable performance, i.e., BERs less than 10–11. As we know, the union bound [68] on the

decoded BER diverges significantly from the actual code performance when the SNR

drops below a threshold determined by the computation cutoff rate [67], [95]. However,

the union bound gives a good estimate of code performance at high SNR, or equivalently,

at very low decoded BER. In most cases, however, simulations of decoded BER down to

10–11 are not possible. Hence, the union bound is more important and useful in optical

fiber communication channels, as a guide for code design, than in any other kind of

channel.

We evaluate the upper bounds on the performance of a TPC and a TCC in an optical

fiber channel using both the Gaussian model and the chi-square model for ASE noise. We

show that, compared to the more accurate chi-square model of the ASE noise, the Gaus-

sian approximation mis-estimates the code performance bounds. We also show that the

TPC outperforms the TCC (according to these bounds), with similar code rate and block

length, in the optical fiber channel requiring very low BER.

3.2.1 Upper bound on linear code performance in asymmetric channels

The upper bound on linear block code performance can be evaluated based on their

weight enumerating functions (WEF), while a convolutional code can be represented by

an equivalent block code if the encoder is forced to the all-zero state at the end of each

block. Moreover, it has been shown that TCC is also a linear code [66]. Obviously, the

60

TPC with block constituent codes can be treated as linear block code. For TCC with two

convolutional constituent codes, the resulting performance degradation is negligible for a

large interleaver size even though only one of the two convolutional encoders, but not

both, can be forced to the all-zero state at the end of each block [67]. Hence, the formulae

that we have derived apply generally to linear block codes, linear convolutional codes,

TPCs, and TCCs in asymmetric channels. The asymmetric channel here refers to a bi-

nary-input soft-output (BISO) asymmetric channel instead of a binary asymmetric chan-

nel which is binary-input binary-output (BIBO). The specific asymmetric channel studied

is the optical fiber channel with ASE noise approximated by the chi-square distributions

given by Eqs. (2.1) and (2.2).

Evaluation of the performance upper bound is based on the Union Bound Theorem

[68]. For linear codes, the set of decoded error patterns is identical to the set of code-

words. Thus the weight distribution of the decoded error patterns can be equivalently de-

scribed with the WEF of the code as

∑=

=max

min

)(d

dd

dd xAxA , (3.15)

where Ad is the number of codewords and, thus, the number of error patterns of weight d.

The pairwise error probability is defined as the probability that the decoder makes a

wrong decision by selecting a codeword other than the transmitted one [67]. Suppose we

are given the pairwise error probability Pd for each possible weight d, then the decoded

word error probability Pe can be bounded by the union bound as [68]

∑=

≤max

min

e

d

dddd PAP . (3.16)

61

It is straightforward to evaluate Pd for symmetric channels, particularly for the AWGN

channels, but it is quite involved for asymmetric channels. In the following sections, in-

stead of deriving an exact formula for Pd, a general upper bound of Pd is derived for

asymmetric channels.

3.2.2 Upper bound on Pd for asymmetric channels with two codewords

First consider a BISO asymmetric channel with only two codewords, x1 = (x11, x12, …,

x1n) and x2 = (x21, x22, …, x2n), where xij ∈ 0, 1. Suppose x1 is the transmitted codeword

and assume the decoded error pattern e = x1 + x2 has weight d. Let d+ represents the

number of mark errors in e, and we similarly define d– for the space errors, so that d

= d+ + d–. We define Pe1 and Pe2 as

),|ˆ( ),|ˆ( 21e212e1 xxxxxxxx ====== PPPP

where x = (x1, x2, …, xn), xi∈0, 1, represents the transmitted codeword, and x represents

the decoded codeword. Assuming that the codewords x1 and x2 are equally probable, it

follows from [69] that the pairwise error probability is bounded by

∑ −

≤≤≤+=

y

xyxy ss

sd PPP

PP)|()|(min 2

1110

e2e1

2

1

2, (3.17)

where y = (y1, y2, …, yn), yi ∈ [0, ∞) (for optical fiber transmissions), represents the re-

ceived noisy codeword.

With the independence assumption, and we note that the continuous value of yi re-

quires integration of the pdf in probability calculations, we have

62

.)|()|(min1

21

110

2ee1

2

1

2∏∫

=

−

≤≤≤+=

n

iy i

sii

sii

sd

i

dyxypxypPPP

(3.18a)

Also noting that yi ∈ [0, ∞), we replace yi by y ∈ [0, ∞) in the above equation, and we

obtain

∏ ∫=

−

≤≤≤

n

iy

si

si

sd dyxypxypP

1 2

11

10)|()|(min

2

1. (3.18b)

We know that x2 differs from x1 with d+ mark errors (x1i = 0 and x2i = 1) and d– space

errors (x1i = 1 and x2i = 0) and, thus, the two codewords have n–d+–d– ident i-

cal bits (x1i = x2i = x). It follows that Eq. (3.18b) can be written as

. )|()|(

)0|()1|()1|()0|(min

1

11

102

1

≤

−+

−+

−−−

−−

≤≤

∫

∫∫

ddn

y

ss

d

y

ssd

y

ss

sd

dyxypxyp

dyypypdyypypP

(3.18c)

We now note that

1)|()|()|(

1 == ∫∫ −

yy

ss dyxypdyxypxyp ,

so that, by simplifying Eq. (3.18c), we obtain

≤+=−+

∫∫ −−

≤≤

d

y

ssd

y

ss

sd dyypypdyypypPPP 1

1

10

2ee1 )1|()0|()1|()0|(min2

1

2. (3.18d)

Equation (3.18d) shows that for a channel with asymmetric noise, the same error pat-

tern may have different pairwise error probabilities when occurring on different code-

words. Hence, there is not an exact formula for Pd as a function of d for asymmetric

63

channels. Based on Eq. (3.18d), however, we can obtain a general upper bound for Pd as

discussed in the following section.

3.2.3 General upper bound on Pd in asymmetric channels

Now we extend our discussion to the general case where a linear code with N code-

words is used on an asymmetric channel. For a decoded error pattern e with weight d, all

the N codewords can be paired into N/2 pairs, such that the modulo-2 summation of the

two codewords in any pair equals e. For any possible e, there always exists the codeword

pair, (0, e). Thus, the pairwise error probability Pd in the N codeword case can be written

as

∑∑==

+==2/

12e1e

1e )(

11 N

j

jjN

iid PPPP

NN, (3.19a)

where jP 1e and jPe2 represent one of the N/2 codeword pairs. We can see that Eq. (3.18d)

holds for any of the N/2 codeword pairs. Thus, with Eqs. (3.18d) and (3.19a) we obtain

∑ ∫∫∑=

−−

≤≤=

≤=−+2/

1

11

101

)1|()0|()1|()0|(min11 N

j

d

y

ssd

y

ss

s

N

ieid

jj

dyypypdyypypPPNN

, (3.19b)

where j is the index of the codeword pair, dj+ represents the number of mark errors for

one codeword in pair j, and dj– represents the number of space errors for the same code-

word in pair j. The right side of the inequality in Eq. (3.19b) is still a function of dj+ and

dj– instead of d.

We can loosen the bound to facilitate the computation by replacing all the terms in the

summation in Eq. (3.19b) with their maximum value. It will be shown with examples that

64

this does not significantly loosen the bound for the optical fiber channels with chi-square

noise. Thus we obtain

).,(minmax

)1|()0|()1|()0|(minmax

102/1

11

102/1

2

1

2

1

+

≤≤≤≤

−−

≤≤≤≤

=

≤

−+

∫∫

jsNj

d

y

ssd

y

ss

sNjd

ds

dyypypdyypypPjj

µ

(3.20)

The µ(s, dj+) function in Eq. (3.20) has the following properties:

Property 1: µ(s, dj+) is convex with respect to s [69].

Property 2: the set of curves µ(s, dj+), 1 ≤ j ≤ N/2, cross at the single point

s = 0.5 and have the same value µ(0.5, d/2).

Property 3: µ(0.5, d/2) is the minimum value of µ(s, dj+) for dj

+ = d/2.

Shannon, et al., showed Propety 1 in [69]. We prove the other two properties as follows.

Proof for propery 2:

We substitute s = 0.5 into µ(s, dj+) as defined in Eq. (3.20), obaining

. )2/,5.0(

)1|()0|(

)1|()0|(

)1|()0|()1|()0|(),5.0(

2/12/1

2/12/1

2/12/12/12/1

d

dyypyp

dyypyp

dyypypdyypypd

d

y

dd

y

d

y

d

yj

jj

jj

µ

µ

=

=

=

=

∫

∫

∫∫−+

−+

+

+

65

Hence, for s = 0.5 and a given d, µ(s, dj+) has the same value for different values of dj

+.

QED.

Proof for propery 3:

Substituting dj+ = d/2 into µ(s, dj

+) as defined in Eq. (3.20) and noting that dj+ = d – dj

–,

we have

2/1

2/1 )1|()0|()1|()0|()2/,(

d

y

ssd

y

ss dyypypdyypypds

= ∫∫ −−µ ,

which is a symmetric function with respect to s = 0.5. Suppose that there exists a ∆s ≠ 0,

such that µ(0.5+∆s, d/2) is the minimum value of µ(s, d/2). From the symmetry of the

function, we must have µ(0.5–∆s, d/2) = µ(0.5+∆s, d/2). Thus, we find µ(0.5, d/2) >

[µ(0.5–∆s, d/2) + µ(0.5+∆s, d/2)]/2, in contradiction to property 1. Hence, the suppo-

sition is wrong and, thus, µ(0.5, d/2) is the minimum value of µ(s, d/2). QED.

With properties 2 and 3, we can prove that

)2/,5.0(),(minmax102/1

dds jsNj

µµ ≤+

≤≤≤≤, (3.21)

as follows.

Proof:

Suppose there exists a j such that dj+ ≠ d/2 and ),(min

10

+

≤≤ jsdsµ > µ(0.5, d/2). According

to property 2, µ(0.5, dj+) = µ(0.5, d/2) and, thus, µ(0.5, d/2) ≥ ),(min

10

+

≤≤ jsdsµ from

property 1. This contradicts the supposition. Hence, the supposition is wrong, and

66

),(min10

+

≤≤ jsdsµ ≤ µ(0.5, d/2) for all j. With property 3, we prove that the equality in Eq.

(3.21) holds when dj+ = d/2 exists. QED.

Equations (3.20) and (3.21) can be combined to obtain a simple general upper bound

on Pd given by

d

yd dyypypP

≤ ∫ )1|()0|(2

1 . (3.22)

And Eqs. (3.16) and (3.22) can be combined to obtain an upper bound on the decoded

word error probability as

∑∑ ∫==

=

≤max

min

max

min2

1

2

1 )1|()0|(

d

dd

dd

d

dd

d

yde ZAdyypypAP . (3.23)

Now, given the WEF of the linear code and the channel noise statistics, the evaluation of

the upper bound via Eq. (3.23) of the decoded error probability becomes a routine task.

3.2.4 Upper bound on linear code performance in optical fiber channels

The upper bound on linear code performance in optically amplified fiber channels can

now be obtained by substituting Eqs. (2.1) and (2.2) into Eq. (3.23) yielding

( ) ( )

,

)!1(

/2/)2/(exp

max

minchi

max

min

0 4/)1(2/12/)1(0

02/1

104/)1(3

e

2

1

2

∑

∑ ∫

=

=

∞

−+−

−

=

−+−

≤

d

dd

dd

d

dd

d

MMM

Md

ZA

dyEMN

NyEINEyyAP

(3.24)

where Zchi (for chi-square ASE model) is a constant for the given optical fiber channel

parameters, M, E, and N0, and can be evaluated numerically.

67

Figure 3.6 plots the µ(s, dj+) vs. s curves with dj

+ = d, 2d/3, d/2, d/3, and 0, respec-

tively, where d = 6, Q2 = 2 dB, M = 3. These plots display the properties of µ(s, dj+) dis-

cussed in the previous section.

Figure 3.6: The µ(s, dj+) curves for different values of dj

+.

Figure 3.7 plots two sets of curves corresponding to the logarithm values of µ(0.5,

d/2)/2 and Minµ(s, d)/2, respectively, as the functions of the Q factor in a given optical

fiber channel for d = 3, 6, 9, 12. The first set of curves, µ(0.5, d/2)/2 vs. s, are actually

the upper bounds of Pd obtained in Eq. (3.22), while the second set, Minµ(s, d)/2 vs. s,

equals 2/),(minmin102/1

+

≤≤≤≤ jsNj

dsµ . Observing Eqs. (3. 20) and (3.21), we can see that the tightest

bound for Pd (Eq. (3.19b)) falls between the µ(0.5, d/2)/2 and Minµ(s, d)/2 curves with

0.3 0.65.5

6

6.5x 10

–3

µ(s,

dj+)

s

dj+ = d/2

dj+ = ddj

+ = 0

dj+ = 2d/3

dj+ = d/3

68

the same d, and the two sets of curves almost overlay each other. Hence, Fig. 3.7 shows

that using MaxMinµ(s, dj+) in Eq. (3.20) gives a very good approximation of the upper

bound of Pd for the optical fiber channel studied. The computation is significantly simpli-

fied, while the loosening of the bound is negligible.

Figure 3.7: Comparison of µ(0.5, d/2)/2 and Minµ(s, dj+)/2 at d = 3, 6, 9, 12 for the opti-

cal fiber channel with M = 3.

3.2.5 Upper bounds on performance for TPC and TCC example

In this section, we apply the upper bound on linear code performance in asymmetric

channels derived in the previous section to two example codes — the Hamming (7, 4) ×

(7, 4) TPC and the (1, 5/7, 5/7) TCC with interleaver length 100. We start with a descrip-

tion of the encoding procedures of the two codes.

0 5–15

–10

–5

0

Q2 (dB)

Bou

nds

on lo

g 10(

Pd)

d = 3

dotted: log10(µ(0.5, d/2)/2)dashed: log10(Minµ(s, d)/2)

d = 6

d = 9

d = 12

69

The Hamming (7, 4) × (7, 4) TPC is a two-dimensional product code. The Hamming

(7, 4) constituent code is a standard single-error-correction code, where 4 is the input

data-word length and 7 is the codeword length. The parity-check matrix H of the Ham-

ming (7, 4) code is shown below

=

1

0

1

110

111

101

100

010

001

H .

From the above systematic H = [I | P] matrix, we can get the systematic code generator

matrix G = [P’ | I] (such that GH’ = 0) as shown below

=

1

0

0

0

0

1

0

0

0

0

1

0

0101

0111

0110

1011

G .

The (7, 4) × (7, 4) TPC is encoded in the row-first column-second order and, thus, has the

codeword structure as shown in Fig. 3.8.

.

Figure 3.8: Codeword structure of the Hamming (7, 4) × (7, 4) TPC.

Information bits

4 × 4

Column parity forrow parity bits

3 × 3

Column parity forinformation bits

3 × 4

Row parity bits

4 × 3

70

We can see that the Hamming (7, 4) × (7, 4) TPC has a block length of 49 bits, and a

code rate r = 16/49 ≈ 0.327.

The (1, 5/7, 5/7) TCC is a rate 1/3 code, where the parameters in the parenthesis are

octal numbers representing the structures of the constituent encoders. As depicted in Fig.

3.9, the “1” represents the information bit sequence and the two “5/7”s correspond to the

recursive parity-check generator polynomial (1 + D2)/(1 + D + D2). A 100-bit interleaver

is used in between the two constituent encoders; thus, the TCC has a block length of 300.

Figure 3.9: Encoder structure of (1, 5/7, 5/7) TCC with 100-bit interleaver.

The upper bounds on the decoded BERs of the TPC and TCC codes can be evaluated

with Eqs. (3.25) and (3.26), respectively,

∑ ∑=

−

=

+≤k

i

kn

j

jiZjiAk

iP

1 0

)(BER ),(

2, (3.25)

uk uk

x1pk

x2pk

100-bitinterleaver

encodeddata

sequence

71

∑ ∑ ∑= = =

++−

≤L

i

L

j

L

j

jjiZjiLtjiLti

L

L

iP

1 0 0

)(21

1

BER

1 2

21),,(),,(2

, (3.26)

where k is the input data-word length, i.e., the number of information bits in the TPC

codeword, n is the TPC codeword length, A(i, j) is the coefficient of the conditional WEF

of the TPC, L is the TCC interleaver length, t(l, i, j) is the transfer function coefficient of

the constituent (5/7) convolutional code for the TCC, and Z was defined in Eq. (3.23).

The TPC studied here has short data-word length (k = 16 bits) and codeword length (n =

49 bits); hence, its conditional WEF can be easily obtained by counting all the possible

codewords. The transfer function of convolutional codes can be obtained with the recur-

sive algorithm introduced in [70]. As defined in Eq. (3.23), Z is a function of the pdfs of

the channel noises. Using both the chi-square and Gaussian models of the ASE noise for

both the TPC and TCC, we obtained four performance bound curves as plotted in Fig.

3.10.

Figure 3.10 shows that, for both TPC and TCC, the upper bounds on performance

evaluated with the Gaussian approximation and chi-square distribution models cross

around the point Q2 = 2 dB. Compared to the chi-square distribution model, the Gaussian

model underestimates the code performance bounds by about 1.5 dB at BER 10–12. As a

whole, the Gaussian approximation overestimates at low Q and underestimates at high Q

the upper bounds on the TPC and TCC performance. If interpreting the above comparison

as the comparison of two channels, a Gaussian channel and a chi-square channel, we can

see that the TPC and TCC perform better in the Gaussian channel at very low Q ( Q2 < 2

dB) and in the chi-square channel at higher Q.

72

Figure 3.10 also shows that the TCC outperforms the TPC by 2.5–3 dB when the BER

is about 10–4, as indicated by the horizontal dash-dot line. However, the rate at which the

BER decreases with the TCC as a function of Q2 is smaller than with the TPC. Thus, the

TPC outperforms the TCC at very low BER (BER < 10-16 as indicated by the horizontal

dash-dot line). Note that the TPC and TCC studied here have similar code rate (TPC rate

= 0.327, TCC rate = 0.333), but the TCC is a longer code (300 bits) than the TPC (49

bits). Hence, it suggests that, with a similar code rate and code length, the TPC may be a

better choice than the TCC in optical fiber channels requiring very low BERs. In fact, as

shown in Fig. 3.11, when the interleaver length of the (1, 5/7, 5/7) TCC is decreased to

20, corresponding to a 60-bit code length, the 49-bit TPC outperforms the TCC for BER

< 10-10 as indicated by the horizontal dash-dot line.

Figure 3.10: Upper bounds on the performance of the Hamming (7, 4) × (7, 4) TPC (tri-angles) and the (1, 5/ 7, 5/7) TCC with interleaver length 100 (circles) using the Gaussian(dotted) and the chi-square (solid) ASE noise models.

0 10–20

–10

0

Q2 (dB)

log 1

0(B

ER

)

–2

–4

–6

–8

–12

–14

–16

–18

2 4 6 8

73

Figure 3.11: Comparison of the upper bounds on the performance of the Hamming (7, 4)× (7, 4) TPC (solid) and the (1, 5/ 7, 5/7) TCC with interleaver length 20 (dashed) usingthe chi-square ASE noise model.

3.3 Performance improvement of turbo codes

In this section, we study the effects of different ASE noise models on the performance

of turbo code (TC) decoders. A soft decoding algorithm, the BCJR algorithm [71], is

generally used in the TC decoders. The BCJR algorithm is a maximum a posteriori prob-

ability (MAP) algorithm, and, is generally very sensitive to the noise statistics. We noted

that the Gaussian approximation of the ASE noise is widely used in the study of optical

fiber transmission systems [20]–[23], [48], [49], and there exist standard TCs for Gaus-

sian channels. We show that, however, using a MAP decoding algorithm based on the

Gaussian noise assumption may significantly degrade the TC decoder performance in an

optical fiber channel with non-Gaussian ASE noise. To take full advantage of TC, accu-

rate asymmetric noise statistics in optical fiber transmissions should be used in the BCJR

decoding algorithm.

0 2 4 6 8 10–15

–10

–5

0

Q2 (dB)

log 1

0(B

ER

)

74

In the following, modifications of the BCJR algorithm according to the chi-square

noise distribution are described and simulation results for the performance of the im-

proved TC decoder are discussed. The effects of the three different optical fiber channel

models –– the chi-square distribution model, the approximated Gaussian distribution

model, and the approximated Gaussian BSC model –– on the TC decoder are discussed.

The BCJR decoding algorithm is an iterative soft decoding algorithm and requires a

soft-decision channel model, such as the chi-square and approximated Gaussian distribu-

tion models. The following discussions will show, however, that the hard-decision chan-

nel model can also affect the performance of the punctured TC decoder. Specifically, the

approximated Gaussian BSC model, as shown later in the simulations, degrades the per-

formance of the punctured TC whose decoding requires the optimal hard-decision

threshold.

3.3.1 Modification of the BCJR Decoding Algorithm according to the chi-square noise

distributions

The BCJR algorithm is a recursive algorithm for the maximum a posteriori probability

(MAP) decoding of the received noisy codeword Y = (ys1, …, ys

N, yip1, …, yip

N, …),

where ysk represents a received information bit corresponding to the transmitted informa-

tion bit uk, and yipk represents a received parity-check bit corresponding to the transmitted

parity-check bit xipk generated by the i-th constituent encoder. Note, i = 1, 2 for our rate

1/3 TC where each constituent convolutional encoder has rate 1/2. In the i-th constituent

75

MAP decoder for TC, the information bit uk in the transmitted codeword X = (u1, …, uN,

xip1, …, xip

N, …) is estimated based on the received noisy codeword Y by

( )( )

<>

= 0 if ,0

0 if ,1ˆ

k

kk uL

uLu (3.27)

where L(uk) is the log likelihood ratio (LLR)

.0 ,)|0(

)|1(log)( Nk

uP

uPuL

k

kk ≤≤

==≡

YY

(3.28)

The key to the BCJR algorithm is to decompose the a posteriori probability into three

factors αk–1, γk, and βk, relating the decision on uk (we refer to the subscribe k as “time k”

in the following discussions) to the previous, current, and future observations, respec-

tively, as

∑∈

−==Sss

kkkk ssssp

ss’uuP ’,

1 )(),’()’()(

1)| to n transitiostate causing ( βγα

YY , (3.29)

where S = s1, …, sk, …, sN is the set of all constituent encoder states, the state pair (s’,

s) represents a state transition from (sk–1 = s’) to (sk = s), αk–1(s’) = p[sk–1 = s’, (ys1, …, ys

k-

1, yip

1, …, yipk–1)] is a probability measure for state s’ at time k–1 that depends only on the

past observations, i.e., the received information and parity-check bits before time k, βk(s)

= p[(ysk+1, …, ys

N, yipk+1, …, yip

N) | sk = s] is a probability measure for state s at time k

that depends only on the future observations, i.e., the received information and parity-

check bits after time k, and γk(s', s) is a probability measure connecting state s' at time k–1

76

to state s at time k that depends only on the present observation (ysk, y

ipk). The γk(s’, s) can

be written as

)|()|()()|,()(),’( ppspskkkkkkkkkk xypuypuPuyypuPss ≅=γ , (3.30)

and αk–1(s’) and βk(s) can be computed recursively as functions of γk(s’, s) given by

∑∈

−=Ss

kkk ssss ’

1 ),’()’()( γαα

and

∑∈

− =Ss

kkk ssss

1 ),’()()’( γββ ,

respectively [92].

We observe that γk(s’, s) depends on the conditional pdfs of the received signals and is

the key factor in the BCJR algorithm; hence, the performance of the BCJR algorithm de-

pends strongly on the accuracy of the noise model.

As shown in Fig. 3.12, the differences between the pdfs of the ASE noise with the chi-

square distribution and the Gaussian approximation with the same mean and variance are

not negligible, especially at low Q as in the case of Q2 = 5 dB. An obvious question is,

therefore, can better TC performance be achieved by modifying the standard formula of

γk(s’, s), which uses the Gaussian noise model, to a new formula using the more accurate

chi-square distribution model given by Eqs. (2.1) and (2.2)? We rewrite the pdfs here,

with new notations for convenience:

77

Figure 3.12: Comparison of the pdfs of the ASE noise with a chi-square distribution and aGaussian distribution with the same mean and variance.

,0 , 2exp1

)1|(0

10

2

1

0

≥

+−

== −

−

kk

Mk

M

kkk y

N

EyI

N

Ey

E

y

Nxyp (3.31)

( ) ( ),0 ,

)!1(

/exp/1)0|( 0

10

0

≥−

−==

−

kk

Mk

kk yM

NyNy

Nxyp (3.32)

where yk represents ysk or yp

k, xk represents uk or xpk, E is the transmitted signal energy,

N0/2 is the two-sided power spectral density of the ASE noise, and 2M is the dimension-

ality of the optical signal space. When we substitute Eqs. (3.31) and (3.32) into Eq.

(3.30), we obtain

0 2 4 6 8 100

0.25

Q2 = 5 dB

p(y

| u =

0) Chi-square pdf

Gaussian approx.

0 10 20 300

0.08

y

p(y

| u =

1)

78

( )

( )

( )

==−

+−

==−

++−

==−

++−

==

++−

≅

−

−

−

−

−

−

−

−−

−

. 0 ,0 ,])!1[(

exp/1

)(

1 ,0 ,)!1(

/2exp

1)(

0 ,1 ,)!1(

/2exp

1)(

1 ,1 ,222

exp1

)(

(3.33) ),’(

p2

0

ps12

0ps

20

p

1

0s

0

p

10

ps2

1ps

20

p

1

0p

0

s

10

ps2

1ps

20

p

0

p

10

s

10

ps2

1

2

ps

20

kk

kkM

kk

k

kk

M

kkM

kk

M

kkk

kk

M

kkM

kk

M

kkk

kkk

Mk

Mkk

M

kkk

k

xuM

N

yyNyy

NuP

xuM

Ny

N

EyI

N

Eyy

E

yy

NuP

xuM

Ny

N

EyI

N

Eyy

E

yy

NuP

xuN

EyI

N

EyI

N

Eyy

E

yy

NuP

ssγ

Defining

==≡

)0(

)1(log)(

k

kk

e

uP

uPuL ,

we may write

−

−+

−=2

)()12(exp

)](exp[1

]2/)(exp[)( k

ek

kek

e

k

uLu

uL

uLuP . (3.34)

Note that Eqs. (3.33) and (3.34) can be substituted into Eqs. (3.28) and (3.29) to calculate

the LLR. Thus, all the common terms in the 4 cases in Eq. (3.33) can be removed to sim-

plify the calculations. Then, the γk(s’, s) can be calculated with

79

( ) ( ) ( )( ) ( )( )

==

==

==

==

≅−

−

−−

, 0 ,0 ,)(

1 ,0 ,)(

0 ,1 ,)()(exp

1 ,1 , )(exp

),’(

pps0

psp1

pps1

pp1

s11

kkb

kk

kkb

kkM

kkb

kkMke

kkkMkMke

k

xuyyc

xuyyaI

xuyyaIuL

xuyaIyaIuLc

ssγ (3.35)

where a, b, c0, and c1 are constants given by

( ) . )!1(exp/ , exp)!1(

,2

1 ,2

0

1

000

1

01

0

−

=

−

−=

−==

−−

MN

ENEc

N

E

E

NMc

Mb

N

Ea

MM

Defining

( )

=

=≡

−

−

, 0 ,)(

1 , ),’(

pp10

pp11

kb

ku

kkMu

ek

xyc

xyaIcss

k

k

γ

the LLR can be calculated iteratively to yield

( )

++

=

∑∑

−−

+−

−

S

kekk

S

kekk

kbk

kMk

ssss

ssssuL

y

yaIuL

)(~

),’()’(~

)(~

),’()’(~

log)()(

log)(1

1e

s

s1

βγα

βγα, (3.36)

where S

+ is the set of (s’, s) caused by uk = 1, and S

– is similarly defined for uk = 0. The

first term on the right side of Eq. (3.36), which depends on the currently observed infor-

mation bit and the channel SNR, is sometimes called the channel value. The second term

Le(uk) represents any a priori information provided (extrinsic information received) by

the other decoder, and the third term represents extrinsic information passed to the other

decoder.

80

3.3.2 Effect of the BSC model on the performance of the punctured TC

Punctured TC is more practical than the standard TC in optical fiber transmissions be-

cause of the higher code rates that can be obtained from lower code rate codes. Punctur-

ing can be implemented by deleting some parity and/or information bits at the output of

the encoder [67], [92]. At the input of the decoder, the signals corresponding to the

punctured bits are set to the same value as the optimal hard-decision threshold Iopt [92].

The reason follows. If we assume that the pdfs of the spaces and the marks, p(x|0) and

p(x|1), cross at the point (Icross, pcross) and satisfy the conditions:

(a) p(x|0) > p(x|1) for all x < Icross,

(b) p(x|0) < p(x|1) for all x > Icross,

we can prove that Iopt, which is optimal in the sense of giving the minimum hard-decision

detection error probability, is the crossover point of the two pdf curves, i.e., Iopt = Icross.

Proof:

Suppose Iopt ≠ Icross, there are only two possible cases, Iopt < Icross or Iopt > Icross. First,

consider the case when Iopt < Icross. With condition (a), we have p(x|0) > p(x|1) for x ∈

[Iopt, Icross). Then, the minimum hard-decision detection error probability can be expressed

as

extcross

min

)]1|()0|([ )0|( )1|(

)0|( )1|(

cross

opt cross

cross

opt

opt

PP

dIIpIpdIIpdIIp

dIIpdIIpP

I

II

I

I

I

+=

−++=

+=

∫∫∫

∫∫∞

∞−

∞

∞−

81

where Pcross is actually the hard-decision detection error probability with Icross as the deci-

sion threshold, and Pext > 0. Thus, we obtain Pmin > Pcross, which contradicts the definition

of Pmin. Hence, Iopt < Icross is not possible.

Similarly, with condition (b), we can prove that Iopt > Icross is also not possible and,

hence, Iopt < Icross. QED.

This proof leads to the straightforward likelihood ratio result, i.e., if we set punctured

bits to the same value as the optimal hard-decision threshold Iopt, then

. 1)1|(

)0|(

)1|(

)0|(

)1bit punctured|bit puncturedfor valuesignalpreset (

)0bit punctured|bit puncturedfor valuesignalpreset (

cross

cross

opt

opt ===

==

Ip

Ip

Ip

Ip

p

p

(3.37)

Obviously, a likelihood ratio equal to 1 (and LLR = 0) is the best guess for the punctured

bits in the sense of achieving minimum error probability. Hence, Iopt is the best value to

use for those virtual signals corresponding to the punctured bits. Note that both the chi-

square and Gaussian distributions satisfy the two conditions mentioned above and, hence,

the proof and statements made above are valid for them.

As discussed in Chapter 2, with the Gaussian BSC model of an optical fiber channel

with ASE noise, the optimal hard-decision threshold is assumed to be at the point that

gives equal transition probabilities. Thus, the approximate optimal threshold can be

evaluated with a simple formula as shown in Eq. (2.9), while the actual optimal threshold

for Gaussian noises satisfies Eq. (2.8). Figure 2.3 plots the optimal thresholds evaluated

at Q2 = 6.2 dB for the chi-square BAC, Gaussian BAC, and Gaussian BSC models. We

82

can see that the optimal thresholds in the first two models both give an ideal likelihood

ratio of 1, while the BSC model gives a likelihood ratio greater than 1.

The Gaussian BSC model gives a more accurate estimate for the BER at higher Q and,

thus, it is expected to perform better at higher Q as a model of optical fiber channels. This

is not true, however, if we use the Gaussian BSC model in the decoding of punctured

TCs. With Eq. (2.4) and (2.9), the approximate optimal threshold IoptBSC based on the

Gaussian BSC model can be expressed as a function of Q given by

QB

B

B

B

B

B

B

BQ

B

BQ

I

+

++=

e

o

e

o

e

o

e

o2

e

o

optBSC

2

where Bo and Be are, respectively, the optical bandwidth and the electrical bandwidth of

the system at the detector. As Q increases, the likelihood ratio

)1|(

)0|(

optBSC

optBSC

==

k

k

xIp

xIp

increases exponentially as shown in Fig. 3.13 for Bo/Be = 3, while the ideal value in turbo

decoding should be 1. Hence, for punctured TC, the BSC model performs even worse at

higher Q.

83

Figure 3.13: Likelihood ratio using the hard-decision threshold based on the GaussianBSC model for Bo/Be = 3.

3.3.3 Simulations of the TC decoders using different channel models

In the simulations, we use a (31, 27, 400) parallel-concatenated-convolutional TC with

the encoder and decoder structure as depicted in Fig. 3.14. The (31, 27, 400) TC is a rate

1/3 code, where the first two parameters, 31 and 27, in the parenthesis are octal numbers

representing the structures of the constituent encoders. If we transform the octal numbers

31 and 27 into binary numbers 11001 and 10111, then the digits of the binary numbers

represent the coefficients of the parity-check generator polynomials 1 + D + D4 and 1 +

D2 + D3 + D4. As depicted in Fig. 3.14(a), “31/27” corresponds to the recursive parity-

check generator polynomial (1 + D + D4)/(1 + D2 + D3 + D4).

A 400-bit interleaver is used between the two constituent encoders shown in Fig.

3.14(a). The major purposes of using an interleaver are [67]: (1) to generate a long block

code from small memory length convolutional codes, and (2) to decorrelate the two parity

0 10 20

10

Q2 (dB)

Like

hood

rat

io a

t Iop

tBS

C

4

8

6

2

84

check sequences so that an iterative suboptimum decoding algorithm based on informa-

tion exchange between the two constituent decoders can be applied.

uk

uk

x1pk

x2pk

400-bitinterleaver

PuncturingMechanism

encodeddata

sequence

(a) Turbo code encoder

MAP decoder1 MAP decoder2400-bit

interleaver

400-bitdeinterleaver

400-bitinterleaver

Le12

Le21

y1p

ys

y2p

(b) Turbo code decoder

Figure 3.14: Turbo code encoder and decoder structure.

1 D2 D3 D4

1 D D4

85

In the turbo encoder, for each input original information bit uk, there are two parity

check bits, x1pk and x2p

k, generated by the two parallel concatenated convolutional encod-

ers, respectively. Thus, we have a rate code of 1/3. To achieve higher code rate, a punc-

turer can be added at the output of the turbo encoder. The puncturing operation can be

represented by a puncturing matrix, in which each column represents an output block

with the element in the first row corresponding to the information bit and the other ele-

ments corresponding to the parity check bits. A “0” element in the puncturing matrix

means that the corresponding information bit or parity check bit is deleted according to

the puncturing mechanism. Similarly, a “1” means that the corresponding bit is transmit-

ted. The puncturing matrices for the rate 1/2 and rate 3/4 punctured TCs are shown in

Eqs. (3.38) and (3.39), respectively,

=

0

1

1

1

0

1

1/2) rate to1/3 (rate

MatrixPuncturing , (3.38)

=

1

0

0

0

0

1

0

1

1

3/4) rate to1/3 (rate

MatrixPuncturing . (3.39)

As shown in Fig. 3.14(b), the iterative turbo decoder consists of two serially concate-

nated constituent decoders, between which there is an 400-bit interleaver identical to the

one in the turbo encoder shown in Fig 3.14(a). The first decoder uses MAP decoding on

the received information sequence ys and parity check sequence y1p generated by the first

encoder and passes the soft extrinsic information Le12 to the second MAP decoder via the

interleaver. Then, the second decoder uses MAP decoding on the interleaved information

sequence and the parity check sequence y2p generated by the second encoder, with an im-

86

proved estimate of the a priori probabilities of the information sequence. The soft extrin-

sic information Le21 produced by the second MAP decoder is then transferred to the first

decoder as an improved a priori knowledge of the information sequence. Thus, an itera-

tive MAP decoding is constructed via the information exchange between the two con-

stituent MAP decoders.

We simulate the performance of the turbo code with BCJR (MAP) decoding algo-

rithms designed based on the chi-square, Gausian, and Gaussian BSC models of the opti-

cal fiber channel with ASE noise. In the simulations, chi-square distributed ASE noise is

added to the optical fiber transmission line. We repeat the simulations for different code

rates by puncturing the 1/3 turbo code.

In Fig. 3.15, the decoded BER as a function of the Q factor is plotted. In all the simu-

lations, the Q factor is evaluated based on the encoded data sequence instead of the origi-

nal uncoded data sequence. In other words, the Q factor shown in Fig. 3.15 is equivalent

to Es/N0 instead of Eb/N0, where Es represent average energy of line symbol, Eb repre-

sents average energy of information bit, and N0/2 is the channel two-sided noise spectral

density. The results show that the modified TC decoder can achieve more than 2 dB of

extra coding gain compared to the TC decoder based on the Gaussian approximation. It is

also shown that the rate 3/4 punctured TC based on the Gaussian BSC model fails to im-

prove upon the BER of uncoded data. The uncoded data, here, is transmitted at the same

signaling rate as the encoded data. The performance divergence of the rate 3/4 punctured

TC from the BER curve of the uncoded data agrees with the discussions about the effect

of the BSC model on the punctured TC in the previous section.

87

Figure 3.15: Output BER comparison of the turbo code (31, 27, 400) decoder based onthe chi-square model (solid), the Gaussian model (dotted), the Gaussian BSC model(dashed) of the ASE noise in the optical fiber transmission system, and the BER beforedecoding (dash-dot), the rate 1/2 and rate 3/4 codes are punctured versions of the rate 1/3turbo code.

3.4 Summary

In the highest level studies, we evaluated the lower performance bound (Shannon

limit) for general FEC codes based on the chi-square BAC, the Gaussian BAC, and the

Gaussian BSC models of optical fiber channels. We showed that the use of the simpler

channel models, the Gaussian BAC and Gaussian BSC, is a convenient way to calculate

the BER of optical fiber transmission systems without FEC coding, but is inappropriate

for evaluating the lower bound on FEC code performance. Although the Gaussian BAC

model gives acceptable estimates of the lower bound on performance at code rates higher

-2 0 2 4 6 8

-5

0

Q2 (dB)

log 1

0(B

ER

)

-1

-2

-3

-4

-6

-7

** Rate 1/3

88

than 0.8, it underestimates the lower bound (predicts lower BER or Q) by about 0.4 dB in

Q2 at code rate = 0.5. The problem becomes more severe at lower code rates. The Gaus-

sian BSC model, however, overestimates the lower bound on FEC code performance

(predicts higher BER or Q) by about 0.4 to 0.5 dB at all studied code rates from 0.5 to

0.9. Thus, the maximum coding gain achievable, with a FEC code, is overestimated at

low code rates by the Gaussian BAC model and always underestimated by the Gaussian

BSC model. A more accurate determination of the lower bound, i.e., the Shannon limit,

on performance for optical fiber channels dominated by ASE noise is possible with the

chi-square BAC model. A better calculation of the achievable coding gain and how close

a code approaches the Shannon limit is an important step in the search for efficient FEC

codes for optical fiber transmission systems.

In the middle level studies of linear code performance, we derived the upper bounds on

linear code performance in optical fiber channels with ASE as the dominant source of

noise. A general upper bound of the pairwise error probability, Pd, in asymmetric chan-

nels was obtained, and the corresponding bound of Pd for optical fiber channels with

dominant ASE noise was evaluated.

With two example codes, we investigated the accuracy of their performance using the

Gaussian approximation of ASE noise instead of the exact chi-square model. We showed

that the Gaussian approximation model overestimates (predicts lower BER or Q) at low

Q and underestimates at high Q the upper bounds on the linear code performance in the

optical fiber channel. The underestimate can be as high as 2 dB in Q2 at 10-12 BER, and

becomes larger for lower BER. The resulting performance bounds also suggest that, with

89

similar code rate and block length, the TPC is a better choice than the TCC in optical fi-

ber channels requiring very low BER.

Based on these results, we conclude that accurate noise statistics are critical in the

performance evaluation for turbo codes, which require a priori knowledge of the channel

noise distribution in the decoding. We can also conclude that the derived upper bound on

code performance is a useful tool in the selection and design of linear codes for channels

with asymmetric marks and spaces distributions.

In the lowest level studies of turbo code performance, we discussed the effects of dif-

ferent ASE noise models on the performance of the turbo code decoder. We showed that

if one uses a decoder assuming Gaussian noise statistics for a channel that actually has a

chi-square noise distribution, the performance of the decoder significantly degrades. The

performance degradation for the rate 1/2 turbo code can be more than 2 dB in Q2 at 10–6

BER, and becomes larger at lower BER. Moreover, if the Gaussian BSC approximation is

incorporated into the puncturing operation to obtain high rate turbo code, the resulting

punctured turbo code may cause coding loss instead of the expected coding gain. Based

on these results, we conclude that using accurate channel noise statistics in the iterative

MAP decoding algorithm is critical to achieve the expected coding gain from a turbo

code. We recommend that MAP decoding chip-sets designed based on Gaussian channel

should not be used for the non-Gaussian optical fiber transmission systems without

modification if one wants the best possible code performance.

90

Chapter 4

A sliding window criterion (SWC) line-code for

mitigating soliton-soliton collision induced errors

The optical soliton is an optical pulse that can propagate undistorted in dispersive non-

linear optical fibers under specific pulse power and pulse shape conditions [16]. In WDM

solition systems, soliton-soliton collisions (SSC) are a major nonlinear effect that causes

both timing jitter and amplitude fluctuation and, thus, limit the achievable system data

rate and transmission distance. Unlike the ASE noise causing random errors, soliton-

soliton collisions cause correlated errors that are highly data-pattern dependent as dis-

cussed in Chapter 2. Based on the particular characteristics of the data-pattern depend-

ence of the SSC-induced errors, we introduce a new line-coding design criterion, the

sliding window criterion (SWC), and develop a new line-coding scheme to mitigate the

SSC-induced errors. The SWC code mitigates the SSC-induced errors by reshaping the

data pattern according to the SWC.

In this chapter, we first investigate the limitations of FEC codes, specifically the Reed-

Solomon (RS) codes, in correcting SSC-induced errors. This then leads us to introduce

line-coding to resolve for this problem. We introduce the new concepts related to the

91

SWC codes. Then we describe two encoding algorithms block- and trellis-based en-

coding algorithms developed for SWC codes. We also discuss the concatenation of the

SWC code with a RS code to achieve the very low BER (< 10-11) required by optical fiber

communications. Finally, we evaluate and compare the performance of the proposed con-

catenated RS/SWC code with a couple of RS codes and a concatenated RS/convolutional

code via simulations.

4.1 Reed-Solomon code without line-coding

Reed-Solomon (RS) codes are increasingly used in optical fiber systems. Here, we

study their performance in the presence of SSC-induced errors, and we then explain the

motivation for a concatenated coding scheme in which our SWC line-code is used as the

inner code and a RS code is used as the outer code.

As discussed in Chapter 2, bit errors caused by soliton-soliton collisions are highly cor-

related and, thus, are bursty. RS codes are designed to correct burst errors of limited

length [55]. Thus, it appears that a RS code might be a good solution for correcting SSC-

induced errors. However, the actual performance of RS codes in an optical fiber channel

with a high SSC-induced BER is limited as shown below.

There are two ways to obtain stronger RS codes. One is to use longer codewords, and

the other is to introduce a larger redundancy by reducing the code rate. RS codes with

very long codewords are difficult to implement in a practical system, especially at the

very high data rates that are present in optical fiber communications systems. Moreover,

increasing the redundancy of the RS code does not always improve the performance for

SSC-induced errors, as we show next.

92

First, we need to evaluate the probability distribution of a complete SSC-induced time

shift in a two-channel system. Let X(nT) denote the SSC-induced time-shift process,

which can be expressed as

)1(......)1()()( max +−++−+= NnWnWnWnTX , (4.1)

where Wn ≡ W(n) is a Bernouli random variable representing the arrival time shift of a

soliton in one channel induced by the interference with a soliton in the other channel. The

probability mass function (PMF) of Wn is given by

pWPptWP nn −==== 1)0( ,)( δ ,

where p is the probability of individual marks in the transmitted data sequence. The prob-

ability mass function, expectation, and variance of Xn ≡ X(nT) are given by

NNN

nn pp

tX

NXP −−

= max)1()( max

δ, (4.2)

max2)1(Var NtppX n δ−= , (4.3)

maxtNpXE n δ= . (4.4)

When the number of channels increases, Nmax will be large and δtmin will be small. In

this case, the central limit theorem implies that P(Xn) approaches a normal distribution N

(EXn, VarXn). The distribution of SSC-induced time shifts, therefore, can be ap-

proximated by the normal distribution Nµ, σ as shown in Fig. 4.1, where µ = pδtNmax,

σ2 = p(1-p)δt2Nmax, Tr is the signal receiving-window duration at receiver, and T is the

time slot for one symbol. Generally, the center of the signal receiving-window is set at

the mean of the time shifts. Detection errors are induced when solitons shift outside the

93

signal receiving-window. Hence, the probability of SSC-induced errors can be estimated

by integrating the normal pdf outside the receiving-window given by

=

σ2

2/erfcBER r

un

T,

where erfc(x) is the complementary error function. Let a = Tr/2T, in uncoded systems T =

1/F, where F is the data rate. Thus, the SSC-induced BER of the received uncoded data

sequence can be estimated by

−=

max2un

)1(2

/erfcBER

Ntpp

Fa

δ. (4.5)

Figure 4.1: Approximated distribution of SSC-induced time shift

–T/2+µ µ0

SSC-induced time shift

pdf

T/2+µ–Tr/2+µ Tr/2+µ

Normal distribution:N µ, σ

WindowReceiving

Time slot for 1 symbol

94

For a FEC code with code rate r, the signaling rate for a fixed data rate F increases to

F/r, so that the maximum number of collisions for each soliton increases to N’max =

Nmax/r. Thus, the SSC-induced BER of the received FEC encoded data sequence becomes

−=

−max

23FEC)1(2

/erfcBER

Ntppr

Fa

δ, (4.6)

and the ratio of BERFEC/BERun increases very rapidly with increased redundancy because

r < 1 yields BERFEC > BERun. Even though the error correction capability of the FEC

code increases with redundancy, the price to be paid in this case is increased SSC-

induced BER of the received data sequence. Hence, the FEC code can only improve the

system performance as long as the increase of its error correction capability is greater

than the degradation of the channel due to the increased transmission bit rate. We con-

clude that increasing the redundancy of FEC code does not always imply better perform-

ance. Indeed, there is an optimal code rate at which the FEC code achieves the best per-

formance in correcting SSC-induced errors. This statement is true for FEC codes in gen-

eral and, hence, holds for RS codes as well as for convolutional codes.

Figure 4.2 plots the error correction capability of the RS (255, m) codes and the corre-

sponding estimated BERs of the received RS encoded data sequence before decoding.

We use the highest BER of the received data sequence that is error free after decoding as

the measure of the RS code error correction capability. The error correction capability of

the RS (255, m) codes is a function of the code redundancy defined as k = 255 – m in our

calculations. The RS (255, m) codes have a codeword length of 255 symbols, and each

symbol has 8 bits. The upper bound on the error correction capability of the RS (255, m)

95

codes plotted in Fig. 4.2 corresponds to the case for which when a symbol error occurs all

8 bits in the symbol are in error. At the other extreme, the lower bound corresponds to the

case for which when a symbol error occurs only one bit in the 8-bit symbol is in error.

A RS (255, m) code can correct up to (255 – m)/2 = k/2 symbol errors in a codeword

comprising 255 symbols [55]. Considering that SSC-induced errors are correlated errors,

we assume that the average number of bit errors in each symbol error equals 2, thus the

equivalent number of bit errors equals k. Based on this assumption, we evaluate the error

correction capability (BERECC) of the RS (255, m) codes as a function of the code redun-

dancy k = 255 – m, which is given by BERECC = k/(255×8). We evaluate the SSC-induced

BER of RS (255, m) encoded data sequence as a function of the code redundancy k = 255

– m by replacing the code rate r in Eq. 4.6 with 1 – k/255.

Figure 4.2: SSC-induced BERs before RS decoding and error correction capability of RS(255, m) codes as a function of redundancy k = 255 – m at the data rate of 12.5 Gb/s.Dashed: upper and lower bounds on the error correction capability of RS (255, m) codes.Circles: error correction capability of RS (255, m) codes with average number of bit er-rors in each symbol error equals to 2. Triangles: received SSC-induced BER before RSdecoding. Stars: margin of error correction of RS (255, m) codes.

0 50 100–4

–2

0

RS (255, m) code redundancy (k = 255 – m)

Log 1

0(B

ER

)

96

As shown in Fig. 4.2, the RS-code error correction capability increases slower than

does the SSC-induced BER with an increase of the code redundancy k = 255 – m. These

two curves have a crossover point when the RS code has 56-bit redundancy, correspond-

ing to RS (255, 199). To achieve the largest error correction margin, which is defined as

the difference between the error correction capability and the BER before RS decoding,

the optimal redundancy of the RS (255, m) codes to correct SSC-induced errors is 36 bits.

We can see that RS codes have a limited error correction capability to combat SSC-

induced bit errors and, hence, their use is not efficient in WDM systems with a high SSC-

induced BER. However, if an initial line-code is first used to decrease the SSC-induced

BER to a low level, then very low BERs can be achieved by using a RS code with little

redundancy. This idea leads to the concatenated RS/SWC codes described in the follow-

ing section.

4.2 SWC code

The goal of the SWC code is to minimize the deviation of the number of collisions

each soliton may experience and, hence, minimize the timing jitter and the bit error rate.

Consider the two-channel WDM system shown in Fig. 4.3, where the maximum number

of collisions for each soliton is N12. Each soliton in one of the two channels will interact

with a bit block of length N12 in the other channel along the entire optical fiber path. If all

blocks with N12 bits in the other channel have “almost” the same number of marks, then

solitons in the first channel will experience “almost” the same number of collisions.

Based on this observation, the problem of minimizing the deviation in the number of col-

lisions can be transformed into an encoding problem in which the goal is to minimize the

97

deviation in the number of marks in each block of length N12. A simple binary block code

can be constructed to achieve this goal in which all the codewords have N12 bits, and each

has the same number of marks.

Channel 1:

Channel 2:

Fiber path

T

Transmitterend

Receiver end

Figure 4.3: Soliton-soliton collision in a two-channel WDM system, the rectangular blockis defined as the sliding window.

In order to make any block with N12 bits in the encoded data stream have almost the

same number of marks, the pattern of the encoded data at the beginning and the end of

codewords in the encoded data stream must be taken into account as well. We introduce

the following concepts in order to construct the block SWC codes:

Fragmental

A binary block is fragmental if it has at least one transition from mark-to-space or space-

to-mark. A codeword is n-bit fragmental if any n-bit block in the codeword is fragmental.

Fragmentation degree (FD)

The n-bit fragmentation degree of a binary codeword is defined as

98

1

FD+−

=nl

mn ]1,0[∈ , (4.7)

where l is the length of the codeword and m is the number of n-bit fragmental blocks

(overlapped) in the codeword.

Fragmental end

A binary codeword has n-bit fragmental ends if its first n bits and last n bits are n-bit

fragmental.

The following are two examples for these new concepts.

Examples:

For codeword “10100110”, l = 8, n = 3, m = 6, FD3 = m/(l – n + 1) = 1, and it has 3-bit

fragmental ends.

For codeword “11110000”, l = 8, n = 3, m = 2, FD3 = m/(l – n + 1) = 1/3, and it does not

have fragmental end.

We define the sliding window criterion (SWC) to test the performance of SWC codes

as

JL = var(KL), (4.8)

where KL is the number of marks in a sliding window of a length L. We can now define

the rules to construct the SWC code look-up table as follows. Select codewords with: (1)

similar weights, (2) high fragmentation degrees, and (3) fragmental ends. A smaller JL

implies better satisfaction of the rules.

99

4.3 Block and trellis-based SWC codes

Based on the SWC, we develop two different types of SWC codes –– the block- and

trellis-based SWC codes. The block SWC code has a standard block code structure and,

thus, it has a simple implementation with codeword look-up table for encoding and de-

coding. By contrast, the trellis-based SWC code has a convolutional code structure,

which can thus improve its performance by increasing the memory depth while keeping a

short codeword length.

4.3.1 Block SWC codes

Both the length of the SWC codeword and the length of the sliding window should be

taken into consideration in constructing the best codeword look-up table defining a block

code in the SWC sense. The sliding window length L by definition is set to the maximum

number of expected collisions between the two neighboring channels: N12. Hence, ac-

cording to Eq. (2.20), L depends on the bit rate F, channel spacing ∆λ, transmission dis-

tance Z, optical fiber dispersion D, and dispersion map in dispersion-managed soliton

systems. Given the sliding window length L, the codeword length N and the code rate R

= M/N, when M is the data-word length, must be determined by taking the system fram-

ing structure and available bandwidth into account. If the SWC codeword is much shorter

than the sliding window, there may be several codewords within the sliding window.

Therefore, in this case, the SWC depends more on the numbers of marks in the code-

words than on their fragmentation degrees. Hence, rule (1) for SWC codeword look-up

table construction should be more heavily weighted than rule (2).

100

On the contrary, if the SWC codeword is longer than the sliding window, there is less

than one codeword within the sliding window. Hence, in this case, the SWC will depend

more on the fragmentation degrees than the numbers of marks in the codewords. In this

case, rule (2) should be emphasized as opposed to rule (1). This point is illustrated in the

following example: Two 24-bit blocks of four 6-bit codewords each are given in Table

4.1. The difference between the two blocks is that the four codewords in the first block

have a higher fragmentation degree, but a different number of marks. The codewords in

the second block have the same weight, but a lower fragmentation degree. We evaluate

these two blocks with a 3-bit sliding window and a 12-bit sliding window, respectively.

The results are shown in Table 4.1.

Table 4.1: SWC codeword examples

Data block 1 Data block 2

Codewords 001010 010010 101101 101011 111000 100011 001110 000111

3-bit FD 1 1 1 1 0.5 0.75 0.75 0.5

Number of marks

in codewords

2 2 4 4 3 3 3 3

Number of marks

in 3-bit SW

1121111111212222222122 3210111012211232100123

Number of marks

in 12-bit SW

4 5 5 5 6 5 6 7 6 7 7 7 8 6 5 4 4 5 6 6 5 5 5 6 6 6

SWC testing result Better for 3-bit SW Better for 12-bit SW

For each of the two weighting criteria –– achieving higher fragmentation degree or

similar weights of codewords –– we introduce a corresponding algorithm to generate the

101

codeword look-up tables. The first one is called the fragmentation-first (FF) algorithm,

and the second one is called the weight-first (WF) algorithm. We construct the codeword

look-up table with the FF algorithm for N > L and the WF algorithm for N ≤ L, respec-

tively. The flow diagrams for the FFMBNB and WFMBNB algorithms are shown in Fig.

4.4, where FFMBNB and WFMBNB represent the block SWC codes having M-bit data

word and N-bit codeword based on the FF algorithm and WF algorithm, respectively.

In Fig. 4.4, M is data-word length, N is codeword length. p is mark probability of the

original information data sequence, i and j are counters introduced for the calculation of

the index of the sections in the codeword look-up table, W is a counter for the number of

currently selected codewords in the code look-up table, n is fragmentation order, J is the

total number of sections in the code look-up table in the FF algorithm, and the number of

sections for each i in the WF algorithm, dj is the minimum fragmentation degree of

codewords in the j-th section, and FDn is the n-bit fragmentation degree of the code-

words.

Both the FF and WF algorithms for the construction of the codeword look-up table of

block SWC codes are based on exhaustive search, but they place different emphases on

high fragmentation degree and similar weights of codewords. The FF algorithm starts

with a very high d1, which is the minimum fragmentation degree of codewords selected

in the first section of the codeword look-up table, and gradually decreases this minimum

fragmentation degree limit to get more sections in the codeword look-up table until ob-

taining sufficient codewords. On the other hand, the WF algorithm starts with codewords

of weight S = N/2, and gradually changes the preferred code weights away from N/2

102

until obtaining sufficient codewords in the codeword look-up table. Hence, the FF algo-

rithm places more emphasis on high fragmentation degrees, while the WF algorithm

places more emphasis on similar code weights.

(a) fragmentation-first algorithm (b) weight-first algorithm

Figure 4.4: Algorithms for generating the SWCMBNB code look-up table.

select x codewords with FDn > dj& fragmental end as the (2j -1)th

section of the code table, W = W + x

select x codewords with FDn > dj& no fragmental end as the 2 j th


discard extra codes in the lastsection, keep 2M codewords

obtain the code table

W >= 2M ? Yes

j = j + 1, j > J ?No

No

set n, J, d1, d2, ... dJ for given M, N, p.Let j = 1, W = 0

No

Yes

Yes

W > 2M ?

select x codewords with S marks & fdn > dj& fragmental end as the (2j -1+ 2iJ )thsection of the code table, W = W + x

W >= 2M ?

select x codewords with S marks & fdn > dj& no fragmental end as the (2 j + 2iJ )th


discard extra codes in the lastsection, keep 2M code words.

j = j + 1, j > J ?

Yes

obtain the code table

No

i = i + 1, j = 1S = S + (-1)ii

Yes

Yes

set n, J, d1, d2, ... dJ for given M, N, p.Let j = 1, i = 0, W = 0, S = N/2

No

NoW >= 2M ?

103

Although the two algorithms have their individual emphases, all the three factors ––

(1) similar weights, (2) high fragmentation degrees, and (3) fragmental ends –– are taken

into account in codeword selection. According to the FF and WF algorithms described in

Fig. 4.4, codewords in a section constructed earlier, i.e., with a lower section index value,

are preferred (in the sense of better satisfying the SWC) to codewords in a section con-

structed later, i.e., with a larger section index value.

For any random binary input sequence with equal probability (p = 0.5) of marks and

spaces, the mapping into all codewords are equally-likely; hence, the arrangement of

codewords in the look-up table does not affect the code performance. However, inspec-

tion of real framed data in communications has shown that it is unrealistic to assume that

all data words are equally-likely [72]. Thus, for a given mark probability p for the input

data sequence, we can calculate the probability of a M-bit data word with m marks as pc =

pm(1 – p)M – m. Then, by assigning codewords that better satisfy the SWC, i.e. in the sec-

tion with lower section index value, to data words with higher pc, better code perform-

ance can be achieved. Therefore, in both algorithms, codeword look-up tables are divided

into several sections depending on the how well the selected codewords satisfy the SWC,

and the optimal code can be achieved with appropriate assignments of the codewords to

the data words according to the statistics of the source data.

To compare the proposed SWC to the conventional line-code performance criteria ––

transition density and balanced transmission [53], [54] –– we evaluate the influence of

the SWC code construction algorithms on the power spectral density of the encoded data

sequence using the spectral analysis technique developed by Cariolaro and Tronca [73].

The spectral density of block coded sequences is evaluated by representing the encoding

104

process as a finite-state synchronous sequential machine, and using the theory of ho-

mogenous Markov chains, to obtain both the continuous and the discrete spectral compo-

nents. Figure 4.5 plots the continuous power spectral density components of a random

sequence, a FF8B10B code, a WF8B10B code, and the Manchester code for comparison.

Figure 4.5: Continuous components of the power spectral densities of the uncoded ran-dom signal (solid) and the signals encoded by the FF8B10B (dash-dot), the WF8B10B(dotted), and the Manchester (dotted) codes.

As expected, the power spectral density of the FF8B10B code results in larger compo-

nents at high frequencies and, hence, implies a higher transition density than the

WF8B10B code. However, the WF8B10B code, shown by the lower power of its DC

component, is more balanced than the FF8B10B code. This observation indicates that the

conventional transition density and balance criteria used for line-coding schemes are not

effective measures for evaluating the performance of SWC codes in decreasing SSC-

induced errors. Hence, the appropriate performance criterion for SWC codes is SWC. We

0 50

0.1

0.2

0.3

P

ower

spe

ctra

l den

sity

FF8B10B WF8B10BRandom

Manchester

ω (0 ~ 2π)1 2 3 4 6 7

105

have shown in Table 4.1 that by using SWC as the performance criterion, the FF SWC

codes have better performance for N > L and the WF SWC codes have better perform-

ance for N ≤ L, where N is codeword length and L is sliding window length. This general

claim is also shown in our simulation results that will be presented in Sec. 4.5.

As shown in Fig. 4.6, the encoding and decoding of the block SWC code can be simply

implemented by writing the codeword look-up table into a memory chip and using the

input data block as the memory address. Thus, the output of the memory chip would be

the encoded data and the encoding speed is determined by the read cycle time of the chip.

Similarly, we can use the received encoded data (hard-decisioned) as the memory address

to implement the decoding.

Currently, a number of high-speed memory chips are commercially available. For ex-

ample, we can use the Motorola MCM64E918 RAM chip (currently in production) with a

19-bit address and an 18-bit output to implement a 16B18B SWC code. The minimum

read-cycle time that can be achieved with this chip is 3 ns; hence, we can achieve 18

bits/3 ns, i.e., 6 Gbps encoding and decoding speeds. By using m of these chips in paral-

lel, we can achieve as high as 6m Gbps encoding and decoding speeds.

106

(a) Hardware structure of the block SWC code implementation

(b) block SWC code encoding example

Figure 4.6: Implementation of the block SWC code.

Address DataA0~A7 D0~D9

SWC codeencoder

(Memory chip)

opticalmodulator

fiberchannel

serialto

parallel

input datasequence

8 bitblock

10 bitblock

parallelto

serial

encoded datasequence

opitc

alpu

lse

sequ

e nce

rece

ived

opitc

al p

ulse

sequ

ence

opticaldetector

detected datasequence

serialto

parallel10 bitblock

SWC codedecoder

(Memory chip)

Data AddressD0~D7 A0~A9

8 bitblock

parallelto

serial

decodeddata

sequence

Address DataA0~A7 D0~D9

SWC code encoder(Memory chip)

memory unit: codeword00000000: 1010101010 ... ...00111100: 1100110010 ... ...11111111: 0101010101

...00

0000

0000

1111

0011

1111

11

serialto

parallel

8 bitblock

10 bitblock parallel

to serial01

0101

0101

1100

1100

1010

1010

1010

...

107

4.3.2 Trellis-Based SWC Codes

As an alternative to the block SWC code, we develop a trellis-based SWC code using a

more sophisticated encoding algorithm. The trellis-based coding schemes are particularly

promising for this application because of the natural match between the sliding window

nature of the physical effects in optical communications and the operation of trellis-based

encoding [74]–[78].

By using SWC as the metric, we develop a new set of trellis-based codes. In the trellis-

based code approach, we use knowledge of the recent output sequence, equivalently the

recent input sequence and encoder state, to determine the next output sequence compo-

nents, i.e., it is Markovian with a code-designer-chosen memory (finite-state machine).

With the continuous input stream structure of the trellis-based approach, we do not need

to consider the fragmental ends in our development. We investigate the design of SWC-

based trellis codes where the main difference from conventional trellis codes is in the

output coded sequence design using the SWC metric. Trellis-based encoding allows us to

choose from a subset of output sequences at any given symbol interval that satisfy the

SWC with respect to the previous output.

Figure 4.7 shows the general (n, k, m) trellis-based SWC encoder structure, where n is

the codeword length in the codeword look-up tables, k is the input data word length, and

m is the memory depth of the trellis-based SWC encoder. For a trellis-based SWC code

with a sliding window of length L > n, the encoder memory depth m is given by

−= 5.0

n

Lm . (4.9)

108

As shown in the figure, there are two main modules in the structure of the trellis-based

SWC encoder the codeword-mapping module and the state-determination module.

The codeword-mapping module includes q codeword look-up tables, T0, T1, …, Tq–1, and

a look-up table selector. Codewords in the q codeword look-up tables have different av-

erage weights. The state-determination module has n×m memory units storing m previous

output data vectors, v1, v2, …, vm, and defining the current trellis state S. The k-bit input

binary data vector ui = (u1i, u2

i, …, uki) is encoded into an n-bit output binary data vector

vi = (v1i, v2

i, …, vni) by using one of the q codeword look-up tables, which is enabled ac-

cording to current trellis state S. Let N be the number of marks within the m previous

output data vectors, v1, v2, …, vm, which is given by

∑ ∑= =

=n

i

m

j

jivN

1 1

, (4.10)

then the current trellis state S, which is updated by the state-determination module, is a

function of N given by

110 ... ... ,, )( −∈= pSSSNfS , (4.11)

where p represents the number of possible states.

The basic idea of the proposed trellis-based SWC encoder is: if the previous n×m out-

put bits comprises many (few) marks, then choose a codeword look-up table whose

codewords have low (high) weights for the current encoding. Thus, we can minimize the

variance of the number of marks in the data blocks comprising the current output code-

word and the m previous output data words. According to Eq. (4.9), the current output n-

bit codeword plus the m previous output n-bit codewords have a similar block length, if

109

not the same, as the sliding window length. Therefore, the proposed trellis-based SWC

encoding algorithm can minimize JL in Eq. (4.8) and, hence, satisfy the SWC.

u0 = (u10, u2

0, ..., uk0 )

v0 = (v10, v2

0, ... vn0

)

(n, k)look-uptable T0

(n, k)look-uptable T1

(n, k)look-up

table Tq-1

look-uptable

selector

enable1

enable2

enableq

v11 v1

2

v21 v2

2

v1m

v2m

vnmvn

2vn1

... ...

... ...

... ...

... ...

... ...

... ...

∑∑= =

=n

i

m

j

jivN

1 1

110 ... ... ,,)( −∈= pSSSNfS

... ...

Codeword Mapping

State Determination

Trellis state S

Figure 4.7: Function diagram of the trellis-based SWC encoder.

The encoding operation described above can be represented by a trellis structure as

shown in Fig. 4.8. The states of the trellis diagram correspond to the encoder states de-

termined by the number of marks in the n×m memory units in the state determination

110

module. The labels Ti on the branches represent the look-up table for current encoding.

Generally, for each state, only one look-up table can be selected. The neighboring states

may share the same look-up table. Hence, the number of states, p, is always greater than

or equal to the number of look-up tables q.

S1

S0

T0 T1

T1

Sp-1

Sp-2

Tp-2Tp-1

Tp-2

T1

T0

... ... ... ...

Tp-2

Tp-1

T0 T1

T1

Tp-2Tp-1

Tp-2

T1

T0

... ... ... ...

Tp-2

Tp-1

Figure 4.8: Trellis diagram of the trellis-based SWC encoder.

As a practical example, consider an encoder for a (4, 3, m) trellis-based SWC code

with a 12-bit sliding window, i.e., codeword length n = 4, input data word length k = 3,

and L = 12. With Eq. (4.9) we obtain m = 2, and the encoder state S is given as

>

≤==

. 4 ,

4 ,)(

1

0

NS

NSNfS (4.12)

111

Hence, we have a (4, 3, 2) trellis-based SWC encoder with p = 2 states. We construct two

(4, 3) binary codeword look-up tables, T0 and T1, as shown in Table 4.2. The left column

in Table 4.2 lists all the possible 3-bit input data words u, and the other two columns list

the corresponding 4-bit codewords for S = S0 and S = S1, respectively. The selection of

codeword look-up table depends only on the encoder state S. When S = S0 the left and

middle columns in Table 4.2 construct the codeword look-up table T0. Similarly, when S

= S1 the left and right columns in Table 4.2 construct the codeword look-up table T1.

Table 4.2: Codeword look-up tables for a trellis-based SWC code

vuT0 (S = S0) T1 (S = S1)

000001010011100101110111

11101101101110100101011010010011

00010010010010100101011010010011

As defined in Eq. (4.12), S = S0 indicates that there are less than 5 marks in the previ-

ous m = 2 output 4-bit codewords, hence, codeword look-up table T0, which has higher

average codeword weights, should be selected for current encoding. On the contrary, S =

S1 indicates that there are more than 4 marks in the previous 2 output codewords, hence,

codeword look-up table T1, which has lower average codeword weights, should be se-

lected for current encoding. Thus, we can adaptively decrease the variance of the num-

112

bers of marks in any 3 continuous codewords and, hence, in the 12-bit sliding window.

This (4, 3, 2) SWC encoder has a simple two-state trellis as shown in Fig. 4.9.

Figure 4.9: Trellis of the (4, 3, 2) SWC encoder.

We initialize v1 and v2 stored in the memory with “1010”, so that the initial value of N

equals 4 and look-up table T0 is chosen for the encoding of the first input data word. Fol-

lowing the encoding procedure, we can get all the possible combinations of the number

of marks in v0, v1, and v2 as described in Fig. 4.10.

Let W(vi) be the weight of binary data vector vi, i.e., the number of marks in vi. Figure

4.10 depicts the possible weight combinations of v0, v1, and v2 with a tree structure com-

prising nodes and one direction branches. At each node, there are three entries, from left

to right corresponding to W(v0), W(v1), and W(v2), respectively. Because the previous 2

output codewords v1 and v2 are stored in the memory, W(v1) and W(v2) are always known

and determine the encoder state S. Given the encoder state S, either T1 or T2 is selected

for current encoding. From Table 4.2 we can see that if current encoding is based on T1,

W(v0) ∈ 2, 3 for all the 8 possible input data words. On the other hand, if current en-

coding is based on T2, then W(v0) ∈ 1, 2.

S1

S0

T0

S1

S0

T1

T0

T1

113

Figure 4.10: Possible combinations of number of marks in v0, v1, and v2.

A one-direction branch in the tree structure in Fig. 4.10 leads the current encoding op-

eration to the next encoding operation, i.e., current data vector v2 shifts out of the mem-

ory, current data vector v1 shifts into the memory units for the next v2, etc.. Hence, all the

possible combinations of W(v0), W(v1), and W(v2) can be obtained with an exhaustive

search.

As shown in Fig. 4.10, N = [W(v1) + W(v2)] ∈ 3, 4, 5, 6, and the number of marks

within any three continuous output codewords, i.e., W(v0) + W(v1) + W(v2), belongs to 5,

6, 7, 8. Therefore, the number of marks in a 12-bit sliding windows K12 ∈ 5, 6, 7, 8.

3 or 2 2

2

2 or 3 2

1

2 or 2 3

1

3 or 1 3

2

3 or 1 2

2

3 or 3 1

2

3 or 2 1

2

2 or 3 3

1

W(v0) W(v1) W(v2)

114

In the above example, “1000”, “0111”, and “1100” are not included in the codeword

look-up tables and, thus, it is guaranteed that K12 ∈ 5, 6, 7, 8 for the encoded data se-

quence. On the contrary, for uncoded random sequence, K12 ∈ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9,

10, 11, 12. Hence, the trellis-based SWC encoded data sequence achieves a smaller

variance in the number of marks in the sliding window, i.e., a smaller J12 = var(K12), at

the price of a (n – k)/k = 33.3% code redundancy. This example shows that, by carefully

choosing codewords in look-up tables, we can limit the possible values of KL to a smaller

range and, thus, decrease JL (KL) = var(KL) as defined in Eq. (4.8).

A simple way to decode the trellis-based SWC code is a reverse procedure of encoding

based on the codeword look-up table. For the possible n-tuple words not included in the

codeword table, we can use the minimum Hamming distance criterion to assign the clos-

est codeword. A more sophisticated decoding algorithm, for example the Viterbi algo-

rithm [78], may be designed for the trellis-based SWC codes because of their trellis-based

code nature.

4.3.3 Comparison of the block and trellis-based SWC codes

The advantage of the block SWC codes is the simple structure. Once the codeword

look-up table is obtained, the encoding and decoding can be simply implemented with a

high-speed memory chip, as shown in Fig. 4.6. The trellis-based SWC encoder, however

as shown in Fig. 4.7, requires more complicated logic in the encoding and decoding pro-

cedures.

115

On the other hand, block SWC codes use an exhaustive search algorithm in construct-

ing the codeword look-up table, and it is desirable to increase the block length to achieve

better code performance without increasing the code rate. For very long codewords, how-

ever, the algorithm may be too slow to be practical. In contrast, the trellis-based SWC

code can improve code performance by increasing the memory depth m in the state de-

termination module (Fig. 4.7) without increasing the length of the codeword or the code

rate. We can see, therefore, that the important consideration in choosing block or trellis-

based SWC codes is the tradeoff in performance and implementation complexity.

4.4 Concatenated RS/SWC coding scheme

Optical fiber communications require very low BER (< 10–11), but with only SWC

codes this requirement may not be satisfied. Moreover, as discussed in the previous sec-

tions, the basic idea of SWC line-codes is to prevent SSC-induced errors during the soli-

ton propagation in optical fiber instead of correcting errors in decoding at the receiver.

Therefore, the redundancy added to the original data sequence in SWC encoding is util-

ized to reshape the transmitted data pattern rather than to ensure an effective error-

correction decoding. Decoding for both the block and trellis-based SWC codes is simply

an inverse procedure of the look-up table encoding. Thus, SWC code decoders may in-

troduce decoding bit errors by decoding the received codeword with few bit errors to a

wrong data word with more bit errors compared to the original data word. Hence, to

achieve very low decoded BER, we propose a concatenated coding scheme, the concate-

nated RS/SWC codes.

116

Forney [79] shows that a concatenated coding system with a powerful outer code can

perform reasonably well when its inner decoder is operated with a probability of error in

a range between 10–2 and 10–3. Thus, by concatenating the SWC code with a RS code an

efficient coding scheme can be achieved as we show schematically in Fig. 4.11. As the

inner code, the SWC code can prevent most of the bit errors caused by SSC-induced

timing jitter and, hence, decrease the total BER to the range between 10–2 and 10–3 or

lower. Then, with an outer RS code, very low BER can be achieved.

In Fig. 4.11, Nr and Mr are the codeword length and data-word length in symbols of the

outer RS code, respectively. Ns and Ms are the codeword length and data-word length in

bits of the inner SWC code, respectively. If we choose RS code symbols with the same

length as the SWC codewords, i.e., Ms = log2(1 + Nr), then even though the SWC code

may introduce decoding BER and transform single-bit errors into many multi-bit errors in

a RS code symbol, the number of symbol errors does not increase after the SWC decod-

ing. In other words, from the view of the RS decoder, there is no extra decoding symbol

errors introduced by the SWC decoder. Hence, the decoding bit errors generated by the

SWC decoder do not affect the performance of the concatenated RS/SWC code as a

whole.

As an example, Table 4.3 lists the BERs and number of symbol errors in RS code-

words observed in one of our 4-channel WDM transmission simulations at different de-

coding stages of the received RS (255, 239) and concatenated RS (255, 239)/SWC (10, 8)

encoded data sequence. The 2nd row shows the SSC-induced BER and number of symbol

errors in 5 RS codewords when the RS (255, 239) code is used alone without concatena-

tion with a SWC code. The RS (255, 239) can correct up to 8 symbol errors in each

117

codeword, but 2 of the 5 RS codewords in the simulation have more than 8 symbol errors

and, hence, cannot be corrected by the RS decoder as shown in the 6th row in Table 4.3.

Figure 4.11: Concatenated RS/SWC coding scheme.

By comparing the 2nd and the 3rd rows in Table 4.3, we can see that the SWC code can

reduce the SSC-induced errors during transmission, thus decreasing the received BER

from 1.59×10–2 to 1.66×10–6. As a line-code, the SWC code avoids errors during the

transmission rather than corrects errors in the decoder. Moreover, as discussed at the start

of this section, the SWC code may introduce extra decoding bit errors. As shown in the

4th row, the BER is increased from 1.66×10–6 to 8.31×10–6 by the SWC decoding. How-

ever, because all the extra decoding bit errors are in the corrupted symbol, the number of

symbol errors does not increase after SWC decoding. Thus, as shown in the 5th row in

RSencoder(Nr, Mr)

WDM SolitonTransmission

Channel

ASESSCPMD

X

~X

SWCencoder(Ns, Ms)

Y

RSdecoder(Nr, Mr)

SWCdecoder(Ns, Ms)

~Y

118

Table 4.3, after the RS decoding, an error free transmission is achieved with the concate-

nated RS (255, 239)/SWC (10, 8) code in this simulation.

Table 4.3: Bit errors and symbol errors induced by soliton-soliton collision

BER Number of symbol errors in RS codewords

Received RS

encoded data

1.59e–2 10 5 9 5 7

Received RS/SWC

encoded data

1.66e–6 0 0 0 1 0

RS/SWC encoded data

after SWC decoding

8.31e–6 0 0 0 1 0

RS/SWC encoded data

after RS decoding

0 0 0 0 0 0

RS encoded data

after RS decoding

8.40e-3 10 0 9 0 0

The above comparison shows that the concatenated RS (255, 239)/SWC (10, 8) code

outperforms the RS (255, 239) code in reducing and correcting SSC-induced errors. We

noted that, the concatenated RS (255, 239)/SWC (10, 8) code has a code rate of about

0.75, while the RS (255, 239) code alone has a higher code rate of about 0.94. As dis-

cussed in Sec. 4.1, however, increasing the redundancy of RS codes without line-coding

does not always help to improve code performance in correcting SSC-induced errors. In

fact, it will be shown in the next section that the RS (255, 239)/SWC (10, 8) code also

significantly outperforms a RS code with similar effective code rate –– the RS (255, 191)

code with a code rate of about 0.75.

119

The concatenated RS/convolutional code is a strong concatenated FEC coding scheme

that has been proposed for long-haul submarine optical fiber communications [20], where

ASE noise is dominant. We can show that this concatenated FEC coding scheme, how-

ever, is not as effective as our proposed concatenated RS/SWC code for correcting SSC-

induced errors. As shown in Fig. 2.9 on page 43, the SSC-induced bit errors have a burst

error pattern. Higher bit rates systems have errors with a longer burst length and a smaller

burst spacing. Because the Viterbi decoder for the convolutional code performs better for

memoryless channels than for channels with memory [78], [80], [81] the burst nature of

the SSC-induced errors degrades the performance of the concatenated RS/convolutional

code. Although interleaving can be used to convert convolutional codes for correcting

random errors into burst-error-correcting codes, it will introduce transmission delay and

requires more complex hardware.

From the above discussions, we can see that both the RS codes and the concatenated

RS/convolutional codes have limited error correction abilities for combating the SSC-

induced bit errors. The proposed RS/SWC code first avoids most SSC-induced errors by

taking advantage of the special SWC encoded data pattern and then corrects the rest of

the errors with a high rate outer RS code. Hence, in systems with high SSC-induced tim-

ing jitter, using an RS code or a concatenated RS/convolutional code is not as effective

and efficient as is using the proposed concatenated RS/SWC codes in achieving low

BER.

120

4.5 Performance and comparisons via simulations

We have performed two sets of simulations to study the performance of our proposed

coding scheme. One set is based on a simplified model of soliton-soliton collisions

(SSCs) given by Eqs. (2.18)–(2.20) that addresses only complete collisions. The other set

is a full simulation of SSC-induced timing jitter in a dispersion-managed optical fiber

system using the Photonic Transmission Design Suite (PTDS) simulation environment

[82]. In the PTDS simulation environment we can construct WDM optical fiber transmis-

sion systems with configurable channel spacing, transmission distance, transmission data

rate, dispersion-management scheme, optical amplifier gain and spacing, optical pulse

shape, etc., and simulate the optical pulse propagation in a configurable step size in terms

of propagation distance.

4.5.1 Simulations based on simplified SSC model

Based on the simplified SSC model, four sets of simulations were performed. These

include: (1) calculating the reduction of SSC-induced timing jitter with the SWC code

alone; (2) comparing the performance of two RS codes, a concatenated RS/convolutional

code, and a concatenated RS/SWC code in mitigating timing-jitter-induced errors in

WDM systems; (3) determining the characteristic of the SSC-induced bit errors; and (4)

comparing the performance of the SWC codes constructed with the fragmentation-first

algorithm and the weight-first algorithm. The results of these simulations are plotted in

Figs. 4.12–4.14.

121

Figure 4.12 plots the time-shift distributions of the uncoded and the SWC (10, 8) en-

coded data sequence in a WDM system having a data rate F = 14 Gbps, transmission

distance Z = 20 Mm, fiber dispersion D = 0.25 ps/nm/km, and channel spacing = 0.8 nm.

To make the figure easy to read, not all the probability mass function (pmf) points of the

time shifts of solitons were plotted. As shown in Fig. 4.12, the variance of the time shifts

of the received data sequence, and the corresponding SSC-induced BER, is effectively

decreased by using the SWC code. SSC-induced BER is decreased from a floor of 10–2 to

a floor of 10–6.

Figure 4.12: Reduction of the SSC-induced timing jitter with a SWC (10, 8) code. Cir-cles: pmf in uncoded random data sequence. Triangles: pmf in SWC coded data se-quence. Dotted curve: Gaussian distribution approximating the pmf in uncoded case.Solid curve: Gaussian distribution approximating the pmf in SWC-coded case. Dottedline pair: receiving-window for uncoded signal. Solid line pair: receiving-window for theSWC-coded signal.

–50 0

0

Time shift of soliton (ps)

Pro

babi

lity

0.05

50

122

Figures 4.13a and 4.13b plot the output BERs of the binary data streams without cod-

ing, with RS coding, with concatenated RS/convolutional coding, and with concatenated

RS/SWC coding transmitted through the WDM soliton system. The BERs of these data

streams are evaluated for different transmission data rates and channel spacing values. In

Fig. 4.13a, we can see that the highest error-free (BER < 10–9) bit rate can almost be dou-

bled with the concatenated RS/SWC code compared to the original uncoded system. Fig-

ure 4.13b shows that the channel spacing for BER < 10–11 can be decreased by half with

the RS/SWC code. These results show that the SWC codes can effectively decrease the

SSC-induced timing jitter in WDM soliton systems, and they significantly enhance the

capacity in terms of data rate and channel spacing.

Comparing the performances of different coding schemes plotted in Fig. 4.13a, we can

see that with the same redundancy the RS (255, 239)/SWC (10,8) code performs better

than the RS (255, 191) code. This result agrees with the discussion in Sec. 4.4 about the

advantage of the RS/SWC codes over the conventional FEC codes. We note that the per-

formance of these coding schemes becomes worse rather than better as the code redun-

dancy increases. This is because, as discussed in Sec. 4.1, the probability of SSC-induced

timing jitter errors is very sensitive to the width of the soliton receiving-window. To keep

a constant data rate, a higher code redundancy requires a higher signaling rate and, thus, a

narrower signal receiving-window. Therefore, the increase of the timing jitter errors in-

duced by increasing code redundancy may be faster than the improvement of code per-

formance.

123

(a)

(b)

Figure 4.13: Comparison of the code performances in enhancing (a) transmission bit rateand (b) channel spacing. Solid: code performances in middle channel. Dotted: code per-formances in outmost channel. Triangle: concatenated RS (255, 239)/convolutional (2, 1,7). Plus: RS (255, 191). Circle: uncoded random data sequence. Square: RS (255, 239).Star: concatenated RS (255, 239)/SWC (10, 8).

5 10 20–10

–5

0

F (Gb/s)

log 1

0(B

ER

)

15

0 0.5 1–10

–5

0

channel spacing (nm)

log 1

0(B

ER

)

124

We simulated a 4-channel 20-Mm system with the simplified soliton-soliton collision

model. We set the soliton receiving-window duration Tr = 0.8/F. Figure 4.14 plots the

distributions of the number of marks inside the sliding window of two SWC encoded data

sequence generated by the two different algorithms, the fragmentation-first (FF) and

weight-first (WF) algorithms. In Fig. 4.14a, the sliding window length is much shorter

than the codeword length, hence the FF12B14B encoded data sequence achieves a

smaller variance in the number of collisions than does the WF12B14B encoded data se-

quence. On the contrary, as observed in Fig. 4.14b, the WF12B14B code performs better

for a 14-bit sliding window that is as long as the codeword. These results are consistent

with the discussion in Sec. 4.3 about the performances of the FF and WF algorithms.

(a) (b)

Figure 4.14: Probability mass function (pmf) of the number of marks in the sliding win-dow on the data sequence encoded with the fragmentation-first (star) and the weight-first(triangle) algorithms for codeword length = 14 bits, and (a) sliding window length = 4bits and (b) sliding window length = 14 bits. The solid curves in the figures represent thecorresponding normal distributions.

Number of marks in sliding window

Pro

babi

lity

0 1 2 3 40

0.5

UnencodeE112B14BFF12B14B

5 10 150

0.2

0.4

Pro

babi

lity

Number of marks in sliding window

UnencodedE112B14BFF12B14B

125

For a given optical fiber transmission line, the sliding window length is determined by

the maximum number of collisions one soliton may experience, and the SWC codeword

length depends on the data frame structure and other system design requirements. The

simulation results in Fig. 4.14 show that the choice of the FF or WF algorithms for the

SWC code construction should be made after the sliding window length and the SWC

codeword length have been determined.

4.5.2 Full simulations using PTDS

A full simulation is required to study the performance of our coding scheme in disper-

sion-managed soliton systems, which is very time consuming given the current state of

the art in optical system simulations. Since our ability to validate our coding scheme

through full simulations is therefore limited, we present simulation results for some se-

lected data patterns –– “desirable”, “undesirable”, and random data patterns –– to demon-

strate the effectiveness of our coding scheme. Here, we define a “desirable” pattern as

one that satisfies the SWC and an “undesirable” as one that does not.

In the full simulations, independent SWC encoded data sequences are transmitted

along 8 WDM channels. Both SSC-induced errors and ASE errors are simulated. The

128-bit soliton trains in the 1st channel (outermost channel) and the 4th channel (middle

channel) are recorded after every 200 km. The system parameters are: 12 GHz data rate,

100 GHz channel spacing, Gaussian pulses with tFWHM = 14 ps, and a symmetrical disper-

sion map with D1 = 2.34 ps/nm-km, D2 = −2.19 ps/nm-km. Each optical fiber segment is

100 km long, and lumped amplifiers are placed every 50 km.

126

Figure 4.15 plots the SSC-induced timing jitter versus transmission length for “desir-

able,” “undesirable,” and random input data patterns. The timing jitter curves for random

input data are obtained by using Richter and Grigoryan’s approach [83] that has been

shown to have good agreement with full simulation results. The results show that the se-

quence with “desirable” data pattern suffers much smaller SSC-induced timing jitter

compared to the sequence with “undesirable” data pattern in both the outmost and middle

channels. The eye diagrams of the received signals with undesirable and desirable pat-

terns are plotted in Fig. 4.16. We can see that the eyes of the received signals with desir-

able data pattern are more open than the eyes of the signals with undesirable data pattern.

Hence, the full simulation results show that the basic idea of the SWC code is quite ef-

fective and is a promising technique for dispersion-managed WDM soliton systems as

well.

4.6 Summary

This chapter introduced a new line-coding technique, the SWC code that can effec-

tively decrease SSC-induced timing jitter in WDM soliton systems. Two types of SWC

codes, the block and the trellis-based SWC codes are developed. A concatenated

RS/SWC coding scheme was developed that was shown by simulations to enhance the

WDM system capacity in both data rate and the channel spacing.

127

Figure 4.15: SSC-induced timing jitter of desirable (square), random (no sign), and unde-sirable (circle) data patterns. Solid: timing jitter in middle channel. Dotted: timing jitter inthe outmost channel.

Figure 4.16: Eye diagrams of the received signals with undesirable (upper) and desirable(lower) patterns.

0 5 100

5

Fiber length (Mm)

T

imin

g jit

ter

(ps)

Time (ps)

Nor

mal

ized

am

plitu

de

1

0

1

0

0 50 100 150

0 50 100 150

128

We studied the performance of RS codes for SSC-induced errors and showed that there

is an optimal redundancy for RS codes in the sense of achieving the largest error correc-

tion margin. Increasing code redundancy can enhance the error correction capability of

RS codes, but on the other hand it also increases SSC-induced errors that are very sensi-

tive to the system signaling bit rate. Hence, there is an optimal RS code redundancy that

gives the best code performance in correcting SSC-induced errors. More redundancy

(stronger error-correction capacity) for RS codes does not always imply better perform-

ance in correcting SSC-induced errors. We showed the advantages of the proposed con-

catenated RS/SWC coding scheme over the RS codes and the concatenated

RS/convolutional codes with both analysis and simulation results. Because of the simple

structure of the proposed SWC codes, this concatenated RS/SWC coding scheme can be

implemented with ASICs. Evaluation with a full simulation of the WEM DMS system

demonstrated that the proposed SWC line-code is a very promising technique for disper-

sion-managed-fiber systems.

129

Chapter 5

Summary and conclusions

5.1 Summary

In this dissertation, we studied the effectiveness of FEC codes for correcting ASE-

induced errors and a SWC line-code for mitigating SSC-induced errors in optical fiber

communication systems.

In the Introduction, we described the major sources of physical impairment in optical

fiber communications systems. We pointed out that ASE noise from optical amplifiers

causes non-Gaussian asymmetric pdfs of marks and spaces, and the nonlinear inter-

channel interference in WDM systems causes highly pattern-dependent errors. These two

physical effects are among the main sources of errors in optical fiber transmissions and

were our major concerns in this dissertation. We then surveyed the literature on previous

FEC and line-coding studies and applications in optical fiber communications. We

pointed out that previous work is mostly based on standard FEC codes and line-coding

schemes, and the channel models used in the theoretical studies include the binary asym-

metric channel (BAC), the AWGN, and the asymmetric Gaussian, all of which do not

accurately describe the optical fiber transmission output. This observation motivated the

goal of this dissertation –– to analyze and design FEC codes and line-coding schemes in

optical fiber communication systems by incorporating the particular physical characteris-

130

tics and mechanisms of these systems.

In Chapter 2, we discussed the statistics of the ASE noise, and constructed the corre-

sponding models for optical fiber channels with dominant ASE noise. In the hard-

decision case, we introduced the chi-square binary asymmetric channel (BAC), Gaussian

BAC, and Gaussian binary symmetric channel (BSC) models for optical fiber channels.

In the soft-decision case, we focused on the asymmetric chi-square and the asymmetric

Gaussian models. We also described the physical mechanism of soliton-soliton collisions

(SSC) in WDM systems, and constructed a simplified model for the SSC-induced timing

jitter. With this model we showed the data pattern dependence of the SSC-induced timing

jitter that was the motivation behind a line-coding scheme, the SWC code, for mitigating

SSC-induced timing jitter and the corresponding SSC-induced errors.

Chapters 3 and 4 are the two major chapters, where our research is discussed and re-

sults are presented. Specifically, in Chapter 3, we studied the effects of one of the main

sources of errors in systems with optical amplifiers, the statistic model of ASE noise, on

both the performance evaluation and performance of FEC codes. We performed a three-

level study regarding a lower bound (the Shannon limit) on general FEC code perform-

ance, an upper bound on linear code performance, and improvement of turbo code per-

formance.

In the study of the Shannon limit for optical fiber channels with dominant ASE noise,

we showed that the use of simpler models, the Gaussian BAC and Gaussian BSC, to cal-

culate the uncoded BERs is a convenient but an inappropriate way to evaluate the Shan-

non limit. Both the Gaussian approximation and the BSC approximation of optical fiber

channels with ASE noise mis-estimate, compared to the chi-square BAC model, the po-

131

tential of error correction in optical fiber transmission systems. We showed that the

Gaussian BAC model gives acceptable estimates of the Shannon limit at code rates

higher than 0.8, but it underestimates (predicts higher required Q) the lower bound by

about 0.4 dB in Q2 at code rate 0.5, and the problem tends to be more severe at lower

code rates. The Gaussian BSC model overestimates (predicts lower required Q) the lower

bound on FEC code performance by about 0.4 to 0.5 dB at all code rates ranging from 0.5

to 0.9. Thus, the maximum coding gain achievable with the best FEC code, is overesti-

mated at low code rates by the Gaussian BAC model and always underestimated by the

Gaussian BSC model.

In the study of upper bounds on linear code performance in optical fiber communica-

tions, we derived a general upper bound on the pairwise error probability, Pd, in asym-

metric channels. We evaluated the corresponding bound on Pd in optical fiber channels

with dominant ASE noise. With two example codes, a turbo product code (TPC) and a

turbo convolutional code (TCC), we investigated the accuracy of the ASE noise Gaussian

approximation in evaluating the upper bound on code performance. We showed that the

Gaussian approximation model overestimates (predicts lower BER) at low Q and under-

estimates at high Q the upper bounds on both the TPC and TCC code performance in the

optical fiber channel. The resulting performance bounds also suggest that, with similar

code rate and block length, the TPC is a better choice than the TCC in optical fiber chan-

nels requiring very low BER. The derived bound is a useful tool in estimating code per-

formance in optical fiber communications: the union bound is in general tight at very low

132

BERs, which is the desired operating range in optical fiber communications, and where

simulation is impractical for evaluating code performance.

In the study of the effect of ASE noise models on the turbo code decoder performance,

we showed that the turbo code decoder design based on the chi-square ASE noise distri-

bution can achieve more than 2 dB coding gain as compared to the design based on Gaus-

sian approximations. We also showed that if the Gaussian BSC model is used in the

puncturing operation to achieve higher code rates, then the likelihood ratio of the re-

ceived signal at the decision threshold, which is supposed to be 1, increases exponentially

as a function of the Q factor. This observation explains the simulation result that the rate-

3/4 punctured-TC based on the Gaussian BSC model fails to improve the system BER

compared to the uncoded data at the same signaling rate as the encoded data.

In Chapter 4, based on the physical mechanism of soliton-soliton collisions (SSC) in

WDM soliton transmission systems, we developed a new line-coding scheme, the SWC

code, to reduce the SSC-induced timing jitter and, thus, bit errors. Two types of SWC

codes, the block- and trellis-based SWC codes, are introduced. A concatenated RS/SWC

coding scheme is proposed, which is shown by simulations to enhance bit rate and reduce

channel spacing in WDM systems.

We studied the performance of RS codes for SSC-induced errors and showed that there

is an optimal RS code redundancy, which is optimal in the sense of achieving the best

code performance in correcting SSC-induced errors. We also showed that more redun-

dancy, implying a stronger error-correction capacity, for RS codes does not always imply

better performance in correcting SSC-induced errors. We showed via analysis and simu-

lation the advantages of the proposed concatenated RS/SWC coding scheme over

133

standalone RS codes and concatenated RS/convolutional codes. With a full simulation

incorporating both complete collisions and partial collisions in a 10 Mm 8-channel 10

Gbps WDM DMS system, we showed that the data sequence with the desired pattern ac-

cording to the SWC significantly reduces SSC-induced timing jitter. Hence, the SWC

line-coding is a very promising technique for dispersion-managed-fiber systems.

5.2 Conclusions

Our research of FEC and line-coding techniques in optical fiber communications sys-

tems has focused on analyzing code performance and designing coding schemes based on

the understanding of the particular physical characteristics of the optical fiber channels

[25]–[29]. Based on all the calculation and simulation results described in the previous

chapters, we conclude this dissertation by answering the two questions posted as the mo-

tivation of our research.

1. Does the non-Gaussian asymmetric statistics of the ASE noise, compared to the

Gaussian symmetric approximations, cause sufficient difference in the FEC perform-

ance that is worth the effort to include more accurate noise statistics into the analysis

and design of FEC codes?

2. Is there sufficient benefit worth the effort in using line-coding approaches to mitigate

the nonlinear inter-channel interference problem?

The answer is yes to both questions.

Specifically, a more accurate determination of the Shannon limit for optical fiber

channels dominated by ASE noise, is possible with the chi-square BAC model. Although,

in the evaluation of the Shannon limit for binary-in binary-out channels, only the two

134

transition probabilities of received signals are involved instead of the complete pdfs. The

resulting Shannon limits based on the chi-square BAC and Gaussian BSC models

showed a 0.4 – 0.5 dB difference in the Q factor. From the viewpoint of the Shannon

limit, a 0.5 dB difference in the Q factor is sufficient motivation for a continued search

for efficient FEC codes for optical fiber transmission systems.

As expected, when the complete pdfs of optical signals with ASE noise are incorpo-

rated in the upper performance bound calculation for linear codes with soft-decision de-

coding, a significant difference between the results based on the chi-square and Gaussian

models shows up. More than 2 dB of coding gain in the performance bound for the turbo

code at 10-12 BER results when the more accurate chi-square model is used. Because the

upper bound derived is based on the union bound that is tight in general at low BER, the

resulting performance bounds can be confidently used for the performance estimate in

the very low BER range required by optical fiber communications.

Our point is further enforced by the simulation results for the turbo code performance

in ASE-noise-dominant optical fiber channels. The soft-decision iterative MAP decoding

algorithm used in turbo code decoding takes full advantage of the statistical information

provided by the soft-decision signals. The high sensitivity of turbo code performance to

the accuracy of the noisy signal distribution causes 1.5 – 2 dB performance degradation

at 10-6 BER when Gaussian approximation is used for a chi-square distributed channel.

The performance degradation becomes more severe at lower BER. Clearly, more than 2

dB of extra coding gain would be very useful to a designer of optical fiber communica-

tion systems.

135

We see a clear trend toward FEC technology advancement in the three generations of

FEC codes applied in optical fiber communications systems. From Hamming codes and

RS codes with algebraic decoding, to concatenated FEC codes with soft decoding and,

further, to turbo product code with soft and iterative decoding, each new generation in

FEC is closer to the Shannon limit. As mentioned in the Introduction, this trend is fun-

damentally guided by the technique of including more and more noise statistical infor-

mation into the FEC code design. However, to make continued progress in the analysis

and design of FEC codes, we should use more accurate channel models for optical fiber

channels that are different from the conventional BSC, AWGN, or Gaussian channels.

FEC codes correct errors after they have occurred in transmission. By contrast, SWC

line-codes, by reshaping the data pattern, prevent SSC-induced errors from occurring.

Hence, in concept, the SWC code is more efficient than the FEC code in mitigating the

particular data-pattern dependent errors. An error floor decrease from 10-2 to 10-6 by us-

ing a SWC code was demonstrated in a 4-channel WDM system simulation. Although it

may not always be possible to achieve very low BER such as 10-12 with only the SWC

code in a highly nonlinear WDM system, the SWC code can be a very efficient compo-

nent code in a concatenation code scheme. Our simulation results showed that for BERs

< 10–9 and compared to the original uncoded system, the highest data rate attainable can

almost be doubled with the concatenated RS/SWC code, and the smallest channel spac-

ing can be decreased by half with the RS/SWC code. Hence, it is worth further effort to

study line-coding schemes to mitigate nonlinearity-based data-pattern dependent errors.

We saw the application of line-coding in counteracting the non-flat laser FM response

in coherent optical fiber communication systems. We believe that a new and promising

136

application of line-coding in WDM optical fiber communication systems is the mitigation

of nonlinear inter-channel interference.

The dissertation results demonstrate, therefore, that more accurate FEC code perform-

ance evaluation, significant improvement of FEC code performance, and highly efficient

and effective line-coding schemes can be achieved when the physical characteristics of

the optical fiber transmission line are taken into account.

5.3 Suggestions for future research

We believe that we have just taken a first step in exploring an important research area

addressing the particular physical characteristics and impairments of optical fiber

communications systems when evaluationg and desigining coding techniques for im-

proving system performance. The results of our research suggest some important topics

for future research.

5.3.1 Further investigation of the noise statistics in optical fiber communication systems

Our FEC research is rooted in accurate noise statistics in optical fiber transmission

systems. As shown in our calculations and simulation results, an accurate channel model

is critical for achieving the best possible FEC code performance, especially for those FEC

codes using soft-decision and iterative decoding. We used the chi-square distribution for

the ASE noise statistics in our studies; however, it is still an approximation. In the deri-

vation of the chi-square model of ASE noise, only the amplitude fluctuation was taken

into account [48], [49], but ASE-induced timing jitter is also an important source of er-

rors. Moreover, it does not account for transmission effects.

137

Holzlöhner, et al., [50] have introduced an efficient simulation algorithm with which

the ASE-induced timing jitter is included in the Monte-Carlo simulation. They performed

simulations of a long-haul DMS system and showed a significant difference of the re-

sulting signal distribution from the Gaussian and chi-square distributions. Simulations of

CRZ systems and comparisons between the resulting signal distributions and chi-square

distributions have not been reported.

In real optical fiber transmission systems, all the physical impairments –– optical fiber

dispersion, fiber nonlinearity, PMD, and ASE noise –– are combined, and the system

may drift from time to time. Hence, the most direct way to obtain the noise distributions

is experimental measurement.

We performed experimental studies of the noise distribution in a recirculating optical

fiber loop described in [85]. We used a BER tester to measure the BER curve as a func-

tion of the decision threshold, and used an oscilloscope to record the histogram of the

detected electrical signals. In theory, it can be proved that the derivative of the BER

curve gives the difference between the two pdfs corresponding to the marks and spaces,

while the histogram of the detected signal gives the sum of the two pdfs. Hence, with the

difference and sum equations of the two pdfs, we can obtain each pdf separately. We

could not obtain reasonably accurate measurements of the optical signal distributions,

however, because of transmission system drifting, the accuracy limit of the oscilloscope,

and thermal noise in the electrical amplifier for the BER tester. To obtain accurate noise

statistics in optical fiber transmission systems, we need to perform more comprehensive

theoretical analysis, develop more efficient simulation algorithms, and design more prac-

tical experiments.

138

5.3.2 Application of low density parity check codes in optical fiber communication sys-

tems

Low density parity check (LDPC) codes is a class of linear block codes originally dis-

covered by Gallager [86] in the early 60s that have recently been rediscovered and gener-

alized [87]–[89]. LDPC codes with soft-decision iterative decoding have been demon-

strated in simulations to perform quite close to the Shannon limit [88]–[91].

One of the advantages of LDPC codes is that the simple linear block code structure and

low density of the parity check matrix make the code implementation relatively easy.

Another advantage is that by increasing the codeword length, high performance can be

achieved with low overhead (redundancy). It has been shown that with sufficient block

length, LDPC codes may outperform turbo codes [91]. Simple implementation structure

and low overhead are two major factors in selecting FEC codes for optical fiber commu-

nication systems with very high data rate. Hence, LDPC codes may be very promising for

optical fiber communications systems.

Moreover, LDPC codes are linear codes, hence, the upper bound on linear code per-

formance derived in Chapter 3 can be directly applied to evaluate the LDPC code per-

formance.

5.3.3 Performance comparison of different FEC codes in optical fiber communication

systems

Up to now, all the FEC codes representing third generation FEC codes in optical fiber

communications belong to the class of turbo product codes (TPC). There are several dif-

139

ferent classes of codes using soft iterative decoding and approaching the Shannon limit;

these include LDPC codes, parallel concatenated convolutional (PCC) turbo codes, and

serial concatenated convolutional (SCC) turbo codes. For future research, we suggest an

investigation of the performance of these different classes of codes in optical fiber chan-

nels, using both hard-decision iterative decoding and soft-decision iterative decoding, and

using the various channel models including the chi-squared BAC, Gaussian BAC, Gaus-

sian BSC, chi-square continuous, and Gaussian continuous models.

Comparison of the code performances will help us evaluate the applicability of these

new classes of codes to optical fiber communications systems. Of particular concern here

is that while other communications systems aim at achieving BERs around 10–4 to 10–6,

optical fiber communications systems require more reliable performance, e.g., BERs <

10–11. Hence, code performance in optical fiber channels should be compared at very low

BERs. In Chapter 3, we have shown that the PCC turbo codes may outperform the other

codes at low Q, and the error floor effect can significantly decrease the slope of the de-

coded BER curve as a function of the Q factor at comparative high Qs (predicting very

low decoded BERs). Thus, other codes with similar code rate and block length, but with-

out the error floor effect, may outperform the PCC turbo code at very low decoded BERs.

To investigate the code performance at BERs < 10–11, analytical evaluation of tight per-

formance bounds is a more practical method than code performance simulations that are

too slow to be practical. Some other issues, including overhead costs, puncturing, decod-

ing complexity, and decoding delay, should also be investigated in a system environment.

140

5.3.4 Experimental study and improvement of the SWC code

For the line-coding work, there is also more work that needs to be done. The perform-

ance of the SWC codes needs to be evaluated in general quasi-linear systems instead of

pure soliton systems. Because the soliton-soliton collison is a particular case of the non-

linear inter-channel interference in WDM systems, the inter-channel interference has

similar physical dynamics to what was used in the development of the SWC code. Hence,

the SWC code is expected to work for errors induced by inter-channel interference in

general. To show this, some experiments in WDM optical fiber transmission systems us-

ing SWC codes should be performed. Moreover, there are other effects that may cause

data-pattern-dependent errors, for example, PMD. In future research, the effects of partial

collisions and PMD need to be addressed in the SWC code design.

141

Bibliography

[1] P. Kaiser, “OIDA Communications Roadmap Study,” Kaiser Global Consulting,

Aug. 1998.

[2] T. Georges and F. Favre, “WDM soliton transmission in dispersion-managed links,”

in European Conf. Opt. Comm, Sept. 1999, Nice, France, paper TuA3.1.

[3] A. Chraplyvy, “Terabit optical communications,” in European Conf. Opt. Comm,

Sep., 1999, Nice, France, paper MoC2.1.

[4] C. R. Davidson, C. J. Chen, M. Nissov, A. Pilipetskii, N. Ramanujam, H. D. Kidorf,

B. Pedersen, M. A. Mills, C. Lin, M. I. Hayee, J. X. Cai, A. B. Puc, P. C. Corbett,

R. Menges, H. Li, A. Elyamani, C. Rivers, and N. Bergano, “1800 Gb/s transmis-

sion of one hundred and eighty 10 Gb/s WDM channels over 7,000 km using full

EDFA C-band,” in OFC/IOOC’00 Technical Digest, Baltimore, MD, Mar. 2000,

paper PD25.

[5] C. A. Brackett, “Dense wavelength division multiplexing principles and applica-

tions,” IEEE Journal on Selected Areas in Communications, vol. 8, no. 6, pp. 948–

964, 1990.

[6] G. P. Agrawal, Fiber-optic Communication Systems, 2nd edition, John Wiley and

Sons, Inc., New York, 1997.

142

[7] B. Zhu, L. Leng, L. E. Nelson, Y. Qian, S. Stulz, Thiele, J. Bromage, L. Gruner-

Nielsen, S. Knudsen, C. Doerr, L. Stulz, S. Chandrasekhar, S. Radic, J. Park, K. S.

Feder, D. Vengsarkar, and Z. Chen, “3.08 Tb/s (77 × 42.7 Gb/s) transmission over

1200 km of non-zero dispersion-shifted fiber with 100-km spans using C- and L-

band distributed raman amplification,” in OFC/IOOC’00 Technical Digest, Ana-

heim, CA, Mar. 2001, paper PD23.

[8] K. Fukuchi, T. Kasamatsu, M. Morie, R. Ohhira, T. Ito, K. Sekiya, D. Ogasahara,

and T. Ono, “10.92-Tb/s (273 × 40-Gb/s) triple-band/ultra-dense WdM optical-

repeatered transmission experiment,” in OFC/IOOC’00 Technical Digest, Anaheim,

CA, Mar. 2001, paper PD24.

[9] S. Bigo, Y. Frignac, G. Charlet, S. Borne, P. Tran, C. Simonneau, D. Bayart, A.

Jourdan, J. P. Hamaide, W. Idler, R. Dischler, G. Veith, H. Gross, and W. Poehl-

mann, “10.2 Tbit/s (256 × 42.7 Gbit/s PdM/WDM) transmission over 100 km Tera-

LightTM fiber with 1.28 bit/s/Hz spectral efficiency,” in OFC/IOOC’00 Technical

Digest, Anaheim, CA, Mar. 2001, paper PD25.

[10] T. Miyakawa, I. Morita, K. Tanaka, H. Sakata, and N. Edagawa, “2.56 Tbit/s (40

Gbit/s × 64 WdM) unrepeatered 230 km transmission with 0.8 bit/s/Hz spectral ef-

ficiency using low-noise fiber Raman amplifier and 170 µm2-Aeff fiber,” in

OFC/IOOC’00 Technical Digest, Anaheim, CA, Mar. 2001, paper PD26.

[11] J. X. Cai, M. Nissov, A. N. Pilipetskii, A. J. Lucero, C.R. Davidson, D. Foursa, H.

Kidorf, M. A. Mills, R. Menges, P. C. Corbett, D. Sutton, and N. S. Bergano, “2.4

Tb/s (120 × 20 Gb/s) transmission over transoceanic distance using optimum FEC

143

overhaed and 48% spectral efficiency,” in OFC/IOOC’00 Technical Digest, Ana-

heim, CA, Mar. 2001, paper PD20.

[12] B. Bakhshi, M. F. Arend, M. Vaa, E. A. Golovchenko, D. Duff, H. Li, S. Jiang, W.

W. Patterson, R. L. Maybach, and D. Kovsh, “1 Tbit/s (101 × 10 Gbit/s) transmis-

sion over transpacific distance using 28 nm C-band EDFAs,” in OFC/IOOC’00

Technical Digest, Anaheim, CA, Mar. 2001, paper PD21.

[13] G. Vareille, F. Pitel, and J. F. Marcerou, “3 Tbit/s (300 × 11.6 Gbit/s) transmission

over 7380 km using C+L band with 25 GHz channel spacing and NRZ format,” in

OFC/IOOC’00 Technical Digest, Anaheim, CA, Mar. 2001, paper PD22.

[14] C. R. Menyuk, “Tutorial on modeling nonlinear lightwave systems,” in

OFC/IOOC’99 Technical Digest, San Diego, CA, Feb. 1999, paper ThW.

[15] D. Marcuse, “Single-channel operation in very long nonlinear fibers with optical

amplifiers at zero dispersion,” Journal of Lightwave Technology, no. 9, pp. 356–

361, 1991.

[16] G. P. Agrawal, Fiber-optic Communication Systems, 2nd edition, Chapter 10, John

Wiley and Sons, Inc., New York, 1997.

[17] W. D. Grover, and T. E. Moore, “Design and characterization of an error-correcting

code for the SONET STS-1 tributary,” IEEE Transactions on Communications, vol.

38, no. 4, pp. 467-476, April 1990.

[18] S. Yamamoto, H. Takahira, and M. Tanaka, “5 Gbps optical transmission terminal

equipment using forward error correction code and optical amplifier,” Electronics

Letters, vol. 30, no. 3, Feb. 1994.

144

[19] J. L. Pamart, E. Lefranc, S. Morin, G. Balland, Y. C. Chen, T. M. Kissell and J. L.

Miller, “Forward error correction in a 5 Gbit/s 6400 km EDFA based system,”

Electronics letters, vol. 30, no. 4, pp. 342–343, Feb. 17, 1994.

[20] A. Puc, F. Kerfoot, A. Simons, and D. L. Wilson, “Concatenated FEC experiment

over 5000 km long straight line WDM test bed,” in OFC/IOOC’99 Technical Di-

gest, San Diego, CA, Feb. 1999, pp. ThQ6-1–THQ6-3.

[21] H. Kidorf, N. Ramanujam, I. Hayee, M. Nissov, J. Cai, B. Pedersen, A. Puc, and C.

Rivers, “Performance improvement in high capacity, ultra-long distance, WDM

systems using forward error correction codes,” in OFC/IOOC’00 Technical Digest,

Baltimore, MD, Mar. 2000, pp. ThS3-1–ThS3-3.

[22] O. Ait Sab, and V. Lemaire, “Block turbo code performances for long-haul DWDM

optical transmission systems,” in OFC/IOOC’00 Technical Digest, Baltimore, MD,

Mar. 2000, pp. ThS5-1–ThS5-3.

[23] O. Ait Sab, “FEC techniques in submarine transmission systems,” in OFC/IOOC’01

Technical Digest, Anaheim, CA, Mar. 2001, pp. TuF1-1–TuF1-3.

[24] H. Taga, H. Yamauchi, T. Inoue, and K. Goto, “Performance improvement of

highly nonlinear long-distance optical fiber transmission system using novel high

gain forward error correcting code,” in OFC/IOOC’01 Technical Digest, Anaheim,

CA, Mar. 2001, pp. TuF3-1–TuF3-3.

[25] Y. Cai, N. Ramanujam, J. M. Morris, T. Adali, G. Lenner, A. B. Puc, and A.

Pilipetskii, “Performance limit of forward error correction codes in optical fiber

communications,” in OFC/IOOC’01 Technical Digest, Anaheim, CA, Mar. 2001,

pp. TuF2-1–TuF2-3.

145

[26] Y. Cai, and J. M. Morris, “On Performance Bounds for Linear Codes in Optical Fi-

ber Communications Systems with Asymmetric Amplified Spontaneous Emission

Noise,” in Proceedings of Conference on Information Sciences and Systems, Balti-

more, MD, Mar. 2001.

[27] Y. Cai, J. M. Morris, T. Adalι, and C. R. Menyuk, “On The Effects of ASE Noise

Models on Turbo Code Decoder Performance in Optical Fiber Transmissions,” to

appear in CLEO/QELS’ 01 Tech. Digest, Baltimore, MD, May 2001, paper CThO.

[28] Y. Cai, T. Adalι, and C. R. Menyuk, “A line coding scheme for reducing timing

jitter in WDM soliton systems,” in OFC/IOOC’00 Technical Digest, Baltimore,

MD, Mar. 2000, pp. ThS4-1–ThS4-3.

[29] Y. Cai, T. Adalι, and C. R. Menyuk, “Error Mitigation System Using Line Coding

for Optical WDM Communications,” Patent Application S/N 06/185,400, filed on

February 28, 2000.

[30] Y. Takasaki, et al., “Two-level AMI line coding family for optical fiber systems,”

International Journal of Electronics, vol. 55, no. 1, pp. 121–131, July 1983.

[31] R. M. Brooks and A. Jessop, “Line coding for optical fiber systems,” International

Journal of Electronics, vol. 55, no. 1, pp. 81–120, July 1983.

[32] A. J. Sharland and A. Stevenson, “A simple in-service error detection scheme based

on the statistical properties of line codes for optical fibre systems,” International

Journal of Electronics, vol. 55, no. 1, pp. 141–158, July 1983.

[33] R. Petrovic, “5B6B optical fibre line code bearing auxiliary signals,” Electronics

Letters, vol. 24, no. 5, pp. 274–275, Mar. 1988.

146

[34] A. Wismeijer, P.W.G. Duijves, H. Van Harten, A.M.J. Koonen, J.S. Leong, P.E.

Schaafsma, and M. Weeda, “A 1.13 Gb/s optical transmission system with ternary

line code,” in Proc. of ECOC ’86, Barcelona, pp. 475–478.

[35] G. Hanke, and B. Hein, “Monomode transmission system operating with 1300 nm

lasers and 1550 nm DFB lasers at a bitrate of 2.23 Gbit/s,” in Proceedings IEEE

International Conference on Communications, Seattle, WA, Jun. 1987.

[36] A. M. J. Koonen, P. V. Eijk, P. H. V. Heijningen, and T. W. M. Mosch, “2.26

Gbit/s optical transmission system with 5B6B line coding,” Electronics Letters, vol.

26, no. 12, pp. 799–801, June 1990.

[37] W. A. Krzymien, “Transmission performance analysis of a new class of line codes

for optical fiber systems,” IEEE Transactions on Communications, vol. 37, no. 4,

pp. 402–404, Apr. 1989.

[38] I. J. Fair, W. D. Grover, W. A. Krzymien, and R. I. MacDonald, “Guided scram-

bling: a new line coding technique for high bit rate fiber optic transmission sys-

tems,” IEEE Transactions on Communications, vol. 39, no. 2, pp. 289–297, Feb.

1991.

[39] R. L. Fellows and T. B. Reynolds, “Synchronous optical digital transmission system

and method,” U.S. Patent, Patent no. 5,459,607, Oct. 17, 1995.

[40] R. S. Vodhanel, B. Enning, and A. F. Elrefaie, “Bipolar optical FSK transmission

experiments at 150 Mb/s and 1 Gb/s,” Journal of Lightwave Technology, vol. 6, no.

10, pp. 1549–1553, Oct. 1988.

[41] R. Noe, M. W. Maeda, S. G. Menocal, and C. E. Zah, “Pattern independent FSK

heterodyne transmission with AMI signal format and two channel cross-talk meas-

147

urements,” Journal of Optical Communications, vol. 10, no. 3, pp. 82–84, Sep.

1989.

[42] H. Tsushima, S. Sasaki, R. Takeyasi, and K. Uomi, “Alternate-mark-inversion opti-

cal continuous phase FSK heterodyne transmission using delay line demodulation,”

Journal of Lightwave Technology, vol. 9, no. 3, pp. 666–674, May 1991.

[43] P. W. Hooijmans, M. T. Tomesen, and A. Van de Grip, “Penalty free biphase line

coding for pattern independent FSK coherent transmission systems,” Journal of

Lightwave Technology, vol. 8, no. 3, pp. 323–328, Mar. 1990.

[44] R. C. Steele and M. Creaner, “565 Mbit/s AMI FSK coherent system using com-

mercial DFB lasers,” Electronics Letters, vol. 25, no. 11, pp. 732–734, May 1989.

[45] S. P. Majunder, R. Gangopadhyay, and G. Prati, “Effect of line coding on hetero-

dyne FSK optical systems with nonuniform laser FM response,” IEE Proceedings.

J, Optoelectronics., vol. 141, no. 3, pp. 200–208, June 1994.

[46] E. Forestieri, and G. Prati, “Analysis of delay-and-multiply optical FSK receivers

with line coding and non-flat laser FM response,” IEEE Journal on Selected Areas

in Communications, vol. 13, no. 3, pp. 543–556, Apr. 1995.

[47] International Telecommunication Union Telecommunication standardization sector

(ITU-T), Series G: Transmission Systems and Media, Digital Systems and Net-

works, G.975.

[48] P. A. Humblet and M. Azizoglu, “On the bit error rate of lightwave systems with

optical amplifiers,” Journal of Lightwave Technology, vol. 9, no. 11, pp. 1576–

1582, Nov. 1991.

148

[49] D. Marcuse, “Derivation of analytical expressions for the bit-error probability in

lightwave systems with optical amplifiers,” Journal of Lightwave Technology, vol.

8, no. 12, pp. 1816–1823, Dec. 1990.

[50] R. Holzlöhner, V. S. Grigoryan, C. R. Menyuk, and W. L. Kath, “Accurate calcula-

tion of eye diagrams and error rates in long-haul transmission systems,” in

OFC/IOOC’01 Technical Digest, Anaheim, CA, Mar. 2001, pp. MF3-1–MF3-3.

[51] C. D. Poole and J. Nagel, “Polarization effects in lightwave systems,” in Optical

Fiber Telecommunications IIIA, I. P. Kaminow and T. L. Koch, eds. Academic, San

Diego, 1997, (Chap. 6).

[52] R. M. Mu, T. Yu, V. S. Grigoryan, and C. R. Menyuk, “Convergence of the CRZ

and DMS formats in WDM systems using disperdion management,” in

OFC/IOOC’01 Technical Digest, Anaheim, CA, Mar. 2001, pp. MF3-1–MF3-3.

[53] K. W. Cattermole, “Principles of digital Line Coding,” International Journal of

Electronics, vol. 55, no. 1, pp. 3–33, July 1983.

[54] J. L. LoCicero and B. P. Patel, “Line Coding,” The Communications Handbook,

Jerry D. Gibson, Editor-in-Chief, CRC Press, Boca Raton, FL, 1997.

[55] S. Lin, and D. J. Costello, Jr., Error control coding: fundamentals and applications,

Prentice Hall, Inc. Englewood Cliffs, NJ, 1983.

[56] C. Berrou, et al., “Near Shannon Limit Error-Correcting Coding and Decoding,” in

Proceedings IEEE International Conference on Communications, Geneva, Swit-

zerland, May 1993, pp. 1064-1070.

149

[57] S. Benedetto and G. Montorsi, “Unveiling Turbo Codes: Some Results on Parallel

Concatenated Coding Schemes,” IEEE Transactions on Information Theory, vol.

42, no. 3, Mar. 1996, pp. 409-428.

[58] G. S. Pandian and S. Dilwali, “On the thermal FM response of a semiconductor la-

ser diode,” IEEE Photonics Technology Letters, vol. 4, no. 2, pp. 130-133, Feb.

1992.

[59] S. J. Wang, Y. J. Wang, N. K. Dutta, and Y. Twu, “FM response of InGaAsP buried

heterostructure distributed feedback lasers and their applications in incoherent FSK

systems,” Journal of Lightwave Technology, vol. 8, no. 12, pp. 1769–1771, Dec.

1990.

[60] D. J. Costello, Jr., et al., “Applications of Error-Control Coding”, IEEE Transac-

tions on Information Theory, vol. 44, no. 6, Oct. 1998, pp. 2531-2560.

[61] A. R. Calderbank, “The Art of Signaling: Fifty Years of Coding Theory”, IEEE

Transactions on Information Theory, vol. 44, no. 6, Oct. 1998, pp. 2561-2595.

[62] C. E. Shannon, “A Mathematical Theory of Communication”, Bell System Tech. J.,

vol. 27, 1948, pp. 379-423, 623-656.

[63] L. F. Mollenauer, S. G. Evangelides, and J. P. Gordon, “Wavelength division multi-

plexing with solitons in ultra-long distance transmission using lumped amplifiers,”

Journal of Lightwave Technology, vol. 9, no. 3, pp. 362-367, Mar. 1991.

[64] L. F. Mollenauer, “Method for nulling nonrandom timing jitter in soliton trans-

mision,” Optics Letters, vol. 21, no. 6, pp. 384-386, Mar. 15, 1996.

[65] R. J. McEliece, The Theory of Information and Coding, Reading, Mass.: Addison-

Wesley Publishing Company, 1977.

150

[66] Y. V. Svirid, “Weight distributions and bounds for Turbo-codes,” European Trans-

actions on Telecommunications, vol. 6, no. 5, pp. 543–555, September–October,

1995.

[67] B. Vucetic and J. Yuan, Turbo codes principles and applications, Kluwer Academic

Publishers, Norwell, Massachusetts, 2000.

[68] S. B. Wicker, Error control systems for digital communication and storage, Pren-

tice Hall, Englewood Cliffs, NJ, 1995, pp. 305.

[69] C. E. Shannon, R. G. Gallager, and E. R. Berlekamp, “Lower bounds to error prob-

ability for coding on discrete memoryless channels,” Information and Control, vol.

10, Part I: pp. 65–103, Part II: pp. 522–552, 1967.

[70] D. Divsalar, S. Dolinar, R. J. McEliece, and F. Pollara, “Transfer function bounds

on the performance of turbo codes,” Jet Propulsion Lab., Pasadena, CA, TDA Prog-

ress Report 42-122, pp. 44–55, Aug. 15, 1995.

[71] L. R. Bahl, J. Cocke, F. Jelinek, and J. Raviv, “Optimal decoding of linear codes for

minimizing symbol error rate,” IEEE Transactions on Information Theory, pp. 284–

287, Mar. 1974.

[72] C. Partridge, J. Hughes, and J. Stone, “Performance of checksums and CRC's over

real data”, Sigcomm '95, Cambridge, MA USA, 1995.

[73] G. L. Cariolaro, and G. P. Tronca, “Spectra of block coded digital signals”, IEEE

Transactions on Communications, vol. Com-22, no. 10, Oct. 1974.

[74] E. Biglieri, et al., Introduction to Trellis-Coded Modulation with Applications,

Macmillan, 1991.

151

[75] D. Haccoun, and G. Begin, “High-rate punctured convolutional codes,” IEEE

Transactions on Communications, vol. COM-37, no. 11, pp. 1113–1125, Nov 1989.

[76] J. W. Modestino, and S. Y. Mui, “Convolutional codes on Rician fading channels,”

IEEE Transactions on Communications, vol. COM-24, no. 6, pp. 592–606, June

1976.

[77] G. Ungerboeck, “Channel coding with amplitude/phase modulation,” IEEE Trans-

actions on Information Theory, vol. IT-28, pp. 55–67, Jan. 1982.

[78] A. J. Viterbi, “Convolutional codes and their performance in communication sys-

tems,” IEEE Transactions on Communications, vol. COM-19, no. 10, pp. 751–772,

Oct. 1971.

[79] S. Lin and D. J. Costello, Jr., Error Control Coding: Fundamentals and Applica-

tions, Prentice-Hall, Inc. Englewood Cliffs, New Jersey, 1983.

[80] G. D. Forney, Jr., Concatenated codes, Cambridge, Mass.: MIT. Press, 1966.

[81] J. M. Morris, “Burst error statistics of simulated Viterbi decoded BPSK on fading

and scintillating channels,” IEEE Transactions on Communications, vol. 40, no. 1,

Jan. 1992.

[82] J. M. Morris and J. Chang, “Burst error statistics of simulated Viterbi Decoded

BFSK and high-rate punctured codes on fading and scintillating channels,” IEEE

Transactions on Communications, vol. 43, no. 2.3.4, February/March/April 1995.

[83] PTDS Version 1.1 for Windows NT, Virtual Photonics Incorporated, 1999.

[84] A. Richter, and V. S. Grigoryan, “Efficient approach to estimate collision-induced

timing jitter in dispersion-managed WDM RZ systems,” in OFC/IOOC’99 Techni-

cal Digest, San Diego, California USA, Feb. 1999, pp. WM33-1–WM33-3.

152

[85] R. M. Mu, V. S. Grigoryan, C. R. Menyuk, G. M. Carter, and J. M. Jacob, “Com-

parison of theory and experiment for dispersion-managed solitons in a recirculating

fiber loop,” IEEE Journal on Selected Topics in Quantum Electronics, vol. 6, no. 2,

pp. 248–257, Mar. 2000.

[86] R. G. Gallager, Low Density Parity Check Codes, MIT Press, Cambridge, MA,

1963.

[87] Y. Kou, S. Lin and M. Fossorier, “Construction of Low Density Parity Check

Codes – A Geometric Approach,” in Proceedings of International Symposium on

Turbo Codes and Related Topics, Brest, France, 4–7 Sept. 2000.

[88] S. Lin, H. Tang, and Y. Kou, “Finite Geometry Low Density Parity Check Codes”,

in Proceedings of Conference on Information Sciences and Systems, Baltimore,

MD, Mar. 2001.

[89] D. J. C. MacKay, “Good Error-Correcting Codes Based on Very Sparse Matrices”,

IEEE Transactions on Information Theory, vol. 45, no. 3, pp. 399–432, Mar. 1999.

[90] D. J. C. MacKay and R. M. Neal, “Near Shannon Limit Performance of Low Den-

sity Parity Check Codes”, Electronic Letters, vol. 32, no. 18, pp. 1645–1646, Aug.

1996.

[91] S.-Y. Chung, G. D. Forney, Jr., T. J. Richardson, and R. Urbanke, “On the Design

of Low-Density Parity-Check Codes within 0.0057 dB from the Shannon Limit”,

IEEE Communications Letters, vol. 5, no. 2, pp. 58–60, Feb. 2001.

[92] W. E. Ryan, “A turbo code tutorial,” http://www.ece.arizona.edu/~ryan/turbo2c.

pdf.

153

[93] L. F. Mollenauer, J. P. Gordon, and M. N. Islam, “Soliton propagation in long fibers

with periodically compensated loss,” IEEE Journal of Quantum Electronics, vol.

QE-22, pp. 157-173, Jan. 1986.

[94] D. J. C. MacKay, Information Theory, Inference, and Learning Algorithms, Draft

2.0.7, Part II, Chapter 10, Feb. 14, 2000.

[95] I. Sason and S. Shamai, “Improved upper bounds on the ML decoding error prob-

ability of parallel and serial concatenated turbo codes via their ensemble distance

spectrum,” IEEE Transactions on Information Theory, vol. 46, no. 1, pp. 24–47,

Jan. 2000.

[96] G. P. Agrawal, Nonlinear fiber optics, 2nd edition, Academic Press, Inc., San Di-

ego, CA, 1995.

Documents

On Forward Error Correction Codes and Line-coding Schemes ... · experience, and constructive criticisms. His expertise in the field of coding technology has been the solid base for