EFFICIENT COMMUNICATIONS FOR NONSYMMETRIC INFORMATION
SOURCES WITH APPLICATION TO PICTURE TRANSMISSION
By
3HAHRIAR EMAMI
A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY
UNIVERSITY OF FLORIDA
1993
To my parents
for their love, patience and support.
ACKNOWLEDGEMENTS
I wish to express my gratitude to my advisor, Dr. Scott
L. Miller, for his encouragement, support and friendship. I am
also grateful to Dr. Couch for his guidance and insight. In
addition, I would like to thank Dr. Childers and Dr. Najafi
for their time and interest in serving in my committee.
Special thanks also go to Dr. Sigmon who has offered his
valuable assistance throughout the course of this study.
iii
TABLE OF CONTENTS
ACKNOWLEDGEMENTS . . . . . . . . .iii
ABSTRACT . . . . . . . . ... . . vi
CHAPTER ONE REVIEW OF MODULATION TECHNIQUES, SOURCE AND
CHANNEL CODING . . . .. . . . . 1
1.1 Introduction . . . . . . . . 1
1.2 TwoDimensional Modulation Formats . .. .. 2
1.3 Source Coding . . . . . . . . 5
1.4 DPCM . . . . . . . . ... . 5
1.5 Transform Coding . . . . .. . 8
1.6 Channel Coding . . . . .. ..... 10
1.7 Transmission Errors in a DPCM system . .. 15
1.8 Optimum Prediction for Noisy Channels . .. 16
1.9 Research Objectives . . . . . ... 18
1.10 Description of Chapters . . . . .. 20
CHAPTER TWO DPCM VIDEO SIGNAL: A NONSYMMETRIC
INFORMATION SOURCE . . . . . . . .. 21
2.1 Introduction . . . . . .. .. . 21
2.2 Basics of Quantizers . . . . . . 21
2.3 Approaches to Quantizer Design . . .. 23
2.4 MSQE Quantizer Design . . . . . .. 23
2.5 Analysis of DPCM Encoder . . . . .. 26
2.6 Results . . . . . . . . . 29
2.7 Discussion . . . . . . . . 31
CHAPTER THREE SIGNAL DESIGN FOR NONSYMMETRIC SOURCES 35
3.1 Introduction . . . . . . . . 35
3.2 Maximum Likelihood Signal Design for Three
Signals . . . . . . . . . 37
3.3 A Numerical Approach Based on Lagrange
Multipliers Method . . . . . . 48
3.4 Minimum Error Signal Selection . . .. 51
3.5 Minimum Average Cost Signal Selection . 58
3.6 Results . . . . . . . . ... 66
CHAPTER FOUR ALPHABET SIZE SELECTION
FOR VIDEO SIGNAL CODING . . . . . .
4.1 Introduction . . . . . . . .
4.2 Preliminaries . . . . . . . .
4.3 Analysis . . . . . . . . .
4.4 Implementation Issues . . . . . .
4.5 Nonbinary BCH Codes . . . . . . .
4.6 Results . . . . . . . . .
4.7 Summary . . . . . . . . .
CHAPTER FIVE EFFICIENT DECODING OF CORRELATED SEQUENCES
5.1
5.2
5.3
5.4
5.5
Introduction . . . . . .
Optimum Decoding of Markov Sequences
A Modified MAP (MMAP) Receiver . .
A Minimum Cost Decoder . . . .
A Maximum SignalToNoise Ratio (MSNR)
Receiver . . . . . .
5.6 Redundancy in the Encoded Signals
5.7 Picture Transmission over noisy
Channels . . . . . .
5.8 Side Information . . . .
5.9 Summary . . . . . .
CHAPTER SIX CONCLUSIONS AND SUMMARY . .
6.1 Summary of the Work . . . .
6.2 Directions of Future Research .
APPENDIX A. EVALUATION OF AN INTEGRAL . .
APPENDIX B. EVALUATION OF THE DERIVATIVES
REFERENCES . . . . . . . .
BIOGRAPHICAL SKETCH . . . . . .
79
79
80
83
86
94
99
104
116
S. 116
S. 119
S. 122
S. 123
S . . 124
S . . 127
S . . 130
S . . 144
S . . 145
S . . 147
S . . 147
S . . 150
. . 151
153
155
159
Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy
EFFICIENT COMMUNICATIONS FOR NONSYMMETRIC INFORMATION
SOURCES WITH APPLICATION TO PICTURE TRANSMISSION
By
Shahriar Emami
August 1993
Chairperson: Dr. Scott L. Miller
Major Department: Electrical Engineering
This dissertation is concerned with issues related to
nonsymmetric information sources. Signal design, alphabet size
selection and decoding of information from these sources are
among the topics covered in this dissertation. Although the
techniques presented here are applicable to any nonsymmetric
source, the emphasis is placed on video sources. Initially a
model for the statistics of DPCM (Differential Pulse Code
Modulation) of video signals is derived and it is shown that
DPCM of video signals results in a nonsymmetric source.
The problem of signal selection for nonsymmetric sources
in two dimensions is considered. Iterative methods for finding
the minimum error signal (and minimum cost) constellation
subject to an average (or a peak) power constraint are
presented.
Even though efficient techniques for source coding,
channel coding and signal design exists, it is not known how
the choice of alphabet size affects a communication system.
Image transmission systems with various alphabet sizes are
compared on the basis of equal information rate, bandwidth and
average power. The systems employing various alphabet sizes
are analyzed and computer simulations are performed using
pictures with different amount of details.
An optimum procedure for decoding Markov sequences is
developed and the path metric is derived. A heuristic tree
searching algorithm is employed to obtain a suboptimum
solution.
Two other techniques for decoding Markov sequences, a
symbolbysymbol modified MAP (MMAP) receiver using higher
order statistics and a maximum signaltonoise ratio (MSNR)
receiver, are also given. The decoding procedures were applied
to image communication over noisy channels.
In summary, the major contributions of this dissertation
were the development of signal selection methods for
nonsymmetric sources, derivation of procedures for decoding of
correlated sources and application of these procedures to the
picture communication in noisy situations.
vii
CHAPTER ONE
REVIEW OF MODULATION TECHNIQUES, SOURCE AND CHANNEL CODING
IN DIGITAL COMMUNICATIONS
1.1 Introduction
The goal of this chapter is to present the background
necessary to follow the work presented in this dissertation,
introduce the research objectives and give a brief description
of the chapters.
Since twodimensional modulation has been utilized in
this work, these formats are reviewed, their spectral
efficiency are calculated and the upper bound on their
spectral efficiency is given. Picture transmission is one of
the applications presented here. To familiarize the reader
with the field, a number of source coding techniques have been
described. Error correction over noisy channels is an
important topic and has been utilized in this thesis. The gain
from channel coding has been explained and different methods
of error correction have been discussed. The effects of
channel errors in a DPCM system and optimum prediction in
noisy channels was presented, because the enhancement of
picture quality in DPCM systems has been addressed in this
dissertation.
1.2 TwoDimensional Modulation Formats
Mary phaseshift keying (MPSK) and quadrature amplitude
modulation (QAM) are among the most popular two dimensional
formats. In MPSK the transmitted signal is given by
s(t)=Re{g(t)e 'ct} (1.1)
where
g(t)=A ejo(c),
(1.2)
e(t)= (i1)2 i=1,2,...,M.
M
In other words, in MPSK while the amplitude is maintained
constant the phase of signal can take on one of the M values
in a symbol interval. The MPSK for M=4 is called quadrature
phaseshift keying (QPSK).
In MPSK signal points are confined to the circumference
of a circle. But in QAM the transmitted signal is
s(t)=Re{g(t)ej ct} (1.3)
where
g(t)=x(t)+jy(t), (1.4)
s(t)=x(t) cos(oct)y(t) sin(Oct). (1.5)
The waveforms x(t) and y(t) are
x(t) =xi h(tiT) (1.6)
and
y(t)=Eyi h(tiT), (1.7)
i
where T is the symbol interval in seconds and h(t) is the
pulse shape.
Let us find the spectral efficiency of MPSK and QAM with
rectangular pulses. The nulltonull transmission bandwidth of
MPSK and QAM is
B,2R (1.8)
where M=2' is the number of points in the signal constellation.
The spectral efficiency is therefore given by
R 1 bits/sec (.
SHz(1.9)
BT 2 Hz
When operating over a bandlimited channel and the overall
pulse shape satisfies the raised cosine rolloff filter
characteristics, the bandwidth of the modulating signal is
B= (1+r)D (1.10)
2
where D=R/1 and r is the rolloff factor. Since BT =2B, the
transmission bandwidth of QAM is
(1+r)R
BT= (1+r)R (1.11)
and the spectral efficiency with raised cosine filtering is
given by
log2M bits/sec (1.12)
(1+r) Hz
The spectral efficiency increases with the number of the
points in the constellation. However, one can not increase the
spectral efficiency by increasing the number of points in the
signal constellation too much, because as you place more
signals in the constellation the error rate increases.
For reliable communications, the information rate must be kept
below the channel capacity. Therefore, the spectral efficiency
is upper bounded by
i
N
where S/N is the signaltonoise power ratio.
Two dimensional formats are well suited for high speed
data transmission because of their efficient use of bandwidth.
However, they require coherent detection that implies the need
for synchronization circuits.
1.3 Source Coding
The purpose of source coding is to remove as much
redundancy as possible from the message. Efficient coding of
messages provides the opportunity for significantly decreasing
the transmission costs. Two main approaches to picture coding
predictive coding and transform coding will be addressed here.
1.4 DPCM
There is considerable correlation between adjacent
samples of speech or image data, and indeed the correlation is
significant even between samples that are several sampling
intervals apart. The meaning of this high correlation is that,
in an average sense, the signal does not change rapidly from
sample to sample so that the difference between adjacent
samples should have a lower variance than the variance of the
signal itself.
The predicted value is the output of the predictor
system, whose input is a quantized version of the input
signal. The difference signal may also be called the
prediction error signal, since it is the amount by which the
predictor fails to exactly predict the input.
Since the variance of error signal is smaller than the
variance of signal, a quantizer with a given number of levels
6
can be adjusted to give a smaller quantization error than
would be possible when quantizing the input directly.
1.4.1 Optimum Prediction
We are interested in linear prediction of the form
R(n) =E aJ x(nj) (1.14)
j1
which is the weighted sum of N previous samples. The weights
aj are linear prediction coefficients. The filter is optimized
by finding the weights that minimize prediction error in a
mean squared sense
a 2=E[ (x(n) k(n)) 2] (1.15)
Since mean squared error is a function of aj and
2a 2
e=0; i=1,2,...,N (1.16)
aai
is a necessary condition for minimum MSE (meansquared error).
Evaluating the derivative gives
E[2(x(n) .f(n)) ) (n) ] (1.17)
ai
Equating this to zero yields,
E[{x(n)x(n)lx(ni)]=0; i=1,2,...,N. (1.18)
This is called the orthogonality principles which states that
minimum error must be orthogonal to all data used in the
7
prediction. The expansion of this equation gives the following
condition for optimum aj,
N
Ia, R,(kj) =R,(k); k=1,2,...,N
j1
(1.19)
. R,(N1)' (al R (1)'
. R (N2) a2 R (2)
Rx(O) a R ()
(1.20)
in matrix notation,
(1.21)
where
R=Q R,(ij)); r={R,(i)i: i,j=1,2,....,N (1.22)
The equations are called normal equations, YuleWalker
prediction equations or WienerHopf equations.
The mean squared error decreases significantly by using
up to three elements in predictive coding. However, if the
coefficients are not exactly matched to the statistics of a
picture, the decrease in mean squared error is not significant
by using three previous elements as compared to one [1].
R, (0) R(1) RX(2)
R (1) Rx (0) R, (1)
R (N1) R. (N2) R.{(N3)
8
1.5 Transform Coding
In transform coding a picture is divided into subpictures
and then each of these subpictures are transformed into a set
of independent coefficients. The coefficients are then
quantized and coded for the transmission. An inverse
transformation is applied to recover intensities of picture
element. Much of the compression is a result of dropping
coefficients from transmission that are small and coarsely
quantizing the others as required by the picture quality.
It is desirable to have a transform which compacts most
of the image energy in as few coefficients as possible.
Another consideration is the ease of implementation.
1.5.1 Optimum Transform
Optimum transform (KL transform) is explicitly known,
but computationally it is very demanding. This undesirable
feature has prevented any hardware implementation of the
optimum transform. It is mainly studied in simulations to
obtain bounds.
The most practical transform coding techniques are based
on the DCT (discrete cosine transform), which provides a good
compromise between information packing ability and
computational complexity. In fact, the properties of DCT have
proved of such practical value that it has become the
9
international standard for transform coding systems. In
addition to that it minimizes the blocking artifact that,
results when the boundaries between the subimages become
visible.
1.5.2 Size of Subpictures
Computer simulations on real pictures show that the mean
square error produced by transform coding improves with the
size of subpicture. However, the improvement is not
significant as subpicture is increased beyond blocksize of
16x16. Subjective quality of pictures, however, does not
appear to improve with the size of block beyond 4x4 pixels
[2].
1.5.3 Bit Allocation
One method of choosing the coefficients for transmission,
is to evaluate the coefficient variances on a set of average
picture, and then discard all the coefficients whose variance
is lower than a certain value. Such a scheme is called zonal
filtering [3].
Having decided which coefficients to transmit, we must
then design a quantizer for each of them. This could be done
by dividing a given total number of bits among all the
coefficients. In order to minimize the mean square error for
10
a given total number of bits for Gaussian variables, the
optimum assignment is done by making the average quantization
error of each coefficient the same. This requires that the
bits be assigned to the coefficients in proportion to the
logarithm of their variance.
1.6 Channel Coding
Channel coding is a method of inserting structured
redundancy into the source data so that transmission errors
can be identified and corrected. Block coding and
convolutional coding are two important subcategories of
channel coding techniques.
1.6.1 Block Codes
With block coding the source data is first segmented into
blocks of k bits; each block can represent any of M=2k
distinct messages. The encoder transforms each message into a
larger block of n digits. This set of 2k coded messages is
called a code block. The (nk) digits, which the encoder adds
to each message block, are called redundant digits; they carry
no new information. The ratio of data bits to total bits
within a block, k/n, is called code rate. The code itself is
referred to as an (n,k) code.
To demonstrate the performance improvement possible with
11
channel coding, let us pick a (15,11) single error correcting
code. Assume a BPSK modulation, a signaltonoise ratio of
43,776 ( S/No=43,776) and a data rate of R=4800 b/s. Let Pu,b
and Pum represent the bit and message error rate for the
uncoded system and Pb P,,m represent the bit and message error
rate for the coded system, respectively.
without coding
EbS =9.6dB (1.23)
No RNo
Pb=Q( )l.02x10s (1.24)
and
Pu=l (lPub) 11=1. 12x104. (1.25)
With coding
RC=4800x(15/11)=6545 b/s
E_ S =8.25 dB
No R o (1.26)
Pcb=Q( b )=1.36x104
The bit error rate for the coded system is inferior to that of
the uncoded system and the performance improvement due to
coding is not apparent yet.
Since the code corrects all single errors within a block
of 15 bits, the message error rate for the coded system will
be
15
Pc, 1k Pc. (1P ,b) 15lk .94x106 (1.27)
It is seen by comparing the error rates that the message error
rate has improved by a factor of 58 through the use of a block
code.
Most of the research on block codes has been concentrated
on a subclass of linear codes known as cyclic codes. A cyclic
code word, after any number of cyclic shifts, has the property
of remaining a valid code word from the original set of code
words. Cyclic codes are attractive because they can be easily
implemented with feedback shift registers. The decoding
methods are simple and efficient.
Examples of cyclic and related codes are BCH, Reed
Solomon, Hamming, ReedMuller, Golay, quadratic residue and
Goppa codes. The classes form overlapping sets so that a
particular code may be a BCH code and also a quadratic residue
code. Recent applications of these codes to digital
communication include a (31,15) ReedSolomon code for joint
tactical information distribution system(JTIDS) and a
(127,112) BCH code for INTELSAT V system [4].
1.6.2 Convolutional Codes
A convolutional encoder consists of some shift registers
and modulo2 summers. For the general case, k bits at a time
are entered into the shift register, and the code rate is k/n.
The state of the encoder is dependent upon the contents of the
shift registers.
Convolutional codes can be described by a code tree. It
is seen that the tree contains redundant information which can
be eliminated by merging, at any level, all nodes
corresponding to the same encoder state. The redrawing of the
tree with merging paths has been called a trellis by Forney.
The problem of decoding a convolutional code can be thought of
as attempting to find a path through the trellis or the tree
by making use of some decoding rule.
The Viterbi algorithm [5] which is shown to be a maximum
likelihood decoder for convolutional codes, involves computing
a metric between the received signal and the trellis path
entering each state. In the event that two paths terminating
on a given state are redundant, the one having the largest
metric is stored (the surviving path). This selection of
survivor is performed on for all paths entering each of the
other states. The decoder continues in this way to advance
deeper into the trellis, making decisions by eliminating the
least likely paths.
The complexity of Viterbi algorithm is an exponential
14
function of the code's constraint length. For large values of
constraint length (K>>10) one might consider other decoding
algorithms.
The complexity of sequential decoders is relatively
independent of constraint length, so codes with much larger
constraint length can be used. Also this technique is more
suitable than Viterbi algorithm for low bit error rates.
Sequential decoding [5] was first introduced by
Wozencraft but the most widely used algorithm to date is due
to Fano. It is an efficient method for finding the most
probable code word, given the received sequence, without
searching the entire tree. The explored path is probably only
local; that is, the procedure is suboptimum. The search is
performed in a sequential manner, always operating on a single
path, but the decoder can back up and change previous
decisions. Each time the decoder moves forward, a tentative
decision is made. If an incorrect decision is made, subsequent
extensions of the path will be wrong. The decoder will
eventually be able to recognize the situation. When this
happens, a substantial amount of computation is needed to
recover the correct path. Backtracking and trying alternate
paths continue until it finally decodes successfully.
Convolutional codes using either Viterbi or sequential
decoding have the ability to utilize whatever softdecision
information might be available to the decoder. It is not
surprising that they have been used widely even though their
15
theory is not as mathematically as profound as that of the
block codes. Most good convolutional codes have been found by
computer search rather than algebraic construction.
1.7 Transmission Errors in a DPCM System
Differential PCM systems are affected differently by bit
errors than PCM systems because the DPCM decoder loop causes
an error propagation, while a PCM error does not propagate in
time. Subjectively, DPCM is more errorrobust than PCM in
speech coding, but less robust than PCM for image coding.
Assume a channel error changes channel input u(n) to a
wrong value v(n). Due to linearity of the decoder filter, the
correct computation of output is superposed by an error output
caused by input c(n)=u(n)v(n) to the decoder loop.
Since the decoder is an allpole filter, there will be an
infinite sequence of error samples at the output, with
decaying amplitudes. In the case of firstorder prediction,
the effect, on a future value, at time m, is described by [3]
C(m) =C(n) a (mn); mtn (1.28)
Transmission errors therefore propagate in the reconstructed
DPCM waveform.
This kind of error smearing is perceptually desirable in
speech coding where a PCM error spike of large magnitude is
more annoying than a low amplitude error smeared over a long
duration.
16
In picture coding, on the other hand, error propagation
is perceptually very undesirable, taking the form of very
visible streaks or blotches with one and two dimensional
predictors.
1.8 Optimum Prediction for Noisy Channels
One of the early approaches to system optimization under
noisy conditions was presented by Chang and Donaldson [6].
Because of the importance of the result and the relevance to
this dissertation a summary of the methods is given here.
Let ri denote the received signal, fi denote the impulse
response of the DPCM decoder. The output of the decoder is
therefore given by
Ri=ri*fi= (s+ni) *fi
(1.29)
=si *f f +ni f =xi+qj +ni fi
Let us define x'i as follows
xi i xi =qi +ni fi
var(Xi) =E[0 2] +2 fk E[QNik] 3
+ fkfl E NikNi1
1 k
Let us assume the channel noise is uncorrelated and the
difference signal samples are statistically independent, then
var (X) =E[Qi2] +2E[QiNi] +E[Ni2] fk2 (1.31)
k
The second term is called the mutual error and can be shown to
be approximately zero if the quantizer is near optimum. The
error power reduces to
var(X',)=E[Q0] +E[N2] fk2. (1.32)
k
The sum can be evaluated using an identity
fk2=l+a2+a +a6+...
k
(1.33)
1 =b.
1a2
The expression for the reconstruction error variance then
becomes
var (X'i) =var (Q) +b var (Ni) (1.34)
We now define the following quantities to relate the quantizer
and noise variances to the differential signal variance,
var(Qj) =kq var (E ),
var( var((1.35)
var(Ni)=Kn var(Ei).
The DPCM prediction gain is also given by
var (X))
G r =1+a22a pi, (1.36)
var (E,)
where p, is equal to R,,(0). Putting all this together yields
var (X')= (kq+b kn) var (X (1.37)
The second term above is the dominant term because the effect
of channel noise is much more destructive to the
reconstruction of the image than the effect due to
quantization noise,
12a p1+a2
var(X'/) =Kn var(X,) 2a2 (1.38)
1a2
To minimize the variance of the reconstruction error we will
set the derivative of this expression with respect to a to
zero. The optimum value of a turns out to be
a= .p1 (1.39)
Pi
1.9 Research Objectives
This dissertation is concerned with issues related to
nonsymmetric information sources. To motivate the work on
nonsymmetric sources, it is shown that DPCM of digitized video
signals results in a nonsymmetric information source. One of
the main goals is to address the problem of signal design in
19
two dimensions for nonsymmetric sources. It is desired to find
an algorithmic solution to the minimum error signal
constellation for average and peak power constraints. In
addition the general case where the cost function is not
necessarily the error rate is discussed.
Even though efficient techniques for source coding,
channel coding and signal design exists, it is not known how
the choice of alphabet size affects a communication system. We
would like to compare communication systems with various
alphabet sizes for the transmission of video signals on the
basis of equal information rate, bandwidth and average power.
Two realistic situations will be considered, when one is
operating under tight bandwidth constraint and when the
constraint is somewhat loose.
System performance can be improved using standard error
correction techniques at the cost of increasing the bandwidth
or reducing the information rate. However, we would like to
use inherent asymmetry and redundancy in the transmitted
picture to improve the reception. We will model the data as a
Markov source and derive the optimum method for decoding the
data. We will also find a receiver that instead of minimizing
the error rate maximizes the SNR (signaltonoise) ratio.
20
1.10 Description of Chapters
A review of background material relevant to this
dissertation is given in Chapter One. Two dimensional
modulation techniques, source coding techniques for images,
standard error correcting techniques and the effect of channel
errors on predictive system are among the topics addressed in
this chapter. In Chapter Two a DPCM system will be analyzed
and a model for the statistics of the source will be derived.
It will be shown theoretically and empirically that DPCM of
video signals produces nonsymmetric sources.
The issues of signal design are addressed in Chapter
Three. Algorithmic solutions to signal design for nonsymmetric
information sources under average and peak power constraints
for minimizing the error rate and average cost are presented.
The study on the role of alphabet size for nonsymmetric
sources in a communication system is given in Chapter Four. In
Chapter Five various methods for decoding Markov sequences are
presented. The application to the transmission of video
signals over noisy channels and a comparison of the method is
also given. Chapter Six contains a summary of presented
approaches, conclusions and comments regarding the future
research directions.
CHAPTER TWO
DPCM VIDEO SIGNAL: A NONSYMMETRIC
INFORMATION SOURCE
2.1 Introduction
The purpose of this chapter is to demonstrate that DPCM
(differential pulse code modulation) of pictures results in a
nonsymmetric information source.
To do so, some introductory material is presented first.
Since a quantizer is an important component of a DPCM system,
it will be examined in some detail. Quantizers will be
introduced, different criteria for the design will be
mentioned and the procedure for finding an optimal quantizer
(in MSQE sense) will be explained step by step. A DPCM encoder
will be analyzed and a model for the resulting source
statistics will be given. Eventually the model will be
compared with actual picture statistics and results will be
compared.
2.2 Basics of Quantizers
Quantization is the process of rounding sample values to
a finite number of discrete values. This process is not an
information preserving process and the reconstructed signal is
21
22
only as good as the quantized samples allow. In other words,
there remains some error, the quantization error between the
original and the reconstructed waveform which is related to
the parameters of the quantizer.
Let the analog signal be modeled as a random waveform and
let p(x) be the probability density function of the signal.
The process of quantization subdivides the range of the values
of x into a number of discrete intervals. If a particular
sample value of the analog signal falls anywhere in a given
interval, it is assigned a single discrete value corresponding
to that interval. The intervals fall between boundaries
denoted by x,X2, ...,XL+I, where there are L intervals. The
quantized values are denoted by 1,,12,...,*L and are called
quantum levels or representative levels. The width of an
interval is xi., xi and is called interval's step size. If all
the steps are equal and, in addition, the quantum level
separations are all the same, the quantizer is said to be
uniform; otherwise it is a nonuniform quantizer.
It is possible to design a quantizer for a given
probability density function and a given number of levels. The
optimal quantizer is nonuniform unless the signal has a
uniform pdf. If a uniform quantizer is used instead the mean
squared quantization error will be larger than that of the
optimal nonuniform quantizer.
23
2.3 Approaches to Quantizer Design
The quantizers can be designed based on a mean squared
error criterion. This results in overspecification of the low
detailed areas of picture and consequently a small amount of
granular noise but relatively poor reproduction of edges.
It has been realized for sometime that for a better
picture quality, quantizers should be designed on the basis of
psychovisual criteria [7]. One method of designing
psychovisual quantizers is to minimize a weighted mean squared
quantization error, where the weights are derived from
subjective experiments [8]. Such optimization would be similar
to mean squared error criteria, where the density function is
replaced by a weighing function.
2.4 MSQE Quantizer Design
An optimal quantizer is defined to be a quantizer with
the smallest mean squared quantization error. It is desired to
find the quantizer that minimizes the meansquared
quantization error for a given probability density function
and number of levels. The meansquared quantization error is
given by
X2 L1 X1+1
E =f (X1) 2p(X) (dx+ (xlf ) 2p (X) dx
+ (x) 2p(x) dx. (2.1)
XL
Our purpose is to choose quantum levels 1, and interval
boundaries xi so that eq.(2.1) is minimized.
The above expression can be differentiated to obtain a
set of necessary conditions that must hold for the optimum
quantizer. By applying Leibniz's rule.we get
(2.2)
ali
ae 2
axi
5=2 f(xl,)'p(x)dx=0, i=1,2,... L,
P = [(x1 )2(Xi'l)2]p(x)dx=0, i=2,..., L, (2.3)
where x1=o and XL+1=o. Equation (2.3) is equivalent to
x i 1 ) i=2,3,...,L, (2.4)
which says that interval boundaries should fall midway between
the adjacent quantum levels. Alternatively,
1 i=2xij1,
(2.5)
25
Equation (2.2) is readily solved for 1,
fx p(x) dx
Sx, i=1,2,...,1. (2.6)
f p(x) dx
xj
The solution of the equations for the general nonuniform
quantizer is difficult. However, a procedure to obtain a
solution by computer iteration has been introduced by Llyod
and Max [9],[10]. For a specified probability density function
and a fixed value of L, i, is first selected arbitrarily. with
x,=oo, we solve eq. (2.6) for x2. Next x2 and i, are used in eq.
(2.5) to obtain 12. The process is repeated to obtain x3 from
eq. (2.6) and 13 from eq. (2.5). Continued iteration finally
stops when 1L is obtained from eq. (2.5). If 1, has been
correctly guessed, then 1L will satisfy the equation with xL+i
=0. If it does not, 1, is corrected to a new value and the
process is repeated until 1L satisfies eq. (2.6). This
procedure satisfies conditions eq. (2.5) and eq. (2.6) which
are necessary for optimality.
Max [10] used the above procedure to find the quantum
levels for a zeromean Gaussian message for quantizers up to
36 levels. Paez and Glisson [11] used the procedure to find
optimum levels for signals having either a gamma density or a
Laplace density for L=2,4,8,16, and 32.
2.5 Analysis of DPCM Encoder
Consider an information source with alphabet {a,, a2,...,
aQ}. The information source is said to be nonsymmetric if the
source symbols are not equally likely. It will be shown that
the output of a DPCM encoder can be viewed as a nonsymmetric
information source.
Assume that in a DPCM encoder a predictor of order M is
used. Let us model the quantizer as a additive noise source
&(n)=&(n)+q(n) (2.7)
where e(n) and 9(n) are the input and the output of the
quantizer respectively and q(n) is the quantization noise. It
has been shown that [12]
"n (2.8)
(n) =e(n)f bi q(ni)
i1
where e(n) is the difference signal in a DPCM without the
quantizer and bi are the prediction coefficients. The
distribution of 0(n) is therefore given by the
M
convolution of the pdf's of e(n) and Q= bl q(ni)
i=1
PS=Pe* Po.
And the probability of quantum level i is determined by
XI.'
P,i f p(x)dx.
Xi
x (n)
e(n)
(n
e (n)
(2.10)
e(n)
x(n)
(b)
Figure 2.1. Block diagram of a DPCM system. (a) encoder and
decoder.
(2.9)
O'Neal [13] has shown experimentally that the pdf of 6(n)
can be approximated with a Laplacian distribution
p(x) =1 exp( Il) (2.11)
VJa a
To find the statistics of the levels an optimum quantizer must
be placed in the DPCM system. If we utilize a MSQE quantizer,
the statistics associated with each level will be
po P
Pi = 1 exp( Ii) dx=fL exp(F2y) dy
o o 2 (2.12)
=1 (exp( )a) exp(V2),
2
where a and # are tabulated in [11].
We have also verified through simulations that the choice
of a gamma distribution for e(n) and uniform pdf for Q results
in a satisfactory approximation to the density of e(n) In
other words
1
pa (x) ,= exp )( ) (2V
X 2 8xa xr 2a0pQ 2 F,3 2
(2.13)
A numerical method must be employed to find the statistics
associated with each level.
29
2.6 Results
In this section we will compute the statistics of the
quantizers for the model developed earlier and will compare
them to actual source statistics for real world pictures. Two
types of quantizer were used, a MSQE quantizer and a quantizer
that was found by pschovisual experiments [14].
The material used were two eightbit pictures; a low
detail picture LENNA and a high detail picture AERIAL MAP
(Fig.2.2 and Fig. 2.3). Both pictures consist of 512x512
pixels.
An optimized eightlevel quantizer for the Laplacian
distribution was chosen. Model parameters can be estimated by
a2= (2.14)
0a=< ((n) (n))2>.
Table 2.1 contains theoretical and actual source statistics.
Theoretical values are seen to be reasonably close to actual
source statistics.
Then a sevenlevel quantizer [14] that is shown to work
well with different pictures was selected. The histograms for
the two pictures were prepared. The Laplacian distribution and
the distribution given in eq (2.13) were compared with the
histograms (Figures 2.4 2.5). They both seem to be a fair
approximation to the histograms.
Table 2.2 gives the statistics given by equation eq.
(2.13), Laplacian pdf and the picture. The model based
statistics are close to the actual source statistics.
Fig. 2.2. Lenna.
Fig. 2.2. Lenna.
Fig. 2.3. Aerial map.
~B
31
2.7 Discussion
It was shown that the pdf of input to the quantizer can
be fairly approximated by equation eq. (2.13) (which is the
convolution of a gamma and a uniform pdf) or simply a
Laplacian distribution. The theoretical statistics derived
from the models agreed well with the actual source statistics.
It is seen that DPCM of video signals does in fact produce a
nonsymmetric source.
TABLE 2.1. A EIGHT LEVEL MSQE QUANTIZER IS USED. THE FIRST AND
THE SECOND COLUMNS SHOW THE ACTUAL SOURCE STATISTICS FOR THE
PICTURES. THE THIRD COLUMN SHOWS THE STATISTICS USING A
LAPLACIAN DISTRIBUTION FOR SOURCE.
Lenna Aerial Map Model Statistics
.3324 .2800 .2549
.3574 .2795 .2549
.0946 .1416 .1510
.1060 .1390 .1510
.0337 .0578 .0744
.0331 .0573 .074
.0225 .0220 .0197
.0197 .0224 .0197
TABLE 2.2. STATISTICS FOR LENNA PICTURE. A SEVEN LEVEL
QUANTIZER FOUND BY PSYCHOVISUAL EXPERIMENTS IS USED.
Laplace Eq. (13) Actual Statistics
.5541 .5967 .6578
.1965 .1558 .1544
.1965 .1558 .1418
.0259 .0270 .0193
.0259 .0270 .0200
.0005 .0019 .0026
.0005 .0019 .0038
8.3
8.25
1 \
8.15
8.1 \
0.e 
8 
BB 68 48 28 8 28 48 68 88
(a)
8.25
8.2 I
8.1 H
8.85 /
B.5B
88 68 40 28 8 28 48 68 BB
(b)
Figure 2.2. The histogram for LENNA picture is compared with
(a) Laplacian Distribution (b) Distribution given in eq.
(2.13). Histogram is shown with a broken line.
8.12
8.88 // \
8.16
0.68
I
8.06
8.84
88 68 48 28 8 2z 40 68 88
(a)
8.12
I
8.08 
8.86
8.84 
8./
0
88 68 48 20 8 20 48 68 88
(b)
Figure 2.3. The histogram for AERIAL MAP picture is compared
with (a) Laplacian Distribution (b) Distribution given in
equation (13). Histogram is shown with a broken line.
CHAPTER THREE
SIGNAL SELECTION FOR NONSYMMETRIC SOURCES
3.1 Introduction
In many applications one has a bandlimited channel and
has to achieve the least error rate for a given signalto
noise ratio. Design of high speed modems is one example where
the designer is faced with the problem of selecting an
efficient set of signals with inphase and quadrature
components.
The objective in signal design is to find the optimum
signal constellation in presence of additive white Gaussian
noise under a power constraint. Two dimensional modulation
formats such as MPSK and QAM have been studied before [15).
These formats confine the signal points to a certain geometry
and are not optimum in the sense of minimum error rate.
There has been a few attempts to solve the signal design
problem under peak or average power constraint without
constraining the signal points to a special geometry such as
a circle or a certain lattice. Foschini et al [16] presented
an iterative approach for signal selection. A gradient search
procedure is given that incorporates a radial contraction
technique to meet the average signal power constraint.
Kernighan and Lin [17] came up with a heuristic procedure for
36
solving signal design problem under a peak power constraint.
Previous investigations on signal design have focused on
signal selection for equally likely signals [15][17]. There
are some applications where the information source is non
symmetric. A practical instance in which such a model proves
rewarding is in the transmission of video signals. It was
demonstrated in Chapter Two that DPCM of digitized video
signals results in a nonsymmetric source. In this case signals
should be mapped into a two dimensional signal constellation
in an optimum manner. In other words the goal is to determine
the signal constellation that minimizes the probability of
error (or a given cost function) in presence of additive white
Gaussian noise under an average power(or a peak power)
constraint, given N signals with unequal probabilities.
To illustrate the difficulty of direct approach we will
design a ML receiver for a three signal constellation. We will
also describe a numerical method that uses the Lagrange
multipliers method for optimization. These two methods are
appropriate for smaller signal sets.
Then a number of iterative algorithms are developed.
First a normalized conjugate gradient search algorithm and a
gradient search algorithm will be presented that can be
applied to signal sets of any size and with any probability
distribution. The methods presented here are applicable to the
design of both MAP (maximum a posteriori) and ML (maximum
likelihood) receivers. These methods are generalizations and
37
modifications to the method given in [16]. Then a gradient
search method for a peak power constraint is developed.
Eventually a gradient search method that finds a signal
constellation for an average cost function subject to a peak
or average power constraint is presented. In the end, a few
examples are given and conclusions are drawn.
3.2 Maximum Likelihood Signal Design for Three Signals
Here a three signal constellation is designed for a 3
symbol source for transmission on a white Gaussian noise
channel. The signal constellation is depicted in Figure 3.1.
From geometrical considerations
a1+a2+a3+P1+P2+P3=2x, (3.1)
AH cosP~=BH cosa1, (3.2)
AH cosa2=CH cosP2, (3.3)
BH cosP3=CH cosa,. (3.4)
Let r(t)=(ri,r2) be the received waveform and p(t) be a
particular value of r(t). Suppose p(t)=(p,,p2) is received in
the symbol interval, p,(r,,r21m) is the conditional joint
probability density of random variables defining r(t) and s,,
s2. s3 are the signal vectors denoted by A, B and C on the
38
constellation. A maximum likelihood receiver sets the message
estimate to m, if for i=1,2,3,
Pr(r1=p1,r2=p21 mk) > p1(r =pl r2=p21 mi) iok. (3.5)
Assuming the noise components are Gaussian and statistically
independent,
Pr(P P 21 ) = exp [(p1si) 2+ (P2Si2) 2]
SprN, N0
(3.6)
The decision rule then becomes
 [ (pSkl) 2 (2Sk2) 2] [ (pSils) 2(2S2 ) 2]
No N0
itk.
(3.7)
Notice that the sum of squared terms on either side of the
c
Figure 3.1. A three signal constellation
inequality are the square of the Euclidean distances between
the received signal and signals si and sk. Thus the decision
rule can be rewritten as:
d(p,Si)> d(p,Sk)
where d(x,y) is the Euclidean distance between x and y.
Point H is the intersection of the decision regions and it is
therefore on the boundaries of the decision regions. For the
boundary points the above inequality changes to an equality,
d(H,A)=d(H,B)
d(H,A)=d(H,C)
(3.9)
(3.10)
AH = BH = CH,
and this results in
(3.11)
(3.12)
a2=P2'
C3 = P3
(3.13)
Define P, to be the probability of word error. The
probability of symbol error is one minus the probability that
a symbol is correct. The probability of a correct symbol is
obtained by averaging all of the conditional probabilities of
a correct symbol, we have
(3.14)
w=1 P(C mi) p,.
i1
But P(Clm,) is the result of point p falling in the decision
i k.
(3.8)
region Ii (Figure 3.2),
3
P,=1 Pi f (Pi P21 mi)dpidp2.
i1
(3.15)
a2
H ^ 1 A
Figure 3.2. One decision region.
By substituting the expressions for probability density
function eq. (3.6) into eq. (3.15) we get
_ _ 3 ( p 1 d 1) 2 p 2
p exp[ N hdpldp?
ii P N 0 N0
TEN PP2
(3.16)
or equivalently
S p tana
P,=l 1op.
SPp f fta
PI'o p2plta
 P2
op10
P3
nNo f
p.0
ptana1
p2pltanp
pltana3
pipatanp;
2 (piAH) 2 p2
exp [  dpdp,
eNo N
nP, o
(p1AH)2 p2
exp [ ] dp2dp,
N, N
(p AH)2 2
exp[ N d 2dp,.
NO NO
By making the following changes of variables
E= 2 dp2=dEJN
Vo
(3.18)
u= dp,=duVIN
V7~
(3.19)
the probability of error in terms of the
new variables
becomes
(3.17)
3;
I
uo 0
uo 'I 0 
SUt
 / exp[(u 22
uO 0o 
utan2a
eutanp,
tana,
utan(
ana,
utanp;
exp[e2] de du
exp[e2] de du
exp [e2] de du.
(3.20)
Now by definition
X2
erf (x, x2) =2exp (t2) d.
Fi#X
(3.21)
Therefore, the following is the equation for P,:
P =1 fexp[(u 2 1 2]erf(u tanp, u tana) du
_2 fit [ AH2
2 f~exp[u N o)2] erf(u tanp3,u tana1) du
fexp[(u AH2)2
 ,exp[ (u ) erf(u tanp2.u tana3) du.
Su.0r~ f NI
(3.22)
43
For a given signal to noise ratio the probability of word
error depends only on a, and a2 because a,, a2 and a3 add up to
71. To find the values of a a2 a3 that minimize the
probability of the word error, we need to differentiate P,
with respect to a, and a2 and set them equal to zero:
ap__p iP (Ue( j) 2 u2tan2a)
= 1 fe (u(l+tan2 a ) ) edu
8aU1
1 uO
P 2 (u 2
.2 e x
u0
(U(l+tan2a1) eu2tan2alu(l+tan2 (a,+a2)) )eu2tan2 (1+2)) du
u0
(3.23)
a=P e' ue N) (u(1+tana2) e2tan22) du
Baa2 7C u0
(U u ) 2
P2 f e N i (u(l+tan2 (ac1 )u2 anc 2)) du
Uu
(u(l+tan2 (az+a )) eu2tan2 (a 2) +u(l+tan2a2) eu2tan22 du) =0.
(3.24)
Inspection shows that above equations consist of only one type
of integral which can be expressed in a computable form,
ju e k)2e2 du=f u e((1+1)uk22uk) du
0 0
(3.25)
By producing a perfect square in the exponent
Sk21 (1+1) (u k 2
fu e((11)u2+k22uk) du= e ) u e (1+1) du
0 0
(3.26)
and using the results obtained in Appendix A we get:
k2
u e(uk)eU21 du= e e + /2 k erf( k1)
f 2(1+1) 2 22(1+1)3 _+
(3.27)
If we substitute for integrals from eq. (3.27) into eq.
(3.23) and eq. (3.24) we will end up with the following set of
nonlinear equations:
 N ^o 1  erfsin2a AH2c
(plp3) e +1 t (p1+p2)e No cosa,erf(  ,cosal, m)
A sin2 ( C+a,) oH2
(p2+p3)e N"o Ios(al+a2)lerf( ( 2),) =0
N No I
(3.28)
AH2 AH2 
(PP) e No + (pp, 3) e N oser( cosa2l, o)
A sin2(i+a,) IAH 2 A"
(p2+P3)e N ICOS (a1+a2)erf( os (a1+a2) ) =0
0
(3.29)
Given a signaltonoise ratio and a probability set the
optimum angles can be found using eq. (3.28) and eq. (3.29).
46
We experimented with a variety of signaltonoise ratios and
probability sets. Figure 3.3 shows the optimum a, for three
probability sets over a wide signaltonoise ratio range. In
all cases optimum a, approaches 60 degrees as signal to noise
ratio goes up. Even when the optimum angle is somewhat
different from 60 degrees from the performance point of view
the two system are almost indistinguishable.
61 
0 910 11 12 13 14
Figure 3.3. a, (deg.) versus average power (dB). (a) the top
curves, source statistics {.9,.05,.05}, (b) The middle curve,
source statistics {.8,.1,.1} and (c) the bottom curve, source
statistics {.6,.2,.2}.
The presented results suggest that the three signals
should be placed on the vertices of an equilateral triangle
(Figure 3.4). The origin of the signal constellation can be
shifted to a new location that minimizes the average energy
without affecting the probability of error. Let us define 1 to
be
1= AB = BC = AC
in triangle ABC. Since ABC is an equilateral triangle, AH and
1 are related by:
1=AH4v.
(3.30)
The point G(x,y), can be found as follows:
2 2 2
Pave=P, ((x) 2+y2] +P2 (x+ )2 +y2 + +(y22(3.3
(3.31)
S (PiP2) 1
2
p31/
. y=.
2
Substituting x and y into (30) and solving for 1 we get
i= JI Pave'
(3.34)
where k is given by
7
(4/3)
(3.35)
aP,,, =
ax
Ox
ave =0
ay
(3.32)
(3.33)
P1 I (P1P2) 2+3P32] +P2 [ (P1P 2+) 2+3P32] +3 [ (PlP2) 2+3 (P3 ) 2]
C (0,113/2)
B(1/2,0)
A(1/2,0)
Figure 3.4. The three signals form a equilateral triangle.
3.3 A Numerical Approach Based on Laqranqe Multipliers
Method
In this approach the problem of signal selection is
viewed as a constrained optimization problem. The constraints
are incorporated into the optimization problem by the use of
Lagrange multipliers method.
Let us outline the design of a MAP receiver with three
signals with unequal probabilities (Figure 3.1).
A MAP receiver decision rule is m=mk if, for i=1,2,3,
Pr(r1=p1,r2 p21 mk)pk> PI(r P1 rrz2=P2 mi)pi isk.
(3.36)
Upon substituting (6) into (43), we get
G(x,y)
2 2
J1 N.J.
ojl ojl
(3.37)
Assume point H is the intersection of decision regions. For
point H the inequality turns into an equality and the sums on
either side will be the distances between H and signals si and
sj, therefore
AH2 BH2 =N n (
P2
(3.38)
Similarly we obtain
AH2CH2=N, in ) .
P3
(3.39)
In addition to the two above MAP constraints there are four
geometric constraints (eq.(3.1) through eq.(3.4)), and an
additional constraint to minimize the average energy. The
origin of the constellation is shifted to a location that
minimizes the average power. Then the average power function
evaluated at that point is set to 1/N.
p a+p2b+p3 C=Pave
(3.40)
where
a=(xAH sinpi)2+y2
b= (x+BH sinal) 2+y2
c=(xCH sin (P+a2+P32)) 2+(yAH cos,1+CH cos (P,+c2+P2) )2
x=p1AH sinpip2BH sincl+p3CH sin(P1+c~+P2)
y=p3AH cosP1p3 CH cos (p1+a2+P2)
Now the problem is to minimize a function of nine
variables under seven equality constraints. The method of
Lagrange multipliers can be applied to this problem. We need
to define a new cost function F of sixteen variables and take
the partial derivatives with respect to the sixteen variables
and set them equal to zero.
cost function=F(1, o2, 3, P P21, P3, AH,BH, CH, 1, 2, *,, 3I, 5', 16,A
If we eliminate X/'s among the sixteen equations we end
up with a nonlinear set of equations of the other nine
variables which must be solved numerically.
This method becomes quite complicated as the number of
the signals in the constellation grows. It was mainly used to
verify the solutions obtained by the gradient based method
when N was relatively small.
51
3.4 Minimum Error Signal Selection
3.4.1 Preliminaries
Let us denote the signals in the constellation by
SI,S2,...,SN where si=(xi,y,). The average power is given by
N (3.41)
Pave=E paisJ2
n1
Where {p, ,p2 ,.. ,PN} is the probability distribution of the
signal set and l1s,.l is the magnitude of s,. Following the
convention of [16] we choose to set P.av =1/N and define the
signaltonoise ratio to be
1/N
SNR=10 logo ( 1) (3.42)
N.
where No is the one sided power spectral density of white
noise.
By definition the error rate (probability of symbol
error) is
N (3.43)
Pe= Pn Pr(error Isn)
nI
The union bound gives the upper bound for the conditional
probability of error. For a maximum likelihood receiver we
have:
Pr(error sn) < is n ) (3.44)
.i=1 2No
ion
For large values of signaltonoise ratio the conditional
probability of error will be equal to the upper bound. By
using approximation for Q function when the argument is large
and plugging back in (3.43) we get
pi=, n 1n exp[ 11sisN512 (3.45)
7 sn1 i sn SS 4N,
Similarly it can be shown that the symbol error rate for
a MAP receiver is given by
(Is s112 +Noln( Pn))2
Pe o Pn sisn exp[ 4NS, cSn i
n1 in isisnl2 +No ln (oSnS
Pi
(3.46)
3.4.2 Gradient Search Algorithms for an Average Power
Constraint
In search of the minimum, the gradient of the probability
of error is obtained analytically and an iterative gradient
search algorithm which modifies the constellation at each
iteration is used to find the optimum constellation.
In the gradient search algorithm, the iterative rule is
given by
S1 =SkakVFk
where
(3.48)
denotes the signal vector at the kth step of the algorithm, ak
is the step size and VF is the gradient of PC.
Since the signal power may change with k, the signal vector is
normalized at each step of the algorithm
Sk*1
sk.1= .
N(pIs* (k+l) 12+p2z1S (k+l) 12+. .+pls; (k+l) 12)
(3.49)
To speed up the convergence, instead of a conventional
gradient search algorithm, the Fletcher and Reeves conjugate
gradient method [18] can be utilized. In this method the
information about the second derivative is used indirectly.
The algorithm is described by
s;+1 =sk+ahk
(3.50)
where
hk= Fk+ khk1
(3.51)
and
(3.47)
vak= F k (3.52)
[VFk1] t vk1
in which a, is the step size and hk is the quantity in this
algorithm that combines the information from current and
previous steps to define a new direction. Similar to gradient
search algorithm, the power may alter with k. Therefore the
signal vector needs to be normalized at each step of the
algorithm.
Let us summarize the procedure for iterative methods:
1. Set k=0 and select the starting points.
2. Determine the search direction by calculating hk for the
conjugate gradient method and VF for the gradient method.
3. Find the improved signals coordinates and normalize them.
If the improvement is smaller than a tolerance level stop,
otherwise set k=k+l and go to step 2.
3.4.3 Analytical Expressions for the Gradient Vector
Let us find the gradient vector for the ML receiver
first. VF the gradient of P, is a vector of 2N components in
signal space, in which each signal occupies two dimensions.
The gradient as a vector of N two dimensional vector
components is represented by
92
vF=
gN
where
(3.53)
gks=(g9,g)
gk is obtained by taking the derivative of P, with respect
to sk
gk=E (Pk+pi)
ivk
SkSi ISkS 112
l exp [ 4No
IIS 4N,
I[ 1 + 2N
Ilsksi2 2No
Similarly one can find the kth component of VF for a MAP
receiver
2
tl
pklks exp [ 4NOkSi2
ski 4 oISkS 2
2exp
+pl exp [ 12
Sk"* 4No ISkSi 1
t,
2
t2
t2 + t2
2 2N s2
12 2No SkS I2
2NIIsks1i I
2No\SSk\\2 2
(3.55)
where
(3.54)
gkiE
i*k
and
SkS SkSi
1 =
SkSi USkSill
t = SkSi 112 +ln( D)
Pi
t2 =SkSi 2 Nn ( Pk)
Pi
3 .4.4 Starting
Points
The average signal power is equal to 1/N or
n1
Thus, individual signals must satisfy
Pn Snll2 <
Solving the inequality for s, we get
P Sn pN
 n N
which states that starting points must be selected from inside
(3.56)
(3.57)
(3.58)
(3.59)
(3.60)
(3.61)
. . . . . . . I . . .
a circle with radius .1
3.4.5 Gradient Search Algorithm for the Peak Power Constraint
Even though the average power constraint is used more
often, in certain applications such as space communication the
peak power constraint is much more realistic. Given a peak
power constraint the transmitted signal points must be placed
inside the circle such that the error rate is minimum. We
would like to modify the iterative procedure developed earlier
to accommodate the peak power constraint. All we need to do is
to further modify the modified constellation at each iteration
to insure that peak signal power is bounded.
Let the peak power be P=1/N. The signal set is modified
using the iterative rule (gradient search or conjugate
gradient search), let
M= max {Is*(k+l) 2) (3.62)
We will modify the signal set again in the following manner
s (k+i)
s (k+1)
Sk+1= (3.63)
s (k+1)
to meet the peak power constraint.
3.5 Minimum Average Cost Signal Selection
The communication channels are almost never error free.
The probability of error (error rate) is a measure of system
performance. Various error types are usually weighted equally
but in some communication system certain errors are more
costly than the others. The Bayes receiver allows us to rank
the different error types [19]. To utilize the Bayes receiver,
one must know the source statistics and a reasonable estimate
of a cost matrix must be obtained. The Bayes receiver requires
more apriori knowledge about the communication system than
others but it results in a superior performance if there is no
mismatch between the design and operating conditions.
The goal is to find a signal constellation that minimizes
the average cost subject to a peak or average power
constraint. First a workable expression for the average cost
will be obtained.
3.5.1 Average Cost Function
The Bayes receiver which minimizes the average cost is
given by
N N
C = IP(sj )si)pi Lji, (3.64)
i1 j1
where L, is cost involved when the receiver picks sj when si was
actually sent.
The boundaries of the decision regions are no longer
straight lines; instead they are two dimensional curves. For
instance, the boundary between si and s, is given by
N N
SLik P(rI Sk) pk=f LJk P(r Sk)pk, (3.65)
kI ki
where conditional probability of error is
p(rls,) 1 isSk 2
Pr exp ( (3.66)
XNo No
Channel noise could cause the received signal to move to an
adjacent decision region and result in an error. For large
values of signaltonoise ratio; however, errors are almost
entirely the result of displacement of a received signal to an
adjacent decision region from the nearest point on a boundary
to the transmitted signal. Let si be the transmitted signal
and sji be the point with shortest distance on the boundary
60
with sj. p(sj si) can be approximated with a Q function
Ip)s(sli
P(sJsi) =Q( 1lsisji )
VNY 2 (3.67)
substituting back into the average cost expression we get
SI L 1 N exp[ sisJ l (3.68)
2\* 7 V exp O
2n iti iSi SJI N.
3.5.2 The Nearest Points on the Boundaries
The first step in computing the average cost function is
to find the set of nearest point to each and every signal
point in the constellation on the boundaries. There are N
signal points in the constellation and for each one there are
(Nl) of nearest points on (Nl) boundaries, therefore a total
of N(N1) of nearest points must be found.
The problem of finding sji which is the nearest point to
si on the boundary between si and sj can be formulated like a
Lagrange multiplier problem. We would like to minimize IIssi
subject to a constraint (eq. (3.65)). Let us form the
auxiliary function
N N
=llssi12+1(E Lik p(rISk)pkLjk p(rsk)pk) (3.69)
ki kc
and differentiate it with respect to x,y and A
a =0,
ax
=0,
;=0.
al
Eliminating
equations
(3.70)
X we will come up with the following set of
X) E (LikLjk) (XXk) exp( )pk
(XXi) k=1 o
(yiy.) N lsSk II2
(YYi) (LikLjk) (yyk) exp ( S ) p
k1 o
(3.71)
N Sk II2 N ISSk 2
ELik exp( l )pk k exp ( )pk=0 (3.72)
k1 No k1 o
where si=(xi,yi) and s=(x,y)=sji.
3.5.3 The Gradient of the Cost Function
To evaluate gk first the portion of cost function that
depends on sk must be formed:
N JI sks kll2
Vi,~ 1 e[ B1 ckIIxpr N2
E Ljk Pk expISk_ I Np [
j1 SkSjkSk No
+ Lki i 1 IS Si kill2
i1 IsI SkilI No
gki and gk2 the components of gk are found by taking the
derivative of above expression with respect to the components
ik Sk sk NS
L1 Sp [ iSki2 ]
s ski 11 No
[ + I ]
IskSk l2 No
1 1
ski +2 N
DHs2sk~iI{ No
(3.73)
where
u,= (XkXj) (1 Fjk) ( a)k
u2= (XiXki) ( + (yki) (
axk xk
v1 Lj exp [SkSJk2 ]
SP kSAkU No
1 Si Ski2
V Lk pi Xski p N
SiSi o
[ 1
ISkjk12
N
1 1
SiSki 2 N
BsiSkIllP No
(3.76)
where
(aXk) + (kyjk)
ayk
of sk
N
gk i
E"
21
(3.74)
(3.75)
and
J
N
j1
N
11
(1 yk),
ayk
(3.77)
v1= (xkXjk)
v2=(x xkX) ( +(Yk)( (3.78)
The gradient of the cost function depends not only on the
nearest points on the boundary but also on their derivatives.
Let us find the derivative of the components of sjp with
respect to the components of si. Differentiating eq. (3.71) and
(3.72) with respect to xi and yi we get
N e Sk 112 N lss 12
Lik 1 exp ] pk= Ljk exp[ N ]Pk
k1 No k1 o
(3.79)
where
ax ay
wl = (Xk) + (yyk)
(3.80)
SISSk 12 N llSSk 2
Lik w2 exp [ ]pk Lk w2 exp [ ] pk
k= No k1 No
(3.81)
where
w2= (xxk) + (Yyk)
Xy ay,
(3.82)
ax NISSk2
x I) (LkLjk) (yyk) exp [ss2
axi k1 N
(X a7y z) issk
+(xx) (LikLjk) ( 2 (yyk) )exp[
k1 ax1 N. N.
S) (Likjk) (xxk) exp ISsk~2] P
axi k ax N.
(yyi) (LikLjk) ( X 2 (XXk Ssll
(yy (Lk (2 (xxk) ) exp[ N
k1 1k o 0 0
(3.83)
1
where
z = (xXk) ( )
axi
+ (yyk)(
ax1
( )E (LikLjk) (yYk) exp[ I 2 ]k
ay, kl No
N Isski_
S(xXi) E (La 2 (yyk) exp2 Sk
Sa1 oy, N N0
=( ay .1) (LikLjk) (xxk)exp lssk2]
ya)x z Issk
i) k(L~ (l 2 (XXk) ) exp [
Ski 1 0k N0 y N V
2
] Pk
(3.85)
where
(x
Z2 = (XXk) () (Yyk)
ay1
(er
ay1
(3.86)
It can be shown that (refer to Appendix B)
D2
] Pk
2
]Pk
(3.84)
ax ax
xi=0 IaxI=0,
axj ayj
(3.87)
and
x =0 yi =0.
axj ayJ
(3.88)
Equations (3.75) and (3.78) imply that v2=u2=0.
Contrary to the equations for finding sji, the equations for
finding its derivatives are linear.
3.5.4 The Iterative Algorithm
To find the signal constellation we will start from an
initial guess. An iterative rule (either gradient search or
conjugate gradient search algorithm) is selected. An improved
signal set is found using the iterative rule. Depending on the
type of constraint (peak or average power constraint) the
constellation is modified accordingly (eq.(15) or eq.(30)).
The iteration is continued until a minima is reached.
The above procedure is valid under the assumption of
large channel SNR. When channel SNR is not large, obtaining an
analytical expression for the gradient is not feasible because
of the complexity of the shape of the decision regions. An
obvious solution is an exhaustive search which is
66
computationally very expensive and long. However, one can use
a modified form of the above procedure to save on the
computational expenses. In the modified procedure, the average
cost is computed based on the received data and the gradient
based update rule is replaced with the directional set
(Powell's) method in multidimensions [20] which does not
require the.gradient.
3.5.5 Application to DPCM System
In a DPCM system some errors are more costly than others,
therefore we could benefit from utilizing a Bayes receiver. In
Chapter Five an optimum method of selecting the cost matrix
will be presented.
3.6 Results
In this section we present some numerical results,
elaborate on the design procedure and discuss the performance
of the methods described in this chapter.
The probability of error, in general, is not a convex
function of the signal set, therefore the algorithm can
converge at local as well as global minima. The multi start
technique can be applied to the global optimization problem
[16]. In this technique one selects an optimization technique
and runs it from a number of different starting points. The
67
set of all terminating points hopefully includes the global
minimum point. Our implementation of the technique is somewhat
different. To come up with a set of start points, First a
search procedure is implemented that finds signal sets having
small error rates initiating from random start points. About
thirty start points are selected. Then we run the gradient
search techniques from these starting points.
To run the algorithms we start with a small step size and
monitor the changes in probability of error as a function of
iteration number, if P, changes very slowly we can proceed to
increase the step size. On the other hand, we may need to
decrease the step size if P, starts to oscillate around a
minima or goes unstable.
The best constellations for a number of sources with
three, four, five, seven and eight signals were found. Figure
3.6 shows the best constellations for three, five and seven
signals with the probability distributions given in Table I.
We compared the speed of convergence of the conjugate
gradient method to that of the conventional gradient search
method for a given value of step size. The normalized
conjugate gradient method converges to a solution much faster
than the normalized gradient method. A set of initial start
points and the optimum constellation for seven equally likely
signals are shown are in Figure 3.7. The conjugate gradient
method is faster about an order of magnitude.
To evaluate the gain in using a nonsymmetric signal
TABLE 3.1. VALUES OF SIGNALTONOISE RATIO AND THE PROBABILITY
DISTRIBUTIONS OF NONSYMMETRIC SOURCE.
TABLE 3.2. SOURCE STATISTICS FOR THE FOUR SIGNAL SOURCE.
source statistics
.35,.35,.15,.15
Source Number SNR(dB) Probability Distribution
1 12.2 .8,.1 ,.1
2 9 .9,.025,.025,.025,.025
3 8 .96,.01,.01,.01,.01
4 15.5 .674,.142,.142,.018,.018,.003,.003
1 8.5 8 8.5 i i.5
(a)
x
i 8.5 8 8.5 1 1.5
Figure 3.6. Best signal constellations for nonsymmetric
sources. Probability distributions are given in Table 3.1.
(a) Source l.(b) Source 2. (c) Source 3. (d) source 4.
x x
8.5 
e.5s
1.5L.
1.5
8.5I
1.5'
1.5
Z 1 B 1 2
(c)
1.5 1 8.5 8
(d)
8.5 1 1.5
Figure 3.6continued
1 I 1
i 1  .
x I
I X
e.s5
1.5 i
181
c
0
S.
8.5 8 8.5 1 1.5
a 288 488 688
number ofa iterations
688 1800
Figure 3.7. (a) Seven signal constellation for an equally
likely source. O's represent the start points and x's
represent the final constellation. (b) The solid and dashed
curves show the error rate as a function of number of
iterations for gradient search and conjugate gradient method
respectively.
0 0
x x
O
1.~. . . .
72
design relative to a equally likely signal selection, a first
order GaussMarkov source with a correlation coefficient of
0.9 was synthesized. The output was encoded using a first
order DPCM encoder. A four signal constellation was designed
with the source statistics shown in Table 3.2. The performance
of the nonsymmetric constellation was compared with that of
the well known equally likely constellation (for four equally
likely signals, the best constellation is formed by the
vertices of a square [16]). Figure 3.9 demonstrates a
comparison between the systems in terms of output signalto
noise ratio (output SNR). For large values of channel signal
tonoise ratio (channel SNR) the two design procedures result
in an identical performance. As the channel SNR decreases the
curves representing the performance of nonsymmetric and
equally likely signal design separate and the difference
between the two systems gets larger. The nonsymmetric signal
design is 3 dB (in terms of output SNR) superior to equally
likely signal design for the noisiest channel considered (a
channel SNR of 4 dB). It is seen that significant improvement
in performance can be obtained for noisy channels by utilizing
nonsymmetric signal design. The amount of improvement is a
function of source statistics.
A comparison between the equally likely signals and
unequal signal probabilities shows that if the signal
statistics are not very different, the shape of constellation
is not appreciably different from the equally likely case. But
73
if the signal probabilities are very different from the
equally likely case the shape of the constellation could be
very different from the equally likely constellation. Figure
3.6(b) and Figure 3.6(c) show constellations for two different
five signal sources. The constellations for the extremely non
symmetric source is completely different from the other one.
Generally speaking the geometry of the constellations
depends upon the power constraint. For example, the optimum
signal constellations for five equally likely signals subject
to average power and peak power constraints are displayed in
Figure 3.8(a) and Figure 3.8(b). Clearly, the choice of power
constrain affects the geometry.
Simulations show that for large values of signaltonoise
ratio the average cost signal selection does not result in any
improvement relative to the minimum error signal design. For
other values of signaltonoise ratio the decision regions
form unusual shapes which vary with the signaltonoise ratio.
Figure 3.10 shows the decision regions for a four signal
constellation with the statistics listed in Table 3.2. Because
of the complexity of the shape, obtaining an analytical
expression for the gradient was not feasible and as a result
the modified procedure in section 3.4.10 was utilized.
To compare the minimum error with the minimum cost signal
selection, a first order GaussMarkov source with the
correlation coefficient of 0.9 was generated. The output of
the source was encoded with a first order DPCM system. An
74
eightlevel MSQE (meansquared quantization error) optimized
quantizer was used and the output (reconstruction) signalto
noise ratio of the two systems were evaluated over a wide
range of channel SNR.
The results graphed in Figure 3.11 indicate that for
large values of channel signaltonoise ratio the two systems
are almost identical. However, for smaller values of signal
tonoise ratio the minimum cost design is superior to the MAP
system. In this case, the improvement in the output signalto
noise ratio is around 0.5 dB for the intermediate values of
channel signaltonoise ratio and 1 dB for low values of
channel SNR.
(b)
Figure 3.8. The best signal constellations for five equally
likely signals subject to average power (a), and peak power
(b) constraints.
.4
.2
0
.2 x
.4
.4
.2 
K x
x x
.2
.4 
S 10 / /
a S
8
41
3 4 5 8 7 8 9 10 11 12
channrl SNR
Figure 3.9. Performance results for nonsymmetric signal
selection (the solid curve) and the equally likely signal
design (the dashed curve) over a wide range of channel signal
tonoise ratio.
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxIxx
.XXXXXXXXIXXXXXXXXXXXXXXXXIXIXXXXXXXXXXX
XXXIXXXXXXXXIXXXIXXXXXIXXXXXXXXXIXXXXXX
+xxxxxxxxxxxxxxx xxx xxxxI xxxxxxxxxxxxxx
1. XXXXXXXXXXXX X XXIXXXXX XXXXX XXXXX "
XXIXXIXXIXIXIXXIXXIXXIXIXIXXXXX IXXX
XXXXXXXXXXXXXXXXXXXIXXXIXXXXXXXXXXXXXX
XXXXXXXXIXIXXXXXXXXXXXXXXXXIXXXXIXIXIXXXXXXX
XXIXIXXXXIXIXIXXIXXIXXXXIXXXXXXXXXXIXX
XXIXIXXXXXXXIXXIXI IXXIXXIXXXXXXXXIXI
M#+txx+xxxxxx:Nxlx xx xxx+t +4e
+ 4+4X X+N +XIX .+*1M0
so 0 a0 0 0w*t O4 oo 00
'0614904000 0
,10
I
1.5
2 
I 0.5 0 1
(a)
xxxxxxxxxxxxxxxxxx~xxxxxxxxxx"xxxxxxx
.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
LXXIXIXIX~iXXXXXIXXXIXX1XXXXIXIXIXRXXXXI
XKIXIXXXXXXXX XXNXXMIAXXIIXNXXMXXXIIIf
XXIXXXXXXXXXXX IXXXXXIXXXXXXXIXXXXX
1.5XXIXXXXXXIXIXIXXIXXX IXXXIXXXXIXIXXXX
XXIXIXXXIXXIXIXIXXXXIXIXIXXXXIXIX AXXXI
IMXXXXIXXXXXXXRXIXIMXIXIXAXXMIXAXMIXXXX
XXIXIXIXXXXXXXXXXIXIXXXXXXXXIXXXXI
Mxxxxxxxxx xxxxxxxIxMxXM xx'MxxxxxxxXXX
xm+4**t+n+xxxxxx Kxx .xx.... *:: 0**::::8
XXIXXXIXIXXIXIXIXXIXIXIXIXXIXXXIXXXXI
4IXIM MX oMMXIXIXXXX
0. "44+4+4+4.4+444X1xxXXX X44XXXSS000o0
4 0 06
1 + ++
1
1.5
2  i i 
1 05 0 0.5
(b)
Figure 3.10. The decision regions for minimum cost signal
design. (a) channel SNR=3.5 dB and (b) channel SNR=11 dB. The
signals are represented by *'s and the decision regions are
marked by distinct symbols.
4 5 6 7 8 9 10 11 12 13 14
channel SNR
Figure 3.11. Comparison between minimum cost signal selection
(the solid curve) and minimum error signal design(the dashed
curve) for a first order GaussMarkov source with correlation
coefficient of 0.9.
CHAPTER FOUR
ALPHABET SIZE SELECTION FOR VIDEO SIGNAL CODING
4.1 Introduction
Previously the methods of signal selection for
nonsymmetric source were studied. Here we are going to utilize
those methods in the deign of an image transmission system.
An image transmission system typically consists of three
parts, a source coder, a channel coder and a modulator.
Source coding is the first step in the transfer of
information from a source. The purpose of source coding is to
remove as much redundancy as possible from the source. DPCM of
video signals with as few as seven quantization levels has
been shown to produce pictures virtually indistinguishable
from the original under standard viewing conditions [21].
Since the output of DPCM system is a nonsymmetric source, an
entropy coder must be used to take advantage of that
redundancy. The well known ShannonFano method and the Huffman
procedure are examples of entropy coding techniques [22].
In channel coding the goal is to correct errors
introduced in the channel by inserting redundancy in the data.
Channel coding is accomplished by either decreasing the
80
information rate or increasing the bandwidth. Block and
convolutional codes are the two major classes of error
correction codes.
Modulation is the process of mapping a base band signal
into a band pass signal. To transmit the channel encoded
signals one must design a signal constellation that minimizes
error rate under a power constraint. This issue has been
addressed in Chapter Three.
Even though efficient techniques for source coding,
channel coding and signal design exist, it is not known how
the choice of alphabet size affects the performance of a
communication system. In other words, given a nonsymmetric
memoryless source with a known probability distribution, it is
not clear what alphabet size results in smallest error rate
subject to equal average power, bandwidth and the information
rate. It is the purpose of this chapter to explore the
relationship between signal constellation size and a system
performance measure (the error rate) for video signals under
different bandwidth constraints.
4.2 Preliminaries
4.2.1 Description of the Communication Systems
The block diagrams of the system under consideration are
shown in Figures 4.14.2. Huffman optimal source coding is
used to encode the source signals.
81
The BCH codes are chosen for channel coding (If extra
bandwidth is available) because codes close to any rate can be
found. A qary BCH code is denoted by (n,kq,tq) where kq
represents the number of information symbols, n represents the
bolck size and tq represents the the error correction
capability of the code.
The receivers are coherent inphase/quadrature detectors
(except for BPSK where only the inphase branch is needed) and
perfect carrier and symbol synchronization is assumed.
The code words for the nonbinary systems are not equally
likely. A minimum error procedure is the optimum way of
decoding at the receiver but a maximum likelihood procedure
can be used as a suboptimal procedure. The BerlekampMassey
procedure will be used for decoding of the received data [23]
[25].
4.2.2 Communication Channel
The communication channel is modeled as a qary
independent error channel and the source alphabet and channel
output alphabet are the same. If a symbol si is transmitted
there is a probability of pi that sj is received (iij) and a
probability of pki that Sk is received (i#k), and so on and so
forth. Therefore, the probability that si is received correctly
is
(4.1)
q
Pi =1E Pn.
n=1
noi
Notice that pj, depends on the modulation scheme. Here a two
dimensional signal constellation is utilized. The channel
matrix is
(4.2)
P= [pljj ,
s2 2
83
PNN
Figure 4.1. The Model of the communication channel.
where pji is given by
N i sss J2 
Pji No 1 exp[ IsiSi12]
Sx jlsisiil 4No
It is also obvious that
Pji= Pij
l1iq
ljCq.
(4.3)
(4.4)
4.3 Analysis
For a fair comparison of systems employing different
alphabet sizes, average power, bandwidth and information rate
are held equal for all systems.
In practice the bandwidth constraint could be either very
strict or can be somewhat relaxed. Therefore two cases are
considered:
I. When the bandwidth constraint is strict and can not be
relaxed. The binary system against which all other systems
will be compared uses source coding, BPSK modulation and no
channel coding due to strict bandwidth limitations. For the
same information rate higher alphabet systems send longer
pulses and require less bandwidth. Therefore these systems can
use the extra bandwidth for errorcorrection codes (Figure
4.1).
Here nonbinary BCH codes are used for error correction.
For a nonbinary system of alphabet size q a qary (n,kq,tq)
code must be found to satisfy the equal information rate:
n _hb (4.5)
kq hq
where hb and hq are the average message length for binary and
qary systems respectively.
(b)
Figure 4.2. The transmitters for systems employing (a) binary
(b) nonbinary alphabets.
In variable length coding (such as Huffman coding) a
symbol decoding error propagates in the block. Therefore, it
is appropriate to use word error probability as the figure of
merit. For the binary system P, is given by
P,=1 (1p) n (4.6)
where p, is the crossover probability. For a BPSK system p, is
known to be equal to
po=(V ) (4.7)
where e is the signaltonoise ratio.
II. When the bandwidth constraint is not very strict and
can be somewhat relaxed. Let us allow some bandwidth expansion
for the binary system (Figure 4.2). As a result the nonbinary
85
systems employing higher order alphabets will enjoy even
greater bandwidth expansion.
Binary
Source DPCM Huffman BPSX
encoding
Figure 4.3. Transmitter for the binary system when the
bandwidth can be expanded.
Let the bandwidth expansion for the binary system be 3
percent. To satisfy the equal information rate constraint,
binary BCH (n,k,t) code and qary BCH (nq,kq,t,) code must be
found such that:
n=P+1
(4.8)
=E (l+f)
kq hq
The binary system makes no error unless t+l or more of
the n total bits in a word are in error. The probability of
t+l or more symbols error is given by
n
,= ii +i( pe i) (4 .9)
where p, is the crossover probability. Usually P, can be
approximated by
P (= nI D p1 pe< (4 .10)
(t+) (nt) (4.10)
Let the word error probability for the system using the
qary alphabet be Pwq and let
=m (4.11)
nq
where n and nq are the block lengths for the binary and the q
ary systems respectively and m is an integer. Then the word
error rate for the qary system can be expressed as
Pq=l(1P ) (4.12)
4.4 Implementation Issues
As it was mentioned DPCM of video signals with as few as
seven quantization levels has been shown to produce pictures
virtually indistinguishable from the original. The quantizer
uses seven quantization levels : a zero difference and three
graded sizes, each of positive and negative differences.
In order to make results meaningful, we used pictures of
different contents. Low, medium and high detail pictures were
used in the simulations. Table 4.1 gives the probabilities
associated with each quantization level for three different
scenes namely a low detail scene MICHAEL, a medium detail
87
NARROWS scene and a high detail scene BANKSIAS [21]. It is
observed that:
P(y0) > p(,) =a (y) > p(y2) Y2)> p(y3) = (3) (4.13)
where p(y,) denotes the probability of ith difference level.
The primitive irreducible polynomials over the nonbinary
fields are given in Table 4.2. The minimal polynomials and the
generator polynomials of the nonbinary codes are given in
Tables 4.4, 4.5, 4.6 and 4.7. Notice that GF(4) the ground
field for GF(42) in itself is an extended version of GF(2).
The elements of GF(22) are 0,1,A and B where B=A2. The
arithmetic tables for GF(4) are provided in Table 4.3.
Alphabets of size 6 are not used in the simulations
because 6 is neither a prime or power of a prime number and
does not lead to BCH code implementation.
TABLE 4.1. DIFFERENCE SYMBOL PROBABILITIES.
TABLE 4.2. PRIMITIVE IRREDUCIBLE POLYNOMIALS OVER NONBINARY
FIELDS.
Field Polynomial
GF(2) x2 +X +1
GF(3) x3 +2x +1
GF(4) X2 +x +A
GF(5) x2 +x +2
GF(7) x2 +X +3
Picture p(yo) P(Y+i)= P(Y+2) P(Y+3)
P (y.) P(y2) P(y.3)
Low .674 .142 .018 .003
Detail
Medium .584 .172 .032 .004
detail
High .5 .166 .064 .02
Detail
TABLE 4.3. ADDITION AND MULTIPLICATION TABLES FOR GF(4).
+ 0 1 A B
0 0 1 A B
1 1 0 B A
A A B 0 1
B B A 1 0
X 0 1 A B
0 0 0 0 0
1 0 1 A B
A 0 A B 1
B 0 B 1 A
TABLE 4.4. (A) MINIMAL POLYNOMIALS FOR ELEMENTS OF GF(27).
THE POWERS OF a ARE SHOWN ON THE LEFT COLUMN AND THE
COEFFICIENTS OF MINIMAL POLYNOMIALS ARE SHOWN IN ASCENDING
ORDER ON THE RIGHT COLUMNS .(B) GENERATOR POLYNOMIALS FOR 3
ARY CODES. COEFFICIENTS OF G(X) ARE SHOWN IN ASCENDING ORDER
ON THE RIGHT COLUMNS. THE POWERS OF a ARE GIVEN IN THE MIDDLE
COLUMN.
GF(33) Minimal Polynomial
1,3,9 1 2 0 1
2,6,18 2 1 1 1
4,10,12 2 0 1 1
13 1 1
14,16,22 2 2 0 1
5,15,19 1 1 2 1
17,23,25 1 0 2 1
(a)
3ary code Powers of a g(x)
(26,19,2) 13,14,15,16 20111201
(26,14,3) 14,15,16,17,18,19 1122002000021
(b)
TABLE 4.5. (A) MINIMAL POLYNOMIALS FOR ELEMENTS OF GF(16). THE
POWERS OF a ARE SHOWN ON THE LEFT COLUMN AND THE COEFFICIENTS
OF MINIMAL POLYNOMIALS ARE SHOWN IN ASCENDING ORDER ON THE
RIGHT COLUMNS (B) GENERATOR POLYNOMIALS FOR 4ARY CODES
COEFFICIENTS OF G(X) ARE SHOWN IN ASCENDING ORDER ON THE RIGHT
COLUMNS. THE POWERS OF a ARE GIVEN IN THE MIDDLE COLUMN.
GF(42) Minimal Polynomial
0 1 1
1,4 A 1 1
2,8 B 11
3,12 1 B 1
5 A 1
6,9 1 A 1
7,13 AA 1
10 B 1
11,14 BB 1
4ary Code Powers of a g(x)
(15,12,1) 0,1 A B 0 1
(15,9,2) 1,2,3,4 1 A A 1 1 B 1
(15,7,3) 0,1,2,3,4,5 A 0 B 1 B B 1 1
TABLE 4.6. (A) MINIMAL POLYNOMIALS FOR ELEMENTS OF GF(25). THE
POWERS OF a ARE SHOWN ON THE LEFT COLUMN AND THE COEFFICIENTS
OF MINIMAL POLYNOMIALS ARE SHOWN IN ASCENDING ORDER ON THE
RIGHT COLUMNS .(B) GENERATOR POLYNOMIALS FOR 5ARY CODES.
COEFFICIENTS OF G(X) ARE SHOWN IN ASCENDING ORDER ON THE RIGHT
COLUMNS. THE POWERS OF a ARE GIVEN IN THE MIDDLE COLUMN.
GF(52) Minimal Polynomial
0 4 1
1,5 2 1 1
2,10 4 3 1
3,15 3 0 1
4,20 1 4 1
6 3 1
7,11 3 2 1
8,16 1 1 1
9,21 2 0 1
12 1 1
13,17 2 4 1
14,22 4 2 1
18 2 1
19,23 3 3 1
(a)
5ary code Powers of a g(x)
(24,15,3) 0,1,2,3,4,5 1322210121
(24,12,4) 0,1,2,3,4,5,6,7 4102441132021
TABLE 4.7. (A) MINIMAL POLYNOMIALS FOR ELEMENTS OF GF(49). THE
POWERS OF a ARE SHOWN ON THE LEFT COLUMN AND THE COEFFICIENTS
OF MINIMAL POLYNOMIALS ARE SHOWN IN ASCENDING ORDER ON THE
RIGHT COLUMNS (B) GENERATOR POLYNOMIALS FOR 7ARY CODES.
COEFFICIENTS OF G(X) ARE SHOWN IN ASCENDING ORDER ON THE RIGHT
COLUMNS. THE POWERS OF a ARE GIVEN IN THE MIDDLE COLUMN.
GF(72) Minimal Polynomial
1,7 3 1 1
2,14 2 5 1
3 661
4 401
5 531
6 141
8 4 1
9,15 6 3 1
10 4 11
11 541
12 1 0 1
13 3 21
16 5 1
(a)
7ary code Powers of a g(x)
(48,31,5) 1 through 10 235134333561123361
(48,27,6) 1 through 12 3202052111645042656531
(48,24,8) 1 through 16 3655056652662113433534431
