BOUNDS ON RELIABILITY FOR BINARY CODES
IN A GAUSSIAN CHANNEL
By
JAMES ROBERT WOOD
A DISSERTATION PRESENTED TO THE GRADUATE COUNCIL OF
THE UNIVERSITY OF FLORIDA
IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE
DEGREE OF DOCTOR OF PHILOSOPHY
UNIVERSITY OF FLORIDA
August, 1964
ACKNOWLEDGMENTS
The author wishes to express his deep appreciation to
Dr. W. W. Peterson and Dr. T. S. George for their guidance and
counsel during the course of the research reported herein. A
large measure of gratitude is due also to the International
Business Machines Corporation, whose extraordinarily generous
support made this work possible.
TABLE OF CONTENTS
Page
ACKNOWLEDGMENTS . . . . . . . . ii
LIST OF FIGURES . . . . . . v
ABSTRACT . . . . . . ..... ...... .. vi
SECTION
I. INTRODUCTION . . . . . ........ 1
A. The Coding Problem
B. The Investigation
C, The Plan of This Paper
II. THE COMMUNICATION SYSTEM DEFINED . . . . 7
A. The Channel
B. The Code
C. The Noise Distribution
D. The Decoding System
E. Euclidean Distance as Related to
Hamming Distance
III. BOUNDS ON RELIABILITY . . . . . . .. 17
A, The Concept of a Bound
B. First Upper Bound on E(R)
C. First Lower Bound on E(R)
D. Second Upper Bound on E(R)
E. Second Lower Bound on E(R)
IV. COMPARISON WITH THE UNRESTRICTED THEORY . . . 44
A. Preliminary Remarks
B, A Comparison of Upper Bounds
C. A Comparison of Lower Bounds
V. CONCLUSIONS . .. . . . . . . 54
A. Summary of the Investigation
B, Suggestions for Future Work
TABLE OF CONTENTS (Continued)
Page
APPENDIX . . .. . . .. . . . . 71
LIST OF REFERENCES ........... . .. . 84
BIOGRAPHICAL SKETCH . . . . . . . . 86
LIST OF FIGURES I
Figure Page
1. Interpretation of First Lower Bound . . . .. 59
2. Bounds on ReliabilityUnrestricted Theory . . .. 60
3. Comparison of Reliability for A = 1/2 . . ... 61
4. Comparison of Reliability for A = 1 . . . ... 62
5. Comparison of Reliability for A = 2 . . . ... 63
6. Comparison of Reliability for A = 3 . . . ... 64
7. Comparison of Reliability for A = 4 . . . ... 65
8. Comparison of Reliability for A = 8 . . . ... 66
9. Comparison of Rate vs. A . . ........ . 67
10. First Lower Bound on Reliability, A = 1/2, 1, 2, 3 .. 68
11. First Lower Bound on Reliability, A = 4 . . . 69
12. First Lower Bound on Reliability, A = 8, 16 ... . 70
13. Low Rate Bound Compared with SpherePacking Bound
for p = 0.01 . . ... .. ...... . ...... . 83
T
Abstract of Dissertation Presented to the Graduate Council in
Partial Fulfillment of the Requirements for the Degree of
Doctor of Philosophy
BOUNDS ON RELIABILITY FOR BINARY CODES IN A GAUSSIAN CHANNEL
By
James Robert Wood
August 8, 1964
Chairman: Dr, T. S. George
Major Department: Electrical Engineering
This paper will report the results of an investigation
of a particular coding method for continuous channels. Specifi
cally, binary linear codes (group codes) are employed as a coding
method for the inputs to a timediscrete continuous channel.
This channel is presumed to be perturbed only by additive noise
with a Gaussian amplitude distribution which affects each trans
mitted digit independently.
Shannon has obtained bounds on the optimum probability of
error when the input signals are considered to be sequences of
n real numbers, subject only to the constraint that the signal
power in each sequence be a constant. This is a nominal restric
tion, resulting in a very general theory. The use of group codes
for the input signal sequences restricts the individual numbers
in the nnumber sequence to take on only one of two distinct
values and further requires that the input sequences be capable
of being placed into onetoone correspondence with a group code.
This is more restrictive than the general case but has the advan
tage of being a constructive method of establishing the input
sequences.
Four bounds, two upper and two lower, on the reliability
of such a coding method are derived. The upper bounds show that
Lhe reliability of the binary coding is bounded away from the
optimum for some ranges of the transmission rate. The range of
rate over which these bounds give useful information is a func
tion of the signal power to noise power ratio. The lower bounds
on reliability for the binary coding technique are below those
for the general case for all ranges of transmission rate. This
is an expected result, since the binary case is a special case of
the general theory. For a reasonably broad range of transmission
rate, however, the lower bounds are quite close together, indi
cating that the binary case can guarantee reliability only
slightly worse than that guaranteed by the general technique.
The results of the investigation show that the use of group
codes as input signal sequences is promising.
vii
SECTION I
INTRODUCTION
A. The Coding Problem
Shannon (1)* has established that it is possible to transmit
information from a source to a receiver over a communications channel
in such a way that the probability that an error will occur can be
made as small as desired, provided that the rate of this information
transmission does not exceed a value called the channel capacity.
This startling and not at all obvious result is achieved by asso
ciating with the information to be transmitted additional quantities
of data which serve to detect and to correct errors introduced by a
noisy channel. Shannon's classic results, while demonstrating that
such a performance is possible, are in effect existence theorems
which do not provide constructive means for achieving this reliable
information transmission. The coding problem is that of devising
methods of appending redundant data to desired information in order
to achieve these results which are known to be possible.
The purpose of introducing redundancy into messages by
coding is to combat the effects of the noise which is present in
*Underlined numbers in parentheses refer to the List of
References at the end of this paper.
1
the transmission medium or channel. It follows, then, that work in
coding theory has been broadly classified according to the type of
channel over which communication is to take place.
In general, a channel may be considered as a device which
transforms successive input events, each represented by a point x
of an input space X, into output events, each represented by a
point y of an output space Y. This transformation of x to y is
governed by a conditional probability distribution P(Y/X) which is
determined by the noise in the channel.
Channels are usually classified according to the types of
the input and output spaces. If the input and output spaces are
discrete, the channel is said to be discrete. If the input and out
put spaces are continuous, the channel is said to be continuous.
Discretetocontinuous and continuoustodiscrete channels would
also be possibilities.
Successive events in a discrete channel form a timediscrete
sequence. However, two possibilities arise in the case of a
continuous space. A point representing an event may be allowed to
change only at specified instants of time. If the channel has input
and output spaces of this type, it is said to be timediscrete with
continuous amplitudes. Alternatively, the point representing an
event may be free to change its value at any time, i.e., to move
continuously. If the channel has input and output spaces of this
type, it is said to be timecontinuous with continuous amplitudes.
Coding theory may then be classified according to its application in
discrete channels, timediscrete channels with continuous amplitudes
or timecontinuous channels with continuous amplitudes.
Research in coding for the discrete channel has been largely
centered on binary channels, wherein the input event or signal can
have only two states. Results in this area have been voluminous,
both as to coding methods and to evaluation of possible ranges and
bounds on error probabilities for various classes of discrete codes.
Peterson (2) presents a comprehensive study of current practices and
the present state of research in discrete channel coding theory. As
an indication of the complexity of this problem, it is significant
that even today Elias' ErrorFree Coding (3) is the only known
example of a constructive error coding technique which permits the
realization of an error probability approaching zero as we increase
without limit the size of the information blocks transmitted, while
maintaining a nonzero information rate. Even this method requires
that information be transmitted at a rate well below channel capacity,
which is the theoretical upper limit,
The search for techniques for incorporation of redundancy
into signals for continuous channels, both timediscrete and time
continuous, has proceeded largely under the name of signal design,
rather than coding. Much of the errorcorrecting techniques
considered have been correlation methods. While the object is the
same as coding for discrete channels, the diversity of the mathemati
cal and conceptual techniques has resulted in relatively little
interplay between coding theory and signal design.
The application of coding theory to continuous channels
has been restricted almost exclusively to timediscrete channels
with continuous amplitudes. Shannon (4) has obtained results
giving the possible limits on error probabilities for a particular
type of input space. Franco and Lachs (5) and Harmuth (6) have
investigated the use of orthogonal functions as signals for a time
discrete channel in a manner which is reminiscent of discrete
coding theory
There has been to the author's knowledge very little work
done on coding for timecontinuous channels with continuous ampli
tudes except for calculations of channel capacity, typically by
Fano (7). There are two basic reasons for this. First, the transi
tion from timediscrete to timecontinuous channels is mathemati
cally a formidable step. Second, while many channels which are of
practical interest are of the timecontinuous type, they may in
many cases be represented to a satisfactory degree of accuracy as
timediscrete channels, usually through sampling techniques.
B. The Investigation
This paper will report the results of an investigation of a
particular coding method for timediscrete channels with continuous
amplitudes. Specifically, binary linear codes (group codes) will
be employed as a coding method for the inputs to a continuous
channel which is presumed to be perturbed only by additive noise
with a Gaussian amplitude distribution which affects each trans
mitted digit independently. The signals and the channel will be
more precisely defined in a later section of this paper.
Shannon (4) has obtained results for a channel of this type
when the input signals are considered to be sequences of n real
numbers, subject only to the constraint that the signal power in
each sequence be a constant. This is a nominal constraint, resulting
in a very general theory for this type of channel. The use of group
codes for the input signal sequences will restrict the individual
numbers in the nnumber sequences to take on only one of two distinct
values, and will further require that the input sequences be capable
of being placed into onetoone correspondence with a group code.
This method is much more constrained than the largely unrestricted
signal sequences allowed in the work of Shannon. It has, however,
the advantage of being a constructive method of establishing the input
sequences. The bulk of the investigation is then an establishing of
comparative results between the group code method and the unrestricted
theory.
Specifically, it is desired to compare the reliability of the two
methods. Reliability has a precise definition which will be given
later but it is in essence a measure of the errorcorrecting capa
bilities of a code. Exact figures for reliability are generally not
obtainable when a large class of codes is considered due to the many
different codes within a class and also to inherent mathematical dif
ficulties. Instead, upper and lower bounds on reliability are commonly
obtained. For a specified code length an estimate or bound on reli
ability does not give a very accurate determination of the probability
of error of the code. However, given a desired level of error proba
bility, a knowledge of reliability will permit a reasonably sharp
estimate of the required length of the code. This is often the
actual problem faced in coding applications.
Shannon's unrestricted results contain four bounds on reliability,
two upper bounds and two lower bounds. In this investigation, two new
upper and two new lower bounds on reliability are presented. These
bounds give a measure of the loss in reliability incurred when the
restrictive encoding method using group codes is employed. Certain
allied results obtained in the investigation will also be presented.
C. The Plan of This Paper
This paper contains five numbered sections and an appendix.
The first section is this Introduction. Section II defines the
channel and the encoding technique. Section III presents the derivation
of the bounds. Section IV compares the results of this investigation
with the unrestricted case. Section V contains conclusions and
recommendations for further work. Allied results are in the Appendix.
SECTION II
THE COMMUNICATION SYSTEM DEFINED
A. The Channel
The type of communication channel with which this investi
gation is concerned is termed a "continuous, timediscrete"
channel. Fano (8) describes a continuous, timediscrete channel
as one wherein the input and output events are represented by
points of continuous, Euclidean space, but these points are per
mitted to change their positions only at specified time instants.
For simplicity, it may be assumed that the input changes once each
second, and that at any given time the input consists of a real
number. The input, then, is a sequence of real numbers which change
once each second. The ith real number will be denoted ui.
The channel is assumed to be perturbed by an additive
noise, whose amplitude has a Gaussian distribution and which affects
each u independently. At the receiver, then, the ith real number
will be observed as ui + ni, where the n are independent Gaussian
random variables, all of which are assumed to have the same
variance N. The assumption that the noise affecting the channel
is of this type results in an admittedly highly idealized channel.
It is felt, however, that understanding this channel thoroughly
will be very helpful if more difficult generalizations are attempted.
In this investigation, the values of the ui are restricted
to be one of two distinct numbers. These numbers could be arbi
trarily chosen, but it is shown in the Appendix, Section A, that
because of the structure of group codes, minimum signal power will
result when these numbers are chosen as +B, where B is a real
number. Hence, the ith real number observed at the receiver will
have the form +B + n .
This channel may be considered to be a form of sampled
data communication. However, the arguments to be used in developing
the bounds on reliability will be largely geometric, No considera
tion will be given to the origin of these inputs or to the allied
question of whether a continuous function of time can be adequately
represented in the form given above. These problems arise in the
application of this channel in a sampled system.
B. The Code
The sequences of real numbers used as inputs to the channel
will be arranged as a block code. A block code is a code that uses
sequences of n symbols or ntuples. Each sequence of n symbols is
termed a code word or code block. With the restriction that each
input symbol may take on only the values +B, there are 2n ntuples
which could be used. Of this total number, only M of these will
be used as code words. In this paper, it will be further required
that each ensemble of M code words be in onetoone correspondence
with a binary linear code or group code. Formally, this corre
spondence can be achieved by mapping a binary linear code into
the set of all ntuples containing +B as elements, where "1" is
mapped into "B," and "0" is mapped into "B." The resulting subset
of ntuples with +B as elements is then in onetoone correspond
ence with the binary linear code.
In the notation of group codes, an (n,k) code is a code
of length n in which each code word contains k information digits
and (nk) redundancy digits. There are M=2k code words in an
(n,k) code,
A convenient interpretation of the ensemble of M code words
is that they represent M points in ndimensional Euclidean space.
The origin of coordinates in this space is the zero vector, and
a typical point of the ensemble might be B, B, B, B,
The points in n space whose coordinates consist of either B or B
will be called "binary points." Since coding involves the selec
tion of M of the possible 2n binary points, each code word will be
a binary point, while the converse is not true. Each binary point
is at the same distance from the origin, since in nspace (9) the
length R of a line is given by
n
R2 = (Xi Y )2
1
i=l
where Xi and Yi are the ith components of the vectors X and Y,
respectively. Consider Y to be the origin and let X be any
signal vector. Then
n
2 2 2
R = B =nB
i=l
Thus, each binary point lies on the surface of an ndimensional
space of radius BVn.
A motivation for the terminology "unrestricted case" as
applied to the case treated by Shannon (4) is apparent in the fore
going discussion. When input signal sequences are permitted to
assume any value subject to the constraint of constant power for
each sequence, it can be shown that these sequences may lie any
where on the surface of an ndimensional sphere. When only two
levels are used, however, the sequences are restricted to the
binary points.
C. The Noise Distribution
The noise present in the channel may also be given a geo
metrical interpretation. Each coordinate of the signal vector is
affected independently by an additive noise n whose amplitude is
Gaussianly distributed. At any given signal point, each ni is
Gaussian with zero mean and variance N. The displacement of any
signal point from its original position due to noise can then be
considered to be a random variable n = nl, n2, nn, where each
ni is independent and has a one dimensional Gaussian distribution.
Cramer (10) shows that such a random variable has a probability
distribution of the general form
I a jkn n
1 2A jk j k
p(n) = n2 e j (1)
(2fT) \7
where A is the determinant of the moment matrix
all aln 2)
A = (2)
anl ' ann
In this notation, ai is the variance of the ni, given by
ai = E(ni m)
and a is the covariance between ni and n., given by
ij
aij = E [ (ni mi)(nj m )]
where mi is the mean of each ni and E is the usual notation for
average or expected value. By hypothesis, each m is zero
and each variance is N.
The form of the probability distribution of n can be
obtained by determining the probability distribution of the
standard variable t t2, t where t = (ni mi)/i
= ni/1' si being the standard deviation of n i
Then
a i= E N L = 1 E (n i) 1
11 W ^ ~N j
and
a = E = E(ni) E(nj) = 0, i # j,
since all the ni are assumed to be independent. The moment matrix
A then becomes the unit matrix and its determinant A is equal to 1.
In (1), all terms in the exponent vanish except those where j = k,
giving
1 t2
1 2 i
p(t) = Ie i
M (2) n/2
(27T)
In nspace, the magnitude of a radius vector R from the
22 th
origin is given by IRI = r r where r is the ith component
of R. Let the magnitude of t be t. Then
1 2
1 I2 t
P(t = n/ e 2 (3)
(21T)
Thus, the probability distribution of the noise displacement is
independent of direction of displacement. For this type distribu
tion, the contours of equiprobable surfaces are given by
1 3 A tt = C2
2A k Jk J k
where Ak is the cofactor of ajk in A. Since A (the determinant
jk jk
of A) = 1, and
,jk i = j
O, i / j
the surfaces of equal probability are given by
n 2 2
1 t = 2
2 ^ i
i=1
which is the parametric equation for a sphere in ndimensions.
Hence (3) may be termed a spherical Gaussian distribution. The
probability distribution of displacement by noise in any given direc
tion is independent of that direction and is a onedimensional
Gaussian distribution.
D. The Decoding System
The encoding and communication process has been characterized
geometrically as the selection of points on the surface of an
ndimensional sphere, which, in the transmission over the channel,
are displaced from their original location by noise that has a
spherical Gaussian distribution. A decoding system for such a model
is a partitioning of the nspace into M subsets corresponding to
the transmitted messages. This is a method of deciding, at the
receiver, which message was transmitted. If the received message
is in the subset corresponding to the ith transmitted code word,
then it is presumed that it was the ith code word which was sent.
For any code, the probability of error is defined as
M
P = piq
e i=1
where pi is the probability that the ith message will be trans
mitted, and qi is the probability that if code word i is sent,
it will be decoded incorrectly, i,e., as a code word other than i.
For the purposes of this investigation, it is assumed that all code
words are equally likely to be transmitted, so that for a code of
M words
M
P Z q
e M ql
i=l
An optimal decoding system for a code is one which minimizes
the probability of error. The Gaussian density function is
monotone decreasing with distance. The greater the displacement
of a point from its original position, the less probable is that
displacement. With this noise distribution, an optimal decoding
system is one which decodes any received signal as being the code
word corresponding to the geometrically closest code word location.
This type of decoding is called minimum distance or maximum likeli
hood decoding. This decoding system is assumed to be used through
out the investigation reported in this paper.
One additional comment pertinent to decoding is offered.
As noted, the probability of error will depend on the geometrical
distance between code words. It would be possible, for a fixed
value of noise variance N, to decrease the probability of error by
placing the code words far enough apart in nspace. For a fixed N
and code length n, this would correspond to increasing the signal
level B. Loosely speaking, this is equivalent to increasing the
signal to noise ratio, which one expects to result in more reliable
communications. Thus, B will necessarily remain as a parameter
in error calculations,
E. Euclidean Distance as Related to Hamming Distance
A fundamental parameter of discrete codes is Hamming
distance, usually denoted by d. Hamming distance is defined to
be the number of digits in which two code words differ. For the
purpose of this investigation, a relation between Hamming and
Euclidean distance is required.
Consider two points in Euclidean nspace, wl = sl, s2 s,
and w = t i t t The distance D between these two points
2 1 2 n
is given by
2 n 2
D2 = (sr ti)
i=1
If these two points are binary points as defined in Part B above,
2 2
then (si t ) can have only the values 0 or 4B depending on
whether s = t, or s i t Now, w and w may also be considered
i i i i 1 2
to be code words. If the Hamming distance between these two words
is d, then si # t in d places. Hence
16
D2 = 4B2d
which is the necessary relation between Hamming and Euclidean
distance.
SECTION III
BOUNDS ON RELIABILITY
A. The Concept of a Bound
The evaluation of a particular coding technique for a
communication channel is ideally done by calculating the proba
bility of error, Pe, for that technique. The Pe is more properly
written Pe(N,B,n,R) to show that error probability is a function
of noise power N, signal level B, code word length n, and informa
tion rate R, Determination of an exact P may be impossible, or
e
mathematically quite complex. Consequently, it is necessary to
resort to bounds on P rather than an exact result. The bounds
e
usually derived on Pe are functions g which permit the inequality
gl Pe < g2
to be written. The functions gl and g2 can all be placed in the
form
nE(R) + o(n)
e (1)
where R is a function of B and N. Here o(n) is a term of order
less than n.
This investigation is concerned with the use of group codes
as input signal sequences. For this class of codes, there are a
finite number of codes. Hence, there is a best code in the sense
that some code has a Pe which is no larger than any other code.
It is the Pe for this best code in the class of group codes that
will be bounded.
To further simplify the mathematical operations, this report
will be concerned primarily with determination of bounds on E(R),
which is called reliability. If the bounds developed on P are
e
placed in the form of (1), then E(R), or simply E, is defined as
E = lim 1 loge Pe
noo n
E is then independent of code word length n, which permits a
simplified presentation of results. E is a measure of how fast the
probability of error goes to zero. In this connection, it should
be noted that the exponent in the defining equation (1) for E is
negative. Hence, a lower bound on E will correspond to an upper
bound on P
e
Knowledge of E and n will not permit the close determination
of P from (1), since the term o(n) could be a large multiplier.
However, given E, and the P which is desired, the necessary value
e
of n can be determined fairly sharply when n is large. In fact,
n will be asymptotic to E loge P e In applications of coding
theory, this is normally the natural problem i.e., how long must the
code be to achieve a given level of Pe.
B. First Upper Bound on E(R)
Plotkin (11) has obtained a bound on minimum Hamming
distance which is applicable to binary codes in general and to
binary linear codes, which are a subclass of binary codes. This
bound on distance may be used as the origin of a bound on Pe in
binary coding for continuous channels. For this purpose, the most
convenient form of the bound is given by Peterson (12) as follows:
"Consider an nsymbol linear code with symbols taken from
the field of q elements. Let k be the number of information
symbols and (nk) the number of check symbols. If
qd 1
nZ
q 
then
k S n + 1+ log d
q q
where d is the minimum Hamming distance between code words."
Since this investigation is concerned with binary coding,
q = 2. Then, for n 2 2d 1, the bound may be written as
k Sn 2d + 2 + log2 d
or
n k 2 2d log2 d 2. (1)
Future calculations will be simplified if the bound on
minimum distance as given by (1) is changed so that d does not
appear as the argument of a logarithm, Note that
d1
d S 2
for any positive integer value of d. Hence
log2 d5 d 1.
Then, from (1)
n k 2d (d 1) 2 = d 1
so that
d 5n k + 1
This may be substituted in (1) to yield
n k 2d log2 (n k + 1) 2
or
dS= [n k + log2 (n k + 1) + 2 (2)
Equation (2) bounds the Hamming distance of a group code
in terms of code length n and the number of information digits k.
A more desirable form of (2) (for the purpose here) is one in which
transmission rate R or the number of code words M appears explicitly.
Rate, as used in the coding literature, is variously defined. A
common definition is that rate R is the ratio of the average number
of information symbols per code word to the average number of total
symbols per word (word length). Specifically, for an (n,k) group
code, rate is equal to k/n. Since there are M = 2k code words in a
group code, rate can also be written as 1 log2 M. Implicit in these
n
definitions, however, is the idea that rate as used in coding
theory is generally concerned with what a source (or code) is
capable of transmitting, rather than what it is actually trans
mitting. For general use, a definition of rate should incorporate
this maximal concept. This may be done by using the concept of
self information.
Assume that there exists an ensemble of events X (such as
the ensemble of M code words), each of which occurs with a proba
bility p(xi), where xi is the ith event. The self information of
x is defined as
I(xi) = log p(xi).
The units of I are determined by the base of logarithms chosen.
Three commonly used units are bits for logarithms to the base 2,
nats for natural or naperian logarithms, and hartleys when the
base 10 is used. Self information may be interpreted as the maximum
amount of information that can possibly be provided about the
event x..
i
The desired definition of R is one which will specify condi
tions under which the maximum average amount of information per code
word symbol is transmitted. The total self information of the
source is simply the sum of the self information of each event. The
rate R can then be defined as the maximum of
M
1 P(xi) I(xi).
n i=
If the definition of I is used, this can be written as
M
1=I p(x ) log p(xi) (3)
n fl L
which is recognized as the entropy function of information theory,
modified by the factor 1/n. It is known that the entropy function
achieves a maximum when all the p(xi)'s are equal (see, for example,
Reza (L)). Then p(x ) = 1/M for all i, and by hypothesis
M
i=1
Under this constraint, the maximum of (3) is
1 log = 1 log M. (4)
nl M n
Thus, if rate R is defined as
R = 1 log M, (5)
n
the desired maximization of average transmitted information is
achieved.
With the definition (5), the bound on minimum distance (2)
may be expressed as an explicit function of R, or of M, in the
following manner:
d S i 1 + log2 (n k + 1) + (6)
2 n n n
From (5),
R = In M nats,
n
where In indicates natural logarithms. For the group code,
M = 2k and
R =1 In 2k k In 2
n n
yielding
k R In M 1 (7)
n In 2 n In 2
The relation (7) is then used in (6) to give
d 2 n 2In 2 1 In M +1 In 2 log2 (nk+1) + 2 (8)
2 In 2\ n n L[
as the desired bound on minimum distance.
In order to use (8) in a geometrical argument, Hamming
distance d is converted to Euclidean distance D by the relation
d = D2/4B2 developed in Section Ii. The final form of the bound
on minimum distance is then
DS; 2B2n In 2 I + In [ log2 (nk+l) + 2 (9)
In 2 n n
The use of relation (9) requires that proper interpretation
be made of it. The bound does not guarantee that the minimum
distance between the code words in a group code can ever equal the
right side of (9). Rather, it only guarantees that the minimum
distance can never exceed the right side of (9). Specifically, (9)
says that in every binary code, at least one pair of code words is
no farther apart than the distance given. This fact is used by
Shannon in the determination of an upper bound on E for the unre
stricted case.
Assume a group code with the maximum minimum distance. There
are two code words no farther apart than D as given by the right side
of (9). Call these two words wl and w2. If one of these words, say
wl, is transmitted, and maximum likelihood decoding is used, then the
24
contribution to the probability of error of the code if w1 is
incorrectly decoded as w2 is at least equal to the probability that
w is carried at least a distance D/2 towards w2. This contribu
tion can be expressed as
1
M P (w moves at least D/2 in a 'pecified direction)
where 1/M is the probability that w1 will be transmitted. As
noted in Part C of Section II, the density function of displace
ment by noise in any given direction is onedimensional Gaussian
with zero mean and variance N. The contribution to the proba
bility of error is then given by
S B In 2 In M + In 2 log2 (n k + 1) + 2]}
where
S(x) = 1 (x)
and
x
2
e d
S(x) o= e dx .
Since the object here is a lower bound on the probability of
error P the contribution to error if wl is decoded as any
e
other word except w2 is neglected. This will result in an opti
mistic value for P but this is the proper direction for reasoning
e
to a lower bound.
Assume now that this first word wl is deleted from the code.
Since there are now (Ml) code words, the maximum minimum distance
cannot be greater than that which would be possible in a code con
taining (Ml) code words, i.e.,
D2n n In 1 n 2 n log2 (n k + 1) +
Fn n
Then, for the pair with this maximum minimum distance, the argu
ment as used above will yield a contribution to the probability of
error of at least
B ( 2nn 2 1n 2 In (M 1) + in1 log2 (n k + 1)+2
M 2N n2 n n 2 i
This process can be continued for (Ml) times, yielding a similar
contribution each time a code word is deleted. The last two code
words will yield a contribution of
1 ( /B 2n In 2 1 In 2 + n2 log (n k + 1) + 2
M p 2N In 2 n n 2
The probability of error is thus bounded as
pe L B n2 In 2 In M + In 2 [log (n k + 1) +2
e M 2N In 2 n n L 2
S In 2 In (M) + n g2 (nk+l) + 2])
(10)
+B2n In 2 In 2 + In 2 log2 (n k+1) + 2}) .
\ V2N In 2 n n
This expression may be simplified somewhat by weakening the bound.
Note that the terms on the right side of the inequality are decreas
ing in value because each argument is greater than the preceding
one and for x 'y, < (x)c=_ < (y). Discard the last (M/2)l terms
and replace the first M/2 terms with the last remaining term. This
last remaining term is
1 n B2 2 1 In M + In 2 log2 (n k + 1) + 2]
M 2N In2 n 2 n 2
Then
(P 2 2/ B2n In 2 1 In M + In 2 flog (n k+ 1) + 2 .
e 2 2N In 2 n 2 n I 2
(11)
Feller (14) shows that 4 (x) is asymptotic to
2
x
1 "
e (12)
xV rF
Since the primary concern here is for reliability, which implies
very large n, this asymptotic result can be used. If (12) is
applied to (11), there results
2 f
B2n In 2 In M + In 2 g (n k + 1) + 2]
Pe e
4N In 2 \ n 2 n 09 2 J
Sn In 2 1 In + In 2 0log2 (n k + 1) + 2 1
N In 2 n 2 n
(13)
To obtain reliability, the definition
E = lim  in P
n e
n oo
is applied to (13):
 1 In P 1 Iln in 2 Iln 2 i In + n2 lg2 (nk+l) + 2]
n e n N In 2 n 2 n L 2 ]
B2 n In 2 1In I+n 21og (nk+l) + 21
4N In 2 n 2 n L 2 +
 1 In P 5 B2 ln 2 1 In M + 1 In 2 + In 2
n e 4N In 2 n n n
[log2 (nk+1) + 21
+1 in 4B2nL In 2 I In + n 2 [log2 (nk+1) + 21}
n
equal to rate R. In the limit, those terms which have as a factor
1/n will vanish to give
E B2 1 R (14)
4N In 2
This is the first upper bound on E.
It is noted again that the Plotkin bound (1) is applicable
to binary codes in general and thus to binary linear codes which
are a subclass of binary codes. Hence, the bound on reliability
given by (14) applies to both and so may be considered to be a
somewhat stronger result than a bound for binary linear codes
only.
C. First Lower Bound on E(R)
The first lower bound on E(R) will be derived from an upper
bound on Pe. The distance structure of group codes is again needed,
but for an upper bound on P a lower bound on minimum distance will
be required. Because of the monotone decreasing nature of the Gaus
sian density function, a lower bound on minimum distance in a group
code will determine the maximum contribution to Pe from any two code
words. The sum of the contributions to Pe from all code word pairs
will then yield an upper limit on Pe provided that the summation is
done in an optimistic direction for an upper bound. This, in brief,
is the outline of the method to be used for the derivation.
The starting point for the derivation of the upper bound
on P is a bound on minimum distance for binary codes first found
e
by Gilbert (15) and refined by Varsharmov. A convenient statement
of the bound for group codes is given by Peterson (16) as follows:
If n + d
nH
2 n k 2 2 n 1 (1)
2 2 (1)
where n = total number of digits per code word, k = number of
information digits per code word, and H2(x) = x log2 x
(1 x) log2 (1 x), then there exists a binary (n,k) group
code with minimum Hamming distance d."
The entropy function H(x) is symmetrical in x and (lx), i.e.,
H(x) = H(1x). Then
2 n+ ) = 2 (d 2
and (1) can be written
2n k 22 (n2) (2)
The inequality (2) is in the wrong direction for the desired
lower bound on minimum distance. Let k be the maximum value of k
for which (1) holds. Then there exists a code with k information
m
symbols and distance d. Since k is chosen to be the maximum, (1)
m
does not hold for k = (km + 1), so
2n(km+1)< 2nH2 ( n (3)
The inequality (3) is now in the desired direction. It can be
further manipulated to arrive at a more suitable form as
follows:
d 2
n k 2 n l
2 2
k d n1H2 ( )] 1
2 z. 2(4
kd 2
2 <= 2
orn [ H d 1
2 =k 2 (4)
where the subscript on k has been dropped. The theorem statement
of (1) guarantees the existence of an (n,k) group code such that (4)
is valid. For group codes, the number of code words M = 2k. Hence
Md d 2 (5)
d
where Md denotes the number of words with minimum distance d. Equa
tion (5) is interpreted as a lower bound on the number of words in
an (n,k) group code with minimum distance d. This bound may also
be expressed as a lower bound on rate, since
R = In M
n
giving
nR
e = M,
which, when placed in (5), yields
n[  1
[1 2 2)
enR 2 (6)
from which is obtained the inequality
R > d( 1 1 In 2. (7)
The quantity (d 2)/(n 1) will approach d/n for large n and d,
and the factor 1/n will become negligible, so that (7) may be
written
R> [I H2 (1 1 In 2
or
R> In 2 He ( (8)
e n
where H (x) indicates that the logarithms are to be taken to the
e
base e. The relation (8) may be expressed in terms of Euclidean
distance as
R > In 2 H (9)
e 2
e LB n J
The bound (5) establishes that, in an (n,k) group code,
there are at least Md words which are separated by distance d
where M is given by the right side of (5). An upper bound on P
d e
for such a code can be obtained by adding up the probability that
each code word will be decoded as each of the remaining code words.
Consider that word wl is transmitted. Since maximum likelihood
decoding is used, the probability of error if wl is decoded as,
say, w2, is the product of the probability that w1 is transmitted
(1/Md, since all words are equally likely) and the probability that
w1 moves at least half way towards w2. Since it is known that w2 is
at least D distance from wl, then this latter probability is no
more than Q (D/2}\J). Note that it is here that the lower bound
on distance is used. The word w2 could be farther away than D,
but in that event the probability of wl moving halfway towards w2
would be less. Hence, the most pessimistic case is taken. The con
tribution to P by w being decoded as w2 can then be written as
e 1 2
Md 2V ]
There are at least Md code words, each of which could be paired
(as an ordered pair) with (Md 1) other code words. Hence
Pe S Md (Md 1) 2 
The inequality is preserved and simplified if this is written as
P S M! D .d (10)
Since
nR
Md = e
(10) may be then written
nR D
SI (11)
e 21N
As noted earlier, ( (x) is overbounded by
2
1 e 2
xV F
This may be applied to (11) giving
D2
2 VN 1 enR 8N
pZ N e e
e D V
which simplifies to
D2
nR 
2N 8N
P e e (12)
e D2 T
It is apparent from (12) that the tightest bound will occur when
R as given by the right side of (9) takes on its minimum value.
The inequality (12) is an upper bound on Pe with D as a
parameter. It will be more convenient to express D as a function
of n by use of the relation
D = XK(n), (13)
where K(n) is the maximum value which D can assume. This maximum
value can be determined from the HammingEuclidean distance, relation,
D = 2B d
The maximum Hamming distance for an ndigit code length is thus con
strained to be 2B'\n. However, in Section IV of this paper, a
comparison between these results and those of the unrestricted theory
will be made. In developing a similar bound for the general case,
Shannon found it expedient to restrict the maximum code word separa
tion to V2Bn, which is less than the maximum by a factor of V1.
This restriction in effect is a lower bound on rate and permitted a
simpler bounding technique. In order to make a meaningful comparison,
the maximum code word separation for the binary case will be assumed
to be the same. Then, K(n) = 2Bn, and
D = 22B 2n (14)
The equation (12) can be written, using (14), as
__ n nR 2B2
p i n e 4N (15)
e X e2
Reliability, which is the asymptotic (as n.oo ) value of the nega
tive of the exponent in (15), is then bounded as
E X2B2 R .(16)
4N
This is the first lower bound on reliability.
If the relation (14) is used in (9), then R is bounded as
R > In 2 He (17)
The equations (16) and (17) can be used to determine E(R) through
k as A varies from 0 to 1.
It should be pointed out here that the specialization of a
lower bound on reliability to group codes is desirable. Group
codes are a subclass of the class of all binary codes. Thus, if
a group code exists which will give this reliability, certainly a
binary code (the same code) exists which will do as well
The bound obtained by the argument which was used in this
section is sometimes confusing because of the simultaneous bound
ing of M and Pe An overall interpretation may be of help in
understanding what has happened. The basic relation for the bound
on P is
e
P
e d 2"^N
This says that the true value of Pe is less than the quantity on
the right and that the quantity on the right increases directly as
Md. Call this quantity on the right of (10) P Further, Md is
bounded from below by (5) as
n[ d 2
n2[ ( 2 1
Md > 2
Call the quantity on the right M and note that M is a function
of d. Figure 1 shows P plotted as a function of M The point
8 g
labeled d on the curve in Figure 1 shows a particular bound value
for P when a value d is used to calculate M Now, P S P
g o g e g
says that the true value of P must lie below the point d and
e o
Md := M says that the true value of M is to the right Since
g d
the curve slopes up to the right, the true value definitely lies
below the curve. Point A is a lower bound on M for a fixed value
of Pe, and point B is an upper bound on Pe for a fixed value
Md of M.
d
D. Second Upper Bound on E(R)
In the theory of discrete codes for binary symmetric
channels, an optimum lower bound on the probability of error has
been clearly established. This bound is known variously as the
SpherePacking Bound or Hamming Bound. The bound is due originally
to Hamming (17) and has been restated in more modern and convenient
form by Peterson (18) and Fano (19). Briefly, the bound states that
the probability of error for the best possible (n,k) code can be no
smaller than that for a hypothetical code called a quasiperfect
merror correcting code. A quasiperfect code is one which corrects
all combinations of m or fewer errors, some of (m + 1) errors and no
combinations of greater than (m + 1) errors. The term hypothetical
is used because such a quasiperfect code may not exist for all combi
nations of n and k.
Peterson (18) has stated the Hamming Bound as an upper bound
on minimum distance for an (n,k) group code. It should be noted that
the bound actually applies to all binary codes. This bound is given by
1 h H2 (:i (1)
where n is code word length, k is the number of information digits
in the code word, m specifies the number of errors which the code
guarantees to correct, and H2(x) is the binary entropy function.
A change of base in logarithms, plus the application of the general
definition for rate, yields
In 2 n In M 2: He ( (2)
where M is the number of words in the code.
If a code is an merror correcting code, then the minimum
distance is at least given by
d = 2m + 1,
which gives
d 1
m =
2
or, for large values of d,
m =d (3)
2
If the EuclideanHamming distance conversion is applied to (3),
the bound (2) may be written
In 2 1 In M ; H D (4)
n e 8B2n
The argument to be used for obtaining a lower bound on
error probability is a geometrical argument similar to that used in
Part B of this section. However, the bound on distance (4) is
transcendental, and no explicit solution for D in closed form is
attainable Rather than attempt to solve (4) by approximation or
by numerical techniques, an "inverse function" approach will be used.
From (3), the value of the argument of H is d/2n, approx
imately. Since d is Hamming distance, the maximum value of d is
n, i.e., two code words can differ at most in every position. Thus,
the maximum value of the argument of H is 0.5. H (x) is monotone
e e
increasing over the range 0 S x S 0.5. Hence, it is single
1
valued over this range, and an inverse function H (x) can be
e
1
defined. H (x) is a number y for which H (y) = x. Using this
e e
inverse notation, the bound (4) may be expressed as
2 1
D S H (In 2 1 In M) (5)
8B2n e n
or 1
DS 8Bn He (In 2 1 n M) (6)
Se n J
1
Now, H is monotone increasing because H is. Hence, as the value
e' e
of M in (6) decreases, the bound value of D will increase. There
fore, the same technique as was used in Part B of this section, i.e.,
the determination of contributions to the probability of error by
successive deletion of code words at maximum minimum distance and
subsequent modification of the distance expression can be used
now on (6) to obtain a lower bound on P These calculations are
e
straightforward and yield as a bound
p 1 4> 2B2n H1 (In 2 R +1 In 2) (7)
e 2 V N e n
When the asymptotic approximation of 4(x) is applied, there
results
He (in 2 R + n 2)
e n
Pe (8)
4./ nV 1 1
2 H1 (in 2 R + n 2)
e e n
The established definition for reliability will yield
E 5 B2 H (In 2 R) (9)
N e
The relation (9) is the second upper bound on E(R). As was
true in the case of the first upper bound, (9) is actually
applicable to all binary codes.
E, Second Lower Bound on E(R)
Kennedy and Wozencraft (20) have established a random coding
bound for discrete, memoryless channels. This bound is an outgrowth
of work by Fano and Gallagher. It states that it is possible to
code and decode data for such a channel with a probability of error
bounded by
P S 2n(Eo R) (1)
where n is the code word length, R is the rate in bits per second,
and E is the reliability obtained when only two code words are in
the message ensemble for the channel, i.e., M = 2. A brief outline
of the argument leading to this bound will be given, since the method
of Kennedy and Wozencraft was adapted for the derivation of a similar
bound for the timediscrete continuous channel.
Suppose that, from the set of ntuples over the field of two
elements, two code words are chosen at random. Define the probabil
ity of error if either of these words is transmitted as
nEo
P = 2 o
Now, choose M words at random. The probability of error if the
first (or any one) of these is transmitted is
P = P (The received message is closer to word 2 than
e word 1, or, the received message is closer to
word 3 than word 1, or, .)
The probability of a union of events is overbounded by the sum of
the probabilities of the component events. Hence
P S P (received message is closer to word 2 than word 1)
e
+ P (received message is closer to word 3 than word 1)
But each component probability is the probability than an error
will occur in the transmission of two randomly chosen code words,
since all code words were chosen at random. Thus
P S (M 1)2"Eo M2nEo (2)
Rate R in bits is given by
R = 1 log M.
n 2
Then nR
M= 2
and (2) can be written
n (Eo R)
P_ <= 2
which is the KennedyWozencraft bound. The customary definition
of reliability gives
E 2 E R. (3)
o
Note that defining the probability of error of transmitting one of
two randomly chosen code words as
nEo
Pe = e
will result in the same expression, (3), for reliability, where R
will then be in natural units.
A method will now be given for obtaining a bound on Eo which
will result in a random coding bound for the timediscrete channel
with + B as input signal levels. This approach removes the restric
tion that the input signal sequences form a binary linear code. The
bound will apply to the entire class of binary codes, i.e., codes
selected from the complete vector space of ntuples over the field
of two elements. This method requires a simplified bound on 4 (x)
and an evaluation, for the timediscrete channel, of the probability
of error for two randomly chosen code words.
It can be shown that
X22
22
(x) e O2 O x oo (4)
in the following manner:
Let
x2
D(x) = <(x) e 2
then, the inequality (4) is valid if D(x) S 0 over the indicated
range. By direct subsitution, D(x) = 0 when x = 0, and when
x =oo Now,
x2
dD = e 2 ( x 1 (5)
dx 2
which has only one root at x = 2/ j27 Also, dD/dx exists over
the entire range. Thus, D(x) has only one relative extremum and no
crossings between 0 andoo A test of any sample value shows D(x)
is negative in the indicated range; hence, D(x)S 0 for 0 S xoo .
Consider that two binary code words are chosen at random.
If these two words differ by a distance d, then, still assuming a
binary signal level + B, they are separated by Euclidean distance
D = 2B.\d. If one of these is transmitted, the probability of
error in decoding is no greater than the probability that one of
them is moved by noise at least D/2, i.e.,
Ped = B
or
B2d
P 1e (6)
ed 2
where (4) has been used.
Since each code word is chosen at random, the probability
of choosing any one is 2 There are (n) words which will differ
in d places from the first chosen. Thus, the probability that the
two code words differ in d places is
2n n
d)
The average probability of error in two randomly chosen code words
is then (6) averaged over all possible d, i.e.,
n
Pe T
o d=o
2n p)
1
S 2
1
o 2
P i ( +e
e 0 2
o
B2d
e
2
B2
d
2N
n n
0+ ed
1 + e
By definition,
P nEo
P =e 0
eo
when this is used in (7), there results
nE, < 1 + e" n
e 2
E 2 In
o
B2
l+e 2N
2
The inequality (8) may be applied in (3) to give the second
lower bound on reliability as
[ In 2) \
In I + e 2N +
2
SECTION IV
COMPARISON WITH THE UNRESTRICTED THEORY
A. Preliminary Remarks
Prior to a comparison of the results of this investigation
and the unrestricted theory of Shannon (4), a summary and explana
tion of the unrestricted bounds is in order. It should be remem
bered that in the unrestricted case the channel input sequence is
considered to be n real numbers, subject only to the constraint
that the power P in each sequence of n numbers be identical.
Figure 2 depicts the four unrestricted bounds on reliabil
ity plotted as a function of rate R, with the parameter A = (P/N)1/2
= 3. The power P will correspond to B2 in the binary coding case.
N is as before the average noise power, For ease of discussion the
bounds have been labeled E1 through E 4
Bound E1 is the bound on reliability when P is obtained
by the spherepacking method. The quantitative expression for this
bound is complex, but it will be recorded here for interest.
E l A2 1 AG cos 9 In (G sin 8) (1)
2 2
where
G = 1 (A cos 9 + A2 cos2 9 4 ) (2)
2
and 9 is a function of rate through the expression
eR = sin 9 (3)
e =sin 8 (3)
The bound (1) is valid for ranges of 9 greater than go, which is
the value of 8 which corresponds to channel capacity, i.e., for
: o = sin1 (1/ V + A2). When 8 = 90, then E1 = 0, which is
the correct value of reliability at channel capacity. E1 represents
the highest possible reliability attainable with the unrestricted
coding scheme. However, its derivation is based on all possible
codes which could be formed. It provides an upper bound on relia
bility for all ranges of R, but at low values of R it is found that
this bound cannot be achieved.
The bound E3 is an upper bound on reliability which is
sharper than E1 at low transmission rates. E3 is independent of
rate, and is given by
P P A2
E A(4)
4N 4
The lower bound E2 is one which is obtained when P is cal
culated using a random coding technique. E2 is actually given by
two different expressions, according to whether or not R is greater
than or less than a value called critical rate, Rc. The concept of
critical rate occurs in coding theory whenever general derivations
of upper bounds on Pe are obtained with a random coding argument.
It is critical only in the sense that the nature of the bounds for
P and E are different for ranges of R on either side of R When
R => Rc, the asymptotic lower and upper bounds on Pe differ only by
a multiplying factor which is a function of rate. Thus, for
46
R > R the reliability E, which is an exponent, is the same for
c
both bounds on Pe. In this range, then, E2 is exactly the same as
E1 and is given by (1). For rates less than Rc, E1 and E2 diverge.
In this range, E is given by
E2 E EL(c) + (Rc R) (5)
where
A2
EL(c) A AG cos e In (G sin 9 )
EL d 2 2 c c
is the value of E1 at the critical rate, G is given by (
9 is a function of critical rate through equation (3).
c
of 9 is the solution of 2 cos 9 = AG sin2 9 Despite
c c c
complexity of the equations required to determine EL(9c)
the bound E2 is linear over the range 0 S R R .
For low values of R, E2 is sharpened by E4. The
is given by
E R
4 4N
2), and
The value
the
and R
bound E4
where
R = In sin 2 sin
I\ 2
_= D
V2Pn
If B2 is substituted for P, E4 has the same form as the first lower
bound on reliability as developed in Section III, Part C, of this
paper. However, the expression for rate R is considerably different.
The curves in Figure 2 give a typical picture of the bounds
obtained by the unrestricted theory For rates less than the critical
rate R the bounds enclose an area within which the reliability
c
must lie. For rates greater than R the reliability is given by a
c
single curve. Thus, for rates near zero and rates near to or greater
than Rc, the reliability is determined fairly sharply.
All the bounds on reliability, as typified by Figure 2,
are derived for the best code which the unrestricted technique can
achieve. This fact requires that a slightly different interpretation
be made of upper and lower bounds. Bounds E1 and E3 say, in effect,
that regardless of what type of code is attempted, within the
limits set by the unrestricted coding technique, no higher relia
bility can be achieved. The best that can be done is to meet these
bounds. Hence, it is expected that a restrictive coding technique
such as that used in this investigation may fall short of these
bounds. The lower bounds on reliability, E2 and E4, are the result
of a random coding technique. This technique results in the following
reasoning: assume that each code is formed by selecting at random M
code words for each code out of the total ensemble of possible
message sequences. The average probability of error for all such
possible codes is then calculated. Since, for all such codes, the
average probability of error is then known, the best code must have
a probability of error which is at least average. This is the key
point in random code bounding. It is asserted that there exists
a code which will give a probability of error and hence a relia
bility at least as good as the average. There is no indication of
how this code is to be found. The binary technique is a special
case of the unrestricted technique. In evaluating reliability for
this case, the question being asked is, in effect, is this one of
the codes capable of achieving the random coding bound.
B. A Comparison of Upper Bounds
Figures 3 through 8 show the first and second upper bounds
on reliability plotted as lines A and B, respectively. These bounds
are plotted against rate for various values of the parameter
2 1/2
A = (B2/N) 2, The first upper bound is given by
2
E A (1 R )
4 In 2
The second upper bound is given by
2 1
ES A H1 (In 2 R).
e
The two bounds may be considered to have different ranges of
validity. For RS 0.25 the first upper bound is lower and hence
predominates. For R 20.25 the second upper bound is lower and
is thus the prevailing one.
For all values of A, these bounds are lower than the low
rate upper bound for the unrestricted case. For values of A
greater than 3, approximately, these bounds are also lower than
the spherepacking bound for the unrestricted case. It is in this
range that the derived bounds give the greatest information. It
is seen that as A increases beyond 3, the reliability of the binary
code is bounded away from the optimum, represented by the sphere
packing bound. The bounds show clearly that the binary coding
technique is restricted to a maximum signalling rate of In 2.
For values of A less than 3, the upper bounds convey less
information. In this range of A, the first upper bound lies under
the spherepacking bound for low ranges of R and intersects that
bound at a value of R always less than Rco The second upper bound
lies above the spherepacking bound except at very low R and at
R near In 2. The amount of definite information given about the
binary technique for small A is less than that available for
higher A.
Where the spherepacking bound is higher than the first and
second upper bounds, the difference between them may be used as
a crude measure of the loss in reliability resulting from the use
of the binary technique. It is not too accurate a measure because
there is no guarantee that the first and second upper bounds can be
met, that is, that a code can be found which will achieve this
reliability, while with the unrestricted technique, the optimum
bound could be very closely approached, if not actually reached.
To do so might require that many levels of signal quantization be
available, but this is, in theory, no barrier.
In all situations where the first and second upper bounds
convey information, it indicates that the binary coding technique
can give, at best, less reliability than the optimum unrestricted
code. The degree of this degradation increases as the signal to
noise ratio increases. In all cases, the situation is not too sharp
at very low rates. Each figure shows that the spherepacking
bound, which is of exponential shape, is sharply truncated by
the low rate bound for the unrestricted case. The resulting
unrestricted upper bound does not appear to be a realistic or
naturally achievable limit. A reasonable estimate is that the
true upper bound lies between the first upper bound for the binary
case and the truncated unrestricted bound, in which case the binary
case will not be bounded as far away from the optimum.
The second upper bound was derived from the Hamming distance
bound for group codes A group code for the Binary Symmetric
Channel which meets the Hamming bound is optimum, i.e., no code
can be found with the same n and k which has a lower probability
of error. It has not been established that the second upper bound
is optimum for the channel investigated herein. However, the
second upper bound does serve as a guide to the performance to be
expected when an optimum group code is used in the timediscrete
channel.
C. A Comparison of Lower Bounds
1. The First Lower Bound
The first lower bound is plotted as curve C on Figures 4
through 8. The bound was derived in terms of minimum distance D
between code words, in the same manner as was the bound for the
unrestricted case. The expression for E is the same for both
and is
E R,
4
where
= D
Rate R is a function of X, but the function is not the same for the
binary and unrestricted cases. For the binary case, R is given by
R 2 In 2 He ( )
This can be compared with R for the unrestricted case as given in
Part A of this section. To present E as a function of R, several
auxiliary curves were used. Figure 9 plots rate as a function
of for both cases. Figures 10, 11, and 12 give E as a function
of X for various values of A.
The first lower bound lies slightly under the low rate
bound for the unrestricted case. For lower values of A, it coin
cides with the unrestricted bound at low rates. It would be
unreasonable to expect that the binary coding technique could give
a higher reliability than that of the unrestricted case. It must
be remembered that the binary case is only a special case of the
unrestricted coding scheme. If the binary technique could guarantee
that codes could be found whose reliability was no worse than
some value K(R), and K(R) was better than the corresponding bound
for the unrestricted case, then the unrestricted bound could be
raised to match, since the binary group codes are one of those avail
able in the unrestricted case. It is felt by the author that the
close proximity, at low rates, of the lower bounds of the binary and
unrestricted schemes is significant when the highly restrictive
nature of binary coding is compared with the unlimited numbers of
codes available in the unrestricted case.
1. The Second Lower Bound
The second lower bound is plotted as line D on Figures 4
through 8. This bound lies under the unrestricted random coding
bound for all values of A. Its separation from the unrestricted
bound increases as A increases. For small A, it approaches the
unrestricted bound quite closely. The departure of the second
lower bound from the unrestricted random coding bound can be used
as a much better measure of the loss of reliability by binary coding
than could the first upper bound. The unrestricted random coding
bound says that, from all the codes available, there can be
53
selected one or more which can have no lower reliability than is
given by the bound. The second lower bound guarantees that
among all the binary codes available one or more exists which can
have no lower reliability than that given by the bound. Since these
bounds are not widely divergent, and in fact, are quite close at
the lower values of A, the curves show that binary codes can be
found which are not significantly worse than the average code for
the unrestricted case.
SECTION V
CONCLUSIONS
A. Summary of the Investigation
The stated purpose of this investigation was to inquire into
the use of binary codes, particularly binary linear codes, as input
signal sequences for a timediscrete continuous channel. More
specifically, it was desired to determine if binary codes as a
class were so inferior in performance that a different technique of
coding should be sought or if their performance approached the unre
stricted codes closely enough to warrant a more detailed study. The
method of this inquiry was to determine the reliability of such codes
for this channel and to contrast this with the reliability derived for
the more general unrestricted case by Shannon. The reliability deter
mination was done by the derivation of two upper and two lower bounds
on reliability for the binary coding technique.
The first and second upper bounds demonstrate that, for a
substantial range of the signal to noise ratio, the upper limit of
reliability of the binary coding technique is bounded below the
optimum reliability for the unrestricted case. This is not an unex
pected result The value of the upper bounds is that they are an
analytical verification of the expected result. The plotted curves
show that the binary method is bounded away from optimum by a
fairly large amount in most cases.
54
The lower bounds on reliability are the most surprising and
significant results of the investigation. The first lower bound is
valid for low rates of information transmission. This bound at worst
lies only slightly below the corresponding bound for the unrestricted
case. It guarantees that binary linear codes exist which give, at
worst, a reliability only slightly less than that guaranteed for the
unrestricted case. Any lower bound on reliability is, in a sense, the
only positive statement which can be made about the merit of a code.
The close proximity of the lower bounds for the binary and unre
stricted cases indicates that, by the use of binary linear codes, a
value of reliability can be guaranteed which is only slightly worse
than that which is guaranteed if the entire ensemble of codes avail
able in the unrestricted case is used.
The second lower bound is valid over higher ranges of rate
than is the first lower bound. It, too, lies beneath the unrestricted
bound. The divergence of the two bounds is somewhat greater than is
the case for the first lower bound. The statements about the first
lower bound apply here also, except that the difference in guaran
teed reliability between the binary and unrestricted cases is greater
over the range for which the second lower bound is valid. It should
also be emphasized that the second lower bound is not derived for
binary linear codes, but rather for binary codes in general It is
the only one of the bounds obtained which has this exception.
Overall, the use of binary codes appears to be a promising
method of coding for the timediscrete continuous channel, provided
the signaltonoise ratio is not too large. As indicated by the
various figures, the bounds on reliability for the binary case become
increasingly divergent from those of the unrestricted case as the
signal to noise ratio increases. For larger values of A, nonbinary
discrete codes would probably yield results much closer to those of
the unrestricted case.
The bounds derived for this investigation are the only indi
cations available (to the best of the author's knowledge) of what
is possible when a class of discrete codes is used in a timediscrete
continuous channel. It is reasonable to expect that when the input
signals are restricted to a comparatively tiny subset of the total
ensemble of codes available to the channel, the reliability will be
decreased. It is, however, encouraging that the lower bounds for the
binary coding technique can guarantee at lower signaltonoise ratios
a reliability which is close to that guaranteed by the unrestricted
codes, over a reasonably wide range of transmission rates.
B. Suggestions for Future Work
This investigation has shown that the use of binary codes as
an encoding technique for the timediscrete channel is promising. It
can be reasonably conjectured that even better results could be
achieved by the use of nonbinary discrete codes, which would make
available additional levels of signal quantization. The possible
problems for future work are numerous. They fall rather naturally
into two groups, the continuation of research in the vein of this
paper and studies directed toward the application of discrete codes
to actual information handling systems.
The first theoretical investigation might well concern itself
with a sharpening of the bounds for the binary case. A most informa
tive result would be an optimum or least upper bound, This would pin
down the maximum capability of this type coding. The lower bounds
herein were obtained, in some cases, by comparatively crude techniques.
It is, however, questionable how much of an improvement can be obtained
in any reasonable fashion.
A most fruitful area for future work would be the use of non
binary linear codes, i.e., codes from ternary, quarternary, and in
general, qary systems. As additional levels of signal quantization
are made available the bounds on reliability, and in particular the
upper bounds, would be expected to approach the unrestricted case,
since, in effect, an infinite number of signal levels are available
in the unrestricted case. A measure or technique for determining
how many levels are required to approach the unrestricted bound to
within some specified interval would be the ultimate result one
might expect.
When the actual application of discrete codes is considered,
several problems are immediately apparent which warrant study. The
58
need for decoding methods is clear, and an investigation to determine
them would be challenging. The implementation of a maximum likeli
hood decoder does not appear to be practical. On the other hand,
digitbydigit decoding will penalize the reliability. Is there an
effective compromise? Finally, in the application of discrete codes,
some code must be selected. An investigation of specific codes for
use in this application would be informative.
P vs. M
true value for d
Interpretation of First Lower Bound
o/I
I
I
I
I 
I
Figure 1.
A = = 3
0 0.2 0.4 0.6 0.8 1.0
Rate R
Figure 2. Bounds on ReliabilityUnrestricted Theory
5.0
4.0
3.0
2.0
1.0
0
. E1
 Unrestricted Theory Bounds
Binary Bounds
A = First Upper Bound
B = Second Upper Bound
D = Second Lower Bound
0 0.02 0.04 0.06 0.08
Rate
Figure 3. Comparison of Reliability for A = 1/2
0.10
m 0.14
0.12
0.10
0.08
0.06
0.04
0.02
0
0.6
 Unrestricted Theory Bounds
Binary Bounds
0.5 A = First Upper Bound
B = Second Upper Bound
C = First Lower Bound
0.4 D = Second Lower Bound
S0.3 B
4
14
4
4
0.2
0.1
0 0.1 0.2 0.3 0.4 0.5
Rate
Figure 4. Comparison of Reliability for A = 1
S Unrestricted Theory
Bounds
Binary Bounds
A = First Upper Bound
B = Second Upper Bound
C = First Lower Bound
D = Second Lower Bound
0.6 0.8
Rate
Figure 5. Comparison of Reliability for A = 2
2.4
2.0
1.6
1.2
4.J
'4
14
0.8
0.4
0
Unrestricted Theory
Bounds
Binary Bounds
A = First Upper Bound
B = Second Upper Bound
C = First Lower Bound
D = Second Lower Bound
0 0.2 0.4
0.6
0.8
Rate
Figure 6. Comparison of Reliability for A = 3
5
4
3
2
4O
1
0
8
 Unrestricted Theory Bounds
6 \  Binary Bounds
\ A = First Upper Bound
B
B = Second Upper Bound
4 C = First Lower Bound
D = Second Lower Bound
*I
S 2
0 D\
0 0.4 0.8 1.2 1.6
Rate
Figure 7. Comparison of Reliability for A = 4
66
35
 Unrestricted Theory Bounds
30 Binary Bounds
A = First Upper Bound
B = Second Upper Bound
25 C = First Lower Bound
D = Second Lower Bound
20
B
"\
S 15 \
5
C
_
D 
0
0 rr r I  I  ~" ~ 
0 0.4 0.8 1.2 1.6 2,0
Rate
Figure 8. Comparison of Reliability for A = 8
A = Unrestricted Case
B = Binary Case
0.2 0.4 0.6
Figure 9. Comparison of Rate vs.2n
Figure 9. Comparison of Rate vs.
0.8
1.4
1.2
1.0
0.8
0.6
0.4
0.2
0.
Note: For each value of A
the upper curve is the binary
case; the lower curve is the
unrestricted case.
A= 3
First Lower Bound on Reliability, A = 1/2, 1, 2, 3
2.5
2.0
1.5
1.0
0.5
0
0 0.2 0.4 0.6 0.8
X = D/ 2nP
Figure 10.
A = Binary Case
B = Unrestricted Case
3.5
3.0
2.5
2.0
1.5 
1.0
0.5 
0
0.2 0.4 0.6 0.8
e = D/ n R A
Figure 11. First Lower Bound on Reliability, A = 4
B
T   V
0
1.
14 
B, D = Binary Case
12
C, E = Unrestricted Case
10 
A = 16
8
SB
C
4 6
4
A= 8
2 D
0
0 0.1 0.2 0.3 0.4 0.5
SD/ 2nP
Figure 12. First Lower Bound on Reliability, A = 8, 16
APPENDIX
APPENDIX
ALLIED RESULTS
A. Minimum Power Theorem
Suppose that there exists a set of M signals, each of which
is represented as an ndimensional vector in nspace in a primary
coordinate frame S. Thus the ith signal will be
xi = Xil, xi2, x ij xin
Each x j is a voltage, which could be derived from some sampled value
of a more complex signal representation. For simplicity, assume that
each signal component xij has a duration of 1 second. The power in
the ith signal is, with respect to frame S, equal to
n
1 2
n L. ix
j=l
The average power P of the ensemble of M signals is then
M n 2
P p n x
j=l
where pi is the probability that the ith signal will occur, and
M
Pi= 1.
73
If the signal points remain fixed with regard to the primary
coordinate frame S, then the average power will be a function of the
position of the coordinate system in which the power is calculated,
for assume that the power is calculated with respect to a new coor
dinate system S centered on K = k, k2, kn. Then
1, 2, n
M n
i j=l
where
xi = (xij k)
thus
M n
P p ( x k) 2 (1)
P = Pi n 1 (xij kJ)2 (1)
j=1
which is clearly a function of K.
We wish to know if there exists a meaningful value of K
which will minimize P. Consider a mechanical analogy. Let each
signal vector xi be the radius vector to a point of mass mi, where
mi = M, and p mi/M. Then, the center of mass of the system
of mass points is given by
M
m ix
R = (2)
M
Thus R = rl, r2, r., r
where
r =1 M
j m x
S M i ij
dP/dK =
tion of
dP/dK =
M
r = Pp x(
j i1 i ij
Taking the derivative of (1) with respect to K, we get
M n
i pi (i/n (2)(xij k )) since p is not a func
i=l j=l ij i
2 2
K. The second derivative d P/dK is positive, hence
0 will yield a minimum. Then
M n
2 pi ( 1 (x.
i= n j=l 1
which requires
M n
pi (xij k )
I=1 j=l
 k )) =0
J
= 0
and
M n
S E Pi(xij k) = 0
i=l j=l j
Interchanging the order of summation, there results
n M
S E Pi(Xj kj) = 0
j=l i=l
This will be satisfied if
M
SPi(xij k ) = 0
for each Since the sum is on (5) may be written
for each J. Since the sum is on i, (5) may be written
M M
SPixj Pkj = 0
i=1 i=l
M
Recall that pi = 1, thus the condition
M
SPixj = k (6)
i=l
for all j will minimize the average power in the signal. But (6)
is equivalent to (4). Thus, the average power will be minimized if
the coordinate system wherein power is calculated has its origin
at the center of mass of the signal points.
In the investigation reported in the body of this paper, it
has been assumed that all signals (code words) are equally likely,
i.e., Pi = 1/M. By construction, each signal is in oneone corre
spondence with a word of a group code (linear code) and the ensemble
of M signals has constituted a complete group code. It will now be
shown that, under these constraints, the origin of coordinates
(0, 0, ' 0) of these signals is at the center of mass of the
signal points and hence minimum average power is required.
If the center of mass of the system of signal points is to
coincide with the origin of coordinates, then R = 0, or, from (4)
M
r = pix = 0
i=l
for all j. By hypothesis, pi = 1/M, requiring
M
S x =0,
iMij
i=l
or
M
x = 0 (7)
ij
i=l1
Since x can assume only the values + B, then (7) will be satisfied
ij
if and only if the positive and negative values each appear M/2
times in the jth component of the signal.
We will use the notation of linear codes. Consider an (n,k)
linear code over the field of two elements. It is such a code which
has been used in forming the signal vectors, by mapping 1 into B and
0 into B. Hence, any property derived for the (n,k) code will be
valid for the signals used in this investigation.
This (n,k) linear code contains M = 2k ntuples over the
field of two elements, and these ntuples form a subspace V of the
vector space of all ntuples over the field of two elements. Arrange
these ntuples as rows of a matrix. No column of this matrix contains
all zeros, for if such a column existed, it could be deleted and an
(nl,k) linear code would remain.
Consider the subset of vectors of V, v1, v2, * = S,
which have a 0 in the jt column. Now S is a subspace of V since S
is closed under addition (the sum of any two vectors with 0 in the
jth column will be a vector with 0 in the jth column) and closed under
multiplication by scalar field elements (any vector which has a zero
in the jth component will retain a zero in that component when multi
plied by any scalar.) Since S is a subspace of a vector space V,
S is also a vector space and is an Abelian group under addition.
Hence, we may form left costs based upon S as a subgroup. Selecting
a member of V which contains a 1 in the jth position, we can form a
left coset. Now, all members of V which contain a 1 in the jth
position must appear in this coset, since two elements v and v' of a
group V are in the same left coset of a subgroup S of V if and only
if (v) + v' is an element of S, and the sum of any two vectors with
a 1 in the jth position yields a vector with a 0 in the jth position.
This theorem is proved by Peterson (21). Since the vectors of V can
have only a 1 or 0 in the jth position, this partitioning exhausts V
and divides the space into a subset containing all O's in the jth
position and another subset containing all l's in the jth position. By
construction, each subset has an identical number of elements. Thus,
each must contain M/2 vectors. Consequently, in a linear code, l's
and O's occur M/2 times in each component of the vector ensemble. Thus,
if a linear code is mapped into a code containing components of + B
in each vector, (7) is satisfied. Hence, minimum power is required.
B. An Upper Bound on Reliability for the Binary Symmetric Channel
It was felt that the technique used in the derivation of the
first upper bound in Section III, Part B, of the body of this paper,
might yield a comparable result for binary codes used in the binary
symmetric channel (BSC). The resulting bound does prove to be lower
than the spherepacking bound for low transmission rates.
The Plotkin bound on maximum minimum distance guarantees that,
for an (n,k) code, there exists at least one pair of words, wl and w2,
which are no farther apart than the bound value of d, which is given
by
d S E 1 R + 1 log (n k + 1) + 1 (1)
2 L n n
where R = k/n = rate in bits, and logs are to the base 2. Assume
that, in a BSC, one of this pair which has maximum minimum distance,
say wl, is transmitted. Then the contribution to the probability
of error of the code is certainly no less than the probability that
w, is mistaken for w2, weighted by the probability that wl is trans
mitted. Since all words are equally likely, the probability
k
that wl is transmitted is 2k. The word w1 will be mistaken for w2
if at least d/2 of the d places in which they differ are changed.
If the BSC has an error probability of p and a probability of correct
transmission of q = (1 p), then the contribution to the probability
of error of the code, P is bounded by
d
P T k p i q i (2)
re '
i=d
2
Now, discard w since an expression bounding its contribution to the
k
P has been obtained. The remaining (2 1) words must contain a
e
pair which is no farther apart than the bound value of d,, which is
given by
d S [ i 1 log (2 1) + 1 log (n k + 1) + (3)
1 2 n n n
Similarly, if one of this pair is transmitted and decoded as the
other, it will yield a contribution to the probability of error
of at least
d(d) qdl 1
d 2k 1 P i
i=d
2
This can be continued until only two words remain. This last con
tribution will be
2k do) pi d 
d
2o
2do
2
where
d 5 1 +1 log (n
o 2 [ n n
If the contributions from
results a lower bound on P given
e
S2k 2
2
i
p q
 k + 1) + 2]n
all these pairs are summed, there
by
d i + d dl pi q di
i dl i
S
2
d d p d (4)
+ + od P (4)
i= i
2
This bound can be simplified. First, since there are (2k 1)
terms on the right of (4), the inequality is preserved if we retain
only the first (2k + 1) terms. Each term on the right of (4)
can be shown to be smaller than the preceding term. Hence, these
(2k 1 + 1) remaining terms can be replaced by the (2k 1+ 1)th
term, giving
k 1
2 e+
P _
e 12k
d
S()pi qd
i.d
2
where now d is given by the right side of the relation
d n2 I .log 2k 1
2 1 n
+ log (n k + 1) +2
n n
d SE 1 R + 1 log (n k + 1) + ]
21 n nj
The inequality (5) is preserved and somewhat simplified if it
is written
d
p a1 E d
e 2 i
2
Reliability E is, from (7)
i d i
p q
E S lim n log[
n +oo L2
d
di
2
d
E S lim 1 + log Z
noo n in
d d i d i
i
d ( pi d i
P q
d i
2
The evaluation of (8) requires that an estimate be made of
the "tail" of a binomial distribution. Peterson (22) gives the
necessary relations. Write d/2 as X d, where = 1/2. The sum in
the inequality (8) may then be written
d
i= Ad
lP q
and the inequality
d d i
i= Ad
Xd /d
X
Xd /pd
P q
can be used, provided
length n increases, d
for large d,
log [
d L
, : p. Here ( X + ,) = 1. As the code word
will also increase. Peterson (22) shows that,
Xd / d Xd /d ,
fA. p q I = F( ), p)
where
F(A, p) = H(p) H(A) + ( A p) H'(p),
H(x) = x log x (1 x) log (1 x),
and
H'(x) log (x)
Thus, using (10), the bound (8) is written
E tli m I
nutoo Plotkin
But, from the Plotkin
+ d F( p)
n J
bound (6),
(10)
(11)
d [l R +l log (n k + 1) + 1 .
n 2 n nJ
Also, recalling that X = 1/2, there results, in the limit,
E E F(0.5, p) (1 R) .(12)
2
The optimum or spherepacking bound for the binary symme
tric channel is due to Elias (23). It is given by
E S F()o, p) (13)
where F(0 p) is given by (11), and \ is defined by the
O o
expression
S H(X ) = R (14)
o
Figure 13 shows a typical plot of the bound derived here, as given
by (12), and the spherepacking bound as given by (13), for
p = 0.01. The bound (13) is lower than the spherepacking bound for
low transmission rates R.
Since the Plotkin bound (1) is applicable to binary codes in
general, the results obtained here for Pe, expression (7), and E,
expression (12), are valid for the entire class of binary codes.
Weldon (24) has obtained low rate results for the BSC which are
applicable to group codes only. His result for E is the same as
(12), while his result for P is higher than (7) by a factor of 2.
1.2
1.0
0.8
0.6
SpherePacking
Bound
0 0.2 0.4 0.6 0.8
Figure 13.
Rate R
Low Rate Bound Compared
Bound for p = 0.01
with SpherePacking
Low Rate Bound
0.4
0.2
LIST OF REFERENCES
1. C. E. Shannon and W. Weaver, "The Mathematical Theory of Com
munication," University of Illinois Press, Urbana, Ill.; 1949.
2 W. W. Peterson, "ErrorCorrecting Codes," John Wiley and Sons,
Inc., New York, N. Y.; 1960.
3. P. Elias, "ErrorFree Coding," IRE Transactions on Information
Theory, vol. IT4, pp. 2937; 1954.
4. C. E. Shannon, "Probability of Error for Optimal Codes in a
Gaussian Channel," Bell System Technical Journal, vol. 38,
pp. 611656; May, 1959.
5. G. A. Franco and G. Lachs, "An Orthogonal Coding Technique for
Communication," IRE Convention Record, part 8, pp. 126133;
March, 1961.
6. F. F. Harmuth, "Orthogonal Codes," Institution of Electrical
Engineers, monograph 369E; March, 1960.
7. R. M. Fano, "Transmission of Information," John Wiley and Sons,
Inc., New York, N. Y., pp. 148163; 1961.
8. R. M. Fano, op, cit., p. 141.
9. D. M. Y. Sommerville, "An Introduction to the Geometry of
N Dimensions," Dover Publications, Inc., New York, N. Y,,
p. 76; 1958.
10. H. Cramer, "Mathematical Methods of Statistics," Princeton
University Press, Princeton, N. J., pp. 310312; 1946.
11. M. Plotkin, "Binary Codes with Specified Minimum Distance,"
IRE Transactions on Information Theory, vol. IT6, pp, 445450;
September, 1960.
12. W. W, Peterson, op. cit., pp. 4850.
13. F. M. Reza, "An Introduction to Information Theory," McGraw
Hill Book Company, Inc., New York, N. Y., pp. 8384; 1961.
14. W. Feller, "An Introduction to Probability Theory and Its
Applications," John Wiley and Sons, Inc., New York, N. Y.,
p. 131; 1950.
15. E. N Gilbert, "A Comparison of Signaling Alphabets," Bell
System Technical Journal, vol. 31, pp. 504522; 1952.
16. W. W. Peterson, op. cit., pp. 5152
17. R, W. Hamming, "Error Detecting and Error Correcting Codes,"
Bell System Technical Journal, vol. 29, pp. 147160; 1960.
18. W. W, Peterson, op. cit., pp. 5254.
19. R. M. Fano, op. cit., pp. 224231.
20. R. S. Kennedy and J. M. Wozencraft, "Coding and Communication,"
Massachusetts Institute of Technology, Cambridge, Mass., MIT
Report No. MS927; 1963.
21. W. W. Peterson, op. cit., p. 17,
22. W. W. Peterson, op. cit., pp. 246247.
23. P. Elias, "Coding for Two Noisy Channels," Third London
Symposium on Information Theory, Academic Press, Inc., New
York, N. Y.; 1956.
24. E. J. Weldon, "Asymptotic Error Coding Bounds for the Binary
Symmetric Channel with Feedback," Air Force Cambridge Research
Laboratories, Bedford, Mass., AFCRL Report No. 63122; April,
1963.
BIOGRAPHICAL SKETCH
James Robert Wood was born February 14, 1931, in Memphis,
Tennessee. In June, 1948, he was graduated from Bay County High
School, Panama City, Florida. He received the degree of Bachelor
of Electrical Engineering in August, 1956, from the University of
Florida. His undergraduate studies were interrupted from 1951 to
1955, when he served four years with the United States Air Force,
chiefly in Japan. In August, 1956, he joined the International
Business Machines Corporation, with whom he has been associated
until the present time. He has held several positions with that
company and had assignments in various phases of digital computer
development. He attended graduate school in the extension division
of Syracuse University during the years 1957 to 1961 and received
the degree Master of Electrical Engineering from that institution
in August, 1961. In 1961 he was awarded an IBM fellowship and began
work toward the degree of Doctor of Philosophy in September, 1961.
James Robert Wood is married to the former Barbara Lois
Braswell and has one son. He is a member of the Institute of
Electrical and Electronic Engineers, Phi Kappa Phi, and Sigma Tau.
This dissertation was prepared under the direction of the
chairman of the candidate's supervisory committee and has been
approved by all members of that committee. It was submitted to the
Dean of the College of Engineering and to the Graduate Council,
and was approved as partial fulfillment of the requirements for
the degree of Doctor of Philosophy.
August 8, 1964
Dean, College of Engineering
Dean, Graduate School
Supervisory Committee
Chairman
L77
//
/L,V f
9r. 1 n ^^
C40t.,
