Note on history re Ville


Material Information

Note on history re Ville
Physical Description:
Chung, Kai Lai
Physical Location:
Box: 1
Folder: Note on history re Ville


Subjects / Keywords:
Mathematics -- History -- 20th century

Record Information

Source Institution:
University of Florida
Rights Management:
All applicable rights reserved by the source institution and holding location.
System ID:

This item is only available as the following downloads:

Full Text

February 12, 1998: NOTE on HISTORY
Ville is most likely long dead. My vague memory is that he went into
business soon after his thesis --- Levy told me. It does not seem
quite "fair" to says as Snell recorded in the interview, that he did
not give a formal definition of martingales name concocted by him mean-
ing a s tem of bets. His definition is given on p. 99 in the discrete
case, on p. 111 in the continuous case. He referred to Levy for condi-
tional probability and expectations and to Doob (*937) in the continu-
ous for reinforcing. It is true axH~ xhkaxtxxxraixxxt that he adopt-
ed a peculiar notation for random variable Xn taking values xn, and in-
nutilement superposed nonnegave twkytl functions sn = Sn(x 1X2,...xn).
nis main definition

MxIs...xn-l{8n(Xl*...*xn-l' Xn = an-l(xl...xn-l)
is somewhat redundantly mysterious. But we must dsume a man innocent
until proven guilty and give Ville the benefit of doubt that he merely
confused his symbolism. Those dispensable functions sn are meant by hitl
to representiehi n the "selection rules" in the frequency theory of Miees
(von or demand Wald* which is to a large extent Ville's colletif is
"all about"i see his final chapter VI: Conclusions.
What struck me as odd is that Ville did not cite Doob's 1936 "Note
on probability" which is extremely relevant to that theory, and has cleAr-
Lale.d cirt~j 4. H. otn a'.70h4i "4emmpv
lydotowith a selection rule which he called optional sampling. Did(~
Ville know Doob's result? %fd Another thing: although Ville referred
to Levyis Addition profusely 1 did not mention the latter's concrete
reituAtof a sequence of random variables {Xn) with the property:
i { x xE ..., xI = o0 .

which we all know is -a la aa l case of a mart kngale* Thus he proved
the maximum inequality (due to Kolmogorov in the independentt case) and
something about the law of iterated logaritA without seeming to realize
that Levy could have done similar things in his context (and probably
did in fact).

p. 2 of HISTORY

It is also odd that when it comes to a true stopping time, Ville
was just as casual as if random times were real times. He gave the
famous gambler's ruin problem and wrote down its solution on p. 99,
and "vaunted" a little on p. 100 that his martingale solution did not
depend on the specifics of the game as Bertrand's solution. In our
+Zs h, A 7 A
notation, 4A amounts to obsey'v('
) x = X0 = E (X where T = the ruin time,
H .^ U~aD h V;t 0 a r"0 ^ (I -
Vvlle allowed T to be infinite with positive probability, The theo-
ry of martingale yields the equation above for any "fair game" provid-
ed the stopping time satisfies certain conditions. The me~db trial
A easrqs-t
condition is that T be bounded with probability one, which is not true
even in coin-tossing. NO other neat condition is available since we
now have necessary and sufficient conditions for its validity, which is
definitely messy (see p. ,oTof Doob's 1953 though he did not attempt
necessity; cf. Neveu's book. p/. 76 wh&O1a stated in roundabout ways
[Fe habitually this author gives definitions "re gu er" in two
am&ng s&e statements of results --- a vile habit that should be forbid-
den by Napoleanic codel] but presumably ( i. e. "let us hope") does
produce a NaS set of conditions). The heuristic ide was actually
-3 already given in Khintchine's 1933 book who seemed f aware of its

V 0 requirement of proof. Yet Ville -f4reat gave hint of 1ite fundamen-
tal notion of "a martingale stopped at random times to remain a martin-
gale". The latter is the main idea in Doob's 1936 Note although there
it is a trivial situation : an IID sequence_ remains so when optionally
sample~{(XT).. Strictly speaking this .be regarded as a particu-
n { outi L .4h
lar case of (1): firstly because we do not need a martingale(E(Xn) need
not be u secondly and moVe seriously, the Tn required is not a general
e .ual n-o
optional time but is one (or two, or 17) plus an optional time. When
and who first thought of a general case of (1)? Doob might have some-

thing unpublished till 1953, but Wald had an excellent special case

p. 3 of H.

(1944). He did not bother with martingale and dealt with an III se-
u if
quence with finite mean) ihen the mean is zero the cu2nulative sums
jYr form a martingale. If T is an optional time with E(T) nite. he

(2) E(YT) = E(T).E(X1).
Actually when E(X1)=0 the stronger result >at the pair (YI YTm forms
a martingale. There is a less elegant generalization of the result
see Doob, p. 303, 0 Wald also has a iarer result dealing with
*srjn; (a,'
the characteristic functions of the cumulative sims stopped~teart -ep-

tLunal ltt-e, see Doob, p. 350ff. It is called the fundamental theorem
of sequential analysis, presumably very practical though not partiou-
larly pretty --- just as G. H. Hardy said long ago about Littlewood's
ballistic research. As mentioned above Ville was intent on Wald's

theory of collectives* one wonders if he got into sequential analysis
after Paris fell in 1940, and Wald invented Se sequential stopping
(of quality control of army-navy supplies) to same war expenses? Vip
master Emile Borel served as minister of the Navy, as well as minister

of Science or Research --- I no longer remember.

In the particular case where p=a/b, q=(b-a)/b, so that E(Xi)=0 for
all i, it is a famous theorem due jo Polya (1921/2?) that P(S2n=0

for infinitely many values of n) = 1; in particular PX{T0 where P denotes the probability when the random walk starts at x

(any integer). Emile Borel called this "retour a l'equilibre" and

discoursed on it at le ngth (see his Valeur pratique et philosophies .

ID this case the exact distribution of TO (fxrm for x=0, or for x=l)
is given by the combinatorial formula of Andrd generalized by Aeppli

The question arises tatxx whether the exact distribution of TO
may be obtained in the more general case of b>a>A, at least when

nd render E(Xi )=. Although there vas a huge literature on the

subject, known as "duration of play" in Todhunter's History ([ 1),

and numerous analytical formulas in terms of generating functions

and their power series, due to DeMoivre, Montmart, Lagrange, Laplace

(...), such a result SEEMS to be missing. Whereas Andre's "reflection
principle" or "counting ballots reverse order" has been greatly
extended by Rothe-Hagen-..... -Takacs ..... they do not yield an ex-

plicit expression for the Fn in (4), except in one special case.
This is the case b>a=l. Let us repeat: let P(X.=b-l) = 1/b, P{Xi=-})
=(b-l)b, where b is any integer >2. In this case we can establish

the formula F = P(T =nb) 0=6. N -0
n Ib-V n-1 ) In6b-1
This formula was found experimentally by Chung c. 1940 (when the Japs

were bombing Kunming-) and verfied by induction. P. L. Hsu then

summed the series EF to confirm it to be one, as it should. It was
.=, n
announced in the Amer. Monthly (see [1]) in the algebraic form ( ).

H. Gould gave a solution using the identities of R-S-H-.... without

reference to its probabilistic o'~igin or meaning. WHO would care for

such a formula (out of million others) without the meaning? As Borel.
.~ .


Around 1980, Chung gave a lecture to the pi-sigma society in the

University of Floroda at Gainsville on the topic, and showed his

very old graph for b=., a=l, and demonstrated the inductive proof

of the result. Tis old document was lost. Early in 1993, in

the effort to recover the material he observed a peculair numerical

equality that led to a logical proof of the result. Several experts

wre consukted (incl. the French author Com$e whaf aielittle book

contains the formula) but nobody knew. Finally the result was communi

cated to Takacs who gave a different proof. Since both proofs are

brief but based on rather diverse principles" one on the renewal

ideahmd the other on the finclusitf-exclusi" idea (Poincareis expres-

sion for the probability of the union of a number of arbitrary events,

much used also in elementary number theory (Hardy-Wright) such as

Mobius .. .). they are presented here.

Vol. 5, pp. 554-558


O Le- co'-t, 69

1^r A

UJ0- Z stY

C A *,V6AA-%AX S






Co...M-. -A.

W w oo*'.-*"

, .GAN fr, k"A

*p% %JX.

YbuLr -Yw

Reprinted from
Canadian Journal of Mathematics



(\ "s iVeQJ-b.

CcrruJ' R- 0,V\r



& 9,



Vi Lu e}




A distribution function O (x) is assumed to have the following properties:
(1) q(x) is non-decreasing

(2) lim 4(x) = 0, lim O(x) = 1,
(3) 4(x) = lim k(y) for every x.
The Fourier transform of 4(x) is defined by the Stieltjes integral

(4) (t)= f -`zd(x).

Let 41 and 02 be two distribution functions. Let a positive real number
8 be given. We consider the question, does there exist a positive E such that
the condition
(5) u1i(t) e2(t)| < e for all t
(6) Il(x) ?(x)\ < ?
There are three separate problems here. (i) We may allow e to depend on 6,
01, and x. Then our question is, does the uniform convergence of 42 to $1 imply
a point-wise convergence of 02 to 01?. The answer to this question is yes, as is
well known; in fact L6vy [1, p. 49] proves a theorem which states considerably
more than is needed for our problem. (ii) We may allow e to depend on 8 and
01, but not on x. Then our question is, does uniform convergence of 12 to C1
imply uniform convergence of 02 to 01? The answer to this question is also yes;
we prove this in Theorem 1 below. (iii) We may allow e to depend on 8 only.
In this case the answer is no, as we shall show by an example.

Counter-example for case (iii). Let a and b be real numbers with b > a > 0.
We consider the distribution functions

(7) 1(X) = log x+ )/log < 0
1, x > 0.
(8) S2(x) = 1 1(-x).
Received June 10, 1952.



(9) 41(x) 2(x) = log / log all x,
and in particular
(10) 1(0) 42(0) = 1.
However, by (9) we have
(11) )1(t) s2(t) i= [e-l e bl]/log ,

(12) 11(t) )2( < 7/log().
Since b/a may be arbitrarily large, we see that we can satisfy (5) for any e > 0
and still have (6) false for 6 = 1.

Statement of theorem for case (ii).
THEOREM 1. Let a positive 6 and a distribution function 01 be given. Then we
can find e > 0, depending only on 6 and 01, such that (5) implies (6) for all x and
for all 02.
Let h,(x) be the function defined by
(13) h,(x) = max (0, 1 Ix/|[).
Then (4) gives
(14) h,(x w) do(x) = r 4 sin27 e t) dt,

both sides being absolutely convergent integrals. If e is chosen so that (5) is
satisfied, then (14) gives, for every I and w,

(15) h,(x w)[d4i(x) d-2(x)] < e.

Since 41 is non-decreasing and (3) holds,
(16) 41 (w) lim 4)(y) = lim f h(x w) d4(x),
V4a-0 ij>0
the limits on both sides necessarily existing. Similarly (16) holds for 42. Therefore
letting 7 -- 0 in (15), we have, for all w,
(17) j(41(w) lim 4)(y)) (42(w) limn 2(y))l < e.
y-.)W-O yto--0
That is to say, at every point the discontinuities in 41 and 42 differ by at most e.
Another consequence of (15) is obtained by writing in turn w + r, w + 2i, ... ,
w + NtI for w and adding the resulting inequalities. From the definition of
h, (x),
h,(x w mro) = 1, w + -- <(ax < w + NI),


0< E h,(x w mJ) < 1,
w Using the fact that 01 and 42 are non-decreasing, adding together (15) for these
N values of w therefore gives
pw+(N+1) l w+N1
(18) dJ (x) > di((x) Ne.

We write for brevity a = 18. We. can divide the whole line (- oo, + om)
into a finite set of intervals II,. Im with the following properties. (i) Each
I, is closed on the left and open on the right. (ii) The total variation of 41(x)
on I,, is less than a. Let L. and R, be the limits to which 41(x) tends as x tends
to the left and right end-points within I,. Similarly let L: and R: be the limits
of 02. By (17) we have
(19) Ri R! < L2+, Li+ + e.
Now let X be the length of the shortest I,, let A be the combined length of
2I, Ir-,, and let N be an integer greater than (2 A/X). The choice of N and
of the I, depends only on 8 and 41 and is independent of e. Given any I, with
1 < n < m, we can choose two points x, x' inside I, such that
(20) x' x > jX.
Then we apply (18) with w = x, w + 7t = x', giving
(21) 41(x') + 4f2(x' + N7) > 02(x) + 01(X + N) Ne.
By the definition of N, the point (x + Ny) belongs to Im and so
OI(x + NV) > 1 a, 02(x' + Nq,) < 1.
Hence (21) becomes
(22) 41(x') > 42(x) NE a.
Again, applying (18) with w = x Ny, w + 1 = x' Ny,
4W(x') + 4x(x' N7y) > 4i(x) + 42(x Nn) Ne,
and since (x' Ny7) belongs to Ix this becomes
(23) 42(x') > x1(x) Ne a.
Let x' and x tend respectively to the right and left to the end-points of I,.
Then (22) and (23) give
(24) L2 < R + Ne + a,
(25) R: > Ln Ne a.


These inequalities, (24) and (25), which have been proved for 1 < n < m, are
trivially true also for n = 1 and n = m.
Writing n + 1 for n in (24) and combining it with (19), we find
R: < RI + R +1 L.+1 + (N + 1) e + a
(26) < R+(N+ 1) e + 2a.

Similarly (25) combined with (19) gives
(27) Ln > Ln (N + 1) e 2a.
Now R2 and LI are the upper and lower bounds of 02 in In, and R! and L, differ
by at most a. Therefore (26) and (27) imply
(28) 102(x) 0i(x)| < (N + 1)e + 3a = (N + 1)e + 15
for all x in (- c, + co). The choice of N depended only on 8 and 01. Given 8
and 01 we can choose e to be any number less than (6/(4(N + 1))), and then
(5) will imply (6). This proves the theorem.

Additional remarks. Another theorem can be derived from Theorem 1 by
weakening both the hypothesis and the conclusion slightly. Let us define the
distance between two distributions 01 and 02 by
(29) Ii 021 = max (1{01, 02}l, {02, 01}),
(30) { 1, 42} = max (min (x' x, 4i(x) 42(x'))).
This definition of the distance is equivalent to that given by L6vy [1, p. 47].
It is easy to see that [[41 4211 is the side of the largest square that can be
inserted between the graphs y = 0i(x) and y = 02(x) when these are plotted in
cartesian coordinates in the usual way. Thus the convergence defined by
1102 0111 0 is topologically weaker than uniform convergence of 42 to 01,
but topologically stronger than point-wise convergence of 02 to 41. The modified
form of Theorem 1 is
THEOREM 2. Ldt 8 and 01 be given. Then we can find e > 0 depending only on
8 and 01, such that
(31) i(l(t) '2(t) < for all t <
(32) 112 0111 < 8.
The proof is similar to the proof of Theorem 1, only simpler. The counter-
example given previously also shows that the weaker conclusion (32) does not
follow from (5) with e depending only on 6.

558 F. J. DYSON

The author is indebted to Dr. K. L. Chung for suggesting this problem to
him, and for several stimulating discussions.

1. P. L6vy, Thiorie de addition des variables algatoires (Paris, 1937).

Cornell University