[section] [section] [section]

[chapter] Theorem]Corollary Theorem]Lemma Theorem]Proposition

[chapter] [chapter]

[section]


Chapter 1
Combinatorial inequalities for binomial tails

[{stirling}]

Abstract

Our next goal is to find explicit expressions for the rate function \mathbbI(x) of the binomial tail. We shall use a simple inequality related to Stirling's approximation for n!. More exact inequalities and Stirling's formula are illustrated numerically.

1  Bounds for factorials

It is easy to capture a factorial n! between two relatively simple expressions. To do so, notice that the Riemann sums for the integrals of lnx involve lnn!. Indeed, åk = 1n-1 lnk £ ò1n lnx dx £ åk = 1n lnk This gives enne-n £ n! £ e nn+1e-n. In particular, rough asymptotic of factorials is n!\asymp (n/e)n as n®¥.

A more more careful analysis is presented in Feller [] Vol. I. We represent the expression ln(n!/Ön) = ln2+ln3+...+ln(n-1) +1/2 lnn in two different ways as the area of polygonal regions that lie above, or below, the curve y = lnx. First, write it as

1/2(ln1+ln2)+1/2(ln 2+ln3)+1/2(ln3+ln4)+...+ln(n-1) +1/2(ln(n-1)+ lnn) which are trapezoids under the curve y = lnx. This shows that ln(n!/Ön) £ ò1n lnx dx.

Now notice that lnk is the area of a trapezoid bounded by k-1/2 < x < k+1/2 and the tangent line to y = lnx at x = k. Therefore ln2+ln3+...+ln(n-1) > ò3/2n-1/2 lnx dx. Since 1/2lnn > òn-1/2n lnx dx, we get the lower bound in the following inequality.

ó
õ
n

3/2 
lnx dx £ 1
Ön
lnn! £ ó
õ
n

1 
lnx dx
The integrals can be computed, so we get
( 2e
3
)3/2 nne-n £ n!/Ön £ e nne-n
This proves the following.

Theorem 1 [{BS}] 2.4 nn+1/2e-n £ n! £ 2.8 nn+1/2e-n

2  Stirling's formula

Feller [] gives bounds that squeeze n! more tightly
[{FellerBounds}]   æ
Ö

2p
 
nn+1/2e-ne[1/( 12n+1)] £ n! £   æ
Ö

2p
 
nn+1/2e-ne[1/ 12n]
(1)
This implies a more refined approximation Stirling's approximation to factorial is
[{Stirling}]n! »   æ
Ö

2p
 
e-nnn+1/2 » 2.507 nn+1/2e-n,
(2)
the so called Stirling's approximation to factorial. The approximation is quite accurate even for small values of n. For large n the difference between n! and Ö{2p} e-nnn+1/2 increases indefinitely, but the relative error of the approximation converges to 0, and the quotient of the two expressions converges to 1; for numerical illustration see Table .

Remark: The main advantage of Theorem 1 or (1) over Stirling's formula are explicit error bounds for finite n.

Table 1: Stirling's approximation to factorial[{TblStirl}]

Stirling's approx relative lower bound upper bound
n n! Ö{2p}nn+1/2e-n error Ö{2p}nn+1/2e-ne[1/( 12n+1)] Ö{2p}nn+1/2e-ne[1/ 12n]

1

1 0.90.0781.01.0
2 2 1.90.0402.02.0
3 6 5.80.0276.06.0
4 24 23.50.02124.024.0
5 120 118.00.017120.0120.0
6 720 710.10.014719.9720.0
7 5040 4980.40.0125039.35040.0
8 40320 39902.40.01040315.940320.2
9 362880 359536.90.009362850.6362881.4
10 3628800 3598695.60.0083628560.13628810.0
11 39916800 39615624.70.00839914609.139916882.8
12 479001600 475687482.40.007478979424.2479002364.4
13 6227020800 6187239422.40.0066226774364.56227028606.8
14 87178291200 86661001001.50.00687175308107.787178378579.8
15 1307674368000 1300430711108.00.0061307635295008.71307675431760.2

3  Bounds for Binomial probabilities

Let X be binomial Bin(n,p). In Lecture we introduced the rate function \mathbbI(x) as the slopes in the regression fit
ln Pr
( _
X
 

n 
> x) » a(x)+\mathbbI(x) n
Numerical evidence in Table (page ) supports our claim that for p = 1/2 the rate function is given by \mathbbI(x) = xlnx+(1-x)ln(1-x)+ln2. Now we derive the explicit formula for \mathbbI(x) for general p, and give two companion inequalities. (We will make no attempt to get sharp bounds.)

Our first task is to follow the idea indicated by the computer generated data in Lecture and find explicit inequalities for Pr(1/n X > x) when x > p.

Theorem 2 [{LD-Bin}] Let

[{BinRate}]\mathbbI(x) = xln x
p
+(1-x)ln 1-x
1-p
(3)
If X is Binomial Bin(n,p) then for all x > p and n > 2/(1-x)
[{BinLD1a}] Pr
(X > nx) £ C
Ön
exp(-n\mathbbI(x))
(4)
where C = C(x,p) = 0.7[1/( [Ö(x(1-x))])]

Moreover, for all x > p and n > [1/( x-p)] we have

[{BinLD1b}] Pr
(X > nx) ³ c 1
Ön
exp(-n\mathbbI(x))
(5)
where c = 0.15[(p(1-x))/( Öx(1-p)3/2)].

Similarly, for all x < p and n large enough we have

c
Ön
exp(-n\mathbbI(x)) £ Pr
(X < nx) £ C
Ön
exp(-n\mathbbI(x))

Remark:  A calculus exercise shows that \mathbbI(x) is a convex function with the unique minimum at x = p. Indeed, the derivatives are \mathbbI¢(x) = ln[x/( 1-x)]-ln[p/( 1-p)], \mathbbI¢¢(x) = [1/( x(1-x))] > 0

This implies that \mathbbI(x) is an increasing function for x > p.

Remark:  In Lecture (Theorem and Corollary ) we give a simpler proof of another version of bound (4).

To prove the lower bound (5) let k = [xn]. Since n > [1/( x-p)] we have p < x-1/n < k/n £ x. In particular, since \mathbbI(x) is increasing for x > p, we have

[{tmp1}]\mathbbI( k
n
) £ \mathbbI(x)
(6)
Notice that
[{tmp2}]
Pr
(X = k+1)

Pr
(X = k)
= p(n-k)
(k+1) (1-p)
³ p
1-p
1- k
n

1+ k
n
³ p
1-p
1-x
1+x
³ p
2
1-x
1-p
(7)
The lower bound now follows from (6), (7), the fact that Pr(X > nx) ³ Pr(X = k+1) and inequalities for factorials in Theorem 1. Namely, Pr(X > nx) ³ Pr(X = k+1) ³ p/2[(1-x)/( 1-p)] Pr(X = k) and

Pr(X = k) ³ [2.4/( 2.82)][1/( Ön)][1/( Ö{k/n(1-k/n)})](([p/( k/n)])k/n([(1-p)/( 1-k/n)])1-k/n)n = [2.4/( 2.82)][1/( Ön)][1/( Ö{k/n(1-k/n)})]exp(-n\mathbbI(k/n)) ³ 0.3 [1/( Ön)] [1/( [Ö(x(1-p))])] exp(-n\mathbbI(x)) .

To prove the upper bound, we will show that Pr(X > nx) is up to a multimplicative factor comparable to Pr(X = k+1) for k = [nx]. To see this notice that similarly as in (7) we have

Pr
(X = j+1)

Pr
(X = j)
= p
1-p
n-j
j+1
£ p
1-p
n-k
k+1
for all j ³ k.

Therefore for j ³ 0

Pr
(X = k+j)

Pr
(X = k)
=
Pr
(X = k+1)

Pr
(X = k)
Pr
(X = k+2)

Pr
(X = k+1)
...
Pr
(X = j)

Pr
(X = j-1)
£ æ
ç
è
p
1-p
n-k
k+1
ö
÷
ø
j

 
Let r = [p/( 1-p)][(n-k-1)/( k+2)] and put k = [nx]. Since [(k+2)/( n+2)] > k/n > p we have r < 1. Therefore
Pr
(X > nx) = n
å
j = k+1 
Pr
(X = j) £ Pr
(X = k+1) ¥
å
j = 0 
rj = Pr
(X = k+1) (1-p)(k+2)
k+2-(n+1)p
= (1-p) x¢
x¢-p
where x¢ = [(k+2)/( n+1)] ³ [(nx+1)/( n+1)] ³ x > p. In particular, since x® [x/( x-p)] = 1+[p/( x-p)] is a decreasing function of x, this shows that
Pr
(X > nx) £ (1-p) x
x-p
Pr
(X = k+1)

Now we proceede similarly as in the proof of the lower bound.

Pr
(X = k+1) £ 2.8
2.42
1
Ön
1
  æ
Ö

k+1
n
(1- k+1
n
)
 
æ
ç
ç
ç
ç
ç
è
æ
ç
ç
ç
ç
ç
è
p
k+1
n
ö
÷
÷
÷
÷
÷
ø
[(k+1)/ n]



 
æ
ç
ç
ç
ç
ç
è
1-p
1- k+1
n
ö
÷
÷
÷
÷
÷
ø
1-[(k+1)/ n]



 
ö
÷
÷
÷
÷
÷
ø
n



 
=
2.8
2.42
1
Ön
1
  æ
Ö

k+1
n
(1- k+1
n
)
 
exp(-n\mathbbI( k+1
n
)) £ 0.49 1
Ön
1
Öx   æ
Ö

1- 1
n
-x
 
exp(-n\mathbbI(x))
where in the last bound we used the fact that \mathbbI(·) in increasing on (p,1), and [(k+1)/ n] > x. For n > [2/( 1-x)] this implies (4)

Theorem 3 yields the following rough asymptotic: Pr(X > nx) \asymp e-n \mathbbI(x) for x > p. Using more advanced methods one can show that Pr(X > nx) » [c(x,p)/( Ön)] exp(-n\mathbbI(x)) as n®¥ which shows that (5) is sharp up to a multiplicative constant.

Example 1 Celatrams is a startup insurance company that plans to insure n = 10,000 computer owners in Cincinnati area against lightning damage to their computers. According to the insurance policy, in case of an accident the computer owner will receives $5,000 regardless of the actual damage. If the probability of lightning hitting a computer in Cincinnati area is about p = 0.001 (Cincinnati is a lightning capital of the U.S.A.), what yearly premium should they charge in order to have a chance less than 10-5 of going out of business due to claims exceeding their assets, which consist solely of the premium collected?

We model the number of claims X as a binomial Bin(n = 104,p = 10-3) random variable. The equation is

Pr(5000 X > 10000P) » 10-5

To solve this equation we may try normal approximation, rough asymptotic, Poisson approximation, and compare them with exact answers (symbolic programs are good at manipulating large integers!). We expect good accuracy from the Poisson approximation, conservative answer from rough asymptotic, and perhaps an inaccurate answer from the normal distribution.

4  Exercises

The following exercises illustrate the concept of rough asymptotic.

Exercise 1 Let pn = 1/n, an = [1/( n2)]. Can we claim that pn-an® 0? Can we claim that pn » an? Can we claim that pn\asymp an?

Exercise 2 Let pn = 2-n, an = 3-n. Can we claim that pn-an® 0? Can we claim that pn » an? Can we claim that pn\asymp an?

Exercise 3 Let pn = 2-n, an = n2-n. Can we claim that pn-an® 0? Can we claim that pn » an? Can we claim that pn\asymp an?

The next exercises illustrate the use of rough expansions for binomial tail probabilities.

Exercise 4 Suppose X is Bin(n,p) with p = 0.5. Determine n such that Pr(X > 0.7n) = 10-5

(For the same exercise with a smaller probability Pr(X > 0.7n) = 10-20 my answers are: n = 536, 556, 504, 546. Can you duplicate these?)

Exercise 5 For a binomial X use the approximation Pr(X > k)\asymp exp(-n(\mathbbI(k/n) and formula (3)) to solve the äirline overbooking problem" from Example ; for a sketch of the solution see page .

Answer: For no-show probability of p = 0.2 and an airplane of capacity 300 passengers, the airline can overbook by about B » .... without exceeding the probability of 10-8 for a bumped flight, a requirement that could have been emposed by a regulating agency.

(the exact answer according to Excel is B = 29; but we will not know how accurate this answer is until we try other methods!)

Exercise 6 Show that for x > p we have rough asymptotic Pr( x £ [`X]n £ y)\asymp exp(-n\mathbbI(x)) for all y > x

Exercise 7 Show that if \mathbbI(x) given by (3) then \mathbbI(x) ³ 2 (x-p)2.

Hint: What is \mathbbI¢¢(x)?

Exercise 8 Expand the rate function \mathbbI(x) given by (3) into a power series at x = p.


File translated from TEX by TTH, version 1.59.