Skip to main content
Logo image

Section 21.4 A Slice of the Prime Number Theorem

We end this chapter with a substantial piece of a real proof in the direction of the Prime Number Theorem, courtesy of a function also first introduced by Chebyshev. The argument is dense, but requires nothing beyond calculus and a willingness to allow a lot of algebraic and integral manipulation for the purposes of estimation. (See a good calculus text 17  to review integral concepts.)

Subsection 21.4.1 Functions to know

First, we’ll review the main function. Think of the prime counting function \(\pi\) as a so-called step function, where every time you hit a new prime you add 1. The picture reminds you of this attribute.
Plot of prime pi function
Figure 21.4.1. Plot of prime pi function
Let’s define a new function in that spirit. Instead of adding 1 each time \(x\) hits a prime, we will add \(\log(p)\) (recall that this is the natural logarithm) each time we hit a prime \(p\text{.}\) Of course, this value we add will get bigger as \(p\) gets bigger.
Plot of Chebyshev theta function
Figure 21.4.2. Plot of Chebyshev theta function

Definition 21.4.3.

We call the function given by this formula Chebyshev’s theta function:
\begin{equation*} \Theta(x)=\sum_{p\leq x}\log(p)\text{.} \end{equation*}

Sage note 21.4.4. Python can do math too.

We include an interactive version so you can see the code.
The syntax math.log is referring to Python’s builtin calculation of the natural logarithm, accessible in the math module 18 . This is sometimes faster and easier to use than Sage’s more powerful capabilities, because if you put an integer in Sage’s logarithm, it will normally not approximate it. All we want here is an easy approximation, so this should be faster.
Earlier in this chapter we noted that the Prime Number Theorem is logically equivalent to the limit \(\lim_{x\to\infty}\frac{\pi(x)}{x/\log(x)}=1\text{.}\) There are actually many such logical equivalences. One of them involves \(\Theta\text{:}\)
\begin{equation*} \lim_{x\to\infty}\frac{\Theta(x)}{x}=1 \end{equation*}
This is certainly numerically plausible. Here is a plot of both limits, along with the constant function \(1\text{.}\)
Plot of Chebyshev theta function limits
Figure 21.4.5. Plot of Chebyshev theta function limits
In the interact below, there is an option for an \(Li\) versus \(x/log(x)\) version of the theorem. Note how much better the prime number theorem limit looks with the \(Li\) version.
As usual, proving such things completely is beyond the level of this course, but we can prove the following partial implication.
The rest of this section is the proof.

Subsection 21.4.2 Getting a formula with sleights of hand

In order to prove this implication, we will first need a formula telling us more about \(\Theta(x)\text{.}\) Our strategy 19  will be to first turn \(\Theta(x)\) into an even more hopelessly complicated sum, but then use calculus trickily to get something usable by summing up integrals.
In order to do this, we need two subsidiary functions. First, recall the notation \(\lfloor x\rfloor\) for the greatest integer less than \(x\text{.}\) Secondly:

Definition 21.4.7.

We let \(a(n)\) be the prime number indicator function defined by
\begin{equation*} a(n)=\begin{cases}1 & \text{ if }n\text{ is prime}\\0 & \text{ otherwise }\end{cases}\text{.} \end{equation*}
Another way to say this is
\begin{equation*} a(n)=\pi(n)-\pi(n-1)\text{.} \end{equation*}
Prime pi versus indicator function
Figure 21.4.8. Prime \(\pi\) versus \(a\) indicator function
One can get the indicator function just by writing prime_pi(x)-prime_pi(x-1).
For convenience we write \(m=\lfloor x\rfloor\text{.}\) Then we can rewrite these step functions as weighted sums of \(a(n)\text{:}\)
\begin{equation*} \pi(x)=\sum_{n=1}^{m}a(n)\text{ and }\Theta(x)=\sum_{n=1}^{m}a(n)\log(n)\text{.} \end{equation*}
Our goal is to rearrange \(\Theta\) to be a sum of terms involving \(\pi\text{.}\) First we turn \(\Theta\) into a difference of sums by rearranging (and using \(\log(1)=0\)):
\begin{equation*} \Theta(x)=\sum_{1\leq n\leq x}a(n)\log(n)=\sum_{n=1}^{m}a(n)\log(n)= \end{equation*}
\begin{equation*} \sum_{n=2}^{m}[\pi(n)-\pi(n-1)]\log(n)=\sum_{n=2}^{m}\pi(n)\log(n)-\sum_{n=1}^{m-1}\pi(n)\log(n+1) \end{equation*}
This difference of sums can be combined into a single sum, with just two left over terms 20 .
\begin{equation*} \Theta(x)=\sum_{n=2}^{m-1}\pi(n)[\log(n)-\log(n+1)]+\pi(m)\log(m)-\pi(1)\log(2)\text{.} \end{equation*}
To continue, we will rewrite almost all of this as single integral. We use a few key facts:
  • The difference which appears in the the sum in the immediately preceding \(\Theta\) formula can be considered as an integral, \(-\left(\log(n+1)-\log(n)\right)=-\int_{n}^{n+1}\frac{dt}{t}\text{.}\)
  • We have that \(\pi(x)\) is constant on the interval \([\lfloor x\rfloor,x]\text{,}\) and in particular on any given interval \([n,n+1)\text{,}\) so it may be factored out of any integral from \(n\) to \(n+1\text{.}\)
  • We can rearrange and add sums and integrals as usual.
  • Note that \(\pi(1)=0\text{,}\) so the second extra term is zero.
This yields the following rewrite.
\begin{equation*} \Theta(x) = -\sum_{n=2}^{m-1}\Big[\pi(n)\int_{n}^{n+1}\frac{dt}{t}\Big]+\pi(m)\log(m) \end{equation*}
\begin{equation*} =-\sum_{n=2}^{m-1}\Big[\pi(n)\int_{n}^{n+1}\frac{dt}{t}\Big]+\pi(m)\log(m)-\pi(x)\log(x)+\pi(x)\log(x) \end{equation*}
\begin{equation*} =-\int_{2}^{m}\frac{\pi(t)dt}{t}+\pi(x)\log(x)-\int_{m}^{x}\frac{\pi(t)dt}{t} =\pi(x)\log(x)-\int_{2}^{x}\frac{\pi(t)dt}{t}\text{.} \end{equation*}
Now we have a formula for \(\Theta\) which will allow us to prove something.

Subsection 21.4.3 Finish the proof

We can divide the formula \(\Theta(x)=\pi(x)\log(x)-\int_{2}^{x}\frac{\pi(t)dt}{t}\) by \(x\text{:}\)
\begin{equation*} \frac{\Theta(x)}{x}=\frac{\pi(x)\log(x)}{x}-\frac{\int_{2}^{x}\frac{\pi(t)}{t}dt}{x}\text{.} \end{equation*}
Given that the Prime Number Theorem says that \(\lim_{x\to\infty}\) of the fraction with \(\pi(x)\) in it is 1, proving that \(\lim_{x\to\infty}\frac{\Theta(x)}{x}\) is also 1 is equivalent to proving
\begin{equation*} \lim_{x\to\infty}\frac{1}{x}\int_{2}^{x}\frac{\pi(t)}{t}dt=0\text{.} \end{equation*}
Now, the Prime Number Theorem also implies that \(\frac{\pi(t)}{t}\) and \(\frac{1}{\log(t)}\) are asymptotic (recall Definition 21.2.1), so that their averaging integrals
\begin{equation*} \frac{1}{x}\int_{2}^{x}\frac{\pi(t)}{t}dt\text{ and } \frac{1}{x}\int_{2}^{x}\frac{dt}{\log(t)} \end{equation*}
clearly are also asymptotic.
This reduces our proof to showing that the average value of \(1/\log(t)\) tends to zero. Since integrals have a graphical interpretation, we now use the following graph of the integral limit to finish the proof!
Consider that one possible upper sum for the integral of \(1/\log(t)\) between 2 and 9 is the area of the two rectangles shown below, one with area \(\frac{1}{\log(2)}(\sqrt{9}-2)\) and the other with area \(\frac{1}{\log(\sqrt{9})}(9-\sqrt{9})\text{.}\) (Of course \(\sqrt{9}=3\) but this form is more useful here.)
Estimating integrals for proof
Figure 21.4.9. Estimating integrals for proof
In general, the same argument should hold, so a possible overestimate of \(\int_2^x dt/\log(t)\) is
\begin{equation*} \frac{1}{\log(2)}(\sqrt{x}-2)+\frac{1}{\log(\sqrt{x})}(x-\sqrt{x}) \end{equation*}
and we want the limit as \(x\to\infty\) of \(\frac{1}{x}\) times that quantity.
Now is the time to recklessly use logarithmic identities:
\begin{equation*} \frac{1}{x}\left(\frac{1}{\log(2)}(\sqrt{x}-2)+\frac{1}{\log(\sqrt{x})}(x-\sqrt{x})\right) \end{equation*}
\begin{equation*} =\frac{1}{\log(2)x^{1/2}}-\frac{2}{x\log(2)}+\frac{1}{\log(\sqrt{x})}-\frac{1}{\log(\sqrt{x})x^{1/2}} \end{equation*}
\begin{equation*} =\frac{1}{\log(2)x^{1/2}}-\frac{2}{x\log(2)}+\frac{2}{\log(x)}-\frac{2}{\log(x)x^{1/2}} \end{equation*}
This last expression has positive powers of \(x\) and their logs in the denominators, so it pretty clearly goes to zero as \(x\to \infty\text{.}\)
If the algebra doesn’t convince you, perhaps an interactive graph will. Below, black is the overestimate to the integral and red is \(1/x\) times the integral.
The picture confirms our analytic proof that the limit of \(\frac{\theta(x)}{x}\) is the same as that of \(\frac{\pi(x)}{x/\log(x)}\text{,}\) which is what we desired!
activecalculus.org/single/C-4.html
docs.python.org/2.7/library/math.html
This is an expansion of the terse approach taken in [E.4.6, Theorems 4.3 and 4.4].
Students a little more familiar with calculus may want to compare this process to integration by parts, but in a discrete context, sometimes called Abel summation.