#### Definition 21.4.3.

We call the function given by this formula Chebyshev’s theta function:

\begin{equation*}
\Theta(x)=\sum_{p\leq x}\log(p)\text{.}
\end{equation*}

We end this chapter with a substantial piece of a real proof in the direction of the Prime Number Theorem, courtesy of a function also first introduced by Chebyshev. The argument is dense, but requires nothing beyond calculus and a willingness to allow a lot of algebraic and integral manipulation for the purposes of estimation. (See a good calculus text^{ 17 } to review integral concepts.)

First, we’ll review the main function. Think of the prime counting function \(\pi\) as a so-called step function, where every time you hit a new prime you add 1. The picture reminds you of this attribute.

Let’s define a new function in that spirit. Instead of adding 1 each time \(x\) hits a prime, we will add \(\log(p)\) (recall that this is the natural logarithm) each time we hit a prime \(p\text{.}\) Of course, this value we add will get bigger as \(p\) gets bigger.

We call the function given by this formula Chebyshev’s theta function:

\begin{equation*}
\Theta(x)=\sum_{p\leq x}\log(p)\text{.}
\end{equation*}

We include an interactive version so you can see the code.

The syntax ^{ 18 }. This is sometimes faster and easier to use than Sage’s more powerful capabilities, because if you put an integer in Sage’s logarithm, it will normally not approximate it. All we want here is an easy approximation, so this should be faster.

`math.log`

is referring to Python’s builtin calculation of the natural logarithm, accessible in the `math`

moduleEarlier in this chapter we noted that the Prime Number Theorem is logically equivalent to the limit \(\lim_{x\to\infty}\frac{\pi(x)}{x/\log(x)}=1\text{.}\) There are actually many such logical equivalences. One of them involves \(\Theta\text{:}\)

\begin{equation*}
\lim_{x\to\infty}\frac{\Theta(x)}{x}=1
\end{equation*}

This is certainly numerically plausible. Here is a plot of both limits, along with the constant function \(1\text{.}\)

In the interact below, there is an option for an \(Li\) versus \(x/log(x)\) version of the theorem. Note how much better the prime number theorem limit looks with the \(Li\) version.

As usual, proving such things completely is beyond the level of this course, but we *can* prove the following partial implication.

If the Prime Number Theorem is true, then it is also true that \(\Theta(x)/x\) approaches \(1\text{.}\)

The rest of this section is the proof.

In order to prove this implication, we will first need a formula telling us more about \(\Theta(x)\text{.}\) Our strategy^{ 19 } will be to first turn \(\Theta(x)\) into an even more hopelessly complicated sum, but then use calculus trickily to get something usable by summing up *integrals*.

In order to do this, we need two subsidiary functions. First, recall the notation \(\lfloor x\rfloor\) for the greatest integer less than \(x\text{.}\) Secondly:

We let \(a(n)\) be the prime number indicator function defined by

\begin{equation*}
a(n)=\begin{cases}1 & \text{ if }n\text{ is prime}\\0 & \text{ otherwise }\end{cases}\text{.}
\end{equation*}

Another way to say this is

\begin{equation*}
a(n)=\pi(n)-\pi(n-1)\text{.}
\end{equation*}

One can get the indicator function just by writing

`prime_pi(x)-prime_pi(x-1)`

.For convenience we write \(m=\lfloor x\rfloor\text{.}\) Then we can rewrite these step functions as weighted sums of \(a(n)\text{:}\)

\begin{equation*}
\pi(x)=\sum_{n=1}^{m}a(n)\text{ and }\Theta(x)=\sum_{n=1}^{m}a(n)\log(n)\text{.}
\end{equation*}

Our goal is to rearrange \(\Theta\) to be a sum of terms involving \(\pi\text{.}\) First we turn \(\Theta\) into a difference of sums by rearranging (and using \(\log(1)=0\)):

\begin{equation*}
\Theta(x)=\sum_{1\leq n\leq x}a(n)\log(n)=\sum_{n=1}^{m}a(n)\log(n)=
\end{equation*}

\begin{equation*}
\sum_{n=2}^{m}[\pi(n)-\pi(n-1)]\log(n)=\sum_{n=2}^{m}\pi(n)\log(n)-\sum_{n=1}^{m-1}\pi(n)\log(n+1)
\end{equation*}

This difference of sums can be combined into a single sum, with just two left over terms^{ 20 }.

\begin{equation*}
\Theta(x)=\sum_{n=2}^{m-1}\pi(n)[\log(n)-\log(n+1)]+\pi(m)\log(m)-\pi(1)\log(2)\text{.}
\end{equation*}

To continue, we will rewrite almost all of this as single integral. We use a few key facts:

- The difference which appears in the the sum in the immediately preceding \(\Theta\) formula can be considered as an integral, \(-\left(\log(n+1)-\log(n)\right)=-\int_{n}^{n+1}\frac{dt}{t}\text{.}\)
- We have that \(\pi(x)\) is constant on the interval \([\lfloor x\rfloor,x]\text{,}\) and in particular on any given interval \([n,n+1)\text{,}\) so it may be factored out of any integral from \(n\) to \(n+1\text{.}\)
- We can rearrange and add sums and integrals as usual.
- Note that \(\pi(1)=0\text{,}\) so the second extra term is zero.

This yields the following rewrite.

\begin{equation*}
\Theta(x) = -\sum_{n=2}^{m-1}\Big[\pi(n)\int_{n}^{n+1}\frac{dt}{t}\Big]+\pi(m)\log(m)
\end{equation*}

\begin{equation*}
=-\sum_{n=2}^{m-1}\Big[\pi(n)\int_{n}^{n+1}\frac{dt}{t}\Big]+\pi(m)\log(m)-\pi(x)\log(x)+\pi(x)\log(x)
\end{equation*}

\begin{equation*}
=-\int_{2}^{m}\frac{\pi(t)dt}{t}+\pi(x)\log(x)-\int_{m}^{x}\frac{\pi(t)dt}{t} =\pi(x)\log(x)-\int_{2}^{x}\frac{\pi(t)dt}{t}\text{.}
\end{equation*}

Now we have a formula for \(\Theta\) which will allow us to prove something.

We can divide the formula \(\Theta(x)=\pi(x)\log(x)-\int_{2}^{x}\frac{\pi(t)dt}{t}\) by \(x\text{:}\)

\begin{equation*}
\frac{\Theta(x)}{x}=\frac{\pi(x)\log(x)}{x}-\frac{\int_{2}^{x}\frac{\pi(t)}{t}dt}{x}\text{.}
\end{equation*}

Given that the Prime Number Theorem says that \(\lim_{x\to\infty}\) of the fraction with \(\pi(x)\) in it is 1, proving that \(\lim_{x\to\infty}\frac{\Theta(x)}{x}\) is also 1 is equivalent to proving

\begin{equation*}
\lim_{x\to\infty}\frac{1}{x}\int_{2}^{x}\frac{\pi(t)}{t}dt=0\text{.}
\end{equation*}

Now, the Prime Number Theorem also implies that \(\frac{\pi(t)}{t}\) and \(\frac{1}{\log(t)}\) are asymptotic (recall Definition 21.2.1), so that their averaging integrals

\begin{equation*}
\frac{1}{x}\int_{2}^{x}\frac{\pi(t)}{t}dt\text{ and } \frac{1}{x}\int_{2}^{x}\frac{dt}{\log(t)}
\end{equation*}

clearly are also asymptotic.

This reduces our proof to showing that the average value of \(1/\log(t)\) tends to zero. Since integrals have a graphical interpretation, we now use the following graph of the integral limit to finish the proof!

Consider that one possible upper sum for the integral of \(1/\log(t)\) between 2 and 9 is the area of the two rectangles shown below, one with area \(\frac{1}{\log(2)}(\sqrt{9}-2)\) and the other with area \(\frac{1}{\log(\sqrt{9})}(9-\sqrt{9})\text{.}\) (Of course \(\sqrt{9}=3\) but this form is more useful here.)

In general, the same argument should hold, so a possible *over*estimate of \(\int_2^x dt/\log(t)\) is

\begin{equation*}
\frac{1}{\log(2)}(\sqrt{x}-2)+\frac{1}{\log(\sqrt{x})}(x-\sqrt{x})
\end{equation*}

and we want the limit as \(x\to\infty\) of \(\frac{1}{x}\) times that quantity.

Now is the time to recklessly use logarithmic identities:

\begin{equation*}
\frac{1}{x}\left(\frac{1}{\log(2)}(\sqrt{x}-2)+\frac{1}{\log(\sqrt{x})}(x-\sqrt{x})\right)
\end{equation*}

\begin{equation*}
=\frac{1}{\log(2)x^{1/2}}-\frac{2}{x\log(2)}+\frac{1}{\log(\sqrt{x})}-\frac{1}{\log(\sqrt{x})x^{1/2}}
\end{equation*}

\begin{equation*}
=\frac{1}{\log(2)x^{1/2}}-\frac{2}{x\log(2)}+\frac{2}{\log(x)}-\frac{2}{\log(x)x^{1/2}}
\end{equation*}

This last expression has positive powers of \(x\) and their logs in the denominators, so it pretty clearly goes to zero as \(x\to \infty\text{.}\)

If the algebra doesn’t convince you, perhaps an interactive graph will. Below, black is the overestimate to the integral and red is \(1/x\) times the integral.

The picture confirms our analytic proof that the limit of \(\frac{\theta(x)}{x}\) is the same as that of \(\frac{\pi(x)}{x/\log(x)}\text{,}\) which is what we desired!

`activecalculus.org/single/C-4.html`

`docs.python.org/2.7/library/math.html`

This is an expansion of the terse approach taken in [E.4.6, Theorems 4.3 and 4.4].

Students a little more familiar with calculus may want to compare this process to integration by parts, but in a discrete context, sometimes called Abel summation.