Skip to main content
Logo image

Section 24.6 Four Facts

We are now ready to work with four applied facts which we can prove, using these tools. Some have other types of proofs, but number theory combined with calculus really provides a unified framework for a huge number of problems.

Subsection 24.6.1 Random integer lattice points

The following graphic will indicate what it means to have a point visible from the origin; is there a point directly between it and the origin or not? To rephrase, what is the probability that a point in the integer lattice has a line connecting the point to the origin that does not hit any other point? (We will explicitly avoid any discussion of why such infinitary probabilities are defined in this introductory text.)
Integer lattice points visible from the origin
Figure 24.6.1. Integer lattice points visible from the origin through \(n=5\)
For this example, the probability is about \(0.66\text{,}\) but the theoretical probability will not be two-third! We will as usual want an interactive version too.
Note that the probabilities estimated by this interact vary wildly. Especially at a prime distance one should expect the computed probability to be higher than the theoretical one; why?
It should be pretty clear from the pictures that if \(x\) and \(y\) have a nontrivial common divisor \(d\) then, \(\left(\frac{x}{d},\frac{y}{d}\right)\) is right on the line of sight from the origin to \((x,y)\) so that it is blocked off. This is most clearly so for \(d=\gcd(x,y)\text{,}\) so the following fact is the same thing as asking for the probability that two randomly chosen integers are relatively prime.
We will prove the statement about coprime random integers, or at least we will prove as much of it as we can without discussing infinite combinations of independent chances. We will also make an assumption about distribution of primes to simplify the proof; one can consider this a sketch, if necessary.
First, we know that \(\gcd(x,y)=1\) is true precisely if \(x\) and \(y\) are never simultaneously congruent to zero modulo \(p\text{,}\) for any prime \(p\text{.}\) (If there were such a \(p\text{,}\) of course it would be a divisor; by the Fundamental Theorem of Arithmetic we need only consider primes.)
For any given prime \(p\text{,}\) the chances that two integers will both be congruent to zero is
\begin{equation*} \frac{1}{p}\cdot\frac{1}{p}\text{.} \end{equation*}
This works because the probabilities are independent, since \(p\) is fixed, so we can just multiply probabilities.
Hence the probability that at least one of \(x\) or \(y\) will not be divisible by \(p\) is
\begin{equation*} 1-\frac{1}{p}\cdot\frac{1}{p}=1-\frac{1}{p^2}=1-p^{-2}\text{.} \end{equation*}
(This may remind you of the so-called birthday problem in probability.)
Now comes our assumption. We will suppose that if \(p \neq q\) are both prime, then the probability that any given integer is divisible by \(p\) has nothing to do with whether it is divisible by \(q\text{.}\) (Such properties are not true in general; if \(n\) is divisible by 4 it has a 100% likelihood of being divisible by 2, while if \(n\) is prime, it has almost no chance of being even.)
In such a case the probabilities are independent, so that (even in this infinitary case)
\begin{equation*} \prod_p (1-p^{-2})=1/\prod_p \frac{1}{1-p^{-2}}=1/\zeta(2)=\frac{6}{\pi^2}\text{.} \end{equation*}
We may note (as in the more extended discussion in [E.2.1, Chapter 9.4]) by using Fact 24.4.2 that this is also the value of the Dirichlet series of \(\mu\) evaluated at \(s=2\text{.}\)
This implies that a random pair of integers, selected from a large enough bound, will be relatively prime about 61% of the time. See this Numberphile video 6  for a real-time experiment on Twitter 7  doing something analogous with triples in order to estimate Apéry’s constant \(\zeta(3)\text{.}\)

Subsection 24.6.2 Dirichlet for the absolute Moebius

With all the tools we’ve gained, the proof 8  is nearly completely symbolic at this point!
First, we have the following from the definition of Moebius in Definition 23.1.1, or from Fact 24.5.5:
\begin{equation*} \sum_{n=1}^{\infty}\frac{|\mu(n)|}{n^s}=\prod_p\left(1+\frac{1}{p^s}\right)\text{.} \end{equation*}
Next, let us write \(x=\frac{1}{p^s}\text{;}\) then we can use the basic identity \((1+x)=\frac{1-x^2}{1-x}\) to rewrite the right-hand side as
\begin{equation*} \prod_p\left(1+\frac{1}{p^s}\right)=\frac{\prod_p\left(1-\frac{1}{p^{2s}}\right)}{\prod_p\left(1-\frac{1}{p^s}\right)}\text{.} \end{equation*}
Now we just invert both numerator and denominator to get familiar friends:
\begin{equation*} \frac{\prod_p\left(1-\frac{1}{p^{2s}}\right)}{\prod_p\left(1-\frac{1}{p^s}\right)}=\frac{\prod_p 1/\left(1-\frac{1}{p^s}\right)}{\prod_p 1/\left(1-\frac{1}{p^{2s}}\right)} \end{equation*}
which means the sum will be \(\zeta(s)/\zeta(2s)\text{.}\)
Let’s try this out computationally.
Computing these series doesn’t stop here, of course! For example, something analogous can be said about the Dirichlet series for multiples \(f(n)\left|\mu(n)\right|\) for certain types of \(f\text{;}\) see [E.4.6, Exercise 11.13] for a precise statement.

Subsection 24.6.3 The prime harmonic series

The divergence of the series created from the reciprocals of prime numbers is not necessarily a particularly obvious fact. Certainly it diverges much, much slower than the harmonic series (recalled before Definition 20.3.10), which already diverges very slowly. Euler showed this in 1737 9 .
This proof doesn’t actually use Dirichlet series, but has in common with them themes of convergence and estimation, so it is appropriate to include here.
This is a fairly expanded form of the proofs in [E.2.1, Theorem 9.2] and [E.4.6, Theorem 1.13], which the latter attributes to Clarkson in the Proceedings of the AMS.
As with many other occasions to prove series divergence, we will focus on the ‘tail’s beyond a point that keeps getting further out. In this case, we’ll choose the ‘tail’ beyond the first \(k\) primes,
\begin{equation*} T = \sum_{n>k}\frac{1}{p_n}\text{.} \end{equation*}
By examining certain terms in this, and assuming (falsely) that it can be made finite, we will obtain a contradiction.
First, let’s consider numbers of the form
\begin{equation*} p_1 p_2 p_3\cdots p_k r+1=p_k\#\cdot r+1 \end{equation*}
(Recall the primorial notation from Definition 22.2.7.) Such a number cannot be divisible by any of those first \(k\) primes, so by the Fundamental Theorem of Arithmetic any number like \(p_k\#\cdot r+1\) may be factored as
\begin{equation*} p_{n_1}p_{n_2}\cdots p_{n_{\ell}}\text{,} \end{equation*}
where all \(n_i>k\) (some may be repeated).
Return to the ‘tail’. Since this \(p_k\#\cdot r+1\) factors with \(\ell\) factors, then somewhere in the \(\ell\)th power of the ‘tail’ we have the following term:
\begin{equation*} T^\ell = \left(\sum_{n>k}\frac{1}{p_n}\right)^{\ell} = \frac{1}{p_1 p_2 p_3\cdots p_k r+1}+ \cdots \text{.} \end{equation*}
Now assume that in fact the prime harmonic series converges; we will derive a contradiction.
First, for some \(k\text{,}\) the ‘tail’ \(T\) is less than \(\frac{1}{2}\text{,}\) i.e. \(T =\sum_{n>k}\frac{1}{p_n}<\frac{1}{2}\text{.}\) Since each term is positive, \(T>0\) and a geometric series involving the \(\ell\)th powers of \(T\) is very precisely bounded:
\begin{equation*} 0\leq \sum_{\ell=1}^{\infty}T^{\ell}=\sum_{\ell=1}^{\infty}\left(\sum_{n>k}\frac{1}{p_n}\right)^{\ell}\leq \sum_{\ell=1}^{\infty}\frac{1}{2^{\ell}}=2\text{.} \end{equation*}
Now we bring in the first discussion in this proof; every single term of the form \(\frac{1}{p_1 p_2 p_3\cdots p_k r+1}\) will appear somewhere within this sum of the \(\ell\)th powers, though naturally \(\ell\) in each case will depend heavily upon \(r\text{.}\)
So the series of reciprocals of just these special terms is bounded.
\begin{equation*} 0<\sum_{r=1}^{\infty}\frac{1}{p_1 p_2 p_3\cdots p_k r+1}\leq \sum_{\ell=1}^{\infty}\left(\sum_{n>k}\frac{1}{p_n}\right)^{\ell}\leq 2\text{.} \end{equation*}
A bounded series of all positive number should converge (e.g. by comparison).
Here comes the contradiction. The same series is bounded below as follows, for each integer \(k\text{.}\)
\begin{equation*} \sum_{r=1}^{\infty}\frac{1}{p_1 p_2 p_3\cdots p_k r+1}>\sum_{r=1}^{\infty}\frac{1}{p_1 p_2 p_3\cdots p_k r+p_1 p_2 p_3\cdots p_k} \end{equation*}
\begin{equation*} =\frac{1}{p_1 p_2 p_3\cdots p_k}\sum_{r=1}^{\infty}\frac{1}{r+1} \end{equation*}
This series certainly diverges, as a multiple of the tail of the harmonic series!
Since no matter how big \(k\) is (and hence how far out in the ‘tail’ we go) we report that a certain series both converges and diverges, we have a contradiction. Hence our original assumption that we could choose \(k\) to make \(T\) finite was false, and the prime harmonic series must diverge!

Subsection 24.6.4 The average value of Euler phi

Finally, here is a really nice result to end with. Thinking about the average value of \(\phi\) will put together many themes from this text.
You may recall Section 20.5, and in particular Exercise 20.6.17, where you were asked to conjecture regarding this question. As there, it’s useful here to try to graph the average values first; here I have incuded the correct long-term average as well.
Average value of \(phi\) versus linear model
Figure 24.6.5. Average value of \(\phi\) versus \(\frac{3}{\pi^2} x\)
Before formally proving this, let’s look at a significant picture for conceptualizing the proof. This is similar to what we used for the average of \(\tau\) and \(\sigma\) in Chapter 24.
Integer lattice with labeled points
Figure 24.6.6. Integer lattice with labeled points
The text at each lattice point is the value of horizontal coordinate, multiplied by a factor of Moebius of the vertical coordinate. You can try it interactively if you are online.
We will crucially use two earlier facts in the proof:
  • From above (e.g. Fact 24.4.2),
    \begin{equation*} \sum_{n=1}^\infty\frac{\mu(n)}{n^2}=\frac{1}{\zeta(2)}=\frac{6}{\pi^2} \end{equation*}
  • From the previous chapter (e.g. Fact 23.3.2),
    \begin{equation*} \phi=\mu\star N\Rightarrow \phi(n)=\sum_{d\mid n}\mu(d)\frac{n}{d} \end{equation*}
This proof is based loosely on [E.4.6, Theorem 3.7]. See [E.2.8, Theorem 3.8.1] for a more detailed approach which is rewarded with a very nice error estimate – unusual in that it starts its discussion of averages with this example! Both books have much more related material, including useful (if difficult) exercises such as finding a bound for the sum of reciprocals of \(\phi\text{.}\)
Consider the summation function for \(\phi\text{,}\) \(\sum_{k=1}^n\phi(n)\text{.}\) As in Chapter 20, we will think of it as summing things up in two different ways.
In particular, look at the summation once we have replaced with the Moebius sum inside:
\begin{equation*} \sum_{k=1}^n\phi(k)=\sum_{k=1}^n\sum_{d\mid k}\mu(d)\frac{k}{d} \end{equation*}
Now instead of considering it as a sum over divisors for each \(k\text{,}\) we can think of it as summing for each divisor over the various hyperbolas \(xy=k\text{.}\) This yields
\begin{equation*} \sum_{k=1}^n\sum_{d\mid k}\mu(d)\frac{k}{d}=\sum_{d=1}^n\mu(d)\left(\sum_{k=1}^{\lfloor \frac{n}{d}\rfloor}k\right) \end{equation*}
Now let’s examine the terms of this sum. We will several times use Landau notation as in Definition 20.1.2.
Knowing about the sum of the first few consecutive integers (also used at the end of Subsection 20.3.2), we see that
\begin{equation*} \left(\sum_{k=1}^{\lfloor \frac{n}{d}\rfloor}k\right)=\frac{1}{2}\left(\frac{n}{d}\right)^2+O\left(\frac{n}{d}\right)\; . \end{equation*}
If we plug that in the double sum, we get
\begin{equation*} \sum_{k=1}^n\phi(n)=\frac{1}{2}n^2\sum_{d=1}^n\frac{\mu(d)}{d^2}+nO\left(\sum_{d=1}^n\frac{\mu(d)}{d}\right)\text{.} \end{equation*}
Let’s examine this.
  • The first term goes to \(\frac{6}{\pi^2}\) as \(n\to\infty\text{.}\) Further, its error is \(O(1/n)\text{,}\) because \(\mu(n)/n^2<1/n^2\) and \(\int x^{-2}\, dx=-x^{-1}\text{.}\)
  • The second term is certainly \(O(n\log(n))\text{,}\) since it is \(n\) times a sum which is less than something \(O(\log(n))\) (namely, \(\zeta\)).
Plugging everything in, we get that
\begin{equation*} \sum_{k=1}^n\phi(n) = \frac{1}{2}n^2\frac{6}{\pi^2}+O(\text{ stuff less than }n^2) \end{equation*}
Dividing by \(n\) and taking the limit, we get the asymptotic result.
\begin{equation*} \lim_{n\to\infty}\frac{\frac{1}{n}\sum_{k=1}^n\phi(n)}{\frac{3}{\pi^2}n} \end{equation*}
\begin{equation*} =\lim_{n\to\infty} \frac{ \frac{1}{2}\frac{n^2}{n}\frac{6}{\pi^2} +\frac{1}{n}O(\text{stuff less than }n^2)} {\frac{3}{\pi^2}n} \end{equation*}
\begin{equation*} =\lim_{n\to\infty}\frac{\frac{3}{\pi^2}n+O(\text{ stuff less than }n)}{\frac{3}{\pi^2}n}=1\text{.} \end{equation*}
Sounds like an extra-credit project to me.
This result is the first half of [E.2.1, Exercise 9.14], where it is then applied to the Liouville \(\lambda\) function of Definition 23.3.4 in an interesting way.