Section 20.3 Digging Deeper and Finding Limits
So where does the number of divisors function go? To answer this, we will look at a very different graph!
The fundamental observation we will use is that \(\tau(n)\) is precisely the same as the number of positive pairs of integers \((x,y)\) such that \(xy=n\text{.}\) Before going on, spend some time convincing yourself of this.
Then, if we translate \(xy=n\) to a graph of \(y=n/x\) and \((x,y)\) to a lattice point, we get the visualization 2 in Figure 20.3.1.
Subsection 20.3.1 Moving toward a proof
To be more in line with our previous notation, we will say that \(\tau(n)\) is exactly given by the number of positive integer points \(\left(d,\frac{n}{d}\right)\) with the property that \(d\frac{n}{d}=n\text{.}\) Now we can interpret \(\sum_{k=1}^n \tau(k)\) as the number of lattice points on or under the hyperbola \(y=n/x\text{.}\)
This is a completely different way of thinking of the divisor function! We can see it for various sizes in the interact below.
So what we will do is try to look at the lattice points as approximating an area! Just like with the sum of squares function (recall Subsection 18.2.3 and Section 20.1), we will exploit the geometry. For each lattice point involved in \(\sum_{k=1}^n \tau(k)\text{,}\) we put a unit square to the lower right 3 .
In examining this graph, we will interpret the lattice points as two different sums.
We can think of it as \(\sum_{k=1}^n \tau(k)\) – adding up the lattice points along each hyperbola.
We can think of it as \(\sum_{j=1}^n \left\lfloor\frac{n}{j}\right\rfloor\text{,}\) or adding up the lattice points in each vertical column.
The area of the squares can then be thought of as another Riemann-type sum, similar to our summation of \(\tau\text{.}\)
It should be clear that the area, an estimate for the sum, is “about”
where the logarithm is the ‘natural’ one.
Definition 20.3.3.
Throughout this text we use \(\log(n)\) to mean the natural logarithm with base \(e\text{.}\)
Why is this integral actually a good estimate, though? The answer is in the error!
Look at the shaded difference between the area under the curve (which is \(n\log(n)\)) and the area of the red squares (which is the sum of all the \(\tau\) values).
All the areas where the red squares are above the hyperbola add up to less than \(n\text{,}\) because they are all 1 in width or less, and do not intersect vertically (they stack, as it were).
Similarly, all the areas where the hyperbola is higher add up to less than \(n\text{,}\) because they are all 1 in height or less, and are horizontally non-intersecting.
(Actually, we would expect they would cancel quite a bit … and they do, as we will see. We don't need that yet.)
I find these points to be easier to see if you try a few different options in the interact below.
We can summarize this discussion in the following three implications.
Fact 20.3.5.
The error \(\sum_{k=1}^n \tau(k)-n\log(n)\) is a positive real number less than \(n\) minus a (different positive real) number less than \(n\text{.}\)
So the error is certainly \(O(n)\) (less than some multiple of \(n\) as \(n\) gets huge).
-
So, the error in the average is less than some constant as \(n\) gets huge! I.e.,
\begin{equation*} \frac{1}{n}\sum_{k=1}^n \tau(k)-\log(n)=O(1) \end{equation*}
(Recall we use \(\log(n)\) to mean the natural logarithm.)
We can verify this graphically by plotting the average value against \(\log(n)\text{.}\)
Lookin' good! There does seem to be some predictable error. What might it be?
Observe Figure 20.3.7. Keeping \(x=0\) in view, the error seems to be somewhat less than \(0.2\text{,}\) although it clearly bounces around a bit. The long-term value seems to settle roughly between \(0.15\) and \(0.16\text{,}\) as \(x\) gets large. So will this give us something more precise?
Subsection 20.3.2 Getting a handle on error
To answer this, we will try one more geometric trick.
Notice we have now divided the lattice points up into three parts, two of which are ‘the same’:
The ones on the line \(y=x\text{.}\)
The lattice points above the line and below the hyperbola.
The lattice points to the right of the line and below the hyperbola.
Try it interactively, and perhaps see if there is a formula for how many of each type there are.
Now let's count. First, there are exactly \(\lfloor\sqrt{n}\rfloor\leq \sqrt{n}\) points on the line. At each integer \(y\)-value \(d\) up to \(y=\sqrt{n}\text{,}\) there are are \(\lfloor n/d\rfloor-d\) to the right of the line and below the hyperbola. Analogously, at each integer \(x\)-value \(d\) up to \(x=\sqrt{n}\text{,}\) there are are \(\lfloor n/d\rfloor-d\) points to the left of the line and below the hyperbola. (These numbers are all nonnegative since \(d\leq \sqrt{n}\text{.}\))
Combine these computations as sums over the divisors \(d\) less than \(n\) and remove the floors to get an easier approximation:
Because the floor of any number is less than the number itself by at most one for each \(d\text{,}\) the total error gained using this inequality is at most the number of terms in the sum, or \(1+2\sqrt{n}=O(\sqrt{n})\text{.}\)
Next we rewrite this using the formula for the sum of the first \(\ell\) integers, using \(\ell=\lfloor \sqrt{n}\rfloor\) and subsuming all the \(\sqrt{n}\) pieces:
Once 4 \(n\geq 4\text{,}\) the difference between \(\frac{n}{2}\) and \(\left(\frac{\lfloor\sqrt{n}\rfloor(\lfloor\sqrt{n}\rfloor+1)}{2}\right)\) is once again far less in size than \(O\left(\sqrt{n}\right)\) (and negative to boot), so using some of the work in Exercise Group 20.6.1–5 we finally get that
Subsection 20.3.3 The end of the story
We're almost at the end of the story! It's been a while since we explored the long-term average of \(\tau\) in Subsection 20.2.1; at that point, you likely convinced yourself that \(\log(n)\) is close to the average value of \(\tau\text{.}\)
So now we just need to relate the sum \(2\sum_{d\leq \sqrt{n}}\frac{1}{d}-1\) to \(\log(n)\text{.}\) I wish to emphasize just how small the error term \(O(1/\sqrt{n})\) is!
Figure 20.3.9 shows the exact difference between \(\sum_{k=1}^{m-1} \frac{1}{k}\) and \(\log(m)\text{.}\) Clearly, even as \(m\to\infty\text{,}\) the total area is simply the sum of a bunch of nearly-triangles with width exactly one and no intersection of height (again this idea), with total height less than \(1\text{.}\) So the difference between \(\sum_{k=1}^{m-1} \frac{1}{k}\) and \(\log(m)\) will be finite as \(m\to\infty\text{.}\)
This number is very important! First of all, it clearly is related to the archetypal divergent series from calculus, the harmonic series
However, this constant has taken on a life of its own.
Definition 20.3.10.
The number \(\gamma\text{,}\) or the Euler-Mascheroni constant, is defined by
Remark 20.3.11.
You have almost certainly never heard of this number, but it is very important. There is even an entire book, by Julian Havil [E.4.15] about this number. It's a pretty good book, in fact!
Among other crazy properties, \(\gamma\) is the derivative of a generalization of the factorial function, called Gamma (\(\Gamma\)). I am not making this up.
Most baffling of all, \(\gamma\) is not known to be either rational or irrational. Maybe you will solve this mystery?
Consider the area corresponding to \(\gamma\) compared to its finite approximations. Notice that the “missing” part of the area (since we can't actually view all the way out to infinity) must be less than \(1/m\text{,}\) since it will be the part lower than all the pieces we can see in the graphic for any given \(m\text{.}\) So \(\gamma\) is within \(O(1/m)\) of any given finite approximation \(\sum_{k=1}^{m-1} \frac{1}{k}-\log(m)\text{.}\) Adapted to our context, we have
Now we put it all together! We know from above that
Further, we can substitute for \(\sum_{d\leq \sqrt{n}}\frac{1}{d}\) as in our discussion of \(\gamma\text{,}\) and then take advantage of the log fact that \(2\log(z)=\log\left(z^2\right)\text{.}\) Then we get
That is exactly the asymptote and type of error that I have depicted in Figure 20.3.12!
It's not hard to prove that \(\tau\) grows at least as fast as \(\log(n)\text{,}\) so this is a fairly sharp result. (It's even possible to show that the error in the average is \(O(1/\sqrt[3]{x})\text{,}\) but is not \(O(1/\sqrt[4]{x})\text{;}\) once again see [E.7.25] for much more information.)