Skip to main content
\( \newcommand{\lt}{ < } \newcommand{\gt}{ > } \newcommand{\amp}{ & } \)

Section20.3Digging Deeper and Finding Limits

So where does the number of divisors function go? To answer this, we will look at a very different graph!

The fundamental observation that makes this graphic possible is that \(\tau(n)\) is precisely the same as the number of positive integers \((x,y)\) such that \(xy=n\). Before going on, spend some time convincing yourself of this.

Then, if we translate \(xy=n\) to a graph of \(y=n/x\) and \((x,y)\) to a lattice point, we get the following.

Subsection20.3.1Moving toward a proof

To be more in line with our previous notation, we will say that \(\tau(n)\) is exactly given by the number of positive integer points \(\left(d,\frac{n}{d}\right)\) with the property that \(d\frac{n}{d}=n\). Now we can interpert \(\sum_{k=1}^n \tau(k)\) as the number of lattice points on or under the hyperbola \(y=n/x\).

This is a completely different way of thinking of the divisor function! We can see it for various sizes below.

So what we will do is try to look at the lattice points as approximating an area! Just like with the sum of squares function (recall Subsection 18.2.3 and Section 20.1), we will exploit the geometry. For each lattice point involved in \(\sum_{k=1}^n \tau(k)\), we put a unit square to the lower right.

In examining this graph, we will interpret the lattice points as two different sums.

  • We can think of it as \(\sum_{k=1}^n \tau(k)\) – adding up the lattice points along each hyperbola.

  • We can think of it as \(\sum_{j=1}^n \left\lfloor\frac{n}{k}\right\rfloor\), or adding up the lattice points in each vertical column.

The area of the squares can then be thought of as another Riemann-type sum, similar to our summation of \(\tau\).

It should be clear that the area, an estimate for the sum, is “about” \begin{equation*}\int_1^n \frac{n}{x}dx=n\log(x)\biggr\vert_1^n=n\log(n)-n\log(1)=n\log(n)\end{equation*} where the logarithm is the ‘natural’ one. Why is this integral actually a good estimate, though? The answer is in the error!

Look at the shaded difference between the area under the curve (which is \(n\log(n)\)) and the area of the red squares (which is the sum of all the \(\tau\) values).

  • All the areas where the red squares are above the hyperbola add up to less than \(n\), because they are all 1 in width or less, and do not intersect vertically (they stack, as it were).

  • Similarly, all the areas where the hyperbola is higher add up to less than \(n\), because they are all 1 in height or less, and are horizontally non-intersecting.

(Actually, we would expect they would cancel quite a bit … and they do, as we will see. We don't need that yet.)

We can summarize this in the following three implications.

We can verify this graphically by plotting the average value against \(\log(n)\).

Lookin' good! There does seem to be some predictable error. What might it be?

Keeping \(x=0\) in view, it seems to be somewhat less than \(0.2\), although the error clearly bounces around. By zooming in, we see the error bouncing around roughly between \(0.15\) and \(0.16\), more or less, as \(x\) gets large. So will this give us something more precise?

Subsection20.3.2Getting a handle on error

To answer this, we will try one more geometric trick.

Notice we have now divided the lattice points up into three parts, two of which are ‘the same’:

  • The ones on the line \(y=x\).

  • The lattice points above the line and below the hyperbola.

  • The lattice points to the right of the line and below the hyperbola.

Let's count how many there are of each.

First, there are exactly \(\lfloor\sqrt{n}\rfloor\leq \sqrt{n}\) points on the line. At each integer \(y\)-value \(d\) up to \(y=\sqrt{n}\), there are are \(\lfloor n/d\rfloor-d\) above the line and below the hyperbola. Analogously, at each integer \(x\)-value \(d\) up to \(x=\sqrt{n}\), there are are \(\lfloor n/d\rfloor-d\) points to the right of the line and below the hyperbola.

Combining these computations as sums over the divisors \(d\) less than \(n\), and noting the floor is less than the number by at most one for each \(d\), \begin{equation*}\sum_{k=1}^n \tau(k)=\sum_{d\leq \sqrt{n}} (\lfloor n/d\rfloor-d)+\sum_{d\leq \sqrt{n}} (\lfloor n/d\rfloor-d) +\lfloor\sqrt{n}\rfloor\leq 2\sum_{d\leq \sqrt{n}} (n/d-d) +\sqrt{n}\end{equation*} so the total error gained by this approximation is at most \(2\sqrt{n}+1=O(\sqrt{n})\).

Next we rewrite this using the formula for the sum of the first \(\ell\) integers: \begin{equation*}\sum_{k=1}^n \tau(k)=2n\sum_{d\leq \sqrt{n}}\frac{1}{d}-2\sum_{d\leq \sqrt{n}}d+O(\sqrt{n})\end{equation*}\begin{equation*}= 2n\sum_{d\leq \sqrt{n}}\frac{1}{d}-2\left(\frac{\lfloor\sqrt{n}\rfloor(\lfloor\sqrt{n}\rfloor+1)}{2}\right)+O(\sqrt{n})\, .\end{equation*} The difference between \(\left(\frac{\lfloor\sqrt{n}\rfloor(\lfloor\sqrt{n}\rfloor+1)}{2}\right)\) and \(\frac{n}{2}\) is once again far less than \(O(\sqrt{n})\) (and negative to boot), so using some of the work in Exercise Group 20.6.1–20.6.5 finally get that \begin{equation*}\sum_{k=1}^n \tau(k)=2n\sum_{d\leq \sqrt{n}}\frac{1}{d}-n+O(\sqrt{n})\Rightarrow \frac{1}{n}\sum_{k=1}^n \tau(k)=2\sum_{d\leq \sqrt{n}}\frac{1}{d}-1+O(1/\sqrt{n})\; .\end{equation*}

Subsection20.3.3The end of the story

We're almost at the end of the story! It's been a while since we explored the long-term average of \(\tau\) in Subsection 20.2.1; at that point, you likely convinced yourself that \(\log(n)\) is close to the average value of \(\tau\).

So now we just need to relate the sum \(2\sum_{d\leq \sqrt{n}}\frac{1}{d}-1\) to \(\log(n)\). I wish to emphasize just how small the error term \(O(1/\sqrt{n})\) is!

This graphic shows the exact difference between \(\sum_{k=1}^{m-1} \frac{1}{k}\) and \(\log(m)\). Clearly, even as \(m\to\infty\), the total area is simply the sum of a bunch of nearly-triangles with width exactly one and no intersection of height (again this idea), with total height less than \(1\). So the difference between \(\sum_{k=1}^{m-1} \frac{1}{k}\) and \(\log(m)\) will be finite as \(m\to\infty\).

This number is very important! First of all, it clearly is related to the archetypal divergent series from calculus, the harmonic series \begin{equation*}\sum_{k=1}^{\infty} \frac{1}{k}\end{equation*} However, this constant has taken on a life of its own.

Definition20.3.2

The number \(\gamma\), or the Euler-Mascheroni constant, is defined by \begin{equation*}\gamma=\lim_{m\to\infty}\left(\sum_{k=1}^{m-1} \frac{1}{k}-\log(m)\right)\end{equation*}

You have almost certainly never heard of this number, but it is very important. There is even an entire book, by Julian Havil [C.3.14] about this number. It's a pretty good book, in fact!

Remark20.3.3

Among other crazy properties, it is the derivative of a generalization of the factorial function, called Gamma (\(\Gamma\)). I am not making this up.

Consider the area corresponding to \(gamma\) compared to its finite approximations. Notice that the “missing” part of the area (since we can't actually view all the way out to infinity) must be less than \(1/m\), since it will be the part lower than all the pieces we can see in the graphic for any given \(m\). So \(\gamma\) is within \(O(1/n)\) of any given amount finite \(\sum_{k=1}^{m-1} \frac{1}{k}-\log(m)\).

Now we put it all together! We know from above that \begin{equation*}\frac{1}{n}\sum_{k=1}^n \tau(k)=2\sum_{d\leq \sqrt{n}}\frac{1}{d}-1+O(1/\sqrt{n})\, .\end{equation*} Further, we can now substitute in the following for \(\sum_{d\leq \sqrt{n}}\frac{1}{d}\); \begin{equation*}\sum_{d\leq \sqrt{n}}\frac{1}{d}= \log(\sqrt{n})+\gamma+O(1/\sqrt{n})\; .\end{equation*} Once we do that, and take advantage of the log fact \(2\log(z)=\log\left(z^2\right)\), we get \begin{equation*}\frac{1}{n}\sum_{k=1}^n \tau(k)= \log(n)+(2\gamma-1)+O(1/\sqrt{n})\, .\end{equation*} That is exactly the asymptote and type of error that I have depicted below!

It's not hard to prove that \(\tau\) grows at least as fast as \(\log(n)\), so this is a fairly sharp result. (It's even possible to show that the error in the average is \(O(1/\sqrt[3]{x})\), but is not \(O(1/\sqrt[4]{x})\).)