Section 3.1 Linear Diophantine Equations
The first goal for this chapter is to completely solve all linear Diophantine equations (of two variables). This is the question of finding solutions \(x,y\in\mathbb{Z}\) of equations of the generic form
Historical remark 3.1.1. Diophantine and his equations.
These equations have been studied since the late Roman era, most notably by the (Greek speaking) mathematician Diophantus, from whom we derive their name, though we know little else about him. One of the most notable things about Diophantus' work is that it incorporates a proto-algebra which begins to use certain Greek letters for an unknown – an advance which, unfortunately, did not go anywhere for over a millenium.
While Diophantus studied much more complicated equations as well (as we will see), methods for solving equations like \(6x+4y=2\) were pursued throughout antiquity and the medieval period – see Historical remark 2.4.7.
There are several main cases involved in the solution, as we see in the following theorem.
Theorem 3.1.2. Solutions of Linear Diophantine Equations.
Given integers \(a,b,c\text{,}\) we wish to find all integer solutions \(x,y\) to \(ax+by=c\text{.}\)
Let \(d=\gcd(a,b)\text{,}\) unless \(a=b=0\) in which case let \(d=0\text{.}\) We will consider cases by ease of generating solutions.
When \(c\) is not a multiple of \(d\) (including if \(c\neq d=0\)), there is no solution.
When \(a\) or \(b\) is zero (but not both) and the nonzero one divides \(c\text{,}\) there are infinitely many solutions that require little work to obtain.
When \(a,b\neq 0\) and \(c=d\text{,}\) there are infinitely many solutions, but you will need to first obtain one solution in order to generate the others.
When \(a,b\neq 0\) and \(c\) is a nontrivial multiple of \(d\text{,}\) there are infinitely many solutions that are easiest to generate by means of a solution to \(ax+by=d\text{.}\)
Proof.
The details are in the following subsections.
When \(c\) is not a multiple of \(d\text{:}\) Subsection 3.1.1
When \(a\) or \(b\) is zero: Subsection 3.1.2
When \(c=d\text{:}\) Subsection 3.1.3
When \(c\) is a nontrivial multiple of \(d\text{:}\) Subsection 3.1.4
You should definitely follow the steps with specific simple numbers to see how each proof works. Examples 3.1.3 and 3.1.4 are good models.
Subsection 3.1.1 If \(c\) is not a multiple of \(\gcd(a,b)\)
When \(d\neq 0\text{,}\) our previous theorems say that solving \(ax+by=c\) is impossible. Can you see why? For instance, try it out with \(a=6\text{,}\) \(b=9\text{,}\) and \(c=5\text{.}\)
Reading the statement of Theorem 3.1.2 carefully shows that this case includes the situation where \(a=0=b\) but \(c\neq 0\text{.}\) It is also an easy exercise to show this is impossible. You can provide full details of all these things in Exercise 3.6.8. Don't forget the division algorithm!f
Subsection 3.1.2 If \(a\) or \(b\) is zero
Suppose \(b=0\) – in which case \(\gcd(a,b)=a\text{.}\) (Try \(a=55\) as an example.)
Then we are just solving \(ax=c\text{,}\) so the equation is true because we already assumed that \(d=a\mid c\text{.}\) All pairs \(\left(\frac{c}{a},y\right)\) with integer \(y\) are solutions.
If \(a=0\) the answer is analogous; write it down for yourself as practice!
Subsection 3.1.3 If \(c=\gcd(a,b)\)
Suppose \(a,b\neq 0\) and \(c\) actually is the gcd of \(a\) and \(b\;\ldots\) then there is some work to do. Follow along with \(a=60\text{,}\) \(b=42\text{,}\) and \(c=6\) if you wish.
Your first step should be to get that gcd \(d\) via the Euclidean algorithm. Then you will be able to go backwards (i.e. using the Bezout identity 2.4.1) to get one solution \((x_0,y_0)\text{.}\) That is important, since now at least one \(ax_0+by_0=c\) is known.
The next step is the last one; write down the entire solution set:
There are three comments to make to finish the proof.
First, look at the structure of the solutions. The constants \(a\) and \(b\) have switched their ‘affiliation’ from \(x\) and \(y\) to \(y\) and \(x\text{.}\) Also note that \(x\) and \(y\) have \(\pm\) involved. It doesn't really matter which is which (switch \(-n\) for \(n\) to see why), but if they have the same sign it is wrong. (When in doubt, try something and then check to see if the answers are right.)
-
It's easy to check that any particular solution works.
\begin{equation*} a\left(x_0+\frac{b}{d}n\right)+b\left(y_0-\frac{a}{d}n\right)=ax_0+\frac{abn}{d}+by_0-\frac{abn}{d} \end{equation*}and \(ax_0+by_0=c\) by hypothesis.
-
Why does this give all solutions? First note that since the only common divisors of \(a\) and \(b\) are divisors of \(d\text{,}\) the integers \(\frac{b}{d}\) and \(\frac{a}{d}\) must be relatively prime.
Now pick another solution \(x=x',y=y'\text{,}\) and let's show it has the desired form. Start with
\begin{equation*} ax'+by'=c=ax_0+by_0 \end{equation*}and gather terms so that
\begin{equation*} \frac{a}{d}(x'-x_0)=-\frac{b}{d}(y'-y_0)\text{.} \end{equation*}Since \(\frac{b}{d}\) divides the right side, it divides the left side as well. Now we use Proposition 2.4.10 and the observation in the previous paragraph to see \(\frac{b}{d}\) must divide the \(x'-x_0\) factor of the left-hand side, so that there exists an integer \(k\) such that
\begin{equation*} x'-x_0=k\frac{b}{d}\text{, which means }x'=x_0+k\frac{b}{d}\text{,} \end{equation*}which is exactly what we just said was the form of all solutions.
Example 3.1.3. An easy example: \(6x+4y=2\).
Trial and error tells us that \(6x+4y=2\) can be solved with \(x_0=1,y_0=-1\text{.}\) Thus the full answer is
which we may rewrite as
Subsection 3.1.4 If \(c\) is a nontrivial multiple of the gcd
Finally, what if \(c\) is not the greatest common divisor but we still have solutions because \(d\mid c\text{?}\) (Follow along in Example 3.1.4 if you wish.)
First, we can write \(c=dm\text{,}\) where again \(d\) is the greatest common divisor.
In Subsection 3.1.3 we just saw that there must be a solution for \(ax+by=d\text{.}\) Take any solution \((x_0,y_0)\) to this equation.
-
By hypothesis, \(d=ax_0+by_0\text{.}\) Now multiply this by \(m\) to obtain
\begin{equation*} c=dm=ax_0 m+by_0 m=a(x_0 m)+b(y_0 m) \end{equation*}which shows \(x=x_0 m,y=y_0 m\) is a solution to the original equation \(ax+by=c\text{.}\)
-
Finally, the surprise is that the full solution has the same form as in Subsection 3.1.3:
\begin{equation*} x=x_0 m +\frac{b}{d}n, y=y_0 m-\frac{a}{d}n \end{equation*}It is easy to check and the proof is very similar to the case \(c=d\) (see Exercise 3.6.9). Intuitively, the reason you don't need the \(m\) in the fractions is because they will just cancel anyway.
Example 3.1.4.
Try to do \(15x-21y=6\text{,}\) a slightly harder one. (Hint: \(d=3\text{;}\) what are \(c\) and \(d\text{?}\)