Polynomials are used constantly in Number Theory, so it is important to understand them thoroughly.

__Example__
Let \(a,b,c,r,s,t\) be real numbers and define functions \(f,g:\mathbb{R}\rightarrow\mathbb{R}\) by

\[ f(x)=ax^2+bx+c\mbox{ and }g(x)=rx^2+sx+t. \]

If \(a=r,b=s\) and \(c=t\) then \(f(x)=g(x)\) for all \(x\in\mathbb{R}\) (and both functions have the same domain and codomain) so \(f=g\).

What about the converse - if \(f=g\) (and therefore \(f(x)=g(x)\) for all \(x\in\mathbb{R}\)), does it follow that \(a=r,b=s\) and \(c=t\)?

One way to answer this is with a little calculus.

If we differentiate \(f\), we get the function \(f'\) given by \(f'(x)=2ax+b\) and, differentiating again, we get \(f''\) given by \(f''(x)=2a\) (a constant function).

Similarly the functions \(g'\) and \(g''\) are given by \(g'(x)=2rx+s\) and \(g''(x)=2r\).

If \(f=g\) then \(f'=g'\) and \(f''=g''\) so, in particular,

\[\begin{eqnarray*}

c & = & f(0)=g(0)=t \\

b & = & f'(0)=g'(0)=s \\

a & = & \frac{1}{2}f''(0)=\frac{1}{2}g''(0)=r.

\end{eqnarray*}\]

Thus \(f=g\) if and only if \(a=r,b=s\) and \(c=t\) but in order to prove this we relied upon the fact that we were working in the real numbers.

__Example__
Let \(R=\mathbb{Z}/2\mathbb{Z}\) (the set of integers modulo 2) and define functions \(f,g:R\rightarrow R\) by

\[ f(x)=x^2\mbox{ and }g(x)=x. \]

The integers modulo 2 are just \(\bar{0}\) and \(\bar{1}\), and we have \(\bar{0}^2=\bar{0}\) and \(\bar{1}^2=\bar{1}\), so \(f(x)=g(x)\) for all \(x\in R\) hence \(f=g\).

In this case, the two distinct polynomials \(x^2\) and \(x\) gave us the same function.

__Proposition 112__
Let \(R\) be an infinite integral domain, \(m,n\) non-negative integers and \(a_0,a_1,\ldots,a_n\) and \(b_0,b_1,\ldots,b_m\) elements of \(R\).

Define \(a_i=0\) for all integers \(i>n\) and \(b_i=0\) for all \(b>m\).

Let \(f,g:R\rightarrow R\) be the functions given by

\[ \begin{eqnarray*}

f(x) & = & a_0+a_1x+a_2x^2+a_3x^3+\ldots +a_nx^n \\

g(x) & = & b_0+b_1x+b_2x^2+b_3x^3+\ldots +b_mx^m.

\end{eqnarray*} \]

If \(f=g\) then it follows for all integers \(i\geq 0\) that \(a_i=b_i\).

__proof__
Let \(k\) be the maximum of \(m\) and \(n\), and \(h:R\rightarrow R\) the function given by

\[ h(x)=(a_0-b_0)+(a_1-b_1)x+(a_2-b_2)x^2+\ldots +(a_k-b_k)x^k. \]

Suppose there exists \(j\in \{0,1,\ldots,k\}\) for which \(a_j\neq b_j\), choosing \(j\) as large as possible.

If \(j>0\) then, by proposition 60, there are at most \(j\) values of \(x\) for which \(h(x)=0\).

But \(R\) is infinite, so there exists \(x\in R\) with \(h(x)\neq 0\) and therefore \(f(x)\neq g(x)\).

If instead \(j=0\) then \(h(0)\neq 0\) so \(f(0)\neq g(0)\).

It follows that if \(f=g\) then no such \(j\) exists, hence \(a_i=b_i\) for all integers \(i\geq 0\). ∎

__Proposition 113__
Let \(R\) be a finite ring and, for each positive integer \(n\), let \(f_n:R\rightarrow R\) be the function given by \(f_n(x)=x^n\).

Then there exist positive integers \(i,j\) such that \(i\neq j\) but \(f_i=f_j\).

__proof__
Let \(n\) be the number of elements in the finite ring \(R\).

Then there are only \(n^n\) distinct functions from \(R\) to itself (we only have \(n\) options for the image of each element)

so the functions \(f_1,f_2,\ldots,f_{n^n+1}\) cannot all be different.

Hence there exist \(i,j\) such that \(i\neq j\) but \(f_i=f_j\). ∎

In general, therefore, distinct polynomials may give us the same function.

So our rigorous definition of what a polynomial actually

*is* cannot make use of these functions - we must find a different approach.

Let \(R\) be a ring.

A

**sequence** in \(R\) is a list of elements of \(R\) indexed by the non-negative integers:

\[ a_0,a_1,a_2,a_3,\ldots \]

(it corresponds with a function \(a:\mathbb{N}\rightarrow R\)).

Sequences are by definition infinitely long, and the same element is allowed to appear more than once.

For each \(i\in\mathbb{N}\), the element \(a_i\in R\) is called a

**term** in the sequence.

We denote the entire sequence by giving the \(i\)-th term in brackets: \((a_i)_{i\in\mathbb{N}}\) or simply \((a_i)\).

For example, we write \((2i+1)_{i\in\mathbb{N}}\) for the sequence

\[ 1,3,5,7,9,11,\ldots \]

Sequences \((a_i)_{i\in\mathbb{N}}\) and \((b_i)_{i\in\mathbb{N}}\) in \(R\) are equal if and only if the corresponding terms are equal, i.e. \(a_i=b_i\) for all \(i\in\mathbb{N}\).

A

**polynomia**l over \(R\) is a sequence in \(R\) with only finitely many non-zero terms.

Thus, if \((a_i)_{i\in\mathbb{N}}\) is a polynomial over \(R\) then there exists an integer \(n\geq 0\) such that, for all \(i>n\), \(a_i=0\).

Let \((a_i)_{i\in\mathbb{N}}\) and \((b_i)_{i\in\mathbb{N}}\) be polynomials over \(R\).

We produce their sum by adding together corresponding terms, i.e. we define

\[ (a_i)_{i\in\mathbb{N}}+(b_i)_{i\in\mathbb{N}}=(c_i)_{i\in\mathbb{N}}

\mbox{ where }c_i=a_i+b_i\mbox{ for each }i\in\mathbb{N}. \]

It is easy to see that the resulting sequence has only finitely many non-zero terms, so that the sum of two polynomials is again a polynomial.

Now let \(c_0=a_0b_0\), \(c_1=a_0b_1+a_1b_0\), \(c_2=a_0b_2+a_1b_1+a_2c_0\) and in general, for each non-negative integer \(k\), let \(c_k=\sum_{i=0}^ka_ib_{k-i}\).

Then the product of the polynomials \((a_i)_{i\in\mathbb{N}}\) and \((b_i)_{i\in\mathbb{N}}\) is defined by

\[ (a_i)_{i\in\mathbb{N}}(b_i)_{i\in\mathbb{N}}=(c_i)_{i\in\mathbb{N}}. \]

There exist non-negative integers \(m,n\) such that \(a_i=0\) for all \(i>n\) and \(b_i=0\) for all \(i>m\).

Take any integer \(k>m+n\). For all \(i\in\{0,1,\ldots,k\}\), if \(i>n\) then \(a_i=0\) and otherwise \(k-i>m\) so \(b_{k-i}=0\), hence \(c_k=0\).

Thus the product of two polynomials is also a polynomial.

__Proposition 114__
Let \(R\) be a ring.

Then the set of all polynomials over \(R\) with addition and multiplication as defined above also forms a ring.

__proof__
Let \(S\) be the set of all polynomials over \(R\).

As we add polynomials together term by term, the ring axioms for addition in \(S\) follow directly from the corresponding properties in \(R\).

The neutral element is \((0)_{i\in\mathbb{N}}\) (i.e. the sequence \(0,0,0,\ldots\)) and, for each \((a_i)_{i\in\mathbb{N}}\in S\), its inverse under addition is \((-a_i)_{i\in\mathbb{N}}\).

The neutral element of \(S\) under multiplication is \(1,0,0,0,\ldots\).

Take any \((a_i)_{i\in\mathbb{N}},(b_i)_{i\in\mathbb{N}}\) and \((c_i)_{i\in\mathbb{N}}\in S\).

We shall abbreviate these to \(a,b\) and \(c\) respectively.

For any non-negative integer \(j\), we shall write \([ab]_j\) for the \(j\)-th term of the sequence obtained by multiplying the sequences \(a\) and \(b\) together,

and similarly for other sums and products.

Take any \(m\in\mathbb{N}\). Let \(X_2\) be the set of all ordered pairs \((i,j)\) of non-negative integers with \(i+j=m\),

and \(X_3\) the set of all ordered triples \((i,j,k)\) of non-negative integers with \(i+j+k=m\).

Then (using associativity of multiplication in \(R\))

\[ [(ab)c]_m=\sum_{(j,k)\in X_2}[ab]_jc_k

=\sum_{(i,j,k)\in X_3}a_ib_jc_k=\sum_{(i,j)\in X_2}a_i[bc]_j=[a(bc)]_m \]

hence multiplication in \(S\) is associative.

Also, using distributivity in \(R\),

\[\begin{eqnarray*}

[a(b+c)]_m & = & \sum_{i=0}^ma_i[b+c]_{m-i}=\sum_{i=0}^ma_i(b_{m-i}+c_{m-i}) \\

& = & \sum_{i=0}^m(a_ib_{m-i}+a_ic_{m-i})

=\sum_{i=0}^ma_ib_{m-i}+\sum_{i=0}^ma_ic_{m-i} \\

& = & [ab]_m+[ac]_m=[ab+ac]_m

\end{eqnarray*}\]

and similarly \((a+b)c=ac+bc\).

It follows that \(S\) is a ring under the given addition and multiplication. ∎

__Notation__
Let \(R\) be a ring, and \(S\) the set of all polynomials over \(R\).

For any \(a,b\in R\), the sum of the sequences \(a,0,0,0,\ldots\) and \(b,0,0,0,\ldots\) is the sequence \(a+b,0,0,0,\ldots\), while their product is \(ab,0,0,0,\ldots\).

The neutral element of \(S\) under addition (its zero) is the sequence \(0,0,0,0,\ldots\) while the neutral element under multiplication (its one) is \(1,0,0,0,\ldots\).

Thus the polynomials in which all terms after the first are zero behave exactly like the elements of \(R\).

For each \(a\in R\), we may therefore write the polynomial \(a,0,0,0\ldots\) simply as \(a\).

These polynomials are called the

**constant polynomials**, and identifying them with the elements of \(R\) makes \(R\) into a subring of \(S\).

We write \(X\) for the polynomial \(0,1,0,0,0,\ldots\). (Choose some other symbol if you are already using \(X\) for something.)

It then follows that \(X^2\) is the sequence \(0,0,1,0,0,0,\ldots\), \(X^3\) is \(0,0,0,1,0,0,0,\ldots\), etc.

So, for any non-negative integer \(n\) and any elements \(a_0,a_1,\ldots,a_n\in R\), the polynomial \(a_0,a_1,a_2,\ldots,a_n,0,0,0,\ldots\) can be written as \(a_0+a_1X+a_2X^2+\ldots +a_nX^n\),

and this is the standard notation for it.

We write \(R[X]\) for the set \(S\) of all polynomials over \(R\).

Let \(f\in R[X]\) with \(f\neq 0\).

Then \(f=a_0+a_1X+a_2X^2+\ldots +a_nX^n\) for some integer \(n\geq 0\) and some \(a_0,a_1,\ldots,a_n\in R\), and we may choose \(n\) so that \(a_n\neq 0\).

We call the elements \(a_0,a_1,\ldots,a_n\) the

**coefficients** of \(f\), and \(n\) is called the

**degree** of \(f\), written \(\deg(f)\).

We call \(a_n\) the

**leading coefficient** of \(f\), and say that \(f\) is

**monic** if its leading coefficient is 1.

(In the case \(f=0\), some people define the degree of \(f\) to be \(-1\) or \(-\infty\), while others leave it undefined.)

Thus the constant polynomials are the polynomials of degree 0 together with the zero polynomial.

__Example__

\(X^2-3X+5\in\mathbb{Z}[X]\) has degree 2 and leading coefficient 1, so it is monic.

__Proposition 115__
Let \(R\) be an integral domain.

Then

i) for all \(f,g\in R[X]\setminus\{0\}\), \(\deg(fg)=\deg(f)+\deg(g)\);

ii) \(R[X]\) is also an integral domain, and

iii) \(R[X]^*=R^*\).

__proof__
i) Take any \(f,g\in R[X]\setminus\{0\}\). Then

\[ f=a_0+a_1X+a_2X^2+\ldots +a_nX^n\mbox{ and }

g=b_0+b_1X+b_2X^2+\ldots +b_mX^m \]

for some non-negative integers \(m,n\) and elements \(a_0,a_1,\ldots,a_n,b_0,b_1,\ldots,b_m\in R\).

As \(f\neq 0\neq g\), we may choose these so that \(a_n\neq 0\neq b_m\), and then \(\deg(f)=n\) and \(\deg(g)=m\).

Define \(a_i=0\) for all integers \(i>n\) and similarly \(b_i=0\) for all \(i>m\).

For each \(k\in\mathbb{N}\), let \(c_k=\sum_{i=0}^ka_ib_{k-i}\).

Then, for all integers \(k>m+n\), \(c_k=0\) (shown above when we defined the product of polynomials).

And \(c_{m+n}=a_nb_m\neq 0\) as \(R\) is an integral domain,hence \(fg\neq 0\) and \(\deg(fg)=m+n=\deg(f)+\deg(g)\).

ii) Still using the notation from part (i) above, we have for all \(i,j\in\mathbb{N}\) that \(a_ib_j=b_ja_i\) (as \(R\) is a commutative ring)

and it follows for each \(k\in\mathbb{N}\) that \(\sum_{i=0}^ka_ib_{k-i}=\sum_{i=0}^kb_ia_{k-i}\) hence \(fg=gf\).

Thus \(R[X]\) is a commutative ring.

The constant polynomials in \(R[X]\) correspond with the elements of \(R\), and \(1\neq 0\) in \(R\) so \(1\neq 0\) in \(R[X]\) too.

Moreover, we saw in part (i) that \(R[X]\) has no zero divisors. Hence \(R[X]\) is an integral domain.

iii) For any \(u\in R^*\), there exists \(v\in R\) with \(uv=vu=1\).

As \(R\) is a subring of \(R[X]\), we have \(uv=vu=1\) in \(R[X]\), too (viewing \(u,v\) as constant polynomials) so \(u\in R[X]^*\).

Thus \(R^*\subset R[X]^*\).

Conversely, for any \(f\in R[X]^*\), there exists \(g\in R[X]\) with \(fg=gf=1\) so, by part (i) above, \(\deg(f)+\deg(g)=\deg(1)=0\).

As the degree of a polynomial is non-negative, it follows that \(\deg(f)=\deg(0)=0\).

Thus \(f\) and \(g\) are both constant polynomials, behaving like the corresponding elements of \(R\) hence \(f\in R^*\) as well.

It follows that \(R[X]^*\subset R^*\) too, and therefore \(R[X]^*=R^*\). ∎

Just as with the (rational) integers and Gaussian integers, division with remainder is important for polynomials.

__Proposition 116__
Let \(R\) be a ring and \(f,g\in R[X]\) with \(g\neq 0\) and \(g\)'s leading coefficient a unit in \(R\).

Then there exist unique \(q,r\in R[X]\) such that \(f=qg+r\) and either \(r=0\) or \(\deg(r)<\deg(g)\).

__proof__
EXISTENCE

If \(f=0\) then putting \(q=r=0\) suffices.

Otherwise,

\[ f=a_0+a_1X+a_2X^2+\ldots +a_nX^n\mbox{ and }

g=b_0+b_1X+b_2X^2+\ldots +b_mX^m \]

where \(n=\deg(f)\), \(m=\deg(g)\) and \(a_0,\ldots,a_n,b_0\ldots,b_m\in R\) with \(a_n\neq 0\) and \(b_m\in R^*\), so \(b_mc=cb_m=1\) for some \(c\in R\).

Induction on \(n\).

In the case \(n=0\), if \(m>0\) then putting \(q=0\) and \(r=f\) suffices.

If \(m=0\) too, let \(q=a_0c\) and \(r=0\). Then \(qg+r=a_0cb_0+0=a_0=f\) (and \(r=0\)).

Take any integer \(N>0\) and suppose existence holds for all \(n\in\mathbb{N}\) with \(n<N\).

In the case \(n=N\), if \(m>n\) then we can put \(q=0\) and \(r=f\) as before.

Suppose instead that \(m\leq n\), and let \(h=f-(a_ncX^{n-m})g\).

Then the coefficient of \(X^n\) in \(h\) is 0 so \(h=0\) or \(\deg(h)<n\) and thus, by our assumption, there exist \(q',r\in R[X]\) such that \(h=q'g+r\) and either \(r=0\) or \(\deg(r)<\deg(g)\).

Let \(q=q'+a_ncX^{n-m}\). Then \(f=qg+r\) with \(r=0\) or \(\deg(r)<\deg(g)\) as required.

By induction, existence holds for all non-negative integers \(n\).

UNIQUENESS

Suppose \(f=qg+r\) and \(f=q'g+r'\) with \(r=0\) or \(\deg(r)<\deg(g)\) and similarly for \(r'\).

Then \(r-r'=(q'-q)g\) with \(r-r'=0\) or \(\deg(r-r')<\deg(g)\).

Suppose \(q'-q\neq 0\). As the leading coefficient of \(g\) is a unit, it cannot be a zero divisor, and it follows that \((q'-q)g\neq 0\) with \(\deg((q'-q)g)\geq\deg(g)\).

But \((q'-q)g=r-r'\) so then \(r-r'\neq 0\) too and \(\deg(r-r')<\deg(g)\), which is a CONTRADICTION.

Hence \(q'-q=0\) giving \(q'=q\) and \(r'=r\). ∎

In particular, if \(R\) is a field then \(R^*=R\setminus\{0\}\) (exercise 89) so we can divide by any non-zero polynomial over \(R\), getting a quotient and remainder.

__Example__
Long division for polynomials: divide \(f=5x^3-3x^2+8\) by \(g=x+2\in\mathbb{Z}[X]\).

The leading coefficient of \(g\) is the unit 1, so division with remainder is possible.

First insert the missing power of \(x\): \(f=5x^3-3x^2+0x+8\).

Step 1: Dividing the leading term \(5x^3\) of \(f\) by the leading term \(x\) of \(g\), we get \(5x^2\), which we put as the leading term of the quotient.

Multiplying it by \(q=x+2\), we get \(5x^3+10x^2\), which we subtract from \(f=5x^3-3x^2+0x+8\) to get the remainder \(r_1=-13x^2+0x+8\) after step 1.

Step 2: Dividing the leading term \(-13x^2\) of \(r_1\) by the leading term \(x\) of \(q\), we get \(-13x\), which we put as the next term of the quotient.

Multiplying it by \(q=x+2\), we get \(-13x^2-26x\), which we subtract to get the remainder \(r_2=26x+8\) after step 2.

Step 3:We divide the leading term \(26x\) of the remainder \(r_2\) by the leading term \(x\) of \(q\), getting 26, which we put as the next term of the quotient.

We multiply it by \(q\), getting \(26x+52\), which we subtract to get the remainder \(r_3=-44\) after step 3.

Conclusion: the quotient is \(5x^2-13x+26\) and the remainder is -44, i.e. \(5x^3-3x^2+8=(5x^2-13x+26)(x+2)-44\).

Let \(R\) be a ring and \(f\in R[X]\).

Then \(f=a_0+a_1X+a_2X^2+\ldots +a_nX^n\) for some integer \(n\geq 0\) and some \(a_0,a_1,\ldots,a_n\in R\).

Let \(S\) be a ring with \(R\) as a subring (possibly with \(S=R\)).

For any element \(c\in S\), we define the

**evaluation** of \(f\) at \(c\) by

\[ f(c)=a_0+a_1c+a_2c^2+\ldots +a_nc^n\in S. \]

__Examples__
Let \(f=X^2-3X+2\in\mathbb{Z}[X]\).

Then

\[\begin{eqnarray*}

f(0) & = & 2\in\mathbb{Z} \\

f(1) & = & 0\in\mathbb{Z} \\

f\left(\frac{1}{2}\right) & = & \frac{3}{4}\in\mathbb{Q} \\

f(X+1) & = & X^2-X\in\mathbb{Z}[X]

\end{eqnarray*} \]

__Proposition 117__
Let \(R\) be a ring, \(a\in R\) and \(f\in R[X]\).

Then the unique remainder on dividing \(f\) by \(X-a\) is \(f(a)\).

__proof__
Write \(f=b_0+b_1X+b_2X^2+\ldots +b_nX^n\) where \(n\in\mathbb{N}\) and \(b_0,b_1,\ldots,b_n\in R\).

Then \(f-f(a)=b_1(X-a)+b_2(X^2-a^2)+\ldots +b_n(X^n-a^n)\).

For each integer \(m\geq 1\), define

\[ h_m=X^{m-1}+aX^{m-2}+a^2X^{m-3}+\ldots +a^{m-2}X+a^{m-1}\in R[X] \]

and let \(q=b_1h_1+b_2h_2+\ldots +b_nh_n\).

Then for all \(m\geq 1\) we have \(X^m-a^m=h_m\cdot (X-a)\) and hence \(f=q\cdot (X-a)+f(a)\). ∎

For any ring \(R\), we call \(a\in R\) a

**zero** or

**root** of \(f\in R[X]\) if \(f(a)=0\).

__Exercises__
91. Let \(R\) be a ring (not necessarily commutative). Show that \(X\cdot f=f\cdot X\) for all \(f\in R[X]\).

92. Let \(p\) be a prime number, \(R=\mathbb{Z}/p\mathbb{Z}\) and \(f=X^p-X\in R[X]\). Show that every \(x\in R\) is a zero of \(f\).

93. Let \(R\) be an integral domain, \(f\in R[X]\), \(n\geq 1\) an integer and \(a_1,a_2,\ldots,a_n\) distinct zeroes of \(f\).

Show there exists \(q\in R[X]\) such that \(f=q(X-a_1)(X-a_2)\ldots (X-a_n)\).

94. Show that every function from \(\mathbb{Z}/3\mathbb{Z}\) to itself can be specified using a polynomial of degree at most 2 (or the zero polynomial).