§6.1 Ideals and Congruence
Our goal is to develop a notion of congruence in arbitrary rings that includes as special cases congruence modulo \(n\) in \(\mathbb{Z}\) and congruence modulo \(p(x)\) in \(F[x]\). We begin by taking a second look at some examples of congruence in \(\mathbb{Z}\) and \(F[x]\) from a somewhat different viewpoint than before.
Example 1
In the ring \(\mathbb{Z}\), \(a = b \ (\text{mod}\ 3)\) means that \(a - b\) is a multiple of 3. Let \(I\) be the set of all multiples of 3, so that:
\[ I = \{0, \pm 3, \pm 6, \dots \} \]Then congruence modulo 3 may be characterized like this:
\[ a = b \ (\text{mod}\ 3) \ \text{means} \ a - b \in I. \]Observe that the subset \(I\) is actually a subring of \(\mathbb{Z}\) (sums and products of multiples of 3 are also multiples of 3). Furthermore, the product of any integer and a multiple of 3 is itself a multiple of 3. Thus, the subring \(I\) has this property:
\[ \text{Whenever } k \in \mathbb{Z} \text{ and } i \in I, \text{ then } ki \in I. \]Example 2*
The notation \(f(x) \equiv g(x) \ (\text{mod}\ x^2 - 2)\) in the polynomial ring \(\mathbb{Q}[x]\) means that \(f(x) - g(x)\) is a multiple of \(x^2 - 2\). Let \(I\) be the set of all multiples of \(x^2 - 2\) in \(\mathbb{Q}[x]\), that is,
\[ I = \{h(x)(x^2 - 2) \mid h(x) \in \mathbb{Q}[x]\}. \]Once again, it is not difficult to check that \(I\) is a subring of \(\mathbb{Q}[x]\) with this property:
\[ \text{Whenever } k(x) \in \mathbb{Q}[x] \text{ and } t(x) \in I, \text{ then } k(x)t(x) \in I \](The product of any polynomial with a multiple of \(x^2 - 2\) is itself a multiple of \(x^2 - 2\).)
Congruence modulo \(x^2 - 2\) may be described in terms of \(I\):
\[ f(x) \equiv g(x) \ (\text{mod}\ x^2 - 2) \text{ means } f(x) - g(x) \in I. \]These examples suggest that congruence in a ring \(R\) might be defined in terms of certain subrings. If we select such a subring, we might define \(a \equiv b \ (\text{mod}\ I)\) to mean \(a - b \in I\). The subring \(I\) might consist of all multiples of a fixed element, as in the preceding examples, but there is no reason for restricting to this situation. These examples indicate that the key property for such a subring \(I\) is that it "absorbs products": Whenever you multiply an element of \(I\) by any element of the ring (either inside or outside \(I\)), the resulting product is an element of \(I\). The set of all multiples of a fixed element has this absorption property. We will see that many other subrings have it as well. Because such subrings play a crucial role in what follows, we pause to give them a name and to consider their basic properties.
Definition
A subring \(I\) of a ring \(R\) is an ideal provided:
\[ \text{Whenever } r \in R \text{ and } a \in I, \text{ then } ra \in I \text{ and } ar \in I. \]The double absorption condition that \(ra \in I\) and \(ar \in I\) is necessary for noncommutative rings. When \(R\) is commutative, as in the preceding examples, this condition reduces to \(ra \in I\).
Example 3
The zero ideal in a ring \(R\) consists of the single element \(0_R\). This is a subring that absorbs all products since \(r0_R = 0_R = 0_Rr\) for every \(r \in R\). The entire ring \(R\) is also an ideal.
Example 4
In the ring \(\mathbb{Z}[x]\) of all polynomials with integer coefficients, let \(I\) be the set of polynomials whose constant terms are even integers. Thus, \(x^3 + x + 6\) is in \(I\), but \(4x^2 + 3\) is not. Verify that \(I\) is an ideal in \(\mathbb{Z}[x]\) (Exercise 2).
Example 5
Let \(T\) be the ring of all functions from \(\mathbb{R}\) to \(\mathbb{R}\), as described in Example 8 of Section 3.1. Let \(I\) be the subset consisting of those functions \(g\) such that \(g(2) = 0\). Then \(I\) is a subring of \(T\) (Exercise 14 of Section 3.1). If \(f\) is any function in \(T\) and if \(g \in I\), then:
\[ (fg)(2) = f(2)g(2) = f(2) \cdot 0 = 0. \]Therefore, \(fg \in I\). Similarly, \(gf \in I\), so that \(I\) is an ideal in \(T\).
Example 6
The subring \(\mathbb{Z}\) of the rational numbers \(\mathbb{Q}\) is not an ideal in \(\mathbb{Q}\) because \(\mathbb{Z}\) fails to have the absorption property. For instance, \(\frac{1}{2} \in \mathbb{Q}\) and \(5 \in \mathbb{Z}\), but their product:
\[ \frac{5}{2} \]is not in \(\mathbb{Z}\).
Example 7
Verify that the set \(I\) of all matrices of the form \(\begin{pmatrix} a & 0 \\ b & 0 \end{pmatrix}\) with \(a, b \in \mathbb{R}\) forms a subring of the ring \(M(\mathbb{R})\) of all \(2 \times 2\) matrices over the reals. It is easy to see that \(I\) absorbs products on the left:
\[ \begin{pmatrix} r & s \\ t & u \end{pmatrix} \begin{pmatrix} a & 0 \\ b & 0 \end{pmatrix} = \begin{pmatrix} ra + sb & 0 \\ ta + ub & 0 \end{pmatrix} \in I. \]But \(I\) is not an ideal in \(M(\mathbb{R})\) because it may not absorb products on the right—for instance:
\[ \begin{pmatrix} 1 & 0 \\ 2 & 0 \end{pmatrix} \begin{pmatrix} 3 & 4 \\ 5 & 6 \end{pmatrix} = \begin{pmatrix} 3 & 4 \\ 6 & 8 \end{pmatrix} \notin I. \]One sometimes says that \(I\) is a left ideal, but not a two-sided ideal, in \(M(\mathbb{R})\).
The following generalization of Theorem 3.6 often simplifies the verification that a particular subset of a ring is an ideal.
Theorem 6.1
A nonempty subset \(I\) of a ring \(R\) is an ideal if and only if it has these properties:
- If \(a, b \in I\), then \(a - b \in I\);
- If \(r \in R\) and \(a \in I\), then \(ra \in I\) and \(ar \in I\).
Proof: Every ideal certainly has these two properties. Conversely, suppose \(I\) has properties (i) and (ii). Then \(I\) absorbs products by (ii), so we need only verify that \(I\) is a subring. Property (i) states that \(I\) is closed under subtraction. Since \(I\) is a subset of \(R\), the product of any two elements of \(I\) must be in \(I\) by (ii). In other words, \(I\) is closed under multiplication. Therefore, \(I\) is a subring of \(R\) by Theorem 3.6.
Finitely Generated Ideals
In the first example of this section, we saw that the set \(I\) of all multiples of 3 is an ideal in \(\mathbb{Z}\). This fact is a special case of:
Theorem 6.2
Let \(R\) be a commutative ring with identity, \(c \in R\), and \(I\) the set of all multiples of \(c\) in \(R\), that is, \(I = \{rc \mid r \in R\}\). Then \(I\) is an ideal.
Proof: If \(r_1, r_2 \in R\) and \(r_1c, r_2c \in I\), then:
\[ r_1c - r_2c = (r_1 - r_2)c \in I \]and
\[ r(r_1c) = (rr_1)c \in I \]because \(r_1 - r_2\) and \(r_1\) are elements of \(R\). Similarly, since \(R\) is commutative, \((r_1c)r = (rr_1)c \in I\). Therefore, \(I\) is an ideal by Theorem 6.1.
The ideal \(I\) in Theorem 6.2 is called the principal ideal generated by \(c\) and hereafter will be denoted by \((c)\). In the ring \(\mathbb{Z}\), for example, \((3)\) indicates the ideal of all multiples of 3. In any commutative ring \(R\) with identity, the principal ideal \((1_R)\) is the entire ring \(R\) because \(r = r \cdot 1_R\) for every \(r \in R\). It can be shown that every ideal in \(\mathbb{Z}\) is a principal ideal (Exercise 40). However, there are ideals in other rings that are not principal, that is, ideals that do not consist of all the multiples of a particular element of the ring.
Example 8
We have seen that the set \(I\) of all polynomials with even constant terms is an ideal in the ring \(\mathbb{Z}[x]\). We claim that \(I\) is not a principal ideal. To prove this, suppose, on the contrary, that \(I\) consists of all multiples of some polynomial \(p(x)\). Since the constant polynomial 2 is in \(I\), 2 must be a multiple of \(p(x)\). By Theorem 4.2, this is possible only if \(p(x)\) has degree 0, that is, if \(p(x)\) is a constant, say \(p(x) \in I\), the constant \(c\) must be an even integer. Since 2 is a multiple of \(p(x) = c\), the only possibility is \(x = \pm 2\). On the other hand, \(x \in I\), because it has even constant term 0. Therefore, \(x\) must be the same on both sides. This is impossible because \(a\) is an integer. Therefore, \(I\) does not consist of all multiples of \(p(x)\) and is not a principal ideal.
In a commutative ring with identity, a principal ideal consists of all multiples of a fixed element. Here is a generalization of that idea.
Theorem 6.3
Let \(R\) be a commutative ring with identity and \(c_1, c_2, \dots, c_n \in R\). Then the set:
\[ I = \{r_1c_1 + r_2c_2 + \dots + r_nc_n \mid r_1, r_2, \dots, r_n \in R\} \]is an ideal in \(R\).
Proof: Exercise 14.
The ideal \(I\) in Theorem 6.3 is called the ideal generated by \(c_1, c_2, \dots, c_n\), and is sometimes denoted by \((c_1, c_2, \dots, c_n)\). Such an ideal is said to be finitely generated. A principal ideal is the special case \(n = 1\), that is, an ideal generated by a single element.
The generators of a finitely generated ideal need not be unique, that is, the ideal generated by \(c_1, c_2, \dots, c_n\) might be the same set as the ideal generated by \(d_1, d_2, \dots, d_k\), even though no \(c_i\) is equal to any \(d_j\) (Exercise 16).
Example 9
In the ring \(\mathbb{Z}[x]\), the ideal generated by the polynomial \(x\) and the constant polynomial 2 consists of all polynomials of the form:
\[ f(x) x + g(x) 2, \quad \text{with} \ f(x), g(x) \in \mathbb{Z}[x]. \]It can be shown that this ideal is the ideal \(I\) of all polynomials with even constant terms, which was discussed in Example 8 (Exercise 15).
Congruence
Now that you are familiar with ideals, we can define congruence in an arbitrary ring:
Let \(I\) be an ideal in a ring \(R\) and let \(a, b \in R\). Then \(a\) is congruent to \(b\) modulo \(I\) (written \(a \equiv b \ (\text{mod}\ I)\)) provided that \(a - b \in I\).
*When a commutative ring does not have an identity, the ideal generated by \(c_1, c_2, \dots, c_n\) is defined somewhat differently (see Exercise 33).
Example 10
Let \(T\) be the ring of all functions from \(\mathbb{R}\) to \(\mathbb{R}\) and let \(I\) be the ideal of all functions \(g\) such that \(g(2) = 0\). If \(f(x) = x^2 + 6\) and \(h(x) = 5x\), then the function \(f - h\) is in \(I\) because:
\[ (f - h)(2) = f(2) - h(2) = (2^2 + 6) - (5 \cdot 2) = 0. \]Therefore, \(f \equiv h \ (\text{mod}\ I)\).
Theorem 6.4
Let \(I\) be an ideal in a ring \(R\). Then the relation of congruence modulo \(I\) is:
- Reflexive: \(a \equiv a \ (\text{mod}\ I)\) for every \(a \in R\);
- Symmetric: if \(a \equiv b \ (\text{mod}\ I)\), then \(b \equiv a \ (\text{mod}\ I)\);
- Transitive: if \(a \equiv b \ (\text{mod}\ I)\) and \(b \equiv c \ (\text{mod}\ I)\), then \(a \equiv c \ (\text{mod}\ I)\).
This theorem generalizes Theorems 2.1 and 5.1. Observe that the proof is virtually identical to that of Theorem 2.1—just replace statements like "k is divisible by \(n\)" or "\(n \mid k\)" or "\(k = nt\)" with the statement "\(k \in I\)".
Proof of Theorem 6.4
- \(a - a = 0_R \in I\); hence, \(a \equiv a \ (\text{mod}\ I)\).
- If \(a \equiv b \ (\text{mod}\ I)\), then \(a - b = i\) for some \(i \in I\). Therefore, \(b - a = -(a - b) = -i \in I\). Since \(I\) is an ideal, the negative of an element of \(I\) is also in \(I\), and so \(b - a = -i \in I\). Hence, \(b \equiv a \ (\text{mod}\ I)\).
- If \(a \equiv b \ (\text{mod}\ I)\) and \(b \equiv c \ (\text{mod}\ I)\), then by the definition of congruence, there are elements \(i\) and \(j\) in \(I\) such that \(a - b = i\) and \(b - c = j\). Therefore, \(a - c = (a - b) + (b - c) = i + j\). Since the ideal \(I\) is closed under addition, \(i + j \in I\) and, hence, \(a \equiv c \ (\text{mod}\ I)\).
Theorem 6.5
Let \(I\) be an ideal in a ring \(R\). If \(a \equiv b \ (\text{mod}\ I)\) and \(c \equiv d \ (\text{mod}\ I)\), then:
- \(a + c \equiv b + d \ (\text{mod}\ I)\);
- \(ac \equiv bd \ (\text{mod}\ I)\).
This theorem generalizes Theorems 2.2 and 5.2. Its proof is quite similar to theirs once you make the change to the language of ideals.
Proof of Theorem 6.5
- By the definition of congruence, there are \(i, j \in I\) such that \(a - b = i\) and \(c - d = j\). Therefore: \[ (a + c) - (b + d) = (a - b) + (c - d) = i + j \in I. \] Hence, \(a + c \equiv b + d \ (\text{mod}\ I)\).
- \(ac - bd = ac - bc + bc - bd = (a - b)c + b(c - d) = ic + bj\). Since the ideal \(I\) absorbs products on both left and right, \(ic \in I\) and \(bj \in I\). Hence: \[ ac - bd \in I. \] Therefore, \(ac \equiv bd \ (\text{mod}\ I)\).
If \(I\) is an ideal in a ring \(R\) and \(a \in R\), then the congruence class of \(a\) modulo \(I\) is the set of all elements of \(R\) that are congruent to \(a\) modulo \(I\), that is, the set:
\[ \{b \in R \mid b \equiv a \ (\text{mod}\ I)\} = \{b \in R \mid b - a \in I\} = \{b \in R \mid a \equiv b \ (\text{mod}\ I)\} = \{a + i \mid i \in I\}. \]Consequently, we shall denote the congruence class of \(a\) modulo \(I\) by the symbol \(a + I\) rather than the symbol \([a]\) that was used in \(\mathbb{Z}\) and \(F[x]\). The plus sign in \(a + I\) is just a formal symbol; we have not defined the sum of an element and an ideal. In this context, the congruence class \(a + I\) is usually called a (left) coset of \(I\) in \(R\).
Theorem 6.6
Let \(I\) be an ideal in a ring \(R\) and let \(a, c \in R\). Then \(a \equiv c \ (\text{mod}\ I)\) if and only if \(a + I = c + I\).
Proof: With only minor notational changes, the proof of Theorem 2.3 carries over almost verbatim to the present case. Simply replace "mod \(n\)" by "mod \(I\)" and "\([a]\)" by "\(a + I\)"; use Theorem 6.4 in place of Theorem 2.1.
Corollary 6.7
Let \(I\) be an ideal in a ring \(R\). Then two cosets of \(I\) are either disjoint or identical.
Proof: Copy the proof of Corollary 2.4 with the obvious notational changes.
If \(I\) is an ideal in a ring \(R\), then the set of all cosets of \(I\) (congruence classes modulo \(I\)) is denoted \(R/I\).
Example 11
Let \(I\) be the principal ideal \((3)\) in the ring \(\mathbb{Z}\). Then the cosets of \(I\) are just the congruence classes modulo 3, and so there are three distinct cosets: \(0 + I = [0]\), \(1 + I = [1]\), and \(2 + I = [2]\). The set \(\mathbb{Z}/I\) of all cosets is precisely the set \(\mathbb{Z}_3\) in our previous notation.
Example 12
Let \(I\) be the ideal in \(\mathbb{Z}[x]\) consisting of all polynomials with even constant terms. We claim that \(\mathbb{Z}[x]/I\) consists of exactly two distinct cosets, namely, \(0 + I\) and \(1 + I\). To see this, consider any coset \(f(x) + I\). The constant term of \(f(x)\) is either even or odd. If it is even, then \(f(x) \equiv 0 \ (\text{mod}\ I)\), so that \(f(x) + I = 0 + I\) by Theorem 6.6. If \(f(x)\) has an odd constant term, then \(f(x) + I = 1 + I\) by Theorem 6.6. Thus \(f(x) + I = 1 + I\).
Example 13
Let \(T\) be the ring of functions from \(\mathbb{R}\) to \(\mathbb{R}\) and let \(I\) be the ideal of all functions \(g\) such that \(g(2) = 0\). Note that for each real number \(r\), the constant function \(f\), whose rule is \(f(x) = r\), is an element of \(T\). Let \(h(x)\) be any element of \(T\). Then \(h(2)\) is some real number, say \(h(2) = c\), and:
\[ (h - f_c)(2) = h(2) - f_c(2) = c - c = 0. \]Thus \(h - f_c \in I\), so that \(h \equiv f_c \ (\text{mod}\ I)\) and, hence, \(h \equiv f_c \ (\text{mod}\ I)\). Consequently, every coset of \(I\) can be written in the form \(f_r + I\) for some real number \(r\). Furthermore, if \(c \neq d\), then \(f_c \neq f_d\), so that \( = 0\) and \(f_c - f_d \notin I\). Hence, \(f_c \not\equiv f_d \ (\text{mod}\ I)\) and \(f_c + I \neq f_d + I\). Therefore, there are infinitely many distinct cosets of \(I\), one for each real number \(r\).