- Last updated

- Save as PDF

- Page ID
- 58771

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}}}\)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{#1}}} \)

\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\id}{\mathrm{id}}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\kernel}{\mathrm{null}\,}\)

\( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\)

\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\)

\( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)

\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)

\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}}}\)

\( \newcommand{\vectorC}[1]{\textbf{#1}}\)

\( \newcommand{\vectorD}[1]{\overrightarrow{#1}}\)

\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}}\)

\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}}}\)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{#1}}} \)

\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

Practical problems in many fields of study—such as biology, business, chemistry, computer science, economics, electronics, engineering, physics and the social sciences—can often be reduced to solving a system of linear equations. Linear algebra arose from attempts to find systematic methods for solving these systems, so it is natural to begin this book by studying linear equations.

If \(a\), \(b\), and \(c\) are real numbers, the graph of an equation of the form

\[ax + by = c \nonumber \]

is a straight line (if \(a\) and \(b\) are not both zero), so such an equation is called a *linear* equation in the variables \(x\) and \(y\). However, it is often convenient to write the variables as \(x_1, x_2, \dots, x_n\), particularly when more than two variables are involved. An equation of the form

\[a_1x_1 + a_2x_2 + \dots + a_nx_n = b \nonumber \]

is called a **linear equation** in the \(n\) variables \(x_1, x_2, \dots, x_n\). Here \(a_1, a_2, \dots, a_n\) denote real numbers (called the **coefficients** of \(x_1, x_2, \dots, x_n\), respectively) and \(b\) is also a number (called the **constant term** of the equation). A finite collection of linear equations in the variables \(x_1, x_2, \dots, x_n\) is called a **system of linear equations** in these variables. Hence,

\[2x_1 - 3x_2 + 5x_3 = 7 \nonumber \]

is a linear equation; the coefficients of \(x_1\), \(x_2\), and \(x_3\) are \(2\), \(-3\), and \(5\), and the constant term is \(7\). Note that each variable in a linear equation occurs to the first power only.

Given a linear equation \(a_1x_1 + a_2x_2 + \dots + a_nx_n = b\), a sequence \(s_1, s_2, \dots, s_n\) of \(n\) numbers is called a **solution** to the equation if

\[a_1s_1 + a_2s_2 + \dots + a_ns_n = b \nonumber \]

that is, if the equation is satisfied when the substitutions \(x_1 = s_1, x_2 = s_2, \dots, x_n = s_n\) are made. A sequence of numbers is called **a solution to a system** of equations if it is a solution to every equation in the system.

For example, \(x = -2\), \(y = 5\), \(z = 0\) and \(x = 0\), \(y = 4\), \(z = -1\) are both solutions to the system

\[ \begin{array}{rrrrrrr} x & + & y & + & z & = & 3\\ 2x & + & y & + & 3z & = & 1 \end{array} \nonumber \]

A system may have no solution at all, or it may have a unique solution, or it may have an infinite family of solutions. For instance, the system \(x + y = 2\), \(x + y = 3\) has no solution because the sum of two numbers cannot be 2 and 3 simultaneously. A system that has no solution is called **inconsistent**; a system with at least one solution is called **consistent**. The system in the following example has infinitely many solutions.

##### Example \(\PageIndex{1}\)

Show that, for arbitrary values of \(s\) and \(t\),

\[\begin{aligned} x_1 &= t - s + 1 \\ x_2 &= t + s + 2 \\ x_3 &= s\\ x_4 &= t\end{aligned} \nonumber \]

is a solution to the system

\[ \begin{array}{rrrrrrr} x_1 & - & 2x_2 & + 3x_3 & + x_4 & = & -3\\ 2x_1 & -& x_2 & + 3x_3 & - x_4 & = & 0 \end{array} \nonumber \]

###### Solution

Simply substitute these values of \(x_1\), \(x_2\), \(x_3\), and \(x_4\) in each equation.

\[\begin{aligned} x_1 - 2x_2 + 3x_3 + x_4 &= (t - s + 1) - 2(t + s + 2) + 3s + t = -3\\ 2x_1 - x_2 + 3x_3 - x_4 &= 2(t - s + 1) - (t + s + 2) + 3s - t = 0\end{aligned} \nonumber \]

Because both equations are satisfied, it is a solution for all choices of \(s\) and \(t\).

The quantities \(s\) and \(t\) in Example\(\PageIndex{1}\) are called **parameters**, and the set of solutions, described in this way, is said to be given in **parametric form** and is called the **general solution** to the system. It turns out that the solutions to *every* system of equations (if there *are* solutions) can be given in parametric form (that is, the variables \(x_1\), \(x_2\), \(\dots\) are given in terms of new independent variables \(s\), \(t\), etc.). The following example shows how this happens in the simplest systems where only one equation is present.

##### Example \(\PageIndex{2}\)

Describe all solutions to \(3x - y + 2z = 6\) in parametric form.

###### Solution

Solving the equation for \(y\) in terms of \(x\) and \(z\), we get \(y = 3x + 2z -6\). If \(s\) and \(t\) are arbitrary then, setting \(x = s\), \(z = t\), we get solutions

\[\begin{aligned} x &= s\\ y &= 3s + 2t -6 \quad s \mbox{ and } t \mbox{ arbitrary} \\ z &= t\end{aligned} \nonumber \]

Of course we could have solved for \(x\): \(x = \frac{1}{3}(y -2z + 6)\). Then, if we take \(y = p\), \(z = q\), the solutions are represented as follows:

\[\begin{array}{rlll} x & = & \frac{1}{3} (p - 2q + 6) & \\ y & = & p & p \mbox{ and } q \mbox{ arbitrary}\\ z & = & q & \end{array} \nonumber \]

The same family of solutions can “look” quite different!

When only two variables are involved, the solutions to systems of linear equations can be described geometrically because the graph of a linear equation \(ax + by = c\) is a straight line if \(a\) and \(b\) are not both zero. Moreover, a point \(P(s, t)\) with coordinates \(s\) and \(t\) lies on the line if and only if \(as + bt = c\)—that is when \(x = s\), \(y = t\) is a solution to the equation. Hence the solutions to a *system* of linear equations correspond to the points \(P(s, t)\) that lie on *all* the lines in question.

In particular, if the system consists of just one equation, there must be infinitely many solutions because there are infinitely many points on a line. If the system has two equations, there are three possibilities for the corresponding straight lines:

*The lines intersect at a single point. Then the system has a*unique solution*corresponding to that point.**The lines are parallel (and distinct) and so do not intersect. Then the system has*no solution.*The lines are identical. Then the system has*infinitely many solutions*—one for each point on the (common) line.*

These three situations are illustrated in Figure \(\PageIndex{1}\). In each case the graphs of two specific lines are plotted and the corresponding equations are indicated. In the last case, the equations are \(3x - y = 4\) and \(-6x + 2y = -8\), which have identical graphs.

With three variables, the graph of an equation \(ax + by + cz = d\) can be shown to be a plane (see Section [sec:4_2]) and so again provides a “picture” of the set of solutions. However, this graphical method has its limitations: When more than three variables are involved, no physical image of the graphs (called hyperplanes) is possible. It is necessary to turn to a more “algebraic” method of solution.

Before describing the method, we introduce a concept that simplifies the computations involved. Consider the following system

\[ \begin{array}{rlrlrlrcr} 3x_1 & + & 2x_2 & - & x_3 & + & x_4 & = & -1\\ 2x_1 & & & - & x_3 & + & 2x_4 & = & 0\\ 3x_1 & + & x_2 & + & 2x_3 & + & 5x_4 & = & 2 \end{array} \nonumber \]

of three equations in four variables. The array of numbers^{1}

\[\left[ \begin{array}{rrrr|r} 3 & 2 & -1 & 1 & -1 \\ 2 & 0 & -1 & 2 & 0 \\ 3 & 1 & 2 & 5 & 2 \end{array} \right] \nonumber \]

occurring in the system is called the **augmented matrix** of the system. Each row of the matrix consists of the coefficients of the variables (in order) from the corresponding equation, together with the constant term. For clarity, the constants are separated by a vertical line. The augmented matrix is just a different way of describing the system of equations. The array of coefficients of the variables

\[\left[ \begin{array}{rrrr} 3 & 2 & -1 & 1 \\ 2 & 0 & -1 & 2 \\ 3 & 1 & 2 & 5 \end{array} \right] \nonumber \]

is called the **coefficient matrix** of the system and \(\left[ \begin{array}{r} -1 \\ 0 \\ 2 \end{array} \right]\) is called the **constant matrix** of the system.

### Elementary Operations

The algebraic method for solving systems of linear equations is described as follows. Two such systems are said to be **equivalent** if they have the same set of solutions. A system is solved by writing a series of systems, one after the other, each equivalent to the previous system. Each of these systems has the same set of solutions as the original one; the aim is to end up with a system that is easy to solve. Each system in the series is obtained from the preceding system by a simple manipulation chosen so that it does not change the set of solutions.

As an illustration, we solve the system \(x + 2y = -2\), \(2x + y = 7\) in this manner. At each stage, the corresponding augmented matrix is displayed. The original system is

\[\begin{array}{lcl} \begin{array}{rlrcr} x & + & 2y & = & -2 \\ 2x & + & y & = & 7 \end{array} & \quad & \left[ \begin{array}{rr|r} 1 & 2 & -2 \\ 2 & 1 & 7 \end{array} \right] \end{array} \nonumber \]

First, subtract twice the first equation from the second. The resulting system is

\[\begin{array}{lcl} \begin{array}{rlrcr} x & + & 2y & = & -2 \\ & - & 3y & = & 11 \end{array} & \quad & \left[ \begin{array}{rr|r} 1 & 2 & -2 \\ 0 & -3 & 11 \end{array} \right] \end{array} \nonumber \]

which is equivalent to the original (see Theorem [thm:000789]). At this stage we obtain \(y = -\frac{11}{3}\) by multiplying the second equation by \(-\frac{1}{3}\). The result is the equivalent system

\[\begin{array}{lcl} \begin{array}{rcr} x + 2y & = & -2 \\ y & = & -\frac{11}{3} \end{array} & \quad & \left[ \begin{array}{rr|r} 1 & 2 & -2 \\ 0 & 1 & -\frac{11}{3} \end{array} \right] \end{array} \nonumber \]

Finally, we subtract twice the second equation from the first to get another equivalent system.

\[\begin{array}{lcl} \def\arraystretch{1.5} \begin{array}{rcr} x & = & \frac{16}{3} \\ y & = & -\frac{11}{3} \end{array} & \quad \quad & \def\arraystretch{1.5} \left[ \begin{array}{rr|r} 1 & 0 & \frac{16}{3} \\ 0 & 1 & -\frac{11}{3} \end{array} \right] \end{array} \nonumber \]

Now *this* system is easy to solve! And because it is equivalent to the original system, it provides the solution to that system.

Observe that, at each stage, a certain operation is performed on the system (and thus on the augmented matrix) to produce an equivalent system.

##### Definition: Elementary Operations

*The following operations, called elementary operations, can routinely be performed on systems of linear equations to produce equivalent systems.*

*Interchange two equations.**Multiply one equation by a nonzero number.**Add a multiple of one equation to a different equation.*

##### Theorem \(\PageIndex{1}\)

*Suppose that a sequence of elementary operations is performed on a system of linear equations. Then the resulting system has the same set of solutions as the original, so the two systems are equivalent.*

The proof is given at the end of this section.

Elementary operations performed on a system of equations produce corresponding manipulations of the *rows* of the augmented matrix. Thus, multiplying a row of a matrix by a number \(k\) means multiplying *every entry* of the row by \(k\). Adding one row to another row means adding *each entry* of that row to the corresponding entry of the other row. Subtracting two rows is done similarly. Note that we regard two rows as equal when corresponding entries are the same.

In hand calculations (and in computer programs) we manipulate the rows of the augmented matrix rather than the equations. For this reason we restate these elementary operations for matrices.

##### Definition: Elementary Row Operations

*The following are called elementary row operations on a matrix.*

*Interchange two rows.**Multiply one row by a nonzero number.**Add a multiple of one row to a different row.*

In the illustration above, a series of such operations led to a matrix of the form

\[\left[ \begin{array}{rr|r} 1 & 0 & * \\ 0 & 1 & * \end{array} \right] \nonumber \]

where the asterisks represent arbitrary numbers. In the case of three equations in three variables, the goal is to produce a matrix of the form

\[\left[ \begin{array}{rrr|r} 1 & 0 & 0 & * \\ 0 & 1 & 0 & * \\ 0 & 0 & 1 & * \end{array} \right] \nonumber \]

This does not always happen, as we will see in the next section. Here is an example in which it does happen.

##### Example \(\PageIndex{3}\)

Find all solutions to the following system of equations.

\[ \begin{array}{rlrlrcr} 3x & + & 4y & + & z & = & 1 \\ 2x & + & 3y & & & = & 0 \\ 4x & + & 3y & - & z & = & -2 \end{array} \nonumber \]

###### Solution

The augmented matrix of the original system is

\[\left[ \begin{array}{rrr|r} 3 & 4 & 1 & 1 \\ 2 & 3 & 0 & 0 \\ 4 & 3 & -1 & -2 \end{array} \right] \nonumber \]

To create a \(1\) in the upper left corner we could multiply row 1 through by \(\frac{1}{3}\). However, the \(1\) can be obtained without introducing fractions by subtracting row 2 from row 1. The result is

\[\left[ \begin{array}{rrr|r} 1 & 1 & 1 & 1 \\ 2 & 3 & 0 & 0 \\ 4 & 3 & -1 & -2 \end{array} \right] \nonumber \]

The upper left \(1\) is now used to “clean up” the first column, that is create zeros in the other positions in that column. First subtract \(2\) times row 1 from row 2 to obtain

\[\left[ \begin{array}{rrr|r} 1 & 1 & 1 & 1 \\ 0 & 1 & -2 & -2 \\ 4 & 3 & -1 & -2 \end{array} \right] \nonumber \]

Next subtract \(4\) times row 1 from row 3. The result is

\[\left[ \begin{array}{rrr|r} 1 & 1 & 1 & 1 \\ 0 & 1 & -2 & -2 \\ 0 & -1 & -5 & -6 \end{array} \right] \nonumber \]

This completes the work on column 1. We now use the \(1\) in the second position of the second row to clean up the second column by subtracting row 2 from row 1 and then adding row 2 to row 3. For convenience, both row operations are done in one step. The result is

\[\left[ \begin{array}{rrr|r} 1 & 0 & 3 & 3 \\ 0 & 1 & -2 & -2 \\ 0 & 0 & -7 & -8 \end{array} \right] \nonumber \]

Note that the last two manipulations *did not affect* the first column (the second row has a zero there), so our previous effort there has not been undermined. Finally we clean up the third column. Begin by multiplying row 3 by \(-\frac{1}{7}\) to obtain

\[\left[ \begin{array}{rrr|r} 1 & 0 & 3 & 3 \\ 0 & 1 & -2 & -2 \\ 0 & 0 & 1 & \frac{8}{7} \end{array} \right] \nonumber \]

Now subtract \(3\) times row 3 from row 1, and then add \(2\) times row 3 to row 2 to get

\[\def\arraystretch{1.5} \left[ \begin{array}{rrr|r} 1 & 0 & 0 & - \frac{3}{7} \\ 0 & 1 & 0 & \frac{2}{7} \\ 0 & 0 & 1 & \frac{8}{7} \end{array} \right] \nonumber \]

The corresponding equations are \(x = -\frac{3}{7}\), \(y = \frac{2}{7}\), and \(z = \frac{8}{7}\), which give the (unique) solution.

Every elementary row operation can be **reversed** by another elementary row operation of the same type (called its **inverse**). To see how, we look at types I, II, and III separately:

*Type I* & *Interchanging two rows is reversed by interchanging them again.*

*Type II* & *Multiplying a row by a nonzero number \(k\) is reversed by multiplying by \(1/k\).*

*Type III* & *Adding \(k\) times row \(p\) to a different row \(q\) is reversed by adding \(-k\) times row \(p\) to row \(q\) (in the new matrix). Note that \(p \neq q\) is essential here.*

To illustrate the Type III situation, suppose there are four rows in the original matrix, denoted \(R_1\), \(R_2\), \(R_3\), and \(R_4\), and that \(k\) times \(R_2\) is added to \(R_3\). Then the reverse operation adds \(-k\) times \(R_2\), to \(R_3\). The following diagram illustrates the effect of doing the operation first and then the reverse:

\[\left[ \begin{array}{c} R_1 \\ R_2 \\ R_3 \\ R_4 \end{array} \right] \rightarrow \left[ \begin{array}{c} R_1 \\ R_2 \\ R_3 + kR_2\\ R_4 \end{array} \right] \rightarrow \left[ \begin{array}{c} R_1 \\ R_2 \\ (R_3 + kR_2) - kR_2\\ R_4 \end{array} \right] = \left[ \begin{array}{c} R_1 \\ R_2 \\ R_3 \\ R_4 \end{array} \right] \nonumber \]

The existence of inverses for elementary row operations and hence for elementary operations on a system of equations, gives:

**Proof of Theorem ****\(\PageIndex{1}\).** Suppose that a system of linear equations is transformed into a new system by a sequence of elementary operations. Then every solution of the original system is automatically a solution of the new system because adding equations, or multiplying an equation by a nonzero number, always results in a valid equation. In the same way, each solution of the new system must be a solution to the original system because the original system can be obtained from the new one by another series of elementary operations (the inverses of the originals). It follows that the original and new systems have the same solutions. This proves Theorem **\**(\PageIndex{1}\)

1. A rectangular array of numbers is called a matrix. Matrices will be discussed in more detail in Chapter 2.