Why gauss? (100 ways to solve an equation system)

What will you do if you are asked to solve a simple system with three unknowns? Each formed its own and most convenient approach for him personally. There are many ways to solve a system of linear algebraic equations. But why the preference is given to the Gauss method?

Everything in order

Let's start with the simple. Let a system of third order linear equations be given:

$$ display $$ \ left \ {\ begin {aligned} a_ {11} x_1 + a_ {12} x_2 + a_ {13} x_3 = b_1, \\ a_ {21} x_1 + a_ {22} x_2 + a_ { 23} x_3 = b_2, \\ a_ {31} x_1 + a_ {32} x_2 + a_ {33} x_3 = b_3. \\ \ end {aligned} \ right. $$ display $$

The Gauss method consists in the successive “annihilation” of the addends below the main diagonal. In the first stage, the first equation is multiplied by

$ inline $ \ dfrac {a_ {21}} {a_ {11}} $ inline $ and subtracted from the second (and similarly multiplied by

$ inline $ \ dfrac {a_ {31}} {a_ {11}} $ inline $ and subtracted from the third). That is, after this transformation, we get the following:

$$ display $$ \ left \ {\ begin {aligned} a_ {11} x_1 + a_ {12} x_2 + a_ {13} x_3 = b_1, \\ a_ {22} 'x_2 + a_ {23}' x_3 = b_2 ', \\ a_ {32}' x_2 + a_ {33} 'x_3 = b_3'. \\ \ end {aligned} \ right. $$ display $$

Now the second equation is multiplied by

$ inline $ \ dfrac {a_ {32} '} {a_ {22}'} $ inline $ and subtracted from the third:

$$ display $$ \ left \ {\ begin {aligned} a_ {11} x_1 + a_ {12} x_2 + a_ {13} x_3 = b_1, \\ a_ {22} 'x_2 + a_ {23}' x_3 = b_2 ', \\ a_ {33}' 'x_3 = b_3' '. \\ \ end {aligned} \ right. $$ display $$

Got a fairly simple system that is easy to find.

$ inline $ x_3 $ inline $ then

$ inline $ x_2 $ inline $ and

$ inline $ x_1 $ inline $ .

The attentive reader will definitely notice: what if the diagonal elements are zero? What to do if, for example,

$ inline $ a_ {11} = 0 $ inline $ ? Does the famous Gauss method end there?
')
Nothing wrong! Let's find

$ inline $ \ max | a_ {1j} | $ inline $ and swap

$ inline $ j $ inline $ th and first line (without loss of generality, suppose

$ inline $ \ max | a_ {1j} | = a_ {13} $ inline $ ). Note that the case when all

$ inline $ a_ {1j} = 0 $ inline $ It cannot be, since in this case the determinant of the matrix of coefficients vanishes and the system cannot be solved. So, after rearranging the 3rd equation on the first line, we perform the actions described earlier.

You can search for the maximum modulo element at each iteration, that is, at

$ inline $ k $ inline $ th iteration search

$ inline $ \ max | a_ {kj} | $ inline $ then change

$ inline $ j $ inline $ th and

$ inline $ k $ inline $ st lines. The algorithm in which the maximum element in a column is searched for is called the Gauss method with the selection of the main element in the column.

There is another way. You can on

$ inline $ k $ inline $ th iteration search

$ inline $ \ max | a_ {ik} | $ inline $ , then to change not the lines, but the columns. But at the same time, it is important to memorize the indices of the changing columns in any array

$ inline $ \ alpha $ inline $ (then to restore the exact order of the variables).

An example of a simple code that implements this algorithm:

import java.io.FileReader; import java.io.IOException; import java.io.PrintWriter; import java.util.ArrayList; import java.util.Collections; import java.util.Locale; import java.util.Scanner; public class GuassianEliminationSearchMainElementsAtString { public static void main(String[] args) throws IOException { Scanner sc = new Scanner(new FileReader("input.txt")); sc.useLocale(Locale.US); int n = sc.nextInt(); double[][] a = new double[n + 1][n + 1]; double[] b = new double[n + 1]; // input for (int i = 1; i <= n; i++) { for (int j = 1; j <= n; j++) { a[i][j] = sc.nextDouble(); } b[i] = sc.nextDouble(); } int[] alpha = new int[n + 1]; // array of indices for (int i = 1; i <= n; i++) { alpha[i] = i; } for (int m = 1; m <= n; m++) { double max = Math.abs(a[m][m]); int count = m; for (int i = m + 1; i <= n; i++) { if (Math.abs(a[m][i]) > max) { // search max elements at the string max = Math.abs(a[m][i]); count = i; } } int tmp = alpha[m]; // swap strings alpha[m] = alpha[count]; alpha[count] = tmp; for (int i = m; i <= n; i++) { double tmp2 = a[i][m]; a[i][m] = a[i][count]; a[i][count] = tmp2; } for (int i = m + 1; i <= n; i++) { // guassian right stroke b[i] = b[i] - a[i][m] * b[m] / a[m][m]; for (int j = m + 1; j < n; j++) { a[i][j] = a[i][j] - a[i][m] * a[m][j] / a[m][m]; } } } // for m double[] x = new double[n+1]; for (int i = n; i >= 1; i--) { // guassian back stroke double sum = 0; for (int j = i + 1; j <= n; j++) { sum += a[i][j] * x[alpha[j]]; } x[alpha[i] - 1] = (b[i] - sum) / a[i][i]; } // output PrintWriter pw = new PrintWriter("output.txt"); for (int i = 0; i < n; i++) { pw.printf(Locale.US, "x%d = %.5f \n", i + 1, x[i]); } pw.flush(); pw.close(); } }

Why gauss?

There is another way to solve SLAE. One of these is the Kramer method. It consists in the preliminary calculation of a certain number of determinants, with the help of which the values of variables are instantly calculated. Under the original system, this method will look as follows:

$$ display $$ \ Delta = \ begin {vmatrix} a_ {11} & a_ {12} & a_ {13} \\ a_ {21} & a_ {22} & a_ {23} \\ a_ {31} & a_ {32} & a_ {33} \\ \ end {vmatrix}, \ Delta_1 = \ begin {vmatrix} b_1 & a_ {12} & a_ {13} \\ b_2 & a_ {22} & a_ {23} \ \ b_3 & a_ {32} & a_ {33} \\ \ end {vmatrix}, $$ display $$

$$ display $$ \ Delta_2 = \ begin {vmatrix} a_ {11} & b_1 & a_ {13} \\ a_ {21} & b_2 & a_ {23} \\ a_ {31} & b_3 & a_ {33} \\\ end {vmatrix}, \ Delta_3 = \ begin {vmatrix} a_ {11} & a_ {12} & b_1 \\ a_ {21} & a_ {22} & b_2 \\ a_ {31} & a_ {32 } & b_3 \\ \ end {vmatrix}, $$ display $$

$$ display $$ x_i = \ dfrac {\ Delta_i} {\ Delta}. $$ display $$

But remember - what is the determinant?

By definition, the determinant of the matrix

$ inline $ A = (a_ {ij}) $ inline $ there is a sum

$$ display $$ \ sum \ limits_ {1 \ leq i_1 <\ dots <i_n \ leq n} (-1) ^ {N (i_1, \ dots, i_n)} a_ {1i_1} \ dots a_ {ni_n}, $$ display $$

Where

$ inline $ N (i_1, \ dots, i_n) $ inline $ - wildcard

$ inline $ i_1, \ dots, i_n. $ inline $

The determinant contains

$ inline $ n! $ inline $ terms. To solve the system you need to count

$ inline $ n + 1 $ inline $ determinants. For large enough

$ inline $ n $ inline $ it is very expensive. For example, when

$ inline $ n = 12 $ inline $ the number of operations becomes

$ inline $ 12! (12 + 1) = 6227020800, $ inline $ while the Gauss method with asymptotics

$ inline $ n ^ 3 $ inline $ will require only

$ inline $ 12 ^ 3 = 1728 $ inline $ operations.

Iterative methods

The so-called iterative methods are suitable as algorithms for solving SLAEs. They consist in the sequential calculation of approximations until one of them is as close as possible to the exact answer.

First, an arbitrary vector is selected.

$ inline $ x ^ 0 $ inline $ - this is a zero approximation. It builds a vector

$ inline $ x ^ 1 $ inline $ - this is the first approximation. And so on. Calculations end when

$ inline $ || x ^ k - x ^ {k + 1} || <\ varepsilon $ inline $ where

$ inline $ \ varepsilon $ inline $ - some preset value. The last inequality means that our “improvement” of the solution with each iteration is almost insignificant.

Consider the popular Jacobi method, which is one of the simplest iterative methods for solving SLAEs.

First, we write the system in the following form:

$$ display $$ \ sum \ limits_ {j \ leq n} a_ {ij} x_j = b_i, \ i = \ overline {1, n}. $$ display $$

Separate

$ inline $ i $ inline $ -th term and express it through everything else:

$$ display $$ x_i = \ dfrac {b_i - \ sum \ limits_ {j \ neq i} a_ {ij} x_j} {a_ {ii}}, \ i = \ overline {1, n}. $$ display $ $

Now we just hang the “counters” on the variables and get the Jacobi iterative method:

$$ display $$ x_i ^ k = \ dfrac {b_i - \ sum \ limits_ {j \ neq i} a_ {ij} x_j ^ k} {a_ {ii}}, \ i = \ overline {1, n}, \ k = 0,1, \ dots. $$ display $$

Note that a prerequisite for the use of this method is the absence of zeros along the main diagonal.

Implementation of the Jacobi method in Java:
As

$ inline $ \ varepsilon $ inline $ A pre-computed so-called machine epsilon is taken.

 import java.io.FileNotFoundException; import java.io.FileReader; import java.io.PrintWriter; import java.util.Locale; import java.util.Scanner; public class JacobiMethod { public static void main(String[] args) throws FileNotFoundException { Scanner sc = new Scanner(new FileReader("input.txt")); sc.useLocale(Locale.US); int n = sc.nextInt(); double[][] a = new double[n + 1][n + 1]; double[] b = new double[n + 1]; double[] x0 = new double[n + 1]; for (int i = 1; i <= n; i++) { for (int j = 1; j <= n; j++) { a[i][j] = sc.nextDouble(); } b[i] = sc.nextDouble(); x0[i] = b[i] / a[i][i]; } double EPS = EPSCalc(); double[] x = new double[n+1]; double norm = Double.MAX_VALUE; int counter = 0; do{ for(int i = 1; i <= n; i++) { double sum = 0; for(int j = 1; j <= n; j++) { if(j == i) continue; sum += a[i][j] * x0[j]; } x[i] = (b[i] - sum) / a[i][i]; } norm = normCalc(x0, x, n); for(int i = 1; i <= n; i++) { x0[i] = x[i]; } counter++; } while(norm > EPS); PrintWriter pw = new PrintWriter("output.txt"); pw.println(counter + " iterations"); for (int i = 1; i <= n; i++) { pw.printf(Locale.US, "x%d = %f\n", i, x0[i]); } pw.flush(); pw.close(); } static double normCalc(double []x1, double[] x2, int n) { double sum = 0; for(int i = 1; i <= n; i++) { sum += Math.abs(x1[i] - x2[i]); } return sum; } static double EPSCalc () { double eps = 1; while (1 + eps > 1) { eps /= 2; } return eps; } }

A modification of the Jacobi method is the relaxation method. Its main difference is that with the help of a pre-selected parameter the number of iterations is significantly reduced. Let us briefly describe the main idea of the method.

From the source system again express

$ inline $ x $ inline $ , but arrange the counters a little differently and add the parameter

$ inline $ \ omega $ inline $ :

$$ display $$ x_i ^ k = \ dfrac {\ omega \ left (b_i - \ sum \ limits_ {j = 1} ^ {i-1} a_ {ij} x_j ^ {k + 1} - \ sum \ limits_ {j = i + 1} ^ n a_ {ij} x_j ^ k \ right)} {a_ {ii}} + (1- \ omega) x_i ^ k, \ i = \ overline {1, n}, \ k = 0,1, \ dots. $$ display $$

With

$ inline $ \ omega = 1 $ inline $ This all turns into Jacobi's method.

So, we will look for some "good"

$ inline $ \ omega $ inline $ . Set some number

$ inline $ s $ inline $ and we will take care of the values

$ inline $ \ omega \ in (0,2) $ inline $ for each of which we will consider the rules

$ inline $ || x ^ {k + 1} -x ^ k ||, \ k = \ overline {1, s} $ inline $ . For the smallest of these norms, remember this value.

$ inline $ \ omega $ inline $ , and with its help we will solve our system.

Java method illustration:
here

$ inline $ s = 5 $ inline $

 import java.io.FileNotFoundException; import java.io.FileReader; import java.io.PrintWriter; import java.util.Locale; import java.util.Scanner; public class SuccessiveOverRelaxation { public static void main(String[] args) throws FileNotFoundException { Scanner sc = new Scanner(new FileReader("input.txt")); sc.useLocale(Locale.US); int n = sc.nextInt(); double[][] a = new double[n + 1][n + 1]; double[] b = new double[n + 1]; for (int i = 1; i <= n; i++) { for (int j = 1; j <= n; j++) { a[i][j] = sc.nextDouble(); } b[i] = sc.nextDouble(); } double EPS = EPSCalc(); double w = bestRelaxationParameterCalc(a, b, n); double[] x = new double[n + 1]; int counter = 0; double maxChange = Double.MAX_VALUE; do { maxChange = 0; for (int i = 1; i <= n; i++) { double firstSum = 0; for (int j = 1; j <= i - 1; j++) { firstSum += a[i][j] * x[j]; } double secondSum = 0; for (int j = i + 1; j <= n; j++) { secondSum += a[i][j] * x[j]; } double lastTerm = (1 - w) * x[i]; double z = (b[i] - firstSum - secondSum); double temp = (w * z) / a[i][i] + lastTerm ; maxChange = Math.max(maxChange, Math.abs(x[i] - temp)); x[i] = temp; } counter++; } while(maxChange > EPS); PrintWriter pw = new PrintWriter("output.txt"); pw.println(w + " is the best relaxation parameter"); pw.println(counter + " iterations"); for (int i = 1; i <= n; i++) { pw.printf(Locale.US, "x%d = %f\n", i, x[i]); } pw.flush(); pw.close(); } static double bestRelaxationParameterCalc(double[][]a, double[]b, int n) { double bestW = 1, bestMaxChange = Double.MAX_VALUE; for (double w = 0.05; w <= 2; w += 0.05) { double[] x = new double[n + 1]; double maxChange = 0; for (int s = 0; s < 5; s++) { maxChange = 0; for (int i = 1; i <= n; i++) { double firstSum = 0; for (int j = 1; j <= i - 1; j++) { firstSum += a[i][j] * x[j]; } double secondSum = 0; for (int j = i + 1; j <= n; j++) { secondSum += a[i][j] * x[j]; } double lastTerm = (1 - w) * x[i]; double z = (b[i] - firstSum - secondSum); double temp = (w * z) / a[i][i] + lastTerm; maxChange = Math.max(maxChange, Math.abs(x[i] - temp)); x[i] = temp; } } if (maxChange < bestMaxChange) { bestMaxChange = maxChange; bestW = w; } } return bestW; } static double EPSCalc () { double eps = 1; while (1 + eps > 1) { eps /= 2; } return eps; } }

Instead of conclusion

There is also a mass of algorithms for solving SLAUs. For example, the square root method, in which the desired system is replaced by two "simple" systems, whose solutions are calculated by elementary formulas; a sweep method that is used for so specific three-diagonal matrices. Everyone chooses which method to use for his problem.

Source: https://habr.com/ru/post/418241/

All Articles

Why gauss? (100 ways to solve an equation system)

Everything in order

Why gauss?

Iterative methods

Instead of conclusion

More articles: