18-330 – Exercise 1: LU factorization (Solution)

Description

Suppos you are given an upper-triangular 𝑚 × 𝑚 matrix U, i.e. all elements below the main diagonal are known to be 0. We wish to solve Ux = b for the vector x.
1. Find the analytical solution for the components 𝑥𝑖 of x in terms of the 𝑈𝑖,𝑗 and 𝑏𝑖 using backsubstitution. (Hint: Where should you start?)
2. Write a function backsubstitution(U, b) that implements this.
3. What is the approximate operation count to perform backsubstitution?
4. Repeat [1]–[3] for forward substitution to solve Lx = b for a lowertriangular matrix L.
5. Write down what happens at a general step in the Gaussian elimination process for a matrix 𝐴, in terms of the old coefficients 𝐴𝑖,𝑗 and the new coefficients 𝐴′𝑖,𝑗.
6. Implement Gaussian elimination for an 𝑚 × 𝑚 square matrix 𝐴 to calculate the 𝐿𝑈 factorization without pivoting. Your function LU_factorization should return L and U.
7. What is the operation count of this algorithm?
8. Write a function solve_linear_system(A, b) that uses the functions you have written to solve the linear system Ax = b.
9. Consider the matrix A built by the following Julia code. Please use exactly this code so that the results are consistent:
using Random
rng = MersenneTwister(1234) A = randn(rng, Float64, (10, 10)) b = randn(rng, Float64, 10) Solve Ax = b for this A and b.
Print out the full x vector using jl show(stdout, “text/plain”, x) (This prevents the ellipses, …, that can occur in compressed output.)
Compare this to the result from Julia’s (backslash) operator and comment.
10. Find the size of the residual Ax − b using the maximum norm (i.e. find the largest entry of the vector) for the x you found using your solve_linear_system(A, b) function.
11. Plot the residual as a function of 𝑚 for matrices of sizes between 𝑚 = 10 and 𝑚 = 1000 (or however high your computer can manage) on suitable axes.
How does the residual grow as the size grows?
Exercise 2: Exploiting structure in matrices 1 : Banded LU
In this exercise we will exploit structure in a matrix, i.e. the location of zero elements. If we know in advance that there are zeros in a certain part of the matrix (“structural zeros”), we can exploit that information to define more efficient operations on matrices of that type.
𝐴𝑖𝑗 ≠ 0 ⟹ 𝑖 − 𝑝 ≤ 𝑗 ≤ 𝑖 + 𝑞
For example, a diagonal matrix is (0,0)-banded (only the main diagonal is present) and a tridiagonal matrix is (1,1)-banded.
As an example, think of a system of 𝑛 particles in a line, each of which interacts only with its nearest neighbour on each side. This could lead to an interaction matrix that is known in advance to be tridiagonal, hence leading to the possibility of more efficiently solving the dynamics. Finite difference matrices are also banded. The matrices without structure that we discussed in the lectures on LU and QR are called dense matrices. We will define a new type Banded as follows:
struct Banded{T} <: AbstractMatrix{T} bands::Matrix{T} p::Int q::Int
end
Here, bands is a matrix that stores only the bands (diagonals) of the original matrix, numbered (−𝑝,−(𝑝−1),…,0,1,…𝑞), where 0 corresponds to the main diagonal.
Note that there are more terms in bands than we need to store. We will store the lower bands justified to the bottom and the upper bands justified to the top. For example, consider the general (3, 2) banded matrix, i.e. with 3 diagonals below and 2 diagonals above the main diagonal.
𝑏11 𝑏12 𝑏13
⎡𝑏21 𝑏22 𝑏23 ⎢𝑏31 𝑏32 𝑏33
𝐵 = ⎢𝑏41 𝑏42 𝑏43 ⎢ 𝑏52 𝑏53
⎣ 𝑏63
This would be stored as follows in bands: 𝑏24
𝑏34
𝑏44
𝑏54
𝑏64 𝑏35
𝑏45
𝑏55
𝑏65 ⎤
⎥
𝑏46⎥⎥
𝑏56
𝑏66⎦
𝑏41 𝑏31 𝑏21
⎡𝑏52 𝑏42 𝑏32
𝐵 = ⎢⎢𝑏063 𝑏𝑏6453 𝑏𝑏5443
⎢ 0 0 𝑏65
⎣ 0 0 0 𝑏11
𝑏22
𝑏33
𝑏44
𝑏55 𝑏66 0
𝑏12
𝑏23
𝑏34
𝑏45
𝑏56 0
0 ⎤
𝑏13⎥
𝑏24⎥⎥
𝑏35
𝑏46⎦
The columns of the new matrix correspond to the diagonals of the old matrix.
Although this seems not to be more efficient in terms of storage, it will be for a larger matrix.
1. Write a function mymul(B::Banded, v::Vector) that returns the matrix– vector product B*v. The function should be optimized in the sense that terms we know are 0 are not calculated. Compare the result to the built-in dense matrix multiply using the code Array(B) * v. (Array(B) produces a dense matrix with the same bands as B).
2. Consider the LU factorization. Show, using the formula for matrix multiplication, that if 𝐿 is (𝑝,0)-banded and 𝑈 is (0,𝑞)-banded then A = LU is (𝑝,𝑞)-banded.
3. Now show, by thinking about how the elimination algorithm works, that if A is (𝑝,𝑞)-banded then the factor L is (𝑝,0)-banded and U is (0,𝑞)banded.
4. We have seen that the structure of bandedness is maintained during an LU factorization. We saw in Exercise 1 that the operation count for a dense LU factorization is 𝒪(𝑛3). What happens to the operation count when we exploit the banded structure of a (𝑝,𝑞)-banded matrix of size 𝑚 × 𝑚? Find the approximate operation count in terms of 𝑚, 𝑝 and 𝑞.
5. Implement LU factorization for a banded matrix, LU_factorization(B::Banded), making sure to operate only on non-zero entries.
6. Given the new operation count, is the naive back/forward substitution or the LU factorization the bottleneck when solving a banded system? If LU is no longer the bottleneck, add methods for your substitution functions so that they are specialized for banded matrices.
(Hint for parts 5 & 6: if you fully understand what is happening here, modifying your LU / substitution codes from Exercise 1 should be simple for banded matrices, changing at most 1 or two lines of code.)
7. As an example of how necessary it is to exploit structure for efficiency, create a random tridiagonal matrix A of size 𝑛 × 𝑛 and a random vector b of length 𝑛, taking 𝑛 = 10,000 (or whatever your computer can safely handle. How many times faster does your structured LU code run than if you were to use dense (standard) linear algebra (after you convert your banded matrix to a dense matrix)? (Remember to run your code once before timing it.)
8. Check numerically the dependence on 𝑚 that you found in question [4] by solving random tridiagonal systems of sizes 𝑚×𝑚 for 𝑚 between 10 and 10,000 and plotting the time taken as a function of 𝑚. (If the times are too fast, use larger values of 𝑚!)
[You can use @elapsed (or @belapsed from BenchmarkTools.jl) to capture the time taken by a Julia operation, so that you can automate this in a loop.]
Exercise 3: Exploiting structure in matrices 2 : Tridiagonal QR
In this exercise we will see how to exploit structure (zeros again) to make a more efficient QR algorithm. (However, we do not ask you to code the algorithm this time.)
Consider a general tridiagonal matrix (i.e. a (1,1)-banded matrix):
𝑏1 𝑐1
⎡𝑎2 𝑏2 𝑐2 ⎤
𝑇 = ⎢⎢ 𝑎3 𝑏⋱3 𝑐⋱3 ⋱ ⎥⎥
⎢ 𝑎𝑁−1 𝑏𝑁−1 𝑐𝑁−1⎥
⎣ 𝑎𝑁 𝑏𝑁 ⎦
Since there is only a single subdiagonal, it should be easier to make the matrix upper-triangular by operating on it.
1. Consider the following rotation matrix:
𝑅(𝜃) = [cossin 𝜃𝜃 −cossin𝜃𝜃]
How should you choose 𝜃 so that
𝑝 𝑟
𝑅(𝜃)[𝑞] = [0]
where 𝑟2 = 𝑝2 + 𝑞2 ? 2. Consider the matrix
𝑅(𝜃) 0 𝑃0 = [ 0 𝐼𝑁−2]
where 𝐼𝑁 is the 𝑁 ×𝑁 identity matrix, and where 𝜃 is chosen as above for the terms 𝑝 = 𝑏1 and 𝑞 = 𝑎2. What does the resulting matrix 𝐴1̃ = 𝑃0𝐴 look like? Show that the first column is now upper-triangular.
To be more concrete consider the matrix
1 4 0 0
⎡2 1 3 0⎤ 𝑀 = ⎢0 5 6 7⎥.
⎣0 0 1 2⎦
Show the resulting matrix when you multiply by 𝑃0 – is it what you expected?
3. What matrix should you multiply 𝐴1̃ by to make the second column uppertriangular? (Hint: it will have a similar structure to 𝑃0) Check that the first column is still upper triangular. Call this matrix 𝑃1. Multiply the result of 𝑃0𝑀 by 𝑃1 and show the result.
4. Generalize to the rotation matrix that makes the 𝑛th column of a tridiagonal matrix upper-triangular. Call this matrix 𝑃𝑛−1. Show that 𝑃𝑛−1 is orthogonal, i.e. 𝑃𝑛𝑃𝑛⊤ = 𝐼.
5. We can now write a tridiagonal matrix 𝑇 in upper-triangular form by forming the product 𝑃𝑁−2𝑃𝑁−3 ⋯𝑃1𝑃0𝑇 . From the above we see that the result of this will be an upper-triangular matrix 𝑅.
Show that the product of orthogonal matrices is orthogonal, and hence that this procedure gives a QR decomposition of 𝑇 .
6. What is the approximate operation count for a full QR factorization on a dense matrix 𝐴?
7. What is the approximate operation count for this reduced tridiagonal QR?
8. Extra credit Implement this algorithm in an efficient way.

Reviews

There are no reviews yet.

Be the first to review “18-330 – Exercise 1: LU factorization (Solution)”

18-330 – Exercise 1: LU factorization (Solution)

Description

Reviews

Related products

18-330 – Exercise 1: Fourier series (Solution)

18-330 – Exercise 1: Runge–Kutta methods (Solution)

18-330 – Exercise 1: Finite differences via interpolation (Solution)

18-330 – Exercise 1: Algorithmic differentiation (Solution)

18-330 – Exercise 1: Fixed-point iteration In this question we will find the roots of the function (Solution)