UNIVERSITY OF LJUBLJANA

FACULTY OF COMPUTER AND INFORMATION SCIENCE

LABORATORY FOR MATHEMATICAL METHODS

IN COMPUTER SCIENCE AND INFORMATICS

Damir Franetič, Peter Marijan Kink

PROBLEMS IN MATHEMATICS 1

(WITH SOLUTIONS)

Study material for the course Mathematics 1 (63506A)

Ljubljana, 2024

Problems in Mathematics 1 (with solutions)

Study material for the course Mathematics 1 (63506A)

Authors: Damir Franetič, Peter Marijan Kink

Reviewers: Polona Oblak, Žiga Virk

Published by: University of Ljubljana Press (Založba Univerzev Ljubljani) For the publisher: Gregor Majdič, The Rector of the University of Ljubljana Issued by: Faculty of Computer and Information Science, University of Ljubljana For the issuer: Mojca Ciglarič, The Dean of Faculty of Computer and Information Science, University of Ljubljana

Ljubljana, 2024

Digital copy of the book is available at: https://ebooks.uni-lj.si/

First e-edition

Publication is free of charge

DOI:10.51939/9789612973186

This work is licensed under CC BY-NC-SA 4.0. To view a copy of this li-cense, visit

http://creativecommons.org/licenses/by-nc-sa/4.0/

Kataložni zapis o publikaciji (CIP) pripravili v

Narodni in univerzitetni knjižnici v Ljubljani

COBISS.SI-ID 191876355

ISBN 978-961-297-318-6 (PDF)

Preface

The following pages contain a representative selection of tasks and problems which appeared at problem sessions, midterms and final (computational) exams at the course Mathematics 1 for the first year students of master’s study programme at the Faculty of Computer and Information Science, University of Ljubljana.

In addition to the two authors, Polona Oblak also contributed many problems and ideas for problems. Uroš Kozole helped with suggestions for im-provements and a thorough inspection of the first typed solutions. The entire text was reviewed by Polona Oblak and Žiga Virk. Of course, that doesn’t mean that the solutions in front of you are error-free. If you find errors, please bring them to the authors’ attention.

Individual tasks in each section are in no specific order. The idea is that you, the reader, attempt to solve an exercise regardless of its perceived difficulty. Nonetheless, the solution to each problem is linked at the right edge, via e.g. (solution 3.17), while each solution contains a link back to the problem, e.g. . . . to problem 3.17. While immediate reading of the solutions may seem tempting, arriving at the solution by yourself is much more rewarding.

3

Contents

1 Linear algebra

5

1.1 A recollection of basic concepts . . . . . . . . . . . . . . . . . . .

5

1.2 Schur, Frobenius, Eckart–Young . . . . . . . . . . . . . . . . . . .

6

2 Vector spaces and linear maps

11

2.1 Vector spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11

2.2 Linear maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

13

3 Functions of several variables

16

3.1 Multiple integrals . . . . . . . . . . . . . . . . . . . . . . . . . . .

16

3.2 Local extrema of real multivariate functions . . . . . . . . . . . .

18

3.3 Constrained extrema . . . . . . . . . . . . . . . . . . . . . . . . .

19

4 Solutions

22

4





Chapter 1

Linear algebra

1.1

A recollection of basic concepts

Problem 1.1

(solution 1.1, page 22)

Determine the eigenvalues and corresponding eigenvectors of the matrix

 0

2

2 





A = 







 3

1

−3









−1

−1

3 

Problem 1.2

(solution 1.2, page 23)

Determine the eigenvalues and orthonormal bases of corresponding eigenspaces for the matrix

1

1

1

0 









1

0

0

1 

H = 















1

0

0

−1









0

1

−1

1 

Problem 1.3

(solution 1.3, page 25)

You are given an n × n matrix



0

1

· · ·

1 











1

0

· · ·

0 





A = 









.

.

.  ,



.





.

.

.

. 



.

.

.

. 











1

0

· · ·

0 

i.e., the adjacency matrix of a (undirected) star graph.

(a) Determine bases of N ( A) and C( A), i.e., bases of the nullspace and the column space of A.

5





CHAPTER 1. LINEAR ALGEBRA

6

(b) Determine the eigenvalues and eigenvectors of A.

Hint: Why is N ( A) an eigenspace of A? Why are eigenspaces of A other than N ( A) contained in N ( A)⊥?

Problem 1.4

(solution 1.4, page 28)

The following is known about a symmetric matrix A ∈ 4×4

R

: A has 3 as a double

eigenvalue and it interchanges the vectors

0

1

















1

0

v









1 =  









and v



 ,





2 =  

1

0

















0

1

i.e. A v1 = v2 and A v2 = v1. Find such a matrix A or prove that it does not exist.

Problem 1.5

(solution 1.5, page 29)

Let A be an n × n matrix. One way to define the exponential of the matrix A is

∞

X 1

eA :=

Ak

k!

k=0

(we substituted x ∈ R with A in the Taylor series for the function ex).

(a) Prove the identity

det( eA) = e tr( A)

(b) Assume A is an antisymmetric matrix, meaning A T = − A. Show that eA is an orthogonal matrix with determinant equal to 1.

1.2

Schur decomposition, Frobenius norm, Eckart–

Young theorem

Problem 1.6

(solution 1.6, page 31)

Determine one Schur decomposition for each of the matrices

6

−1

1



2

−1

0









A = 













0

1

0

4

3

1 and B = 

 .







√

√











2

2

3

−

2

−

2

2

Problem 1.7

(solution 1.7, page 34)

Let A be an arbitrary matrix and let U and V be orthogonal matrices, so that one can form the product U AV . Prove that the following equalities hold:

CHAPTER 1. LINEAR ALGEBRA

7

1. ∥ U A∥F = ∥ A∥F,

2. ∥ AV ∥F = ∥ A∥F,

3. ∥ U AV ∥F = ∥ A∥F.

Problem 1.8

(solution 1.8, page 35)

Denote by ⟨ A, B⟩F := tr( A T B) the Frobenius inner product of matrices A,B ∈

√

m× n

R

, and denote by ∥ A∥

⟨

F :=

A, A⟩F the corresponding Frobenius norm. Prove:

1. the Cauchy–Schwarz inequality, |⟨ A, B⟩ | ≤ ∥

∥

F

A∥F B∥F,

2. the triangle inequality, ∥ A + B∥ ≤ ∥

F

A∥F + ∥ B∥F,

3. submultiplicativity, ∥ AB∥ ≤ ∥

∥

F

A∥F B∥F,

4. multiplicativity for the Kronecker product, ∥ A ⊗ B∥

∥

F = ∥ A∥F B∥F.

Problem 1.9

(solution 1.9, page 37)

Let I be the 2 × 2 identity matrix. Find an orthonormal basis of orthogonal complement of I (the vector subspace I⊥ ⊆

2×2

R

) with respect to the (Frobe-

nius) inner product ⟨ A, B⟩F := tr( A T B).

Problem 1.10

(solution 1.10, page 37)

Find rank 1 matrices closest to the matrices

2

0

0

"1 3#

"2 0#





(b)

,

(c)

(a) 







0

−3

0,

3

1

0

2









0

0

1

with respect to the Frobenius norm. Are those rank 1 matrices unique?

Problem 1.11

(solution 1.11, page 40)

Prove the following:

(a) If the set {u

n

j : j = 1 , . . . , q} ⊆ R is linearly independent, then, whenever q

X x ⊗

mn

j

u j = 0 ⊗ 0 = 0 ∈ R

j=1

holds for vectors x

∈

m

1 , . . . , x q

R

, the vectors x j are necessarily zero, i.e.,

x

m

1 = · · · = x q = 0 ∈ R .

(b) Given linearly independent sets of vectors {v

m

i : i = 1 , . . . , p} ⊆ R

and

{u

n

j : j = 1 , . . . , q} ⊆ R , the set

{v ⊗

mn

i

u j : i = 1 ,...,p ; j = 1 ,...,q} ⊆ R

is also linearly independent.

CHAPTER 1. LINEAR ALGEBRA

8

Problem 1.12

(solution 1.12, page 41)

Let A ∈ m× m

n× n

R

and B ∈ R

. Show that the Kronecker sum

A ⊕ B := A ⊗ I

⊗

n + Im

B

has the property: Eigenvalues of A ⊕ B are all possible sums of the form λi + µj, where λ 1 ,...,λm are eigenvalues of A, and µ 1 ,...,µn eigenvalues of B.

Use this to find eigenvalues and eigenvectors of A ⊕ B, where

"−1 2#

"1 0#

A =

and B =

.

0

3

2

2

Problem 1.13

(solution 1.13, page 41)

Let

"2

2 #

A =

.

2

−1

(a) Determine a diagonal matrix D and an orthogonal matrix U such that A = U DU T.

(b) Explain why for an orthogonal matrix U the matrix U ⊗ U is also orthogonal.

(c) Find matrices of rank 1 and 2 which are best approximations to the Kronecker product A ⊗ A with respect to the Frobenius norm.

Problem 1.14

(solution 1.14, page 43)

You are given matrices

"−2 2#

"−3 2#

A =

and B =

.

2

1

2

0

(a) Diagonalize matrices A and B. Write down both corresponding diagonal and transition matrices.

(b) Let u be an eigenvector of A and v and eigenvector of B. Prove that u ⊗ v is an eigenvector of A⊗ B+ A⊗ I + I ⊗ B. Additionally, find all eigenvalues of A ⊗ B + A ⊗ I + I ⊗ B.

(c) Find a rank 1 matrix M, which is closest to the matrix A⊗ B+ A⊗ I + I ⊗ B

with respect to the Frobenius norm.

Problem 1.15

(solution 1.15, page 44)

Find the eigenvalues and corresponding eigenvectors of the matrix A⊗ A+ A 2⊗ I, where A is the matrix

"−1

3 #

A =

.

3

−1

CHAPTER 1. LINEAR ALGEBRA

9

Problem 1.16

(solution 1.16, page 45)

The objective of this exercise is to express the Sylvester matrix equation AX +

XB = C in the ‘usual’ form ( ˆ

A x = b) using the vec operator and then solve this

equation.

(a) Verify that the matrix equation AX + XB = C in the unknown matrix X is equivalent to the linear system

( B T ⊕ A)vec( X) = vec( C)

in the unknown column vec( X).

(b) Let A and B be 2 × 2 matrices

"−1 2#

"1 0#

A =

and B =

.

0

3

2

2

Does AX + XB = 0 posses a non-trivial solution? (You need to answer quickly! Do not attempt to solve the corresponding linear system. . . ) (c) Find a matrix X which solves

"−2 1#

AX + XB =

.

2

5

Problem 1.17

(solution 1.17, page 46)

Let A ∈ n× n

R

be a matrix with only nonnegative eigenvalues.

(a) Prove that A is invertible if and only if all of its eigenvalues are (strictly) positive.

(b) Assume that A is invertible. Prove that all of its eigenvalues are positive if and only if all of the eigenvalues of A−1 are positive.

(c) Assume A T = A. Prove that there exists a matrix S, with only nonnegative

√

eigenvalues and S 2 = A holds. We denote such matrix S as S = A.

Problem 1.18

(solution 1.18, page 46)

You are given the matrix

2

3

1





A = 







3

6

3 .









1

3

2

(a) Check that A is positive semidefinite.

(b) Find all eigenvalues and corresponding eigenvectors of A.

√

(c) Find

A.

CHAPTER 1. LINEAR ALGEBRA

10

Problem 1.19

(solution 1.19, page 47)

Find the Cholesky decomposition ( A = LL T, where L is lower triangular) of the matrix

 1

2

−1





A = 







 2

8

2 









−1

2

6 

using (recursive) algorithm below:

Write a symmetric matrix A ∈ n× n

R

in the block form

" a

#

A

11

bT

1 := A =

b

B

and define

√

"

a

#

11

0T

L 1 :=

.

√1

b

I

a

n−1

11

Then

" a

#

"1

0T

#

A

11

bT

1 =

= L

L T

b

B

1 0 B − 1 bbT 1.

a 11

Repeat this on the symmetric matrix A

( n−1)×( n−1)

2 := B − 1 bbT ∈ R

.

a 11

Let L 2 ,L 3 ,...,Ln be the matrices obtained in repeated steps. The matrix L is then

"1 0T#

" I

#

L = L ·

· · ·

n−1

0

1

.

(1.2)

0

L 2

0T

Ln

Problem 1.20

(solution 1.20, page 48)

Is any of the matrices

 1

−1

5 

−4

6

−2 









A = 







−









1

4

−5 and B =  6

−10

5 

















 5

−5

2 

−2

5

−14

negative definite? For each negative definite matrix X ∈ { A, B} find the Cholesky decomposition of − X.





Chapter 2

Vector spaces and linear maps

2.1

Vector spaces

Problem 2.1

(solution 2.1, page 49)

Which subsets of the vector space

n× n

R

are vector subspaces? Determine the

dimension of those that are.

(a) All matrices, which have 0 as the (1 , 2)-entry.

(b) All matrices, which have 1 as the (1 , 2)-entry.

(c) All matrices with integer entries, i.e., for A = [ a

∈

ij ] we have aij

Z (for

all indices i, j).

(d) All upper-triangular matrices.

(e) All symmetric matrices; A = A T.

(f) All antisymmetric matrices; A = − A T.

(g) All invertible matrices; the subset GL( n,

n× n

R) ⊆ R

.

(h) All matrices with determinant 0, i.e.,

n× n

R

∖ GL( n, R).

(i) All nilpotent matrices, i.e., matrices N ∈ n× n R

such that N n = 0.

(j) All upper-triangular nilpotent matrices. (Hint: Which elements appear on the diagonal of an upper-triangular nilpotent matrix?)

(k) All matrices with trace 0.

Problem 2.2

(solution 2.2, page 52)

Equip the open interval

+

R

= (0 , ∞) with the operation x ⊕ y := xy, and define α ⊙ x = xα for scalars α ∈ R.

(a) Show that ( +

R

, ⊕ , ⊙) is a vector space over R.

(b) Find a basis for ( +

R

, ⊕ , ⊙) and determine its dimension.

Problem 2.3

(solution 2.3, page 53)

11

CHAPTER 2. VECTOR SPACES AND LINEAR MAPS

12

Let F be the set of all Fibonacci sequences, i.e., sequences ( an)∞

n=0 = ( a 0 , a 1 , a 2 , . . . ), where a 0 and a 1 are arbitrary real numbers, and an = an−1 + an−2 holds for all n ≥ 2.

(a) Show that F is a vector space under operations

( an) + ( bn) := ( an + bn) and α( an) := ( αan), where α ∈ R.

(b) Find a basis for F and express the usual Fibbonacci sequence (the one with a 0 = a 1 = 1) in this basis.

Problem 2.4

(solution 2.4, page 54)

Let N be the matrix

"0 0#

N =

.

1

0

Show that the set of all real 2 × 2 matrices which commute with N , i.e., U = { A ∈ 2×2

R

: AN = N A},

is a vector subspace in

2×2

R

. Find a basis for U and determine its dimension!

Problem 2.5

(solution 2.5, page 54)

(a) Is the set U 1 = { p( x) = ax + b : a , 0 ,a,b ∈ R} a vector subspace in the vector space of polynomials R1[ x]?

(b) Is the set U 2 = { p( x) : p(0) = 0} a vector subspace in the vector space of polynomials R2[ x]?

(c) Is the set U 3 = { p( x) : p(0) = 1} a vector subspace in the vector space of polynomials R n[ x]?

(d) Is the set U 4 = { p( x) : p′′(3) = 0 } a vector subspace in the vector space of polynomials R n[ x]?

Problem 2.6

(solution 2.6, page 55)

Let A be the matrix

"1 −1#

A =

.

0

−1

Define subsets

V := { X ∈ 2×2

2×2

R

: XA + AX = 0} and W := { X ∈ R

: XAX = X}.

of the vector space

2×2

2×2

R

. Which of these subsets are vector subspaces of R

?

Why/why not? For each subset that is a vector space find its basis and determine its dimension.





CHAPTER 2. VECTOR SPACES AND LINEAR MAPS

13

Problem 2.7

For a polynomial p( x) = ax 3 + bx 2 + cx + d and a square matrix A denote p( A) =

aA 3 + bA 2 + cA + dI. Let A ∈ 2×2

R

be the matrix

"1 2#

A =

.

2

1

Let U ⊆ R3[ x] be the subset of those polynomials (of degree at most 3), for which p( A) = 0 (the zero matrix).

(a) Show that U is a vector subspace of R3[ x].

(b) Find a basis for U and determine dim U .

(Hint: If ∆ A( x) is the characteristic polynomial of A, then ∆ A( A) = 0.) (c) Let q( x) = x( x 2 − 2 x − 3). Is the set of all 2 × 2 matrices X, for which q( X) = 0 holds, a vector subspace of

2×2

R

? Justify your answer!

Problem 2.8

(solution 2.8, page 55)

Let R[ x] be the set of all polynomials in the indeterminate x. (Hence, R[ x] contains polynomials of arbitrary degrees!) Show that R[ x] is a vector space for the usual addition of polynomials and multiplication of a scalar and a polynomial.

Can you describe a basis for R[ x]? Can you find a basis for the subspace W = { p ∈ R[ x] : p(1) = p(−1) = 0}?

Determine dim R[ x] and dim W .

Problem 2.9

(solution 2.9, page 56)

Let V ⊆ C∞(0 , 2 π) be the set of all solutions to the differential equation y′′ + y = 0.

Show that V is a vector subspace of C∞(0 , 2 π). Find its basis.

2.2

Linear maps

Problem 2.10

(solution 2.10, page 56)

A map τ :

2×2

2×2

R

→ R

is given by

"1 1#

"1 1#

τ( X) =

X + X

.

1

0

1

0

(a) Show that τ is a linear map.

(b) Find the matrix corresponding to τ with respect to the standard basis

{ E

}

2×2

11 , E 12 , E 21 , E 22 of the vector space R

.

CHAPTER 2. VECTOR SPACES AND LINEAR MAPS

14

Problem 2.11

(solution 2.11, page 57)

For a polynomial p( x) = ax 3 + bx 2 + cx + d, and a square matrix A denote p( A) =

aA 3 + bA 2 + cA + dI. Let A ∈ 2×2

R

be the matrix

"1 2#

A =

.

2

1

(a) Show that the map given by

φ :

2×2

R3[ x] → R

, φ( p) = p( A)

is linear and determine the matrix corresponding to φ in the standard bases of spaces

2×2

R3[ x] and R

.

(b) Find a basis for ker φ and determine dim(ker φ). (Hint: If ∆ A( λ) is the characteristic polynomial of A, then ∆ A( A) = 0.) (c) Let q( x) = x( x 2 − 2 x − 3). Is the set of all 2 × 2 matrices X, for which q( X) = 0 holds, a vector subspace of

2×2

R

?

Problem 2.12

(solution 2.12, page 58)

We are given vectors a = [1 , 1 , 0]T, b = [1 , 0 , 1]T, and c = [0 , 1 , 1]T in 3

R , and a

linear map τ : 3

3

R

→ R for which

τ(a) = a , τ(b) = a + b, and τ(c) = a + c

holds.

(a) Show that {a , b , c} is a basis of

3

R .

(b) Find the matrix for τ in the basis B = {a , b , c}.

(c) Find the matrix for τ in the standard basis S = {i , j , k}.

(d) Where does the vector [1 , 1 , 1]T get mapped by τ?

Problem 2.13

(solution 2.13, page 59)

Let R3[ x] be the vector space of polynomials p of degree at most 3.

(a) Check that the map φ :

3

R3[ x] → R , φ( p) := [ p(−1) , p(0) , p(1)]T is linear.

(b) Find a basis Bker φ for the kernel ker φ of the map φ.

(c) Find the matrix corresponding to φ in the basis {1 , x, x 2 , x 3} of R3[ x] and the standard basis of

3

R .

Problem 2.14

(solution 2.14, page 60)

A map ψ : R2[ x] → R2[ x] is given by

( ψ( p))( x) = ( xp( x + 1))′ − 2 p( x) .

Show that ψ is linear. Find its matrix in the basis {1 , x, x 2}. Find its kernel and image.

CHAPTER 2. VECTOR SPACES AND LINEAR MAPS

15

Problem 2.15

(solution 2.15, page 61)

Let a = [1 , 1]T. A map φ :

2

2×2

R

→ R

is given by

φ(x) = x aT = x [1 , 1].

(a) Show that φ is linear.

(b) Find the matrix corresponding to φ in the standard bases of 2

R

and

2×2

R

.

(c) Determine dim(ker φ) and dim(im φ).

(d) Find a basis of im φ.

Problem 2.16

(solution 2.16, page 62)

Let B = {1 , 1 − x 2 , 1 + x 4} and V = L(B) ⊂ R4[ x]. Define a map

" p(−1) p′(−1)#

φ : V → 2×2

R

by

φ( p) =

.

p(1)

p′(1)

(a) Show that φ is a linear map.

(b) Find the matrix Aφ corresponding to φ with respect to the basis B of V

and the standard basis of

2×2

R

.

(c) Determine vector space bases of the kernel ker( φ) and the image im( φ).

Problem 2.17

(solution 2.17, page 63)

Assume that U and V are vector subspaces of a vector space W . Define sets: U × V := {( u, v) : u ∈ U , v ∈ V }, U + V := { u + v : u ∈ U and v ∈ V }, and U ∩ V := { w ∈ W : w ∈ U and w ∈ V }.

(a) Verify that U + V and U ∩ V are vector subspaces of W .

(b) ‘Guess’ the appropriate vector space structure on U × V . Prove that U × V

is actually a vector space in the guessed case! Determine dim( U × V ) from dim U and dim V .

(c) Let a map φ : U × V → W be given by φ( u, v) = u − v. Confirm that φ is linear. (If it turns out, that it is not, return to part (b).) Determine ker φ

and im φ.

(d) Show that the map ψ : U ∩ V → ker φ, ψ( w) = ( w, w) is a linear bijection, therefore dim( U ∩ V ) = dim(ker φ).

(e) Conclude that dim U + dim V = dim( U + V ) + dim( U ∩ V ).





Chapter 3

Functions of several variables

3.1

Multiple integrals

Problem 3.1

(solution 3.1, page 64)

Let a vector–valued function F :

2

2

R

→ R be given by F(r) = F( r, ϕ) = [ x, y]T = x,

where

x = r cos ϕ,

y = r sin ϕ,

the so-called polar coordinates. Find the Jacobian matrix J F = ∂ F and the Jaco-

∂ r

bian determinant det( J F) of F.

Problem 3.2

(solution 3.2, page 64)

Let a vector–valued function F :

3

3

R

→ R be given by F(r) = F( r, ϕ, z) = [ x, y, z]T =

x, where

x = r cos ϕ,

y = r sin ϕ,

z = z,

the cylindrical coordinates. Find the Jacobian matrix J F = ∂ F and the Jacobian

∂ r

determinant det( J F) of F.

Problem 3.3

(solution 3.3, page 64)

Let F :

3

3

R

→ R , r 7→ x be a vector–valued function given by F( r, ϕ, θ) = [ x, y, z]T, where:

x = r cos θ cos ϕ,

y = r cos θ sin ϕ,

z = r sin θ,

16

CHAPTER 3. FUNCTIONS OF SEVERAL VARIABLES

17

the spherical coordinates. Find the Jacobian matrix J F = ∂ F and the Jacobian

∂ r

determinant det( J F) of F.

Problem 3.4

Let R ≥ 0, and let F :

3

3

R

→ R be a vector–valued function given by

( R + r cos θ) cos ϕ







F( r, ϕ, θ) = F [ r, ϕ, θ]T = 







( R + r cos θ) sin ϕ  ,











r sin θ



the toroidal coordinates.

(a) Find the Jacobian matrix F; J F =

∂ F

of F.

∂[ r,ϕ,θ]T

(b) Find the determinant of that Jacobian matrix; det( J F).

Problem 3.5

(solution 3.5, page 65)

Evaluate double integrals below.

"

(a)

(5 − x − y) dxdy, where D = [0 , 1] × [0 , 1], D

"

y

(b)

dxdy, where D is given by x ≥ 0, y ≥ x in x 2 + y 2 ≤ 2, D x + 1

" sin x

(c)

dxdy, where D is the triangle given by 0 ≤ y ≤ x, and x ≤ π, D

x

"

Z ∞

(d)

e− x 2− y 2 dxdy, and use this to evaluate e− x 2 dx.

2

R

−∞

Problem 3.6

Sketch the domain of integration and evaluate the integrals.

Z 1  Z x



1.











xeydy dx,





0 

− x



√

√

Z 1  Z y







y

Z

2 Z

2− y 2

y

2.



















dx dy +



dx dy.









0 

0 x + 1



1



0

x + 1



Problem 3.7

(solution 3.7, page 66)

What is the volume of the solid bounded by the paraboloid z = 8 − x 2 − y 2 and the plane z = −1?

Problem 3.8

(solution 3.8, page 66)

Find the coordinates of the center of mass of the quarter of a disk given by inequalities x 2 + y 2 ≤ R 2, x ≥ 0, y ≥ 0 if the density at each point is equal to the distance from the origin, i.e., ρ( x, y) = p x 2 + y 2.





CHAPTER 3. FUNCTIONS OF SEVERAL VARIABLES

18

!

Hint: the mass of a figure D ⊆

2

R

is given by m =

ρ( x, y) dxdy, coordinates

!

D

!

of the center of mass are x∗ = 1

xρ( x, y) dxdy and y∗ = 1

yρ( x, y) dxdy.

m

D

m

D

Use polar coordinates.

Problem 3.9

(solution 3.9, page 67)

Determine the mass and the coordinates of the center of mass of a homogeneous solid (i.e., ρ( x, y, z) = 1) bounded by surfaces z 2 = x 2 + y 2 and x 2 + y 2 + z 2 =

4, which lies in the half-space z ≥ 0.

Hint: Use spherical coordinates.

Problem 3.10

(solution 3.10, page 68)

Determine the mass and the coordinates of the center of mass of a ball given by inequality x 2 + y 2 + z 2 ≤ 2 z if the density at every point is equal to the distance from the origin.

Hint: Use spherical coordinates.

Problem 3.11

(solution 3.11, page 69)

A solid D ⊆ 3

R

is bounded by parabolic cylinders z = 2− x 2 and z = y 2−2. Determine the volume and mass of this solid if the density is given by ρ( x, y, z) = y 2.

Hint: Find the (orthogonal) projection of this solid onto the xy-plane, use cylindrical coordinates.

3.2

Local extrema of real multivariate functions

Problem 3.12

(solution 3.12, page 69)

Find and classify the stationary points of functions below.

(a) f ( x, y) = x 3 − 4 x 2 + 2 xy − y 2

(b) g( x, y) = xex + 2 yey + 1

(c) h( x, y) = (1 + ey) cos x − yey (d) k( x, y, z) = x 3 + y 3 + 3 z 2 − 3 xyz (e) r( x, y, z) = x 2 + y 2 + z 2 − 2 xyz (f) u( x, y) = 4 + x 3 + y 3 − 3 xy (g) v( x, y) = 3 x 2 y + y 3 − 3 x 2 − 3 y 2

Problem 3.13

(solution 3.13, page 73)

Given a , b ∈ n

R

let f (x) = (xTa)(xTb).

(a) Evaluate ∂f and ∂ 2 f .

∂ x

∂ x2

(b) Additionally assume that a and b are nonzero and orthogonal. What is the type of any stationary point of f in this case?





CHAPTER 3. FUNCTIONS OF SEVERAL VARIABLES

19

Problem 3.14

(solution 3.14, page 74)

Find the vector x ∈

n

R

for which the sum of squared distances from given

vectors a

∈

n

1 , . . . , a k

R

is the smallest possible.

3.3

Constrained extrema

Problem 3.15

(solution 3.15, page 74)

Find the points in the domain described by the inequality

4( x − 1)2 + y 2 ≤ 16,

at which the largest and the smallest values of the function

f ( x, y) = 2 x 2 + y 2

are attained.

Problem 3.16

(solution 3.16, page 76)

Let T be the triangle which is the intersection of the first octant and the plane given by x + y + z = 5. At which point on this triangle is the largest value of the function g( x, y, z) = xy 2 z 2 attained?

Problem 3.17

(solution 3.17, page 77)

Find all points on the ellipse given by

x 2 − xy + y 2 = 3,

which are farthest from the origin.

Problem 3.18

(solution 3.18, page 78)

Which points on the curve given implicitly by

( x 2 + y 2)2 = x 3 + y 3

are farthest from the origin?

Problem 3.19

(solution 3.19, page 78)

Find the largest and the smallest value of the function f ( x, y) = xy − y + x − 1

(a) on the disk given by x 2 + y 2 ≤ 2,

(b) on the half-disk given by x 2 + y 2 ≤ 2 and x ≥ 0.

Problem 3.20

(solution 3.20, page 80)

CHAPTER 3. FUNCTIONS OF SEVERAL VARIABLES

20

An ellipsoid is given by the equation

x 2

y 2

z 2

+

+

= 1.

a 2

b 2

c 2

A box with edges parallel to x, y, and z axes is inscribed inside this ellipsoid.

(a) What is the largest possible volume of the inscribed box?

(b) What is the largest possible surface area of the inscribed box? If you get stuck with the general case, assume c = b, i.e., an ellipsoid of revolution.

Problem 3.21

(solution 3.21, page 84)

We are given an ℓ meters long thin rod. We cut it into 12 shorter rods, from which a frame of a box can be assembled.

(a) How long should those shorter rods be if the box is to have the largest possible volume?

(b) Same question as above with an additional restriction – the base rectan-gle should have area equal to A.

Problem 3.22

(solution 3.22, page 86)

We wish to assemble a frame of a triangular prism (equilateral base triangle) from an ℓ metres long thin rod.

(a) What should be the length of the pieces we cut our rod into, for the prism to have the largest possible volume?

(b) What should be the length of the pieces we cut our rod into, for the prism to have the largest possible surface area?

Problem 3.23

(solution 3.23, page 87)

Let a ∈ n

R

and let d ≥ 0 be a real number.

(a) Find the largest and least value of the expression aTx for a vector x ∈ n R

with prescribed length ∥x∥ = d.

(b) Explain the solution.

Problem 3.24

(solution 3.24, page 88)

Assume A ∈ n× n

R

, and let d be a positive real number.

(a) Find the largest and least value of f (x) = xT A x with constraint ∥x∥ = d.

(b) Find the largest and least value of f (x) = ∥x∥2 with constraint xT A x = d 2

if A is symmetric positive definite.

Problem 3.25

(solution 3.25, page 89)

Let A ∈ n× n

n

R

, b , p ∈ R , and d > 0 a real number.

(a) Minimize f (x) = ∥x∥2 with respect to ∥x − p∥ ≤ d.

(b) Minimize f (x) = ∥x∥2 with respect to A x = b.

CHAPTER 3. FUNCTIONS OF SEVERAL VARIABLES

21

(c) Minimize f (x) = ∥x∥2 with respect to ∥x − p∥ ≤ d and A x = b.





Chapter 4

Solutions

Solution to problem 1.1, page 5: First we evaluate the characteristic polynomial of A.





− λ

2

2

− λ

2

2 − λ





p





A( λ) = det( A − λI ) = 3

1 − λ

−3 = 3

1 − λ

0





−1

−1

3 − λ

−1

−1

2 − λ





1 − λ

3

0





1 − λ

3

=





3

1 − λ

0 = (2 − λ)





3

1 − λ



−1

−1

2 − λ





= (2 − λ)((1 − λ)2 − 9) = (2 − λ)(4 − λ)(−2 − λ).

The eigenvalues are the roots of the characteristic polynomial (indexed in decreasing order)

λ 1 = 4, λ 2 = 2, λ 3 = −2.

To find the corresponding eigenvectors we determine the null space

N ( A − λI) = {v ∈ 3

R

: ( A − λI)v = 0}

for each eigenvalue λ, which can be done using Gaussian elimination.

• For N ( A − λ 1 I) we compute

−4

2

2 

1

1

1 

1

1

1 

1

0

0



































 ∼ 

 ∼ 

 ∼ 



 3

−3

−3

1

−1

−1

0

−2

−2

0

1

1 .

































−1

−1

−1

2

−1

−1

0

3

3 

0

0

0

The homogenous system of equations for components of the vector [ x, y, z]T

is thus simplified to

x = 0,

y + z = 0.

We can take z to be the free variable and choosing z = 1 gives an eigenvector for λ 1 = 4:

 0 





v

− 

1 = 





1 .









 1 

22

CHAPTER 4. SOLUTIONS

23

• For N ( A − λ 2 I) we have

−2

2

2 

 1

−1

−1

1

−1

−1

1

0

−1



































 ∼ 

 ∼ 

 ∼ 



 3

−1

−3

 3

−1

−3

0

2

0 

0

1

0  .

































−1

−1

1 

−1

−1

1 

0

−2

0 

0

0

0 

The equations are x − z = 0 and y = 0. Choosing z = 1 gives

1





v





2 =  

0 .









1

• For N ( A − λ 3 I) we have

 2

2

2 

 1

1

1 

1

1

1 

1

1

0



































 ∼ 

 ∼ 

 ∼ 



 3

3

−3

 1

1

−1

0

0

−2

0

0

1 .

































−1

−1

5 

−1

−1

5 

0

0

6 

0

0

0

The equations are x + y = 0 , z = 0. This time y is the free variable and the choice y = 1 gives

−1





v





3 = 



 1  .









 0 

Solution to problem 1.2, page 5: We know that it will be possible to find an orthogonal basis of 4

R

consisting of eigenvectors of H because H is a symmetric matrix ( H T = H). Computing the characteristic polynomial for H we get 1 − λ

1

1

0

1 − λ

1

1

0





1

− λ

0

1



2

− λ

0

1

det( H − λI) =





=





1

0

− λ

−1



0

0

− λ

−1





0

1

−1

1 − λ

1 − λ

1

−1

1 − λ





1 − λ

1

1

0





2

− λ

0

1





1 − λ

1 − λ

−1

=



·



=





0

0

− λ

−1



2

− λ −2

1 − λ





0

0

−2

1 − λ

= (−(1 − λ) λ − 2)2 = (2 − λ)2(−1 − λ)2.

In addition to the usual rules for computing determinants (addition of multiples of rows/columns to other rows/columns) we used the following rule for determinants of block upper–triangular matrices:

" A C#!

det

= det( A) · det( B),

0

B

where A, B, C, and 0 are matrices of appropriate dimensions.

We therefore obtain two distinct eigenvalues λ 1 , 2 = 2 and λ 3 , 4 = −1 which are both double roots of the characteristic polynomial. Since H is symmetric, we know that it has a two-dimensional eigenspace for each eigenvalue (for general matrices this is not necessarily the case).

Let’s proceed with eigenspaces and their bases:

CHAPTER 4. SOLUTIONS

24

• Eigenspace for λ 1 , 2 = 2 is the nullspace N ( H − 2 I). We have

−1

1

1

0 

1

0

−2

−1

















 1

−2

0

1 

0

1

−1

−1

H − 2 I = 









 ∼ · · · ∼ 









 .









 1

0

−2

−1

0

0

0

0 

















 0

1

−1

−1

0

0

0

0 

Denoting a general eigenvector as v = [ x 1 ,x 2 ,x 3 ,x 4]T, the equations for the components of v are

x −

−

1

2 x 3 x 4 = 0

∴

x 1 =2 x 3 + x 4,

x −

−

2

x 3 x 4 = 0

∴

x 2 = x 3 + x 4.

Hence an eigenvector corresponding to λ 1 , 2 = 2 looks like

2 x











3 + x 4

2

1

























 x



1

1

v =  3 + x 4 























 = x



 + x



 .





3  

4  



x



1

0



3

























x











4

0

1

Setting v

}

1 = [2 , 1 , 1 , 0]T and v2 = [1 , 1 , 0 , 1]T, the set BN ( H−2 I) = {v1 , v2 is a basis of N ( H − 2 I). Note, however, that v1 and v2 are not orthogonal, so we don’t have an orthonormal basis yet. We can obtain an orthonormal basis via Gram–Schmidt orthogonalization:

1





u

1  

1

u

2





2 := v2

and

q2 =

= √  



 ,

∥u ∥





2

3 0









1

 1 

 1 

uTv

















1

 0 

u

1  0 

u

−

2





1





1 := v1

u2 = 



√









and

q1 =

=



 .

uTu









 1 

∥u ∥

 1 

2 2





1

3 











−1

−1

(Starting with index 2 produces ‘smaller’ square roots, we could just as well start with index 1.) So, we have an orthonormal basis B′

=

N ( H−2 I)

{q

}

1 , q2 for N ( H − 2 I ).

• Eigenspace for λ 3 , 4 = −1 is the nullspace N ( H + I). We have

2

1

1

0 

1

0

1

−1

















1

1

0

1 

0

1

−1

2 

H + I = 









 ∼ · · · ∼ 









 .









1

0

1

−1

0

0

0

0 

















0

1

−1

2 

0

0

0

0 

Using the same notation as above, we obtain

 x −



− 





4

x 3

1

1

























 x − 2 x 

 1 

−2

v =  3

4























 = x



 + x



 .





3 



4 





x



 1 

 0 



3

























x











4

0

1

CHAPTER 4. SOLUTIONS

25

Again, v3 = [−1 , 1 , 1 , 0]T and v4 = [1 , −2 , 0 , 1]T constitue a basis of N ( H + I), but they are not orthogonal. Let’s repeat the Gram–Schmidt orthogonalization:

−1





u

1 



 1 

u

3





3 := v3

and

q3 =

= √ 





 ,

∥u ∥





3

3  1 









 0 

 0 

 0 

uTv

















4

−1

u

1 −1

u

−

3





4





4 := v4

u3 = 



√









and

q4 =

=



 .

uTu









 1 

∥u ∥

 1 

3 3





4

3 











 1 

 1 

Now, B′

= {q

} is an orthonormal basis of N ( H + I).

N ( H+ I)

3 , q4

Solution to problem 1.3, page 5:

(a) By performing Gaussian elimination



0

1

· · ·

1 



1

0

· · ·

0 



























1

0

· · ·

0 



0

1

· · ·

1 

























A =  1

0

· · ·

0 



0

0

· · ·

0 



 ∼ 





















.

.

.

. 



.

.

.

. 



.

.

.

. 



.

.

.

. 



.





.





.

.

. 



.

.

. 



















1

0

· · ·

0 



0

0

· · ·

0 

or by simply looking at the matrix A it is clear that the column space C( A) is spanned by the first two column vectors. We denote these two vectors by

0

1

















1

0









u =  













 . 

and v =  . 









 . 

 . 

 . 

 . 

















1

0

so we can write C( A) = L({u , v}) and conclude that the dimension of C( A) equals 2.

This implies that dim( N ( A)) = n − 2, which can also be seen from the row-reduced form of A above. To determine a basis for the null space N ( A), i.e., the subspace of the solutions to the equation A w = 0, we need to find n − 2 linearly independent solutions to the system of equations x 1 = 0,

x 2 + ... + xn = 0,

where the variables are the components of the vector w = [ x 1 ,...,xn]T.

This system of equations can be obtained directly from the result of Gaussian elimination above or by observing that A is a symmetric matrix: We know that C( A)⊥ = N ( A T) = N ( A), so we can conclude that w ∈ N ( A) is equivalent to the condition that w is orthogonal to both u and v. Expressing these conditions via equations, w · u = 0 and w · v = 0,

CHAPTER 4. SOLUTIONS

26

and writing these equations component-wise produces the same equa-

tions as above.

The ‘free’ variables of this system are x 3 ,...,xn which means we can choose n − 2 linearly independent solutions by setting xk = 1, xi = 0

for i , k for each choice of k = 3 , . . . , n. The solutions can then be written as

 0 

 0 

 0 

 0 

































−1

−1

−1

−1

































































 1 

 0 

 0 

 0 

















w

















3 = 

 , w



 , . . . , w



 , w





 0 

4 =  1 

n−1 =  . 

n =  . 









 . 

 . 

























 . 

 . 

 . 

 . 

























 . 

 . 









 . 

 . 

 1 

 0 

































 0 

 0 

 0 

 1 

and hence one choice of a basis for N ( A) is the set of vectors {w

}

3 , . . . , w n .

(b) The subspace N ( A) is by definition the eigenspace of A for eigenvalue λ = 0 (if N ( A) is non-trivial). We can therefore immediately conclude that we have an ( n − 2)-dimensional eigenspace for eigenvalue λ = 0

spanned by the vectors {w

}

3 , . . . , w n defined above. Since A is a symmetric matrix, we know that the algebraic multiplicity of each eigenvalue is equal to the dimension of corresponding eigenspace. This means we have (algebraically) two non-zero eigenvalues yet to be determined.

It is possible to obtain the remaining two eigenvalues as usual by computing the characteristic polynomial directly and then solving the corresponding systems of equations to obtain the eigenvectors. As an alternative we will use the fact that A is a symmetric matrix.

Since we know that the eigenspaces of a symmetric matrix for distinct eigenvalues are mutually orthogonal, and that the orthogonal complement of the eigenspace for λ = 0 is N ( A)⊥ = C( A T) = C( A) = L({u , v}), we can deduce that it must be possible to express the remaining two

eigenvectors as linear combinations of the column vectors u and v.

It therefore makes sense to see how the matrix A acts on the vectors u and v. Direct computation produces the equations

A u = ( n − 1)v,

(1.1)

A v = u.

There are now at least two ways to proceed from here.

1. Since we know that any eigenvector w of A for any non-zero eigenvalue λ can be written as a linear combination w = x u + y v, we can write the eigenvector equation A w = λ w using (1.1) as follows A w = A( x u + y v) = xA u + yA v = x( n − 1)v + y u = λ( x u + y v).

The last equality can be rewritten as

( λx − y)u − ( x( n − 1) − λy)v = 0.

Since u and v are linearly independent this implies that both coef-

ficients above must equal 0, which yields a system of equations

λx − y = 0,

x( n − 1) − λy = 0.

CHAPTER 4. SOLUTIONS

27

By expressing y = λx from the first equation and plugging into the

√

second equation we get λ 2 = n − 1 or λ 1 , 2 = ± n − 1. (Of course this system also has an obvious solution for x = y = 0, but this is not a valid solution in this case because eigenvectors must be

non-zero.) To obtain the eigenvectors for λ 1 , 2 we have freedom to choose any non-zero value for (say) x (because know that any nonzero multiple of an eigenvector is also an eigenvector for the same

eigenvalue). We choose x = 1 and hence y = λ for each eigenvalue λ 1 , 2 to get

√



n − 1





√











1







w





1 = x u + y v = u +

n − 1v = 





.





.









.













1



and

√

− n − 1





√











1







w





2 = u −

n − 1v = 





.





.









.













1



as the remaining eigenvectors.

2. Another way is to use a little theory of linear maps (which are

the subject of subsequent chapters). According to (1.1), we can view multiplication by A (restricted to the 2-dimensional subspace C( A)) as a linear map from C( A) to C( A). Using (1.1) we can represent this map in the basis {u , v} with the 2 × 2 matrix, which we denote by B;

"

0

1#

B =

.

n − 1

0

Theory tells us that the eigenvalues of a linear map do not depend

on the choice of basis of the underlying vector space, which means

that the eigenvectors of B must be same as the restriction of A to C( A). The characteristic polynomial for B is det( B − λI) = λ 2 − ( n − 1) = 0

√

which yields λ 1 , 2 = ± n − 1. Then the bases of the eigenspaces N ( B − λ 1 , 2 I) (expressed with respect to the basis {u , v}) can be obtained as usual via Gaussian elimination:

"

1

#

"

1

#

w

√

√

1 =

, w

.

n − 1

2 = − n − 1

Since the components of these vectors represent coefficients with respect to the basis {u , v} of C( A), we can express them as

√

√

w1 = u + n − 1v , w2 = u − n − 1v.

CHAPTER 4. SOLUTIONS

28

Solution to problem 1.4, page 6: The idea: Determine all eigenvalues and eigenvectors of A. Since A is symmetric, it can be diagonalized in an orthonormal basis, i.e., , written as A = QDQ T, where D is a diagonal matrix (with eigenvalues on the diagonal) and Q is orthogonal (with properly chosen eigenvectors as columns). So, once we have all eigenvalues and eigenvectors, we’ll just multiply QDQ T to get A.

The simplest way to determine the remaining eigenvalues of A (apart from the double eigenvalue 3) is to notice what happens when we add and subtract the equations A v1 = v2 and A v2 = v1.

By adding them we get

A(v1 + v2) = v1 + v2

and by subtracting them we get

A(v −

−

1

v2) = −(v1 v2).

From these two equations we can directly read two eigenvectors

1









1

u





1 = v1 + v2 =  





for eigenvalue λ





1 = 1

1









1

and

−1









 1 

u

−





2 = v1

v2 = 







for eigenvalue λ





2 = −1.

 1 









−1

An alternative way to determine these two eigenvectors would be similar to that in Exercise 1.3 (b): we could write a 2 × 2 matrix that represents multiplication by A on the subspace spanned by {v

}

1 , v2 and compute λ 1 , λ 2 , u1 , u2

using the usual procedure.

We also notice that the computed eigenvectors u1 and u2 are orthogonal, which is a necessary condition for A to be a symmetric matrix. If the computed u1 and u2 were not orthogonal we could conclude that a symmetric matrix A with the required properties does not exist.

Now that we have all four eigenvalues λ 1 = 1 ,λ 2 = −1 (together with their eigenvectors) and λ 3 , 4 = 3, we only need to determine the corresponding eigenspace for λ 3 , 4 = 3.

The requirement that A must be a symmetric matrix implies that the eigenspace N ( A−3 I) must be 2-dimensional and it must be orthogonal to the eigenspaces for λ 1 and λ 2. This means it is enough to find a basis for the orthogonal complement to L({u

}

1 , u2 ) which is also 2-dimensional. The condition

x ∈ L({u

}

·

·

1 , u2 )⊥ can be described by equations u1 x = 0 and u2 x = 0, i.e., , a system of equations

x 1 + x 2 + x 3 + x 4 = 0,

− x

−

1 + x 2 + x 3

x 4 = 0

for the components x = [ x 1 ,x 2 ,x 3 ,x 4]T.

CHAPTER 4. SOLUTIONS

29

Another way to obtain this system of equations is to define the matrix

1

−1









1

1 

U = [u





1 , u2] = 











1

1 









1

−1

and note that

L({u

}

1 , u2 )⊥ = C( U )⊥ = N ( U T) In any case, we can perform Gaussian elimination on the matrix U T to obtain

"1 0 0 1#

U T ∼ 0 1 1 0

or on the system above to directly to obtain the conditions

x 1 + x 4 = 0,

x 2 + x 3 = 0.

By choosing the appropriate values for the ‘free’ variables x 3 and x 4 we can choose the following vectors for a basis of the eigenspace N ( A − 3 I) = N ( U T):

 0 

−1

















−1

 0 

u









3 = 









 and u



 .





4 = 



 1 

 0 

















 0 

 1 

By finding all the eigenvalues and eigenvectors we effectively have a diagonalisation of the matrix A. It remains to define the matrices

D = diag(1 , −1 , 3 , 3) and P = [u1 , u2 , u3 , u4]

and compute A = P DP −1. If we are computing this product by hand, it may be preferable to choose an orthonormal basis of eigenvectors (in order to avoid computing the inverse P −1) instead of {u

}

1 , u2 , u3 , u4 . Luckily, all these vectors are already orthogonal, so we only need to normalize them. We can therefore define an orthonormal basis consisting of eigenvectors by

1

1

1

1

q1 = u

u

√ u

√ u

2 1 , q2 = 2 2 , q3 =

3 , q4 =

4.

2

2

Now define the matrix Q = [q1 , q2 , q3 , q4] and evaluate

 3

1

1

−3





1 



 1

3

−3

1 

A = QDQ T =











 .

2 



 1

−3

3

1 









−3

1

1

3 

Solution to problem 1.5, page 6:

CHAPTER 4. SOLUTIONS

30

(a) We will discuss the general case below. Let us first assume that A is diagonalizable, so that we have

A = P DP −1

where D = diag( λ 1 ,...,λn) is a diagonal matrix that contains the eigenvalues of A along the diagonal. The powers Ak of the matrix A can then be expressed as

Ak = P DP −1 P DP −1 · · · P DP −1 = P DkP −1

where Dk = diag( λk 1 ,...,λkn) contains the powers of the eigenvalues along the diagonal. We use this expression in the power series for eA.

∞

∞

X 1

X 1

eA =

Ak =

P DkP −1

k!

k!

k=0

k=0

 ∞



∞

∞

X 1



X 1

X 1

= P 





Dk P −1 = P · diag(

λk

λk





n) · P −1





1 , . . . ,



k!



k!

k!

k=0

k=0

k=0

= P · diag( eλ 1 , . . . , eλn ) · P −1

In other words, the matrix eA can be diagonalized in the same basis as A, and its eigenvalues are simply eλ 1 , . . . , eλn . Using elementary properties of determinants, the determinant of eA can then be expressed as det( eA)

=

det( P · diag( eλ 1 , . . . , eλn) · P −1)

=

det( P ) · det( diag( eλ 1 , . . . , eλn )) · det( P −1)

=

det( diag( eλ 1 , . . . , eλn ))

=

eλ 1 · · · eλ

n = eλ 1+···+ λn

We know that tr( A) equals the sum of the eigenvalues of A which completes the proof for diagonalizable A.

For the general case we can use the Schur decomposition

A = U T U ∗

where T is an upper-triangular matrix and U is a unitary matrix ( U ∗ U =

U U ∗ = I). We know that the upper–triangular matrix in the Schur decomposition of A also contains the eigenvalues of A along its diagonal.

The proof then follows the same pattern as in the diagonalizable case, with the difference being that we work with upper–triangular matrices instead of diagonal matrices. However the upper–triangular matrices

that appear contain the same diagonal elements as the diagonal matrices above. Since the matrix elements above the diagonal do not affect the determinant or the trace, this means the result is the same in the end.

(b) Using the power series for eA we can directly see that

eA T = eA T.

CHAPTER 4. SOLUTIONS

31

For matrices A and B that commute (meaning AB = BA) it is also easy to see that

eA · eB = eA+ B.

(This is identity is not valid for matrices that do not commute!). To show that eA is orthogonal, we need to prove that ( eA)T eA = I. For an antisymmetric matrix A we have

eA T · eA = eA T · eA = e− A · eA = e− A+ A = e 0 = I, since A and − A commute (the zero above denotes the zero matrix), hence eA is an orthogonal matrix. If A T = − A, then all elements on the diagonal of A must be equal to zero, implying that tr( A) = 0. Using the identity from (a) we then have

det( eA) = e tr( A) = e 0 = 1.

Solution to problem 1.6, page 6: In principle, a Schur decomposition for a matrix A ∈ n× n

R

can be computed by the following algorithm.

We first find an eigenvalue λ 1 for A and a corresponding normalized eigenvector q1, so we have A q1 = λ 1q1 with qTq

1 1 = 1. Then we find an orthonormal

basis {q

}

2 , . . . , q n for the orthogonal complement of q1. In other words we form a an orthogonal matrix

h

i

Q 1 = q1 , q2 ,..., q n

meaning Q T

q

q

1 Q 1 = I , or, equivalently, qT

i

i = 1 and qT

i

j = 0 for i , j, i, j =

1 , . . . , n. Then we compute the matrix

qT

qT



1 



1 

















qT

qT



2 

h

i



2  h

i

T









1 := Q T

1 AQ 1 = 

 A q

= 

 λ

 . 

1

q2 ... q n

 . 

1q1

A q2 ... A q n

 . 

 . 









 . 

 . 

















qT





n

qT n

 λ



1qTq1

qT



1

1 A q2

. . .

qT1 A q n









"

#

 λ

q





1qT

2 1

qT2 A q2 ... qT2 A q n

λ

= 



1

bT



 =



.

.

.

.





.

.

.

.



0

A



.



2



.

.

.











 λ



1qT

n q1

qT nA q2 ... qT nA q n

where the vector b and matrix A 2 are simply the results of the computation.

This basically yields the first column of the upper triangular matrix T in the Schur decomposition A = QT Q T.

We can then repeat the procedure on the A 2 block of the matrix T 1 to obtain ( n − 1) × ( n − 1) matrices Q 2 and T 2 and so on. The end result is the upper-triangular matrix from the Schur decomposition together with a sequence Q 1 ,Q 2 ,...,Qn−1 of orthognal matrices of decreasing size. The orthognal martix Q from the Schur decomposition can then be computed by

"1 0T #

" I

#

Q = Q

· · ·

n−1

0T

1

.

0

Q 2

0

Qn−1

CHAPTER 4. SOLUTIONS

32

It should be noted that we have considerable freedom during the computing of the Schur decomposition using the described algorithm, from the choice of the ‘first’ eiqenvalue to the choice of orthonormal basis for the orthogonal complement of the chosen eigenvector at each step of the algorithm. When computing by hand it is therefore worth it to use this freedom at each step with an eye towards keeping the subsequent computations as simple as possible.

We will find the Schur decomposition of matrix B first, since it is the easier example. The characteristic polynomial is





2 − λ

−1

0





2 − λ

0

det( B − λI) =





0

1 − λ

0





√

√

= (2 − λ)

= (2 − λ)2(1 − λ).





0

1 − λ

−

2

−

2

2 −





λ

We choose the ‘first’ eigenvalue to be λ = 2 since we easily notice that the corresponding eigenvector is simply q1 = [0 , 0 , 1]T. Using Gaussian elimination we could also verify that q1 is also the only eigenvector for the double eigenvalue λ = 2 which means that B is not diagonalizable, but a Schur decomposition still exists.

Clearly, we have many choices for the orthogonal matrix Q 1 which should contain q1 in the first column. A good choice is

0

1

0





Q





1 = 



0

0

1 .









1

0

0

We now compute

0

0

1  2

−1

0 0 1 0



 

 



T



 

 



1 = Q T



 

0

1

0 



1 BQ 1 = 1

0

0 

 0

0

1



 

√

√

 





 

 



0

1

0 − 2 − 2 2 1 0 0

√

√

0

0

1 0

2

−1 

2 − 2 − 2



 







= 

 









 0

0

1 





1

0

0 

 = 0

2

−1  .



 

√

√ 







 







0

1

0 2 − 2 − 2

0

0

1 

The matrix multiplication in this case is easy since Q 1 happens to be a permutation matrix: multiplication by Q 1 from the right just permutes column vectors and multiplication from the left permutes row vectors.

We notice that the result T 1 already happens to be an upper-triangular matrix so no further steps are needed. The Schur decomposition of B is simply B = Q 1 T 1 Q T.

1

Of course, a less fortunate choice of Q 1 would require more computation.

For instance, a sensible choice for Q 1 also seems to be

0

0

1





Q





1 = 



0

1

0 .









1

0

0

This choice then results in

√

√

2 − 2 − 2

"

#





λ

T





1

bT

1 = 



0

1

0  =

,









0

B

0

−1

2 

2

CHAPTER 4. SOLUTIONS

33

where

√

√

" 1

0#

bT = [− 2 , − 2] and B 2 =

.

−1

2

As before, we can notice the eigenvalue λ = 2 of B 2 (we also know this must be an eigenvalue since the matrix Q T1 BQ 1 has the same eigenvalues as B) with eigenvector q2 = [0 , 1]T. The (almost) only choice for Q 2 is then

"0 1#

Q 2 = 1 0

and we get

"2 −1#

T 2 = Q T2 B 2 Q 2 =

.

0

1

( Q 2 is a permutation matrix that just swaps the first and second columns/rows of matrices.)

The final result is

√

√





"

2

−

2

−

2

2

bT Q #





T =

2 = 



0

2

−1 

0

T









2





0

0

1 

for the upper–triangular matrix of the decomposition and





"

0

1

0

1

0T #





Q = Q





1

= 0

0

1

0

Q









2





1

0

0

for the orthogonal matrix, which is the same result as before.

A somewhat more involved example is matrix A. The characteristic polynomial is





6 − λ

−1

1

6 − λ

−1

1





det( A − λI) =





4

3 − λ

1 = λ − 2

4 − λ

0





2

2

3 − λ



2

2

3 − λ





λ − 2

4 − λ

6 − λ

−1

=

+ (3 − λ)





2

2

λ − 2

4 − λ

= (2( λ − 2) − 2(4 − λ)) + (3 − λ) ((6 − λ)(4 − λ) + ( λ − 2))

= −4(3 − λ) + (3 − λ)( λ 2 − 9 λ + 22) = (3 − λ)( λ 2 − 9 λ + 18)

= (3 − λ)2(6 − λ).

Let us find an eigenvector for λ = 6. Using Gaussian elimination we get

0

−1

1 

2

2

−3

2

2

−3

2

0

−1

















A − 6 I ∼ 

















 ∼ 

 ∼ 

 ∼ 



4

−3

1 

0

−1

1 

0

−1

1 

0

−1

1 

































2

2

−3

4

−3

1 

0

−7

7 

0

0

0 

The equations for the components of an eigenvector v1 = [ x 1 ,x 2 ,x 3]T for λ = 6

therefore reduce to

2 x −

1

x 3 = 0,

− x 2 + x 3 = 0.

CHAPTER 4. SOLUTIONS

34

If we choose x 3 = 2 for the value of the ‘free’ variable we get the eigenvector v1 = [1 , 2 , 2]T. Similarly, we can find that the only eigenvectors for the eigenvalue λ = 3 are nonzero multiples of v2 = [−1 , 1 , 4]T.

However, the eigenvector v

∥

1 seems ‘nicer’ than v2. For one, ∥v1

= 3 is an

integer which means we don’t need to deal with square roots when normaliz-ing. Also, it is possible to find two mutually orthogonal vectors to v1 simply by cleverly permuting and changing signs of its coordinates, which also ensures they all have the same length.

After some guesswork, a sensible choice for Q 1 seems to be

1

2

2 

1 



Q





1 =





2

1

−2

3 







2

−2

1 

The result for T 1 is

2

3

5

6

3

3









T









1 = Q T









1 AQ 1 = Q T

1 4

3

1 = 0

3

3

















4

0

1

0

0

3

Since T 1 happens to be an upper-triangular matrix we can terminate the algorithm and write the Schur decomposition as A = Q 1 T 1 Q T.

1

With a little less luck with our choice for Q 1, for instance if we had chosen

1

2

2 

1 



Q





1 =





2

−2

1 

3 







2

1

−2

the result for T 1 would be

6

3

3





T





1 = 



0

0

3









0

3

3

The algorithm would then require one more step. But even in this case we can notice that T 1 can be transformed into an upper-triangular matrix by the same permutation matrix that transposes the second and third columns and rows as in the example for the B matrix above.

Solution to problem 1.7, page 6:

p

1. By the definition ∥ A∥ F = tr( A T A) and assumption U T U = I we have

∥ U A∥2F = tr(( UA)T UA) = tr( A T U T UA) = tr( A T A) = ∥ A∥2F.

2. Here, we also need a basic property of the trace operation tr( AB) = tr( BA).

∥ AV ∥2F = tr(( AV )T AV ) = tr( V T( A T AV )) = tr(( A T AV ) V T) = tr( A T A) = ∥ A∥2F.

3. This can also be proved directly by definition, or simply by combining the previous two equalities

∥ U AV ∥F = ∥ U( AV )∥F = ∥ AV ∥F = ∥ A∥F.

CHAPTER 4. SOLUTIONS

35

Solution to problem 1.8, page 7:

1. Define the function f ( x) := ∥ xA + B∥2 for F

x ∈ R. Clearly, f ( x) ≥ 0 for all

x ∈ R. Expanding according to the definitions and the properties of the inner product we have

f ( x) = ∥ Ax + B∥2F

= ⟨ xA + B, xA + B⟩F

= ⟨ xA, xA⟩F + ⟨ xA,B⟩F + ⟨ B,xA⟩F + ⟨ B,B⟩F

= ⟨ A, A⟩ ·

·

F x 2 + 2⟨ A, B⟩F x + ⟨ B, B⟩F.

This shows f ( x) is a quadratic function of x. A quadratic function f ( x) is non-negative for all x ∈ R if and only if its discriminant D = b 2 − 4 ac is non-positive, D ≤ 0. In our case the discriminant equals

D = 4(⟨ A, B⟩

⟨

∥

F)2 − 4⟨ A, A⟩F B, B⟩F = 4(⟨ A, B⟩F)2 − 4∥ A∥2

F B∥2

F

Since D ≤ 0 we have (⟨ A, B⟩

∥

| ≤ ∥

∥

F)2 ≤ ∥ A∥2

, hence |⟨

F B∥2

F

A, B⟩F

A∥F B∥F.

We also mention that the Cauchy–Schwarz inequality holds not only for the Frobenius inner product but for general inner products on vector spaces. The only properties of the inner product needed to prove the Cauchy–Schwarz inequality were ⟨ A, A⟩ ≥ 0, ⟨ A, B⟩ = ⟨ B, A⟩ and ⟨ xA, B⟩ =

x⟨ A, B⟩ for scalar values x.

2. In the proof we need the Cauchy–Schwarz inequality (notice that |⟨ A, B⟩ | ≤

F

∥ A∥ ∥

≤ ∥

∥

F B∥F implies ⟨ A, B⟩F

A∥F B∥F) but is otherwise quite direct.

∥ A + B∥2F = ⟨ A + B,A + B⟩F

= ⟨ A, A⟩F + 2⟨ A,B⟩F + ⟨ B,B⟩F

≤ ∥ A∥2

∥

F + 2∥ A∥F B∥F + ∥ B∥2

F

= (∥ A∥F + ∥ B∥F)2.

Since ∥ A+ B∥F and ∥ A∥F+∥ B∥F are both non-negative numbers this implies the triangle inequality.

Obviously, as is the case with the Cauchy–Schwarz inequality, the triangle inequality also holds in general vector spaces with inner products.

3. First, we use the Cauchy–Schwarz inequality to obtain the following inequality

∥ AB∥2F = ⟨ AB,AB⟩F = tr(( AB)T AB)

= tr( B T A T AB) = tr( A T ABB T)

= tr( A T A( B T B)T) = ⟨ A T A, B T B⟩F

≤ ∥ A T A∥ ∥

F B T B∥F

In order to obtain ∥ AB∥ ≤ ∥

∥

F

A∥F B∥F it therefore suffices to prove the in-

equality

∥ A T A∥ ≤ ∥

F

A∥2F

CHAPTER 4. SOLUTIONS

36

To see this, we need to use a few properties of the matrix B = A T A ∈ m× m R

.

Clearly, B is a symmetric matrix since B T = ( A T A)T = A T A = B. Also, for all x ∈ m

R

we have

⟨ B x , x⟩ = ⟨ A T A x , x⟩ = ⟨ A x , A x⟩ ≥ 0

where ⟨x , y⟩ := xTy denotes the usual Euclidean inner product. In other words, B is a positive semidefinite matrix, so all of its eigenvalues are non-negative.

Let λ

≥

1 , . . . , λm

0 denote the eigenvalues of B. On the one hand we have

 m

2

m



2

X



X

X

∥ A∥4





F = tr( A T A)

= tr( B)2 = 

λ  =

λ 2 +

λ



i 

i λj





i





i=1

i=1

i, j

since we know that the trace of a matrix equals the sum of its eigenvalues.

On the other hand we have

m

X

∥ A T A∥2F = ∥ B∥2F = tr( B T B) = tr( B 2) =

λ 2 i

i=1

since B is a symmetric matrix and we know that if λ is an eigenvalue of B

then λ 2 is an eigenvalue of B 2.

Combining the last two identities and noting that P

≥

i

0, because

, j λi λj

all the eigenvalues of B are non-negative, we can write

m

X

∥ A T A∥2F =

λ 2 i

i=1

m

X

X

≤

λ 2 +

i

λiλj

i=1

i, j

= ∥ A∥4 F

from which the desired inequality follows.

4. Using the properties of the Kronecker product we can verify the identity directly.

∥ A ⊗ B∥2F = tr(( A ⊗ B)T( A ⊗ B))

= tr(( A T ⊗ B T)( A ⊗ B))

= tr( A T A ⊗ B T B)

= tr( A T A)tr( B T B)

= ∥ A∥2∥

F B∥2

F.

CHAPTER 4. SOLUTIONS

37

Solution to problem 1.9, page 7: We simply write the orthogonality condition

⟨ I, A⟩F = 0 for A ∈ I⊥ in terms of the coordinates of

" x

#

A =

1

x 2

x 3 x 4

The equation is

⟨ I, A⟩F = tr( I T A) = tr( A) = x 1 + x 4 = 0

This gives us three ’free’ variables x 2, x 3 and x 4, which agrees with the fact that 2×2

R

is a four-dimensional space which implies that the orthogonal complement of any non-zero element should be three-dimensional.

By choosing the appropriate values for the free variables we define three linearly independent solutions

"0 1#

"0 0#

1 "−1 0#

A 1 =

, A

, A

√

0

0

2 = 1 0

3 =

2 0

1

It is straight-forward to verify that all the matrices I, A 1, A 2, A 3 are mutually orthogonal with respect to the Frobenius product. Obviously we also have

∥ A ∥

∥

∥

1 F = ∥ A 2 F = ∥ A 3 F = 1, so these three matrices form an ONB for I ⊥.

Solution to problem 1.10, page 7: The Eckart–Young theorem states that the problem of the best rank k approximation of a rank n matrix (with regard to the Frobenius norm) can be found using the SVD matrix decomposition.

The SVD of a matrix A ∈ n× m

R

is a matrix factorization A = U Σ V T where

 σ



1

0

. . .

0









Σ =  0

σ



n× m



2

. . .

0 ∈





R





 .

. 



.



 ..

. . .. 

is a diagonal matrix containing the singular values σ ≥ · · · ≥

≥

1

σ min( n,m)

0 along

the diagonal and

h

i

h

i

U = u

n× n

m× m

1 , . . . , u

∈

∈

n

R

and V = v1 ,..., v m

R

are orthogonal matrices. Another useful way of writing the SVD is the sum min( n,m)

X

A =

σi u i vT.

i

i=1

Each term in this sum is a rank 1 matrix (every column in a matrix of the form σ uvT is clearly a multiple of u), these are the ‘singular’ matrices that give the

‘singular value decomposition’ its name.

Note also that the Frobenius norm of a ‘singular’ matrix in the decomposition equals

∥ σ uvT∥2F = tr(( σ uvT)T σ uvT) = σ 2tr(vuTuvT) = σ 2tr(vvT) = σ 2tr(vTv) = σ 2

since the vectors u are v are normed, uTu = vTv = 1. Hence, the Frobenius norm of any matrix A can be expressed in terms of its singular values as ∥ A∥ F =

CHAPTER 4. SOLUTIONS

38

q

σ 2 +

+ · · · +

. The Eckart–Young theorem states that in order to

1

σ 2

2

σ 2

min( n,m)

find the best rank k approximation to a matrix A we simply pick out k of those rank 1 ‘singular’ matrices corresponding to k largest singular values.

In order words, we form the diagonal matrix Σ k which contains only the first k singular values (assuming σ ≥

≥

1

σ 2

. . . ) along the diagonal and com-

pute M = U Σ kV T. Among all rank k matrices M will then be the matrix that minimises the value ∥ A − M∥F.

A difficulty in using the Eckart–Young theorem when dealing with matrices without numerical computation software, is that computing the SVD for general matrices by hand can be quite tedious. For the examples below we don’t use any general algorithm for computing the SVD because for these special cases matrices the decomposition can often be found by simpler means.

(a) For a diagonal matrix D we almost already have the SVD D = U Σ V T with Σ = D and U = V = I, but not quite. We can write

2

0

0





A = 







−

0

−3

0 = 2 · e

3 · e





1eT

1

2eT

2 + 1 · e3eT

3 = 2 · e1eT

1 + 3 · e2(−eT

2 ) + 1 · e3eT

3 ,





0

0

1

i.e., we need to change the signs of the negative diagonal elements and change the sign of the corresponding column of either U = I or V = I. The best rank 1 approximation to A contains the ‘largest’ term (by absolute value) in that sum

0

0

0





M = 3 · e





2(−eT





2 ) = −3 · e2eT

2 = 0

−3

0 .









0

0

0

So clerly, for diagonal matrix D, we only need to pick the diagonal entry which is largest by absolute value. Similarly, the best rank 2 approximation contains the largest 2 diagonal entries by absolute value,

2

0

0





M = 2 · e





1eT − 3 · e2eT





2 = 0

−3

0 .









0

0

0

Of course, the decomposition D = IDI is not actually a SVD with positive and ordered singular values along the diagonal of D. But we could obtain this with permutation matrices (with an additional change of sign because of the −3) for U and V . For instance, we could also write

2

0

0

 0

1

0 3

0

0 0

1

0







 

 









 

 







−

 

 



0

−3

0 =  1

0

0 0

2

0 1

0

0 =: U Σ V T







 

 









 

 



0

0

1

 0

0

1 0

0

1 0

0

1

to obtain the standard SVD and then compute

 0

1

0 3

0

0 0

1

0

0

0

0



 

 







M = 

 

 







−

 

 









1

0

0 0

0

0 1

0

0 = U Σ

0

−3

0



 

 



1 V T = 





 

 







 0

0

1 0

0

0 0

0

1

0

0

0

which is the same result.

CHAPTER 4. SOLUTIONS

39

(b) Noticing that B is a symmetric, we remember that a symmetric matrix can be diagonalised in an orthonormal basis, meaning that we have an eigenvalue decomposition B = QDQ T with Q an orthogonal matrix. Comparing such an eigenvalue decomposition with the SVD, we notice that we can almost write B = U Σ V T with Σ = D and U = V = Q. Depend-ing on the order of the eigenvalues in D we may need to permute the columns of U and V , and change the signs of those columns in either U

or V which correspond to a negative eigenvalue, in order to obtain the standard SVD. But as seen in the previous example, this is not essential for the application of the Eckart–Young theorem.

To find the eigenvalue decomposition of B we first compute the characteristic polynomial





1 − λ

3

det( B − λI) =

= (1 − λ)2 − 9 = (−2 − λ)(4 − λ).





3

1 − λ

In order to obtain the best rank 1 approximation we actually only need the largest eigenvalue (by absolute value) λ = 4 along with the appropriate (normalised) eigenvector. Gaussian elimination

"−3

3 #

"1 −1#

A − 4 I =

∼

3

−3

0

0

reduces the equations for the coordinates of the eigenvector v = [ x 1 ,x 2]T

to x −

1

x 2 = 0. A normalised solution is

1 "1#

q = √2 1

and the best rank 1 approximation can be expressed by

1 "1# 1

"

#

h

i

2

2

M = 4 · qqT = 4 · √

√

1

1 =

.

2 1

2

2

2

(c) Since C is a diagonal matrix we can immediately write at least two different best rank 1 approximations to C.

"2 0#

"0 0#

M =

or M =

0

0

0

2

because we have two equal ‘largest’ singular values σ 1 = σ 2 = 2.

In fact, the are many more solutions, because in the case of two or more equal singular values there are also infinitely many valid singular value decompositions. Indeed, for any orthogonal matrix Q = U = V we can write a SVD for C as

C = 2 I = Q(2 I) Q T

For instance, if we choose Q to be a rotation matrix

"cos( t) −sin( t)#

Q = sin( t) cos( t)

CHAPTER 4. SOLUTIONS

40

for some t ∈ R, we can even explicitly compute

"2 0#

"cos( t) −sin( t)#"2 0#" cos( t) sin( t)#

M = Q

Q T =

0

0

sin( t)

cos( t)

0

0 − sin( t)

cos( t)

"

cos2( t)

cos( t) sin( t)#

= 2 cos( t)sin( t)

sin2( t)

to get an infinite set of different best rank 1 approximations to C (which also includes the two previously mentioned solutions by choosing t = 0

and t = π ). One can also explicitly compute that we have ∥ C − M∥

2

F = 2

for any choice of t in the expression for M above.

Solution to problem 1.11, page 7:

(a) Let’s recall, linear independence of the set {u

}

1 , . . . , u q is the validity of

the statement (for β ∈

j

R):

q

X

Whenever

βj u j = 0, we have that β 1 = β 2 = ··· = βq = 0.

j=1

As the reader will readily check, the implication we need to prove is now really just a direct consequence of the definition of the Kronecker product and linear independence of {u

}

1 , . . . , u q .

(b) Consider a trivial linear combination of vectors v ⊗

i

u j, i.e.,

p

q

X X α

⊗

ij v i

u j = 0.

i=1 j=1

Notice that

p

q

p 

q



q 

p





!



!

X X

X 

X



X  X



α

⊗







⊗





⊗



ij v i

u j =

v

α

 =



α

u  .



i

ij u j 



ij v i

j 













i=1 j=1

i=1 

j=1



j=1

i=1

(We really just notice that middle expression, we will not use it.) Now it follows from linear independence of u j’s and part (a) that for every j = 1 , . . . , q we have that

p

X αij v i = 0.

i=1

As the set {v

}

1 , . . . , v p

is also linearly independent, we must have that

αij = 0 for every i = 1 ,...,p and every j = 1 ,...,q, i.e., the set

{v ⊗

i

u j : i = 1 ,...,p ; j = 1 ,...,q}

is linearly independent.

CHAPTER 4. SOLUTIONS

41

Solution to problem 1.12, page 8: Assume we have eigenvalue/eigenvector pairs for A and B, A v = λ v and B u = µ u. By the definition of the Kronecker sum and properties of the Kronecker product we have

( A ⊕ B)(v ⊗ u) = ( A ⊗ I

⊗

n + Im

B)(v ⊗ u)

= ( A ⊗ I

⊗

n)(v ⊗ u) + ( Im

B)(v ⊗ u)

= A v ⊗ In u + Im v ⊗ B u

= λ(v ⊗ u) + µ(v ⊗ u)

= ( λ + µ)(v ⊗ u).

This shows that the sum of eigenvalues λ + µ is an eigenvalue for A ⊕ B with eigenvector v ⊗ u which proves the claim.

This means it is possible to compute the eigensystem of A ⊕ B without explicitly computing the Kronecker sum simply by computing the eigensystems of A and B separately and then computing all possible sums λ + µ of eigenvalues together with corresponding eigenvectors v ⊗ u. (As seen in problem 1.11,

linearly independent sets of eigenvectors of A and B will give us corresponding linearly independent sets of eigenvectors of A ⊕ B.) For A the characteristic polynomial is





−1 − λ

2

det( A − λI) =

= (−1 − λ)(3 − λ).





0

3 − λ

An eigenvector for λ 1 = −1 is v1 = [1 , 0]T and an eigenvector for λ 2 = 3 is v2 = [1 , 2]T.

For B we have





1 − µ

0

det( B − µI) =

= (1 − µ)(2 − µ)





2

2 − µ

so the eigenvalues are µ 1 = 1 and µ 2 = 2 with eigenvectors u1 = [−1 , 2]T and u2 = [0 , 1]T.

We can organize the pairs λ

⊗

i + µj , v i

u j into a table.

"

#

"

#

/

1

1

µ

−

j , u j λi , v i

1 ,

3 ,

0

2

−1

−1

"









−1#









 2 

 2 

1 ,

0 , 











4 , 



2













− 

 0 



2

















 0 

 4 

0

0

"









0#









1

1

2 ,

1 ,  









5 ,  

1

















0

0

















0

2

Solution to problem 1.13, page 8:

(a) Since A is a symmetric matrix, we know it can be diagonalised with an orthogonal matrix U . The characteristic polynomial is





2 − λ

2



det( A− λI) =

= (2− λ)(−1− λ)−4 = λ 2 − λ−6 = ( λ−3)( λ+2).





2

−1 − λ

CHAPTER 4. SOLUTIONS

42

For λ 1 = 3 a normalized eigenvector is u1 = 1

√ [2 , 1]T and for λ

5

2 = −2 we

have for example u2 = 1

√ [−1 , 2]T. The matrices D and U are therefore 5

"3

0 #

1 "2 −1#

D =

and U = √

.

0

−2

5 1

2

Note that since A is a symmetric matrix this diagonalization readily gives us the SVD which we can write as

A = 3 · u

−

1uT

1

2 · u2uT2 = 3 · u1uT1 + 2 · u2(−uT2).

(b) We can directly verify that if we have U T U = I then we also have ( U ⊗ U )T( U ⊗ U ) = ( U T ⊗ U T)( U ⊗ U ) = U T U ⊗ U T U = I ⊗ I which shows this is an orthogonal matrix. Similarly, we can see that if we have a diagonalisation A = U DU T then ( U ⊗ U )( D ⊗ D)( U ⊗ U )T is a diagonalisation for A ⊗ A since

( U ⊗ U )( D ⊗ D)( U ⊗ U )T = ( U ⊗ U )( D ⊗ D)( U T ⊗ U T)

= U DU T ⊗ U DU T = A ⊗ A.

(c) For the best rank 1 approximation of A ⊗ A we need the largest singular value and corresponding singular vectors. Since A is symmetric, so is A ⊗ A, and its SVD can be obtained from the eigenvalue decomposition which was given in (b). We only need the largest (by absolute value) eigenvalue of A ⊗ A. We know that for matrices A and B the eigenvalues of A ⊗ B are the products λiµj (where λi and µj are the eigenvalues of A and B, respectively) and the eigenvectors are v ⊗

i

u j (where v i and u j are

eigenvectors of A and B, respectively). The largest eigenvalue of A ⊗ A is therefore λ ·

1 λ 1 = 9 and the corresponding eigenvector is

4





1  

2

q

⊗





1 = u1

u1 =







 .

5  

2









1

The best rank 1 approximation is therefore

16

8

8

4





9 



 8

4

4

2

M





1 = 9 · q1qT





1 =



 .

25 



 8

4

4

2









 4

2

2

1

To obtain the best rank 2 approximation to A ⊗ A we can add the second largest singular matrix in the SVD to M 1. The second largest eigenvalue (by absolute value) of A ⊗ A is λ 1 λ 2 = λ 2 λ 1 = −6. The eigenspace for the eigenvalue −6 is two-dimensional, so for the eigenvector we can take

−2

−2









1 







 4 

1 −1

q

⊗





⊗





2 = u1

u2 =













or q

u





5 − 

3 = u2

1 =







1

5  4 

















 2 

 2 

CHAPTER 4. SOLUTIONS

43

or any normalised linear combination of q2 and q3, e.g. sin( t)q2+cos( t)q3

for any choice of t ∈ R. If we choose q2 we get

 4

−8

2

−4

24

24

12

12









6 







−8

16

−4

8 

1 24 −12 12 −6

M

−

−









2 = M 1

6 · q2qT









2 = M 1



 =





25 







 2

−4

1

−2

5 12

12

6

6 

















−4

8

−2

4 

12

−6

6

−3

but as in the case of Exercise 1.10 (c) we could actually construct infinitely many rank 2 matrices M

−

2 with minimal value of ∥ M 2

A ⊗ A∥F.

Solution to problem 1.14, page 8:

(a) Let’s start with the matrix A. Its characteristic polynomial is





−2 − λ

2

det( A − λI) =

= λ 2 + λ − 6 = ( λ + 3)( λ − 2),





2

1 − λ

which gives λ 1 = −3 and λ 2 = 2 as eigenvalues of A. The corresponding eigenvectors are u′ = [−2

= [1

1

, 1]T and u′2

, 2]T, which both have norm

√

5. Since we’re aiming for an orthogonal transition matrix, we’ll pick 1

1 "−2 1#

Q = [u1 , u2] = √ [u′

√

1 , u′2] =

.

5

5

1

2

The corresponding diagonal matrix is

"−3 0#

D =

0

2

and we have A = QDQ T.

Note that B = A − I, hence B = Q( D − I) Q T and B has eigenvalues µ

−

−

1 = λ 1

1 = −4 and µ 2 = λ 2 1 = 1

(and same eigenvectors as A).

(b) Write A u = λ u and B v = µ v. Then ( A ⊗ B + A ⊗ I + I ⊗ B)(u ⊗ v) = ( A ⊗ B)(u ⊗ v) + ( A ⊗ I)(u ⊗ v) + ( I ⊗ B)(u ⊗ v)

= ( A u) ⊗ ( B v) + ( A u) ⊗ v + u ⊗ ( B u)

= ( λ u) ⊗ ( µ v) + ( λ u) ⊗ v + u ⊗ ( µ u)

= ( λµ + λ + µ)(u ⊗ v),

hence u ⊗ v is an eigenvector of A ⊗ B + A ⊗ I + I ⊗ B corresponding to the eigenvalue λµ + λ + µ.

For actual eigenvalues of matrices A and B we obtain λ 1 µ 1 + λ 1 + µ 1 = 5,

λ 1 µ 2 + λ 1 + µ 2 = −5,

λ 2 µ 1 + λ 2 + µ 1 = −10,

λ 2 µ 2 + λ 2 + µ 2 = 5.

CHAPTER 4. SOLUTIONS

44

(c) Note that the matrices A and B are symmetric, hence the matrices A ⊗ B, A ⊗ I and I ⊗ B are also symmetric, and so is their sum, i.e., the matrix A ⊗ B + A ⊗ I + I ⊗ B. Singular value decomposition of this matrix can therefore be completely determined from its eigendecomposition. In

particular, the singular values of A ⊗ B + A ⊗ I + I ⊗ B are (in decreasing order) 10 , 5 , 5 , 5. The singular value 10 corresponds to the eigenvalue λ 2 µ 1 + λ 2 + µ 1 = −10 with corresponding eigenvector

−2





1 "1#

"−2#

1 



 1 

u ⊗

⊗





2

v1 =

=







 .

5 2

1

5 − 



4









 2 

By the Eckart–Young theorem the rank 1 matrix M which is closest to A ⊗ B + A ⊗ I + I ⊗ B with respect to the Frobenius norm is

−2





1 1 



 1 

M = −10(u ⊗

⊗

·





2

v1)(u2 v1)T = −10 ·







 [−2 , 1 , −4 , 2]

5 5 − 



4









 2 

 4

−2

8

−4





2 



−2

1

−4

2 

= − 









 .

5 



 8

−4

16

−8









−4

2

−8

4 

Solution to problem 1.15, page 8: Let us check what happens if we multiply the matrix B = A⊗ A+ A 2 ⊗ I and the vector v⊗u where v and u are eigenvectors of A, A v = λ v and A u = µ u: B(v ⊗ u) = ( A ⊗ A + A 2 ⊗ I)(v ⊗ u) = A v ⊗ A u + A 2v ⊗ I u

= λµ v ⊗ u + λ 2v ⊗ u = ( λµ + λ 2)v ⊗ u.

This shows that we can obtain the eigensystem of B by computing the eigenvalues λi and eigenvectors v i of A and then compute λiλj + λ 2 to obtain all four i

eigenvalues of B together with their eigenvectors v ⊗

i

v j, i,j = 1 , 2.

The characteristic polynomial of A is





−1 − λ

3



det( A − λI) =

= (1 + λ)2 − 9 = ( λ + 4)( λ − 2).





3

−1 − λ

Eigenvectors for λ 1 = −4 and λ 2 = 2 are v1 = [−1 , 1]T and v2 = [1 , 1]T. Then the eigenvalues for B are µ 1 = λ 1( λ 1 + λ 1) = 32, µ 2 = λ 1( λ 2 + λ 1) = 8, µ 3 =

λ 2( λ 1 + λ 2) = −4 and µ 4 = λ 2( λ 2 + λ 2) = 8 with eigenvectors

 1 

−1

−1

1

































−1

−1

 1 

1

u

⊗





⊗





⊗





⊗





1 = v1

v1 = 

















 , u

v



 , u

v



 , u

v



 .

− 

2 = v1

2 = 



3 = v2

1 = − 

4 = v2

2 =  



1

 1 



1

1

































 1 

 1 

 1 

1

CHAPTER 4. SOLUTIONS

45

Solution to problem 1.16, page 9:

(a) To prove this, we need the identity

vec( ABC) = ( C T ⊗ A)vec( B)

which holds for any matrices A, B and C for which the product ABC is defined. Applying the (linear) operator vec to the expression AX + XB we can write

vec( AX + XB) = vec( AXI) + vec( IXB) = ( I ⊗ A)vec( X) + ( B T ⊗ I)vec( X)

= ( B T ⊕ A)vec( X)

which holds by the definition of the Kronecker sum.

(b) We can answer this question quickly by considering the eigenvalues of the matrix B T ⊕ A. Since both B T and A are upper triangular matrices, we can read their eigenvalues off their diagonals. The eigenvalues of the Kronecker sum B T ⊕ A are all the possible sums of the eigenvalues of B T and A (see Exercise 1.12), i.e., 0 , 1 , 4 , 5. Because 0 is an eigenvalue, the matrix B T ⊕ A is singular and the homogeneous system ( B T ⊕ A)vec( X) = 0 has non-trivial solutions. By (a) AX + XB = 0 also has a nontrivial solution.

(c) Here we actually compute the Kronecker sum B T ⊕ A to obtain the system matrix of our equation.

1

0

2

0 −1

2

0

0

0

2

2

0

























0

1

0

2  0

3

0

0

0

4

0

2

B T ⊕ A = B T ⊗ I + I ⊗ A = 

























 + 

 = 

 .













0

0

2

0  0

0

−1

2

0

0

1

2

























0

0

0

2  0

0

0

3

0

0

0

5

By the way, we can confirm this matrix does indeed have the eigenvalues as claimed in (b). Gaussian elimination for the system ( B T ⊕ A)vec( X) =

vec( C) then yields the reduced form of the system



0

2

2

0

−2 



0

1

1

0

−1 



















0

4

0

2

2 



0

2

0

1

1 











 ∼ 





















0

0

1

2

1 



0

0

1

2

1 



















0

0

0

5

5 



0

0

0

1

1 



0

1

1

0

−1 



0

1

0

0

0 



















0

1

0

0

0 



0

0

1

0

−1 

∼ 









 ∼ 









 .











0

0

1

0

−1 



0

0

0

1

1 



















0

0

0

1

1 



0

0

0

0

0 

We therefore have the equations x 2 = 0, x 3 = −1 and x 4 = 1 for the entries of the unknown matrix

" x

#

X =

1

x 3

x 2 x 4

with x 1 being the ‘free’ variable of the system. The general solution for our equation can then be written as

" t −1#

X = 0 1

where t ∈ R.

CHAPTER 4. SOLUTIONS

46

Solution to problem 1.17, page 9:

(a) A square matrix A is invertible if and only if N ( A) = {0}. This is true if and only if 0 is not an eigenvalue. Since by assumption all eigenvalues of A are not negative, this holds if and only if they are all strictly positive.

(b) Let λ be an eigenvalue of A and v a corresponding eigenvector, A v =

λ v. Since we are assuming the inverse A−1 exists, we can multiply this equation from the left by A−1 to get

1

A v = λ v ⇒ v = λA−1v ⇒ A−1v =

v

λ

since we know λ , 0 from (a). This shows λ is an (non-zero) eigenvalue of A if and only if λ−1 is an eigenvalue of A−1. Clearly then all the eigenvalues λ 1 ,...,λn of A are positive if and only all the eigenvalues λ−1

1 , . . . , λ−1

n

of A−1 are positive since inverting a number does not change its sign.

(c) A PSD matrix A (i.e., a symmetric matrix with only nonnegative eigenvalues) can be diagonalised, A = QDQ T, with

 λ



1









D = 

.





.









.













λ 

n

where λ 1 ,...,λn are the eigenvalues and Q is an orthogonal matrix. Let us define the square-root of D by

√



λ



√



1











D := 

.





.





.









√









λ 

n

√

and define the matrix S by S = Q DQ T. Then we can verify

√

√

√

√

S 2 = Q DQ T Q DQ T = Q D DQ T = QDQ T = A.

Since by assumption all the eigenvalues λ

√

√

1 , . . . , λn of A are nonnegative,

all the eigenvalues

λ 1 ,..., λn of S are also nonnegative, so S is also a positive semiedefinite matrix.

Solution to problem 1.18, page 9:

(a) This can be done by computing the eigenvalues and verifying that they are nonnegative, as we will do in (b).

(b) We start by evaluating the characteristic polynomial





2 − λ

3

1

2 − λ

3

λ − 1





det( A − λI) =





3

6 − λ

3 = 3

6 − λ

0





1

3

2 − λ



1

3

1 − λ





2 − λ

3

λ − 1





3

6 − λ

=





3

6 − λ

0 = ( λ − 1)





3 − λ

6

3 − λ

6

0





= (1 − λ)(18 − (3 − λ)(6 − λ)) = λ( λ − 1)(9 − λ).

CHAPTER 4. SOLUTIONS

47

To obtain the eigenspace for λ 1 = 0 we compute

2

3

1

1

2

1

1

2

1 

1

0

−1

















A ∼ 

















 ∼ 

 ∼ 

 ∼ 



3

6

3

2

3

1

0

−1

−1

0

1

1  .

































1

3

2

1

3

2

0

1

1 

0

0

0 

We choose a normalised solution to the system, q1 = 1

√ [1 , −1 , 1]T, so that

3

we will have an orthonormal basis of eigenvectors.

For λ 2 = 1 we have

1

3

1

1

3

1

1

0

1













A − I ∼ 













 ∼ 

 ∼ 



3

5

3

0

−4

0

0

1

0 .

























1

3

1

0

0

0

0

0

0

We choose q2 = 1

√ [−1 , 0 , 1]T. For λ

2

3 = 9 we have

−7

3

1 

 1

−1

1 

1

−1

1 

1

0

−1

















A − 9 I ∼ 

















 ∼ −

 ∼ 

 ∼ 



 3

−3

3 



7

3

1 

0

−4

8 

0

1

−2 .

































 1

3

−7

 1

3

−7

0

4

−8

0

0

0 

We choose q3 = 1

√ [1 , 2 , 1]T.

6

√

(c) As explained in Exercise 1.17 we define D = diag(0 , 1 , 3), Q = [q1 , q2 , q3]

and compute













√

√

1

0

−1

1

2

1

1

1

0

1 



1 







A = Q DQ T = q













2qT













2 +3q3qT

3 =

 0

0

0 + 2

4

2 = 1

2

1 .

2 















2 







−1

0

1 

1

2

1

0

1

1

Solution to problem 1.19, page 10: Following the algorithm, we denote

" 2 #

"8 2#

a 11 = 1 , b =

, B =

.

−1

2

6

Then we compute

 1

0

0

"

#

"

#

"

#





8

2

4

−2

4

4

L





−

1 = 



 2

1

0 and A

=

.





2 =

−





2

6

2

1

4

5

−1

0

1

This completes the first step. In the next step we repeat the process for A 2. We denote

h

i

h

i

a 11 = 4 , b = 4 , B = 5

and compute

"2 0#

L 2 =

, A

2

1

3 = 5 − 4 = 1.

√

The last step is to compute the square-root

A 3 = 1 and to compute L 3 = 1. To

get the final result it is actually not necessary to explicitly compute the product

CHAPTER 4. SOLUTIONS

48

in (1.2). Namely, it can be verified that the same result can be obtained simply by nesting the first columns of the L 1, L 2 and L 3 matrices into one matrix

 1

0

0





L = 







 2

2

0 .









−1

2

1

Solution to problem 1.20, page 10: Note that a matrix X is negative definite if and only if the matrix − X is positive definite. As the (1 , 1)-entry of − A is −1, the matrix − A is not positive definite by Sylvester’s criterion, so A is not negative definite. On the other hand, for the matrix

 4

−6

2 





− B = 



−





6

10

−5









 2

−5

14

we have





4

−6

4 > 0 ,

= 4 > 0 and det(− B) = 36 > 0.

−





6

10

Hence, − B is positive definite by Sylvester’s criterion, which means that B is negative definite.

Let’s find the Cholesky decomposition of Y = − B, which we’ll write in block form as

 4

−6

2 

"

#





y

Y = 



−



11

zT



6

10

−5 =

.









z

Z

 2

−5

14

For the first step we have

√





"

y

#

2

0

0

11

0T





L

−



1 =

=  3

1

0 .

√1

c

I









y

2





11

 1

0

1

Then

1

" 10 −5# 1 " 36

−12#

" 1

−2#

Y

−

2 = Z −

zzT =

=

,

y

−

−

−

11

5

14

4

12

4

2

13

so, for the second step, we get

√



1

0

" 1

0#

L





2 = 



 −2

 =

.

 √



−



1

2

1

1

q

√

The third and final step gives L 3 = 13 − 1 (−2)(−2) = 9 = 3. ‘Assembling’ L

1

1,

L 2 and L 3 we obtain

 2

0

0





L = 



 −





3

1

0 .









 1

−2

3

We have the Cholesky decomposition LL T of Y = − B.

CHAPTER 4. SOLUTIONS

49

Solution to problem 2.1, page 11: First, recall that any n × n matrix A can be uniquely expressed as the linear combination

n

X

A =

aijEij

i,j=1

where aij is the ( i,j)-entry of A and Eij is the matrix which contains only zeros except for the ( i, j)-entry which equals 1. As the set { Eij ; i,j = 1 ,...,n} is also clearly linearly independent, it forms a basis of the vector space ( n× n R

, + , ·)

which we will call the standard basis. Since we have n 2 basis vectors, this demonstrates that ( n× n

R

, + , ·) is n 2 dimensional.

(a) Denote this subset by U 1 and assume A,B ∈ U 1. Since n

n

n

X

X

X

αA + βB = α

aijEij + β

bijEij =

( αaij + βbij) Eij

i,j=1

i,j=1

i,j=1

the (1 , 2)-entry of αA + βB is α · 0 + β · 0 = 0, i.e., αA + βB ∈ U 1. Hence, U 1

is a vector subspace of

n× n

R

.

For the basis of U

n× n

1 we simply omit E 12 from the standard basis of R

.

Therefore, a basis of U 1 has one element less than the standard basis of n× n

R

, i.e., dim( U 1) = n 2 − 1.

(b) For a subset of a vector space to be a vector subspace it must contain all scalar multiples of all its elements, including 0. The simplest argument why this subset is not a subspace is therefore because it does not contain the 0 matrix. Another argument is that this subset is not closed for addition since adding two matrices with 1 for their (1 , 2) entries produces a matrix with 2 for its (1 , 2) entry.

(c) This subset contains the 0 matrix and is also closed for addition since adding matrices with integer entries results in matrices with integer entries. However, this subset is not closed for scalar multiplication. For instance multiplying any non-zero integer matrix with for instance π

will produce a matrix that does not have all integer entries.

(d) If we add two upper-triangular matrices, we get an upper-triangular matrix. Similarly, if we multiply an upper-triangular matrix with any scalar, the result will be an upper-triangular matrix. This subset, let us denote it with U 4, is therefore closed for addition and scalar multiplication and is clearly a subspace. To find a basis for U 4 and determine its dimension, we notice that any matrix A ∈ U 4 can be written as X

A =

aijEij

i≤ j

where the sum goes over all the pairs i, j = 1 , . . . , n with i ≤ j. So a basis for U 4 is the set { Eij : i,j = 1 ,...,n, i ≤ j}. This set contains 1 n( n + 1) 2

elements which is the dimension of U 4.

(e) Adding symmetric matrices produces a symmetric matrix, multiplying a symmetric matrix with any number also results in a symmetric matrix, so the set of symmetric matrices is clearly a vector subspace.

CHAPTER 4. SOLUTIONS

50

More formally, denote the subset of symmetric matrices by U 5 and assume A, B ∈ U 5, meaning A T = A and B T = B. We can then easily verify that any linear combination of A and B is also a symmetric matrix since ( αA + βB)T = αA T + βB T = αA + βB.

To get an explicit basis for U 5 we can notice that any matrix A ∈ U 5 can be written as

n

X

X

A =

aiiEii +

aij( Eij + Eji).

i=1

i<j

Here, we separated the diagonal elements in the first sum and off-diagonal in the second sum. This allows us to directly identify a basis for U 5 as the set

{ Eii : i = 1 ,...,n} ∪ { Eij + Eji : i,j = 1 ,...,n, i < j}.

The number of elements in this basis and hence the dimension of U 5 is n + 1 n( n − 1) = 1 n( n + 1).

2

2

(f) Like for the case of symmetric matrices, it is easy to conclude that the set of antisymmetric matrices is also subspace. Denote the set of antisymmetric matrices by U 6 and assume A,B ∈ U 6, implying A T = − A and B T = − B. Then

( αA + βB)T = αA T + βB T = −( αA + βB) showing that any linear combination of antisymmetric matrices is an

antisymmetric matrix. To get a basis we can notice that any antisym-

metric matrix A ∈ U 6 can be written as

X

A =

a

−

ij ( Eij

Eji).

i<j

This sum does not include any non-zero diagonal elements since an antisymmetric matrix necessarily has only zeros on the diagonal. A possible basis is therefore the set

{ E −

ij

Eji : i,j = 1 ,...,n, i < j}

and the dimension of U 6 is 1 n( n − 1).

2

To conclude, we mention that the vector spaces bases we identified for the case of symmetric and antisymmetric matrices together form a basis for the entire vector space

n× n

R

, which also agrees with the dimensions

since 1 n( n + 1) + 1 n( n − 1) = n 2.

2

2

(g) The set of invertible matrices is not a vector subspace since it does not contain the 0 matrix. It is also not closed for addition, since even if A (and hence also − A) is an invertible matrix, the sum A + (− A) = 0 is not invertible.

(h) The set of matrices with zero determinant is closed under scalar multiplication since if det( A) = 0, then we also have

det( αA) = αn det( A) = 0.

CHAPTER 4. SOLUTIONS

51

However this set is not closed under addition. A simple counterexample is

"1 0# "0 0# "1 0#

+

=

.

0

0

0

1

0

1

(i) The set of nilpotent matrices is again closed under scalar multiplication, since N n = 0 implies ( αN ) n = αnN n = 0. But again, this set is not closed under addition. For instance let

"0 0#

"0 1#

A =

and B =

.

1

0

0

0

Then we have A 2 = 0 and B 2 = 0, so both are nilpotent matrices. But for their sum

"0 1#

C = A + B = 1 0

we have C 2 = I (and more generally Ck = C for odd k and Ck = I for even k) so it is not a nilpotent matrix.

(j) It is useful to provide a more explicit description of the set U 10 of all nilpotent upper-triangular matrices. First, let A be a general upper-triangular matrix with elements a 1 ,...,an on the diagonal

 a

∗



1

. . .

∗









 0

a





2

. . .

∗ 

A = 







 .

.  .



.



 .

.

. 

 .

.

. 









 0

. . .

0

a 

n

If we compute the m-th power of such a matrix, we notice we get a matrix of the form

 am

∗



1

. . .

∗









 0

am





2

. . .

∗ 

Am = 







 .

.  .



.



 .

.

. 

 .

.

. 









 0

. . .

0

am

n

This means we can have a nilpotent upper-triangular matrix only if all the diagonal elements equal zero, a 1 = ... = an = 0.

Conversely, if we have an upper-triangular with zeros on the diagonal

0

∗

. . .

∗ 









0

0

. . .

∗ 





A = 







 .

. 



.



 .

.

. 

 .

.

. 









0 . . .

0

0

we can quite directly see that we get An = 0 (with every multiplication by A we see we lose an additional line of elements above the diagonal until none are left).

In other words the set of nilpotent upper-triangular matrices is precisely the set of strictly upper-triangular matrices, which is clearly a vector subspace.

CHAPTER 4. SOLUTIONS

52

Thus any A ∈ U 10 can be written as

X

A =

aijEij.

i<j

A basis is the set

{ Eij : i,j = 1 ,...,n, i < j}

and the dimension is 1 n( n − 1).

2

(k) Assume we have two matrices A, B ∈ U 11 from the set of matrices with zero trace, tr( A) = tr( B) = 0. Then by the properties of the tr operator we see that any linear combination also has zero trace:

tr( αA + βB) = α tr( A) + β tr( B) = 0.

Thus U 11 is a vector subspace. From tr( A) = a 11 + a 22 + ··· + ann = 0 we get ann = −( a 11 + a 22 + ··· + an−1 ,n−1), so one way of explicitly writing a general matrix A ∈ U 11 is

n−1

X

X

A =

a

−

ii Eii

( a 11 + ... + an−1 ,n−1) Enn +

aijEij

i=1

i, j

n−1

X

X

=

a

−

ii ( Eii

Enn) +

aijEij.

i=1

i, j

A possible basis is then the set

{ E −

ii

Enn : i = 1 ,...,n − 1} ∪ { Eij : i,j = 1 ,...,n, i , j}.

The dimension is n−1+ n 2 − n = n 2 −1. This also agrees with the fact that we have basically one equation tr( A) = a 11 + ... + ann = 0 for the entries of the matrix A and n 2 − 1 ‘free’ variables.

Solution to problem 2.2, page 11:

(a) Recall that a vector space is a set V , equipped with an inner operation called addition

+ : V × V → V , ( v 1 ,v 2) 7→ v 1 + v 2

and an outer operation called scalar multiplication

· : R × V → V , ( α, v) 7→ αv

such that the following conditions hold:

(VS1) u + v = v + u and ( u + v) + w = u + ( v + w), (VS2) there exists a zero vector 0 and v + 0 = 0 + v = v, (VS3) for each v ∈ V there exists an inverse vector − v, such that v +(− v) =

(− v) + v = 0,

(VS4) 1 · v = v,

(VS5) ( αβ) · v = α · ( β · v),

CHAPTER 4. SOLUTIONS

53

(VS6) ( α + β) · v = α · v + β · v, (VS7) α · ( u + v) = α · u + α · v, for all u, v, w ∈ V and α, β ∈ R.

We need to verify that conditions (VS1)–(VS7) hold for ( +

R

, ⊕ , ⊙). Pick

arbitrary x, y, z ∈ +

R

and α, β ∈ R. Here we go:

(VS1) x ⊕ y = xy = yx = y ⊕ x, x ⊕ ( y ⊕ z) = x( yz) = xyz = ( xy) z = ( x ⊕ y) ⊕ z, (VS2) for 1 ∈

+

R

we have x ⊕ 1 = x · 1 = x, i.e., 1 acts as the zero vector in

+

R

; 0 = 1 (and this is not contradictory),

(VS3) let’s use the (ad hoc) notation ⊖ x = 1 /x (this is well-defined since x

+

, 0), then x ⊕ (⊖ x) = x · 1 = 1, which is the zero vector in R , x

(VS4) 1 ⊙ x = x 1 = x,

(VS5) ( αβ) ⊙ x = xαβ = ( xβ) α = α ⊙ ( β ⊙ x), (VS6) ( α + β) ⊙ x = xα+ β = xαxβ = ( α ⊙ x) ⊕ ( β ⊙ x), (VS7) α ⊙ ( x ⊕ y) = ( xy) α = xαyα = ( α ⊙ x) ⊕ ( α ⊙ y), hence ( +

R

, ⊕ , ⊙) is a vector space over R.

(b) Note that every x ∈ +

R

can be written as x = e log x = (log x) ⊙ e, i.e., every x ∈

+

R

is some scalar multiple of e (the base of the natural logarithm).

Hence { e} is a basis for ( +

+

R

, ⊕ , ⊙) and dim(R ) = |{ e}| = 1.

Solution to problem 2.3, page 11:

(a) While we could check the axioms of a vector space as in the previous problem, let us solve this with some ingenuity. Let

∞

R

:= R × R × · · ·

be the set of all sequences ( a 0 ,a 1 ,a 2 ,... ). (This is the countably infinite Cartesian product of R’s.) This set is a vector space over R, for operations defined by

( an) + ( bn) := ( an + bn) and α( an) := ( αan), for ( a

∞

n) , ( bn) ∈ R

and α ∈ R. We invite the reader to prove that this

is indeed a vector space. (The proof can be copied practically verbatim from the proof that

n

R

is a vector space with the usual operations.)

Note that F is a subset of ∞

∞

R

and that the vector space operations on R

agree with those defined on the set F. Now, we only need to prove that F is a vector subspace of

∞

R

. So let’s pick ( an) , ( bn) ∈ F, i.e., sequences with properties an = an−1 + an−2 and bn = bn−1 + bn−2 for all n ≥ 2. Also pick α, β ∈ R. Then

αan+ βbn = α( an−1+ an−2)+ β( bn−1+ bn−2) = ( αan−1+ βbn−1)+( αan−2+ βbn−2), hence α( an) + β( bn) ∈ F, i.e., it is a Fibonacci sequence and F is a vector subspace of

∞

R

. Therefore, F is also a vector space by itself.

(b) Note that every Fibonacci sequence ( an) is completely determined by its two initial terms, a 0 and a 1. Set

( fn) = (0 , 1 , 1 , 2 , 3 , 5 , 8 ,... ) and ( gn) = (1 , 0 , 1 , 1 , 2 , 3 , 5 ,... ).

CHAPTER 4. SOLUTIONS

54

Now, any Fibonacci sequence ( an) with initial terms a 0 and a 1 can be written as ( an) = a 0( fn) + a 1( gn), and, since ( fn) and ( gn) are linearly independent, the set B F = {( fn) , ( gn)} is a basis of F.

In this basis, we have

(1 , 1 , 2 , 3 , 5 , 8 , 13 , . . . ) = ( fn) + ( gn).

Solution to problem 2.4, page 12: We need to prove that for any X, Y ∈ U and any α, β ∈ R we have αX + βY ∈ U . So we pick arbitrary X, Y ∈ U and α, β ∈ R.

Now, X, Y ∈ U means that XN = N X and Y N = N Y , and we have ( αX + βY ) N = αXN + βY N = αN X + βN Y = N ( αX + βY ), hence αX + βY ∈ U and U is a vector subspace of 2×2

R

.

To determine a basis, we need the actual matrix N . Write

" a b#

A =

.

c

d

The equation AN = N A becomes

" a b#"0 0# "0 0#" a b#

" b 0# "0 0#

b = 0,

0 = 0,

=

∴

=

∴

c

d 1

0

1

0 c

d

d

0

a

b

d = a,

0 = b.

Hence, a matrix A ∈ U must be of the form

" a 0#

"1 0#

"0 0#

A =

= a

+ c

= aI + cN ,

c

a

0

1

1

0

where a, c ∈ R are arbitrary. Since I and N are linearly independent, so B U =

{ I, N } is a basis for U and dim U = 2.

Solution to problem 2.5, page 12:

(a) Note that the zero polynomial is 0 = 0 · x + 0, and this is not contained in U 1, since a = 0. That means that U 1 does not contain the zero polynomial and U 1 is not a vector subspace of R1[ x].

(b) Pick p, q ∈ U 2 and α,β ∈ R. We have

( αp + βq)(0) = αp(0) + βq(0) = α · 0 + β · 0 = 0, i.e. αp + βq ∈ U 2 and U 2 is a vector subspace of R2[ x].

(c) No. This subset clearly does not contain the zero polynomial.

(d) Again, pick p, q ∈ U 4 and α,β ∈ R. Then ( αp + βq)′′(3) = αp′′(3) + βq′′(3) = α · 0 + β · 0 = 0, i.e. αp + βq ∈ U 4 and U 4 is a vector subspace of R n[ x].

CHAPTER 4. SOLUTIONS

55

Solution to problem 2.6, page 12: Note that A is invertible and A−1 AA−1 = A−1, i.e., A−1 is nonzero and A−1 ∈ W . Also

(2 A−1) A(2 A−1) = 4 A−1 , 2 A−1, so W is not closed under scalar multiplication (nor is it closed under addition), hence W is not a vector subspace of

2×2

R

.

On the other hand, for any two matrices X, Y ∈ V , i.e., any two matrices X, Y ∈

2×2

R

such that XA + AX = 0 and Y A + AY = holds, and any α, β ∈ R we have

( αX + βY ) A + A( αX + βY ) = α( XA + AX) + β( Y A + AY ) = α · 0 + β · 0 = 0.

Therefore, αX + βY ∈ V and V is a vector subspace of 2×2

R

.

Let’s find a basis of V . Write

" x

#

X =

11

x 12 .

x 21 x 22

The equation XA + AX = 0 becomes

" x

# "

#

"

# "

#

"

#

11

x 12 1 −1

1

−1 x

0

0

+

11

x 12 =

x 21 x 22 0 −1

0

−1 x 21 x 22

0

0

or

"2 x −

−

−

#

"

#

11

x 21

x 11 x 22

0

0

=

.

0

− x

−

21

2 x 22

0

0

This gives us equations 2 x −

−

−

11

x 21 = 0, − x 11 x 22 = 0 and − x 21 2 x 22 = 0. Hence x 11 = − x 22 and x 21 = −2 x 22,

while x 12 and x 22 are arbitrary real numbers. So, a general matrix X ∈ V is

" − x

#

"0 1#

"−1 0#

X =

22

x 12 = x

+ x

for x

∈ R.

−2 x

12

22 −

12 , x 22

22

x 22

0

0

2

1

One possible basis for V is therefore

("0 1# "−1 0#)

B V =

,

0

0

−2

1

and dim( V ) = |B |

V = 2.

Solution to problem 2.8, page 13: The proof that R[ x] is a vector space follows the same argument as the proof that R n[ x] is a vector space. It is routine and left to the reader. The (infinite) set of polynomials

B = {1 , x, x 2 , x 3 , . . . }

is clearly contained in R[ x]. Moreover, for any p ∈ R[ x] we have p( x) = a 0 + a 1 x + a 2 x 2 + ··· + anxn,

CHAPTER 4. SOLUTIONS

56

i.e., p is a linear combination of polynomials in B. Since the elements in B are linearly independent (as they are polynomials of different degrees), the set B

is a basis for R[ x]. Now, B has infinitely many elements, hence dim R[ x] = ∞.

For the subspace W ; every polynomial p ∈ W has zeroes 1 and −1, so it must be divisible by ( x − 1)( x + 1) = x 2 − 1. Again, there are infinitely many linearly independent polynomials in R[ x] with this property. Namely, the set B W = { x 2 − 1 ,x( x 2 − 1) ,x 2( x 2 − 1) ,... }

is a basis for W and dim W = ∞.

Perhaps, we should be more precise in dealings with infinities. The ‘amount’

of elements in B is the same as the ‘amount’ of elements in the set of the natural numbers N. Actually, the map N → B, n 7→ xn is a bijection. (Here, 0 is a natural number.) So, in more precise notation,

dim R[ x] = ℵ0

and

dim W = ℵ0,

where ℵ0 = |N| is the cardinality of the set N.

Solution to problem 2.9, page 13: Pick f , g ∈ V , i.e., f ′′ + f = 0 and g′′ + g = 0, and α, β ∈ R. Then

( αf + βg)′′ + ( αf + βg) = αf ′′ + βg′′ + αf + βg = α( f ′′ + f ) + β( g′′ + g) = 0, i.e. αf + βg ∈ V , so V is a vector subspace of C∞(0 , 2 π).

To determine its basis recall that the general solution to the second order linear differential equation y′′ + y = 0 is

y( x) = C 1 cos x + C 2 sin x, i.e., a linear combination of functions cos x and sin x. As these two functions are linearly independent (as the reader will verify), the set

B V = {cos x, sin x}

is a basis for V .

Solution to problem 2.10, page 13:

(a) To shorten the notation, we set A = [ 1 1 ], so that τ( X) = AX + XA. Then, 1 0

for any X, Y ∈ 2×2

R

and any α, β ∈ R, we have

τ( αX + βY ) = A( αX + βY ) + ( αX + βY ) A = αAX + βAY + αXA + βY A

= α( AX + XA) + β( AY + Y A) = ατ( X) + βτ( Y ), i.e., τ is a linear map.

(b) We need to evaluate τ at each of the basis vectors, and then express that

CHAPTER 4. SOLUTIONS

57

evaluation as a linear combination of basis vectors. We get

"1 0#! "1 1#"1 0# "1 0#"1 1# "2 1#

τ( E 11) = τ

=

+

=

0

0

1

0 0

0

0

0 1

0

1

0

= 2 E 11 + E 12 + E 21,

"0 1#! "1 1#"0 1# "0 1#"1 1# "1 1#

τ( E 12) = τ

=

+

=

0

0

1

0 0

0

0

0 1

0

0

1

= E 11 + E 12 + E 22,

"0 0#! "1 1#"0 0# "0 0#"1 1# "1 0#

τ( E 21) = τ

=

+

=

1

0

1

0 1

0

1

0 1

0

1

1

= E 11 + E 21 + E 22,

"0 0#! "1 1#"0 0# "0 0#"1 1# "0 1#

τ( E 22) = τ

=

+

=

0

1

1

0 0

1

0

1 1

0

1

0

= E 12 + E 21,

hence, the matrix corresponding to τ with respect to the standard basis of

2×2

R

is

2

1

1

0









1

1

0

1

A





τ = 





 .





1

0

1

1









0

1

1

0

(The columns of Aτ correspond to coefficients in the expressions of τ( Eij) as linear combinations of Eij’s.)

Solution to problem 2.11, page 14:

(a) For p, q ∈ R3[ x] and α,β ∈ R we have

φ( αp + βq) = ( αp + βq)( A) = αp( A) + βq( A) = αφ( p) + βφ( q), i.e., φ is linear.

For 1 , x, x 2 and x 3 we have, with a slight abuse of notation,

"1 0#

"5 4#

φ(1) = I =

,

φ( x 2) = A 2 =

,

0

1

4

5

"1 2#

"13 14#

φ( x) = A 1 =

,

φ( x 3) = A 3 =

,

2

1

14

13

so the matrix corresponding to φ with respect to the bases {1 , x, x 2 , x 3}

and { E

}

11 , E 12 , E 21 , E 22 is

1

1

5

13









0

2

4

14

A





φ = 





 .





0

2

4

14









1

1

5

13

CHAPTER 4. SOLUTIONS

58

(b) We’ll determine ker φ from the nullspace N ( Aφ). With respect to the basis {1 , x, x 2 , x 3} the polynomial p( x) = ax 3 + bx 2 + cx + d is represented by the column [ d, c, b, a]T. Gaussian elimination on Aφ gives

1

0

3

6









0

1

2

7

A ∼ 



φ







 ,





0

0

0

0









0

0

0

0

which gives us equations

d + 3 b + 6 a = 0,

c + 2 b + 7 a = 0.

Therefore c = −7 a − 2 b and d = −6 a − 3 b. Hence, a general polynomial p ∈ ker φ is of the form

p( x) = ax 3 + bx 2 + (−7 a − 2 b) x − 6 a − 3 b = a( x 3 − 7 x − 6) + b( x 2 − 2 x − 3).

This means that the set { x 2 − 2 x − 3 , x 3 − 7 x − 6} is one possible basis for ker φ and dim(ker φ) = 2.

(c) Let’s denote W = { X ∈ 2×2

R

: q( X) = 0}. Note that the matrix

"−1 0#

B = 0 0

is contained in the set W , since q( B) = 0. However,

"1 0#! "−4 0#

q(− B) = q

=

0

0

0

0 , 0,

so − B < W and W is not a vector subspace.

Solution to problem 2.12, page 14:

(a) This is direct. Assume α a + β b + γ c = 0, i.e.,

 α

1

1

0  α

0







 







α a + β b + γ c = [a , b , c]  



 













 







 β  = 1

0

1  β = 0 .







 













 







 γ

0

1

1  γ

0

A quick Gaussian elimination on the matrix [a , b , c] helps us show that

[ α, β, γ]T = [0 , 0 , 0]T is the unique solution to this system, hence a, b, and c are linearly independent. Since we have three linearly independent vectors in a vector space of dimension 3, they must constitute a basis of that vector space.

(b) This is also direct. Since we are already given τ on these three basis vectors, we simply read off the coefficients:

1

1

1





A





τ, B , B = 



0

1

0 .









0

0

1

CHAPTER 4. SOLUTIONS

59

(c) While we could also do this directly (and a bit tediously), we rather start with a diagram:

τ

( 3

3

R , B)

(R , B)

Aτ, B , B

id

A id , S , B

id

A id , B , S

τ

( 3

3

R , S )

(R , S)

Aτ, S , S

The nodes on this diagram represent vector spaces along with their assumed bases, while the arrows represent linear maps and the corre-

sponding matrices with respect to the assumed bases. Note that K :=

A id , B , S = [a , b , c], and A id , S , B = K−1. Hence, since τ = id ◦ τ ◦ id (the τ on the left hand side is the bottom τ in our diagram), we have A

·

·

τ, S , S = A id , B , S Aτ, B , B A id , S , B or Aτ, S , S = K Aτ, B , B K −1.

After evaluating K−1, and multiplying that triple product, we get

1

0

1





A





τ, S , S = 



0

1

1 .









0

0

1

(d) The vector [1 , 1 , 1]T is given in the standard basis, we could (with excessive use of ornaments) write

1

1

























1 = i + j + k = 1

.

















1

1S

Therefore



1

1

2

2























































 τ 1

= A

1

= 2 = 2 .







τ, S , S  





























1

1

1

1

S

S

S

We could also have used the matrix corresponding to τ with respect to the basis B. Note that

1

 1 





1

 2 













 1 

1

=

(a + b + c) =   .





 2 





2





1





 1 

S

2 B

Hence,



1

 1 

 3 







 2 

 2 





















 1 

 1 

 τ 1

= A





=   .







τ, B , B  2 

 2 

















1









 1 

 1 

B

2 B

2 B

That last column represents 3 a + 1 b + 1 c, which is exactly [2 , 2 , 1]T (in 2

2

2

the standard basis).

Solution to problem 2.13, page 14:

CHAPTER 4. SOLUTIONS

60

(a) Pick arbitrary p, q ∈ R3[ x] and α,β ∈ R. Then

( αp + βq)(−1)

 αp(−1) + βq(−1)









φ( αp + βq) = 















 ( αp + βq)(0)  = 

αp(0) + βq(0) 

















 ( αp + βq)(1) 



αp(1) + βq(1) 

 p(−1)

 q(−1)









= α 















 p(0)  + β  q(0)  = αφ( p) + βφ( q),

















 p(1) 

 q(1) 

i.e., φ is linear.

(b) For any p ∈ ker φ, by definition, φ( p) = 0 must hold. In our case that means that

 p(−1)

0









φ( p) = 















 p(0)  = 0 ,

















 p(1) 

0

or p(−1) = 0, p(0) = 0, p(1) = 0, i.e., −1, 0, and 1 are zeroes of an at most degree 3 polynomial p. One choice for p is p( x) = ( x + 1) x( x − 1) = x 3 − x.

Any other polynomial contained in ker φ must be a scalar multiple of that p, since there are no higher degree polynomials in R3[ x]. Hence, Bker φ = { x 3 − x} is one possible basis for ker φ.

(c) Let’s evaluate φ at the basis polynomials 1, x, x 2 and x 3:

1

−1

1

−1

















φ(1) =  





























1 , φ( x) =  0  , φ( x 2) = 0 , φ( x 3) =  0  .

































1

 1 

1

 1 

Hence, the matrix corresponding to φ is

1

−1

1

−1





A





φ = 



1

0

0

0  .









1

1

1

1 

Solution to problem 2.14, page 14: Linearity of ψ essentially follows from the linearity of derivation, scalar multiplication, and addition. We have ψ( αp + βq)( x) = ( x( αp + βq)( x + 1))′ − 2( αp + βq)( x)

= ( αxp( x + 1) + βxq( x + 1))′ − 2 αp( x) − 2 βq( x)

= α( xp( x + 1))′ + β( xq( x + 1))′ − 2 αp( x) − 2 βq( x)

= α (( xp( x + 1))′ − 2 p( x)) + β (( xq( x + 1))′ − 2 q( x))

= αψ( p)( x) + βψ( q)( x) for any p, q ∈ R2[ x] and any α,β ∈ R, i.e., ψ is linear.

To determine the matrix corresponding to ψ we evaluate ψ on 1, x, and x 2.

With slight abuse of notation, we get

ψ(1)( x) = ( x · 1)′ − 2 · 1 = 1 − 2 = −1,

ψ( x)( x) = ( x · ( x + 1))′ − 2 · x = ( x 2 + x)′ − 2 x = 2 x + 1 − 2 z = 1, ψ( x 2)( x) = ( x · ( x + 1)2)′ − 2 · x 2 = ( x 3 + 2 x 2 + x)′ − 2 x 2 = x 2 + 4 x + 1.

CHAPTER 4. SOLUTIONS

61

So, with respect to the basis {1 , x, x 2}, the matrix corresponding to ψ is

−1

1

1





A





ψ = 



 0

0

4 ,









 0

0

1

Finally, determination of ker ψ and im ψ might be easier by means of N ( Aψ) and C( Aψ) in this particular case. Note that Aψ is a rank 2 matrix, with columns 2 and 3 linearly independent. Those two columns express polynomials 1 and 1 + 4 x + x 2 with respect to the standard basis. (Namely, ψ( x) and ψ( x 2).) Hence, Bim ψ = {1 ,x 2 + 4 x + 1}

is a basis for im ψ and im ψ is determined. A quick Gaussian elimination on Aψ gives

1

−1

0





A → 



ψ





0

0

1 .









0

0

0

i.e., columns of the form [ x 2 ,x 2 , 0]T are contained in N ( Aψ), or [1 , 1 , 0]T is a choice for the sole basis vector of N ( Aψ). That means that Bker ψ = { x + 1}

is a basis for ker ψ.

Solution to problem 2.15, page 15:

(a) This is direct:

φ( α x + β y) = ( α x + β y)aT = α x aT + β y aT

= αφ(x) + βφ(y).

This holds for any x , y ∈ 2

R

and any α, β ∈ R, i.e., φ is linear.

n

o

(b) The standard bases of

2

2×2

R

and R

are [ 1 ] , [ 0 ] and { E

},

0

1

11 , E 12 , E 21 , E 22

respectively. We have

"1#! "1#

"

#

h

i

1

1

φ

=

1

1 =

= E

0

0

0

0

11 + E 12,

"0#! "0#

"

#

h

i

0

0

φ

=

1

1 =

= E

1

1

1

1

21 + E 22,

so the matrix corresponding to φ is

1

0









1

0

A





φ = 





 .





0

1









0

1

(c) Note that Aφ is of rank 2, hence dim(im φ) = dim( C( Aφ)) = 2, and, since dim(ker φ) + dim(im φ) = dim( 2

R ) = 2,

we must have dim(ker φ) = 0 .

CHAPTER 4. SOLUTIONS

62

(d) The two (linearly independent) columns of Aφ represent matrices

"1 1#

"0 0#

, and

,

0

0

1

1

which, of course, constitute a basis for im φ.

Solution to problem 2.16, page 15:

(a) For any p, q ∈ V and any α, β ∈ R we have (using the definition of vector space operations on polynomials and the fact that derivation is linear)

"( αp + βq)(−1) ( αp + βq)′(−1)#

φ( αp + βq) = ( αp + βq)(1)

( αp + βq)′(1)

" p(−1) p′(−1)#

" q(−1) q′(−1)#

= α

+ β

= αφ( p) + βφ( q),

p(1)

p′(1)

q(1)

q′(1)

i.e., φ is a linear map.

(b) Note that (1)′ = 0, (1 − x 2)′ = −2 x and (1 + x 4)′ = 4 x 3. Evaluating φ at polynomials from the basis B therefore gives

"1 0#

"0

2 #

"2 −4#

φ(1) =

, φ(1 − x 2) =

and φ(1 + x 4) =

.

1

0

0

−2

2

4

Hence, the matrix corresponding to φ with respect to the (ordered) bases B = {1 , 1 − x 2 , 1 + x 4} and S 2×2 = { E

} is

R

11 , E 12 , E 21 , E 22

1

0

2 









0

2

−4

A





φ = 





 .





1

0

2 









0

−2

4 

(c) Let’s determine N ( Aφ). After a quick Gaussian elimination on Aφ we get

1

0

2 









0

1

−2

A ∼ 



φ







 ,





0

0

0 









0

0

0 

so a general solution to Aφ x = 0 is the column

−2 x 

− 

3

2









x = 















 2 x

 = x

 2  .



3 

3 













x 





3

1

The polynomial determined by that column (with respect to the basis B

of V ) is

x 3(−2 · 1 + 2 · (1 − x 2) + 1 · (1 + x 4)) = x 3( x 4 − 2 x 2 + 1) = x 3( x 2 − 1)2, where x 3 is an arbitrary real number. Hence, Bker φ = {( x 2 − 1)2} is one possible basis for ker φ.

CHAPTER 4. SOLUTIONS

63

From dim(ker φ) + dim(im φ) = dim(V ) = 3 and the (just established) fact that dim(ker φ) = 1 we obtain dim(im φ) = 2. Since the matrices φ(1) and φ(1 − x 2) from (b) are clearly linearly independent, a choice of basis for im φ is

("1 0# "0

2 #)

Bim φ =

,

.

1

0

0

−2

Solution to problem 2.17, page 15:

(a) Pick z

∈

1 , z 2

U + V . By the definition of U + V , we can write z 1 and z 2 as z

∈

∈

1 = u 1 + v 1 and z 2 = u 2 + v 2 for some u 1 , u 2

U and v 1 ,v 2 V . Therefore,

for scalars α

∈

1 , α 2

R,

α 1 z 1 + α 2 z 2 = α 1( u 1 + v 1) + α 2( u 2 + v 2) = ( α 1 u 1 + α 2 u 2) + ( α 1 v 1 + α 2 v 2), which is an element of U + V , since U and V are vector subspaces of W .

Hence, U + V is a vector subspace of W .

The proof that U ∩ V is also a vector subspace is left to the reader.

(b) For ( u, v) , ( u′ , v′) ∈ U × V and α ∈ R define ( u, v) + ( u′ , v′) := ( u + u′ , v + v′), and α · ( u, v) := ( αu, αv).

(Note that the operations on the right side are the vector space operations in U and V , while the operations on the left side are newly defined operations.)

The routine verification that U × V is a vector space with these operations is conveniently left to the reader.

Pick bases { u

}

}

1 , u 2 , . . . , um and { v 1 , v 2 , . . . , vn of U and V , respectively. As the reader will verify, the set

{( u 1 , 0) , ( u 2 , 0) ,..., ( um, 0) , (0 ,v 1) , (0 ,v 2) ,..., (0 ,vn)}

is a basis of U × V . Hence, dim( U × V ) = dim( U ) + dim( V ).

(c) Let’s pick ( u

∈

1 , v 1) , ( u 2 , v 2) ∈ U × V and α 1 , α 2

R. We have

φ( α 1( u 1 ,v 1) + α 2( u 2 ,v 2)) = φ( α 1 u 1 + α 2 u 2 ,α 1 v 1 + α 2 v 2)

= ( α 1 u 1 + α 2 u 2) − ( α 1 v 1 + α 2 v 2)

= α

−

−

1( u 1

v 1) + α 2( u 2 v 2)

= α 1 φ( u 1 ,v 1) + α 2 φ( u 2 ,v 2), i.e., φ is linear.

To determine ker( φ) we solve φ( u, v) = 0 or u − v = 0, which means u = v. Note that, since u ∈ U and v ∈ V , this also implies u = v ∈ U ∩ V .

Therefore

ker( φ) = {( z, z) : z ∈ U ∩ V }.

The image of φ consists of vectors φ( u, v) = u − v for u ∈ U and v ∈ V .

We can rewrite this as φ( u, − v) = u + v, i.e., im( φ) = U + V .

(d) Linearity of ψ is obvious. Since ψ( x) = (0 , 0) implies x = 0, ψ is injective.

Surjectivity of ψ is clear from the description of ker φ above.

CHAPTER 4. SOLUTIONS

64

(e) We have

dim( U + V ) + dim( U ∩ V ) = dim(im φ) + dim(ker φ) = dim( U × V )

= dim( U ) + dim( V ).

Solution to problem 3.1, page 16: The Jacobian matrix is

"cos ϕ − r sin ϕ#

J F = sin ϕ r cos ϕ

and the Jacobian determinant is





cos ϕ

− r sin ϕ

det( J F) =

= r cos2 ϕ + r sin2 ϕ = r.





sin ϕ

r cos ϕ

Solution to problem 3.2, page 16: The Jacobian matrix is

cos ϕ

− r sin ϕ

0





J





F = 



sin ϕ

r cos ϕ

0 .











0

0

1

The Jacobian determinant is the same as for the case of polar coordinates cos ϕ

− r sin ϕ

0





cos ϕ

− r sin ϕ

det( J





F) = sin ϕ

r cos ϕ

0 =

= r.





sin ϕ

r cos ϕ



0

0

1





Solution to problem 3.3, page 16: The Jacobian matrix is

cos θ cos ϕ

− r cos θ sin ϕ

− r sin θ cos ϕ





J





F = 



cos θ sin ϕ

r cos θ cos ϕ

− r sin θ sin ϕ .











sin θ

0

r cos θ



Computation of the Jacobian determinant requires a little more work than for polar and cylindrical coordinates:





cos θ cos ϕ

− r cos θ sin ϕ

− r sin θ cos ϕ





det( J





F) = cos θ sin ϕ

r cos θ cos ϕ

− r sin θ sin ϕ





sin θ

0

r cos θ





cos θ cos ϕ

− cos θ sin ϕ

− sin θ cos ϕ





= r 2



cos θ sin ϕ

cos θ cos ϕ

− sin θ sin ϕ





sin θ

0

cos θ





!

− cos θ sin ϕ

− sin θ cos ϕ

cos θ cos ϕ

− cos θ sin ϕ

= r 2 sin θ

+ cos θ





cos θ cos ϕ

− sin θ sin ϕ

cos θ sin ϕ

cos θ cos ϕ





!

− sin ϕ

− cos ϕ

cos ϕ

− sin ϕ

= r 2 sin2 θ cos θ

+ cos3 θ





cos ϕ

− sin ϕ

sin ϕ

cos ϕ





= r 2 cos θ sin2 θ(sin2 ϕ + cos2 ϕ) + cos2 θ(cos2 ϕ + sin2 ϕ)

= r 2 cos θ(sin2 θ + cos2 θ) = r 2 cos θ.

CHAPTER 4. SOLUTIONS

65

Solution to problem 3.5, page 17:

(a) With a rectangular domain of integration, this one can be directly evaluated

"

Z 1 Z 1

!

Z 1

x 2

! x=1



(5 − x − y) dxdy =

(5 − x − y) dx dy =

5 x −

− xy

dy



0

0

0

2

x=0

[0 , 1]×[0 , 1]

Z 1



! y=1

9



9 y

y 2

=

− y dy =

−



= 4.



0

2

2

2 y=0

√

(b) From x 2 + y 2 = 2 we get y = ± 2 − x 2 and, since

√

y

√

√

y = 2 − x 2

y ≥ x ≥ 0, only the solution y =

2 − x 2 is relevant.

2

We’ll integrate along y first. Our integral becomes

y = x

x

1

"

√

√





y

Z 1

Z

2− x 2

Z 1

y=

2− x 2



y



y 2



dxdy =







dy dx =



dx











D x + 1

0  x

x + 1



0 2( x + 1) y= x

Z 1 2 − x 2 − x 2

Z 1 (1 + x)(1 − x)

=

dx =

dx

0

2( x + 1)

0

1 + x

Z 1

1

=

(1 − x) dx =

.

0

2

(c) Let’s try directly as the boundaries are given, i.e.,

y

along y first (with 0 ≤ y ≤ x) and then along x (with π

0 ≤ x ≤ π). We have

y = x

x = π

x

π

" sin x

Z π Z x sin x

!

Z π sin x

y= x

dxdy =

dy dx =

· y

dx



D

x

0

0

x

0

x

y=0

Z π sin x

Z π

=

· x dx =

sin x dx = 2.

0

x

0

(d) Since our domain of integration is whole

2

R , the boundaries for x and y

are ±∞. We have

"

Z ∞ Z ∞

!

Z ∞

Z ∞

!

e− x 2− y 2 dxdy =

e− x 2 e− y 2 dx dy =

e− y 2

e− x 2 dx dy

2

R

−∞

−∞

−∞

−∞

Z ∞

! Z ∞

!

Z ∞

!2

=

e− x 2 dx

e− y 2 dy =

e− x 2 dx

.

−∞

−∞

−∞

(The last equality follows after renaming the variable y in the second definite integral to x. A variable doesn’t know its name, after all.) So,

!

R ∞

denoting K =

e− x 2− y 2 dxdy and L =

e− x 2 dx, we have K = L 2.

2

R

−∞

CHAPTER 4. SOLUTIONS

66

We’ll evaluate K using polar coordinates

x = r cos ϕ , y = r sin ϕ , det( J F) = r.

The boundaries for r and ϕ are 0 ≤ r < ∞ and 0 ≤ ϕ ≤ 2 π. We obtain Z ∞ Z 2 π

!

Z ∞ Z 2 π

!

K =

e− r 2 cos2 ϕ− r 2 sin2 ϕ · r dϕ dr =

re− r 2 dϕ dr

0

0

0

0

Z ∞

Z ∞ e− t

= 2 π

re− r 2 dr = 2 π

dt = π,

0

0

2

where we substituted a new variable t = r 2 into the (single) integral in the second line.

Finally,

Z ∞

√

√

L =

e− x 2 dx = K = π.

−∞

Solution to problem 3.7, page 17: Let’s determine the domain of integration: The projection of the intersection of the paraboloid

z

z = 8− x 2− y 2 and the plane z = −1 onto the xy-plane is the curve given by

8 − x 2 − y 2 = −1

∴

x 2 + y 2 = 9,

i.e., the circle of radius 3 centered at the origin. The

y

domain of integration, call it D, is the closed disk

x

bounded by that circle. We can express the volume

of the solid as a double integral in polar coordi-

nates x = r cos ϕ, y = r sin ϕ, det( J F) = r:

"

Z 3 Z 2 π

!

Z 3





8 − x 2 − y 2 − (−1) dxdy =

(9 − r 2) r dϕ dr = 2 π

(9 r − r 3) dr

D

0

0

0

9 r 2

r 4 ! r=3



81 π

= 2 π

−



=

.

2

4 r=0

2

Solution to problem 3.8, page 17: Polar coordinates are an ideal choice for this problem. The boundaries for r and ϕ for that quarter of the disk D in polar coordinates are 0 ≤ r ≤ R and 0 ≤ ϕ ≤ π . Moreover, 2

y

q

q

ρ( x, y) =

x 2 + y 2 =

r 2 cos2 ϕ + r 2 sin2 ϕ = r.

R

Hence, the integral for the mass of D is

"



π



q

Z R

Z

2

π Z R

m =

x 2 + y 2 dxdy =











r · r dϕ dr =

r 2 dr





D

0  0



2 0

x

R

πr 3 r= R



πR 3

=



=

.

6 r=0

6

CHAPTER 4. SOLUTIONS

67

The coordinates of the center of mass are

"



π



1

q

6 Z R Z 2

x∗ =

x x 2 + y 2 dxdy =











r 2 cos ϕ · r dϕ dr

m





D

πR 3 0  0



π



6 Z R

! Z

2

6

R 4

3 R

=

r 3 dr 







·

·



cos ϕ dϕ =

1 =

,

πR 3





0



0



πR 3

4

2 π

and

"



π



1

q

6 Z R Z 2

y∗ =

y x 2 + y 2 dxdy =











r 2 sin ϕ · r dϕ dr

m





D

πR 3 0  0



π



6 Z R

! Z

2

6

R 4

3 R

=

r 3 dr 







·

·



sin ϕ dϕ =

1 =

.

πR 3





0



0



πR 3

4

2 π

Solution to problem 3.9, page 18: The surface z 2 = x 2 + y 2 is an infinite cone with (double) apex at the origin, while the surface

z

x 2 + y 2 + z 2 = 4 is the sphere of radius 2 centered at the origin. Since z ≥ 0, our domain of integration

D is the ‘top’ conical cutout from the ball of radius

2 centered at the origin. In spherical coordinates,

that domain is given by boundaries

y

π

π

x

0 ≤ r ≤ 2 , 0 ≤ ϕ ≤ 2 π ,

≤ θ ≤

.

4

2

Hence,

$

π

Z 2 Z 2 π Z





2

m =

ρ( x, y, z) dxdydz =





















r 2 cos θ dθ dϕ dr









D

0  0



π





4

π

Z 2

! Z 2 π

! Z





r=2!

θ= π

2

r 3



2





=

r 2 dr

dϕ 









·





cos θ dθ =

2 π · sin θ









0

0



π



3



4

r=0

θ= π 4

√

√

8



2 !

8 π(2 − 2)

=

· 2 π · 1 −

=

.

3

2

3

Judging from the symmetry of our domain (and the fact that ρ is constant), we deduce that x∗ = 0 and y∗ = 0. (Still, the reader is invited to verify that by evaluating corresponding integrals.) For z∗ we have

$

1

z∗ =

z ρ( x, y, z) dxdydz

m

D





π





3

Z 2

Z 2 π

Z

2

=

√





















r sin θ · r 2 cos θ dθ dϕ dr 8 π(2 − 2)









0  0



π





4

π



3

Z 2

! Z 2 π

! Z

2

=

√

r 3 dr

dϕ 









sin θ cos θ dθ

8 π(2 − 2)





0

0



π



4

√

3

24

1

3(2 + 2)

=

√

·

· 2 π ·

=

,

8 π(2 − 2)

4

4

8

CHAPTER 4. SOLUTIONS

68

since

π

π

Z

θ= π

2

1 Z 2

− cos(2 θ)

2



1 + 0

1

sin θ cos θ dθ =

sin(2 θ) dθ =



=

=

.



π

2

π

4



4

4

4

4

θ= π 4

Solution to problem 3.10, page 18: We can rewrite the inequality x 2 + y 2 + z 2 ≤

2 z of our domain D as

x 2 + y 2 + z 2 − 2 z + 1 ≤ 1

∴

x 2 + y 2 + ( z − 1)2 ≤ 1,

i.e., our domain of integration is the ball of radius

z

1 centered at (0 , 0 , 1). If we ‘plug spherical coordinates’ into the inequality x 2 + y 2 + z 2 ≤ 2 z, we obtain r 2 ≤ 2 r sin θ,

and, since r ≥ 0,

y

x

r ≤ 2 sin θ,

which gives us integration boundaries for r.

Let’s start with the mass of this ball. Since ρ( x, y, z) = p x 2 + y 2 + z 2 = r, we have

$



π



q

Z 2 π

Z



!

2

Z 2 sin θ

m =

x 2 + y 2 + z 2 dxdydz =











r · r 2 cos θ dr dθ dϕ





D

0



0

0



π

π

Z 2 π

! Z

r=2 sin θ



2

r 4

Z



2

=

dϕ 











cos θ ·

dθ = 2 π

4 cos θ sin4 θ dθ







0



0

4 r=0



0

Z 1

8 π

= 8 π

t 4 dt =

,

0

5

where a new variable t = sin θ was introduced in the last line.

Now the coordinates of the center of mass; let’s start with

$

1

q

z∗ =

z x 2 + y 2 + z 2 dxdydz

m

D



π



5 Z 2 π Z



!

2

Z 2 sin θ

=











r sin θ · r · r 2 cos θ dr dθ dϕ

8 π





0



0

0



π

π

5

Z

r=2 sin θ

2

r 5

Z



2

=

· 2 π

sin θ cos θ



dθ = 8

cos θ sin6 θ dθ

8 π



0

5 r=0

0

Z 1

8

= 8

t 6 dt =

.

0

7

R 2 π

R 2 π

Note that x∗ = 0 and y∗ = 0, since

cos ϕ dϕ = 0 and

sin ϕ dϕ = 0. (The

0

0

reader is invited to work out the details.)

CHAPTER 4. SOLUTIONS

69

Solution to problem 3.11, page 18: From y 2 − 2 ≤ 2 − x 2 we get x 2 + y 2 ≤ 4, i.e., the projection of this solid onto the xy-plane is the disk of radius 2 centered at the origin.

The volume is

$

Z 2 Z 2 π Z 2− r 2 cos2 ϕ





V =

dxdydz =





















r dz dϕ dr









D

0  0



r 2 sin2 ϕ−2





Z 2 Z 2 π

!



r 4 ! r=2



=

(4 − r 2) r dϕ dr = 2 π 2 r 2 −



= 8 π.



0

0

4 r=0

For the mass, we get

$

Z 2 Z 2 π Z 2− r 2 cos2 ϕ





m =

y 2 dxdydz =





















r 2 sin2 ϕ · r dz dϕ dr









D

0  0



r 2 sin2 ϕ−2





Z 2 Z 2 π

!

Z 2 π

! Z 2

!

=

(4 − r 2) r 3 sin2 ϕdϕ dr =

sin2 ϕ dϕ

(4 r 3 − r 5) dr

0

0

0

0

1



r 6 ! r=2



16 π

=

· 2 π · r 4 −



=

.

2

6 r=0

3

Solution to problem 3.12, page 18:

h

iT

(a) Let’s start with the evaluation of the gradient of f ; grad( f ) = ∂f , ∂f

,

∂x ∂y

or the two partial derivatives:

∂f

∂f

= 3 x 2 − 8 x + 2 y ,

= 2 x − 2 y.

∂x

∂y

Stationary points of f are solutions of grad( f ) = 0, i.e.,

∂f = 0 ∴ 3 x 2 −8 x +2 y = 0,

∂x

∂f = 0 ∴ 2 x −2 y = 0.

∂y

We get y = x from the second equation and, plugging this into the first equation, we get

3 x 2 − 6 x = 0

∴

3 x( x − 2) = 0,

i.e., x 1 = 0, x 2 = 2, so the stationary points of f are T 1(0 , 0) and T 2(2 , 2).

We’ll use the Hesse matrix of f to determine the type of these stationary points:

 ∂ 2 f

∂ 2 f 

"

#





6 x − 8

2

H

 ∂x 2

∂x∂y 

f ( x, y) = J grad( f )( x, y) = 





 =

.

 ∂ 2 f

∂ 2 f 





2

−2





∂y∂x

∂y 2

In particular, at the stationary points T 1 and T 2, we have

"−8

2 #

"4

2 #

H 1 := Hf (0 , 0) =

and H

.

2

−2

2 := Hf (2 , 2) = 2 −2

CHAPTER 4. SOLUTIONS

70

Now, det( H 1) = 12 and, since the (1 , 1)-entry of H 1 is −8 < 0, the matrix H 1 is negative definite by the Sylvester’s criterion. Hence, T 1 is a local maximum. For H 2, we have det( H 2) = −12, so the two (real) eigenvalues of H 2 are of opposite signs, H 2 is indefinite, and T 2 is a saddle point (i.e., not a local extremum).

(b) Let’s find stationary points first:

∂g = ( x +1) ex = 0,

∂x

∂g = 2( y +1) ey = 0.

∂y

This system is particularly simple, and ( x 1 ,y 1) = (−1 , −1) is the only solution. So T 1(−1 , −1) is the stationary point of g.

The Hesse matrix of g is

"( x + 2) ex

0

#

Hg( x,y) =

,

0

2( y + 2) ey

which, evaluated at the stationary point T 1, is

" 1

0#

H

e

1 := Hg (−1 , −1) =

.

0

2

e

Now, this is a diagonal matrix with positive diagonal entries, i.e., all eigenvalues of H 1 are positive and T 1 is a local minimum.

(c) Again, we start with the stationary points:

∂h = −(1 + ey)sin x = 0,

∂x

∂h = ey(cos x − y − 1) = 0.

∂y

First equation forces sin x = 0, i.e., xk = kπ for k ∈ Z. Plugging this into the second equation we get cos( xk) − y − 1 = (−1) k − y − 1 = 0 or yk = (−1) k − 1. (That is, yk = 0 for k even and yk = −2 for k odd.) There are infinitely many stationary points Tk( kπ, (−1) k −1), one for each k ∈ Z.

Now, the Hesse matrix of h is

"−(1 + ey)cos x

− ey sin x

#

Hh( x,y) =

,

− ey sin x

ey(cos x − y − 2)

which, at stationary points Tk evaluates to

"−2

0 #

Hk := Hh( kπ, 0) =

for k even,

0

−1

"1 + 2

0 #

and H

e 2

k := Hh( kπ, −2) =

for k odd.

0

− 1

e 2

Hence, for k even Hk is negative definite (since all its eigenvalues are negative), and Tk is a local maximum for k even. For k odd Hk is indefinite (since it has a positive and a negative eigenvalue), so Tk is a saddle point for k odd.

CHAPTER 4. SOLUTIONS

71

(d) Again, we start with the stationary points:

∂k = 3 x 2 − 3 yz = 0,

∂x

∂k = 3 y 2 − 3 xz = 0,

∂y

∂k = 6 z − 3 xy = 0.

∂y

Third equation implies z = xy . We plug this into first two equations to 2

get

xy



y 2 !

3 x 2 − 3 y ·

= 0

∴

3 x x −

= 0,

2

2

xy



x 2 !

3 y 2 − 3 x ·

= 0

∴

3 y y −

= 0.

2

2

From the first one of these two equations we conclude that either x = 0

or x = y 2 . In case x = 0, we get y = 0 from the second equation, and, 2

since z = xy , z = 0. We have the first stationary point T

2

1(0 , 0 , 0). In case

x = y 2 we have

2



y 4 !



y 3 !

3 y y −

= 0

∴

3 y 2 1 −

= 0.

8

8

Since we already covered the case y = 0, the only remaining option is y = 2, and therefore x = 2 and z = 2. The second stationary point of k is T 2(2 , 2 , 2).

Now, the Hesse matrix is

 6 x

−3 z

−3 y





H

−



k ( x, y, z) = 





3 z

6 y

−3 x .









−3 y

−3 x

6 

Evaluating at stationary points T 1(0 , 0 , 0) and T 2(2 , 2 , 2) we have

0

0

0

12

−6

−6









H





−



1 := Hk (0 , 0 , 0) = 







0

0

0

and

H



6

12

−6 .





2 := Hk (2 , 2 , 2) = 











0

0

6

−6

−6

6 

The matrix H 1 is semidefinite (but not definite), hence the type of stationary point T 1 cannot be determined from second derivatives alone. (And we’ll leave T 1 as is—of undetermined type.) We’ll use Sylvester’s criterion to determine the definiteness (or lack of it) for H 2. We have 12

−6

12 > 0 ,

= 108 > 0 , det( H

−



2) = −648 < 0,



6

12

i.e., H 2 is indefinite and T 2 is a saddle point.

CHAPTER 4. SOLUTIONS

72

(e) Note that the function r is not sensitive to a permutation of variables, so all three partial derivatives can be obtained from ∂r with a suitable

∂x

permutation of variables.

Let’s start with the stationary points:

∂r = 2 x − 2 yz = 0,

∂x

∂r = 2 y − 2 xz = 0,

∂y

∂r = 2 z − 2 xy = 0.

∂x

From the third equation, we have z = xy, therefore the first two equations become

x − y · xy = x(1 − y 2) = 0, y − x · xy = y(1 − x 2) = 0.

From the first equation we have that either x = 0 or 1 − y 2 = 0. In case x = 0, we have z = 0 and also y = 0 from the second original equation.

The first stationary point is T 1(0 , 0 , 0). In case y = ±1 we get ±1·(1− x 2) =

0, so x = ±1. (N.b.: We have the two possibilities x = ±1 for each of the possibilities y = ±1.) Since z = xy, we get four additional stationary points

T 2(−1 , −1 , 1) , T 3(−1 , 1 , −1) , T 4(1 , −1 , −1) , and T 5(1 , 1 , 1).

The Hesse matrix is



2

−2 z

−2 y





H

−



r ( x, y, z) = 





2 z

2

−2 x .









−2 y

−2 x

2 

Evaluating this at stationary points T 1 ,...,T 5 and denoting these matrices by H 1 ,...,H 5 we obtain

2

0

0

 2

−2

2

 2

2

−2













H





−







1 = 











0

2

0 , H



2

2

2 , H

 2

2

2  ,





2 = 



3 = 















0

0

2

 2

2

2

−2

2

2 

2

2

2 

 2

−2

−2









H





−



4 = 







2

2

−2 , H



2

2

−2 .





5 = 











2

−2

2 

−2

−2

2 

Now, H 1 is clearly positive definite, so T 1 is a local minimum. Note that H 2 ,..., H 5 are indefinite by the Sylvester’s criterion, i.e., they all have the (1 , 1)-entry positive, while det( H 2) = ··· = det( H 5) = −32. (In fact, the matrices H 2 ,...,H 5 all have eigenvalues −2 , 4 , 4.) Hence, T 2 ,...T 5 are saddle points.

(f) Stationary points:

∂u = 3 x 2 − 3 y = 0,

∂x

∂u = 3 y 2 − 3 x = 0.

∂y

CHAPTER 4. SOLUTIONS

73

So, from the first equation, y = x 2, hence x 4 − x = x( x 3 − 1) = 0 from the second equation. This gives us x 1 = 0 and x 2 = 1, hence y 1 = 0 and y 2 = 1. We have two stationary points, T 1(0 , 0) and T 2(1 , 1).

The Hesse matrix is

"6 x −3#

Hu( x,y) =

,

−3

6 y

and, evaluating at stationary points we get

" 0

−3#

" 6

−3#

H 1 := Hu(0 , 0) =

and H

.

−3

0

2 := Hu (1 , 1) = −3

6

A quick application of Sylvester’s criterion reveals that H 1 is indefinite, while H 2 is positive definite. Therefore, T 1 is a saddle point, and T 2 is a local minimum.

(g) Stationary points:

∂v = 6 xy − 6 x = 0,

∂x

∂v = 3 x 2 + 3 y 2 − 6 y = 0.

∂y

First equation is equivalent to 6 x( y − 1) = 0 so we have either x = 0 or y = 1.

• Plugging x = 0 into the second equation we get 3 y 2 − 6 y = 0, hence y = 0 or y = 2.

• Plugging y = 1 into the second equation we get 3 x 2 − 3 = 0, hence x = −1 or x = 1.

All in all, we have four stationary points: T 1(0 , 0), T 2(0 , 2), T 3(−1 , 1), and T 4(1 , 1).

The Hesse matrix of v is

"6 y − 6

6 x #

Hv( x,y) =

,

6 x

6 y − 6

which, evaluated at T 1 ,...,T 4 and denoted by H 1 ,...,H 4 become

"−6

0 #

"6 0#

" 0

−6#

"0 6#

H 1 =

, H

, H

, and H

.

0

−6

2 = 0 6

3 = −6

0

4 = 6 0

Now, H 1 is clearly negative definite, so T 1 is a local maximum. Also clear is positive definiteness of H 2, so T 2 is a local minimum. The matrices H 3

and H 4 are not definite, so T 3 and T 4 are saddle points.

Solution to problem 3.13, page 18: Given a , b ∈ n R

let f (x) = (xTa)(xTb).

(a) The formula for f (x) can be rewritten as

f (x) = (xTa)(bTx) = xT(abT) x.

Therefore, from the formula ∂(xT A x) = xT( A + A T), we get

∂ x

∂f = xT(abT + baT),

∂ x

CHAPTER 4. SOLUTIONS

74

and, from the formula ∂A x = A, we also get

∂ x

∂ 2 f

∂ ∂f !T

=

= (abT + baT).

∂ x2

∂ x ∂ x

(b) One stationary point of f is 0, i.e., one of the solutions of ∂f (x) = 0. (All

∂ x

stationary points of f are determined by the nullspace N (abT + baT).) Since a and b are orthogonal, i.e., aTb = bTa = 0, we have

(abT + baT)a = ∥a∥2b,

(abT + baT)b = ∥b∥2a.

Let U be the vector subspace of

n

R

spanned by a and b. From the two

equalities above, we see that, restricted to U with the vector space basis

{a , b}, the Hesse matrix of f is represented by

"

0

∥a∥2# ,

∥b∥2

0

which has eigenvalues λ 1 , 2 = ±∥a∥∥b∥. Hence, ∂ 2 f is indefinite, since it

∂ x2

has a positive and a negative eigenvalue, so 0 is a saddle point. (Note that, since the Hesse matrix ∂ 2 f is constant, the whole nullspace N (abT +baT)

∂ x2

consists of saddle points.)

Solution to problem 3.14, page 19: We are trying to minimize the function f (x) = ∥x − a ∥2

∥2

∥2

1

+ ∥x − a2 + ··· + ∥x − a k .

Note that ∥x − a ∥2

i

= (x − a i)T(x − a i), so

k

∂f



X

T

= 2(x − a

k x −

a

.

∂ x

1)T + 2(x − a2)T + · · · + 2(x − a k )T = 2

i

i=1

The stationary point of f (the solution to ∂f (x) = 0) is therefore

∂ x

1

x =

(a

k

1 + a2 + · · · + a k ) .

Note that the Hesse matrix of f is ∂ 2 f = 2 kI (with n-fold eigenvalue 2 k), which

∂ x2

is positive definite, so our stationary point is in fact a (local) minimum.

Solution to problem 3.15, page 19: The problem is to find the extreme values of the function f ( x, y) = 2 x 2 + y 2 constrained to the (closed and bounded) domain given by 4( x − 1)2 + y 2 ≤ 16. We split this into two subtasks.

• Find the candidates for extreme values of f in the interior of the domain, i.e., subject to strict inequality 4( x − 1)2 + y 2 < 16. Extreme values of f over the entire domain that are attained at interior points (and not at the boundary points) must be local extrema of f .

CHAPTER 4. SOLUTIONS

75

• Find the candidates for extreme values of f on the boundary of the domain, i.e., subject to equality 4( x−1)2 + y 2 = 16. These will be determined using the method of Lagrange multipliers.

Let’s start with the interior. Stationary points of f are solutions of the system

∂f = 4 x = 0,

∂x

∂f = 2 y = 0.

∂y

Clearly, x 1 = 0 ,y 1 = 0 is the only solution of this system, i.e., T 1(0 , 0) is the only stationary point of f . This point is contained in the interior of our domain (since 4(0 − 1)2 + 02 = 4 < 16 holds).

Now the boundary. Let’s rewrite the equation 4( x − 1)2 + y 2 = 16 as 4( x − 1)2 + y 2 − 16 = 0,



|

{z

}

g( x,y)

i.e., we have rewritten the constraint as g( x, y) = 0. The corresponding Lagrange function is

L( x, y, λ) = f ( x, y) − λg( x, y) = 2 x 2 + y 2 − λ(4( x − 1)2 + y 2 − 16).

Candidates for extrema on the boundary are the stationary points of this Lagrange function. (Strictly speaking, we only need the x and y coordinates of these stationary points.) Stationary points of L are solutions of the system

∂L = 4 x − 8 λ( x − 1) = 0,

∂x

∂L = 2 y − 2 λy = 0,

∂y

∂L = −(4( x − 1)2 + y 2 − 16) = 0.

∂λ

(The last equation is, of course, equivalent to our constraint.) From the second equation we get 2 y(1 − λ) = 0, which implies y = 0 or λ = 1.

• The case y 2 , 3 = 0 can be plugged directly into the third equation, and we get ( x − 1)2 = 4 or x 2 = −1 and x 3 = 3.

• In case λ = 1 we get −4 x = −8 from the first equation, or x 4 ,

√ 5 = 2. Plugging

this into the third equation we have y 2 = 12 or y 4 , 5 = ±2 3.

Summarizing, we have the following five points, which are candidates for global extrema of f on our domain:

√

√

T ( x, y)

T 1(0 , 0)

T 2(−1 , 0)

T 3(3 , 0)

T 4(2 , −2 3)

T 5(2 , 2 3) .

f ( x, y)

0

2

18

20

20

The bottom row contains the values of f evaluated at corresponding stationary points. Clearly, the smallest value, 0, is attained at T

√

1(0 , 0), while the largest

√

value, 20, is attained at two points, T 4(2 , −2 3) and T 5(2 , 2 3).

CHAPTER 4. SOLUTIONS

76

Solution to problem 3.16, page 19: The first octant is defined by inequalities x ≥ 0, y ≥ 0, and z ≥ 0. So, along with x + y + z = 5, we have four constraints.

We’ll split the task into following subtasks:

• Find candidates for extrema in the interior of the triangle T . That means extrema of g with respect to one constraint, namely x+ y+ z = 5. Additionally, only candidates with x > 0, y > 0, and z > 0 should be considered.

• Find candidates for extrema on the edges of the triangle T . That means extrema of g with respect to two constraints, x + y + z = 5 and one of the planes x = 0, y = 0, or z = 0.

Let’s start with the interior of T . The Lagrange function is L( x, y, z, λ) = xy 2 z 2 − λ( x + y + z − 5).

Its stationary points are solutions of

∂L = y 2 z 2 − λ = 0,

∂x

∂L = 2 xyz 2 − λ = 0,

∂y

∂L = 2 xy 2 z − λ = 0,

∂z

∂L = −( x + y + z − 5) = 0.

∂λ

From the first three equations we have y 2 z 2 = 2 xyz 2 = 2 xy 2 z. Since we only need to consider candidates with x > 0, y > 0, and z > 0, we can safely ignore the solutions with any of the x, y, or z equal to 0. Hence: y 2 z 2 = 2 xyz 2

∴

y = 2 x and

2 xyz 2 = 2 xy 2 z

∴

z = y

∴

z = 2 x.

Plugging this into the equation of the constraint, we get 5 x − 5 = 0 or x = 1. So, T 1(1 , 2 , 2) is our first candidate.

We could define Lagrange functions with two Lagrange multipliers for each of the edges of the triagle, but, due to the simplicity of the additional constraints, we don’t have to. Simply plugging x = 0, y = 0, or z = 0 into g will simplify our task. A lot!

• If x = 0, then g(0 , y, z) = 0.

• If y = 0, then g( x, 0 , z) = 0.

• If z = 0, then g( x, y, 0) = 0.

That means that g is constant (and 0) along all edges of the triangle T . At T 1(1 , 2 , 2), however, the value of g is g(1 , 2 , 2) = 16.

So, the largest value of g on T is 16.

CHAPTER 4. SOLUTIONS

77

Solution to problem 3.17, page 19: The equation x 2 − xy + y 2 = 3

represents the constraint in this task. The expression we’re trying to maximize is ∥x∥ = p x 2 + y 2. Dealing with square roots can be cumbersome, and we’ll see that it’s preferable to maximize

f ( x, y) = x 2 + y 2

instead. (N.b.: If v is a multivariate function, then every stationary point of v is also a stationary point of v 2. Why?)

So we rewrite the constraint

x 2 − xy + y 2 − 3 = 0

and form the corresponding Lagrange function

L( x, y, λ) = x 2 + y 2 − λ( x 2 − xy + y 2 − 3).

Its stationary points are solutions of

∂L = 2 x − λ(2 x − y) = 0,

∂x

∂L = 2 y − λ(2 y − x) = 0,

∂y

∂L = −( x 2 − xy + y 2 − 3) = 0.

∂λ

This system is a bit trickier. We can assume λ , 0 (since λ = 0 would imply x = 0 and y = 0 from first two equations) and rewrite the first two equations as 1

2 x − y

1

2 y − x

=

and

=

.

λ

2 x

λ

2 y

Therefore

2 x − y

2 y − x

=

∴

2 xy − 2 y 2 = 2 xy − 2 x 2

∴

x 2 = y 2 or y = ± x.

2 x

2 y

Now we plug both cases of y = ± x into the constraint (or the third equation).

• In case y = − x we have 3 x 2 − 3 = 0 or x 1 , 2 = ±1.

√

• In case y = x we have x 2 − 3 = 0 or x 3 , 4 = ± 3.

Let’s summarize with a table

√

√

√

√

T ( x, x)

T 1(−1 , 1)

T 2(1 , −1)

T 3(− 3 , − 3)

T 4( 3 , 3) .

f ( x, y)

2

2

6

6

The largest value in the bottom row is 6, so the points T

√

3 and T 4 are farthest away

from the origin (at a distance 6).

CHAPTER 4. SOLUTIONS

78

Solution to problem 3.18, page 19: As usual we prefer to maximize the square of the distance from the origin rather than the distance itself. The Lagrange function is



L( x, y, λ) = x 2 + y 2 − λ ( x 2 + y 2)2 − x 3 − y 3 .

The system for the stationary points of L is:

∂L



= 2 x − λ 4 x( x 2 + y 2) − 3 x 2 = 0,

∂x

∂L



= 2 y − λ 4 y( x 2 + y 2) − 3 y 2 = 0,

∂y

∂L



= − ( x 2 + y 2)2 − x 3 − y 3 = 0.

∂λ

One obvious solution is ( x, y) = (0 , 0). We can discard this solution (since this is the origin itself) by crossing out one of the x and y factors in the first two equations and then express the λ variable in two ways

2 = 4( x 2 + y 2)−3 x,

λ

2 = 4( x 2 + y 2)−3 y.

λ

By identifying these two equations we get x = y (all this assuming neither x nor y equals 0), and then the constraint gives us the equation

4 x 4 = 2 x 3

with x = 1 as the only non-zero solution.

2

However, it is possible that only one of the variables equals zero and we still get a valid solution. Assume x = 0 and y , 0. The constraint then reduces to the equation

y 4 = y 3

which yields y = 1. We can then directly verify that ( x, y) = (0 , 1) (with λ = 2) solves our system and is therefore a stationary point.

Similarly, assuming x , 0 and y = 0 leads to stationary point ( x, y) = (1 , 0).

Out of the three non-trivial stationary points ( 1 , 1 ), (0 , 1) and (1 , 0) we see 2 2

that the latter two attain the maximal distance from the origin.

Solution to problem 3.19, page 19:

(a) The extremal values of f can be situated either in the (strict) interior of the disk or on its boundary and we consider both possibilities separately.

First, any extremal point in the interior of the disc must be a stationary point for f . The system for the stationary points is

∂f = y +1 = 0,

∂x

∂f = x −1 = 0,

∂y

CHAPTER 4. SOLUTIONS

79

with solution ( x, y) = (1 , −1). We can verify that this point lies in the interior of our disk but if we compute the Hesse matrix

"0 1#

Hf ( x,y) = 1 0

we notice this is a saddle point not an extremal point (the eigenvalues λ 1 , 2 = ±1 are of opposite signs).

So the extremal points must be on the boundary of the disk, on the circle x 2 + y 2 = 2. The relevant Lagrange function for the problem is L( x, y, λ) = xy − y + x − 1 − λ( x 2 + y 2 − 2) and the system for the stationary points is

∂L = y + 1− 2 λx = 0,

∂x

∂L = x − 1 − 2 λy = 0,

∂y

∂L = −( x 2 + y 2 − 2) = 0.

∂λ

From the first two equations we express 2 λ

y + 1

x − 1

2 λ =

and 2 λ =

.

x

y

This leads to

y + 1

x − 1

=

∴

y 2 + y = x 2 − x.

x

y

By adding x 2 to both sides we get the identity

x 2 + y 2 + y = 2 x 2 − x

and considering the constraint x 2 + y 2 = 2 this gives the expression y = 2 x 2 − x − 2.

By plugging this expression for y back into the constraint equation we get an equation for x.

x 2 + (2 x 2 − x − 2)2 = 2

∴

2 x 4 − 2 x 3 − 3 x 2 + 2 x + 1 = 0.

This is a quartic equation which in general has quite complicated solutions. Fortunately, we can guess two integer roots x 1 = 1 and x 2 = −1.

This means the polynomial is divisible by the factor x 2 − 1 and computing the polynomial quotient gives

2 x 4 − 2 x 3 − 3 x 2 + 2 x + 1 = 2 x 2 −2 x −1, x 2 − 1

which is the factor from which we can compute the remaining two roots

√

1 ± 3

x 3 , 4 =

.

2

We summarize all four stationary points along with the function values in a table

CHAPTER 4. SOLUTIONS

80

√

√

√

√

T ( x, y)

T

3

3

3

3

1(−1 , 1)

T 2(1 , −1)

T 3( 1+ , −1+ )

T

, −1−

)

2

2

4( 1−2

2

.

f ( x, y)

−4

0

1

1

2

2

The minimum value of f is therefore attained at T 1, while the maximal value is attained at both, T 3 and T 4.

(b) Comparing to Exercise 3.19 (a), it is clear that any extremal points computed in (a) that happen to lie in the half-plane x ≥ 0 are also extremal points for f on the half disk. We notice that T 3 does indeed lie in the half-plane x ≥ 0 which means it is where the maximum of f on our half-disk. The point T 1 however does not lie in the half-disk. Since the minimum is not attained in the interior of the half disk nor the edge x 2 + y 2 = 2, it must be on the edge with x = 0.

The Lagrangian function for f with the constraint x = 0 is L( x, y, λ) = xy − y + x − 1 − λx The system for the stationary points is

∂L = y + 1− λ = 0,

∂x

∂L = x − 1 = 0,

∂y

∂L = − x = 0

∂λ

This is a contradictory system which means f has no stationary points on the (whole) line x = 0 (this is also clear directly since f (0 , y) = − y).

The last possibility is that the minimum is attained on the edge of both

√

the circle x 2 + y 2 = 2 and line x = 0, i.e., the ‘corner’ points T

√

√

5(0 , − 2)

√

and T 6(0 , 2). The values of f at these points are 2 − 1 and − 2 − 1

respectively, which means the minimum value of f on our half-disk is achieved at the T 6 ‘corner’ point of the half-disk.

Solution to problem 3.20, page 19: In this task, the equation of the ellipsoid, rewritten as

x 2

y 2

z 2

+

+

− 1 = 0,

a 2

b 2

c 2

represents the constraint. An inscribed box with vertices on this ellipsoid has edges of length 2 x, 2 y, and 2 z.

(a) The inscribed box has volume V ( x, y, z) = 2 x · 2 y · 2 z = 8 xyz and this is the function we’d like to maximize with respect to the constraint above.

The corresponding Lagrange function is

x 2

y 2

z 2

!

L( x, y, z, λ) = 8 xyz − λ

+

+

− 1 .

a 2

b 2

c 2

CHAPTER 4. SOLUTIONS

81

The system that determines the stationary points of L is:

∂L

2 λx

= 8 yz −

= 0,

∂x

a 2

∂L

2 λy

= 8 xz −

= 0,

∂y

b 2

∂L

2 λz

= 8 xy −

= 0,

∂z

c 2

∂L

x 2

y 2

z 2

!

= −

+

+

− 1 = 0.

∂λ

a 2

b 2

c 2

Multiplying the first three equations with yz, xz, and xy respectively, and then rearranging we obtain

4 a 2 y 2 z 2 = λxyz,

4 b 2 x 2 z 2 = λxyz,

4 c 2 x 2 y 2 = λxyz,

and, ignoring cases with x = 0, y = 0, or z = 0 (Why?), we deduce x 2

y 2

z 2

=

=

.

a 2

b 2

c 2

Pluging this into the constraint we get

x 2

y 2

z 2

3 ·

= 1 , 3 ·

= 1 , and 3 ·

= 1,

a 2

b 2

c 2

i.e., x = ± a

√ , y = ± b

√ , and z = ± c

√ . Ignoring the signs (and solutions

3

3

3

with any of the edge lengths equal to 0), we deduce that



a

b

c

8 abc

V √ , √ , √

= √

3

3

3

3 3

is the largest possible volume of the inscribed rectangular box.

(b) We just need to replace the function to maximize, in this subtask it is S( x, y, z) = 2 x · 2 y + 2 x · 2 z + 2 y · 2 z = 4( xy + xz + yz).

As we will see, we are now presented with a slightly more formidable problem. Anyway, let’s see how long we can consider this in full generality. The Lagrange function becomes

x 2

y 2

z 2

!

L( x, y, z, λ) = 4( xy + xz + yz) − λ

+

+

− 1 .

a 2

b 2

c 2

CHAPTER 4. SOLUTIONS

82

Its stationary points are solutions of

∂L

2 λx

= 4( y + z) −

= 0,

∂x

a 2

∂L

2 λy

= 4( x + z) −

= 0,

∂y

b 2

∂L

2 λz

= 4( x + y) −

= 0,

∂z

c 2

∂L

x 2

y 2

z 2

!

= −

+

+

− 1 = 0.

∂λ

a 2

b 2

c 2

Now, we multiply the first three equations with a 2 , b 2 , and c 2 , respec-2

2

2

tively, to obtain

2 a 2( y + z) = λx,

2 b 2( x + z) = λy,

2 c 2( x + y) = λz.

This is now an eigenvalue problem. Namely, setting



0

2 a 2

2 a 2

 x









A = 















2 b 2

0

2 b 2 and x =  y ,

















2 c 2

2 c 2

0 

 z

the above system becomes A x = λ x. The characteristic polynomial of A is

det( A − λI) = − λ 3 + 4( a 2 b 2 + a 2 c 2 + b 2 c 2) λ + 16 a 2 b 2 c 2.

This is a so-called depressed cubic (it has no λ 2 term) and while its zeros can be directly expressed via Cardano’s formulae, we rather consider an ellipsoid of revolution, i.e., simplified case c = b. The above characteristic polynomial then becomes

det( A − λI) = − λ 3 + 4(2 a 2 b 2 + b 4) λ + 16 a 2 b 4.

It is now (almost) obvious that λ 1 = −2 b 2 is one of its zeros. To obtain remaining two zeros we first divide det( A − λI) by λ + 2 b 2:

− λ 3 + 4(2 a 2 b 2 + b 4) λ + 16 a 2 b 4 = − λ 2 + 2 b 2 λ + 8 a 2 b 2.

λ + 2 b 2

The zeros of this quadratic polynomial are

√

λ 2 , 3 = b 2 ± 8 a 2 b 2 + b 4.

Now that the possible λ’s are known, let’s determine x = [ x, y, z]T. Those would be the eigenvectors of A subject to the ‘normalization constraint’

x 2

y 2

z 2

+

+

= 1.

a 2

b 2

b 2

CHAPTER 4. SOLUTIONS

83

• Start with λ 1 = −2 b 2. We have (assuming a , b)

2 b 2

2 a 2

2 a 2

1

0

0









A − λ



 ∼ 



1 I = 







2 b 2

2 b 2

2 b 2

0

1

1 ,

















2 b 2

2 b 2

2 b 2

0

0

0

hence x = z[0 , 1 , −1]T for arbitrary (free variable) z ∈ R. That won’t produce a solution of interest, since x = 0.

√

• Next one is λ 2 = b 2 + 8 a 2 b 2 + b 4. We get

√

 b 2 − 8 a 2 b 2 + b 4

2 a 2

2 a 2





√







A − λ





2 I = 





2 b 2

− b 2 − 8 a 2 b 2 + b 4

2 b 2





√













2 b 2

2 b 2

− b 2 − 8 a 2 b 2 + b 4





q



1

0

1 1 − 8 a 2 + 1 



2





b 2



∼ 





 .





0

1

−1











0

0

0



Therefore, a general eigenvector corresponding to λ 2 is



q



1





8 a 2 + 1 − 1 

 2





b 2



x = z 













1













1



for arbitrary z ∈ R. A little more work shows that z which respects the constraint x 2 + y 2 + z 2 = 1 is

a 2

b 2

b 2

√

ab

8 a 2 b 2 + b 4 − b 2

z =

, where γ =

.

p γ 2 b 2 + 2 a 2

2 b 2

We will not continue to express x in full, but it is already clear that x = [ γz, z, z]T. Hence

(8 γ + 4) a 2 b 2

S( γz, z, z) = 4 γz 2 + 4 γz 2 + 4 z 2 = z 2(8 γ + 4) = γ 2 b 2 +2 a 2

√

= b 2 + 8 a 2 b 2 + b 4.

√

• The third and final case is λ 3 = b 2 − 8 a 2 b 2 + b 4. Now we obtain

√

 b 2 + 8 a 2 b 2 + b 4

2 a 2

2 a 2





√







A − λ





3 I = 





2 b 2

− b 2 + 8 a 2 b 2 + b 4

2 b 2





√













2 b 2

2 b 2

− b 2 + 8 a 2 b 2 + b 4





q



1

0

1 1 + 8 a 2 + 1 



2





b 2



∼ 





 .





0

1

−1











0

0

0



Repeating the rest of analogous steps as for λ 2 one would obtain

√

S( γz, z, z) = b 2 − 8 a 2 b 2 + b 4

√

which is clearly a negative number. (This time γ = − 8 a 2 b 2+ b 4+ b 2 .) 2 b 2

CHAPTER 4. SOLUTIONS

84

In conclusion, the largest attainable surface area for a rectangular box inscribed inside an ellipsoid of revolution is

√

S = b 2 + 8 a 2 b 2 + b 4.

(We will not explicitly express the side lengths of this rectangular box, but an enthusiastic reader may well do so.)

A comment: Relying on intuition can sometimes fail. A sphere is a

special case of an ellipsoid (the one with c = b = a) and intuition would say that the inscribed box with the largest volume or surface area is a cube with side length 2 a

√ . (Which is correct!) Stretching the sphere

3

produces an ellipsoid and stretching the cube produces a rectangular box. Now, intuition might say that we simply need to stretch the cube which maximizes volume and surface area. The resulting volume and

surface area for an ellipsoid of revolution ( c = b) would then be 8 ab 2

8 ab + 4 b 2

V = √

and S =

.

3 3

3

While the formula for the maximal attainable volume is correct, the one for maximal attainable area is false!

Solution to problem 3.21, page 20: Denote the lengths of the edges of this box by a, b, and c. Since we assembled the box frame from a rod of length ℓ, we must have 4 a + 4 b + 4 c = ℓ. This is our constraint.

(a) The volume of the box is V ( a, b, c) = abc, and this is precisely the function we must maximize, subject to the constraint above. Let’s rewrite the constraint as

4 a + 4 b + 4 c − ℓ = 0

and form the Lagrange function

L( a, b, c, λ) = abc − λ(4 a + 4 b + 4 c − ℓ).

The stationary point of L are solutions of the system

∂L = bc − 4 λ = 0,

∂a

∂L = ac − 4 λ = 0,

∂b

∂L = ab − 4 λ = 0,

∂c

∂L = −(4 a + 4 b + 4 c − ℓ) = 0.

∂λ

We quickly gather from the first three equations that bc = ac = ab holds.

While we could safely ignore the solutions with any of the a, b, or c equal to 0, let’s be strict (once) and find all solutions to this system.

If a = 0, then we must have λ = 0. (Consider either the second or the third equation.) Now, from the first equation (and since λ = 0) we must have either b = 0 or c = 0. If b = 0, we have c = ℓ , if c = 0, we have b = ℓ .

4

4

CHAPTER 4. SOLUTIONS

85

There’s nothing special about starting the reasoning above with ‘if a = 0’.

The conclusion has the same form: Any solution with one of the a, b, or c equal to 0, will force λ = 0, two of the a, b, and c equal to 0, and the remaining one equal to ℓ . These ( a, b, c, λ)-solutions are 4

( ℓ , 0 , 0 , 0) , (0 , ℓ , 0 , 0) , and (0 , 0 , ℓ , 0).

4

4

4

They are also not the ones that interest us—the volume of the resulting (degenerate) box is 0, i.e., these solutions represent the minima.

So let’s assume that none of the a, b, c are equal to 0. Then, from bc =

ac = ab, we deduce that b = a and c = b, i.e., a = b = c. Plugging this into the constraint, we get a = ℓ and

12

ℓ

ℓ

ℓ

ℓ 3

V

,

,

=

12 12 12

12

is the largest possible volume of such a box frame. So the box frame with largest possible volume we can assemble from a rod of length ℓ is in fact a frame of a cube.

(b) The additional restriction is in fact an additional constraint, ab = A must hold. We’ll rewrite this as ab − A = 0. The Lagrange function will now depend on two Lagrange multipliers, we’ll denote them by λ and µ: L( a, b, c, λ, µ) = abc − λ(4 a + 4 b + 4 c − ℓ) − µ( ab − A).

We now solve the system:

∂L = bc − 4 λ − µb = 0,

∂a

∂L = ac − 4 λ − µa = 0,

∂b

∂L = ab − 4 λ = 0,

∂c

∂L = −(4 a + 4 b + 4 c − ℓ) = 0,

∂λ

∂L = −( ab − A) = 0.

∂µ

From the third equation we have 4 λ = ab, and replacing 4 λ with ab in the first two equations we get

bc − ab − µb = 0

∴

b( c − a − µ) = 0,

ac − ab − µa = 0

∴

a( c − b − µ) = 0.

We’re assuming that A > 0, and, since ab = A (from the fifth equation), a , 0 and b , 0. Hence,

c − a − µ = 0,

c − b − µ = 0,

which implies a = b. From the second constraint we now get a 2 = A or

√

a =

A. (We ignore the negative solution, since the length cannot be

CHAPTER 4. SOLUTIONS

86

negative.) To finish, we use the first constraint (the fourth equation in

√

√

our ‘Lagrange system’) to get 8 A+4 c = ℓ or c = ℓ −2 A. The maximum 4

possible volume in this case is therefore

√

√

√

√



ℓ



ℓ



V

A, A,

− 2 A = A ·

− 2 A .

4

4

The solution is, in a sense, expected. We obtained a box with base rect-

√

angle a square with side length

A, the remainder of the rod length is

then used for four vertical edges.

Solution to problem 3.22, page 20: This is similar to the previous exercise.

Denote by a the side length of the base equilateral triangle and by h the prism’s height. The volume of such a prism is

√

a 2 h 3

V ( a, h) =

,

4

and its surface area is

√

a 2 3

A( a, h) =

+ 3 ah.

2

Having ℓ meters of the rod available, means that 6 a + 3 h = ℓ. That will be our constraint.

(a) We need to maximize V with respect to the constraint. The Lagrange function is

√

a 2 h 3

L( a, h, λ) =

− λ(6 a + 3 h − ℓ).

4

It stationary points are solutions of

√

∂L

ah 3

=

− 6 λ = 0,

∂a

2√

∂L

a 2 3

=

− 3 λ = 0,

∂h

4

∂L = −(6 a + 3 h − ℓ) = 0.

∂λ

Let’s multiply the first equation with 2 and the second one with 4 to get

√

ah 3 − 12 λ = 0,

√

a 2 3 − 12 λ = 0.

From this we get

ah = a 2

∴

ah − a 2 = 0

∴

a( h − a) = 0.

Ignoring the solutions with a = 0, we get h = a. (The reader is invited to consider the solutions we just ignored. What do they represent?) Hence, from the constraint, we get 9 a − ℓ = 0 or a = h = ℓ . That means that we 9

need to cut up the rod into 9 pieces of equal lenght, and the resulting maximal volume is

√

ℓ ℓ

ℓ 3 3

V

,

=

.

9 9

4 · 93

CHAPTER 4. SOLUTIONS

87

(b) We now need to maximize A with respect to our constraint. The Lagrange function is

√

a 2 3

L( a, h, λ) =

+ 3 ah − λ(6 a + 3 h − ℓ).

2

It stationary points are now solutions of

∂L

√

= a 3 + 3 h − 6 λ = 0,

∂a

∂L = 3 a − 3 λ = 0,

∂b

∂L = −(6 a + 3 h − ℓ) = 0.

∂λ

Note that λ = a from the second equation, and, plugging this into the first equation, we get

√

√

a 3 + 3 h − 6 a = 0

∴

3 h = (6 − 3) a.

Substituting that instead of 3 h into the third equation, we have

√

√

ℓ

12 + 3

6 a + (6 − 3) a − ℓ = 0

∴

a =

√

=

ℓ 0 . 09739 ℓ,

12 − 3

141

and

√

√

√

(6 − 3) ℓ

(6 − 3) ℓ

23 − 2 3

3 h =

√

∴

h =

√

=

ℓ 0 . 13855 ℓ.

12 − 3

3(12 − 3)

141

The reader is invited to evaluate the resulting maximal attainable area.

Solution to problem 3.23, page 20:

(a) We’d like to find extreme values of the function f (x) = aTx subject to the constraint ∥x∥ = d. We first rewrite the constraint as

∥x∥2 = d 2

∴

∥x∥2 − d 2 = 0

∴

xTx − d 2

= 0

|

{z

}

g(x)

and set the Lagrange function as

L(x , λ) = f (x) − λg(x) = aTx − λ(xTx − d 2).

The stationary points of L are again the solutions of the system

∂L = aT − 2 λ xT = 0,

∂ x

∂L = −(xTx − d 2) = 0.

∂λ

It follows from the first equation that a = 2 λ x, i.e., x and a are parallel.

Let’s write this as x = α a and plug it into the second equation: d 2

d

α 2aTa − d 2 = 0

∴

α 2 =

∴

α = ±

.

∥a∥2

∥a∥

CHAPTER 4. SOLUTIONS

88

Hence, vectors x, at which extreme values of f are attained, are d

x = α a = ±

a,

∥a∥

and these extreme values are



d





d



f ±

a = aT ±

a = ± d∥a∥.

∥a∥

∥a∥

(b) Extreme value of the expression aTx (the dot product of a and x) on the sphere with equation ∥x∥ = d will be attained precisely when x is parallel to a.

Solution to problem 3.24, page 20:

(a) Let’s rewrite the constraint as ∥x∥2 = d 2 or xTx − d 2 = 0. The Lagrange function corresponding to our problem is

L(x , λ) = xT A x − λ(xTx − d 2).

Its stationary points are, as usual, the solutions to the system

∂L = xT( A + A T) − 2 λ xT = 0,

∂ x

∂L = −(xTx − d 2) = 0.

∂λ

Note that the first (system of) equation(s) can be rewritten as

A + A T x = λ x,

2

i.e., x is an eigenvector of A+ A T (and λ is the corresponding eigenvalue).

2

(Since A+ A T is symmetric, its eigenvalues and its eigenvectors are real, 2

i.e., λ ∈

n

R and x ∈ R .) Another thing to notice is

A + A T

1



1

xT

x =

xT A x + xT A Tx =

· 2xT A x = xT A x = f (x).

2

2

2

So, for an eigenvector x of A+ A T with ∥x∥ = d, we have 2

A + A T

f (x) = xT

x = xT λ x = λd 2.

2

Finally, the extreme values of f subject to ∥x∥ = d can be identified as λ max d 2 and λ min d 2, where λ max and λ min are the largest and the smallest eigenvalues of A+ A T , respectively.

2

(b) Now the constraint is xT A x = d 2, which we rewrite as xT A x − d 2 = 0. For the Lagrange function we have

L(x , µ) = xTx − µ(xT A x − d 2).

CHAPTER 4. SOLUTIONS

89

(The decision to denote the Lagrange multiplier by µ will become clear later.) Its stationary points are solutions of

∂L = 2xT − 2 µ xT A = 0,

∂ x

∂L = −(xT A x − d 2) = 0.

∂µ

(We used the fact that A is symmetric when evaluating the first derivative.) Rewriting the first equation as

1

A x =

x,

µ

we see that 1 must be an eigenvalue of A, with x the corresponding µ

eigenvector. (Note that 1 makes sense as an eigenvalue of A, since A is µ

a definite matrix.) From the second equation (the constraint), we now obtain

1 !

xT A x = d 2

∴

xT

x = d 2

∴

∥x∥2 = µd 2 or f (x) = µd 2.

µ

Hence, denoting the smallest and the largest eigenvalues of A by λ min and λ max, respectively, we have that

d 2

is the largest value attained by f , and

λ min

d 2

is the smallest value attained by f .

λ max

(We used that λ = 1 for an eigenvalue λ of A, and also the fact that the µ

eigenvalues of A are positive.)

Solution to problem 3.25, page 20:

(a) The inequality ∥x − p∥ ≤ d determines the closed

n

R

ball of radius d centered at p. We split the solution

into two subtasks: extrema in the interior (deter-

∥

mined by ∥x − p∥ < d) and extrema on the boundary

x − p∥ ≤ d

(the sphere determined by ∥x − p∥ = d).

• The interior: This is easy, the only stationary point of f is x = 0 and this is a candidate if and only if ∥p∥ < d, i.e., 0 is actually contained in the interior.

• The boundary: Write the constraint as

∥x − p∥ = d

∴

∥x − p∥2 − d 2 = 0

∴

(x − p)T(x − p) − d 2 = 0,

and let’s set up the corresponding Lagrange function



L(x , λ) = xTx − λ (x − p)T(x − p) − d 2 .

CHAPTER 4. SOLUTIONS

90

Now,

∂L = 2xT − λ(2xT − 2pT) = 0T,

∂ x

∂L



= − ∥x − p∥2 − d 2 = 0.

∂λ

The first equation implies that (2 − 2 λ)x = −2 λ p, i.e., x and p must be parallel. Let’s write this as x = α p for some α ∈ R and plug this into the second equation

d

d

∥ α p−p∥ = d

∴

∥( α−1)p∥ = d

∴

| α−1| =

∴

α = 1±

.

∥p∥

∥p∥

Hence,



d !

x = 1 ±

p,

∥p∥

which should be the expected solution.

Finally, if ∥p∥ < d, then the minimal value of f is attained in the interior at 0, and is f (0) = 0. In case ∥p∥ ≥ d, the minimal value of f is attained on the boaundary at 1 − d

p, and is equal to ∥p∥2 + d 2 − 2 d∥p∥. (As a

∥p∥

sanity check, notice that in the boundary case ∥p∥ = d that last expression is 0, as it should be.)

(b) The equation A x = b determines an affine subspace

n

R

of

n

R . The appropriate Lagrange function now is

L(x , λ) = xTx − λ T( A x − b).

(Note that λ is now a column vector, with each

A x = b

component corresponding to one of the boundary

conditions!) Its stationary points are solutions of

∂L = 2xT − λ T A = 0T,

∂ x

∂L = −( A x− b) = 0.

∂λ

From the first equation we have x = 1 A T λ. We plug this into the second 2

equation to get

1 AA T λ = b.

2

If we assume that A is of full rank (so that AA T is invertible), we can express λ = 2 AA T−1 b, and therefore

1

x =

A T λ = A T AA T−1 b.

2

(In general, i.e., for non-full rank matrices A, the solution is x = A+b, where A+ is the Moore–Penrose inverse of A.)

CHAPTER 4. SOLUTIONS

91

Finally, the minimal value of f on that affine subspace is





f A T AA T−1 b = bT AA T−1 AA T AA T−1 b = bT AA T−1 b for the case when A is of full rank. (That −1 superscript needs to be replaced by a + superscript for A which is not of full rank.) (c) Let’s consider the boundary case first, i.e., we min-n

R

imize f (x) = ∥x∥2 with respect to ∥x − p∥ = d and

A x = b. The Lagrange function is

L(x , µ, λ) = xTx − µ(∥x − p∥2 − d 2) − λ T( A x − b).

∥x − p∥ ≤ d

A x = b

Its stationary points are solutions of the system

∂L = 2xT − 2 µ(x − p)T − λ T A = 0T,

∂ x

∂L = −(∥x − p∥2 − d 2) = 0,

∂µ

∂L = −( A x − b) = 0.

∂λ

Let’s start: From the first equation we get

1





(2 − 2 µ)x = A T λ − 2 µ p

∴

x =

A T λ − 2 µ p .

2 − 2 µ

A side note: This last expression is not well defined if µ = 1. The case µ = 1 implies that the first equation of the ‘Lagrange system’ does not depend on x. (The first equation simplifies to 2p− A T λ = 0.) That means that x is only determined by the second and the third equation which describe a sphere in the affine subspace given by A x = b. If that whole sphere consists of stationary points of f (x) = ∥x∥2, i.e., f is constant on that sphere, then the origin of that sphere is the orthogonal projection of 0 onto the affine subspace A x = b. (Namely, the origin of the sphere is given by p = A+b = A T AA T−1 b.) It is clear that in this case the minimal value of f is attained at that origin, i.e., in the interior of the ball bounded by that sphere and not on the boundary sphere itself.

So, assuming µ , 1 we plug x from above into the third equation to get A A T λ − 2 µ p = (2 − 2 µ)b

∴

AA T λ = (2 − 2 µ)b + 2 µA p,

hence



λ = AA T−1 ((2 − 2 µ)b + 2 µA p) .

CHAPTER 4. SOLUTIONS

92

Now we plug this into the expression for x and obtain

1





x =

A T AA T−1 ((2 − 2 µ)b + 2 µA p) − 2 µ p 2 − 2 µ

1





=

(2 − 2 µ) A T AA T−1 b + 2 µ( A T A p − p) 2 − 2 µ

µ

= A T AA T−1 b +

( A T A − I)p

1 − µ

= A T AA T−1 b + α( A T A − I)p, where we introduced α = µ in the last line. Finally, we use the con-1− µ

straint ∥x − p∥2 = d 2. Plugging the above expression for x into it and rearranging we obtain a quadratic equation for α, namely





α 2 pT( A T A − I)2p + 2 α pT A Tb − pT A T( AA T)−1b − pT( A T A − I)p +

+ bT( AA T)−1b − 2pT A T( AA T)−1b + pTp − d 2 = 0

Solving this equation for α and plugging the solution into our expression for x, we finally obtain the points, at which extreme values of f (x) = xTx are attained. Conveniently, we leave that task to the reader.

A final case to consider: f attains its minimal value at a point in the interior of the ball ∥x−p∥ ≤ d within the affine subspace A x = b. This can only happen if the orthogonal projection of 0 onto A x = b is contained in that ball, i.e., if ∥ A+b − p∥ < d. In that case, the minimal value is the same as in part (b) of this task.





Document Outline


Linear algebra A recollection of basic concepts

Schur, Frobenius, Eckart–Young





Vector spaces and linear maps Vector spaces

Linear maps





Functions of several variables Multiple integrals

Local extrema of real multivariate functions

Constrained extrema





Solutions