Notes
University
Dawson College
Course
Preview text
Calculus 3 Notes
(Shahab Shahabi)
In this set of notes, we shall cover a number of topics from Multivariable Calculus. No attempt has been made to be comprehensive; on the contrary, our account will be (very) telegraphic!
Convention: As we know, the EuclideannspaceRnis the set of allntuples
Rn={X= (x 1 , …, xn) : x 1 , …, xn∈R},
(equippedwith the standard addition and scalar multiplication operations.) However, it is sometimes convenient (or even necessary) to write suchanntuple in a vertical style
X=
x 1 … xn
. When the necessity arises (in the sequel), we shall switch from one style
to the other,without further indication!
§0. Terminology & Notation:
Calculus 1deals with the singlevariable singlevaluedfunctions (also known asscalar functions): {
R
f −→R x7→y
, y=f(x).
Calculus 3deals with:
singlevariable vectorvaluedfunctions (akavector functions):
R−→r Rn, t7→r(t) =
x 1 (t) … xn(t)
.
multivariable singlevaluedfunctions (akascalar fields):
Rm
f −→R,
x 1 … xm
7→w=f(x 1 , x 2 , …, xm).
multivariable vectorvaluedfunctions (akavector fields):
Rm−→F Rn,
x 1 … xm
7→
w 1 … wn
=F(x 1 , x 2 , …, xm).
This means that eachwj is a function ofxi’s:
wj=fj(x 1 , x 2 , …, xm) (j= 1, 2 , ..).
§1. Level Sets:
For any (continuous) functionf:Rm−→R, and any real numberc, the collection of pointsP(a 1 , .., am)for whichf(P) =c(a constant value) is called alevel set:
S={P ∈Rn: f(P) =c}.
The casesm= 2, 3 are more important for us, thus:
For any twovariable continuous functionz=f(x, y), the graph of pairs(x, y)for which “f(x, y) =constant” defines a curve (inR 2 ) called alevel curve:
For any threevariable continuous function w = f(x, y, z), the locus of points (x, y, z)for which “f(x, y, z) =constant” defines a surface (inR 3 ) called alevel surface:
§2. Differentiation:
Scalar Functions (Cal 1): y=f(x) =⇒y′=yx′ =
dy dx
=f′(x) =
df dx
;
Vector Functions:~r(t) =
x 1 (t) … xn(t)
=⇒ d dt
(r) =r′(t) =
x′ 1 (t) … x′n(t)
;
Scalar Fields: w=f(x 1 , …, xm) =⇒ ∇f = (∂x∂f 1 , …,∂x∂fm)(For this, which is called
the “gradient” off, see§ 5 .)
(This case is known asthe Chain Rule for paths,for differentiatingf(r(t))r(t).) Note furthermore that the last expression above for
may be viewed as differentiating the scalar fieldf “along the path”determined
by
d dt
(
f(~r(t)
)
is easily seen to be also equal to the following dot product:
d dt
(
f(~r(t))
)
=
(
x′ 1 (t), …, x′m(t)
)
(∂f
∂x 1
(~r(t)), …,
∂f ∂xm
(~r(t))
)
.
A scalar function composed with a vector field: if
Rm
G −→Rn
f −→R, (x 1 , …, xm)7→(u 1 , …, un)7→y,
then ∂y ∂x 1
=
∂y ∂u 1
·
∂u 1 ∂x 1
+
∂y ∂u 2
·
∂u 2 ∂x 1
+· · ·+
∂y ∂un
·
∂un ∂x 1
,
∂y ∂x 2
=
∂y ∂u 1
·
∂u 1 ∂x 2
+
∂y ∂u 2
·
∂u 2 ∂x 2
+· · ·+
∂y ∂un
·
∂un ∂x 2
,
· · · · · · · · · · · ·,
and ∂y ∂xm
=
∂y ∂u 1
·
∂u 1 ∂xm
+
∂y ∂u 2
·
∂u 2 ∂xm
+· · ·+
∂y ∂un
·
∂un ∂xm
.
The set of all these equalities may also be expressed in a single matrix equality as follows:
(∂y
∂x 1
, …,
∂y ∂xm
)
1 ×m
=
(∂y
∂u 1
, …,
∂y ∂un
)
1 ×n
×
∂u 1 ∂x 1
∂u 1 ∂x 2 · · ·
∂u 1 ∂xm ∂u 2 ∂x 1
∂u 2 ∂x 2 · · ·
∂u 2 ∂xm · · · · · · · · · · · · ∂un ∂x 1
∂un ∂x 2 · · ·
∂un ∂xm
n×m
or even more concisely as∇(f◦G) =∇(f)×D(G).
A vector field composed with a vector field (the most general case):
Rm
G −→Rn
F −→Rp, (x 1 , …, xm)7→(u 1 , …, un)7→(y 1 , …, yp),
then for each 1 ≤i≤mand each 1 ≤k≤p, we have
∂yk ∂xi
=
∑n
j=
∂yk ∂uj
·
∂uj ∂xi
=
∂yk ∂u 1
·
∂u 1 ∂xi
+
∂yk ∂u 2
·
∂u 2 ∂xi
+· · ·+
∂yk ∂un
·
∂un ∂xi
.
Having all these equalities (collectively) amounts to having the following matrix equality (the Jacobian matrices)
∂y 1 ∂x 1
∂y 1 ∂x 2 · · ·
∂y 1 ∂xm ∂y 2 ∂x 1
∂y 2 ∂x 2 · · ·
∂y 2 ∂xm · · · · · · · · · · · · ∂yp ∂x 1
∂yp ∂x 2 · · ·
∂yp ∂xm
p×m
=
∂y 1 ∂u 1
∂y 1 ∂u 2 · · ·
∂y 1 ∂y ∂un 2 ∂u 1
∂y 2 ∂u 2 · · ·
∂y 2 ∂un · · · · · · · · · · · · ∂yp ∂u 1
∂yp ∂u 2 · · ·
∂yp ∂un
p×n
×
∂u 1 ∂x 1
∂u 1 ∂x 2 · · ·
∂u 1 ∂xm ∂u 2 ∂x 1
∂u 2 ∂x 2 · · ·
∂u 2 ∂xm · · · · · · · · · · · · ∂un ∂x 1
∂un ∂x 2 · · ·
∂un ∂xm
n×m
,
or in short: D(F◦G) =D(F)×D(G).
The following is a diagram for the case(NEEDS TO BE COMPLETED)
§4. An Application of the Chain Rule: Implicit Differentiation
One learns in Cal 1 how to find y′ = yx′ when x and y are “tied” together via an implicit equation: implicit differentiation; a process which is often lengthy in practice! By means of partial derivatives, one has a “shortcut”:
If the equationF(x, y) = 0definesy(implicitly) as a function ofx, then
dy dx
=−
∂F/∂x ∂F/∂y
=−
Fx Fy
.
(In the implicit equation, all the terms must be on the left side for this to work!)
Example: Ifx︸ 3 y− 2 x 2 y︷︷ 2 + 5y 4 − 7 ︸ F(x,y)
= 0 =⇒y′=−
3 x 2 y− 4 xy 2 x 3 − 4 x 2 y+ 20y 3
.
If the equationF(x, y, z) = 0definesz(implicitly) as a function ofxandy, then
∂z ∂x
=−
∂F/∂x ∂F/∂z
=−
Fx Fz
and
∂z ∂y
=−
∂F/∂y ∂F/∂z
=−
Fy Fz
.
Example: Ifx︸ 3 +y︷︷ 3 +xyz︸ F(x,y,z)
= 0then
∂z ∂x
=−
3 x 2 +yz xy
and
∂z ∂y
=−
3 y 2 +xz xy
.
Fix aunitvector u(i.,‖u‖= 1). Given a functionw =f(x 1 , …, xm) and a
pointP(a 1 , …, am), thedirectional derivativeof f atP in the direction of
~uis the value of the following limit (assuming it exists)
D~uf(P) = lim t→ 0
f(P+t~u)−f(P) t
.
Example. Letf(x, y) =x 2 +xy+y 2 ,P(1,2)and~u= (1/ 2 ,
√
3 /2)(note that ~uis indeed a unit vector!). Then
Du~f(1,2) = lim t→ 0
f(1 + 12 t,2 +
√ 3 2 t)−f(1,2) t
= lim t→ 0
(1 + 12 t) 2 + (1 + 12 t)(2 +
√ 3 2 t) + (2 +
√ 3 2 )
2 − 7
t
= lim t→ 0
1 +t+ 14 t 2 + 2 +
√ 3 2 t+t+
√ 3 4 t
2 + 4 + 2√ 3 t+ 3 4 t
2 − 7
t
= 2 + 2
√
3 +
√
3
2
= 2 +
5
2
√
3.
Special Cases: Ifz=f(x, y), and ifu=i= (1,0)is the standard unit vector
along thexaxis, then the directional derivative off in the direction of~iis none
other than the partial derivative off with respect tox:
D~if(P) =
∂f ∂x
(P).
We also have D~jf(P) =
∂f ∂y
(P), where~j = (0,1) is the standard unit vector along theyaxis.
Likewise, ifw=f(x, y, z), then the directional derivatives off in the directions
of the standard unit vectorsi,j and~kare, respectively, the partial derivatives of
f with respect tox, toy, and toz, respectively:
D~if(P) =
∂f ∂x
(P), D~jf(P) =
∂f ∂y
(P), and D~kf(P) =
∂f ∂z
(P).
Geometric Insight: For the geometric interpretation of directional derivative, see the page 947 of the textbook.
Computing the Directional Derivative Using Gradient:
An important observation with regards to the directional derivative off atP in
the direction of the (unit) vectoruis that it is equal tothe rate of change offu,namely along
along the straightpath through the pointP and determined by
the path
r(t) =P+tu.
That is to say,
D~uf(P) =
d dt
(
f(P+t~u)
)
.
Thus, thanks to the Chain Rule for paths (see the section on Chain Rule), we have:
Theoremua unit vector, andw=f(x 1 , …, xm)a differentiable functionuf(P) is thedot product
at a pointP(a 1 , …, am), the directional derivativeD
of~uwith the gradient vector off atP. That is to say, we have
Duf(P) =u∇f(P).
If the vector~uis written in a vertical style, then our statement takes the fomr
Duf(P) =∇f(P)×u,
the righthand side being viewed as a matrix multiplication! [ The case of two components: if~u= (α, β), then
D~uf(P) =α·
∂f ∂x
(a 1 , a 2 ) +β·
∂f ∂y
(a 1 , a 2 ).
Likewise, in the case of three component, we have
D~uf(P) =α·
∂f ∂x
(a 1 , a 2 , a 3 ) +β·
∂f ∂y
(a 1 , a 2 , a 3 ) +γ·
∂f ∂z
(a 1 , a 2 , a 3 ).
]
Example. Redoing the same example wheref(x, y) =x 2 +xy+y 2 ,P(1,2)and ~u= (1/ 2 ,
√
3 /2). Using the formula above, we get
D~uf(1,2) =
1
2
·(2x+y)
∣∣
∣
(1,2)
+
√
3
2
·(x+ 2y)
∣∣
∣
(1,2)
= 2 +
5
2
√
3 ,
the same answer as before!
Maximum Rate of Increase/Decrease: Geometrically speaking, the gradi ent vectorpoints in the direction of maximum increaseoff atP. This means that if one is interested in observing the maximum increase in the value of f when moving away from the position P, one must move in the direction determined by the gradient vector atP. More generally, we have
Theorem. Assume that∇f(P) 6 = 0, and thatuis a unit vector: ‖u‖= 1.
(i)If 0 ≤θ≤πis the angle between the two vectors∇f(P)and~u, then
D~uf(P) =‖∇f(P)‖ ·cos(θ).
(Remember from Linear Algebra:v 1 v 2 =‖v 1 ‖ · ‖v 2 ‖ ·cos(θ).)
(ii)We always have the double inequality
−‖∇f(P)‖ ≤D~uf(P)≤ ‖∇f(P)‖.
(This is because: − 1 ≤cos(θ)≤ 1 .)
(iii)∇f(P)points in the direction of maximum rate ofincreaseoffatP; this is achieved whenθ= 0, becausecos(0) = 1.
LikewiseP(a, b) is said to be alocal (or relative) maximumforf if for all (x, y)in some n.h.b. N, we have
f(x, y)≤f(a, b).
The surface below has (at least) two maximums (and one minimum):
If either of the two inequalities above holdsfor all(x, y)in the domainoff, then P will be called anabsolute maximum / absolute minimumoff:
f(x, y)≥f(P), ∀(x, y)∈Df =⇒ P(a, b) is an absolute minimum,
and
f(x, y)≤f(P), ∀(x, y)∈Df =⇒ P(a, b) is an absolute maximum.
Proposition. Assume thatz=f(x, y)has either a maximum (local or global) or a minimum (relative or absolute) at a pointP. Assume furtherf is differentiable atP. Then
(a)
∂f ∂x
(P) =
∂f ∂y
(P) = 0;
(b)More generally, for any unit vector u, the directional derivative of f in theuis zero:
direction of
Du~f(P) = 0.
Recall from Cal 1 that a numbercis said to be acritical numberfor a function y=f(x)if eitherf′(c)does not exist, or (in case it exists)f′(c) = 0. Motivated by this, we say a pointP(a, b) is a critical (or stationary)point for a twovariable functionz=f(x, y)if either
∂f ∂x
(a, b) or
∂f ∂y
(a, b)
does not exist, or (in case both partial derivatives atP exist), ∂f ∂x
(a, b) =
∂f ∂y
(a, b) = 0.
Here are two examples:
 Setf(x, y) =x 2 +y 2 − 2 x− 6 y+14. Obviously,fis everywhere differentiable, so to search for critical point(s), if any, we have to solve { fx(x, y) = 0 fy(x, y) = 0
=⇒
{
2 x−2 = 0 2 y−6 = 0
=⇒
{
x= 1 y= 3,
hence the (only) critical pointP(1,3). It is easily shown thatP is anabsolute minimum:
f(x, y) = (x−1) 2 + (y−3) 2 + 4≥4 =f(1,3).
 Setg(x, y) =x 2 −y 2. This is also an everywhere differentiable function. As for critical points, we have { gx(x, y) = 0 gy(x, y) = 0 =⇒
{
2 x= 0 2 y= 0 =⇒
{
x= 0 y= 0,
hence the (only) critical pointO(0,0). The point O however is neither a local max. nor a local min: in every n.h.b. ofO(0,0), one can find points like(x,0) (withx 6 = 0) for which
f(x,0) =x 2 > f(0,0),
and points like(0, y)(withy 6 = 0) for which
f(0, y) =−y 2 < f(0,0).
In fact the pointOis an example of asaddle point:
A pointS(a, b)is said to be asaddle pointforz=f(x, y), if

f is differentiable in a n.h.b. of(a, b);

fx(a, b) =fy(a, b) = 0(i.,S is a stationary point); but

Sis not a (local) extrema off.
In the surface below the red point is an example of a saddle point:
Two Examples: (1)Considerf(x, y) =x 2 +y 2 − 2 x− 6 y+ 14(we already studiedf). We have
fx=fy= 0 =⇒x= 1, y= 3.
AtP(1,3), we have
∆ =
∣
∣∣
∣
fxx(1,3) fxy(1,3) fxy(1,3) fyy(1,3)
∣
∣∣
∣=
∣
∣∣
∣
2 0
0 2
∣
∣∣
∣= 4>0 & fxx(1,3) = 2> 0.
Thus, according to the Second Derivative Test, the pointP(1,3)must be a min imum,which is!
(2)Considerg(x, y) =x 4 +y 4 − 4 xy+ 1. We have
gx=gy= 0 =⇒ 4 x 3 − 4 y= 4y 3 − 4 x= 0 =⇒x=x 9 =⇒x= 0,± 1 ,
thus the three pointsP(1,1),Q(− 1 ,−1)andO(0,0)are the critical points ofg.
The second derivatives ofgare
gxx= 12x 2 , gxy=− 4 , & gyy= 12y 2 ,
thus ∆(x, y) =
∣
∣∣
∣
12 x 2 − 4 −4 12y 2
∣
∣∣
∣= 144x
2 y 2 − 16.
(i)AtP: we have
∆(P) = ∆(1,1) = 128> 0 gxx(P) = 12>0 =⇒P is a (local) minimum;
(i)AtQ: we have
∆(Q) = ∆(− 1 ,−1) = 128> 0 gxx(Q) = 12>0 =⇒Qis a (local) minimum too;
(i)AtO: we have
∆(O) = ∆(0,0) =− 16 <0 =⇒ Ois a saddle point.
Applications:
 Find the shortest distance (in R 3 ) between the point P(1, 0 ,−2) and the planeΠ :x+ 2y+z= 4.
Solution: Let Q(x, y, z) be an arbitrary point on Π. This means that x+ 2y+z= 4, or equivalently thatz= 4−x− 2 y. On the other hand, the distance betweenP andQis
D =
√
(x−1) 2 + (y−0) 2 + (z−(−2)) 2
√
(x−1) 2 +y 2 + (6−x− 2 y) 2
√
2 x 2 − 14 x+ 5y 2 − 24 y+ 4xy+ 37.
Note that the distanceD=D(x, y)may be viewed as a function of the two variablesxandy. Thus, the shortest distance is indeed the minimum value ofD(x, y). It is, however, easier to find theabsolute minimum of
f(x, y) =D 2 (x, y) = 2x 2 − 14 x+ 5y 2 − 24 y+ 4xy+ 37,
and then take its square root. As for the stationary point(s) off, we get { fx= 0 fy= 0
=⇒
{
4 x−14 + 4y= 0 10 y−24 + 4x= 0
=⇒
{
x= 11/ 6 y= 10/ 6.
So, we need to determine the nature of the pointQ(11/ 6 , 5 /3). The second derivatives off, evaluated atQ, are
fxx(Q) = 4> 0 , fxy(Q) = 4, & fyy(Q) = 10.
Thus∆ =
∣∣
∣
∣
4 4
4 10
∣∣
∣
∣= 24> 0. It follows from the Second Derivative Test thatQ(11/ 6 , 5 /3)is a minimum point off,which might be local or global. We leave it to the reader to convince him/herself that in factf assumes its absolute minimumatQ:
f(x, y)≥f(11/ 6 , 5 /3) = 25/ 6 , ∀(x, y).
Hence, the (shortest) distance fromP toΠisD=
√
25 /6 = 5/
√
6.
Comment. As a byproduct, we also deduce that this minimum distance occurs at the pointQ(11/ 6 , 10 / 6 ,− 7 / 6.
 Find the maximum volume of a rectangular box, without a lid, with the surface area= 12 (m 2 ).
Solution: Denoting the dimensions of the box by x (the length), y (the width) andz(the hight), we mustmaximize
V =xyz,
wherex, y, z > 0 , andxy+ 2xz+ 2yz= 12. From this relation, one gets
z=
12 −xy 2(x+y)
=⇒V =V(x, y) =
12 xy−x 2 y 2 2(x+y)
.
Thus, we wish to maximize the function
f(x, y) =
12 xy−x 2 y 2 2(x+y)
,
under the conditionsx > 0 ,y > 0 , and 12 −xy > 0. It turns out that of the several solutions of fx=fy= 0, only one solution is acceptable: x = y = 2. And the Second Derivative Test yields a maximum! Thus the maximum possible volume of the box is V =V(2,2) = 4.
§8. Maxima/Minima With Constraint(s): Lagrange Multipliers
From the first two equations, we getx= 3/ 2 λandy= 1/ 2 λ (note thatλ 6 = 0, why?) By inserting these into the lat equation, we get
(1/ 3 λ) 2 + (1/ 2 λ) 2 −10 = 0 =⇒ 360 λ 2 = 4 + 9 =⇒ λ 1 , 2 =±
√
13
6
√
10
.
One readily verifies that the solution λ 1 =
√ 13 6
√ 10 leads to the point of maximum value off which is
Maxx 2 +y 2 =10(f) = 3(
3
2 λ 1
) + (
1
2 λ 1
) = 30
√
10 /
√
13 ,
whereas the solutionλ 2 =−
√ 13 6
√ 10 leads to the minimum value off which is
Minx 2 +y 2 =10(f) = 3(
3
2 λ 2
) + (
1
2 λ 2
) =− 30
√
10 /
√
13.
(2)Find the extreme values off(x, y, z) =exyz, subject to the constraint 2 x 2 + y 2 +z 2 = 24.
Solution: We must solve the system:
∂f ∂x=λ·
∂g ∂f ∂x, ∂y=λ·
∂g ∂y, ∂f ∂z =λ·
∂g ∂z, g(x, y, z) = 0,
=⇒
yz exyz=λ· 4 x, zx exyz=λ· 2 y, xy exyz=λ· 2 z, 2 x 2 +y 2 +z 2 −24 = 0.
Let us assume thatxyz 6 = 0. We have made this assumption since we would like to dividewith no fear! Also note that the assumptionxyz 6 = 0forcesλ 6 = 0too (why?) Now, by dividing the two sides of the first equation by that of the second equation, we get yz exyz zx exyz
=
4 λx 2 λy
=⇒ y 2 = 2x 2.
Doing the same thing to the second and the third equations, weget
zx exyz xy exyz
=
2 λy 2 λz
=⇒ z 2 =y 2.
(And if we do the same thing with the third and the first equations, we get 2 x 2 = z 2 .) Exploiting these relations in the fourth equation, the constraint condition will take the form
2 x 2 +y 2 +z 2 −24 = 0 =⇒ 2 x 2 + 2x 2 + 2x 2 = 24 =⇒ x=± 2.
Thus, there areeightspecial points to look at:
(± 2 ,± 2
√
2 ,± 2
√
2),
where thesigns are independent of each other. At four of these points we have a positive productxyz= (2)(
√
2)(
√
 = 16, whereas at the other four points we have a negative productxyz =− 16. Finally, note that ifxyz = 0, then the value off ise 0 = 1, and that 1 /e 16 < 1 < e 16. Hence
Max{ 2 x 2 +y 2 +z 2 =24}exyz=e 16 & Min{ 2 x 2 +y 2 +z 2 =24}exyz=e− 16.
Next we turn our attention to the case oftwo constraints:
Givenf(x 1 , …, xn), g(x 1 , …, xn) andh(x 1 , …, xn) are differentiable. The goal is to locate (and evaluate) the extreme values off over the set of points where g andh(both) vanish:
C ={(x 1 , …, xn) : g(x 1 , …, xn) = 0 & h(x 1 , …, xn) = 0}.
According to Lagrange, iff acquires an extreme value at a pointP, under the two constraintsg=h= 0, then at such point the gradient vector off must be a linear combination of the gradient vectors ofgandh:
∇f(P) =λ· ∇g(P) +μ· ∇h(P),
for some scalars λ and μ. This means that we should deal with the following system:
∂f ∂x 1 =λ
∂g ∂x 1 +μ·
∂h ∂x 1 , ∂f ∂x 2 =λ
∂g ∂x 2 +μ·
∂h ∂x 2 , · · · · · · ∂f ∂xn=λ
∂g ∂xn+μ·
∂h ∂xn, g(x 1 , ..) = 0, h(x 1 , …, xn) = 0.
(an (n+ 2)×(n+ 2) system)
As in the case of one constraint problems, solving this system often leads to a list of (several) points among which there live the point(s)of extreme value(s) of f, subject to the constraints g=h = 0. We shall illustrate the method in two examples:
Two Examples: (1)Find the extreme values off(x, y, z) =x+y+zsubject to the conditionsx+y= 1andx 2 +z 2 = 2.
Solution: Introducing g(x, y, z) = x+y− 1 andh(x, y, z) =x 2 +z 2 − 2 , we must solve the system
∂f ∂x=λ·
∂g ∂x+μ·
∂h ∂f ∂x, ∂y =λ·
∂g ∂y+μ·
∂h ∂y, ∂f ∂z =λ·
∂g ∂z +μ·
∂h ∂z, g(x, y, z) = 0, h(x, y, z) = 0,
=⇒
1 =λ·1 +μ· 2 x, 1 =λ·1 +μ· 0 , 1 =λ·0 +μ· 2 z, x+y−1 = 0, x 2 +z 2 −2 = 0.
This system has the solution
λ= 0 =⇒ x= 0 =⇒ y= 1 & z=±
√
2 =⇒ μ=±
1
2
√
2
,
withidentical signsforzandμ. It is thus obvious that
Min{g=h=0}f= 0+1+(−
√
2) = 1−
√
2 & Max{g=h=0}f= 0+1+(
√
2) = 1+
√
2.
(2)Find the extreme values off(x, y, z) =x 2 +y 2 +z 2 , subject to the constraints x−y= 1andy 2 −z 2 = 1.
Solution:
17
Notes