Lecture 24: Markov matrices; fourier series | Video Lectures | Linear Algebra | Mathematics

Flash and JavaScript are required for this feature.

Download the video from iTunes U or the Internet Archive.

About this Video
Playlist
Related Resources
Transcript
Download this Video

These video lectures of Professor Gilbert Strang teaching 18.06 were recorded in Fall 1999 and do not correspond precisely to the current edition of the textbook. However, this book is still the best reference for more information on the topics covered in each lecture.

Strang, Gilbert. Introduction to Linear Algebra. 4th ed. Wellesley, MA: Wellesley-Cambridge Press, February 2009. ISBN: 9780980232714.

Instructor/speaker: Prof. Gilbert Strang

Lecture 1: The geometry of ...

Lecture 2: Elimination with...

Lecture 3: Multiplication a...

Lecture 4: Factorization in...

Lecture 5: Transposes, perm...

Lecture 6: Column space and...

Lecture 7: Solving Ax = 0: ...

Lecture 8: Solving Ax = b: ...

Lecture 9: Independence, ba...

Lecture 10: The four fundam...

Lecture 11: Matrix spaces; ...

Lecture 12: Graphs, network...

Lecture 13: Quiz 1 review

Lecture 14: Orthogonal vect...

Lecture 15: Projections ont...

Lecture 16: Projection matr...

Lecture 17: Orthogonal matr...

Lecture 18: Properties of d...

Lecture 19: Determinant for...

Lecture 20: Cramer's rule, ...

Lecture 21: Eigenvalues and...

Lecture 22: Diagonalization...

Lecture 23: Differential eq...

Now Playing

Lecture 24: Markov matrices...

Lecture 24b: Quiz 2 review

Lecture 25: Symmetric matri...

Lecture 26: Complex matrice...

Lecture 27: Positive defini...

Lecture 28: Similar matrice...

Lecture 29: Singular value ...

Lecture 30: Linear transfor...

Lecture 31: Change of basis...

Lecture 32: Quiz 3 review

Lecture 33: Left and right ...

Lecture 34: Final course re...

Related Resources

Readings
Table of Contents

Download this transcript - PDF (English - US)

-- two, one and -- okay.

Here is a lecture on the applications of eigenvalues and, if I can -- so that will be Markov matrices.

I'll tell you what a Markov matrix is, so this matrix A will be a Markov matrix and I'll explain how they come in applications.

And -- and then if I have time, I would like to say a little bit about Fourier series, which is a fantastic application of the projection chapter.

Okay.

What's a Markov matrix?

Can I just write down a typical Markov matrix, say .1, .2, .7, .01, .99 0, let's say, .3, .3, .4.

Okay. There's a -- a totally just invented Markov matrix.

What makes it a Markov matrix?

Two properties that this -- this matrix has.

So two properties are -- one, every entry is greater equal zero.

All entries greater than or equal to zero.

And, of course, when I square the matrix, the entries will still be greater/equal zero.

I'm going to be interested in the powers of this matrix.

And this property, of course, is going to -- stay there.

It -- really Markov matrices you'll see are connected to probability ideas and probabilities are never negative.

The other property -- do you see the other property in there?

If I add down the columns, what answer do I get?

One.

So all columns add to one.

All columns add to one.

And actually when I square the matrix, that will be true again.

So that the powers of my matrix are all Markov matrices, and I'm interested in, always, the eigenvalues and the eigenvectors.

And this question of steady state will come up.

You remember we had steady state for differential equations

last time? When -- what was the steady state -- what was the eigenvalue?

What was the eigenvalue in the differential equation case that led to a steady state?

It was lambda equals zero.

When -- you remember that we did an example and one of the eigenvalues was lambda equals zero, and that -- so then we had an E to the zero T, a constant one -- as time went on, there that thing stayed steady.

Now what -- in the powers case, it's not a zero eigenvalue.

Actually with powers of a matrix, a zero eigenvalue, that part is going to die right away.

It's an eigenvalue of one that's all important.

So this steady state will correspond -- will be totally connected with an eigenvalue of one and its eigenvector.

In fact, the steady state will be the eigenvector for that eigenvalue.

Okay. So that's what's coming.

Now, for some reason then that we have to see, this matrix has an eigenvalue of one.

This property, that the columns all add to one -- turns out -- guarantees that one is an eigenvalue, so that you can actually find the eigenvalue -- find that eigenvalue of a Markov matrix without computing any determinants of A minus lambda I -- that matrix will have an eigenvalue of one, and we want to see why.

And then the other thing is -- so the key points -- let me -- let me write these underneath.

The key points are -- the key points are lambda equal one is an eigenvalue.

I'll add in a little -- an additional -- well, a thing about eigenvalues -- key point two, the other eigenval- values -- all other eigenvalues are, in magnitude, smaller than one -- in absolute value, smaller than one.

Well, there could be some exceptional case when -- when an eigen -- another eigenvalue might have magnitude equal one.

It never has an eigenvalue larger than one.

So these two facts -- somehow we ought to -- linear algebra ought to tell us.

And then, of course, linear algebra is going to tell us what the -- what's -- what happens if I take -- if -- you remember when I solve -- when I multiply by A time after time the K-th thing is A to the K u0 and I'm asking what's special about this -- these powers of A, and very likely the quiz will have a problem to computer s- to computer some powers of A or -- or applied to an initial vector.

So, you remember the general form?

The general form is that there's some amount of the first eigenvalue to the K-th power times the first eigenvector, and another amount of the second eigenvalue to the K-th power times the second eigenvector and so on.

A -- just -- my conscience always makes me say at least once per lecture that this requires a complete set of eigenvectors, otherwise we might not be able to expand u0 in the eigenvectors and we couldn't get started.

But once we're started with u0 when K is zero, then every A brings in these lambdas.

And now you can see what the steady state is going to be.

If lambda one is one -- so lambda one equals one to the K-th power and these other eigenvalues are smaller than one -- so I've sort of scratched over the equation there to -- we had this term, but what happens to this term -- if the lambda's smaller than one, then the -- when -- as we take powers, as we iterate as we -- as we go forward in time, this goes to zero,

Can I just -- having scratched over it, I might as well scratch right? further.

That term and all the other terms are going to zero because all the other eigenvalues are smaller than one and the steady state that we're approaching is just -- whatever there was -- this was -- this was the -- this is the x1 part of un- of the initial condition u0 -- is the steady state.

This much we know from general -- from -- you know, what we've already done.

So I want to see why -- let's at least see number one, why one is an eigenvalue.

And then there's actually -- in this chapter we're interested not only in eigenvalues, but also eigenvectors.

And there's something special about the eigenvector.

Let me write down what that is.

The eigenvector x1 -- x1 is the eigenvector and all its components are positive, so the steady state is positive, if the start was.

If the start was -- so -- well, actually, in general, I -- this might have a -- might have some component zero always, but no negative components in that eigenvector.

Okay. Can I come to that point?

How can I look at that matrix -- so that was just an example.

How could I be sure -- how can I see that a matrix -- if the columns add to zero -- add to one, sorry -- if the columns add to one, this property means that lambda equal one is an eigenvalue.

Okay. So let's just think that through.

What I saying about -- let me ca- let me look at A, and if I believe that one is an eigenvalue, then I should be able to subtract off one times the identity and then I would get a matrix that's, what, -.9, -.01 and -.6 -- wh- I took the ones away and the other parts, of course, are still what they were, and this is still .2 and .7 and -- okay, what's -- what's up with this matrix now?

I've shifted the matrix, this Markov matrix by one, by the identity, and what do I want to prove?

I -- what is it that I believe this matrix -- about this matrix?

I believe it's singular.

Singular will -- if A minus I is singular, that tells me that one is an eigenvalue, right?

The eigenvalues are the numbers that I subtract off -- the shifts -- the numbers that I subtract from the diagonal -- to make it singular.

Now why is that matrix singular?

I -- we could compute its determinant, but we want to see a reason that would work for every Markov matrix not just this particular random example.

So what is it about that matrix?

Well, I guess you could look at its columns now -- what do they add up to?

Zero.

The columns add to zero, so all columns -- let me put all columns now of -- of -- of A minus I add to zero, and then I want to realize that this means A minus I is singular.

Okay. Why?

So I could I -- you know, that could be a quiz question, a sort of theoretical quiz question.

If I give you a matrix and I tell you all its columns add to zero, give me a reason, because it is true, that the matrix is singular.

Okay.

I guess actually -- now what -- I think of -- you know, I'm thinking of two or three ways to see that.

How would you do it?

We don't want to take its determinant somehow.

For the matrix to be singular, well, it means that these three columns are dependent, right?

The determinant will be zero when those three columns are dependent.

You see, we're -- we're at a point in this course, now, where we have several ways to look at an idea.

We can take the determinant -- here we don't want to.

B- but we met singular before that -- those columns are dependent.

So how do I see that those columns are dependent?

They all add to zero.

Let's see, whew -- well, oh, actually, what -- another thing I know is that the -- I would like to be able to show is that the rows are dependent.

Maybe that's easier.

If I know that all the columns add to zero, that's my information, how do I see that those three rows are linearly dependent?

What -- what combination of those rows gives the zero row?

How -- how could I combine those three rows -- those three row vectors to produce the zero row vector?

And that would tell me those rows are dependent, therefore the columns are dependent, the matrix is singular, the determinant is zero -- well, you see it.

I just add the rows.

One times that row plus one times that row plus one times that row -- it's the zero row.

The rows are dependent.

In a way, that one one one, because it's multiplying the rows, is like an eigenvector in the -- it's in the left null space, right?

One one one is in the left null space.

It's singular because the rows are dependent -- and can I just keep the reasoning going?

Because this vector one one one is -- it's not in the null space of the matrix, but it's in the null space of the transpose -- is in the null space of the transpose.

And that's good enough.

If we have a square matrix -- if we have a square matrix and the rows are dependent, that matrix is singular.

So it turned out that the immediate guy we could identify was one one one.

Of course, the -- there will be somebody in the null space,

too. And actually, who will it be?

So what's -- so -- so now I want to ask about the null space of -- of the matrix itself.

What combination of the columns gives zero?

I -- I don't want to compute it because I just made up this matrix and -- it will -- it would take me a while -- it looks sort of doable because it's three by three but wh- my point is, what -- what vector is it if we -- once we've found it, what have we got that's in the -- in the null space of A?

It's the eigenvector, right?

That's where we find X one.

Then X one, the eigenvector, is in the null space of A.

That's the eigenvector corresponding to the eigenvalue

one. Right?

That's how we find eigenvectors.

So those three columns must be dependent -- some combination of columns -- of those three columns is the zero column and that -- the three components in that combination are the eigenvector.

And that guy is the steady state.

Okay.

So I'm happy about the -- the thinking here, but I haven't given -- I haven't completed it because I haven't found x1.

But it's there.

Can I -- another thought came to me as I was doing this, another little comment that -- you -- about eigenvalues and eigenvectors, because of A and A transpose.

What can you tell me about eigenvalues of A -- of A and eigenvalues of A transpose?

Whoops.

They're the same.

They're -- so this is a little comment -- we -- it's useful, since eigenvalues are generally not easy to find -- it's always useful to know some cases where you've got them, where -- and this is -- if you know the eigenvalues of A, then you know the eigenvalues of A transpose.

eigenvalues of A transpose are the same.

And can I just, like, review why that is?

So to find the eigenvalues of A, this would be determinate of A minus lambda I equals zero, that gives me an eigenvalue of A -- now how can I get A transpose into the picture here?

I'll use the fact that the determinant of a matrix and the determinant of its transpose are the same.

The determinant of a matrix equals the determinant of a -- of the transpose.

That was property ten, the very last guy in our determinant list.

So I'll transpose that matrix.

This leads to -- I just take the matrix and transpose it, but now what do I get when I transpose lambda I?

I just get lambda I.

So that's -- that's all there was to the reasoning.

The reasoning is that the eigenvalues of A solved that equation.

The determinant of a matrix is the determinant of its transpose, so that gives me this equation and that tells me that the same lambdas are eigenvalues of A transpose.

So that, backing up to the Markov case, one is an eigenvalue of A transpose and we actually found its eigenvector, one one one, and that tell us that one is also an eigenvalue of A -- but, of course, it has a different eigenvector, the -- the left null space isn't the same as the null space and we would have to find it.

So there's some vector here which is x1 that produces zero

zero zero. Actually, it wouldn't be that hard to find, you know, I -- as I'm talking I'm thinking, okay, I going to follow through and actually find it?

Well, I can tell from this one -- look, if I put a point six there and a point seven there, that's what -- then I'll be okay in the last row, right?

Now I only -- remains to find one guy.

And let me take the first row, then.

Minus point 54 plus point 21 -- there's some big number going in there, right?

So I have -- just to make the first row come out zero, I'm getting minus point 54 plus point 21, so that was minus point 33 and what -- what do I want?

Like thirty three hundred?

This is the first time in the history of linear algebra that an eigenvector has every had a component thirty three hundred.

But I guess it's true.

Because then I multiply by minus one over a hundred -- oh no, it was point 33.

So is this just -- oh, shoot.

Only 33. Okay.

Only 33. Okay, so there's the eigenvector.

Oh, and notice that it -- that it turned -- did turn out, at least, to be all positive.

So that was, like, the theory -- predicts that part, too.

I won't give the proof of that part.

So 30 -- 33 -- point six 33 point seven.

Okay.

Now those are the ma- that's the linear algebra part.

Can I get to the applications?

Where do these Markov matrices come from?

Because that's -- that's part of this course and absolutely part of this lecture.

Okay. So where's -- what's an application of Markov matrices?

Okay. Markov matrices -- so, my equation, then, that I'm solving and studying is this equation u(k+1)=Auk.

And now A is a Markov matrix.

A is Markov.

And I want to give an example.

Can I just create an example?

It'll be two by two.

And it's one I've used before because it seems to me to bring out the idea.

It's -- because we have two by two, we have two states, let's say California and Massachusetts.

And I'm looking at the populations in those two states, the people in those two states, California and Massachusetts.

And my matrix A is going to tell me in a -- in a year, some movement has happened.

Some people stayed in Massachusetts, some people moved to California, some smart people moved from California to Massachusetts, some people stayed in California and made a billion.

Okay. So that -- there's a matrix there with four entries and those tell me the fractions of my population -- so I'm making -- I'm going to use fractions, so they won't be negative, of course, because -- because only positive people are in- involved here -- and they'll add up to one, because I'm accounting for all people.

So that's why I have these two key properties.

The entries are greater equal zero because I'm looking at probabilities.

Do they move, do they stay?

Those probabilities are all between zero and one.

And the probabilities add to one because everybody's accounted for.

I'm not losing anybody, gaining anybody in this Markov chain.

It's -- it conserves the total population.

Okay. So what would be a typical matrix, then?

So this would be u, California and u Massachusetts at time t equal k+1.

And it's some matrix, which we'll think of, times u California and u Massachusetts at time k.

And notice this matrix is going to stay the same, you know, forever.

So that's a severe limitation on the example.

The example has a -- the same Markov matrix, the same probabilities act at every time.

Okay. So what's a reasonable, say -- say point nine of the people in California at time k stay there.

And point one of the people in California move to Massachusetts.

Notice why that column added to one, because we've now accounted for all the people in California at time k.

Nine tenths of them are still in California, one tenth are here at time k+1.

Okay.

What about the people who are in Massachusetts?

This is going to multiply column two, right, by our fundamental rule of multiplying matrix by vector, it's the -- it's the population in Massachusetts.

Shall we say that -- that after the Red Sox, fail again, eight -- only 80 percent of the people in Massachusetts stay and 20 percent move to California.

Okay. So again, this adds to one, which accounts for all people in Massachusetts where they are.

So there is a Markov matrix.

Non-negative entries adding to one.

What's the steady state?

If everybody started in Massachusetts, say, at -- you know, when the Pilgrims showed up or

something. Then where are they now?

Where are they at time 100, let's say, or maybe -- I don't know, how many years since the Pilgrims?

300 and something.

Or -- and actually where will they be, like, way out a million years from now?

I -- I could multiply -- take the powers of this matrix.

In fact, you'll -- you would -- ought to be able to figure out what is the hundredth power of that matrix?

Why don't we do that?

But let me follow the steady state.

So what -- what's my starting -- my starting u Cal, u Mass at time zero is, shall we say -- shall we put anybody in California?

Let's make -- let's make zero there, and say the population of Massachusetts is -- let's say a thousand just to -- okay.

So the population is -- so the populations are zero and a thousand at the start.

What can you tell me about this population after -- after k steps?

What will u Cal plus u Mass add to?

A thousand.

Those thousand people are always accounted for.

But -- so u Mass will start dropping from a thousand and u Cal will start growing.

Actually, we could see -- why don't we figure out what it is after one?

After one time step, what are the populations at time one?

So what happens in one step?

You multiply once by that matrix and, let's see, zero times this column -- so it's just a thousand times this column, so I think we're getting 200 and 800.

So after the first step, 200 people have -- are in California.

Now at the following step, I'll multiply again by this matrix -- more people will move to California.

Some people will move back.

Twenty people will come back and, the -- the net result will be that the California population will be above 200 and the Massachusetts below 800 and they'll still add up to a thousand.

Okay.

I do that a few times.

I do that 100 times.

What's the population?

Well, okay, to answer any question like that, I need the eigenvalues and eigenvectors,

right? As soon as I've -- I've created an example, but as soon as I want to solve anything, I have to find eigenvalues and eigenvectors of that matrix.

Okay.

So let's do it.

So there's the matrix .9, .2, .1, .8 and tell me its eigenvalues.

Lambda equals -- so tell me one eigenvalue?

One, thanks.

And tell me the other one.

What's the other eigenvalue -- from the trace or the determinant -- from the -- I -- the trace is what -- is, like, easier.

So the trace of that matrix is one point seven.

So the other eigenvalue is point seven.

And it -- notice that it's less than one.

And notice that that determinant is point 72-.02, which is point seven.

Right. Okay.

Now to find the eigenvectors.

This is -- so that's lambda one and the eigenvector -- I'll subtract one from the diagonal, right?

So can I do that in light let -- in light here?

Subtract one from the diagonal, I have minus point one and minus point two, and of course these are still

there. And I'm looking for its -- here's -- here's -- this is going to be x1.

It's the null space of A minus I.

Okay, everybody sees that it's two and one.

Okay?

And now how about -- so that -- and it -- notice that that eigenvector is positive.

And actually, we can jump to infinity right now.

What's the population at infinity?

It's a multiple -- this is -- this eigenvector is giving the steady state.

It's some multiple of this, and how is that multiple decided?

By adding up to a thousand people.

So the steady state, the c1x1 -- this is the x1, but that adds up to three, so I really want two -- it's going to be two thirds of a thousand and one third of a thousand, making a total of the thousand

people. That'll be the steady state.

That's really all I need to know at infinity.

But if I want to know what's happened after just a finite number like 100 steps, I'd better find this eigenvector.

So can I -- can I look at -- I'll subtract point seven time -- ti- from the diagonal and I'll get that and I'll look at the null space of that one and I -- and this is going to give me x2, now, and what is it?

So what's in the null space of -- that's certainly singular, so I know my calculation is right, and -- one and minus one.

One and minus one.

So I'm prepared now to write down the solution after 100 time

steps. The -- the populations after 100 time steps,

right? Can -- can we remember the point one -- the -- the one with this two one eigenvector and the point seven with the minus one one eigenvector.

So I'll -- let me -- I'll just write it above here.

u after k steps is some multiple of one to the k times the two one eigenvector and some multiple of point seven to the k times the minus one one eigenvector.

Right?

That's -- I -- this is how I take -- how powers of a matrix work.

When I apply those powers to a u0, what I -- so it's u0, which was zero a thousand -- that has to be corrected k=0.

So I'm plugging in k=0 and I get c1 times two one and c2 times minus one one.

Two equations, two constants, certainly independent eigenvectors, so there's a solution and you see what it is?

Let's see, I guess we already figured that c1 was a thousand over three, I think -- did we think that had to be a thousand over three?

And maybe c2 would be -- excuse me, let -- get an eraser -- we can -- I just -- I think we've -- get it here.

c2, we want to get a zero here, so maybe we need plus two thousand over three.

I think that has to work.

Two times a thousand over three minus two thousand over three, that'll give us the zero and a thousand over three and the two thousand over three will give us three thousand over three,

the thousand. So this is what we approach -- this part, with the point seven to the k-th power is the part that's disappearing.

That's -- that's Markov matrices.

Okay.

That's an example of where they come from, from modeling movement of people with no gain or loss, with total -- total count conserved.

Okay.

I -- just if I can add one more comment, because you'll see Markov matrices in electrical engineering courses and often you'll see them -- sorry, here's my little comment.

Sometimes -- in a lot of applications they prefer to work with row vectors.

So they -- instead of -- this was natural for us, right?

For all the eigenvectors to be column vectors.

So our columns added to one in the Markov matrix.

Just so you don't think, well, what -- what's going on?

If we work with row vectors and we multiply vector times matrix -- so we're multiplying from the left -- then it'll be the then we'll be using the transpose of -- of this matrix and it'll be the rows that add to one.

So in other textbooks, you'll see -- instead of col- columns adding to one, you'll see rows add to one.

Okay. Fine.

Okay, that's what I wanted to say about Markov, now I want to say something about projections and even leading in -- a little into Fourier series.

Because -- but before any Fourier stuff, let me make a comment about projections.

This -- so this is a comment about projections onto -- with an orthonormal basis.

So, of course, the basis vectors are q1 up to

qn. Okay.

I have a vector b.

Let -- let me imagine -- let me imagine this is a basis.

Let -- let's say I'm in n by n.

I'm -- I've got, eh, n orthonormal vectors, I'm in n dimensional space so they're a complete -- they're a basis -- any vector v could be expanded in this basis.

So any vector v is some combination, some amount of q1 plus some amount of q2 plus some amount of qn.

So -- so any v.

I just want you to tell me what those amounts are.

What are x1 -- what's x1, for example?

So I'm looking for the expansion.

This is -- this is really our projection.

I could -- I could really use the word expansion.

I'm expanding the vector in the basis.

And the special thing about the basis is that it's orthonormal.

So that should give me a special formula for the answer, for the coefficients.

So how do I get x1?

What -- what's a formula for x1?

I could -- I can go through the projection -- the Q transpose Q, all that -- normal equations, but -- and I'll get -- I'll come out with this nice answer that I think I can see right away.

How can I pick -- get a hold of x1 and get these other x-s out of the equation?

So how can I get a nice, simple formula for x1?

And then we want to see, sure, we knew that all the time.

Okay. So what's x1?

The good way is take the inner product of everything with q1.

Take the inner product of that whole equation, every term, with q1.

What will happen to that last term?

The inner product -- when -- if I take the dot product with q1 I get zero, right?

Because this basis was orthonormal.

If I take the dot product with q2 I get zero.

If I take the dot product with q1 I get one.

So that tells me what x1 is. q1 transpose v, that's taking the dot product, is x1 times q1 transpose q1 plus a bunch of zeroes.

And this is a one, so I can forget that.

I get x1 immediately.

So -- do you see what I'm saying -- is that I have an orthonormal basis, then the coefficient that I need for each basis vector is a cinch to

find. Let me -- let me just -- I have to put this into matrix language, too, so you'll see it there also.

If I write that first equation in matrix language, what -- what is it?

I'm writing -- in matrix language, this equation says I'm taking these columns -- are -- are you guys good at this now?

I'm taking those columns times the Xs and getting V, right?

That's the matrix form.

Okay, that's the matrix Q.

Qx is v.

What's the solution to that equation?

It's -- of course, it's x equal Q inverse v.

So x is Q inverse v, but what's the point?

Q inverse in this case is going to -- is simple.

I don't have to work to invert this matrix Q, because of the fact that the -- these columns are orthonormal, I know the inverse to that.

And it is Q transpose.

When you see a Q, a square matrix with that letter Q, the -- that just triggers -- Q inverse is the same as Q transpose.

So the first component, then -- the first component of x is the first row times v, and what's that?

The first component of this answer is the first row of Q transpose.

That's just -- that's just q1 transpose times v.

So that's what we concluded here, too.

Okay.

So -- so nothing Fourier here.

The -- the key ingredient here was that the q-s are orthonormal.

And now that's what Fourier series are built on.

So now, in the remaining time, let me say something about Fourier series.

Okay.

So Fourier series is -- well, we've got a function f of x.

And we want to write it as a combination of -- maybe it has a constant term.

And then it has some cos(x) in it.

And it has some sin(x) in it.

And it has some cos(2x) in it.

And a -- and some sin(2x), and forever.

So what's -- what's the difference between this type problem and the one above it?

This one's infinite, but the key property of things being orthogonal is still true for sines and cosines, so it's the property that makes Fourier series work.

So that's called a Fourier series.

Better write his name up.

Fourier series.

So it was Joseph Fourier who realized that, hey, I could work in function space.

Instead of a vector v, I could have a function f of x.

Instead of orthogonal vectors, q1, q2 , q3, I could have orthogonal functions, the constant, the cos(x), the sin(x), the s- cos(2x), but infinitely many of them.

I need infinitely many, because my space is infinite dimensional.

So this is, like, the moment in which we leave finite dimensional vector spaces and go to infinite dimensional vector spaces and our basis -- so the vectors are now functions -- and of course, there are so many functions that it's -- that we've got an infin- infinite dimensional space -- and the basis vectors are functions, too. a0, the constant function one -- so my basis is one cos(x), sin(x), cos(2x), sin(2x) and so on.

And the reason Fourier series is a success is that those are orthogonal.

Okay. Now what do I mean by orthogonal?

I know what it means for two vectors to be orthogonal -- y transpose x equals zero, right?

Dot product equals zero.

But what's the dot product of functions?

I'm claiming that whatever it is, the dot product -- or we would more likely use the word inner product of, say, cos(x) with sin(x) is zero.

And cos(x) with cos(2x), also zero.

So I -- let me tell you what I mean by that, by that dot product.

Well, how do I compute a dot product?

So, let's just remember for vectors v trans- v transpose w for vectors, so this was vectors, v transpose w was v1w1 +...+vnwn.

Okay.

Now functions.

Now I have two functions, let's call them f and g.

What's with them now?

The vectors had n components, but the functions have a whole, like, continuum.

To graph the function, I just don't have n points, I've got this whole graph.

So I have functions -- I'm really trying to ask you what's the inner product of this function f with another function

g? And I want to make it parallel to this the best I can.

So the best parallel is to multiply f (x) times g(x) at every x -- and here I just had n multiplications, but here I'm going to have a whole range of x-s, and here I added the results.

What do I do here?

So what's the analog of addition when you have -- when you're in a continuum?

It's integration.

So that the -- the dot product of two functions will be the integral of those functions, dx.

Now I have to say -- say, well, what are the limits of integration?

And for this Fourier series, this function f(x) -- if I'm going to have -- if that right hand side is going to be f(x), that function that I'm seeing on the right, all those sines and cosines, they're all periodic, with -- with period two pi.

So -- so that's what f(x) had better be.

So I'll integrate from zero to two pi.

My -- all -- everything -- is on the interval zero two pi now, because if I'm going to use these sines and cosines, then f(x) is equal to f(x+2pi).

This is periodic -- periodic functions.

Okay.

So now I know what -- I've got all the right words now.

I've got a vector space, but the vectors are functions.

I've got inner products and -- and the inner product gives a number, all right.

It just happens to be an integral instead of a sum.

I've got -- and that -- then I have the idea of orthogonality -- because, actually, just -- let's just check.

Orthogonality -- if I take the integral -- s- I -- let me do sin(x) times cos(x) -- sin(x) times cos(x) dx from zero to two pi -- I think we get zero.

That's the differential of that, so it would be one half sine x squared, was that right?

Between zero and two pi -- and, of course, we get zero.

And the same would be true with a little more -- some trig identities to help us out -- of every other pair.

So we have now an orthonormal infinite basis for function space, and all we want to do is express a function in that

basis. And so I -- the end of my lecture is, okay, what is a1?

What's the coefficient -- how much cos(x) is there in a function compared to the other harmonics?

How much constant is in that function?

That'll -- that would be an easy question.

The answer a0 will come out to be the average value of f.

That's the amount of the constant that's in there, its average value.

But let's take a1 as more typical.

How will I get -- here's the end of the lecture, then -- how do I get a1?

The first Fourier coefficient.

Okay.

I do just as I did in the vector case.

I take the inner product of everything with cos(x) Take the inner product of everything with cos(x).

Then on the left -- on the left I have -- the inner product is the integral of f(x) times cos(x) cx.

And on the right, what do I have?

When I -- so what I -- when I say take the inner product with cos(x), let me put it in ordinary calculus words.

Multiply by cos(x) and integrate.

That's what inner products are.

So if I multiply that whole thing by cos(x) and I integrate, I get a whole lot of zeroes.

The only thing that survives is that term.

All the others disappear.

So -- and that term is a1 times the integral of cos(x) squared dx zero to 2pi equals -- so this was the left side and this is all that's left on the right-hand side.

And this is not zero of course, because it's the length of the function squared, it's the inner product with itself, and -- and a simple calculation gives that answer to be pi.

So that's an easy integral and it turns out to be pi, so that a1 is one over pi times there -- times this integral.

So there is, actually -- that's Euler's famous formula for the -- or maybe Fourier found it -- for the coefficients in a Fourier series.

And you see that it's exactly an expansion in an orthonormal basis.

Okay, thanks.

So I'll do a quiz review on Monday and then the quiz itself in Walker on Wednesday.

Okay, see you Monday.

Thanks.