Least squares

Announcements	Last Post
	SoC Curricula	09/30/2017 01:08PM
	Demarcation or scoping of examinations and assessment	02/13/2017 07:59AM
	School of Computing Short Learning Programmes	11/24/2014 08:37AM
	Unisa contact information	07/28/2011 01:28PM

sheepapple

Least squares
March 14, 2011 05:18PM

Registered: 13 years ago
Posts: 135
Rating: 0

OK, next problem.

Does anyone have a full example with step by step workings for me to check my logic against. I need some basis of correctness to compare against.

Reply Quote

sheepapple

Re: Least squares
March 14, 2011 10:26PM

Registered: 13 years ago
Posts: 135
Rating: 0

I did some playing with the example on page 205:

>> A = [11 6.01 4.6545 5.905; 6.01 4.6545 4.1150 2.1839; 4.6545 4.1150 3.9161 1.3557]

A =

11.0000 6.0100 4.6545 5.9050
6.0100 4.6545 4.1150 2.1839
4.6545 4.1150 3.9161 1.3557

>> rref(A)

ans =

1.0000 0 0 1.0262
0 1.0000 0 -1.1772
0 0 1.0000 0.3635

Notice that the coefficients the book supplies are a₀=0.998, a₁=-1.018, a₂=0.225.

Why then does matlab yield a₀=1.0262, a₁=-1.1772, a₂=0.3635?

Reply Quote

slow_eddy

Re: Least squares
March 14, 2011 10:44PM

Registered: 15 years ago
Posts: 3,496
Rating: 1

Octave gives me the following:

octave:1> A = [11 6.01 4.6545 5.905; 6.01 4.6545 4.1150 2.1839; 4.6545 4.115 3.
9161 1.3357]
A =

   11.0000    6.0100    4.6545    5.9050
    6.0100    4.6545    4.1150    2.1839
    4.6545    4.1150    3.9161    1.3357

octave:2> rref(A)
ans =

   1.00000   0.00000   0.00000   0.99796
   0.00000   1.00000   0.00000  -1.01801
   0.00000   0.00000   1.00000   0.22467

IOW, it agrees with the text book. Looking at what you've done in Matlab for differences all I see is that your last row differs in the last column. Can't imagine that making such a big difference, so we have a mystery on our hands here!

Reply Quote

sheepapple

Re: Least squares
March 14, 2011 10:56PM

Registered: 13 years ago
Posts: 135
Rating: 0

DOH! Past my bedtime I guess. Better rest before I break something.

Reply Quote

slow_eddy

Re: Least squares
March 15, 2011 01:17PM

Registered: 15 years ago
Posts: 3,496
Rating: 1

Was it really all caused by that tiny typo?

I wouldn't have thought so. I thought maybe your copy of Matlab did something mysterious, and that it'd be interesting to see if it replicated this.

If it was the typo, please let me know. That's a big change in the system for one small change in the constants!

Reply Quote

slow_eddy

Re: Least squares
March 17, 2011 02:35AM

Registered: 15 years ago
Posts: 3,496
Rating: 1

OK, so here's my minimalist/ provisional summary of Least Squares. (Basically just the bits that struck me as most important) ... to be continued from p206 onward a bit later...

The thing I need most to remind myself of is that x is known. All the x's are like the Y's. They're data points, so they're known ("exactly" we'll say). What's unknown here is what's usually the known. The a_i are unknown.

Right, so in the long run we're going to be solving for a, not solving for x, as the reflex would have us do.

So what's with the "least squares" thing then? It's "least" because we're minimizing (error), and we do that to squares just because they behave nicely. (See the very beginning of the section if that's a problem). I'll get back to "least/ minimizing" in a minute.

<<< Edited. Best I do more of that on the original postings, ja? >>>

Reply Quote

slow_eddy

Re: Least squares
March 17, 2011 02:41AM

Registered: 15 years ago
Posts: 3,496
Rating: 1

Oh Scheiße!

Seems half the font has gone minuscule. (Note the u there. Lots of people mistakenly spell it with an i).

I'll try a cut and paste repair?

AH. I think at that point I was doing the tags by hand instead of scrolling back up to the edit bar. The problem is a missing "[sub...]"

<<< I had some problems that produced some really ugly output. I've deleted it.>>>

Reply Quote

slow_eddy

Re: Least squares
March 17, 2011 02:46AM

Registered: 15 years ago
Posts: 3,496
Rating: 1

Nee man!

<<< I got a bit annoyed, and then said a few more annoyed things here. Irrelevant. Edited. By clicking "Edit" on the RHS below the posting.>>>

Reply Quote

slow_eddy

Re: Least squares
March 17, 2011 02:47AM

Registered: 15 years ago
Posts: 3,496
Rating: 1

<<< Fiddled about a bit, because I'm not too good at just leaving things be.>>>

Reply Quote

slow_eddy

Re: Least squares
March 17, 2011 02:48AM

Registered: 15 years ago
Posts: 3,496
Rating: 1

<<< Fiddled a bit more. No luck >>>

Reply Quote

slow_eddy

Re: Least squares
March 17, 2011 02:52AM

Registered: 15 years ago
Posts: 3,496
Rating: 1

Now the data points (the "Y_{i[\sub]" are all correct values for whatever function (if any) fits them. The Y["sub"]i[\sub] don't lie. This means that for any of these correct Y["sub"]i[\sub] that the smooth polynomial curve doesn't go through, the polynomial is in error by some function of the distance between the correct Y["sub"]i[\sub] and the incorrect y["sub"]i["\"sub] above the "data-x" for that Y.}

er... Y_i ... AH!

OK. I went and used a forward slash instead of a backslash to "escape" the tags.

<<< Blushing colour edited in >>>
With apologies, the repaired version will follow...

Reply Quote

slow_eddy

Re: Least squares
March 17, 2011 02:55AM

Registered: 15 years ago
Posts: 3,496
Rating: 1

Now the data points (the "Y_i" are all correct values for whatever function (if any) fits them. The Y_i don't lie. This means that for any of these correct Y_i that the smooth polynomial curve doesn't go through, the polynomial is in error by some function of the distance between the correct Y_i and the incorrect y_i above the "data-x" for that Y.

The error in simple terms is Y_i - y_i.

Now of course that value could be either of negative; or positive. Square it and the distinction between positive and negative error goes away. Whee hah.

So we have N data points, N errors (if we allow zero to count as an "error" too), and for various reasons we work with the squares of those N errors, and not simple error.

Now obviously the best curve will have the least error, right? (Squared, but let's drop that from now on to keep things simple).

So you minimise. Ting. A little calculus bell rings. You think of derivatives. Only here we have lots of unknowns, so we use partials instead.

We have "lots" of unknowns? Imprecise. We have n+1 of them. Each of the n x's has an a, and then there's the constant term to add to those. We know the x's. It's "x-the-well-known". We have n+1 ai here. Those are the unknowns. (The x's just helped with counting them, that's all).

Right, so we do some calculus (partial derivatives of all n+1 equations), as part of that we do some algebra (divide away the 2's etc), and we end up with a system of equations, each of which is a summation of x's multiplied by one of the unknown a values. Put that in matrix form, look at it, notice that the matrix is symmetric.

We are then warned that the matrix is ill conditioned, but that this isn't a problem with 3 or 4 degree polynomials, usually (and I think that's the number of degrees you'd get in any exam question on these). To deal with polynomials of greater degree (which always have the problem of having all those zeros, crossing and recrossing the axis like that), one uses orthogonal polynomials etc. For the purposes of just getting by threadbarely in COS233, it's probably better to temporarily forget that fact immediately after being told it.

Erm.. then there's all that design matrix stuff. A good excuse for bailing out of this early would be to say I've already used up too much space...

One can quite easily see what the design matrix is all about. Just do the transpose of a few rows (take a reduced A^T), and do dot products of the form R₁.C_n. The possibility that in fact AA^T = B will soon seem reasonable enough for you to press on...

OK. That'll do for the time being.

The idea is that this summary is a kind of "collective working document". It's available to be crossed out, rephrased, amplified, simplified, etc. No need to write the whole lot over. Just copy, paste and then attack with the "strikeout" font, the delete key, etc.

The 5 of us should be able to produce a reasonable summary of the nitty gritty involved in "surviving least squares" from it.

Reply Quote

slow_eddy

Re: Least squares
March 17, 2011 02:55AM

Registered: 15 years ago
Posts: 3,496
Rating: 1

Well I've used up so many posts on this now, I might as well add just one more to say "YeeeeHaH!"

Reply Quote

slow_eddy

Re: Least squares
March 17, 2011 10:55AM

Registered: 15 years ago
Posts: 3,496
Rating: 1

How does B get to be a square matrix?

We know N must be greater than n+1 (otherwise the error is zero everywhere, since some unique polynomial of degree less than n will go through every data point).

We have N data points, so N equations.

We have less than that many ~~variables~~ unknowns. This must mean we always have an overdetermined system, surely?

Reply Quote

slow_eddy

Re: Least squares
March 17, 2011 11:00AM

Registered: 15 years ago
Posts: 3,496
Rating: 1

Ah.. it's because we have N errors, isn't it?

That might be ...

No.

No the inner products are with vector [a].

The number of columns must match the number of rows in that vector.

Question still open.

Reply Quote

slow_eddy

Re: Least squares
March 17, 2011 03:18PM

Registered: 15 years ago
Posts: 3,496
Rating: 1

I can see this last little bit needs editing, so let me do some deleting then...

~~Now of course that value could be either of negative; or positive. Square it and the distinction between positive and negative error goes away. Whee hah.~~

So we have N data points, N errors (if we allow zero to count as an "error" too), and for various reasons we work with the squares of those N errors, and not simple error.

Now obviously the best curve will have the least error, right? (Squared, but let's drop that from now on to keep things simple).

So you minimise. ~~Ting. A little calculus bell rings. You think of~~ That involves derivatives. Only here we have lots of unknowns, so we use partials instead. <<< Zero error means the derivative is equal to zero, so there's an equation to use: Sum of the partial derivatives = derivative = zero >>>

We have "lots" of unknowns? Imprecise. We have n+1 of them. Each of the n x's has an a, and then there's the constant term to add to those. We know the x's. It's "x-the-well-known". We have n+1 a_i here. Those are the unknowns. ~~(The x's just helped with counting them, that's all).~~

Right, so we do some calculus (partial derivatives of all n+1 equations), as part of that we do some algebra (divide away the 2's etc), and we end up with a system of equations, each of which is a summation of x's multiplied by one of the unknown a values. Put that in matrix form, look at it, notice that the matrix is symmetric. (I don't want to get bogged down in the tiny details of the calculation, but this is just too cavalier. Somewhere in the way we solve lies the reason that B is somehow a square matrix, instead of an "overdetermined" one.)

Reply Quote

slow_eddy

Re: Least squares
March 17, 2011 03:48PM

Registered: 15 years ago
Posts: 3,496
Rating: 1

We minimise the sum of the squared errors. That's what we call "best fit".

Expand any error term and you have
Y_i - y_i =
Y_i - (a₀ + a₁x₁ + ... + a_nx_n)

Put the right kind of sigma in front of that, and you're talking of the sum of all errors.

Minimising S, the sum of all these errors, squared, is done through partials.

The matrix is square because firstly there are n+1 partials(one for every a), so n+1 equations. And secondly, there are n+1 a_i in every "sigma". (You'll take the partial of the whole sum WRT each a_i)

I think from there on it's good and proper to just say "We do some calculus and some algebra" to get the set of equations.

Reply Quote

slow_eddy

Re: Least squares
March 22, 2011 01:32PM

Registered: 15 years ago
Posts: 3,496
Rating: 1

Being stubborn, I go on...

We're solve for a, not for x.

However, these a_i are (in principle) parameters, not "constants of the polynomial". In our particular case you might as well think of them as the constants to solve for, but it would appear that one can do all sorts of weird things with those a_i. Small point, then, so I'll delete it later.

Now the data points (the "Y_i" are all correct values for whatever function (if any) fits them. The Y_i don't lie... Really? ...

Turns out I've got that pretty back-to-front. No. The data has a certain amount of error. The correct value is faith-based. We choose to believe that without the experimental error, those raggedy data points would've landed exactly on some nice smooth curve with a closed form solution. I'll skip the philosophical aspects of that. Also it doesn't make a huge practical difference if you make this mistake.

This means that for any of these incorrect Y_i that the (true) smooth polynomial curve doesn't go through, the polynomial is in error by some function of the distance between the incorrect Yi and the correct y_i above the "data-x" for that Y.

The error in simple terms is Yi - yi. However if you minimise the square of that error you still have all the a_i you want to solve for in the resulting equations, and the squares just behave more nicely than do the simple errors. (I'm falling asleep here, so this aside about the squares must die in the end, too).

So we have N data points, N errors.

Now obviously the best curve will have the least error, right? (And when the square of that error is at a minimum, so is the error). We minimise by taking partials of the sum of all errors - one partial per parameter.

The number of unknown-a's is n+1. Each of the n x^j terms has an a, and one must not forget a₀x⁰ adds 1.

We know the x's. It's "x-the-well-known". We have n+1 a_i here. Those are the unknowns.

The matrix is square because firstly there are n+1 partials(one for every a), so n+1 equations. And secondly, there are n+1 a_i in every "sigma". (You'll take the partial of the whole sum WRT each a_i)

The original system was overdetermined. Solve for the a_i using partials, and you get a nice square matrix. Actually one of the motivations for producing this system is precisely this: Take an awkward overdetermined system. Find some function on it that outputs a nice square system.

That should be neat enough for readability now? It's still a very imperfect summary (much too long, far too many digressions), but it's getting better, innit?

Reply Quote

slow_eddy

Re: Least squares
March 22, 2011 01:36PM

Registered: 15 years ago
Posts: 3,496
Rating: 1

Urp! I still have a lot of things completely back to front in that.

Never mind. That gives the rest of you something to get your critical teeth into, doesn't it? There's nothing like a good fundamental error for bringing clarification of the whole problem when it's revealed.

Reply Quote

sheepapple

Re: Least squares
March 24, 2011 04:16PM

Registered: 13 years ago
Posts: 135
Rating: 0

My goodness. Sorry for the lapse in dialog. I had to catch up with work the past few days. Car's been broken, etc too. Anyhow, let me read through your rants and ramblings tonight and give you some feedback. BTW that small change in the matrix made all the difference.

Reply Quote

slow_eddy

Re: Least squares
March 24, 2011 04:32PM

Registered: 15 years ago
Posts: 3,496
Rating: 1

Ah, sheepapple! Good to see you!

You can safely ignore all but the last two postings if you like. I thought I'd work through a summary right from the rough. (Leave it open to criticism - always a good way of learning something).

Excellent. Look forward to hearing your comments. Please nitpick. smile

(Honestly. The idea is to understand as best as possible, not to have a nice tea party all the time).

Reply Quote

sheepapple

Re: Least squares
March 25, 2011 10:41AM

Registered: 13 years ago
Posts: 135
Rating: 0

Yeah, look on the least squares side it all boils down to the equations supplied on page 204 and the matrix form given in 3.28.

Couple that with the Table 3.14 and accompanying worked example and it makes sense. The problem though is like with all this stuff its quite computationally intensive and time consuming. I ended up writing a matlab program to help do some of the legwork.

Reply Quote

sheepapple

Re: Least squares
March 25, 2011 10:41AM

Registered: 13 years ago
Posts: 135
Rating: 0

By the way I started another topic regarding assignment 3, q1.

Reply Quote