Welcome! Log In Create A New Profile

Advanced

Least squares

Posted by sheepapple 
Announcements Last Post
Announcement SoC Curricula 09/30/2017 01:08PM
Announcement Demarcation or scoping of examinations and assessment 02/13/2017 07:59AM
Announcement School of Computing Short Learning Programmes 11/24/2014 08:37AM
Announcement Unisa contact information 07/28/2011 01:28PM
Least squares
March 14, 2011 05:18PM
OK, next problem.

Does anyone have a full example with step by step workings for me to check my logic against. I need some basis of correctness to compare against.
Re: Least squares
March 14, 2011 10:26PM
I did some playing with the example on page 205:

>> A = [11 6.01 4.6545 5.905; 6.01 4.6545 4.1150 2.1839; 4.6545 4.1150 3.9161 1.3557]

A =

11.0000 6.0100 4.6545 5.9050
6.0100 4.6545 4.1150 2.1839
4.6545 4.1150 3.9161 1.3557

>> rref(A)

ans =

1.0000 0 0 1.0262
0 1.0000 0 -1.1772
0 0 1.0000 0.3635


Notice that the coefficients the book supplies are a0=0.998, a1=-1.018, a2=0.225.

Why then does matlab yield a0=1.0262, a1=-1.1772, a2=0.3635?
avatar Re: Least squares
March 14, 2011 10:44PM
Octave gives me the following:

octave:1> A = [11 6.01 4.6545 5.905; 6.01 4.6545 4.1150 2.1839; 4.6545 4.115 3.
9161 1.3357]
A =

   11.0000    6.0100    4.6545    5.9050
    6.0100    4.6545    4.1150    2.1839
    4.6545    4.1150    3.9161    1.3357

octave:2> rref(A)
ans =

   1.00000   0.00000   0.00000   0.99796
   0.00000   1.00000   0.00000  -1.01801
   0.00000   0.00000   1.00000   0.22467

IOW, it agrees with the text book. Looking at what you've done in Matlab for differences all I see is that your last row differs in the last column. Can't imagine that making such a big difference, so we have a mystery on our hands here!
Re: Least squares
March 14, 2011 10:56PM
DOH! Past my bedtime I guess. Better rest before I break something.
avatar Re: Least squares
March 15, 2011 01:17PM
Was it really all caused by that tiny typo?

I wouldn't have thought so. I thought maybe your copy of Matlab did something mysterious, and that it'd be interesting to see if it replicated this.

If it was the typo, please let me know. That's a big change in the system for one small change in the constants!
avatar Re: Least squares
March 17, 2011 02:35AM
OK, so here's my minimalist/ provisional summary of Least Squares. (Basically just the bits that struck me as most important) ... to be continued from p206 onward a bit later...

The thing I need most to remind myself of is that x is known. All the x's are like the Y's. They're data points, so they're known ("exactly" we'll say). What's unknown here is what's usually the known. The ai are unknown.

Right, so in the long run we're going to be solving for a, not solving for x, as the reflex would have us do.

So what's with the "least squares" thing then? It's "least" because we're minimizing (error), and we do that to squares just because they behave nicely. (See the very beginning of the section if that's a problem). I'll get back to "least/ minimizing" in a minute.

<<< Edited. Best I do more of that on the original postings, ja? >>>
avatar Re: Least squares
March 17, 2011 02:41AM
Oh Scheiße!

Seems half the font has gone minuscule. (Note the u there. Lots of people mistakenly spell it with an i).

I'll try a cut and paste repair?

AH. I think at that point I was doing the tags by hand instead of scrolling back up to the edit bar. The problem is a missing "[sub...]"

<<< I had some problems that produced some really ugly output. I've deleted it.>>>
avatar Re: Least squares
March 17, 2011 02:46AM
Nee man!

<<< I got a bit annoyed, and then said a few more annoyed things here. Irrelevant. Edited. By clicking "Edit" on the RHS below the posting.>>>
avatar Re: Least squares
March 17, 2011 02:47AM
<<< Fiddled about a bit, because I'm not too good at just leaving things be.>>>
avatar Re: Least squares
March 17, 2011 02:48AM
<<< Fiddled a bit more. No luck >>>
avatar Re: Least squares
March 17, 2011 02:52AM
Now the data points (the "Yi[\sub]" are all correct values for whatever function (if any) fits them. The Y["sub"]i[\sub] don't lie. This means that for any of these correct Y["sub"]i[\sub] that the smooth polynomial curve doesn't go through, the polynomial is in error by some function of the distance between the correct Y["sub"]i[\sub] and the incorrect y["sub"]i["\"sub] above the "data-x" for that Y.


er... Yi ... AH!

OK. I went and used a forward slash instead of a backslash to "escape" the tags.

<<< Blushing colour edited in >>>
With apologies, the repaired version will follow...
avatar Re: Least squares
March 17, 2011 02:55AM
Now the data points (the "Yi" are all correct values for whatever function (if any) fits them. The Yi don't lie. This means that for any of these correct Yi that the smooth polynomial curve doesn't go through, the polynomial is in error by some function of the distance between the correct Yi and the incorrect yi above the "data-x" for that Y.

The error in simple terms is Yi - yi.

Now of course that value could be either of negative; or positive. Square it and the distinction between positive and negative error goes away. Whee hah.

So we have N data points, N errors (if we allow zero to count as an "error" too), and for various reasons we work with the squares of those N errors, and not simple error.

Now obviously the best curve will have the least error, right? (Squared, but let's drop that from now on to keep things simple).

So you minimise. Ting. A little calculus bell rings. You think of derivatives. Only here we have lots of unknowns, so we use partials instead.

We have "lots" of unknowns? Imprecise. We have n+1 of them. Each of the n x's has an a, and then there's the constant term to add to those. We know the x's. It's "x-the-well-known". We have n+1 ai here. Those are the unknowns. (The x's just helped with counting them, that's all).

Right, so we do some calculus (partial derivatives of all n+1 equations), as part of that we do some algebra (divide away the 2's etc), and we end up with a system of equations, each of which is a summation of x's multiplied by one of the unknown a values. Put that in matrix form, look at it, notice that the matrix is symmetric.

We are then warned that the matrix is ill conditioned, but that this isn't a problem with 3 or 4 degree polynomials, usually (and I think that's the number of degrees you'd get in any exam question on these). To deal with polynomials of greater degree (which always have the problem of having all those zeros, crossing and recrossing the axis like that), one uses orthogonal polynomials etc. For the purposes of just getting by threadbarely in COS233, it's probably better to temporarily forget that fact immediately after being told it.

Erm.. then there's all that design matrix stuff. A good excuse for bailing out of this early would be to say I've already used up too much space...

One can quite easily see what the design matrix is all about. Just do the transpose of a few rows (take a reduced AT), and do dot products of the form R1.Cn. The possibility that in fact AAT = B will soon seem reasonable enough for you to press on...

OK. That'll do for the time being.

The idea is that this summary is a kind of "collective working document". It's available to be crossed out, rephrased, amplified, simplified, etc. No need to write the whole lot over. Just copy, paste and then attack with the "strikeout" font, the delete key, etc.

The 5 of us should be able to produce a reasonable summary of the nitty gritty involved in "surviving least squares" from it.
avatar Re: Least squares
March 17, 2011 02:55AM
Well I've used up so many posts on this now, I might as well add just one more to say "YeeeeHaH!"
avatar Re: Least squares
March 17, 2011 10:55AM
How does B get to be a square matrix?

We know N must be greater than n+1 (otherwise the error is zero everywhere, since some unique polynomial of degree less than n will go through every data point).

We have N data points, so N equations.

We have less than that many variables unknowns. This must mean we always have an overdetermined system, surely?
avatar Re: Least squares
March 17, 2011 11:00AM
Ah.. it's because we have N errors, isn't it?

That might be ...

No.

No the inner products are with vector [a].

The number of columns must match the number of rows in that vector.

Question still open.
avatar Re: Least squares
March 17, 2011 03:18PM
I can see this last little bit needs editing, so let me do some deleting then...

Now of course that value could be either of negative; or positive. Square it and the distinction between positive and negative error goes away. Whee hah.

So we have N data points, N errors (if we allow zero to count as an "error" too), and for various reasons we work with the squares of those N errors, and not simple error.

Now obviously the best curve will have the least error, right? (Squared, but let's drop that from now on to keep things simple).

So you minimise. Ting. A little calculus bell rings. You think of That involves derivatives. Only here we have lots of unknowns, so we use partials instead. <<< Zero error means the derivative is equal to zero, so there's an equation to use: Sum of the partial derivatives = derivative = zero >>>

We have "lots" of unknowns? Imprecise. We have n+1 of them. Each of the n x's has an a, and then there's the constant term to add to those. We know the x's. It's "x-the-well-known". We have n+1 ai here. Those are the unknowns. (The x's just helped with counting them, that's all).

Right, so we do some calculus (partial derivatives of all n+1 equations), as part of that we do some algebra (divide away the 2's etc), and we end up with a system of equations, each of which is a summation of x's multiplied by one of the unknown a values. Put that in matrix form, look at it, notice that the matrix is symmetric. (I don't want to get bogged down in the tiny details of the calculation, but this is just too cavalier. Somewhere in the way we solve lies the reason that B is somehow a square matrix, instead of an "overdetermined" one.)
avatar Re: Least squares
March 17, 2011 03:48PM
We minimise the sum of the squared errors. That's what we call "best fit".

Expand any error term and you have
Yi - yi =
Yi - (a0 + a1x1 + ... + anxn)

Put the right kind of sigma in front of that, and you're talking of the sum of all errors.

Minimising S, the sum of all these errors, squared, is done through partials.

The matrix is square because firstly there are n+1 partials(one for every a), so n+1 equations. And secondly, there are n+1 ai in every "sigma". (You'll take the partial of the whole sum WRT each ai)

I think from there on it's good and proper to just say "We do some calculus and some algebra" to get the set of equations.
avatar Re: Least squares
March 22, 2011 01:32PM
Being stubborn, I go on...

We're solve for a, not for x.

However, these ai are (in principle) parameters, not "constants of the polynomial". In our particular case you might as well think of them as the constants to solve for, but it would appear that one can do all sorts of weird things with those ai. Small point, then, so I'll delete it later.

Now the data points (the "Yi" are all correct values for whatever function (if any) fits them. The Yi don't lie... Really? ...

Turns out I've got that pretty back-to-front. No. The data has a certain amount of error. The correct value is faith-based. We choose to believe that without the experimental error, those raggedy data points would've landed exactly on some nice smooth curve with a closed form solution. I'll skip the philosophical aspects of that. Also it doesn't make a huge practical difference if you make this mistake.

This means that for any of these incorrect Yi that the (true) smooth polynomial curve doesn't go through, the polynomial is in error by some function of the distance between the incorrect Yi and the correct yi above the "data-x" for that Y.

The error in simple terms is Yi - yi. However if you minimise the square of that error you still have all the ai you want to solve for in the resulting equations, and the squares just behave more nicely than do the simple errors. (I'm falling asleep here, so this aside about the squares must die in the end, too).

So we have N data points, N errors.

Now obviously the best curve will have the least error, right? (And when the square of that error is at a minimum, so is the error). We minimise by taking partials of the sum of all errors - one partial per parameter.

The number of unknown-a's is n+1. Each of the n xj terms has an a, and one must not forget a0x0 adds 1.

We know the x's. It's "x-the-well-known". We have n+1 ai here. Those are the unknowns.

The matrix is square because firstly there are n+1 partials(one for every a), so n+1 equations. And secondly, there are n+1 ai in every "sigma". (You'll take the partial of the whole sum WRT each ai)

The original system was overdetermined. Solve for the ai using partials, and you get a nice square matrix. Actually one of the motivations for producing this system is precisely this: Take an awkward overdetermined system. Find some function on it that outputs a nice square system.

That should be neat enough for readability now? It's still a very imperfect summary (much too long, far too many digressions), but it's getting better, innit?
avatar Re: Least squares
March 22, 2011 01:36PM
Urp! I still have a lot of things completely back to front in that.

Never mind. That gives the rest of you something to get your critical teeth into, doesn't it? There's nothing like a good fundamental error for bringing clarification of the whole problem when it's revealed.
Re: Least squares
March 24, 2011 04:16PM
My goodness. Sorry for the lapse in dialog. I had to catch up with work the past few days. Car's been broken, etc too. Anyhow, let me read through your rants and ramblings tonight and give you some feedback. BTW that small change in the matrix made all the difference.
avatar Re: Least squares
March 24, 2011 04:32PM
Ah, sheepapple! Good to see you!

You can safely ignore all but the last two postings if you like. I thought I'd work through a summary right from the rough. (Leave it open to criticism - always a good way of learning something).

Excellent. Look forward to hearing your comments. Please nitpick. smile (Honestly. The idea is to understand as best as possible, not to have a nice tea party all the time).
Re: Least squares
March 25, 2011 10:41AM
Yeah, look on the least squares side it all boils down to the equations supplied on page 204 and the matrix form given in 3.28.

Couple that with the Table 3.14 and accompanying worked example and it makes sense. The problem though is like with all this stuff its quite computationally intensive and time consuming. I ended up writing a matlab program to help do some of the legwork.
Re: Least squares
March 25, 2011 10:41AM
By the way I started another topic regarding assignment 3, q1.
Sorry, only registered users may post in this forum.

Click here to login