Welcome! Log In Create A New Profile

Advanced

Threading challenge

Posted by andrevh 
Announcements Last Post
Announcement myUnisa availability 21 to 24 March 2019 03/17/2019 02:24PM
Announcement SoC Curricula 09/30/2017 01:08PM
Announcement Demarcation or scoping of examinations and assessment 02/13/2017 07:59AM
Announcement School of Computing Short Learning Programmes 11/24/2014 08:37AM
Announcement Unisa contact information 07/28/2011 01:28PM
Threading challenge
November 05, 2007 01:19PM
So, are you bored and have nothing to after the exams.

Well here is something to use your newly aquired skills.

http://softwarecontests.intel.com/threadingchallenge/?cid=sw:multicore041
avatar Re: Threading challenge
November 05, 2007 02:26PM
Strangely enough, I routinely get the urge
to whack developers upside the head when
they use multi-threading.

Out of the hundred or so applications I've seen
that used multi-threading, I think only one resulted
in any sort of improvement from multiple threads.
In any modern OS, there is no need to multi-thread.

OTOH, if you really want to do parallel programming
then check out the Erlang programming language.
Re: Threading challenge
November 06, 2007 02:28PM
I'm sure it doesn't make much of a difference on a single core machine except if it's a GUI program. But now with multi core processors it's more relevant.
Re: Threading challenge
November 06, 2007 03:47PM
I think if you are an old unix fan, then you might prefer the forks()

As a Windows user, not fan, I prefer threads because you get more controls over threads, an it is easier to get those control than to use multiple processes.
The disadvantage of threads is, if one thread crash, everything is gone.
avatar Re: Threading challenge
November 21, 2007 11:05PM
goose Wrote:
-------------------------------------------------------
> Strangely enough, I routinely get the urge
> to whack developers upside the head when
> they use multi-threading.
>
> Out of the hundred or so applications I've seen
> that used multi-threading, I think only one
> resulted
> in any sort of improvement from multiple threads.
> In any modern OS, there is no need to
> multi-thread.

my renderer sees an almost exact 4x improvement when using 4 threads on my quadcore cpu. multicore revolution here i come!

> OTOH, if you really want to do parallel
> programming
> then check out the Erlang programming language.

java's multithreading support is really good too.
avatar Re: Threading challenge
November 22, 2007 06:57AM
lycium Wrote:
-------------------------------------------------------
> goose Wrote:
> --------------------------------------------------
> -----
> > Strangely enough, I routinely get the urge
> > to whack developers upside the head when
> > they use multi-threading.
> >
> > Out of the hundred or so applications I've seen
> > that used multi-threading, I think only one
> > resulted
> > in any sort of improvement from multiple
> threads.
> > In any modern OS, there is no need to
> > multi-thread.
>
> my renderer sees an almost exact 4x improvement
> when using 4 threads on my quadcore cpu. multicore
> revolution here i come!
>
<snipped>

Congratulations.
avatar Re: Threading challenge
November 22, 2007 07:13AM
jeanbodemer Wrote:
-------------------------------------------------------
> I'm sure it doesn't make much of a difference on a
> single core machine except if it's a GUI program.

I'm not too sure there either; the system call select()
(or equivalent) is much better suited to an event
handling framework than starting up different threads
to do different things.

> But now with multi core processors it's more
> relevant.

Which is why I mentioned Erlang; you scale up to 64
processor (not just cores, mind you) and you see a
suitable improvement. A threaded program won't scale
efficiently past a single processor.

(<to lycium as well>winking smiley
The reason for having multiple cores on a processor is
to enable better performance for multiple-threads - this
is why you see double the performance when doubling the
number of cores.

However, the number of cores per processor has a very
small number as a limit. Multiple threads running across
multiple processors run into the age-old shared-memory
problems.

Using an approach such as that adopted by Erlang rather
than threads means that that problem goes away, and the
program can be properly distributed across however many
cores *or* processors there are.

The problems with attempting to scale multi-threaded
applications become apparent when you attempt to
create a new threads implementation for a particular
OS. The point is moot at any rate - Lycium pointed
out an application of threads that makes sense, but I
find very few applications (which are multi-threaded)
that cannot be written more simply (and will run faster)
by redesigning the program to run in an event loop that
uses the system select() call instead.

In short, use multiple-threads the way you would use
GOTOs: if you find yourself using it, ask yourself what
is wrong with the faster-to-code and faster-to-run
solution. If you are writing the multi-threaded
equivalent of Duffs Device, then you can be reasonably
sure that you are doing the correct thing - OTOH if
you think "Well, I need to do these two things at the
same time", then you are probably going to be
incorrect.

(WARNING: the above is my opinion only - take with a
load of salt smile
Re: Threading challenge
November 22, 2007 08:21AM
So Goose

Am I correct if I say that you prefer single threaded processes but a lot of processes?
avatar Re: Threading challenge
November 22, 2007 09:30AM
andrevh Wrote:
-------------------------------------------------------
> So Goose
>
> Am I correct if I say that you prefer single
> threaded processes but a lot of processes?

Er, no ... I prefer that the simplest manner
suitable for the task be adopted. Attempting
to scale a MT program across many programs
is a difficult thing to do, and any bugs that
exist may be close to impossible to track down.

IIRC, this is why most SMP implementations
do not even attempt to distribute threads across
processors, and simply stick to distributing
full processes instead; the pain is not worth
the (sometimes negative) gain.

For example, rendering like Lycium loves doing
is a good match for multi-threading - the task
is not IO-bound but instead computation-bound,
so having more cores speeds up the computation.

OTOH, if you wanted to make sure that the
computation is always sped up when throwing
hardware at the problem, using separate
processes allows the distributed computation to
very nicely scale up when you throw a few dozen
more computers at the problem, although on a
single computer you are likely going to be running
slower than the MT approach.

For tasks that are IO-bound (Primarily user-io
(both text and GUI's), or file-servers (file-io
bound)) it does not make sense to handle each task
in a different thread, as the system provides a
perfectly good and simple method to do this ->
call the system select().

The reason I made my initial (somewhat rash and
stupid smile statement is because I very often see
MT approaches used for tasks that are bound by
IO, and not by the computation itself. An event
loop does not benefit from being run in a separate
thread anymore than streaming data from two
different URL's need two different threads[1].
Both tasks are easier to code, read and maintain
when using the facilities provided by the system.

Anyway, this was ... interesting, but I have to
go and pretend to do some work now (they may
pretend to give me a december bonus smile.

Footnotes:
[1] I may be wrong, but I was told Firefox very
stupidly starts a separate thread for concurrent
tcp requests; all this does is add overhead to
any computation that Firefox does without actually
speeding up the process (the requests are not bound
by the speed of the processor, but by the network
connection which is usually many tens of
thousands of times slower anyway). The concurrent
download can be done in a single execution thread,
and can still saturate the link anyway.

I can see a need for concurrent execution in FF
with the execution of extensions (extensions in a
browser should crash without taking the other
tabs/pages in the browser down). However this could
be done using separate processes for each extension,
thereby ensuring that one broken extension does not
crash the browser. If I were to guess, I would guess
that threads are used here again, as a bad extension
will, without a doubt, hang the browser for certain.
Re: Threading challenge
November 22, 2007 10:15AM
I think threads, especialy with incoming tcp sockets, makes code much easier to write and maintain.

I do not think speed is always the issue, but rather readable and maintainable, else we need to go back to assembler for pure speed.


When writing games, it seems that threads is a must, although I am sure you can do it in a single thread that will be very complicated to follow?


http://www.intel.com/cd/ids/developer/asmo-na/eng/264351.htm

Has anybody tried this game and completed ?

http://softwarecommunity.intel.com/gamecontest/EntryDetails.aspx?entryID=26
avatar Re: Threading challenge
November 22, 2007 12:06PM
I said:
-------------------------------------------------------
Er, no ... I prefer that the simplest manner
suitable for the task be adopted. Attempting
to scale a MT program across many programs
is a difficult thing to do, and any bugs that
exist may be close to impossible to track down.


You said:
-------------------------------------------------------
> I think threads, especialy with incoming tcp
> sockets, makes code much easier to write and
> maintain.
>
> I do not think speed is always the issue, but
> rather readable and maintainable, else we need to
> go back to assembler for pure speed.
>

It seems that we agree in principle; we disagree
about the actual implementations. I very rarely
care about execution speed when writing concurrent
execution code (I'd rather make the code simple
and let extra hardware take up the load in concurrent
executions).

I still maintain, like I did in my previous posts,
that threads add complexity and should not be used
everywhere that concurrency is desired - there are
simpler ways to manage pools of IO.


>
> When writing games, it seems that threads is a
> must, although I am sure you can do it in a single
> thread that will be very complicated to follow?

Not at all; next time you are tempted to use
threads to handle IO (like tcp connections),
try writing a state-machine around the select
system call.
avatar Re: Threading challenge
November 23, 2007 02:23AM
goose Wrote:
-------------------------------------------------------
> The reason for having multiple cores on a
> processor is to enable better performance for multiple-threads
> - this is why you see double the performance when
> doubling the number of cores.

thing is, it's not that simple, which is why getting almost exactly 4x speedup from 4 cores isn't too bad. no "congratulations" are necessary, however i think you might not be aware of the system-level issues which can mitigate the n-times-for-n-cores speedup...

> However, the number of cores per processor has a
> very small number as a limit. Multiple threads running
> across multiple processors run into the age-old
> shared-memory problems.

... this is one of them, but even if you have a nice computing problem which can execute more or less independently (so-called "embarassingly parallel"winking smiley, such as rendering, there are still problems such as limited memory bandwidth <insert long discussion about NUMA vs UMA architectures and how intel is late to the party in introducing their NUMA arch only next year>.

however, they can be overcome through careful algorithmic tuning and hardware support for SMT (intel will introduce a decent implementation to replace hyperthreading next year). ironically, one of the best solutions to limited memory bandwidth is just having more threads on tap; this is exactly what GPUs use to hide memory access latency, and they have literally thousands of threads in flight at once.

these solutions don't only apply to graphics (i've been a bit heavy on examples dealing with graphics because i have them handy), but the multithreading + memory subsystem picture is just as important in business applications: http://techreport.com/articles.x/13176/4 (amd's latest chips are generally outperformed by the intel chips, and in this case amd wins because of its superior system architecture).

> Using an approach such as that adopted by Erlang
> rather than threads means that that problem goes away,
> and the program can be properly distributed across however
> many cores *or* processors there are.

no, programming languages don't make the multithreading problem go away. there are inherently sequential algorithms which you just can't throw a language at and hope for a miracle...

> The problems with attempting to scale
> multi-threaded applications become apparent when you attempt to
> create a new threads implementation for a particular
> OS. The point is moot at any rate - Lycium pointed
> out an application of threads that makes sense,

and was "congratulated" for it, lol

> but I find very few applications (which are
> multi-threaded) that cannot be written more simply (and will run
> faster) by redesigning the program to run in an event loop
> that uses the system select() call instead.

often multithreading makes a lot of sense, even from a non-performance perspective. for example, suppose you have a program which needs to perform work in large batches to be efficient. you want to do that in a thread seperate from the user interface so that it can remain responsive to the system (my programs don't do this, and when you click on them etc you have a lag of several seconds until it responds...).

> In short, use multiple-threads the way you would
> use GOTOs:

omg

> (WARNING: the above is my opinion only - take with
> a load of salt smile


avatar Re: Threading challenge
November 23, 2007 02:37AM
andrevh Wrote:
-------------------------------------------------------
> I do not think speed is always the issue, but
> rather readable and maintainable, else we need to
> go back to assembler for pure speed.

btw, compilers beat humans essentially all the time these days, unless the coder has a good "plan of action" involving lots of tight register usage, bit tricks and cache management.

the problem is essentially that cpus have hundreds if not thousands of instructions in flight at any given moment, and it's essentially impossible for the programmer to have a mental model of execution (since execution is massively out-of-order and predicted to death by non-disclosed algorithms etc.). the compiler attacks this with complex heuristics, often designed by the processor manufacturers themselves, and combinatorial search algorithms in many many passes to determine which approach is fastest.

i'm no assembler expert, but i've coded plenty of tight asm loops back in the pentium days (and pentium2, when optimisation started to get hairy due to OOE); i do know some real asm experts and they all concede to letting the compiler optimise; the way to do low level programming these days is through the use of compiler intrinsics: http://www.codeproject.com/cpp/sseintro.asp

using intrinsics has a ton of great advantages:

1. you can code at a low level, dictating which instructions (which may be complex composite ones) to use and how memory fetching is done.
2. you don't have to worry about register allocation, which is a complex problem best suited to computer optimisation.
3. because you don't specify which registers to use, the compiler can produce highly optimised code for both 32bit and 64bit (twice the number of registers) platforms.
4. the compiler can work its scheduling magic to make it fast, targeting a particular processor architecture without any changes in your code.

in other words, it's fast and it's easy smiling smiley it's actually even a pleasure.

but i digress...
Re: Threading challenge
November 23, 2007 08:25AM
On the issue of writing in assembler


// Debug mode

i = 0;
0041138E mov dword ptr [i (417160h)],0
while (i < 10)
00411398 cmp dword ptr [i (417160h)],0Ah
0041139F jge wmain+40h (4113B0h)
i++;
004113A1 mov eax,dword ptr [i (417160h)]
004113A6 add eax,1
004113A9 mov dword ptr [i (417160h)],eax
004113AE jmp wmain+28h (411398h)
return 0;
004113B0 xor eax,eax
}


// Release mode

int _tmain(int argc, _TCHAR* argv[])
{

i = 0;
while (i < 10)
00401000 mov dword ptr [i (403374h)],0Ah
i++;
return 0;
0040100A xor eax,eax
}
0040100C ret


OK , I admit, the compiler would have beaten me, I would have stored i in a registry and increment till ten. The compiler just set it to 10 and finish the job.

But with careful thinking, you can still beat the compiler, with the final run time, although it is not realy worth it in time it takes to develope.

After all it is humans that wrote compilers.
It is like playing chess against a good chess computer. It is hard to win but you can.
avatar Re: Threading challenge
November 23, 2007 09:16AM
andrevh Wrote:
-------------------------------------------------------
> But with careful thinking, you can still beat the
> compiler, with the final run time, although it is
> not realy worth it in time it takes to develope.

hit the nail on the head there. it requires a truly herculian effort these days, and a lot of luck even (it may be by coincidence that a particular instruction sequence is faster, because of complex architectural reasons).

an effective way to optimise performance critical sections of code is to see what the compiler produced and work with that rather than trying from scratch.

> After all it is humans that wrote compilers.

it's not always true that we're better than tools we make.

> It is like playing chess against a good chess
> computer. It is hard to win but you can.

the question is, are you prepared to write truly optimal code for many different combinations of instruction sets (mmx, sse, sse2, sse3, ssse3, altivec, ...) on many different platforms (x86, x86-64, power-pc)? compilers hit near-optimality on all those, just by tweaking compile options with the same source code.

i'm not an authority on assembly optimisation, but i have spoken to several people who are and the general consensus is that it's not worth writing assembly directly. intrinsics are the way to go when they can be used (eg. intel's simd instruction sets), but for general non-numerical code just don't bother.
avatar Re: Threading challenge
November 23, 2007 10:24AM
lycium Wrote:
-------------------------------------------------------
> goose Wrote:
> --------------------------------------------------
> -----
> > The reason for having multiple cores on a
> > processor is to enable better performance for
> multiple-threads
> > - this is why you see double the performance
> when
> > doubling the number of cores.
>
> thing is, it's not that simple, which is why
> getting almost exactly 4x speedup from 4 cores
> isn't too bad. no "congratulations" are necessary,
> however i think you might not be aware of the
> system-level issues which can mitigate the
> n-times-for-n-cores speedup...
>

I'm probably not aware of most things (ask my
wife smile. However, the problems in concurrency
are not exactly new and unresearched. Attempting
to fasten all concurrency pins using only a
hammer results in only the 'nail' concurrency
pins being fastened, if you get my drift.

The reason I responded was because the "hammer"
(threads) is all that most developers know about
WRT concurrency.

In fact, the replies to this thread have
convinced me that no one replying here have
even read up on alternatives to the shared
memory state model used by threads.

> > However, the number of cores per processor has
> a
> > very small number as a limit. Multiple threads
> running
> > across multiple processors run into the age-old
> > shared-memory problems.
>
> ... this is one of them, but even if you have a
> nice computing problem which can execute more or
> less independently (so-called "embarassingly
> parallel"winking smiley, such as rendering, there are still
> problems such as limited memory bandwidth .
>

Of course - there are trade-offs: you will reach a
break-even point where the gains from concurrent
execution are lower than the costs of consolidating
the results of the concurrency.

> however, they can be overcome through careful
> algorithmic tuning and hardware support for SMT
> (intel will introduce a decent implementation to
> replace hyperthreading next year). ironically, one
> of the best solutions to limited memory bandwidth
> is just having more threads on tap; this is
> exactly what GPUs use to hide memory access
> latency, and they have literally thousands of
> threads in flight at once.
>

Hardware isn't my place to judge - I generally do not
care. What I have noticed is that hardware generally
follows what the developer asks for; we got multi-cores
and hyperthreading merely because there is a significant
demand for that from developers.

Developers, OTOH, are asking for better thread support
in hardware merely because they are not fully informed.

> these solutions don't only apply to graphics (i've
> been a bit heavy on examples dealing with graphics
> because i have them handy), but the multithreading
> + memory subsystem picture is just as important in
> business applications:
> http://techreport.com/articles.x/13176/4 (amd's
> latest chips are generally outperformed by the
> intel chips, and in this case amd wins because of
> its superior system architecture).
>
> > Using an approach such as that adopted by
> Erlang
> > rather than threads means that that problem goes
> away,
> > and the program can be properly distributed
> across however
> > many cores *or* processors there are.
>
> no, programming languages don't make the
> multithreading problem go away.

In this case, yes it does. Erlang does not
have threads, hence Erlang does not have
any of the MT problems. Certainly some concurrency
problems that are not specific to threads will
remain, but it is certain that the problems
inherent in MT will go away.

> there are
> inherently sequential algorithms which you just
> can't throw a language at and hope for a
> miracle...
>

You are confusing multi-threading and concurrency,
which is most certainly not the same thing. Using
threads brings many problems - using a different
approach to concurrency makes problems that are
special to MT go away.

For sure, sequential algorithms will remain
sequential ... however many algorithms written
in a certain style (think "functional programming"winking smiley
can be parallel-ised (is that even a word?) without
the programmer knowing about it.


> > The problems with attempting to scale
> > multi-threaded applications become apparent when
> you attempt to
> > create a new threads implementation for a
> particular
> > OS. The point is moot at any rate - Lycium
> pointed
> > out an application of threads that makes sense,
>
> and was "congratulated" for it, lol
>

Sorry - I did not mean to sound condescending; I
was congratulating you on using threads in a manner
that makes sense.

> > but I find very few applications (which are
> > multi-threaded) that cannot be written more
> simply (and will run
> > faster) by redesigning the program to run in an
> event loop
> > that uses the system select() call instead.
>
> often multithreading makes a lot of sense, even
> from a non-performance perspective.

No. Often *concurrency* makes a lot of sense; MT is
just one way of implementing concurrency.

> for example,
> suppose you have a program which needs to perform
> work in large batches to be efficient. you want to
> do that in a thread seperate from the user
> interface so that it can remain responsive to the
> system (my programs don't do this, and when you
> click on them etc you have a lag of several
> seconds until it responds...).

Once again I feel obliged to point out that you
are assuming that MT = Concurrency and Concurrent = MT.
In this specific example you are both correct and
incorrect; it is desirable to have concurrency in this
program, but I fail to see why this concurrency *must*
be implemented using threads.

>
> > In short, use multiple-threads the way you
> would
> > use GOTOs:
>
> omg

(lookup Duffs Device smile

>
> > (WARNING: the above is my opinion only - take
> with
> > a load of salt smile
>
>
> http://www.maps-of-austria.co.uk/images/salzburg-m
> ap.gif

Or a metric Kilo-ton, your choice smile
avatar Re: Threading challenge
November 23, 2007 10:32AM
lycium Wrote:
-------------------------------------------------------
> andrevh Wrote:
> --------------------------------------------------
> -----
> > I do not think speed is always the issue, but
> > rather readable and maintainable, else we need
> to
> > go back to assembler for pure speed.
>
> btw, compilers beat humans essentially all the
> time these days, unless the coder has a good "plan
> of action" involving lots of tight register usage,
> bit tricks and cache management.
>

Even then, I seriously doubt a human can beat a language
implementation that hosts it's own evaluator and profiler
*at compilation time*. In a language that does this,
even though the compiler may have to make several hundred
passes and take days to compile even small programs, I'm
pretty certain that it will *always* emit code that will
beat a human.

However you may not want to wait for 2 days after typing
"make" smile.

> the problem is essentially that cpus have hundreds
> if not thousands of instructions in flight at any
> given moment, and it's essentially impossible for
> the programmer to have a mental model of execution
> (since execution is massively out-of-order and
> predicted to death by non-disclosed algorithms
> etc.). the compiler attacks this with complex
> heuristics, often designed by the processor
> manufacturers themselves, and combinatorial search
> algorithms in many many passes to determine which
> approach is fastest.
>

Complex heuristics help, but may not even be required.
Using Churches thesis as an axiom, we can eventually
infer that the compiler will *always* emit code better
than or equal to the code written by a human, no matter
how smart the human or how slow the computer.

(ps. I'm too lazy to do the proof know, but if you
really must know, say so and I'll post a pretty
convincing informal proof on Monday.)

<snipped>
avatar Re: Threading challenge
November 23, 2007 10:44AM
andrevh Wrote:
-------------------------------------------------------
> On the issue of writing in assembler
>
>
> // Debug mode
>
> i = 0;
> 0041138E mov dword ptr ,0
> while (i < 10)
> 00411398 cmp dword ptr ,0Ah
> 0041139F jge wmain+40h (4113B0h)
> i++;
> 004113A1 mov eax,dword ptr
> 004113A6 add eax,1
> 004113A9 mov dword ptr ,eax
> 004113AE jmp wmain+28h (411398h)
> return 0;
> 004113B0 xor eax,eax
> }
>
>
> // Release mode
>
> int _tmain(int argc, _TCHAR* argv[])
> {
>
> i = 0;
> while (i < 10)
> 00401000 mov dword ptr ,0Ah
> i++;
> return 0;
> 0040100A xor eax,eax
> }
> 0040100C ret
>
>
> OK , I admit, the compiler would have beaten me,
> I would have stored i in a registry and increment
> till ten. The compiler just set it to 10 and
> finish the job.
>
> But with careful thinking, you can still beat the
> compiler,

Actually you can never beat a compiler[1] at this job,
only match it, which makes low-level programming
pointless[1].

OTOH, if you have a stupid compiler, instead of ignoring
the stupid compiler and writing the optimal code in assembler,
you may as well use the stupid compiler to write a smart
compiler, and then use your smart compiler to generate the
code.

For example, let's say your compiler will not optimise
a certain sequence of operations (call them 'foo'); instead
of inserting assembly instructions within the program, you
use stupid-compiler to write smart-compiler which reads in
your program code, and outputs the program exactly as is with the
exception of replacing 'foo' with the correct assembly.

Then you run the command like this:
smart-compiler input-file.src | stupid-compiler


Note:
[1] Assuming that a compiler is developed to the
very limits of what is computable.

> with the final run time, although it is
> not realy worth it in time it takes to develope.
>
> After all it is humans that wrote compilers.

Which is why we can never best them, only match
them. If you haven't already done so, do COS 301
in 2008.

> It is like playing chess against a good chess
> computer. It is hard to win but you can.

It is nothing like that at all, however in the game
of chess it is provably possible to write a chess engine
that never loses against a human. Which means that we
can match them, but never best them.
avatar Re: Threading challenge
November 23, 2007 10:53AM
goose Wrote:
-------------------------------------------------------
> Attempting
> to fasten all concurrency pins using only a
> hammer results in only the 'nail' concurrency
> pins being fastened, if you get my drift.

yes. i interpreted your comment rather negatively, something like "all this newfangled threading stuff must go away", and am pleased to see that was my judgement error not yours smiling smiley

> we got multi-cores and hyperthreading merely
> because there is a significant demand for
> that from developers.

this, however, isn't true. we are going down the multicore route because our gigaherz "free lunch" is over.

suppose one cpu vendor went dualcore, and another one went with 2x the clockspeed. obviously the one selling clockspeed would utterly dominate in a market comprised mostly of singlethreaded programs and lazy programmers who don't like fixing indeterminstic bugs...

i read this article when it came out and remembered saying to myself, "learn as much as you can about multithreaded programming and your future as a programmer is assured."

> Developers, OTOH, are asking for better thread
> support in hardware merely because they are not fully
> informed.

that's a bit unfair, a little hardware support goes a LONG way to helping concurrency issues, look at atomic compare and swap instructions for example.

intel's hyperthreading sucked, but word on the street is that they'll be doing it right (among other things like p2p interconnect and ondie memory controllers) for nehalem, and that will indeed help with the multicore bandwidth problem. at that stage amd will be in even deeper shit than they are now, if they're even around next year...

> In this case, yes it does. Erlang does not
> have threads, hence Erlang does not have
> any of the MT problems. Certainly some
> concurrency
> problems that are not specific to threads will
> remain, but it is certain that the problems
> inherent in MT will go away.

erlang is hardly unique in its ability to not multithread winking smiley

threading can be complex, and almost always needs tuning so automatic approaches can't compete with careful implementations. this kind of tradeoff can already be seen with openmp vs os-native implementations, where the simplicity vs efficiency picture is less extreme than it would be if the programmer had zero control over concurrent execution.

> You are confusing multi-threading and concurrency,

threading is the most lightweight approach to shared-memory concurrency; are you promoting split processes and message passing?!

> Using threads brings many problems

sure, but we use it for its benefits, not its problems winking smiley

> For sure, sequential algorithms will remain
> sequential ... however many algorithms written
> in a certain style (think "functional
> programming"winking smiley can be parallel-ised (is that even a word?)
> without the programmer knowing about it.

i'd love some references on this, it seems like an incredibly difficult compiler problem!

> No. Often *concurrency* makes a lot of sense; MT
> is just one way of implementing concurrency.

agreed. the thing is, it's the smallest unit of execution, so it's in most cases the best choice, so programmers have to "suck it up" if they want to tap the benefits on offer.

this is hardly new btw; not long ago the programmers who could master the harshness of assembly language found themselves in a better position than those who couldn't. nowadays, that's exactly the case with mulithreaded programming.

> Once again I feel obliged to point out that you
> are assuming that MT = Concurrency and Concurrent
> = MT.

most often multithreading is the way to implement concurrency, so i tend to use the term interchangibly from having discussed it in practical situations and almost never in broad academic terms.

> In this specific example you are both correct and
> incorrect; it is desirable to have concurrency in
> this program, but I fail to see why this concurrency
> *must* be implemented using threads.

of course it doesn't, and at the highest levels one resorts to the "tightly coupled network" model, which is indeed where multicore processors are heading.

however, right now it's the concurrency primitive of choice when using shared memory.

> (lookup Duffs Device smile

i've heard of it, and seen some example code, but what does a scary sequential code construct have to do with multiple threads?

both should be used with caution?! tongue sticking out smiley

> Or a metric Kilo-ton, your choice smile

sorry, i couldn't resist smiling smiley i've been there once, and licked the walls to prove to myself that it really is made of salt. then i considered how many other people may have been equally unconvinced and did the same... but i digress...
avatar Re: Threading challenge
November 23, 2007 10:57AM
about beating a computer at chess, i'd like to take a moment to pimp the best board game on earth. it has the dual distinctions of being the oldest game on earth still played in its original form, and of being the only game at which humans are still vastly superior (9 year old children can beat any go computer/program). it's called go, and it completely dwarfs chess in game complexity (and elegance). moreover, it's "fuzzy" in the sense that it's difficult to quanitfy the state of the board - something that humans are very good at doing.

please, ignore everything i've written and learn this game now: http://playgo.to/interactive

it, together with mathematics and arguably physics, is one of the few perfect things in this universe.
avatar Re: Threading challenge
November 23, 2007 11:08AM
quick reply to goose's last huge post (sorry, my last post was serious enough).

compilers are very, very good: yes, really they are. i have nothing but a bunch of conversations with experts and web links to show you; please know that i have no personal investment in the truth of this assertion (i.e. i do not own stock in intel), it's just a consistent thing i've found in my own experiments and from others. look around online for gcc 4.2 and especially 4.3 kicking the snot out of programmers everywhere, especially in 64bit mode as its register allocator isn't great.


the proof: that's way too academic for me to read during holidays winking smiley
Re: Threading challenge
November 23, 2007 11:16AM
avatar Re: Threading challenge
November 23, 2007 11:18AM
ehm that article is more than a year old (june 2006), 4x4 was a complete failure and essentially didn't happen. even its sequel, FASN8, was a flop.
Re: Threading challenge
November 23, 2007 11:37AM
avatar Re: Threading challenge
November 23, 2007 11:43AM
lycium Wrote:
-------------------------------------------------------
> about beating a computer at chess, i'd like to
> take a moment to pimp the best board game on
> earth. it has the dual distinctions of being the
> oldest game on earth still played in its original
> form, and of being the only game at which humans
> are still vastly superior (9 year old children can
> beat any go computer/program). it's called go, and
> it completely dwarfs chess in game complexity (and
> elegance). moreover, it's "fuzzy" in the sense
> that it's difficult to quanitfy the state of the
> board - something that humans are very good at
> doing.
>
> please, ignore everything i've written and learn
> this game now: http://playgo.to/interactive
>
> it, together with mathematics and arguably
> physics, is one of the few perfect things in this
> universe.

Which is why I wrote:
-----------------------
It is nothing like that at all, however in the game
of chess it is provably possible to write a chess engine
that never loses against a human.
-----------------------

However, the difficulties in writing a go engine
is in the decision-space (too big). Since it is
possible to build a TM that exhausts the decision space
in Go, it is theoretically possible to write a Go
engine on a computer - the rot sets in when we find
that out computer does not have an infinite memory
like the TM ...

... Unless I was horribly mis-informed about Go smile
avatar Re: Threading challenge
November 23, 2007 11:53AM
> > it, together with mathematics and arguably
> > physics, is one of the few perfect things in
> > this universe.

just to back this up a bit, a quote by the famous chess player and mathematician edward lasker:

Quote

"While the Baroque rules of Chess could only have been created by humans, the rules of Go are so elegant, organic, and rigorously logical that if intelligent life forms exist elsewhere in the universe, they almost certainly play Go."

goose Wrote:
-------------------------------------------------------
> It is nothing like that at all, however in the
> game of chess it is provably possible to write a chess
> engine that never loses against a human.

and i certainly agree, even if i didn't pos to say so smiling smiley

> However, the difficulties in writing a go engine
> is in the decision-space (too big).

nope, it's in scoring too. classifying a group of stones as "alive", "dead" or "dunno" is a hugely fuzzy thing, as is quantifying feely things like "thickness", "heaviness", "slowness", aggression, ... a single number cannot even begin to quantify the worth of a particular play, it has many dimensions.

if it were just about the decision space then standard methods would already work quite well, but at this stage the very best computer programs are at the level of your beginning club player.

> Unless I was horribly mis-informed about Go smile

the best thing that could ever result of our discussions here on osprey is if you developed an interest in go. at the very least, it has done wonders for my poor memory over the years (in fact it is currently the best treatment for people who've had strokes).

go is SO unbelievably rich, stupefyingly so if you consider that you can learn its few rules in under 5 minutes. in playing this wonderfully abstract game you'll see many analogies applicable to life, learn deep proverbs and eastern philosophy/history (comes with being over 4000 years old), and meet many interesting people smiling smiley

one last time, and sorry for the zealous activism: http://playgo.to/interactive

there's also a really excellent online server where people are happy to teach (for free!), if you ever check it out let me know, i'm also lycium on that server: http://www.gokgs.com/index.xhtml
avatar Re: Threading challenge (Very Long)
November 23, 2007 12:04PM
Okay, I finally found what I was looking for; I'm not
going to post the authors name and credentials, but
bear in mind that *I* am not the author of this particular
piece of wisdom.

I figure that anyone still awake at this point will
find this interesting. Also note that this has some
bearing on a phorum for a module called "advanced programming".

------------------------------------------------------------------
If you program using strictly functional programming, you can not only verify that your code is 100% secure, but you can even automate the process. (Preferably in a functional programming language such as Scheme, caml, Haskel, LISP, or Erlang; imperative languages make it very difficult/slow to do with functions what functional languages do very naturally and easily.) Purely functional code can be subjected to automated code auditing easily, whereas code auditing imperative code cannot be guaranteed to catch every bug and unintentionally available abuse.

Here's why, and why just about any computational problem can be solved using FP (functional programming):
Functional languages conform to lambda calculus, which has been shown to be Turing equivalent, which means that any program that can be computed on a Turing machine can be solved using Lambda calculus. So long as you program using strictly functions, your program can be verified according to the rules of lambda calculus, and the verification would be as sure as a mathematical proof. This is the only sure way I know of really knowing with mathematical certainty that your application is secure.

Pure functional programming has no assignment statements; there are no state changes for you to keep track of in your program, and in many cases abuses resulting unintended changes of state are the root of security problems. This is not to say that there is no state in functional programming; the state is maintained through function call parameters. (For example, in an imperative programming language, iteration loops keep track of a state variable that guides the running of the loop, whereas a functional program never actually keeps track of state with a variable that changes value; a functional program would carry out iteration by recursion, and the state is simply kept as a parameter passed to each call of the function. No variable with changing state is ever coded.)

Since functional programs lack assignment statements, and assignment statements make up a large fraction of the code in imperative programs, functional programs tend to be a lot shorter for the same problem solved. (I can't give you a hard ratio, but depending on the problem, your code can be up to 90% shorter when described functionally.) Shorter code is easier to debug, which helps in securing code. The reason functional code is so much shorter is that functional programing describes the problem in terms of functions and composition of functions, whereas imperative code describes a step by step solution to the problem. Descriptions of problems in terms of functions tend to be far shorter than algorithmic descriptions of solving them, which is required in imperative code.

Here's the biggest benefit of managing complexity with functional programming: as a coder, you NEVER have to worry about state being messed with. The outcome of each function is always the same so long as the function is called with the same parameters. In imperative programming as done in OOP, you can't depend on that. Unit testing each part doesn't guarantee that your code is bug free and secure because bugs can arise from the interaction of the parts even if every part is tested and passed. In functional programming, however, you never have to deal with that kind of problem because if you test that the range of each function is correct given the proper domain, and pre-screen the parameters being passed to each function to reject any out-of-domain parameters, you can know with certainty where your bugs come from by unit testing each function.

If you need to guarantee the order of evaluation (something that critics of FP advocates sometimes use to dismiss FP advocacy), you can still use FP and benefit: in functional programming, order of evaluation can be enforced using monads. Explaining how is beyond the scope of a mere comment though, but in any case, if you need really reliable code, consider using a functional programming style.

I can't do justice to the matter here; for more information, see this:
http://www.defmacro.org/ramblings/fp.html/ [defmacro.org]
A quote from the article I linked above, to whet your interest:

A functional program is ready for concurrency without any further modifications. You never have to worry about deadlocks and race conditions because you don't need to use locks! No piece of data in a functional program is modified twice by the same thread, let alone by two different threads. That means you can easily add threads without ever giving conventional problems that plague concurrency applications a second thought!

If this is the case, why doesn't anybody use functional programs for highly concurrent applications? Well, it turns out that they do. Ericsson designed a functional language called Erlang for use in its highly tolerant and scalable telecommunication switches. Many others recognized the benefits provided by Erlang and started using it. We're talking about telecommunication and traffic control systems that are far more scalable and reliable than typical systems designed on Wall Street. Actually, Erlang systems are not scalable and reliable. Java systems are. Erlang systems are simply rock solid.

The concurrency story doesn't stop here. . .

There are times where you cannot simply use FP because your code must interface with other code that does care about state. In those cases, use functional programming as much as possible, and limit the parts that use imperative code and state to a minimum. That way, you limit the places where your code can have state related problems, and improve your chances of securing the code, since there is far less of it for you to sift through and secure.

If this has caught your curiosity, I urge you to read the link above. As for what functional language I advocate, my personal favorite is Scheme, because of its simplicity, but if you want to do OOP, CLOS, OCaml, and various Scheme object systems can implement just about any object based program you need.

------------------------------------------------------------------
avatar Re: Threading challenge
November 23, 2007 12:13PM
Sorry, only registered users may post in this forum.

Click here to login