Asp Forum - Re: Seven new VMs, all in a row

flaig

4/12/2005 9:44:00 AM

Am Montag, 11. April 2005 17:02 schrieb ruby-talk-admin@ruby-lang.org:
> flaig@sanctacaris.net wrote:
> > I think the "wal-mart argument" is quite an important one.
>
> I'm not sure exactly what the "wal-mart argument" is. Â Wal-Mart can be
> seen as a big U.S. conglomerate that moves into a town and drives all
> the mom-and-pop stores out of business, eventually drying up downtown
> business districts. Â Or it can be seen as a big discount retailer that
> provides cheap imported goods using its massive warehousing and
> distribution networks, while undercutting domestic manufacturers.

I was alluding to someone's dictum that even Wal-Mart are selling multiprocessor systems nowadays, with the innuendo that hardware is growing more and more trashy every day :-) . The philosophy of increasing processing power by simply increasing clock speeds seems to have reached its limits. Any further advancement will require an intelligent combination of hard- and software. One which even Wal-Mart customers can handle ;-) .

> Obviously, I'm not a big Wal-Mart fan, but maybe their brutal retail
> success strategy has lessons for Ruby? Â :)

Neither am I, but still I like your expression "undercutting domestic manufacturers" (= low-level programmers)!

> > Apart from explicitly creating threads, it would be nice if
> > the Ruby system could be taught to automatically recognize
> > parallelizable code and optimally distribute it across a
> > multiprocessor system -- implicitly. That would be a big
> > advange for high-level programming in general! I do not know
> > the state of the art in this, I only remember that the
> > Atari/Inmos guys failed do do this in Occam, back in the 1980s.
> > Do you think there is a serious chance to get such a thing working?
>
> The only programming environment I'm familiar with where somebody
> implemented automatic parallel optimization is Fortran (although I'm
> sure there are others). Â Fortran's branching and memory models are
> constrained enough to allow for some clever analysis. Â Loops where each
> iteration has no impact on the next can be discovered and converted into
> short-term fine-grained parallel execution. Â In that case, the original
> code has no concept of threading, it just runs faster during the inner
> loops.
>
> None of that would carry over to a thread-aware language with a dynamic
> type system.

Do you really think so?
Fortran has a pretty simple enumerative loop which can be optimized for parallelization, provided your compiler is smart enough.
Higher-level languages, by contrast, contain (or at least may contain) structures such as MAP which tell the compiler/interpreter: "This refers to an entire block of data" and may be extended to: "So distribute the workload as you think fit." This would not even require any analysis but just a fistful of code in the part of the compiler that handles the respective statement.
Of course, it might break backward compatibility as it does away with the tacit assumption that the iterations are executed in any guaranteed order...

> Ilmari Heikkinen wrote:
> > Re: green threads vs native threads, if a green threads implementation
> > is 30 times faster, that's like having 29 extra cpus, no?
>
> "You can't get blood from a stone." Â The only thing that is "like having
> 29 extra cpus" is actually having 29 extra CPUs. :)

You can, if and only if ("iff") the net workload of the thread is negligible when compared to the administrative overhead, and in this case the entire affair is probably not very demanding in terms of cpu power, so there is little point in using a multiprocessor system. On the other hand, if your threads have to do really heavy-duty work, there is no real gain in optimizing the thread model, so what's the buzz...
For bioinformatics work, I'd opt for the 29 extra cpus.

-- Ruediger Marcus

--
Chevalier Dr Dr Ruediger Marcus Flaig
Institute for Immunology
University of Heidelberg
INF 305, D-69121 Heidelberg

"Drain you of your sanity,
Face the Thing That Should Not Be."

--
Diese E-Mail wurde mit http://www.mail-in... verschickt
Mail Inspector ist ein kostenloser Service von http://www....
Der Absender dieser E-Mail hatte die IP: 129.206.124.135

2 Answers

Luke Graham

4/13/2005 1:06:00 AM

On 4/12/05, flaig@sanctacaris.net <flaig@sanctacaris.net> wrote:
> Am Montag, 11. April 2005 17:02 schrieb ruby-talk-admin@ruby-lang.org:
> > flaig@sanctacaris.net wrote:

> > > Apart from explicitly creating threads, it would be nice if
> > > the Ruby system could be taught to automatically recognize
> > > parallelizable code and optimally distribute it across a
> > > multiprocessor system -- implicitly. That would be a big
> > > advange for high-level programming in general! I do not know
> > > the state of the art in this, I only remember that the
> > > Atari/Inmos guys failed do do this in Occam, back in the 1980s.
> > > Do you think there is a serious chance to get such a thing working?
> >
> > The only programming environment I'm familiar with where somebody
> > implemented automatic parallel optimization is Fortran (although I'm
> > sure there are others). ÂFortran's branching and memory models are
> > constrained enough to allow for some clever analysis. ÂLoops where each
> > iteration has no impact on the next can be discovered and converted into
> > short-term fine-grained parallel execution. ÂIn that case, the original
> > code has no concept of threading, it just runs faster during the inner
> > loops.
> >
> > None of that would carry over to a thread-aware language with a dynamic
> > type system.
>
> Do you really think so?
> Fortran has a pretty simple enumerative loop which can be optimized for parallelization, provided your compiler is smart enough.
> Higher-level languages, by contrast, contain (or at least may contain) structures such as MAP which tell the compiler/interpreter: "This refers to an entire block of data" and may be extended to: "So distribute the workload as you think fit." This would not even require any analysis but just a fistful of code in the part of the compiler that handles the respective statement.
> Of course, it might break backward compatibility as it does away with the tacit assumption that the iterations are executed in any guaranteed order...

Parallel programming, both local and distributed, is one of the great
research topics of this decade. Some good google keywords - orca,
amoeba, erlang. For things more complicated than a do-loop, it still
takes a human to break the problem into message-passing or an
equivalent. There are plenty of extensions to C/Fortran/whatever to
help in these things.

--
spooq

Glenn Parker

4/13/2005 1:54:00 AM

flaig@sanctacaris.net wrote:
>
> Higher-level languages, by contrast, contain (or at least may contain)
> structures such as MAP which tell the compiler/interpreter: "This
> refers to an entire block of data" and may be extended to: "So distribute
> the workload as you think fit." This would not even require any analysis
> but just a fistful of code in the part of the compiler that handles the
> respective statement.

Would that it were this simple. A "map" function operates on a list,
but what are the potential relationships between members of any list?
In most cases it's very hard to be sure. And if you can't prove
mathematically, at compile (or even execution) time, that there are zero
interactions between the list members, then your compiler had best not
inject any threaded code.

>>"You can't get blood from a stone." The only thing that is "like
>> having 29 extra cpus" is actually having 29 extra CPUs. :)
>
> You can, if and only if ("iff") the net workload of the thread is
> negligible when compared to the administrative overhead, ...

I read this as saying, "if your threading overhead is so rediculous that
it consumes 29/30ths of your CPU time," then you are better off not
using threads." This is certainly true, but it misses the point. Good
threaded software design minimizes threading overhead, thus a
(potential) 30x speedup indicates a poor design, not a poor threading
system.

> For bioinformatics work, I'd opt for the 29 extra cpus.

Of course, but can you get 30x work out of them? I suspect that in
bioinformatics, many of the "big" problems are easily isolated into
non-communicating work units. I'm running the World Community Grid
client, which is currently cranking on the Human Proteome Folding
Project. My computer, along with thousands more, handles its job with
relatively little coordination from the central work server at IBM. For
this system, threading would be a waste.

But, there are other not purely compute-bound problems that require more
intricate and fine-grained scheduling to achieve decent scaling.

--
Glenn Parker | glenn.parker-AT-comcast.net | <http://www.tetrafoi...

comp.lang.ruby

Re: Seven new VMs, all in a row

flaig

Luke Graham

Glenn Parker

x Login to ForumsZone