Asp Forum - Banging around a few threads - microsoft.public.vb.general.discussion

Karl E. Peterson

2/9/2012 7:37:00 PM

I'm fishing for thoughts and/or ideas here. I have a pretty complex
set of calculations that I'd really like to spread out amongst a few
cores. The general structure of the situation is like this...

Everything's currently contained within a single console mode
application. Output is directed to both the console window and a log
file.

The preliminary phase takes about 4 minutes to calculate and write to
disk approximately 4GB of disposable working data.

This is currently followed by seven sequential processing phases that
use this working data to produce various results based on different
models. Each of these phases can take between 3 and 12 minutes! In
addition to CPU, they are obviously very disk i/o intensive.

Total model run (prelim + 7) takes about an hour, run sequentially.

So it seems to me the logical direction to take it is to perform that
preliminary phase, then spawn off the seven other phases in parallel,
monitoring each, and deleting the working data upon completion of the
last.

I'm not opposed to each parallel phase being its own process. And I
wonder if that might be not only the simplest, but effectively darn
near the most elegant, approach to take? (This is my current
inclination.)

If threads seem to anyone to be the better way to go, I'd like to hear
why? And by what mechanism?

To me, the big issues are:

* I need a console for each phase,
* The console would ideally remain open after completion, to be
closed by the user after review,
* the main "controller" needs to know when all is done, necessitating
a series of signals from the workers.

If it matters, the whole thing will ultimately be initiated through a
batch file. And the code is currently segregated so that any of the
seven calc phases can be triggered from command line switches. (I'd
just need to get the part about the prelim data precalc'd.)

I guess the idea of threads just seems *oh so slightly* less fragile
when it comes to knowing about the completion of each phase.

Thoughts?

--
..NET: It's About Trust!
http://vfre...

28 Answers

Matt

2/2/2012 4:47:00 PM

On Feb 2, 3:52 am, NoBody <NoB...@nowhere.com> wrote:
> On Wed, 1 Feb 2012 04:48:00 -0800 (PST), Matt <matttel...@sprynet.com>
> wrote:
>
>
>
>
>
> >On Feb 1, 3:42 am, NoBody <NoB...@nowhere.com> wrote:
> >> On Tue, 31 Jan 2012 09:04:06 -0800 (PST), Matt
>
> >> <matttel...@sprynet.com> wrote:
> >> >On Jan 31, 3:50 am, NoBody <NoB...@nowhere.com> wrote:
> >> >> On Mon, 30 Jan 2012 08:57:23 -0800 (PST), Matt
>
> >> >> <matttel...@sprynet.com> wrote:
> >> >> >On Jan 27, 3:29 am, NoBody <NoB...@nowhere.com> wrote:
> >> >> >> On Thu, 26 Jan 2012 11:57:00 -0800 (PST), Matt
>
> >> >> >> <matttel...@sprynet.com> wrote:
> >> >> >> >On Jan 26, 3:55 am, NoBody <NoB...@nowhere.com> wrote:
> >> >> >> >> On Wed, 25 Jan 2012 06:28:18 -0800 (PST), Matt
>
> >> >> >> >> <matttel...@sprynet.com> wrote:
> >> >> >> >> >On Jan 25, 4:21 am, NoBody <NoB...@nowhere.com> wrote:
> >> >> >> >> >> On Tue, 24 Jan 2012 17:16:10 +0000 (UTC), 2875 Dead <d...@gone.com>
> >> >> >> >> >> wrote:
>
> >> >> >> >> >> >On Tue, 24 Jan 2012 11:45:52 -0500, Obama Fucks Taxpayers wrote:
>
> >> >> >> >> >> >> "2875 Dead" <d...@gone.com> wrote in message
> >> >> >> >> >> >>news:jfml48$qoa$1@dont-email.me...
>
> >> >> >> >> >> >> No doubt these are the Obama voters who scream that its unfair that
> >> >> >> >> >> >> Romney only pays $ 7 million in taxes
>
> >> >> >> >> >> >Now compare payroll and state taxes, and let us know how your upholstered
> >> >> >> >> >> >parasite does against working people.
>
> >> >> >> >> >> The discussion is about federal income taxes. You, realizing that you
> >> >> >> >> >> can't win, discuss other taxes and pretend they are the same thing.
>
> >> >> >> >> >Oy.
>
> >> >> >> >> >Payroll taxes are a part of your federal taxes. They subtract from
> >> >> >> >> >what you
> >> >> >> >> >owe at the end of the year. So do state and local taxes. If you'd ever
> >> >> >> >> >had any
> >> >> >> >> >earned income, and done a tax return, you would know this.
>
> >> >> >> >> None of these are part of the federal income tax....
>
> >> >> >> >They are not part of what YOU wish to call the federal income tax.
>
> >> >> >> Oh good grief....
>
> >> >> >Apparently, it is causing you a great deal of grief.
>
> >> >> >> >Why don't you define the term for us, so we can discuss it, then?
>
> >> >> >> Are you serious?
>
> >> >> >No, I ask you to define your terms because it is the high point of my
> >> >> >day. Sigh.
>
> >> >> >> >Be very specific. What is the "federal income tax"?
>
> >> >> >>http://www.investopedia.com/terms/f/federal_incom...
>
> >> >> >So, from your definition:
>
> >> >> >Federal income taxes are applied on all forms of earnings that make up
> >> >> >a taxpayer's taxable income, such as employment earnings or capital
> >> >> >gains.
>
> >> >> >Withholding tax is a part of federal income tax.
>
> >> >> You "forgot" to specifically define what you mean by withholding tax
> >> >> and what this would have to do with state and local taxes.
>
> >> >I "forgot" to ignore you, mostly because you are perpetuating this
> >> >insane
> >> >myth that half the people don't pay taxes. Stop being an idiot and
> >> >maybe
> >> >I'll forget you existed.
>
> >> Since I never said what you just claimed, perhaps you have a point to
> >> make here. Kindly put aside your false indignation and try to address
> >> what I said
>
> >Why? You don't address what others say.
>
> I see, you outright lie about what I said and then complain that I
> don't address it.

If you got involved in a discussion and have no idea what it is about,
you are
both rude and stupid.

Thanks for the concession.

Matt

Tom Shelton

2/9/2012 9:10:00 PM

Karl E. Peterson formulated the question :
> I'm fishing for thoughts and/or ideas here. I have a pretty complex
> set of calculations that I'd really like to spread out amongst a few
> cores. The general structure of the situation is like this...
>
> Everything's currently contained within a single console mode
> application. Output is directed to both the console window and a log
> file.
>
> The preliminary phase takes about 4 minutes to calculate and write to
> disk approximately 4GB of disposable working data.
>
> This is currently followed by seven sequential processing phases that
> use this working data to produce various results based on different
> models. Each of these phases can take between 3 and 12 minutes! In
> addition to CPU, they are obviously very disk i/o intensive.
>
> Total model run (prelim + 7) takes about an hour, run sequentially.
>
> So it seems to me the logical direction to take it is to perform that
> preliminary phase, then spawn off the seven other phases in parallel,
> monitoring each, and deleting the working data upon completion of the
> last.
>
> I'm not opposed to each parallel phase being its own process. And I
> wonder if that might be not only the simplest, but effectively darn
> near the most elegant, approach to take? (This is my current
> inclination.)
>
> If threads seem to anyone to be the better way to go, I'd like to
> hear why? And by what mechanism?
>
> To me, the big issues are:
>
> * I need a console for each phase,
> * The console would ideally remain open after completion, to be
> closed by the user after review,
> * the main "controller" needs to know when all is done,
> necessitating a series of signals from the workers.
>
> If it matters, the whole thing will ultimately be initiated through a
> batch file. And the code is currently segregated so that any of the
> seven calc phases can be triggered from command line switches. (I'd
> just need to get the part about the prelim data precalc'd.)
>
> I guess the idea of threads just seems *oh so slightly* less fragile
> when it comes to knowing about the completion of each phase.
>
> Thoughts?

In VB6? I would most likely use multiple processes and sync them using
a semiphore or other process sync method. Just my initial thought :)

--
Tom Shelton

Karl E. Peterson

2/9/2012 9:18:00 PM

Tom Shelton wrote :
> Karl E. Peterson formulated the question :
>> Thoughts?
>
> In VB6?

Sorry, yeah, considered that implied. <g>

> I would most likely use multiple processes and sync them using a
> semiphore or other process sync method. Just my initial thought :)

Me too. All I see is "work" to be gained by pursuing threads, rather
than processes, here. Especially as it's all batch file driven.

--
..NET: It's About Trust!
http://vfre...

Jim Mack

2/9/2012 9:55:00 PM

> Tom Shelton wrote :
>> Karl E. Peterson formulated the question :
>>> Thoughts?
>>
>> In VB6?
>
> Sorry, yeah, considered that implied. <g>
>
>> I would most likely use multiple processes and sync them using a
>> semiphore or other process sync method. Just my initial thought :)
>
> Me too. All I see is "work" to be gained by pursuing threads, rather
> than processes, here. Especially as it's all batch file driven.

Processes don't run -- only threads run. So if you spawn separate
processes, you're just spawning threads with a tad more memory
overhead.

I wouldn't create more threads than you have cores, unless there's
significant idle time as the process is now.

--
Jim

BeeJ

2/9/2012 10:28:00 PM

Karl E. Peterson laid this down on his screen :
> I'm fishing for thoughts and/or ideas here. I have a pretty complex set of
> calculations that I'd really like to spread out amongst a few cores. The
> general structure of the situation is like this...
>
> Everything's currently contained within a single console mode application.
> Output is directed to both the console window and a log file.
>
> The preliminary phase takes about 4 minutes to calculate and write to disk
> approximately 4GB of disposable working data.
>
> This is currently followed by seven sequential processing phases that use
> this working data to produce various results based on different models. Each
> of these phases can take between 3 and 12 minutes! In addition to CPU, they
> are obviously very disk i/o intensive.
>
> Total model run (prelim + 7) takes about an hour, run sequentially.
>
> So it seems to me the logical direction to take it is to perform that
> preliminary phase, then spawn off the seven other phases in parallel,
> monitoring each, and deleting the working data upon completion of the last.
>
> I'm not opposed to each parallel phase being its own process. And I wonder
> if that might be not only the simplest, but effectively darn near the most
> elegant, approach to take? (This is my current inclination.)
>
> If threads seem to anyone to be the better way to go, I'd like to hear why?
> And by what mechanism?
>
> To me, the big issues are:
>
> * I need a console for each phase,
> * The console would ideally remain open after completion, to be closed by
> the user after review,
> * the main "controller" needs to know when all is done, necessitating a
> series of signals from the workers.
>
> If it matters, the whole thing will ultimately be initiated through a batch
> file. And the code is currently segregated so that any of the seven calc
> phases can be triggered from command line switches. (I'd just need to get
> the part about the prelim data precalc'd.)
>
> I guess the idea of threads just seems *oh so slightly* less fragile when it
> comes to knowing about the completion of each phase.
>
> Thoughts?

From this junior coding guy.
I have an app that starts multiple ActiveX EXEs.
AxEXE1 does intensive graphic screen work. I usually have four AxEXE1
running at a time.
AxEXE2 does movies (running a the same time as AxEXE1) multiple
instantiation. Had two of these and four of AxEXE1 running together.
AxEXE3 talks to USB device. AxEXE3 has time sensitive DAC step
outputs.
AxEXE4 talks to DMX device. This one is pretty relaxed compared to the
others.
AxEXE5 talks to X10 device. Kind of slow stuff here.
All run together.
They all talk to the main app at regular intervals: get new commands to
run and return data and status.

I was amazed that they all ran smoothly together on a dual core laptop.
I have not tried it on my quad code desktop yet.

Previously I tried running all methods in the main app and they all
stuttered and sputtered. Looked terrible.

The ActiveX EXEs really work.

What is amazing too me is that with all of that and only two cores it
was all smooth.

Once I got the hang of an ActiveX EXE I rapidly built them all.
Unfortunately I was never able to find an all encompassing book about
ActiveX EXE. Usually it is just a chapter in the middle of ActiveX
Controls so I found it difficult to get all the stuff together but now
I have and it is just too much fun. And thanks again to some in this
group that gave me valuable inputs.

Karl E. Peterson

2/9/2012 10:57:00 PM

Jim Mack explained :
>> Tom Shelton wrote :
>>> Karl E. Peterson formulated the question :
>>>> Thoughts?
>>>
>>> In VB6?
>>
>> Sorry, yeah, considered that implied. <g>
>>
>>> I would most likely use multiple processes and sync them using a semiphore
>>> or other process sync method. Just my initial thought :)
>>
>> Me too. All I see is "work" to be gained by pursuing threads, rather than
>> processes, here. Especially as it's all batch file driven.
>
> Processes don't run -- only threads run. So if you spawn separate processes,
> you're just spawning threads with a tad more memory overhead.

Semantics. <g> Yes, a tad more overhead. But, when viewed in the
context of 10 minutes of CPU time, not so much. Then if I compare that
"overhead" to what it takes (in terms of *my* time) to bring all the
threads in-process... Just not finding any justification (other than a
bit simpler EOT signaling?) for that effort.

And, I wonder, a single process can only attach to a single console,
no? That alone might make the design decision. Hmmmm.

> I wouldn't create more threads than you have cores, unless there's
> significant idle time as the process is now.

Agreed. The target machines are mostly dual-quads, so seven ought to
leave the users one to "do email" and such. <g> I was also looking at
stacking a couple of the shorter ones to sort of even out the load-per.

--
..NET: It's About Trust!
http://vfre...

Karl E. Peterson

2/9/2012 11:03:00 PM

BeeJ explained :
> Karl E. Peterson laid this down on his screen :
>> I'm fishing for thoughts and/or ideas here. I have a pretty complex set of
>> calculations that I'd really like to spread out amongst a few cores. The
>> general structure of the situation is like this...
>>
>> Everything's currently contained within a single console mode application.
>> Output is directed to both the console window and a log file.
>>
>> The preliminary phase takes about 4 minutes to calculate and write to disk
>> approximately 4GB of disposable working data.
>>
>> This is currently followed by seven sequential processing phases that use
>> this working data to produce various results based on different models.
>> Each of these phases can take between 3 and 12 minutes! In addition to
>> CPU, they are obviously very disk i/o intensive.
>>
>> Total model run (prelim + 7) takes about an hour, run sequentially.
>>
>> So it seems to me the logical direction to take it is to perform that
>> preliminary phase, then spawn off the seven other phases in parallel,
>> monitoring each, and deleting the working data upon completion of the last.
>>
>> I'm not opposed to each parallel phase being its own process. And I wonder
>> if that might be not only the simplest, but effectively darn near the most
>> elegant, approach to take? (This is my current inclination.)
>>
>> If threads seem to anyone to be the better way to go, I'd like to hear why?
>> And by what mechanism?
>>
>> To me, the big issues are:
>>
>> * I need a console for each phase,
>> * The console would ideally remain open after completion, to be closed by
>> the user after review,
>> * the main "controller" needs to know when all is done, necessitating a
>> series of signals from the workers.
>>
>> If it matters, the whole thing will ultimately be initiated through a batch
>> file. And the code is currently segregated so that any of the seven calc
>> phases can be triggered from command line switches. (I'd just need to get
>> the part about the prelim data precalc'd.)
>>
>> I guess the idea of threads just seems *oh so slightly* less fragile when
>> it comes to knowing about the completion of each phase.
>>
>> Thoughts?
>
> From this junior coding guy.
> I have an app that starts multiple ActiveX EXEs.
> AxEXE1 does intensive graphic screen work. I usually have four AxEXE1
> running at a time.
> AxEXE2 does movies (running a the same time as AxEXE1) multiple
> instantiation. Had two of these and four of AxEXE1 running together.
> AxEXE3 talks to USB device. AxEXE3 has time sensitive DAC step outputs.
> AxEXE4 talks to DMX device. This one is pretty relaxed compared to the
> others.
> AxEXE5 talks to X10 device. Kind of slow stuff here.
> All run together.
> They all talk to the main app at regular intervals: get new commands to run
> and return data and status.
>
> I was amazed that they all ran smoothly together on a dual core laptop.
> I have not tried it on my quad code desktop yet.
>
> Previously I tried running all methods in the main app and they all stuttered
> and sputtered. Looked terrible.
>
> The ActiveX EXEs really work.
>
> What is amazing too me is that with all of that and only two cores it was all
> smooth.

Heh, the wonders of pre-emptive multitasking! :-)

> Once I got the hang of an ActiveX EXE I rapidly built them all.
> Unfortunately I was never able to find an all encompassing book about ActiveX
> EXE. Usually it is just a chapter in the middle of ActiveX Controls so I
> found it difficult to get all the stuff together but now I have and it is
> just too much fun. And thanks again to some in this group that gave me
> valuable inputs.

Yeah, they (axEXE) are one of the Greater Mysteries of COM, it seems.
I'm not all that sure the model code I'm dealing with is all that
amenable to being refactored like that, though. Each of the seven main
phases are extremely linear in nature. Although they are written to
manipulate a fairly complex (axDLL) object model that abstracts the
model building blocks.

--
..NET: It's About Trust!
http://vfre...

BeeJ

2/9/2012 11:34:00 PM

Karl E. Peterson submitted this idea :
> BeeJ explained :
>> Karl E. Peterson laid this down on his screen :
>>> I'm fishing for thoughts and/or ideas here. I have a pretty complex set
>>> of calculations that I'd really like to spread out amongst a few cores.
>>> The general structure of the situation is like this...
>>>
>>> Everything's currently contained within a single console mode application.
>>> Output is directed to both the console window and a log file.
>>>
>>> The preliminary phase takes about 4 minutes to calculate and write to disk
>>> approximately 4GB of disposable working data.
>>>
>>> This is currently followed by seven sequential processing phases that use
>>> this working data to produce various results based on different models.
>>> Each of these phases can take between 3 and 12 minutes! In addition to
>>> CPU, they are obviously very disk i/o intensive.
>>>
>>> Total model run (prelim + 7) takes about an hour, run sequentially.
>>>
>>> So it seems to me the logical direction to take it is to perform that
>>> preliminary phase, then spawn off the seven other phases in parallel,
>>> monitoring each, and deleting the working data upon completion of the
>>> last.
>>>
>>> I'm not opposed to each parallel phase being its own process. And I
>>> wonder if that might be not only the simplest, but effectively darn near
>>> the most elegant, approach to take? (This is my current inclination.)
>>>
>>> If threads seem to anyone to be the better way to go, I'd like to hear
>>> why? And by what mechanism?
>>>
>>> To me, the big issues are:
>>>
>>> * I need a console for each phase,
>>> * The console would ideally remain open after completion, to be closed
>>> by the user after review,
>>> * the main "controller" needs to know when all is done, necessitating a
>>> series of signals from the workers.
>>>
>>> If it matters, the whole thing will ultimately be initiated through a
>>> batch file. And the code is currently segregated so that any of the seven
>>> calc phases can be triggered from command line switches. (I'd just need
>>> to get the part about the prelim data precalc'd.)
>>>
>>> I guess the idea of threads just seems *oh so slightly* less fragile when
>>> it comes to knowing about the completion of each phase.
>>>
>>> Thoughts?
>>
>> From this junior coding guy.
>> I have an app that starts multiple ActiveX EXEs.
>> AxEXE1 does intensive graphic screen work. I usually have four AxEXE1
>> running at a time.
>> AxEXE2 does movies (running a the same time as AxEXE1) multiple
>> instantiation. Had two of these and four of AxEXE1 running together.
>> AxEXE3 talks to USB device. AxEXE3 has time sensitive DAC step outputs.
>> AxEXE4 talks to DMX device. This one is pretty relaxed compared to the
>> others.
>> AxEXE5 talks to X10 device. Kind of slow stuff here.
>> All run together.
>> They all talk to the main app at regular intervals: get new commands to run
>> and return data and status.
>>
>> I was amazed that they all ran smoothly together on a dual core laptop.
>> I have not tried it on my quad code desktop yet.
>>
>> Previously I tried running all methods in the main app and they all
>> stuttered and sputtered. Looked terrible.
>>
>> The ActiveX EXEs really work.
>>
>> What is amazing too me is that with all of that and only two cores it was
>> all smooth.
>
> Heh, the wonders of pre-emptive multitasking! :-)
>
>> Once I got the hang of an ActiveX EXE I rapidly built them all.
>> Unfortunately I was never able to find an all encompassing book about
>> ActiveX EXE. Usually it is just a chapter in the middle of ActiveX
>> Controls so I found it difficult to get all the stuff together but now I
>> have and it is just too much fun. And thanks again to some in this group
>> that gave me valuable inputs.
>
> Yeah, they (axEXE) are one of the Greater Mysteries of COM, it seems. I'm
> not all that sure the model code I'm dealing with is all that amenable to
> being refactored like that, though. Each of the seven main phases are
> extremely linear in nature. Although they are written to manipulate a fairly
> complex (axDLL) object model that abstracts the model building blocks.

Have you fiddled with Affinity and Priority?
I tried Priority on one of my apps but did not see any difference.
Might just be that the processing it was doing was not all that
intensive to see any difference.
Could be useful on a quad.
I would like to know if you have seen any improvement using either.
I have not been able to devote any time to Affinity and don't know
where to start on that one. Maybe some day.

Karl E. Peterson

2/10/2012 12:08:00 AM

BeeJ explained on 2/9/2012 :
> Have you fiddled with Affinity and Priority?

Priority, sure. Affinity, no reason to.

> I tried Priority on one of my apps but did not see any difference.

You really won't until the processor is pegged. Was it? More often,
on "long" processes, I like to shift mine into lower gear, so that the
user can do other things more comfortably in the foreground. At least
that was my strategy back in the single-core days. Not hardly as
needed now.

> Might just be that the processing it was doing was not all that intensive to
> see any difference.

Yep.

> Could be useful on a quad.

Not too likely, in the general case. Possibly in specific ones.
Easiest way to try is just to twiddle in TaskMgr, and see if you can
tell the difference.

> I would like to know if you have seen any improvement using either.
> I have not been able to devote any time to Affinity and don't know where to
> start on that one. Maybe some day.

I sure wouldn't bother. I know guys who've worked on the kernel
devteam. If they can't schedule the cores better than me, we're in
trouble.

--
..NET: It's About Trust!
http://vfre...

Jim Mack

2/10/2012 12:47:00 AM

>>
>> Processes don't run -- only threads run. So if you spawn separate
>> processes, you're just spawning threads with a tad more memory
>> overhead.
>
> Semantics. <g> Yes, a tad more overhead. But, when viewed in the
> context of 10 minutes of CPU time, not so much. Then if I compare that
> "overhead" to what it takes (in terms of *my* time) to bring all the
> threads in-process... Just not finding any justification (other than a
> bit simpler EOT signaling?) for that effort.

If you have access to a threading library, there's may be no extra
work. But only you can decide how much is too much. And BTW, this is
one of the key areas where a real VB7 could have gone. Big sigh.

If the phases are not tightly coupled, and mostly linear, I don't think
you'd lose anything by just launching processes -- always assuming that
you have a way of communicating and synchronizing. Tightly coupled
things tend not to be improved at all by threading, since so much time
is spent just spinning the plates.

I second the recommendation for AX EXE, if the work permits that sort
of division, especially if there are operations that can be split into
parallel tracks without too much sync overhead. I don't use them much
but they do solve a certain class of problem neatly.

> And, I wonder, a single process can only attach to a single console, no?
> That alone might make the design decision. Hmmmm.

Not knowing your requirements, I can't speak to that at all. Does each
step need a console to itself? Can you simulate a console with a form?

>> I wouldn't create more threads than you have cores, unless there's
>> significant idle time as the process is now.
>
> Agreed. The target machines are mostly dual-quads, so seven ought to
> leave the users one to "do email" and such. <g> I was also looking at
> stacking a couple of the shorter ones to sort of even out the load-per.

Good luck. (-:

--
Jim

microsoft.public.vb.general.discussion

Banging around a few threads

Karl E. Peterson

Matt

Tom Shelton

Karl E. Peterson

Jim Mack

BeeJ

Karl E. Peterson

Karl E. Peterson

BeeJ

Karl E. Peterson

Jim Mack

x Login to ForumsZone