[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.c++

Functional Local Static Zero Initialization - When?

Brian Cole

12/5/2008 2:38:00 AM

A working draft of the C++ standard I was able to obtain says the
following in section 6.7.4:
The zero-initialization (8.5) of all local objects with static storage
duration (3.7.1) or thread storage duration (3.7.2) is performed
before any other initialization takes place.

First, the only addition for C++0x is the thread storage duration, so
I assume the sentence was the following for previous versions of the
standard:
The zero-initialization (8.5) of all local objects with static storage
duration (3.7.1) is performed before any other initialization takes
place.

The criteria "before any other initialization" is a little ambiguous
here. Does this mean any other initialization inside the function the
static resides, or any other initialization the entire program may
perform.

Basically, I'm trying to implement something like the following to
allow for thread safe function local static initialization while
maintaining proper destructor ordering atexit.

template<class T>
struct Once
{
T *_obj;
long _once;
Once()
{
while (1)
{
long prev = InterlockedCompareExchange(&_once, 1, 0);
if (0 == prev) // got the lock
break;
else if (2 == prev) // The singleton has been initialized.
return _obj;
else {
// Another thread is initializing the singleton: must wait.
assert(1 == prev);
sleep(1); // sleep 1 millisecond
}
}
assert(_obj == 0);
_obj = new T;
InterlockedExchange(&_once, 2);
return _obj;
}

~Once() { delete _obj; }
inline T& operator *() { return *_obj; }
inline T* operator ->() { return _obj; }
inline operator T* () { return operator ->(); }
};

If I can guarantee that the memory of the object is zero-initialized
during "static initialization", then I can safely use that zero value
to do mutual exclusion in the constructor of the object using atomic
operations. And the following code is then safe during either "dynamic
initialization" or multi-threaded execution.

Foo *GetMeyersSingletonFoo()
{
static Once<Foo> foo;
return foo;
}

Thanks, I've been trying to tackle this for months now, and I think
I'm finally on the last steps.
17 Answers

Chris M. Thomasson

12/5/2008 4:17:00 AM

0


"Brian Cole" <coleb2@gmail.com> wrote in message
news:98cb5faf-67ff-496b-8a69-b5113b9b3ea2@p2g2000prf.googlegroups.com...
>A working draft of the C++ standard I was able to obtain says the
> following in section 6.7.4:
> The zero-initialization (8.5) of all local objects with static storage
> duration (3.7.1) or thread storage duration (3.7.2) is performed
> before any other initialization takes place.
>
> First, the only addition for C++0x is the thread storage duration, so
> I assume the sentence was the following for previous versions of the
> standard:
> The zero-initialization (8.5) of all local objects with static storage
> duration (3.7.1) is performed before any other initialization takes
> place.
>
> The criteria "before any other initialization" is a little ambiguous
> here. Does this mean any other initialization inside the function the
> static resides, or any other initialization the entire program may
> perform.
>
> Basically, I'm trying to implement something like the following to
> allow for thread safe function local static initialization while
> maintaining proper destructor ordering atexit.
>
> template<class T>
> struct Once
> {
> T *_obj;
> long _once;
> Once()
> {
> while (1)
> {
> long prev = InterlockedCompareExchange(&_once, 1, 0);

[...]

classic problem with CAS docs on Windows: Does
`InterlockedCompareExchange()' always execute a memory barrier when it
encounters a failed operation? If so, where is it _explicitly_ documented?
Perhaps it is implied somewhere in their documentation. Who knows for sure?
Humm...

Marcel Müller

12/5/2008 7:38:00 AM

0

Hi,

Chris M. Thomasson schrieb:
> classic problem with CAS docs on Windows: Does
> `InterlockedCompareExchange()' always execute a memory barrier when it
> encounters a failed operation?

no.

It only synchronizes this one word. It is an exact mapping of the x86
LOCK CMPXCHG instruction. No more, no less.
And on other platforms it is emulated somehow.


Marcel

James Kanze

12/5/2008 9:23:00 AM

0

On Dec 5, 3:38 am, Brian Cole <col...@gmail.com> wrote:
> A working draft of the C++ standard I was able to obtain says
> the following in section 6.7.4:
> The zero-initialization (8.5) of all local objects with static
> storage duration (3.7.1) or thread storage duration (3.7.2) is
> performed before any other initialization takes place.

> First, the only addition for C++0x is the thread storage
> duration, so I assume the sentence was the following for
> previous versions of the standard:
> The zero-initialization (8.5) of all local objects with static
> storage duration (3.7.1) is performed before any other
> initialization takes place.

> The criteria "before any other initialization" is a little
> ambiguous here. Does this mean any other initialization inside
> the function the static resides, or any other initialization
> the entire program may perform.

I don't see any ambiguity. "Before any other initialization"
means "before any other initialization".

Of course, if the compiler can determine that a conformant
program cannot see the difference... I rather suspect that no
implementation actually initializes the thread local storage
before the thread using it is created.

> Basically, I'm trying to implement something like the
> following to allow for thread safe function local static
> initialization while maintaining proper destructor ordering
> atexit.

> template<class T>
> struct Once
> {
> T *_obj;
> long _once;
> Once()
> {
> while (1)
> {
> long prev = InterlockedCompareExchange(&_once, 1, 0);
> if (0 == prev) // got the lock
> break;
> else if (2 == prev) // The singleton has been initialized.
> return _obj;
> else {
> // Another thread is initializing the singleton: must wait.
> assert(1 == prev);
> sleep(1); // sleep 1 millisecond

That's one second, not one millisecond. At least on Posix
platforms, and I'm pretty sure Windows as well. (There is no
C++ standard function sleep.)

> }
> }
> assert(_obj == 0);
> _obj = new T;
> InterlockedExchange(&_once, 2);
> return _obj;
> }

> ~Once() { delete _obj; }
> inline T& operator *() { return *_obj; }
> inline T* operator ->() { return _obj; }
> inline operator T* () { return operator ->(); }
> };

> If I can guarantee that the memory of the object is
> zero-initialized during "static initialization",

It will be if the object has static storage duration. Otherwise
not.

--
James Kanze (GABI Software) email:james.kanze@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

James Kanze

12/5/2008 9:25:00 AM

0

On Dec 5, 8:38 am, Marcel Müller <news.5.ma...@spamgourmet.com> wrote:
> Chris M. Thomasson schrieb:

> > classic problem with CAS docs on Windows: Does
> > `InterlockedCompareExchange()' always execute a memory
> > barrier when it encounters a failed operation?

> no.

> It only synchronizes this one word. It is an exact mapping of
> the x86 LOCK CMPXCHG instruction. No more, no less. And on
> other platforms it is emulated somehow.

Doesn't the X86 lock prefix force a memory synchronization? I
was under the impression that it did (but I could easily be
mistaken; I've only recently started to do any significant work
on the platform, and have not yet studied this aspect in
detail).

--
James Kanze (GABI Software) email:james.kanze@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Chris M. Thomasson

12/5/2008 9:34:00 AM

0

"James Kanze" <james.kanze@gmail.com> wrote in message
news:b0f36ddb-e872-4812-a84b-13c8408036e7@w34g2000yqm.googlegroups.com...
On Dec 5, 8:38 am, Marcel Müller <news.5.ma...@spamgourmet.com> wrote:
> > Chris M. Thomasson schrieb:

> > > classic problem with CAS docs on Windows: Does
> > > `InterlockedCompareExchange()' always execute a memory
> > > barrier when it encounters a failed operation?

> > no.

> > It only synchronizes this one word. It is an exact mapping of
> > the x86 LOCK CMPXCHG instruction. No more, no less. And on
> > other platforms it is emulated somehow.

> Doesn't the X86 lock prefix force a memory synchronization?

Yeah:

http://developer.intel.com/products/processor/manuals/...




> I
> was under the impression that it did (but I could easily be
> mistaken; I've only recently started to do any significant work
> on the platform, and have not yet studied this aspect in
> detail).


jason.cipriani@gmail.com

12/5/2008 10:19:00 AM

0

On Dec 4, 11:16 pm, "Chris M. Thomasson" <n...@spam.invalid> wrote:
> "Brian Cole" <col...@gmail.com> wrote in message
>
> news:98cb5faf-67ff-496b-8a69-b5113b9b3ea2@p2g2000prf.googlegroups.com...
>
>
>
> >A working draft of the C++ standard I was able to obtain says the
> > following in section 6.7.4:
> > The zero-initialization (8.5) of all local objects with static storage
> > duration (3.7.1) or thread storage duration (3.7.2) is performed
> > before any other initialization takes place.
>
> > First, the only addition for C++0x is the thread storage duration, so
> > I assume the sentence was the following for previous versions of the
> > standard:
> > The zero-initialization (8.5) of all local objects with static storage
> > duration (3.7.1) is performed before any other initialization takes
> > place.
>
> > The criteria "before any other initialization" is a little ambiguous
> > here. Does this mean any other initialization inside the function the
> > static resides, or any other initialization the entire program may
> > perform.
>
> > Basically, I'm trying to implement something like the following to
> > allow for thread safe function local static initialization while
> > maintaining proper destructor ordering atexit.
>
> > template<class T>
> > struct Once
> > {
> >  T   *_obj;
> >  long _once;
> >  Once()
> >  {
> >    while (1)
> >    {
> >      long prev = InterlockedCompareExchange(&_once, 1, 0);
>
> [...]
>
> classic problem with CAS docs on Windows: Does
> `InterlockedCompareExchange()' always execute a memory barrier when it
> encounters a failed operation? If so, where is it _explicitly_ documented?
> Perhaps it is implied somewhere in their documentation. Who knows for sure?
> Humm...

It does. It is explicitly documented here:

http://msdn.microsoft.com/en-us/library/ms6...

"This function generates a full memory barrier (or fence) to ensure
that memory operations are completed in order."

Starting with Vista (or Server 2003) you can also choose "acquire" vs.
"release" semantics with InterlockedCompareExchangeAcquire/
InterlockedCompareExchangeRelease (and same with the other interlocked
functions):

http://msdn.microsoft.com/en-us/librar...(VS.85).aspx

Jason

jason.cipriani@gmail.com

12/5/2008 10:24:00 AM

0

On Dec 5, 4:22 am, James Kanze <james.ka...@gmail.com> wrote:
> On Dec 5, 3:38 am, Brian Cole <col...@gmail.com> wrote:
>
> > A working draft of the C++ standard I was able to obtain says
> > the following in section 6.7.4:
> > The zero-initialization (8.5) of all local objects with static
> > storage duration (3.7.1) or thread storage duration (3.7.2) is
> > performed before any other initialization takes place.
> > First, the only addition for C++0x is the thread storage
> > duration, so I assume the sentence was the following for
> > previous versions of the standard:
> > The zero-initialization (8.5) of all local objects with static
> > storage duration (3.7.1) is performed before any other
> > initialization takes place.
> > The criteria "before any other initialization" is a little
> > ambiguous here. Does this mean any other initialization inside
> > the function the static resides, or any other initialization
> > the entire program may perform.
>
> I don't see any ambiguity.  "Before any other initialization"
> means "before any other initialization".
>
> Of course, if the compiler can determine that a conformant
> program cannot see the difference... I rather suspect that no
> implementation actually initializes the thread local storage
> before the thread using it is created.
>
>
>
> > Basically, I'm trying to implement something like the
> > following to allow for thread safe function local static
> > initialization while maintaining proper destructor ordering
> > atexit.
> > template<class T>
> > struct Once
> > {
> >   T   *_obj;
> >   long _once;
> >   Once()
> >   {
> >     while (1)
> >     {
> >       long prev = InterlockedCompareExchange(&_once, 1, 0);
> >       if (0 == prev) // got the lock
> >         break;
> >       else if (2 == prev) // The singleton has been initialized.
> >         return _obj;
> >       else {
> >         // Another thread is initializing the singleton: must wait.
> >         assert(1 == prev);
> >         sleep(1); // sleep 1 millisecond
>
> That's one second, not one millisecond.  At least on Posix
> platforms, and I'm pretty sure Windows as well.  (There is no
> C++ standard function sleep.)

There is no "sleep" on Windows. If he meant "Sleep", then it's 1
millisecond (well, more like 50 or so, realistically, depending on the
platform).


> >       }
> >     }
> >     assert(_obj == 0);
> >     _obj = new T;
> >     InterlockedExchange(&_once, 2);
> >     return _obj;
> >   }
> >   ~Once() { delete _obj; }
> >   inline T& operator *() { return *_obj; }
> >   inline T* operator ->() { return _obj; }
> >   inline operator T* () { return operator ->(); }
> > };
> > If I can guarantee that the memory of the object is
> > zero-initialized during "static initialization",
>
> It will be if the object has static storage duration.  Otherwise
> not.
>
> --
> James Kanze (GABI Software)             email:james.ka...@gmail.com
> Conseils en informatique orientée objet/
>                    Beratung in objektorientierter Datenverarbeitung
> 9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Brian Cole

12/5/2008 3:32:00 PM

0

On Dec 5, 3:23 am, "jason.cipri...@gmail.com"
<jason.cipri...@gmail.com> wrote:
> On Dec 5, 4:22 am, James Kanze <james.ka...@gmail.com> wrote:
>
>
>
> > On Dec 5, 3:38 am, Brian Cole <col...@gmail.com> wrote:
>
> > > A working draft of theC++standard I was able to obtain says
> > > the following in section 6.7.4:
> > > The zero-initialization(8.5) of alllocalobjects withstatic
> > > storage duration (3.7.1) orthreadstorage duration (3.7.2) is
> > > performed before any otherinitializationtakes place.
> > > First, the only addition for C++0x is thethreadstorage
> > > duration, so I assume the sentence was the following for
> > > previous versions of the standard:
> > > The zero-initialization(8.5) of alllocalobjects withstatic
> > > storage duration (3.7.1) is performed before any other
> > >initializationtakes place.
> > > The criteria "before any otherinitialization" is a little
> > > ambiguous here. Does this mean any otherinitializationinside
> > > thefunctionthestaticresides, or any otherinitialization
> > > the entire program may perform.
>
> > I don't see any ambiguity.  "Before any otherinitialization"
> > means "before any otherinitialization".
>
> > Of course, if the compiler can determine that a conformant
> > program cannot see the difference... I rather suspect that no
> > implementation actually initializes thethreadlocalstorage
> > before thethreadusing it is created.
>
> > > Basically, I'm trying to implement something like the
> > > following to allow forthreadsafefunctionlocalstatic
> > >initializationwhile maintaining proper destructor ordering
> > > atexit.
> > > template<class T>
> > > struct Once
> > > {
> > >   T   *_obj;
> > >   long _once;
> > >   Once()
> > >   {
> > >     while (1)
> > >     {
> > >       long prev = InterlockedCompareExchange(&_once, 1, 0);
> > >       if (0 == prev) // got the lock
> > >         break;
> > >       else if (2 == prev) // The singleton has been initialized.
> > >         return _obj;
> > >       else {
> > >         // Anotherthreadis initializing the singleton: must wait.
> > >         assert(1 == prev);
> > >         sleep(1); // sleep 1 millisecond
>
> > That's one second, not one millisecond.  At least on Posix
> > platforms, and I'm pretty sure Windows as well.  (There is no
> >C++standardfunctionsleep.)
>
> There is no "sleep" on Windows. If he meant "Sleep", then it's 1
> millisecond (well, more like 50 or so, realistically, depending on the
> platform).
>
> > >       }
> > >     }
> > >     assert(_obj == 0);
> > >     _obj = new T;
> > >     InterlockedExchange(&_once, 2);
> > >     return _obj;
> > >   }
> > >   ~Once() { delete _obj; }
> > >   inline T& operator *() { return *_obj; }
> > >   inline T* operator ->() { return _obj; }
> > >   inline operator T* () { return operator ->(); }
> > > };
> > > If I can guarantee that the memory of the object is
> > > zero-initialized during "staticinitialization",
>
> > It will be if the object hasstaticstorage duration.  Otherwise
> > not.
>
> > --
> > James Kanze (GABI Software)             email:james.ka...@gmail.com
> > Conseils en informatique orientée objet/
> >                    Beratung in objektorientierter Datenverarbeitung
> > 9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
>
>

I did mean millisecond sleep. The code originally called an internal
cross-platform millisecond sleep function. I changed it to just
"sleep" so that everyone else knew the jist of what was going on
there. In fact, there's probably better things to do instead of
"sleep", exponential back-off and such, but this condition is so
rarely encountered it doesn't seem to warrant anything complex.

Thanks for being meticulous though. :-)

Brian Cole

12/5/2008 6:02:00 PM

0

On Dec 5, 2:22 am, James Kanze <james.ka...@gmail.com> wrote:
> On Dec 5, 3:38 am, Brian Cole <col...@gmail.com> wrote:
>
> > A working draft of the C++ standard I was able to obtain says
> > the following in section 6.7.4:
> > The zero-initialization (8.5) of all local objects with static
> > storage duration (3.7.1) or thread storage duration (3.7.2) is
> > performed before any other initialization takes place.
> > First, the only addition for C++0x is the thread storage
> > duration, so I assume the sentence was the following for
> > previous versions of the standard:
> > The zero-initialization (8.5) of all local objects with static
> > storage duration (3.7.1) is performed before any other
> > initialization takes place.
> > The criteria "before any other initialization" is a little
> > ambiguous here. Does this mean any other initialization inside
> > the function the static resides, or any other initialization
> > the entire program may perform.
>
> I don't see any ambiguity.  "Before any other initialization"
> means "before any other initialization".

I guess the ambiguity is in my own mind fueled by the rest of the
paragraph:
"A local object of trivial or literal type (3.9) with static or thread
storage duration initialized with constant-expressions is initialized
before its
block is ?rst entered."

Hinting that the zero-initialization could occur after main is invoked
as long as it's before the function is entered. The next sentence only
says the implementation is "permitted" to perform initialization
before main, doesn't seem to require it:
"An implementation is permitted to perform early initialization of
other local objects with static or thread storage duration under the
same conditions that an implementation is permitted to statically
initialize an object with static or thread storage duration in
namespace scope (3.6.2)."

I am willing to accept that any decent compiler implementation would
zero out all the memory defined for function local statics during
"zero-initialization" since that would be cheaper than doing it during
main. Just wanted to be sure. Any idea what standard this guarantee
first appeared in? I deal with some rather old compilers sometimes.

> Of course, if the compiler can determine that a conformant
> program cannot see the difference... I rather suspect that no
> implementation actually initializes the thread local storage
> before the thread using it is created.
>
>
>
> > Basically, I'm trying to implement something like the
> > following to allow for thread safe function local static
> > initialization while maintaining proper destructor ordering
> > atexit.
> > template<class T>
> > struct Once
> > {
> >   T   *_obj;
> >   long _once;
> >   Once()
> >   {
> >     while (1)
> >     {
> >       long prev = InterlockedCompareExchange(&_once, 1, 0);
> >       if (0 == prev) // got the lock
> >         break;
> >       else if (2 == prev) // The singleton has been initialized.
> >         return _obj;
> >       else {
> >         // Another thread is initializing the singleton: must wait.
> >         assert(1 == prev);
> >         sleep(1); // sleep 1 millisecond
>
> That's one second, not one millisecond.  At least on Posix
> platforms, and I'm pretty sure Windows as well.  (There is no
> C++ standard function sleep.)
>
> >       }
> >     }
> >     assert(_obj == 0);
> >     _obj = new T;
> >     InterlockedExchange(&_once, 2);
> >     return _obj;
> >   }
> >   ~Once() { delete _obj; }
> >   inline T& operator *() { return *_obj; }
> >   inline T* operator ->() { return _obj; }
> >   inline operator T* () { return operator ->(); }
> > };
> > If I can guarantee that the memory of the object is
> > zero-initialized during "static initialization",
>
> It will be if the object has static storage duration.  Otherwise
> not.

So the next obvious question is if there is a way I can force users of
the class to always declare it "static" since the implementation will
depend on this condition. Since static is a storage class specifier
and has nothing to do with the type there is no fancy typedef trickery
I could do to catch the following misuse of the class:
Foo *GetMeyersSingletonFoo()
{
Once<Foo> foo;
return foo;
}

The only hope is that during testing that foo would get placed in some
memory on the stack that wasn't already zero'd out, triggering an
assertion in the constructor. Seeing that memory is often zero'd out
for various reasons it seems way to easy for this to fall through
testing and only appear in production down the road.

Can any C++ wizards think of a way to catch this at compile or run
time?

Thanks

Chris M. Thomasson

12/5/2008 11:59:00 PM

0

<jason.cipriani@gmail.com> wrote in message
news:cc35304e-4d3e-4821-a8b6-9cdcc3e60a53@41g2000yqf.googlegroups.com...
On Dec 4, 11:16 pm, "Chris M. Thomasson" <n...@spam.invalid> wrote:
[...]
> > classic problem with CAS docs on Windows: Does
> > `InterlockedCompareExchange()' always execute a memory barrier when it
> > encounters a failed operation? If so, where is it _explicitly_
> > documented?
> > Perhaps it is implied somewhere in their documentation. Who knows for
> > sure?
> > Humm...

> It does. It is explicitly documented here:

> http://msdn.microsoft.com/en-us/library/ms6...

> "This function generates a full memory barrier (or fence) to ensure
> that memory operations are completed in order."

You need to read this post from Neill Clift who works/worked on
synchronization issues within the Windows Kernel:


http://groups.google.com/group/comp.programming.threads/msg/c3cdcd...

http://groups.google.com/group/comp.programming.threads/browse_frm/thread/29ea51...


He cannot just point to the documentation because it is rather vague. It
says it generates a full barrier, but when? On success for sure, but what
about failure? I am coming from the point of view that atomic RMW
instructions are naked, and one need to explicitly add memory barriers
exactly where there needed. Are they saying that the operation is 100% fully
fenced in:


word CAS(word* pdest, word cmp, word xchg) {
MEMBAR #StoreLoad | #StoreStore | #LoadStore | #LoadLoad;
word* cur;
atomic {
cur = *pdest;
if (cur == cmp) {
*pdest = xchg;
}
}
MEMBAR #StoreLoad | #StoreStore | #LoadStore | #LoadLoad;
return cur;
}


Or is it like:

word CAS(word* pdest, word cmp, word xchg) {
MEMBAR #LoadStore | #StoreStore;
word* cur;
atomic {
cur = *pdest;
if (cur == cmp) {
*pdest = xchg;
}
}
MEMBAR #StoreLoad | #StoreStore;
return cur;
}



Or perhaps optimize for failure case...:

word CAS(word* pdest, word cmp, word xchg) {
MEMBAR #LoadStore | #StoreStore;
word* cur;
atomic {
cur = *pdest;
if (cur == cmp) {
*pdest = xchg;
MEMBAR #StoreLoad | #StoreStore;
}
}
return cur;
}



Anyway, according to Neill, it would be a bug if
`InterlockedCompareExchange()' did not execute a full memory barrier on the
failure case.




> Starting with Vista (or Server 2003) you can also choose "acquire" vs.
> "release" semantics with InterlockedCompareExchangeAcquire/
> InterlockedCompareExchangeRelease (and same with the other interlocked
> functions):

> http://msdn.microsoft.com/en-us/librar...(VS.85).aspx

That's good, but they can get way more fine-grain wrt memory barrier
support, IMVHO of course...