James Kanze
10/20/2008 7:47:00 PM
On Oct 20, 8:38 pm, Victor Bazarov <v.Abaza...@comAcast.net> wrote:
> blargg wrote:
> > Does ~0 yield undefined behavior? C++03 section 5 paragraph 5 seems to
> > suggest so:
> >> If during the evaluation of an expression, the result is not
> >> mathematically defined or not in the range of representable values
> >> for its type, the behavior is undefined [...]
> > The description of unary ~ (C++03 section 5.3.1 paragraph 8):
> >> The operand of — shall have integral or enumeration type; the
> >> result is the one's complement of its operand. Integral promotions
> >> are performed. The type of the result is the type of the promoted
> >> operand. [...]
> > But perhaps "one's complement" means the value that type would have with
> > all bits inverted, rather than the mathematical result of inverting all
> > bits in the binary representation. For example, on a machine with 32-bit
> > int, does one's complement of 0 (attempt to) have the value 2^31-1, which
> > can't be represented in a signed int and is thus undefined,
> Uh... Sorry, could you perhaps elaborate, why (2^31 - 1) can't be
> represented? Or did you mean (2^32 - 1)?
> If the resulting value is greater than can be represented in
> 'int', the compiler will create the code to promote it first
> to 'unsigned', then to 'long', then to 'unsigned long', IIRC.
> So, if ~0 cannot for some reason be represented in an int, it
> might become the (unsigned){all bits set} value.
No. That's the way the compiler behaves for integral literal
for an octal or hexadecimal constant. (For a decimal constant,
the results will never be unsigned.) In this case, the integral
literal is 0---which can't possibly overflow anything, and so
has type int. What we have here is an expression, with an
operator applied to an int. What blargg is doubtlessly
referring to is the statement in §5 that "If during the
evaluation of an expression, the result is not mathematically
defined or not in the range of representable values for its
type, the behavior is undefined, unless such an expression
appears where an integral constant expression is required
(5.19), in which case the program is ill-formed."
The problem here is that the "one's complement" operation
doesn't really define a numeric result, but rather a
manipulation on the underlying representation. So I don't think
that this statement can be applied: the ~ operator changes the
bits in the representation, and the "result" is whatever value
the changed bits happen to represent. Except that it's not
really too clear what that means, either; what happens if the
changed bits would be a trapping representation? (E.g. a 1's
complement machine that traps on negative 0's.)
Because of such issues, I tend to avoid using ~, | or & on
signed integral types.
> > or does it
> > have the value of whatever a signed int with all set bits
> > would have (-1 on a two's complement machine)?
> That's what I'd expect.
That's doubtlessly what was intended. On a two's complement
machine. Now try it on a one's complement machine which traps
negative 0's.
The C standard has cleared this up considerably. According to
the C99 standard:
If the implementation supports negative zeros, they
shall be generated only by:
-- the &, |, ^, ~, <<, and >> operators with arguments
that produce such a value;
-- the +, -, *, /, and % operators where one argument
is a negative zero and the result is zero;
-- compound assignment operators based on the above
cases.
It is unspecified whether these cases actually generate
a negative zero or a normal zero, and whether a negative
zero becomes a normal zero when stored in an object.
If the implementation does not support negative zeros,
the behavior of the &, |, ^, ~, <<, and >> operators
with arguments that would produce such a value is
undefined.
The second paragraph above is particularly significant: ~0
*is* undefined behavior on an implementation which doesn't
support negative zeros. (Note that the text immediately
preceding the above makes it clear that it is talking about
negative zero representations in one's complement or signed
magnitude; the "doesn't support negative zeros" only applies
in the case where they exist in the representation.)
> > I used the ~0 case for simplicity; in practice, this
> > issue might occur when ANDing with the complement of a
> > mask, for example n&=~0x0F to clear the low 4 bits of n,
> > or ~n&0x0F to find the inverted low 4 bits of n.
> Actually, on 2's complement, we use -1 for the "all bits
> set"... Perhaps we should switch to ~0 (more portable?)
If you're worried about bits, the *only* way you can be sure
of anything where the highest bit might not be 0 is to use
unsigned types. For signed types, ~0 can result in
undefined behavior. (In other words, ~0 is not portable, ~0U
is. As is -1, if that's what you want.)
--
James Kanze (GABI Software) email:james.kanze@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34