David Brown
12/30/2015 1:56:00 PM
On 30/12/15 14:28, Richard Heathfield wrote:
> On 30/12/15 08:56, David Brown wrote:
>> On 29/12/15 23:19, Richard Heathfield wrote:
>
> <snip>
>
>>> 2) mixing of signed and unsigned expressions is very often a sign that
>>> the program hasn't been thought through properly. This, too, is (I
>>> think) a discussion that might also turn out to be profitable, despite
>>> its origins.
>>
>> Perhaps that could usefully be discussed here, since it is an issue that
>> can affect many languages.
>
> Oddly, the signed/unsigned dilemma is one that can bite us even if we
> are only dealing exclusively (or so we think) with unsigned values.
> Recently, in Another Place, I suggested capturing a value that is
> inherently unsigned, and small (it may have been an array size - I don't
> remember the exact details), by using strtoul.
>
> Here is a sketch of the mechanism I suggested:
>
> #include <stdio.h>
> #include <stdlib.h>
>
> int main(int argc, char **argv)
> {
> if(argc > 1)
> {
> char *endptr = NULL;
> unsigned long n = strtoul(argv[1], &endptr, 10);
> if(endptr > argv[1])
> {
> printf("%lu\n", n);
> }
> else
> {
> puts("That's a terrible argument!");
> }
> }
> else
> {
> puts("Argue more.");
> }
> return 0;
> }
>
> Looks innocuous enough, doesn't it? But if I call it like this:
>
> ./foo -1
>
> I get an output of 18446744073709551615 (!)
>
> This surprised me for two reasons - firstly, I genuinely wasn't aware
> that unsigned long is 64 bits on this system. (Why should I care, after
> all? As long as it is at least 32 bits, I'm happy.) But more
> importantly, what I was expecting was an output of "That's a terrible
> argument!" But what I actually got was a value that is difficult to
> envisage as an array size.
>
> (Obviously it's easy to fix this - we can simply check that the first
> non-whitespace character in the string is a digit in the relevant number
> base.)
>
> But that isn't, quite, what we were talking about.
Indeed not - this is simply a misunderstanding of a detail of the
strtoul function, which appears to be working exactly as specified.
>
> The following code is perhaps a warning of the perils of arbitrarily
> mixing up signed and unsigned quantities:
>
> #include <stdio.h>
>
> int main(void)
> {
> int i = -6;
> unsigned int j = 42;
> if(i < j)
> {
> puts("-6 is less than 42");
> }
> else
> {
> puts("-6 is greater than or equal to 42");
> }
> return 0;
> }
>
> If it weren't for the fact that we were talking about the perils of
> signed and unsigned arithmetic, one might reasonably expect the code to
> print "-6 is less than 42", but of course it doesn't do that.
>
> This is a consequence of one of the "usual arithmetic conversions" -
> when the int value -6 is compared to the unsigned int value 42, it is
> first promoted to unsigned int, and of course this means that it is not
> only greater than 42 but in fact very, very, very much greater!
Yes indeed, and this standard conversion is key to many of the problems
people get when mixing signed and unsigned values. Basically, it is
only safe to do so when the signed variable is non-negative.
Of course, this in itself is just a specialist case of the difference
between "normal" mathematics and numbers in C. Replace the first two
lines with "int16_t i = 1000000; int16_t j = 4000000;" and you'll find
that four million is apparently smaller than one million.
>
> I think it is unfortunate that many C programmers tend to gloss over the
> "usual arithmetic conversions" on the grounds that, almost all the time,
> they make perfect sense and behave in exactly the way we expect and do
> everything we want. It is certainly a tendency to which I have succumbed
> on occasion. But *sometimes* they bite.
>
Indeed.
In my line of work, unsigned data turns up all the time, as does the
risk of overflows of various sorts - it pays to be careful and use
appropriate casts to be exactly sure of what sizes and signedness you
have at any given time.