[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

microsoft.public.dotnet.framework

Char datatype does some hocus pocus

Arne Garvander

1/25/2008 8:32:00 PM

Dim B As Byte = 150
Dim C As Char
C = Convert.ToChar(B)
B = Asc(C)
And the value of B is 63
Why?
--
Arne Garvander
Certified Geek
Professional Data Dude
6 Answers

Jon Skeet

1/25/2008 9:05:00 PM

0

Arne Garvander <ArneGarvander@discussions.microsoft.com> wrote:
> Dim B As Byte = 150
> Dim C As Char
> C = Convert.ToChar(B)
> B = Asc(C)
> And the value of B is 63
> Why?

Convert.ToChar will use Unicode - so you end up with Unicode 150. That
character (the control character "start of guarded area") almost
certainly isn't in your ANSI character encoding.

Moral: avoid Asc and Chr, which implicitly use ANSI. Use the Encoding
class instead, where you explicitly specify the encoding.

--
Jon Skeet - <skeet@pobox.com>
http://www.pobox.... Blog: http://www.msmvps.com...
World class .NET training in the UK: http://iterativetrai...

Peter Duniho

1/25/2008 9:14:00 PM

0

On Fri, 25 Jan 2008 12:32:01 -0800, Arne Garvander =

<ArneGarvander@discussions.microsoft.com> wrote:

> Dim B As Byte =3D 150
> Dim C As Char
> C =3D Convert.ToChar(B)
> B =3D Asc(C)
> And the value of B is 63
> Why?

Because 150 isn't a valid ASCII value (ASCII is 0 to 127). 63 is the =

ASCII code for a question mark ('?'), and that's what Asc is returning =

when you pass it a character that doesn't have a valid ASCII code.

Pete

Jon Skeet

1/25/2008 9:23:00 PM

0

Peter Duniho <NpOeStPeAdM@nnowslpianmk.com> wrote:
> On Fri, 25 Jan 2008 12:32:01 -0800, Arne Garvander
> <ArneGarvander@discussions.microsoft.com> wrote:
>
> > Dim B As Byte = 150
> > Dim C As Char
> > C = Convert.ToChar(B)
> > B = Asc(C)
> > And the value of B is 63
> > Why?
>
> Because 150 isn't a valid ASCII value (ASCII is 0 to 127). 63 is the
> ASCII code for a question mark ('?'), and that's what Asc is returning
> when you pass it a character that doesn't have a valid ASCII code.

To be strict about it, Asc is misnamed - it returns 63 when you pass it
a character which doesn't have a valid *ANSI* code for the default ANSI
code page on your system. Unfortunately there's a bad history of
assuming that ASCII==ANSI (and that ANSI is a specific encoding, rather
than a whole collection of them).

<shudders>

--
Jon Skeet - <skeet@pobox.com>
http://www.pobox.... Blog: http://www.msmvps.com...
World class .NET training in the UK: http://iterativetrai...

Peter Duniho

1/25/2008 10:35:00 PM

0

On Fri, 25 Jan 2008 13:22:50 -0800, Jon Skeet [C# MVP] <skeet@pobox.com>=
=

wrote:

> To be strict about it, Asc is misnamed - it returns 63 when you pass i=
t
> a character which doesn't have a valid *ANSI* code for the default ANS=
I
> code page on your system. Unfortunately there's a bad history of
> assuming that ASCII=3D=3DANSI (and that ANSI is a specific encoding, r=
ather
> than a whole collection of them).

Ah, I see. I changed the code to use Chr(B) instead of Convert.ToChar()=
=

and it does actually convert to the appropriate Unicode character =

('\u2013' in this case). I made the error of assuming that the call to =
=

Convert.ToChar() was doing something that the OP expected, but now I see=
=

that it expects Unicode values and so passing what's an ANSI value does =
=

actually allow Asc() to return the right thing.

I had assumed from the name that Asc() would return only ASCII values, b=
ut =

looking at the doc page I see that it does return characters from the AN=
SI =

range, just as you said.

I guess the moral is either to follow your original advice (use an =

explicit Encoding for conversion), or in VB if you are just using the =

default ANSI encoding then just use the appropriate function, Chr(), =

instead of passing an ANSI value to a function that expects Unicode =

(Convert.ToChar()).

Of course, yet another fix is to just stop using ANSI and switch over to=
=

Unicode. :) Then if you want to specify that character by value, use =

2013 instead of 150 and everything works fine.

Pete

Peter Duniho

1/25/2008 10:54:00 PM

0

On Fri, 25 Jan 2008 14:35:11 -0800, Peter Duniho
<NpOeStPeAdM@nnowslpianmk.com> wrote:

> [...] I made the error of assuming that the call to Convert.ToChar()
> was doing something that the OP expected, but now I see that it expects
> Unicode values and so passing what's an ANSI value does actually allow
> Asc() to return the right thing.

I wrote that, and I'm not even sure what it means. Obviously, passing an
ANSI value to a method that expects a Unicode value isn't going to do the
right thing. Not then, not later.

If you pass the correct Unicode value to Convert.ToChar(), you can later
get the expected ANSI value from Asc(). The Unicode will be converted to
ANSI for you. But only if you start with the correct, corresponding
Unicode character in the first place.

But this is a VB-specific thing. I agree that if you're writing .NET
code, and doing character encoding conversions, that using the actual
Encoding class is the appropriate solution. It makes it much more clear
what's going on, and isn't tied to a specific character encoding (the code
can be easily changed or generalized to use a different encoding than
ANSI).

Of course, it's not clear from the OP's post that he wanted or expected
_any_ conversion. In which case, it's really just a matter of avoiding
the .NET methods in the first place, and sticking to the VB functions that
deal only in ANSI.

Anyway, sorry for writing that confusing sentence. I wish I know what
thought I was trying to express at the time. :)

Pete

Arne Garvander

1/28/2008 2:27:00 PM

0

If I try to put non-ANSI data into a string then I am screwed.
Before I got my current contract, someone started to put EBCDIC data into
strings.
No it will take to much effort to clean it up.
--
Arne Garvander
Certified Geek
Professional Data Dude


"Peter Duniho" wrote:

> On Fri, 25 Jan 2008 14:35:11 -0800, Peter Duniho
> <NpOeStPeAdM@nnowslpianmk.com> wrote:
>
> > [...] I made the error of assuming that the call to Convert.ToChar()
> > was doing something that the OP expected, but now I see that it expects
> > Unicode values and so passing what's an ANSI value does actually allow
> > Asc() to return the right thing.
>
> I wrote that, and I'm not even sure what it means. Obviously, passing an
> ANSI value to a method that expects a Unicode value isn't going to do the
> right thing. Not then, not later.
>
> If you pass the correct Unicode value to Convert.ToChar(), you can later
> get the expected ANSI value from Asc(). The Unicode will be converted to
> ANSI for you. But only if you start with the correct, corresponding
> Unicode character in the first place.
>
> But this is a VB-specific thing. I agree that if you're writing .NET
> code, and doing character encoding conversions, that using the actual
> Encoding class is the appropriate solution. It makes it much more clear
> what's going on, and isn't tied to a specific character encoding (the code
> can be easily changed or generalized to use a different encoding than
> ANSI).
>
> Of course, it's not clear from the OP's post that he wanted or expected
> _any_ conversion. In which case, it's really just a matter of avoiding
> the .NET methods in the first place, and sticking to the VB functions that
> deal only in ANSI.
>
> Anyway, sorry for writing that confusing sentence. I wish I know what
> thought I was trying to express at the time. :)
>
> Pete
>