dk wrote:
> @BugBear: yeah the xml [sic] is a well formed and properly validated xml [sic].
>
That didn't answer his question. Answer his question.
"Have you checked that your data IS valid UTF-8 ?"
Clearly there is an improperly-encoded character in your XML file.
Find that and fix it.
> @Roedy: write now I'm using ultraEdit and inserting the characters
> from the ASCII table that it has. I have even tried seeing it in hex
> mode and I got the same value from both the places.
>
ASCII != UTF-8.
That hex value for the bad character, does it match the UTF-8 code
point for that character? It's four bytes long? What character is
it, and what is the hex value you observe? (Note: that's four
questions, so there ought to be four answers.)
> Meanwhile I have found something more interesting while reading the
> input stream from my xml [sic] if I exclusively define it to be formatted to
> UTF-8 in getByteStream it is working fine. Now here is this a Java bug
> (1.5.0.12)? or something else?
>
It's not a Java bug.
> Now this has led to a confusion. I thought ISO-8859-1 is a charset
Did you mean "encoding"?
> which is subset of UTF-8. Then why didn't UTF-8 work whereas
> ISO-8859-1 worked?
>
Because you were wrong. The two encodings differ.
If you have an assumption, let's call it an hypothesis, and the
evidence contradicts the hypothesis, then the hypothesis is wrong.
Simple.
--
Lew