[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

JSON.parse and unicode escape?

Jonathan Rochkind

8/26/2008 11:26:00 PM

The documentation for the ruby JSON classes (http://json.ruby...)
implies that it handles unicode escaping fine. But I'm having trouble
with parsing JSON with a unicode escape sequence in it. I am using the
'ext' parser (JSON::Ext::Parser) not the 'pure' parser. version 1.1.2,
which appears to still be the latest.

Here is some test JSON, that's actually an excerpt from some JSON
returned to me by a third party web service. Finally boiled it down to
the simplest demonstration case. I saved it in a file, but here's what's
in the text file:

=====
{ "key": 'something \x26 more' }
=====

I believe that is valid json, containing an escaped unicode char? But
JSON.parse on that string throws, complaining:

JSON::ParserError: unexpected token at '{ "summary": ' \u0026 ' }


I have verified it is the /x26 that's doing it. It doesn't like \x
escaped unicode.

Am I doing something wrong? Is the JSON I am receiving from the third
party bad somehow? This is such a widely used library that I'd be
surprised if it's broken and can't parse input including unicode escape
sequences... but that's what it looks like to me. Feedback?
--
Posted via http://www.ruby-....

3 Answers

pwever

10/1/2008 4:17:00 AM

0

I am running into what seems to be a related problem with the
following code:

irb
>> require 'json'
=> true
>> JSON.parse('{"s":"\uddb0"}')
JSON::ParserError: source sequence is illegal/malformed near uddb0"}
from /Library/Ruby/Gems/1.8/gems/json-1.1.3/lib/json/common.rb:122:in
`parse'
from /Library/Ruby/Gems/1.8/gems/json-1.1.3/lib/json/common.rb:122:in
`parse'
from (irb):2
from :0
>>

I don't know enough about unicode to really understand what is being
escaped here, but the following unicode characters, very close in
range (I assume) do not throw an error:
"\ucdb0", "\uedb0", "\ud7b0"

I also validated the JSON string ('{"s":"\uddb0"}') successfully at
http://www.jso... and in Python.

Any ideas of what might be the problem?
Are there any alternative JSON parsers for ruby?

Thank you very much // pascal

Rob Biedenharn

10/1/2008 4:29:00 AM

0

That's not valid Unicode. See:
http://www.unicode.org/charts/PDF...

You can only have that code point in UTF-16

-Rob

On Oct 1, 2008, at 12:18 AM, pwever wrote:

> I am running into what seems to be a related problem with the
> following code:
>
> irb
>>> require 'json'
> => true
>>> JSON.parse('{"s":"\uddb0"}')
> JSON::ParserError: source sequence is illegal/malformed near uddb0"}
> from /Library/Ruby/Gems/1.8/gems/json-1.1.3/lib/json/common.rb:122:in
> `parse'
> from /Library/Ruby/Gems/1.8/gems/json-1.1.3/lib/json/common.rb:122:in
> `parse'
> from (irb):2
> from :0
>>>
>
> I don't know enough about unicode to really understand what is being
> escaped here, but the following unicode characters, very close in
> range (I assume) do not throw an error:
> "\ucdb0", "\uedb0", "\ud7b0"
>
> I also validated the JSON string ('{"s":"\uddb0"}') successfully at
> http://www.jso... and in Python.
>
> Any ideas of what might be the problem?
> Are there any alternative JSON parsers for ruby?
>
> Thank you very much // pascal
>
>


Pascal Wever

10/1/2008 5:56:00 PM

0

That makes a lot of sense. Thanks for the clarification regarding the
unicode range.

Since I don't have control over the JSON source, I would like to try to
parse the JSON even if it results in a malformed unicode string. So
today I tried switching from 'json' to the 'ruby-json' library. After
some searching online, I didn't find any documentation on how to use it
though. Primarily I don't know how to include or require it.

require 'ruby-json'
require 'rubyjson'

don't seem to work?
Any ideas are appreaciated.
Thank you very much
// pascal
--
Posted via http://www.ruby-....