[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

parsing literals

konsu

12/14/2005 2:26:00 AM

hello,

i need to write a function that would parse a string literal in another
language. a string literal in this language is:

STRING = "CHAR*"
CHAR = any character except for " and | \"
| \ | \/
| \u four hexadecimal digits

the \u sequence specifies a character in UTF-16 encoding.

for example: "abc", "", "a\"bc", "a\\b", "a\u12bfc"

below is the code that i wrote. is this Ruby enough? can someone
suggest improvements? a better style?

thanks
konstantin

def parselit(s)
r = %r{\\"|\\/|\\\\|\\u[\da-f][\da-f][\da-f][\da-f]}i
s =~ /^"((?:[^"\\]|#{r})*)"$/ && $1.gsub(r) { |x| x =~ /\\u(.*)/ ?
[$1.hex].pack('U*') : x[1..-1] }
end

puts parselit('"\u004e\"a"')

1 Answer

William James

12/14/2005 11:09:00 AM

0

ako... wrote:
> hello,
>
> i need to write a function that would parse a string literal in another
> language. a string literal in this language is:
>
> STRING = "CHAR*"
> CHAR = any character except for " and > | \"
> | \> | \/
> | \u four hexadecimal digits
>
> the \u sequence specifies a character in UTF-16 encoding.
>
> for example: "abc", "", "a\"bc", "a\\b", "a\u12bfc"
>
> below is the code that i wrote. is this Ruby enough? can someone
> suggest improvements? a better style?
>
> thanks
> konstantin
>
> def parselit(s)
> r = %r{\\"|\\/|\\\\|\\u[\da-f][\da-f][\da-f][\da-f]}i
> s =~ /^"((?:[^"\\]|#{r})*)"$/ && $1.gsub(r) { |x| x =~ /\\u(.*)/ ?
> [$1.hex].pack('U*') : x[1..-1] }
> end
>
> puts parselit('"\u004e\"a"')

def parselit(s)

re = %r{
\\"
| \\/
| \\\ | \\u [\da-f] {4}
}xoi

return nil if s !~ /^".*"$/

out = ""

s[1..-2].scan( /\G (?: ( [^"\\]+ ) | ( #{re} ) )/x ){ |x|
out <<
if !x.last
x.first
else
if x.last[0,2] == '\u'
[x.last[2..-1].hex].pack('U*')
else
x.last[1..-1]
end
end

}

# Fail if whole string didn't match.
if $~.post_match != ""
nil
else
out
end


end

puts parselit('"\u004e\"a"')
puts parselit('"\u004e\""a"')