Jano Svitok
2/3/2007 8:11:00 PM
On 2/3/07, Wido Menhardt <a@menhardt.com> wrote:
>
> I am sorrrrrry, but I am banging my head against this, and can't seem to
> find the answer!
>
> Text gets displayed in an input field in a web page with “
> prepended and ” appended to the string (needs to be inside the
> string otherwise it looks funny). The user edits it, and when it comes
> back to the (Rails) backend, the new string with (possibly) these quotes
> attached comes back, but in unicode.
>
> So the string possibly starts with UTF “ and possibly ends with
> UTF ”
>
> I want to do a regexp removal. Here is what works (but I am embarrased):
>
> ldquo = '123'; ldquo[0] = 226; ldquo[1] = 128; ldquo[2] = 156
> rdquo = '123'; rdquo[0] = 226; rdquo[1] = 128; rdquo[2] = 157
> string.gsub!(/(\A#{ldquo}|#{rdquo}\Z)/,'')
>
> There must be a better way.
1. it's possible to insert the chars directly, either in octal (226 =
"\342") or hexa (226= "\xe2")
string.gsub!(\A\xe2\x80\x9c|\xe2\x80\9d\Z/,")
2. | has low priority, so your regex is equal to /(\Alquo)|(rquo\z)/.
the correct one is (notice the non-capturing group (?:...)
string.gsub!(\A(?:\xe2\x80\x9c|\xe2\x80\9d)\Z/,")
3. there's iconv library that will convert things for you.