[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Using string.slice for unicode chars

Ilya V. Sabanin

5/27/2005 12:47:00 AM

Hi,

I try to cut unicode string with "slice" method but looks like Ruby
unable to do this.
Of course I requiring 'jcode':

$KCODE = 'u'
require 'jcode'

So, Ruby unable to cut utf-8 strings?

--
Sorry for bad english!
With best regards,
Ilya Sabanin


1 Answer

Mark Hubbart

5/27/2005 3:55:00 AM

0

On 5/26/05, Ilya V. Sabanin <ilya.sabanin@gmail.com> wrote:> Hi,> > I try to cut unicode string with "slice" method but looks like Ruby> unable to do this.> Of course I requiring 'jcode':> > $KCODE = 'u'> require 'jcode'> > So, Ruby unable to cut utf-8 strings?Not yet. I think it's slated for 2.0 (or 1.9 first) Some possible workarounds:ruby 1.8.2 (2004-12-25) on powerpc-darwin8.0.0Welcome to Interactive Ruby us = '???ï??' ==>"???ï??" us.scan(/./) ==>["?", "?", "?", "ï", "?", "?"] us.unpack('U*') ==>[7782, 566, 529, 239, 331, 608]regexen will abide by the kcode, or you can unpack the utf-8 string asan array of integer code points.It shouldn't be too hard to wrap the regexp idea into a u_slicemethod. here's an untested lightwieght version: class String def u_slice(index, size = 1) self[/.{#{index}}(.{#{size}})/, 1] end def u_slice!(index, size = 1) str = self[/.{#{index}}(.{#{size}})/, 1] self[/.{#{index}}(.{#{size}})/, 1] = "" str end end