[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

reading multibyte characters

Vladimir Agafonkin

6/21/2006 11:44:00 AM

To read, for example, 20 bytes from a file, I can use the following
code:

block = input_file.read( 20 )

How can I do the same with multibyte characters instead of bytes? E.g.
I have a UTF8-encoded file, and want to read exactly 20 characters from
it, do something with them, then read another 20 character etc. Is it
possible?

Thanks in advance.

1 Answer

Tim Hoolihan

6/21/2006 2:19:00 PM

0

Should look something like this:

$KCODE='UTF8'
require 'jcode'

def parse_utf_string(text)
tempstr=""
text.each_char{|ch|
tempstr+=ch
if tempstr.length==20
yield tempstr
tempstr=""
end
}
yield tempstr if tempstr.length>0 #last chunk of file
end

parse_utf_string(File.open("test.txt").read){|chunk|
puts "20 bytes: #{chunk}"
}

Vladimir Agafonkin wrote:
> To read, for example, 20 bytes from a file, I can use the following
> code:
>
> block = input_file.read( 20 )
>
> How can I do the same with multibyte characters instead of bytes? E.g.
> I have a UTF8-encoded file, and want to read exactly 20 characters from
> it, do something with them, then read another 20 character etc. Is it
> possible?
>
> Thanks in advance.
>