[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Question on reading text files in Windows

Jim Knowlton

2/2/2009 5:39:00 PM

I am running Ruby 1.86 on Windows, and having trouble reading in some
text files. For some text files, if I do something simple like:

myfile = File.open("logfile.log")
contents = myfile.read()
puts contents

I get each character seperated by a space, such as:

¦= = = V e r b o s e l o g g i n g s t a r t e d : 1 / 2 8 / 2
0 0 9
1 3 : 4 5 : 0 6 B u i l d t y p e : S H I P U N I C O D E

If I bring up the file in even a bare-bones editor (such as VIM), I
get the file as it normally is (without any extraneous spaces). Does
anyone know why this would be, or how I can work around it? It's
causing issues as I am trying to write a script to search for a
particular string of text, and obviously it isn't found, even though
it should be.

Thanks,

Jim
5 Answers

Stefan Lang

2/2/2009 7:09:00 PM

0

2009/2/2 Jim Knowlton <jknowlton525@gmail.com>:
> I am running Ruby 1.86 on Windows, and having trouble reading in some
> text files. For some text files, if I do something simple like:
>
> myfile =3D File.open("logfile.log")
> contents =3D myfile.read()
> puts contents
>
> I get each character seperated by a space, such as:
>
> =94=3D =3D =3D V e r b o s e l o g g i n g s t a r t e d : 1 / 2 =
8 / 2
> 0 0 9
> 1 3 : 4 5 : 0 6 B u i l d t y p e : S H I P U N I C O D E
>
> If I bring up the file in even a bare-bones editor (such as VIM), I
> get the file as it normally is (without any extraneous spaces). Does
> anyone know why this would be, or how I can work around it? It's
> causing issues as I am trying to write a script to search for a
> particular string of text, and obviously it isn't found, even though
> it should be.

The file is probably UTF-16 encoded and starts with a BOM.
Try to convert the string to UTF-8, or switch to Ruby 1.9.

Stefan

Stefan Lang

2/2/2009 7:26:00 PM

0

2009/2/2 Stefan Lang <perfectly.normal.hacker@gmail.com>:
> 2009/2/2 Jim Knowlton <jknowlton525@gmail.com>:
>> I am running Ruby 1.86 on Windows, and having trouble reading in some
>> text files. For some text files, if I do something simple like:
>>
>> myfile =3D File.open("logfile.log")
>> contents =3D myfile.read()
>> puts contents
>>
>> I get each character seperated by a space, such as:
>>
>> =94=3D =3D =3D V e r b o s e l o g g i n g s t a r t e d : 1 / 2=
8 / 2
>> 0 0 9
>> 1 3 : 4 5 : 0 6 B u i l d t y p e : S H I P U N I C O D E
>>
>> If I bring up the file in even a bare-bones editor (such as VIM), I
>> get the file as it normally is (without any extraneous spaces). Does
>> anyone know why this would be, or how I can work around it? It's
>> causing issues as I am trying to write a script to search for a
>> particular string of text, and obviously it isn't found, even though
>> it should be.
>
> The file is probably UTF-16 encoded and starts with a BOM.
> Try to convert the string to UTF-8, or switch to Ruby 1.9.

Sorry, I meant to say "Try to convert the string to UTF-8 WITH Iconv"

Stefan

Jim Knowlton

2/2/2009 9:06:00 PM

0

Thanks...so if I upgraded to Ruby 1.9, would it convert it
automatically?

Jim Knowlton

2/2/2009 9:42:00 PM

0

Thanks for the pointer! I actually ended up using the iconv module,
and it worked like a charm. Incidentally, in case anyone else is
curious about this, Windows .REG files get saved as UTF-16 by default.

Stefan Lang

2/2/2009 9:53:00 PM

0

2009/2/2 Jim Knowlton <jknowlton525@gmail.com>:
> Thanks...so if I upgraded to Ruby 1.9, would it convert it
> automatically?

You'd have to tell it that you want to work with UTF-8
internally by putting this at the top of your application:

Encoding.default_internal = Encoding::UTF_8

and then tell the read or open function that the file
is UTF-16 encoded, e.g.:

content = File.read("logfile.log", encoding: "utf-16")

Though I don't know how many gems already work for
Ruby 1.9.1 on Windows.

Stefan