[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Check for text file

Alin Popa

6/19/2007 7:00:00 AM

Hi guys,

After some research I still cannot find a way how to see if a file is
plain text or binary. In fact I want to check if a file is plain text no
matter what characters are in it.
This thing may be possible by using ruby ?

Thanks,

Alin

--
Posted via http://www.ruby-....

13 Answers

Alex Young

6/19/2007 7:06:00 AM

0

Alin Popa wrote:
> Hi guys,
>
> After some research I still cannot find a way how to see if a file is
> plain text or binary. In fact I want to check if a file is plain text no
> matter what characters are in it.
> This thing may be possible by using ruby ?

I think so, but it's a little unclear exactly what you're trying to
achieve. Do you have an example?

--
Alex


Alin Popa

6/19/2007 7:19:00 AM

0

Alex Young wrote:
> Alin Popa wrote:
>> Hi guys,
>>
>> After some research I still cannot find a way how to see if a file is
>> plain text or binary. In fact I want to check if a file is plain text no
>> matter what characters are in it.
>> This thing may be possible by using ruby ?
>
> I think so, but it's a little unclear exactly what you're trying to
> achieve. Do you have an example?

I'm trying to do a replace in file for some text but I don't want to
consider files like archives or other binary files.

--
Posted via http://www.ruby-....

Alin Popa

6/19/2007 7:33:00 AM

0

Alin Popa wrote:
> Alex Young wrote:
>> Alin Popa wrote:
>>> Hi guys,
>>>
>>> After some research I still cannot find a way how to see if a file is
>>> plain text or binary. In fact I want to check if a file is plain text no
>>> matter what characters are in it.
>>> This thing may be possible by using ruby ?
>>
>> I think so, but it's a little unclear exactly what you're trying to
>> achieve. Do you have an example?
>
> I'm trying to do a replace in file for some text but I don't want to
> consider files like archives or other binary files.

Of course, when I'm on windows I can go after the file extension and try
to ignore some specific (eg. .exe, .zip, .jar, .rar, .anything_i_want)
but I don't know how to do it on Linux/Unix OS where file extension is
not mandatory.

--
Posted via http://www.ruby-....

Robert Klemme

6/19/2007 7:54:00 AM

0

On 19.06.2007 09:33, Alin Popa wrote:
> Alin Popa wrote:
>> Alex Young wrote:
>>> Alin Popa wrote:
>>>> Hi guys,
>>>>
>>>> After some research I still cannot find a way how to see if a file is
>>>> plain text or binary. In fact I want to check if a file is plain text no
>>>> matter what characters are in it.
>>>> This thing may be possible by using ruby ?
>>> I think so, but it's a little unclear exactly what you're trying to
>>> achieve. Do you have an example?
>> I'm trying to do a replace in file for some text but I don't want to
>> consider files like archives or other binary files.
>
> Of course, when I'm on windows I can go after the file extension and try
> to ignore some specific (eg. .exe, .zip, .jar, .rar, .anything_i_want)
> but I don't know how to do it on Linux/Unix OS where file extension is
> not mandatory.

You could read the file (or portion of the file), create a histogram of
byte (or groups of bytes) occurrences and compare that to what you
expect for text files (e.g. most chars are "0-9a-zA-Z" and punctuation).

You could as well use command "file" and parse its output.

Kind regards

robert

George Malamidis

6/19/2007 8:02:00 AM

0

Hello,

On a *nix system, you can do

file_type = `file my_file`
puts file_type

but this will not work on Windows.

George

On 19 Jun 2007, at 08:33, Alin Popa wrote:

> Alin Popa wrote:
>> Alex Young wrote:
>>> Alin Popa wrote:
>>>> Hi guys,
>>>>
>>>> After some research I still cannot find a way how to see if a
>>>> file is
>>>> plain text or binary. In fact I want to check if a file is plain
>>>> text no
>>>> matter what characters are in it.
>>>> This thing may be possible by using ruby ?
>>>
>>> I think so, but it's a little unclear exactly what you're trying to
>>> achieve. Do you have an example?
>>
>> I'm trying to do a replace in file for some text but I don't want to
>> consider files like archives or other binary files.
>
> Of course, when I'm on windows I can go after the file extension
> and try
> to ignore some specific (eg. .exe, .zip, .jar, .rar, .anything_i_want)
> but I don't know how to do it on Linux/Unix OS where file extension is
> not mandatory.
>
> --
> Posted via http://www.ruby-....
>


Robert Klemme

6/19/2007 8:13:00 AM

0

On 19.06.2007 10:01, George Malamidis wrote:
> Hello,
>
> On a *nix system, you can do
>
> file_type = `file my_file`
> puts file_type
>
> but this will not work on Windows.

robert@fussel ~
$ file .inputrc
..inputrc: ASCII English text

robert@fussel ~
$ uname -a
CYGWIN_NT-5.1 fussel 1.5.24(0.156/4/2) 2007-01-31 10:57 i686 Cygwin

:-)

robert

Alin Popa

6/19/2007 8:16:00 AM

0

George Malamidis wrote:
> Hello,
>
> On a *nix system, you can do
>
> file_type = `file my_file`
> puts file_type
>
> but this will not work on Windows.
>
> George

Thanks guys, the problem was solved due to your indications ;)

Regarding file command, I can use it on win also since there are
gnuwin32 tools :)

Best regards,

Alin

--
Posted via http://www.ruby-....

Ryan Davis

6/19/2007 5:11:00 PM

0


On Jun 18, 2007, at 23:59 , Alin Popa wrote:

> After some research I still cannot find a way how to see if a file is
> plain text or binary. In fact I want to check if a file is plain
> text no
> matter what characters are in it.

http://blog.zenspider.com/archives/2006/08/i_miss_pe...



Alin Popa

6/19/2007 5:34:00 PM

0

Ryan Davis wrote:
> On Jun 18, 2007, at 23:59 , Alin Popa wrote:
>
>> After some research I still cannot find a way how to see if a file is
>> plain text or binary. In fact I want to check if a file is plain
>> text no
>> matter what characters are in it.
>
> http://blog.zenspider.com/archives/2006/08/i_miss_pe...

Nice, thanks.

--
Posted via http://www.ruby-....

Nobuyoshi Nakada

6/19/2007 6:17:00 PM

0

Hi,

At Wed, 20 Jun 2007 02:10:57 +0900,
Ryan Davis wrote in [ruby-talk:256206]:
> > After some research I still cannot find a way how to see if a file is
> > plain text or binary. In fact I want to check if a file is plain
> > text no
> > matter what characters are in it.
>
> http://blog.zenspider.com/archives/2006/08/i_miss_pe...

You can use String#count:

def File.binary?(path)
s = read(path, 4096) and
!s.empty? and
(/\0/n =~ s or s.count("\t\n -~").to_f/s.size<=0.7)
end

In any case, it doesn't work for non-ascii files.

--
Nobu Nakada