[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

How to include zip in a program.

Daniel Carrera

10/27/2003 2:28:00 PM

Hello all,

Yay! I'm now subscribed to this list. =)

As I mentioned earlier, I'm working on a program to gather data from
OpenOffice.org files. It's comming along well. It is sort of a grep-like
program. One difference is that it can not only match text content, but
it can match styles.

I think this is very neat. You can take advantage of the logical
structure of OpenOffice.org documents. Say you have a document full of
poems, and you create a style called "PoemAuthor". With this program you
can extract all the authors from the document.

Now, I am fiding one significant stumbling block: Zip.

OOo files are zip archives. Ruby doesn't come with zip functionality by
default, and I don't want to ask people to compile anything to get this
program to run. Most of them won't have a compiler.

Also, I haven't been able to compile the ruby-zlib pacakge from RAA, so
I'm not sure I want to rely on it.

Can anyone suggest a good way of solving this? All I need is to extract
one file from a zip archive. Currently, the program runs the external
program, unzip. Perhaps I should ship unzip along with the program? I
think I'm leaning towards this direction.

Does this sound like a good approach?

I am on a Solaris system, so I would need some help gathering command-line
zip programs for the three main platforms: Linux, Mac OS X and Windows.

Any thoughts, comments and ideas would be much appreciated.

Cheers,
--
Daniel Carrera | OpenPGP KeyID: 9AF77A88
PhD grad student. |
Mathematics Dept. | "To understand recursion, you must first
UMD, College Park | understand recursion".

17 Answers

Andrew Walrond

10/27/2003 3:04:00 PM

0

Daniel Carrera wrote:

>
> As I mentioned earlier, I'm working on a program to gather data from
> OpenOffice.org files. It's comming along well. It is sort of a grep-like
>
> Now, I am fiding one significant stumbling block: Zip.
>
> OOo files are zip archives. Ruby doesn't come with zip functionality by
> default, and I don't want to ask people to compile anything to get this
> program to run. Most of them won't have a compiler.

Just a thought; Does OOo install an unzip library/binary you can get at?
Since OOo is guaranteed to installed if your tool is being used, you
wouldn't need to ship anything extra.

--
Andrew Walrond

Sergei Olonichev

10/27/2003 3:46:00 PM

0

Hello,

I have found a problem with character classes definition in unicoded
regular expressions. It seems \s isn't defined properly.

See the following simple program which ought to change space symbols
into "line feed":
cat test.utf8 | ruby-1.8.0 -Ku -ne '$_.gsub(/[\s]+/u,"\n"); puts $_;'

test.utf8 contains the following in hex:
C2 A0 32 33 20 31 0A C2 A0 32 34 20 31 0A

which is UTF8 code for:
00A0 NS no-break space
0032 2 digit two
0033 3 digit three
0020 SP space
0031 1 digit one
000A LF line feed (lf)
00A0 NS no-break space
0032 2 digit two
0034 4 digit four
0020 SP space
0031 1 digit one
000A LF line feed (lf)

But Ruby does not make any changes (does not change "no-break space"
into "line feed")!
Is that a bug?


Best wishes,
Sergei




Simon Strandgaard

10/27/2003 4:29:00 PM

0

On Tue, 28 Oct 2003 00:45:48 +0900, Sergei Olonichev wrote:

> Hello,
>
> I have found a problem with character classes definition in unicoded
> regular expressions. It seems \s isn't defined properly.
>
> See the following simple program which ought to change space symbols
> into "line feed":
> cat test.utf8 | ruby-1.8.0 -Ku -ne '$_.gsub(/[\s]+/u,"\n"); puts $_;'
>
> test.utf8 contains the following in hex:
> C2 A0 32 33 20 31 0A C2 A0 32 34 20 31 0A
>
> which is UTF8 code for:
> 00A0 NS no-break space
> 0032 2 digit two
> 0033 3 digit three
> 0020 SP space
> 0031 1 digit one
> 000A LF line feed (lf)
> 00A0 NS no-break space
> 0032 2 digit two
> 0034 4 digit four
> 0020 SP space
> 0031 1 digit one
> 000A LF line feed (lf)
>
> But Ruby does not make any changes (does not change "no-break space"
> into "line feed")!
> Is that a bug?

No..


server> ruby u.rb
"+)!! !\036+)!! !\036"
"+)!!\n!\036+)!!\n!\036"
server> cat u.rb
input = %w(C2 A0 32 33 20 31 0A C2 A0 32 34 20 31 0A)
str = input.map{|i| i.unpack('H2')[0].to_i.chr}.join
p str
p str.gsub(/[\s]+/u,"\n")
server>

I see no problems with regexp \s..

--
Simon Strandgaard

Thomas Sondergaard

10/27/2003 5:00:00 PM

0

Daniel,

zlib is included in ruby 1.8.0 so you don't need to compile it yourself.
zlib in itself does not support the zip archive format. I have written a
small ruby module called rubyzip that grogs the zip archive format and uses
zlib to do the (de)compression. If you search for 'zip' on
http://raa.ruby... it's right there at the top of the list, I wonder
how you missed it.

http://rubyzip.sourc...

Thomas


Robert Klemme

10/27/2003 5:19:00 PM

0


"Thomas Sondergaard" <thomas@FirstNameGoesHereSondergaard.com> schrieb im
Newsbeitrag news:3f9d4e81$0$94868$edfadb0f@dtext02.news.tele.dk...
> Daniel,
>
> zlib is included in ruby 1.8.0 so you don't need to compile it yourself.
> zlib in itself does not support the zip archive format. I have written a
> small ruby module called rubyzip that grogs the zip archive format and
uses
> zlib to do the (de)compression. If you search for 'zip' on
> http://raa.ruby... it's right there at the top of the list, I
wonder
> how you missed it.
>
> http://rubyzip.sourc...

+1 for inclusion of this in the std distribution.

(another +1 for eventually including bzip2 into std distribution - if
that's legally ok.)

robert

Simon Strandgaard

10/27/2003 5:59:00 PM

0

On Mon, 27 Oct 2003 17:28:36 +0100, Simon Strandgaard wrote:

> On Tue, 28 Oct 2003 00:45:48 +0900, Sergei Olonichev wrote:
>
>> Hello,
>>
>> I have found a problem with character classes definition in unicoded
>> regular expressions. It seems \s isn't defined properly.
>>
>> See the following simple program which ought to change space symbols
>> into "line feed":
>> cat test.utf8 | ruby-1.8.0 -Ku -ne '$_.gsub(/[\s]+/u,"\n"); puts $_;'
>>
>> test.utf8 contains the following in hex:
>> C2 A0 32 33 20 31 0A C2 A0 32 34 20 31 0A
>>
>> which is UTF8 code for:
>> 00A0 NS no-break space
>> 0032 2 digit two
>> 0033 3 digit three
>> 0020 SP space
>> 0031 1 digit one
>> 000A LF line feed (lf)
>> 00A0 NS no-break space
>> 0032 2 digit two
>> 0034 4 digit four
>> 0020 SP space
>> 0031 1 digit one
>> 000A LF line feed (lf)
>>
>> But Ruby does not make any changes (does not change "no-break space"
>> into "line feed")!
>> Is that a bug?
>
> No..
>
>
> server> ruby u.rb
> "+)!! !\036+)!! !\036"
> "+)!!\n!\036+)!!\n!\036"
> server> cat u.rb
> input = %w(C2 A0 32 33 20 31 0A C2 A0 32 34 20 31 0A)
> str = input.map{|i| i.unpack('H2')[0].to_i.chr}.join
> p str
> p str.gsub(/[\s]+/u,"\n")
> server>
>
> I see no problems with regexp \s..

Hmmm.. there is something wrong with my code .. me sorry, too quick.
My hex2utf8 conversion is buggy.. anyone who knows a smarter way to do
this ?

--
Simon Strandgaard

Rodrigo Bermejo

10/27/2003 7:58:00 PM

0

Daniel,

OOo files are zip archives.


Have you taken a look on the OO source code, to see how OO deals with it ?



>Perhaps I should ship unzip along with the program? I
>think I'm leaning towards this direction.
>
>Does this sound like a good approach?
>
>
Sounds good ...
or you can also just point this dependence and raise an error along
with the
download url depending on each platform in case unzip is not found on
the system.
I hate to have duplicated files on my system.

I suspect mostly Linux-Distro come with zip/unzip

>I am on a Solaris system, so I would need some help gathering command-line
>zip programs for the three main platforms: Linux, Mac OS X and Windows.
>
I can help you with the linux box.

-r.

gabriele renzi

10/27/2003 8:25:00 PM

0

il Mon, 27 Oct 2003 18:19:03 +0100, "Robert Klemme" <bob.news@gmx.net>
ha scritto::


>+1 for inclusion of this in the std distribution.

for what my opinion worths,
$vote.succ!

Daniel Carrera

10/27/2003 8:30:00 PM

0

On Mon, Oct 27, 2003 at 01:16:18PM -0500, walter@mwsewall.com wrote:
>
> have you tried using rubyzip at
> http://sourceforge.net/projec...
> it uses the ruby/zlib that is included in most recent ruby
> distributions. It is easy to use and works pretty well.

Recent ruby distributions come with ruby/zlib?
I am running 1.8.0 and I don't have ruby/zlib. I checked the Ruby website
and it looks like 1.8.0 is the newest.

There is an RAA project called ruby-zlib but I can't get it to compile.
(and even if I could, I'd still have the problem that I can't expect the
recpients of this program to have a compiler).

Cheers,
--
Daniel Carrera | OpenPGP KeyID: 9AF77A88
PhD grad student. |
Mathematics Dept. | "To understand recursion, you must first
UMD, College Park | understand recursion".

Daniel Carrera

10/27/2003 8:52:00 PM

0


I think that my Ruby is broken. I did compile it from source, but I'm
missing the zlib module.

$ ruby -v
ruby 1.8.0 (2002-12-24) [sparc-solaris2.8]
$
$ ruby -e " require 'zlib' "
-e:1:in `require': No such file to load -- zlib (LoadError)
$
$ ls -R lib/ruby/ | grep zlib
$

I've tried compiling the ruby-zlib module from RAA but without success
(see my earlier post).

Can anyone see anything I'm doing wrong? I don't recall any particular
errors when I compiled ruby 1.8.0.

If I can't find anything I'll try compiling ruby again.

Cheers,
Daniel.


On Mon, Oct 27, 2003 at 02:57:06PM -0600, Lyle Johnson wrote:
> Daniel Carrera wrote:
>
> >Recent ruby distributions come with ruby/zlib?
>
> The "zlib" extension module is one of the standard extensions shipped
> with ruby 1.8.0. If you built ruby 1.8.0 from source, you should find
> the code for it in the ruby-1.8.0/ext/zlib directory of the source tree.
> If you've already built and installed ruby 1.8.0, you should be able to do:
>
> require 'zlib'
>
> to load that extension.
>
> Hope this helps,
>
> Lyle

--
Daniel Carrera | OpenPGP KeyID: 9AF77A88
PhD grad student. |
Mathematics Dept. | "To understand recursion, you must first
UMD, College Park | understand recursion".