[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Just for fun...

matthew.moss.coder

3/15/2007 5:43:00 PM

I posted this to another forum as part of a challenge... thought to
post it here just for fun. +5 pts if you know what algorithm is being
used here...

Anyone care to golf? Note that the breakdown of the string TXT was
just to prevent it from line-wrapping...

TXT = ".,~,,;\ntee,yeeewgeendhoeheddrhdfteeenn" +
"\n\n\n\n\nr cl u enneiii ddd hhhlhhhh" +
"elllphsvHfo ousctTTttTttgdfddwl tddd " +
"uaooAat doopmp cmsse o i ari t isa" +
"joo e\n"

dat = [""] * TXT.size
TXT.size.times do
dat = TXT.split(//).zip(dat).map{ |a| a.join }.sort
end
puts dat.find{ |x| x[-1] == ?~ }[0...-1]

39 Answers

Stephen Lewis

3/15/2007 6:24:00 PM

0

Matthew Moss wrote:
> +5 pts if you know what algorithm is being used here...

I'm not going to try to golf it right now, but it looks like an inverse
Burrows-Wheeler Transform.

Stephen

--
Stephen Lewis

matthew.moss.coder

3/15/2007 6:34:00 PM

0

On 3/15/07, Stephen Lewis <stephen@sock.org.uk> wrote:
> Matthew Moss wrote:
> > +5 pts if you know what algorithm is being used here...
>
> I'm not going to try to golf it right now, but it looks like an inverse
> Burrows-Wheeler Transform.

+5 for Stephen. =)

Christian Neukirchen

3/19/2007 7:19:00 PM

0

"Matthew Moss" <matthew.moss.coder@gmail.com> writes:

> On 3/15/07, Stephen Lewis <stephen@sock.org.uk> wrote:
>> Matthew Moss wrote:
>> > +5 pts if you know what algorithm is being used here...
>>
>> I'm not going to try to golf it right now, but it looks like an inverse
>> Burrows-Wheeler Transform.
>
> +5 for Stephen. =)

Whoa, I just thought of that, without even reading the question... magic.

--
Christian Neukirchen <chneukirchen@gmail.com> http://chneuk...

Trans

3/19/2007 7:51:00 PM

0



On Mar 15, 2:33 pm, "Matthew Moss" <matthew.moss.co...@gmail.com>
wrote:
> On 3/15/07, Stephen Lewis <step...@sock.org.uk> wrote:
>
> > Matthew Moss wrote:
> > > +5 pts if you know what algorithm is being used here...
>
> > I'm not going to try to golf it right now, but it looks like an inverse
> > Burrows-Wheeler Transform.
>
> +5 for Stephen. =)

Looks interesting. Care to elaborate on "inverse Burrows-Wheeler
Transform"?


Thanks,
T.


David Kastrup

3/19/2007 8:16:00 PM

0

"Trans" <transfire@gmail.com> writes:

> On Mar 15, 2:33 pm, "Matthew Moss" <matthew.moss.co...@gmail.com>
> wrote:
>> On 3/15/07, Stephen Lewis <step...@sock.org.uk> wrote:
>>
>> > Matthew Moss wrote:
>> > > +5 pts if you know what algorithm is being used here...
>>
>> > I'm not going to try to golf it right now, but it looks like an inverse
>> > Burrows-Wheeler Transform.
>>
>> +5 for Stephen. =)
>
> Looks interesting. Care to elaborate on "inverse Burrows-Wheeler
> Transform"?

The Burrows-Wheeler transform takes some arbitrary string, say,
gigantomania and sorts the permutations: we start with

gigantomania
igantomaniag
gantomaniagi
antomaniagig
ntomaniagiga
tomaniagigan
omaniagigant
maniagiganto
aniagigantom
niagigantoma
iagigantoman
agigantomani

and then sort the lines:

agigantomani
aniagigantom
antomaniagig
gantomaniagi
gigantomania
iagigantoman
igantomaniag
maniagiganto
niagigantoma
ntomaniagiga
omaniagigant
tomaniagigan

Then we take the last column and assign numbers to it:

imgiangoaatn
0123456789AB

Now we sort the letters with the numbers alongside.

aaaggiimnnot
489260315B7A

And now we walk off a linked list starting at 4 (we need to remember
that starting position since all cyclic strings have the same BWT
transform otherwise).

That puts out the original sequence.

--
David Kastrup, Kriemhildstr. 15, 44793 Bochum

Trans

3/20/2007 3:13:00 AM

0



On Mar 19, 4:20 pm, David Kastrup <d...@gnu.org> wrote:
> "Trans" <transf...@gmail.com> writes:
> > On Mar 15, 2:33 pm, "Matthew Moss" <matthew.moss.co...@gmail.com>
> > wrote:
> >> On 3/15/07, Stephen Lewis <step...@sock.org.uk> wrote:
>
> >> > Matthew Moss wrote:
> >> > > +5 pts if you know what algorithm is being used here...
>
> >> > I'm not going to try to golf it right now, but it looks like an inverse
> >> > Burrows-Wheeler Transform.
>
> >> +5 for Stephen. =)
>
> > Looks interesting. Care to elaborate on "inverse Burrows-Wheeler
> > Transform"?
>
> The Burrows-Wheeler transform takes some arbitrary string, say,
> gigantomania and sorts the permutations: we start with
>
> gigantomania
> igantomaniag
> gantomaniagi
> antomaniagig
> ntomaniagiga
> tomaniagigan
> omaniagigant
> maniagiganto
> aniagigantom
> niagigantoma
> iagigantoman
> agigantomani
>
> and then sort the lines:
>
> agigantomani
> aniagigantom
> antomaniagig
> gantomaniagi
> gigantomania
> iagigantoman
> igantomaniag
> maniagiganto
> niagigantoma
> ntomaniagiga
> omaniagigant
> tomaniagigan
>
> Then we take the last column and assign numbers to it:
>
> imgiangoaatn
> 0123456789AB
>
> Now we sort the letters with the numbers alongside.
>
> aaaggiimnnot
> 489260315B7A
>
> And now we walk off a linked list starting at 4 (we need to remember
> that starting position since all cyclic strings have the same BWT
> transform otherwise).
>
> That puts out the original sequence.

That's nuts.

T.


Clifford Heath

3/20/2007 3:16:00 AM

0

Trans wrote:
> On Mar 19, 4:20 pm, David Kastrup <d...@gnu.org> wrote:
>> aaaggiimnnot
>> 489260315B7A
>> And now we walk off a linked list starting at 4 (we need to remember
>> that starting position since all cyclic strings have the same BWT
>> transform otherwise).
>> That puts out the original sequence.
> That's nuts.

Nuts it may be, but it's the reason why "bzip" compresses tighter
than deflate (zip, gzip, etc). It does a BWT first, then compresses
that.

David, I followed your explanation right up to the quoted text about
how to read the original string back out. Can you elaborate a bit
please?

Clifford Heath.

Trans

3/20/2007 3:52:00 AM

0

On Mar 19, 11:15 pm, Clifford Heath <no.s...@please.net> wrote:

> Nuts it may be, but it's the reason why "bzip" compresses tighter
> than deflate (zip, gzip, etc). It does a BWT first, then compresses
> that.

Really? I find that surprising actually. I would not think a sorting
algorithm could lead to any significant compression. It just doesn't
make sense under conservation of information. I assume it's due to
deficiencies in deflate then.

T.

Raj Sahae

3/20/2007 5:08:00 AM

0

Clifford Heath wrote:
> David, I followed your explanation right up to the quoted text about
> how to read the original string back out. Can you elaborate a bit
> please?
>
> Clifford Heath.
Just read the Wikipedia article. It has tables representing the loops
the algorithm goes through. It's very easy to understand.

Clifford Heath

3/20/2007 6:52:00 AM

0

Trans wrote:
> Really? I find that surprising actually. I would not think a sorting
> algorithm could lead to any significant compression. It just doesn't
> make sense under conservation of information. I assume it's due to
> deficiencies in deflate then.

deflate works by substituting short codes for repeated text, then
Huffman packing the codes and the first instance of each text.
BWT creates more repetitions by grouping similar text, so the
compression has more to chew on.