[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

[ANN] XHTMLDiff 1.0.0

Aredridel

10/21/2004 9:17:00 PM

Since today seems to be the day for document diffing tools, here's mine.

I'd like to announce XHTMLDiff 1.0.0, available at
http://theinternetco.net/projects/ruby... for your consumption.

XHTMLDiff takes valid XHTML as input, and generates valid XHTML with
redlining tags (<ins> and <del>) as output. Valid input documents
should generate valid output.

It diffs down to the paragraph level at the moment. A future version
will search down to the word.

Prerequisites are REXML, Diff::LCS, and delegate.rb

Bug reports are welcome.

Aredridel.


10 Answers

Austin Ziegler

10/22/2004 3:07:00 PM

0

On Fri, 22 Oct 2004 06:17:08 +0900, Aredridel <aredridel@gmail.com> wrote:
> Since today seems to be the day for document diffing tools, here's mine.
>
> I'd like to announce XHTMLDiff 1.0.0, available at
> http://theinternetco.net/projects/ruby... for your consumption.
>
> XHTMLDiff takes valid XHTML as input, and generates valid XHTML with
> redlining tags (<ins> and <del>) as output. Valid input documents
> should generate valid output.
>
> It diffs down to the paragraph level at the moment. A future version
> will search down to the word.
>
> Prerequisites are REXML, Diff::LCS, and delegate.rb

Cool. Always nice to see people using something I wrote :)

Are there any requests for improvements with Diff::LCS, Aredridel?

-austin
--
Austin Ziegler * halostatue@gmail.com
* Alternate: austin@halostatue.ca
: as of this email, I have [ 5 ] Gmail invitations


Aredridel

10/22/2004 5:01:00 PM

0

e to see people using something I wrote :)
>
> Are there any requests for improvements with Diff::LCS, Aredridel?

None at the moment -- the functional interface is very pleasant, and
the library makes no assumptions about type, so I could ducktype to my
hearts content. Nicely and solidly written, and it's good to see the
McIlroy-Hunt algorithm spelled out in Ruby where I totally grok it,
rather than locked up in Perl or Smalltalk (which I've read, but was
never sure I really got)

Ari


Aredridel

10/22/2004 5:01:00 PM

0

e to see people using something I wrote :)
>
> Are there any requests for improvements with Diff::LCS, Aredridel?

None at the moment -- the functional interface is very pleasant, and
the library makes no assumptions about type, so I could ducktype to my
hearts content. Nicely and solidly written, and it's good to see the
McIlroy-Hunt algorithm spelled out in Ruby where I totally grok it,
rather than locked up in Perl or Smalltalk (which I've read, but was
never sure I really got)

Ari


Francis Hwang

10/24/2004 4:15:00 PM

0

This looks quite cool. Are there any plans to generalize this to XML in
general? I can think of lots of good ways to use this if it's more
broadly applicable to XML.

On Oct 21, 2004, at 5:17 PM, Aredridel wrote:

> Since today seems to be the day for document diffing tools, here's
> mine.
>
> I'd like to announce XHTMLDiff 1.0.0, available at
> http://theinternetco.net/projects/ruby... for your consumption.
>
> XHTMLDiff takes valid XHTML as input, and generates valid XHTML with
> redlining tags (<ins> and <del>) as output. Valid input documents
> should generate valid output.
>
> It diffs down to the paragraph level at the moment. A future version
> will search down to the word.
>
> Prerequisites are REXML, Diff::LCS, and delegate.rb
>
> Bug reports are welcome.
>
> Aredridel.
>



Aredridel

10/25/2004 6:08:00 PM

0

On Mon, 25 Oct 2004 01:14:37 +0900, Francis Hwang <sera@fhwang.net> wrote:
> This looks quite cool. Are there any plans to generalize this to XML in
> general? I can think of lots of good ways to use this if it's more
> broadly applicable to XML.

Not at the moment, since it satisfies my need, and differencing on XML
is a slightly different task, and much easier, or much harder
depending. XHTML has to satisfy the XHTML DTD, and so there's specific
places and specific tags to use to mark changes.

With XML, it would either have to be arbitrarily defined (easy), or
according to each flavor's DTD (hard).

I'm up for it when I get some free time, if someone wanted to specify
what they needed.

Ari


Francis Hwang

10/25/2004 11:43:00 PM

0

Well, in my case I wanted to compare two different RSS 2.0 feeds. Which
doesn't seem to have a DTD, harrumph. I'll be quite happy when we all
move to Atom ...

On Oct 25, 2004, at 2:07 PM, Aredridel wrote:

> On Mon, 25 Oct 2004 01:14:37 +0900, Francis Hwang <sera@fhwang.net>
> wrote:
>> This looks quite cool. Are there any plans to generalize this to XML
>> in
>> general? I can think of lots of good ways to use this if it's more
>> broadly applicable to XML.
>
> Not at the moment, since it satisfies my need, and differencing on XML
> is a slightly different task, and much easier, or much harder
> depending. XHTML has to satisfy the XHTML DTD, and so there's specific
> places and specific tags to use to mark changes.
>
> With XML, it would either have to be arbitrarily defined (easy), or
> according to each flavor's DTD (hard).
>
> I'm up for it when I get some free time, if someone wanted to specify
> what they needed.
>
> Ari
>



Aredridel

10/26/2004 3:55:00 AM

0

On Tue, 26 Oct 2004 08:43:09 +0900, Francis Hwang <sera@fhwang.net> wrote:
> Well, in my case I wanted to compare two different RSS 2.0 feeds. Which
> doesn't seem to have a DTD, harrumph. I'll be quite happy when we all
> move to Atom ...

Ah, yes -- atom or RSS 1.0 (for comparison, RSS 1.0 would be even nicer)

What sort of interface would you want to compare RSS data? A list of
things that have changed since a previous run? A highlighted list? An
XML sort of patch? Would you want it at the API level, or a textually
annotated set of changes?


Francis Hwang

10/26/2004 12:03:00 PM

0


On Oct 25, 2004, at 11:55 PM, Aredridel wrote:

> On Tue, 26 Oct 2004 08:43:09 +0900, Francis Hwang <sera@fhwang.net>
> wrote:
>> Well, in my case I wanted to compare two different RSS 2.0 feeds.
>> Which
>> doesn't seem to have a DTD, harrumph. I'll be quite happy when we all
>> move to Atom ...
>
> Ah, yes -- atom or RSS 1.0 (for comparison, RSS 1.0 would be even
> nicer)
>
> What sort of interface would you want to compare RSS data? A list of
> things that have changed since a previous run? A highlighted list? An
> XML sort of patch? Would you want it at the API level, or a textually
> annotated set of changes?
>

What I was doing was refactoring some RSS code that didn't have enough
tests, so I wanted to compare pretty much every element of the
resulting RSS. I ended up just eyeballing it in an aggregator, which
seems to have worked out okay but still wasn't ideal.

I'd want a pretty granular comparison, and at the API level would be
ideal. I don't mind having to do a little work to format the changes
into readable output. Also, maybe having API-level information would
make it easier for me to filter out certain differences.

F.



Aredridel

10/27/2004 6:17:00 AM

0

> > What I was doing was refactoring some RSS code that didn't have enough
> tests, so I wanted to compare pretty much every element of the
> resulting RSS. I ended up just eyeballing it in an aggregator, which
> seems to have worked out okay but still wasn't ideal.
>
> I'd want a pretty granular comparison, and at the API level would be
> ideal. I don't mind having to do a little work to format the changes
> into readable output. Also, maybe having API-level information would
> make it easier for me to filter out certain differences.

Hm. Sounds like <ins> and <del> equivalents might be perfect, though
really, you want exact, full-tree diffs. That's a simpler task,
really. Honestly, it sounds like raw Diff::LCS might be the tool you
want -- parse both with REXML, and then hit the trees with Diff::LCS
-- the gotcha being the way REXML deals with containers. Steal the
proxy class from XHTMLDiff and that should be all you need.

Ari


Francis Hwang

10/27/2004 11:49:00 AM

0


On Oct 27, 2004, at 2:17 AM, Aredridel wrote:
> Hm. Sounds like <ins> and <del> equivalents might be perfect, though
> really, you want exact, full-tree diffs. That's a simpler task,
> really. Honestly, it sounds like raw Diff::LCS might be the tool you
> want -- parse both with REXML, and then hit the trees with Diff::LCS
> -- the gotcha being the way REXML deals with containers. Steal the
> proxy class from XHTMLDiff and that should be all you need.

Sounds good! I'll give that a try sometime and maybe write a tiny
how-to on my blog.

F.