Gabriel Genellina
1/30/2008 2:01:00 PM
On 29 ene, 22:47, Zbigniew Braniecki <zbigniew.branie...@gmail.com>
wrote:
> The new one is of course much better and cleaner (the old one is
> bloated), but I'm wondering if there is a faster way to compare two
> lists and find out what was added, what was removed, what was changed.
> I can simply iterate through two lists because I need to keep an order
> (so it's important that the removed line is after the 3 line which was
> not changed etc.)
>
> ndiff plays well here, but it seems to be extremely slow (1000
> iterations of diffToObject takes 10 sec, 7sec of this is in ndiff).
ndiff does a quadratic process: first determines matching lines using
a SequenceMatcher, then looks for near-matching lines and for each
pair, compares them using another SequenceMatcher.
You don't appear to be interested in what changed inside a line, just
that it changed, so a simple SequenceMatcher would be enough.
Output from SequenceMatcher is quite different than ndiff, but you'd
have to reimplement the _compareLists method only.
--
Gabriel Genellina