[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.python

difflib confusion

krishnakant Mane

1/22/2008 6:57:00 PM

hello all,
I have a bit of a confusing question.
firstly I wanted a library which can do an svn like diff with two files.
let's say I have file1 and file2 where file2 contains some thing which
file1 does not have. now if I do readlines() on both the files, I
have a list of all the lines.
I now want to do a diff and find out which word is added or deleted or changed.
and that too on which character, if not at least want to know the word
that has the change.
any ideas please?
kk
3 Answers

Paul Hankin

1/22/2008 10:34:00 PM

0

On Jan 22, 6:57 pm, "krishnakant Mane" <researchb...@gmail.com> wrote:
> hello all,
> I have a bit of a confusing question.
> firstly I wanted a library which can do an svn like diff with two files.
> let's say I have file1 and file2 where file2 contains some thing which
> file1 does not have.  now if I do readlines() on both the files, I
> have a list of all the lines.
> I now want to do a diff and find out which word is added or deleted or changed.
> and that too on which character, if not at least want to know the word
> that has the change.
> any ideas please?

Have a look at difflib in the standard library.

--
Paul Hankin

krishnakant Mane

1/23/2008 9:00:00 AM

0

On 23/01/2008, Paul Hankin <paul.hankin@gmail.com> wrote:
> On Jan 22, 6:57 pm, "krishnakant Mane" <researchb...@gmail.com> wrote:
> > hello all,
> > I have a bit of a confusing question.
> > firstly I wanted a library which can do an svn like diff with two files.
> > let's say I have file1 and file2 where file2 contains some thing which
> > file1 does not have. now if I do readlines() on both the files, I
> > have a list of all the lines.
> > I now want to do a diff and find out which word is added or deleted or
> changed.
> > and that too on which character, if not at least want to know the word
> > that has the change.
> > any ideas please?
>
> Have a look at difflib in the standard library.
>
I am aware of the difflib library but still can't figure out.
I know that differences in two lines can be got but how to get it between words?
regards,
Krishna
> --
> Paul Hankin
> --
> http://mail.python.org/mailman/listinfo/p...
>

Gabriel Genellina

1/23/2008 5:38:00 PM

0

On 23 ene, 06:59, "krishnakant Mane" <researchb...@gmail.com> wrote:
> On 23/01/2008, Paul Hankin <paul.han...@gmail.com> wrote:> On Jan 22, 6:57 pm, "krishnakant Mane" <researchb...@gmail.com> wrote:
> > > hello all,
> > > I have a bit of a confusing question.
> > > firstly I wanted a library which can do an svn like diff with two files.
> > > let's say I have file1 and file2 where file2 contains some thing which
> > > file1 does not have.  now if I do readlines() on both the files, I
> > > have a list of all the lines.
> > > I now want to do a diff and find out which word is added or deleted or
> > changed.
> > > and that too on which character, if not at least want to know the word
> > > that has the change.
> > > any ideas please?
>
> > Have a look at difflib in the standard library.
>
> I am aware of the difflib library but still can't figure out.
> I know that differences in two lines can be got but how to get it between words?

The base functionality is in SequenceMatcher; this class takes
sequence pairs of any type and tries to match them. The sequences may
be a list of lines, a single line (seen as a list of characters), or
you may feed it with a list of words (perhaps using line.split()).
Built on top of SequenceMatcher, you have a text Differ. It takes a
sequence of lines, and does its work in two steps: first tries to
match blocks of lines (using a SequenceMatcher), and later unmatched
blocks are further analyzed to show intraline differences (with
another SequenceMatcher, considering lines as a sequence of
characters). See the example at http://docs.python.org/lib/differ-exa...
- perhaps this is what you want.
Note that Differ has no concept of "word"; if you want to report only
whole word differences take a look at the _fancy_replace method.

--
Gabriel Genellina