[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.python

Re: extract occurrence of regular expression from elements of XML documents

Steve Holden

3/15/2010 5:30:00 PM

Martin Schmidt wrote:
> Hi,
>
> I have just started to use Python a few weeks ago and until last week I
> had no knowledge of XML.
> Obviously my programming knowledge is pretty basic.
> Now I would like to use Python in combination with ca. 2000 XML
> documents (about 30 kb each) to search for certain regular expression
> within specific elements of these documents.
> I would then like to record the number of occurrences of the regular
> expression within these elements.
> Moreover I would like to count the total number of words contained
> within these, and record the attribute of a higher level element that
> contains them.
> I was trying to figure out the best way how to do this, but got
> overwhelmed by the available information (e.g. posts using different
> approaches based on dom, sax, xpath, elementtree, expat).
> The outcome should be a file that lists the extracted attribute, the
> number of occurrences of the regular expression, and the total number of
> words.
> I did not find a post that addresses my problem.
> If someone could help me with this I would really appreciate it.
>
You would get more specific help if you could post an example of the XML
and describe the regex searching you want to do in a little more detail,
I suspect.

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
See PyCon Talks from Atlanta 2010 http://pyco...
Holden Web LLC http://www.hold...
UPCOMING EVENTS: http://holdenweb.event...