Asp Forum - ElementTree should parse string and file in teh same way

Peter Pei

12/31/2007 3:42:00 AM

One bad design about elementtree is that it has different ways parsing a
string and a file, even worse they return different objects:
1) When you parse a file, you can simply call parse, which returns a
elementtree, on which you can then apply xpath;
2) To parse a string (xml section), you can call XML or fromstring, but both
return element instead of elementtree. This alone is bad. To make it worse,
you have to create an elementtree from this element before you can utilize
xpath.

17 Answers

Paddy

12/31/2007 7:18:00 AM

On Dec 31, 3:42 am, "Peter Pei" <yan...@telus.com> wrote:
> One bad design about elementtree is that it has different ways parsing a
> string and a file, even worse they return different objects:
> 1) When you parse a file, you can simply call parse, which returns a
> elementtree, on which you can then apply xpath;
> 2) To parse a string (xml section), you can call XML or fromstring, but both
> return element instead of elementtree. This alone is bad. To make it worse,
> you have to create an elementtree from this element before you can utilize
> xpath.

I haven't tried this, but you should be able to wrap your text string
so that it looks like a file using the stringio module and pass that
to elementtree:

http://blog.doughellmann.com/2007/04/pymotw-stringio-and-cstr...

- Paddy.

Stefan Behnel

12/31/2007 8:13:00 AM

Peter Pei wrote:
> One bad design about elementtree is that it has different ways parsing a
> string and a file, even worse they return different objects:
> 1) When you parse a file, you can simply call parse, which returns a
> elementtree, on which you can then apply xpath;

ElementTree doesn't support XPath. In case you mean the simpler ElementPath
language that is supported by the find*() methods, I do not see a reason why
you can't use it on elements.

> 2) To parse a string (xml section), you can call XML or fromstring, but
> both return element instead of elementtree. This alone is bad. To make
> it worse, you have to create an elementtree from this element before you
> can utilize xpath.

a) how hard is it to write a wrapper function around fromstring() that wraps
the result Element in an ElementTree object and returns it?

b) the same as above applies: I can't see the problem you are talking about.

Stefan

Peter Pei

1/1/2008 1:54:00 AM

You are talking shit. It is never about whether it is hard to write a
wrapper. It is about bad design. I should be able to parse a string and a
file in exactly same way, and that should be provided as part of the
package.

Looks like you are just a code monkey not a designer, so I forgive you. You
didn't understand the issue I described? That's your issue. You are not at
the same level to talk to me, so chill.
===================================================================

"Stefan Behnel" <stefan.behnel-n05pAM@web.de> wrote in message
news:4778A47B.9020201@web.de...
> Peter Pei wrote:
>> One bad design about elementtree is that it has different ways parsing a
>> string and a file, even worse they return different objects:
>> 1) When you parse a file, you can simply call parse, which returns a
>> elementtree, on which you can then apply xpath;
>
> ElementTree doesn't support XPath. In case you mean the simpler
> ElementPath
> language that is supported by the find*() methods, I do not see a reason
> why
> you can't use it on elements.
>
>
>> 2) To parse a string (xml section), you can call XML or fromstring, but
>> both return element instead of elementtree. This alone is bad. To make
>> it worse, you have to create an elementtree from this element before you
>> can utilize xpath.
>
> a) how hard is it to write a wrapper function around fromstring() that
> wraps
> the result Element in an ElementTree object and returns it?
>
> b) the same as above applies: I can't see the problem you are talking
> about.
>
> Stefan

Peter Pei

1/1/2008 1:56:00 AM

To be preise, XPath is not fully supported. Don't be a smart asshole.
=====================================================================
"Stefan Behnel" <stefan.behnel-n05pAM@web.de> wrote in message
news:4778A47B.9020201@web.de...
> Peter Pei wrote:
>> One bad design about elementtree is that it has different ways parsing a
>> string and a file, even worse they return different objects:
>> 1) When you parse a file, you can simply call parse, which returns a
>> elementtree, on which you can then apply xpath;
>
> ElementTree doesn't support XPath. In case you mean the simpler
> ElementPath
> language that is supported by the find*() methods, I do not see a reason
> why
> you can't use it on elements.
>
>
>> 2) To parse a string (xml section), you can call XML or fromstring, but
>> both return element instead of elementtree. This alone is bad. To make
>> it worse, you have to create an elementtree from this element before you
>> can utilize xpath.
>
> a) how hard is it to write a wrapper function around fromstring() that
> wraps
> the result Element in an ElementTree object and returns it?
>
> b) the same as above applies: I can't see the problem you are talking
> about.
>
> Stefan

Steven D'Aprano

1/1/2008 3:02:00 AM

On Tue, 01 Jan 2008 01:53:47 +0000, Peter Pei wrote:

> You are talking shit. It is never about whether it is hard to write a
> wrapper. It is about bad design. I should be able to parse a string and
> a file in exactly same way, and that should be provided as part of the
> package.

Oh my, somebody decided to start the new year with all guns blazing.

Before abusing anyone else, have you considered asking *why* ElementTree
does not treat files and strings the same way? I believe the writer of
ElementTree, Fredrik Lundh, frequents this newsgroup.

It may be that Fredrik doesn't agree with you that you should be able to
parse a string and a file the same way, in which case there's nothing you
can do but work around it. On the other hand, perhaps he just hasn't had
a chance to implement that functionality, and would welcome a patch.

Fredrik, if you're reading this, I'm curious what your reason is. I don't
have an opinion on whether you should or shouldn't treat files and
strings the same way. Over to you...

--
Steven

Stefan Behnel

1/1/2008 10:06:00 AM

Peter Pei wrote:
> To be preise
[...]

Preise the lord, not me. :)

Happy New Year!

Stefan

Diez B. Roggisch

1/1/2008 12:37:00 PM

Steven D'Aprano schrieb:
> On Tue, 01 Jan 2008 01:53:47 +0000, Peter Pei wrote:
>
>> You are talking shit. It is never about whether it is hard to write a
>> wrapper. It is about bad design. I should be able to parse a string and
>> a file in exactly same way, and that should be provided as part of the
>> package.
>
> Oh my, somebody decided to start the new year with all guns blazing.
>
> Before abusing anyone else, have you considered asking *why* ElementTree
> does not treat files and strings the same way? I believe the writer of
> ElementTree, Fredrik Lundh, frequents this newsgroup.
>
> It may be that Fredrik doesn't agree with you that you should be able to
> parse a string and a file the same way, in which case there's nothing you
> can do but work around it. On the other hand, perhaps he just hasn't had
> a chance to implement that functionality, and would welcome a patch.
>
> Fredrik, if you're reading this, I'm curious what your reason is. I don't
> have an opinion on whether you should or shouldn't treat files and
> strings the same way. Over to you...

I think the decision is pretty clear to everybody who is a code-monkey
and not a Peter-Pei-School-of-Excellent-And-Decent-Designers-attendant:

when building a XML-document, you start from a Element or Elementtree
and often do things like

root_element = <some_element>
for child in some_objects:
root_element.append(XML("""<child attribute="%i"/>""" %
child.attribute))

Which is such a common usage-pattern that it would be extremely annoying
to get a document from XML/fromstring and then needing to extract the
root-element from it.

And codemonkeys know that in python

doc = et.parse(StringIO(string))

is just one import away, which people who attend to
Peter-Pei-School-of-Excellent-And-Decent-Designers may have not learned
yet - because they are busy praising themselves and coating each other
in edible substances before stepping out into the world and having all
code-monkeys lick off their greatness in awe.

http://www.youtube.com/watch?v=F...

Diez

Steven D'Aprano

1/1/2008 2:46:00 PM

On Tue, 01 Jan 2008 13:36:57 +0100, Diez B. Roggisch wrote:

> And codemonkeys know that in python
>
> doc = et.parse(StringIO(string))
>
> is just one import away

Yes, but to play devil's advocate for a moment,

doc = et.parse(string_or_file)

would be even simpler.

Is there any reason why it should not behave that way? It could be as
simple as adding a couple of lines to the parse method:

if isinstance(arg, str):
import StringIO
arg = StringIO(arg)

I'm not saying it *should*, I'm asking if there's a reason it *shouldn't*.

"I find it aesthetically distasteful" would be a perfectly acceptable
answer -- not one I would agree with, but I could accept it.

--
Steven

Steven Bethard

1/1/2008 8:00:00 PM

Steven D'Aprano wrote:
> On Tue, 01 Jan 2008 13:36:57 +0100, Diez B. Roggisch wrote:
>
>> And codemonkeys know that in python
>>
>> doc = et.parse(StringIO(string))
>>
>> is just one import away
>
> Yes, but to play devil's advocate for a moment,
>
> doc = et.parse(string_or_file)
>
> would be even simpler.

I assume the problem with this is that it would be ambiguous. You can
already use either a string or a file with ``et.parse``. A string is
interpreted as a file name, while a file object is used directly.

How would you differentiate between a string that's supposed to be a
file name, and a string that's supposed to be XML?

Steve

Steven D'Aprano

1/1/2008 10:02:00 PM

On Tue, 01 Jan 2008 12:59:44 -0700, Steven Bethard wrote:

> Steven D'Aprano wrote:
>> On Tue, 01 Jan 2008 13:36:57 +0100, Diez B. Roggisch wrote:
>>
>>> And codemonkeys know that in python
>>>
>>> doc = et.parse(StringIO(string))
>>>
>>> is just one import away
>>
>> Yes, but to play devil's advocate for a moment,
>>
>> doc = et.parse(string_or_file)
>>
>> would be even simpler.
>
> I assume the problem with this is that it would be ambiguous. You can
> already use either a string or a file with ``et.parse``. A string is
> interpreted as a file name, while a file object is used directly.

Ah! I wasn't aware that parse() operated on either an open file object or
a string file name. That's an excellent reason for not treating strings
the same as files in ElementTree.

> How would you differentiate between a string that's supposed to be a
> file name, and a string that's supposed to be XML?

Well, naturally I wouldn't.

I *could*, if I assumed that a multi-line string that started with "<"
was XML, and a single-line string with the path separator character or
ending in ".xml" was a file name, but that sort of Do What I Mean coding
is foolish in a library function that can't afford to occasionally Do The
Wrong Thing.

--
Steven

comp.lang.python

ElementTree should parse string and file in teh same way

Peter Pei

Paddy

Stefan Behnel

Peter Pei

Peter Pei

Steven D'Aprano

Stefan Behnel

Diez B. Roggisch

Steven D'Aprano

Steven Bethard

Steven D'Aprano

x Login to ForumsZone