[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

REXML question

Chris McMahon

1/24/2006 6:54:00 PM


Hi...

I cargo-culted the following REXML statement, and it's working fine:

elements = Document.new( my_xml ).elements.to_a( "//*[text()]").map {
|e|
e.text.strip.empty? ? nil : e.text.strip}.compact

but I'm no expert at this. I want for this expression to return an
array containing every element of any given XML Document in a reliable
order. It seems to do so.

Is there any XML with elements that would not be captured by this
expression?

1 Answer

Mark Volkmann

1/24/2006 7:55:00 PM

0

On 1/24/06, Chris McMahon <christopher.mcmahon@gmail.com> wrote:
>
> Hi...
>
> I cargo-culted the following REXML statement, and it's working fine:
>
> elements = Document.new( my_xml ).elements.to_a( "//*[text()]").map {
> |e|
> e.text.strip.empty? ? nil : e.text.strip}.compact
>
> but I'm no expert at this. I want for this expression to return an
> array containing every element of any given XML Document in a reliable
> order. It seems to do so.
>
> Is there any XML with elements that would not be captured by this
> expression?

Are you trying to find only elements that contain text in them that is
not just whitespace?

I can't comment on your use of REXML, but I'll comment on you XPath expression.
"//*[text()]" means that you only want elements that have text in them.
Consider the following.
<car>
<make>Saturn</make>
<model>SC2</model>
<colors exterior="purple" interior="tan"/>
</car>

Which of these elements have text in them?
Clearly make and model do. Clearly colors does not.
Somewhat surprisingly, car does. It has whitespace inside it, in
addition to child elements. Not only that, it has four pieces of text
inside. A DOM parser would say that the car element has four text
child nodes. I'm not sure how REXML treats this.

--
R. Mark Volkmann
Partner, Object Computing, Inc.