[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Parsing xml

Arun Kumar

3/25/2009 5:05:00 PM

Hi,
Is there any way in Ruby to parse an xml file without using REXML or any
other libraries.



Regards
Arun Kumar
--
Posted via http://www.ruby-....

24 Answers

Peter Zotov

3/25/2009 5:43:00 PM

0

Quoting "Arun Kumar" <arunkumar@innovaturelabs.com>:

> Hi,
> Is there any way in Ruby to parse an xml file without using REXML or any
> other libraries.

Of course. You can write a finite state machine, read XML from file
and parse as you want.

--
WBR, Peter Zotov

Arun Kumar

3/25/2009 5:46:00 PM

0

Peter Zotov wrote:
> Quoting "Arun Kumar" <arunkumar@innovaturelabs.com>:
>
>> Hi,
>> Is there any way in Ruby to parse an xml file without using REXML or any
>> other libraries.
>
> Of course. You can write a finite state machine, read XML from file
> and parse as you want.

Thanks. Can u please give me details of it.

Regards
Arun Kumar
--
Posted via http://www.ruby-....

Peter Zotov

3/25/2009 5:52:00 PM

0

Quoting "Arun Kumar" <arunkumar@innovaturelabs.com>:

> Peter Zotov wrote:
>> Quoting "Arun Kumar" <arunkumar@innovaturelabs.com>:
>>
>>> Hi,
>>> Is there any way in Ruby to parse an xml file without using REXML or any
>>> other libraries.
>>
>> Of course. You can write a finite state machine, read XML from file
>> and parse as you want.
>
> Thanks. Can u please give me details of it.

It is described nicely in Wikipedia:
http://en.wikipedia.org/wiki/Finite_sta...

As a clue, I can recommend you define following states: "text",
"opening tag", "tag attribute", "tag attribute value", "closing tag".
E. g. when you are in "text" state and get "<" symbol at input
sequence, you change state to "opening tag" or "closing tag"...
I still have one question: why you don't use REXML?

--
WBR, Peter Zotov

Arun Kumar

3/25/2009 5:58:00 PM

0

Peter Zotov wrote:
> Quoting "Arun Kumar" <arunkumar@innovaturelabs.com>:
>
>> Thanks. Can u please give me details of it.
> It is described nicely in Wikipedia:
> http://en.wikipedia.org/wiki/Finite_sta...
>
> As a clue, I can recommend you define following states: "text",
> "opening tag", "tag attribute", "tag attribute value", "closing tag".
> E. g. when you are in "text" state and get "<" symbol at input
> sequence, you change state to "opening tag" or "closing tag"...
> I still have one question: why you don't use REXML?

Hi,

The problem is that my boss donot want me to use any libraries to parse
xml. He also said to use regular expressions to extract the contents of
an xml tag. Can u please tell me how to do it. I'll be really greatfull.

Regards
Arun Kumar
--
Posted via http://www.ruby-....

Jason Roelofs

3/25/2009 5:58:00 PM

0

On Wed, Mar 25, 2009 at 1:52 PM, Peter Zotov <whitequark@whitequark.ru> wro=
te:
> Quoting "Arun Kumar" <arunkumar@innovaturelabs.com>:
>
>> Peter Zotov wrote:
>>>
>>> Quoting "Arun Kumar" <arunkumar@innovaturelabs.com>:
>>>
>>>> Hi,
>>>> Is there any way in Ruby to parse an xml file without using REXML or a=
ny
>>>> other libraries.
>>>
>>> Of course. You can write a finite state machine, read XML from file
>>> and parse as you want.
>>
>> Thanks. Can u please give me details of it.
>
> It is described nicely in Wikipedia:
> http://en.wikipedia.org/wiki/Finite_sta...
>
> As a clue, I can recommend you define following states: "text", "opening
> tag", "tag attribute", "tag attribute value", "closing tag". E. g. when y=
ou
> are in "text" state and get "<" symbol at input sequence, you change stat=
e
> to "opening tag" or "closing tag"...
> I still have one question: why you don't use REXML?
>
> --
> =A0WBR, Peter Zotov
>
>

Better question: Why *wouldn't* you want to use an existing library?
You'd have to spend months on your own before it even starts to make
sense to use such a custom solution over an existing, tested, and
heavily used library like libxml or nokigiri (and to be fair,
Hpricot::XML, though it's more for HTML parsing than XML).

Jason

Arun Kumar

3/25/2009 6:03:00 PM

0

Jason Roelofs wrote:
> On Wed, Mar 25, 2009 at 1:52 PM, Peter Zotov <whitequark@whitequark.ru>
> wrote:
>>>> Of course. You can write a finite state machine, read XML from file
>> to "opening tag" or "closing tag"...
>> I still have one question: why you don't use REXML?
>>
>> --
>> �WBR, Peter Zotov
>>
>>
>
> Better question: Why *wouldn't* you want to use an existing library?
> You'd have to spend months on your own before it even starts to make
> sense to use such a custom solution over an existing, tested, and
> heavily used library like libxml or nokigiri (and to be fair,
> Hpricot::XML, though it's more for HTML parsing than XML).
>
> Jason

Hi,
One problem is compatability. I'm developing an application that
extracts the xml tags from a url like 'http://www.shoe-g.com/inde...
and displays the contents within it. So compatability is an issue. My
boss is strict of not using any complex libraries. Can u please help me.
Thanks once again

Regards
Arun Kumar
--
Posted via http://www.ruby-....

Peter Zotov

3/25/2009 6:05:00 PM

0

Quoting "Arun Kumar" <arunkumar@innovaturelabs.com>:

> Peter Zotov wrote:
>> Quoting "Arun Kumar" <arunkumar@innovaturelabs.com>:
>>
>>> Thanks. Can u please give me details of it.
>> It is described nicely in Wikipedia:
>> http://en.wikipedia.org/wiki/Finite_sta...
>>
>> As a clue, I can recommend you define following states: "text",
>> "opening tag", "tag attribute", "tag attribute value", "closing tag".
>> E. g. when you are in "text" state and get "<" symbol at input
>> sequence, you change state to "opening tag" or "closing tag"...
>> I still have one question: why you don't use REXML?
>
> Hi,
>
> The problem is that my boss donot want me to use any libraries to parse
> xml. He also said to use regular expressions to extract the contents of
> an xml tag. Can u please tell me how to do it. I'll be really greatfull.

If you have, for example, this document:
----8<----
<?xml version=3D"1.0" encoding=3D"utf-8"?>
<root>
<some-tag>some text</some-tag>
</root>
----8<----

you can extract contetns of tag "some-tag" with this (code assumes =20
that document lies in "document" variable):

document.match(/<some-tag>(.+?)<\/some-tag>/)[1]

But this will fail at "some-tag" embedded in other "some-tag" and if =20
tag will have arguments. Of course, these variants can be predicted =20
and added to regexp too, but this will make it very complicated.

Anyway, REXML is not _external_ library to Ruby. It's in stdlib!

--
WBR, Peter Zotov

Phlip

3/25/2009 6:07:00 PM

0

Arun Kumar wrote:
> Peter Zotov wrote:
>> Quoting "Arun Kumar" <arunkumar@innovaturelabs.com>:
>>
>>> Hi,
>>> Is there any way in Ruby to parse an xml file without using REXML or any
>>> other libraries.
>> Of course. You can write a finite state machine, read XML from file
>> and parse as you want.
>
> Thanks. Can u please give me details of it.

He told you to write a parser. That's the same as using REXML or any other library.

Why can't you use a library? REXML comes for free with Ruby, and is good enough
in a pinch.

If the XML input is very stable, and it never changes, you can parse some of it
with Regexp. That will break very easily, but it might be good enough for your
needs.

Phlip

3/25/2009 6:10:00 PM

0

Arun Kumar wrote:

> The problem is that my boss donot want me to use any libraries to parse
> xml. He also said to use regular expressions to extract the contents of
> an xml tag. Can u please tell me how to do it. I'll be really greatfull.

Your boss is micromanaging you, and does not understand the relationship between
Ruby, its libraries, and its programmers. Bosses generally should not prohibit
valid techniques for bogus reasons.

That said, you could use "malicious compliance", and show her or him how fragile
regular expressions are. (Write unit tests that fail for the wrong XML, for
example.)

Or you could explain that REXML is not an _external_ library. It comes with
Ruby, so it's "free" to use. You never need to download and install it...

Phlip

3/25/2009 6:11:00 PM

0

> One problem is compatability. I'm developing an application that
> extracts the xml tags from a url like 'http://www.shoe-g.com/inde...
> and displays the contents within it. So compatability is an issue. My
> boss is strict of not using any complex libraries. Can u please help me.
> Thanks once again

I had a boss once who wouldn't let us use keyboards, because we might use them
to type bugs in.

Sheesh...