Asp Forum - Need help on File parsing

Maxx

3/21/2011 8:35:00 PM

I'm writing a C program which would parse a xml file as its input and
perform specific operations...
Now what i have in my mind is that i should declare a two dimensional
array and store the xml file in it

for example::: char country[][]={<countries>,
<country>,
<text>Norway</
text>,
<value>N</value>,
</country>}, and so on

My question is... is there any better way to do this, i.e. is there
any better way to store the xml input input..

Thanks

40 Answers

Ian Collins

3/21/2011 8:44:00 PM

On 03/22/11 09:35 AM, Maxx wrote:
> I'm writing a C program which would parse a xml file as its input and
> perform specific operations...
> Now what i have in my mind is that i should declare a two dimensional
> array and store the xml file in it
>
> for example::: char country[][]={<countries>,
> <country>,
> <text>Norway</
> text>,
> <value>N</value>,
> </country>}, and so on
>
>
> My question is... is there any better way to do this, i.e. is there
> any better way to store the xml input input..

That's more of a generic programming question than a C one. Have a look
at a common XML parser like libxml, the documentation will give you
ideas even if you choose not to use the library.

--
Ian Collins

Ben Bacarisse

3/21/2011 9:54:00 PM

Maxx <grungeddd.maxx@gmail.com> writes:

> I'm writing a C program which would parse a xml file as its input and
> perform specific operations...

What specific operations? See below...

> Now what i have in my mind is that i should declare a two dimensional
> array and store the xml file in it
>
> for example::: char country[][]={<countries>,
> <country>,
> <text>Norway</
> text>,
> <value>N</value>,
> </country>}, and so on
>
>
> My question is... is there any better way to do this, i.e. is there
> any better way to store the xml input input..

It's almost impossible to say without knowing how a piece of data is
going to be accessed (or manipulated).

A good place to post would be comp.programming. If you say what you
propose to do with the XML you should get good help there. Be prepared
to be told that you should use an existing XML parsing library (because
that is almost always the right answer).

--
Ben.

John Doe

3/22/2011 3:42:00 AM

On Mon, 21 Mar 2011 13:35:01 -0700, Maxx wrote:

> I'm writing a C program which would parse a xml file as its input and
> perform specific operations...
> Now what i have in my mind is that i should declare a two dimensional
> array and store the xml file in it

> My question is... is there any better way to do this, i.e. is there any
> better way to store the xml input input..

Yes. In fact, it would be hard to imagine a worse way.

First, I wouldn't recommend trying to actually parse the XML yourself, as
you're practically bound to get it wrong. Use an XML parsing library
instead.

XML parsing libraries come in two main flavours: DOM and SAX. DOM
constructs a parse tree for the entire file, which the application can
then query. SAX generates events (reported via callbacks) as it parses the
file; it's up to the application to actually store the data.

Which flavour to use and exactly how to do it depend upon the details of
the application.

Malcolm McLean

3/22/2011 6:41:00 AM

On Mar 21, 10:35 pm, Maxx <grungeddd.m...@gmail.com> wrote:
>
> My question is... is there any better way to do this, i.e. is there
> any better way to store the xml input input..
>
Think of the XML as a tree, and build what is known as a recursive
descent parser.

Basically it's the same problem as a mathematical expression with
deeply nested parentheses, in a slightly different form. You need one
token of lookahead.

Once you've converted the XML to a tree, you'll usually want to walk
the tree to convert to a set of nested arrays, but sometimes it will
be better to keep the data in tree form.

Nick

3/22/2011 7:01:00 AM

Malcolm McLean <malcolm.mclean5@btinternet.com> writes:

> On Mar 21, 10:35Â pm, Maxx <grungeddd.m...@gmail.com> wrote:
>>
>> My question is... is there any better way to do this, i.e. is there
>> any better way to store the xml input input..
>>
> Think of the XML as a tree, and build what is known as a recursive
> descent parser.
>
> Basically it's the same problem as a mathematical expression with
> deeply nested parentheses, in a slightly different form. You need one
> token of lookahead.
>
> Once you've converted the XML to a tree, you'll usually want to walk
> the tree to convert to a set of nested arrays, but sometimes it will
> be better to keep the data in tree form.

I did it the other way round. First I wrote a good generic "values"
handling system that allowed me to have named strings, integers, lists,
string-indexed-arrays, all as recursive as you like. That was the
difficult bit.

They I just hooked xmlparse up to it and it sucked the XML in nicely.

Think hard about what you want to, if anything, to distinguish between:

<stuff>
<item>fred</item>
</stuff>

<stuff item="fred"/>

To summarise - you need more a specification of the problem before
starting to find a solution.
--
Online waterways route planner | http://ca...
Plan trips, see photos, check facilities | http://canalp...

Maxx

3/22/2011 8:10:00 PM

On Mar 21, 1:43 pm, Ian Collins <ian-n...@hotmail.com> wrote:
> On 03/22/11 09:35 AM, Maxx wrote:
>
> > I'm writing a C program which would parse a xml file as its input and
> > perform specific operations...
> > Now what i have in my mind is that i should declare a two dimensional
> > array and store the xml file in it
>
> > for example::: char country[][]={<countries>,
> > <country>,
> > <text>Norway</
> > text>,
> > <value>N</value>,
> > </country>}, and so on
>
> > My question is... is there any better way to do this, i.e. is there
> > any better way to store the xml input input..
>
> That's more of a generic programming question than a C one. Have a look
> at a common XML parser like libxml, the documentation will give you
> ideas even if you choose not to use the library.
>
> --
> Ian Collins

Alright i've looked up libxml and seems to have hit jackpot... It does
contains the necessary function which i need...
Thanks

Maxx

3/22/2011 8:14:00 PM

On Mar 21, 8:41 pm, Nobody <nob...@nowhere.com> wrote:
> On Mon, 21 Mar 2011 13:35:01 -0700, Maxx wrote:
> > I'm writing a C program which would parse a xml file as its input and
> > perform specific operations...
> > Now what i have in my mind is that i should declare a two dimensional
> > array and store the xml file in it
> > My question is... is there any better way to do this, i.e. is there any
> > better way to store the xml input input..
>
> Yes. In fact, it would be hard to imagine a worse way.
>
> First, I wouldn't recommend trying to actually parse the XML yourself, as
> you're practically bound to get it wrong. Use an XML parsing library
> instead.
>
> XML parsing libraries come in two main flavours: DOM and SAX. DOM
> constructs a parse tree for the entire file, which the application can
> then query. SAX generates events (reported via callbacks) as it parses the
> file; it's up to the application to actually store the data.
>
> Which flavour to use and exactly how to do it depend upon the details of
> the application.

Actually the xml file that i was going to provide the program will
always have a predefined format, like the one example i gave above.It
will always parse the same format and simply extract the values from
the fields and write another xml file having the same template... so i
was looking for the easiest way to solve it, instead of requiring to
call extensive library functions...

any ways Thanks

Maxx

3/22/2011 8:17:00 PM

On Mar 21, 11:40 pm, Malcolm McLean <malcolm.mcle...@btinternet.com>
wrote:
> On Mar 21, 10:35 pm, Maxx <grungeddd.m...@gmail.com> wrote:
>
> > My question is... is there any better way to do this, i.e. is there
> > any better way to store the xml input input..
>
> Think of the XML as a tree, and build what is known as a recursive
> descent parser.
>
> Basically it's the same problem as a mathematical expression with
> deeply nested parentheses, in a slightly different form. You need one
> token of lookahead.
>
> Once you've converted the XML to a tree, you'll usually want to walk
> the tree to convert to a set of nested arrays, but sometimes it will
> be better to keep the data in tree form.

Yeah i had this concept in mind at first, but as i was going to write
a simple program which would simply extract values from a set of
predefined fields, so i kinda avoided going into trees.. Although i
recon a tree would be the best solution but i'm still quite naive in
trees.

Thanks

Maxx

3/22/2011 8:20:00 PM

On Mar 22, 12:01 am, Dr Nick <3-nos...@temporary-address.org.uk>
wrote:
> Malcolm McLean <malcolm.mcle...@btinternet.com> writes:
> > On Mar 21, 10:35 pm, Maxx <grungeddd.m...@gmail.com> wrote:
>
> >> My question is... is there any better way to do this, i.e. is there
> >> any better way to store the xml input input..
>
> > Think of the XML as a tree, and build what is known as a recursive
> > descent parser.
>
> > Basically it's the same problem as a mathematical expression with
> > deeply nested parentheses, in a slightly different form. You need one
> > token of lookahead.
>
> > Once you've converted the XML to a tree, you'll usually want to walk
> > the tree to convert to a set of nested arrays, but sometimes it will
> > be better to keep the data in tree form.
>
> I did it the other way round. First I wrote a good generic "values"
> handling system that allowed me to have named strings, integers, lists,
> string-indexed-arrays, all as recursive as you like. That was the
> difficult bit.
>
> They I just hooked xmlparse up to it and it sucked the XML in nicely.
>
> Think hard about what you want to, if anything, to distinguish between:
>
> <stuff>
> <item>fred</item>
> </stuff>
>
> <stuff item="fred"/>
>
> To summarise - you need more a specification of the problem before
> starting to find a solution.
> --
> Online waterways route planner |http://ca...
> Plan trips, see photos, check facilities |http://canalp...

Yeah yeah a generic list of values would be helpful but i need more
ideas on how to implement it.. I'm trying to avoid library function in
this program as it will always parse the same fields over and over
again..

David Resnick

3/23/2011 6:46:00 PM

On Mar 22, 4:13 pm, Maxx <grungeddd.m...@gmail.com> wrote:
> On Mar 21, 8:41 pm, Nobody <nob...@nowhere.com> wrote:
>
>
>
> > On Mon, 21 Mar 2011 13:35:01 -0700, Maxx wrote:
> > > I'm writing a C program which would parse a xml file as its input and
> > > perform specific operations...
> > > Now what i have in my mind is that i should declare a two dimensional
> > > array and store the xml file in it
> > > My question is... is there any better way to do this, i.e. is there any
> > > better way to store the xml input input..
>
> > Yes. In fact, it would be hard to imagine a worse way.
>
> > First, I wouldn't recommend trying to actually parse the XML yourself, as
> > you're practically bound to get it wrong. Use an XML parsing library
> > instead.
>
> > XML parsing libraries come in two main flavours: DOM and SAX. DOM
> > constructs a parse tree for the entire file, which the application can
> > then query. SAX generates events (reported via callbacks) as it parses the
> > file; it's up to the application to actually store the data.
>
> > Which flavour to use and exactly how to do it depend upon the details of
> > the application.
>
> Actually the xml file that i was going to provide the program will
> always have a predefined format, like the one example i gave above.It
> will always parse the same format and simply extract the values from
> the fields and write another xml file having the same template... so i
> was looking for the easiest way to solve it, instead of requiring to
> call extensive library functions...

Note that it always starts this way. It is easy to hand parse the XML
if it is in a truly fixed format, so why use a real parser? But then
there are modifications/extensions/etc. People hand edit the file and
add white space, which won't confuse a parser but messes up your less
flexible hand parse. People write a mixture of <element></element>
instead of <element/>, which should parse as equivalent and somehow
don't when hand parsing. People suddenly want validation. etc.
Going with a real parser is very much the way to go in a real
application, much more future friendly even if not apparently needed
up front...

comp.lang.c

Need help on File parsing

Maxx

Ian Collins

Ben Bacarisse

John Doe

Malcolm McLean

Nick

Maxx

Maxx

Maxx

Maxx

David Resnick

x Login to ForumsZone