[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.python

Python help for a C++ programmer

mlimber

1/16/2008 2:23:00 PM

I'm writing a text processing program to process some survey results.
I'm familiar with C++ and could write it in that, but I thought I'd
try out Python. I've got a handle on the file I/O and regular
expression processing, but I'm wondering about building my array of
classes (I'd probably use a struct in C++ since there are no methods,
just data).

I want something like (C++ code):

struct Response
{
std::string name;
int age;
int iData[ 10 ];
std::string sData;
};

// Prototype
void Process( const std::vector<Response>& );

int main()
{
std::vector<Response> responses;

while( /* not end of file */ )
{
Response r;

// Fill struct from file
r.name = /* get the data from the file */;
r.age = /* ... */;
r.iData[0] = /* ... */;
// ...
r.sData = /* ... */;
responses.push_back( r );
}

// Do some processing on the responses
Process( responses );
}

What is the preferred way to do this sort of thing in Python?

Thanks in advance! --M
4 Answers

Lutz Horn

1/16/2008 2:39:00 PM

0

Hi,

On Wed, 16 Jan 2008 06:23:10 -0800 (PST), "mlimber" <mlimber@gmail.com>
said:
> I'm writing a text processing program to process some survey results.
> I'm familiar with C++ and could write it in that, but I thought I'd
> try out Python. I've got a handle on the file I/O and regular
> expression processing, but I'm wondering about building my array of
> classes (I'd probably use a struct in C++ since there are no methods,
> just data).

You could try something like this.

#!/usr/bin/env python

class Response:
def __init__(self, name, age, iData, sData):
self.name = name
self.age = age
self.iData = iData
self.sData = sData

def sourceOfResponses():
return [["you", 42, [1, 2, 3], ["foo", "bar", "baz"]],
["me", 23, [1, 2, 3], ["ham", "spam", "eggs"]]]

if __name__ == "__main__":
responses = []
for input in sourceOfResponses:
response = Response(input.name, input.age,
input.iData, input.sData)
reponses.append(response)

Lutz
--
GnuPG Key: 1024D/6EBDA359 1999-09-20
Key fingerprint = 438D 31FC 9300 CED0 1CDE A19D CD0F 9CA2 6EBD A359
http://dev-random.dnsalias.net/0x6...
http://pgp.cs.uu.nl/stats/6EB...

Neil Cerutti

1/16/2008 2:43:00 PM

0

On Jan 16, 2008 9:23 AM, mlimber <mlimber@gmail.com> wrote:
> I'm writing a text processing program to process some survey results.
> I'm familiar with C++ and could write it in that, but I thought I'd
> try out Python. I've got a handle on the file I/O and regular
> expression processing, but I'm wondering about building my array of
> classes (I'd probably use a struct in C++ since there are no methods,
> just data).
>
> I want something like (C++ code):
>
> struct Response
> {
> std::string name;
> int age;
> int iData[ 10 ];
> std::string sData;
> };
>
> // Prototype
> void Process( const std::vector<Response>& );
>
> int main()
> {
> std::vector<Response> responses;
>
> while( /* not end of file */ )
> {
> Response r;
>
> // Fill struct from file
> r.name = /* get the data from the file */;
> r.age = /* ... */;
> r.iData[0] = /* ... */;
> // ...
> r.sData = /* ... */;
> responses.push_back( r );
> }
>
> // Do some processing on the responses
> Process( responses );
> }
>
> What is the preferred way to do this sort of thing in Python?

It depends on the format of your data (Python provides lots of
shortcuts for handling lots of kinds of data), but perhaps something
like this, if you do all the parsing manually:

class Response(object):
def __init__(self, extern_rep):
# parse or translate extern_rep into ...
self.name = ...
self.age = ...
# Use a dictionary instead of parallel lists.
self.data = {...}
def process(self):
# Do what you need to do.

fstream = open('thedatafile')

for line in fstream:
# This assumes each line is one response.
Response(line).process()

--
Neil Cerutti <mr.cerutti+python@gmail.com>

Tim Chase

1/16/2008 2:56:00 PM

0

> I want something like (C++ code):
>
> struct Response
> {
> std::string name;
> int age;
> int iData[ 10 ];
> std::string sData;
> };
>
> // Prototype
> void Process( const std::vector<Response>& );
>
> int main()
> {
> std::vector<Response> responses;
>
> while( /* not end of file */ )
> {
> Response r;
>
> // Fill struct from file
> r.name = /* get the data from the file */;
> r.age = /* ... */;
> r.iData[0] = /* ... */;
> // ...
> r.sData = /* ... */;
> responses.push_back( r );
> }
>
> // Do some processing on the responses
> Process( responses );
> }
>
> What is the preferred way to do this sort of thing in Python?

Without knowing more about the details involved with parsing the
file, here's a first-pass whack at it:

class Response(object):
def __init__(self, name, age, iData, sData):
self.name = name
self.age = age
self.iData = iData
self.sData = sData

def __repr__(self):
return '%s (%s)' % self.name

def parse_response_from_line(line):
name, age, iData, sData = line.rstrip('\n').split('\t')
return Response(name, age, iData, sData)

def process(response):
print 'Processing %r' % response

responses = [parse_response_from_line(line)
for line in file('input.txt')]

for response in responses:
process(response)


That last pair might be condensed to just

for line in file('input.txt'):
process(parse_response_from_line(line))

Things get a bit hairier if your input is multi-line. You might
have to do something like

def getline(fp):
return fp.readline().rstrip('\n')
def response_generator(fp):
name = None
while name != '':
name = getline(fp)
age = getline(fp)
iData = getline(fp)
sData = getline(fp)
if name and age and iData and sData:
yield Response(name, age, iData, sData)

fp = file('input.txt')
for response in response_generator(fp):
process(response)

which you can modify accordingly.

-tkc




Bruno Desthuilliers

1/16/2008 4:06:00 PM

0

mlimber a écrit :
> I'm writing a text processing program to process some survey results.
> I'm familiar with C++ and could write it in that, but I thought I'd
> try out Python. I've got a handle on the file I/O and regular
> expression processing,

FWIW, and depending on your text format, there may be better solutions
than regexps.

> but I'm wondering about building my array of
> classes (I'd probably use a struct in C++ since there are no methods,
> just data).

If you have no methods and you're sure you won't have no methods, then
just use a dict (name-indexed record) or a tuple (position-indexed record).

> I want something like (C++ code):
>
> struct Response
> {
> std::string name;
> int age;
> int iData[ 10 ];
> std::string sData;
> };
>
> // Prototype
> void Process( const std::vector<Response>& );
>
> int main()
> {
> std::vector<Response> responses;
>
> while( /* not end of file */ )
> {
> Response r;
>
> // Fill struct from file
> r.name = /* get the data from the file */;
> r.age = /* ... */;
> r.iData[0] = /* ... */;
> // ...
> r.sData = /* ... */;
> responses.push_back( r );
> }
>
> // Do some processing on the responses
> Process( responses );
> }
>
> What is the preferred way to do this sort of thing in Python?

# assuming you're using a line-oriented format, and not
# worrying about exception handling etc...

def extract(line):
data = dict()
data['name'] = # get the name
data['age'] = # get the age
data['data'] = # etc...
return data


def process(responses):
# code here

if name == '__main__':
import sys
path = sys.argv[1]
responses = [extract(line) for line in open(path)]
process(response)

If you have a very huge dataset, you may want to either use tuples
instead of dicts (less overhead) and/or use a more stream-oriented
approach using generators - if applyable of course (that is, if you
don't need to extract all results before processing)

HTH