[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.python

Re: database query - logic question

Tim Chase

1/2/2008 9:43:00 PM

Israel Carr wrote:
> Thanks for anyone who takes the time to read this. If I posted to the
> wrong list, I apologize and you can disregard.
>
> I need help with a script to pull data from a postgres database. I'm ok
> with the database connection just not sure how to parse the data to get
> the results I need.
>
> I'm running Python 2.4.4. For what it's worth, once I can get my logic
> correct I'll be publishing the reports mentioned below via zope for web
> clients.
>
> Here is a small sample of the records in the table:
>
> name date time status
> machine1 01/01/2008 13:00:00 system ok
> machine1 01/01/2008 13:05:00 system ok
> machine1 01/01/2008 13:10:00 status1
> machine1 01/01/2008 13:10:30 status1
> machine1 01/01/2008 13:11:00 system ok
> machine1 01/01/2008 13:16:30 status2
> machine1 01/01/2008 13:17:00 status2
> machine1 01/01/2008 13:17:30 status2
> machine1 01/01/2008 13:18:00 status2
> machine1 01/01/2008 13:18:30 status2
> machine1 01/01/2008 13:19:00 system ok
> machine1 01/01/2008 13:24:00 status2
> machine1 01/01/2008 13:24:30 status2
> machine1 01/01/2008 13:25:00 system ok
>
> I need to report from this data.
> The detail report needs to be something like:
> machine1 01/01/2008 13:10:00 status1 00:01:30
> machine1 01/01/2008 13:16:30 status2 00:02:30
> machine1 01/01/2008 13:24:00 status2 00:01:00

Well, just for fun of the SQL challenge, I tossed together the
following (using sqlite3)

SELECT name, Min(ts) as ts, next_ts, status
FROM (
SELECT *, (
SELECT ts
FROM test
WHERE
test.name = t.name
AND test.ts > t.ts
AND test.status = 'system ok'
ORDER BY test.ts ASC
LIMIT 1) AS next_ts
FROM test t
WHERE status <> 'system ok'
) with_next
GROUP BY name, status, next_ts

where my table has "name", "ts" (a timestamp field combo of your
"date" and "time" fields, and for sqlite, formatting in
"YYYY-MM-DD mm:ss" format)

which yields rows with the machine name, the non "system ok"
status, the timestamp of the initial event, and the timestamp of
the subsequent "system ok" stamp. There's a bit of an
underdefined case where you have more than one non-OK status
before OK gets reset:

00:10 status1
00:20 status1
00:30 status2
00:40 status ok

If this can't happen, it should work fine. If the above can
happen, you'll get odd overlaps in your reporting. Since I
couldn't find an Interval data type in sqlite, you'd just have to
take the "ts" and "next_ts" columns and subtract them to get the
interval you want.

> and the summary needs to be
> machine1 01/01/2008 total 'status1' time = 00:01:30
> machine1 01/01/2008 total 'status2' time = 00:03:30
> _____
> machine1 01/01/2008 total 'non-OK' time = 00:05:00 #this is the
> sum of status1 and status2 times

While the below doesn't track the changing of the machine, you
can follow the basic framework given here. I threw in a couple
helper functions to normalize whatever data types
("normalize_status()" and "make_timestamp()")

NO_TIME = datetime.datetime(datetime.MINYEAR, 1, 1)
OK = 'system ok'
normalize_status = lambda s: s.lower()

def log(s):
print s
print '=' * len(s)

def make_timestamp(date, time):
d = datetime.datetime(*(int(s) for s in
date.split('-') +
time.split(':')))
return d

status_tally = {}
last_status = OK
last_ts = NO_TIME
log('Intervals (your first request)')
for i, (machine, date, time, status) in enumerate(fetchall()):
ts = make_timestamp(date, time)
status = normalize_status(status)
if status == OK and last_status <> OK:
interval = ts - last_ts
print machine, last_status, last_ts, interval
if last_status in status_tally:
status_tally[last_status] += interval
else:
status_tally[last_status] = interval
last_status = status
elif status <> OK and last_status == OK:
last_ts = ts
last_status = status

log('Summary (your 2nd request)')
for k,v in status_tally.iteritems():
print k, v

log('Grand Total (your 3rd request)')
print sum(status_tally.values(), datetime.timedelta(0))

Thanks for the mental exercise. :)

-tkc