Alex Young
9/17/2006 1:41:00 PM
ara.t.howard@noaa.gov wrote:
>
> anyone out there have good ways of dealing with multithreaded logging? my
> current approach uses a meta programming hack/mutex/etc so that, if the
> calling thread is the main one, a tid is not written, otherwise it is.
> thus,
>
> I, [2006-09-14T15:41:58.394355 #14588] INFO -- : aquired lock.
>
> I, [2006-09-14T15:41:59.225737 #14588] INFO -- : t[-609956564] command
> <DDB_GEN_DIR=/dev/shm/fod/gen
> DDB_OUT_DIR=/raid/lv1/dmsp_reprocessing/ddb/zwd.d/F12/fod/out
> DDB_LOG_DIR=/dev/shm/fod/log DDB_TMP_DIR=/de v/shm/fod/tmp
> DDB_EPH_DIR=/raid/lv1/dmsp_reprocessing/ddb/eph.d
> DDB_BVL_DIR=/raid/lv1/dmsp_reprocessing/ddb/bvl.d DDB_BVL_EXE=false
> ddbasic -M nrt -n 2 -P ofdt -0.252525252525252 -P osdt
> -0.420875420875421 -S fod 2000.F12_0010909_DS.DAT
> 2000.F12_0011055_DS.DAT>
>
>
> shows the first line being logged in the main thread, while the next
> line is
> logged by tid -609956564. this is ok. but, obviously, log lines become
> interleaved at a certian point leading to sections like
>
> I, [2006-09-14T15:43:03.486280 #14588] INFO -- : t[-610194194] command
> <DDB_GEN_DIR=/dev/shm/foa/gen
> DDB_OUT_DIR=/raid/lv1/dmsp_reprocessing/ddb/zwd.d/F12/foa/out
> DDB_LOG_DIR=/dev/shm/foa/log DDB_TMP_DIR=/dev/shm/foa/tmp
> DDB_EPH_DIR=/raid/lv1/dmsp_reprocessing/ddb/eph.d
> DDB_BVL_DIR=/raid/lv1/dmsp_reprocessing/ddb/bvl.d DDB_BVL_EXE=false
> ddbasic -M nrt -n 2 -P ofdt -0.252525252525252 -P osdt
> -0.420875420875421 -S foa 2000.F12_0011232_DS.DAT 2000.F12_0011414_DS.DAT>
> I, [2006-09-14T15:43:03.487003 #14588] INFO -- : t[-610194894] tmp_dir
> </raid/lv1/dmsp_reprocessing/ddb/zwd.d/F12/fod/tmp>
> I, [2006-09-14T15:43:03.489989 #14588] INFO -- : t[-610194894] command
> <DDB_GEN_DIR=/dev/shm/fod/gen
> DDB_OUT_DIR=/raid/lv1/dmsp_reprocessing/ddb/zwd.d/F12/fod/out
> DDB_LOG_DIR=/dev/shm/fod/log DDB_TMP_DIR=/dev/shm/fod/tmp
> DDB_EPH_DIR=/raid/lv1/dmsp_reprocessing/ddb/eph.d
> DDB_BVL_DIR=/raid/lv1/dmsp_reprocessing/ddb/bvl.d DDB_BVL_EXE=false
> ddbasic -M nrt -n 2 -P ofdt -0.252525252525252 -P osdt
> -0.420875420875421 -S fod 2000.F12_0011232_DS.DAT
> 2000.F12_0011414_DS.DAT>
> I, [2006-09-14T15:44:18. 45782 #14588] INFO -- : t[-610194894] status <0>
> I, [2006-09-14T15:44:18. 46468 #14588] INFO -- : t[-610297924] command
> <empty_ols
> /raid/lv1/dmsp_reprocessing/ddb/zwd.d/F12/fod/out/F12200001010754.OIS>
> I, [2006-09-14T15:44:18. 47187 #14588] INFO -- : t[-610299344] command
> <empty_ols
> /raid/lv1/dmsp_reprocessing/ddb/zwd.d/F12/fod/out/F12200001010936.OIS>
> I, [2006-09-14T15:44:18. 50663 #14588] INFO -- : t[-610298484] command
> <empty_ols
> /raid/lv1/dmsp_reprocessing/ddb/zwd.d/F12/fod/out/F12200001011118.OIS>
> I, [2006-09-14T15:44:18. 53356 #14588] INFO -- : t[-610301254] command
> <empty_ols
> /raid/lv1/dmsp_reprocessing/ddb/zwd.d/F12/fod/out/F12200001011300.OIS>
> I, [2006-09-14T15:44:25.361170 #14588] INFO -- : t[-610194194] status <0>
> I, [2006-09-14T15:44:18. 50663 #14588] INFO -- : t[-610297924] status <0>
>
>
> here, we can certainly track the status of commands back the the
> line/thread
> that issued them, but it's pretty painful to to by hand. of course, one
> could
> easily post-process in order to analyze...
>
> so - has anyone else solved this problem neatly?
I handle this by having each thread *always* drop its thread id into the
logs, whether it's the main thread or not. However, I don't do it as a
number - I write the entry at /usr/share/dict/words corresponding to the
thread id modulo the length of the file. That way, the threads are
identified by a memorable word rather than an unmemorable number, and
grep does the rest. It might not seem ground-breaking, but it makes the
whole thing much more friendly.
I've toyed in the past with assigning each thread a sequential ID, and
altering the log output so that it's actually a CSV file with the log
output for each thread appearing in its own column, but that ended up
being more trouble than it was worth at the time. I can dig the code up
when I'm in the office tomorrow if you're interested, but I'm sure
you're more than capable of replicating it without my help :-)
--
Alex