Asp Forum - Logger rotation problem in multi-process/Rails environment

Guy Rosen

10/5/2006 9:19:00 AM

Hi,

There seems to be a bug (or just unexpected behavior...) in logger.rb's
rotation that causes it to either misbehave or fail the application
when running in a multi-process environment - specifically Rails.

I have Ruby on Rails under FastCGI, with multiple dispatch.fcgi
processes alive. My logger is configured for daily rotation. The
problematic scenario is this:
- Come midnight, the _first_ Rails process that happens to get a
request renames the log file (production.log =>
production.log.yyyymmdd)
- Subsequently, the remaining Rails processes attempt to rename the
file as well, but fail completely because production.log.yyyymmdd
already exists. Logger raises an exception, and users receive "500
Internal Server Error".
- Until all Rails processes are restarted, only the first one will
work and all the rest will continue to fail violently.

I'm including a patch for logger.rb that fixes Logger#shift_log_period
so that it doesn't fail violently if log.yyyymmdd already exists, but
rather just reopens the logfile and keeps on working.

Note there is probably a bug when rotating by size
(Logger#shift_log_age) in the same environment (haven't tested it, but
reading the code seems to indicate it will happen). Although nothing
will fail violently, each Rails process, ignorant of its peers, will
rotate the logs and the end result will be multiple rotations!

- Guy.

-----8<-----

Index: logger.rb
===================================================================
RCS file: /src/ruby/lib/logger.rb,v
retrieving revision 1.5.2.9
diff -r1.5.2.9 logger.rb
568,570d567
< if FileTest.exist?(age_file)
< raise RuntimeError.new("'#{ age_file }' already exists.")
< end
572c569,571
< File.rename("#{@filename}", age_file)
---
> if not FileTest.exist?(age_file)
> File.rename("#{@filename}", age_file)
> end

7 Answers

Alex Young

10/5/2006 9:37:00 AM

Guy Rosen wrote:
<snip>
> Note there is probably a bug when rotating by size
> (Logger#shift_log_age) in the same environment (haven't tested it, but
> reading the code seems to indicate it will happen). Although nothing
> will fail violently, each Rails process, ignorant of its peers, will
> rotate the logs and the end result will be multiple rotations!
There's definitely something screwy going on. Try this code:

require 'logger'

logger = Logger.new('test.log', 100, 70 * 1024)

100.times do
fork {
1000.times do
logger.info("This is a test line.")
end
}
end

I know, 100 is a ridiculous number. It runs, though. It should result
in 100 full logfiles. However, running this on my machine (Ubuntu 6.06,
ruby 1.8.4) results in Logger::ShiftingErrors being thrown, and the
logfile list when it's finished is incomplete (the last run only left 84
logfiles, with apparently random gaps in the sequence), so there's been
data loss - I presume from when the processes fell over having failed to
log.

--
Alex

Guy Rosen

10/5/2006 2:38:00 PM

The solution in your rotate-by-size configuration is probably just a
line or two in shift_log_age that checks whether the file *on disk* has
already been rotated (by checking its size), and if so just reopen it
instead of doing all the renames.

Although, I should say that both this and my patch to shift_log_period
have a possible race condition in the "check then rotate" flow. Since
it's not atomic, some kind of basic IPC is needed to synchronize that
bit of code.

- Guy.

Alex Young wrote:
> There's definitely something screwy going on. Try this code:
>
> require 'logger'
>
> logger = Logger.new('test.log', 100, 70 * 1024)
>
> 100.times do
> fork {
> 1000.times do
> logger.info("This is a test line.")
> end
> }
> end
>
> I know, 100 is a ridiculous number. It runs, though. It should result
> in 100 full logfiles. However, running this on my machine (Ubuntu 6.06,
> ruby 1.8.4) results in Logger::ShiftingErrors being thrown, and the
> logfile list when it's finished is incomplete (the last run only left 84
> logfiles, with apparently random gaps in the sequence), so there's been
> data loss - I presume from when the processes fell over having failed to
> log.
>
> --
> Alex

Ara.T.Howard

10/5/2006 3:06:00 PM

Alex Young

10/5/2006 4:00:00 PM

ara.t.howard@noaa.gov wrote:
> On Thu, 5 Oct 2006, Guy Rosen wrote:
>
>> -----8<-----
>>
>> Index: logger.rb
>> ===================================================================
>> RCS file: /src/ruby/lib/logger.rb,v
>> retrieving revision 1.5.2.9
>> diff -r1.5.2.9 logger.rb
>> 568,570d567
>> < if FileTest.exist?(age_file)
>> < raise RuntimeError.new("'#{ age_file }' already exists.")
>> < end
>> 572c569,571
>> < File.rename("#{@filename}", age_file)
>> ---
>>> if not FileTest.exist?(age_file)
>>> File.rename("#{@filename}", age_file)
>>> end
>>
>
> this code has a serious race condition too
>
> if not FileTest.exist?(age_file)
> #
> # but suddenly here it does!!
> #
> File.rename("#{@filename}", age_file)
> end
>
> this will be especially bad on nfs and shared filesystems where the exists?
> check only reads the inode and inode caching may mean a lag of 1-3 minutes
> (with normal setups) before the check reflects what's on the remote server.
> invalidating an inode is tricky and platform/fs dependant as well.
>
> additionally fixing this doesn't make logger process safe: writes will
> sometimes still be overlapped because logger doesn't use any sort of ipc
> safe
> locking for writing. if it did it would incur a massive speed penalty
> to boot
> - esp if locking is on a shared filesystem. also, creation of a new
> logfile
> uses (File::WRONLY | File::APPEND | File::CREAT) which also races on any
> shared fs... in summary, there's almost nothing about logger.rb which makes
> it process safe ;-(

It sounds like you're confirming my suspicion - that the only way to get
process-safe logging is by having a separate logging process which
everything else can connect to over some suitable RPC protocol... True?

--
Alex

Ara.T.Howard

10/5/2006 4:26:00 PM

Alex Young

10/5/2006 5:11:00 PM

ara.t.howard@noaa.gov wrote:
> On Fri, 6 Oct 2006, Alex Young wrote:
>
>> It sounds like you're confirming my suspicion - that the only way to get
>> process-safe logging is by having a separate logging process which
>> everything else can connect to over some suitable RPC protocol... True?
>
> no. you could use locking of an external file (otherwise creation still
> has a
> race). but that solution would certainly work too...
For my application, I'm looking at being able to scale to more than one
machine, so it's at least worth looking into - and I'd really rather
avoid learning how to set up NFS for the moment if I can :-)

--
Alex

Ara.T.Howard

10/5/2006 5:23:00 PM

comp.lang.ruby

Logger rotation problem in multi-process/Rails environment

Guy Rosen

Alex Young

Guy Rosen

Ara.T.Howard

Alex Young

Ara.T.Howard

Alex Young

Ara.T.Howard

x Login to ForumsZone