[lnkForumImage]
TotalShareware - Download Free Software

Confronta i prezzi di migliaia di prodotti.
Asp Forum
 Home | Login | Register | Search 


 

Forums >

comp.lang.ruby

Re: Sorting a logfile, how would you write it?

Andrew Savige

8/11/2007 11:16:00 PM

--- William James <w_a_x_man@yahoo.com> wrote:
> It's my understanding that when you use -i, a temporary file
> is created, the original file is deleted, and the temporary
> file is renamed. Doesn't this cause unnecessary disk
> fragmentation?

To do this safely you'll need a temporary file.
Slurping a file into memory, sorting it, then writing it back to the same
file is an unsound practice, i.e. not "rerunnable-safe". Suppose, for
example, you suffer a power failure half-way through writing back the file,
or the write fails due to "disk full" or "user disk quota exceeded" or for
any other reason. Oops, you've just corrupted your input file. Worse, if
this program is run automatically or by a naive user you risk permanent
data loss without knowing about it, for when the program is re-run after
the write failure, it will use as its input file the now corrupted file
and appear to "work".

The simple remedy is to first write to a temporary file; once you're sure
the temporary file has been written without error (and after the permissions
on the temporary are updated to match the original) you then (atomically)
rename the temporary file to the original. In that way, if writing the new
file is interrupted for any reason, you can simply re-run the program
without losing any data. Though you could roll all this code for yourself,
the -i switch is more convenient, and safer than rewriting a file in place.

Cheers,
/-



____________________________________________________________________________________
Sick of deleting your inbox? Yahoo!7 Mail has free unlimited storage.
http://au.docs.yahoo.com/mail/unlimitedst...

1 Answer

William James

8/12/2007 7:06:00 AM

0

On Aug 11, 6:16 pm, Andrew Savige <ajsav...@yahoo.com.au> wrote:
> --- William James <w_a_x_...@yahoo.com> wrote:
>
> > It's my understanding that when you use -i, a temporary file
> > is created, the original file is deleted, and the temporary
> > file is renamed. Doesn't this cause unnecessary disk
> > fragmentation?
>
> To do this safely you'll need a temporary file.
> Slurping a file into memory, sorting it, then writing it back to the same
> file is an unsound practice, i.e. not "rerunnable-safe". Suppose, for
> example, you suffer a power failure half-way through writing back the file,
> or the write fails due to "disk full" or "user disk quota exceeded" or for
> any other reason. Oops, you've just corrupted your input file.

Of course. But I'm willing to take that miniscule chance when
I'm doing a write to a small file that takes a fraction of a
second.

The question remains: doesn't using a temp file cause more
disk fragmentation than writing directly to the original file?