Tom
1/15/2008 11:11:00 PM
i suppose a lot of this depends on how heavy-weight you want this to
be. something more reliable would be a SHA1 of the contents of the
file-
require 'digest/sha1'
def content_hash(path)
digest = Digest::SHA1.new
File.open(path, 'rb') do |file|
while buffer = file.read(1048576) do
digest << buffer
end
end
return digest.hexdigest
end
it also depends on what exactly you want to be doing with it- this
will create a signature for the file contents that you can use to
compare against another. if all you care about is checking whether a
file has changed or not, and the files are located on an NTFS
partition and you really don't like the .folder, you could use
alternate data streams (ADS):
def write_signature(path, signature)
open(path + ":signature_stream", "w") {|f| f.write(signature}
end
this will effectively hide the signature. you could potentially
duplicate the contents into a stream, but that would not be an
effective use- if the duplication is for versioning or backups, loss
of the file will also result in loss of the stream and, thus, the
backup. if you are running this as a service, you could just use a
simple db to contain the path + signature and use that as your check.
i would recommend using something like SHA1 to check the contents- far
more reliable the size, ctime, or mtime. if the files are fairly
small, it won't be too costly. regardless, caching the signature of
the copy in either an ads or a database will save time- you'll only
need to create the signature of the original file and compare it to
the cached sig of the copy:
class FileCompObj
attr_reader :path, :signature
def initialize(path, signature = nil)
@path = path
@signature = signature ? signature : content_hash(path)
end
def ==(file)
@signature == file.signature
end
def content_hash(path)
digest = Digest::SHA1.new
File.open(path, 'rb') do |file|
while buffer = file.read(1048576) do
digest << buffer
end
end
return digest.hexdigest
end
end
a simple convenience class like that would allow you to do something
like this:
# include methods defined above
path, backup_path = ARGV[0], ARGV[1]
original = FileCompObj.new(path)
...
# grab sig from ads or db, then pass it along
backup = FileCompObj.new(backup_path, signature)
if original == backup
puts "Same"
else
puts "Different"
end
a rather lengthy reply. while no animals were harmed in the creation
of this message, i haven't test the code yet, so no guarantees. scan
for bugs first!
tom
On Jan 15, 2008, at 2:54 PM, Vell wrote:
> I am attempting to right a small utility that checks one file against
> an automated snapshot of itself in a different location to make sure
> that they are the same. I have not had any luck doing this and thought
> I would present this to the group to see if you can help me out.
>
> So far when I execute my code everything that is not an actual file
> ("." and "..") shows as OK but, anything else shows as not equal.
> I have tried this using several different things like File1.size ==
> File2.size and also what you see in the code listed below. Neither
> seem to give me a result showing that these files are equal.
>
> If I were to look in windows explore at these 2 files and look at the
> file sizes, they both look identical so I am not sure what it is that
> *.size is looking at to show that they aren't the same size.
>
> Can anyone advise me on how I can better implement this? Also how I
> can get rid of the "." files in the directory
>
> My code so far is as below:
>
> require 'fileutils'
> require 'ftools'
>
> y = Dir.entries("\\\\C:\\BNADataFile\\")
> z = Dir.entries("\\\\F\\BNADataFile\\.snapshot\\hourly.0\\")
>
> y.each do |y|
> z.each do |z|
> if(FileUtils.uptodate?(y,z))
> puts y + ": Snapshot Successful"
> else
> puts y + ": Not the same"
> end
> end
> end
>
> My results: (Not sure why ".." showed up so many times)
>
> C:\Documents and Settings\lmcilwain\My Documents\scripts\work>ruby
> check.rb
> .: Not the same
> .: Snapshot Successful
> .: Snapshot Successful
> .: Snapshot Successful
> .: Snapshot Successful
> .: Snapshot Successful
> .: Snapshot Successful
> .: Snapshot Successful
> ..: Snapshot Successful
> backup: Not the same
> Archived..cdb: Not the same
> OPInc..cdb: Not the same
> ArchivedAlt.cdb: Not the same
> Alts.cdb: Not the same
> event.log: Not the same
>
>