Asp Forum - Ensuring files are copied correctly - microsoft.public.vb.general.discussion

Tony Toews

5/23/2010 3:56:00 AM

Folks

I've had a request to double check that files which my utility as
copied from the network server to the client PC has been copied
correctly. Presumably by using a CRC although it looks like I could
also use SHA or MD5..

Any comments on Calculating CRC32 With VB
http://www.vbaccelerator.com/home/vb/code/libraries/CRC32/a...

Any better algorithms out there?

I'm not at all sure this is required because I would've though Windows
networking would double check that the files coming across the network
were intact. But then this would be the standard belt and suspenders
I guess. Just in case the hard drive is failing or something goofy
like that.

Tony
--
Tony Toews, Microsoft Access MVP
Tony's Main MS Access pages - http://www.granite.ab.ca/ac...
Tony's Microsoft Access Blog - http://msmvps.com/blo...
For a convenient utility to keep your users FEs and other files
updated see http://www.autofeup...
Granite Fleet Manager http://www.granite...

17 Answers

(Mike Mitchell)

5/23/2010 9:09:00 AM

On Sat, 22 May 2010 21:56:19 -0600, "Tony Toews [MVP]"
<ttoews@telusplanet.net> wrote:

>Folks
>
>I've had a request to double check that files which my utility as
>copied from the network server to the client PC has been copied
>correctly. Presumably by using a CRC although it looks like I could
>also use SHA or MD5..
>
>Any comments on Calculating CRC32 With VB
>http://www.vbaccelerator.com/home/vb/code/libraries/CRC32/a...
>
>Any better algorithms out there?
>
>I'm not at all sure this is required because I would've though Windows
>networking would double check that the files coming across the network
>were intact. But then this would be the standard belt and suspenders
>I guess. Just in case the hard drive is failing or something goofy
>like that.
>
>Tony

I use the free bincomp.exe. Here's how I call it from my VB6 front-end
app:

lRet = ExecCmd("c:\utils\bincomp\bincomp " & Chr$(34) & sSourceFile &
Chr$(34) & " " & Chr$(34) & sDestinationFile & Chr$(34) & " /Q")

Public Function ExecCmd(cmdline$) As Long
Dim proc As PROCESS_INFORMATION
Dim start As STARTUPINFO
Dim ret As Long

' Initialize the STARTUPINFO structure:
start.cb = Len(start)
start.dwFlags = STARTF_USESHOWWINDOW
start.wShowWindow = SW_HIDE ' SW_MINIMIZE

' Start the shelled application:
ret = CreateProcessA(vbNullString, cmdline$, 0&, 0&, 1&, _
NORMAL_PRIORITY_CLASS, 0&, vbNullString, start, proc)

' Wait for the shelled application to finish:
ret& = WaitForSingleObject(proc.hProcess, INFINITE)
Call GetExitCodeProcess(proc.hProcess, ret)
Call CloseHandle(proc.hThread)
Call CloseHandle(proc.hProcess)
ExecCmd = ret
End Function

Get it here:
http://www.softlist.net/company/steven_wet...

I haven't updated my version of bincomp for ages, so you may see a
much later version on his website.

I've used my app to compare all sorts of files, including massive VOBs
and large True Image image files.

MM

Jason Keats

5/23/2010 10:11:00 AM

Tony Toews [MVP] wrote:
> Folks
>
> I've had a request to double check that files which my utility as
> copied from the network server to the client PC has been copied
> correctly. Presumably by using a CRC although it looks like I could
> also use SHA or MD5..
>
> Any comments on Calculating CRC32 With VB
> http://www.vbaccelerator.com/home/vb/code/libraries/CRC32/a...
>
> Any better algorithms out there?
>
> I'm not at all sure this is required because I would've though Windows
> networking would double check that the files coming across the network
> were intact. But then this would be the standard belt and suspenders
> I guess. Just in case the hard drive is failing or something goofy
> like that.
>
> Tony

That seems to be a very fast implementation of CRC32 - at least, it's a
lot faster than the one I've been using. :-)

I've used the following for SHA1:
http://vb.wikia.com/wiki...

It seems to be faster than the CRC32 implementation you're using, is
less likely to produce duplicate values (if that's important) and it's
certainly more secure (only important if you're going to use it with
passwords, etc.).

Jim Mack

5/23/2010 1:13:00 PM

Tony Toews [MVP] wrote:
> Folks
>
> I've had a request to double check that files which my utility as
> copied from the network server to the client PC has been copied
> correctly. Presumably by using a CRC although it looks like I could
> also use SHA or MD5..
>
> Any comments on Calculating CRC32 With VB
>
http://www.vbaccelerator.com/home/vb/code/libraries/CRC32/a...
>
> Any better algorithms out there?

CRC32 has the benefit of being very fast to calculate, and if you're
already reasonably sure that the file is correct -- you've done the
obvious, like check byte length -- then CRC32 is probably enough. Even
for very long files, the odds of a mis-hit are minuscule. The reason
is that it's millions of times more likely to generate a false
negative than a false positive.

MD5 and the other secure hashes are much better at guaranteeing that a
byte sequence is unique, but may be overkill, and they're
correspondingly slower. But if you have the time, they're the best you
can do. I'd say SHA-256 is the gold standard right now.

It's common to pre-compute hashes for all files in a directory and
keep them in a separate text file. Then you need to hash only on the
target side, and compare to what's already done on the source side.

--
Jim Mack
Twisted tees at http://www.cafepress.c...
"We sew confusion"

Tony Toews

5/23/2010 8:43:00 PM

"Jim Mack" <jmack@mdxi.nospam.com> wrote:

>It's common to pre-compute hashes for all files in a directory and
>keep them in a separate text file. Then you need to hash only on the
>target side, and compare to what's already done on the source side.

Yes, that was exactly what I was going to do. Especially given
that that "from" file is on a network and usually the "to" file is on
a local hard drive.

Tony
--
Tony Toews, Microsoft Access MVP
Tony's Main MS Access pages - http://www.granite.ab.ca/ac...
Tony's Microsoft Access Blog - http://msmvps.com/blo...
For a convenient utility to keep your users FEs and other files
updated see http://www.autofeup...
Granite Fleet Manager http://www.granite...

Tony Toews

5/23/2010 8:43:00 PM

MM <kylix_is@yahoo.co.uk> wrote:

>I use the free bincomp.exe. Here's how I call it from my VB6 front-end
>app:

Thanks for the suggestion however I want to contain everything within
my single VB 6 exe. It uses a drag and drop deploy so I'm trying to
keep things exceedingly simple.

Tony
--
Tony Toews, Microsoft Access MVP
Tony's Main MS Access pages - http://www.granite.ab.ca/ac...
Tony's Microsoft Access Blog - http://msmvps.com/blo...
For a convenient utility to keep your users FEs and other files
updated see http://www.autofeup...
Granite Fleet Manager http://www.granite...

Tony Toews

5/23/2010 8:51:00 PM

Jason Keats <jkeats@melbpcDeleteThis.org.au> wrote:

>I've used the following for SHA1:
>http://vb.wikia.com/wiki...
>
>It seems to be faster than the CRC32 implementation you're using, is
>less likely to produce duplicate values (if that's important) and it's
>certainly more secure (only important if you're going to use it with
>passwords, etc.).

Ah, interesting. I'll do some timing tests. I've got a 300 Mb Access
file which would be suitable for such. <smile>

The concern is about unspecified and unknown oddities happening
between the file server and the client PC when copying down files.
Which is the primary purpose of my utility. These oddities should
never happen. Of course. <smile> If the file server, network, and
client PC are working just fine. And they probably will 99.99999% of
the time.

Tony
--
Tony Toews, Microsoft Access MVP
Tony's Main MS Access pages - http://www.granite.ab.ca/ac...
Tony's Microsoft Access Blog - http://msmvps.com/blo...
For a convenient utility to keep your users FEs and other files
updated see http://www.autofeup...
Granite Fleet Manager http://www.granite...

(nobody)

5/24/2010 5:39:00 AM

Tested routine, but slow for large files:

' Returns 0 when identical in size and contents
Public Function CompareFiles(fn1 As String, fn2 As String) As Long
Dim i As Long
Dim fs1 As Long
Dim fs2 As Long
Dim f1 As Long
Dim f2 As Long
Dim b1 As Byte
Dim b2 As Byte

CompareFiles = 0 ' Assume identical

On Error Resume Next

fs1 = FileLen(fn1)
If Err.Number <> 0 Then
Err.Clear
CompareFiles = 1 ' File 1 not found
Exit Function
End If
fs2 = FileLen(fn2)
If Err.Number <> 0 Then
Err.Clear
CompareFiles = 2 ' File 2 not found
Exit Function
End If
If fs1 <> fs2 Then
CompareFiles = 3 ' File size is not the same
Exit Function
End If

f1 = FreeFile
Open fn1 For Binary Access Read Shared As f1
If Err.Number <> 0 Then
Err.Clear
CompareFiles = 4 ' Error opening file 1
Exit Function
End If

f2 = FreeFile
Open fn2 For Binary Access Read Shared As f2
If Err.Number <> 0 Then
Err.Clear
CompareFiles = 5 ' Error opening file 2
Close f1
Exit Function
End If

On Error GoTo 0

For i = 1 To fs1
Get f1, , b1
Get f2, , b2
If b1 <> b2 Then
CompareFiles = 6 ' Contents don't match
Debug.Print "CompareFiles: Mismatch at byte position " &
Seek(f1) - 1
Exit For
End If
Next

Close f2
Close f1

End Function

Tony Toews

5/24/2010 7:17:00 PM

"Nobody" <nobody@nobody.com> wrote:

>Tested routine, but slow for large files:

One of the files to be compared will always be on a network file
share. Thus I'd prefer to compute the CRC/hash once when it changes.

Tony
--
Tony Toews, Microsoft Access MVP
Tony's Main MS Access pages - http://www.granite.ab.ca/ac...
Tony's Microsoft Access Blog - http://msmvps.com/blo...
For a convenient utility to keep your users FEs and other files
updated see http://www.autofeup...
Granite Fleet Manager http://www.granite...

Dee Earley

5/25/2010 11:45:00 AM

On 23/05/2010 04:56, Tony Toews [MVP] wrote:
> Folks
>
> I've had a request to double check that files which my utility as
> copied from the network server to the client PC has been copied
> correctly. Presumably by using a CRC although it looks like I could
> also use SHA or MD5..
>
> Any comments on Calculating CRC32 With VB
> http://www.vbaccelerator.com/home/vb/code/libraries/CRC32/a...
>
> Any better algorithms out there?

If it works... :)

Bear in mind that doing the calculations on the copying end means it
needs to transfer the data twice, possibly showing the same or different
corruption each time.

You would really need something else (ideally the server) to generate
the checksums that you then check against on the client.

--
Dee Earley (dee.earley@icode.co.uk)
i-Catcher Development Team

iCode Systems

(Replies direct to my email address will be ignored.
Please reply to the group.)

Dee Earley

5/25/2010 11:48:00 AM

On 23/05/2010 14:12, Jim Mack wrote:
> CRC32 has the benefit of being very fast to calculate, and if you're
> already reasonably sure that the file is correct -- you've done the
> obvious, like check byte length -- then CRC32 is probably enough. Even
> for very long files, the odds of a mis-hit are minuscule. The reason
> is that it's millions of times more likely to generate a false
> negative than a false positive.

Err, how?

An algorithm should always give the same output for the same input,
especially when used as a hash, but the different input data may well
lead to to the same output, giving the opposite to your statement.

--
Dee Earley (dee.earley@icode.co.uk)
i-Catcher Development Team

iCode Systems

(Replies direct to my email address will be ignored.
Please reply to the group.)

microsoft.public.vb.general.discussion

Ensuring files are copied correctly

Tony Toews

(Mike Mitchell)

Jason Keats

Jim Mack

Tony Toews

Tony Toews

Tony Toews

(nobody)

Tony Toews

Dee Earley

Dee Earley

x Login to ForumsZone