=?ISO-8859-1?Q?Arne_Vajh=F8j?=
5/24/2008 9:58:00 PM
Arne Vajhøj wrote:
> John Smith wrote:
>> Arne Vajhøj wrote:
>>> John Smith wrote:
>>>> I am very new to C# and NET framework. I am trying to hash (using
>>>> MD5CryptoServiceProvider) a source that is split into several files.
>>>>
>>>> Now when the source is in one file I can produce the correct md5 hash.
>>>>
>>>> My issue is how can I reproduce the correct hash when the file is
>>>> split into different files.
>>>
>>> A hash is calculated based on the byte content.
>>>
>>> Why does it make the difference whether those bytes are read
>>> from a single file or from multiple files ?
>
>> I think best way is to show you my problem with quick example code:
>
> Example code is always good.
>
>> MD5CryptoServiceProvider oMD5 = new MD5CryptoServiceProvider();
>> string sRet;
>>
>> string s1 = "First String Sample";
>> string s2 = "Second String Sample";
>> string s3 = s1 + s2;
>>
>>
>> byte[] bBytes = System.Text.ASCIIEncoding.ASCII.GetBytes(s1);
>> sRet = BitConverter.ToString(oMD5.ComputeHash(bBytes)).Replace("-",
>> string.Empty);
>> System.Diagnostics.Debug.WriteLine(sRet);
>>
>> bBytes = System.Text.ASCIIEncoding.ASCII.GetBytes(s2);
>> sRet = BitConverter.ToString(oMD5.ComputeHash(bBytes)).Replace("-",
>> string.Empty);
>> System.Diagnostics.Debug.WriteLine(sRet);
>>
>> bBytes = System.Text.ASCIIEncoding.ASCII.GetBytes(s3);
>> sRet = BitConverter.ToString(oMD5.ComputeHash(bBytes)).Replace("-",
>> string.Empty);
>> System.Diagnostics.Debug.WriteLine(sRet);
>> -----------------------------------------------------------------
>>
>> The output hash is as follows:
>> s1 = 1EC25881AD012D4CA6E73D1986AE93FB
>> s2 = D8D46AC432C7251F863C2D5B91FE48FC
>> s3 = 9E158DDEE697EBAEC2A036F459B02448
>>
>> Now what I want is basically to be able to hash s1 get the
>> result and then continue hashing s2 and get the final s3 result.
>>
>> Right now the only way I know of getting s3 hash is by first
>> concatenating the strings then running it through ComputeHash.
>>
>> This isn't much of an issue when the input is a small string, however
>> if I am trying to hash several files then that is a different matter.
>> **These files can be large, and the only way I know of doing it, is to
>> basically combining all the files into a single temporary file and then
>> passing the stream to ComputeHash.
>
> You can not "add" MD5 checksums.
>
> But if you use TransformBlock and TransformFinalBlock instead
> of ComputeHash, then you should be able to process small
> chunks (like 1 MB or 10 MB) at a time - even coming from
> multiple files.
Example:
using System;
using System.Text;
using System.Security.Cryptography;
namespace E
{
public class Program
{
public static void Main(string[] args)
{
MD5CryptoServiceProvider md5 = new MD5CryptoServiceProvider();
string s1 = "First String Sample";
Console.WriteLine(BitConverter.ToString(md5.ComputeHash(Encoding.UTF8.GetBytes(s1))).Replace("-",
""));
string s2 = "Second String Sample";
Console.WriteLine(BitConverter.ToString(md5.ComputeHash(Encoding.UTF8.GetBytes(s2))).Replace("-",
""));
string s3 = s1 + s2;
Console.WriteLine(BitConverter.ToString(md5.ComputeHash(Encoding.UTF8.GetBytes(s3))).Replace("-",
""));
md5.Initialize();
byte[] garbage = new Byte[1000000];
md5.TransformBlock(Encoding.UTF8.GetBytes(s1), 0,
Encoding.UTF8.GetByteCount(s1), garbage, 0);
md5.TransformFinalBlock(Encoding.UTF8.GetBytes(s2), 0,
Encoding.UTF8.GetByteCount(s2));
Console.WriteLine(BitConverter.ToString(md5.Hash).Replace("-", ""));
Console.ReadKey();
}
}
}
(it may be possible to optimize it a bit, but it should
show the concept)
Arne