[Gambas-user] Binary compare of files?

Kari Laine klaine8 at ...626...
Sun Oct 19 18:40:45 CEST 2008


On Sun, Oct 19, 2008 at 7:02 AM, nando <nando_f at ...951...> wrote:

> You want to use the code below,

Hi nando,

Ok what code you mean?



> but use a large block size
> like 8192 or 32768.  It doesn't have to be a perfect binary size.
> I suggest not to use SHA or MD5 because if you're reading in
> the files to compute SHA or MD5, you might as well forget
> wasting the time to compute and simply compare the strings.
> It will be faster.  Plus MD5 is a one-way digest and it is
> possible to get one identical answer digesting two different
> strings - although highly unlikely.  SHA and MD5 are candidates
> for scenario where the two files cannot be compared directly, so
> the copy is digested and compared with a copy of the SHA or MD5.
>
>
>
Thanks for comment. I am at the moment testing in practice how well MD5SUM
and SHA512SUM distributes fingerprints. I have had problems with the testing
so it is taking time. At the moment I try to checksum 1000GB of files and
see if I get any collisions. As I told before I cannot compare the files
itself because I have many hard disks which contains backups, which are not
connected to machine all the time. So I must use some kind of checksumming.
I was also thinking that I could take little snapshots of files into the
database and use that in addition to checksums. The whole idea of this
project is to move backups from hard disks to DVDs to be able to reuse the
harddisks. Also idea is to have a database to know what I have and where.
Also there is lot of duplication in disks I dont't want same data backed up
many times on DVDs. By the way does anyone have an idea how long lived a dvd
is?

Best Regards
Kari Laine



More information about the User mailing list