checksum for files

Each file can have a finger print depending on its content. The finger print is called checksum. If 2 files have exactly the same content, those 2 files would have the same checksum. We can use md5 on Mac and md5sum on Linux.

I have a file “~/tmp/data.txt”. It’s checksum can be calculated with the following command.

md5 data.txt

Result

MD5 (data.txt) = e1e1415d2433143ff43103b384df7402

I’m going to copy the file as data2.txt and have it calculate the checksum.

To copy…

cp data.txt data2.txt

Then I’m going to pass 2 files to md5 command.

md5 data.txt data2.txt

Result

MD5 (data.txt) = e1e1415d2433143ff43103b384df7402
MD5 (data2.txt) = e1e1415d2433143ff43103b384df7402

As you can see, the hashes are exactly the same. This is really useful to see if the files are exactly the same or not.

You could implement a process to calculate the hash locally and then the files are uploaded to the production systems, you could have another process on the server side to make sure the uploaded files have the same hash.

You could alternatively combine find command and md5 command to have it calculate the selected files like the following.

find . -name 'data*' | xargs md5

Result

MD5 (./data2.txt) = e1e1415d2433143ff43103b384df7402
MD5 (./data.txt) = e1e1415d2433143ff43103b384df7402

And when the same file is copied to another system from Mac to Linux (I have a Linux-Mint machine) and execute md5sum on the file, I get exactly the same md5 hash.

Author: admin

A software engineer in greater Seattle area

Leave a Reply

Your email address will not be published. Required fields are marked *