File Size comparing enough to ensure copy ok or we need MD5?

If none of the more specific forums is the right place to ask

File Size comparing enough to ensure copy ok or we need MD5?

Postby debian121212 » 2020-07-14 05:19

I'm copying large amounts of data from cel phone into my PC and have experienced data corruption before. I'd like to make sure that the copied files will open correctly in the future/experience 0 corruption.

Is it necessary to generate and check md5 checksums for every file or does just checking the file size work enough for this?

For example:
I copy everything from the android source folder into the destination pc folder.
How do I make sure the copied files will always open fine with no issues in the future?

I figured generating an md5 checksum with md5deep would be correct however it seems a little overkill and I'd like to know if its really worth it. Shouldn't just simply checking the File Size do the trick if the files open correctly from the source? Md5 or just exact file byte sizes?
User avatar
debian121212
 
Posts: 79
Joined: 2019-01-03 01:34

Re: File Size comparing enough to ensure copy ok or we need

Postby cuckooflew » 2020-07-14 13:41

How do you copy ? Since you do not bother to tell us what method or command you use, maybe you all ready are using this :
Code: Select all
rsync -n -c  original-dir/ copied-dir/

If not , then perhaps some other variation of the 'rsync' command, see 'man rsync' . There are also many other ways to copy / transfer large amounts of data / files. You also might look at the 'diff' command , it could help you.'man diff'
DESCRIPTION
The diff utility compares the contents of file1 and file2 and writes to
the standard output the list of changes necessary to convert one file
into the other. No output is produced if the files are identical

They key words:
Code: Select all
on Linux, File Size comparing enough to ensure copy ok 
Copy/pasted into a search engine will give you results that go into detail, I just "scratched" the surface here, there are many ways to transfer files/data from a device to another device on linux, I generally use the above 'rsync' command , but at times another command is needed.
One of many detailed instructions : https://www.networkworld.com/article/3190407/nine-ways-to-compare-files-on-unix.html
Please Read What we expect you have already Done
Search Engines know a lot, and
"If God had wanted computers to work all the time, He wouldn't have invented RESET buttons"
and
Just say NO to help vampires!
cuckooflew
 
Posts: 683
Joined: 2018-05-10 19:34
Location: Some where out west

Re: File Size comparing enough to ensure copy ok or we need

Postby CwF » 2020-07-14 15:09

I generally never check a particular file for corruption but I do check hardware and methods. Once a corruption happens, something will be different by the end of the day! So, get the methodology down, test it, go with it...

Is there a reason why some DE and its GUI file manager doesn't work right?
CwF
 
Posts: 789
Joined: 2018-06-20 15:16

Re: File Size comparing enough to ensure copy ok or we need

Postby cuckooflew » 2020-07-14 15:39

Ahh, yes, that would be relevant, which DE and file manager ? I all ways forget , some people do not use the CLI ,... :mrgreen:
Please Read What we expect you have already Done
Search Engines know a lot, and
"If God had wanted computers to work all the time, He wouldn't have invented RESET buttons"
and
Just say NO to help vampires!
cuckooflew
 
Posts: 683
Joined: 2018-05-10 19:34
Location: Some where out west

Re: File Size comparing enough to ensure copy ok or we need

Postby debian121212 » 2020-07-14 20:27

cuckooflew wrote:How do you copy ?

Its full folder copies (and individual files as well) and figured that the three best options at the moment for simple folder (and individual file) lossless copying are:

a) md5deep hashing
Seems overkill.

b) rsync -r
Code: Select all
rsync -r /home/USER/compare1/ /home/USER/Pictures/compare1copy/

Seems rsync reliably does all the MD5 checking automatically so I don't have to do it manually with md5deep.

c) exact byte counted regular GTK Copy (XFCE Thunar Ctrl+v or right click copy pasting)
Figured regular XFCE Thunar copy paste and check exact byte count afterwards should technically be enough since data issues at transfer should result in a Thunar warning. HOWEVER I'VE EXPERIENCED CORRUPTION AFTER REGULAR BYTE CHECKING copy pasting like this using external hard drives so I can't really tell if it was the Seagate External Hard Drive or the PC.

Does regular GTK Copy and exact byte check afterwards ensure a lossless copy or should we still be wary about files not opening after an exact byte count check with regular right click copy paste?
Last edited by debian121212 on 2020-07-14 20:36, edited 1 time in total.
User avatar
debian121212
 
Posts: 79
Joined: 2019-01-03 01:34

Re: File Size comparing enough to ensure copy ok or we need

Postby debian121212 » 2020-07-14 20:34

CwF wrote:Is there a reason why some DE and its GUI file manager doesn't work right?

Well to be more specific the exact issue is that I store my data on Seagate External Hard Drives and even though the byte count was right, after some time the files won't open due to data corruption errors.

This is not something I want to continue having to deal with and are therefore looking for ways to prevent this in the future. Lost priceless data like this and any insight is more than appreciated.
User avatar
debian121212
 
Posts: 79
Joined: 2019-01-03 01:34

Re: File Size comparing enough to ensure copy ok or we need

Postby debian121212 » 2020-07-14 20:39

So its therefore an issue of preventing backup data corruption then.

Files open normally at first and then after some time of having been stored on every external hard drive I've owned, they just stop opening due to data corruption warnings

So I figure this will have to do with the corruption at the External Hard Drive since the byte count was exact at the initial copy paste then and the 3 methodologies should work regardless as its an external hard drive issue.

Keeps happening no matter with what external hard drives and seems to be that perfectly kept CD R's or mutliple copies of the same hard drive is the only way to go to ensure stuff opens normally after a while?

Having to keep pumping money and effort into new supposed-to-work-seagate external hard drives because they all just suck after a while is so annoying!!!! Haven't been able to ever trust online storage.

Any non basic recommendations on the best way to ensure stuff just opens normally after a while?
User avatar
debian121212
 
Posts: 79
Joined: 2019-01-03 01:34

Re: File Size comparing enough to ensure copy ok or we need

Postby debian121212 » 2020-07-14 20:59

Looks like ill stick to happy medium rsync (still dont know if its exactly necessary though) and it was just an issue about better cold storage all along.

So 100 dollar basic consumer cold storage isn't doing it.

Paying more for better External Hard Drives seems to be the real way out of this data corruption issue.

Any insight more than appreciated.
User avatar
debian121212
 
Posts: 79
Joined: 2019-01-03 01:34

Re: File Size comparing enough to ensure copy ok or we need

Postby cuckooflew » 2020-07-14 21:14

So, you are saying at first, when you first transfer them to the drive, they do open ok and are good ?
But after a long period of time they become corrupted.
Mostly I use western digital, I am not sure if I have any Seagate, but I can't say I have ever had anything like this happen. I have heard of this happening with CD/DVD, but there again, I have not experienced it, and I have some that are 15 years old, still good. How long is a "long period" ?EG:months, years, etc...
The only thing I can think of would be where they are stored, for example if they were close to electric motors, generators, any magnetic fields, that might cause some damage ?
Last edited by cuckooflew on 2020-07-14 21:51, edited 2 times in total.
Please Read What we expect you have already Done
Search Engines know a lot, and
"If God had wanted computers to work all the time, He wouldn't have invented RESET buttons"
and
Just say NO to help vampires!
cuckooflew
 
Posts: 683
Joined: 2018-05-10 19:34
Location: Some where out west

Re: File Size comparing enough to ensure copy ok or we need

Postby cuckooflew » 2020-07-14 21:20

Maybe read this: https://www.securedatarecovery.com/services/hard-drive-recovery/what-causes-hard-drive-data-corruption
Serious data corruption is more likely with larger files than with smaller files, since larger files take up more physical space on a hard drive's platters. If a hard drive has tracking issues or read/write head problems, corruption may affect several files or folders simultaneously. The physical hard disk issues that contribute to corruption are often caused by poor operating conditions, but all hard drives eventually fail due to mechanical stress and wear.

How ever that site is really a add for a data recovery company, and it seems to be referring to HD's, but really, USB portable hd drives are basically the same, mine are anyway, they do have disks, etc.
============ edited ==========
another says basically the same :
Every big brand has its issues after a long term use, particularly with frequently improper use, such as incompatible bundled software with a newer operating system, a connection on multiple computers, unsafe ejection, physical vibration, etc. As a consequence, the Seagate external hard drive is not working anymore.
Last edited by cuckooflew on 2020-07-14 21:51, edited 1 time in total.
Please Read What we expect you have already Done
Search Engines know a lot, and
"If God had wanted computers to work all the time, He wouldn't have invented RESET buttons"
and
Just say NO to help vampires!
cuckooflew
 
Posts: 683
Joined: 2018-05-10 19:34
Location: Some where out west

Re: File Size comparing enough to ensure copy ok or we need

Postby cuckooflew » 2020-07-14 21:43

Any non basic recommendations on the best way to ensure stuff just opens normally after a while?


I found this one to be interesting, https://superuser.com/questions/284427/how-much-time-until-an-unused-hard-drive-loses-its-data
To periodically refresh the data on the drive, simply transfer it to another location, and re-writing it back to the drive. That way, the magnetic domains in the physical disk surface will be renewed with their original strength (because you just re-wrote the files back to the disk). If you're concerned about filesystem corruption, you can also format the disk before transferring the data back.
Please Read What we expect you have already Done
Search Engines know a lot, and
"If God had wanted computers to work all the time, He wouldn't have invented RESET buttons"
and
Just say NO to help vampires!
cuckooflew
 
Posts: 683
Joined: 2018-05-10 19:34
Location: Some where out west

Re: File Size comparing enough to ensure copy ok or we need

Postby debian121212 » 2020-07-14 22:16

cuckooflew wrote:So, you are saying at first, when you first transfer them to the drive, they do open ok and are good ?
But after a long period of time they become corrupted.

Exactly.
cuckooflew wrote:The only thing I can think of would be where they are stored, for example if they were close to electric motors, generators, any magnetic fields, that might cause some damage ?

Thx for pointing out however i don't think this is the case since its just a regular room with an ac on one corner and a pc in another. I don't place them on top of the PC case ever for this reason.
Last edited by debian121212 on 2020-07-14 22:58, edited 1 time in total.
User avatar
debian121212
 
Posts: 79
Joined: 2019-01-03 01:34

Re: File Size comparing enough to ensure copy ok or we need

Postby CwF » 2020-07-14 22:26

I do put large check files in the mix if concerned with the particular media. I use video files with the sha1 sum as the file name, and a thunar custom action gives a zenity dialog for a quick compare, ie filename=sha1. I keep 1G,4G,10G and 64G files handy.

Other than that, yes, cycle often.
CwF
 
Posts: 789
Joined: 2018-06-20 15:16

Re: File Size comparing enough to ensure copy ok or we need

Postby debian121212 » 2020-07-14 22:33

cuckooflew wrote:Maybe read this: https://www.securedatarecovery.com/services/hard-drive-recovery/what-causes-hard-drive-data-corruption
Serious data corruption is more likely with larger files than with smaller files, since larger files take up more physical space on a hard drive's platters. If a hard drive has tracking issues or read/write head problems, corruption may affect several files or folders simultaneously. The physical hard disk issues that contribute to corruption are often caused by poor operating conditions, but all hard drives eventually fail due to mechanical stress and wear.

How ever that site is really a add for a data recovery company, and it seems to be referring to HD's, but really, USB portable hd drives are basically the same, mine are anyway, they do have disks, etc.
============ edited ==========
another says basically the same :
Every big brand has its issues after a long term use, particularly with frequently improper use, such as incompatible bundled software with a newer operating system, a connection on multiple computers, unsafe ejection, physical vibration, etc. As a consequence, the Seagate external hard drive is not working anymore.


This looks like the exact case. They ALL EVENTUALLY FAIL WITH TIME; sometimes as short as 5 years as is the case with my "cheap consumer 100 USD" Seagate USB External HD,

To periodically refresh the data on the drive, simply transfer it to another location, and re-writing it back to the drive. That way, the magnetic domains in the physical disk surface will be renewed with their original strength (because you just re-wrote the files back to the disk). If you're concerned about filesystem corruption, you can also format the disk before transferring the data back.


Great to know.

Is it certain that using rsync (thanks to its md5 before and after check with every transfer as per its man page) is *actually necessary* at all just to make sure the file copied correctly and opens normally right after copying? How necessary is it? Really worth it?

Is a regular byte count good enough to make sure files just open correctly immediately after a basic ctrl+c and ctrl+v copy (GTK Copy assuming the source file is ok and not corrupted itself)
User avatar
debian121212
 
Posts: 79
Joined: 2019-01-03 01:34

Re: File Size comparing enough to ensure copy ok or we need

Postby debian121212 » 2020-07-14 22:46

CwF wrote:I do put large check files in the mix if concerned with the particular media. I use video files with the sha1 sum as the file name, and a thunar custom action gives a zenity dialog for a quick compare, ie filename=sha1. I keep 1G,4G,10G and 64G files handy.

Other than that, yes, cycle often.


How'd you copy the files? Can you assure that this has been actually necessary ever right after the initial copy to make sure the media opens correctly right after copying? As in, have you ever copied something only to realize the MD5/Sha1 or whatever is off right after a successful copy paste operation is over?

After some time a checksum would be able to reveal data has gone bad however this is due to data corruption after a good initial copy. After the file has copied right, we golden. However, is the checksum on the spot right after normal ctrl+c ctrl+v (gtk copy paste) copy pasting really necessary at all after an exact byte count check reveals the byte count to be correct?

Cant decide if rsync is actually worth it or not. For now Im using it bc it seems the safer option until someone can confirm that its unnecessary just to make sure the file opens right just after copying (and its not due to data corruption after a successful regular copy)
User avatar
debian121212
 
Posts: 79
Joined: 2019-01-03 01:34

Next

Return to General Questions

Who is online

Users browsing this forum: No registered users and 18 guests

fashionable
cron