UPLOADING/splitting file process
1- First, we do a cheksum of the source big files with the purpose of detecting errors which may have been introduced during its transmission or storage.
we keep the hash to compare once we want to restore/download from the cloud the big file, to check if their hashes match each other, so this'd mean the big file has download without any data corruption.md5sum ubuntu-11.10-dvd-i386.iso
md5sum should then print out a single line after calculating the hash:
8044d756b7f00b695ab8dce07dce43e5 ubuntu-11.10-dvd-i386.iso
2- we split the huge file into smaller sizes with any tool that do this: (split, 7z, etc)
I'd reccomend to rise slices between 25Mb and 50MB to prevent data corruption, the bigger the slice the bigger the data posibility corruption.
In this case i splited a 4,2GB simage iso file into around 88 pieces of 50Mb size each one.
3- We will use data parity archive to repair data corruption in case of needed. https://en.wikipedia.org/wiki/Parchive
The more numer of files you split a huge file (the bigger the file), the more possiblities to be corrupted the huge file when you try to join them.
3.1 well install par2 command line: https://github.com/Parchive/par2cmdline
3.2 We apply par2 to our splited huge file directory to generate the parity files needed to repair our splited file if it was necessary.:apt-get install par2
it will create a few parity files that will garatize the recovery in case of data corruption.par2 c -R archive.par2 PATHtosplitedfile
4. Sync files in cloud including parity files.
-------------------------------------------------------------
DOWNLOADING/merging file process
1. we downloading from our cloud server all the pieces of the huge file.
2. we check or verify the integrity of all the pieces downloaded by using parity checking with par2 command line.
if the result is ok, we can merge our files.par2 v archive.par2
3. We will repair pieces if the check was with data corruption:
4. once we repair the pieces, we' will join them to merge the big file.par2 r archive.par2
5. Finally we will aply checksum to compare the hash with the source, to make sure there wasnt any data corruption in all the process. This step shoul be ok as parity repair should keep integrity data restoration.