Scheduled Maintenance: We are aware of an issue with Google, AOL, and Yahoo services as email providers which are blocking new registrations. We are trying to fix the issue and we have several internal and external support tickets in process to resolve the issue. Please see: viewtopic.php?t=158230

 

 

 

Upload huge files to cloud (keeping integrity)

Share your HowTo, Documentation, Tips and Tricks. Not for support questions!.
Post Reply
Message
Author
User avatar
bester69
Posts: 2072
Joined: 2015-04-02 13:15
Has thanked: 24 times
Been thanked: 14 times

Upload huge files to cloud (keeping integrity)

#1 Post by bester69 »

If we want to upload/backup a large files into Dropbox, MEGA, etc, keeping the data integrity and be able to restoring with success, we will need to apply data parity in the process https://en.wikipedia.org/wiki/Parchive . we could do as following:

UPLOADING/splitting file process
1- First, we do a cheksum of the source big files with the purpose of detecting errors which may have been introduced during its transmission or storage.
md5sum ubuntu-11.10-dvd-i386.iso
md5sum should then print out a single line after calculating the hash:
8044d756b7f00b695ab8dce07dce43e5 ubuntu-11.10-dvd-i386.iso
we keep the hash to compare once we want to restore/download from the cloud the big file, to check if their hashes match each other, so this'd mean the big file has download without any data corruption.

2- we split the huge file into smaller sizes with any tool that do this: (split, 7z, etc)
I'd reccomend to rise slices between 25Mb and 50MB to prevent data corruption, the bigger the slice the bigger the data posibility corruption.

In this case i splited a 4,2GB simage iso file into around 88 pieces of 50Mb size each one.

3- We will use data parity archive to repair data corruption in case of needed. https://en.wikipedia.org/wiki/Parchive
The more numer of files you split a huge file (the bigger the file), the more possiblities to be corrupted the huge file when you try to join them.

3.1 well install par2 command line: https://github.com/Parchive/par2cmdline
apt-get install par2
3.2 We apply par2 to our splited huge file directory to generate the parity files needed to repair our splited file if it was necessary.:
par2 c -R archive.par2 PATHtosplitedfile
it will create a few parity files that will garatize the recovery in case of data corruption.

4. Sync files in cloud including parity files.

-------------------------------------------------------------
DOWNLOADING/merging file process

1. we downloading from our cloud server all the pieces of the huge file.
2. we check or verify the integrity of all the pieces downloaded by using parity checking with par2 command line.
par2 v archive.par2
if the result is ok, we can merge our files.

3. We will repair pieces if the check was with data corruption:
par2 r archive.par2
4. once we repair the pieces, we' will join them to merge the big file.

5. Finally we will aply checksum to compare the hash with the source, to make sure there wasnt any data corruption in all the process. This step shoul be ok as parity repair should keep integrity data restoration.
bester69 wrote:STOP 2030 globalists demons, keep the fight for humanity freedom against NWO...

Post Reply