I did some research on this issue - only because I was curious if this is reproducible - the results are interesting, so I've decided to write this post.
I've simulated the situation described by the OP, that is, I've made many tests by hard-disconnecting the USB drive while the rsync session was running.
The tests was made by rsync-ing a directory with a 280 files of size ranging from 2.5MB to 10MB.
Connected USB device: [log]
- Code: Select all
Sep 29 18:09:38 d7-01 kernel: [ 790.612117] usb 4-5: new high-speed USB device number 5 using ehci_hcd
Sep 29 18:09:38 d7-01 kernel: [ 790.750444] usb 4-5: New USB device found, idVendor=13fe, idProduct=3623
Sep 29 18:09:38 d7-01 kernel: [ 790.750454] usb 4-5: New USB device strings: Mfr=1, Product=2, SerialNumber=3
Sep 29 18:09:38 d7-01 kernel: [ 790.750462] usb 4-5: Product: STORE N GO
Sep 29 18:09:38 d7-01 kernel: [ 790.750467] usb 4-5: Manufacturer: Verbatim
Sep 29 18:09:38 d7-01 kernel: [ 790.750472] usb 4-5: SerialNumber: <cut>
Sep 29 18:09:38 d7-01 kernel: [ 790.751256] scsi13 : usb-storage 4-5:1.0
Sep 29 18:09:38 d7-01 mtp-probe: checking bus 4, device 5: "/sys/devices/pci0000:00/0000:00:13.2/usb4/4-5"
Sep 29 18:09:38 d7-01 mtp-probe: bus: 4, device: 5 was not an MTP device
Sep 29 18:09:39 d7-01 kernel: [ 791.785586] scsi 13:0:0:0: Direct-Access Verbatim STORE N GO 5.00 PQ: 0 ANSI: 0 CCS
Sep 29 18:09:39 d7-01 kernel: [ 791.787023] sd 13:0:0:0: Attached scsi generic sg11 type 0
Sep 29 18:09:40 d7-01 kernel: [ 792.668407] sd 13:0:0:0: [sdj] 15638528 512-byte logical blocks: (8.00 GB/7.45 GiB)
Sep 29 18:09:40 d7-01 kernel: [ 792.668898] sd 13:0:0:0: [sdj] Write Protect is off
Sep 29 18:09:40 d7-01 kernel: [ 792.668901] sd 13:0:0:0: [sdj] Mode Sense: 23 00 00 00
Sep 29 18:09:40 d7-01 kernel: [ 792.669398] sd 13:0:0:0: [sdj] No Caching mode page found
Sep 29 18:09:40 d7-01 kernel: [ 792.669401] sd 13:0:0:0: [sdj] Assuming drive cache: write through
Sep 29 18:09:40 d7-01 kernel: [ 792.672793] sd 13:0:0:0: [sdj] No Caching mode page found
Sep 29 18:09:40 d7-01 kernel: [ 792.672804] sd 13:0:0:0: [sdj] Assuming drive cache: write through
Sep 29 18:09:40 d7-01 kernel: [ 792.701762] sdj: sdj1
Sep 29 18:09:40 d7-01 kernel: [ 792.705177] sd 13:0:0:0: [sdj] No Caching mode page found
Sep 29 18:09:40 d7-01 kernel: [ 792.705186] sd 13:0:0:0: [sdj] Assuming drive cache: write through
Sep 29 18:09:40 d7-01 kernel: [ 792.705194] sd 13:0:0:0: [sdj] Attached SCSI removable disk
The rsync command used was just like in the example provided by the OP, but with execution time measurement:
- Code: Select all
$time rsync --progress ./test_reports /media/VERBATIM8G
This is the typical log in case when the USB device is disconnected during the rsync session:
- Code: Select all
Sep 29 20:31:16 d7-01 kernel: [ 9289.122714] usb 4-5: USB disconnect, device number 13
Sep 29 20:31:16 d7-01 kernel: [ 9289.136107] sd 21:0:0:0: [sdj] Unhandled error code
Sep 29 20:31:16 d7-01 kernel: [ 9289.136117] sd 21:0:0:0: [sdj] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
Sep 29 20:31:16 d7-01 kernel: [ 9289.136129] sd 21:0:0:0: [sdj] CDB: Write(10): 2a 00 00 1e ee f6 00 00 f0 00
Sep 29 20:31:16 d7-01 kernel: [ 9289.136150] end_request: I/O error, dev sdj, sector 2027254
Sep 29 20:31:16 d7-01 kernel: [ 9289.138970] sd 21:0:0:0: [sdj] Unhandled error code
Sep 29 20:31:16 d7-01 kernel: [ 9289.138978] sd 21:0:0:0: [sdj] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
Sep 29 20:31:16 d7-01 kernel: [ 9289.138988] sd 21:0:0:0: [sdj] CDB: Write(10): 2a 00 00 1e ef e6 00 00 f0 00
Sep 29 20:31:16 d7-01 kernel: [ 9289.139007] end_request: I/O error, dev sdj, sector 2027494
Sep 29 20:31:16 d7-01 kernel: [ 9289.148115] FAT-fs (sdj1): FAT read failed (blocknr 2013)
Sep 29 20:31:16 d7-01 kernel: [ 9289.148192] FAT-fs (sdj1): unable to read inode block for updating (i_pos 488210)
Sep 29 20:31:16 d7-01 kernel: [ 9289.148413] FAT-fs (sdj1): Directory bread(block 30513) failed
Sep 29 20:31:16 d7-01 kernel: [ 9289.148439] FAT-fs (sdj1): Directory bread(block 30514) failed
This is the typical rsync output after the USB device is disconnected:
- Code: Select all
$time rsync --progress ./test_reports /media/VERBATIM8G/
100726-100802.csv
5391670 100% 53.80MB/s 0:00:00 (xfer#1, to-check=279/280)
100802_100805.csv
4577268 100% 24.25MB/s 0:00:00 (xfer#2, to-check=278/280)
100805-100810.csv
4321116 100% 16.89MB/s 0:00:00 (xfer#3, to-check=277/280)
100810-100816.csv
6021194 100% 12.88MB/s 0:00:00 (xfer#4, to-check=276/280)
100816-100820.csv
4498966 100% 6.99MB/s 0:00:00 (xfer#5, to-check=275/280)
100820-100826.csv
6938522 100% 5.00MB/s 0:00:01 (xfer#6, to-check=274/280)
100826-100903.csv
32768 0% 264.46kB/s 0:00:39
rsync: writefd_unbuffered failed to write 4 bytes to socket [sender]: Broken pipe (32)
rsync: write failed on "/media/VERBATIM8G/100826-100903.csv": Input/output error (5)
rsync error: error in file IO (code 11) at receiver.c(322) [receiver=3.0.9]
rsync: connection unexpectedly closed (142 bytes received so far) [sender]
rsync error: error in rsync protocol data stream (code 12) at io.c(605) [sender=3.0.9]
real 0m2.064s
user 0m0.256s
sys 0m0.124s
*ZERO* delay - I've removed the USB stick about 2 seconds after starting the rsync.
However, by disconnecting the USB drive
I've just damaged the FAT filesystem.
NOW:
With damaged filesystem, typically every ~10th test something wrong is happening:
- Code: Select all
Sep 29 18:27:49 d7-01 kernel: [ 1881.529456] usb 4-5: USB disconnect, device number 6
Sep 29 18:27:49 d7-01 kernel: [ 1881.544104] sd 14:0:0:0: [sdj] Unhandled error code
Sep 29 18:27:49 d7-01 kernel: [ 1881.544113] sd 14:0:0:0: [sdj] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
Sep 29 18:27:49 d7-01 kernel: [ 1881.544122] sd 14:0:0:0: [sdj] CDB: Write(10): 2a 00 00 19 df 36 00 00 f0 00
Sep 29 18:27:49 d7-01 kernel: [ 1881.544141] end_request: I/O error, dev sdj, sector 1695542
Sep 29 18:27:49 d7-01 kernel: [ 1881.546431] sd 14:0:0:0: [sdj] Unhandled error code
Sep 29 18:27:49 d7-01 kernel: [ 1881.546437] sd 14:0:0:0: [sdj] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
Sep 29 18:27:49 d7-01 kernel: [ 1881.546446] sd 14:0:0:0: [sdj] CDB: Write(10): 2a 00 00 19 e0 26 00 00 f0 00
Sep 29 18:27:49 d7-01 kernel: [ 1881.546463] end_request: I/O error, dev sdj, sector 1695782
Sep 29 18:27:49 d7-01 kernel: [ 1881.652151] FAT-fs (sdj1): Directory bread(block 30512) failed
Sep 29 18:27:49 d7-01 kernel: [ 1881.652164] FAT-fs (sdj1): Directory bread(block 30513) failed
Sep 29 18:27:49 d7-01 kernel: [ 1881.652173] FAT-fs (sdj1): Directory bread(block 30514) failed
Sep 29 18:27:49 d7-01 kernel: [ 1881.652180] FAT-fs (sdj1): Directory bread(block 30515) failed
Sep 29 18:27:49 d7-01 kernel: [ 1881.652187] FAT-fs (sdj1): Directory bread(block 30516) failed
Sep 29 18:27:49 d7-01 kernel: [ 1881.652194] FAT-fs (sdj1): Directory bread(block 30517) failed
Sep 29 18:27:49 d7-01 kernel: [ 1881.652201] FAT-fs (sdj1): Directory bread(block 30518) failed
Sep 29 18:27:49 d7-01 kernel: [ 1881.652208] FAT-fs (sdj1): Directory bread(block 30519) failed
Sep 29 18:27:49 d7-01 kernel: [ 1881.652233] FAT-fs (sdj1): Directory bread(block 30512) failed
Sep 29 18:27:49 d7-01 kernel: [ 1881.652240] FAT-fs (sdj1): Directory bread(block 30513) failed
Sep 29 18:27:49 d7-01 kernel: [ 1881.652247] FAT-fs (sdj1): Directory bread(block 30514) failed
Sep 29 18:27:49 d7-01 kernel: [ 1881.652254] FAT-fs (sdj1): Directory bread(block 30515) failed
Sep 29 18:27:49 d7-01 kernel: [ 1881.652261] FAT-fs (sdj1): Directory bread(block 30516) failed
Sep 29 18:27:49 d7-01 kernel: [ 1881.652268] FAT-fs (sdj1): Directory bread(block 30517) failed
Sep 29 18:27:49 d7-01 kernel: [ 1881.652275] FAT-fs (sdj1): Directory bread(block 30518) failed
Sep 29 18:27:49 d7-01 kernel: [ 1881.652282] FAT-fs (sdj1): Directory bread(block 30519) failed
Sep 29 18:27:49 d7-01 kernel: [ 1881.652480] FAT-fs (sdj1): Directory bread(block 30512) failed
.....
and so on, and so on....
Apparently some kinds of errors in the FAT are ignored by the kernel, and the rsync continues to run untill it processes all the files in the source directory !This is obviously a BUG - but not in the rsync - it's the kernel.
However, if I fix the filesystem with the fsck.vfat:
- Code: Select all
$fsck.vfat -a -v /dev/sdj1
dosfsck 3.0.13 (30 Jun 2012)
dosfsck 3.0.13, 30 Jun 2012, FAT32, LFN
Checking we can access the last sector of the filesystem
Boot sector contents:
System ID "mkdosfs"
Media byte 0xf8 (hard disk)
512 bytes per logical sector
4096 bytes per cluster
32 reserved sectors
First FAT starts at byte 16384 (sector 32)
2 FATs, 32 bit entries
7802880 bytes per FAT (= 15240 sectors)
Root directory start at cluster 2 (arbitrary size)
Data area starts at byte 15622144 (sector 30512)
1950627 data clusters (7989768192 bytes)
62 sectors/track, 247 heads
0 hidden sectors
15635532 sectors total
Orphaned long file name part ".csv"
Auto-deleting.
/.cm0013
Bad short file name ( \000\004\010\015\015\034\007./\000\000).
Auto-renaming it.
Renamed to FSCK0000.000
/.100915-100917_2200.csv.xsymaG
File size is 4165616 bytes, cluster chain length is 1818624 bytes.
Truncating file to 1818624 bytes.
/.100810-100816.csv.9U9Usp
File size is 6021194 bytes, cluster chain length is 3543040 bytes.
Truncating file to 3543040 bytes.
/.100816-100820.csv.JgjPTT
Contains a free cluster (213120). Assuming EOF.
/.100816-100820.csv.JgjPTT
File size is 4498966 bytes, cluster chain length is 253952 bytes.
Truncating file to 253952 bytes.
/.100816-100820.csv.LF7jnB
File size is 4498966 bytes, cluster chain length is > 4501504 bytes.
Truncating file to 4498966 bytes.
/.100816-100820.csv.rYXlI9
Contains a free cluster (241792). Assuming EOF.
/.100816-100820.csv.rYXlI9
File size is 4498966 bytes, cluster chain length is 126976 bytes.
Truncating file to 126976 bytes.
/.101004_1054-101011.csv.iycvJV
File size is 8516242 bytes, cluster chain length is 5828608 bytes.
Truncating file to 5828608 bytes.
Reclaiming unconnected clusters.
Unable to create unique name
The problem
*never* shows up
when working with consistent FAT filesystem.
The conclusion: The OP have made the tests with severely damaged FAT filesystem(s).
Possible solutions:
1. fsck the USB drive filesystem before running the rsync (risky! - but less risky than cumulating the damages)
2. Inform the user to not remove the pendrive until the rsync finishes its work.
3. use
--whole-file option for rsync -> this disables delta copies, but it highly reduces the risk of breaking the consistency of the filesystem's low-level structures.
*Do not* kill the rsync - this is plainly stupid idea - instead, You can just run it as a background process - this is far more safe, because this way You will avoid races between disconnecting event and normal rsync termination.