My system configuration:
Code: Select all
Linux BigMutt 4.19.0-5-amd64 #1 SMP Debian 4.19.37-5+deb10u1 (2019-07-19) x86_64 GNU/Linux
Motherboard: Gigabyte 970A-D3P
CPU: AMD FX-8350 8-Core Processor @4000.000 MHz
cache: 2048 KB
RAM: 32GB (4x8GB) Unbuffered (Unregistered)
HP EH957SB StorageWorks LTO-5 Ultrium 3000 SAS Internal Tape Drive
QUANTUM LTO 5 TAPE CARTRIDGE (MR-L5MQN-01)
Video: GeForce 8400 GS
Monitor: VIZIO E320VA
Is the error with the cartridge, the LTO-5 (SAS) drive or what? How do I find the exact cause for an i/o error that is simply reported initially as:
Code: Select all
TAPE=/dev/nst0
tar --create --file $TAPE --verbose --totals ./*
./2019-07-21_SDA1.img
./2019-07-21_SDA2.img
Total bytes written: 39025121280 (37GiB, ?/s)
tar: /dev/nst0: cannot write: Input/output error
tar: /dev/nst0: cannot close: Input/output error
tar: Error is not recoverable: exiting now
/bin/mt: /dev/nst0: rmtopen failed: No such file or directory
Code: Select all
Jul 27 09:18:17 BigMutt kernel: [39221.766994] st 6:0:3:0: [st0] Block limits 1 - 16777215 bytes.
[39576.563080] st 6:0:3:0: device_block, handle(0x0009)
[39576.563205] st 6:0:3:0: [st0] Error e0000 (driver bt 0x0, host bt 0xe).
[39578.062876] st 6:0:3:0: device_unblock and setting to running, handle(0x0009)
[39578.062963] st 6:0:3:0: [st0] Error 10000 (driver bt 0x0, host bt 0x1).
[39578.062966] st 6:0:3:0: [st0] Error on write filemark.
[39578.064281] mpt2sas_cm0: removing handle(0x0009), sas_addr(0x500110a001622ed0)
[39578.064283] mpt2sas_cm0: enclosure logical id(0x500605b00341cef0), slot(0)
[39582.825144] scsi 6:0:4:0: Sequential-Access HP Ultrium 5-SCSI Z6ED PQ: 0 ANSI: 6
[39582.825152] scsi 6:0:4:0: SSP: handle(0x0009), sas_addr(0x500110a001622ed0), phy(3), device_name(0x500110a001622ed2)
[39582.825153] scsi 6:0:4:0: enclosure logical id (0x500605b00341cef0), slot(0)
[39582.827036] scsi 6:0:4:0: TLR Enabled
[39582.829132] st 6:0:4:0: Attached scsi tape st0
[39582.829134] st 6:0:4:0: st0: try direct i/o: yes (alignment 4 B)
[39582.829207] st 6:0:4:0: Attached scsi generic sg2 type 1
I can put a different LTO-5 cartridge in the drive and try again, but it would be nice not to overwrite what are good backups to tape, and more to the point, I want to find the CAUSE of the error so I can fix it.
Code: Select all
# mt -f /dev/nst0 status
drive type = 114
drive status = 1476395008
sense key error = 0
residue count = 0
file number = 0
block number = 0
Also, I ran a second command (and edited out the entries pertaining to "office" and a spreadsheet I had open noting a large memory block of almost 4 GB - but nothing but the backup was open during the failed backup attempt... the system was freshly booted prior to the backup attempt):
Code: Select all
dmesg
st 6:0:3:0: [st0] Block limits 1 - 16777215 bytes.
[39576.563080] st 6:0:3:0: device_block, handle(0x0009)
[39576.563205] st 6:0:3:0: [st0] Error e0000 (driver bt 0x0, host bt 0xe).
[39578.062876] st 6:0:3:0: device_unblock and setting to running, handle(0x0009)
[39578.062963] st 6:0:3:0: [st0] Error 10000 (driver bt 0x0, host bt 0x1).
[39578.062966] st 6:0:3:0: [st0] Error on write filemark.
[39578.064281] mpt2sas_cm0: removing handle(0x0009), sas_addr(0x500110a001622ed0)
[39578.064283] mpt2sas_cm0: enclosure logical id(0x500605b00341cef0), slot(0)
[39582.825144] scsi 6:0:4:0: Sequential-Access HP Ultrium 5-SCSI Z6ED PQ: 0 ANSI: 6
[39582.825152] scsi 6:0:4:0: SSP: handle(0x0009), sas_addr(0x500110a001622ed0), phy(3), device_name(0x500110a001622ed2)
[39582.825153] scsi 6:0:4:0: enclosure logical id (0x500605b00341cef0), slot(0)
[39582.827036] scsi 6:0:4:0: TLR Enabled
[39582.829132] st 6:0:4:0: Attached scsi tape st0
[39582.829134] st 6:0:4:0: st0: try direct i/o: yes (alignment 4 B)
[39582.829207] st 6:0:4:0: Attached scsi generic sg2 type 1
[122176.617444] st 6:0:4:0: [st0] Block limits 1 - 16777215 bytes
My issue seems very simple - to me, that is...
I use "partclone.[vfat|ext4] -c -s /dev/sda1|4 -o $OUTF" or "dd if=/dev/sda2 of=$OUTF conv=sparse,sync,noerror bs=4096" to create an image of disk partitions, and then use "tar -cvf /dev/st0 /backup_images" to write the images to LTO-5 tape.
I am trying to determine if the errors that occur during "tar" are related to the drive, the cartridge, or the cable (some of my searching found instances of a cable fault causing an LTO write to fail)... replacing the cable or the cartridges can be done, but randomly substituting new for older in hopes of having it work violates EVERYTHING I have learned in 60+ years (yes, I am old, and the first computer that I wrote a program for - and was paid - was a hybrid discrete solid-state and vacuum tube machine!) of programming and using computers. I want to know WHAT is causing the error and to WHAT piece of equipment, so I can develop a solution that doesn't involve random pecking in hopes of finding a seed.