Scheduled Maintenance: We are aware of an issue with Google, AOL, and Yahoo services as email providers which are blocking new registrations. We are trying to fix the issue and we have several internal and external support tickets in process to resolve the issue. Please see: viewtopic.php?t=158230

 

 

 

I/O Errors using TAR to LTO-5 tape AFTER CLEANING

Need help with peripherals or devices?
Post Reply
Message
Author
User avatar
w4kh
Posts: 98
Joined: 2006-09-09 19:10
Location: Alabama, USA

I/O Errors using TAR to LTO-5 tape AFTER CLEANING

#1 Post by w4kh »

This question involves some "Old School" activity, but as an old dude, I think I'm entitled to ask... And, this ran fine three times before upgrading to Buster and twice since.
My system configuration:

Code: Select all

Linux BigMutt 4.19.0-5-amd64 #1 SMP Debian 4.19.37-5+deb10u1 (2019-07-19) x86_64 GNU/Linux
Motherboard: Gigabyte 970A-D3P
CPU: AMD FX-8350 8-Core Processor @4000.000 MHz
cache: 2048 KB
RAM: 32GB (4x8GB)  Unbuffered (Unregistered)
HP EH957SB StorageWorks LTO-5 Ultrium 3000 SAS Internal Tape Drive
QUANTUM LTO 5 TAPE CARTRIDGE (MR-L5MQN-01)
Video: GeForce 8400 GS
Monitor: VIZIO E320VA
I clean (cleaner cartridge) the tape drive once a month, and at this point, I am only starting tar seven times a month. The LTO-5 tape cartridges have only been run through 6-10 times...
Is the error with the cartridge, the LTO-5 (SAS) drive or what? How do I find the exact cause for an i/o error that is simply reported initially as:

Code: Select all

TAPE=/dev/nst0
tar --create --file $TAPE --verbose --totals ./*
./2019-07-21_SDA1.img
./2019-07-21_SDA2.img
Total bytes written: 39025121280 (37GiB, ?/s)
tar: /dev/nst0: cannot write: Input/output error
tar: /dev/nst0: cannot close: Input/output error
tar: Error is not recoverable: exiting now
/bin/mt: /dev/nst0: rmtopen failed: No such file or directory
I looked in syslog for clues and syslog says:

Code: Select all

Jul 27 09:18:17 BigMutt kernel: [39221.766994] st 6:0:3:0: [st0] Block limits 1 - 16777215 bytes.
[39576.563080] st 6:0:3:0: device_block, handle(0x0009)
[39576.563205] st 6:0:3:0: [st0] Error e0000 (driver bt 0x0, host bt 0xe).
[39578.062876] st 6:0:3:0: device_unblock and setting to running, handle(0x0009)
[39578.062963] st 6:0:3:0: [st0] Error 10000 (driver bt 0x0, host bt 0x1).
[39578.062966] st 6:0:3:0: [st0] Error on write filemark.
[39578.064281] mpt2sas_cm0: removing handle(0x0009), sas_addr(0x500110a001622ed0)
[39578.064283] mpt2sas_cm0: enclosure logical id(0x500605b00341cef0), slot(0)
[39582.825144] scsi 6:0:4:0: Sequential-Access HP       Ultrium 5-SCSI   Z6ED PQ: 0 ANSI: 6
[39582.825152] scsi 6:0:4:0: SSP: handle(0x0009), sas_addr(0x500110a001622ed0), phy(3), device_name(0x500110a001622ed2)
[39582.825153] scsi 6:0:4:0: enclosure logical id (0x500605b00341cef0), slot(0)
[39582.827036] scsi 6:0:4:0: TLR Enabled
[39582.829132] st 6:0:4:0: Attached scsi tape st0
[39582.829134] st 6:0:4:0: st0: try direct i/o: yes (alignment 4 B)
[39582.829207] st 6:0:4:0: Attached scsi generic sg2 type 1
I tried looking for "Error Codes" in the support pages, but nothing came up for either "000E0000" or "00010000".
I can put a different LTO-5 cartridge in the drive and try again, but it would be nice not to overwrite what are good backups to tape, and more to the point, I want to find the CAUSE of the error so I can fix it.

Code: Select all

# mt -f /dev/nst0 status
drive type = 114
drive status = 1476395008
sense key error = 0
residue count = 0
file number = 0
block number = 0
While results show an blank tape, I did erase the failed backup on that cartridge, hence the zeros. I need to know how to decode the "drive status = 1476395008"
Also, I ran a second command (and edited out the entries pertaining to "office" and a spreadsheet I had open noting a large memory block of almost 4 GB - but nothing but the backup was open during the failed backup attempt... the system was freshly booted prior to the backup attempt):

Code: Select all

dmesg
st 6:0:3:0: [st0] Block limits 1 - 16777215 bytes.
[39576.563080] st 6:0:3:0: device_block, handle(0x0009)
[39576.563205] st 6:0:3:0: [st0] Error e0000 (driver bt 0x0, host bt 0xe).
[39578.062876] st 6:0:3:0: device_unblock and setting to running, handle(0x0009)
[39578.062963] st 6:0:3:0: [st0] Error 10000 (driver bt 0x0, host bt 0x1).
[39578.062966] st 6:0:3:0: [st0] Error on write filemark.
[39578.064281] mpt2sas_cm0: removing handle(0x0009), sas_addr(0x500110a001622ed0)
[39578.064283] mpt2sas_cm0: enclosure logical id(0x500605b00341cef0), slot(0)
[39582.825144] scsi 6:0:4:0: Sequential-Access HP       Ultrium 5-SCSI   Z6ED PQ: 0 ANSI: 6
[39582.825152] scsi 6:0:4:0: SSP: handle(0x0009), sas_addr(0x500110a001622ed0), phy(3), device_name(0x500110a001622ed2)
[39582.825153] scsi 6:0:4:0: enclosure logical id (0x500605b00341cef0), slot(0)
[39582.827036] scsi 6:0:4:0: TLR Enabled
[39582.829132] st 6:0:4:0: Attached scsi tape st0
[39582.829134] st 6:0:4:0: st0: try direct i/o: yes (alignment 4 B)
[39582.829207] st 6:0:4:0: Attached scsi generic sg2 type 1
[122176.617444] st 6:0:4:0: [st0] Block limits 1 - 16777215 bytes
I can easily replace a no good tape cartridge, but the drive is about 15 or more times the cartridge cost, so I'll have to work on a workaround unless I can identify the problem as the drive, and swing a new drive. And at this point, I don't really know if it is a cartridge (easy fix) or drive issue, so throwing money at this isn't a good solution (for me - I am old and retired)...

My issue seems very simple - to me, that is...
I use "partclone.[vfat|ext4] -c -s /dev/sda1|4 -o $OUTF" or "dd if=/dev/sda2 of=$OUTF conv=sparse,sync,noerror bs=4096" to create an image of disk partitions, and then use "tar -cvf /dev/st0 /backup_images" to write the images to LTO-5 tape.

I am trying to determine if the errors that occur during "tar" are related to the drive, the cartridge, or the cable (some of my searching found instances of a cable fault causing an LTO write to fail)... replacing the cable or the cartridges can be done, but randomly substituting new for older in hopes of having it work violates EVERYTHING I have learned in 60+ years (yes, I am old, and the first computer that I wrote a program for - and was paid - was a hybrid discrete solid-state and vacuum tube machine!) of programming and using computers. I want to know WHAT is causing the error and to WHAT piece of equipment, so I can develop a solution that doesn't involve random pecking in hopes of finding a seed.
4.19.0-9-amd64 #1 SMP Debian 4.19.118-2+deb10u1
CPU: AMD FX(tm)-8350 Eight-Core Processor
RAM: 32GB (8x8GB) DDR3 DRAM
Video: GeForce 8400 GS to VIZIO E320VA Monitor

trinidad
Posts: 297
Joined: 2016-08-04 14:58
Been thanked: 15 times

Re: I/O Errors using TAR to LTO-5 tape AFTER CLEANING

#2 Post by trinidad »

I've followed both of your posts on this subject. A couple of things jump out at me. If SSL did not report a clearer explanation of the code then there isn't one. 000E0000 and 000e0000 can express a variety of issues: startpoint, mount point, drive offset errors, inaccessible or spurious RAM sectors, API extension errors, an almost endless list of designations. You must realize that not many people would be running your hardware with Debian 10. Because of your I/O errors RAM permissions, and/or RAM failure seem likely which can also be linked to board performance itself if you were dealing with an all disk setup. However because you are using a tape drive several other things can randomly occur not the least of which is random sector misallignments during copying, thus startpoint/mountpoint errors. My first question is do you have all the OEM Ultrium utilities installed and available to you? Secondly is any firware up to date and unbroken? If you do have the utilities you should use them first to completely check the tape drive regularly, and preferrably right after cooking it a while. You might have better luck with your question on StackExchange. It's difficult for people to address such a question when they have no access to your particular hardware situation. I do wish you good luck, and hope you identify the issue.

TC
You can't believe your eyes if your imagination is out of focus.

User avatar
w4kh
Posts: 98
Joined: 2006-09-09 19:10
Location: Alabama, USA

Re: I/O Errors using TAR to LTO-5 tape AFTER CLEANING

#3 Post by w4kh »

Thank you, TC

I will do some more research, looking for Ultrium utilities and firmware updates...
Then I'll check StackExchange...

I appreciate the leads...

And, as an old timer with loads of mainframe time, I did choose tape as my local backup medium because I can do an overnight backup daily (or on some other schedule that I choose) to a medium that is easy to store off-site or in a fire-proof safe. Tape cartridges cost much less than disks of comparable capacity, and their physical size makes storage much easier.
4.19.0-9-amd64 #1 SMP Debian 4.19.118-2+deb10u1
CPU: AMD FX(tm)-8350 Eight-Core Processor
RAM: 32GB (8x8GB) DDR3 DRAM
Video: GeForce 8400 GS to VIZIO E320VA Monitor

Post Reply