Scheduled Maintenance: We are aware of an issue with Google, AOL, and Yahoo services as email providers which are blocking new registrations. We are trying to fix the issue and we have several internal and external support tickets in process to resolve the issue. Please see: viewtopic.php?t=158230

 

 

 

[SOLVED] repairing a system with bad blocks on the hard disk

Linux Kernel, Network, and Services configuration.
Message
Author
User avatar
Soul Singin'
Posts: 1605
Joined: 2008-12-21 07:02

[SOLVED] repairing a system with bad blocks on the hard disk

#1 Post by Soul Singin' »

EDIT: . The members of this forum just helped me fix and repair a nasty hard disk error. I had run file system checks before, but what I never knew was that the default check does not update the bad block inode list.
p.H wrote:e2fsck detects and marks bad blocks only when run with the -c option.
With that one sentence, p.H saved my computer. And the advice that he and L_V gave me in this thread was priceless.

What ultimately worked for me was checking both my / (root) and /home partitions with the non-destructive read-write option, -cc from a Live CD:

Code: Select all

e2fsck -f -y -cc -C0 /dev/sda5
e2fsck -f -y -cc -C0 /dev/sda7
That check identified and repaired the affected inodes. It also wrote over the damaged files. Keep a list of those files. You will have to replace them (as explained below).

Next, I ran the checks again with the read-only option -c:

Code: Select all

e2fsck -f -y -c -C0 /dev/sda5
e2fsck -f -y -c -C0 /dev/sda7
Running the check a second time was an important step because it added a few more blocks to the bad blocks list.

Having repaired the file system, the next step was to repair the affected files:
p.H wrote:Note that e2fsck can remap bad blocks but cannot restore the unreadable contents of the affected files, so these files must be reinstalled from their respective packages.
In my case, I had a fresh install of Debian Buster and a Debian Buster Live CD, so I just copied them from the Live CD:

Code: Select all

mkdir /media/inspiron
mount /dev/sda5 /media/inspiron
cp /usr/bin/$FILE01  /media/inspiron/usr/bin/$FILE01
cp /usr/bin/$FILE02  /media/inspiron/usr/bin/$FILE02
...
umount /dev/sda5
After that, the computer booted like a charm. Importantly, it shutdown like a charm too. There were no priority 0 or 1 messages in my journalctl.

Thank you to p.H and L_V for helping me rescue this old machine! . :D

----------------------------------------

ORIGINAL POST:

After a fresh installation of Debian Buster on an old machine, the partition that contains my /home partition does not unmount at shutdown. The problem seems to be caused by an I/O error. At first glance, smartctl does not show any errors, but a deeper looks shows that the disk experienced a few errors on the / (root) partition a few years ago.

If I followed Linux Admins' "Fixing disk problems" guide would that resolve the issue?

Thanks in advance,
- Soul

Code: Select all

$ journalctl -r -b -1 -p3

-- Logs begin at Sun 2019-05-19 13:22:05 EDT, end at Sun 2019-05-19 15:26:53 EDT. --
May 19 14:51:02 inspiron systemd[1]: Failed unmounting /home.
May 19 14:51:02 inspiron kernel: print_req_error: I/O error, dev sda, sector 162964427
May 19 14:51:02 inspiron kernel: ata1.00: error: { UNC }
May 19 14:51:02 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 14:51:02 inspiron kernel: ata1.00: cmd 60/08:88:c8:a3:b6/00:00:09:00:00/40 tag 17 ncq dma 4096 in
                                          res 41/40:08:cb:a3:b6/00:00:09:00:00/00 Emask 0x409 (media error) <F>
May 19 14:51:02 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 14:51:02 inspiron kernel: ata1.00: irq_stat 0x40000008
May 19 14:51:02 inspiron kernel: ata1.00: exception Emask 0x0 SAct 0x20000 SErr 0x0 action 0x0
May 19 14:50:59 inspiron kernel: print_req_error: I/O error, dev sda, sector 162964427
May 19 14:50:59 inspiron kernel: ata1.00: error: { UNC }
May 19 14:50:59 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 14:50:59 inspiron kernel: ata1.00: cmd 60/20:a8:c0:a3:b6/00:00:09:00:00/40 tag 21 ncq dma 16384 in
                                          res 41/40:20:cb:a3:b6/00:00:09:00:00/00 Emask 0x409 (media error) <F>
May 19 14:50:59 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 14:50:59 inspiron kernel: ata1.00: irq_stat 0x40000008
May 19 14:50:59 inspiron kernel: ata1.00: exception Emask 0x0 SAct 0x200000 SErr 0x0 action 0x0
May 19 14:50:43 inspiron wpa_supplicant[509]: dbus: wpa_dbus_property_changed: no property SessionLength in object /fi/w1/wpa_supplicant1/Interfaces/1
May 19 14:47:06 inspiron root[7585]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 14:40:19 inspiron root[7277]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 14:34:55 inspiron root[7129]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 14:27:20 inspiron root[6970]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 14:19:26 inspiron root[6425]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 14:13:30 inspiron root[6164]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 14:07:00 inspiron root[4631]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 14:01:11 inspiron root[3004]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 13:53:40 inspiron root[2451]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 13:46:29 inspiron root[2355]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 13:41:26 inspiron root[2260]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 13:33:31 inspiron root[1803]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 13:28:13 inspiron root[1633]: /etc/dhcp/dhclient-exit-hooks.d/zzz_avahi-autoipd returned non-zero exit status 1
May 19 13:26:48 inspiron kernel: print_req_error: I/O error, dev sda, sector 201851126
May 19 13:26:48 inspiron kernel: ata1.00: error: { UNC }
May 19 13:26:48 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:48 inspiron kernel: ata1.00: cmd 60/08:c8:f0:00:08/00:00:0c:00:00/40 tag 25 ncq dma 4096 in
                                          res 41/40:08:f6:00:08/00:00:0c:00:00/00 Emask 0x409 (media error) <F>
May 19 13:26:48 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:48 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:48 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:48 inspiron kernel: ata1.00: cmd 60/08:98:90:f6:3c/00:00:0a:00:00/40 tag 19 ncq dma 4096 in
                                          res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:48 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:48 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:48 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:48 inspiron kernel: ata1.00: cmd 60/08:90:08:3e:d1/00:00:30:00:00/40 tag 18 ncq dma 4096 in
                                          res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:48 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:48 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:48 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:48 inspiron kernel: ata1.00: cmd 60/08:88:08:2d:8d/00:00:15:00:00/40 tag 17 ncq dma 4096 in
                                          res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:48 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:48 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:48 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:48 inspiron kernel: ata1.00: cmd 60/08:80:c0:3e:59/00:00:09:00:00/40 tag 16 ncq dma 4096 in
                                          res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:48 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:48 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:48 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:48 inspiron kernel: ata1.00: cmd 61/18:30:18:7f:c5/00:00:2f:00:00/40 tag 6 ncq dma 12288 out
                                          res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:48 inspiron kernel: ata1.00: failed command: WRITE FPDMA QUEUED
May 19 13:26:48 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:48 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:48 inspiron kernel: ata1.00: cmd 60/08:28:c8:6a:71/00:00:15:00:00/40 tag 5 ncq dma 4096 in
                                          res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:48 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:48 inspiron kernel: ata1.00: irq_stat 0x40000001
May 19 13:26:48 inspiron kernel: ata1.00: exception Emask 0x0 SAct 0x20f0060 SErr 0x0 action 0x0
May 19 13:26:45 inspiron kernel: print_req_error: I/O error, dev sda, sector 361573512
May 19 13:26:45 inspiron kernel: print_req_error: I/O error, dev sda, sector 156843584
May 19 13:26:45 inspiron kernel: print_req_error: I/O error, dev sda, sector 201851126
May 19 13:26:45 inspiron kernel: print_req_error: I/O error, dev sda, sector 359754432
May 19 13:26:45 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:45 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:45 inspiron kernel: ata1.00: cmd 60/d0:f0:88:2c:8d/00:00:15:00:00/40 tag 30 ncq dma 106496 in
                                          res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:45 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:45 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:45 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:45 inspiron kernel: ata1.00: cmd 60/00:e8:40:3e:59/01:00:09:00:00/40 tag 29 ncq dma 131072 in
                                          res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:45 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:45 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:45 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:45 inspiron kernel: ata1.00: cmd 60/08:58:08:3e:d1/00:00:30:00:00/40 tag 11 ncq dma 4096 in
                                          res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:45 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:45 inspiron kernel: ata1.00: error: { UNC }
May 19 13:26:45 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:45 inspiron kernel: ata1.00: cmd 60/08:48:f0:00:08/00:00:0c:00:00/40 tag 9 ncq dma 4096 in
                                          res 41/40:08:f6:00:08/00:00:0c:00:00/00 Emask 0x409 (media error) <F>
May 19 13:26:45 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:45 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:45 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:45 inspiron kernel: ata1.00: cmd 60/08:18:90:f6:3c/00:00:0a:00:00/40 tag 3 ncq dma 4096 in
                                          res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:45 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:45 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:45 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:45 inspiron kernel: ata1.00: cmd 60/40:10:c0:6a:71/00:00:15:00:00/40 tag 2 ncq dma 32768 in
                                          res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:45 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:45 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:45 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:45 inspiron kernel: ata1.00: cmd 61/08:00:00:70:cc/00:00:31:00:00/40 tag 0 ncq dma 4096 out
                                          res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:45 inspiron kernel: ata1.00: failed command: WRITE FPDMA QUEUED
May 19 13:26:45 inspiron kernel: ata1.00: irq_stat 0x40000001
May 19 13:26:45 inspiron kernel: ata1.00: exception Emask 0x0 SAct 0x60000a0d SErr 0x0 action 0x0
May 19 13:26:40 inspiron kernel: print_req_error: I/O error, dev sda, sector 804169080
May 19 13:26:40 inspiron kernel: print_req_error: I/O error, dev sda, sector 361572440
May 19 13:26:40 inspiron kernel: print_req_error: I/O error, dev sda, sector 201851904
May 19 13:26:40 inspiron kernel: print_req_error: I/O error, dev sda, sector 201851126
May 19 13:26:40 inspiron kernel: print_req_error: I/O error, dev sda, sector 813441024
May 19 13:26:40 inspiron kernel: print_req_error: I/O error, dev sda, sector 201848320
May 19 13:26:39 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:39 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:39 inspiron kernel: ata1.00: cmd 60/40:e0:78:a5:ee/00:00:2f:00:00/40 tag 28 ncq dma 32768 in
                                          res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:39 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:39 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:39 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:39 inspiron kernel: ata1.00: cmd 60/08:c8:30:59:d1/00:00:30:00:00/40 tag 25 ncq dma 4096 in
                                          res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:39 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:39 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:39 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:39 inspiron kernel: ata1.00: cmd 60/08:c0:58:60:92/00:00:31:00:00/40 tag 24 ncq dma 4096 in
                                          res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:39 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:39 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:39 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:39 inspiron kernel: ata1.00: cmd 60/58:60:58:28:8d/00:00:15:00:00/40 tag 12 ncq dma 45056 in
                                          res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:39 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:39 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:39 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:39 inspiron kernel: ata1.00: cmd 60/d8:58:00:04:08/06:00:0c:00:00/40 tag 11 ncq dma 897024 in
                                          res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:39 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:39 inspiron kernel: ata1.00: error: { UNC }
May 19 13:26:39 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:39 inspiron kernel: ata1.00: cmd 60/00:50:00:00:08/04:00:0c:00:00/40 tag 10 ncq dma 524288 in
                                          res 41/40:00:f6:00:08/00:04:0c:00:00/00 Emask 0x409 (media error) <F>
May 19 13:26:39 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:39 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:39 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:39 inspiron kernel: ata1.00: cmd 60/40:48:00:20:7c/00:00:30:00:00/40 tag 9 ncq dma 32768 in
                                          res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:39 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:39 inspiron kernel: ata1.00: error: { ABRT }
May 19 13:26:39 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 13:26:39 inspiron kernel: ata1.00: cmd 60/00:40:00:f6:07/06:00:0c:00:00/40 tag 8 ncq dma 786432 in
                                          res 41/04:00:f6:00:08/00:00:0c:00:00/00 Emask 0x1 (device error)
May 19 13:26:39 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 13:26:39 inspiron kernel: ata1.00: irq_stat 0x40000001
May 19 13:26:39 inspiron kernel: ata1.00: exception Emask 0x0 SAct 0x13001f00 SErr 0x0 action 0x0
May 19 13:22:09 inspiron kernel: mei mei::55213584-9a29-4916-badf-0fb7ed682aeb:01: FW version command failed -5
May 19 13:22:09 inspiron kernel: mei mei::55213584-9a29-4916-badf-0fb7ed682aeb:01: Could not read FW version
May 19 13:22:05 inspiron kernel: ACPI: SPCR: Unexpected SPCR Access Width.  Defaulting to byte size

Code: Select all

# fdisk -l

Disk /dev/sda: 465.8 GiB, 500107862016 bytes, 976773168 sectors
Disk model: ST9500325AS     
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x07f2837e

Device     Boot     Start       End   Sectors   Size Id Type
/dev/sda1              63    208844    208782   102M de Dell Utility
/dev/sda2  *       208845  30928844  30720000  14.7G  7 HPFS/NTFS/exFAT
/dev/sda3        30928845 155775023 124846179  59.5G  7 HPFS/NTFS/exFAT
/dev/sda4       155782305 976768064 820985760 391.5G  5 Extended
/dev/sda5  *    155782368 177305599  21523232  10.3G 83 Linux
/dev/sda6       177307648 199903231  22595584  10.8G 82 Linux swap / Solaris
/dev/sda7       199905280 976766975 776861696 370.4G 83 Linux

Code: Select all

# smartctl -l selftest /dev/sda

smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.19.0-5-amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%         0         -

Code: Select all

# smartctl -a /dev/sda

smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.19.0-5-amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Momentus 5400.6
Device Model:     ST9500325AS
Serial Number:    6VEGMVRP
LU WWN Device Id: 5 000c50 03067dd6f
Firmware Version: D005DEM1
User Capacity:    500,107,862,016 bytes [500 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    5400 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 2.6, 3.0 Gb/s
Local Time is:    Sun May 19 15:05:07 2019 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
See vendor-specific Attribute list for marginal Attributes.

General SMART Values:
Offline data collection status:  (0x82)	Offline data collection activity
					was completed without error.
					Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		(    0) seconds.
Offline data collection
capabilities: 			 (0x7b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 ( 139) minutes.
Conveyance self-test routine
recommended polling time: 	 (   3) minutes.
SCT capabilities: 	       (0x103f)	SCT Status supported.
					SCT Error Recovery Control supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   101   089   006    Pre-fail  Always       -       29958806
  3 Spin_Up_Time            0x0003   099   099   085    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   091   091   020    Old_age   Always       -       9917
  5 Reallocated_Sector_Ct   0x0033   088   088   036    Pre-fail  Always       -       246
  7 Seek_Error_Rate         0x000f   083   060   030    Pre-fail  Always       -       207791365
  9 Power_On_Hours          0x0032   073   073   000    Old_age   Always       -       23876
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   094   094   020    Old_age   Always       -       6861
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   001   001   000    Old_age   Always       -       1097
188 Command_Timeout         0x0032   100   096   000    Old_age   Always       -       3759
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   051   036   045    Old_age   Always   In_the_past 49 (Min/Max 49/49 #998)
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       20
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       78
193 Load_Cycle_Count        0x0032   001   001   000    Old_age   Always       -       578157
194 Temperature_Celsius     0x0022   049   064   000    Old_age   Always       -       49 (0 18 0 0 0)
195 Hardware_ECC_Recovered  0x001a   053   045   000    Old_age   Always       -       29958806
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       4
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       4
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       22868 (153 213 0)
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       3790333358
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       1937597633
254 Free_Fall_Sensor        0x0032   100   100   000    Old_age   Always       -       0

SMART Error Log Version: 1
ATA Error Count: 987 (device log contains only the most recent five errors)
	CR = Command Register [HEX]
	FR = Features Register [HEX]
	SC = Sector Count Register [HEX]
	SN = Sector Number Register [HEX]
	CL = Cylinder Low Register [HEX]
	CH = Cylinder High Register [HEX]
	DH = Device/Head Register [HEX]
	DC = Device Command Register [HEX]
	ER = Error register [HEX]
	ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 987 occurred at disk power-on lifetime: 23876 hours (994 days + 20 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 cb a3 b6 09  Error: UNC at LBA = 0x09b6a3cb = 162964427

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 08 c8 a3 b6 49 00      05:50:04.649  READ FPDMA QUEUED
  60 00 28 e0 a3 b6 49 00      05:50:04.617  READ FPDMA QUEUED
  60 00 08 c0 a3 b6 49 00      05:50:04.515  READ FPDMA QUEUED
  27 00 00 00 00 00 e0 00      05:50:04.513  READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
  ec 00 00 00 00 00 a0 00      05:50:04.512  IDENTIFY DEVICE

Error 986 occurred at disk power-on lifetime: 23876 hours (994 days + 20 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 cb a3 b6 09  Error: UNC at LBA = 0x09b6a3cb = 162964427

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 20 c0 a3 b6 49 00      05:50:02.009  READ FPDMA QUEUED
  60 00 08 10 50 bb 49 00      05:50:01.961  READ FPDMA QUEUED
  ea 00 00 00 00 00 a0 00      05:49:55.547  FLUSH CACHE EXT
  61 00 08 a0 33 4a 49 00      05:49:55.547  WRITE FPDMA QUEUED
  ea 00 00 00 00 00 a0 00      05:49:55.538  FLUSH CACHE EXT

Error 985 occurred at disk power-on lifetime: 23874 hours (994 days + 18 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 f6 00 08 0c  Error: UNC at LBA = 0x0c0800f6 = 201851126

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 08 ff ff ff 4f 00      04:25:50.279  READ FPDMA QUEUED
  60 00 08 f0 00 08 4c 00      04:25:50.257  READ FPDMA QUEUED
  61 00 08 ff ff ff 4f 00      04:25:50.256  WRITE FPDMA QUEUED
  60 00 08 90 f6 3c 4a 00      04:25:50.256  READ FPDMA QUEUED
  60 00 08 ff ff ff 4f 00      04:25:50.256  READ FPDMA QUEUED

Error 984 occurred at disk power-on lifetime: 23874 hours (994 days + 18 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 f6 00 08 0c  Error: UNC at LBA = 0x0c0800f6 = 201851126

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 40 ff ff ff 4f 00      04:25:47.981  READ FPDMA QUEUED
  60 00 80 28 9e 57 49 00      04:25:47.954  READ FPDMA QUEUED
  60 00 40 ff ff ff 4f 00      04:25:47.953  READ FPDMA QUEUED
  60 00 40 ff ff ff 4f 00      04:25:47.945  READ FPDMA QUEUED
  60 00 40 ff ff ff 4f 00      04:25:47.941  READ FPDMA QUEUED

Error 983 occurred at disk power-on lifetime: 23874 hours (994 days + 18 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 f6 00 08 0c  Error: UNC at LBA = 0x0c0800f6 = 201851126

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 08 ff ff ff 4f 00      04:25:42.214  READ FPDMA QUEUED
  60 00 d8 00 04 08 4c 00      04:25:42.209  READ FPDMA QUEUED
  60 00 00 00 00 08 4c 00      04:25:42.207  READ FPDMA QUEUED
  60 00 00 00 f6 07 4c 00      04:25:42.207  READ FPDMA QUEUED
  60 00 08 ff ff ff 4f 00      04:25:42.163  READ FPDMA QUEUED

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%         0         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
Last edited by Soul Singin' on 2019-05-21 22:11, edited 6 times in total.

L_V
Posts: 1477
Joined: 2007-03-19 09:04
Been thanked: 11 times

Re: /home fails to unmount at shutdown

#2 Post by L_V »

You can try disabling NCQ (Native Command Queuing) :

Code: Select all

echo 1 > /sys/block/sda/device/queue_depth
And investigate this:

Code: Select all

sudo hdparm -I /dev/sda | grep TRIM
find /etc/cron* -name fstrim
and check this non linux partition:

Code: Select all

/dev/sda2  *       208845  30928844  30720000  14.7G  7 HPFS/NTFS/exFAT
Can you unmount it manually ? +

Code: Select all

mount | grep sda
grep -v \# /etc/fstab
Soul Singin' wrote: a deeper looks shows that the disk experienced a few errors on the / (root) partition a few years ago
All above assuming you already have checked your disk (e2fsck) with all unmounted partitions (with means which a liveCD for example).

User avatar
Soul Singin'
Posts: 1605
Joined: 2008-12-21 07:02

Re: /home fails to unmount at shutdown

#3 Post by Soul Singin' »

Thanks for the quick response! :D
L_V wrote:You can try disabling NCQ (Native Command Queuing) :

Code: Select all

echo 1 > /sys/block/sda/device/queue_depth
I'll look into that. Thanks for the tip.
L_V wrote:And investigate this:

Code: Select all

sudo hdparm -I /dev/sda | grep TRIM
find /etc/cron* -name fstrim
Both commands return nothing at all.
L_V wrote:and check this non linux partition:

Code: Select all

/dev/sda2  *       208845  30928844  30720000  14.7G  7 HPFS/NTFS/exFAT
That's my MS Windows partition. I never use it, never mount it, but I'll take a look.
L_V wrote:Can you unmount it manually ?
The last time I tried, I could not.
L_V wrote:

Code: Select all

mount | grep sda

Code: Select all

/dev/sda5 on / type ext3 (rw,relatime,errors=remount-ro)
/dev/sda7 on /home type ext3 (rw,relatime)
L_V wrote:

Code: Select all

grep -v \# /etc/fstab
UUIDs confuse me. Let's do it this way:

Code: Select all

# / was on /dev/sda5 during installation
UUID=0b79e803-8f3f-4a0b-843b-e065e03a90c3 /               ext3    errors=remount-ro 0       1

# /home was on /dev/sda7 during installation
UUID=7e85e962-bd1b-4892-9f39-9a25d7c8ec3d /home           ext3    defaults        0       2

# swap was on /dev/sda6 during installation
UUID=dd5ce90f-a661-4f0d-9553-4e1df17c362d none            swap    sw              0       0

/dev/sr0        /media/cdrom0   udf,iso9660 user,noauto     0       0
L_V wrote:All above assuming you already have checked your disk (e2fsck) with all unmounted partitions (with means which a liveCD for example).
Yes. I already did that.

Question: What does that file system check do? According to Wikipedia, e2fsck keeps a list in the "bad block inode," so that they do not get allocated.

Does that mean that a file system check updates that list? If so, should I do nothing at all? . Obviously, I should back up my important files and consider buying a new machine, but ...

Should I just leave this old hard disk alone? Or should I try to repair the bad blocks? I ran badblocks and saw a very long list.

Thanks again for your help!

L_V
Posts: 1477
Joined: 2007-03-19 09:04
Been thanked: 11 times

Re: /home fails to unmount at shutdown

#4 Post by L_V »

Code: Select all

man e2fsck
DESCRIPTION
e2fsck is used to check the ext2/ext3/ext4 family of file systems.
For ext3 and ext4 filesystems that use a journal, if the system has been shut down uncleanly without any errors, normally,
after replaying the committed transactions in the journal, the file system should be marked as clean.
Hence, for filesystems that use journalling, e2fsck will normally replay the journal and exit,
unless its superblock indicates that further checking is required.
Note that in general it is not safe to run e2fsck on mounted filesystems.

If not sure, you should verify your disk again.
Any good reason to not use ext4 ???

User avatar
Soul Singin'
Posts: 1605
Joined: 2008-12-21 07:02

Re: /home fails to unmount at shutdown

#5 Post by Soul Singin' »

Yes. I read that page, but I did not see anything about how e2fsck handles bad blocks.
L_V wrote:If not sure, you should verify your disk again.
Great minds think alike. . :) . I put the Live CD back in the machine and I am checking both partitions with:

Code: Select all

e2fsck -f -p /dev/sda5
e2fsck -f -p /dev/sda7
Let's see what happens now.
L_V wrote:Any good reason to not use ext4 ???
One horribly corrupted file system. . :(

L_V
Posts: 1477
Joined: 2007-03-19 09:04
Been thanked: 11 times

Re: /home fails to unmount at shutdown

#6 Post by L_V »

e2fsk flags the bad sectors as "to not be used by the system".
Soul Singin' wrote:One horribly corrupted file system.
Or corrupted disk sectors... Then may be wrong conclusion about ext4.
Never have had any problem with ext4.

User avatar
Soul Singin'
Posts: 1605
Joined: 2008-12-21 07:02

Re: /home fails to unmount at shutdown

#7 Post by Soul Singin' »

L_V wrote:e2fsk flags the bad sectors as "to not be used by the system".
That's perfect! . :)

I took another look at the logs and noticed that systemd performs some kind of file system check on the /home after its (unsuccessful) effort to unmount the /home partition. Here are the lines that caught my attention:

Code: Select all

$ journalctl -r -b -1 -p7

-- Logs begin at Sun 2019-05-19 13:22:05 EDT, end at Sun 2019-05-19 19:05:33 EDT. --
< trimmed >
May 19 19:03:27 inspiron systemd[1]: Stopped File System Check on /dev/disk/by-uuid/7e85e962-bd1b-4892-9f39-9a25d7c8ec3d.
May 19 19:03:27 inspiron systemd[1]: systemd-fsck@dev-disk-by\x2duuid-7e85e962\x2dbd1b\x2d4892\x2d9f39\x2d9a25d7c8ec3d.service: Succeeded.
May 19 19:03:27 inspiron systemd[1]: Reached target Unmount All Filesystems.
May 19 19:03:27 inspiron systemd[1]: Failed unmounting /home.
May 19 19:03:27 inspiron systemd[1]: home.mount: Mount process exited, code=killed, status=7/BUS
< trimmed >
And here is the complete context in which those five lines appear:

Code: Select all

$ journalctl -r -b -1 -p7

-- Logs begin at Sun 2019-05-19 13:22:05 EDT, end at Sun 2019-05-19 19:05:33 EDT. --
May 19 19:03:28 inspiron systemd-journald[238]: Journal stopped
May 19 19:03:28 inspiron systemd-shutdown[1]: Sending SIGTERM to remaining processes...
May 19 19:03:27 inspiron systemd-shutdown[1]: Syncing filesystems and block devices.
May 19 19:03:27 inspiron kernel: systemd-shutdow: 25 output lines suppressed due to ratelimiting
May 19 19:03:27 inspiron systemd[1]: Shutting down.
May 19 19:03:27 inspiron systemd[1]: Reached target Power-Off.
May 19 19:03:27 inspiron systemd[1]: Started Power-Off.
May 19 19:03:27 inspiron systemd[1]: systemd-poweroff.service: Succeeded.
May 19 19:03:27 inspiron systemd[1]: Reached target Final Step.
May 19 19:03:27 inspiron systemd[1]: Reached target Shutdown.
May 19 19:03:27 inspiron systemd[1]: Stopped Remount Root and Kernel File Systems.
May 19 19:03:27 inspiron systemd[1]: systemd-remount-fs.service: Succeeded.
May 19 19:03:27 inspiron systemd[1]: Stopped Create System Users.
May 19 19:03:27 inspiron systemd[1]: systemd-sysusers.service: Succeeded.
May 19 19:03:27 inspiron systemd[1]: Stopped Create Static Device Nodes in /dev.
May 19 19:03:27 inspiron systemd[1]: systemd-tmpfiles-setup-dev.service: Succeeded.
May 19 19:03:27 inspiron systemd[1]: Stopped target Local File Systems (Pre).
May 19 19:03:27 inspiron systemd[1]: Removed slice system-systemd\x2dfsck.slice.
May 19 19:03:27 inspiron systemd[1]: Stopped File System Check on /dev/disk/by-uuid/7e85e962-bd1b-4892-9f39-9a25d7c8ec3d.
May 19 19:03:27 inspiron systemd[1]: systemd-fsck@dev-disk-by\x2duuid-7e85e962\x2dbd1b\x2d4892\x2d9f39\x2d9a25d7c8ec3d.service: Succeeded.
May 19 19:03:27 inspiron systemd[1]: Reached target Unmount All Filesystems.
May 19 19:03:27 inspiron systemd[1]: Failed unmounting /home.
May 19 19:03:27 inspiron systemd[1]: home.mount: Mount process exited, code=killed, status=7/BUS
May 19 19:03:27 inspiron kernel: ata1: EH complete
May 19 19:03:27 inspiron kernel: print_req_error: I/O error, dev sda, sector 162964427
May 19 19:03:27 inspiron kernel: sd 0:0:0:0: [sda] tag#14 CDB: Read(10) 28 00 09 b6 a3 c8 00 00 08 00
May 19 19:03:27 inspiron kernel: sd 0:0:0:0: [sda] tag#14 Add. Sense: Unrecovered read error - auto reallocate failed
May 19 19:03:27 inspiron kernel: sd 0:0:0:0: [sda] tag#14 Sense Key : Medium Error [current] 
May 19 19:03:27 inspiron kernel: sd 0:0:0:0: [sda] tag#14 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
May 19 19:03:27 inspiron kernel: ata1.00: configured for UDMA/133
May 19 19:03:27 inspiron kernel: ata1.00: error: { UNC }
May 19 19:03:27 inspiron kernel: ata1.00: status: { DRDY ERR }
May 19 19:03:27 inspiron kernel: ata1.00: cmd 60/08:70:c8:a3:b6/00:00:09:00:00/40 tag 14 ncq dma 4096 in
                                          res 41/40:08:cb:a3:b6/00:00:09:00:00/00 Emask 0x409 (media error) <F>
May 19 19:03:27 inspiron kernel: ata1.00: failed command: READ FPDMA QUEUED
May 19 19:03:27 inspiron kernel: ata1.00: irq_stat 0x40000008
May 19 19:03:27 inspiron kernel: ata1.00: exception Emask 0x0 SAct 0x4000 SErr 0x0 action 0x0
< trimmed >
L_V wrote:
Soul Singin' wrote:One horribly corrupted file system.
Or corrupted disk sectors... Then may be wrong conclusion about ext4.
Never have had any problem with ext4.
I could be mistaken. I just had a bad experience and after that: "Once bitten, twice shy."

L_V
Posts: 1477
Joined: 2007-03-19 09:04
Been thanked: 11 times

Re: /home fails to unmount at shutdown

#8 Post by L_V »

Did you check for any BIOS updates ?

No new ideas, then check NCQ of my first message.
Next time you reinstall, may be try ext4 which is the default format.
The disk problems you have now are with ext3; then what would be your conclusions about ext 3 .....

User avatar
Soul Singin'
Posts: 1605
Joined: 2008-12-21 07:02

Re: /home fails to unmount at shutdown

#9 Post by Soul Singin' »

L_V wrote:Did you check for any BIOS updates ?
There is one BIOS update. It patches a LoJack vulnerability.
L_V wrote:No new ideas, then check NCQ of my first message.
I just tried it. No luck. . :( . But thank you for the suggestion!
L_V wrote:Next time you reinstall, may be try ext4 which is the default format.
The disk problems you have now are with ext3; then what would be your conclusions about ext 3 .....
That's a good point. I will consider EXT4 in the future. In the meantime, these EXT3 troubles are not giving me any troubles, except the worry that a /home partition which is not unmounting properly will give me troubles one day.

And thank you for your help! . :D

L_V
Posts: 1477
Joined: 2007-03-19 09:04
Been thanked: 11 times

Re: /home fails to unmount at shutdown

#10 Post by L_V »

If easy to do, depending on your available disks capacity is in liveCD mode, to move to /home to an another partition, reformat sda7, and move back your home to sda7, and reassign correct permissions with chown -R.
| + fstab to be checked with correct UUID , or home changed to /dev/sda7, and ext(?) option ]
This is what I would do.
Last edited by L_V on 2019-05-20 15:00, edited 1 time in total.

p.H
Global Moderator
Global Moderator
Posts: 3049
Joined: 2017-09-17 07:12
Has thanked: 5 times
Been thanked: 132 times

Re: /home fails to unmount at shutdown

#11 Post by p.H »

The drive has UNCorrectable sector errors, i.e. bad blocks. I doubt disabling NCQ or updating the BIOS helps against this.

Interesting SMART attributes :

Code: Select all

ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  5 Reallocated_Sector_Ct   0x0033   088   088   036    Pre-fail  Always       -       246
190 Airflow_Temperature_Cel 0x0022   051   036   045    Old_age   Always   In_the_past 49 (Min/Max 49/49 #998)
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       4
The drive has currently 4 known unreadable sectors. 246 sectors were internally reallocated because they were defective. A defective sector cannot be reallocated until it is read successfully or overwritten with new data.
The drive is quite hot and has suffered from over-heating at least once. This may explain the bad blocks.

The kernel logs and SMART error logs show that sectors at LBA 162964427 and 201851126 were recently (not one year ago) found unreadable. According to the partition table, the former belongs to the root partition sda5 and the latter belongs to the home partition sda7.

e2fsck detects and marks bad blocks only when run with the -c option.

L_V
Posts: 1477
Joined: 2007-03-19 09:04
Been thanked: 11 times

Re: /home fails to unmount at shutdown

#12 Post by L_V »

Bad sectors is the easy part of the problem.
This one is a bit more tricky => " failed command: WRITE FPDMA QUEUED" ATA errors"
and not always caused by bad sectors.

User avatar
Soul Singin'
Posts: 1605
Joined: 2008-12-21 07:02

Re: /home fails to unmount at shutdown

#13 Post by Soul Singin' »

L_V wrote:move to /home to an another partition, reformat sda7, and move back your home to sda7
That's a nice tip. Thank you. And because I'm running out of storage space on the disk, it would force me to backup my entire /home partition to an external device. Thanks.
p.H wrote:The drive is quite hot and has suffered from over-heating at least once.
I have to start running my machine learning projects on another machine or this one is going to melt. . :oops:

And thank you for reading through the logs for me. Your explanation is very helpful.
p.H wrote:e2fsck detects and marks bad blocks only when run with the -c option.
Thank you!


Searching around, I see that other people are experiencing similar issues:

p.H
Global Moderator
Global Moderator
Posts: 3049
Joined: 2017-09-17 07:12
Has thanked: 5 times
Been thanked: 132 times

Re: /home fails to unmount at shutdown

#14 Post by p.H »

L_V wrote:Bad sectors is the easy part of the problem.
This one is a bit more tricky => " failed command: WRITE FPDMA QUEUED" ATA errors"
and not always caused by bad sectors.
The system log is not easy to read in reverse order (-r), but it seems that both errors happen at the same time, so I suspect that the bad sectors may be the initial cause. I have seen drives going completely offline when trying hard to read bad sectors.
Soul Singin' wrote:"failed to unmount /home, /var/tmp, /tmp at shutdown" -- the author later said that KDE is the culprit
"failed to unmount /var" -- the author blamed systemd and reinstalled
"'Failed to unmount /oldroot/...' when shutting down" -- the author found that his graphics card was causing the trouble
KDE and systemd cannot be blamed for bad sectors. Neither a graphic card, unless it overheats or disrupts the power supply.

L_V
Posts: 1477
Joined: 2007-03-19 09:04
Been thanked: 11 times

Re: /home fails to unmount at shutdown

#15 Post by L_V »

Soul Singin' wrote: And because I'm running out of storage space on the disk
Then risky. If not too late, what says this

Code: Select all

df -h /dev/sda5 /dev/sda7

User avatar
Soul Singin'
Posts: 1605
Joined: 2008-12-21 07:02

Re: /home fails to unmount at shutdown

#16 Post by Soul Singin' »

L_V wrote:
Soul Singin' wrote: And because I'm running out of storage space on the disk
Then risky.

I thought it would be the opposite of risky because following your suggestion would require me to back up ALL of my data to an external device. Do you think that all of those read operations would harm the disk?
L_V wrote:If not too late, what says this

Code: Select all

df -h /dev/sda5 /dev/sda7

Code: Select all

$ df -h /dev/sda5 /dev/sda7

Filesystem      Size  Used Avail Use% Mounted on
/dev/sda5        11G  6.5G  3.1G  69% /
/dev/sda7       365G  323G   24G  94% /home
Too many compiled software packages. . :oops:

User avatar
Soul Singin'
Posts: 1605
Joined: 2008-12-21 07:02

Re: /home fails to unmount at shutdown

#17 Post by Soul Singin' »

p.H wrote:but it seems that both errors happen at the same time, so I suspect that the bad sectors may be the initial cause. I have seen drives going completely offline when trying hard to read bad sectors.
Your words link together several issues that I have been experiencing.

The output of smartctl indicates that the disk errors occurred on 28 Aug 2016, but I did not notice any problems until I upgraded from Wheezy to Stretch, two years later. smartctl also indicates that there were bad sectors on both the / (root) and /home partitions.

So I wonder if the bad sectors on / (root) caused the "black screen of death" issues that I had with Debian Stretch. I wonder if the drive was "going completely offline" just like you described.

It's too late to know now, because I reformatted the / (root) partition when I installed Buster. The reinstall resolved the "black screen of death" issue, but now I am having trouble with the /home partition, which I did not reformat.
p.H wrote:e2fsck detects and marks bad blocks only when run with the -c option.
I did not use that option when I ran the file system checks from a Live CD. Could that have been my mistake? Is that why I'm still having trouble with the /home partition?

We'll find out soon enough! . :)

p.H wrote:KDE and systemd cannot be blamed for bad sectors. Neither a graphic card
Please excuse me. I was trying to summarize several posts in a handful of words.

I was just trying to say that several other people are seeing their partitions not unmount during shutdown. The issues that they experienced were similar, but they all pointed their fingers at different software packages, graphics cards, etc. I should have expressed that point more clearly.


I'm going to try e2fsck -c and see what happens. Wish me luck!

Thanks again for your help! I will let you know what I find.

L_V
Posts: 1477
Joined: 2007-03-19 09:04
Been thanked: 11 times

Re: /home fails to unmount at shutdown

#18 Post by L_V »

This will give you more info:

Code: Select all

sudo tune2fs -l /dev/sda7
If you consider the x% reserved capacity for the file system, I wonder if your partition is not very close to be full.
I have had bad experiences with full partitions....
+ see

Code: Select all

man badblocks
badblocks - search a device for bad blocks

p.H
Global Moderator
Global Moderator
Posts: 3049
Joined: 2017-09-17 07:12
Has thanked: 5 times
Been thanked: 132 times

Re: /home fails to unmount at shutdown

#19 Post by p.H »

Soul Singin' wrote:The output of smartctl indicates that the disk errors occurred on 28 Aug 2016
Where do you see that date ?

User avatar
Soul Singin'
Posts: 1605
Joined: 2008-12-21 07:02

Re: /home fails to unmount at shutdown

#20 Post by Soul Singin' »

p.H wrote:
Soul Singin' wrote:The output of smartctl indicates that the disk errors occurred on 28 Aug 2016
Where do you see that date ?
Soul Singin' wrote:

Code: Select all

Error 983 occurred at disk power-on lifetime: 23874 hours (994 days + 18 hours)
994 days ago corresponds to 28 Aug 2016. I remember the event too because I had to teach my first class of the semester the very next day and I was frightened.

Back in the present, I can confirm that bad blocks are causing my troubles. When I ran e2fsck from a Live CD, it found a bad block in: . /usr/bin/umount . (and several other files).

Right now, I'm trying to clean them up with:

Code: Select all

e2fsck -f -y -cc -C0 /dev/sda5
e2fsck -f -y -cc -C0 /dev/sda7
I'll keep you posted. In the meantime, thank you! Just knowing what happened is a relief.

Post Reply