[SOLVED] repairing a system with bad blocks on the hard disk

Kernels & Hardware, configuring network, installing services

Re: /home fails to unmount at shutdown

Postby Soul Singin' » 2019-05-20 18:49

L_V wrote:
Soul Singin' wrote: And because I'm running out of storage space on the disk
Then risky.

I thought it would be the opposite of risky because following your suggestion would require me to back up ALL of my data to an external device. Do you think that all of those read operations would harm the disk?

L_V wrote:If not too late, what says this
Code: Select all
df -h /dev/sda5 /dev/sda7

Code: Select all
$ df -h /dev/sda5 /dev/sda7

Filesystem      Size  Used Avail Use% Mounted on
/dev/sda5        11G  6.5G  3.1G  69% /
/dev/sda7       365G  323G   24G  94% /home

Too many compiled software packages. . :oops:
User avatar
Soul Singin'
 
Posts: 1607
Joined: 2008-12-21 07:02

Re: /home fails to unmount at shutdown

Postby Soul Singin' » 2019-05-20 18:55

p.H wrote:but it seems that both errors happen at the same time, so I suspect that the bad sectors may be the initial cause. I have seen drives going completely offline when trying hard to read bad sectors.

Your words link together several issues that I have been experiencing.

The output of smartctl indicates that the disk errors occurred on 28 Aug 2016, but I did not notice any problems until I upgraded from Wheezy to Stretch, two years later. smartctl also indicates that there were bad sectors on both the / (root) and /home partitions.

So I wonder if the bad sectors on / (root) caused the "black screen of death" issues that I had with Debian Stretch. I wonder if the drive was "going completely offline" just like you described.

It's too late to know now, because I reformatted the / (root) partition when I installed Buster. The reinstall resolved the "black screen of death" issue, but now I am having trouble with the /home partition, which I did not reformat.

p.H wrote:e2fsck detects and marks bad blocks only when run with the -c option.

I did not use that option when I ran the file system checks from a Live CD. Could that have been my mistake? Is that why I'm still having trouble with the /home partition?

We'll find out soon enough! . :)


p.H wrote:KDE and systemd cannot be blamed for bad sectors. Neither a graphic card

Please excuse me. I was trying to summarize several posts in a handful of words.

I was just trying to say that several other people are seeing their partitions not unmount during shutdown. The issues that they experienced were similar, but they all pointed their fingers at different software packages, graphics cards, etc. I should have expressed that point more clearly.


I'm going to try e2fsck -c and see what happens. Wish me luck!

Thanks again for your help! I will let you know what I find.
User avatar
Soul Singin'
 
Posts: 1607
Joined: 2008-12-21 07:02

Re: /home fails to unmount at shutdown

Postby L_V » 2019-05-20 19:23

This will give you more info:
Code: Select all
sudo tune2fs -l /dev/sda7
If you consider the x% reserved capacity for the file system, I wonder if your partition is not very close to be full.
I have had bad experiences with full partitions....
+ see
Code: Select all
man badblocks
badblocks - search a device for bad blocks
L_V
 
Posts: 1270
Joined: 2007-03-19 09:04

Re: /home fails to unmount at shutdown

Postby p.H » 2019-05-20 20:46

Soul Singin' wrote:The output of smartctl indicates that the disk errors occurred on 28 Aug 2016

Where do you see that date ?
p.H
 
Posts: 1521
Joined: 2017-09-17 07:12

Re: /home fails to unmount at shutdown

Postby Soul Singin' » 2019-05-20 21:55

p.H wrote:
Soul Singin' wrote:The output of smartctl indicates that the disk errors occurred on 28 Aug 2016

Where do you see that date ?
Soul Singin' wrote:
Code: Select all
Error 983 occurred at disk power-on lifetime: 23874 hours (994 days + 18 hours)

994 days ago corresponds to 28 Aug 2016. I remember the event too because I had to teach my first class of the semester the very next day and I was frightened.

Back in the present, I can confirm that bad blocks are causing my troubles. When I ran e2fsck from a Live CD, it found a bad block in: . /usr/bin/umount . (and several other files).

Right now, I'm trying to clean them up with:

Code: Select all
e2fsck -f -y -cc -C0 /dev/sda5
e2fsck -f -y -cc -C0 /dev/sda7

I'll keep you posted. In the meantime, thank you! Just knowing what happened is a relief.
User avatar
Soul Singin'
 
Posts: 1607
Joined: 2008-12-21 07:02

Re: /home fails to unmount at shutdown

Postby Soul Singin' » 2019-05-21 03:32

Soul Singin' wrote:
Code: Select all
at disk power-on lifetime

Whoops. Looks like I misread that one. . :oops:


I'm still waiting for the /home partition to finish checking. That's going to run all night.

I checked the / (root) partition first. From the e2fsck output below, the first damage appears to have occurred on:
Code: Select all
(inode #219834, mod time Sat Sep 15 19:12:39 2018)

But I do remember a bad day on 28 Aug 2016. . :lol:

Here's the list of affected files:

Code: Select all
/usr/lib/x86_64-linux-gnu/security/pam_debug.so
/usr/share/locale/da/LC_MESSAGES/adduser.mo
/usr/lib/x86_64-linux-gnu/libgcrypt.so.20.2.4
/usr/bin/stdbuf
/usr/bin/umount
/usr/share/perl5/Debconf/Element/Web/String.pm
/usr/share/locale/ro/LC_MESSAGES/avahi.mo
/usr/lib/x86_64-linux-gnu/pspp/libpspp-1.2.0.so
/var/lib/dpkg/info/util-linux.conffiles
/var/lib/dpkg/info/keyboard-configuration.templates

And here's the full e2fsck output:

Code: Select all
root@debian:~# e2fsck -f -y -cc -C0 /dev/sda5
e2fsck 1.44.5 (15-Dec-2018)
Checking for bad blocks (non-destructive read-write test)
Testing with random pattern: done                                                 
/dev/sda5: Updating bad block inode.
Pass 1: Checking inodes, blocks, and sizes
                                                                               
Running additional passes to resolve blocks claimed by more than one inode...
Pass 1B: Rescanning for multiply-claimed blocks
Multiply-claimed block(s) in inode 219103: 899957
Multiply-claimed block(s) in inode 219834: 896206
Multiply-claimed block(s) in inode 219962: 896387
Multiply-claimed block(s) in inode 220385: 901267
Multiply-claimed block(s) in inode 221791: 897757
Multiply-claimed block(s) in inode 222233: 899741
Multiply-claimed block(s) in inode 225407: 900354
Multiply-claimed block(s) in inode 337232: 864543
Multiply-claimed block(s) in inode 544329: 897733
Multiply-claimed block(s) in inode 544414: 860550
Pass 1C: Scanning directories for inodes with multiply-claimed blocks
Pass 1D: Reconciling multiply-claimed blocks
(There are 10 inodes containing multiply-claimed blocks.)

File /usr/lib/x86_64-linux-gnu/security/pam_debug.so (inode #219103, mod time Thu Feb 14 07:08:47 2019)
  has 1 multiply-claimed block(s), shared with 1 file(s):
   <The bad blocks inode> (inode #1, mod time Mon May 20 17:36:13 2019)
Clone multiply-claimed blocks? yes

Error reading block 899957 (Input/output error).  Ignore error? yes

Force rewrite? yes

File /usr/share/locale/da/LC_MESSAGES/adduser.mo (inode #219834, mod time Sat Sep 15 19:12:39 2018)
  has 1 multiply-claimed block(s), shared with 1 file(s):
   <The bad blocks inode> (inode #1, mod time Mon May 20 17:36:13 2019)
Clone multiply-claimed blocks? yes

Error reading block 896206 (Input/output error).  Ignore error? yes

Force rewrite? yes

File /usr/lib/x86_64-linux-gnu/libgcrypt.so.20.2.4 (inode #219962, mod time Sun Jan 20 13:47:23 2019)
  has 1 multiply-claimed block(s), shared with 1 file(s):
   <The bad blocks inode> (inode #1, mod time Mon May 20 17:36:13 2019)
Clone multiply-claimed blocks? yes

Error reading block 896387 (Input/output error).  Ignore error? yes

Force rewrite? yes

File /usr/bin/stdbuf (inode #220385, mod time Thu Feb 28 15:30:31 2019)
  has 1 multiply-claimed block(s), shared with 1 file(s):
   <The bad blocks inode> (inode #1, mod time Mon May 20 17:36:13 2019)
Clone multiply-claimed blocks? yes

Error reading block 901267 (Input/output error).  Ignore error? yes

Force rewrite? yes

File /usr/bin/umount (inode #221791, mod time Thu Jan 10 08:30:43 2019)
  has 1 multiply-claimed block(s), shared with 1 file(s):
   <The bad blocks inode> (inode #1, mod time Mon May 20 17:36:13 2019)
Clone multiply-claimed blocks? yes

Error reading block 897757 (Input/output error).  Ignore error? yes

Force rewrite? yes

File /usr/share/perl5/Debconf/Element/Web/String.pm (inode #222233, mod time Tue Feb 26 09:30:35 2019)
  has 1 multiply-claimed block(s), shared with 1 file(s):
   <The bad blocks inode> (inode #1, mod time Mon May 20 17:36:13 2019)
Clone multiply-claimed blocks? yes

Error reading block 899741 (Input/output error).  Ignore error? yes

Force rewrite? yes

File /usr/share/locale/ro/LC_MESSAGES/avahi.mo (inode #225407, mod time Wed Oct 10 08:17:36 2018)
  has 1 multiply-claimed block(s), shared with 1 file(s):
   <The bad blocks inode> (inode #1, mod time Mon May 20 17:36:13 2019)
Clone multiply-claimed blocks? yes

Error reading block 900354 (Input/output error).  Ignore error? yes

Force rewrite? yes

File /usr/lib/x86_64-linux-gnu/pspp/libpspp-1.2.0.so (inode #337232, mod time Tue Apr 23 11:59:03 2019)
  has 1 multiply-claimed block(s), shared with 1 file(s):
   <The bad blocks inode> (inode #1, mod time Mon May 20 17:36:13 2019)
Clone multiply-claimed blocks? yes

Error reading block 864543 (Input/output error).  Ignore error? yes

Force rewrite? yes

File /var/lib/dpkg/info/util-linux.conffiles (inode #544329, mod time Thu Jan 10 08:30:43 2019)
  has 1 multiply-claimed block(s), shared with 1 file(s):
   <The bad blocks inode> (inode #1, mod time Mon May 20 17:36:13 2019)
Clone multiply-claimed blocks? yes

Error reading block 897733 (Input/output error).  Ignore error? yes

Force rewrite? yes

File /var/lib/dpkg/info/keyboard-configuration.templates (inode #544414, mod time Sat Mar 23 20:13:24 2019)
  has 1 multiply-claimed block(s), shared with 1 file(s):
   <The bad blocks inode> (inode #1, mod time Mon May 20 17:36:13 2019)
Clone multiply-claimed blocks? yes

Error reading block 860550 (Input/output error).  Ignore error? yes

Force rewrite? yes

Pass 2: Checking directory structure
Pass 3: Checking directory connectivity                                       
Pass 4: Checking reference counts
Pass 5: Checking group summary information                                     
Free blocks count wrong for group #0 (5258, counted=5248).
Fix? yes

Free blocks count wrong for group #26 (5522, counted=5524).
Fix? yes

Free blocks count wrong for group #27 (12395, counted=12403).
Fix? yes

                                                                               
/dev/sda5: ***** FILE SYSTEM WAS MODIFIED *****
/dev/sda5: 218757/673296 files (6.6% non-contiguous), 1757988/2690404 blocks


Wish me luck!
- Soul
User avatar
Soul Singin'
 
Posts: 1607
Joined: 2008-12-21 07:02

Re: /home fails to unmount at shutdown

Postby p.H » 2019-05-21 06:47

Not surprisingly, the disk had more bad sectors than initially reported by SMART. Expect some more in the other partitions too and consider replacing the disk.

Note that e2fsck can remap bad blocks but cannot restore the unreadable contents of the affected files, so these files must be reinstalled from their respective packages.
p.H
 
Posts: 1521
Joined: 2017-09-17 07:12

Re: repairing a system with bad blocks on the hard disk

Postby Soul Singin' » 2019-05-21 17:51

p.H wrote:Expect some more in the other partitions too and consider replacing the disk.

The machine is 10 years old. It ran Debian Lenny. Replacing the disk means replacing the computer.

p.H wrote:Note that e2fsck can remap bad blocks but cannot restore the unreadable contents of the affected files, so these files must be reinstalled from their respective packages.

The machine did not reboot, so I'm going to try copying those files from the Live CD.

If that does not work, then I will have to reconsider my options. Any attempt at reinstallion would have to account for those bad blocks and also account for the installer's assumption of no bad blocks. The installer would reformat the disk without checking for bad blocks. It might work, but the system would be prone to all kinds of errors.
User avatar
Soul Singin'
 
Posts: 1607
Joined: 2008-12-21 07:02

Re: repairing a system with bad blocks on the hard disk

Postby p.H » 2019-05-21 18:28

Soul Singin' wrote:Any attempt at reinstallion would have to account for those bad blocks and also account for the installer's assumption of no bad blocks. The installer would reformat the disk without checking for bad blocks.

You can choose to not format the partitions during the installation, to preserve the bad block list. This does not apply to the swap partition, so check that it has no bad sectors with badblocks.

Or you can try to force the embedded disk controller to reallocate all bad sectors by writing them, for example with badblocks -w (it will destroy all data on the disk). You may be lucky. But SMART shows that the disk already has many reallocated sectors, so I am afraid that bad sectors may just grow over time.
p.H
 
Posts: 1521
Joined: 2017-09-17 07:12

Re: repairing a system with bad blocks on the hard disk

Postby Soul Singin' » 2019-05-21 18:49

p.H wrote:You can choose ... Or you can try to force ...

I'm back in! . Thank you!

Copying the files worked. It makes perfect sense, but I still cannot believe it. Thank you!

I will post more soon! . :D

EDIT: I placed a full description of the repair that I ran in the original post.
User avatar
Soul Singin'
 
Posts: 1607
Joined: 2008-12-21 07:02

Previous

Return to System configuration

Who is online

Users browsing this forum: No registered users and 21 guests

fashionable