HDD errors

Message

Ltlbkofjim · #1 Post by **Ltlbkofjim** » 2018-11-19 19:30

Hi

Not sure if this is the right place to request help with this, if not please direct me to a more appropriate group.

Had a small incident with my raspberry pi running raspbian the other day where a power interruption led to the external HDD (ext4 fs)not mounting on boot, therefore putting the it into single user emergency mode.
After repairing the superblock and replacing it with one of the backups the drive now mounts and the it boots correctly.
However it’s displaying some funny behaviour. The files are still on the drive and they are all still accessible, du reports the mount point being 186GB but df reports the mount point being only 77MB in size?? - I’ve seen df reporting a higher usage than du before due to deleted files being used by running processes but never seen it reporting a practically empty drive that has plenty of data on.
Running fsck on the drive initially reports a file system with errors but gets to “pass 5: checking group summary information” and just reports fsck exited with signal 9, or just killed, but I am not killing the process nor is any other user.
Currently running bad blocks on the drive but looks like this could take a few days on the size of the drive, so far no errors found at 54% in.
Like I said I still do have access to all the files and looks like the data is intact but I worry about just carrying on and using it if there’s something wrong underneath that I’m missing

Any ideas welcome

Thanks

milomak · #2 Post by **milomak** » 2018-11-20 17:32

it is often easier for us reading your posts if you actually post the output of the commands you have run. so we can see the actual results rather than trying to build them in our minds.

you may also find that someone may spot something else you have missed.

also post the dmesg output for when that drive is being mounted

Segfault · #3 Post by **Segfault** » 2018-11-20 17:46

First, you should check the SMART data. However, SMART may not report errors it is not aware of. To make it aware you need to force write. This is where badblocks -n comes in. After running it run smartctl -t long /dev/<device>. After it finishes run smartctl -a, see if there are any errors. OTOH, if the test errors out and does not finish then it is time to replace the drive.
Next layer is filesystem. There is not much point repairing it if there are bad blocks and I/O errors indeed.

debiman · #4 Post by **debiman** » 2018-11-20 18:23

i take it the drive does not contain the operating system?
but couldn't that also have been damaged?
maybe you should check the root partition as well.

apart from that the answers to your problem (data loss through power outage) are just a few web searches away.
but we are here to help, regardless, so please do provide what was requested & answer our questions.

milomak · #5 Post by **milomak** » 2018-11-20 18:37

i used to run raspbian (on the original pi) some years ago and this was very common when a power failure happened

my external was an lvm setup (ext4) which required a bit more work to fix. but usually once i had recovered the lvm, an e2fsck would work.

Ltlbkofjim · #6 Post by **Ltlbkofjim** » 2018-11-20 21:06

Thanks for the responses all - hopefully ill answer all your questions to assist

milomak wrote:it is often easier for us reading your posts if you actually post the output of the commands you have run. so we can see the actual results rather than trying to build them in our minds.

you may also find that someone may spot something else you have missed.

also post the dmesg output for when that drive is being mounted

Here are the original df and du commands that led me to see something was wrong