FSCK code 4

New to Debian (Or Linux in general)? Ask your questions here!

Re: FSCK code 4

Postby alexankius » 2019-01-17 12:34

It's relatively easy for me to reinstall Debian.

The issue is that I have a ZFS Pool (iSCSI SCST) that I can not lose and am not sure whether I can just import it without first exporting...

I would appreciate your feedback..
alexankius
 
Posts: 15
Joined: 2019-01-17 10:14

Re: FSCK code 4

Postby Segfault » 2019-01-17 12:51

Aren't you doing it the hard way?
I'd boot from an external media, I have a SystemRescueCD USB stick for this purpose, and diagnose and repair as needed. SysemRescueCD allows you to run GUI and post on forums, copy and paste from terminal window, no need for clumsy images. If you need to know what filesystem is in your device use 'file -s /dev/sdh1".
Segfault
 
Posts: 895
Joined: 2005-09-24 12:24

Re: FSCK code 4

Postby p.H » 2019-01-17 22:12

Looks like a defective disk or SATA link.
p.H
 
Posts: 1011
Joined: 2017-09-17 07:12

Re: FSCK code 4

Postby bw123 » 2019-01-18 00:38

I usually use kernel parameter fsck.mode=force appended to grub 'linux' line to check root device. Sometimes checks are skipped if the journal says the fs is clean.

Are you sure the fs you are checking is the bad one? Try fsck on them all. Maybe make sure you have a decent backup though before doing too much. Use a live system is a good idea.

Is it possible the fs is full? or if you don't have recovery usb/cd can you run any other commands from busybox that might help gather more information? Do a websearch and read a lot before doing possibly destructive things. On a rescue media like systemrescueCD, tune2fs -l would be the first thing I'd look at.

Also check Manual page fsck.ext4(8) for the -p option, and -v flag to gather more info.
User avatar
bw123
 
Posts: 3787
Joined: 2011-05-09 06:02

Re: FSCK code 4

Postby llivv » 2019-01-18 02:53

the flush cache error on ssd is a bugger.... live boot or install to another disk
if possible so you have a full system to check the disk with.
It would be great if you could recover the ssd and document it here.
Although what I'm seeing seems to point to disk failure.
And I can't find much regarding the flush cache error.

Also get the make and model of the ssd and see if it includes hardware trim.
Or post the ssd make and model here.
In memory of Ian Ashley Murdock (1973 - 2015) founder of the Debian project.
User avatar
llivv
 
Posts: 5488
Joined: 2007-02-14 18:10
Location: cold storage

Re: FSCK code 4

Postby alexankius » 2019-01-18 12:08

booted through RescueCD including ZFS. here's the outcome of fsck and the filesystem:

root@sysresccd /root % fsck /dev/sdh1
fsck from util-linux 2.30.2
e2fsck 1.43.6 (29-Aug-2017)
/dev/sdh1: recovering journal
Superblock needs_recovery flag is clear, but journal has data.
Run journal anyway<y>? yes
fsck.ext4: Unknown code ____ 251 while recovering journal of /dev/sdh1
fsck.ext4: unable to set superblock flags on /dev/sdh1


/dev/sdh1: ********** WARNING: Filesystem still has errors **********

root@sysresccd /root % file -s /dev/sdh1
/dev/sdh1: Linux rev 1.0 ext4 filesystem data, UUID=8c202284-1a8f-47a2-bafa-cd07e901f5a7 (needs journal recovery) (extents) (64bit) (large files) (huge files)


and also the outpute of hparm
root@sysresccd /root % hdparm -I /dev/sdh

/dev/sdh:

ATA device, with non-removable media
Model Number: SATAFIRM S11
Serial Number: 16467027000122390215
Firmware Revision: SBFM50W8
Transport: Serial, ATA8-AST, SATA 1.0a, SATA II Extensions, SATA Rev 2.5, SATA Rev 2.6, SATA Rev 3.0
Standards:
Supported: 11 10 9 8 7 6 5
Likely used: 11
Configuration:
Logical max current
cylinders 16383 16383
heads 16 16
sectors/track 63 63
--
CHS current addressable sectors: 16514064
LBA user addressable sectors: 234441648
LBA48 user addressable sectors: 234441648
Logical Sector size: 512 bytes
Physical Sector size: 512 bytes
Logical Sector-0 offset: 0 bytes
device size with M = 1024*1024: 114473 MBytes
device size with M = 1000*1000: 120034 MBytes (120 GB)
cache/buffer size = unknown
Form Factor: 2.5 inch
Nominal Media Rotation Rate: Solid State Device
Capabilities:
LBA, IORDY(can be disabled)
Queue depth: 32
Standby timer values: spec'd by Standard, no device specific minimum
R/W multiple sector transfer: Max = 16 Current = 16
DMA: mdma0 mdma1 mdma2 udma0 udma1 *udma2 udma3 udma4 udma5 udma6
Cycle time: min=120ns recommended=120ns
PIO: pio0 pio1 pio2 pio3 pio4
Cycle time: no flow control=120ns IORDY flow control=120ns
Commands/features:
Enabled Supported:
* SMART feature set
Security Mode feature set
* Power Management feature set
* Write cache
* Look-ahead
* Host Protected Area feature set
* WRITE_BUFFER command
* READ_BUFFER command
* NOP cmd
* DOWNLOAD_MICROCODE
SET_MAX security extension
* 48-bit Address feature set
* Device Configuration Overlay feature set
* Mandatory FLUSH_CACHE
* FLUSH_CACHE_EXT
* SMART error logging
* SMART self-test
* General Purpose Logging feature set
* WRITE_{DMA|MULTIPLE}_FUA_EXT
* 64-bit World wide name
* WRITE_UNCORRECTABLE_EXT command
* {READ,WRITE}_DMA_EXT_GPL commands
* Segmented DOWNLOAD_MICROCODE
* Gen1 signaling speed (1.5Gb/s)
* Gen2 signaling speed (3.0Gb/s)
* Gen3 signaling speed (6.0Gb/s)
* Native Command Queueing (NCQ)
* Phy event counters
* READ_LOG_DMA_EXT equivalent to READ_LOG_EXT
* DMA Setup Auto-Activate optimization
Device-initiated interface power management
* Software settings preservation
* DOWNLOAD MICROCODE DMA command
* SET MAX SETPASSWORD/UNLOCK DMA commands
* WRITE BUFFER DMA command
* READ BUFFER DMA command
* DEVICE CONFIGURATION SET/IDENTIFY DMA commands
* Data Set Management TRIM supported (limit 8 blocks)
Security:
Master password revision code = 65534
supported
not enabled
not locked
frozen
not expired: security count
supported: enhanced erase
6min for SECURITY ERASE UNIT. 60min for ENHANCED SECURITY ERASE UNIT.
Logical Unit WWN Device Identifier: 5000000000000000
NAA : 5
IEEE OUI : 000000
Unique ID : 000000000
Checksum: correct
alexankius
 
Posts: 15
Joined: 2019-01-17 10:14

Re: FSCK code 4

Postby bw123 » 2019-01-18 17:32

Have you confirmed the drive's health by checking smart status? Look for anything odd, even if it passes. In general, higher VALUEs are better and anything under 100 (or something out of place like 132) is suspicious unless you know what it is. If you have saved smart info on a regular schedule, it's easy to spot problems.

If you have another partition on the same disk and can write normally on it, the I'd think it's only a superblock/fs error. There are ways to use another superblock for fsck, but I won't go into it. There are a lot of threads on the net with those error msg you posted.

hdparm says *udma2 that doesn't look right, my ssd is running *udma6
also, hdparm should report 'not' frozen? Is there a bios pw or something?

If the thing is dead, it's dead. you can probably read from it fine though?

p.s. code blocks make output easier to read.
User avatar
bw123
 
Posts: 3787
Joined: 2011-05-09 06:02

Re: FSCK code 4

Postby alexankius » 2019-01-21 07:41

Apologies this is a lengthy post. Here's some additional output:

FDISK
root@sysresccd / % fdisk -l /dev/sdh
Disk /dev/sdh: 111.8 GiB, 120034123776 bytes, 234441648 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0xb8f134ba

Device Boot Start End Sectors Size Id Type
/dev/sdh1 * 2048 167956479 167954432 80.1G 83 Linux
/dev/sdh2 167958526 234440703 66482178 31.7G 5 Extended
/dev/sdh5 167958528 234440703 66482176 31.7G 82 Linux swap / Solaris


Below the smart info:

root@sysresccd /root % smartctl -t long /dev/sdh
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.14.15-std520-amd64] (local build)

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Extended self-test routine immediately in off-line mode".
Command "Execute SMART Extended self-test routine immediately in off-line mode" failed: scsi error badly formed scsi parameters

root@sysresccd / % smartctl --health /dev/sdh
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.14.15-std520-amd64] (local build)

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

root@sysresccd / % smartctl --all /dev/sdh
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.14.15-std520-amd64] (local build)

=== START OF INFORMATION SECTION ===
Device Model: SATAFIRM S11
Serial Number: 16467027000122390215
LU WWN Device Id: 5 000000 000000000
Firmware Version: SBFM50W8
User Capacity: 120,034,123,776 bytes [120 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Form Factor: 2.5 inches
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ACS-4 (minor revision not indicated)
SATA Version is: SATA 3.2, 6.0 Gb/s (current: 1.5 Gb/s)
Local Time is: Mon Jan 21 08:21:05 2019 UTC
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (65535) seconds.
Offline data collection
capabilities: (0x79) SMART execute Offline immediate.
No Auto Offline data collection support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 4) minutes.
Extended self-test routine
recommended polling time: ( 32) minutes.
Conveyance self-test routine
recommended polling time: ( 8) minutes.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000b 100 100 050 Pre-fail Always - 75842
5 Reallocated_Sector_Ct 0x0013 100 100 050 Pre-fail Always - 446
9 Power_On_Hours 0x0012 100 100 000 Old_age Always - 1585
12 Power_Cycle_Count 0x0012 100 100 000 Old_age Always - 262
162 Unknown_Attribute 0x0003 031 031 000 Pre-fail Always - 10
170 Unknown_Attribute 0x0002 100 100 000 Old_age Always - 75
172 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0
173 Unknown_Attribute 0x0000 100 100 000 Old_age Offline - 196614
174 Unknown_Attribute 0x0012 100 100 000 Old_age Always - 46
181 Program_Fail_Cnt_Total 0x0012 100 100 000 Old_age Always - 0
187 Reported_Uncorrect 0x0012 100 100 000 Old_age Always - 446
192 Power-Off_Retract_Count 0x0012 100 100 000 Old_age Always - 46
194 Temperature_Celsius 0x0023 067 067 000 Pre-fail Always - 33
196 Reallocated_Event_Count 0x0000 100 100 000 Old_age Offline - 446
218 Unknown_Attribute 0x0000 100 100 000 Old_age Offline - 0
231 Temperature_Celsius 0x0013 100 100 000 Pre-fail Always - 99
241 Total_LBAs_Written 0x0012 100 100 000 Old_age Always - 202
242 Total_LBAs_Read 0x0012 100 100 000 Old_age Always - 84

SMART Error Log Version: 1
ATA Error Count: 65535 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 65535 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 08 70 7c 54 e0 Error: UNC 8 sectors at LBA = 0x00547c70 = 5536880

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 08 70 7c 54 e0 08 00:00:00.000 READ DMA
c8 00 60 b0 7c 54 e0 08 00:00:00.000 READ DMA
c8 00 08 f0 7b 54 e0 08 00:00:00.000 READ DMA
c8 00 20 90 7b 54 e0 08 00:00:00.000 READ DMA
c8 00 08 30 7c 54 e0 08 00:00:00.000 READ DMA

Error 65534 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 b0 7b 54 e0 Error: UNC at LBA = 0x00547bb0 = 5536688

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 00 b0 7b 54 e0 08 00:00:00.000 READ DMA
ef 10 02 00 00 00 a0 08 00:00:00.000 SET FEATURES [Enable SATA feature]
27 00 00 00 00 00 e0 08 00:00:00.000 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
ec 00 00 00 00 00 a0 08 00:00:00.000 IDENTIFY DEVICE
ef 03 46 00 00 00 a0 08 00:00:00.000 SET FEATURES [Set transfer mode]

Error 65533 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 08 38 13 c0 e9 Error: UNC 8 sectors at LBA = 0x09c01338 = 163582776

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 08 38 13 c0 e9 08 00:00:00.000 READ DMA
ef 10 02 00 00 00 a0 08 00:00:00.000 SET FEATURES [Enable SATA feature]
27 00 00 00 00 00 e0 08 00:00:00.000 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
ec 00 00 00 00 00 a0 08 00:00:00.000 IDENTIFY DEVICE
ef 03 46 00 00 00 a0 08 00:00:00.000 SET FEATURES [Set transfer mode]

Error 65532 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 08 38 13 c0 e9 Error: UNC 8 sectors at LBA = 0x09c01338 = 163582776

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 08 38 13 c0 e9 08 00:00:00.000 READ DMA
c8 00 08 30 13 c0 e9 08 00:00:00.000 READ DMA
c8 00 08 28 13 c0 e9 08 00:00:00.000 READ DMA
c8 00 08 20 13 c0 e9 08 00:00:00.000 READ DMA
c8 00 08 18 13 c0 e9 08 00:00:00.000 READ DMA

Error 65531 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 08 10 13 c0 e9 Error: UNC 8 sectors at LBA = 0x09c01310 = 163582736

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
c8 00 08 10 13 c0 e9 08 00:00:00.000 READ DMA
ef 10 02 00 00 00 a0 08 00:00:00.000 SET FEATURES [Enable SATA feature]
27 00 00 00 00 00 e0 08 00:00:00.000 READ NATIVE MAX ADDRESS EXT [OBS-ACS-3]
ec 00 00 00 00 00 a0 08 00:00:00.000 IDENTIFY DEVICE
ef 03 46 00 00 00 a0 08 00:00:00.000 SET FEATURES [Set transfer mode]

SMART Self-test log structure revision number 1
No self-tests have been logged. [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 0
Note: revision number not 1 implies that no selective self-test has ever been run
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.


And Superblock:
root@sysresccd / % dumpe2fs /dev/sdh1 | grep superblock
dumpe2fs 1.43.6 (29-Aug-2017)
Primary superblock at 0, Group descriptors at 1-11
Backup superblock at 32768, Group descriptors at 32769-32779
Backup superblock at 98304, Group descriptors at 98305-98315
Backup superblock at 163840, Group descriptors at 163841-163851
Backup superblock at 229376, Group descriptors at 229377-229387
Backup superblock at 294912, Group descriptors at 294913-294923
Backup superblock at 819200, Group descriptors at 819201-819211
Backup superblock at 884736, Group descriptors at 884737-884747
Backup superblock at 1605632, Group descriptors at 1605633-1605643
Backup superblock at 2654208, Group descriptors at 2654209-2654219
Backup superblock at 4096000, Group descriptors at 4096001-4096011
Backup superblock at 7962624, Group descriptors at 7962625-7962635
Backup superblock at 11239424, Group descriptors at 11239425-11239435
Backup superblock at 20480000, Group descriptors at 20480001-20480011

I can't even mount the drive at this point, getting the below:
root@sysresccd / % mount /dev/sdh1 Temp_SDH1
mount: /Temp_SDH1: can't read superblock on /dev/sdh1.
alexankius
 
Posts: 15
Joined: 2019-01-17 10:14

Re: FSCK code 4

Postby bw123 » 2019-01-21 10:52

It doesn't look encouraging, but you never know. Is it really the 8th drive, why the sdh? Have you considered a power supply issue, or bad cable, bios reset+ enables AHCI?

Some things that might help you understand:

FWICT, some drives revert at failure time to something like this as their name.
https://duckduckgo.com/html/?q=SATAFIRM+S11

There are so many S.M.A.R.T errors in your report they have maxed out.
https://en.wikipedia.org/wiki/65535_(number)#In_computing

Code: Select all
man mount | grep -A1 -m1 sb=
User avatar
bw123
 
Posts: 3787
Joined: 2011-05-09 06:02

Re: FSCK code 4

Postby alexankius » 2019-01-21 13:14

thanks for your replies.

It seems that the disk is dead. I have contacted Corsair and it seems that this is a known bug. Waiting for their reply.

to your questions:

This is indeed the 8 drive, this is a Debian + ZFS + SCST, NAS sever and AHCI is of course enabled (3+3 drives for the mirror, 1 drive for cache and 1 for boot). I plugged another drive where the Corsair drive was and it works fine, so cable and port not a problem.

When Corsair Tech Support reply, i'll post here for the benefit of others.
alexankius
 
Posts: 15
Joined: 2019-01-17 10:14

Re: FSCK code 4

Postby llivv » 2019-01-21 15:36

I think the partition table looks a bit odd, but hey,
it's not my setup and I don't need to know your reasons why you set it up that way,
Just looks odd to me.
In memory of Ian Ashley Murdock (1973 - 2015) founder of the Debian project.
User avatar
llivv
 
Posts: 5488
Joined: 2007-02-14 18:10
Location: cold storage

Re: FSCK code 4

Postby alexankius » 2019-01-21 18:33

What would you say is odd about the partition table? Not being an expert when installing I opted for the "use entire disk" option. That is how the partitions were created on /dev/sdh
alexankius
 
Posts: 15
Joined: 2019-01-17 10:14

Re: FSCK code 4

Postby p.H » 2019-01-21 19:54

I think llivv finds odd that there is an extended partition containing only one logical partition. Usually extended partitions are used to overcome le limit of 4 primary partitions, but here there are only two primary/logical partitions, so an extended partition is useless.

This is the result of the Debian installer guided partitioning. It always creates one primary partition, an extended partition and logical partitions regardless of the total number of partitions.
p.H
 
Posts: 1011
Joined: 2017-09-17 07:12

Re: FSCK code 4

Postby llivv » 2019-01-22 06:41

alexankius wrote:What would you say is odd about the partition table? Not being an expert when installing I opted for the "use entire disk" option. That is how the partitions were created on /dev/sdh

p.H explained it nicely above.
I was thinking 31Gb swap within an extended partition is really strange. And the few times I looked over the installers Guided Partitioning Schemes, I'd always choose go back in the installer and setup my own.

The way it's setup though makes for an easy repartition of thte disk, And 30 Gb is a nice, solid amount of usable space to repartition when needed.... And is something you might consider as a last resort forensic procedure to see if it can be repartitioned and if that repartitioned space is usable on that ssd?
In memory of Ian Ashley Murdock (1973 - 2015) founder of the Debian project.
User avatar
llivv
 
Posts: 5488
Joined: 2007-02-14 18:10
Location: cold storage

Re: FSCK code 4

Postby alexankius » 2019-01-23 20:46

Finally solved it though not as i had expected. Here under the reply from Corsair Tech Support:

QUOTE
Hi Alexander

Unfortunately once you get that error there is nothing that can be done for the drive. If you can provide me with some information I can see if we can get it replaced under warranty though. Please attach a picture of the drive that includes the Product and Serial numbers to the support ticket. Also please attach a copy/screenshot/pdf of the invoice/receipt for the product purchase.
UNQUOTE

The error they are talking about is the drive showing up as SATAFIRM S11 and npt as CORSAIR Force LE.

Thank you all for your suggestions. In the meantime got a Samsung SSD and reinstalled everything..
alexankius
 
Posts: 15
Joined: 2019-01-17 10:14

Previous

Return to Beginners Questions

Who is online

Users browsing this forum: No registered users and 11 guests

fashionable