Scheduled Maintenance: We are aware of an issue with Google, AOL, and Yahoo services as email providers which are blocking new registrations. We are trying to fix the issue and we have several internal and external support tickets in process to resolve the issue. Please see: viewtopic.php?t=158230
boot hangs on infinite fsck upgrading to stretch
-
- Posts: 6
- Joined: 2018-12-27 01:34
boot hangs on infinite fsck upgrading to stretch
I upgraded to Stretch (stable) from oldstable and now boot hangs with messages about "a start job is running". Boot never reaches a login prompt or an emergency shell ... nothing. I have to reboot. I can only get into my system using recovery mode. At first I couldnt examine boot logs and I had to set up journalctl to actually keep record of prior boots. My log files show that my boot/efi /home and swap partitions fail to mount. Relevant portions below. It seems an fsck check is timing out. I changed two timeout settings during boot for systemd to infinity within /etc/systemd/systemd.conf and left it overnight; 13 hours later and no progress. The total drive size is 500GB, not full, and partitioned into root, boot, home and swap.
I also already ensured that the uuids in fstab match those of the devices. And I already checked that the kernel config includes CONFIG_FHANDLE=y. I then commented out the swap partition from fstab. I added nobootwait, option, and nofail to the fstab options for a remote nfs mount and swap (before I completely commented swap out and the nfs mount). Nothing helped. I did the same for some pre-defined usb stick mounts before commenting those out too. I still cannot boot.
Then I booted off a live CD and ran fsck and e2fsck upon all the partitions that I could (swap was the only one for which no fsck-like tool exists I think). The result was that the filesystems are ok. No errors found.
I'm getting desperate looking for a solution and some help. I don't want to have to blow away this install and re-install, back up and potentially restore all data. I should note that his upgrade to Stretch has been one of the worst upgrade experiences I have ever gone through. I wasted a lot of time figuring out that plymouth is a worthless good-for-nothing broken package -- at least for me -- and I had to purge it since it prevented a normal boot sequence to X after upgrading to stretch. I was going to investigate the cause of its failure, but for a graphical splash screen I said to myself, my time is not worth it. I don't need a graphical boot sequence, nor should something that is bells and whistles stand in the way of critical path components toward booting up--thus I purged it.
Anyway, enough of my frustration. Any help or tips would be welcome. Thank you.
P.S: Further below is the relevant portion of my log. Elsewhere I also have errors that may or may not be relevant--but these occur during recovery mode boot without further issue:
kernel: tpm tpm0: A TPM error (6) occurred attempting to read a pcr value
blkmapd[328]: open pipe file /run/rpc_pipefs/nfs/blocklayout failed: No such file or directory
PPS: I also removed some pre-defined NFS exports, which also didn't help. I'm starting to throw everything at this problem. If fsck is timing out but fsck from a live cd shows no errors...I'm at a dead end here.
Here is some of my log:
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
systemd[1]: Started Flush Journal to Persistent Storage.
-- Subject: Unit systemd-journal-flush.service has finished start-up
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- Unit systemd-journal-flush.service has finished starting up.
--
-- The start-up result is done.
systemd[1]: dev-sda3.device: Job dev-sda3.device/start timed out.
systemd[1]: Timed out waiting for device dev-sda3.device.
-- Subject: Unit dev-sda3.device has failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- Unit dev-sda3.device has failed.
--
-- The result is timeout.
systemd[1]: Dependency failed for Swap Partition.
-- Subject: Unit dev-sda3.swap has failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- Unit dev-sda3.swap has failed.
--
-- The result is dependency.
systemd[1]: dev-sda3.swap: Job dev-sda3.swap/start failed with result 'dependency'.
systemd[1]: dev-sda3.device: Job dev-sda3.device/start failed with result 'timeout'.
systemd[1]: dev-disk-by\x2duuid-C206\x2dD9FD.device: Job dev-disk-by\x2duuid-C206\x2dD9FD.device/start timed out.
systemd[1]: Timed out waiting for device dev-disk-by\x2duuid-C206\x2dD9FD.device.
-- Subject: Unit dev-disk-by\x2duuid-C206\x2dD9FD.device has failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- Unit dev-disk-by\x2duuid-C206\x2dD9FD.device has failed.
--
-- The result is timeout.
systemd[1]: Dependency failed for File System Check on /dev/disk/by-uuid/C206-D9FD.
-- Subject: Unit systemd-fsck@dev-disk-by\x2duuid-C206\x2dD9FD.service has failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- Unit systemd-fsck@dev-disk-by\x2duuid-C206\x2dD9FD.service has failed.
-- The result is dependency.
systemd[1]: Dependency failed for /boot/efi.
-- Subject: Unit boot-efi.mount has failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- Unit boot-efi.mount has failed.
--
-- The result is dependency.
systemd[1]: Dependency failed for Local File Systems.
-- Subject: Unit local-fs.target has failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- Unit local-fs.target has failed.
--
-- The result is dependency.
systemd[1]: local-fs.target: Job local-fs.target/start failed with result 'dependency'.
systemd[1]: local-fs.target: Triggering OnFailure= dependencies.
systemd[1]: boot-efi.mount: Job boot-efi.mount/start failed with result 'dependency'.
systemd[1]: systemd-fsck@dev-disk-by\x2duuid-C206\x2dD9FD.service: Job systemd-fsck@dev-disk-by\x2duuid-C206\x2dD9FD.service/start failed with result
systemd[1]: dev-disk-by\x2duuid-C206\x2dD9FD.device: Job dev-disk-by\x2duuid-C206\x2dD9FD.device/start failed with result 'timeout'.
systemd[1]: dev-disk-by\x2duuid-25ed3900\x2d5a41\x2d4a69\x2dadfd\x2d2bddacc30dc5.device: Job dev-disk-by\x2duuid-25ed3900\x2d5a41\x2d4a69\x2dadfd\x2d2
systemd[1]: Timed out waiting for device dev-disk-by\x2duuid-25ed3900\x2d5a41\x2d4a69\x2dadfd\x2d2bddacc30dc5.device.
-- Subject: Unit dev-disk-by\x2duuid-25ed3900\x2d5a41\x2d4a69\x2dadfd\x2d2bddacc30dc5.device has failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- Unit dev-disk-by\x2duuid-25ed3900\x2d5a41\x2d4a69\x2dadfd\x2d2bddacc30dc5.device has failed.
--
-- The result is timeout.
systemd[1]: Dependency failed for /home.
-- Subject: Unit home.mount has failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- Unit home.mount has failed.
--
-- The result is dependency.
systemd[1]: home.mount: Job home.mount/start failed with result 'dependency'.
systemd[1]: Dependency failed for File System Check on /dev/disk/by-uuid/25ed3900-5a41-4a69-adfd-2bddacc30dc5.
-- Subject: Unit systemd-fsck@dev-disk-by\x2duuid-25ed3900\x2d5a41\x2d4a69\x2dadfd\x2d2bddacc30dc5.service has failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- Unit systemd-fsck@dev-disk-by\x2duuid-25ed3900\x2d5a41\x2d4a69\x2dadfd\x2d2bddacc30dc5.service has failed.
--
-- The result is dependency.
systemd[1]: systemd-fsck@dev-disk-by\x2duuid-25ed3900\x2d5a41\x2d4a69\x2dadfd\x2d2bddacc30dc5.service: Job systemd-fsck@dev-disk-by\x2duuid-25ed3900\x
systemd[1]: dev-disk-by\x2duuid-25ed3900\x2d5a41\x2d4a69\x2dadfd\x2d2bddacc30dc5.device: Job dev-disk-by\x2duuid-25ed3900\x2d5a41\x2d4a69\x2dadfd\x2d2
systemd[1]: Starting Preprocess NFS configuration...
-- Subject: Unit nfs-config.service has begun start-up
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- Unit nfs-config.service has begun starting up.
systemd[1]: Starting Enable support for additional executable binary formats...
-- Subject: Unit binfmt-support.service has begun start-up
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- Unit binfmt-support.service has begun starting up.
systemd[1]: Starting Raise network interfaces...
-- Subject: Unit networking.service has begun start-up
-- Defined-By: systemd
-- Support: https://www.debian.org/support
I also already ensured that the uuids in fstab match those of the devices. And I already checked that the kernel config includes CONFIG_FHANDLE=y. I then commented out the swap partition from fstab. I added nobootwait, option, and nofail to the fstab options for a remote nfs mount and swap (before I completely commented swap out and the nfs mount). Nothing helped. I did the same for some pre-defined usb stick mounts before commenting those out too. I still cannot boot.
Then I booted off a live CD and ran fsck and e2fsck upon all the partitions that I could (swap was the only one for which no fsck-like tool exists I think). The result was that the filesystems are ok. No errors found.
I'm getting desperate looking for a solution and some help. I don't want to have to blow away this install and re-install, back up and potentially restore all data. I should note that his upgrade to Stretch has been one of the worst upgrade experiences I have ever gone through. I wasted a lot of time figuring out that plymouth is a worthless good-for-nothing broken package -- at least for me -- and I had to purge it since it prevented a normal boot sequence to X after upgrading to stretch. I was going to investigate the cause of its failure, but for a graphical splash screen I said to myself, my time is not worth it. I don't need a graphical boot sequence, nor should something that is bells and whistles stand in the way of critical path components toward booting up--thus I purged it.
Anyway, enough of my frustration. Any help or tips would be welcome. Thank you.
P.S: Further below is the relevant portion of my log. Elsewhere I also have errors that may or may not be relevant--but these occur during recovery mode boot without further issue:
kernel: tpm tpm0: A TPM error (6) occurred attempting to read a pcr value
blkmapd[328]: open pipe file /run/rpc_pipefs/nfs/blocklayout failed: No such file or directory
PPS: I also removed some pre-defined NFS exports, which also didn't help. I'm starting to throw everything at this problem. If fsck is timing out but fsck from a live cd shows no errors...I'm at a dead end here.
Here is some of my log:
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
systemd[1]: Started Flush Journal to Persistent Storage.
-- Subject: Unit systemd-journal-flush.service has finished start-up
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- Unit systemd-journal-flush.service has finished starting up.
--
-- The start-up result is done.
systemd[1]: dev-sda3.device: Job dev-sda3.device/start timed out.
systemd[1]: Timed out waiting for device dev-sda3.device.
-- Subject: Unit dev-sda3.device has failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- Unit dev-sda3.device has failed.
--
-- The result is timeout.
systemd[1]: Dependency failed for Swap Partition.
-- Subject: Unit dev-sda3.swap has failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- Unit dev-sda3.swap has failed.
--
-- The result is dependency.
systemd[1]: dev-sda3.swap: Job dev-sda3.swap/start failed with result 'dependency'.
systemd[1]: dev-sda3.device: Job dev-sda3.device/start failed with result 'timeout'.
systemd[1]: dev-disk-by\x2duuid-C206\x2dD9FD.device: Job dev-disk-by\x2duuid-C206\x2dD9FD.device/start timed out.
systemd[1]: Timed out waiting for device dev-disk-by\x2duuid-C206\x2dD9FD.device.
-- Subject: Unit dev-disk-by\x2duuid-C206\x2dD9FD.device has failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- Unit dev-disk-by\x2duuid-C206\x2dD9FD.device has failed.
--
-- The result is timeout.
systemd[1]: Dependency failed for File System Check on /dev/disk/by-uuid/C206-D9FD.
-- Subject: Unit systemd-fsck@dev-disk-by\x2duuid-C206\x2dD9FD.service has failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- Unit systemd-fsck@dev-disk-by\x2duuid-C206\x2dD9FD.service has failed.
-- The result is dependency.
systemd[1]: Dependency failed for /boot/efi.
-- Subject: Unit boot-efi.mount has failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- Unit boot-efi.mount has failed.
--
-- The result is dependency.
systemd[1]: Dependency failed for Local File Systems.
-- Subject: Unit local-fs.target has failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- Unit local-fs.target has failed.
--
-- The result is dependency.
systemd[1]: local-fs.target: Job local-fs.target/start failed with result 'dependency'.
systemd[1]: local-fs.target: Triggering OnFailure= dependencies.
systemd[1]: boot-efi.mount: Job boot-efi.mount/start failed with result 'dependency'.
systemd[1]: systemd-fsck@dev-disk-by\x2duuid-C206\x2dD9FD.service: Job systemd-fsck@dev-disk-by\x2duuid-C206\x2dD9FD.service/start failed with result
systemd[1]: dev-disk-by\x2duuid-C206\x2dD9FD.device: Job dev-disk-by\x2duuid-C206\x2dD9FD.device/start failed with result 'timeout'.
systemd[1]: dev-disk-by\x2duuid-25ed3900\x2d5a41\x2d4a69\x2dadfd\x2d2bddacc30dc5.device: Job dev-disk-by\x2duuid-25ed3900\x2d5a41\x2d4a69\x2dadfd\x2d2
systemd[1]: Timed out waiting for device dev-disk-by\x2duuid-25ed3900\x2d5a41\x2d4a69\x2dadfd\x2d2bddacc30dc5.device.
-- Subject: Unit dev-disk-by\x2duuid-25ed3900\x2d5a41\x2d4a69\x2dadfd\x2d2bddacc30dc5.device has failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- Unit dev-disk-by\x2duuid-25ed3900\x2d5a41\x2d4a69\x2dadfd\x2d2bddacc30dc5.device has failed.
--
-- The result is timeout.
systemd[1]: Dependency failed for /home.
-- Subject: Unit home.mount has failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- Unit home.mount has failed.
--
-- The result is dependency.
systemd[1]: home.mount: Job home.mount/start failed with result 'dependency'.
systemd[1]: Dependency failed for File System Check on /dev/disk/by-uuid/25ed3900-5a41-4a69-adfd-2bddacc30dc5.
-- Subject: Unit systemd-fsck@dev-disk-by\x2duuid-25ed3900\x2d5a41\x2d4a69\x2dadfd\x2d2bddacc30dc5.service has failed
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- Unit systemd-fsck@dev-disk-by\x2duuid-25ed3900\x2d5a41\x2d4a69\x2dadfd\x2d2bddacc30dc5.service has failed.
--
-- The result is dependency.
systemd[1]: systemd-fsck@dev-disk-by\x2duuid-25ed3900\x2d5a41\x2d4a69\x2dadfd\x2d2bddacc30dc5.service: Job systemd-fsck@dev-disk-by\x2duuid-25ed3900\x
systemd[1]: dev-disk-by\x2duuid-25ed3900\x2d5a41\x2d4a69\x2dadfd\x2d2bddacc30dc5.device: Job dev-disk-by\x2duuid-25ed3900\x2d5a41\x2d4a69\x2dadfd\x2d2
systemd[1]: Starting Preprocess NFS configuration...
-- Subject: Unit nfs-config.service has begun start-up
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- Unit nfs-config.service has begun starting up.
systemd[1]: Starting Enable support for additional executable binary formats...
-- Subject: Unit binfmt-support.service has begun start-up
-- Defined-By: systemd
-- Support: https://www.debian.org/support
--
-- Unit binfmt-support.service has begun starting up.
systemd[1]: Starting Raise network interfaces...
-- Subject: Unit networking.service has begun start-up
-- Defined-By: systemd
-- Support: https://www.debian.org/support
-
- Global Moderator
- Posts: 3049
- Joined: 2017-09-17 07:12
- Has thanked: 5 times
- Been thanked: 132 times
Re: boot hangs on infinite fsck upgrading to stretch
What start job ? Please post the full lines.papatangonyc wrote: boot hangs with messages about "a start job is running".
Do you mean the "recovery mode" available in GRUB's "Advanced" menu (which removes "quiet" and adds "single" in the kernel command line) ? If yes, please post the output ofpapatangonyc wrote:I can only get into my system using recovery mode.
Code: Select all
df
ls /dev/sd*
ls /dev/disk/by-uuid
By default the initramfs performs fsck on the root filesystem before mounting it. Later, the init system performs fsck on other filesystems according to /etc/fstab.
"It seems" ? Facts please. I do not see in your post any evidence that fsck is timing out. All I can see is that devices cannot be found, so fsck and mount cannot even happen.papatangonyc wrote:It seems an fsck check is timing out
If you feel the need to mention it, I suspect that you are not using a Debian stock kernel. What kernel are you using ? Does it use an initrd/initramfs ?papatangonyc wrote:I already checked that the kernel config includes CONFIG_FHANDLE=y
Looks like your fstab uses /dev/sd* device names. Such use is not reliable because these names are not persistent. Consider using UUID or LABEL instead.papatangonyc wrote:systemd[1]: dev-sda3.device: Job dev-sda3.device/start timed out.
- Head_on_a_Stick
- Posts: 14114
- Joined: 2014-06-01 17:46
- Location: London, England
- Has thanked: 81 times
- Been thanked: 133 times
Re: boot hangs on infinite fsck upgrading to stretch
The OP said that they have checked the UUIDs in /etc/fstab, I think those units are auto-generated.[1]p.H wrote:Looks like your fstab uses /dev/sd* device names. Such use is not reliable because these names are not persistent. Consider using UUID or LABEL instead.papatangonyc wrote:systemd[1]: dev-sda3.device: Job dev-sda3.device/start timed out.
Looks like the filesystem modules are missing, perhaps rebuild the initramfs? And check the bootloader configuration, ofc.
deadbang
-
- Global Moderator
- Posts: 3049
- Joined: 2017-09-17 07:12
- Has thanked: 5 times
- Been thanked: 132 times
Re: boot hangs on infinite fsck upgrading to stretch
My point was not whether the UUIDs in /etc/fstab are correct or not (I have no reason not to trust the OP about this) but the presence of /dev/sd* in /etc/fstab. AFAIK, dev-sd* systemd units are auto-generated when /etc/fstab contains /dev/sd* device names.Head_on_a_Stick wrote:The OP said that they have checked the UUIDs in /etc/fstab, I think those units are auto-generated.
This specific point in my post was not related to the issue described in the OP.
I doubt the issue is caused by missing modules in the initramfs because IIUC it happens after the initramfs has handed over to the init system.
-
- Posts: 6
- Joined: 2018-12-27 01:34
Re: boot hangs on infinite fsck upgrading to stretch
Thank you for the quick replies.
On screen during boot I see:
Then that last line alternates with two more version of itself:
Where X is the time elapsed and the timeout value used to be 1:30 but then I changed it to infinity.
Yes, I did mean "recovery mode" available in GRUB's "Advanced" menu that removes "quiet" and adds "single" in the kernel command line. As I had mentioned I also tried removing quiet from a non-recovery mode boot option in GRUB's "Advanced" menu to see if I could get better logging, but I don't think it made a difference.
Output of df:
Output of ls /dev/sd* is
Output of /dev/disk/by-uuid
When I read the log and saw this:
...I interpreted that as a problem with fsck. Perhaps I was wrong. If you say that devices are not being found, I guess that would also make sense. I don't know. All I can do is read the log but I'm not an expert at interpreting these logs. Never had a problem like this before.
I only mention CONFIG_FHANDLE=y because I was casting about for possible causes of my problem and I read some posts on user forums that mention this option as being required or else some persons in the past have experienced similar symptoms. Also I did not change that option. I found it set already. I have not customized my kernel at all. Way over my head. All I did was an upgrade from oldstable to stable. One such example post: https://stackoverflow.com/questions/233 ... during-sta
As for my /etc/fstab, I was already using UUIDs for a long time. The only non-uuid identified drive is the cdrom with /dev/sr0.
Here it is as it is today and still cannot boot:
Thanks again.
On screen during boot I see:
Code: Select all
tpm tpm(0): A TPM error (6) occurred attempting to read a pcr value
tpm tpm(0): A TPM error (6) occurred attempting to read a pcr value
/dev/sda2: clean ... files ... blocks
[ *** ] (1 of 3) A start job is running for ddacc30dc5.device (Xs / <no limit or timeout value>
Code: Select all
[ *** ] (2 of 3) A start job is running for 06\x2dD9FD (Xs / <no limit or timeout value>
[ *** ] (3 of 3) A start job is running for dev-sda3.device (Xs / <no limit or timeout value>
Yes, I did mean "recovery mode" available in GRUB's "Advanced" menu that removes "quiet" and adds "single" in the kernel command line. As I had mentioned I also tried removing quiet from a non-recovery mode boot option in GRUB's "Advanced" menu to see if I could get better logging, but I don't think it made a difference.
Output of df:
Code: Select all
Filesystem 1K-blocks Used Available Use% Mounted on
udev 1953696 0 1953696 0% /dev
tmpfs 392940 1764 391176 1% /run
/dev/sda2 67154552 36723624 26996588 58% /
tmpfs 1964684 44364 1920320 3% /dev/shm
tmpfs 5120 4 5116 1% /run/lock
tmpfs 1964684 0 1964684 0% /sys/fs/cgroup
/dev/sda1 523248 132 523116 1% /boot/efi
/dev/sda4 403587808 193191236 189872388 51% /home
tmpfs 392936 16 392920 1% /run/user/119
tmpfs 392936 28 392908 1% /run/user/1000
Code: Select all
/dev/sda /dev/sda1 /dev/sda2 /dev/sda3 /dev/sda4
Code: Select all
25ed3900-5a41-4a69-adfd-2bddacc30dc5 c0403138-8cdb-4412-b9a6-de4375ab60ff C206-D9FD d56d5024-9570-4066-b706-ddcd676612cb
Code: Select all
systemd-fsck@dev-disk-by\x2duuid-C206\x2dD9FD.service: Job systemd-fsck@dev-disk-by\x2duuid-C206\x2dD9FD.service/start failed with result
systemd[1]: dev-disk-by\x2duuid-C206\x2dD9FD.device: Job dev-disk-by\x2duuid-C206\x2dD9FD.device/start failed with result 'timeout'.
I only mention CONFIG_FHANDLE=y because I was casting about for possible causes of my problem and I read some posts on user forums that mention this option as being required or else some persons in the past have experienced similar symptoms. Also I did not change that option. I found it set already. I have not customized my kernel at all. Way over my head. All I did was an upgrade from oldstable to stable. One such example post: https://stackoverflow.com/questions/233 ... during-sta
As for my /etc/fstab, I was already using UUIDs for a long time. The only non-uuid identified drive is the cdrom with /dev/sr0.
Here it is as it is today and still cannot boot:
Code: Select all
# <file system> <mount point> <type> <options> <dump> <pass>
# / was on /dev/sda2 during installation
UUID=c0403138-8cdb-4412-b9a6-de4375ab60ff / ext4 discard,noatime,nodiratime,errors=remount-ro 0 1
# /boot/efi was on /dev/sda1 during installation
UUID=C206-D9FD /boot/efi vfat defaults 0 1
# /home was on /dev/sda4 during installation
UUID=25ed3900-5a41-4a69-adfd-2bddacc30dc5 /home ext4 discard,noatime,nodiratime 0 2
# swap was on /dev/sda3 during installation
# UUID=d56d5024-9570-4066-b706-ddcd676612cb none swap sw 0 0
/dev/sr0 /media/cdrom0 udf,iso9660 user,noauto 0 0
# /dev/sdb1 /media/usb0 auto rw,user,noauto 0 0
# /dev/sdb2 /media/usb1 auto rw,user,noauto 0 0
#192.168.1.3 mnt nfs4 rw,option,noauto,nofail,nobootwait 0 0
#UUID=F765-51E3 /media/usb1 vfat rw,option,noauto,user,nosuid,nodev,relatime,uid=1000,gid=1000,fmask=0022,dmask=0077,codepage=437,iocharset=utf8,shortname=mixed,showexec,utf8,flush,errors=remount-ro,uhelper=udisks2w,nofail,nobootwait
#UUID=B2FB-1E2C /media/usb0 vfat rw,option,noauto,user,nosuid,nodev,relatime,uid=1000,gid=1000,fmask=0022,dmask=0077,codepage=437,iocharset=utf8,shortname=mixed,showexec,utf8,flush,errors=remount-ro,uhelper=udisks2w,nofail,nobootwait
-
- Posts: 6
- Joined: 2018-12-27 01:34
Re: boot hangs on infinite fsck upgrading to stretch
I realize now that perhaps I did "customize" my kernel, but I'm not sure what that term means.
I should mention that well over a year ago I did add modules to the kernel under /etc/modules after installing a package that provided the modules. It was tp_smapi and coretemp, from the tp-smapi-dkms package. But I commented those out. The package is still installed but I should purge it. I have no idea if that might be relevant. I doubt it but I mention it just in case. Home page: http://www.thinkwiki.org/wiki/Tp_smapi
Thanks again.
I should mention that well over a year ago I did add modules to the kernel under /etc/modules after installing a package that provided the modules. It was tp_smapi and coretemp, from the tp-smapi-dkms package. But I commented those out. The package is still installed but I should purge it. I have no idea if that might be relevant. I doubt it but I mention it just in case. Home page: http://www.thinkwiki.org/wiki/Tp_smapi
Thanks again.
Re: boot hangs on infinite fsck upgrading to stretch
you have to fix fstab field 5 for filesystems that need to be checked. see: man fstab
If fsck still bails after changing the dump entries for both / and /home
I'd first change the dump entry for /boot/efi reboot checking for the fsck bail
If still getting the fsck errors I'd then start changing the mount options to defaults
for eash hard disk filesystem and disconnecting the usb devices
to see which fs(s) are at fault.
Code: Select all
# <file system> <mount point> <type> <options> <dump> <pass>
UUID=c0403138-8cdb-4412-b9a6-de4375ab60ff / ext4 discard,noatime,nodiratime,errors=remount-ro 0 1
UUID=25ed3900-5a41-4a69-adfd-2bddacc30dc5 /home ext4 discard,noatime,nodiratime 0 2
Personally, your ext4 options make little sense to me. But if you have special reasons to use those options that is your choice.# <file system> <mount point> <type> <options> <dump> <pass>
UUID=c0403138-8cdb-4412-b9a6-de4375ab60ff / ext4 discard,noatime,nodiratime,errors=remount-ro 1 1
UUID=25ed3900-5a41-4a69-adfd-2bddacc30dc5 /home ext4 discard,noatime,nodiratime 1 2
If fsck still bails after changing the dump entries for both / and /home
I'd first change the dump entry for /boot/efi reboot checking for the fsck bail
If still getting the fsck errors I'd then start changing the mount options to defaults
for eash hard disk filesystem and disconnecting the usb devices
to see which fs(s) are at fault.
In memory of Ian Ashley Murdock (1973 - 2015) founder of the Debian project.
- Head_on_a_Stick
- Posts: 14114
- Joined: 2014-06-01 17:46
- Location: London, England
- Has thanked: 81 times
- Been thanked: 133 times
Re: boot hangs on infinite fsck upgrading to stretch
Nopep.H wrote:AFAIK, dev-sd* systemd units are auto-generated when /etc/fstab contains /dev/sd* device names.
@OP: with which kernel are you booted?
Code: Select all
uname -a
deadbang
-
- Posts: 6
- Joined: 2018-12-27 01:34
Re: boot hangs on infinite fsck upgrading to stretch
Output of uname -a is
I changed the 5th fstab fields for / and /home, and the options to defaults, and that failed. Then I also changed the 5th field for /boot/efi and that didn't help either.
I did not yet try setting the 5th field just for efi and not for / and /home. I could try that right after posting and if it solves anything, or gives different results in any way, I can post again.
The boot options were due to my purchase long ago of an ssd and read someplace that to optimize for ssd drives I could use certain ext4 options. Perhaps that was a mistake. I'm no expert. I reverted to defaults, like I said, without any effect. And as for that 5th field...I don't have dump installed, so could it really make a difference?
Here is the entire log (available for the next 14 days) of my most recent boot from journalctl wherein / and /home/ and /boot/efi have the 5th field changed to 1, and all options are set to defaults, just in case I am missing something relevant.
https://pastebin.com/2uNt8SKE
Disconnecting the usb devices? Just to be clear there are no usb drives. There are no usb devices at all of any kind at boot time. I sometimes plug them in well after boot, but there are no external usb devices bearing any kind of fs at boot time. And I commented out those lines in fstab anyway.
Everybody have a Happy New Year in the meantime and again, thanks for the help.
Code: Select all
Linux <hostname> 4.9.0-8-amd64 #1 SMP Debian 4.9.130-2 (2018-10-27) x86_64 GNU/Linux
Code: Select all
UUID=c0403138-8cdb-4412-b9a6-de4375ab60ff / ext4 defaults 1 1
# /boot/efi was on /dev/sda1 during installation
UUID=C206-D9FD /boot/efi vfat defaults 1 1
# /home was on /dev/sda4 during installation
UUID=25ed3900-5a41-4a69-adfd-2bddacc30dc5 /home ext4 defaults 1 2
The boot options were due to my purchase long ago of an ssd and read someplace that to optimize for ssd drives I could use certain ext4 options. Perhaps that was a mistake. I'm no expert. I reverted to defaults, like I said, without any effect. And as for that 5th field...I don't have dump installed, so could it really make a difference?
Here is the entire log (available for the next 14 days) of my most recent boot from journalctl wherein / and /home/ and /boot/efi have the 5th field changed to 1, and all options are set to defaults, just in case I am missing something relevant.
https://pastebin.com/2uNt8SKE
Disconnecting the usb devices? Just to be clear there are no usb drives. There are no usb devices at all of any kind at boot time. I sometimes plug them in well after boot, but there are no external usb devices bearing any kind of fs at boot time. And I commented out those lines in fstab anyway.
Everybody have a Happy New Year in the meantime and again, thanks for the help.
Re: boot hangs on infinite fsck upgrading to stretch
A couple of last ditch efforts here keeping in mind that if you have made the changes posted here before you upgraded - I beleive your upgrade would have been a much different experience.
I say that becasue after reading the journal there are still entries from the old config before you made changes to it still trying to load and failing. Especially when systemd starts running the bootup.
It looks as if systemd is loading in parallel making it really difficult to follow entries in the journal.
from the journal:
edit from the grub menu at boot using key to edit the boot parameters in the grub menu.
scroll down to the linux entry and remove
from that line.
than from the edited grub menu
to boot
if that helps at all please save another journal to disk
You could also try rebuilding the initrd.img
not sure it will help at all at this stage but also worth a try seeing how boot looks to me like it hasn't picked up most of the config changes you've made so far.
ie: systemd is still trying to load swap and failing - there are still entries for NFS - etc etc
reboot
If? either if these adjustments make any noticeable improvement in booting please repost another journal.
If no noticeable improvement,
I'd suggest you restore from jessie backup
than make all the changes to jessie pre-upgrade to stretch
than post a journal from restored jessie with config changes you've made
before the upgrade.
Just my view from reading the journal.
Edit: if you still have jessie 3.16 kernel installed you could see if that still boots to a usable system.
edit2: for ssd the recommended option isand run fstrim weekly
the 5th entry in fstab [dump] calls fsck at mounting the filesystem so for an ssd the 0 is correct (don't run fsck of the filesystem) ie: use periodic trim instead.
edit3: the journal also shows you have LVM which just adds another layer of management to deal with especially during upgrades. Do you remember how and when you set it up?
edit4: keep in mind that restoring from backup without cleaning the ssd my make an even bigger mess on disk,,, it's hard to tell what other issues my be lurking in the filesystem.
I say that becasue after reading the journal there are still entries from the old config before you made changes to it still trying to load and failing. Especially when systemd starts running the bootup.
It looks as if systemd is loading in parallel making it really difficult to follow entries in the journal.
from the journal:
Code: Select all
/boot/vmlinuz-4.9.0-8-amd64 root=UUID=c0403138-8cdb-4412-b9a6-de4375ab60ff ro i915.i915_enable_rc6=1 pcie_aspm=force acpi=noirq quiet splash
edit from the grub menu at boot using
Code: Select all
e
scroll down to the linux entry and remove
Code: Select all
ro i915.i915_enable_rc6=1 pcie_aspm=force acpi=noirq quiet splash
than from the edited grub menu
Code: Select all
CTRL +X
if that helps at all please save another journal to disk
You could also try rebuilding the initrd.img
not sure it will help at all at this stage but also worth a try seeing how boot looks to me like it hasn't picked up most of the config changes you've made so far.
ie: systemd is still trying to load swap and failing - there are still entries for NFS - etc etc
Code: Select all
# update-initramfs -uk "$uname"
If? either if these adjustments make any noticeable improvement in booting please repost another journal.
If no noticeable improvement,
I'd suggest you restore from jessie backup
than make all the changes to jessie pre-upgrade to stretch
than post a journal from restored jessie with config changes you've made
before the upgrade.
Just my view from reading the journal.
Edit: if you still have jessie 3.16 kernel installed you could see if that still boots to a usable system.
edit2: for ssd the recommended option is
Code: Select all
UUID / ext4 defaults,discard 0 1
the 5th entry in fstab [dump] calls fsck at mounting the filesystem so for an ssd the 0 is correct (don't run fsck of the filesystem) ie: use periodic trim instead.
edit3: the journal also shows you have LVM which just adds another layer of management to deal with especially during upgrades. Do you remember how and when you set it up?
edit4: keep in mind that restoring from backup without cleaning the ssd my make an even bigger mess on disk,,, it's hard to tell what other issues my be lurking in the filesystem.
In memory of Ian Ashley Murdock (1973 - 2015) founder of the Debian project.
-
- Posts: 6
- Joined: 2018-12-27 01:34
Re: boot hangs on infinite fsck upgrading to stretch
Booted twice as per your suggestion without all the additional args via grub command editing, both old and new kernels. Specifically I removed
Logs are here for the next 2 weeks:
But still timing out on boot up. Logs look pretty much the same.
If I included LVM during install that was a mistake, since this is a single ssd machine. There's little point to using LVM, or at least I never intended to use it anyway, and I have never taken advantage of it.
I will attempt later today.
This seems to be heading for no solution.
Code: Select all
ro i915.i915_enable_rc6=1 pcie_aspm=force acpi=noirq quiet splash
Code: Select all
https://pastebin.com/r1MZtLiH
https://pastebin.com/grPL0Zhm
I was wondering about that. It seems that way to me too by reading the logs, but how can that be possible?! Since when are config files ignored during boot, such as fstab? What then is the point of fstab? This is news to me and somewhat troubling. I don't know much about the boot process but I thought fstab is supposed to be consulted during boot, granted it resides on a fs that needs to be mounted during boot, so I'm a bit unclear on when and how that was done during boot in the past, and how it is being done today...something I should read up on. But still, something is seriously wrong here if I cannot remove a bad fs from boot, regardless.I say that becasue after reading the journal there are still entries from the old config before you made changes to it still trying to load and failing. Especially when systemd starts running the bootup.
I thought systemd by design and default performs boot tasks in parallel; so that should not be a surprise. I was also wondering if there is any way to force a serialized boot sequence to make reading logs easier. But so far I have only found posts on other user forums that state it is impossible. If you know otherwise, please share.It looks as if systemd is loading in parallel making it really difficult to follow entries in the journal.
If I included LVM during install that was a mistake, since this is a single ssd machine. There's little point to using LVM, or at least I never intended to use it anyway, and I have never taken advantage of it.
I will attempt
Code: Select all
update-initramfs -uk "$uname"
This seems to be heading for no solution.
I seriously doubt that. Or at least I never go the memo about a few fstab options being able to totally screw up an upgrade. That would seem a bit brittle.if you have made the changes posted here before you upgraded - I beleive your upgrade would have been a much different experience.
I don't have the time to go back to Jessie, etc., not to mention I never backup the OS and binaries, just my user data, ie, that which cannot be reinstalled. If update-initramfs fails I will have to back up user data again and remove this installation of Debian.I'd suggest you restore from jessie backup
than make all the changes to jessie pre-upgrade to stretch
than post a journal from restored jessie with config changes you've made
before the upgrade.
- Head_on_a_Stick
- Posts: 14114
- Joined: 2014-06-01 17:46
- Location: London, England
- Has thanked: 81 times
- Been thanked: 133 times
Re: boot hangs on infinite fsck upgrading to stretch
If the journal is recording then the initramfs stage is passed, as p.H noted earlier.
The root partition is mounted but /home, swap & the ESP fail.
@OP: try adding x-systemd.automount or nofail to the fstab lines for the /home, swap & EFI partitions.
https://bbs.archlinux.org/viewtopic.php?id=238554&p=3
Your posted journal seems to show that systemd is attempting to mount UUID 25ed3900-5a41-4a69-adfd-2bddacc30dc5 as both /home and swap.
Is this a GPT disk?
Can we please see the output of
I think we need further details of your upgrade procedure from jessie, check the official guide and see if you missed anything:
https://www.debian.org/releases/stable/ ... ng.en.html
The root partition is mounted but /home, swap & the ESP fail.
@OP: try adding x-systemd.automount or nofail to the fstab lines for the /home, swap & EFI partitions.
https://bbs.archlinux.org/viewtopic.php?id=238554&p=3
Your posted journal seems to show that systemd is attempting to mount UUID 25ed3900-5a41-4a69-adfd-2bddacc30dc5 as both /home and swap.
Is this a GPT disk?
Can we please see the output of
Code: Select all
# gdisk -l /dev/sda
https://www.debian.org/releases/stable/ ... ng.en.html
deadbang
- sunrat
- Administrator
- Posts: 6476
- Joined: 2006-08-29 09:12
- Location: Melbourne, Australia
- Has thanked: 118 times
- Been thanked: 474 times
Re: boot hangs on infinite fsck upgrading to stretch
Just a side note - discard and fstrim are not both needed. The current recommendations I have read are to use fstrim weekly (fstrim.service) with a systemd timer and not use discard at all.llivv wrote:edit2: for ssd the recommended option isand run fstrim weeklyCode: Select all
UUID / ext4 defaults,discard 0 1
the 5th entry in fstab [dump] calls fsck at mounting the filesystem so for an ssd the 0 is correct (don't run fsck of the filesystem) ie: use periodic trim instead.
I'm not in Stretch right now but fstrim.timer may be set up and just needs to be started. You can check with:
Code: Select all
systemctl status fstrim.timer
“ computer users can be divided into 2 categories:
Those who have lost data
...and those who have not lost data YET ” Remember to BACKUP!
Those who have lost data
...and those who have not lost data YET ” Remember to BACKUP!
-
- Posts: 6
- Joined: 2018-12-27 01:34
Re: boot hangs on infinite fsck upgrading to stretch
I could have sworn I already tried nofail, perhaps in combination with some other options. I definitely already tried nobootwait.
In any event, with defaults,discard,nofail on /home, /boot/efi and swap, I could not believe my eyes...boot up actually worked.
Here's the log:
So as per https://bbs.archlinux.org/viewtopic.php?id=238554&p=3 this is all caused by an LVM bug? And I have nothing more to worry about? I don't suppose I could remove LVM at this point? Too late?
I never installed fstrim. My next step.
It seems moot now but here is the output of gdisk:
Can we call this case closed? Thank you all.
In any event, with defaults,discard,nofail on /home, /boot/efi and swap, I could not believe my eyes...boot up actually worked.
Here's the log:
Code: Select all
https://pastebin.com/vnj6rqH0
I never installed fstrim. My next step.
It seems moot now but here is the output of gdisk:
Code: Select all
GPT fdisk (gdisk) version 1.0.1
Partition table scan:
MBR: protective
BSD: not present
APM: not present
GPT: present
Found valid GPT with protective MBR; using GPT.
Disk /dev/sda: 1000215216 sectors, 476.9 GiB
Logical sector size: 512 bytes
Disk identifier (GUID): 94E91A1A-22F1-4971-B032-4F2E4B6F01EB
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 1000215182
Partitions will be aligned on 2048-sector boundaries
Total free space is 24558189 sectors (11.7 GiB)
Number Start (sector) End (sector) Size Code Name
1 2048 1050623 512.0 MiB 8300
2 1050624 137768959 65.2 GiB 8300
3 137768960 155346943 8.4 GiB 8200
4 155346944 975659007 391.2 GiB 8300
Re: boot hangs on infinite fsck upgrading to stretch
see lines #173 and #1106 in the log for cleaning up your custom boot parameters....papatangonyc wrote:Code: Select all
https://pastebin.com/vnj6rqH0
fstrim is part of the util-linux packagepapatangonyc wrote:I never installed fstrim. My next step.
I'd encourage to read of both manpages for fstrim and fsfreeze
It boots - if you are happy so am Ipapatangonyc wrote:Can we call this case closed? Thank you all.
In memory of Ian Ashley Murdock (1973 - 2015) founder of the Debian project.
- Head_on_a_Stick
- Posts: 14114
- Joined: 2014-06-01 17:46
- Location: London, England
- Has thanked: 81 times
- Been thanked: 133 times
Re: boot hangs on infinite fsck upgrading to stretch
^ This is wrong, an EFI partition should show the ef00 code in gdisk, I am amazed that disk can boot at all in UEFI modepapatangonyc wrote:Code: Select all
1 2048 1050623 512.0 MiB 8300
It should be possible to use gdisk to change the partition code but be sure to back up the contents first.
deadbang
Re: boot hangs on infinite fsck upgrading to stretch
Since we don't know fore sure how it was setup (which package called it as a dep or recommends)papatangonyc wrote: So as per https://bbs.archlinux.org/viewtopic.php?id=238554&p=3 this is all caused by an LVM bug? And I have nothing more to worry about? I don't suppose I could remove LVM at this point? Too late?
I'd be suspicious of removing it at this point of time, anyways.
Here is more from the lvm debian wiki https://wiki.debian.org/LVM
I have a few other suggestions for consideration, most importantly (in my opinion) is to double check the upgrade status
Code: Select all
~$ apt list --upgradeable
Code: Select all
~$ aptitude -s safe-upgrade
Code: Select all
~$ aptitude -s full-upgrade
so choosing the default Y [ENTER] key just returns to the bash prompt without doing any package management.
Code: Select all
Note: Using 'Simulate' mode.
Do you want to continue? [Y/n/?]
as I believe you mentioned having swap NFS in one of the posts above.
best of luck
and very glad you got boot again....
In memory of Ian Ashley Murdock (1973 - 2015) founder of the Debian project.