Scheduled Maintenance: We are aware of an issue with Google, AOL, and Yahoo services as email providers which are blocking new registrations. We are trying to fix the issue and we have several internal and external support tickets in process to resolve the issue. Please see: viewtopic.php?t=158230

 

 

 

sytemd issue between zfs and others (docker, smbd, netatalk)

Linux Kernel, Network, and Services configuration.
Post Reply
Message
Author
pierrepaap
Posts: 7
Joined: 2021-01-05 16:01

sytemd issue between zfs and others (docker, smbd, netatalk)

#1 Post by pierrepaap »

Dear all,

first of all, first post, so thank you to everyone here who helps others.
I've read the "first read this" posts so hopefully i'll be ok.

I've used unix and linux, debian in particular, for many years (something around 25).
While this allows me to know quite some stuff, i don't know all and especillay some of the newer things.


After the intros, let's come to the problem at hand.

I've been using ZFS (ZoL) for about 5 years on debian. Now it's all fully using systemd and I've always had issues there.
I had fixed them in stretch I believe, but when I upgraded to buster I suppose the systemd files were updated and I'm back to square one.

The issue is that services that use the files on one of the zfs pool/fs are starting before the pool is fully imported and mounted and therefore create file entries which cause the import and mount to not finish fully correctly.

Debian version : Buster 10.7
zfs version : zfs-dkms/buster-backports,now 0.8.6-1~bpo10+1 all [installed]


here's an example of WRONG:

Code: Select all

# df -H /zdocs/*
Filesystem         Size  Used Avail Use% Mounted on
zdocs              308G  525k  308G   1% /zdocs
zdocs/docker_data  313G  4.3G  308G   2% /zdocs/docker_data
zdocs              308G  525k  308G   1% /zdocs
zdocs              308G  525k  308G   1% /zdocs
zdocs              308G  525k  308G   1% /zdocs
zdocs/server       311G  2.5G  309G   1% /zdocs/server
zdocs              308G  525k  308G   1% /zdocs
zdocs/temp         311G  2.6G  308G   1% /zdocs/temp
zdocs              308G  525k  308G   1% /zdocs
from there, if I try to export /import that pool after stopping services using those FS (docker, smbd and netatalk), it goes like this :

Code: Select all

# zpool export zdocs 
# zpool import zdocs 
cannot mount '/zdocs/ourcloud': directory is not empty
then i remove all the extra files that are not supposed to be there, do again an export/import and the expected result is the following:

Code: Select all

# df -H /zdocs/*
Filesystem         Size  Used Avail Use% Mounted on
zdocs/dev          322G   13G  310G   4% /zdocs/dev
zdocs/docker_data  315G  5.8G  310G   2% /zdocs/docker_data
zdocs/docs         374G   33G  342G   9% /zdocs/docs
zdocs/ourcloud     310G   48M  310G   1% /zdocs/ourcloud
zdocs/photos       594G  285G  310G  48% /zdocs/photos
zdocs/server       312G  2.5G  310G   1% /zdocs/server
zdocs/software     379G   70G  310G  19% /zdocs/software
zdocs/temp         312G  2.6G  310G   1% /zdocs/temp
zdocs/TM_Brice     215G  211G  4.0G  99% /zdocs/TM_Brice
this is apparently caused by an ordering issue in the systemd startup, meaning some dependencies in the zfs (or the 3 others consuming : docker, smbd and netatalk) services are incorrectly set.

I don't know if a bug should be submitted to some of the maintainers, for now I'd be happy with a fix that means i don't have to manually fix this at every startup or reboot...

Note: I believe the mounts happen during the import, not via the zfs-mount.service and I believe that's because mountpoints are set on the pools:

Code: Select all

# zfs get mountpoint zdocs
NAME   PROPERTY    VALUE       SOURCE
zdocs  mountpoint  /zdocs      local
# zfs get mountpoint zdocs/docs
NAME        PROPERTY    VALUE        SOURCE
zdocs/docs  mountpoint  /zdocs/docs  inherited from zdocs
Thanks in advance for any help !

User avatar
Head_on_a_Stick
Posts: 14114
Joined: 2014-06-01 17:46
Location: London, England
Has thanked: 81 times
Been thanked: 133 times

Re: sytemd issue between zfs and others (docker, smbd, netat

#2 Post by Head_on_a_Stick »

Have you created a systemd service to import the bpool? See point 11 of Step 4 in the Debian buster installation guide on the ZoL wiki: https://openzfs.github.io/openzfs-docs/ ... figuration

This is just a shot in the dark, I have installed Debian buster on ZFS but I only used it very briefly so I don't know much about it.
deadbang

pierrepaap
Posts: 7
Joined: 2021-01-05 16:01

Re: sytemd issue between zfs and others (docker, smbd, netat

#3 Post by pierrepaap »

I appreciate your trying to help !

Here are some addtional info following your input:
I did not do that step.
I use ZFS only for data storage, not for the root FS.

The import happens without problem with existing services.
When the mounts start for each of the zfs, other services start writing before all zfs are mounted, causing the mount to partially fail.
That's wgy i believe its a systemd scheduling or dependency issue

pierrepaap
Posts: 7
Joined: 2021-01-05 16:01

Re: sytemd issue between zfs and others (docker, smbd, netat

#4 Post by pierrepaap »

hopefully something more precise:

Code: Select all

# systemd-analyze critical-chain docker.service
The time after the unit is active or started is printed after the "@" character.
The time the unit takes to start is printed after the "+" character.

docker.service [b]+18.427s[/b]
└─containerd.service @18.197s +617ms
  └─network.target @18.194s
    └─wpa_supplicant.service @18.072s +121ms
      └─dbus.service @18.063s
        └─basic.target @18.031s
          └─sockets.target @18.030s
            └─acpid.socket @18.017s
              └─sysinit.target @18.011s
                └─systemd-timesyncd.service @17.873s +136ms
                  └─systemd-tmpfiles-setup.service @17.835s +33ms
                    └─local-fs.target @17.830s
                      └─run-user-1001.mount @27.271s
                        └─local-fs-pre.target @2.355s
                          └─lvm2-monitor.service @272ms +2.082s
                            └─systemd-journald.socket @267ms
                              └─system.slice @258ms
                                └─-.slice @258ms

Code: Select all

# systemd-analyze critical-chain zdocs-docker_data.mount
The time after the unit is active or started is printed after the "@" character.
The time the unit takes to start is printed after the "+" character.

zdocs-docker_data.mount [b]@1min 29.615s[/b]
└─local-fs-pre.target @2.355s
  └─lvm2-monitor.service @272ms +2.082s
    └─systemd-journald.socket @267ms
      └─system.slice @258ms
        └─-.slice @258ms
docker takes 18s to start
but the mount which contains its data is active after 1'29"

User avatar
Head_on_a_Stick
Posts: 14114
Joined: 2014-06-01 17:46
Location: London, England
Has thanked: 81 times
Been thanked: 133 times

Re: sytemd issue between zfs and others (docker, smbd, netat

#5 Post by Head_on_a_Stick »

pierrepaap wrote:I use ZFS only for data storage, not for the root FS.
Ah, I see.

We could try changing zfs-mount.service:

Code: Select all

# systemctl edit zfs-mount.service
Then enter this:

Code: Select all

[Unit]
Before=
After=local-fs.target
Save the file and reboot to test.

If it doesn't work then delete the /etc/systemd/system/zfs-mount.service.d/ directory (which contains override.conf) to return to the stock configuration.
deadbang

pierrepaap
Posts: 7
Joined: 2021-01-05 16:01

Re: sytemd issue between zfs and others (docker, smbd, netat

#6 Post by pierrepaap »

Ok I'll try that
In fact, I had tried adding local-fs.target in the "After" line of /etc/systemd.docker.service.d/startup_options.conf but that did not cut it

I had the impression that the mounting as already a prere to local-fs.target

pierrepaap
Posts: 7
Joined: 2021-01-05 16:01

Re: sytemd issue between zfs and others (docker, smbd, netat

#7 Post by pierrepaap »

did not work

tried to move that override.conf to docker.service.d, still no joy

pierrepaap
Posts: 7
Joined: 2021-01-05 16:01

Re: sytemd issue between zfs and others (docker, smbd, netat

#8 Post by pierrepaap »

ok, I need to see how it goes in the near future but I believe I solved my issue, albeit not through a neat systemd setup...

The hack I came up with is to disable the docker service and add a @reboot entry in the crontab which starts it...
so, yes, a hack, but it appear to do the job and I expect it won't be much of a bother to me

User avatar
Head_on_a_Stick
Posts: 14114
Joined: 2014-06-01 17:46
Location: London, England
Has thanked: 81 times
Been thanked: 133 times

Re: sytemd issue between zfs and others (docker, smbd, netat

#9 Post by Head_on_a_Stick »

It is possible to add a delay to the service:

Code: Select all

# /etc/systemd/system/zfs-mount.service.wants/override.conf
[Service]
ExecStartPre=/bin/sleep 30
^ That example would delay zfs-mount.service by 30 seconds.
deadbang

pierrepaap
Posts: 7
Joined: 2021-01-05 16:01

Re: sytemd issue between zfs and others (docker, smbd, netat

#10 Post by pierrepaap »

Nice, thanks !
I'll try that although my hack did not fix the issue.
It has to do on startup with the disk array and device-ids being ready (for imort I guess)
It slows down the whole thing.

on reboot the cron hack works

Post Reply