sytemd issue between zfs and others (docker, smbd, netatalk)

Kernels & Hardware, configuring network, installing services

sytemd issue between zfs and others (docker, smbd, netatalk)

Postby pierrepaap » 2021-01-05 19:49

Dear all,

first of all, first post, so thank you to everyone here who helps others.
I've read the "first read this" posts so hopefully i'll be ok.

I've used unix and linux, debian in particular, for many years (something around 25).
While this allows me to know quite some stuff, i don't know all and especillay some of the newer things.


After the intros, let's come to the problem at hand.

I've been using ZFS (ZoL) for about 5 years on debian. Now it's all fully using systemd and I've always had issues there.
I had fixed them in stretch I believe, but when I upgraded to buster I suppose the systemd files were updated and I'm back to square one.

The issue is that services that use the files on one of the zfs pool/fs are starting before the pool is fully imported and mounted and therefore create file entries which cause the import and mount to not finish fully correctly.

Debian version : Buster 10.7
zfs version : zfs-dkms/buster-backports,now 0.8.6-1~bpo10+1 all [installed]


here's an example of WRONG:

Code: Select all
# df -H /zdocs/*
Filesystem         Size  Used Avail Use% Mounted on
zdocs              308G  525k  308G   1% /zdocs
zdocs/docker_data  313G  4.3G  308G   2% /zdocs/docker_data
zdocs              308G  525k  308G   1% /zdocs
zdocs              308G  525k  308G   1% /zdocs
zdocs              308G  525k  308G   1% /zdocs
zdocs/server       311G  2.5G  309G   1% /zdocs/server
zdocs              308G  525k  308G   1% /zdocs
zdocs/temp         311G  2.6G  308G   1% /zdocs/temp
zdocs              308G  525k  308G   1% /zdocs


from there, if I try to export /import that pool after stopping services using those FS (docker, smbd and netatalk), it goes like this :

Code: Select all
# zpool export zdocs
# zpool import zdocs
cannot mount '/zdocs/ourcloud': directory is not empty


then i remove all the extra files that are not supposed to be there, do again an export/import and the expected result is the following:

Code: Select all
# df -H /zdocs/*
Filesystem         Size  Used Avail Use% Mounted on
zdocs/dev          322G   13G  310G   4% /zdocs/dev
zdocs/docker_data  315G  5.8G  310G   2% /zdocs/docker_data
zdocs/docs         374G   33G  342G   9% /zdocs/docs
zdocs/ourcloud     310G   48M  310G   1% /zdocs/ourcloud
zdocs/photos       594G  285G  310G  48% /zdocs/photos
zdocs/server       312G  2.5G  310G   1% /zdocs/server
zdocs/software     379G   70G  310G  19% /zdocs/software
zdocs/temp         312G  2.6G  310G   1% /zdocs/temp
zdocs/TM_Brice     215G  211G  4.0G  99% /zdocs/TM_Brice


this is apparently caused by an ordering issue in the systemd startup, meaning some dependencies in the zfs (or the 3 others consuming : docker, smbd and netatalk) services are incorrectly set.

I don't know if a bug should be submitted to some of the maintainers, for now I'd be happy with a fix that means i don't have to manually fix this at every startup or reboot...

Note: I believe the mounts happen during the import, not via the zfs-mount.service and I believe that's because mountpoints are set on the pools:

Code: Select all
# zfs get mountpoint zdocs
NAME   PROPERTY    VALUE       SOURCE
zdocs  mountpoint  /zdocs      local
# zfs get mountpoint zdocs/docs
NAME        PROPERTY    VALUE        SOURCE
zdocs/docs  mountpoint  /zdocs/docs  inherited from zdocs


Thanks in advance for any help !
pierrepaap
 
Posts: 7
Joined: 2021-01-05 16:01

Re: sytemd issue between zfs and others (docker, smbd, netat

Postby Head_on_a_Stick » 2021-01-05 20:32

Have you created a systemd service to import the bpool? See point 11 of Step 4 in the Debian buster installation guide on the ZoL wiki: https://openzfs.github.io/openzfs-docs/ ... figuration

This is just a shot in the dark, I have installed Debian buster on ZFS but I only used it very briefly so I don't know much about it.
Black Lives Matter

Debian buster-backports ISO image: for new hardware support
User avatar
Head_on_a_Stick
 
Posts: 13062
Joined: 2014-06-01 17:46
Location: /dev/chair

Re: sytemd issue between zfs and others (docker, smbd, netat

Postby pierrepaap » 2021-01-06 17:58

I appreciate your trying to help !

Here are some addtional info following your input:
I did not do that step.
I use ZFS only for data storage, not for the root FS.

The import happens without problem with existing services.
When the mounts start for each of the zfs, other services start writing before all zfs are mounted, causing the mount to partially fail.
That's wgy i believe its a systemd scheduling or dependency issue
pierrepaap
 
Posts: 7
Joined: 2021-01-05 16:01

Re: sytemd issue between zfs and others (docker, smbd, netat

Postby pierrepaap » 2021-01-06 18:17

hopefully something more precise:

Code: Select all
# systemd-analyze critical-chain docker.service
The time after the unit is active or started is printed after the "@" character.
The time the unit takes to start is printed after the "+" character.

docker.service [b]+18.427s[/b]
└─containerd.service @18.197s +617ms
  └─network.target @18.194s
    └─wpa_supplicant.service @18.072s +121ms
      └─dbus.service @18.063s
        └─basic.target @18.031s
          └─sockets.target @18.030s
            └─acpid.socket @18.017s
              └─sysinit.target @18.011s
                └─systemd-timesyncd.service @17.873s +136ms
                  └─systemd-tmpfiles-setup.service @17.835s +33ms
                    └─local-fs.target @17.830s
                      └─run-user-1001.mount @27.271s
                        └─local-fs-pre.target @2.355s
                          └─lvm2-monitor.service @272ms +2.082s
                            └─systemd-journald.socket @267ms
                              └─system.slice @258ms
                                └─-.slice @258ms



Code: Select all
# systemd-analyze critical-chain zdocs-docker_data.mount
The time after the unit is active or started is printed after the "@" character.
The time the unit takes to start is printed after the "+" character.

zdocs-docker_data.mount [b]@1min 29.615s[/b]
└─local-fs-pre.target @2.355s
  └─lvm2-monitor.service @272ms +2.082s
    └─systemd-journald.socket @267ms
      └─system.slice @258ms
        └─-.slice @258ms


docker takes 18s to start
but the mount which contains its data is active after 1'29"
pierrepaap
 
Posts: 7
Joined: 2021-01-05 16:01

Re: sytemd issue between zfs and others (docker, smbd, netat

Postby Head_on_a_Stick » 2021-01-07 09:45

pierrepaap wrote:I use ZFS only for data storage, not for the root FS.

Ah, I see.

We could try changing zfs-mount.service:
Code: Select all
# systemctl edit zfs-mount.service

Then enter this:
Code: Select all
[Unit]
Before=
After=local-fs.target

Save the file and reboot to test.

If it doesn't work then delete the /etc/systemd/system/zfs-mount.service.d/ directory (which contains override.conf) to return to the stock configuration.
Black Lives Matter

Debian buster-backports ISO image: for new hardware support
User avatar
Head_on_a_Stick
 
Posts: 13062
Joined: 2014-06-01 17:46
Location: /dev/chair

Re: sytemd issue between zfs and others (docker, smbd, netat

Postby pierrepaap » 2021-01-13 17:22

Ok I'll try that
In fact, I had tried adding local-fs.target in the "After" line of /etc/systemd.docker.service.d/startup_options.conf but that did not cut it

I had the impression that the mounting as already a prere to local-fs.target
pierrepaap
 
Posts: 7
Joined: 2021-01-05 16:01

Re: sytemd issue between zfs and others (docker, smbd, netat

Postby pierrepaap » 2021-01-13 17:41

did not work

tried to move that override.conf to docker.service.d, still no joy
pierrepaap
 
Posts: 7
Joined: 2021-01-05 16:01

Re: sytemd issue between zfs and others (docker, smbd, netat

Postby pierrepaap » 2021-01-16 12:42

ok, I need to see how it goes in the near future but I believe I solved my issue, albeit not through a neat systemd setup...

The hack I came up with is to disable the docker service and add a @reboot entry in the crontab which starts it...
so, yes, a hack, but it appear to do the job and I expect it won't be much of a bother to me
pierrepaap
 
Posts: 7
Joined: 2021-01-05 16:01

Re: sytemd issue between zfs and others (docker, smbd, netat

Postby Head_on_a_Stick » 2021-01-16 18:39

It is possible to add a delay to the service:
Code: Select all
# /etc/systemd/system/zfs-mount.service.wants/override.conf
[Service]
ExecStartPre=/bin/sleep 30

^ That example would delay zfs-mount.service by 30 seconds.
Black Lives Matter

Debian buster-backports ISO image: for new hardware support
User avatar
Head_on_a_Stick
 
Posts: 13062
Joined: 2014-06-01 17:46
Location: /dev/chair

Re: sytemd issue between zfs and others (docker, smbd, netat

Postby pierrepaap » 2021-01-18 18:37

Nice, thanks !
I'll try that although my hack did not fix the issue.
It has to do on startup with the disk array and device-ids being ready (for imort I guess)
It slows down the whole thing.

on reboot the cron hack works
pierrepaap
 
Posts: 7
Joined: 2021-01-05 16:01


Return to System configuration

Who is online

Users browsing this forum: No registered users and 10 guests

fashionable