Finally, TIL, what can all be the reason for systemd services to hang indefinitely. The internet is flooded with numerous reports on this topic but no clear answers. So no more uselessly marked workarounds like:
systemctl daemon-reload
and
systemctl-daemon-reexec
for this scenario.
The scene would be something along the lines of:
rrs 6467 0.0 0.0 23088 15852 pts/1 Ss 12:53 0:00 \_ /bin/bash
rrs 11512 0.0 0.0 14876 4608 pts/1 S+ 13:18 0:00 \_ systemctl restart snapper-timeline.timer
rrs 11513 0.0 0.0 14984 3076 pts/1 S+ 13:18 0:00 \_ /bin/systemd-tty-ask-password-agent --watch
rrs 11514 0.0 0.0 234756 6752 pts/1 Sl+ 13:18 0:00 \_ /usr/bin/pkttyagent --notify-fd 5 --fallback
The
snapper-timeline
service is important to me and it not running for months is a complete failure. Disappointingly, commands like
systemctl --failed
do not report of this oddity. The overall system status is reported to be fine, which is completely incorrect.
Thankfully, a kind soul s
comment gave the hint. The problem is that you could be having certain services in
Activating
status, which thus blocks all other services; quietly. So much for the unnecessary fun.
Looking further, in my case, it was:
rrs@priyasi:~$ systemctl list-jobs
JOB UNIT TYPE STATE
81 timers.target start waiting
85 man-db.timer start waiting
88 fstrim.timer start waiting
3832 snapper-timeline.service start waiting
83 snapper-timeline.timer start waiting
39 systemd-time-wait-sync.service start running
87 logrotate.timer start waiting
84 debspawn-clear-caches.timer start waiting
89 plocate-updatedb.timer start waiting
91 dpkg-db-backup.timer start waiting
93 e2scrub_all.timer start waiting
40 time-sync.target start waiting
86 apt-listbugs.timer start waiting
13 jobs listed.
13:12
That was it. I knew the
systemd-timesyncd
service, in the past, had given me enough headaches. And so was it this time, just quietly doing it all again.
rrs@priyasi:~$ systemctl status systemd-time-wait-sync.service
systemd-time-wait-sync.service - Wait Until Kernel Time Synchronized
Loaded: loaded (/lib/systemd/system/systemd-time-wait-sync.service; enabled; vendor preset>
Active: activating (start) since Fri 2022-04-22 13:14:25 IST; 1min 38s ago
Docs: man:systemd-time-wait-sync.service(8)
Main PID: 11090 (systemd-time-wa)
Tasks: 1 (limit: 37051)
Memory: 836.0K
CPU: 7ms
CGroup: /system.slice/systemd-time-wait-sync.service
11090 /lib/systemd/systemd-time-wait-sync
Apr 22 13:14:25 priyasi systemd[1]: Starting Wait Until Kernel Time Synchronized...
Apr 22 13:14:25 priyasi systemd-time-wait-sync[11090]: adjtime state 5 status 40 time Fri 2022->
13:16 => 3
Dear LazyWeb, anybody knows of why the
systemd-time-wait-sync
service would hang indefinitely? I ve had identical setups on many machines, in the same network, where others don t exhibit this problem.
rrs@priyasi:~$ systemctl cat systemd-time-wait-sync.service
...snipped...
[Service]
Type=oneshot
ExecStart=/lib/systemd/systemd-time-wait-sync
TimeoutStartSec=infinity
RemainAfterExit=yes
[Install]
WantedBy=sysinit.target
The
TimeoutStartSec=infinity
is definitely an attribute that shouldn t be shipped in any system services. There are use cases for it but that should be left for local admins to explicitly decide. Hanging for
infinity
is not a desired behavior for a system service.
In figuring all this out, today I learnt the handy
systemctl list-jobs
command, which will give the list of active
running/blocked/waiting
jobs.