Guix Home leaves user shepherd on logout, starts new instance on login

  • Open
  • quality assurance status badge
Details
6 participants
  • bokr
  • Danny Milosavljevic
  • Julian Flake
  • Jake
  • Ludovic Courtès
  • Tomas Volf
Owner
unassigned
Submitted by
Jake
Severity
important
Merged with

Debbugs page

J
Shepherd: Growing number of user shepherds when relogging
(address . bug-guix@gnu.org)(address . ludovic.courtes@inria.fr)
CAJqVjv_yNT19Svyd_xNVduNduuwZoWRrcGYRuQJ6=g4cmWDSaQ@mail.gmail.com
Hi

I think I'm experiencing a bug in Shepherd since version 1.0.
Whenever I log out and log back in again, my user shepherd from the
previous login session is still present, and a new user shepherd spawns for
the current login session.
So relogging N times results in N+1 user shepherds.

For example, I have relogged 5 times since I last rebooted:

$ herd status root
Status of root:
It is running since 00:30:02 (10 minutes ago).
Main PID: 23450
Command:
/gnu/store/mfkz7fvlfpv3ppwbkv0imb19nrf95akf-guile-3.0.9/bin/guile
--no-auto-compile
/gnu/store/nl0w5c7pxxdczqiv4r9iq44al7nd5y5g-shepherd-1.0.0/bin/shepherd
--silent --config /gnu/store/w3l6dmap815mm3qzx77xdazky853adda-shepherd.conf
...

$ pgrep shepherd
1
9891
10777
16417
18510
21960
23450

$ ps aux | grep shepherd
root 1 0.0 0.9 222872 74456 ? Sl Dec15 0:08
/gnu/store/mfkz7fvlfpv3ppwbkv0imb19nrf95akf-guile-3.0.9/bin/guile
--no-auto-compile
/gnu/store/nl0w5c7pxxdczqiv4r9iq44al7nd5y5g-shepherd-1.0.0/bin/shepherd
--config /gnu/store/p7al8wd1inwk8f5di2q4llcpd64mjn5q-shepherd.conf
jake 9891 0.0 0.2 75816 23624 ? Ss Dec15 0:04
/gnu/store/mfkz7fvlfpv3ppwbkv0imb19nrf95akf-guile-3.0.9/bin/guile
--no-auto-compile
/gnu/store/nl0w5c7pxxdczqiv4r9iq44al7nd5y5g-shepherd-1.0.0/bin/shepherd
--silent --config /gnu/store/w3l6dmap815mm3qzx77xdazky853adda-shepherd.conf
jake 10777 0.0 0.3 76224 24752 ? Ss Dec16 0:03
/gnu/store/mfkz7fvlfpv3ppwbkv0imb19nrf95akf-guile-3.0.9/bin/guile
--no-auto-compile
/gnu/store/nl0w5c7pxxdczqiv4r9iq44al7nd5y5g-shepherd-1.0.0/bin/shepherd
--silent --config /gnu/store/w3l6dmap815mm3qzx77xdazky853adda-shepherd.conf
jake 16417 0.0 0.3 75752 24004 ? Ss Dec16 0:02
/gnu/store/mfkz7fvlfpv3ppwbkv0imb19nrf95akf-guile-3.0.9/bin/guile
--no-auto-compile
/gnu/store/nl0w5c7pxxdczqiv4r9iq44al7nd5y5g-shepherd-1.0.0/bin/shepherd
--silent --config /gnu/store/w3l6dmap815mm3qzx77xdazky853adda-shepherd.conf
jake 18510 0.0 0.2 75752 23760 ? Ss Dec16 0:01
/gnu/store/mfkz7fvlfpv3ppwbkv0imb19nrf95akf-guile-3.0.9/bin/guile
--no-auto-compile
/gnu/store/nl0w5c7pxxdczqiv4r9iq44al7nd5y5g-shepherd-1.0.0/bin/shepherd
--silent --config /gnu/store/w3l6dmap815mm3qzx77xdazky853adda-shepherd.conf
jake 21960 0.0 0.2 114608 22124 ? Ss Dec16 0:00
/gnu/store/mfkz7fvlfpv3ppwbkv0imb19nrf95akf-guile-3.0.9/bin/guile
--no-auto-compile
/gnu/store/nl0w5c7pxxdczqiv4r9iq44al7nd5y5g-shepherd-1.0.0/bin/shepherd
--silent --config /gnu/store/w3l6dmap815mm3qzx77xdazky853adda-shepherd.conf
jake 23450 0.0 0.2 114204 21328 ? Ss 00:30 0:00
/gnu/store/mfkz7fvlfpv3ppwbkv0imb19nrf95akf-guile-3.0.9/bin/guile
--no-auto-compile
/gnu/store/nl0w5c7pxxdczqiv4r9iq44al7nd5y5g-shepherd-1.0.0/bin/shepherd
--silent --config /gnu/store/w3l6dmap815mm3qzx77xdazky853adda-shepherd.conf
jake 23672 0.0 0.0 6636 2552 pts/1 S+ 00:32 0:00 grep
--color=auto shepherd

In addition, any daemons managed by the zombie shepherds also persist!

I'm experiencing this on both of my Guix System machines. One is running
GDM and XFCE. The other is running GDM and CWM.
Please let me know if I can provide more information.

Thanks
Jake
Attachment: file
L
L
Ludovic Courtès wrote on 18 Dec 2024 14:35
(name . Jake)(address . jforst.mailman@gmail.com)(address . 74912@debbugs.gnu.org)
87r064ippt.fsf@gnu.org
Hello,

Jake <jforst.mailman@gmail.com> skribis:

Toggle quote (6 lines)
> I think I'm experiencing a bug in Shepherd since version 1.0.
> Whenever I log out and log back in again, my user shepherd from the
> previous login session is still present, and a new user shepherd spawns for
> the current login session.
> So relogging N times results in N+1 user shepherds.

I have a user shepherd via Guix Home and I experience the same problem
(though because I rarely log out it’s not really annoying :-)).

I suspect the problem has to do with how Guix Home determines whether or
not it should launch shepherd, but I haven’t checked yet.

Thanks for reporting the issue,
Ludo’.
T
T
Tomas Volf wrote on 18 Dec 2024 16:29
(name . Ludovic Courtès)(address . ludo@gnu.org)
877c7w7bxi.fsf@wolfsden.cz
Ludovic Courtès <ludo@gnu.org> writes:

Toggle quote (16 lines)
> Hello,
>
> Jake <jforst.mailman@gmail.com> skribis:
>
>> I think I'm experiencing a bug in Shepherd since version 1.0.
>> Whenever I log out and log back in again, my user shepherd from the
>> previous login session is still present, and a new user shepherd spawns for
>> the current login session.
>> So relogging N times results in N+1 user shepherds.
>
> I have a user shepherd via Guix Home and I experience the same problem
> (though because I rarely log out it’s not really annoying :-)).
>
> I suspect the problem has to do with how Guix Home determines whether or
> not it should launch shepherd, but I haven’t checked yet.

When you have another login session active when you log out and in
again, new shepherd is *not* spawned. I am guessing here but probably
last log out causes XDG_RUNTIME_DIR to be removed (by elogind in my
case), so on log in there is no /run/user/$UID/on-first-login-executed,
so it runs again and starts the shepherd.

But even if that would be solved, since the runtime directory was nuked,
there is no shepherd socket around anymore, so the (still running)
shepherd from previous login session cannot be contacted by herd.

Of the top of my head I can think of two possible solutions:

1. Stop the shepherd on log out. So as we have on-first-login, we would
have on-last-logout. I have no idea how to implement that. Maybe we
could use ~/.bash_logout? Or some PAM thing?

2. Shepherd could shutdown gracefully when the control socket is deleted
from the file system. It is arguable how useful running shepherd is
without the socket anyway.

Any other ideas?

Tomas

--
There are only two hard things in Computer Science:
cache invalidation, naming things and off-by-one errors.
-----BEGIN PGP SIGNATURE-----

iQJCBAEBCgAsFiEEt4NJs4wUfTYpiGikL7/ufbZ/wakFAmdjaNkOHH5Ad29sZnNk
ZW4uY3oACgkQL7/ufbZ/wakqSA//X0KE72gpD9M9RRjzrRQjmT/xYRHBPia7ZKIQ
96OH8Rj7qrHLbaQvlfhtEEVbuoIxoxXRHvtLXvdXgxAaD05geIE9qyhid84E6qgM
Y+/qiRfsThXjEu2chqLcPxl/xkL6mqk+Jzv2HQyAn0wAp+5N1A4TxCJv14cG+ZOj
0Non6zepnKkeQABDsH0ovzAj79T5LqKDCVryXI6BGpE+kqnH6V+H7nBC8JNGh7eO
O4koQuAYcifyAAD4iD/qM0bI3CCtOShBIFalHmJ9Mb4GNVMbTh/Oe2ayVZ0yNB7T
NAsLYNeu6UmrGQ7J6cYskPsvnu5qB01PqeGTMvqIVjc8yzb1nEdFHlZ4FLFcZ5mU
TT57bpNfB6TMzQ4R3KkffiJ+Oh0EIokDlYukTrvpNqTvnSxWB7GFesej+mgpyQU3
Sc569E+AzO+dDqCO3W7s9otW4qw9MsYyoR2q6yR3qLJWbDAhzp2KQUNoJAG5M+Xy
WlpI7QZDUsCNaABwV3J/4DpI+0bnc9EkLcRvVqVyglgsY3QgRTbxiCvtdWFhh5Iq
uuzepyV/WHCIW/h58M0lQH8AcF9mFZoLLwMrOV83OGAlI8h6R2ixqhNakWK2QcZP
ySUedRh6NaNkxC9FbznYUkuEthpLnmd5PfT32PKhanFq00T+DknnqYm2fOHCQ2X1
VL/AgMk=
=fjWz
-----END PGP SIGNATURE-----

L
L
Ludovic Courtès wrote on 26 Dec 2024 02:50
(name . Tomas Volf)(address . ~@wolfsden.cz)
87o70yzpk7.fsf@gnu.org
Hi!

Tomas Volf <~@wolfsden.cz> skribis:

Toggle quote (10 lines)
> When you have another login session active when you log out and in
> again, new shepherd is *not* spawned. I am guessing here but probably
> last log out causes XDG_RUNTIME_DIR to be removed (by elogind in my
> case), so on log in there is no /run/user/$UID/on-first-login-executed,
> so it runs again and starts the shepherd.
>
> But even if that would be solved, since the runtime directory was nuked,
> there is no shepherd socket around anymore, so the (still running)
> shepherd from previous login session cannot be contacted by herd.

Hmm, when is /run/user/UID deleted?

Toggle quote (6 lines)
> Of the top of my head I can think of two possible solutions:
>
> 1. Stop the shepherd on log out. So as we have on-first-login, we would
> have on-last-logout. I have no idea how to implement that. Maybe we
> could use ~/.bash_logout? Or some PAM thing?

Or some elogind thing, rather?

But then, how do we make it work on other distros? Maybe on systemd
distros shepherd receives SIGTERM or something, in which case it
terminates properly.

Toggle quote (4 lines)
> 2. Shepherd could shutdown gracefully when the control socket is deleted
> from the file system. It is arguable how useful running shepherd is
> without the socket anyway.

I don’t think that’s workable: you’d need to poll/inotify for the
existence of that socket, but even if it exists on the file system, you
cannot tell whether it matches the socket you’re accepting on.

Ludo’.
B
(name . Ludovic Courtès)(address . ludo@gnu.org)
Z22RflvtBpyOHG14@BRL14v1
On +2024-12-26 11:50:00 +0100, Ludovic Courtès wrote:
Toggle quote (41 lines)
> Hi!
>
> Tomas Volf <~@wolfsden.cz> skribis:
>
> > When you have another login session active when you log out and in
> > again, new shepherd is *not* spawned. I am guessing here but probably
> > last log out causes XDG_RUNTIME_DIR to be removed (by elogind in my
> > case), so on log in there is no /run/user/$UID/on-first-login-executed,
> > so it runs again and starts the shepherd.
> >
> > But even if that would be solved, since the runtime directory was nuked,
> > there is no shepherd socket around anymore, so the (still running)
> > shepherd from previous login session cannot be contacted by herd.
>
> Hmm, when is /run/user/UID deleted?
>
> > Of the top of my head I can think of two possible solutions:
> >
> > 1. Stop the shepherd on log out. So as we have on-first-login, we would
> > have on-last-logout. I have no idea how to implement that. Maybe we
> > could use ~/.bash_logout? Or some PAM thing?
>
> Or some elogind thing, rather?
>
> But then, how do we make it work on other distros? Maybe on systemd
> distros shepherd receives SIGTERM or something, in which case it
> terminates properly.
>
> > 2. Shepherd could shutdown gracefully when the control socket is deleted
> > from the file system. It is arguable how useful running shepherd is
> > without the socket anyway.
>
> I don’t think that’s workable: you’d need to poll/inotify for the
> existence of that socket, but even if it exists on the file system, you
> cannot tell whether it matches the socket you’re accepting on.
>
> Ludo’.
>
>
>

I wonder how many guix-daemon-process-relationship type problems would be simplified
if (radical vision) one let wayland's inner event-driven loop/protocol be the dispatcher
for guix processes instead of the current guix daemon switching between its collection of threads.
I.e., all the guix threads would be individual login or spawned user processes securely communicating
virtualizably (shared memory or networked rendezvous buffers etc) for offloading?
T
T
Tomas Volf wrote on 27 Dec 2024 15:19
(name . Ludovic Courtès)(address . ludo@gnu.org)
875xn44suw.fsf@wolfsden.cz
Ludovic Courtès <ludo@gnu.org> writes:

Toggle quote (16 lines)
> Hi!
>
> Tomas Volf <~@wolfsden.cz> skribis:
>
>> When you have another login session active when you log out and in
>> again, new shepherd is *not* spawned. I am guessing here but probably
>> last log out causes XDG_RUNTIME_DIR to be removed (by elogind in my
>> case), so on log in there is no /run/user/$UID/on-first-login-executed,
>> so it runs again and starts the shepherd.
>>
>> But even if that would be solved, since the runtime directory was nuked,
>> there is no shepherd socket around anymore, so the (still running)
>> shepherd from previous login session cannot be contacted by herd.
>
> Hmm, when is /run/user/UID deleted?

I believe it is done by elogind (in my setup) when last user session
(for the given UID) logs out. If I grepped right, it is done by
user_finalize function in logind-user.c.

It (AFAIUT) it should be performed when last session of the seat
terminates. So if you log only into a single TTY, the XDG_RUNTIME_DIR
will be removed on every log out.

Toggle quote (9 lines)
>
>> Of the top of my head I can think of two possible solutions:
>>
>> 1. Stop the shepherd on log out. So as we have on-first-login, we would
>> have on-last-logout. I have no idea how to implement that. Maybe we
>> could use ~/.bash_logout? Or some PAM thing?
>
> Or some elogind thing, rather?

I looked around the manual page, but did not found anything. There is
KillUserProcesses, but that feels like fairly big hammer, and something
that should *not* be enabled by default.

We could patch elogind to add new RemoveRuntimeDirectory boolean flag to
allow keeping the XDG_RUNTIME_DIR even after last log out (I personally
would prefer that behavior anyway). I am not sure what our policy
regarding patches here is.

Toggle quote (5 lines)
>
> But then, how do we make it work on other distros? Maybe on systemd
> distros shepherd receives SIGTERM or something, in which case it
> terminates properly.

No idea here. ~/.bash_logout?

Toggle quote (9 lines)
>
>> 2. Shepherd could shutdown gracefully when the control socket is deleted
>> from the file system. It is arguable how useful running shepherd is
>> without the socket anyway.
>
> I don’t think that’s workable: you’d need to poll/inotify for the
> existence of that socket, but even if it exists on the file system, you
> cannot tell whether it matches the socket you’re accepting on.

For files I would suggest checking if both `stat:dev' and `stat:ino'
match in order to detect whether it is the same file. Not sure if same
strategy can be used for unix sockets.

Tomas

--
There are only two hard things in Computer Science:
cache invalidation, naming things and off-by-one errors.
-----BEGIN PGP SIGNATURE-----

iQJCBAEBCgAsFiEEt4NJs4wUfTYpiGikL7/ufbZ/wakFAmdvNecOHH5Ad29sZnNk
ZW4uY3oACgkQL7/ufbZ/wamqow/5ARGzpgBLPnkZwrBhnb2M3oaDxKJP3xCI9rwq
asdXIj6IdrK2BXikuRAKq8iLdJdzZzZXNVBidTkYOu9U6OYZoQ28kzdBk0wQQ0lk
rKooFzjsxGL2WVp3N9j5Z+oMT6RFLSDf9W2w3sOAp6boNzZ4iHVQKtFHmWhCFJ6e
LBI+C0EFZoWoEoZzrboEMDC5r9NYRLAr2tUxu6RG+FZ+Shd4gT827oeTtvn2nMIz
agfHAER2DUJcWNJy1QuPMvyOWicmFEEHk0wNiLw9xdWiodN4/qdt9AcudmHkQtY/
oASK/aBbAa74avwBPIXaGosD9djABKWIjF0JtzL0/C+YwTUDkqkx1Fp2rKYkg5oY
Oko0ctMuFpyIJLIx7A8notShubi1YVDhXrKxbCo7xuSVcT1N5TJ0TVdLT6mAA5lF
haDmC+7+u8Y/ZOvQQ1Z1zsBYzl3oiPspSzZ5tbIWlfQMGf35jA8onCz+ksgVh/Ps
LGTHgvoIszeJ5tP9yUPB3ScPRiYiRy9GUj0sN7wZEPM41fqu+Lxm95R9RMMGPjEY
EoAB/UyJP/29puD6EFHBH6CHFm2XE7U1NqIkep1UaNlipgr/JMle7EB7JbE9HZsS
ifKNVwrpVV0wP85nhxkYkLz3oXAhwlq4FHPPF2mAthIcL0bZJflchdIWk2Ad8tIK
OZLG6tU=
=X+CU
-----END PGP SIGNATURE-----

T
T
Tomas Volf wrote on 27 Dec 2024 15:20
(address . bokr@bokr.com)
871pxs4srs.fsf@wolfsden.cz
I am not sure how this relates to this specific bug report, but

bokr@bokr.com writes:

Toggle quote (4 lines)
> I wonder how many guix-daemon-process-relationship type problems would be simplified
> if (radical vision) one let wayland's inner event-driven loop/protocol
> be the dispatcher

not everyone uses wayland.

Toggle quote (4 lines)
> for guix processes instead of the current guix daemon switching between its collection of threads.
> I.e., all the guix threads would be individual login or spawned user processes securely communicating
> virtualizably (shared memory or networked rendezvous buffers etc) for offloading?

--
There are only two hard things in Computer Science:
cache invalidation, naming things and off-by-one errors.
-----BEGIN PGP SIGNATURE-----

iQJCBAEBCgAsFiEEt4NJs4wUfTYpiGikL7/ufbZ/wakFAmdvNlcOHH5Ad29sZnNk
ZW4uY3oACgkQL7/ufbZ/walKPRAAl9jzOpPQ9YaB4UjS6KEIbLe1VtCiga+PmtMX
hIq/h7JbMv1EQAHEV9kUJSwNrvrzsFx76E5PTaln3FD/cGSbs37XVdEF5QvweGK6
5rD6ksKZQwskM4SnaxEq4RjwoIDXcS3ybkfMyvq8VDfBmPR9cOxQmwqdiI7K4rYb
VVB/TZRJRXFUa6fb72mvMeZLodXHGqfFrKlADLQ2ltqw6KbqgLlPpJDwLM/7jQWE
JXJsgS4/iNlAonFKbwLBWO9W04sfv+ybXwJvtpeOtthWf7MpB5UHKKVWsi7u/IT3
U1fUDMFhxYZ9XcImCirmqhV+SRfeIHuxJ/X35ezPjbk4BtLuHB6GBUTsXU72YsA8
r1XV0XS0EecgFBJ3ZtBIHYYZDaTY4x5Ou+XNC0F7GZWKIhuZhWXKK6uVt8IGirYf
DuYRcS/5uVjJYoVchcMySmCuyiDudOsMEhoTYFRx0vNVI84O/s4cZ/tw4WL3Ga3J
LGspXSSRFnZKBdUw9tkjkeDZAXvJCNphU8W8UeUO7gsQY54sSx+CB6Hbfj2STPcx
lYF2QByOXFjsLUctzczXwTRX/Sy7mZTctGNc4YbyM/uan3Vm3+9RA6ZjhQHpf0cV
erMeMbThx5QR8SwgO6qH2rh1zKkXyv31ye98y9oTbFpWq23L/bDEa7Sii7UGdqWD
ZLNteBg=
=WNc+
-----END PGP SIGNATURE-----

L
L
Ludovic Courtès wrote on 7 Jan 14:58 -0800
control message for bug #74912
(address . control@debbugs.gnu.org)
87ikqqxmbz.fsf@gnu.org
severity 74912 important
quit
J
J
Julian Flake wrote on 11 Jan 13:54 -0800
merge 67863 74912
(name . GNU bug tracker automated control server)(address . control@debbugs.gnu.org)
87bjwdaue5.fsf@uni-koblenz.de
severity 67863 important
merge 67863 74912
thankyou
-----BEGIN PGP SIGNATURE-----

iHMEAREKADMWIQSZos45zYG9CgQjO52pmOo730U57wUCZ4LoohUcZmxha2VAdW5p
LWtvYmxlbnouZGUACgkQqZjqO99FOe9yMwCghxsw9H0nubbMjU6XC2JYELZ4VRAA
niOvwWBqYBCAQhaBbAy2kFqhXuWl
=QKlJ
-----END PGP SIGNATURE-----

L
L
Ludovic Courtès wrote on 15 Mar 03:56 -0700
control message for bug #74912
(address . control@debbugs.gnu.org)
87r02yr28a.fsf@gnu.org
merge 74912 76998
quit
L
L
Ludovic Courtès wrote on 17 Mar 12:37 -0700
control message for bug #76998
(address . control@debbugs.gnu.org)
877c4nmosf.fsf@gnu.org
retitle 76998 Guix Home leaves user shepherd on logout, starts new instance on login
quit
D
D
Danny Milosavljevic wrote on 26 Mar 05:18 -0700
Re: bug#74912: Shepherd: Growing number of user shepherds when relogging
(name . Tomas Volf)(address . ~@wolfsden.cz)
871pukdlyo.fsf@friendly-machines.com
Hi,

Toggle quote (2 lines)
>KillUserProcesses

Warning: That actually runs on every session logout (if enabled at all),
not just once per user. Also, I think session_stop_scope is commented
out in our elogind, so it won't actually kill anything. If it hadn't been
commented out, it would have used dbus to communicate with systemd to
stop a special (session) scope unit (see "manager_stop_unit"). That is
a good idea--to have only one guy managing all the user processes
(in order to prevent races).

Toggle quote (4 lines)
>We could patch elogind to add new RemoveRuntimeDirectory boolean flag to
>allow keeping the XDG_RUNTIME_DIR even after last log out (I personally
>would prefer that behavior anyway).

About the implication:
I would prefer if random user processes would not linger after I logged out.
What possible good can come from that? And definitely not have my
user services linger after I logged out.

Toggle quote (2 lines)
> ~/.bash_logout?

I think first we have to decide whether shepherd should run per user or
per session. These are not the same. This is a design decision--and it
HAS to be decided--otherwise nothing will work right. There is a risk
of data loss (backups run by shepherd step on each other's toes etc)
until that's decided.

I think shepherd should be run once per user, not per session.

I also think the on-first-login handling in guix home means that at
least guix home has already decided on shepherd once per user.

There used to be a check in shepherd to ensure that it can only run at
most once per user at the same time. It wasn't perfect--but I mean that
even shepherd itself apparently had decided on shepherd once per user.

Toggle quote (4 lines)
>>> 2. Shepherd could shutdown gracefully when the control socket is deleted
>> from the file system. It is arguable how useful running shepherd is
>> without the socket anyway.

I recommend against magic like this. I don't think it's possible to do this
in a way that is atomic.

Also, in an ideal world this would have been the way things worked in the
first place--but we aren't in that world. So I don't think it would be
wise to single out just one UNIX program, shepherd, and do it just for
that.
If you want to do stuff like that, add it to the POSIX standard.
Otherwise it's too surprising.

I would suggest the following:

(1) For Guix native, patch elogind[i] to also kill -TERM shepherd
(See user_stop_service--which is for that).
How does it find the shepherd process, specifically?

So elogind probably could also start

/run/current-system/profile/bin/shepherd
(with which config?)

on first user session login (and remember its pid)
(See user_start_service--which is for that, anyway).

elogind also has control over the directory with the socket file, so
I think it's the best place to also control the process.

Alternatively, we'd tell system shepherd to do it.
If shepherd could do dbus, dbus is already hooked up in elogind.

elogind's "sd_event_source" already has "child": "process_owned",
"exited", "waited"; and "sd_event_add_child" exists and is used for
"brightness_writer_fork"--haha totally random functionality.
But that means there's already a process manager hooked up in elogind.
It also has "kill_and_sigcont" and/or "sigterm_wait"--which we'd
probably use.

(2) When a foreign distro uses systemd (there's a very high chance it
does), then we can just install shepherd as a systemd user unit
(from guix-install.sh). systemd will do the right thing, the end.

(3) Maybe use .bash_logout and have it invoke "w" (or "loginctl") to see
whether we are the last session of that user (that would have a race...).
If we are, then kill shepherd.

I have seen bugs that it doesn't add an entry to "w" even though you
logged in. Then we'd be out of luck for (3).

Also, it would have a race anyway--even otherwise.

So maybe let's not do (3)--although it was a good find (cool that that
exists!).

------

What about shepherd's child processes (for example services)?
Will shepherd clean those up on shepherd termination?

There are also abstract UNIX domain sockets (think URN) that don't have
or need a filesystem entry.
It might be a good idea to use that for shepherd and prevent problem
stemming from the /run/user/xxx deletion. But in my opinion, stopping
user shepherd (once user logged out of all their sessions) is more
important than that, anyway.

[i] Would cause 3571 dependents to rebuild

P.S. in elogind, almost the entire cgroup handling in src/core/cgroup.c
has been disabled. That's disappointing. Someday, we should have cgroup
support as well!
L
L
Ludovic Courtès wrote on 1 Apr 03:13 -0700
(name . Danny Milosavljevic)(address . dannym@friendly-machines.com)
87plhwfaug.fsf@gnu.org
Hi Danny,

Danny Milosavljevic <dannym@friendly-machines.com> skribis:

Toggle quote (6 lines)
> I would suggest the following:
>
> (1) For Guix native, patch elogind[i] to also kill -TERM shepherd
> (See user_stop_service--which is for that).
> How does it find the shepherd process, specifically?

I think ‘user_stop_service’ could run:

herd stop root -s /run/user/$UID/shepherd/socket

Toggle quote (8 lines)
> So elogind probably could also start
>
> /run/current-system/profile/bin/shepherd
> (with which config?)
>
> on first user session login (and remember its pid)
> (See user_start_service--which is for that, anyway).

Oh yes, that too.

Toggle quote (4 lines)
> (2) When a foreign distro uses systemd (there's a very high chance it
> does), then we can just install shepherd as a systemd user unit
> (from guix-install.sh). systemd will do the right thing, the end.

I wouldn’t do it from ‘guix-install.sh’ because it only makes sense if
you’re going to use Guix Home; and if you use Guix Home, it has its own
way of starting shepherd.

Toggle quote (4 lines)
> (3) Maybe use .bash_logout and have it invoke "w" (or "loginctl") to see
> whether we are the last session of that user (that would have a race...).
> If we are, then kill shepherd.

Yes.

Question is how to keep Home portable between Guix and foreign distros.
Neither the elogind nor the systemd approach are portable; the
‘.bash_logout’ thing may be portable, but it’s probably more fragile.

Maybe we shouldn’t try to be portable, and first start by fixing the
problem on Guix System?

Toggle quote (3 lines)
> What about shepherd's child processes (for example services)?
> Will shepherd clean those up on shepherd termination?

Yes: if you ‘herd stop root’ or send SIGTERM to shepherd, it will shut
down all the services properly.

Thanks,
Ludo’.
L
L
Ludovic Courtès wrote 3 days ago
(name . Danny Milosavljevic)(address . dannym@friendly-machines.com)
874iyrkvx7.fsf@gnu.org
Hi Danny and all,

Following reports by Daniel Littlewood, who talked about involuntarily
running a second shepherd instance shadowing the previous one (this time
not in a Guix Home context), I realized shepherd itself could avoid this
entirely.

So shepherd will now refuse to start when it determines that an instance
is already listening on its socket:


Feedback welcome!

Ludo’.
?
Your comment

Commenting via the web interface is currently disabled.

To comment on this conversation send an email to 74912@patchwise.org

To respond to this issue using the mumi CLI, first switch to it
mumi current 74912
Then, you may apply the latest patchset in this issue (with sign off)
mumi am -- -s
Or, compose a reply to this issue
mumi compose
Or, send patches to this issue
mumi send-email *.patch