GNU bug report logs

#23170 Shepherd doesn't restart previously running dependent services

PackageSource(s)Maintainer(s)
guix PTS Buildd Popcon
Reply or subscribe to this bug. View this bug as an mbox, status mbox, or maintainer mbox

Report forwarded to bug-guix@gnu.org:
bug#23170; Package guix. (Thu, 31 Mar 2016 13:24:01 GMT) (full text, mbox, link).


Acknowledgement sent to "Thompson, David" <dthompson2@worcester.edu>:
New bug report received and forwarded. Copy sent to bug-guix@gnu.org. (Thu, 31 Mar 2016 13:24:01 GMT) (full text, mbox, link).


Message #5 received at submit@debbugs.gnu.org (full text, mbox, reply):

From: "Thompson, David" <dthompson2@worcester.edu>
To: bug-guix@gnu.org
Subject: Shepherd doesn't restart previously running dependent services
Date: Thu, 31 Mar 2016 09:23:24 -0400
If service 'foo' is depended on by service 'bar', and both are
running, and one runs 'herd restart foo', both 'foo' and 'bar' are
stopped, but then only 'foo' is restarted.  I think that the dependent
services that were stopped should also be restarted, otherwise the
user has to manually start them back up one at a time.

- Dave




Information forwarded to bug-guix@gnu.org:
bug#23170; Package guix. (Sat, 25 Aug 2018 11:35:02 GMT) (full text, mbox, link).


Message #8 received at 23170@debbugs.gnu.org (full text, mbox, reply):

From: Carlo Zancanaro <carlo@zancanaro.id.au>
To: 23170@debbugs.gnu.org
Subject: [PATCH shepherd] Restart dependent services on service restart
Date: Sat, 25 Aug 2018 21:33:47 +1000
[Message part 1 (text/plain, inline)]
I've written a patch to fix this. It's not super smart, but it 
should do the job.

It currently targets the branch after my patch in #32408[1], but 
it's technically an independent change.

[1]: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=32408

[0001-service-Restart-dependent-services-on-service-restar.patch (text/x-patch, inline)]
From 50dd3ef4888b04ea3b869da893b23ad69fad8971 Mon Sep 17 00:00:00 2001
From: Carlo Zancanaro <carlo@zancanaro.id.au>
Date: Sat, 25 Aug 2018 20:32:11 +1000
Subject: [PATCH] service: Restart dependent services on service restart

* modules/shepherd/service.scm (required-by?): New procedure.
(stop): Return a list of canonical-names for stopped dependent services,
including transitive dependencies.
(action)[restart]: Start services based on the return value of stop.
(fold-services): New procedure.
* tests/restart.sh: New file.
* Makefile.am (TESTS): Add tests/restart.sh.
---
 Makefile.am                  |  1 +
 modules/shepherd/service.scm | 90 ++++++++++++++++++++++--------------
 tests/restart.sh             | 77 ++++++++++++++++++++++++++++++
 3 files changed, 133 insertions(+), 35 deletions(-)
 create mode 100644 tests/restart.sh

diff --git a/Makefile.am b/Makefile.am
index 4322d7f..d9e21e9 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -187,6 +187,7 @@ TESTS =						\
   tests/replacement.sh				\
   tests/respawn.sh				\
   tests/respawn-throttling.sh			\
+  tests/restart.sh				\
   tests/misbehaved-client.sh			\
   tests/no-home.sh				\
   tests/pid-file.sh				\
diff --git a/modules/shepherd/service.scm b/modules/shepherd/service.scm
index 006309c..510a5ea 100644
--- a/modules/shepherd/service.scm
+++ b/modules/shepherd/service.scm
@@ -358,61 +358,72 @@ NEW-SERVICE."
     (for-each remove-service (provided-by old-service))
     (register-services new-service)))
 
+(define (required-by? service dependent)
+  "Returns #t if DEPENDENT directly requires SERVICE in order to run.  Returns
+#f otherwise."
+  (and (find (lambda (dependency)
+               (memq dependency (provided-by service)))
+             (required-by dependent))
+       #t))
+
 ;; Stop the service, including services that depend on it.  If the
 ;; latter fails, continue anyway.  Return `#f' if it could be stopped.
-(define-method (stop (obj <service>) . args)
+(define-method (stop (service <service>) . args)
+  "Stop SERVICE, and any services which depend on it.  Returns a list of
+canonical names for all of the services which have been stopped (including
+transitive dependent services).  This method will print a warning if SERVICE
+is not already running, and will return SERVICE's canonical name in a list."
   ;; Block asyncs so the SIGCHLD handler doesn't execute concurrently.
-  ;; Notably, that makes sure the handler processes the SIGCHLD for OBJ's
-  ;; process once we're done; otherwise, it could end up respawning OBJ.
+  ;; Notably, that makes sure the handler processes the SIGCHLD for SERVICE's
+  ;; process once we're done; otherwise, it could end up respawning SERVICE.
   (call-with-blocked-asyncs
    (lambda ()
-     (if (not (running? obj))
-         (local-output "Service ~a is not running." (canonical-name obj))
-         (if (slot-ref obj 'stop-delay?)
+     (if (not (running? service))
+         (begin
+           (local-output "Service ~a is not running." (canonical-name service))
+           (list (canonical-name service)))
+         (if (slot-ref service 'stop-delay?)
              (begin
-               (slot-set! obj 'waiting-for-termination? #t)
+               (slot-set! service 'waiting-for-termination? #t)
                (local-output "Service ~a pending to be stopped."
-                             (canonical-name obj)))
-             (begin
-               ;; Stop services that depend on it.
-               (for-each-service
-                (lambda (serv)
-                  (and (running? serv)
-                       (for-each (lambda (sym)
-                                   (and (memq sym (provided-by obj))
-                                        (stop serv)))
-                                 (required-by serv)))))
-
+                             (canonical-name service))
+               (list (canonical-name service)))
+             (let ((name (canonical-name service))
+                   (stopped-dependents (fold-services (lambda (other acc)
+                                                        (if (and (running? other)
+                                                                 (required-by? service other))
+                                                            (append (stop other) acc)
+                                                            acc))
+                                                      '())))
                ;; Stop the service itself.
                (catch #t
                  (lambda ()
-                   (apply (slot-ref obj 'stop)
-                          (slot-ref obj 'running)
+                   (apply (slot-ref service 'stop)
+                          (slot-ref service 'running)
                           args))
                  (lambda (key . args)
                    ;; Special case: 'root' may quit.
-                   (and (eq? root-service obj)
+                   (and (eq? root-service service)
                         (eq? key 'quit)
                         (apply quit args))
                    (caught-error key args)))
 
-               ;; OBJ is no longer running.
-               (slot-set! obj 'running #f)
+               ;; SERVICE is no longer running.
+               (slot-set! service 'running #f)
 
                ;; Reset the list of respawns.
-               (slot-set! obj 'last-respawns '())
+               (slot-set! service 'last-respawns '())
 
                ;; Replace the service with its replacement, if it has one
-               (let ((replacement (slot-ref obj 'replacement)))
+               (let ((replacement (slot-ref service 'replacement)))
                  (when replacement
-                   (replace-service obj replacement)))
+                   (replace-service service replacement)))
 
                ;; Status message.
-               (let ((name (canonical-name obj)))
-                 (if (running? obj)
-                     (local-output "Service ~a could not be stopped." name)
-                     (local-output "Service ~a has been stopped." name))))))
-     (slot-ref obj 'running))))
+               (if (running? service)
+                   (local-output "Service ~a could not be stopped." name)
+                   (local-output "Service ~a has been stopped." name))
+               (cons name stopped-dependents)))))))
 
 ;; Call action THE-ACTION with ARGS.
 (define-method (action (obj <service>) the-action . args)
@@ -423,10 +434,9 @@ NEW-SERVICE."
       ;; Restarting is done in the obvious way.
       ((restart)
        (lambda (running . args)
-         (if running
-             (stop obj)
-             (local-output "~a was not running." (canonical-name obj)))
-         (apply start obj args)))
+         (let ((stopped-services (stop obj)))
+           (for-each start stopped-services)
+           #t)))
       ((status)
        ;; Return the service itself.  It is automatically converted to an sexp
        ;; via 'result->sexp' and sent to the client.
@@ -953,6 +963,16 @@ Return #f if service is not found."
           (eq? name (canonical-name service)))
         services))
 
+(define (fold-services proc init)
+  "Apply PROC to the registered services to build a result, and return that
+result.  Works in a manner akin to `fold' from SRFI-1."
+  (hash-fold (lambda (name services acc)
+               (let ((service (lookup-canonical-service name services)))
+                 (if service
+                     (proc service acc)
+                     acc)))
+             init %services))
+
 (define (for-each-service proc)
   "Call PROC for each registered service."
   (hash-for-each (lambda (name services)
diff --git a/tests/restart.sh b/tests/restart.sh
new file mode 100644
index 0000000..92a1f79
--- /dev/null
+++ b/tests/restart.sh
@@ -0,0 +1,77 @@
+# GNU Shepherd --- Test restarting services.
+# Copyright © 2013, 2014, 2016 Ludovic Courtès <ludo@gnu.org>
+# Copyright © 2018 Carlo Zancanaro <carlo@zancanaro.id.au>
+#
+# This file is part of the GNU Shepherd.
+#
+# The GNU Shepherd is free software; you can redistribute it and/or modify it
+# under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or (at
+# your option) any later version.
+#
+# The GNU Shepherd is distributed in the hope that it will be useful, but
+# WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with the GNU Shepherd.  If not, see <http://www.gnu.org/licenses/>.
+
+shepherd --version
+herd --version
+
+socket="t-socket-$$"
+conf="t-conf-$$"
+log="t-log-$$"
+pid="t-pid-$$"
+
+herd="herd -s $socket"
+
+trap "cat $log || true ;
+  rm -f $socket $conf $log;
+  test -f $pid && kill \`cat $pid\` || true ; rm -f $pid" EXIT
+
+cat > "$conf"<<EOF
+(register-services
+ (make <service>
+   #:provides '(test1)
+   #:start (const #t)
+   #:stop  (const #t))
+ (make <service>
+   #:provides '(test2)
+   #:requires '(test1)
+   #:start (const #t)
+   #:stop  (const #t))
+ (make <service>
+   #:provides '(test3)
+   #:requires '(test2)
+   #:start (const #t)
+   #:stop  (const #t)))
+EOF
+
+rm -f "$pid"
+shepherd -I -s "$socket" -c "$conf" -l "$log" --pid="$pid" &
+
+while ! test -f "$pid" ; do sleep 0.3 ; done
+
+# Start some test services, and make sure they behave how we expect
+$herd start test1
+$herd start test2
+$herd status test1 | grep started
+$herd status test2 | grep started
+
+# Restart test1 and make sure that both services are still running (ie. that
+# test2 hasn't been stopped)
+$herd restart test1
+$herd status test1 | grep started
+$herd status test2 | grep started
+
+# Now let's test with a transitive dependency
+$herd start test3
+$herd status test3 | grep started
+
+# After restarting test1 we want test3 to still be running
+$herd restart test1
+$herd status test1 | grep started
+$herd status test2 | grep started
+$herd status test3 | grep started
-- 
2.18.0

[signature.asc (application/pgp-signature, inline)]

Information forwarded to bug-guix@gnu.org:
bug#23170; Package guix. (Sat, 25 Aug 2018 14:42:02 GMT) (full text, mbox, link).


Message #11 received at 23170@debbugs.gnu.org (full text, mbox, reply):

From: ludo@gnu.org (Ludovic Courtès)
To: Carlo Zancanaro <carlo@zancanaro.id.au>
Cc: 23170@debbugs.gnu.org
Subject: Re: bug#23170: [PATCH shepherd] Restart dependent services on service restart
Date: Sat, 25 Aug 2018 16:41:03 +0200
Hi again!  :-)

Carlo Zancanaro <carlo@zancanaro.id.au> skribis:

> I've written a patch to fix this. It's not super smart, but it should
> do the job.

I wonder if there are cases where one might want to restart a service
without restarting its dependent services.  We can probably ignore it
for now, but perhaps we’ll need to add a flag or a separate action later.

Thoughts?

> From 50dd3ef4888b04ea3b869da893b23ad69fad8971 Mon Sep 17 00:00:00 2001
> From: Carlo Zancanaro <carlo@zancanaro.id.au>
> Date: Sat, 25 Aug 2018 20:32:11 +1000
> Subject: [PATCH] service: Restart dependent services on service restart
>
> * modules/shepherd/service.scm (required-by?): New procedure.
> (stop): Return a list of canonical-names for stopped dependent services,
> including transitive dependencies.
> (action)[restart]: Start services based on the return value of stop.
> (fold-services): New procedure.
> * tests/restart.sh: New file.
> * Makefile.am (TESTS): Add tests/restart.sh.

[...]

> +# Restart test1 and make sure that both services are still running (ie. that
> +# test2 hasn't been stopped)
> +$herd restart test1
> +$herd status test1 | grep started
> +$herd status test2 | grep started
> +
> +# Now let's test with a transitive dependency
> +$herd start test3
> +$herd status test3 | grep started
> +
> +# After restarting test1 we want test3 to still be running
> +$herd restart test1
> +$herd status test1 | grep started
> +$herd status test2 | grep started
> +$herd status test3 | grep started

For clarity, should we do an explicit “herd stop test1” followed by
“herd start test1”?  I know it’s currently equivalent under the hood,
but it might be slightly clearer.  WDYT?

Otherwise it LGTM.  Thanks for addressing these longstanding issues!

Ludo’.




Information forwarded to bug-guix@gnu.org:
bug#23170; Package guix. (Sat, 25 Aug 2018 22:50:02 GMT) (full text, mbox, link).


Message #14 received at 23170@debbugs.gnu.org (full text, mbox, reply):

From: Carlo Zancanaro <carlo@zancanaro.id.au>
To: Ludovic Courtès <ludo@gnu.org>
Cc: 23170@debbugs.gnu.org
Subject: Re: bug#23170: [PATCH shepherd] Restart dependent services on service restart
Date: Sun, 26 Aug 2018 08:48:47 +1000
[Message part 1 (text/plain, inline)]
On Sun, Aug 26 2018, Ludovic Courtès wrote:
> I wonder if there are cases where one might want to restart a 
> service without restarting its dependent services.  We can 
> probably ignore it for now, but perhaps we’ll need to add a flag 
> or a separate action later.
>
> Thoughts?

I think this is best served by 'herd stop', followed by 'herd 
start'. This patch just special-cases the 'restart' action, so 
manually stopping then starting a service will behave as the old 
restart used to.

> For clarity, should we do an explicit “herd stop test1” followed 
> by “herd start test1”?  I know it’s currently equivalent under 
> the hood, but it might be slightly clearer.  WDYT?

Hopefully the above also answers this, too. I did consider whether 
it was worth adding a test for 'herd stop' to make sure it still 
stops dependent services, and 'herd start' to make sure it doesn't 
start dependent services, but in the end I decided not to. I'm 
happy to send through another patch to test these cases, though, 
if you think it would be worthwhile.

Carlo
[signature.asc (application/pgp-signature, inline)]

Information forwarded to bug-guix@gnu.org:
bug#23170; Package guix. (Sun, 26 Aug 2018 21:09:01 GMT) (full text, mbox, link).


Message #17 received at 23170@debbugs.gnu.org (full text, mbox, reply):

From: ludo@gnu.org (Ludovic Courtès)
To: Carlo Zancanaro <carlo@zancanaro.id.au>
Cc: 23170@debbugs.gnu.org
Subject: Re: bug#23170: [PATCH shepherd] Restart dependent services on service restart
Date: Sun, 26 Aug 2018 23:08:34 +0200
Carlo Zancanaro <carlo@zancanaro.id.au> skribis:

> On Sun, Aug 26 2018, Ludovic Courtès wrote:
>> I wonder if there are cases where one might want to restart a
>> service without restarting its dependent services.  We can probably
>> ignore it for now, but perhaps we’ll need to add a flag or a
>> separate action later.
>>
>> Thoughts?
>
> I think this is best served by 'herd stop', followed by 'herd
> start'. This patch just special-cases the 'restart' action, so
> manually stopping then starting a service will behave as the old
> restart used to.

Great, I had overlooked this.

>> For clarity, should we do an explicit “herd stop test1” followed by
>> “herd start test1”?  I know it’s currently equivalent under the
>> hood, but it might be slightly clearer.  WDYT?
>
> Hopefully the above also answers this, too.

It does, thanks!

> I did consider whether it was worth adding a test for 'herd stop' to
> make sure it still stops dependent services, and 'herd start' to make
> sure it doesn't start dependent services, but in the end I decided not
> to. I'm happy to send through another patch to test these cases,
> though, if you think it would be worthwhile.

No, that’s fine.

I forgot if this was already done, but perhaps you can add a bit in the
manual to insist that ‘restart’ is not quite the same as ‘stop’ +
‘start’.

Anyway, it all LGTM, thanks!

Ludo’.




Information forwarded to bug-guix@gnu.org:
bug#23170; Package guix. (Sun, 26 Aug 2018 22:06:02 GMT) (full text, mbox, link).


Message #20 received at 23170@debbugs.gnu.org (full text, mbox, reply):

From: Carlo Zancanaro <carlo@zancanaro.id.au>
To: ludo@gnu.org
Cc: 23170@debbugs.gnu.org
Subject: Re: bug#23170: [PATCH shepherd] Restart dependent services on service restart
Date: Mon, 27 Aug 2018 08:05:45 +1000
[Message part 1 (text/plain, inline)]
Hey Ludo’!


On 27 August 2018 7:08:34 am AEST, ludo@gnu.org wrote:
>I forgot if this was already done, but perhaps you can add a bit in the manual to insist that ‘restart’ is not quite the same as ‘stop’ + ‘start’.

I hadn't done that, but I have now. There aren't many mentions of restart in the manual, but I changed the one that seemed most relevant.

>Anyway, it all LGTM, thanks!

Pushed! Thanks for the review.

Carlo
[Message part 2 (text/html, inline)]

Information forwarded to bug-guix@gnu.org:
bug#23170; Package guix. (Sun, 26 Aug 2018 22:06:02 GMT) (full text, mbox, link).


Message #23 received at 23170@debbugs.gnu.org (full text, mbox, reply):

From: Carlo Zancanaro <carlo@zancanaro.id.au>
To: ludo@gnu.org
Cc: 23170@debbugs.gnu.org
Subject: Re: bug#23170: [PATCH shepherd] Restart dependent services on service restart
Date: Mon, 27 Aug 2018 08:05:17 +1000
Hey Ludo’!


On 27 August 2018 7:08:34 am AEST, ludo@gnu.org wrote:
>I forgot if this was already done, but perhaps you can add a bit in the manual to insist that ‘restart’ is not quite the same as ‘stop’ + ‘start’.

I hadn't done that, but I have now. There aren't many mentions of restart in the manual, but I changed the one that seemed most relevant.

>Anyway, it all LGTM, thanks!

Pushed! Thanks for the review.

Carlo




Information forwarded to bug-guix@gnu.org:
bug#23170; Package guix. (Mon, 27 Aug 2018 11:06:01 GMT) (full text, mbox, link).


Message #26 received at 23170@debbugs.gnu.org (full text, mbox, reply):

From: ludo@gnu.org (Ludovic Courtès)
To: Carlo Zancanaro <carlo@zancanaro.id.au>
Cc: 23170@debbugs.gnu.org
Subject: Re: bug#23170: [PATCH shepherd] Restart dependent services on service restart
Date: Mon, 27 Aug 2018 13:05:07 +0200
Hi,

Carlo Zancanaro <carlo@zancanaro.id.au> skribis:

> Pushed! Thanks for the review.

Great!

I see that you also reverted the patch that removed the ‘EINTR-safe’
workaround.  Could you explain why that was necessary?  (It should not
be necessary with current Guile versions.)

Thank you,
Ludo’.




Information forwarded to bug-guix@gnu.org:
bug#23170; Package guix. (Mon, 27 Aug 2018 12:43:02 GMT) (full text, mbox, link).


Message #29 received at 23170@debbugs.gnu.org (full text, mbox, reply):

From: Carlo Zancanaro <carlo@zancanaro.id.au>
To: Ludovic Courtès <ludo@gnu.org>
Cc: 23170@debbugs.gnu.org
Subject: Re: bug#23170: [PATCH shepherd] Restart dependent services on service restart
Date: Mon, 27 Aug 2018 22:42:40 +1000
[Message part 1 (text/plain, inline)]
On Mon, Aug 27 2018, Ludovic Courtès wrote:
> I see that you also reverted the patch that removed the 
> ‘EINTR-safe’ workaround.  Could you explain why that was 
> necessary?  (It should not be necessary with current Guile 
> versions.)

I'm not really sure of the details, but as I mentioned on IRC, 
that commit stopped me from being able to boot. I grafted a new 
version of the Shepherd and reconfigured my system, and when it 
came up my screen was filled with messages complaining that it 
couldn't create things in /dev because they already existed. I 
tested the commits since the previous release and that was the one 
that caused my problem.

I'd like to investigate further, but I have no idea what could be 
causing it. All I have to go on so far is that I think the udev 
service is getting respawned repeatedly, but I don't know why that 
commit would cause that problem. After reverting that commit I 
could successfully boot back into my system and everything is 
working properly again.
[signature.asc (application/pgp-signature, inline)]

Information forwarded to bug-guix@gnu.org:
bug#23170; Package guix. (Mon, 27 Aug 2018 17:54:01 GMT) (full text, mbox, link).


Message #32 received at 23170@debbugs.gnu.org (full text, mbox, reply):

From: Clément Lassieur <clement@lassieur.org>
To: Carlo Zancanaro <carlo@zancanaro.id.au>
Cc: 23170@debbugs.gnu.org, ludo@gnu.org
Subject: Re: bug#23170: [PATCH shepherd] Restart dependent services on service restart
Date: Mon, 27 Aug 2018 19:53:01 +0200
Carlo Zancanaro <carlo@zancanaro.id.au> writes:

>>Anyway, it all LGTM, thanks!
>
> Pushed! Thanks for the review.
>
> Carlo

Awesome!  Thanks a lot Carlo for working on this :-)




Added tag(s) fixed. Request was from Ludovic Courtès <ludo@gnu.org> to control@debbugs.gnu.org. (Sat, 05 Jan 2019 13:54:01 GMT) (full text, mbox, link).


bug closed, send any further explanations to 23170@debbugs.gnu.org and "Thompson, David" <dthompson2@worcester.edu> Request was from Ludovic Courtès <ludo@gnu.org> to control@debbugs.gnu.org. (Sat, 05 Jan 2019 13:54:02 GMT) (full text, mbox, link).


bug archived. Request was from Debbugs Internal Request <help-debbugs@gnu.org> to internal_control@debbugs.gnu.org. (Sun, 03 Feb 2019 12:24:05 GMT) (full text, mbox, link).


Send a report that this bug log contains spam.


debbugs.gnu.org maintainers <help-debbugs@gnu.org>. Last modified: Mon Nov 4 22:06:36 2024; Machine Name: wallace-server

GNU bug tracking system

Debbugs is free software and licensed under the terms of the GNU Public License version 2. The current version can be obtained from https://bugs.debian.org/debbugs-source/.

Copyright © 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson, 2005-2017 Don Armstrong, and many other contributors.