GNU bug report logs

#66684 [shepherd] Altering system time renders herd unresponsive

PackageSource(s)Maintainer(s)
guix PTS Buildd Popcon
Reply or subscribe to this bug. View this bug as an mbox, status mbox, or maintainer mbox

Report forwarded to bug-guix@gnu.org:
bug#66684; Package guix. (Sun, 22 Oct 2023 16:42:02 GMT) (full text, mbox, link).


Acknowledgement sent to Vladilen Kozin <vladilen.kozin@gmail.com>:
New bug report received and forwarded. Copy sent to bug-guix@gnu.org. (Sun, 22 Oct 2023 16:42:02 GMT) (full text, mbox, link).


Message #5 received at submit@debbugs.gnu.org (full text, mbox, reply):

From: Vladilen Kozin <vladilen.kozin@gmail.com>
To: bug-guix@gnu.org
Subject: [shepherd] Altering system time renders herd unresponsive
Date: Sun, 22 Oct 2023 14:43:28 +0100
Hello guix.

My server would consistently run with system time 1h ahead of actual.
Both `date` and `hwclock` would show the same time off by 1hr, while
BIOS showed me correct time. I'm not sure why, but some services won't
run if time difference is e.g. over 15min or smth, so.

$ sudo date -s '-1 hour'

fixes time but causes `herd` to become unresponsive as in you type a
command, any command and stare at tty stuck. Also ssh'ing into the
system becomes impossible. Any attempt gets logged in
/var/log/messages - I can see that, but you again just stare at
unresponsive terminal. Initially I thought it fried shepherd
completely, so I powercycle the system to get it back. `sudo reboot`
being an alias to `herd` command will of course not work - so you have
to do it physically. Annoying but feasible on a desktop system -
complete nightmare on a physical server which may take up to 20min to
reboot due to inventory lifecycle and such.

By chance, I got distracted this time and just left it hanging. Lo and
behold it unfroze some 15-20min later. What gives I've no clue.

I hope I won't be seeing this particular issue again, cause I followed
system clock alteration with:
$ sudo hwclock -w
and reboot shows correct time.

In general my experience with shepherd has been less than stellar.
IMO, this just shouldn't happen with PID 1 ever - cause there isn't
anything you can do at this point. Not the first time it became
unresponsive. On occasion after pull that changes some user service
code, followed by system reconfigure those services would start
failing to find their binaries - best guess I have there is that those
specific services depend on user-home service or some such and
something happens that prevents discovery of said binaries in PATH -
binaries in those services aren't referenced by absolute path in GNU
store. Separate issue.

Generation 8 Oct 14 2023 00:22:53 (current)
  file name: /var/guix/profiles/system-8-link
  canonical file name: /gnu/store/j9i2w1zacw7sl8vlb7k1g7p0vnd58ns7-system
  label: GNU with Linux 6.4.16
  bootloader: grub
  root device: label: "r720-guix-0"
  kernel: /gnu/store/cbc7x9in2dnjrnh840c21ivgygnndp1c-linux-6.4.16/bzImage
  channels:
    guix:
      repository URL: https://git.savannah.gnu.org/git/guix.git
      branch: master
      commit: 3963fa1a465708690cd1554d911613f1c92f5eef

Thank you

-- 
Best regards
Vlad Kozin




Information forwarded to bug-guix@gnu.org:
bug#66684; Package guix. (Mon, 23 Oct 2023 20:00:01 GMT) (full text, mbox, link).


Message #8 received at 66684@debbugs.gnu.org (full text, mbox, reply):

From: Ludovic Courtès <ludo@gnu.org>
To: Vladilen Kozin <vladilen.kozin@gmail.com>
Cc: 66684@debbugs.gnu.org
Subject: Re: bug#66684: [shepherd] Altering system time renders herd unresponsive
Date: Mon, 23 Oct 2023 21:58:33 +0200
Hi Vladilen,

Vladilen Kozin <vladilen.kozin@gmail.com> skribis:

> My server would consistently run with system time 1h ahead of actual.
> Both `date` and `hwclock` would show the same time off by 1hr, while
> BIOS showed me correct time. I'm not sure why, but some services won't
> run if time difference is e.g. over 15min or smth, so.
>
> $ sudo date -s '-1 hour'
>
> fixes time but causes `herd` to become unresponsive as in you type a
> command, any command and stare at tty stuck. Also ssh'ing into the
> system becomes impossible.

Thanks for your report.  This issue comes from Fibers 1.3.1:

  https://github.com/wingo/fibers/issues/89

There’s currently no bug-fix in sight though.

Ludo’.




Severity set to 'important' from 'normal' Request was from Ludovic Courtès <ludo@gnu.org> to control@debbugs.gnu.org. (Thu, 23 Nov 2023 11:45:02 GMT) (full text, mbox, link).


Merged 66684 68476. Request was from Sergey Trofimov <sarg@sarg.org.ru> to control@debbugs.gnu.org. (Tue, 16 Jan 2024 06:56:01 GMT) (full text, mbox, link).


Merged 66684 68476 70848. Request was from Sergey Trofimov <sarg@sarg.org.ru> to control@debbugs.gnu.org. (Sat, 11 May 2024 16:58:02 GMT) (full text, mbox, link).


Send a report that this bug log contains spam.


debbugs.gnu.org maintainers <help-debbugs@gnu.org>. Last modified: Sun Dec 22 11:38:06 2024; Machine Name: wallace-server

GNU bug tracking system

Debbugs is free software and licensed under the terms of the GNU Public License version 2. The current version can be obtained from https://bugs.debian.org/debbugs-source/.

Copyright © 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson, 2005-2017 Don Armstrong, and many other contributors.