shepherd lacks error reporting
(name . Guix Bugs)(address . bug-guix@gnu.org)
I'm writing a single-shot shepherd-service that expands the (ext4) root
file system on first boot, using the hostname service as a template,
just passing the script as a G-expression, instead of using the
forkexec constructor.
Of course there is a bug in it. Trouble is, I have no idea what it is,
because Shepherd won't tell me. :)
The VM boots and completes the ssh initialization phase and then
apparently just gets stuck. Doesn't even show a login prompt.
It's... not a great debugging experience.
I'm going to attempt to at the very least add some error reporting.
It would also be really nice if the failure modes for Shepherd services
were better documented, like what happens when the procedure passed in
the `start` field fails, or is not even a procedure, etc.
Since I never touched Shepherd internals, help would be greatly
appreciated.
ps.: I'm attaching the system definition for completeness's sake and so
that someone might point out where the error is, but honestly the exact
bug in my code does not matter for the feature. All that matters is
there is an error and it should be logged but isn't.
(define-module (raingloom machines cloud-deploy-bootstrap))
(use-modules
(gnu)
(gnu system nss)
(guix channels)
(guix modules))
(use-service-modules
admin
networking
shepherd
ssh)
(use-package-modules
admin
bootloaders
certs
gnome
linux
networking
ssh
tmux
tls
version-control)
(define disk "/dev/vda")
(define partition "2")
(define ext-autoexpand-service-type
(let
((name 'ext-autoexpand)
(desc
"Automatically expand ext2 root")
(modules
'((ice-9 popen))))
(shepherd-service-type
name
(lambda (config)
(shepherd-service
(documentation desc)
(provision (list name))
(requirement '(file-systems))
(one-shot? #t)
(start
(with-imported-modules
(source-module-closure modules)
#~(begin
(use-modules #$@modules)
(let ((port
(open-pipe*
OPEN_WRITE
#$(file-append util-linux "/sbin/sfdisk")
;; don't check if the block is in use
;; it is, and we don't care.
"--no-reread"
disk
"-N" partition)))
(display ",+" port)
(close-port port))
(system* $#(file-append util-linux "/sbin/partx") "--update" disk)
(system*
#$(file-append e2fsprogs "/sbin/resize2fs")
(string-append disk partition)))))))
(description desc))))
(define-public %system
(operating-system
(host-name "cloud-deploy-bootstrap")
(timezone "Europe/Budapest")
(locale "en_US.utf8")
(keyboard-layout (keyboard-layout "us"))
(bootloader (bootloader-configuration
(bootloader grub-bootloader)
(targets '("/dev/vda"))
(keyboard-layout keyboard-layout)))
(file-systems (append
(list (file-system
(device (file-system-label "cloudimg-rootfs"))
(mount-point "/")
(type "btrfs")))
%base-file-systems))
;; This is where we specify system-wide packages.
(packages (append (list
nss-certs
tmux)
%base-packages))
(services
(append
(list
(service ext-autoexpand-service-type #f)
(service dhcp-client-service-type)
(service openssh-service-type
(openssh-configuration
(openssh openssh-sans-x)
(permit-root-login #t)
(authorized-keys
`(("root" ,(local-file (string-append (getenv "HOME") "/.ssh/id_ed25519.pub"))))))))
%base-services))
;; Allow resolution of '.local' host names with mDNS.
(name-service-switch %mdns-host-lookup-nss)))
%system