Redundant library grafts leads to breakage

  • Open
  • quality assurance status badge
Details
6 participants
  • Simen Endsjø
  • Jack Hill
  • Leo Famulari
  • Ludovic Courtès
  • Mark H Weaver
  • Richard Sent
Owner
unassigned
Submitted by
Jack Hill
Severity
important
Merged with

Debbugs page

J
J
Jack Hill wrote on 12 Mar 2021 14:58
Failure building grub-img.png when reconfiguring
(address . bug-guix@gnu.org)
alpine.DEB.2.21.2103121747190.8138@marsh.hcoop.net
Hi Guix,

When reconfiguring my system, the build of
/gnu/store/0yf1b1l19h7c3jj1zkhxjmq4sb3yysjq-grub-image.png.drv failed with
the following:

```
Backtrace:
2 (primitive-load "/gnu/store/larqpc2wjhnc6jmj4885k8lynd1?")
In gnu/build/svg.scm:
53:6 1 (svg->png _ "/gnu/store/xadbzis4pvmxib4fk55jrag4fmn55w?" ?)
In unknown file:
0 (rsvg-handle-render-cairo #<rsvg-handle 7ffff5b60150> #)

ERROR: In procedure rsvg-handle-render-cairo:
Wrong type (expecting finalized smob): #<cairo-context 7ffff5b60090>
```

This is with Guix bb5d84a0489a629d30bc2e978807caf20f46e329. My last
successful reconfigure was with 80739ea480a7db667b83b45e3a08be740449f689.
The output of the reconfigure run is attached. Reconfiguring without
grafts succeeds.

Best,
Jack
Attachment: reconfigure-log
Attachment: no-grafts
;; This is an operating system configuration for a VM image. ;; Modify it as you see fit and instantiate the changes by running: ;; ;; guix system reconfigure /etc/config.scm ;; (use-modules (gnu) (guix)) (use-service-modules networking ssh) (use-package-modules bootloaders certs linux package-management) (define vm-image-motd (plain-file "motd" " \x1b[1;37mThis is the GNU system. Welcome!\x1b[0m This instance of Guix is a template for virtualized environments. You can reconfigure the whole system by adjusting /etc/config.scm and running: guix system reconfigure /etc/config.scm Run '\x1b[1;37minfo guix\x1b[0m' to browse documentation. \x1b[1;33mConsider setting a password for the 'root' and 'guest' \ accounts.\x1b[0m ")) (operating-system (host-name "kalessin") (timezone "America/New_York") (locale "en_US.utf8") (initrd-modules (cons "virtio_scsi" %base-initrd-modules)) ;; Label for the GRUB boot menu. (label (string-append "GNU Guix " (package-version guix))) (firmware '()) ;; Below we assume /dev/vda is the VM's hard disk. ;; Adjust as needed. (bootloader (bootloader-configuration (bootloader grub-bootloader) (target "/dev/vda") (terminal-outputs '(console)))) (file-systems (cons (file-system (mount-point "/") (device (file-system-label "kalessin-btrfs")) (type "btrfs") (options "compress=zstd")) %base-file-systems)) (users (cons* (user-account (name "jackhill") (comment "Jack Hill") (group "users") (supplementary-groups '("wheel" "netdev"))) %base-user-accounts)) ;; Our /etc/sudoers file. Since 'guest' initially has an empty password, ;; allow for password-less sudo. (sudoers-file (plain-file "sudoers" "\ root ALL=(ALL) ALL %wheel ALL=NOPASSWD: ALL\n")) (packages (append (list btrfs-progs nss-certs) %base-packages)) (services (append (list (service openssh-service-type (openssh-configuration (password-authentication? #f) (authorized-keys `(("jackhill" ,(local-file "/id_ed25519.pub") ,(local-file "/home/jackhill/tamago.ssh-key") ,(local-file "/home/jackhill/id_ed25519.pub")))))) ;; Use the DHCP client service rather than NetworkManager. (service dhcp-client-service-type)) (modify-services %base-services (guix-service-type config => (guix-configuration (inherit config) (extra-options '("--disable-deduplication"))))))))
L
L
Leo Famulari wrote on 12 Mar 2021 15:05
(name . Jack Hill)(address . jackhill@jackhill.us)
YEvznooRt4wUjDtA@jasmine.lan
On Fri, Mar 12, 2021 at 05:58:27PM -0500, Jack Hill wrote:
Toggle quote (5 lines)
> This is with Guix bb5d84a0489a629d30bc2e978807caf20f46e329. My last
> successful reconfigure was with 80739ea480a7db667b83b45e3a08be740449f689.
> The output of the reconfigure run is attached. Reconfiguring without grafts
> succeeds.

I wonder if it's related to the changes in the recent Cairo graft, from
commit bc16eacc99e801ac30cbe2aa649a2be3ca5c102a?
J
J
Jack Hill wrote on 12 Mar 2021 16:25
(name . Leo Famulari)(address . leo@famulari.name)
alpine.DEB.2.21.2103121924000.8138@marsh.hcoop.net
On Fri, 12 Mar 2021, Leo Famulari wrote:

Toggle quote (3 lines)
> I wonder if it's related to the changes in the recent Cairo graft, from
> commit bc16eacc99e801ac30cbe2aa649a2be3ca5c102a?

Yes, that seems to be it. The previous commit

sudo -E guix time-machine --commit=453e101fc3f7dac9aabcd6122cf05fb7925103c7 -- system reconfigure /config.scm

works, but

sudo -E guix time-machine --commit=bc16eacc99e801ac30cbe2aa649a2be3ca5c102a -- system reconfigure /config.scm

does not.

Best,
Jack
M
M
Mark H Weaver wrote on 12 Mar 2021 16:24
(address . 47115@debbugs.gnu.org)
87a6r7294k.fsf@netris.org
Leo Famulari <leo@famulari.name> writes:

Toggle quote (9 lines)
> On Fri, Mar 12, 2021 at 05:58:27PM -0500, Jack Hill wrote:
>> This is with Guix bb5d84a0489a629d30bc2e978807caf20f46e329. My last
>> successful reconfigure was with 80739ea480a7db667b83b45e3a08be740449f689.
>> The output of the reconfigure run is attached. Reconfiguring without grafts
>> succeeds.
>
> I wonder if it's related to the changes in the recent Cairo graft, from
> commit bc16eacc99e801ac30cbe2aa649a2be3ca5c102a?

Is anyone else seeing this? FWIW, I tested reconfiguring my Guix system
with the grafts I recently pushed, and grub-img.png built successfully
for me. I'm using the resulting system now.

Also, the changes between the original cairo and its replacement are
quite minimal. I'm having trouble imagining how this graft could have
led to that error, but of course my imagination is limited. :)

Jack: is the problem reproducible, or could it have been a sporadic
failure?

Mark
J
J
Jack Hill wrote on 12 Mar 2021 16:33
(name . Mark H Weaver)(address . mhw@netris.org)
alpine.DEB.2.21.2103121931030.8138@marsh.hcoop.net
On Fri, 12 Mar 2021, Mark H Weaver wrote:

Toggle quote (3 lines)
> Jack: is the problem reproducible, or could it have been a sporadic
> failure?

So far I've only reconfigured beyond the graft on the one VM, but with
multiple commits. I'll try it on another host shortly.

Best,
Jack
J
J
Jack Hill wrote on 12 Mar 2021 20:05
(name . Mark H Weaver)(address . mhw@netris.org)
alpine.DEB.2.21.2103122301220.8138@marsh.hcoop.net
On Fri, 12 Mar 2021, Jack Hill wrote:

Toggle quote (8 lines)
> On Fri, 12 Mar 2021, Mark H Weaver wrote:
>
>> Jack: is the problem reproducible, or could it have been a sporadic
>> failure?
>
> So far I've only reconfigured beyond the graft on the one VM, but with
> multiple commits. I'll try it on another host shortly.

I was not able to reproduce the problem on my desktop. Both systems are
using the bios grub-bootloader.

Best,
Jack
M
M
Mark H Weaver wrote on 12 Mar 2021 23:41
(name . Jack Hill)(address . jackhill@jackhill.us)
877dmb1owa.fsf@netris.org
Hi Jack,

Jack Hill <jackhill@jackhill.us> writes:

Toggle quote (13 lines)
> On Fri, 12 Mar 2021, Jack Hill wrote:
>
>> On Fri, 12 Mar 2021, Mark H Weaver wrote:
>>
>>> Jack: is the problem reproducible, or could it have been a sporadic
>>> failure?
>>
>> So far I've only reconfigured beyond the graft on the one VM, but with
>> multiple commits. I'll try it on another host shortly.
>
> I was not able to reproduce the problem on my desktop. Both systems are
> using the bios grub-bootloader.

Thanks. Given this, and the lack of similar reports from others, my
guess is that you hit a non-deterministic Guile bug, possibly the same
one as https://bugs.gnu.org/46879 (Non-deterministic failures while
building Guix with Guile 3.0.5).

If the problem happens reproducibly on that one VM only, that suggests
that the bug might have led to a corrupted store item, i.e. a store item
containing a .go file with bad code. If so, running "guix gc" might be
sufficient to clear the corrupted items.

Regards,
Mark
J
J
Jack Hill wrote on 13 Mar 2021 12:08
(name . Mark H Weaver)(address . mhw@netris.org)(address . 47115@debbugs.gnu.org)
alpine.DEB.2.21.2103131506140.8138@marsh.hcoop.net
On Sat, 13 Mar 2021, Mark H Weaver wrote:

Toggle quote (10 lines)
> Thanks. Given this, and the lack of similar reports from others, my
> guess is that you hit a non-deterministic Guile bug, possibly the same
> one as <https://bugs.gnu.org/46879> (Non-deterministic failures while
> building Guix with Guile 3.0.5).
>
> If the problem happens reproducibly on that one VM only, that suggests
> that the bug might have led to a corrupted store item, i.e. a store item
> containing a .go file with bad code. If so, running "guix gc" might be
> sufficient to clear the corrupted items.

guix gc and the reconfigure again (now with
373c7b5791acd8f377455be47260948b843dd5db) still results in the same error.
Of course, if the miscompilation was in something that couldn't get
collected… However, I have been able to reproduce it on that one VM across
multiple Guix commits.

Best,
Jack
J
J
Jack Hill wrote on 13 Mar 2021 20:27
(name . Mark H Weaver)(address . mhw@netris.org)(address . 47115@debbugs.gnu.org)
alpine.DEB.2.21.2103132302490.8138@marsh.hcoop.net
In an effort to clear out more of the potentially problematic store items,
I switched to an older generation of the system as well as guix pull and
user profiles. I then ran guix gc. At this point, I was running guix from
commit 373e5fc96724fd38bb1263e4af90932ea36f596b and the system profile was
created with guix f3eecfd36cb537a1febc30eea1f6aa448203ba40.

I then pulled, bringing me up to guix
8154beffd8c121e953a7c4cd75c3eebfcc073a9a. Reconfiguring results in the
same error. Any thoughts on how to recover? Should I try building guix
against an older guile version?

Best,
Jack
J
J
Jack Hill wrote on 14 Mar 2021 13:49
(address . 47115@debbugs.gnu.org)(name . Mark H Weaver)(address . mhw@netris.org)
alpine.DEB.2.21.2103141624080.8138@marsh.hcoop.net
Still on the same VM, I've been able to reproduce the problem while
building a different derivation:
/gnu/store/07xw2pp63xin4c4y8ndrcdn3n8z1vmx2-grub-image.png.drv from Guix
commit d4e29f3628ad0c7576d7cab659d7fcc19d21999a. I can still build the new
derivation on my desktop.

Hrm, it's a pretty spooky problem.

Best,
Jack
J
J
Jack Hill wrote on 14 Mar 2021 14:14
(address . 47115@debbugs.gnu.org)
alpine.DEB.2.21.2103141657310.8138@marsh.hcoop.net
Trying to document some more information in the hopes that others can
reproduce this bug.

On the host that fails:

$ guix describe
Generation 7 Mar 14 2021 16:14:58 (current)
guix d4e29f3
branch: master
commit: d4e29f3628ad0c7576d7cab659d7fcc19d21999a

jackhill@kalessin ~$ cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 58
model name : Intel Xeon E3-12xx v2 (Ivy Bridge, IBRS)
stepping : 9
microcode : 0x1
cpu MHz : 2599.990
cache size : 16384 KB
physical id : 0
siblings : 1
core id : 0
cpu cores : 1
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm
constant_tsc rep_good nopl xtopology cpuid tsc_known_freq pni pclmulqdq
vmx ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes
xsave avx f16c rdrand hypervisor lahf_lm cpuid_fault pti ssbd ibrs ibpb
tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust smep erms
xsaveopt arat md_clear
vmx flags : vnmi preemption_timer posted_intr invvpid ept_x_only
ept_1gb flexpriority apicv tsc_offset vtpr mtf vapic ept vpid
unrestricted_guest vapic_reg vid
bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass
l1tf mds swapgs itlb_multihit srbds
bogomips : 5199.98
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management:

processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 58
model name : Intel Xeon E3-12xx v2 (Ivy Bridge, IBRS)
stepping : 9
microcode : 0x1
cpu MHz : 2599.990
cache size : 16384 KB
physical id : 1
siblings : 1
core id : 0
cpu cores : 1
apicid : 1
initial apicid : 1
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm
constant_tsc rep_good nopl xtopology cpuid tsc_known_freq pni pclmulqdq
vmx ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes
xsave avx f16c rdrand hypervisor lahf_lm cpuid_fault pti ssbd ibrs ibpb
tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust smep erms
xsaveopt arat md_clear
vmx flags : vnmi preemption_timer posted_intr invvpid ept_x_only
ept_1gb flexpriority apicv tsc_offset vtpr mtf vapic ept vpid
unrestricted_guest vapic_reg vid
bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass
l1tf mds swapgs itlb_multihit srbds
bogomips : 5199.98
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management:

jackhill@kalessin ~$ cat /config.scm
;; This is an operating system configuration for a VM image.
;; Modify it as you see fit and instantiate the changes by running:
;;
;; guix system reconfigure /etc/config.scm
;;

(use-modules (gnu) (guix))
(use-service-modules networking ssh)
(use-package-modules bootloaders certs linux
package-management)

(define vm-image-motd (plain-file "motd" "
\x1b[1;37mThis is the GNU system. Welcome!\x1b[0m

This instance of Guix is a template for virtualized environments.
You can reconfigure the whole system by adjusting /etc/config.scm
and running:

guix system reconfigure /etc/config.scm

Run '\x1b[1;37minfo guix\x1b[0m' to browse documentation.

\x1b[1;33mConsider setting a password for the 'root' and 'guest' \
accounts.\x1b[0m
"))

(operating-system
(host-name "kalessin")
(timezone "America/New_York")
(locale "en_US.utf8")
(initrd-modules (cons "virtio_scsi" %base-initrd-modules))

;; Label for the GRUB boot menu.
(label (string-append "GNU Guix " (package-version guix)))

(firmware '())

;; Below we assume /dev/vda is the VM's hard disk.
;; Adjust as needed.
(bootloader (bootloader-configuration
(bootloader grub-bootloader)
(target "/dev/vda")
(terminal-outputs '(console))))
(file-systems (cons (file-system
(mount-point "/")
(device (file-system-label "kalessin-btrfs"))
(type "btrfs")
(options "compress=zstd"))
%base-file-systems))

(users (cons* (user-account
(name "jackhill")
(comment "Jack Hill")
(group "users")
(supplementary-groups '("wheel" "netdev")))
%base-user-accounts))

;; Our /etc/sudoers file. Since 'guest' initially has an empty password,
;; allow for password-less sudo.
(sudoers-file (plain-file "sudoers" "\
root ALL=(ALL) ALL
%wheel ALL=NOPASSWD: ALL\n"))

(packages (append (list btrfs-progs nss-certs)
%base-packages))

(services
(append (list (service openssh-service-type
(openssh-configuration
(password-authentication? #f)
(authorized-keys
`(("jackhill" ,(local-file "/id_ed25519.pub")
,(local-file "/home/jackhill/tamago.ssh-key")
,(local-file "/home/jackhill/id_ed25519.pub"))))))

;; Use the DHCP client service rather than NetworkManager.
(service dhcp-client-service-type))
(modify-services %base-services
(guix-service-type config =>
(guix-configuration
(inherit config)
(extra-options
'("--disable-deduplication"))
(authorized-keys
(cons
(local-file "/home/jackhill/alperton-guix-key.pub")
%default-authorized-guix-keys))))))))

ckhill@kalessin ~$ sudo -E guix system -v3 reconfigure /config.scm
The following derivations will be built:
/gnu/store/cnl0pbld58rq4zn0l347ssawdxpcs2hg-grub.cfg.drv
/gnu/store/07xw2pp63xin4c4y8ndrcdn3n8z1vmx2-grub-image.png.drv
building /gnu/store/07xw2pp63xin4c4y8ndrcdn3n8z1vmx2-grub-image.png.drv...
Backtrace:
2 (primitive-load "/gnu/store/larqpc2wjhnc6jmj4885k8lynd1?")
In gnu/build/svg.scm:
53:6 1 (svg->png _ "/gnu/store/vmldvxllh07k641wmbnlz3migga29r?" ?)
In unknown file:
0 (rsvg-handle-render-cairo #<rsvg-handle 7ffff5b60150> #)

ERROR: In procedure rsvg-handle-render-cairo:
Wrong type (expecting finalized smob): #<cairo-context 7ffff5b60090>
builder for `/gnu/store/07xw2pp63xin4c4y8ndrcdn3n8z1vmx2-grub-image.png.drv' failed with exit code 1
build of /gnu/store/07xw2pp63xin4c4y8ndrcdn3n8z1vmx2-grub-image.png.drv failed
View build log at '/var/log/guix/drvs/07/xw2pp63xin4c4y8ndrcdn3n8z1vmx2-grub-image.png.drv.bz2'.
cannot build derivation `/gnu/store/cnl0pbld58rq4zn0l347ssawdxpcs2hg-grub.cfg.drv': 1 dependencies couldn't be built
guix system: error: build of `/gnu/store/cnl0pbld58rq4zn0l347ssawdxpcs2hg-grub.cfg.drv' failed

jackhill@kalessin ~$ free -m
total used free shared buff/cache available
Mem: 2994 122 505 0 2366 2783
Swap: 0 0 0

Now on the host where it suceeds:

$ guix describe
Generation 112 Mar 14 2021 16:30:34 (current)
guix-at-duke 2a57b7c
branch: master
commit: d4e29f3628ad0c7576d7cab659d7fcc19d21999a
nonguix 54b8358
branch: master
commit: 54b83587669b5df5fe36bce058f4f2cf34d8a63c
guix d4e29f3
branch: master
commit: d4e29f3628ad0c7576d7cab659d7fcc19d21999a

jackhill@alperton ~$ cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 37
model name : Intel(R) Core(TM) i7 CPU L 640 @ 2.13GHz
stepping : 5
microcode : 0x7
cpu MHz : 2623.174
cache size : 4096 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 2
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt aes lahf_lm pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid dtherm ida arat flush_l1d
vmx flags : vnmi preemption_timer invvpid ept_x_only flexpriority tsc_offset vtpr mtf vapic ept vpid unrestricted_guest
bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit
bogomips : 4256.16
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:

processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 37
model name : Intel(R) Core(TM) i7 CPU L 640 @ 2.13GHz
stepping : 5
microcode : 0x7
cpu MHz : 2416.981
cache size : 4096 KB
physical id : 0
siblings : 2
core id : 2
cpu cores : 2
apicid : 4
initial apicid : 4
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt aes lahf_lm pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid dtherm ida arat flush_l1d
vmx flags : vnmi preemption_timer invvpid ept_x_only flexpriority tsc_offset vtpr mtf vapic ept vpid unrestricted_guest
bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit
bogomips : 4256.16
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:

jackhill@alperton ~$ cat repos/guix-configs/alperton/config.scm
;; This is an operating system configuration template
;; for a "desktop" setup with GNOME and Xfce where the
;; root partition is encrypted with LUKS.

(use-modules (gnu) (gnu system nss) (gnu services xorg)
(gnu packages linux)
(nongnu packages linux)
(nongnu system linux-initrd)
(srfi srfi-1))
(use-service-modules ;; afs
cups desktop docker kerberos sddm)
(use-package-modules certs gnome kerberos printers
scanner security-token
wm)

(operating-system
(host-name "alperton")
(timezone "America/New_York")
(locale "en_US.utf8")

(bootloader (bootloader-configuration
(bootloader grub-bootloader)
(target "/dev/sda")))

;; Specify a mapped device for the encrypted root partition.
;; The UUID is that returned by 'cryptsetup luksUUID'.
(mapped-devices
(list (mapped-device
(source (uuid "9cfdc1d9-d062-4269-9cbb-9cb518c4cf4c"))
(target "alperton_btrfs")
(type luks-device-mapping))))

(file-systems (cons
(file-system
(device (uuid "179969de-85a9-4e95-ba44-79566c492eb5"))
(mount-point "/")
(type "btrfs")
(flags '(no-atime))
(options "compress=zstd")
(dependencies mapped-devices))
%base-file-systems))

(swap-devices (list "/root/swap"))

(users (cons (user-account
(name "jackhill")
(comment "Jack Hill")
(group "users")
(supplementary-groups '("wheel" "netdev"
;"docker"
"audio" "lp" "video"))
(home-directory "/home/jackhill"))
%base-user-accounts))

;; This is where we specify system-wide packages.
(packages (cons* nss-certs ;for HTTPS access
btrfs-progs
fuse-exfat
bluez
mit-krb5
sway
gvfs ;for user mounts
%base-packages))

;; Add GNOME and/or Xfce---we can choose at the log-in
;; screen with F1. Use the "desktop" services, which
;; include the X11 log-in service, networking with
;; NetworkManager, and more.
(services (cons* (service gnome-desktop-service-type)
(bluetooth-service)
;; (service docker-service-type)
(simple-service 'custom-udev-rules udev-service-type (list sane-backends libu2f-host))
(screen-locker-service swaylock)
(service sddm-service-type)
(service cups-service-type
(cups-configuration
(web-interface? #true)))
(service krb5-service-type
(krb5-configuration
(default-realm "HCOOP.NET")
(forwardable? #t)
))
;; (service afs-client-service-type)
(modify-services (remove (lambda (service)
(eq? (service-kind service) gdm-service-type))
%desktop-services)
(guix-service-type
config =>
(guix-configuration
(inherit config)
(authorized-keys
(cons*
(local-file "../keys/libre-01-guix-key.pub")
(local-file "../keys/libre-02-guix-key.pub")
%default-authorized-guix-keys)))))))

(kernel linux)
(kernel-arguments '("quite"
"zswap.enabled=1" "zswap.compressor=zstd"
"zswap.max_pool_percent=50" "zswap.zpool=z3fold"))
(initrd-modules (cons* "zstd" "z3fold"
%base-initrd-modules))
(initrd microcode-initrd)
(firmware (append (list linux-firmware iwlwifi-firmware broadcom-bt-firmware)
%base-firmware))

;; Allow resolution of '.local' host names with mDNS.
(name-service-switch %mdns-host-lookup-nss))

jackhill@alperton ~$ free -m
total used free shared buff/cache available
Mem: 3735 2087 209 368 1438 1076
Swap: 8191 2696 5495

Clearly there as some differences between these hosts. However, the same
derivation and png file:

jackhill@alperton ~$ guix build /gnu/store/07xw2pp63xin4c4y8ndrcdn3n8z1vmx2-grub-image.png.drv
/gnu/store/vmldvxllh07k641wmbnlz3migga29rfn-grub-image.png
jackhill@alperton ~$ cat /gnu/store/07xw2pp63xin4c4y8ndrcdn3n8z1vmx2-grub-image.png.drv
Derive([("out","/gnu/store/vmldvxllh07k641wmbnlz3migga29rfn-grub-image.png","","")],[("/gnu/store/6k0b8k9cl9gcrg603cxva0qnwbxv55xs-guile-rsvg-2.18.1-0.05c6a2f.drv",["out"]),("/gnu/store/b5nnbpgkvgdpzgvj67539ylcaqacj90l-guile-3.0.2.drv",["out"]),("/gnu/store/hb2q1683r8x8n28dyvr4gvdgkhmssq8q-guix-artwork-2f2fe74-checkout.drv",["out"]),("/gnu/store/kvpdmjknxqjm9k6gi2c9bijkrmk9n944-module-import-compiled.drv",["out"]),("/gnu/store/rcl324yiq7a56rwkqwgqx097dwc5mgni-guile-cairo-1.11.2.drv",["out"])],["/gnu/store/ih9cbxl2qwn9bn2yfmr2g40w7p7yafic-module-import","/gnu/store/larqpc2wjhnc6jmj4885k8lynd19fl4m-grub-image.png-builder"],"x86_64-linux","/gnu/store/0m0vd873jp61lcm4xa3ljdgx381qa782-guile-3.0.2/bin/guile",["--no-auto-compile","-L","/gnu/store/ih9cbxl2qwn9bn2yfmr2g40w7p7yafic-module-import","-L","/gnu/store/0b39xp6kndr95k6rccbp8ijwvsrkygvd-guile-rsvg-2.18.1-0.05c6a2f/share/guile/site/3.0","-L","/gnu/store/vjn7ygzzqshvsfzck8hq5lp5pfrr2xp5-guile-cairo-1.11.2/share/guile/site/3.0","-C","/gnu/store/pk1r70b4gxn9fsd53glr8alqz5h1kk65-module-import-compiled","-C","/gnu/store/0b39xp6kndr95k6rccbp8ijwvsrkygvd-guile-rsvg-2.18.1-0.05c6a2f/lib/guile/3.0/site-ccache","-C","/gnu/store/vjn7ygzzqshvsfzck8hq5lp5pfrr2xp5-guile-cairo-1.11.2/lib/guile/3.0/site-ccache","/gnu/store/larqpc2wjhnc6jmj4885k8lynd19fl4m-grub-image.png-builder"],[("out","/gnu/store/vmldvxllh07k641wmbnlz3migga29rfn-grub-image.png"),("preferLocalBuild","1")])

is used on both systems, and after copying the successfully built png file
to "bad" host, it is used successfully.

This has been a difficult problem for me to track down, since the only way
I know to reproduce it is with guix system reconfigure (guix system build
isn't even enough). I think it would make troubleshooting easier if I
could generate the problematic derivations outside of guix system
reconfigure. At this point, I'm not sure what additional information would
be relevant. Are there any additional places I should look?

Thanks!
Jack
J
J
Jack Hill wrote on 14 Mar 2021 15:00
(address . 47115@debbugs.gnu.org)
alpine.DEB.2.21.2103141753300.8138@marsh.hcoop.net
Okay, I've started looking at the builder a little more:

jackhill@alperton ~$ cat /gnu/store/larqpc2wjhnc6jmj4885k8lynd19fl4m-grub-image.png-builder
(if (string-suffix? ".svg" "/gnu/store/83qplqmavzphd30hm1maxwlh166ylpwr-guix-artwork-2f2fe74-checkout/grub/GuixSD-fully-black-4-3.svg") (begin (use-modules (gnu build svg)) (svg->png "/gnu/store/83qplqmavzphd30hm1maxwlh166ylpwr-guix-artwork-2f2fe74-checkout/grub/GuixSD-fully-black-4-3.svg" ((@ (guile) getenv) "out") #:width 1024 #:height 768)) (copy-file "/gnu/store/83qplqmavzphd30hm1maxwlh166ylpwr-guix-artwork-2f2fe74-checkout/grub/GuixSD-fully-black-4-3.svg" ((@ (guile) getenv) "out")))

The problem appears to be in the svg->png procedure or at least in the
svg.scm file. On the "bad" system:

jackhill@kalessin ~$ guix environment --ad-hoc guile guile-rsvg guile-readline
jackhill@kalessin ~ [env]$ guile
GNU Guile 3.0.5
Copyright (C) 1995-2021 Free Software Foundation, Inc.

Guile comes with ABSOLUTELY NO WARRANTY; for details type `,show w'.
This program is free software, and you are welcome to redistribute it
under certain conditions; type `,show c' for details.

Enter `,help' for help.
scheme@(guile-user)> ,use (gnu build svg)
;;; note: auto-compilation is enabled, set GUILE_AUTO_COMPILE=0
;;; or pass the --no-auto-compile argument to disable.
;;; compiling /run/current-system/profile/share/guile/site/3.0/gnu/build/svg.scm
;;; WARNING: compilation of /run/current-system/profile/share/guile/site/3.0/gnu/build/svg.scm failed:
;;; failed to create path for auto-compiled file "/run/current-system/profile/share/guile/site/3.0/gnu/build/svg.scm"
scheme@(guile-user)> (svg->png "/gnu/store/83qplqmavzphd30hm1maxwlh166ylpwr-guix-artwork-2f2fe74-checkout/grub/GuixSD-fully-black-4-3.svg" "/tmp/test.png")
ice-9/boot-9.scm:1669:16: In procedure raise-exception:
Wrong type (expecting finalized smob): #<cairo-context 7fb032a3e6b0>

Entering a new prompt. Type `,bt' for a backtrace or `,q' to continue.

On the good system

ckhill@alperton ~$ guix environment --ad-hoc guile guile-rsvg guile-readline
jackhill@alperton ~ [env]$ guile
GNU Guile 3.0.5
Copyright (C) 1995-2021 Free Software Foundation, Inc.

Guile comes with ABSOLUTELY NO WARRANTY; for details type `,show w'.
This program is free software, and you are welcome to redistribute it
under certain conditions; type `,show c' for details.

Enter `,help' for help.
scheme@(guile-user)> ,use (gnu build svg)
;;; note: auto-compilation is enabled, set GUILE_AUTO_COMPILE=0
;;; or pass the --no-auto-compile argument to disable.
;;; compiling /run/current-system/profile/share/guile/site/3.0/gnu/build/svg.scm
;;; compiled /home/jackhill/.cache/guile/ccache/3.0-LE-8-4.4/gnu/store/0j6w61vjjvp4zqzrqvyhqm6254ppzh8y-guix-1.2.0-16.c8887a5/share/guile/site/3.0/gnu/build/svg.scm.go
scheme@(guile-user)> (svg->png "/gnu/store/83qplqmavzphd30hm1maxwlh166ylpwr-guix-artwork-2f2fe74-checkout/grub/GuixSD-fully-black-4-3.svg" "/tmp/test.png")

and a png file is produced. Particularly relivant seems the
auto-compilation failure.

To be continued…

Best,
Jack
M
M
Mark H Weaver wrote on 14 Mar 2021 15:05
(name . Jack Hill)(address . jackhill@jackhill.us)(address . 47115@debbugs.gnu.org)
874khds84o.fsf@netris.org
Hi Jack,

Jack Hill <jackhill@jackhill.us> writes:

Toggle quote (11 lines)
> In an effort to clear out more of the potentially problematic store items,
> I switched to an older generation of the system as well as guix pull and
> user profiles. I then ran guix gc. At this point, I was running guix from
> commit 373e5fc96724fd38bb1263e4af90932ea36f596b and the system profile was
> created with guix f3eecfd36cb537a1febc30eea1f6aa448203ba40.
>
> I then pulled, bringing me up to guix
> 8154beffd8c121e953a7c4cd75c3eebfcc073a9a. Reconfiguring results in the
> same error. Any thoughts on how to recover? Should I try building guix
> against an older guile version?

Rolling back to an earlier system generation and running "guix gc" again
was a good idea, but you might have missed one or two crucial steps in
between:

(1) You must *delete* the "older" system generations and user profiles
e.g. by running "guix system delete-generations" and "guix package
--delete-generations", or else "guix gc" won't clear them from your
store. It is not enough to merely switch to an older system
generation and profiles.

(2) You'll also need to actually reboot into the older system
generation, because /run/booted-system will continue to protect
(from GC) the system that you last booted into, even after you
switch systems.

Did you do those things before running "guix gc"?

I'm sorry that you've hit this nasty bug.

Regards,
Mark
L
L
Leo Famulari wrote on 14 Mar 2021 15:18
(name . Mark H Weaver)(address . mhw@netris.org)
YE6Loqq6ugZ+RrQT@jasmine.lan
On Fri, Mar 12, 2021 at 07:24:16PM -0500, Mark H Weaver wrote:
Toggle quote (4 lines)
> Is anyone else seeing this? FWIW, I tested reconfiguring my Guix system
> with the grafts I recently pushed, and grub-img.png built successfully
> for me. I'm using the resulting system now.

I wasn't able to reproduce this on either a Core 2 Duo or a newer AMD
EPYC machine.
J
J
Jack Hill wrote on 14 Mar 2021 16:18
(name . Mark H Weaver)(address . mhw@netris.org)(address . 47115@debbugs.gnu.org)
alpine.DEB.2.21.2103141914430.8138@marsh.hcoop.net
On Sun, 14 Mar 2021, Mark H Weaver wrote:

Toggle quote (13 lines)
> (1) You must *delete* the "older" system generations and user profiles
> e.g. by running "guix system delete-generations" and "guix package
> --delete-generations", or else "guix gc" won't clear them from your
> store. It is not enough to merely switch to an older system
> generation and profiles.
>
> (2) You'll also need to actually reboot into the older system
> generation, because /run/booted-system will continue to protect
> (from GC) the system that you last booted into, even after you
> switch systems.
>
> Did you do those things before running "guix gc"?

Oops, I left out those details. Yes, I did both those things.

Toggle quote (2 lines)
> I'm sorry that you've hit this nasty bug.

Thanks. For me, being the only one that can reproduce or experience a
problem can be a frustrating and lonely experience, so really appreciate
the time you and Leo have spent looking at it.

Best,
Jack
M
M
Mark H Weaver wrote on 14 Mar 2021 17:11
(name . Jack Hill)(address . jackhill@jackhill.us)(address . 47115@debbugs.gnu.org)
87y2epqnq8.fsf@netris.org
Hi Jack,

Jack Hill <jackhill@jackhill.us> writes:

Toggle quote (17 lines)
> On Sun, 14 Mar 2021, Mark H Weaver wrote:
>
>> (1) You must *delete* the "older" system generations and user profiles
>> e.g. by running "guix system delete-generations" and "guix package
>> --delete-generations", or else "guix gc" won't clear them from your
>> store. It is not enough to merely switch to an older system
>> generation and profiles.
>>
>> (2) You'll also need to actually reboot into the older system
>> generation, because /run/booted-system will continue to protect
>> (from GC) the system that you last booted into, even after you
>> switch systems.
>>
>> Did you do those things before running "guix gc"?
>
> Oops, I left out those details. Yes, I did both those things.

It occurs to me that we missed something: the profiles in
~/.config/guix/current that are managed by "guix pull". It might be
that code within Guix itself was miscompiled (e.g. gnu/build/svg.scm),
or else that a profile in ~/.config/guix/current is still holding a
reference to something else that was miscompiled, (e.g. guile-cairo).

I suggest "guix pull --commit=453e101fc3f7dac9aabcd6122cf05fb7925103c7",
and then "guix package -p ~/.config/guix/current --delete-generations"
to delete any generations of Guix at commits that came after the Cairo
graft (use "guix pull --list-generations" to list them). Do this for
all user accounts (including root) that have a ~/.config/guix/current
directory. Then, try "guix gc" again.

Thanks,
Mark
J
J
Jack Hill wrote on 14 Mar 2021 20:38
(name . Mark H Weaver)(address . mhw@netris.org)(address . 47115@debbugs.gnu.org)
alpine.DEB.2.21.2103142334250.8138@marsh.hcoop.net
On Sun, 14 Mar 2021, Mark H Weaver wrote:

Toggle quote (13 lines)
> It occurs to me that we missed something: the profiles in
> ~/.config/guix/current that are managed by "guix pull". It might be
> that code within Guix itself was miscompiled (e.g. gnu/build/svg.scm),
> or else that a profile in ~/.config/guix/current is still holding a
> reference to something else that was miscompiled, (e.g. guile-cairo).
>
> I suggest "guix pull --commit=453e101fc3f7dac9aabcd6122cf05fb7925103c7",
> and then "guix package -p ~/.config/guix/current --delete-generations"
> to delete any generations of Guix at commits that came after the Cairo
> graft (use "guix pull --list-generations" to list them). Do this for
> all user accounts (including root) that have a ~/.config/guix/current
> directory. Then, try "guix gc" again.

Thanks Mark. I've done the dance to gc as much as possible again. This
time, I also checked in /var/guix/gcroots to make sure I hadn't missed
anything. In fact I had missed some extra manual roots that I had created,
and I cleaned those up as well before running guix gc.

After running guix gc, I rebooted, ran guix pull, followed by a
reconfigure. The first reconfigure failed because of the substitute
networking problem, but when I ran it again, it failed in the same way
building the grub png. After it failed, I ran it again to capture the
following output:

jackhill@kalessin ~$ guix describe
Generation 9 Mar 14 2021 23:24:43 (current)
guix d059485
branch: master
commit: d059485257bbe5b4f4d903b357ec99a3af2d4f39
jackhill@kalessin ~$ sudo -E guix system -v3 reconfigure /config.scm
The following derivations will be built:
/gnu/store/xqdm3fslr3n0jyxh6i3nsn237lygjfwf-system.drv
/gnu/store/2p1s41kwh9w7w8cijg3r4zplc9f9i6fw-activate.scm.drv
/gnu/store/jgagsl2m5x5vi63s3hdwg6lb58m8qiz1-activate-service.scm.drv
/gnu/store/dsv31bkl2vwqhqgrqvz59wir009ix3kb-etc.drv
/gnu/store/9f2rvmk0xii50smi8dwn0q9556y7qc94-rottlog.drv
/gnu/store/ky3yw75v55g06ggi4i0xk155i7knn10f-sudoers.drv
/gnu/store/b2h0nkrd03zff082lg7y149aw3j9yfxg-profile.drv
/gnu/store/hlr9ypdb841sz2w949mxi5kqhvv2dd22-boot.drv
/gnu/store/y8s53y9irwbsy1pc07vbczbp7jwsrsw4-shepherd.conf.drv
/gnu/store/6zk7p1iljyayb5hyafgbzik06cq0f00j-shepherd-ssh-daemon-ssh-sshd.go.drv
/gnu/store/p89f6qy78yarsjrmq8mkrjihnk4hpm25-shepherd-ssh-daemon-ssh-sshd.scm.drv
/gnu/store/kscdry7kq4izr7nyzs6gq3kg0hqcjffx-shepherd-guix-daemon.go.drv
/gnu/store/aa4wgjx3625m5k71i5rzb0ywx9z6a0i3-shepherd-guix-daemon.scm.drv
/gnu/store/qy2sl92bqnzahvpzb6imgspp6llpz0cj-shepherd-mcron.go.drv
/gnu/store/xdxd5gfvzk4g0m2idbfcrp3d32gm0vz6-shepherd-mcron.scm.drv
/gnu/store/q8ampzxsdkibl15jhlvq30gic5qgm0wi-mcron-job.drv
/gnu/store/qj9nqyhci6zhkfprpwch90ry5hkhwvbx-mcron-job.drv
/gnu/store/6gx45db5mwraihq1qv8c9vmxhdskjk1a-grub.cfg.drv
/gnu/store/07xw2pp63xin4c4y8ndrcdn3n8z1vmx2-grub-image.png.drv
The following grafts will be made:
/gnu/store/fwwwnlzhckvi4wmw89m9az9y9wb9v6q9-rottlog-0.72.2.drv
/gnu/store/26z2lhnqhzr5b88axv7b38fgqjl3w2h8-usbutils-013.drv
The following profile hooks will be built:
/gnu/store/5c19y82k9pw297w0b5gn8j6p7g7c6h60-ca-certificate-bundle.drv
/gnu/store/j5plp2k4bkjilqx1yw9mkavy37ipp29h-fonts-dir.drv
/gnu/store/lcilg958v3adfl8jljkjwpwihbzsyr6c-info-dir.drv
/gnu/store/z5m7ra9zd3vhqbp5hg4695s2jgsggr6q-manual-database.drv
building /gnu/store/07xw2pp63xin4c4y8ndrcdn3n8z1vmx2-grub-image.png.drv...
Backtrace:
2 (primitive-load "/gnu/store/larqpc2wjhnc6jmj4885k8lynd1?")
In gnu/build/svg.scm:
53:6 1 (svg->png _ "/gnu/store/vmldvxllh07k641wmbnlz3migga29r?" ?)
In unknown file:
0 (rsvg-handle-render-cairo #<rsvg-handle 7ffff5b60150> #)

ERROR: In procedure rsvg-handle-render-cairo:
Wrong type (expecting finalized smob): #<cairo-context 7ffff5b60090>
builder for `/gnu/store/07xw2pp63xin4c4y8ndrcdn3n8z1vmx2-grub-image.png.drv' failed with exit code 1
build of /gnu/store/07xw2pp63xin4c4y8ndrcdn3n8z1vmx2-grub-image.png.drv failed
View build log at '/var/log/guix/drvs/07/xw2pp63xin4c4y8ndrcdn3n8z1vmx2-grub-image.png.drv.bz2'.
cannot build derivation `/gnu/store/6gx45db5mwraihq1qv8c9vmxhdskjk1a-grub.cfg.drv': 1 dependencies couldn't be built
guix system: error: build of `/gnu/store/6gx45db5mwraihq1qv8c9vmxhdskjk1a-grub.cfg.drv' failed

Do you think it is worth creating another VM to see if it's a problem with
the VM configuration?

Best,
Jack
J
J
Jack Hill wrote on 14 Mar 2021 20:52
(name . Mark H Weaver)(address . mhw@netris.org)(address . 47115@debbugs.gnu.org)
alpine.DEB.2.21.2103142350500.8138@marsh.hcoop.net
On Sun, 14 Mar 2021, Jack Hill wrote:

Toggle quote (2 lines)
> After running guix gc, I rebooted, ran guix pull

Er, I wrote it backwords here, but I ran them in the correct order: delete
roots, reboot, gc, pull, …
L
L
Ludovic Courtès wrote on 15 Mar 2021 06:43
control message for bug #47115
(address . control@debbugs.gnu.org)
87k0q8h6qx.fsf@gnu.org
severity 47115 important
quit
J
J
Jack Hill wrote on 15 Mar 2021 13:48
Re: bug#47115: Failure building grub-img.png when reconfiguring
(name . Mark H Weaver)(address . mhw@netris.org)(address . 47115@debbugs.gnu.org)
alpine.DEB.2.21.2103151642290.8138@marsh.hcoop.net
I was able to reproduce this on a new VM with the same hosting provider
(Ramnode), but in a different data center. Therefore, I conclude that it
is not a fault in the particular hardware the VMs are running on, but that
it could be a general problem with the hardware used by Ramnode or their
virtualization software.

Apologies for not mentioning the provider before. At the time, I didn't
see the need to advertise them, and it didn't seem likely to me that the
problem was particular to them.

The good news is that I now have a VM dedicated to reproducing this
problem, so if anyone would like access to help with investigation, please
let me know (and include your preferred username and ssh public key).

For what it's worth, I noticed that guile-rsvg was substituted from
ci.guix.gnu.org during the failed reconfigure.

Best,
Jack
J
J
Jack Hill wrote on 15 Mar 2021 18:41
(name . Mark H Weaver)(address . mhw@netris.org)(address . 47115@debbugs.gnu.org)
alpine.DEB.2.21.2103152139000.8138@marsh.hcoop.net
On Mon, 15 Mar 2021, Jack Hill wrote:

Toggle quote (10 lines)
> I was able to reproduce this on a new VM with the same hosting provider
> (Ramnode), but in a different data center. Therefore, I conclude that it is
> not a fault in the particular hardware the VMs are running on, but that it
> could be a general problem with the hardware used by Ramnode or their
> virtualization software.
>
> Apologies for not mentioning the provider before. At the time, I didn't see
> the need to advertise them, and it didn't seem likely to me that the problem
> was particular to them.

I've now reproduced this problem at a different VM provider (Linode), so I
strongly suspect that it doesn't have anything to do with the hardware or
virtualization configuration. There must be something about my
operating system configuration that triggers this bug.

Best,
Jack
J
J
Jack Hill wrote on 15 Mar 2021 19:40
(name . Mark H Weaver)(address . mhw@netris.org)(address . 47115@debbugs.gnu.org)
alpine.DEB.2.21.2103152233590.8138@marsh.hcoop.net
I believe that I have identified the problematic difference in my
operating system config between my working and non-working hosts. After
applying the following patch to my operating system config (good and bad
versions attatched), I was able to successfully reconfigure with guix
8ec0ca8faff62f19426f22aeb1bd59a8950ca05a (I was able to reproduce the
failure with that commit on another VM):

--- bad.scm 2021-03-15 22:36:36.000000001 -0400
+++ good.scm 2021-03-15 22:37:01.000000001 -0400
@@ -79,8 +79,6 @@
(guix-service-type config =>
(guix-configuration
(inherit config)
- (extra-options
- '("--disable-deduplication"))
(authorized-keys
(cons
(local-file "/home/jackhill/alperton-guix-key.pub")

I am forced to conclude that running the guix-daemon with deduplication
disabled causes this build failure. Spooky!

Best,
Jack
;; This is an operating system configuration for a VM image. ;; Modify it as you see fit and instantiate the changes by running: ;; ;; guix system reconfigure /etc/config.scm ;; (use-modules (gnu) (guix)) (use-service-modules networking ssh) (use-package-modules bootloaders certs linux package-management) (define vm-image-motd (plain-file "motd" " \x1b[1;37mThis is the GNU system. Welcome!\x1b[0m This instance of Guix is a template for virtualized environments. You can reconfigure the whole system by adjusting /etc/config.scm and running: guix system reconfigure /etc/config.scm Run '\x1b[1;37minfo guix\x1b[0m' to browse documentation. \x1b[1;33mConsider setting a password for the 'root' and 'guest' \ accounts.\x1b[0m ")) (operating-system (host-name "kalessin") (timezone "America/New_York") (locale "en_US.utf8") (initrd-modules (cons "virtio_scsi" %base-initrd-modules)) ;; Label for the GRUB boot menu. (label (string-append "GNU Guix " (package-version guix))) (firmware '()) ;; Below we assume /dev/vda is the VM's hard disk. ;; Adjust as needed. (bootloader (bootloader-configuration (bootloader grub-bootloader) (target "/dev/vda") (terminal-outputs '(console)))) (file-systems (cons (file-system (mount-point "/") (device (file-system-label "kalessin-btrfs")) (type "btrfs") (options "compress=zstd")) %base-file-systems)) (users (cons* (user-account (name "jackhill") (comment "Jack Hill") (group "users") (supplementary-groups '("wheel" "netdev"))) %base-user-accounts)) ;; Our /etc/sudoers file. Since 'guest' initially has an empty password, ;; allow for password-less sudo. (sudoers-file (plain-file "sudoers" "\ root ALL=(ALL) ALL %wheel ALL=NOPASSWD: ALL\n")) (packages (append (list btrfs-progs nss-certs) %base-packages)) (services (append (list (service openssh-service-type (openssh-configuration (password-authentication? #f) (authorized-keys `(("jackhill" ,(local-file "/id_ed25519.pub") ,(local-file "/home/jackhill/tamago.ssh-key") ,(local-file "/home/jackhill/id_ed25519.pub")))))) ;; Use the DHCP client service rather than NetworkManager. (service dhcp-client-service-type)) (modify-services %base-services (guix-service-type config => (guix-configuration (inherit config) (extra-options '("--disable-deduplication")) (authorized-keys (cons (local-file "/home/jackhill/alperton-guix-key.pub") %default-authorized-guix-keys))))))))
;; This is an operating system configuration for a VM image. ;; Modify it as you see fit and instantiate the changes by running: ;; ;; guix system reconfigure /etc/config.scm ;; (use-modules (gnu) (guix)) (use-service-modules networking ssh) (use-package-modules bootloaders certs linux package-management) (define vm-image-motd (plain-file "motd" " \x1b[1;37mThis is the GNU system. Welcome!\x1b[0m This instance of Guix is a template for virtualized environments. You can reconfigure the whole system by adjusting /etc/config.scm and running: guix system reconfigure /etc/config.scm Run '\x1b[1;37minfo guix\x1b[0m' to browse documentation. \x1b[1;33mConsider setting a password for the 'root' and 'guest' \ accounts.\x1b[0m ")) (operating-system (host-name "kalessin") (timezone "America/New_York") (locale "en_US.utf8") (initrd-modules (cons "virtio_scsi" %base-initrd-modules)) ;; Label for the GRUB boot menu. (label (string-append "GNU Guix " (package-version guix))) (firmware '()) ;; Below we assume /dev/vda is the VM's hard disk. ;; Adjust as needed. (bootloader (bootloader-configuration (bootloader grub-bootloader) (target "/dev/vda") (terminal-outputs '(console)))) (file-systems (cons (file-system (mount-point "/") (device (file-system-label "kalessin-btrfs")) (type "btrfs") (options "compress=zstd")) %base-file-systems)) (users (cons* (user-account (name "jackhill") (comment "Jack Hill") (group "users") (supplementary-groups '("wheel" "netdev"))) %base-user-accounts)) ;; Our /etc/sudoers file. Since 'guest' initially has an empty password, ;; allow for password-less sudo. (sudoers-file (plain-file "sudoers" "\ root ALL=(ALL) ALL %wheel ALL=NOPASSWD: ALL\n")) (packages (append (list btrfs-progs nss-certs) %base-packages)) (services (append (list (service openssh-service-type (openssh-configuration (password-authentication? #f) (authorized-keys `(("jackhill" ,(local-file "/id_ed25519.pub") ,(local-file "/home/jackhill/tamago.ssh-key") ,(local-file "/home/jackhill/id_ed25519.pub")))))) ;; Use the DHCP client service rather than NetworkManager. (service dhcp-client-service-type)) (modify-services %base-services (guix-service-type config => (guix-configuration (inherit config) (authorized-keys (cons (local-file "/home/jackhill/alperton-guix-key.pub") %default-authorized-guix-keys))))))))
M
M
Mark H Weaver wrote on 16 Mar 2021 01:26
Re: bug#47115: Grafts without deduplication can lead to breakage in Guile (was: Failure building grub-img.png when reconfiguring)
(name . Jack Hill)(address . jackhill@jackhill.us)(address . 47115@debbugs.gnu.org)
875z1rjyfn.fsf@netris.org
retitle 47115 Grafts without deduplication can lead to breakage in Guile
thanks

Hi Jack,

Jack Hill <jackhill@jackhill.us> writes:

Toggle quote (3 lines)
> I believe that I have identified the problematic difference in my
> operating system config between my working and non-working hosts.

Thanks very much for your investigation.

Toggle quote (3 lines)
> I am forced to conclude that running the guix-daemon with deduplication
> disabled causes this build failure. Spooky!

Very interesting!

The only difference deduplication makes is that it (usually) causes
identical files in the store to be hard links to the same inode.

I have a new hypothesis:

Suppose that a reference to the original (ungrafted) version of some
library (e.g. libcairo or librsvg) has survived unchanged by the
grafting process. This could lead to two copies of the same library
being loaded. For example, I guess that libcairo is loaded by both
librsvg and by guile-cairo. One of them might load the original
libcairo, and the other might load the replacement libcairo.

If the library is loaded twice, that could lead to each instance of the
library having its own dynamically-allocated type tags, which could lead
to this kind of error.

Here's the code where the error occurred:


Guile uses 'cairo-create' (via guile-cairo) to create a cairo-context,
and then passes it to 'rsvg-handle-render-cairo', a 'librsvg' function,
which complains that the context argument has the wrong type.

If 'guile-cairo' was somehow using a different instance of 'libcairo'
than the one that 'librsvg' is linked to, that could explain what we're
seeing, because the two instances of 'libcairo' would have different
ideas of what the cairo-context tag should be.

However, *if* you have deduplication enabled, and *if* the library in
question doesn't contain any references that require rewrites due to
grafts, then these two copies of the library would most likely[*] be
hard links to the same inode. Perhaps in that case, the run-time loader
recognizes that these are in fact the same library, and suppresses the
redundant load.

I don't know if this is what's happening, but it seems plausible.
Thoughts?

Regards,
Mark
M
M
Mark H Weaver wrote on 16 Mar 2021 02:18
Re: bug#47115: Redundant library grafts leads to breakage (was: Failure building grub-img.png when reconfiguring)
(name . Jack Hill)(address . jackhill@jackhill.us)(address . 47115@debbugs.gnu.org)
8735wvjw25.fsf@netris.org
retitle 47115 Redundant library grafts leads to breakage
thanks

Hi,

I looked a bit deeper, and now I think I finally know what's going on.
It turns out that the grafting process is creating two redundant
variants of the replacement guile-cairo.

All of the relevant information is in
/gnu/store/07xw2pp63xin4c4y8ndrcdn3n8z1vmx2-grub-image.png.drv and its
dependents, which fails to build if deduplication is disabled.

If you look through the output of "guix size
/gnu/store/07xw2pp63xin4c4y8ndrcdn3n8z1vmx2-grub-image.png.drv", you'll
find three distinct guile-cairo derivations:

(1) /gnu/store/vz4yw7zkm73diy95mdzywgixal3nf2s2-guile-cairo-1.11.2.drv,
=> /gnu/store/5nmzfnxk8kp85xwma2r585fgyz3jfw56-guile-cairo-1.11.2
(the original, ungrafted, cairo)

(2) /gnu/store/rcl324yiq7a56rwkqwgqx097dwc5mgni-guile-cairo-1.11.2.drv,
=> /gnu/store/vjn7ygzzqshvsfzck8hq5lp5pfrr2xp5-guile-cairo-1.11.2
(the first graft)

(3) /gnu/store/9mha4bzbji8iql50prmq9br4j1c51sjn-guile-cairo-1.11.2.drv,
=> /gnu/store/j69k9n0g3h9ppqi7dmqypwy3lrhxvb97-guile-cairo-1.11.2
(the second graft)

In the 'guile-builder' files referenced from the two graft derivations,
we see that they have the same inputs and perform the same rewrites, but
list them in a different order. Graft 1 has this guile-builder:

Toggle snippet (38 lines)
(begin
(use-modules (guix build graft) (guix build utils) (ice-9 match))
(define %output (getenv "out"))
(define %outputs (map (lambda (o) (cons o (getenv o))) (quote ("out"))))
(define %build-inputs
(quote
(("x" . "/gnu/store/5nmzfnxk8kp85xwma2r585fgyz3jfw56-guile-cairo-1.11.2")
("x" . "/gnu/store/fx3979c88s9yxdbchyf36qryawgzpwb5-libx11-1.6.10")
("x" . "/gnu/store/na0x00biq02fm5cyj5a8r67qwsnsskw8-cairo-1.16.0")
("x" . "/gnu/store/cw69is9wbbllwx95wky4pmbcsk4vvbpd-libxrender-0.9.10")
("x" . "/gnu/store/qrs0p8j3wq6q5a4dm0ndjdavk9gyal5q-libxext-1.3.4")
("x" . "/gnu/store/rwkqxykm91a75w9afhb41saj0dmf30hw-libx11-1.6.12")
("x" . "/gnu/store/p51dv37zj24q8001zghc3wxhxz8i3c50-cairo-1.16.0")
("x" . "/gnu/store/pzj036f1nmxdrbza6cqy419ddsn9bydp-libxrender-0.9.10")
("x" . "/gnu/store/3rmazp46f6g8w9qs8n3w7qcg8hhs1lig-libxext-1.3.4"))))
(unsetenv "GUILE_LOAD_COMPILED_PATH")
(unsetenv "LD_LIBRARY_PATH"))
(exit
(begin
(let* ((old-outputs
(quote
(("out" . "/gnu/store/5nmzfnxk8kp85xwma2r585fgyz3jfw56-guile-cairo-1.11.2"))))
(mapping
(append (quote
(("/gnu/store/fx3979c88s9yxdbchyf36qryawgzpwb5-libx11-1.6.10"
. "/gnu/store/rwkqxykm91a75w9afhb41saj0dmf30hw-libx11-1.6.12")
("/gnu/store/na0x00biq02fm5cyj5a8r67qwsnsskw8-cairo-1.16.0"
. "/gnu/store/p51dv37zj24q8001zghc3wxhxz8i3c50-cairo-1.16.0")
("/gnu/store/cw69is9wbbllwx95wky4pmbcsk4vvbpd-libxrender-0.9.10"
. "/gnu/store/pzj036f1nmxdrbza6cqy419ddsn9bydp-libxrender-0.9.10")
("/gnu/store/qrs0p8j3wq6q5a4dm0ndjdavk9gyal5q-libxext-1.3.4"
. "/gnu/store/3rmazp46f6g8w9qs8n3w7qcg8hhs1lig-libxext-1.3.4")))
(map (match-lambda ((name . file)
(cons (assoc-ref old-outputs name) file)))
%outputs))))
(graft old-outputs %outputs mapping))))

Graft 2 has this guile-builder:

Toggle snippet (37 lines)
(begin
(use-modules (guix build graft) (guix build utils) (ice-9 match))
(define %output (getenv "out"))
(define %outputs (map (lambda (o) (cons o (getenv o))) (quote ("out"))))
(define %build-inputs
(quote (("x" . "/gnu/store/5nmzfnxk8kp85xwma2r585fgyz3jfw56-guile-cairo-1.11.2")
("x" . "/gnu/store/na0x00biq02fm5cyj5a8r67qwsnsskw8-cairo-1.16.0")
("x" . "/gnu/store/fx3979c88s9yxdbchyf36qryawgzpwb5-libx11-1.6.10")
("x" . "/gnu/store/cw69is9wbbllwx95wky4pmbcsk4vvbpd-libxrender-0.9.10")
("x" . "/gnu/store/qrs0p8j3wq6q5a4dm0ndjdavk9gyal5q-libxext-1.3.4")
("x" . "/gnu/store/p51dv37zj24q8001zghc3wxhxz8i3c50-cairo-1.16.0")
("x" . "/gnu/store/rwkqxykm91a75w9afhb41saj0dmf30hw-libx11-1.6.12")
("x" . "/gnu/store/pzj036f1nmxdrbza6cqy419ddsn9bydp-libxrender-0.9.10")
("x" . "/gnu/store/3rmazp46f6g8w9qs8n3w7qcg8hhs1lig-libxext-1.3.4"))))
(unsetenv "GUILE_LOAD_COMPILED_PATH")
(unsetenv "LD_LIBRARY_PATH"))
(exit
(begin
(let* ((old-outputs
(quote
(("out" . "/gnu/store/5nmzfnxk8kp85xwma2r585fgyz3jfw56-guile-cairo-1.11.2"))))
(mapping
(append (quote
(("/gnu/store/na0x00biq02fm5cyj5a8r67qwsnsskw8-cairo-1.16.0"
. "/gnu/store/p51dv37zj24q8001zghc3wxhxz8i3c50-cairo-1.16.0")
("/gnu/store/fx3979c88s9yxdbchyf36qryawgzpwb5-libx11-1.6.10"
. "/gnu/store/rwkqxykm91a75w9afhb41saj0dmf30hw-libx11-1.6.12")
("/gnu/store/cw69is9wbbllwx95wky4pmbcsk4vvbpd-libxrender-0.9.10"
. "/gnu/store/pzj036f1nmxdrbza6cqy419ddsn9bydp-libxrender-0.9.10")
("/gnu/store/qrs0p8j3wq6q5a4dm0ndjdavk9gyal5q-libxext-1.3.4"
. "/gnu/store/3rmazp46f6g8w9qs8n3w7qcg8hhs1lig-libxext-1.3.4")))
(map (match-lambda ((name . file)
(cons (assoc-ref old-outputs name) file)))
%outputs))))
(graft old-outputs %outputs mapping))))

I think that my last hypothesis was on the right track, but not quite
right:

* Instead of 'libcairo' being loaded twice, I now suspect that
"libguile-cairo.so" is being loaded twice.

* Instead of the original and replacement libraries being loaded, I now
suspect that two different variants of the replacement "guile-cairo"
are being loaded.

* Instead of libcairo type tags being duplicated, I now suspect that
duplicated smob tags are being allocated.

However, *if* deduplication is enabled, two redundant replacements
created by grafting _should_ occupy the same inodes, assuming that the
replacement mappings are the same (modulo ordering), and assuming that
/gnu/store/.links doesn't hit a directory size limit (which can happen
on ext3/4, leading to missed deduplication opportunities).

I've known about these redundant replacements in Guix for many years,
but was not aware of any significant practical problems arising from
them until now.

Mark
L
L
Ludovic Courtès wrote on 20 Mar 2021 04:01
Re: bug#47115: Redundant library grafts leads to breakage
(name . Mark H Weaver)(address . mhw@netris.org)
87r1ka9ji3.fsf@gnu.org
Hi Mark,

Mark H Weaver <mhw@netris.org> skribis:

Toggle quote (19 lines)
> I think that my last hypothesis was on the right track, but not quite
> right:
>
> * Instead of 'libcairo' being loaded twice, I now suspect that
> "libguile-cairo.so" is being loaded twice.
>
> * Instead of the original and replacement libraries being loaded, I now
> suspect that two different variants of the replacement "guile-cairo"
> are being loaded.
>
> * Instead of libcairo type tags being duplicated, I now suspect that
> duplicated smob tags are being allocated.
>
> However, *if* deduplication is enabled, two redundant replacements
> created by grafting _should_ occupy the same inodes, assuming that the
> replacement mappings are the same (modulo ordering), and assuming that
> /gnu/store/.links doesn't hit a directory size limit (which can happen
> on ext3/4, leading to missed deduplication opportunities).

Woow, thanks for the investigation! You wouldn’t think that
deduplication can have an effect on this kind of bug.

Toggle quote (4 lines)
> I've known about these redundant replacements in Guix for many years,
> but was not aware of any significant practical problems arising from
> them until now.

Do you know why the two guile-cairo grafts differ in this case?

I’m aware of one case that can lead to that: the grafting code can
create grafts for just one output of the original derivation, or for all
of them (commit 482fda2729c3e76999892cb8f9a0391a7bd37119). Maybe that’s
what’s happening here?

Thank you!

Ludo’.
J
J
Jack Hill wrote on 17 Apr 2021 16:39
Re: bug#47853: guix system reconfigure build of `/gnu/store/inppfcz5yk5a20cwhv1dwqn8zq6jcdxl-grub.cfg.drv' failed
(name . Dr. Arne Babenhauserheide)(address . arne_bab@web.de)
alpine.DEB.2.21.2104171934190.8414@marsh.hcoop.net
On Sun, 18 Apr 2021, Dr. Arne Babenhauserheide wrote:

Toggle quote (2 lines)
> On my system building grub fails because grub-image.png fails.

[…]

Toggle quote (10 lines)
> Backtrace:
> 2 (primitive-load "/gnu/store/larqpc2wjhnc6jmj4885k8lynd1?")
> In gnu/build/svg.scm:
> 53:6 1 (svg->png _ "/gnu/store/0f2bpqpgflza414sk0hwms3rdizg1x?" ?)
> In unknown file:
> 0 (rsvg-handle-render-cairo #<rsvg-handle 7ffff0b56280> #)
>
> ERROR: In procedure rsvg-handle-render-cairo:
> Wrong type (expecting finalized smob): #<cairo-context 7ffff0b561c0>

Oh dear. I ran into a similar problem (at least that resulted in the same
Guile error message in #47115) to do with grafts and a store that was not
using the hard links for de-duplication. Essentially the de-duplication
masked an issue with differing files being included in the closure.

Can you provide more information about your system? I'm particularly
interested in whether you're using store de-duplication, but other
specifics about your system might be interesting. Do you know when you
last successfully reconfigured?

Best,
Jack
L
L
Ludovic Courtès wrote on 18 Apr 2021 03:17
control message for bug #47115
(address . control@debbugs.gnu.org)
877dkznbh9.fsf@gnu.org
merge 47115 47853
quit
R
R
Richard Sent wrote on 26 May 23:57 -0700
Another occurence in the wild
(address . 47115@debbugs.gnu.org)
87msobkaij.fsf@freakingpenguin.com
Hi Guix!

I believe I found another instance of this bug coming back to haunt
unfortunate, wayward souls. (including me! 😭).


--
Take it easy,
Richard Sent
Making my computer weirder one commit at a time.
S
S
Simen Endsjø wrote on 20 Jun 11:00 -0700
Yet another occurrence in the wild
(address . 47115@debbugs.gnu.org)
87tthncx52.fsf@simendsjo.me
After updating to the "lisp-team" branch which included an updated sbcl
and more (commit 2d5a7bfed5ccae6ce8adbef3ae1017d6ce8512be), my system broke
down when reading ~/.stumpwm.d/init.lisp with a message like "failed
AVER: (= (ASD (LENGTH KV-VECTOR) -1 ... This is probably a bug in SBCL
itself. ..."

references https://issues.guix.gnu.org/62890.I tried with --no-grafts
as that repo does, and my system works (commit
e32e3d0a03dc17c4c54a91aad053c9036998b601).
?
Your comment

Commenting via the web interface is currently disabled.

To comment on this conversation send an email to 47115@patchwise.org

To respond to this issue using the mumi CLI, first switch to it
mumi current 47115
Then, you may apply the latest patchset in this issue (with sign off)
mumi am -- -s
Or, compose a reply to this issue
mumi compose
Or, send patches to this issue
mumi send-email *.patch