Report forwarded
to bug-guix@gnu.org
:
bug#31785
; Package guix
.
(Mon, 11 Jun 2018 14:07:02 GMT) (full text, mbox, link).
Acknowledgement sent
to ludo@gnu.org (Ludovic Courtès)
:
New bug report received and forwarded. Copy sent to bug-guix@gnu.org
.
(Mon, 11 Jun 2018 14:07:03 GMT) (full text, mbox, link).
Message #5 received at submit@debbugs.gnu.org (full text, mbox, reply):
Hello, I tried running this: guix build --max-jobs=200 $(guix gc -R $(guix build -d inkscape --no-grafts) | sort) & \ guix build --max-jobs=200 $(guix gc -R $(guix build -d inkscape --no-grafts) | sort -r) … also in parallel with this (for good measure): guix build --max-jobs=200 $(guix gc -R $(guix build -d inkscape --no-grafts) | sort -R) Since we have 3 clients, that leads to 3 guix-daemon processes, and those are stuck in a deadlock: --8<---------------cut here---------------start------------->8--- $ ps aux | grep guix-daemon […] root 20501 0.6 0.0 39292 12012 ? Ss 15:51 0:01 guix-daemon 20455 root 20503 4.3 0.0 39420 12096 ? Ss 15:51 0:12 guix-daemon 20491 root 22154 0.0 0.0 39028 11016 ? Ss 15:52 0:00 guix-daemon 22148 $ sudo strace -p 22154 -t strace: Process 22154 attached 15:57:03 select(0, 0x7ffdca30b610, NULL, NULL, {tv_sec=1, tv_usec=314189}) = 0 (Timeout) 15:57:04 openat(AT_FDCWD, "/gnu/store/ghc9vcc8ydv305jz0spd73qrrf30gxik-binutils-bootstrap-0.lock", O_RDWR|O_CREAT, 0600) = 8 15:57:04 fcntl(8, F_GETFD) = 0 15:57:04 fcntl(8, F_SETFD, FD_CLOEXEC) = 0 15:57:04 fcntl(8, F_SETLK, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=0}) = -1 EAGAIN (Risurco dumtempe ne disponeblas) 15:57:04 close(8) = 0 15:57:04 openat(AT_FDCWD, "/gnu/store/5hdpffkgkfszpsin2ir50zscz06x0mb5-coreutils-8.29-debug.lock", O_RDWR|O_CREAT, 0600) = 8 15:57:04 fcntl(8, F_GETFD) = 0 15:57:04 fcntl(8, F_SETFD, FD_CLOEXEC) = 0 15:57:04 fcntl(8, F_SETLK, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=0}) = -1 EAGAIN (Risurco dumtempe ne disponeblas) 15:57:04 close(8) = 0 15:57:04 openat(AT_FDCWD, "/gnu/store/aasfcliv85frq3aawpa4rw18nvlvl735-gettext-minimal-0.19.8.1.lock", O_RDWR|O_CREAT, 0600) = 8 15:57:04 fcntl(8, F_GETFD) = 0 15:57:04 fcntl(8, F_SETFD, FD_CLOEXEC) = 0 15:57:04 fcntl(8, F_SETLK, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=0}) = -1 EAGAIN (Risurco dumtempe ne disponeblas) 15:57:04 close(8) = 0 15:57:04 openat(AT_FDCWD, "/gnu/store/j1kpzcpi1m1ihzwyqb60zd4s74yp85g8-glibc-bootstrap-0.lock", O_RDWR|O_CREAT, 0600) = 8 15:57:04 fcntl(8, F_GETFD) = 0 15:57:04 fcntl(8, F_SETFD, FD_CLOEXEC) = 0 15:57:04 fcntl(8, F_SETLK, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=0}) = -1 EAGAIN (Risurco dumtempe ne disponeblas) 15:57:04 close(8) = 0 15:57:04 openat(AT_FDCWD, "/gnu/store/4llnxsgr6mxxj6pk21d9vbwaxnkbn99k-gnutls-3.5.18-doc.lock", O_RDWR|O_CREAT, 0600) = 8 15:57:04 fcntl(8, F_GETFD) = 0 15:57:04 fcntl(8, F_SETFD, FD_CLOEXEC) = 0 15:57:04 fcntl(8, F_SETLK, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=0}) = -1 EAGAIN (Risurco dumtempe ne disponeblas) 15:57:04 close(8) = 0 15:57:04 openat(AT_FDCWD, "/gnu/store/7rs9qx86f2w4hjbf9wjm10k4m2vvlbyv-groff-1.22.3-doc.lock", O_RDWR|O_CREAT, 0600) = 8 15:57:04 fcntl(8, F_GETFD) = 0 15:57:04 fcntl(8, F_SETFD, FD_CLOEXEC) = 0 15:57:04 fcntl(8, F_SETLK, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=0}) = -1 EAGAIN (Risurco dumtempe ne disponeblas) 15:57:04 close(8) = 0 15:57:04 openat(AT_FDCWD, "/gnu/store/d645d919gp1zk8v65qa2r4ia5x1c5n6k-libstdc++-5.5.0.lock", O_RDWR|O_CREAT, 0600) = 8 15:57:04 fcntl(8, F_GETFD) = 0 15:57:04 fcntl(8, F_SETFD, FD_CLOEXEC) = 0 15:57:04 fcntl(8, F_SETLK, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=0}) = -1 EAGAIN (Risurco dumtempe ne disponeblas) 15:57:04 close(8) = 0 15:57:04 openat(AT_FDCWD, "/gnu/store/amcs7l0ynj1qg6fp9ll3asiamd4zsq75-m4-1.4.18.lock", O_RDWR|O_CREAT, 0600) = 8 15:57:04 fcntl(8, F_GETFD) = 0 15:57:04 fcntl(8, F_SETFD, FD_CLOEXEC) = 0 15:57:04 fcntl(8, F_SETLK, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=0}) = -1 EAGAIN (Risurco dumtempe ne disponeblas) 15:57:04 close(8) = 0 15:57:04 openat(AT_FDCWD, "/gnu/store/p262qk7zsd2wnlal6x1jax2xbqvdddqq-m4-1.4.18.lock", O_RDWR|O_CREAT, 0600) = 8 15:57:04 fcntl(8, F_GETFD) = 0 15:57:04 fcntl(8, F_SETFD, FD_CLOEXEC) = 0 15:57:04 fcntl(8, F_SETLK, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=0}) = -1 EAGAIN (Risurco dumtempe ne disponeblas) 15:57:04 close(8) = 0 15:57:04 openat(AT_FDCWD, "/gnu/store/zvvgr301hz95ac15rp0wsjy5j4cdgkg4-module-import-compiled.lock", O_RDWR|O_CREAT, 0600) = 8 15:57:04 fcntl(8, F_GETFD) = 0 15:57:04 fcntl(8, F_SETFD, FD_CLOEXEC) = 0 15:57:04 fcntl(8, F_SETLK, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=0}) = -1 EAGAIN (Risurco dumtempe ne disponeblas) 15:57:04 close(8) = 0 15:57:04 openat(AT_FDCWD, "/gnu/store/760kj698c9x1vcmlzk1qr8i9hqw961n3-openssl-1.0.2o-doc.lock", O_RDWR|O_CREAT, 0600) = 8 ^C $ sudo strace -p 20503 -t strace: Process 20503 attached 15:58:23 select(445, [15 54 159 177 321 411 414 444], NULL, NULL, {tv_sec=1, tv_usec=595302}) = 0 (Timeout) 15:58:25 openat(AT_FDCWD, "/gnu/store/amcs7l0ynj1qg6fp9ll3asiamd4zsq75-m4-1.4.18.lock", O_RDWR|O_CREAT, 0600) = 14 15:58:25 fcntl(14, F_GETFD) = 0 15:58:25 fcntl(14, F_SETFD, FD_CLOEXEC) = 0 15:58:25 fcntl(14, F_SETLK, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=0}) = -1 EAGAIN (Risurco dumtempe ne disponeblas) 15:58:25 close(14) = 0 15:58:25 openat(AT_FDCWD, "/gnu/store/zvvgr301hz95ac15rp0wsjy5j4cdgkg4-module-import-compiled.lock", O_RDWR|O_CREAT, 0600) = 14 15:58:25 fcntl(14, F_GETFD) = 0 15:58:25 fcntl(14, F_SETFD, FD_CLOEXEC) = 0 15:58:25 fcntl(14, F_SETLK, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=0}) = -1 EAGAIN (Risurco dumtempe ne disponeblas) 15:58:25 close(14) = 0 15:58:25 openat(AT_FDCWD, "/gnu/store/760kj698c9x1vcmlzk1qr8i9hqw961n3-openssl-1.0.2o-doc.lock", O_RDWR|O_CREAT, 0600) = 14 15:58:25 fcntl(14, F_GETFD) = 0 15:58:25 fcntl(14, F_SETFD, FD_CLOEXEC) = 0 15:58:25 fcntl(14, F_SETLK, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=0}) = -1 EAGAIN (Risurco dumtempe ne disponeblas) 15:58:25 close(14) = 0 15:58:25 select(445, [15 54 159 177 321 411 414 444], NULL, NULL, {tv_sec=5, tv_usec=0}) = 0 (Timeout) 15:58:30 openat(AT_FDCWD, "/gnu/store/amcs7l0ynj1qg6fp9ll3asiamd4zsq75-m4-1.4.18.lock", O_RDWR|O_CREAT, 0600) = 14 15:58:30 fcntl(14, F_GETFD) = 0 15:58:30 fcntl(14, F_SETFD, FD_CLOEXEC) = 0 15:58:30 fcntl(14, F_SETLK, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=0}) = -1 EAGAIN (Risurco dumtempe ne disponeblas) 15:58:30 close(14) = 0 15:58:30 openat(AT_FDCWD, "/gnu/store/zvvgr301hz95ac15rp0wsjy5j4cdgkg4-module-import-compiled.lock", O_RDWR|O_CREAT, 0600) = 14 15:58:30 fcntl(14, F_GETFD) = 0 15:58:30 fcntl(14, F_SETFD, FD_CLOEXEC) = 0 15:58:30 fcntl(14, F_SETLK, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=0}) = -1 EAGAIN (Risurco dumtempe ne disponeblas) 15:58:30 close(14) = 0 15:58:30 openat(AT_FDCWD, "/gnu/store/760kj698c9x1vcmlzk1qr8i9hqw961n3-openssl-1.0.2o-doc.lock", O_RDWR|O_CREAT, 0600) = 14 15:58:30 fcntl(14, F_GETFD) = 0 15:58:30 fcntl(14, F_SETFD, FD_CLOEXEC) = 0 15:58:30 fcntl(14, F_SETLK, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=0}) = -1 EAGAIN (Risurco dumtempe ne disponeblas) 15:58:30 close(14) = 0 15:58:30 select(445, [15 54 159 177 321 411 414 444], NULL, NULL, {tv_sec=5, tv_usec=0}^Cstrace: Process 20503 detached $ sudo strace -p 20501 -t strace: Process 20501 attached 15:59:09 select(121, [20 30 120], NULL, NULL, {tv_sec=0, tv_usec=535316}) = 0 (Timeout) 15:59:10 openat(AT_FDCWD, "/gnu/store/ghc9vcc8ydv305jz0spd73qrrf30gxik-binutils-bootstrap-0.lock", O_RDWR|O_CREAT, 0600) = 8 15:59:10 fcntl(8, F_GETFD) = 0 15:59:10 fcntl(8, F_SETFD, FD_CLOEXEC) = 0 15:59:10 fcntl(8, F_SETLK, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=0}) = -1 EAGAIN (Risurco dumtempe ne disponeblas) 15:59:10 close(8) = 0 15:59:10 openat(AT_FDCWD, "/gnu/store/5hdpffkgkfszpsin2ir50zscz06x0mb5-coreutils-8.29-debug.lock", O_RDWR|O_CREAT, 0600) = 8 15:59:10 fcntl(8, F_GETFD) = 0 15:59:10 fcntl(8, F_SETFD, FD_CLOEXEC) = 0 15:59:10 fcntl(8, F_SETLK, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=0}) = -1 EAGAIN (Risurco dumtempe ne disponeblas) 15:59:10 close(8) = 0 15:59:10 openat(AT_FDCWD, "/gnu/store/aasfcliv85frq3aawpa4rw18nvlvl735-gettext-minimal-0.19.8.1.lock", O_RDWR|O_CREAT, 0600) = 8 15:59:10 fcntl(8, F_GETFD) = 0 15:59:10 fcntl(8, F_SETFD, FD_CLOEXEC) = 0 15:59:10 fcntl(8, F_SETLK, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=0}) = -1 EAGAIN (Risurco dumtempe ne disponeblas) 15:59:10 close(8) = 0 15:59:10 openat(AT_FDCWD, "/gnu/store/j1kpzcpi1m1ihzwyqb60zd4s74yp85g8-glibc-bootstrap-0.lock", O_RDWR|O_CREAT, 0600) = 8 15:59:10 fcntl(8, F_GETFD) = 0 15:59:10 fcntl(8, F_SETFD, FD_CLOEXEC) = 0 15:59:10 fcntl(8, F_SETLK, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=0}) = -1 EAGAIN (Risurco dumtempe ne disponeblas) 15:59:10 close(8) = 0 15:59:10 openat(AT_FDCWD, "/gnu/store/4llnxsgr6mxxj6pk21d9vbwaxnkbn99k-gnutls-3.5.18-doc.lock", O_RDWR|O_CREAT, 0600) = 8 15:59:10 fcntl(8, F_GETFD) = 0 15:59:10 fcntl(8, F_SETFD, FD_CLOEXEC) = 0 15:59:10 fcntl(8, F_SETLK, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=0}) = -1 EAGAIN (Risurco dumtempe ne disponeblas) 15:59:10 close(8) = 0 15:59:10 openat(AT_FDCWD, "/gnu/store/7rs9qx86f2w4hjbf9wjm10k4m2vvlbyv-groff-1.22.3-doc.lock", O_RDWR|O_CREAT, 0600) = 8 15:59:10 fcntl(8, F_GETFD) = 0 15:59:10 fcntl(8, F_SETFD, FD_CLOEXEC) = 0 15:59:10 fcntl(8, F_SETLK, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=0}) = -1 EAGAIN (Risurco dumtempe ne disponeblas) 15:59:10 close(8) = 0 15:59:10 openat(AT_FDCWD, "/gnu/store/d645d919gp1zk8v65qa2r4ia5x1c5n6k-libstdc++-5.5.0.lock", O_RDWR|O_CREAT, 0600) = 8 15:59:10 fcntl(8, F_GETFD) = 0 15:59:10 fcntl(8, F_SETFD, FD_CLOEXEC) = 0 15:59:10 fcntl(8, F_SETLK, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=0}) = -1 EAGAIN (Risurco dumtempe ne disponeblas) 15:59:10 close(8) = 0 15:59:10 openat(AT_FDCWD, "/gnu/store/p262qk7zsd2wnlal6x1jax2xbqvdddqq-m4-1.4.18.lock", O_RDWR|O_CREAT, 0600) = 8 15:59:10 fcntl(8, F_GETFD) = 0 15:59:10 fcntl(8, F_SETFD, FD_CLOEXEC) = 0 15:59:10 fcntl(8, F_SETLK, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=0}) = -1 EAGAIN (Risurco dumtempe ne disponeblas) 15:59:10 close(8) = 0 15:59:10 select(121, [20 30 120], NULL, NULL, {tv_sec=5, tv_usec=0}^Cstrace: Process 20501 detached $ sudo lsof | grep '/gnu/store/.*\.lock' guix-daem 20501 root 14uW REG 253,0 0 3150879 /gnu/store/zvvgr301hz95ac15rp0wsjy5j4cdgkg4-module-import-compiled.lock guix-daem 20501 root 26uW REG 253,0 0 3212618 /gnu/store/760kj698c9x1vcmlzk1qr8i9hqw961n3-openssl-1.0.2o-doc.lock guix-daem 20501 root 116uW REG 253,0 0 3212684 /gnu/store/amcs7l0ynj1qg6fp9ll3asiamd4zsq75-m4-1.4.18.lock guix-daem 20503 root 8uW REG 253,0 0 3212520 /gnu/store/7rs9qx86f2w4hjbf9wjm10k4m2vvlbyv-groff-1.22.3-doc.lock guix-daem 20503 root 50uW REG 253,0 0 3212514 /gnu/store/ghc9vcc8ydv305jz0spd73qrrf30gxik-binutils-bootstrap-0.lock guix-daem 20503 root 155uW REG 253,0 0 3212547 /gnu/store/j1kpzcpi1m1ihzwyqb60zd4s74yp85g8-glibc-bootstrap-0.lock guix-daem 20503 root 173uW REG 253,0 0 3212552 /gnu/store/4llnxsgr6mxxj6pk21d9vbwaxnkbn99k-gnutls-3.5.18-doc.lock guix-daem 20503 root 317uW REG 253,0 0 3212602 /gnu/store/d645d919gp1zk8v65qa2r4ia5x1c5n6k-libstdc++-5.5.0.lock guix-daem 20503 root 407uW REG 253,0 0 3212655 /gnu/store/5hdpffkgkfszpsin2ir50zscz06x0mb5-coreutils-8.29-debug.lock guix-daem 20503 root 410uW REG 253,0 0 3212659 /gnu/store/aasfcliv85frq3aawpa4rw18nvlvl735-gettext-minimal-0.19.8.1.lock guix-daem 20503 root 440uW REG 253,0 0 3212686 /gnu/store/p262qk7zsd2wnlal6x1jax2xbqvdddqq-m4-1.4.18.lock --8<---------------cut here---------------end--------------->8--- So for example: • 20501 tries to acquire aasfcliv85frq3aawpa4rw18nvlvl735-gettext-minimal-0.19.8.1.lock, which is held by 20503; • 20503 tries to acquire amcs7l0ynj1qg6fp9ll3asiamd4zsq75-m4-1.4.18.lock, which is held by 20501. This comes from the fact that ‘LocalStore::buildPaths’ takes the user-supplied derivation list as is, without sorting it, and then acquires locks in that order in ‘Worker::run’. A topological sort (or maybe an alphanumeric sort?) should allow us to guarantee that guix-daemon processes take locks in the same order, and then don’t end up in a deadlock. I discovered this bug while monitoring Cuirass on berlin: several sessions submit batches of 200 derivations in ‘build-paths’ RPCs, and sometimes most of the corresponding guix-daemon processes would end up being stuck in a lock-acquiring loop. Ludo’.
Severity set to 'important' from 'normal'
Request was from ludo@gnu.org (Ludovic Courtès)
to control@debbugs.gnu.org
.
(Mon, 11 Jun 2018 15:43:02 GMT) (full text, mbox, link).
Information forwarded
to bug-guix@gnu.org
:
bug#31785
; Package guix
.
(Fri, 06 Sep 2019 09:06:01 GMT) (full text, mbox, link).
Message #10 received at 31785@debbugs.gnu.org (full text, mbox, reply):
Here’s another example from berlin today: --8<---------------cut here---------------start------------->8--- ludo@berlin$ sudo guix processes SessionPID: 38649 ClientPID: 38611 ClientCommand: /gnu/store/7x9lrzs2l00mfcr3ya7hg1i06i7lq9pd-profile/bin/guile \ /home/mbakke/guix/master/scripts/guix build --no-grafts --keep-going --max-silent-time=36000 --timeout=216000 -s armhf-linux nmoldyn r-msnid r-yamss netcdf-fortran python-netcdf4 LockHeld: /gnu/store/v6hcm1gvv3gdfbnq6cd4kn1i6ip3y74s-netcdf-fortran-4.4.4.lock LockHeld: /gnu/store/ydxl0ws9kk6ix58m99ib7qhi157yq49h-r-msnid-1.18.1.lock ChildProcess: 33181: /gnu/store/9alic3caqhay3h8mx4iihpmyj6ymqpcx-guile-2.2.4/bin/guile \ /gnu/store/vyhfp0gj4rx74yw0ybalzj8bvmqgp9pr-guix-1.0.1-4.c902458/bin/.guix-real offload x86_64-linux 36000 1 216000 [...] SessionPID: 44043 ClientPID: 44023 ClientCommand: /gnu/store/7x9lrzs2l00mfcr3ya7hg1i06i7lq9pd-profile/bin/guile \ /home/mbakke/guix/master/scripts/guix build --no-grafts --keep-going --max-silent-time=36000 --timeout=216000 -s armhf-linux netcdf-fortran r-msnid nmoldyn r-yamss LockHeld: /gnu/store/69jjabypcpx83482rh3yp4a9ncjimw6b-r-yamss-1.10.0.lock LockHeld: /gnu/store/ibdbwxb1i8yal8ilqxxgnzh6bxpinic1-nmoldyn-3.0.11.lock ChildProcess: 48997: /gnu/store/9alic3caqhay3h8mx4iihpmyj6ymqpcx-guile-2.2.4/bin/guile \ /gnu/store/vyhfp0gj4rx74yw0ybalzj8bvmqgp9pr-guix-1.0.1-4.c902458/bin/.guix-real offload x86_64-linux 36000 1 216000 ludo@berlin ~$ sudo strace -p 38649 -e openat,fcntl strace: Process 38649 attached openat(AT_FDCWD, "/gnu/store/ibdbwxb1i8yal8ilqxxgnzh6bxpinic1-nmoldyn-3.0.11.lock", O_RDWR|O_CREAT, 0600) = 10 fcntl(10, F_GETFD) = 0 fcntl(10, F_SETFD, FD_CLOEXEC) = 0 fcntl(10, F_SETLK, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=0}) = -1 EAGAIN (Resource temporarily unavailable) openat(AT_FDCWD, "/gnu/store/69jjabypcpx83482rh3yp4a9ncjimw6b-r-yamss-1.10.0.lock", O_RDWR|O_CREAT, 0600) = 10 fcntl(10, F_GETFD) = 0 fcntl(10, F_SETFD, FD_CLOEXEC) = 0 fcntl(10, F_SETLK, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=0}) = -1 EAGAIN (Resource temporarily unavailable) ^Cstrace: Process 38649 detached ludo@berlin ~$ sudo strace -p 44043 -e openat,fcntl strace: Process 44043 attached openat(AT_FDCWD, "/gnu/store/ydxl0ws9kk6ix58m99ib7qhi157yq49h-r-msnid-1.18.1.lock", O_RDWR|O_CREAT, 0600) = 10 fcntl(10, F_GETFD) = 0 fcntl(10, F_SETFD, FD_CLOEXEC) = 0 fcntl(10, F_SETLK, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=0}) = -1 EAGAIN (Resource temporarily unavailable) ^Cstrace: Process 44043 detached --8<---------------cut here---------------end--------------->8--- A very palpable deadlock. Ludo’.
Information forwarded
to bug-guix@gnu.org
:
bug#31785
; Package guix
.
(Wed, 04 Nov 2020 15:22:02 GMT) (full text, mbox, link).
Message #13 received at 31785@debbugs.gnu.org (full text, mbox, reply):
Hi, ludo@gnu.org (Ludovic Courtès) skribis: > This comes from the fact that ‘LocalStore::buildPaths’ takes the > user-supplied derivation list as is, without sorting it, and then > acquires locks in that order in ‘Worker::run’. This diagnostic is incorrect: ‘Goals’ is a set sorted according to ‘CompareGoalPtrs’, which is lexical sort that arranges so substitution goals come before derivation goals. Thus, ‘_topGoals’ and ‘awake2’ in Worker::run are sorted in a deterministic fashion. The problem is that ‘Worker::waitForAWhile’ reshuffles the order of goals by temporarily moving goals out of the way. This can happen when offloading replies “postpone”, which is inherently non-deterministic (which goals are put to sleep will vary from one session to another session.) When those goals are eventually woken up from ‘Worker::waitForInput’, they’re reprocessed, in sorted order, but potentially with “holes” compared to other ‘guix-daemon’ processes. That’s only a partial explanation; we need to go further to come up with an actual deadlock scenario. Ludo’.
Send a report that this bug log contains spam.
Debbugs is free software and licensed under the terms of the GNU Public License version 2. The current version can be obtained from https://bugs.debian.org/debbugs-source/.
Copyright © 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson, 2005-2017 Don Armstrong, and many other contributors.