Report forwarded
to janneke@gnu.org, samplet@ngyro.com, bug-guix@gnu.org
:
bug#75658
; Package guix
.
(Sat, 18 Jan 2025 22:09:01 GMT) (full text, mbox, link).
Acknowledgement sent
to Ludovic Courtès <ludovic.courtes@inria.fr>
:
New bug report received and forwarded. Copy sent to janneke@gnu.org, samplet@ngyro.com, bug-guix@gnu.org
.
(Sat, 18 Jan 2025 22:09:02 GMT) (full text, mbox, link).
Message #5 received at submit@debbugs.gnu.org (full text, mbox, reply):
Hello, I stumbled upon this interesting non-deterministic failure while building ‘gcc-mesboot-4.9.4.drv’ on current ‘core-packages-team’ (which is unchanged compared to ‘master’): --8<---------------cut here---------------start------------->8--- source directory: "/tmp/guix-build-gcc-mesboot-4.9.4.drv-0/gcc-4.9.4" (relative from build: ".") build directory: "/tmp/guix-build-gcc-mesboot-4.9.4.drv-0/gcc-4.9.4" configure flags: ("CONFIG_SHELL=/gnu/store/bhmkf29xki04mmydpm0axpbh35md4vfb-gash-boot-0.3.0/bin/bash" "SHELL=/gnu/store/bhmkf29xki04mmydpm0axpbh35md4vfb-gash-boot-0.3.0/bin/bash" "--prefix=/gnu/store/mgbd56zvid129vkk8l9zir7pf46r5038-gcc-mesboot-4.9.4" "--enable-fast-install" "--build=x86_64-unknown-linux-gnu" "--prefix=/gnu/store/mgbd56zvid129vkk8l9zir7pf46r5038-gcc-mesboot-4.9.4" "--build=i686-unknown-linux-gnu" "--host=i686-unknown-linux-gnu" "--with-host-libstdcxx=-lsupc++" "--with-native-system-header-dir=/gnu/store/qxp7icgwbn1hqqwvkan7aljgzfn439zh-glibc-mesboot-2.16.0/include" "--with-build-sysroot=/gnu/store/qxp7icgwbn1hqqwvkan7aljgzfn439zh-glibc-mesboot-2.16.0/include" "--disable-bootstrap" "--disable-decimal-float" "--disable-libatomic" "--disable-libcilkrts" "--disable-libgomp" "--disable-libitm" "--disable-libmudflap" "--disable-libquadmath" "--disable-libsanitizer" "--disable-libssp" "--disable-libvtv" "--disable-lto" "--disable-lto-plugin" "--disable-multilib" "--disable-plugin" "--disable-threads" "--enable-languages=c,c++" "--enable-static" "--enable-shared" "--enable-threads=single" "--disable-libstdcxx-pch" "--disable-build-with-cxx") Backtrace: In gash/eval.scm: 221: 19 [eval-sh (<sh-set!> ("ac_useropt" (<sh-cmd-sub> #)))] In srfi/srfi-1.scm: 642: 18 [for-each #<procedure 1502320 at gash/eval.scm:221:17 (name word)> # #] In gash/eval.scm: 222: 17 [#<procedure 1502320 at gash/eval.scm:221:17 (name word)> "ac_useropt" #] 131: 16 [eval-word (<sh-cmd-sub> (<sh-pipeline> # #)) #:output string ...] 121: 15 [expand-word (<sh-cmd-sub> (<sh-pipeline> # #)) #:output string ...] In gash/shell.scm: 289: 14 [sh:substitute-command #<procedure 15022a0 at gash/eval.scm:129:35 ()>] 270: 13 [%subshell #<procedure v ()>] In ice-9/boot-9.scm: 157: 12 [catch quit #<procedure v ()> ...] In ice-9/r4rs.scm: 176: 11 [with-output-to-port #<variable 13a02e0 value: #<output: file 39>> ...] In srfi/srfi-1.scm: 619: 10 [for-each #<procedure eval-sh (exp)> ((<sh-pipeline> # #))] In gash/shell.scm: 344: 9 [sh:pipeline #<procedure 1506f40 at gash/eval.scm:149:6 ()> ...] 310: 8 [plumb #<input: #{read pipe}# 36> #f ...] 270: 7 [%subshell #<procedure thunk* ()>] In ice-9/boot-9.scm: 157: 6 [catch quit #<procedure thunk* ()> ...] In gash/shell.scm: 316: 5 [thunk*] 129: 4 [sh:exec-let () "sed" "s/[-+.]/_/g"] 92: 3 [exec-utility () ...] In srfi/srfi-1.scm: 616: 2 [for-each #<procedure ec3b20 at gash/shell.scm:70:12 (i)> (0 1 2 ...)] In ice-9/boot-9.scm: 1473: 1 [dup->port #<input: file 38> "r" 7] In unknown file: ?: 0 [fdopen 7 "r"] ERROR: In procedure fdopen: ERROR: In procedure scm_fdes_to_port: Bad file descriptor Backtrace: In gash/eval.scm: 221: 19 [eval-sh (<sh-set!> ("ac_useropt" (<sh-cmd-sub> #)))] In srfi/srfi-1.scm: 642: 18 [for-each #<procedure 1502320 at gash/eval.scm:221:17 (name word)> # #] In gash/eval.scm: 222: 17 [#<procedure 1502320 at gash/eval.scm:221:17 (name word)> "ac_useropt" #] 131: 16 [eval-word (<sh-cmd-sub> (<sh-pipeline> # #)) #:output string ...] 121: 15 [expand-word (<sh-cmd-sub> (<sh-pipeline> # #)) #:output string ...] In gash/shell.scm: 289: 14 [sh:substitute-command #<procedure 15022a0 at gash/eval.scm:129:35 ()>] 270: 13 [%subshell #<procedure v ()>] In ice-9/boot-9.scm: 157: 12 [catch quit #<procedure v ()> ...] In ice-9/r4rs.scm: 176: 11 [with-output-to-port #<variable 13a02e0 value: #<output: file 39>> ...] In srfi/srfi-1.scm: 619: 10 [for-each #<procedure eval-sh (exp)> ((<sh-pipeline> # #))] In gash/shell.scm: 347: 9 [sh:pipeline #<procedure 1506f40 at gash/eval.scm:149:6 ()> ...] 310: 8 [plumb #f #<output: #{write pipe}# 38> ...] 270: 7 [%subshell #<procedure thunk* ()>] In ice-9/boot-9.scm: 157: 6 [catch quit #<procedure thunk* ()> ...] In gash/shell.scm: 316: 5 [thunk*] 129: 4 [sh:exec-let () "printf" "%s\\n" "libsanitizer"] 92: 3 [exec-utility () ...] In srfi/srfi-1.scm: 616: 2 [for-each #<procedure ec3b20 at gash/shell.scm:70:12 (i)> (0 1 2 ...)] In ice-9/boot-9.scm: 1473: 1 [dup->port #<input: file 36> "r" 7] In unknown file: ?: 0 [fdopen 7 "r"] ERROR: In procedure fdopen: ERROR: In procedure scm_fdes_to_port: Bad file descriptor checking build system type... i686-unknown-linux-gnu checking host system type... i686-unknown-linux-gnu checking target system type... i686-unknown-linux-gnu checking for a BSD-compatible install... ./install-sh -c checking whether ln works... yes checking whether ln -s works... yes checking for a sed that does not truncate output... /gnu/store/i61mvrw30k8ng8hxym8s180nydnsbji6-gash-utils-boot-0.2.0/bin/sed checking for gawk... gawk checking for libsanitizer support... yes --8<---------------cut here---------------end--------------->8--- What happens is that Gash crashes in the middle of a substitution on $ac_useropt. As a result, ‘--disable-libsanitizer’ (and other options, it seems) are discarded, hence the “libsanitizer support... yes” line. Hours later, build fails while trying to build libsanitizer. Any idea what could cause EBADF? Thanks, Ludo’.
Information forwarded
to bug-guix@gnu.org
:
bug#75658
; Package guix
.
(Sun, 19 Jan 2025 18:25:01 GMT) (full text, mbox, link).
Message #8 received at 75658@debbugs.gnu.org (full text, mbox, reply):
Ludovic Courtès <ludovic.courtes@inria.fr> skribis: > I stumbled upon this interesting non-deterministic failure while > building ‘gcc-mesboot-4.9.4.drv’ on current ‘core-packages-team’ (which > is unchanged compared to ‘master’): Just got another one: --8<---------------cut here---------------start------------->8--- checking for struct sigaction.sa_sigaction... yes checking for volatile sig_atomic_t... yes checking for sighandler_t... yes checking for sigprocmask... (cached) yes checking whether sleep is declared... yes checking for working sleep... yes checking for socklen_t... Backtrace: In gash/shell.scm: 129: 19 [sh:exec-let () "ac_fn_c_try_compile" "2817"] In gash/environment.scm: 215: 18 [save-variables-excursion () ...] 292: 17 [with-arguments # #<procedure 2210f00 at gash/shell.scm:145:25 ()>] 389: 16 [call-with-return #<procedure 2210e40 at gash/shell.scm:147:28 ()>] In srfi/srfi-1.scm: 619: 15 [for-each #<procedure eval-sh (exp)> ((<sh-begin> # # # ...))] 619: 14 [for-each #<procedure eval-sh (exp)> (# # # # ...)] In gash/shell.scm: 441: 13 [sh:cond # #] 55: 12 [without-errexit #<procedure 13185e0 at gash/eval.scm:149:6 ()>] 372: 11 [sh:and #<procedure 1318560 at gash/eval.scm:149:6 ()> ...] 55: 10 [without-errexit #<procedure 1318560 at gash/eval.scm:149:6 ()>] 372: 9 [sh:and #<procedure 1318500 at gash/eval.scm:149:6 ()> ...] 55: 8 [without-errexit #<procedure 1318500 at gash/eval.scm:149:6 ()>] In srfi/srfi-1.scm: 616: 7 [for-each #<procedure eval-sh (exp)> (# # # # ...)] 619: 6 [for-each #<procedure eval-sh (exp)> (# # #)] In gash/shell.scm: 245: 5 [#<procedure 1f63030 at gash/shell.scm:239:17 ()>] 129: 4 [sh:exec-let () "grep" "-v" "^ *+" "conftest.err"] 92: 3 [exec-utility () ...] In srfi/srfi-1.scm: 616: 2 [for-each #<procedure ea9a60 at gash/shell.scm:70:12 (i)> (0 1 2 ...)] In ice-9/boot-9.scm: 1473: 1 [dup->port #<input: file 20> "r" 7] In unknown file: ?: 0 [fdopen 7 "r"] ERROR: In procedure fdopen: ERROR: In procedure scm_fdes_to_port: Bad file descriptor yes checking whether symlink handles trailing slash correctly... yes checking whether <sys/ioctl.h> declares ioctl... yes checking for unsetenv... yes checking for unsetenv() return type... int --8<---------------cut here---------------end--------------->8--- That one likely doesn’t change the build outcome since it still determines that ‘socklen_t’ is defined, but it sounds a bit like a dice roll. Ludo’.
Added indication that bug 75658 blocks75518
Request was from Ludovic Courtès <ludo@gnu.org>
to control@debbugs.gnu.org
.
(Thu, 06 Feb 2025 15:18:02 GMT) (full text, mbox, link).
Severity set to 'important' from 'normal'
Request was from Ludovic Courtès <ludo@gnu.org>
to control@debbugs.gnu.org
.
(Mon, 17 Feb 2025 21:18:04 GMT) (full text, mbox, link).
Information forwarded
to bug-guix@gnu.org
:
bug#75658
; Package guix
.
(Tue, 11 Mar 2025 21:43:02 GMT) (full text, mbox, link).
Message #15 received at 75658@debbugs.gnu.org (full text, mbox, reply):
Ludovic Courtès <ludo@gnu.org> skribis: > Ludovic Courtès <ludovic.courtes@inria.fr> skribis: > >> I stumbled upon this interesting non-deterministic failure while >> building ‘gcc-mesboot-4.9.4.drv’ on current ‘core-packages-team’ (which >> is unchanged compared to ‘master’): > > Just got another one: A few more, obtained by running the start of the ‘configure’ script in a loop (added an ‘exit’ on line 2562, which is after the first 4 lines of output). while ./configure CONFIG_SHELL=/gnu/store/98bd49rhyia49y0b9d7sk8phsq14g3nk-gash-boot-0.3.0/bin/bash SHELL=/gnu/store/98bd49rhyia49y0b9d7sk8phsq14g3nk-gash-boot-0.3.0/bin/bash --prefix=/gnu/store/awkbdj5j41pv5kiy9ifs0zl40jamwfw4-gcc-mesboot-4.9.4 --enable-fast-install --build=x86_64-unknown-linux-gnu --prefix=/gnu/store/awkbdj5j41pv5kiy9ifs0zl40jamwfw4-gcc-mesboot-4.9.4 --build=i686-unknown-linux-gnu --host=i686-unknown-linux-gnu --with-host-libstdcxx=-lsupc++ --with-native-system-header-dir=/gnu/store/gc91zbacrk6prhvm91cj3x9rr3v2k17q-glibc-mesboot-2.16.0/include --with-build-sysroot=/gnu/store/gc91zbacrk6prhvm91cj3x9rr3v2k17q-glibc-mesboot-2.16.0/include --disable-bootstrap --disable-decimal-float --disable-libatomic --disable-libcilkrts --disable-libgomp --disable-libitm --disable-libmudflap --disable-libquadmath --disable-libsanitizer --disable-libssp --disable-libvtv --disable-lto --disable-lto-plugin --disable-multilib --disable-plugin --disable-threads --enable-languages=c,c++ --enable-static --enable-shared --enable-threads=single --disable-libstdcxx-pch --disable-build-with-cxx ; do : ;done --8<---------------cut here---------------start------------->8--- warning: failed to install locale: Invalid argument Backtrace: In gash/environment.scm: 371: 19 [call-with-break #<procedure 2dda9450 at gash/shell.scm:400:6 ()>] In srfi/srfi-1.scm: 619: 18 [for-each #<procedure 2dda9420 at gash/shell.scm:401:18 (value)> #] In gash/environment.scm: 353: 17 [call-with-continue #<procedure 2de13460 at gash/eval.scm:158:14 ()>] In srfi/srfi-1.scm: 616: 16 [for-each #<procedure eval-sh (exp)> (# # #)] 619: 15 [for-each #<procedure eval-sh (exp)> ((<sh-set!> ("ac_optarg" #)))] In gash/eval.scm: 221: 14 [eval-sh (<sh-set!> ("ac_optarg" (<sh-cmd-sub> #)))] In srfi/srfi-1.scm: 642: 13 [for-each #<procedure 2da0f5e0 at gash/eval.scm:221:17 (name word)> # #] In gash/eval.scm: 222: 12 [#<procedure 2da0f5e0 at gash/eval.scm:221:17 (name word)> "ac_optarg" #] 131: 11 [eval-word (<sh-cmd-sub> (<sh-exec> "expr" # ":" ...)) #:output string ...] 121: 10 [expand-word (<sh-cmd-sub> (<sh-exec> "expr" # ...)) #:output ...] In gash/shell.scm: 289: 9 [sh:substitute-command #<procedure 2da0f560 at gash/eval.scm:129:35 ()>] 270: 8 [%subshell #<procedure v ()>] In ice-9/boot-9.scm: 157: 7 [catch quit #<procedure v ()> ...] In ice-9/r4rs.scm: 176: 6 [with-output-to-port #<variable 2de5dc00 value: #<output: file /dev/pts/19>> ...] In srfi/srfi-1.scm: 619: 5 [for-each #<procedure eval-sh (exp)> ((<sh-exec> "expr" # ":" ...))] In gash/shell.scm: 129: 4 [sh:exec-let () "expr" ...] 92: 3 [exec-utility () ...] In srfi/srfi-1.scm: 616: 2 [for-each #<procedure 2d60f0a0 at gash/shell.scm:70:12 (i)> (0 1 2 ...)] In ice-9/boot-9.scm: 1473: 1 [dup->port #<input: file /dev/pts/19> "r0" 7] In unknown file: ?: 0 [fdopen 7 "r0"] ERROR: In procedure fdopen: ERROR: In procedure scm_fdes_to_port: Bad file descriptor --8<---------------cut here---------------end--------------->8--- And: --8<---------------cut here---------------start------------->8--- Backtrace: In ice-9/boot-9.scm: 157: 17 [catch #t #<catch-closure 25cdf0a0> ...] In unknown file: ?: 16 [apply-smob/1 #<catch-closure 25cdf0a0>] In ice-9/boot-9.scm: 63: 15 [call-with-prompt prompt0 ...] In ice-9/eval.scm: 432: 14 [eval # #] In ice-9/boot-9.scm: 793: 13 [call-with-input-file "./configure" ...] In gash/gash.scm: 121: 12 [#<procedure 262f7700 at gash/gash.scm:120:19 (port)> #<input: ./configure 5>] In gash/repl.scm: 38: 11 [run-repl #<input: ./configure 5> #f] In gash/environment.scm: 371: 10 [call-with-break #<procedure 26335c00 at gash/shell.scm:400:6 ()>] In srfi/srfi-1.scm: 616: 9 [for-each #<procedure 26335bd0 at gash/shell.scm:401:18 (value)> #] In gash/environment.scm: 353: 8 [call-with-continue #<procedure 26315260 at gash/eval.scm:158:14 ()>] In srfi/srfi-1.scm: 619: 7 [for-each #<procedure eval-sh (exp)> (# # #)] In gash/shell.scm: 441: 6 [sh:cond #] 55: 5 [without-errexit #<procedure 26861c80 at gash/eval.scm:149:6 ()>] 129: 4 [sh:exec-let () "test" "-n" ""] 92: 3 [exec-utility () ...] In srfi/srfi-1.scm: 619: 2 [for-each #<procedure 26272b60 at gash/shell.scm:70:12 (i)> (0 1 2 ...)] In ice-9/boot-9.scm: 1473: 1 [dup->port #<output: file /dev/pts/19> "w0" 6] In unknown file: ?: 0 [fdopen 6 "w0"] ERROR: In procedure fdopen: ERROR: In procedure scm_fdes_to_port: Bad file descriptor --8<---------------cut here---------------end--------------->8--- And: --8<---------------cut here---------------start------------->8--- Backtrace: In ice-9/boot-9.scm: 157: 13 [catch #t #<catch-closure 1879d00> ...] In unknown file: ?: 12 [apply-smob/1 #<catch-closure 1879d00>] In ice-9/boot-9.scm: 63: 11 [call-with-prompt prompt0 ...] In ice-9/eval.scm: 432: 10 [eval # #] In ice-9/boot-9.scm: 793: 9 [call-with-input-file "./configure" ...] In gash/gash.scm: 121: 8 [#<procedure 1e905e0 at gash/gash.scm:120:19 (port)> #<input: ./configure 5>] In gash/repl.scm: 38: 7 [run-repl #<input: ./configure 5> #f] In gash/shell.scm: 441: 6 [sh:cond #] 55: 5 [without-errexit #<procedure 2192680 at gash/eval.scm:149:6 ()>] 129: 4 [sh:exec-let () "test" "xi686-unknown-linux-gnu" "!=" "x"] 92: 3 [exec-utility () ...] In srfi/srfi-1.scm: 619: 2 [for-each #<procedure 1e06920 at gash/shell.scm:70:12 (i)> (0 1 2 ...)] In ice-9/boot-9.scm: 1473: 1 [dup->port #<output: file /dev/pts/19> "w0" 6] In unknown file: ?: 0 [fdopen 6 "w0"] ERROR: In procedure fdopen: ERROR: In procedure scm_fdes_to_port: Bad file descriptor --8<---------------cut here---------------end--------------->8--- All these happen before the line: --8<---------------cut here---------------start------------->8--- checking build system type... i686-unknown-linux-gnu --8<---------------cut here---------------end--------------->8--- Good news: I was able to reproduce with Gash over Guile 3.0.9: --8<---------------cut here---------------start------------->8--- ludo@ribbon /tmp/guix-build-gcc-mesboot-4.9.4.drv-0/gcc-4.9.4$ guix build gash /gnu/store/mz5swdf35iwplrgdvm4z256py585nxi6-gash-0.3.0 ludo@ribbon /tmp/guix-build-gcc-mesboot-4.9.4.drv-0/gcc-4.9.4$ while /gnu/store/mz5swdf35iwplrgdvm4z256py585nxi6-gash-0.3.0/bin/gash ./configure CONFIG_SHELL=/gnu/store/98bd49rhyia49y0b9d7sk8phsq14g3nk-gash-boot-0.3.0/bin/bash SHELL=/gnu/store/98bd49rhyia49y0b9d7sk8phsq14g3nk-gash-boot-0.3.0/bin/bash --prefix=/gnu/store/awkbdj5j41pv5kiy9ifs0zl40jamwfw4-gcc-mesboot-4.9.4 --enable-fast-install --build=x86_64-unknown-linux-gnu --prefix=/gnu/store/awkbdj5j41pv5kiy9ifs0zl40jamwfw4-gcc-mesboot-4.9.4 --build=i686-unknown-linux-gnu --host=i686-unknown-linux-gnu --with-host-libstdcxx=-lsupc++ --with-native-system-header-dir=/gnu/store/gc91zbacrk6prhvm91cj3x9rr3v2k17q-glibc-mesboot-2.16.0/include --with-build-sysroot=/gnu/store/gc91zbacrk6prhvm91cj3x9rr3v2k17q-glibc-mesboot-2.16.0/include --disable-bootstrap --disable-decimal-float --disable-libatomic --disable-libcilkrts --disable-libgomp --disable-libitm --disable-libmudflap --disable-libquadmath --disable-libsanitizer --disable-libssp --disable-libvtv --disable-lto --disable-lto-plugin --disable-multilib --disable-plugin --disable-threads --enable-languages=c,c++ --enable-static --enable-shared --enable-threads=single --disable-libstdcxx-pch --disable-build-with-cxx ; do : ;done […] Backtrace: In ice-9/boot-9.scm: 1752:10 18 (with-exception-handler _ _ #:unwind? _ #:unwind-for-type _) In unknown file: 17 (apply-smob/0 #<thunk 7f8e78d15300>) In ice-9/boot-9.scm: 724:2 16 (call-with-prompt _ _ #<procedure default-prompt-handler (k proc)>) In ice-9/eval.scm: 619:8 15 (_ #(#(#<directory (guile-user) 7f8e78d18c80>))) In ice-9/ports.scm: 433:17 14 (call-with-input-file _ _ #:binary _ #:encoding _ #:guess-encoding _) In gash/gash.scm: 121:27 13 (_ _) In gash/repl.scm: 38:14 12 (run-repl _ _) In gash/environment.scm: 375:8 11 (call-with-break _) In srfi/srfi-1.scm: 634:9 10 (for-each #<procedure 7f8e75612420 at gash/shell.scm:401:18 (value)> _) In gash/environment.scm: 357:8 9 (call-with-continue _) In srfi/srfi-1.scm: 634:9 8 (for-each #<procedure eval-sh (exp)> _) 634:9 7 (for-each #<procedure eval-sh (exp)> _) In gash/shell.scm: 55:39 6 (sh:and _ #<procedure 7f8e75656da0 at gash/eval.scm:149:6 ()>) 245:24 5 (_) 159:10 4 (sh:exec-let _ "expr" . _) 92:9 3 (exec-utility _ "/run/current-system/profile/bin/expr" "expr" ("xliba…" …)) In srfi/srfi-1.scm: 634:9 2 (for-each #<procedure 7f8e760654c0 at gash/shell.scm:70:12 (i)> _) In ice-9/ports.scm: 317:17 1 (dup->port _ _ _) In unknown file: 0 (fdopen 6 "w0") ERROR: In procedure fdopen: In procedure scm_fdes_to_port: Bad file descriptor --8<---------------cut here---------------end--------------->8--- Enough backtraces for now. To be continued… Ludo’.
Information forwarded
to bug-guix@gnu.org
:
bug#75658
; Package guix
.
(Fri, 14 Mar 2025 00:01:02 GMT) (full text, mbox, link).
Message #18 received at 75658@debbugs.gnu.org (full text, mbox, reply):
Ludovic Courtès <ludo@gnu.org> skribis: > In gash/shell.scm: > 289: 9 [sh:substitute-command #<procedure 2da0f560 at gash/eval.scm:129:35 ()>] > 270: 8 [%subshell #<procedure v ()>] > In ice-9/boot-9.scm: > 157: 7 [catch quit #<procedure v ()> ...] > In ice-9/r4rs.scm: > 176: 6 [with-output-to-port #<variable 2de5dc00 value: #<output: file /dev/pts/19>> ...] > In srfi/srfi-1.scm: > 619: 5 [for-each #<procedure eval-sh (exp)> ((<sh-exec> "expr" # ":" ...))] > In gash/shell.scm: > 129: 4 [sh:exec-let () "expr" ...] > 92: 3 [exec-utility () ...] > In srfi/srfi-1.scm: > 616: 2 [for-each #<procedure 2d60f0a0 at gash/shell.scm:70:12 (i)> (0 1 2 ...)] > In ice-9/boot-9.scm: > 1473: 1 [dup->port #<input: file /dev/pts/19> "r0" 7] > In unknown file: > ?: 0 [fdopen 7 "r0"] > > ERROR: In procedure fdopen: > ERROR: In procedure scm_fdes_to_port: Bad file descriptor I was able to capture an strace log of this: --8<---------------cut here---------------start------------->8--- 15837 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fb10dad7850) = 15838 15838 set_robust_list(0x7fb10dad7860, 24) = 0 15837 wait4(15838, <unfinished ...> 15838 close(3) = 0 15838 close(4) = 0 15838 pipe2([3, 4], O_CLOEXEC) = 0 [...] 15838 clone3({flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, child_tid=0x7fb10beaa990, parent_tid=0x7fb10beaa990, exit_signal=0, stack=0x7fb10b51b000, stack_size=0x98ef80, tls=0x7fb10beaa6c0} => {parent_tid=[15839]}, 88) = 15839 15839 rseq(0x7fb10beaafe0, 0x20, 0, 0x53053053 <unfinished ...> 15838 rt_sigprocmask(SIG_SETMASK, [], <unfinished ...> [...] 15838 lseek(2, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek) 15839 close(10) = 0 15839 close(17 <unfinished ...> 15838 dup2(22, 6 <unfinished ...> 15839 <... close resumed>) = 0 15838 <... dup2 resumed>) = 6 15839 close(6 <unfinished ...> 15838 fcntl(6, F_GETFL <unfinished ...> 15839 <... close resumed>) = 0 15838 <... fcntl resumed>) = -1 EBADF (Bad file descriptor) 15839 close(7) = 0 15839 close(18) = 0 15839 close(15) = 0 15839 close(12) = 0 15839 close(9) = 0 15839 close(16) = 0 15838 write(2, "Backtrace:\n", 11) = 11 --8<---------------cut here---------------end--------------->8--- The sequence goes like this: 1. A child process (15837) corresponding to the subshell is created; 2. That process creates a finalization thread (15839); 3. Main thread does dup2(22, 6); finalization does close(6); main thread does fcntl(6, F_GETFL), which fails with EBADF. I suspect something like a wrong revealed count on the relevant ports, possibly those created in ‘install-current-ports!’. Ludo’.
Information forwarded
to bug-guix@gnu.org
:
bug#75658
; Package guix
.
(Sat, 15 Mar 2025 05:10:02 GMT) (full text, mbox, link).
Message #21 received at 75658@debbugs.gnu.org (full text, mbox, reply):
Ludovic Courtès <ludo@gnu.org> writes: > I was able to capture an strace log of this: > > 15837 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fb10dad7850) = 15838 > 15838 set_robust_list(0x7fb10dad7860, 24) = 0 > 15837 wait4(15838, <unfinished ...> > 15838 close(3) = 0 > 15838 close(4) = 0 > 15838 pipe2([3, 4], O_CLOEXEC) = 0 > [...] > 15838 clone3({flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, child_tid=0x7fb10beaa990, parent_tid=0x7fb10beaa990, exit_signal=0, stack=0x7fb10b51b000, stack_size=0x98ef80, tls=0x7fb10beaa6c0} => {parent_tid=[15839]}, 88) = 15839 > 15839 rseq(0x7fb10beaafe0, 0x20, 0, 0x53053053 <unfinished ...> > 15838 rt_sigprocmask(SIG_SETMASK, [], <unfinished ...> > [...] > 15838 lseek(2, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek) > 15839 close(10) = 0 > 15839 close(17 <unfinished ...> > 15838 dup2(22, 6 <unfinished ...> > 15839 <... close resumed>) = 0 > 15838 <... dup2 resumed>) = 6 > 15839 close(6 <unfinished ...> > 15838 fcntl(6, F_GETFL <unfinished ...> > 15839 <... close resumed>) = 0 > 15838 <... fcntl resumed>) = -1 EBADF (Bad file descriptor) > 15839 close(7) = 0 > 15839 close(18) = 0 > 15839 close(15) = 0 > 15839 close(12) = 0 > 15839 close(9) = 0 > 15839 close(16) = 0 > 15838 write(2, "Backtrace:\n", 11) = 11 > > The sequence goes like this: > > 1. A child process (15837) corresponding to the subshell is created; > > 2. That process creates a finalization thread (15839); > > 3. Main thread does dup2(22, 6); finalization does close(6); main > thread does fcntl(6, F_GETFL), which fails with EBADF. > > I suspect something like a wrong revealed count on the relevant ports, > possibly those created in ‘install-current-ports!’. In “boot-9.scm”, we have (define dup->port (case-lambda ((port/fd mode) (fdopen (dup->fdes port/fd) mode)) ((port/fd mode new-fd) (let ((port (fdopen (dup->fdes port/fd new-fd) mode))) (set-port-revealed! port 1) port)))) It looks like the system calls on the main thread correspond to this code (which is called from ‘install-current-ports!’ via ‘dup’). Specifically, ‘dup2’ is called from ‘dup->fdes’ and ‘fcntl’ is called from ‘fdopen’. The way that ‘dup->fdes’ works is that it first makes sure that no existing port has the desired file descriptor (‘scm_evict_ports’), and then calls ‘dup2‘. This should mean that the requested file descriptor is up for grabs. Here’s my guess as to what‘s happening. For brevity let’s call the port with file descriptor 6 “P”. 1. The GC runs, nullifying the entry for P in the port table (weak key hash table), and queuing its finalizer. 2. The evict ports loop runs, missing P because it was nullified (see ‘scm_internal_hash_fold’). 3. ‘dup2’ turns 22 to 6. 4. The finalizer for P runs, closing 6. 5. ‘fdopen’ calls ‘fcntl’ on 6, which results in EBADF. And here’s a reproducer: (let loop () (define fd #f) (let ((P (open-input-file "/dev/null"))) ;; Does not change the revealed count of P. (set! fd (fileno P))) (let ((port (open-input-file "/dev/null"))) (dup->port port "r" fd) (close-port port) (loop))) This results in EBADF in seemingly exactly the same way. (I had to run it a few times: sometimes it runs out of file descriptors first.) This happens on bootstrap Guile (2.0.9) and modern Guile. That’s all I have for now. I’m not sure how to avoid this without resorting to calling “(gc)” to synchronously run the finalizers before trying to mess with the file descriptors. -- Tim
Information forwarded
to bug-guix@gnu.org
:
bug#75658
; Package guix
.
(Sun, 16 Mar 2025 15:02:03 GMT) (full text, mbox, link).
Message #24 received at 75658@debbugs.gnu.org (full text, mbox, reply):
Hello Timothy, Thanks for chiming in. Timothy Sample <samplet@ngyro.com> skribis: > And here’s a reproducer: > > (let loop () > (define fd #f) > (let ((P (open-input-file "/dev/null"))) > ;; Does not change the revealed count of P. > (set! fd (fileno P))) > (let ((port (open-input-file "/dev/null"))) > (dup->port port "r" fd) > (close-port port) > (loop))) > > This results in EBADF in seemingly exactly the same way. (I had to run > it a few times: sometimes it runs out of file descriptors first.) This > happens on bootstrap Guile (2.0.9) and modern Guile. Nice reproducer; I fully agree with your analysis. See3in that ‘install-current-ports!’ creates temporary ports (via ‘dup’) for no reason since nobody captures their reference and they get GC’d soon after, I rewrote it like this: --8<---------------cut here---------------start------------->8--- (define (install-current-ports!) "Install all current ports into their usual file descriptors. For example, if @code{current-input-port} is a @code{file-port?}, make the process file descriptor 0 refer to the file open for @code{current-input-port}. If any current port is a @code{port?} but not a @code{file-port?}, its corresponding file descriptor will refer to @file{/dev/null}." ;; XXX: Input/output ports? Closing other FDs? (for-each (lambda (i) (gc) ;to trigger bugs (let ((current-port (fd->current-port i))) (match (current-port) ((? file-port? port) (dup->fdes port i)) (#f #t)))) (iota *fd-count*))) --8<---------------cut here---------------end--------------->8--- But this illustrates another problem: in the child process, right before ‘execve’, the finalization thread may be restarted, in which case it creates a new pipe. In the example below, the finalization pipe is on FDs 9 and 7, but ‘install-current-ports!’ blindly dups to FD 7, thereby closing one end of the finalization pipe that was just created: --8<---------------cut here---------------start------------->8--- 23647 pipe2([7, 9], O_CLOEXEC) = 0 23647 rt_sigprocmask(SIG_BLOCK, ~[], [], 8) = 0 23647 clone3({flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, child_tid=0x7f84204b3990, parent_tid=0x7f84204b3990, exit_signal=0, stack=0x7f841fb24000, stack_size=0x98ef80, tls=0x7f84204b36c0} => {parent_tid=[23648]}, 88) = 23648 […] 23647 write(9, "\0", 1) = 1 23648 <... read resumed>"\0", 1) = 1 23648 rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 23648 read(7, <unfinished ...> 23647 clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {tv_sec=0, tv_nsec=35845839}) = 0 23647 dup2(12, 7) = 7 23647 fcntl(7, F_GETFL) = 0x8002 (flags O_RDWR|O_LARGEFILE) 23647 lseek(7, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek) 23647 dup2(12, 7) = 7 23647 clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {tv_sec=0, tv_nsec=35899320}) = 0 23647 rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 23647 madvise(0x7f842207c000, 12288, MADV_DONTNEED) = 0 23647 write(9, "\0", 1) = 1 23648 <... read resumed>"\0", 1) = 1 23648 rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 23648 read(7, <unfinished ...> 23647 clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {tv_sec=0, tv_nsec=39539830}) = 0 23647 clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {tv_sec=0, tv_nsec=39555997}) = 0 23647 rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 23647 madvise(0x7f842207c000, 12288, MADV_DONTNEED) = 0 23647 madvise(0x7f8421d74000, 8192, MADV_DONTNEED) = 0 23647 write(9, "\0", 1) = -1 EPIPE (Broken pipe) 23647 --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=23647, si_uid=1000} --- --8<---------------cut here---------------end--------------->8--- After that dup2(12, 7) call, writing to the finalization pipe yields SIGPIPE, which terminates the process (here it corresponds to a subshell running ‘expr’). Since we’re going to exec right after fork, we could turn off finalization around ‘primitive-fork’ such that the child doesn’t attempt to restart the finalization thread before exec. The Shepherd has code like this: --8<---------------cut here---------------start------------->8--- (define %set-automatic-finalization-enabled?! ;; When using a statically-linked Guile, for instance in the initrd, we ;; cannot resolve this symbol, but most of the time we don't need it ;; anyway. Thus, delay it. (let ((proc (delay (pointer->procedure int (dynamic-func "scm_set_automatic_finalization_enabled" (dynamic-link)) (list int))))) (lambda (enabled?) "Switch on or off automatic finalization in a separate thread. Turning finalization off shuts down the finalization thread as a side effect." (->bool ((force proc) (if enabled? 1 0)))))) (define-syntax-rule (without-automatic-finalization exp ...) "Turn off automatic finalization within the dynamic extent of EXP." (let ((enabled? #t)) (dynamic-wind (lambda () (set! enabled? (%set-automatic-finalization-enabled?! #f))) (lambda () exp ...) (lambda () (%set-automatic-finalization-enabled?! enabled?))))) --8<---------------cut here---------------end--------------->8--- Problem is, we cannot use the FFI on the statically-linked Guile. We could implement fork+exec in C, but we don’t have a C compiler at this early bootstrap stage. Thoughts? Ludo’.
Information forwarded
to bug-guix@gnu.org
:
bug#75658
; Package guix
.
(Sun, 16 Mar 2025 21:33:03 GMT) (full text, mbox, link).
Message #27 received at 75658@debbugs.gnu.org (full text, mbox, reply):
[Message part 1 (text/plain, inline)]
Ludovic Courtès <ludo@gnu.org> skribis: > But this illustrates another problem: in the child process, right before > ‘execve’, the finalization thread may be restarted, in which case it > creates a new pipe. > > In the example below, the finalization pipe is on FDs 9 and 7, but > ‘install-current-ports!’ blindly dups to FD 7, thereby closing one end > of the finalization pipe that was just created: The hack below addresses that (mostly) by reserving low-number file descriptors before the signal and finalization threads create their pipe. (In practice, we can only reserve FDs above 5; FDs 3 and 4 are the “sleep pipe” I believe.) It seems to be good enough though. Thoughts? Ludo’.
[Message part 2 (text/x-patch, inline)]
diff --git a/gash/shell.scm b/gash/shell.scm index 3611067..68e74e7 100644 --- a/gash/shell.scm +++ b/gash/shell.scm @@ -68,14 +68,13 @@ not a @code{file-port?}, its corresponding file descriptor will refer to @file{/dev/null}." ;; XXX: Input/output ports? Closing other FDs? (for-each (lambda (i) - (match ((fd->current-port i)) - ((? file-port? port) - (dup port i)) - ((? input-port? port) - (dup (open-file "/dev/null" "r") i)) - ((? output-port? port) - (dup (open-file "/dev/null" "w") i)) - (_ #t))) + (gc) + (let ((current-port (fd->current-port i))) + (match (current-port) + ((? file-port? port) + (let ((new (dup port i))) + (redirect-port port new))) + (#f #t)))) (iota *fd-count*))) (define (exec-utility bindings path name args) @@ -89,8 +88,14 @@ to @file{/dev/null}." ;; the buffer) produces its output. (flush-all-ports) (match (primitive-fork) - (0 (install-current-ports!) - (apply execle path utility-env name args)) + (0 + (dynamic-wind + (lambda () + (install-current-ports!)) + (lambda () + (apply execle path utility-env name args)) + (lambda () + (primitive-exit 127)))) (pid (match-let (((pid . status) (waitpid pid))) (set-status! (status:exit-val status))))))) @@ -182,7 +187,10 @@ if it is our responsibility to close the port." (define* (make-processed-redir fd target #:optional (open-flags 0)) (let ((port (match target ((? port?) target) - ((? string?) (open target open-flags)) + ((? string?) + (let ((port (open target open-flags))) + (set-port-revealed! port 10) + port)) ;; TODO: Verify open-flags. ((? integer?) ((fd->current-port target))) (#f #f)))) @@ -213,6 +221,7 @@ if it is our responsibility to close the port." (make-processed-redir fd #f)) (('<< (? integer? fd) text) (let ((port (tmpfile))) + (set-port-revealed! port 10) (display text port) (seek port 0 SEEK_SET) (make-processed-redir fd port))))) @@ -264,6 +273,7 @@ process." (lambda () #t) (lambda () (restore-signals) + (gc) (set-atexit! #f) ;; We need to preserve the status given to 'exit', so we ;; catch the 'quit' key here. diff --git a/scripts/gash.in b/scripts/gash.in index f851c1d..57506ba 100644 --- a/scripts/gash.in +++ b/scripts/gash.in @@ -21,5 +21,13 @@ ;;; along with Gash. If not, see <http://www.gnu.org/licenses/>. (define (main args) + ;; Reserve file descriptors 5 to 12 (roughly) before the signal and + ;; finalization threads grab them so that a script willing to use + ;; them can do so without breaking Guile. + (let loop ((i 3)) + (when (<= i 10) + (open-fdes "/dev/null" (logior O_RDONLY O_CLOEXEC)) + (loop (+ i 1)))) + (setenv "SHELL" ((compose canonicalize-path car command-line))) - ((@ (gash gash) main) (command-line))) + ((module-ref (resolve-interface '(gash gash)) 'main) (command-line)))
Information forwarded
to bug-guix@gnu.org
:
bug#75658
; Package guix
.
(Wed, 19 Mar 2025 21:21:01 GMT) (full text, mbox, link).
Message #30 received at 75658@debbugs.gnu.org (full text, mbox, reply):
Hello, This fixes issues reported at <https://issues.guix.gnu.org/75658> and related I noticed while looking at the code. Feedback welcome! Thanks, Ludo'. Ludovic Courtès (4): shell: Exit child process when ‘execle’ fails. shell: Remove dead code in ‘install-current-ports!’. shell: ‘install-current-ports!’ opens file descriptors, not ports. Open low-numbered file descriptors for use by the shell. gash/shell.scm | 29 +++++++++++++++++++++-------- scripts/gash.in | 14 +++++++++++++- tests/exiting.org | 27 +++++++++++++++++++++++++++ 3 files changed, 61 insertions(+), 9 deletions(-) base-commit: ec9f0313190e380687da387b4207469a0a0a8cd8 -- 2.48.1
Information forwarded
to bug-guix@gnu.org
:
bug#75658
; Package guix
.
(Wed, 19 Mar 2025 21:28:02 GMT) (full text, mbox, link).
Message #33 received at 75658@debbugs.gnu.org (full text, mbox, reply):
[Message part 1 (text/plain, inline)]
Hello Timothy, Ludovic Courtès <ludo@gnu.org> skribis: > The hack below addresses that (mostly) by reserving low-number file > descriptors before the signal and finalization threads create their > pipe. (In practice, we can only reserve FDs above 5; FDs 3 and 4 are > the “sleep pipe” I believe.) I’ve just sent cleaned-up patches to gash-devel including this fix/workaround. It passes my tests, meaning that I cannot reproduce the original bug in a timely fashion when running: ./pre-inst-env gash -c 'exec 2>/dev/null; while true; do echo $(sh --version) > /dev/null; done' or when running part of the GCC 4.9.4 ‘configure’ script in a loop (attached is the helper script I used for that; not shown here is a manual modification of said script so that it exits after “checking for a sed that does not truncate output”, which was sufficient to reproduce the bug, possibly after many iterations). It would be great to cut a Gash release soonish as this bug has been blocking the ‘core-packages-team’ branch for a while already. Thanks, Ludo’.
[gash-redirect-EBADF-reproducer.sh (text/plain, inline)]
#!/bin/sh set -x export COLUMNS=200 #STRACE="strace -s 100 -f -o log.strace" PATCH=--with-patch=gash=$PWD/gash-redirect-EBADF.patch export SHELL=$(guix build gash $PATCH)/bin/gash export CONFIG_SHELL=$SHELL OPTIONS="--prefix=/wherever --disable-bootstrap --disable-decimal-float --disable-libatomic --disable-libcilkrts --disable-libgomp --disable-libitm --disable-libmudflap --disable-libquadmath --disable-libsanitizer --disable-libssp --disable-libvtv --disable-lto --disable-lto-plugin --disable-multilib --disable-plugin --disable-threads --enable-languages=c,c++ --enable-static --enable-shared --enable-threads=single --disable-libstdcxx-pch --disable-build-with-cxx" cd /data/src/gcc-4.9.4 while $STRACE $SHELL -e ./configure $OPTIONS $OPTIONS $OPTIONS do grep fcntl.*EBADF log.strace && break done
Information forwarded
to bug-guix@gnu.org
:
bug#75658
; Package guix
.
(Thu, 20 Mar 2025 14:19:02 GMT) (full text, mbox, link).
Message #36 received at 75658@debbugs.gnu.org (full text, mbox, reply):
Ludovic Courtès <ludo@gnu.org> skribis: > shell: Exit child process when ‘execle’ fails. > shell: Remove dead code in ‘install-current-ports!’. > shell: ‘install-current-ports!’ opens file descriptors, not ports. > Open low-numbered file descriptors for use by the shell. For the record, I also built this series with Guile 2.0.9, by modifying ‘guix.scm’ to refer to it instead of ‘guile-3.0’ and turning off tests (since they require (srfi srfi-64), which 2.0.9 doesn’t have). It appears to work fine and passes this test: timeout 10m \ /gnu/store/3ylfablfwsdaapgk2y3x8yjchmapasxs-gash-0.3.0.6-f988cb-dirty/bin/gash -c 'exec 7>/dev/null; while true; do echo $(sh --version) > /dev/null; done' Ludo’.
Information forwarded
to bug-guix@gnu.org
:
bug#75658
; Package guix
.
(Thu, 27 Mar 2025 07:17:02 GMT) (full text, mbox, link).
Message #39 received at 75658@debbugs.gnu.org (full text, mbox, reply):
Ludovic Courtès writes: Hi! > This fixes issues reported at <https://issues.guix.gnu.org/75658> > and related I noticed while looking at the code. > > Feedback welcome! That's awesome, what a terrible puzzle that was! I'm hoping Timothy finds the time to review/merge/release Gash. We could carry these patches in Guix, but yeah. Greetings, Janneke -- Janneke Nieuwenhuizen <janneke@gnu.org> | GNU LilyPond https://LilyPond.org Freelance IT https://www.JoyOfSource.com | Avatar® https://AvatarAcademy.com
Information forwarded
to bug-guix@gnu.org
:
bug#75658
; Package guix
.
(Wed, 02 Apr 2025 14:30:03 GMT) (full text, mbox, link).
Message #42 received at 75658@debbugs.gnu.org (full text, mbox, reply):
Hi there! Janneke Nieuwenhuizen <janneke@gnu.org> skribis: > That's awesome, what a terrible puzzle that was! > > I'm hoping Timothy finds the time to review/merge/release Gash. We > could carry these patches in Guix, but yeah. Yup, it would be great if one of you could do that. :-) Especially since ‘core-packages-team’ has been queued for a while now and the latest attempts to evaluate the branch have all failed due to this bug, as in <https://ci.guix.gnu.org/eval/2050457/log/raw> (I guess we were lucky on the previous ‘core-updates’ cycle, or just retried until it would eventually work!). Cheers, Ludo’.
Send a report that this bug log contains spam.
Debbugs is free software and licensed under the terms of the GNU Public License version 2. The current version can be obtained from https://bugs.debian.org/debbugs-source/.
Copyright © 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson, 2005-2017 Don Armstrong, and many other contributors.