Report forwarded
to bug-guix@gnu.org: bug#55848; Package guix.
(Wed, 08 Jun 2022 15:32:02 GMT) (full text, mbox, link).
Acknowledgement sent
to Mathieu Othacehe <othacehe@gnu.org>:
New bug report received and forwarded. Copy sent to bug-guix@gnu.org.
(Wed, 08 Jun 2022 15:32:02 GMT) (full text, mbox, link).
Hello,
The aarch64 workers were all idle whereas 70k builds were
available. Once restarted, they started building again.
The problem might be that when the server is unavailable for a while the
worker connections expire and cannot be resumed once the server is
available again.
Thanks,
Mathieu
Information forwarded
to bug-guix@gnu.org: bug#55848; Package guix.
(Wed, 08 Jun 2022 19:08:01 GMT) (full text, mbox, link).
On Wed, Jun 8, 2022 at 11:32 AM Mathieu Othacehe <othacehe@gnu.org> wrote:
>
>
> Hello,
>
> The aarch64 workers were all idle whereas 70k builds were
> available. Once restarted, they started building again.
>
> The problem might be that when the server is unavailable for a while the
> worker connections expire and cannot be resumed once the server is
> available again.
>
> Thanks,
>
> Mathieu
The recent aarch64 builds look to all be failing with the following message.
===== <cut> =====
substitute:
substitute: [Kupdating substitutes from 'https://ci.guix.gnu.org'...
0.0%guix substitute: error: TLS error in procedure 'handshake': Error
in the pull function.
===== </cut> =====
Information forwarded
to bug-guix@gnu.org: bug#55848; Package guix.
(Sat, 11 Jun 2022 10:45:02 GMT) (full text, mbox, link).
Greg Hogan <code@greghogan.com> writes:
> On Wed, Jun 8, 2022 at 11:32 AM Mathieu Othacehe <othacehe@gnu.org> wrote:
>> The aarch64 workers were all idle whereas 70k builds were
>> available. Once restarted, they started building again.
From following the builds on http://ci.guix.gnu.org/workers , many
(all?) builds are failing on the following workers:
* grunewald
* kreuzberg
* pankow
The builds are failing with the same error:
"substitute: updating substitutes from 'https://ci.guix.gnu.org'...
0.0%guix substitute: error: TLS error in procedure 'handshake': Error in
the pull function."
Here's some examples:
* http://ci.guix.gnu.org/build/998403/details
* http://ci.guix.gnu.org/build/978678/details
* http://ci.guix.gnu.org/build/978243/details
On worker overdrive1, in the raw log of
http://ci.guix.gnu.org/build/875908/details we can see this
rust-async-mutex build managing to pull substitutes, but it
seems to be compiling rust-1.57 itself.
Severity set to 'important' from 'normal'
Request was from Ludovic Courtès <ludo@gnu.org>
to control@debbugs.gnu.org.
(Sat, 11 Jun 2022 20:34:02 GMT) (full text, mbox, link).
Information forwarded
to bug-guix@gnu.org: bug#55848; Package guix.
(Sun, 12 Jun 2022 13:34:02 GMT) (full text, mbox, link).
Hi,
(+Cc: guix-sysadmin)
Tom Fitzhenry <tom@tom-fitzhenry.me.uk> skribis:
>>From following the builds on http://ci.guix.gnu.org/workers , many
> (all?) builds are failing on the following workers:
>
> * grunewald
> * kreuzberg
> * pankow
>
> The builds are failing with the same error:
>
> "substitute: updating substitutes from 'https://ci.guix.gnu.org'...
> 0.0%guix substitute: error: TLS error in procedure 'handshake': Error in
> the pull function."
On these machines, https://ci.guix.gnu.org (among other) is unavailable
for some reason (firewall I guess):
--8<---------------cut here---------------start------------->8---
ludo@grunewald ~$ wget --debug -O/dev/null https://ci.guix.gnu.org
Setting --output-document (outputdocument) to /dev/null
DEBUG output created by Wget 1.21.1 on linux-gnu.
Reading HSTS entries from /home/ludo/.wget-hsts
URI encoding = ‘UTF-8’
--2022-06-11 22:38:59-- https://ci.guix.gnu.org/
Certificates loaded: 444
Resolving ci.guix.gnu.org (ci.guix.gnu.org)... 141.80.181.40
Caching ci.guix.gnu.org => 141.80.181.40
Connecting to ci.guix.gnu.org (ci.guix.gnu.org)|141.80.181.40|:443... connected.
Created socket 4.
Releasing 0x000000001fd26b50 (new refcount 1).
[Sits there forever…]
--8<---------------cut here---------------end--------------->8---
These machines are configured using ‘honeycomb-system’ from (sysadmin
honeycomb) in maintenance.git.
guix-daemon is configured to use the default substitute URLs,
https://ci.guix.gnu.org and https://bordeaux.guix.gnu.org, which we know
are unreachable.
I’ve theoretically addressed this here:
https://git.savannah.gnu.org/cgit/guix/maintenance.git/commit/?id=99bd9dc9001d6bea7480a7ce0e0e10ff78adb787https://git.savannah.gnu.org/cgit/guix/maintenance.git/commit/?id=b0661cc7d6dd74b0aeac3b052a80a8a2fef2af9c
I tried to reconfigure those boxes with ‘guix deploy’, but this is
currently on hold because ci.guix has run out of inodes…
To be continued!
Ludo’.
Information forwarded
to bug-guix@gnu.org: bug#55848; Package guix.
(Sun, 12 Jun 2022 16:16:01 GMT) (full text, mbox, link).
Ludovic Courtès <ludo@gnu.org> writes:
> Hi,
>
> (+Cc: guix-sysadmin)
>
> Tom Fitzhenry <tom@tom-fitzhenry.me.uk> skribis:
>
>>>From following the builds on http://ci.guix.gnu.org/workers , many
>> (all?) builds are failing on the following workers:
>>
>> * grunewald
>> * kreuzberg
>> * pankow
>>
>> The builds are failing with the same error:
>>
>> "substitute: updating substitutes from 'https://ci.guix.gnu.org'...
>> 0.0%guix substitute: error: TLS error in procedure 'handshake': Error in
>> the pull function."
>
> On these machines, https://ci.guix.gnu.org (among other) is unavailable
> for some reason (firewall I guess):
They should be using the local IP instead of routing through the
internet, so /etc/hosts should contain an entry for
141.80.167.131 ci.guix.gnu.org
(We have the same entry on the other build nodes hosted at the MDC.)
“guix deploy” did not work on these nodes due to a serious problem: they
were given *some* x86_64 binaries to execute, so deployed systems were
unbootable. Since we don’t have a serial interface through which you
could debug this remotely, please make sure not to deploy a broken
system. I’d like to avoid trips to the data centre.
--
Ricardo
Information forwarded
to bug-guix@gnu.org: bug#55848; Package guix.
(Sun, 12 Jun 2022 20:23:03 GMT) (full text, mbox, link).
Ricardo Wurmus <rekado@elephly.net> skribis:
> They should be using the local IP instead of routing through the
> internet, so /etc/hosts should contain an entry for
>
> 141.80.167.131 ci.guix.gnu.org
Good idea.
> “guix deploy” did not work on these nodes due to a serious problem: they
> were given *some* x86_64 binaries to execute, so deployed systems were
> unbootable. Since we don’t have a serial interface through which you
> could debug this remotely, please make sure not to deploy a broken
> system. I’d like to avoid trips to the data centre.
Ooooh right, thanks for the reminder!
Ludo’.
Information forwarded
to bug-guix@gnu.org: bug#55848; Package guix.
(Sun, 19 Jun 2022 02:08:01 GMT) (full text, mbox, link).
Mathieu Othacehe <othacehe@gnu.org> writes:
Substitutes for aarch64 are a lot healthier now. Thanks Ludovic!
* kreuzberg is now successfully building and has been for a while.
* ci.guix.gnu.has has 41% of substitutes (a low percentage, but likely a
high percentage of toolchains). 0 jobs are queued, presumably because Curiass
believes its up-to-date. This should increase over time, as packages
are updated.
* bordeaux has 83.8% of substitutes.
A few issues remain for aarch64:
* grunewald and kreuzberg are not on <https://ci.guix.gnu.org/workers>.
Perhaps they were taken down while the substitute ratio was low to
avoid each worker independently recompiling expensive toolchains?
* rust@1.39.0 (and thus all of Rust) is missing from ci and bordeaux. I
had expected this would have been working. I'll take a look and raise
a separate issue.
--8<---------------cut here---------------start------------->8---
$ ./pre-inst-env guix weather -s aarch64-linux -c2000
computing 15514 package derivations for aarch64-linux...
looking for 16265 store items on https://ci.guix.gnu.org...
https://ci.guix.gnu.org
41.0% substitutes available (6668 out of 16265)
at least 34188.1 MiB of nars (compressed)
45362.5 MiB on disk (uncompressed)
0.015 seconds per request (144.9 seconds in total)
66.2 requests per second
0.0% (0 out of 9597) of the missing items are queued
at least 1000 queued builds
aarch64-linux: 110 (11.0%)
powerpc64le-linux: 890 (89.0%)
build rate: 36.81 builds per hour
aarch64-linux: 17.23 builds per hour
x86_64-linux: 14.25 builds per hour
powerpc64le-linux: 1.01 builds per hour
i686-linux: 4.83 builds per hour
1871 packages are missing from 'https://ci.guix.gnu.org' for 'aarch64-linux', among which:
3479 rust@1.39.0 /gnu/store/xxlgndidxvhdd391k35vcmviixq5d9b0-rust-1.39.0-cargo /gnu/store/cfy1p8q4bwwy1i01cjfssfry21kpljz3-rust-1.39.0
2111 cairomm@1.14.2 /gnu/store/bxknxn3nbmmvavf537k0pggrynhrgsaf-cairomm-1.14.2-doc /gnu/store/3sn66mgr29v73zpp93c2v09a0rj87l3w-cairomm-1.14.2
2101 texlive-latex-pgf@59745 /gnu/store/l6jr7v8ygn3ybj4gxcwskf8ifsjcj6x1-texlive-latex-pgf-59745
looking for 16265 store items on https://bordeaux.guix.gnu.org...
https://bordeaux.guix.gnu.org
83.8% substitutes available (13624 out of 16265)
35138.6 MiB of nars (compressed)
109501.6 MiB on disk (uncompressed)
0.060 seconds per request (699.4 seconds in total)
16.7 requests per second
(continuous integration information unavailable)
579 packages are missing from 'https://bordeaux.guix.gnu.org' for 'aarch64-linux', among which:
3479 rust@1.39.0 /gnu/store/xxlgndidxvhdd391k35vcmviixq5d9b0-rust-1.39.0-cargo /gnu/store/cfy1p8q4bwwy1i01cjfssfry21kpljz3-rust-1.39.0
--8<---------------cut here---------------end--------------->8---
> Hello,
>
> The aarch64 workers were all idle whereas 70k builds were
> available. Once restarted, they started building again.
>
> The problem might be that when the server is unavailable for a while the
> worker connections expire and cannot be resumed once the server is
> available again.
>
> Thanks,
>
> Mathieu
Information forwarded
to bug-guix@gnu.org: bug#55848; Package guix.
(Mon, 20 Jun 2022 02:40:02 GMT) (full text, mbox, link).
Hi Mathieu!
[...]
> A few issues remain for aarch64:
>
> * grunewald and kreuzberg are not on <https://ci.guix.gnu.org/workers>.
> Perhaps they were taken down while the substitute ratio was low to
> avoid each worker independently recompiling expensive toolchains?
> * rust@1.39.0 (and thus all of Rust) is missing from ci and bordeaux. I
> had expected this would have been working. I'll take a look and raise
> a separate issue.
That's a known issue with mrustc; it only succeeds with x86_64; the
other architectures have problems. That's a bug the mrustc author would
like to fix, so perhaps in time in will improve (especially if
interested parties can lend a hand).
There was also an attempt to cross-compile a rust/cargo bootstrap seed
for other architectures (branch: wip-cross-built-rust) but due to
complications with building rust as a static archive (it relies on
dynamic linking for its macro expand crates), the effort stalled.
Thanks,
Maxim
Information forwarded
to bug-guix@gnu.org: bug#55848; Package guix.
(Mon, 20 Jun 2022 02:46:02 GMT) (full text, mbox, link).
On Mon, 20 Jun 2022, at 12:39 PM, Maxim Cournoyer wrote:
> That's a known issue with mrustc; it only succeeds with x86_64; the
> other architectures have problems. That's a bug the mrustc author would
> like to fix, so perhaps in time in will improve (especially if
> interested parties can lend a hand).
mrustc was fixed on aarch64 in https://issues.guix.gnu.org/54580 on staging, which was recently merged to master.
I had tested mrustc and rust-1.39 to compile on aarch64 on staging, but now I observe rust-1.39 failing.
I'll take a closer look, maybe I'm missing something.
Information forwarded
to bug-guix@gnu.org: bug#55848; Package guix.
(Mon, 20 Jun 2022 13:04:02 GMT) (full text, mbox, link).
Maxim Cournoyer schreef op zo 19-06-2022 om 22:39 [-0400]:
> There was also an attempt to cross-compile a rust/cargo bootstrap seed
> for other architectures (branch: wip-cross-built-rust) but due to
> complications with building rust as a static archive (it relies on
> dynamic linking for its macro expand crates), the effort stalled.
FWIW, has it been considered to cross-compile rust non-statically
(not as a seed, just as an input cross-compiled from another system)?
Doesn't help for people that cannot offload to x86_64 and don't have
substitutes from ci.guix.gnu.org or such enabled, but could still be an
improvement.
Greetings,
Maxime.
Cc: Mathieu Othacehe <othacehe@gnu.org>, 55848@debbugs.gnu.org,
Tom Fitzhenry <tom@tom-fitzhenry.me.uk>
Subject: Re: bug#55848: [cuirass] workers stalled
Date: Tue, 21 Jun 2022 01:32:33 -0400
Hi Maxime,
Maxime Devos <maximedevos@telenet.be> writes:
> Maxim Cournoyer schreef op zo 19-06-2022 om 22:39 [-0400]:
>> There was also an attempt to cross-compile a rust/cargo bootstrap seed
>> for other architectures (branch: wip-cross-built-rust) but due to
>> complications with building rust as a static archive (it relies on
>> dynamic linking for its macro expand crates), the effort stalled.
>
> FWIW, has it been considered to cross-compile rust non-statically
> (not as a seed, just as an input cross-compiled from another system)?
> Doesn't help for people that cannot offload to x86_64 and don't have
> substitutes from ci.guix.gnu.org or such enabled, but could still be an
> improvement.
This already works, on the branch. One of the patches carried there
that made it possible has been merged upstream too. The issue is that
to offer a useful cross-compiled rust on non-x86_64 systems, you need to
move it from system domains; the clean way to do this is to archive a
static binary that depends on nothing else somewhere, and extract it in
a package for the target architecture.
Currently it's not cleanly self-contained because it still references
GCC libraries.
Maxim
Changed bug title to 'AArch64 Honeycomb builders are inactive' from '[cuirass] workers stalled'
Request was from Ludovic Courtès <ludo@gnu.org>
to control@debbugs.gnu.org.
(Wed, 10 Aug 2022 13:29:01 GMT) (full text, mbox, link).
Information forwarded
to bug-guix@gnu.org: bug#55848; Package guix.
(Wed, 10 Aug 2022 13:47:01 GMT) (full text, mbox, link).
Subject: AArch64 honeycomb machines aren’t building stuff
Date: Wed, 10 Aug 2022 15:46:34 +0200
Hi!
Ludovic Courtès <ludo@gnu.org> skribis:
> guix-daemon is configured to use the default substitute URLs,
> https://ci.guix.gnu.org and https://bordeaux.guix.gnu.org, which we know
> are unreachable.
>
> I’ve theoretically addressed this here:
>
> https://git.savannah.gnu.org/cgit/guix/maintenance.git/commit/?id=99bd9dc9001d6bea7480a7ce0e0e10ff78adb787
> https://git.savannah.gnu.org/cgit/guix/maintenance.git/commit/?id=b0661cc7d6dd74b0aeac3b052a80a8a2fef2af9c
>
> I tried to reconfigure those boxes with ‘guix deploy’, but this is
> currently on hold because ci.guix has run out of inodes…
Time passed and I had kinda forgotten about it, but the problem remains.
I’m currently reconfiguring pankow and grunewald⁰ from berlin with ‘guix
deploy’ to include the fix above¹, but it’s gonna take a while as it’s
currently building GCC…
To do that, I had to ‘herd stop guix-daemon’ (thereby stopping
‘cuirass-remote worker’ as well) and run guix-daemon by hand with
‘--substitute-urls=http://10.0.0.1’.
While doing that with Guix 9e4632081ff31bf0d1715edd66f514614c6dc4bb, I
found another bug² (yup, it does look like an endless quest, even more
so that I’ll soon be going AFK and it’s not clear that things will be
settled by then!).
Cheers,
Ludo’, aka. el Quijote.
⁰ More on kreuzberg in a separate message…
¹ For the record, previously ‘guix deploy’ had a bug whereby running it
from an x86_64 box like berlin would lead it to send x86_64 binaries
(instead of AArch64 binaries) to the machines. This was fixed in
7046e777212233b89df68379c270b448c45195ce:
<https://issues.guix.gnu.org/55951>.
² https://issues.guix.gnu.org/57117
Information forwarded
to bug-guix@gnu.org: bug#55848; Package guix.
(Wed, 10 Aug 2022 17:58:02 GMT) (full text, mbox, link).
Hi,
Ricardo Wurmus <rekado@elephly.net> skribis:
> Ludovic Courtès <ludo@gnu.org> writes:
>
>> Ludovic Courtès <ludo@gnu.org> skribis:
>>
>>> guix-daemon is configured to use the default substitute URLs,
>>> https://ci.guix.gnu.org and https://bordeaux.guix.gnu.org, which we know
>>> are unreachable.
>>>
>>> I’ve theoretically addressed this here:
>>>
>>> https://git.savannah.gnu.org/cgit/guix/maintenance.git/commit/?id=99bd9dc9001d6bea7480a7ce0e0e10ff78adb787
>>> https://git.savannah.gnu.org/cgit/guix/maintenance.git/commit/?id=b0661cc7d6dd74b0aeac3b052a80a8a2fef2af9c
>>>
>>> I tried to reconfigure those boxes with ‘guix deploy’, but this is
>>> currently on hold because ci.guix has run out of inodes…
>>
>> Time passed and I had kinda forgotten about it, but the problem remains.
>
> I wrote this earlier:
>
>> They should be using the local IP instead of routing through the
>> internet, so /etc/hosts should contain an entry for
>>
>> 141.80.167.131 ci.guix.gnu.org
>
> So running the daemon with “--substitute-urls=http://10.0.0.1” should
> not be necessary.
Oh my bad, sorry for overlooking your message.
Explicitly going through http://10.0.0.1 is still desirable I think
because we avoid HTTPS altogether.
‘guix deploy’ is still running on berlin.guix and building things;
unfortunately I’m going AFK for a bit. I’ll pick it up later unless
someone takes care of it by then.
Thanks,
Ludo’.
Information forwarded
to bug-guix@gnu.org: bug#55848; Package guix.
(Mon, 29 Aug 2022 13:31:01 GMT) (full text, mbox, link).
Subject: Re: bug#55848: AArch64 Honeycomb builders are inactive
Date: Mon, 29 Aug 2022 15:30:07 +0200
Hello!
Ludovic Courtès <ludo@gnu.org> skribis:
> I’m currently reconfiguring pankow and grunewald⁰ from berlin with ‘guix
> deploy’ to include the fix above¹, but it’s gonna take a while as it’s
> currently building GCC…
That eventually succeeded and pankow is now reconfigured with the right
daemon settings. \o/
In the meantime, grunewald went off-line so I can’t tell if it’s
properly reconfigured, and kreuzberg is still running the old config I
believe (I cannot log in).
Ricardo, do you have access to these two?
Cheers,
Ludo’.
bug closed, send any further explanations to
55848@debbugs.gnu.org and Mathieu Othacehe <othacehe@gnu.org>
Request was from Ludovic Courtès <ludo@gnu.org>
to control@debbugs.gnu.org.
(Fri, 06 Sep 2024 12:20:01 GMT) (full text, mbox, link).
Debbugs is free software and licensed under the terms of the
GNU Public License version 2. The current version can be
obtained from https://bugs.debian.org/debbugs-source/.