Report forwarded
to bug-guix@gnu.org: bug#36380; Package guix.
(Tue, 25 Jun 2019 18:13:02 GMT) (full text, mbox, link).
Acknowledgement sent
to Robert Vollmert <rob@vllmrt.net>:
New bug report received and forwarded. Copy sent to bug-guix@gnu.org.
(Tue, 25 Jun 2019 18:13:02 GMT) (full text, mbox, link).
On my VPS, booting takes forever (long enough that for a long
time I thought the install had failed). I just rebooted again,
and it took over 7 minutes, see attached screenshot.
I would suggest skipping the seeding from /dev/hwrng by default
if /var/lib/random-seed is available. I’m assuming here that my
problem is not too rare — if it is, an option to disable the
seeding from /dev/hwrng seems like a good idea.
From: Alex Sassmannshausen <alex.sassmannshausen@gmail.com>
To: bug-guix@gnu.org
Cc: 36380@debbugs.gnu.org
Subject: Re: bug#36380: service urandom-seed takes too long on boot
Date: Wed, 26 Jun 2019 10:41:47 +0100
Hi Robert,
Robert Vollmert <rob@vllmrt.net> writes:
> On my VPS, booting takes forever (long enough that for a long
> time I thought the install had failed). I just rebooted again,
> and it took over 7 minutes, see attached screenshot.
>
> I would suggest skipping the seeding from /dev/hwrng by default
> if /var/lib/random-seed is available. I’m assuming here that my
> problem is not too rare — if it is, an option to disable the
> seeding from /dev/hwrng seems like a good idea.
I'm not sure I'm qualified on best practices with regard to urandom, but
anecdotally, my servers are booting pretty fast, within a minute,
consistently. This is even when booting a qemu virtual image rather
than a VPS.
Perhaps something else is going on here?
Alex
Information forwarded
to bug-guix@gnu.org: bug#36380; Package guix.
(Wed, 26 Jun 2019 09:42:02 GMT) (full text, mbox, link).
Information forwarded
to bug-guix@gnu.org: bug#36380; Package guix.
(Wed, 26 Jun 2019 15:48:02 GMT) (full text, mbox, link).
On Tue, Jun 25, 2019 at 08:12:28PM +0200, Robert Vollmert wrote:
> On my VPS, booting takes forever (long enough that for a long
> time I thought the install had failed). I just rebooted again,
> and it took over 7 minutes, see attached screenshot.
Yikes, that's way too long. Can you say what VPS it is?
> I would suggest skipping the seeding from /dev/hwrng by default
> if /var/lib/random-seed is available. I’m assuming here that my
> problem is not too rare — if it is, an option to disable the
> seeding from /dev/hwrng seems like a good idea.
Originally I added the HWRNG read specifically the for VM / VPS use case
[0], where the first boot environment is relatively deterministic. I
agree it's superfluous if the random-seed file is handled properly but
it's nice to unconditionally have this other entropy source that avoids
the pitfalls of file-based entropy seeding.
Ideally the hypervisor would seed the guest's HWRNG interface with the
host's /dev/urandom, which would avoid significant delays. It seems they
are using some other more limited resource instead.
Does anyone else have an opinion or experience with this issue? It would
be great to know how widespread it is.
[0]
https://git.savannah.gnu.org/cgit/guix.git/commit/?id=9a56cf2b5b4970843c215091ea9823a67e077310https://lists.gnu.org/archive/html/guix-devel/2017-12/msg00096.html
Subject: Re: bug#36380: service urandom-seed takes too long on boot
Date: Wed, 26 Jun 2019 18:02:03 +0200
> On 26. Jun 2019, at 17:47, Leo Famulari <leo@famulari.name> wrote:
>
> On Tue, Jun 25, 2019 at 08:12:28PM +0200, Robert Vollmert wrote:
>> On my VPS, booting takes forever (long enough that for a long
>> time I thought the install had failed). I just rebooted again,
>> and it took over 7 minutes, see attached screenshot.
>
> Yikes, that's way too long. Can you say what VPS it is?
It’s with arpnetworks.com, their default “small" VPS. I don’t
really know more; here’s some dmesg output that might be relevant,
happy to provide more info.
[ 0.000000] DMI: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1.1 04/01/2014
[ 0.000000] Hypervisor detected: KVM
[ 0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00
[ 0.000000] kvm-clock: cpu 0, msr 2a782001, primary cpu clock
[ 0.000000] kvm-clock: using sched offset of 1160634602574609 cycles
[ 0.000002] clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
[ 0.000004] tsc: Detected 3066.774 MHz processor
Information forwarded
to bug-guix@gnu.org: bug#36380; Package guix.
(Thu, 27 Jun 2019 15:21:02 GMT) (full text, mbox, link).
Cc: 36380@debbugs.gnu.org, Robert Vollmert <rob@vllmrt.net>
Subject: Re: bug#36380: service urandom-seed takes too long on boot
Date: Thu, 27 Jun 2019 17:20:21 +0200
Hi Leo,
Leo Famulari <leo@famulari.name> skribis:
> On Tue, Jun 25, 2019 at 08:12:28PM +0200, Robert Vollmert wrote:
>> On my VPS, booting takes forever (long enough that for a long
>> time I thought the install had failed). I just rebooted again,
>> and it took over 7 minutes, see attached screenshot.
>
> Yikes, that's way too long. Can you say what VPS it is?
We had a “bug report” at
<https://distrowatch.com/weekly.php?issue=20190624#guixsd>, which may be
due to the same issue:
The first time I loaded Guix the boot process took an unusually long
time. At one point the system appeared to lock up for about five
minutes before continuing. In the end, from boot menu to graphical
login screen, the start-up time totalled about ten minutes.
The author says they were running Guix in VirtualBox. I’m glad Robert’s
bug report has more info than that one. :-)
What should we do?
Ludo’.
Severity set to 'important' from 'normal'
Request was from Ludovic Courtès <ludo@gnu.org>
to control@debbugs.gnu.org.
(Thu, 27 Jun 2019 15:21:02 GMT) (full text, mbox, link).
Information forwarded
to bug-guix@gnu.org: bug#36380; Package guix.
(Thu, 27 Jun 2019 19:04:02 GMT) (full text, mbox, link).
On Thu, Jun 27, 2019 at 05:20:21PM +0200, Ludovic Courtès wrote:
> We had a “bug report” at
> <https://distrowatch.com/weekly.php?issue=20190624#guixsd>, which may be
> due to the same issue:
>
> The first time I loaded Guix the boot process took an unusually long
> time. At one point the system appeared to lock up for about five
> minutes before continuing. In the end, from boot menu to graphical
> login screen, the start-up time totalled about ten minutes.
Perhaps, but if the reason for the slowness on their first boot was a
suboptimal /dev/hwrng source, I would expect it to be equally slow for
each boot, since we unconditionally read 64 bytes each time.
However, their next sentence says, "Curiously, after the first boot,
Guix started up considerably faster, generally taking less than a minute
to arrive at the login screen."
They are using an old machine with a spinning disk, so who knows...
On Wed, Jun 26, 2019 at 06:02:03PM +0200, Robert Vollmert wrote:
> It’s with arpnetworks.com, their default “small" VPS. I don’t
> really know more; here’s some dmesg output that might be relevant,
> happy to provide more info.
Okay, I've asked them about it on IRC:
http://irclogger.arpnetworks.com/irclogger_log/arpnetworks?date=2019-06-27,Thu
Let's wait and see what they think about the issue.
Cc: 36380@debbugs.gnu.org, Robert Vollmert <rob@vllmrt.net>
Subject: Re: bug#36380: service urandom-seed takes too long on boot
Date: Thu, 27 Jun 2019 22:00:37 +0200
Leo Famulari <leo@famulari.name> skribis:
> On Thu, Jun 27, 2019 at 05:20:21PM +0200, Ludovic Courtès wrote:
>> We had a “bug report” at
>> <https://distrowatch.com/weekly.php?issue=20190624#guixsd>, which may be
>> due to the same issue:
>>
>> The first time I loaded Guix the boot process took an unusually long
>> time. At one point the system appeared to lock up for about five
>> minutes before continuing. In the end, from boot menu to graphical
>> login screen, the start-up time totalled about ten minutes.
>
> Perhaps, but if the reason for the slowness on their first boot was a
> suboptimal /dev/hwrng source, I would expect it to be equally slow for
> each boot, since we unconditionally read 64 bytes each time.
Perhaps VirtualBox behaves this way? For instance, the VM was rebooted
but VirtualBox itself was still running, and thus it had a good random
seed to start from on the second boot (does that make sense?). I guess
we should try.
> They are using an old machine with a spinning disk, so who knows...
Still, what else could be taking this long?
Thanks,
Ludo’.
Information forwarded
to bug-guix@gnu.org: bug#36380; Package guix.
(Fri, 28 Jun 2019 06:48:01 GMT) (full text, mbox, link).
Subject: Re: bug#36380: service urandom-seed takes too long on boot
Date: Fri, 28 Jun 2019 08:47:35 +0200
> On 27. Jun 2019, at 21:03, Leo Famulari <leo@famulari.name> wrote:
> Perhaps, but if the reason for the slowness on their first boot was a
> suboptimal /dev/hwrng source, I would expect it to be equally slow for
> each boot, since we unconditionally read 64 bytes each time.
It’s 512 bytes, not that that should fundamentally change anything.
Information forwarded
to bug-guix@gnu.org: bug#36380; Package guix.
(Fri, 28 Jun 2019 17:25:01 GMT) (full text, mbox, link).
On Fri, Jun 28, 2019 at 08:47:35AM +0200, Robert Vollmert wrote:
> > On 27. Jun 2019, at 21:03, Leo Famulari <leo@famulari.name> wrote:
> > Perhaps, but if the reason for the slowness on their first boot was a
> > suboptimal /dev/hwrng source, I would expect it to be equally slow for
> > each boot, since we unconditionally read 64 bytes each time.
>
> It’s 512 bytes, not that that should fundamentally change anything.
Oh right, my bad. It's been a while...
Anyways, this should either work immediately or fail. Aside from
getrandom(2), which we aren't using here, nothing related to this stuff
should ever block, and if it does then it's a bug we need to work
around.
So, I suggest we add a 1 second timeout to this read.
I can work on that next week.
On Fri, Jun 28, 2019 at 01:24:01PM -0400, Leo Famulari wrote:
> So, I suggest we add a 1 second timeout to this read.
>
> I can work on that next week.
I did try working on this, after reading the code in (guix scripts
offload (call-with-timeout)).
But, I couldn't make it work at all — it always fails, with all the
services depending on urandom-seed-service failing, breaking the boot.
I don't know how to debug or work interactively in this part of the
system.
Can anybody help?
Cc: 36380@debbugs.gnu.org, Robert Vollmert <rob@vllmrt.net>
Subject: Re: bug#36380: service urandom-seed takes too long on boot
Date: Thu, 11 Jul 2019 23:33:07 +0200
Hi Leo,
Leo Famulari <leo@famulari.name> skribis:
> On Fri, Jun 28, 2019 at 01:24:01PM -0400, Leo Famulari wrote:
>> So, I suggest we add a 1 second timeout to this read.
>>
>> I can work on that next week.
>
> I did try working on this, after reading the code in (guix scripts
> offload (call-with-timeout)).
The ‘start’ method of the ‘urandom-seed’ Shepherd service runs in PID 1,
so we certainly don’t want to fiddle with SIGALRM in that context, which
is what ‘call-with-timeout’ does.
Instead, I think we should use ‘select’ with a timeout:
(call-with-input-file "/dev/hwrng"
(lambda (port)
(match (select (list port) '() '() 3)
…)))
I think we can then use ‘get-bytevector-n!’, assuming it doesn’t block
if less than COUNT bytes are available (I’m not sure this is the case.)
HTH!
Ludo’.
Information forwarded
to bug-guix@gnu.org: bug#36380; Package guix.
(Wed, 17 Jul 2019 21:06:01 GMT) (full text, mbox, link).
Just ran across this article about Debian dealing with similar issues.
https://daniel-lange.com/archives/152-hello-buster.html
One of the suggested work-arounds is to rely on virtio_rng on
virtual machines, but it seems Guix already uses that on my machine,
so perhaps it doesn’t help:
rob@garp ~$ cat /sys/devices/virtual/misc/hw_random/rng_available
virtio_rng.0
rob@garp ~$ cat /sys/devices/virtual/misc/hw_random/rng_current
virtio_rng.0
Information forwarded
to bug-guix@gnu.org: bug#36380; Package guix.
(Sun, 22 Mar 2020 08:44:01 GMT) (full text, mbox, link).
Subject: Re: bug#36380: service urandom-seed takes too long on boot
Date: Sun, 22 Mar 2020 16:19:19 -0400
On Sun, Mar 22, 2020 at 08:43:33AM +0000, Brice Waegeneire wrote:
> This article has been updated since then with a section[0] about a fix
> authored by Linus[1][2] and merged in Linux 5.4. The gist of it that now
> `getrandom()' will actively try to collect entropy in early boot, if it
> is missing, by using the CPU jitter. The Debian wiki is saying the same[3].
The issue here is not related to getrandom() or our kernel. I think the
bug is still relevant.
The Guix system unconditionally reads from /dev/hwrng if it exists, and
there is no reason for that to take a noticeable amount of time.
But this bug report revealed that some VPS providers have a broken
deployment that does cause delays. Who knows how they are feeding
/dev/hwrng... they would not reply to my questions.
It doesn't really matter though, the problem is ours to fix.
We need to make this read time out after a second, but in the past I
could not figure out how to do this without crashing the system (I'm not
a strong Schemer).
Help is still wanted!
Information forwarded
to bug-guix@gnu.org: bug#36380; Package guix.
(Sun, 27 Dec 2020 15:01:02 GMT) (full text, mbox, link).
Subject: service urandom-seed takes too long on boot
Date: Sun, 27 Dec 2020 16:00:23 +0100
Hi!
I’m running Guix in qemu on a NAS. The boot takes sometimes more than 30 minutes, probably waiting to start the urandom-seed service.
Guix is using virtio_rng:
stefan@guix ~$ cat /sys/devices/virtual/misc/hw_random/rng_available
virtio_rng.0
stefan@guix ~$ cat /sys/devices/virtual/misc/hw_random/rng_current
virtio_rng.0
stefan@guix ~$ guix describe
Generation 1 26. Dezember 2020 15:06:11 (aktuell)
guix 4969b51
Repository-URL: https://git.savannah.gnu.org/git/guix.git
Branch: master
Commit: 4969b51d175497bfcc354c91803e9d70542b7113
This may be relevant information from dmesg:
[ 0.194324] random: get_random_u64 called from __kmem_cache_create+0x30/0x460 with crng_init=0
…
[ 3.271767] random: fast init done
…
[ 3.497369] random: crng init done
…
[ 21.228829] shepherd[1]: Service file-systems has been started.
[ 21.243838] shepherd[1]: Service user-homes has been started.
[ 2182.735965] shepherd[1]: Service urandom-seed has been started.
[ 2182.737229] shepherd[1]: Service user-processes has been started.
…
Sometimes the urandom-seed service takes “just” 200 seconds – still a lot.
Interestingly during this time-out the system can’t be pinged, the networking with dhclient doesn't seem to be done.
The Guix installer iso is not using the urandom-seed service and does not suffer from this delay.
The host kernel and qemu are rather old:
~$ uname -a
Linux aaaaaaaa 4.4.59+ #25426 SMP PREEMPT Mon Dec 14 18:48:50 CST 2020 x86_64 GNU/Linux synology_apollolake
~$ /usr/local/bin/qemu-system-x86_64 --version
QEMU emulator version 2.12.1 (-dirty)
Copyright (c) 2003-2017 Fabrice Bellard and the QEMU Project developers
Qemu is invoked with these arguments concerning random:
-cpu host,+smap,+rdseed,+erms,+smep,+fsgsbase,+3dnowprefetch,+rdtscp,+pdpe1gb,+rdrand,+osxsave,+xsave,+tsc-deadline,+movbe,+x2apic,+pdcm,+xtpr,+tm2,+est,+vmx,+ds_cpl,+dtes64,+pclmuldq,+pbe,+tm,+ht,+ss,+acpi,+ds,+vme -object rng-random,id=objrng0,filename=/dev/random -device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.0,addr=0x1c
The full qemu command is huge. I can provide it on request.
Bye
Stefan
Information forwarded
to bug-guix@gnu.org: bug#36380; Package guix.
(Sun, 27 Dec 2020 23:11:01 GMT) (full text, mbox, link).
Subject: Re: bug#36380: service urandom-seed takes too long on boot
Date: Sun, 27 Dec 2020 18:09:56 -0500
On Sun, Dec 27, 2020 at 04:00:23PM +0100, Stefan wrote:
> Qemu is invoked with these arguments concerning random:
>
> -cpu host,+smap,+rdseed,+erms,+smep,+fsgsbase,+3dnowprefetch,+rdtscp,+pdpe1gb,+rdrand,+osxsave,+xsave,+tsc-deadline,+movbe,+x2apic,+pdcm,+xtpr,+tm2,+est,+vmx,+ds_cpl,+dtes64,+pclmuldq,+pbe,+tm,+ht,+ss,+acpi,+ds,+vme -object rng-random,id=objrng0,filename=/dev/random -device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.0,addr=0x1c
What does the argument "filename=/dev/random" do? Does it use the host's
/dev/random to provide entropy to the QEMU guest?
Information forwarded
to bug-guix@gnu.org: bug#36380; Package guix.
(Sun, 27 Dec 2020 23:29:02 GMT) (full text, mbox, link).
Subject: Re: bug#36380: service urandom-seed takes too long on boot
Date: Mon, 28 Dec 2020 00:28:09 +0100
Hi Leo!
> What does the argument "filename=/dev/random" do? Does it use the host's
> /dev/random to provide entropy to the QEMU guest?
Yes, see [1]:
“-object rng-random,id=id,filename=/dev/random
Creates a random number generator backend which obtains entropy from a device on the host. The id parameter is a unique ID that will be used to reference this entropy backend from the virtio-rng device. The filename parameter specifies which file to obtain entropy from and if omitted defaults to /dev/urandom.”
Bye
Stefan
[1] https://www.qemu.org/docs/master/system/invocation.html#hxtool-10
Information forwarded
to bug-guix@gnu.org: bug#36380; Package guix.
(Tue, 29 Dec 2020 02:52:01 GMT) (full text, mbox, link).
Subject: Re: bug#36380: service urandom-seed takes too long on boot
Date: Mon, 28 Dec 2020 21:51:19 -0500
On Mon, Dec 28, 2020 at 12:28:09AM +0100, Stefan wrote:
> “-object rng-random,id=id,filename=/dev/random
> Creates a random number generator backend which obtains entropy from a device on the host. The id parameter is a unique ID that will be used to reference this entropy backend from the virtio-rng device. The filename parameter specifies which file to obtain entropy from and if omitted defaults to /dev/urandom.”
I see, so there is the problem.
It should either be changed to "filename=/dev/urandom", or the QEMU
rng-random mechanism should be disabled.
We still need to implement a timeout to workaround this kind of
misconfiguration... help wanted :)
Information forwarded
to bug-guix@gnu.org: bug#36380; Package guix.
(Sun, 07 Feb 2021 15:24:02 GMT) (full text, mbox, link).
To: "36380@debbugs.gnu.org" <36380@debbugs.gnu.org>
Subject: service urandom-seed takes too long on boot
Date: Sun, 07 Feb 2021 15:23:13 +0000
```scheme
(define (get-bytevector-n-timed port count max-time)
"Read COUNT octets from PORT, blocking as necessary and return a
bytevector containing the octets read, and taking no more than
MAX-TIME seconds. If fewer bytes are available, a bytevector
smaller than COUNT is returned."
(define (get-time)
(let* ((pair (gettimeofday))
(secs (car pair))
(usecs (cdr pair)))
(+ secs (* 0.000001 usecs))))
(let* ((start-time (get-time))
(end-time (+ start-time max-time))
(buf (make-bytevector count)))
(let loop ((offset 0))
(let ((current-time (get-time)))
(cond
((= offset count)
buf)
((>= current-time end-time)
(let ((newbuf (make-bytevector offset)))
(bytevector-copy! buf 0 newbuf 0 offset)
newbuf))
(else
(let* ((result (select (list port) '() '() (- end-time current-time)))
(readable? (not (null? (car result)))))
(if readable?
(begin
;; read only one byte at a time, as we cannot be sure
;; that the given port will have more than one byte
;; available and that ports will not block if we ask
;; for more than one byte.
(get-bytevector-n! port buf offset 1)
(loop (+ offset 1)))
(loop offset)))))))))
```
Debbugs is free software and licensed under the terms of the
GNU Public License version 2. The current version can be
obtained from https://bugs.debian.org/debbugs-source/.