Report forwarded
to bug-guix@gnu.org: bug#25018; Package guix.
(Thu, 24 Nov 2016 14:08:02 GMT) (full text, mbox, link).
Acknowledgement sent
to ludo@gnu.org (Ludovic Courtès):
New bug report received and forwarded. Copy sent to bug-guix@gnu.org.
(Thu, 24 Nov 2016 14:08:02 GMT) (full text, mbox, link).
Hello,
The ‘readTempRoots’ function in gc.cc has this:
/* Try to acquire a write lock without blocking. This can
only succeed if the owning process has died. In that case
we don't care about its temporary roots. */
if (lockFile(*fd, ltWrite, false)) {
printMsg(lvlError, format("removing stale temporary roots file `%1%'") % path);
unlink(path.c_str());
There’s a thinko here: locking the file also succeeds when the lock is
already held by the calling process.
In that case, this code ends up removing the temporary root file of
calling process, which is bad. Here’s a sample session:
--8<---------------cut here---------------start------------->8---
scheme@(guile-user)> ,use(guix)
scheme@(guile-user)> (define s (open-connection))
scheme@(guile-user)> (current-build-output-port (current-error-port))
$2 = #<output: file /dev/pts/9>
scheme@(guile-user)> (set-build-options s #:verbosity 10)
$3 = #t
scheme@(guile-user)> (add-text-to-store s "foo" "bar!")
acquiring global GC lock `/var/guix/gc.lock'
acquiring read lock on `/var/guix/temproots/4259'
acquiring write lock on `/var/guix/temproots/4259'
downgrading to read lock on `/var/guix/temproots/4259'
locking path `/gnu/store/0siy93lggjw7sfdg8gsvrzafaa974h2d-foo'
lock acquired on `/gnu/store/0siy93lggjw7sfdg8gsvrzafaa974h2d-foo.lock'
`/gnu/store/0siy93lggjw7sfdg8gsvrzafaa974h2d-foo' has hash `c756ef12a70bad10c9ac276ecd1213ea7cc3a2e6c462ba47e4f9a88756a055d0'
lock released on `/gnu/store/0siy93lggjw7sfdg8gsvrzafaa974h2d-foo.lock'
$4 = "/gnu/store/0siy93lggjw7sfdg8gsvrzafaa974h2d-foo"
scheme@(guile-user)> (delete-paths s (list $4))
acquiring global GC lock `/var/guix/gc.lock'
finding garbage collector roots...
executing `/gnu/store/l99rkv2713nl53kr3gn4akinvifsx19h-guix-0.11.0-3.7ca3/libexec/guix/list-runtime-roots' to find additional roots
[…]
reading temporary root file `/var/guix/temproots/4259'
removing stale temporary roots file `/var/guix/temproots/4259'
[…]
considering whether to delete `/gnu/store/0siy93lggjw7sfdg8gsvrzafaa974h2d-foo'
| invalidating path `/gnu/store/0siy93lggjw7sfdg8gsvrzafaa974h2d-foo'
| deleting `/gnu/store/0siy93lggjw7sfdg8gsvrzafaa974h2d-foo'
| recursively deleting path `/gnu/store/0siy93lggjw7sfdg8gsvrzafaa974h2d-foo'
| | /gnu/store/0siy93lggjw7sfdg8gsvrzafaa974h2d-foo
deleting `/gnu/store/trash'
recursively deleting path `/gnu/store/trash'
| /gnu/store/trash
deleting unused links...
deleting unused link `/gnu/store/.links/1l2ml1b8ga7rwi3vlqn4wsic6z7a2c9csvi7mk4i1b8blw9fymn7'
note: currently hard linking saves 6699.22 MiB
$5 = ("/gnu/store/0siy93lggjw7sfdg8gsvrzafaa974h2d-foo")
$6 = 4096
--8<---------------cut here---------------end--------------->8---
Notice the “removing stale temporary roots file” message.
Eelco: shouldn’t it be changed along the lines of the attached path?
Thanks,
Ludo’.
diff --git a/nix/libstore/gc.cc b/nix/libstore/gc.cc
index 72eff52..d92388f 100644
--- a/nix/libstore/gc.cc
+++ b/nix/libstore/gc.cc
@@ -2,6 +2,7 @@
#include "misc.hh"
#include "local-store.hh"
+#include <string>
#include <functional>
#include <queue>
#include <algorithm>
@@ -225,10 +226,10 @@ static void readTempRoots(PathSet & tempRoots, FDs & fds)
//FDPtr fd(new AutoCloseFD(openLockFile(path, false)));
//if (*fd == -1) continue;
- /* Try to acquire a write lock without blocking. This can
- only succeed if the owning process has died. In that case
- we don't care about its temporary roots. */
- if (lockFile(*fd, ltWrite, false)) {
+ /* Try to acquire a write lock without blocking. This can only
+ succeed if the owning process has died, in which case we don't care
+ about its temporary roots, or if we are the owning process. */
+ if (i.name != std::to_string(getpid()) && lockFile(*fd, ltWrite, false)) {
printMsg(lvlError, format("removing stale temporary roots file `%1%'") % path);
unlink(path.c_str());
writeFull(*fd, "d");
Severity set to 'important' from 'normal'
Request was from ludo@gnu.org (Ludovic Courtès)
to control@debbugs.gnu.org.
(Mon, 23 Jan 2017 22:16:01 GMT) (full text, mbox, link).
Reply sent
to Maxim Cournoyer <maxim.cournoyer@gmail.com>:
You have taken responsibility.
(Fri, 07 Oct 2022 21:00:02 GMT) (full text, mbox, link).
Notification sent
to ludo@gnu.org (Ludovic Courtès):
bug acknowledged by developer.
(Fri, 07 Oct 2022 21:00:02 GMT) (full text, mbox, link).
Subject: Re: bug#25018: GC incorrectly removes the temporary root file of
the calling process
Date: Fri, 07 Oct 2022 16:59:20 -0400
Hi Ludo,
ludo@gnu.org (Ludovic Courtès) writes:
> Hello,
>
> The ‘readTempRoots’ function in gc.cc has this:
>
> /* Try to acquire a write lock without blocking. This can
> only succeed if the owning process has died. In that case
> we don't care about its temporary roots. */
> if (lockFile(*fd, ltWrite, false)) {
> printMsg(lvlError, format("removing stale temporary roots file `%1%'") % path);
> unlink(path.c_str());
>
> There’s a thinko here: locking the file also succeeds when the lock is
> already held by the calling process.
>
> In that case, this code ends up removing the temporary root file of
> calling process, which is bad. Here’s a sample session:
>
> scheme@(guile-user)> ,use(guix)
> scheme@(guile-user)> (define s (open-connection))
> scheme@(guile-user)> (current-build-output-port (current-error-port))
> $2 = #<output: file /dev/pts/9>
> scheme@(guile-user)> (set-build-options s #:verbosity 10)
> $3 = #t
> scheme@(guile-user)> (add-text-to-store s "foo" "bar!")
> acquiring global GC lock `/var/guix/gc.lock'
> acquiring read lock on `/var/guix/temproots/4259'
> acquiring write lock on `/var/guix/temproots/4259'
> downgrading to read lock on `/var/guix/temproots/4259'
> locking path `/gnu/store/0siy93lggjw7sfdg8gsvrzafaa974h2d-foo'
> lock acquired on `/gnu/store/0siy93lggjw7sfdg8gsvrzafaa974h2d-foo.lock'
> `/gnu/store/0siy93lggjw7sfdg8gsvrzafaa974h2d-foo' has hash `c756ef12a70bad10c9ac276ecd1213ea7cc3a2e6c462ba47e4f9a88756a055d0'
> lock released on `/gnu/store/0siy93lggjw7sfdg8gsvrzafaa974h2d-foo.lock'
> $4 = "/gnu/store/0siy93lggjw7sfdg8gsvrzafaa974h2d-foo"
> scheme@(guile-user)> (delete-paths s (list $4))
> acquiring global GC lock `/var/guix/gc.lock'
> finding garbage collector roots...
> executing `/gnu/store/l99rkv2713nl53kr3gn4akinvifsx19h-guix-0.11.0-3.7ca3/libexec/guix/list-runtime-roots' to find additional roots
> […]
> reading temporary root file `/var/guix/temproots/4259'
> removing stale temporary roots file `/var/guix/temproots/4259'
> […]
> considering whether to delete `/gnu/store/0siy93lggjw7sfdg8gsvrzafaa974h2d-foo'
> | invalidating path `/gnu/store/0siy93lggjw7sfdg8gsvrzafaa974h2d-foo'
> | deleting `/gnu/store/0siy93lggjw7sfdg8gsvrzafaa974h2d-foo'
> | recursively deleting path `/gnu/store/0siy93lggjw7sfdg8gsvrzafaa974h2d-foo'
> | | /gnu/store/0siy93lggjw7sfdg8gsvrzafaa974h2d-foo
> deleting `/gnu/store/trash'
> recursively deleting path `/gnu/store/trash'
> | /gnu/store/trash
> deleting unused links...
> deleting unused link `/gnu/store/.links/1l2ml1b8ga7rwi3vlqn4wsic6z7a2c9csvi7mk4i1b8blw9fymn7'
> note: currently hard linking saves 6699.22 MiB
> $5 = ("/gnu/store/0siy93lggjw7sfdg8gsvrzafaa974h2d-foo")
> $6 = 4096
>
> Notice the “removing stale temporary roots file” message.
>
> Eelco: shouldn’t it be changed along the lines of the attached path?
>
>
> Thanks,
> Ludo’.
>
> diff --git a/nix/libstore/gc.cc b/nix/libstore/gc.cc
> index 72eff52..d92388f 100644
> --- a/nix/libstore/gc.cc
> +++ b/nix/libstore/gc.cc
> @@ -2,6 +2,7 @@
> #include "misc.hh"
> #include "local-store.hh"
>
> +#include <string>
> #include <functional>
> #include <queue>
> #include <algorithm>
> @@ -225,10 +226,10 @@ static void readTempRoots(PathSet & tempRoots, FDs & fds)
> //FDPtr fd(new AutoCloseFD(openLockFile(path, false)));
> //if (*fd == -1) continue;
>
> - /* Try to acquire a write lock without blocking. This can
> - only succeed if the owning process has died. In that case
> - we don't care about its temporary roots. */
> - if (lockFile(*fd, ltWrite, false)) {
> + /* Try to acquire a write lock without blocking. This can only
> + succeed if the owning process has died, in which case we don't care
> + about its temporary roots, or if we are the owning process. */
> + if (i.name != std::to_string(getpid()) && lockFile(*fd, ltWrite, false)) {
> printMsg(lvlError, format("removing stale temporary roots file `%1%'") % path);
> unlink(path.c_str());
> writeFull(*fd, "d");
>
I'm not Eelco, but your change LGTM. Note that the upstream version
still uses the original code [0].
I've installed the change, tested that it had the expected result:
--8<---------------cut here---------------start------------->8---
reading temporary root file `/var/guix/temproots/8386'
waiting for read lock on `/var/guix/temproots/8386'
got temporary root `/gnu/store/0siy93lggjw7sfdg8gsvrzafaa974h2d-foo'
considering whether to delete `/gnu/store/0siy93lggjw7sfdg8gsvrzafaa974h2d-foo'
| cannot delete `/gnu/store/0siy93lggjw7sfdg8gsvrzafaa974h2d-foo' because it's a root
| cannot delete `/gnu/store/0siy93lggjw7sfdg8gsvrzafaa974h2d-foo' because it's still reachable
ice-9/boot-9.scm:1685:16: In procedure raise-exception:
ERROR:
1. &store-protocol-error:
message: "cannot delete path `/gnu/store/0siy93lggjw7sfdg8gsvrzafaa974h2d-foo' since it is still alive"
status: 1
--8<---------------cut here---------------end--------------->8---
and pushed!
Closing.
[0] https://github.com/NixOS/nix/blob/master/src/libstore/gc.cc#L194
--
Thanks,
Maxim
Information forwarded
to bug-guix@gnu.org: bug#25018; Package guix.
(Mon, 10 Oct 2022 08:02:02 GMT) (full text, mbox, link).
Subject: Re: bug#25018: GC incorrectly removes the temporary root file of
the calling process
Date: Mon, 10 Oct 2022 10:01:01 +0200
Hi Maxim,
Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:
> I'm not Eelco, but your change LGTM. Note that the upstream version
> still uses the original code [0].
Right.
> I've installed the change, tested that it had the expected result:
>
> reading temporary root file `/var/guix/temproots/8386'
> waiting for read lock on `/var/guix/temproots/8386'
> got temporary root `/gnu/store/0siy93lggjw7sfdg8gsvrzafaa974h2d-foo'
> considering whether to delete `/gnu/store/0siy93lggjw7sfdg8gsvrzafaa974h2d-foo'
> | cannot delete `/gnu/store/0siy93lggjw7sfdg8gsvrzafaa974h2d-foo' because it's a root
> | cannot delete `/gnu/store/0siy93lggjw7sfdg8gsvrzafaa974h2d-foo' because it's still reachable
> ice-9/boot-9.scm:1685:16: In procedure raise-exception:
> ERROR:
> 1. &store-protocol-error:
> message: "cannot delete path `/gnu/store/0siy93lggjw7sfdg8gsvrzafaa974h2d-foo' since it is still alive"
> status: 1
>
> and pushed!
Thank you! (Your bug triage work is much appreciated!) We could turn
the example here in a unit test; the only downside is that running the
GC in a test is expensive.
Ludo’.
Information forwarded
to bug-guix@gnu.org: bug#25018; Package guix.
(Mon, 10 Oct 2022 10:31:01 GMT) (full text, mbox, link).
On 10-10-2022 10:01, Ludovic Courtès wrote:
> Hi Maxim,
>
> [...]
>> and pushed!
>
> Thank you! (Your bug triage work is much appreciated!) We could turn
> the example here in a unit test; the only downside is that running the
> GC in a test is expensive.
It should be possible to run the GC on the test store instead of
/gnu/store, no? If that's still too expensive, how about creating an
additional temporary test store only for the GC test?
Greetings,
Maxime.
Subject: Re: bug#25018: GC incorrectly removes the temporary root file of
the calling process
Date: Mon, 10 Oct 2022 16:53:32 +0200
Maxime Devos <maximedevos@telenet.be> skribis:
> On 10-10-2022 10:01, Ludovic Courtès wrote:
>> Hi Maxim,
>> [...]
>>> and pushed!
>> Thank you! (Your bug triage work is much appreciated!) We could
>> turn
>> the example here in a unit test; the only downside is that running the
>> GC in a test is expensive.
>
> It should be possible to run the GC on the test store instead of
> /gnu/store, no?
Yes, that’s what I meant and several tests already do this, but it’s
quite expensive.
> If that's still too expensive, how about creating an additional
> temporary test store only for the GC test?
Bah, that sounds complicated to me.
Ludo’.
Information forwarded
to bug-guix@gnu.org: bug#25018; Package guix.
(Mon, 10 Oct 2022 17:25:02 GMT) (full text, mbox, link).
Hello, this patch seems to have broken the test suite in
tests/store.scm. My test log file is attached.
./test-env make check TESTS=tests/store.scm
Cuirass did not detect the changes since they are at such a low level:
(https://ci.guix.gnu.org/eval/700414)
--
Sincerely,
Ryan Sundberg
Subject: Re: bug#25018: GC incorrectly removes the temporary root file of
the calling process
Date: Fri, 14 Oct 2022 22:30:28 +0200
Hi Maxim,
(Stripping Cc:.)
Ludovic Courtès <ludo@gnu.org> skribis:
> Thank you! (Your bug triage work is much appreciated!) We could turn
> the example here in a unit test; the only downside is that running the
> GC in a test is expensive.
Actually, there are tests that most likely relied on the previous
behavior and are now failing in
tests/{derivations,nar,publish,pypi,store}.scm. We’ll have to look at
each one to make sure they are indeed making the wrong assumption and to
fix them.
What about reverting the change first so we can do that without
pressure and come up with a self-contained patch?
Ludo’.
Information forwarded
to bug-guix@gnu.org: bug#25018; Package guix.
(Mon, 17 Oct 2022 01:26:02 GMT) (full text, mbox, link).
Subject: Re: bug#25018: GC incorrectly removes the temporary root file of
the calling process
Date: Sun, 16 Oct 2022 21:25:44 -0400
Hi Ludovic!
Ludovic Courtès <ludo@gnu.org> writes:
> Hi Maxim,
>
> (Stripping Cc:.)
>
> Ludovic Courtès <ludo@gnu.org> skribis:
>
>> Thank you! (Your bug triage work is much appreciated!) We could turn
>> the example here in a unit test; the only downside is that running the
>> GC in a test is expensive.
>
> Actually, there are tests that most likely relied on the previous
> behavior and are now failing in
> tests/{derivations,nar,publish,pypi,store}.scm. We’ll have to look at
> each one to make sure they are indeed making the wrong assumption and to
> fix them.
Hmm, I hadn't seen that coming.
> What about reverting the change first so we can do that without
> pressure and come up with a self-contained patch?
Sounds reasonable, if we can't think of an immediate fix.
--
Thanks,
Maxim
Did not alter fixed versions and reopened.
Request was from Debbugs Internal Request <help-debbugs@gnu.org>
to internal_control@debbugs.gnu.org.
(Mon, 17 Oct 2022 07:37:01 GMT) (full text, mbox, link).
Information forwarded
to bug-guix@gnu.org: bug#25018; Package guix.
(Mon, 17 Oct 2022 08:52:02 GMT) (full text, mbox, link).
Subject: Re: bug#25018: GC incorrectly removes the temporary root file of
the calling process
Date: Mon, 17 Oct 2022 10:51:19 +0200
Hi Maxim,
Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:
> Ludovic Courtès <ludo@gnu.org> writes:
>
>> Hi Maxim,
>>
>> (Stripping Cc:.)
>>
>> Ludovic Courtès <ludo@gnu.org> skribis:
>>
>>> Thank you! (Your bug triage work is much appreciated!) We could turn
>>> the example here in a unit test; the only downside is that running the
>>> GC in a test is expensive.
>>
>> Actually, there are tests that most likely relied on the previous
>> behavior and are now failing in
>> tests/{derivations,nar,publish,pypi,store}.scm. We’ll have to look at
>> each one to make sure they are indeed making the wrong assumption and to
>> fix them.
>
> Hmm, I hadn't seen that coming.
>
>> What about reverting the change first so we can do that without
>> pressure and come up with a self-contained patch?
>
> Sounds reasonable, if we can't think of an immediate fix.
I reverted it in eec920ba93ecb086366576e31b785962fbaf81c2.
The way forward will be to review those tests one by one, make sure they
were making the “wrong” assumption, adjust them accordingly, and
possibly add new tests. It’s not necessarily difficult but takes a bit
of time. (In the coming weeks I’m going to try and focus on more urgent
matters but I’m happy to review if you or someone else gets to it!)
Thanks,
Ludo’.
Information forwarded
to bug-guix@gnu.org: bug#25018; Package guix.
(Tue, 18 Oct 2022 15:34:02 GMT) (full text, mbox, link).
Cc: 25018@debbugs.gnu.org, GNU Debbugs <control@debbugs.gnu.org>
Subject: Re: bug#25018: GC incorrectly removes the temporary root file of
the calling process
Date: Tue, 18 Oct 2022 11:33:32 -0400
reopen 25018
quit
Hi,
Ludovic Courtès <ludo@gnu.org> writes:
[...]
> I reverted it in eec920ba93ecb086366576e31b785962fbaf81c2.
>
> The way forward will be to review those tests one by one, make sure they
> were making the “wrong” assumption, adjust them accordingly, and
> possibly add new tests. It’s not necessarily difficult but takes a bit
> of time. (In the coming weeks I’m going to try and focus on more urgent
> matters but I’m happy to review if you or someone else gets to it!)
Thanks, and sorry for not noticing the breakage. I'm reopening the bug
so that we don't forget about it.
--
Thanks,
Maxim
Debbugs is free software and licensed under the terms of the
GNU Public License version 2. The current version can be
obtained from https://bugs.debian.org/debbugs-source/.