GNU bug report logs

#28659 Content-addressed mirror is not used upon invalid hash

PackageSource(s)Maintainer(s)
guix PTS Buildd Popcon
Reply or subscribe to this bug. View this bug as an mbox, status mbox, or maintainer mbox

Report forwarded to bug-guix@gnu.org:
bug#28659; Package guix. (Sun, 01 Oct 2017 10:17:02 GMT) (full text, mbox, link).


Acknowledgement sent to Jan Nieuwenhuizen <janneke@gnu.org>:
New bug report received and forwarded. Copy sent to bug-guix@gnu.org. (Sun, 01 Oct 2017 10:17:02 GMT) (full text, mbox, link).


Message #5 received at submit@debbugs.gnu.org (full text, mbox, reply):

From: Jan Nieuwenhuizen <janneke@gnu.org>
To: bug-guix@gnu.org
Subject: v0.13: guix pull fails; libgit2-0.26.0 and 0.25.1 content hashes fail
Date: Sun, 01 Oct 2017 12:16:07 +0200
Hi!

As reported by laertus on irc[0]: guix pull on 0.13 without substitutes fails

      guix pull

    Starting download of /tmp/guix-file.3r6cH0
    From https://git.savannah.gnu.org/cgit/guix.git/snapshot/master.tar.gz...
     ….tar.gz                                   5.7MiB/s 00:02 | 13.6MiB transferred
    unpacking '/gnu/store/sginfwnrcfqn1far31gmzlaffd8xlxyy-guix-latest.tar.gz'...

    Starting download of /gnu/store/c3npgqn9ag2ypi9bda1g779wwwlcqqrf-libgit2-0.25.1.tar.gz
    From https://github.com/libgit2/libgit2/archive/v0.25.1.tar.gz...
    following redirection to `https://codeload.github.com/libgit2/libgit2/tar.gz/v0.25.1'...
     v0.25.1                                     6.1MiB/s 00:01 | 4.1MiB transferred
    output path `/gnu/store/c3npgqn9ag2ypi9bda1g779wwwlcqqrf-libgit2-0.25.1.tar.gz' should have sha256 hash `1cdwcw38frc1wf28x5ppddazv9hywc718j92f3xa3ybzzycyds3s', instead has `0ywcxw1mwd56c8qc14hbx31bf198gxck3nja3laxyglv7l57qp26'
    cannot build derivation `/gnu/store/z1ky970mnamnbairnpyxxb72qnc485zq-libgit2-0.25.1.drv': 1 dependencies couldn't be built
    cannot build derivation `/gnu/store/rl7ms8rmbywvydy4qf656g1sdfxafb7r-guile-git-0.0-2.06f9fc3.drv': 1 dependencies couldn't be built
    guix pull: error: build failed: build of `/gnu/store/rl7ms8rmbywvydy4qf656g1sdfxafb7r-guile-git-0.0-2.06f9fc3.drv' failed

because the libgit2-0.25.1 content hash does not check out.

I verified this on version-0.13.  The same goes for 0.26.0 on master

    $ guix build -S libgit2 --no-substitutes
    The following derivations will be built:
       /gnu/store/5szrmzmfgxk6pylk5fh9bk8apj4x8axf-libgit2-0.26.0.tar.xz.drv
       /gnu/store/mgh4yjxkxfyqmc7c61vwq4vs8v837602-libgit2-0.26.0.tar.gz.drv
    @ build-started /gnu/store/mgh4yjxkxfyqmc7c61vwq4vs8v837602-libgit2-0.26.0.tar.gz.drv - x86_64-linux /var/log/guix/drvs/mg//h4yjxkxfyqmc7c61vwq4vs8v837602-libgit2-0.26.0.tar.gz.drv.bz2

    Starting download of /gnu/store/53lj4z9cavl7n27r89zjnvyd8fk854kj-libgit2-0.26.0.tar.gz
    From https://github.com/libgit2/libgit2/archive/v0.26.0.tar.gz...
    following redirection to `https://codeload.github.com/libgit2/libgit2/tar.gz/v0.26.0'...
     v0.26.0  4.5MiB                    3.1MiB/s 00:01 [####################] 100.0%
    sha256 hash mismatch for output path `/gnu/store/53lj4z9cavl7n27r89zjnvyd8fk854kj-libgit2-0.26.0.tar.gz'
      expected: 1fdk9yhwvl1w1z71ykzcvgh4nsf8scxcbclz5anh98zpplmhmisa
      actual:   1b3figbhp5l83vd37vq6j2narrq4yl9pfw6mw0px0dzb1hz3jqka
    @ build-failed /gnu/store/mgh4yjxkxfyqmc7c61vwq4vs8v837602-libgit2-0.26.0.tar.gz.drv - 1 sha256 hash mismatch for output path `/gnu/store/53lj4z9cavl7n27r89zjnvyd8fk854kj-libgit2-0.26.0.tar.gz'
      expected: 1fdk9yhwvl1w1z71ykzcvgh4nsf8scxcbclz5anh98zpplmhmisa
      actual:   1b3figbhp5l83vd37vq6j2narrq4yl9pfw6mw0px0dzb1hz3jqka
    cannot build derivation `/gnu/store/5szrmzmfgxk6pylk5fh9bk8apj4x8axf-libgit2-0.26.0.tar.xz.drv': 1 dependencies couldn't be built
    guix build: error: build failed: build of `/gnu/store/5szrmzmfgxk6pylk5fh9bk8apj4x8axf-libgit2-0.26.0.tar.xz.drv' failed

I found no apparent difference in the content

    -r--r--r-- 1 janneke janneke  4252130 Oct  1 09:08 c3npgqn9ag2ypi9bda1g779wwwlcqqrf-libgit2-0.25.1.tar.gz
    -rw-r--r-- 1 janneke janneke  4252139 Oct  1 09:09 NEW-c3npgqn9ag2ypi9bda1g779wwwlcqqrf-libgit2-0.25.1.tar.gz
    -rw-r--r-- 1 janneke janneke 16363520 Oct  1 09:14 c3npgqn9ag2ypi9bda1g779wwwlcqqrf-libgit2-0.25.1.tar
    -rw-r--r-- 1 janneke janneke 16363520 Oct  1 09:14 NEW-c3npgqn9ag2ypi9bda1g779wwwlcqqrf-libgit2-0.25.1.tar

but there's this difference between the tar balls...

    12:13:57 janneke@dundal:~/src/guix-0.13 
    $ cmp -l c3npgqn9ag2ypi9bda1g779wwwlcqqrf-libgit2-0.25.1.tar NEW-c3npgqn9ag2ypi9bda1g779wwwlcqqrf-libgit2-0.25.1.tar
    13122049   0 157
    13122050   0 162
    13122051   0 151
    13122052   0 147
    13122053   0 151
    13122054   0 156
    13122055   0  57
    13122490  57   0
    13122491 157   0
    13122492 162   0
    13122493 151   0
    13122494 147   0
    13122495 151   0
    13122496 156   0
    13270529   0 157
    13270530   0 162
    13270531   0 151
    13270532   0 147
    13270533   0 151
    13270534   0 156
    13270535   0  57
    13270972  57   0
    13270973 157   0
    13270974 162   0
    13270975 151   0
    13270976 147   0
    13270977 151   0
    13270978 156   0
    13294081   0 157
    13294082   0 162
    13294083   0 151
    13294084   0 147
    13294085   0 151
    13294086   0 156
    13294087   0  57
    13294519  57   0
    13294520 157   0
    13294521 162   0
    13294522 151   0
    13294523 147   0
    13294524 151   0
    13294525 156   0

janneke

[0] https://gnunet.org/bot/log/guix/2017-10-01#T1517584

-- 
Jan Nieuwenhuizen <janneke@gnu.org> | GNU LilyPond http://lilypond.org
Freelance IT http://JoyofSource.com | Avatar® http://AvatarAcademy.com




Information forwarded to bug-guix@gnu.org:
bug#28659; Package guix. (Sun, 01 Oct 2017 19:22:01 GMT) (full text, mbox, link).


Message #8 received at 28659@debbugs.gnu.org (full text, mbox, reply):

From: Jan Nieuwenhuizen <janneke@gnu.org>
To: 28659@debbugs.gnu.org
Subject: Re: bug#28659: v0.13: guix pull fails; libgit2-0.26.0 and 0.25.1 content hashes fail
Date: Sun, 01 Oct 2017 21:20:42 +0200
Jan Nieuwenhuizen writes:

The changing of the libgit-0.26.0 checksum was already reported about 3
weeks ago (github seems to only show relative dates)

    https://github.com/libgit2/libgit2/issues/4343

and the bug is still open.  It seems to be a github thing.  As I
understand it, currently our options are to update the hash and pray it
won't happen again or host libgit2 tarballs ourselves.

-- 
Jan Nieuwenhuizen <janneke@gnu.org> | GNU LilyPond http://lilypond.org
Freelance IT http://JoyofSource.com | Avatar® http://AvatarAcademy.com




Information forwarded to bug-guix@gnu.org:
bug#28659; Package guix. (Sun, 01 Oct 2017 20:44:02 GMT) (full text, mbox, link).


Message #11 received at 28659@debbugs.gnu.org (full text, mbox, reply):

From: Leo Famulari <leo@famulari.name>
To: Jan Nieuwenhuizen <janneke@gnu.org>
Cc: 28659@debbugs.gnu.org
Subject: Re: bug#28659: v0.13: guix pull fails; libgit2-0.26.0 and 0.25.1 content hashes fail
Date: Sun, 1 Oct 2017 16:42:37 -0400
[Message part 1 (text/plain, inline)]
On Sun, Oct 01, 2017 at 09:20:42PM +0200, Jan Nieuwenhuizen wrote:
> Jan Nieuwenhuizen writes:
> 
> The changing of the libgit-0.26.0 checksum was already reported about 3
> weeks ago (github seems to only show relative dates)
> 
>     https://github.com/libgit2/libgit2/issues/4343
> 
> and the bug is still open.  It seems to be a github thing.  As I
> understand it, currently our options are to update the hash and pray it
> won't happen again or host libgit2 tarballs ourselves.

I contacted GitHub about this issue a few weeks ago and they said that:

1) They do not guarantee bit-reproducibility of the snapshots they
generate automatically for each release tag, and they wish that people
would not rely on them as we do. However, since people *are* relying on
them, they are discussing this issue internally.
2) This is the relevant code change:
https://git.kernel.org/pub/scm/git/git.git/commit/?id=22f0dcd9634a818a0c83f23ea1a48f2d620c0546

In the meantime, we can add this to the list of reasons that
reproducibility is difficult in the long term.

I don't have any solutions in mind besides keeping substitutes available
for as long as possible and, for users, using substitutes. We might also
petition upstream projects to offer a "real" release tarball.
[signature.asc (application/pgp-signature, inline)]

Information forwarded to bug-guix@gnu.org:
bug#28659; Package guix. (Sun, 01 Oct 2017 21:06:02 GMT) (full text, mbox, link).


Message #14 received at 28659@debbugs.gnu.org (full text, mbox, reply):

From: ng0 <ng0@infotropique.org>
To: Leo Famulari <leo@famulari.name>
Cc: 28659@debbugs.gnu.org, Jan Nieuwenhuizen <janneke@gnu.org>
Subject: Re: bug#28659: v0.13: guix pull fails; libgit2-0.26.0 and 0.25.1 content hashes fail
Date: Sun, 1 Oct 2017 21:05:27 +0000
[Message part 1 (text/plain, inline)]
Leo Famulari transcribed 2.3K bytes:
> On Sun, Oct 01, 2017 at 09:20:42PM +0200, Jan Nieuwenhuizen wrote:
> > Jan Nieuwenhuizen writes:
> > 
> > The changing of the libgit-0.26.0 checksum was already reported about 3
> > weeks ago (github seems to only show relative dates)
> > 
> >     https://github.com/libgit2/libgit2/issues/4343
> > 
> > and the bug is still open.  It seems to be a github thing.  As I
> > understand it, currently our options are to update the hash and pray it
> > won't happen again or host libgit2 tarballs ourselves.
> 
> I contacted GitHub about this issue a few weeks ago and they said that:
> 
> 1) They do not guarantee bit-reproducibility of the snapshots they
> generate automatically for each release tag, and they wish that people
> would not rely on them as we do. However, since people *are* relying on
> them, they are discussing this issue internally.
> 2) This is the relevant code change:
> https://git.kernel.org/pub/scm/git/git.git/commit/?id=22f0dcd9634a818a0c83f23ea1a48f2d620c0546
> 
> In the meantime, we can add this to the list of reasons that
> reproducibility is difficult in the long term.
> 
> I don't have any solutions in mind besides keeping substitutes available
> for as long as possible and, for users, using substitutes. We might also
> petition upstream projects to offer a "real" release tarball.

Given that we depend on this for our core functionality,
can't we just keep this on our ftp directory at gnu.org
as a fall-back source in a list?

-- 
ng0
GnuPG: A88C8ADD129828D7EAC02E52E22F9BBFEE348588
GnuPG: https://krosos.org/dist/keys/
https://www.infotropique.org https://krosos.org
[signature.asc (application/pgp-signature, inline)]

Information forwarded to bug-guix@gnu.org:
bug#28659; Package guix. (Mon, 02 Oct 2017 14:58:01 GMT) (full text, mbox, link).


Message #17 received at 28659@debbugs.gnu.org (full text, mbox, reply):

From: ludo@gnu.org (Ludovic Courtès)
To: Leo Famulari <leo@famulari.name>
Cc: 28659@debbugs.gnu.org, Jan Nieuwenhuizen <janneke@gnu.org>
Subject: Re: bug#28659: v0.13: guix pull fails; libgit2-0.26.0 and 0.25.1 content hashes fail
Date: Mon, 02 Oct 2017 16:57:38 +0200
Hi!

Leo Famulari <leo@famulari.name> skribis:

> I contacted GitHub about this issue a few weeks ago and they said that:
>
> 1) They do not guarantee bit-reproducibility of the snapshots they
> generate automatically for each release tag, and they wish that people
> would not rely on them as we do. However, since people *are* relying on
> them, they are discussing this issue internally.

Oh?!  Then we’re in trouble.

Perhaps we should start using ‘git-fetch’ more, with Software Heritage
as a fallback content-addressed mirror?  Though again the difficulty is
that SWH uses Git’s method to hash directory contents, so we’d end up
having to provide both a Nix hash and a Git hash in ‘origin’.  :-/

> In the meantime, we can add this to the list of reasons that
> reproducibility is difficult in the long term.

Heh.

Ludo’.




Information forwarded to bug-guix@gnu.org:
bug#28659; Package guix. (Mon, 02 Oct 2017 15:11:02 GMT) (full text, mbox, link).


Message #20 received at 28659@debbugs.gnu.org (full text, mbox, reply):

From: ludo@gnu.org (Ludovic Courtès)
To: Jan Nieuwenhuizen <janneke@gnu.org>
Cc: 28659@debbugs.gnu.org
Subject: Re: bug#28659: v0.13: guix pull fails; libgit2-0.26.0 and 0.25.1 content hashes fail
Date: Mon, 02 Oct 2017 17:09:39 +0200
Hello,

Jan Nieuwenhuizen <janneke@gnu.org> skribis:

> As reported by laertus on irc[0]: guix pull on 0.13 without substitutes fails

I just checked and we do have substitutes, but I understand it doesn’t
help here.

>       guix pull
>
>     Starting download of /tmp/guix-file.3r6cH0
>     From https://git.savannah.gnu.org/cgit/guix.git/snapshot/master.tar.gz...
>      ….tar.gz                                   5.7MiB/s 00:02 | 13.6MiB transferred
>     unpacking '/gnu/store/sginfwnrcfqn1far31gmzlaffd8xlxyy-guix-latest.tar.gz'...
>
>     Starting download of /gnu/store/c3npgqn9ag2ypi9bda1g779wwwlcqqrf-libgit2-0.25.1.tar.gz
>     From https://github.com/libgit2/libgit2/archive/v0.25.1.tar.gz...
>     following redirection to `https://codeload.github.com/libgit2/libgit2/tar.gz/v0.25.1'...
>      v0.25.1                                     6.1MiB/s 00:01 | 4.1MiB transferred
>     output path `/gnu/store/c3npgqn9ag2ypi9bda1g779wwwlcqqrf-libgit2-0.25.1.tar.gz' should have sha256 hash `1cdwcw38frc1wf28x5ppddazv9hywc718j92f3xa3ybzzycyds3s', instead has `0ywcxw1mwd56c8qc14hbx31bf198gxck3nja3laxyglv7l57qp26'

What’s sad here is that we do have the right tarball at:

  https://mirror.hydra.gnu.org/file/libgit2-0.25.1.tar.gz/sha256/1cdwcw38frc1wf28x5ppddazv9hywc718j92f3xa3ybzzycyds3s

The problem is that the hash check is performed by guix-daemon itself,
not by “guix perform-download”.  So when guix-daemon diagnoses a hash
mismatch, it’s too late and we cannot try again and use the
content-addressed mirror.

A crude but helpful fix would be to have perform-download compute the
hash by itself and act accordingly.  It’s crude because that means that
we’d be computing the hash twice: once in ‘guix perform-download’ and a
second time in guix-daemon.  For archives below ~20 MiB it’s probably OK
though.

Thoughts?

In the future, with the daemon written in Guile, it’s one area where we
could achieve better integration and coordination among the various
pieces.

Ludo’.




Changed bug title to 'Content-addressed mirror is not used upon invalid hash' from 'v0.13: guix pull fails; libgit2-0.26.0 and 0.25.1 content hashes fail' Request was from ludo@gnu.org (Ludovic Courtès) to control@debbugs.gnu.org. (Mon, 02 Oct 2017 15:17:02 GMT) (full text, mbox, link).


Severity set to 'important' from 'normal' Request was from ludo@gnu.org (Ludovic Courtès) to control@debbugs.gnu.org. (Mon, 02 Oct 2017 15:17:03 GMT) (full text, mbox, link).


Information forwarded to bug-guix@gnu.org:
bug#28659; Package guix. (Mon, 02 Oct 2017 17:06:02 GMT) (full text, mbox, link).


Message #27 received at 28659@debbugs.gnu.org (full text, mbox, reply):

From: Jan Nieuwenhuizen <janneke@gnu.org>
To: ludo@gnu.org (Ludovic Courtès)
Cc: 28659@debbugs.gnu.org
Subject: Re: bug#28659: v0.13: guix pull fails; libgit2-0.26.0 and 0.25.1 content hashes fail
Date: Mon, 02 Oct 2017 19:05:00 +0200
Ludovic Courtès writes:

> What’s sad here is that we do have the right tarball at:
>
>   https://mirror.hydra.gnu.org/file/libgit2-0.25.1.tar.gz/sha256/1cdwcw38frc1wf28x5ppddazv9hywc718j92f3xa3ybzzycyds3s

Sad indeed!

> The problem is that the hash check is performed by guix-daemon itself,
> not by “guix perform-download”.  So when guix-daemon diagnoses a hash
> mismatch, it’s too late and we cannot try again and use the
> content-addressed mirror.

Why don't we try our content-addressed mirror first?

> A crude but helpful fix would be to have perform-download compute the
> hash by itself and act accordingly.  It’s crude because that means that
> we’d be computing the hash twice: once in ‘guix perform-download’ and a
> second time in guix-daemon.  For archives below ~20 MiB it’s probably OK
> though.
>
> Thoughts?

We may want more guix hackers' viewpoints here, I don't feel very
qualified...As this would be a temporary workaround only until we have

> In the future, with the daemon written in Guile, it’s one area where we
> could achieve better integration and coordination among the various
> pieces.

...it might be fine?

Do we want/need to bring out a new release for this, e.g. 0.13.1, or
even 0.14?  I'm not sure how bad it is that --no-substitutes does not
work.  I think working on guix pull to not compile everything locally
may have priority?

janneke

-- 
Jan Nieuwenhuizen <janneke@gnu.org> | GNU LilyPond http://lilypond.org
Freelance IT http://JoyofSource.com | Avatar® http://AvatarAcademy.com




Information forwarded to bug-guix@gnu.org:
bug#28659; Package guix. (Mon, 02 Oct 2017 18:20:01 GMT) (full text, mbox, link).


Message #30 received at 28659@debbugs.gnu.org (full text, mbox, reply):

From: Leo Famulari <leo@famulari.name>
To: Ludovic Courtès <ludo@gnu.org>
Cc: 28659@debbugs.gnu.org, Jan Nieuwenhuizen <janneke@gnu.org>
Subject: Re: bug#28659: v0.13: guix pull fails; libgit2-0.26.0 and 0.25.1 content hashes fail
Date: Mon, 2 Oct 2017 14:19:29 -0400
[Message part 1 (text/plain, inline)]
On Mon, Oct 02, 2017 at 04:57:38PM +0200, Ludovic Courtès wrote:
> Hi!
> 
> Leo Famulari <leo@famulari.name> skribis:
> 
> > I contacted GitHub about this issue a few weeks ago and they said that:
> >
> > 1) They do not guarantee bit-reproducibility of the snapshots they
> > generate automatically for each release tag, and they wish that people
> > would not rely on them as we do. However, since people *are* relying on
> > them, they are discussing this issue internally.
> 
> Oh?!  Then we’re in trouble.

I wonder, are there really that many affected packages? My sense is that
most GitHub-hosted projects offer their own release tarballs in addition
to the problematic auto-generated snapshots, and we tend to prefer the
upstream-provided tarballs in this case.

We'd need to survey our package sources to know what sort of reaction is
most appropriate.

In general, we should try to make Guix as resilient as possible to
unstable upstream sources, since the problem is not limited to GitHub.

> Perhaps we should start using ‘git-fetch’ more, with Software Heritage
> as a fallback content-addressed mirror?  Though again the difficulty is
> that SWH uses Git’s method to hash directory contents, so we’d end up
> having to provide both a Nix hash and a Git hash in ‘origin’.  :-/

And the Git hashes will change from SHA1 to SHA256 sooner or later, and
SHA1 hashes will become less reliable as CPUs get faster (collision
attacks), compounding the problem...
[signature.asc (application/pgp-signature, inline)]

Information forwarded to bug-guix@gnu.org:
bug#28659; Package guix. (Mon, 02 Oct 2017 18:23:02 GMT) (full text, mbox, link).


Message #33 received at 28659@debbugs.gnu.org (full text, mbox, reply):

From: Leo Famulari <leo@famulari.name>
To: Ludovic Courtès <ludo@gnu.org>
Cc: 28659@debbugs.gnu.org, Jan Nieuwenhuizen <janneke@gnu.org>
Subject: Re: bug#28659: v0.13: guix pull fails; libgit2-0.26.0 and 0.25.1 content hashes fail
Date: Mon, 2 Oct 2017 14:22:08 -0400
[Message part 1 (text/plain, inline)]
On Mon, Oct 02, 2017 at 05:09:39PM +0200, Ludovic Courtès wrote:
> What’s sad here is that we do have the right tarball at:
> 
>   https://mirror.hydra.gnu.org/file/libgit2-0.25.1.tar.gz/sha256/1cdwcw38frc1wf28x5ppddazv9hywc718j92f3xa3ybzzycyds3s

It seems to me that there are several reasons someone may choose not to
use substitutes. Some of those reasons (reproducibility and security
concerns) are obviated for fixed-output derivations like upstream
sources, and I think it would be fine to still use substitutes for these
derivations.

But the motivations of privacy, self-sufficiency, etc are not addressed
by that idea.
[signature.asc (application/pgp-signature, inline)]

Information forwarded to bug-guix@gnu.org:
bug#28659; Package guix. (Mon, 02 Oct 2017 20:01:01 GMT) (full text, mbox, link).


Message #36 received at 28659@debbugs.gnu.org (full text, mbox, reply):

From: ludo@gnu.org (Ludovic Courtès)
To: Leo Famulari <leo@famulari.name>
Cc: 28659@debbugs.gnu.org, Jan Nieuwenhuizen <janneke@gnu.org>
Subject: Re: bug#28659: v0.13: guix pull fails; libgit2-0.26.0 and 0.25.1 content hashes fail
Date: Mon, 02 Oct 2017 22:00:33 +0200
Leo Famulari <leo@famulari.name> skribis:

> On Mon, Oct 02, 2017 at 05:09:39PM +0200, Ludovic Courtès wrote:
>> What’s sad here is that we do have the right tarball at:
>> 
>>   https://mirror.hydra.gnu.org/file/libgit2-0.25.1.tar.gz/sha256/1cdwcw38frc1wf28x5ppddazv9hywc718j92f3xa3ybzzycyds3s

Just to be clear: this URL is not that of a substitute, but that of a
content-addressed file (corresponding to the output of a fixed-output
derivation.)

> It seems to me that there are several reasons someone may choose not to
> use substitutes. Some of those reasons (reproducibility and security
> concerns) are obviated for fixed-output derivations like upstream
> sources, and I think it would be fine to still use substitutes for these
> derivations.
>
> But the motivations of privacy, self-sufficiency, etc are not addressed
> by that idea.

Right.  Jan suggested checking the content-addressed mirrors *before*
the real upstream address.  That would address the problem of upstream
sources modified in-place, but at the cost of privacy/self-sufficiency
as you note.  (Though it’s not really making “privacy” any worse in this
case: it’s gnu.org vs. github.com.)

Perhaps we should make content-addressed mirrors configurable in a way
that’s orthogonal to derivations, something similar in spirit to
--substitute-urls?  The difficulty is that content-addressed mirrors are
not just URLs; see (guix download).

Thoughts?

Ludo’.




Information forwarded to bug-guix@gnu.org:
bug#28659; Package guix. (Mon, 02 Oct 2017 20:24:02 GMT) (full text, mbox, link).


Message #39 received at 28659@debbugs.gnu.org (full text, mbox, reply):

From: Jan Nieuwenhuizen <janneke@gnu.org>
To: ludo@gnu.org (Ludovic Courtès)
Cc: 28659@debbugs.gnu.org, Leo Famulari <leo@famulari.name>
Subject: Re: bug#28659: v0.13: guix pull fails; libgit2-0.26.0 and 0.25.1 content hashes fail
Date: Mon, 02 Oct 2017 22:22:33 +0200
Ludovic Courtès writes:

> Right.  Jan suggested checking the content-addressed mirrors *before*
> the real upstream address.  That would address the problem of upstream
> sources modified in-place, but at the cost of privacy/self-sufficiency
> as you note.  (Though it’s not really making “privacy” any worse in this
> case: it’s gnu.org vs. github.com.)

Yes, that may not preferrable in general without override.

> Perhaps we should make content-addressed mirrors configurable in a way
> that’s orthogonal to derivations, something similar in spirit to
> --substitute-urls?  The difficulty is that content-addressed mirrors are
> not just URLs; see (guix download).

Hmm.  I'm not sure what problem we are solving.  Should we only do this
for github(-like) tarballs?  Do we see this problem with other sources,
should we prevent it?  Possibly github will never do something like this
again.  Or we could banish github/gitlab(?) auto-generated tarballs and
go for git checkouts+commits?

janneke

-- 
Jan Nieuwenhuizen <janneke@gnu.org> | GNU LilyPond http://lilypond.org
Freelance IT http://JoyofSource.com | Avatar® http://AvatarAcademy.com




Information forwarded to bug-guix@gnu.org:
bug#28659; Package guix. (Mon, 02 Oct 2017 20:30:01 GMT) (full text, mbox, link).


Message #42 received at 28659@debbugs.gnu.org (full text, mbox, reply):

From: Leo Famulari <leo@famulari.name>
To: Jan Nieuwenhuizen <janneke@gnu.org>
Cc: Ludovic Courtès <ludo@gnu.org>, 28659@debbugs.gnu.org
Subject: Re: bug#28659: v0.13: guix pull fails; libgit2-0.26.0 and 0.25.1 content hashes fail
Date: Mon, 2 Oct 2017 16:29:07 -0400
[Message part 1 (text/plain, inline)]
On Mon, Oct 02, 2017 at 10:22:33PM +0200, Jan Nieuwenhuizen wrote:
> Hmm.  I'm not sure what problem we are solving.  Should we only do this
> for github(-like) tarballs?  Do we see this problem with other sources,
> should we prevent it?  Possibly github will never do something like this
> again.  Or we could banish github/gitlab(?) auto-generated tarballs and
> go for git checkouts+commits?

Files referenced by URL (location-addressing vs content-addressing) have
been changed in place by a variety of hosters and upstream projects
since I've started paying attention to these issues. I don't think we
need to do anything special regarding GitHub.
[signature.asc (application/pgp-signature, inline)]

Information forwarded to bug-guix@gnu.org:
bug#28659; Package guix. (Mon, 02 Oct 2017 22:48:01 GMT) (full text, mbox, link).


Message #45 received at 28659@debbugs.gnu.org (full text, mbox, reply):

From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
To: Leo Famulari <leo@famulari.name>
Cc: Ludovic Courtès <ludo@gnu.org>, 28659@debbugs.gnu.org
Subject: Re: bug#28659: v0.13: guix pull fails; libgit2-0.26.0 and 0.25.1 content hashes fail
Date: Mon, 02 Oct 2017 18:47:06 -0400
Leo Famulari <leo@famulari.name> writes:

> On Mon, Oct 02, 2017 at 04:57:38PM +0200, Ludovic Courtès wrote:
>> Hi!
>> 
>> Leo Famulari <leo@famulari.name> skribis:
>> 
>> > I contacted GitHub about this issue a few weeks ago and they said that:
>> >
>> > 1) They do not guarantee bit-reproducibility of the snapshots they
>> > generate automatically for each release tag, and they wish that people
>> > would not rely on them as we do. However, since people *are* relying on
>> > them, they are discussing this issue internally.
>> 
>> Oh?!  Then we’re in trouble.
>
> I wonder, are there really that many affected packages?

There's a list here:
https://github.com/Homebrew/homebrew-core/issues/18044, compiled by one
of the homebrew project's maintainers.

Maxim




Information forwarded to bug-guix@gnu.org:
bug#28659; Package guix. (Tue, 03 Oct 2017 12:31:01 GMT) (full text, mbox, link).


Message #48 received at 28659@debbugs.gnu.org (full text, mbox, reply):

From: ludo@gnu.org (Ludovic Courtès)
To: Jan Nieuwenhuizen <janneke@gnu.org>
Cc: 28659@debbugs.gnu.org, Leo Famulari <leo@famulari.name>
Subject: Re: bug#28659: v0.13: guix pull fails; libgit2-0.26.0 and 0.25.1 content hashes fail
Date: Tue, 03 Oct 2017 14:30:26 +0200
Jan Nieuwenhuizen <janneke@gnu.org> skribis:

> Ludovic Courtès writes:

[...]

>> Perhaps we should make content-addressed mirrors configurable in a way
>> that’s orthogonal to derivations, something similar in spirit to
>> --substitute-urls?  The difficulty is that content-addressed mirrors are
>> not just URLs; see (guix download).
>
> Hmm.  I'm not sure what problem we are solving.  Should we only do this
> for github(-like) tarballs?  Do we see this problem with other sources,
> should we prevent it?  Possibly github will never do something like this
> again.  Or we could banish github/gitlab(?) auto-generated tarballs and
> go for git checkouts+commits?

Content-addressed mirrors help with disappearing and modified tarballs
in general; it’s not just GitHub.

Occasionally we see that problem with tarballs coming from elsewhere:
404 is quite frequent, and in-place modification happens from time to
time (even on ftp.gnu.org…).

Ludo’.




Information forwarded to bug-guix@gnu.org:
bug#28659; Package guix. (Tue, 03 Oct 2017 12:32:02 GMT) (full text, mbox, link).


Message #51 received at 28659@debbugs.gnu.org (full text, mbox, reply):

From: ludo@gnu.org (Ludovic Courtès)
To: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Cc: 28659@debbugs.gnu.org, Leo Famulari <leo@famulari.name>
Subject: Re: bug#28659: v0.13: guix pull fails; libgit2-0.26.0 and 0.25.1 content hashes fail
Date: Tue, 03 Oct 2017 14:31:19 +0200
Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:

> Leo Famulari <leo@famulari.name> writes:
>
>> On Mon, Oct 02, 2017 at 04:57:38PM +0200, Ludovic Courtès wrote:
>>> Hi!
>>> 
>>> Leo Famulari <leo@famulari.name> skribis:
>>> 
>>> > I contacted GitHub about this issue a few weeks ago and they said that:
>>> >
>>> > 1) They do not guarantee bit-reproducibility of the snapshots they
>>> > generate automatically for each release tag, and they wish that people
>>> > would not rely on them as we do. However, since people *are* relying on
>>> > them, they are discussing this issue internally.
>>> 
>>> Oh?!  Then we’re in trouble.
>>
>> I wonder, are there really that many affected packages?
>
> There's a list here:
> https://github.com/Homebrew/homebrew-core/issues/18044, compiled by one
> of the homebrew project's maintainers.

Interesting.  Thanks for the link!

Ludo’.




Information forwarded to bug-guix@gnu.org:
bug#28659; Package guix. (Tue, 03 Oct 2017 14:25:02 GMT) (full text, mbox, link).


Message #54 received at 28659@debbugs.gnu.org (full text, mbox, reply):

From: Leo Famulari <leo@famulari.name>
To: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Cc: Ludovic Courtès <ludo@gnu.org>, 28659@debbugs.gnu.org
Subject: Re: bug#28659: v0.13: guix pull fails; libgit2-0.26.0 and 0.25.1 content hashes fail
Date: Tue, 3 Oct 2017 10:24:49 -0400
[Message part 1 (text/plain, inline)]
On Mon, Oct 02, 2017 at 06:47:06PM -0400, Maxim Cournoyer wrote:
> Leo Famulari <leo@famulari.name> writes:
> > I wonder, are there really that many affected packages?
> 
> There's a list here:
> https://github.com/Homebrew/homebrew-core/issues/18044, compiled by one
> of the homebrew project's maintainers.

I meant, how many Guix packages use the auto-generated GitHub snapshots?

I believe the tell-tale sign is that the download link will have the
link text 'Source code', as for this release:

https://github.com/libgit2/libgit2/releases/tag/v0.26.0
[signature.asc (application/pgp-signature, inline)]

Information forwarded to bug-guix@gnu.org:
bug#28659; Package guix. (Wed, 04 Oct 2017 04:23:01 GMT) (full text, mbox, link).


Message #57 received at 28659@debbugs.gnu.org (full text, mbox, reply):

From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
To: Leo Famulari <leo@famulari.name>
Cc: Ludovic Courtès <ludo@gnu.org>, 28659@debbugs.gnu.org
Subject: Re: bug#28659: v0.13: guix pull fails; libgit2-0.26.0 and 0.25.1 content hashes fail
Date: Wed, 04 Oct 2017 00:22:34 -0400
[Message part 1 (text/plain, inline)]
Leo Famulari <leo@famulari.name> writes:

> On Mon, Oct 02, 2017 at 06:47:06PM -0400, Maxim Cournoyer wrote:
>> Leo Famulari <leo@famulari.name> writes:
>> > I wonder, are there really that many affected packages?
>> 
>> There's a list here:
>> https://github.com/Homebrew/homebrew-core/issues/18044, compiled by one
>> of the homebrew project's maintainers.
>
> I meant, how many Guix packages use the auto-generated GitHub snapshots?
>
> I believe the tell-tale sign is that the download link will have the
> link text 'Source code', as for this release:
>
> https://github.com/libgit2/libgit2/releases/tag/v0.26.0

The following script:
[Message part 2 (text/plain, inline)]
;;; A script to find packages possibly affected by GitHub
;;; infrastructure update that caused minor changes in the
;;; automatically generated tarballs.

(use-modules (ice-9 match)
	     (gnu packages)
	     (guix download)
	     (guix packages))

(define (problematic-uri? uri)

  (define (contains-github-archive? uri)
    (string-match "github.com/.*/archive/" uri))

  ;; URI can be a string or a list of string.
  (match uri
    ((uri1 uri2 ...)			;match list of strings
     (filter contains-github-archive? uri))
    (uri1				;match string
     (contains-github-archive? uri1))))

(define (problematic-github-package? package)
  (let ((source (package-source package)))
    (and (origin? source)
	 (eq? (origin-method source) url-fetch)
	 (problematic-uri? (origin-uri source)))))

(define (problematic-github-packages)
  "List of all the potentially problematic GitHub packages."
  (fold-packages (lambda (p r)
		   (if (problematic-github-package? p)
		       (cons p r)
		       r))
		 '()))
(define (main)
  "Find and print the names of the potentially problematic GitHub packages."
  (let ((packages (problematic-github-packages)))
    (format #t "Number of potentially problematic GitHub packages:~a~%"
	    (length packages))
    (for-each (lambda (p)
		(format #t "~a~%" (package-name p)))
	      packages)))

;;; Run the program.
(main)
[Message part 3 (text/plain, inline)]
outputs that there could be up to 1011 affected packages.

The scripts checks for a url-fetch uri of the form
"github.com/.*/archive/", which seems to be the one used for the
dynamically generated archives.

Here are the first 10 lines of the output:
--8<---------------cut here---------------start------------->8---
Number of potentially problematic GitHub packages:1011
fdupes
cbatticon
sedsed
cpulimit
autojump
sudo
thermald
progress
dstat
[...]
--8<---------------cut here---------------end--------------->8---

I've checked the first few with for example:
--8<---------------cut here---------------start------------->8---
guix build --source --no-substitutes sedsed
--8<---------------cut here---------------end--------------->8---

and they were OK though.

Maxim

Information forwarded to bug-guix@gnu.org:
bug#28659; Package guix. (Wed, 04 Oct 2017 16:55:02 GMT) (full text, mbox, link).


Message #60 received at 28659@debbugs.gnu.org (full text, mbox, reply):

From: Leo Famulari <leo@famulari.name>
To: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Cc: Ludovic Courtès <ludo@gnu.org>, 28659@debbugs.gnu.org
Subject: Re: bug#28659: v0.13: guix pull fails; libgit2-0.26.0 and 0.25.1 content hashes fail
Date: Wed, 4 Oct 2017 12:54:13 -0400
[Message part 1 (text/plain, inline)]
On Wed, Oct 04, 2017 at 12:22:34AM -0400, Maxim Cournoyer wrote:
> Here are the first 10 lines of the output:
> --8<---------------cut here---------------start------------->8---
> Number of potentially problematic GitHub packages:1011
> fdupes
> cbatticon
> sedsed
> cpulimit
> autojump
> sudo

I think the script is buggy; sudo's source is not downloaded from GitHub
as far as I can tell.
[signature.asc (application/pgp-signature, inline)]

Information forwarded to bug-guix@gnu.org:
bug#28659; Package guix. (Wed, 04 Oct 2017 23:54:01 GMT) (full text, mbox, link).


Message #63 received at 28659@debbugs.gnu.org (full text, mbox, reply):

From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
To: Leo Famulari <leo@famulari.name>
Cc: Ludovic Courtès <ludo@gnu.org>, 28659@debbugs.gnu.org
Subject: Re: bug#28659: v0.13: guix pull fails; libgit2-0.26.0 and 0.25.1 content hashes fail
Date: Wed, 04 Oct 2017 19:53:30 -0400
[Message part 1 (text/plain, inline)]
Leo Famulari <leo@famulari.name> writes:

> On Wed, Oct 04, 2017 at 12:22:34AM -0400, Maxim Cournoyer wrote:
>> Here are the first 10 lines of the output:
>> --8<---------------cut here---------------start------------->8---
>> Number of potentially problematic GitHub packages:1011
>> fdupes
>> cbatticon
>> sedsed
>> cpulimit
>> autojump
>> sudo
>
> I think the script is buggy; sudo's source is not downloaded from GitHub
> as far as I can tell.

Good catch! I was assuming empty lists were falsy, but that's not the
case! I've ensured purely boolean predicates now and it gets the list
down to 650.

Here's the corrected script:
[Message part 2 (text/plain, inline)]
;;; A script to find packages possibly affected by GitHub
;;; infrastructure update that caused minor changes in the
;;; automatically generated tarballs.

(use-modules (ice-9 match)
	     (gnu packages)
	     (guix download)
	     (guix packages))

(define (problematic-uri? uri)

  (define (contains-github-archive? uri)
    (regexp-match? (string-match "github.com/.*/archive/" uri)))

  ;; URI can be a string or a list of string.
  (match uri
    ((uri1 uri2 ...)			;match list of strings
     (not (null? (filter contains-github-archive? uri))))
    (uri1				;match string
     (contains-github-archive? uri1))))

(define (problematic-github-package? package)
  (let ((source (package-source package)))
    (and (origin? source)
	 (eq? (origin-method source) url-fetch)
	 (problematic-uri? (origin-uri source)))))

(define (problematic-github-packages)
  "List of all the potentially problematic GitHub packages."
  (fold-packages (lambda (p r)
		   (if (problematic-github-package? p)
		       (cons p r)
		       r))
		 '()))
(define (main)
  "Find and print the names of the potentially problematic GitHub packages."
  (let ((packages (problematic-github-packages)))
    (format #t "Number of potentially problematic GitHub packages: ~a~%"
	    (length packages))
    (for-each (lambda (p)
		(format #t "~a~%" (package-name p)))
	      packages)))

;;; Run the program.
(main)
[Message part 3 (text/plain, inline)]
And sample output:
--8<---------------cut here---------------start------------->8---
Number of potentially problematic GitHub packages: 650
fdupes
cbatticon
cpulimit
thefuck
thermald
neofetch
autojump
progress
nnn
[...]
wxwidgets
xclip
xcape
sxhkd
maim
slop
tinyxml2
xlsx2csv
--8<---------------cut here---------------end--------------->8---

Maxim

Information forwarded to bug-guix@gnu.org:
bug#28659; Package guix. (Thu, 05 Oct 2017 04:54:02 GMT) (full text, mbox, link).


Message #66 received at 28659@debbugs.gnu.org (full text, mbox, reply):

From: Maxim Cournoyer <maxim.cournoyer@gmail.com>
To: Leo Famulari <leo@famulari.name>
Cc: 28659@debbugs.gnu.org
Subject: Re: bug#28659: v0.13: guix pull fails; libgit2-0.26.0 and 0.25.1 content hashes fail
Date: Thu, 05 Oct 2017 00:52:57 -0400
I've modified the script to sort the packages it prints:
--8<---------------cut here---------------start------------->8---
-    (for-each (lambda (p)
-		(format #t "~a~%" (package-name p)))
-	      packages)))
+    (for-each (lambda (name)
+		(format #t "~a~%" name))
+	      (sort (map package-name packages) string<?))))
--8<---------------cut here---------------end--------------->8---

and compared it to the list here: https://github.com/Homebrew/homebrew-core/issues/18044

If we can trust the Homebrew list to be extensive, it seems we got
lucky; there's only one affected package that we share which is
yaml-cpp. Here's how it fails on our side:

--8<---------------cut here---------------start------------->8---
guix build -S --no-substitutes yaml-cpp
The following derivation will be built:
   /gnu/store/mlap8jmadirnbii6sppb6vj9x56s8azw-yaml-cpp-0.5.3.tar.gz.drv
@ build-started /gnu/store/mlap8jmadirnbii6sppb6vj9x56s8azw-yaml-cpp-0.5.3.tar.gz.drv - x86_64-linux /var/log/guix/drvs/ml//ap8jmadirnbii6sppb6vj9x56s8azw-yaml-cpp-0.5.3.tar.gz.drv.bz2

Starting download of /gnu/store/qwflwafrzjbr2b7dy4nv18nxykghhmnk-yaml-cpp-0.5.3.tar.gz
From https://github.com/jbeder/yaml-cpp/archive/yaml-cpp-0.5.3.tar.gz...
following redirection to `https://codeload.github.com/jbeder/yaml-cpp/tar.gz/yaml-cpp-0.5.3'...
 ...p-0.5.3                                  1.7MiB/s 00:01 | 1.9MiB transferred
sha256 hash mismatch for output path `/gnu/store/qwflwafrzjbr2b7dy4nv18nxykghhmnk-yaml-cpp-0.5.3.tar.gz'
  expected: 1vk6pjh0f5k6jwk2sszb9z5169whmiha9ainbdpa1arxlkq7v3b6
  actual:   1ck7jk0wjfigrf4cgcjqsir4yp1s6vamhhxhpsgfvs46pgm5pk6y
@ build-failed /gnu/store/mlap8jmadirnbii6sppb6vj9x56s8azw-yaml-cpp-0.5.3.tar.gz.drv - 1 sha256 hash mismatch for output path `/gnu/store/qwflwafrzjbr2b7dy4nv18nxykghhmnk-yaml-cpp-0.5.3.tar.gz'
  expected: 1vk6pjh0f5k6jwk2sszb9z5169whmiha9ainbdpa1arxlkq7v3b6
  actual:   1ck7jk0wjfigrf4cgcjqsir4yp1s6vamhhxhpsgfvs46pgm5pk6y
guix build: error: build failed: build of
`/gnu/store/mlap8jmadirnbii6sppb6vj9x56s8azw-yaml-cpp-0.5.3.tar.gz.drv'
failed
--8<---------------cut here---------------end--------------->8---

Maxim




Information forwarded to bug-guix@gnu.org:
bug#28659; Package guix. (Thu, 05 Oct 2017 06:09:01 GMT) (full text, mbox, link).


Message #69 received at 28659@debbugs.gnu.org (full text, mbox, reply):

From: Jan Nieuwenhuizen <janneke@gnu.org>
To: Maxim Cournoyer <maxim.cournoyer@gmail.com>
Cc: 28659@debbugs.gnu.org, Leo Famulari <leo@famulari.name>
Subject: Re: bug#28659: v0.13: guix pull fails; libgit2-0.26.0 and 0.25.1 content hashes fail
Date: Thu, 05 Oct 2017 08:08:06 +0200
Maxim Cournoyer writes:

> If we can trust the Homebrew list to be extensive, it seems we got
> lucky; there's only one affected package that we share which is
> yaml-cpp. Here's how it fails on our side:

I needed to also use (ice-9 regex) and then I found these to fail

    antlr3
    csound
    erlang
    font-google-material-design-icons
    fritzing
    libgit2
    lxqt-common
    ogre
    plexus-interpolation
    red-eclipse
    yaml-cpp

out of 646 packages it's not many but it includes our core dependency
libgit2 which breaks guix pull --no-substitutes; that's hardly being
lucky?

janneke

-- 
Jan Nieuwenhuizen <janneke@gnu.org> | GNU LilyPond http://lilypond.org
Freelance IT http://JoyofSource.com | Avatar® http://AvatarAcademy.com




Information forwarded to bug-guix@gnu.org:
bug#28659; Package guix. (Fri, 20 Oct 2017 21:18:02 GMT) (full text, mbox, link).


Message #72 received at 28659@debbugs.gnu.org (full text, mbox, reply):

From: Leo Famulari <leo@famulari.name>
To: Ludovic Courtès <ludo@gnu.org>
Cc: 28659@debbugs.gnu.org, Jan Nieuwenhuizen <janneke@gnu.org>
Subject: Re: bug#28659: v0.13: guix pull fails; libgit2-0.26.0 and 0.25.1 content hashes fail
Date: Fri, 20 Oct 2017 17:17:00 -0400
[Message part 1 (text/plain, inline)]
On Mon, Oct 02, 2017 at 10:00:33PM +0200, Ludovic Courtès wrote:
> Right.  Jan suggested checking the content-addressed mirrors *before*
> the real upstream address.  That would address the problem of upstream
> sources modified in-place, but at the cost of privacy/self-sufficiency
> as you note.  (Though it’s not really making “privacy” any worse in this
> case: it’s gnu.org vs. github.com.)

Yeah, I don't personally think there is a privacy issue with fetching
sources from our mirrors at gnu.org, or other domains we control.

> Perhaps we should make content-addressed mirrors configurable in a way
> that’s orthogonal to derivations, something similar in spirit to
> --substitute-urls?  The difficulty is that content-addressed mirrors are
> not just URLs; see (guix download).
>
> Thoughts?

I do think we should make it so that users don't suffer from unreliable
upstream sources when we know the sources are available on our servers
(or the Nix mirror), even with --no-substitutes.
[signature.asc (application/pgp-signature, inline)]

Information forwarded to bug-guix@gnu.org:
bug#28659; Package guix. (Tue, 28 Nov 2017 13:32:01 GMT) (full text, mbox, link).


Message #75 received at 28659@debbugs.gnu.org (full text, mbox, reply):

From: ludo@gnu.org (Ludovic Courtès)
To: Leo Famulari <leo@famulari.name>
Cc: 28659@debbugs.gnu.org, Jan Nieuwenhuizen <janneke@gnu.org>
Subject: Re: bug#28659: v0.13: guix pull fails; libgit2-0.26.0 and 0.25.1 content hashes fail
Date: Tue, 28 Nov 2017 14:30:59 +0100
[Message part 1 (text/plain, inline)]
Leo Famulari <leo@famulari.name> skribis:

> On Mon, Oct 02, 2017 at 10:00:33PM +0200, Ludovic Courtès wrote:
>> Right.  Jan suggested checking the content-addressed mirrors *before*
>> the real upstream address.  That would address the problem of upstream
>> sources modified in-place, but at the cost of privacy/self-sufficiency
>> as you note.  (Though it’s not really making “privacy” any worse in this
>> case: it’s gnu.org vs. github.com.)
>
> Yeah, I don't personally think there is a privacy issue with fetching
> sources from our mirrors at gnu.org, or other domains we control.
>
>> Perhaps we should make content-addressed mirrors configurable in a way
>> that’s orthogonal to derivations, something similar in spirit to
>> --substitute-urls?  The difficulty is that content-addressed mirrors are
>> not just URLs; see (guix download).
>>
>> Thoughts?
>
> I do think we should make it so that users don't suffer from unreliable
> upstream sources when we know the sources are available on our servers
> (or the Nix mirror), even with --no-substitutes.

The more I think about it, the more I’m inclined to simply move
content-addressed mirrors to the front of the list.  This means that
users, in practice, would be fetching all the source from
mirror.hydra.gnu.org.

The main issue is making it configurable.  Currently the
content-addressed mirror configuration for regular files in (guix
download) looks like this:

--8<---------------cut here---------------start------------->8---
(define %content-addressed-mirrors
  ;; List of content-addressed mirrors.  Each mirror is represented as a
  ;; procedure that takes a file name, an algorithm (symbol) and a hash
  ;; (bytevector), and returns a URL or #f.
  ;; Note: Avoid 'https' to mitigate <http://bugs.gnu.org/22774>.
  ;; TODO: Add more.
  '(list (lambda (file algo hash)
           ;; Files served by 'guix publish' are accessible under a single
           ;; hash algorithm.
           (string-append "http://mirror.hydra.gnu.org/file/"
                          file "/" (symbol->string algo) "/"
                          (bytevector->nix-base32-string hash)))
         (lambda (file algo hash)
           ;; 'tarballs.nixos.org' supports several algorithms.
           (string-append "http://tarballs.nixos.org/"
                          (symbol->string algo) "/"
                          (bytevector->nix-base32-string hash)))))
--8<---------------cut here---------------end--------------->8---

That for VCS checkouts in (guix build download-nar) looks like this:

--8<---------------cut here---------------start------------->8---
(define (urls-for-item item)
  "Return the fallback nar URL for ITEM--e.g.,
\"/gnu/store/cabbag3…-foo-1.2-checkout\"."
  ;; Here we hard-code nar URLs without checking narinfos.  That's probably OK
  ;; though.
  ;; TODO: Use HTTPS?  The downside is the extra dependency.
  (let ((bases '("http://mirror.hydra.gnu.org/guix"
                 "http://berlin.guixsd.org"))
        (item  (basename item)))
    (append (map (cut string-append <> "/nar/gzip/" item) bases)
            (map (cut string-append <> "/nar/" item) bases))))
--8<---------------cut here---------------end--------------->8---

The latter could be expressed by a command-line flag.  In fact it’s the
same as --substitute-urls.

(Time passes…)

Thinking more about it, why not simply always enable substitutes for
fixed-output derivations, like this:

[Message part 2 (text/x-patch, inline)]
diff --git a/nix/libstore/build.cc b/nix/libstore/build.cc
index d68e8b2bc..03a8f5080 100644
--- a/nix/libstore/build.cc
+++ b/nix/libstore/build.cc
@@ -1034,8 +1034,10 @@ void DerivationGoal::haveDerivation()
 
     /* We are first going to try to create the invalid output paths
        through substitutes.  If that doesn't work, we'll build
-       them. */
-    if (settings.useSubstitutes && substitutesAllowed(drv))
+       them.  Always enable substitutes for fixed-output derivations to
+       protect against disappearing files and in-place modifications on
+       upstream sites.  */
+    if ((fixedOutput || settings.useSubstitutes) && substitutesAllowed(drv))
         foreach (PathSet::iterator, i, invalidOutputs)
             addWaitee(worker.makeSubstitutionGoal(*i, buildMode == bmRepair));
 
[Message part 3 (text/plain, inline)]
This solves all our problems and makes download-nar.scm useless.

As an added bonus, it provides a improves the UI since we now always
see:

--8<---------------cut here---------------start------------->8---
0.1 MB will be downloaded:
   /gnu/store/plx9848n6waj6zghn3d54ybx8ihcn23k-guile-git-0.0-4.951a32c-checkout
--8<---------------cut here---------------end--------------->8---

… instead of:

--8<---------------cut here---------------start------------->8---
The following derivation will be built:
   /gnu/store/y86rlb6pdm35im7q02y6479ca84zwylz-guile-git-000.0-4.951a32c-checkout.drv
--8<---------------cut here---------------end--------------->8---

The downside is that it still requires one to authorize the server’s
key, although it’s in theory unnecessary since it’s content addressed.
I’m not sure how to solve that because ‘guix substitute’ doesn’t know
that it’s substituting a fixed-output derivation.  I suppose we’d need
to modify the “protocol” between guix-daemon and ‘guix substitute’.

Thoughts?

Ludo’.

Information forwarded to bug-guix@gnu.org:
bug#28659; Package guix. (Thu, 14 Dec 2017 16:54:02 GMT) (full text, mbox, link).


Message #78 received at 28659@debbugs.gnu.org (full text, mbox, reply):

From: ludo@gnu.org (Ludovic Courtès)
To: Leo Famulari <leo@famulari.name>
Cc: 28659@debbugs.gnu.org, Jan Nieuwenhuizen <janneke@gnu.org>
Subject: Re: bug#28659: v0.13: guix pull fails; libgit2-0.26.0 and 0.25.1 content hashes fail
Date: Thu, 14 Dec 2017 17:53:37 +0100
ludo@gnu.org (Ludovic Courtès) skribis:

> Thinking more about it, why not simply always enable substitutes for
> fixed-output derivations, like this:
>
> diff --git a/nix/libstore/build.cc b/nix/libstore/build.cc
> index d68e8b2bc..03a8f5080 100644
> --- a/nix/libstore/build.cc
> +++ b/nix/libstore/build.cc
> @@ -1034,8 +1034,10 @@ void DerivationGoal::haveDerivation()
>  
>      /* We are first going to try to create the invalid output paths
>         through substitutes.  If that doesn't work, we'll build
> -       them. */
> -    if (settings.useSubstitutes && substitutesAllowed(drv))
> +       them.  Always enable substitutes for fixed-output derivations to
> +       protect against disappearing files and in-place modifications on
> +       upstream sites.  */
> +    if ((fixedOutput || settings.useSubstitutes) && substitutesAllowed(drv))
>          foreach (PathSet::iterator, i, invalidOutputs)
>              addWaitee(worker.makeSubstitutionGoal(*i, buildMode == bmRepair));

[...]

> The downside is that it still requires one to authorize the server’s
> key, although it’s in theory unnecessary since it’s content addressed.
> I’m not sure how to solve that because ‘guix substitute’ doesn’t know
> that it’s substituting a fixed-output derivation.  I suppose we’d need
> to modify the “protocol” between guix-daemon and ‘guix substitute’.

I looked at how to address this by having ‘guix substitute’
automatically determine whether it’s being asked for a content-addressed
item or not.  The guts of it is this procedure:

  (define* (content-addressed-item? item hash
                                    #:key (hash-algo 'sha256))
    "Return true if ITEM, a store file name, is definitely a content-addressed
  item (result of a fixed-output derivation) with the given HASH of type
  HASH-ALGO, false otherwise.

  Note: This procedure is useful when the deriver of ITEM is unknown.  In other
  cases, the recommended approach is to check 'fixed-output-derivation?' on the
  deriver."
    ;; XXX: This returns #f for "text" items produced by 'add-text-to-store'.
    ;; There's not much we can do because the file name for these is a function
    ;; of their content.
    (let ((name (store-path-package-name item)))
      (or (string=? item (fixed-output-path name hash #:recursive? #f
                                            #:hash-algo hash-algo))
          (string=? item (fixed-output-path name hash #:recursive? #t
                                            #:hash-algo hash-algo)))))

It works as expected for the result of “recursive fixed-output
derivations”—i.e., fixed-output derivations that produce a directory,
such as VCS checkouts.

However it doesn’t work for fixed-output derivations that produce a flat
file, such as origins with the ‘url-fetch’ method.  The reason is
because in the case of non-recursive derivations, the store file name is
computed as a function of the file hash, not as a function of the nar
hash, whereas narinfos only contains the nar hash (the thing that ‘guix
hash -r’ computes.)

So I think we have to communicate more info from the daemon to ‘guix
substitute’.

Ludo’.




Information forwarded to bug-guix@gnu.org:
bug#28659; Package guix. (Fri, 15 Dec 2017 09:31:02 GMT) (full text, mbox, link).


Message #81 received at 28659@debbugs.gnu.org (full text, mbox, reply):

From: ludo@gnu.org (Ludovic Courtès)
To: Leo Famulari <leo@famulari.name>
Cc: 28659@debbugs.gnu.org, Jan Nieuwenhuizen <janneke@gnu.org>
Subject: Always enable substitutes for fixed-output derivations
Date: Fri, 15 Dec 2017 10:30:39 +0100
[Message part 1 (text/plain, inline)]
ludo@gnu.org (Ludovic Courtès) skribis:

> So I think we have to communicate more info from the daemon to ‘guix
> substitute’.

The attached patch addresses that by simply calling out to the daemon to
determine whether we’re dealing with a content-addressed item.

To summarize, the new behavior is that substitutes are always enabled
for fixed-output derivations.  That way, people willing to build
everything from source can still use ‘--no-substitutes’ and yet be able
to retrieve source code without being penalized compared to someone
enabling substitutes wholesale.

Of course, when substitutes are missing, we fall back to regular
downloads or VCS checkouts.  It is also still possible to choose where
substitutes are downloaded from, using ‘--substitute-urls’, or even to
pass an empty list of URLs.

Feedback welcome!

Ludo’.

[0001-substitute-Always-allow-substitutes-for-fixed-output.patch (text/x-patch, attachment)]
[0002-Revert-download-Download-a-nar-when-a-VCS-checkout-f.patch (text/x-patch, attachment)]

Information forwarded to bug-guix@gnu.org:
bug#28659; Package guix. (Fri, 14 Feb 2020 21:35:01 GMT) (full text, mbox, link).


Message #84 received at 28659@debbugs.gnu.org (full text, mbox, reply):

From: Ludovic Courtès <ludo@gnu.org>
To: Jan Nieuwenhuizen <janneke@gnu.org>
Cc: 39575@debbugs.gnu.org, 28659@debbugs.gnu.org, zimoun <zimon.toutoune@gmail.com>
Subject: Re: bug#39575: guix time-machine fails when a tarball was modified in-place
Date: Fri, 14 Feb 2020 22:34:13 +0100
Jan Nieuwenhuizen <janneke@gnu.org> skribis:

> Ludovic Courtès writes:

[...]

>> The problem here is really that we fall back to content-addressed
>> mirrors instead of using them directly:
>>
>>   https://issues.guix.gnu.org/issue/28659
>
> Wait, what happened here; you finally proposed a patch two years ago and
> nothing happened/we all forgot to follow up?

I think we forgot, indeed.

One thing I don’t quite like about the patch is the fact that ‘guix
substitutes’ connects to the daemon in ‘content-addressed-item?’.

Also, one could argue that we’d steer users towards downloading from our
server, which could be a privacy concern (probably not a strong argument
since one can easily change the substitute URLs.)

Thoughts?

Ludo’.




Information forwarded to bug-guix@gnu.org:
bug#28659; Package guix. (Sat, 15 Feb 2020 15:45:02 GMT) (full text, mbox, link).


Message #87 received at 28659@debbugs.gnu.org (full text, mbox, reply):

From: zimoun <zimon.toutoune@gmail.com>
To: Ludovic Courtès <ludo@gnu.org>
Cc: 39575@debbugs.gnu.org, 28659@debbugs.gnu.org, Jan Nieuwenhuizen <janneke@gnu.org>
Subject: Re: bug#39575: guix time-machine fails when a tarball was modified in-place
Date: Sat, 15 Feb 2020 16:43:46 +0100
Hi,

On Fri, 14 Feb 2020 at 22:34, Ludovic Courtès <ludo@gnu.org> wrote:

> Also, one could argue that we’d steer users towards downloading from our
> server, which could be a privacy concern (probably not a strong argument
> since one can easily change the substitute URLs.)

I am not following the privacy concern.
What do you mean?

Cheers,
simon




Information forwarded to bug-guix@gnu.org:
bug#28659; Package guix. (Sun, 16 Feb 2020 11:00:01 GMT) (full text, mbox, link).


Message #90 received at 28659@debbugs.gnu.org (full text, mbox, reply):

From: Ludovic Courtès <ludo@gnu.org>
To: zimoun <zimon.toutoune@gmail.com>
Cc: 39575@debbugs.gnu.org, 28659@debbugs.gnu.org, Jan Nieuwenhuizen <janneke@gnu.org>
Subject: Re: bug#39575: guix time-machine fails when a tarball was modified in-place
Date: Sun, 16 Feb 2020 11:59:01 +0100
Hi!

zimoun <zimon.toutoune@gmail.com> skribis:

> On Fri, 14 Feb 2020 at 22:34, Ludovic Courtès <ludo@gnu.org> wrote:
>
>> Also, one could argue that we’d steer users towards downloading from our
>> server, which could be a privacy concern (probably not a strong argument
>> since one can easily change the substitute URLs.)
>
> I am not following the privacy concern.
> What do you mean?

I mean that by default, someone who’s disabled substitutes (presumably
out of security or privacy concerns) would find themself downloading
source code from ci.guix.gnu.org instead of various upstream sites.

Ludo’.





Information forwarded to bug-guix@gnu.org:
bug#28659; Package guix. (Mon, 17 Feb 2020 10:19:02 GMT) (full text, mbox, link).


Message #93 received at 28659@debbugs.gnu.org (full text, mbox, reply):

From: zimoun <zimon.toutoune@gmail.com>
To: Ludovic Courtès <ludo@gnu.org>
Cc: 39575@debbugs.gnu.org, 28659@debbugs.gnu.org, Jan Nieuwenhuizen <janneke@gnu.org>
Subject: Re: bug#39575: guix time-machine fails when a tarball was modified in-place
Date: Mon, 17 Feb 2020 11:18:22 +0100
Hi Ludo,

On Sun, 16 Feb 2020 at 11:59, Ludovic Courtès <ludo@gnu.org> wrote:
> zimoun <zimon.toutoune@gmail.com> skribis:
> > On Fri, 14 Feb 2020 at 22:34, Ludovic Courtès <ludo@gnu.org> wrote:

> >> Also, one could argue that we’d steer users towards downloading from our
> >> server, which could be a privacy concern (probably not a strong argument
> >> since one can easily change the substitute URLs.)
> >
> > I am not following the privacy concern.
> > What do you mean?
>
> I mean that by default, someone who’s disabled substitutes (presumably
> out of security or privacy concerns) would find themself downloading
> source code from ci.guix.gnu.org instead of various upstream sites.

I do not see the difference between mirroring and traveling back in
time with missing upstream sources.
And because it is content-addressed, it seems even more secure than
downloading from a upstream URL, IMHO.
If one trusts Guix, then an attacker needs to corrupt in the same time
the Guix history and Berlin (and/or any other farm).
If one does not trust Guix, why does they use the recipe coming from
Guix? To be precise, this person has to check all the recipes of all
the dependencies.

Well, I do not see a security concern because we are talking about
serving the sources.
It is another story when the substitutes serve the results of the
build (binaries); because one does not have any strong guarantee that
the substitute serves the expected binaries.

By privacy concern, do you mean that Guix could collect who downloads
what; in a central fashion? Which is not the case when one downloads
from several distributed upstream sources. Right?
Well, I am not convinced because the case of missing upstream source
is rare. And it is easy to protect against such collecting data
process.
In paranoid mode, traveling back in time is becoming difficult because
of the reliability of the sources; I mean if the sources were
reliable, SWH would not exist. ;-) The solution should be an IPFS /
GNUnet / full distributed archive... which is not ready... yet! :-)


Well, maybe for the TODO list of the time-machine: add an option to
allow substitutes *only* for the sources (substitutes meaning
ci.guix.gnu.org and/or SWH). If this option does not exist yet. ;-)


Cheers,
simon




Information forwarded to bug-guix@gnu.org:
bug#28659; Package guix. (Mon, 17 Feb 2020 14:41:02 GMT) (full text, mbox, link).


Message #96 received at 28659@debbugs.gnu.org (full text, mbox, reply):

From: Ludovic Courtès <ludo@gnu.org>
To: zimoun <zimon.toutoune@gmail.com>
Cc: 39575@debbugs.gnu.org, 28659@debbugs.gnu.org, Jan Nieuwenhuizen <janneke@gnu.org>
Subject: Re: bug#39575: guix time-machine fails when a tarball was modified in-place
Date: Mon, 17 Feb 2020 15:40:13 +0100
Hi,

zimoun <zimon.toutoune@gmail.com> skribis:

> On Sun, 16 Feb 2020 at 11:59, Ludovic Courtès <ludo@gnu.org> wrote:
>> zimoun <zimon.toutoune@gmail.com> skribis:
>> > On Fri, 14 Feb 2020 at 22:34, Ludovic Courtès <ludo@gnu.org> wrote:
>
>> >> Also, one could argue that we’d steer users towards downloading from our
>> >> server, which could be a privacy concern (probably not a strong argument
>> >> since one can easily change the substitute URLs.)
>> >
>> > I am not following the privacy concern.
>> > What do you mean?
>>
>> I mean that by default, someone who’s disabled substitutes (presumably
>> out of security or privacy concerns) would find themself downloading
>> source code from ci.guix.gnu.org instead of various upstream sites.

[...]

> By privacy concern, do you mean that Guix could collect who downloads
> what; in a central fashion? Which is not the case when one downloads
> from several distributed upstream sources. Right?

Exactly.  But like I wrote above, I don’t think it’s a strong argument.

What remains is the issue with ‘content-addressed-item?’, then.

Ludo’.




Information forwarded to bug-guix@gnu.org:
bug#28659; Package guix. (Mon, 17 Feb 2020 15:05:02 GMT) (full text, mbox, link).


Message #99 received at 28659@debbugs.gnu.org (full text, mbox, reply):

From: zimoun <zimon.toutoune@gmail.com>
To: Ludovic Courtès <ludo@gnu.org>
Cc: 39575@debbugs.gnu.org, 28659@debbugs.gnu.org, Jan Nieuwenhuizen <janneke@gnu.org>
Subject: Re: bug#39575: guix time-machine fails when a tarball was modified in-place
Date: Mon, 17 Feb 2020 16:04:09 +0100
On Mon, 17 Feb 2020 at 15:40, Ludovic Courtès <ludo@gnu.org> wrote:

> Exactly.  But like I wrote above, I don’t think it’s a strong argument.

I agree and the big picture depends on the audience.
Scientific communities would be fine with centralized archives such as
SWH. And only centralized archives IMHO can provide a reliable "long
term" support which is the point for that communities. (Quote because
not clearly defined what it is. :-))
Other communities would prefer distributed archive such as IPFS or
GNUnet but 1. it still needs some work and 2. the "long term" is not
guarantee by nature, IMHO. But it is probably not an issue for that
communities.


> What remains is the issue with ‘content-addressed-item?’, then.

I agree.
The bridge with SWH is in good shape, IMHO.
And the pending IPFS patch would deserve more love. :-) Maybe soon...



Cheers,
simon




Information forwarded to bug-guix@gnu.org:
bug#28659; Package guix. (Wed, 09 Sep 2020 14:32:02 GMT) (full text, mbox, link).


Message #102 received at 28659@debbugs.gnu.org (full text, mbox, reply):

From: zimoun <zimon.toutoune@gmail.com>
To: Ludovic Courtès <ludo@gnu.org>
Cc: 28659@debbugs.gnu.org, Jan Nieuwenhuizen <janneke@gnu.org>
Subject: Re: bug#28659: Content-addressed mirror is not used upon invalid hash
Date: Wed, 09 Sep 2020 16:31:27 +0200
Hi,

On Fri, 14 Feb 2020 at 22:34, Ludovic Courtès <ludo@gnu.org> wrote:

> One thing I don’t quite like about the patch is the fact that ‘guix
> substitutes’ connects to the daemon in ‘content-addressed-item?’.

What is the status of this patch [1] following the recent discussion about
tar “disarchive” and SWH?

Related:
 - http://issues.guix.gnu.org/issue/39575
 - http://issues.guix.gnu.org/42162
 - https://git.ngyro.com/disarchive/
 
All the best,
simon

[1] http://issues.guix.gnu.org/issue/28659#26




Information forwarded to bug-guix@gnu.org:
bug#28659; Package guix. (Thu, 10 Sep 2020 08:16:02 GMT) (full text, mbox, link).


Message #105 received at 28659@debbugs.gnu.org (full text, mbox, reply):

From: Ludovic Courtès <ludo@gnu.org>
To: zimoun <zimon.toutoune@gmail.com>
Cc: 28659@debbugs.gnu.org, Jan Nieuwenhuizen <janneke@gnu.org>
Subject: Re: bug#28659: Content-addressed mirror is not used upon invalid hash
Date: Thu, 10 Sep 2020 10:14:44 +0200
Hello,

zimoun <zimon.toutoune@gmail.com> skribis:

> On Fri, 14 Feb 2020 at 22:34, Ludovic Courtès <ludo@gnu.org> wrote:
>
>> One thing I don’t quite like about the patch is the fact that ‘guix
>> substitutes’ connects to the daemon in ‘content-addressed-item?’.
>
> What is the status of this patch [1] following the recent discussion about
> tar “disarchive” and SWH?
>
> Related:
>  - http://issues.guix.gnu.org/issue/39575
>  - http://issues.guix.gnu.org/42162
>  - https://git.ngyro.com/disarchive/

Thanks for the reminder.  I don’t think Timothy’s work changes anything
wrt. to this issue: it would still need to be addressed.

Ludo’.




Information forwarded to bug-guix@gnu.org:
bug#28659; Package guix. (Thu, 03 Feb 2022 03:02:01 GMT) (full text, mbox, link).


Message #108 received at 28659@debbugs.gnu.org (full text, mbox, reply):

From: zimoun <zimon.toutoune@gmail.com>
To: ludo@gnu.org (Ludovic Courtès)
Cc: Jan Nieuwenhuizen <janneke@gnu.org>, 28659@debbugs.gnu.org, Leo Famulari <leo@famulari.name>
Subject: Re: bug#28659: Content-addressed mirror is not used upon invalid hash
Date: Thu, 03 Feb 2022 03:58:26 +0100
Hi Ludo,

On Fri, 15 Dec 2017 at 10:30, ludo@gnu.org (Ludovic Courtès) wrote:

>> So I think we have to communicate more info from the daemon to ‘guix
>> substitute’.
>
> The attached patch addresses that by simply calling out to the daemon to
> determine whether we’re dealing with a content-addressed item.

WDYT to rebase this patch [1] and resubmit to guix-patches in order to
get more attention and so potential feedback and/or review?

1: <https://issues.guix.gnu.org/issue/28659#26>


Cheers,
simon




Merged 28659 70588. Request was from Ludovic Courtès <ludo@gnu.org> to control@debbugs.gnu.org. (Wed, 01 May 2024 10:37:04 GMT) (full text, mbox, link).


Send a report that this bug log contains spam.


debbugs.gnu.org maintainers <help-debbugs@gnu.org>. Last modified: Sun Sep 8 03:48:38 2024; Machine Name: wallace-server

GNU bug tracking system

Debbugs is free software and licensed under the terms of the GNU Public License version 2. The current version can be obtained from https://bugs.debian.org/debbugs-source/.

Copyright © 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson, 2005-2017 Don Armstrong, and many other contributors.