Report forwarded
to bug-guix@gnu.org: bug#28659; Package guix.
(Sun, 01 Oct 2017 10:17:02 GMT) (full text, mbox, link).
Acknowledgement sent
to Jan Nieuwenhuizen <janneke@gnu.org>:
New bug report received and forwarded. Copy sent to bug-guix@gnu.org.
(Sun, 01 Oct 2017 10:17:02 GMT) (full text, mbox, link).
Jan Nieuwenhuizen writes:
The changing of the libgit-0.26.0 checksum was already reported about 3
weeks ago (github seems to only show relative dates)
https://github.com/libgit2/libgit2/issues/4343
and the bug is still open. It seems to be a github thing. As I
understand it, currently our options are to update the hash and pray it
won't happen again or host libgit2 tarballs ourselves.
--
Jan Nieuwenhuizen <janneke@gnu.org> | GNU LilyPond http://lilypond.org
Freelance IT http://JoyofSource.com | Avatar® http://AvatarAcademy.com
Information forwarded
to bug-guix@gnu.org: bug#28659; Package guix.
(Sun, 01 Oct 2017 20:44:02 GMT) (full text, mbox, link).
On Sun, Oct 01, 2017 at 09:20:42PM +0200, Jan Nieuwenhuizen wrote:
> Jan Nieuwenhuizen writes:
>
> The changing of the libgit-0.26.0 checksum was already reported about 3
> weeks ago (github seems to only show relative dates)
>
> https://github.com/libgit2/libgit2/issues/4343
>
> and the bug is still open. It seems to be a github thing. As I
> understand it, currently our options are to update the hash and pray it
> won't happen again or host libgit2 tarballs ourselves.
I contacted GitHub about this issue a few weeks ago and they said that:
1) They do not guarantee bit-reproducibility of the snapshots they
generate automatically for each release tag, and they wish that people
would not rely on them as we do. However, since people *are* relying on
them, they are discussing this issue internally.
2) This is the relevant code change:
https://git.kernel.org/pub/scm/git/git.git/commit/?id=22f0dcd9634a818a0c83f23ea1a48f2d620c0546
In the meantime, we can add this to the list of reasons that
reproducibility is difficult in the long term.
I don't have any solutions in mind besides keeping substitutes available
for as long as possible and, for users, using substitutes. We might also
petition upstream projects to offer a "real" release tarball.
Leo Famulari transcribed 2.3K bytes:
> On Sun, Oct 01, 2017 at 09:20:42PM +0200, Jan Nieuwenhuizen wrote:
> > Jan Nieuwenhuizen writes:
> >
> > The changing of the libgit-0.26.0 checksum was already reported about 3
> > weeks ago (github seems to only show relative dates)
> >
> > https://github.com/libgit2/libgit2/issues/4343
> >
> > and the bug is still open. It seems to be a github thing. As I
> > understand it, currently our options are to update the hash and pray it
> > won't happen again or host libgit2 tarballs ourselves.
>
> I contacted GitHub about this issue a few weeks ago and they said that:
>
> 1) They do not guarantee bit-reproducibility of the snapshots they
> generate automatically for each release tag, and they wish that people
> would not rely on them as we do. However, since people *are* relying on
> them, they are discussing this issue internally.
> 2) This is the relevant code change:
> https://git.kernel.org/pub/scm/git/git.git/commit/?id=22f0dcd9634a818a0c83f23ea1a48f2d620c0546
>
> In the meantime, we can add this to the list of reasons that
> reproducibility is difficult in the long term.
>
> I don't have any solutions in mind besides keeping substitutes available
> for as long as possible and, for users, using substitutes. We might also
> petition upstream projects to offer a "real" release tarball.
Given that we depend on this for our core functionality,
can't we just keep this on our ftp directory at gnu.org
as a fall-back source in a list?
--
ng0
GnuPG: A88C8ADD129828D7EAC02E52E22F9BBFEE348588
GnuPG: https://krosos.org/dist/keys/https://www.infotropique.orghttps://krosos.org
Hi!
Leo Famulari <leo@famulari.name> skribis:
> I contacted GitHub about this issue a few weeks ago and they said that:
>
> 1) They do not guarantee bit-reproducibility of the snapshots they
> generate automatically for each release tag, and they wish that people
> would not rely on them as we do. However, since people *are* relying on
> them, they are discussing this issue internally.
Oh?! Then we’re in trouble.
Perhaps we should start using ‘git-fetch’ more, with Software Heritage
as a fallback content-addressed mirror? Though again the difficulty is
that SWH uses Git’s method to hash directory contents, so we’d end up
having to provide both a Nix hash and a Git hash in ‘origin’. :-/
> In the meantime, we can add this to the list of reasons that
> reproducibility is difficult in the long term.
Heh.
Ludo’.
Information forwarded
to bug-guix@gnu.org: bug#28659; Package guix.
(Mon, 02 Oct 2017 15:11:02 GMT) (full text, mbox, link).
Hello,
Jan Nieuwenhuizen <janneke@gnu.org> skribis:
> As reported by laertus on irc[0]: guix pull on 0.13 without substitutes fails
I just checked and we do have substitutes, but I understand it doesn’t
help here.
> guix pull
>
> Starting download of /tmp/guix-file.3r6cH0
> From https://git.savannah.gnu.org/cgit/guix.git/snapshot/master.tar.gz...
> ….tar.gz 5.7MiB/s 00:02 | 13.6MiB transferred
> unpacking '/gnu/store/sginfwnrcfqn1far31gmzlaffd8xlxyy-guix-latest.tar.gz'...
>
> Starting download of /gnu/store/c3npgqn9ag2ypi9bda1g779wwwlcqqrf-libgit2-0.25.1.tar.gz
> From https://github.com/libgit2/libgit2/archive/v0.25.1.tar.gz...
> following redirection to `https://codeload.github.com/libgit2/libgit2/tar.gz/v0.25.1'...
> v0.25.1 6.1MiB/s 00:01 | 4.1MiB transferred
> output path `/gnu/store/c3npgqn9ag2ypi9bda1g779wwwlcqqrf-libgit2-0.25.1.tar.gz' should have sha256 hash `1cdwcw38frc1wf28x5ppddazv9hywc718j92f3xa3ybzzycyds3s', instead has `0ywcxw1mwd56c8qc14hbx31bf198gxck3nja3laxyglv7l57qp26'
What’s sad here is that we do have the right tarball at:
https://mirror.hydra.gnu.org/file/libgit2-0.25.1.tar.gz/sha256/1cdwcw38frc1wf28x5ppddazv9hywc718j92f3xa3ybzzycyds3s
The problem is that the hash check is performed by guix-daemon itself,
not by “guix perform-download”. So when guix-daemon diagnoses a hash
mismatch, it’s too late and we cannot try again and use the
content-addressed mirror.
A crude but helpful fix would be to have perform-download compute the
hash by itself and act accordingly. It’s crude because that means that
we’d be computing the hash twice: once in ‘guix perform-download’ and a
second time in guix-daemon. For archives below ~20 MiB it’s probably OK
though.
Thoughts?
In the future, with the daemon written in Guile, it’s one area where we
could achieve better integration and coordination among the various
pieces.
Ludo’.
Changed bug title to 'Content-addressed mirror is not used upon invalid hash' from 'v0.13: guix pull fails; libgit2-0.26.0 and 0.25.1 content hashes fail'
Request was from ludo@gnu.org (Ludovic Courtès)
to control@debbugs.gnu.org.
(Mon, 02 Oct 2017 15:17:02 GMT) (full text, mbox, link).
Severity set to 'important' from 'normal'
Request was from ludo@gnu.org (Ludovic Courtès)
to control@debbugs.gnu.org.
(Mon, 02 Oct 2017 15:17:03 GMT) (full text, mbox, link).
Information forwarded
to bug-guix@gnu.org: bug#28659; Package guix.
(Mon, 02 Oct 2017 17:06:02 GMT) (full text, mbox, link).
Ludovic Courtès writes:
> What’s sad here is that we do have the right tarball at:
>
> https://mirror.hydra.gnu.org/file/libgit2-0.25.1.tar.gz/sha256/1cdwcw38frc1wf28x5ppddazv9hywc718j92f3xa3ybzzycyds3s
Sad indeed!
> The problem is that the hash check is performed by guix-daemon itself,
> not by “guix perform-download”. So when guix-daemon diagnoses a hash
> mismatch, it’s too late and we cannot try again and use the
> content-addressed mirror.
Why don't we try our content-addressed mirror first?
> A crude but helpful fix would be to have perform-download compute the
> hash by itself and act accordingly. It’s crude because that means that
> we’d be computing the hash twice: once in ‘guix perform-download’ and a
> second time in guix-daemon. For archives below ~20 MiB it’s probably OK
> though.
>
> Thoughts?
We may want more guix hackers' viewpoints here, I don't feel very
qualified...As this would be a temporary workaround only until we have
> In the future, with the daemon written in Guile, it’s one area where we
> could achieve better integration and coordination among the various
> pieces.
...it might be fine?
Do we want/need to bring out a new release for this, e.g. 0.13.1, or
even 0.14? I'm not sure how bad it is that --no-substitutes does not
work. I think working on guix pull to not compile everything locally
may have priority?
janneke
--
Jan Nieuwenhuizen <janneke@gnu.org> | GNU LilyPond http://lilypond.org
Freelance IT http://JoyofSource.com | Avatar® http://AvatarAcademy.com
Information forwarded
to bug-guix@gnu.org: bug#28659; Package guix.
(Mon, 02 Oct 2017 18:20:01 GMT) (full text, mbox, link).
On Mon, Oct 02, 2017 at 04:57:38PM +0200, Ludovic Courtès wrote:
> Hi!
>
> Leo Famulari <leo@famulari.name> skribis:
>
> > I contacted GitHub about this issue a few weeks ago and they said that:
> >
> > 1) They do not guarantee bit-reproducibility of the snapshots they
> > generate automatically for each release tag, and they wish that people
> > would not rely on them as we do. However, since people *are* relying on
> > them, they are discussing this issue internally.
>
> Oh?! Then we’re in trouble.
I wonder, are there really that many affected packages? My sense is that
most GitHub-hosted projects offer their own release tarballs in addition
to the problematic auto-generated snapshots, and we tend to prefer the
upstream-provided tarballs in this case.
We'd need to survey our package sources to know what sort of reaction is
most appropriate.
In general, we should try to make Guix as resilient as possible to
unstable upstream sources, since the problem is not limited to GitHub.
> Perhaps we should start using ‘git-fetch’ more, with Software Heritage
> as a fallback content-addressed mirror? Though again the difficulty is
> that SWH uses Git’s method to hash directory contents, so we’d end up
> having to provide both a Nix hash and a Git hash in ‘origin’. :-/
And the Git hashes will change from SHA1 to SHA256 sooner or later, and
SHA1 hashes will become less reliable as CPUs get faster (collision
attacks), compounding the problem...
On Mon, Oct 02, 2017 at 05:09:39PM +0200, Ludovic Courtès wrote:
> What’s sad here is that we do have the right tarball at:
>
> https://mirror.hydra.gnu.org/file/libgit2-0.25.1.tar.gz/sha256/1cdwcw38frc1wf28x5ppddazv9hywc718j92f3xa3ybzzycyds3s
It seems to me that there are several reasons someone may choose not to
use substitutes. Some of those reasons (reproducibility and security
concerns) are obviated for fixed-output derivations like upstream
sources, and I think it would be fine to still use substitutes for these
derivations.
But the motivations of privacy, self-sufficiency, etc are not addressed
by that idea.
Leo Famulari <leo@famulari.name> skribis:
> On Mon, Oct 02, 2017 at 05:09:39PM +0200, Ludovic Courtès wrote:
>> What’s sad here is that we do have the right tarball at:
>>
>> https://mirror.hydra.gnu.org/file/libgit2-0.25.1.tar.gz/sha256/1cdwcw38frc1wf28x5ppddazv9hywc718j92f3xa3ybzzycyds3s
Just to be clear: this URL is not that of a substitute, but that of a
content-addressed file (corresponding to the output of a fixed-output
derivation.)
> It seems to me that there are several reasons someone may choose not to
> use substitutes. Some of those reasons (reproducibility and security
> concerns) are obviated for fixed-output derivations like upstream
> sources, and I think it would be fine to still use substitutes for these
> derivations.
>
> But the motivations of privacy, self-sufficiency, etc are not addressed
> by that idea.
Right. Jan suggested checking the content-addressed mirrors *before*
the real upstream address. That would address the problem of upstream
sources modified in-place, but at the cost of privacy/self-sufficiency
as you note. (Though it’s not really making “privacy” any worse in this
case: it’s gnu.org vs. github.com.)
Perhaps we should make content-addressed mirrors configurable in a way
that’s orthogonal to derivations, something similar in spirit to
--substitute-urls? The difficulty is that content-addressed mirrors are
not just URLs; see (guix download).
Thoughts?
Ludo’.
Information forwarded
to bug-guix@gnu.org: bug#28659; Package guix.
(Mon, 02 Oct 2017 20:24:02 GMT) (full text, mbox, link).
Ludovic Courtès writes:
> Right. Jan suggested checking the content-addressed mirrors *before*
> the real upstream address. That would address the problem of upstream
> sources modified in-place, but at the cost of privacy/self-sufficiency
> as you note. (Though it’s not really making “privacy” any worse in this
> case: it’s gnu.org vs. github.com.)
Yes, that may not preferrable in general without override.
> Perhaps we should make content-addressed mirrors configurable in a way
> that’s orthogonal to derivations, something similar in spirit to
> --substitute-urls? The difficulty is that content-addressed mirrors are
> not just URLs; see (guix download).
Hmm. I'm not sure what problem we are solving. Should we only do this
for github(-like) tarballs? Do we see this problem with other sources,
should we prevent it? Possibly github will never do something like this
again. Or we could banish github/gitlab(?) auto-generated tarballs and
go for git checkouts+commits?
janneke
--
Jan Nieuwenhuizen <janneke@gnu.org> | GNU LilyPond http://lilypond.org
Freelance IT http://JoyofSource.com | Avatar® http://AvatarAcademy.com
Information forwarded
to bug-guix@gnu.org: bug#28659; Package guix.
(Mon, 02 Oct 2017 20:30:01 GMT) (full text, mbox, link).
On Mon, Oct 02, 2017 at 10:22:33PM +0200, Jan Nieuwenhuizen wrote:
> Hmm. I'm not sure what problem we are solving. Should we only do this
> for github(-like) tarballs? Do we see this problem with other sources,
> should we prevent it? Possibly github will never do something like this
> again. Or we could banish github/gitlab(?) auto-generated tarballs and
> go for git checkouts+commits?
Files referenced by URL (location-addressing vs content-addressing) have
been changed in place by a variety of hosters and upstream projects
since I've started paying attention to these issues. I don't think we
need to do anything special regarding GitHub.
Leo Famulari <leo@famulari.name> writes:
> On Mon, Oct 02, 2017 at 04:57:38PM +0200, Ludovic Courtès wrote:
>> Hi!
>>
>> Leo Famulari <leo@famulari.name> skribis:
>>
>> > I contacted GitHub about this issue a few weeks ago and they said that:
>> >
>> > 1) They do not guarantee bit-reproducibility of the snapshots they
>> > generate automatically for each release tag, and they wish that people
>> > would not rely on them as we do. However, since people *are* relying on
>> > them, they are discussing this issue internally.
>>
>> Oh?! Then we’re in trouble.
>
> I wonder, are there really that many affected packages?
There's a list here:
https://github.com/Homebrew/homebrew-core/issues/18044, compiled by one
of the homebrew project's maintainers.
Maxim
Information forwarded
to bug-guix@gnu.org: bug#28659; Package guix.
(Tue, 03 Oct 2017 12:31:01 GMT) (full text, mbox, link).
Jan Nieuwenhuizen <janneke@gnu.org> skribis:
> Ludovic Courtès writes:
[...]
>> Perhaps we should make content-addressed mirrors configurable in a way
>> that’s orthogonal to derivations, something similar in spirit to
>> --substitute-urls? The difficulty is that content-addressed mirrors are
>> not just URLs; see (guix download).
>
> Hmm. I'm not sure what problem we are solving. Should we only do this
> for github(-like) tarballs? Do we see this problem with other sources,
> should we prevent it? Possibly github will never do something like this
> again. Or we could banish github/gitlab(?) auto-generated tarballs and
> go for git checkouts+commits?
Content-addressed mirrors help with disappearing and modified tarballs
in general; it’s not just GitHub.
Occasionally we see that problem with tarballs coming from elsewhere:
404 is quite frequent, and in-place modification happens from time to
time (even on ftp.gnu.org…).
Ludo’.
Information forwarded
to bug-guix@gnu.org: bug#28659; Package guix.
(Tue, 03 Oct 2017 12:32:02 GMT) (full text, mbox, link).
Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:
> Leo Famulari <leo@famulari.name> writes:
>
>> On Mon, Oct 02, 2017 at 04:57:38PM +0200, Ludovic Courtès wrote:
>>> Hi!
>>>
>>> Leo Famulari <leo@famulari.name> skribis:
>>>
>>> > I contacted GitHub about this issue a few weeks ago and they said that:
>>> >
>>> > 1) They do not guarantee bit-reproducibility of the snapshots they
>>> > generate automatically for each release tag, and they wish that people
>>> > would not rely on them as we do. However, since people *are* relying on
>>> > them, they are discussing this issue internally.
>>>
>>> Oh?! Then we’re in trouble.
>>
>> I wonder, are there really that many affected packages?
>
> There's a list here:
> https://github.com/Homebrew/homebrew-core/issues/18044, compiled by one
> of the homebrew project's maintainers.
Interesting. Thanks for the link!
Ludo’.
Information forwarded
to bug-guix@gnu.org: bug#28659; Package guix.
(Tue, 03 Oct 2017 14:25:02 GMT) (full text, mbox, link).
On Mon, Oct 02, 2017 at 06:47:06PM -0400, Maxim Cournoyer wrote:
> Leo Famulari <leo@famulari.name> writes:
> > I wonder, are there really that many affected packages?
>
> There's a list here:
> https://github.com/Homebrew/homebrew-core/issues/18044, compiled by one
> of the homebrew project's maintainers.
I meant, how many Guix packages use the auto-generated GitHub snapshots?
I believe the tell-tale sign is that the download link will have the
link text 'Source code', as for this release:
https://github.com/libgit2/libgit2/releases/tag/v0.26.0
Leo Famulari <leo@famulari.name> writes:
> On Mon, Oct 02, 2017 at 06:47:06PM -0400, Maxim Cournoyer wrote:
>> Leo Famulari <leo@famulari.name> writes:
>> > I wonder, are there really that many affected packages?
>>
>> There's a list here:
>> https://github.com/Homebrew/homebrew-core/issues/18044, compiled by one
>> of the homebrew project's maintainers.
>
> I meant, how many Guix packages use the auto-generated GitHub snapshots?
>
> I believe the tell-tale sign is that the download link will have the
> link text 'Source code', as for this release:
>
> https://github.com/libgit2/libgit2/releases/tag/v0.26.0
The following script:
outputs that there could be up to 1011 affected packages.
The scripts checks for a url-fetch uri of the form
"github.com/.*/archive/", which seems to be the one used for the
dynamically generated archives.
Here are the first 10 lines of the output:
--8<---------------cut here---------------start------------->8---
Number of potentially problematic GitHub packages:1011
fdupes
cbatticon
sedsed
cpulimit
autojump
sudo
thermald
progress
dstat
[...]
--8<---------------cut here---------------end--------------->8---
I've checked the first few with for example:
--8<---------------cut here---------------start------------->8---
guix build --source --no-substitutes sedsed
--8<---------------cut here---------------end--------------->8---
and they were OK though.
Maxim
Information forwarded
to bug-guix@gnu.org: bug#28659; Package guix.
(Wed, 04 Oct 2017 16:55:02 GMT) (full text, mbox, link).
On Wed, Oct 04, 2017 at 12:22:34AM -0400, Maxim Cournoyer wrote:
> Here are the first 10 lines of the output:
> --8<---------------cut here---------------start------------->8---
> Number of potentially problematic GitHub packages:1011
> fdupes
> cbatticon
> sedsed
> cpulimit
> autojump
> sudo
I think the script is buggy; sudo's source is not downloaded from GitHub
as far as I can tell.
Leo Famulari <leo@famulari.name> writes:
> On Wed, Oct 04, 2017 at 12:22:34AM -0400, Maxim Cournoyer wrote:
>> Here are the first 10 lines of the output:
>> --8<---------------cut here---------------start------------->8---
>> Number of potentially problematic GitHub packages:1011
>> fdupes
>> cbatticon
>> sedsed
>> cpulimit
>> autojump
>> sudo
>
> I think the script is buggy; sudo's source is not downloaded from GitHub
> as far as I can tell.
Good catch! I was assuming empty lists were falsy, but that's not the
case! I've ensured purely boolean predicates now and it gets the list
down to 650.
Here's the corrected script:
I've modified the script to sort the packages it prints:
--8<---------------cut here---------------start------------->8---
- (for-each (lambda (p)
- (format #t "~a~%" (package-name p)))
- packages)))
+ (for-each (lambda (name)
+ (format #t "~a~%" name))
+ (sort (map package-name packages) string<?))))
--8<---------------cut here---------------end--------------->8---
and compared it to the list here: https://github.com/Homebrew/homebrew-core/issues/18044
If we can trust the Homebrew list to be extensive, it seems we got
lucky; there's only one affected package that we share which is
yaml-cpp. Here's how it fails on our side:
--8<---------------cut here---------------start------------->8---
guix build -S --no-substitutes yaml-cpp
The following derivation will be built:
/gnu/store/mlap8jmadirnbii6sppb6vj9x56s8azw-yaml-cpp-0.5.3.tar.gz.drv
@ build-started /gnu/store/mlap8jmadirnbii6sppb6vj9x56s8azw-yaml-cpp-0.5.3.tar.gz.drv - x86_64-linux /var/log/guix/drvs/ml//ap8jmadirnbii6sppb6vj9x56s8azw-yaml-cpp-0.5.3.tar.gz.drv.bz2
Starting download of /gnu/store/qwflwafrzjbr2b7dy4nv18nxykghhmnk-yaml-cpp-0.5.3.tar.gz
From https://github.com/jbeder/yaml-cpp/archive/yaml-cpp-0.5.3.tar.gz...
following redirection to `https://codeload.github.com/jbeder/yaml-cpp/tar.gz/yaml-cpp-0.5.3'...
...p-0.5.3 1.7MiB/s 00:01 | 1.9MiB transferred
sha256 hash mismatch for output path `/gnu/store/qwflwafrzjbr2b7dy4nv18nxykghhmnk-yaml-cpp-0.5.3.tar.gz'
expected: 1vk6pjh0f5k6jwk2sszb9z5169whmiha9ainbdpa1arxlkq7v3b6
actual: 1ck7jk0wjfigrf4cgcjqsir4yp1s6vamhhxhpsgfvs46pgm5pk6y
@ build-failed /gnu/store/mlap8jmadirnbii6sppb6vj9x56s8azw-yaml-cpp-0.5.3.tar.gz.drv - 1 sha256 hash mismatch for output path `/gnu/store/qwflwafrzjbr2b7dy4nv18nxykghhmnk-yaml-cpp-0.5.3.tar.gz'
expected: 1vk6pjh0f5k6jwk2sszb9z5169whmiha9ainbdpa1arxlkq7v3b6
actual: 1ck7jk0wjfigrf4cgcjqsir4yp1s6vamhhxhpsgfvs46pgm5pk6y
guix build: error: build failed: build of
`/gnu/store/mlap8jmadirnbii6sppb6vj9x56s8azw-yaml-cpp-0.5.3.tar.gz.drv'
failed
--8<---------------cut here---------------end--------------->8---
Maxim
Information forwarded
to bug-guix@gnu.org: bug#28659; Package guix.
(Thu, 05 Oct 2017 06:09:01 GMT) (full text, mbox, link).
Maxim Cournoyer writes:
> If we can trust the Homebrew list to be extensive, it seems we got
> lucky; there's only one affected package that we share which is
> yaml-cpp. Here's how it fails on our side:
I needed to also use (ice-9 regex) and then I found these to fail
antlr3
csound
erlang
font-google-material-design-icons
fritzing
libgit2
lxqt-common
ogre
plexus-interpolation
red-eclipse
yaml-cpp
out of 646 packages it's not many but it includes our core dependency
libgit2 which breaks guix pull --no-substitutes; that's hardly being
lucky?
janneke
--
Jan Nieuwenhuizen <janneke@gnu.org> | GNU LilyPond http://lilypond.org
Freelance IT http://JoyofSource.com | Avatar® http://AvatarAcademy.com
Information forwarded
to bug-guix@gnu.org: bug#28659; Package guix.
(Fri, 20 Oct 2017 21:18:02 GMT) (full text, mbox, link).
On Mon, Oct 02, 2017 at 10:00:33PM +0200, Ludovic Courtès wrote:
> Right. Jan suggested checking the content-addressed mirrors *before*
> the real upstream address. That would address the problem of upstream
> sources modified in-place, but at the cost of privacy/self-sufficiency
> as you note. (Though it’s not really making “privacy” any worse in this
> case: it’s gnu.org vs. github.com.)
Yeah, I don't personally think there is a privacy issue with fetching
sources from our mirrors at gnu.org, or other domains we control.
> Perhaps we should make content-addressed mirrors configurable in a way
> that’s orthogonal to derivations, something similar in spirit to
> --substitute-urls? The difficulty is that content-addressed mirrors are
> not just URLs; see (guix download).
>
> Thoughts?
I do think we should make it so that users don't suffer from unreliable
upstream sources when we know the sources are available on our servers
(or the Nix mirror), even with --no-substitutes.
Leo Famulari <leo@famulari.name> skribis:
> On Mon, Oct 02, 2017 at 10:00:33PM +0200, Ludovic Courtès wrote:
>> Right. Jan suggested checking the content-addressed mirrors *before*
>> the real upstream address. That would address the problem of upstream
>> sources modified in-place, but at the cost of privacy/self-sufficiency
>> as you note. (Though it’s not really making “privacy” any worse in this
>> case: it’s gnu.org vs. github.com.)
>
> Yeah, I don't personally think there is a privacy issue with fetching
> sources from our mirrors at gnu.org, or other domains we control.
>
>> Perhaps we should make content-addressed mirrors configurable in a way
>> that’s orthogonal to derivations, something similar in spirit to
>> --substitute-urls? The difficulty is that content-addressed mirrors are
>> not just URLs; see (guix download).
>>
>> Thoughts?
>
> I do think we should make it so that users don't suffer from unreliable
> upstream sources when we know the sources are available on our servers
> (or the Nix mirror), even with --no-substitutes.
The more I think about it, the more I’m inclined to simply move
content-addressed mirrors to the front of the list. This means that
users, in practice, would be fetching all the source from
mirror.hydra.gnu.org.
The main issue is making it configurable. Currently the
content-addressed mirror configuration for regular files in (guix
download) looks like this:
--8<---------------cut here---------------start------------->8---
(define %content-addressed-mirrors
;; List of content-addressed mirrors. Each mirror is represented as a
;; procedure that takes a file name, an algorithm (symbol) and a hash
;; (bytevector), and returns a URL or #f.
;; Note: Avoid 'https' to mitigate <http://bugs.gnu.org/22774>.
;; TODO: Add more.
'(list (lambda (file algo hash)
;; Files served by 'guix publish' are accessible under a single
;; hash algorithm.
(string-append "http://mirror.hydra.gnu.org/file/"
file "/" (symbol->string algo) "/"
(bytevector->nix-base32-string hash)))
(lambda (file algo hash)
;; 'tarballs.nixos.org' supports several algorithms.
(string-append "http://tarballs.nixos.org/"
(symbol->string algo) "/"
(bytevector->nix-base32-string hash)))))
--8<---------------cut here---------------end--------------->8---
That for VCS checkouts in (guix build download-nar) looks like this:
--8<---------------cut here---------------start------------->8---
(define (urls-for-item item)
"Return the fallback nar URL for ITEM--e.g.,
\"/gnu/store/cabbag3…-foo-1.2-checkout\"."
;; Here we hard-code nar URLs without checking narinfos. That's probably OK
;; though.
;; TODO: Use HTTPS? The downside is the extra dependency.
(let ((bases '("http://mirror.hydra.gnu.org/guix"
"http://berlin.guixsd.org"))
(item (basename item)))
(append (map (cut string-append <> "/nar/gzip/" item) bases)
(map (cut string-append <> "/nar/" item) bases))))
--8<---------------cut here---------------end--------------->8---
The latter could be expressed by a command-line flag. In fact it’s the
same as --substitute-urls.
(Time passes…)
Thinking more about it, why not simply always enable substitutes for
fixed-output derivations, like this:
diff --git a/nix/libstore/build.cc b/nix/libstore/build.cc
index d68e8b2bc..03a8f5080 100644
--- a/nix/libstore/build.cc
+++ b/nix/libstore/build.cc
@@ -1034,8 +1034,10 @@ void DerivationGoal::haveDerivation()
/* We are first going to try to create the invalid output paths
through substitutes. If that doesn't work, we'll build
- them. */
- if (settings.useSubstitutes && substitutesAllowed(drv))
+ them. Always enable substitutes for fixed-output derivations to
+ protect against disappearing files and in-place modifications on
+ upstream sites. */
+ if ((fixedOutput || settings.useSubstitutes) && substitutesAllowed(drv))
foreach (PathSet::iterator, i, invalidOutputs)
addWaitee(worker.makeSubstitutionGoal(*i, buildMode == bmRepair));
This solves all our problems and makes download-nar.scm useless.
As an added bonus, it provides a improves the UI since we now always
see:
--8<---------------cut here---------------start------------->8---
0.1 MB will be downloaded:
/gnu/store/plx9848n6waj6zghn3d54ybx8ihcn23k-guile-git-0.0-4.951a32c-checkout
--8<---------------cut here---------------end--------------->8---
… instead of:
--8<---------------cut here---------------start------------->8---
The following derivation will be built:
/gnu/store/y86rlb6pdm35im7q02y6479ca84zwylz-guile-git-000.0-4.951a32c-checkout.drv
--8<---------------cut here---------------end--------------->8---
The downside is that it still requires one to authorize the server’s
key, although it’s in theory unnecessary since it’s content addressed.
I’m not sure how to solve that because ‘guix substitute’ doesn’t know
that it’s substituting a fixed-output derivation. I suppose we’d need
to modify the “protocol” between guix-daemon and ‘guix substitute’.
Thoughts?
Ludo’.
Information forwarded
to bug-guix@gnu.org: bug#28659; Package guix.
(Thu, 14 Dec 2017 16:54:02 GMT) (full text, mbox, link).
ludo@gnu.org (Ludovic Courtès) skribis:
> Thinking more about it, why not simply always enable substitutes for
> fixed-output derivations, like this:
>
> diff --git a/nix/libstore/build.cc b/nix/libstore/build.cc
> index d68e8b2bc..03a8f5080 100644
> --- a/nix/libstore/build.cc
> +++ b/nix/libstore/build.cc
> @@ -1034,8 +1034,10 @@ void DerivationGoal::haveDerivation()
>
> /* We are first going to try to create the invalid output paths
> through substitutes. If that doesn't work, we'll build
> - them. */
> - if (settings.useSubstitutes && substitutesAllowed(drv))
> + them. Always enable substitutes for fixed-output derivations to
> + protect against disappearing files and in-place modifications on
> + upstream sites. */
> + if ((fixedOutput || settings.useSubstitutes) && substitutesAllowed(drv))
> foreach (PathSet::iterator, i, invalidOutputs)
> addWaitee(worker.makeSubstitutionGoal(*i, buildMode == bmRepair));
[...]
> The downside is that it still requires one to authorize the server’s
> key, although it’s in theory unnecessary since it’s content addressed.
> I’m not sure how to solve that because ‘guix substitute’ doesn’t know
> that it’s substituting a fixed-output derivation. I suppose we’d need
> to modify the “protocol” between guix-daemon and ‘guix substitute’.
I looked at how to address this by having ‘guix substitute’
automatically determine whether it’s being asked for a content-addressed
item or not. The guts of it is this procedure:
(define* (content-addressed-item? item hash
#:key (hash-algo 'sha256))
"Return true if ITEM, a store file name, is definitely a content-addressed
item (result of a fixed-output derivation) with the given HASH of type
HASH-ALGO, false otherwise.
Note: This procedure is useful when the deriver of ITEM is unknown. In other
cases, the recommended approach is to check 'fixed-output-derivation?' on the
deriver."
;; XXX: This returns #f for "text" items produced by 'add-text-to-store'.
;; There's not much we can do because the file name for these is a function
;; of their content.
(let ((name (store-path-package-name item)))
(or (string=? item (fixed-output-path name hash #:recursive? #f
#:hash-algo hash-algo))
(string=? item (fixed-output-path name hash #:recursive? #t
#:hash-algo hash-algo)))))
It works as expected for the result of “recursive fixed-output
derivations”—i.e., fixed-output derivations that produce a directory,
such as VCS checkouts.
However it doesn’t work for fixed-output derivations that produce a flat
file, such as origins with the ‘url-fetch’ method. The reason is
because in the case of non-recursive derivations, the store file name is
computed as a function of the file hash, not as a function of the nar
hash, whereas narinfos only contains the nar hash (the thing that ‘guix
hash -r’ computes.)
So I think we have to communicate more info from the daemon to ‘guix
substitute’.
Ludo’.
Information forwarded
to bug-guix@gnu.org: bug#28659; Package guix.
(Fri, 15 Dec 2017 09:31:02 GMT) (full text, mbox, link).
ludo@gnu.org (Ludovic Courtès) skribis:
> So I think we have to communicate more info from the daemon to ‘guix
> substitute’.
The attached patch addresses that by simply calling out to the daemon to
determine whether we’re dealing with a content-addressed item.
To summarize, the new behavior is that substitutes are always enabled
for fixed-output derivations. That way, people willing to build
everything from source can still use ‘--no-substitutes’ and yet be able
to retrieve source code without being penalized compared to someone
enabling substitutes wholesale.
Of course, when substitutes are missing, we fall back to regular
downloads or VCS checkouts. It is also still possible to choose where
substitutes are downloaded from, using ‘--substitute-urls’, or even to
pass an empty list of URLs.
Feedback welcome!
Ludo’.
Subject: Re: bug#39575: guix time-machine fails when a tarball was modified
in-place
Date: Fri, 14 Feb 2020 22:34:13 +0100
Jan Nieuwenhuizen <janneke@gnu.org> skribis:
> Ludovic Courtès writes:
[...]
>> The problem here is really that we fall back to content-addressed
>> mirrors instead of using them directly:
>>
>> https://issues.guix.gnu.org/issue/28659
>
> Wait, what happened here; you finally proposed a patch two years ago and
> nothing happened/we all forgot to follow up?
I think we forgot, indeed.
One thing I don’t quite like about the patch is the fact that ‘guix
substitutes’ connects to the daemon in ‘content-addressed-item?’.
Also, one could argue that we’d steer users towards downloading from our
server, which could be a privacy concern (probably not a strong argument
since one can easily change the substitute URLs.)
Thoughts?
Ludo’.
Information forwarded
to bug-guix@gnu.org: bug#28659; Package guix.
(Sat, 15 Feb 2020 15:45:02 GMT) (full text, mbox, link).
Cc: 39575@debbugs.gnu.org, 28659@debbugs.gnu.org,
Jan Nieuwenhuizen <janneke@gnu.org>
Subject: Re: bug#39575: guix time-machine fails when a tarball was modified
in-place
Date: Sat, 15 Feb 2020 16:43:46 +0100
Hi,
On Fri, 14 Feb 2020 at 22:34, Ludovic Courtès <ludo@gnu.org> wrote:
> Also, one could argue that we’d steer users towards downloading from our
> server, which could be a privacy concern (probably not a strong argument
> since one can easily change the substitute URLs.)
I am not following the privacy concern.
What do you mean?
Cheers,
simon
Information forwarded
to bug-guix@gnu.org: bug#28659; Package guix.
(Sun, 16 Feb 2020 11:00:01 GMT) (full text, mbox, link).
Cc: 39575@debbugs.gnu.org, 28659@debbugs.gnu.org,
Jan Nieuwenhuizen <janneke@gnu.org>
Subject: Re: bug#39575: guix time-machine fails when a tarball was modified
in-place
Date: Sun, 16 Feb 2020 11:59:01 +0100
Hi!
zimoun <zimon.toutoune@gmail.com> skribis:
> On Fri, 14 Feb 2020 at 22:34, Ludovic Courtès <ludo@gnu.org> wrote:
>
>> Also, one could argue that we’d steer users towards downloading from our
>> server, which could be a privacy concern (probably not a strong argument
>> since one can easily change the substitute URLs.)
>
> I am not following the privacy concern.
> What do you mean?
I mean that by default, someone who’s disabled substitutes (presumably
out of security or privacy concerns) would find themself downloading
source code from ci.guix.gnu.org instead of various upstream sites.
Ludo’.
Information forwarded
to bug-guix@gnu.org: bug#28659; Package guix.
(Mon, 17 Feb 2020 10:19:02 GMT) (full text, mbox, link).
Cc: 39575@debbugs.gnu.org, 28659@debbugs.gnu.org,
Jan Nieuwenhuizen <janneke@gnu.org>
Subject: Re: bug#39575: guix time-machine fails when a tarball was modified
in-place
Date: Mon, 17 Feb 2020 11:18:22 +0100
Hi Ludo,
On Sun, 16 Feb 2020 at 11:59, Ludovic Courtès <ludo@gnu.org> wrote:
> zimoun <zimon.toutoune@gmail.com> skribis:
> > On Fri, 14 Feb 2020 at 22:34, Ludovic Courtès <ludo@gnu.org> wrote:
> >> Also, one could argue that we’d steer users towards downloading from our
> >> server, which could be a privacy concern (probably not a strong argument
> >> since one can easily change the substitute URLs.)
> >
> > I am not following the privacy concern.
> > What do you mean?
>
> I mean that by default, someone who’s disabled substitutes (presumably
> out of security or privacy concerns) would find themself downloading
> source code from ci.guix.gnu.org instead of various upstream sites.
I do not see the difference between mirroring and traveling back in
time with missing upstream sources.
And because it is content-addressed, it seems even more secure than
downloading from a upstream URL, IMHO.
If one trusts Guix, then an attacker needs to corrupt in the same time
the Guix history and Berlin (and/or any other farm).
If one does not trust Guix, why does they use the recipe coming from
Guix? To be precise, this person has to check all the recipes of all
the dependencies.
Well, I do not see a security concern because we are talking about
serving the sources.
It is another story when the substitutes serve the results of the
build (binaries); because one does not have any strong guarantee that
the substitute serves the expected binaries.
By privacy concern, do you mean that Guix could collect who downloads
what; in a central fashion? Which is not the case when one downloads
from several distributed upstream sources. Right?
Well, I am not convinced because the case of missing upstream source
is rare. And it is easy to protect against such collecting data
process.
In paranoid mode, traveling back in time is becoming difficult because
of the reliability of the sources; I mean if the sources were
reliable, SWH would not exist. ;-) The solution should be an IPFS /
GNUnet / full distributed archive... which is not ready... yet! :-)
Well, maybe for the TODO list of the time-machine: add an option to
allow substitutes *only* for the sources (substitutes meaning
ci.guix.gnu.org and/or SWH). If this option does not exist yet. ;-)
Cheers,
simon
Information forwarded
to bug-guix@gnu.org: bug#28659; Package guix.
(Mon, 17 Feb 2020 14:41:02 GMT) (full text, mbox, link).
Cc: 39575@debbugs.gnu.org, 28659@debbugs.gnu.org,
Jan Nieuwenhuizen <janneke@gnu.org>
Subject: Re: bug#39575: guix time-machine fails when a tarball was modified
in-place
Date: Mon, 17 Feb 2020 15:40:13 +0100
Hi,
zimoun <zimon.toutoune@gmail.com> skribis:
> On Sun, 16 Feb 2020 at 11:59, Ludovic Courtès <ludo@gnu.org> wrote:
>> zimoun <zimon.toutoune@gmail.com> skribis:
>> > On Fri, 14 Feb 2020 at 22:34, Ludovic Courtès <ludo@gnu.org> wrote:
>
>> >> Also, one could argue that we’d steer users towards downloading from our
>> >> server, which could be a privacy concern (probably not a strong argument
>> >> since one can easily change the substitute URLs.)
>> >
>> > I am not following the privacy concern.
>> > What do you mean?
>>
>> I mean that by default, someone who’s disabled substitutes (presumably
>> out of security or privacy concerns) would find themself downloading
>> source code from ci.guix.gnu.org instead of various upstream sites.
[...]
> By privacy concern, do you mean that Guix could collect who downloads
> what; in a central fashion? Which is not the case when one downloads
> from several distributed upstream sources. Right?
Exactly. But like I wrote above, I don’t think it’s a strong argument.
What remains is the issue with ‘content-addressed-item?’, then.
Ludo’.
Information forwarded
to bug-guix@gnu.org: bug#28659; Package guix.
(Mon, 17 Feb 2020 15:05:02 GMT) (full text, mbox, link).
Cc: 39575@debbugs.gnu.org, 28659@debbugs.gnu.org,
Jan Nieuwenhuizen <janneke@gnu.org>
Subject: Re: bug#39575: guix time-machine fails when a tarball was modified
in-place
Date: Mon, 17 Feb 2020 16:04:09 +0100
On Mon, 17 Feb 2020 at 15:40, Ludovic Courtès <ludo@gnu.org> wrote:
> Exactly. But like I wrote above, I don’t think it’s a strong argument.
I agree and the big picture depends on the audience.
Scientific communities would be fine with centralized archives such as
SWH. And only centralized archives IMHO can provide a reliable "long
term" support which is the point for that communities. (Quote because
not clearly defined what it is. :-))
Other communities would prefer distributed archive such as IPFS or
GNUnet but 1. it still needs some work and 2. the "long term" is not
guarantee by nature, IMHO. But it is probably not an issue for that
communities.
> What remains is the issue with ‘content-addressed-item?’, then.
I agree.
The bridge with SWH is in good shape, IMHO.
And the pending IPFS patch would deserve more love. :-) Maybe soon...
Cheers,
simon
Information forwarded
to bug-guix@gnu.org: bug#28659; Package guix.
(Wed, 09 Sep 2020 14:32:02 GMT) (full text, mbox, link).
Cc: 28659@debbugs.gnu.org, Jan Nieuwenhuizen <janneke@gnu.org>
Subject: Re: bug#28659: Content-addressed mirror is not used upon invalid hash
Date: Thu, 10 Sep 2020 10:14:44 +0200
Hello,
zimoun <zimon.toutoune@gmail.com> skribis:
> On Fri, 14 Feb 2020 at 22:34, Ludovic Courtès <ludo@gnu.org> wrote:
>
>> One thing I don’t quite like about the patch is the fact that ‘guix
>> substitutes’ connects to the daemon in ‘content-addressed-item?’.
>
> What is the status of this patch [1] following the recent discussion about
> tar “disarchive” and SWH?
>
> Related:
> - http://issues.guix.gnu.org/issue/39575
> - http://issues.guix.gnu.org/42162
> - https://git.ngyro.com/disarchive/
Thanks for the reminder. I don’t think Timothy’s work changes anything
wrt. to this issue: it would still need to be addressed.
Ludo’.
Information forwarded
to bug-guix@gnu.org: bug#28659; Package guix.
(Thu, 03 Feb 2022 03:02:01 GMT) (full text, mbox, link).
Cc: Jan Nieuwenhuizen <janneke@gnu.org>, 28659@debbugs.gnu.org,
Leo Famulari <leo@famulari.name>
Subject: Re: bug#28659: Content-addressed mirror is not used upon invalid hash
Date: Thu, 03 Feb 2022 03:58:26 +0100
Hi Ludo,
On Fri, 15 Dec 2017 at 10:30, ludo@gnu.org (Ludovic Courtès) wrote:
>> So I think we have to communicate more info from the daemon to ‘guix
>> substitute’.
>
> The attached patch addresses that by simply calling out to the daemon to
> determine whether we’re dealing with a content-addressed item.
WDYT to rebase this patch [1] and resubmit to guix-patches in order to
get more attention and so potential feedback and/or review?
1: <https://issues.guix.gnu.org/issue/28659#26>
Cheers,
simon
Merged 2865970588.
Request was from Ludovic Courtès <ludo@gnu.org>
to control@debbugs.gnu.org.
(Wed, 01 May 2024 10:37:04 GMT) (full text, mbox, link).
Debbugs is free software and licensed under the terms of the
GNU Public License version 2. The current version can be
obtained from https://bugs.debian.org/debbugs-source/.