Maybe partly undo the patch on Elisp comp-el-to-eln-filename

  • Open
  • quality assurance status badge
Details
2 participants
  • Liliana Marie Prikler
  • Martin Edström
Owner
unassigned
Submitted by
Martin Edström
Severity
normal

Debbugs page

M
M
Martin Edström wrote on 7 Oct 07:56 -0700
(name . bug-guix)(address . bug-guix@gnu.org)(name . liliana.prikler)(address . liliana.prikler@gmail.com)
E1sxpAI-0003hS-Gg@rmmprod07.runbox
Hi, I suggest to maybe amend one of the things done by this patchset: https://issues.guix.gnu.org/67260

It undoes the hashing effect of the Elisp function `comp-el-to-eln-filename`, and that seems likely to cause issues downstream, for example in my Emacs package: https://github.com/meedstrom/org-node/issues/60.

To summarize: that function is supposed to generate a filename with a hash based not only the filename but the contents of the file. While it makes sense in Guix to ignore the contribution of the filename, I believe it should still output a new filename when the contents change.

Otherwise there seems to be no way for a downstream package to ensure that it is using an up-to-date .eln variant of an .el file.

I may have missed something though. Can someone in the know tell me what happens if you have not updated Emacs (which if I understand correctly, means ELN-DIR does not change), but you do update an Elisp package, whether through Guix or through Emacs' own package managers. Will Emacs then possibly load an old .eln?

I do not believe that user options like `load-prefer-newer` would affect it. It would just rely on running the aforementioned function and counting on it to output an .eln filename that does not exist if the source is newer.
L
L
Liliana Marie Prikler wrote on 7 Oct 11:02 -0700
f228248ece04c00f78590df2ee7bb90e4302090c.camel@gmail.com
Am Montag, dem 07.10.2024 um 16:56 +0200 schrieb Martin Edström:
Toggle quote (13 lines)
> Hi, I suggest to maybe amend one of the things done by this
> patchset:  https://issues.guix.gnu.org/67260
>
> It undoes the hashing effect of the Elisp function `comp-el-to-eln-
> filename`, and that seems likely to cause issues downstream, for
> example in my Emacs package:
> https://github.com/meedstrom/org-node/issues/60.
>
> To summarize: that function is supposed to generate a filename with a
> hash based not only the filename but the contents of the file.  While
> it makes sense in Guix to ignore the contribution of the filename, I
> believe it should still output a new filename when the contents
> change.
There are opposite goals to "make sure that the file hasn't been
tampered with" (upstream) and "to keep files patchable" (Guix). I
don't think we can easily satisfy both. Perhaps we could use the
original store path as some kind of key to match the files (since we
compile them in-store IIRC), but that wouldn't work for the "let's
compile our init.el" use case.

As a matter of fact, we've disabled JIT compilation for the very reason
that stuff can break ;)

Toggle quote (2 lines)
> Otherwise there seems to be no way for a downstream package to ensure
> that it is using an up-to-date .eln variant of an .el file.
What about aggressive-recompilation-on-write?

Toggle quote (5 lines)
> I may have missed something though.  Can someone in the know tell me
> what happens if you have not updated Emacs (which if I understand
> correctly, means ELN-DIR does not change), but you do update an Elisp
> package, whether through Guix or through Emacs' own package managers.
> Will Emacs then possibly load an old .eln? 
We write store paths to a subdirs.el – unless specifically prompted to
reload that, Emacs will keep using old libraries. This is by design,
so that updating Emacs does not cause any issues with (byte) compiled
files.

Toggle quote (4 lines)
> I do not believe that user options like `load-prefer-newer` would
> affect it. It would just rely on running the aforementioned function
> and counting on it to output an .eln filename that does not exist if
> the source is newer.
Since all timestamps point to 1970, you are right, `load-prefer-newer'
does nothing.

Cheers
M
M
Martin Edström wrote on 7 Oct 13:46 -0700
(name . Liliana Marie Prikler)(address . liliana.prikler@gmail.com)(name . 73681)(address . 73681@debbugs.gnu.org)
E1sxuch-0007aP-NN@rmmprod07.runbox
Hi, thanks for the prompt response!

On Mon, 07 Oct 2024 20:02:17 +0200, Liliana Marie Prikler <liliana.prikler@gmail.com> wrote:
Toggle quote (2 lines)
> What about aggressive-recompilation-on-write?

That works in the init.el case, because the user would be the one writing the file.

In my package, the use case is instead that it starts sub-processes, each of which should load a file "org-node-parser.eln". I could ahve made them just load the output of (locate-library "org-node-parser"), but for unrelated reasons, that seems it would only ever locate an .elc and not an .eln.

So I need to use `comp-el-to-eln-filename` to find the up-to-date .eln (or force it to be compiled). Regardless of whether or not the rest of the package has already been native-compiled.

The worst-case scenario is using package version 1.4.1 in the main Emacs process, but 1.4.0 of org-node-parser.eln in the subprocesses. That leads to unintended bugs.

I suppose the other way around could also happen - using package version 1.4.0 but 1.4.1 in the subprocesses - but that'll be my own mess to figure out :)

Toggle quote (5 lines)
> We write store paths to a subdirs.el – unless specifically prompted to
> reload that, Emacs will keep using old libraries. This is by design,
> so that updating Emacs does not cause any issues with (byte) compiled
> files.

Let me know if I understand this correctly: updating a package compiles it at the same time, so we can have store paths like

/gnu/store/package-1.4.0/...{.elc,.el,.eln}
/gnu/store/package-1.4.1/...{.elc,.el,.eln}

which sounds like it will be all consistent. An .eln in the second directory would never secretly be 1.4.0, for example.

But since you said we disabled the JIT compilation, and store dirs are read-only, where do the .eln actually end up, after the user starts Emacs and it does its async-native-compile thing?

(or does it not do any JIT compilation at all?)

I don't have a Guix machine at the moment, but is it a deterministic path like ~/.emacs.d/eln-cache/29.4-HASH/package-HASH.eln ? The user in my GitHub issue gets no such path from running `comp-el-to-eln-filename`, but maybe they're on an old Guix Emacs.
L
L
Liliana Marie Prikler wrote on 7 Oct 21:32 -0700
(name . Martin Edström)(address . meedstrom@runbox.eu)(name . 73681)(address . 73681@debbugs.gnu.org)
58598114857dce8a25e3b4d0477d212467a0173f.camel@gmail.com
Am Montag, dem 07.10.2024 um 22:46 +0200 schrieb Martin Edström:
Toggle quote (5 lines)
> In my package, the use case is instead that it starts sub-processes,
> each of which should load a file "org-node-parser.eln".  I could ahve
> made them just load the output of (locate-library "org-node-parser"),
> but for unrelated reasons, that seems it would only ever locate an
> .elc and not an .eln.
Could you keep track of modifications to org-node-parser and recompile
that on change? Or is it part of your package already – if the latter,
then we should already have it compiled as a package.

Toggle quote (11 lines)
> So I need to use `comp-el-to-eln-filename` to find the up-to-date
> .eln (or force it to be compiled).  Regardless of whether or not the
> rest of the package has already been native-compiled.
>
> The worst-case scenario is using package version 1.4.1 in the main
> Emacs process, but 1.4.0 of org-node-parser.eln in the subprocesses. 
> That leads to unintended bugs.
>
> I suppose the other way around could also happen - using package
> version 1.4.0 but 1.4.1 in the subprocesses - but that'll be my own
> mess to figure out :)
I think it's your own mess either way – can't you downgrade to a model
where you just ask for "org-node-parser" to be loaded, regardless of
compilation? Then, you wouldn't have to native-compile it.

Toggle quote (19 lines)
> > We write store paths to a subdirs.el – unless specifically prompted
> > to reload that, Emacs will keep using old libraries.  This is by
> > design, so that updating Emacs does not cause any issues with
> > (byte) compiled files.
>
> Let me know if I understand this correctly: updating a package
> compiles it at the same time, so we can have store paths like
>
> /gnu/store/package-1.4.0/...{.elc,.el,.eln}
> /gnu/store/package-1.4.1/...{.elc,.el,.eln}
>
> which sounds like it will be all consistent.  An .eln in the second
> directory would never secretly be 1.4.0, for example.
>
> But since you said we disabled the JIT compilation, and store dirs
> are read-only, where do the .eln actually end up, after the user
> starts Emacs and it does its async-native-compile thing?
>
> (or does it not do any JIT compilation at all?)
We don't do JIT, but even if we did, we should find the one in the
store (and the correct one, as per Emacs load paths).

Toggle quote (4 lines)
> I don't have a Guix machine at the moment, but is it a deterministic
> path like ~/.emacs.d/eln-cache/29.4-HASH/package-HASH.eln ?  The user
> in my GitHub issue gets no such path from running `comp-el-to-eln-
> filename`, but maybe they're on an old Guix Emacs.
It's `~/.emacs.d/eln-cache/29.4-HASH/some/module.eln`, with some/module
being the module that you'd load.

Cheers
M
M
Martin Edström wrote on 8 Oct 03:41 -0700
(name . Liliana Marie Prikler)(address . liliana.prikler@gmail.com)(name . 73681)(address . 73681@debbugs.gnu.org)
E1sy7eg-0005XF-AC@rmmprod07.runbox
On Tue, 08 Oct 2024 06:32:32 +0200, Liliana Marie Prikler <liliana.prikler@gmail.com> wrote:

Toggle quote (4 lines)
> Could you keep track of modifications to org-node-parser and recompile
> that on change? Or is it part of your package already – if the latter,
> then we should already have it compiled as a package.

It comes as part of the package. I don't want to assume that it has been compiled, since it's fairly performance-sensitive. That's why I'll either use a previously existing compiled object or make a new one.

What I'm getting from this is that it might be safer to just compile the object into /tmp and use that, regardless of what may already exist. At the moment, that's necessary for keeping the option open of using an .eln on Guix.

So let's ignore my package, it is just an example of a downstream use of `comp-el-to-eln-filename` that relied on its hashing functionality.

Let's just discuss that function.

I have to point out that the emacs `load-path` does not include any native paths. When I inspect the value on my non-Guix emacs, I see no references to .../eln-cache/..., just references to directories where there are .elc and .el files.

I infer that Emacs starts with finding a library in load-path, then converts the path with `comp-el-to-eln-filename`, and checks if that file exists, then loads it.

And crucially, it is not just about the filepath, the function hashes the file contents as well. That ensures that the output path is always different if the source file changes.

Since Guix has a patch that removes this effect, it seems like a package could be upgraded many, many times, without the .eln path ever changing, and so the user would stay on that very outdated file.

Is that not a regression/bug?
L
L
Liliana Marie Prikler wrote on 8 Oct 10:33 -0700
(name . Martin Edström)(address . meedstrom@runbox.eu)(name . 73681)(address . 73681@debbugs.gnu.org)
a96bf29e15108292c31e9cf29f81a7a5150eaeca.camel@gmail.com
Am Dienstag, dem 08.10.2024 um 12:41 +0200 schrieb Martin Edström:
Toggle quote (4 lines)
> It comes as part of the package. I don't want to assume that it has
> been compiled, since it's fairly performance-sensitive. That's why
> I'll either use a previously existing compiled object or make a new
> one.
Could you leave that decision to the user?

Toggle quote (12 lines)
> […]
>
> So let's ignore my package, it is just an example of a downstream use
> of `comp-el-to-eln-filename` that relied on its hashing
> functionality.
>
> Let's just discuss that function.
>
> I have to point out that the emacs `load-path` does not include any
> native paths.  When I inspect the value on my non-Guix emacs, I see
> no references to .../eln-cache/..., just references to directories
> where there are .elc and .el files.
There is a separate load path for natively compiled files, called
`native-comp-eln-load-path'.

Toggle quote (7 lines)
> I infer that Emacs starts with finding a library in load-path, then
> converts the path with `comp-el-to-eln-filename`, and checks if that
> file exists, then loads it.
>
> And crucially, it is not just about the filepath, the function hashes
> the file contents as well. That ensures that the output path is
> always different if the source file changes.
I think relying on such implementation details is perhaps permitted if
it's inside of Emacs itself, but even then it clashes with our
expectation that Emacs be graftable.

Toggle quote (5 lines)
> Since Guix has a patch that removes this effect, it seems like a
> package could be upgraded many, many times, without the .eln path
> ever changing, and so the user would stay on that very outdated file.
>
> Is that not a regression/bug?
The way our load paths are set up, it is actually the opposite (which
still is a bug, just not the one reported). While `guix upgrade` or a
command to the similar effect will swap out the .eln under the hood,
the `.el` and `.elc` files stay stable – remember what I wrote in the
previous message about that having caused issues with byte compilation?

We also get a similar-looking bug if our packages aren't actually
native-compiled, but Emacs itself vendors them. That is resolved by
dropping those .eln-files from the Emacs package.

Cheers
M
M
Martin Edström wrote 6 days ago
(name . Liliana Marie Prikler)(address . liliana.prikler@gmail.com)(name . 73681)(address . 73681@debbugs.gnu.org)
E1syYPq-0005iN-A6@rmmprod05.runbox
Attachment: file
L
L
Liliana Marie Prikler wrote 6 days ago
(name . Martin Edström)(address . meedstrom@runbox.eu)(name . 73681)(address . 73681@debbugs.gnu.org)
518988807953a1b77acb5f9833992fa9ced883c1.camel@gmail.com
Am Mittwoch, dem 09.10.2024 um 17:15 +0200 schrieb Martin Edström:
Toggle quote (39 lines)
> Hi, thanks for info! This email will be a bit long because I'm quite
> confused and thinking aloud, so no need to reply to all of it.  But I
> appreciate any attempt to shed clarity!
>
>
> On Tue, 08 Oct 2024 19:33:06 +0200, Liliana Marie Prikler
> <liliana.prikler@gmail.com> wrote:
>
> > > It comes as part of the package. I don't want to assume that it
> > > has been compiled, since it's fairly performance-sensitive.
> > > That's why I'll either use a previously existing compiled object
> > > or make a new one.
> > Could you leave that decision to the user?
>
> I'm considering it, but ran into essentially the same problem.  I
> need a way to tell which one was loaded, of the .elc, .eln or .el.
>
> This takes the thread a bit off-topic but ... Do you know how?
>
> Currently I'm leaning towards an algorithm like
>
> #+begin_src elisp
> (defun guess-loaded-lib (basename)
>   (let* ((el (locate-library (concat basename ".el")))
>          (elc (locate-library (concat basename ".elc")))
>          (eln (comp-el-to-eln-filename el)))
>     (if load-prefer-newer
>         (car (sort (delq 'nil (list el elc eln)) #'file-newer-than-
> file-p))
>       (if (and (file-newer-than-file-p elc el)
>                (file-newer-than-file-p eln el))
>           ;; If elc is newer than el, assume it was compiled from el,
> and
>           ;; therefore is safe to replace with eln
>           eln
>         elc))))
> #+end_src
>
> I don't know how to "leave it to the user" any better than that. 
The user can just use the output of the help function, it will tell
them whether a function is natively-compiled, byte-compiled, or just
interpreted. Emacs 30 even gives you `native-comp-function-p'. See
`gnu/packages/aux-files/emacs/comp-integrity[-next].el' for how we
assert that our Emacs loads natively-compiled libraries.

Toggle quote (11 lines)
> On Guix, even in the future when we re-enable JIT compilation, this
> algorithm could never return an .eln, since =file-newer-than-file-p=
> returns nil when both paths have identical timestamps from 1970.
>
> I found a pretty neat built-in:
>
>      (symbol-file 'SOME-FUNCTION-FROM-THE-LIB nil t)
>
> but its internals also use =file-newer-than-file-p= and =comp-el-to-
> eln-rel-filename=, so not sure it is any more reliable than what I
> have above.
I'm not sure these things work the way you think. IIUC, `load-prefer-
newer' should guard against the case where the .el is newer than the
.el[cn], not the other way round, so it should still load the
compiled/native-compiled variant if they have the same stamp.

Toggle quote (26 lines)
> > There is a separate load path for natively compiled files, called
> > `native-comp-eln-load-path'.
>
> Good to know! I don't know how Emacs consults it though.  I can't
> e.g. locate an .eln of a given library by calling something like
>
> #+begin_src elisp
> (locate-eln-file "org-node-parser")
> #+end_src
>
> so it seems I must simply do:
>
> #+begin_src elisp
> (let ((el (locate-library "org-node-parser.el")))
>   (when el
>     (locate-eln-file (comp-el-to-eln-rel-filename el))))
> #+end_src
>
> which is functionally equivalent to:
>
> #+begin_src elisp
> (let ((eln (comp-el-to-eln-filename (locate-library "org-node-
> parser.el"))))
>   (when (file-exists-p eln)
>     eln))
> #+end_src
I mean, the C code is there to read (along with the patch we've made),
but I can't fault you for not going that deep.

Toggle quote (22 lines)
> > > I infer that Emacs starts with finding a library in load-path,
> > > then converts the path with `comp-el-to-eln-filename`, and checks
> > > if that file exists, then loads it.
> > >
> > > And crucially, it is not just about the filepath, the function
> > > hashes the file contents as well. That ensures that the output
> > > path is always different if the source file changes.
> > I think relying on such implementation details is perhaps permitted
> > if it's inside of Emacs itself, but even then it clashes with our
> > expectation that Emacs be graftable.
>
> OK, so if passing "file.el" to =comp-el-to-eln-filename= produces a
> path with a hash of the content, like:
>
>     .../eln-cache/29.4-HASH/module/file-HASH-BASED-ON-CONTENT.eln
>
> this would make Emacs non-graftable.  Because the hash would change
> after file.el is upgraded.  I suppose that makes sense, thanks!
>
> Maybe, as a hack, the graft process could symlink the pre-graft
> filename to the post-graft filename... so we don't have to alter the
> behavior of =comp-el-to-eln-filename=.
The grafting process doesn't look that deep into file names. On paper,
that could work, since the length of the name is preserved, but it'd
also make the grafting process take longer and not really add all that
much to its reliability.

Toggle quote (7 lines)
> But yea... it's less dev effort to just not pre-compile any .eln
> files.
>
> Out of curiosity: if Guix does no JIT compilation at all anyway, why
> does it not let Emacs do it the usual way into ~/.emacs.d/eln-cache,
> post-installation?  (Aside from the fact it would necessitate
> reverting the behavior of =comp-el-to-eln-filename=.)
There is only one `comp-el-to-eln-filename' to call, you can't (without
a much more ugly hack) have two strategies, and even if you did have
them, it'd be even more error-prone.

Toggle quote (15 lines)
> Actually... (with reservation that I may be wasting your time) I'm
> starting to wonder why the filename needs to stay the same for Emacs
> to be graftable?  Realistically, if all Emacs packages get upgraded,
> and if =comp-el-to-eln-filename= works like upstream, we have some
> paths like
>
>     /gnu/store/emacs-29.4/bin/emacs
>     /gnu/store/emacs-magit-OLD/lisp/magit.el
>     /gnu/store/emacs-magit-NEW/lisp/magit.el
>     .../wherever/magit-OLDHASH.eln
>     .../wherever/magit-NEWHASH.eln
>
> Then we start Emacs, and run its =comp-el-to-eln-filename= on the new
> magit.el, it would correctly return .../wherever/magit-NEWHASH.eln.
> No need for static filenames to swap out.
I haven't linked at the shared objects themselves, but I have strong
hunch that they use dynamic linking between the libraries. Since we
only update the /gnu/store/emacs-whatever-HERE-WE-HAVE-A-HASH portion
and nothing post that, we get file names that don't exist :)

If you dig into the patch series, it references the bug it fixes.

Toggle quote (10 lines)
> […]
>
> No, grafting is about avoiding a world re-compile, right?  So like if
> package-A was built with an old dependency package-B, it does not get
> re-built when package-B is upgraded. But does grafting actually ever
> change anything in Emacs?  When you start emacs and it loads package-
> A which in turn loads package-B because there's a =require= statement
> for package-B, Emacs will actually load the fresh package-B from
> /gnu/store/emacs-package-B, won't it?  So grafting would seem to be
> meaningless.
For example, GTK is a package deep in the world, that gets grafted a
lot and causes Emacs to be grafted as well.

Toggle quote (11 lines)
> Not sure how to feel if it's like this instead: a compiled function
> from package A, containing a call to some function "package-B-
> frobnicate", will actually call some bytecode originating from
> /gnu/store/emacs-package-A/share/emacs/site-lisp/package-B.elc?
> That's... not what happens, right?  OK, probably yes if it was a
> defsubst or defmacro.
>
> So we can have package A using a macro that was defined in an old
> version of package-B.  I like that not, but the reason I ask is I'm
> trying to find out where it is that the file path would influence the
> graft.
I haven't looked in the ABI stability of Emacs Lisp packages, but most
of them would actually not need a graft. It's Emacs itself that needs
the grafting :)

Toggle quote (20 lines)
> > The way our load paths are set up, it is actually the opposite
> > (which still is a bug, just not the one reported).  While `guix
> > upgrade` or a command to the similar effect will swap out the .eln
> > under the hood, the `.el` and `.elc` files stay stable – remember
> > what I wrote in the previous message about that having caused
> > issues with byte compilation?
>
> I confess I'm puzzled.  Does this just apply to a case like the
> following?
>
> - Emacs itself is upgraded
> - A package emacs-magit is NOT upgraded
>
> so that
>
> - magit.eln is rebuilt
> - magit.el and magit.elc stay the same
>
> That would make sense.  Do you mean it's a bug because .elc files are
> also not portable between different Emacsen?
We had issues, where a major upgrade between Emacsen would cause
invalid bytecode to be loaded, yes. The issue in Guix is not so much
that a package is not upgrade, but Emacs being upgraded while Emacs is
running. You'd get a functioning Emacs once you exit the process and
start a new one, but that feels very dirty. (Imagine exiting Emacs,
because your package manager tells you so, ugh!)

Toggle quote (3 lines)
> Just checking: is it correct to expect that if you actually upgrade
> the emacs-magit package, then its .el, .elc and .eln are all
> upgraded?
Yes, the .el, .elc, and .eln files are built as *one* package. Any
package manager you use with Emacs should provide you with that
invariant.

Cheers
?
Your comment

Commenting via the web interface is currently disabled.

To comment on this conversation send an email to 73681@patchwise.org

To respond to this issue using the mumi CLI, first switch to it
mumi current 73681
Then, you may apply the latest patchset in this issue (with sign off)
mumi am -- -s
Or, compose a reply to this issue
mumi compose
Or, send patches to this issue
mumi send-email *.patch