Login fail after core-update without reboot

  • Open
  • quality assurance status badge
Details
3 participants
  • Ludovic Courtès
  • Maxim Cournoyer
  • Pierre-Antoine Rouby
Owner
unassigned
Submitted by
Pierre-Antoine Rouby
Severity
important

Debbugs page

P
P
Pierre-Antoine Rouby wrote on 17 Jul 2018 01:30
(address . bug-guix@gnu.org)
655514906.8428771.1531816208634.JavaMail.zimbra@inria.fr
Hi Guix,

I found a problem with 'guix reconfigure' and core-update. After
reconfigure it's impossible to connect in tty, 'login' segfault
with this error:

----------------------------------------------------------------------
login[30083]: segfault at 968 ip 00007f6ae6168ec8 sp 00007ffc7bd0f420 error 4 in libpthread-2.27.so[7f6ae6163000+19000]
----------------------------------------------------------------------

I think login try to use glibc-2.27 but it's still configured to use
glib-2.26. It's possible this issue come from '/etc/pam.d/login'.

A had to reboot my system.

gdb trace:
----------------------------------------------------------------------
process 24717 is executing new program: /gnu/store/31qbd404pmlm5bmb0l0r147mnjxzpq3y-shadow-4.6/bin/login
warning: Could not load shared library symbols for linux-vdso.so.1.
Do you need "set solib-search-path" or "set sysroot"?
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/gnu/store/l4lr0f5cjd0nbsaaf8b5dmcw1a1yypr3-glibc-2.27/lib/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
0x00007fb95cabaec8 in __pthread_initialize_minimal_internal () from /gnu/store/l4lr0f5cjd0nbsaaf8b5dmcw1a1yypr3-glibc-2.27/lib/libpthread.so.0
(gdb) bt
#0 0x00007fb95cabaec8 in __pthread_initialize_minimal_internal () from /gnu/store/l4lr0f5cjd0nbsaaf8b5dmcw1a1yypr3-glibc-2.27/lib/libpthread.so.0
#1 0x00007fb95caba621 in _init () from /gnu/store/l4lr0f5cjd0nbsaaf8b5dmcw1a1yypr3-glibc-2.27/lib/libpthread.so.0
#2 0x00007fb95d8dcaa0 in ?? () from /gnu/store/gwyb3679v49ljisgkvzay2xa3njgq4ii-linux-pam-1.3.0/lib/security/pam_env.so
#3 0x00007fb95eb1f33a in call_init.part () from /gnu/store/4sqaib7c2dfjv62ivrg9b8wa7bh226la-glibc-2.26.105-g0890d5379c/lib/ld-linux-x86-64.so.2
#4 0x00007fb95eb1f4f5 in _dl_init () from /gnu/store/4sqaib7c2dfjv62ivrg9b8wa7bh226la-glibc-2.26.105-g0890d5379c/lib/ld-linux-x86-64.so.2
#5 0x00007fb95eb23980 in dl_open_worker () from /gnu/store/4sqaib7c2dfjv62ivrg9b8wa7bh226la-glibc-2.26.105-g0890d5379c/lib/ld-linux-x86-64.so.2
#6 0x00007fb95e058901 in _dl_catch_error () from /gnu/store/4sqaib7c2dfjv62ivrg9b8wa7bh226la-glibc-2.26.105-g0890d5379c/lib/libc.so.6
#7 0x00007fb95eb23127 in _dl_open () from /gnu/store/4sqaib7c2dfjv62ivrg9b8wa7bh226la-glibc-2.26.105-g0890d5379c/lib/ld-linux-x86-64.so.2
#8 0x00007fb95e4f9f96 in dlopen_doit () from /gnu/store/4sqaib7c2dfjv62ivrg9b8wa7bh226la-glibc-2.26.105-g0890d5379c/lib/libdl.so.2
#9 0x00007fb95e058901 in _dl_catch_error () from /gnu/store/4sqaib7c2dfjv62ivrg9b8wa7bh226la-glibc-2.26.105-g0890d5379c/lib/libc.so.6
#10 0x00007fb95e4fa5a9 in _dlerror_run () from /gnu/store/4sqaib7c2dfjv62ivrg9b8wa7bh226la-glibc-2.26.105-g0890d5379c/lib/libdl.so.2
#11 0x00007fb95e4fa021 in dlopen@@GLIBC_2.2.5 () from /gnu/store/4sqaib7c2dfjv62ivrg9b8wa7bh226la-glibc-2.26.105-g0890d5379c/lib/libdl.so.2
#12 0x00007fb95e701f4d in _pam_load_module () from /gnu/store/gwyb3679v49ljisgkvzay2xa3njgq4ii-linux-pam-1.3.0/lib/libpam.so.0
#13 0x00007fb95e7025d9 in _pam_add_handler () from /gnu/store/gwyb3679v49ljisgkvzay2xa3njgq4ii-linux-pam-1.3.0/lib/libpam.so.0
#14 0x00007fb95e702cd6 in _pam_parse_conf_file () from /gnu/store/gwyb3679v49ljisgkvzay2xa3njgq4ii-linux-pam-1.3.0/lib/libpam.so.0
#15 0x00007fb95e7033d7 in _pam_init_handlers () from /gnu/store/gwyb3679v49ljisgkvzay2xa3njgq4ii-linux-pam-1.3.0/lib/libpam.so.0
#16 0x00007fb95e704bc1 in pam_start () from /gnu/store/gwyb3679v49ljisgkvzay2xa3njgq4ii-linux-pam-1.3.0/lib/libpam.so.0
#17 0x0000000000402f2c in main ()
----------------------------------------------------------------------

----------------------------------------------------------------------
(gdb) info sharedlibrary
From To Syms Read Shared Object Library
0x00007fb95eb10cc0 0x00007fb95eb2c990 Yes (*) /gnu/store/4sqaib7c2dfjv62ivrg9b8wa7bh226la-glibc-2.26.105-g0890d5379c/lib/ld-linux-x86-64.so.2
No linux-vdso.so.1
0x00007fb95e90d1b0 0x00007fb95e90de59 Yes (*) /gnu/store/gwyb3679v49ljisgkvzay2xa3njgq4ii-linux-pam-1.3.0/lib/libpam_misc.so.0
0x00007fb95e6ff9d0 0x00007fb95e706ae5 Yes (*) /gnu/store/gwyb3679v49ljisgkvzay2xa3njgq4ii-linux-pam-1.3.0/lib/libpam.so.0
0x00007fb95e4f9de0 0x00007fb95e4faa27 Yes (*) /gnu/store/4sqaib7c2dfjv62ivrg9b8wa7bh226la-glibc-2.26.105-g0890d5379c/lib/libdl.so.2
0x00007fb95e2e4a40 0x00007fb95e2f4775 Yes (*) /gnu/store/2ifmksc425qcysl5rkxkbv6yrgc1w9cs-gcc-5.5.0-lib/lib/libgcc_s.so.1
0x00007fb95df50750 0x00007fb95e088fac Yes (*) /gnu/store/4sqaib7c2dfjv62ivrg9b8wa7bh226la-glibc-2.26.105-g0890d5379c/lib/libc.so.6
0x00007fb95dd1a6a0 0x00007fb95dd20af8 Yes (*) /gnu/store/gwyb3679v49ljisgkvzay2xa3njgq4ii-linux-pam-1.3.0/lib/security/pam_unix.so
0x00007fb95dae0ba0 0x00007fb95dae5f47 Yes (*) /gnu/store/4sqaib7c2dfjv62ivrg9b8wa7bh226la-glibc-2.26.105-g0890d5379c/lib/libcrypt.so.1
0x00007fb95cabab40 0x00007fb95cac8657 Yes (*) /gnu/store/l4lr0f5cjd0nbsaaf8b5dmcw1a1yypr3-glibc-2.27/lib/libpthread.so.0
0x00007fb95d8dcf30 0x00007fb95d8de421 Yes (*) /gnu/store/gwyb3679v49ljisgkvzay2xa3njgq4ii-linux-pam-1.3.0/lib/security/pam_env.so
(*): Shared library is missing debugging information.
----------------------------------------------------------------------

--
Pierre-Antoine Rouby
L
L
Ludovic Courtès wrote on 23 Jul 2018 06:17
(name . Pierre-Antoine Rouby)(address . pierre-antoine.rouby@inria.fr)(address . 32182@debbugs.gnu.org)
87r2juqegc.fsf@gnu.org
Hello!

Pierre-Antoine Rouby <pierre-antoine.rouby@inria.fr> skribis:

Toggle quote (3 lines)
> I think login try to use glibc-2.27 but it's still configured to use
> glib-2.26. It's possible this issue come from '/etc/pam.d/login'.

Indeed. The problem here is that ‘reconfigure’ updates /etc/pam.d, but
does not change the service definition of ‘login’, etc. Thus, when
‘login’ restarts, it reads the new /etc/pam.d/login, which contains a
line like:

session required /gnu/store/…-elogind-232.4/lib/security/pam_elogind.so

Consequently, ‘login’ dlopens pam_elogind.so, which is linked against
the new libc, which eventually causes it to crash.

It’s a real issue on headless servers because you could lock yourself
out (‘sshd’ could have the same problem.)

I can think of several solutions:

1. Arrange for services to refer to /gnu/store/…-pam.d instead of
/etc/pam.d. This can maybe be achieved by modifying PAM such that
these applications honor $PAM_DIRECTORY or something like that.

2. Add support for “service chain-loading” in the Shepherd and/or
GuixSD. The idea is that, for services that cannot be restarted
right away because they are currently running, register code to
upgrade the service next time it is restarted (see
https://bugs.gnu.org/30706). That way, when ‘login’ restarts
after ‘reconfigure’, it’s the new ‘login’ service that would be
restarted.

Thoughts?

Ludo’.
L
L
Ludovic Courtès wrote on 8 Sep 2018 14:05
control message for bug #32182
(address . control@debbugs.gnu.org)
87h8izem3o.fsf@gnu.org
severity 32182 important
L
L
Ludovic Courtès wrote on 2 May 2020 07:37
Re: bug#32182: Login fail after core-update without reboot
(name . Pierre-Antoine Rouby)(address . pierre-antoine.rouby@inria.fr)(address . 32182@debbugs.gnu.org)
87mu6q5rnn.fsf@gnu.org
Hi, old bug! :-)

ludo@gnu.org (Ludovic Courtès) skribis:

Toggle quote (6 lines)
> I can think of several solutions:
>
> 1. Arrange for services to refer to /gnu/store/…-pam.d instead of
> /etc/pam.d. This can maybe be achieved by modifying PAM such that
> these applications honor $PAM_DIRECTORY or something like that.

We should look into that.

Toggle quote (8 lines)
> 2. Add support for “service chain-loading” in the Shepherd and/or
> GuixSD. The idea is that, for services that cannot be restarted
> right away because they are currently running, register code to
> upgrade the service next time it is restarted (see
> <https://bugs.gnu.org/30706>). That way, when ‘login’ restarts
> after ‘reconfigure’, it’s the new ‘login’ service that would be
> restarted.

That bit was implemented long ago with Shepherd service replacements.
So at least, now, one can run ‘herd start term-tty1’ or similar to get a
working login:


Ludo’.
M
M
Maxim Cournoyer wrote on 16 Dec 2021 07:56
(name . Ludovic Courtès)(address . ludo@gnu.org)
878rwksfvt.fsf@gmail.com
Hello,

Ludovic Courtès <ludo@gnu.org> writes:

[...]

Toggle quote (18 lines)
>> I can think of several solutions:
>>
>> 1. Arrange for services to refer to /gnu/store/…-pam.d instead of
>> /etc/pam.d. This can maybe be achieved by modifying PAM such that
>> these applications honor $PAM_DIRECTORY or something like that.
>>
>> 2. Add support for “service chain-loading” in the Shepherd and/or
>> GuixSD. The idea is that, for services that cannot be restarted
>> right away because they are currently running, register code to
>> upgrade the service next time it is restarted (see
>> <https://bugs.gnu.org/30706>). That way, when ‘login’ restarts
>> after ‘reconfigure’, it’s the new ‘login’ service that would be
>> restarted.
>
> That bit was implemented long ago with Shepherd service replacements.
> So at least, now, one can run ‘herd start term-tty1’ or similar to get a
> working login:

Point 2 doesn't seem to help in working around or fixing the related
#52533 though, correct? Restarting the remote elogind or even
ssh-daemon doesn't work there, perhaps because 'guix deploy' wasn't able
to complete in the first place.

I guess that means we should look into fixing point 1., as you already
suggested. On top of that I'd propose disabling PAM unless there's a
good reason to have it on by default; as I wrote in the other issue,
`man sshd_config' documents that by default in OpenSSH it is disabled.

Thanks,

Maxim
M
M
Maxim Cournoyer wrote on 16 Dec 2021 08:15
(name . Ludovic Courtès)(address . ludo@gnu.org)
875yrosez3.fsf@gmail.com
Hi again,

Maxim Cournoyer <maxim.cournoyer@gmail.com> writes:

Toggle quote (29 lines)
> Hello,
>
> Ludovic Courtès <ludo@gnu.org> writes:
>
> [...]
>
>>> I can think of several solutions:
>>>
>>> 1. Arrange for services to refer to /gnu/store/…-pam.d instead of
>>> /etc/pam.d. This can maybe be achieved by modifying PAM such that
>>> these applications honor $PAM_DIRECTORY or something like that.
>>>
>>> 2. Add support for “service chain-loading” in the Shepherd and/or
>>> GuixSD. The idea is that, for services that cannot be restarted
>>> right away because they are currently running, register code to
>>> upgrade the service next time it is restarted (see
>>> <https://bugs.gnu.org/30706>). That way, when ‘login’ restarts
>>> after ‘reconfigure’, it’s the new ‘login’ service that would be
>>> restarted.
>>
>> That bit was implemented long ago with Shepherd service replacements.
>> So at least, now, one can run ‘herd start term-tty1’ or similar to get a
>> working login:
>
> Point 2 doesn't seem to help in working around or fixing the related
> #52533 though, correct? Restarting the remote elogind or even
> ssh-daemon doesn't work there, perhaps because 'guix deploy' wasn't able
> to complete in the first place.

Another bit that probably played a role here: the above failure to
complete is perhaps caused/made worst by #41238 (guix deploy close ssh
session after each store items sent), which doesn't reuse the same
stable SSH session to do the whole of what it needs to do.

Maxim
?
Your comment

Commenting via the web interface is currently disabled.

To comment on this conversation send an email to 32182@patchwise.org

To respond to this issue using the mumi CLI, first switch to it
mumi current 32182
Then, you may apply the latest patchset in this issue (with sign off)
mumi am -- -s
Or, compose a reply to this issue
mumi compose
Or, send patches to this issue
mumi send-email *.patch