Hurd VM fails to boot on AMD EPYC (kvm-amd)

Done

Details

2 participants

Ludovic Courtès
Samuel Thibault

Owner: unassigned

Submitted by: Ludovic Courtès

Severity: normal

Debbugs page

Ludovic Courtès wrote on 5 Oct 2022 14:01

Recipients:(address . bug-guix@gnu.org)

Message-ID:87k05eouh8.fsf@inria.fr

Hello!

On AMD EPYC processors, as found on the build nodes of ci.guix.gnu.org,

childhurd VMs fail to boot when running with ‘qemu-system-i386

-enable-kvm’ (the kvm-amd Linux kernel module is used), with the Hurd

startup process hanging before /hurd/exec has been started:

Toggle snippet (18 lines)module 0: ext2fs --multiboot-command-line=${kernel-command-line} --host-priv-por                    
t=${host-port} --device-master-port=${device-port} --exec-server-task=${exec-tas                    
k} --store-type=typed --x-xattr-translator-records ${root} $(task-create) $(task                    
-resume)                                                                                            
module 1: exec /gnu/store/99sqiayswrxxb80331pl7jxin18wv28b-hurd-0.9-1.91a5167/hu                    
rd/exec $(exec-task=task-create)                                                                    
2 multiboot modules                                                                                 
task loaded: ext2fs --multiboot-command-line=root=device:hd0s1 root=3367134b-cfb                    
d-1e90-2f38-dfd13367134b gnu.system=/gnu/store/m66ccpdzdbcd3k2fdvyaj8cgmk23lybn-                    
system gnu.load=/gnu/store/m66ccpdzdbcd3k2fdvyaj8cgmk23lybn-system/boot --host-p                    
riv-port=1 --device-master-port=2 --exec-server-task=3 --store-type=typed --x-xa                    
ttr-translator-records device:hd0s1                                                                 
task loaded: exec /gnu/store/99sqiayswrxxb80331pl7jxin18wv28b-hurd-0.9-1.91a5167                    
/hurd/exec                                                                                          
                                                                                                    
start ext2fs: Hurd server bootstrap: ext2fs[device:hd0s1] exec                                      

Kdb shows these two tasks:

Toggle snippet (28 lines)Stopped   at  mach                    
ine_idle+0x26:  nop                                                                                 
machine_idle(0,c102f380,f5f7bcd0,c102e3fa)+0x26                                                     
idle_thread_continue(f5f7ade0,36366000,f5f73868,f5f73890,0)+0x73                                    
>>>>> user space <<<<<                                                                              
db> show all threads                                                                                
    TASK        THREADS                                                                             
  0 gnumach (f5f7cf00): 7 threads:                                                                  
              0 (f5f7be18) .W..N. 0xc11dac04                                                        
              1 (f5f7bcd0) R.....                                                                   
              2 (f5f7bb88) .W.ON.(reaper_thread_continue) 0xc12015d4                                
              3 (f5f7ba40) .W.ON.(swapin_thread_continue) 0xc11f8e2c                                
              4 (f5f7b8f8) .W.ON.(sched_thread_continue) 0                                          
              5 (f5f7b7b0) .W.ON.(io_done_thread_continue) 0xc1201f74                               
              6 (f5f7b668) .W.ON.(net_thread_continue) 0xc11db0a8                                   
  1 ext2fs (f5f7ce40): 6 threads:                                                                   
              0 (f5f7b520) .W.O.F(mach_msg_continue) 0                                              
              1 (f5f7b290) .W.O.F(mach_msg_receive_continue) 0                                      
              2 (f5f7b148) .W.O..(mach_msg_receive_continue) 0                                      
              3 (f5f7b000) .W.O..(mach_msg_continue) 0                                              
              4 (f67d3e20) .W.O..(mach_msg_receive_continue) 0                                      
              5 (f67d3cd8) .W.O..(mach_msg_continue) 0                                              
db> trace/t 0xf5f7b520                                                                              
Continuation mach_msg_continue                                                                      
>>>>> user space <<<<<                                                                              
0x80ccaec()                                                                                         

For ext2fs.static, that just means thread 0 is here:

Toggle snippet (4 lines)

$ addr2line -e /gnu/store/99sqiayswrxxb80331pl7jxin18wv28b-hurd-0.9-1.91a5167/hurd/ext2fs.static 0x80ccaec

/tmp/guix-build-glibc-cross-i586-pc-gnu-2.33.drv-0/build/mach/mach_msg_trap.S:2

That doesn’t tell us much.

The same image boots fine on the same CPU without ‘-enable-kvm’.

However, keeping ‘-enable-kvm’ and adding ‘--cpu pentium’ and other

variants of this option don’t make any difference, AFAICS.

Ideas on how to debug this further, and/or ways to work around it

without giving up on KVM?

Thanks,

Ludo’.

Ludovic Courtès wrote on 6 Oct 2022 06:14

Recipients:(address . 58320@debbugs.gnu.org)(address . bug-hurd@gnu.org)

Message-ID:8735c1nlga.fsf@gnu.org

Hi!

As suggested by Samuel on IRC, I did that early on in kdb:

debug traps /on

such that it would stop on each trap, hopefully allowing me to see why

exec is not starting.

Toggle snippet (42 lines)module 0: ext2fs --multiboot-command-line=${kernel-command-line} --host-priv-por                    
t=${host-port} --device-master-port=${device-port} --exec-server-task=${exec-tas                    
k} --store-type=typed --x-xattr-translator-records ${root} $(task-create) $(task                    
-resume)                                                                                            
module 1: exec /gnu/store/99sqiayswrxxb80331pl7jxin18wv28b-hurd-0.9-1.91a5167/hu                    
rd/exec $(exec-task=task-create)                                                                    
2 multiboot modules                                                                                 
task loaded: ext2fs --multiboot-command-line=root=device:hd0s1 root=3367134b-cfb                    
d-1e90-2f38-dfd13367134b gnu.system=/gnu/store/m66ccpdzdbcd3k2fdvyaj8cgmk23lybn-                    
system gnu.load=/gnu/store/m66ccpdzdbcd3k2fdvyaj8cgmk23lybn-system/boot --host-p                    
riv-port=1 --device-master-port=2 --exec-server-task=3 --store-type=typed --x-xa                    
ttr-translator-records device:hd0s1                                                                 
task loaded: exec /gnu/store/99sqiayswrxxb80331pl7jxin18wv28b-hurd-0.9-1.91a5167                    
/hurd/exec                                                                                          
                                                                                                    
start ext2fs: Hurd server bootstrap: ext2fs[device:hd0s1] execkernel: Page fault                    
 (14), code=6                                                                                       
Stopped at  0x1000:     pushl   0x4(%ebx)                                                           
>>>>> user space <<<<<                                                                              
0x1000(bfffff24,0,0,1160b,0)                                                                        
0x11627(bfffff9c,0,0,0,2)                                                                           
0x11bb()                                                                                            
db> show all threads                                                                                
    TASK        THREADS                                                                             
  0 gnumach (f5f7cf00): 7 threads:                                                                  
              0 (f5f7be18) .W..N. 0xc11dac04                                                        
              1 (f5f7bcd0) R..O..(idle_thread_continue)                                             
              2 (f5f7bb88) .W.ON.(reaper_thread_continue) 0xc12015d4                                
              3 (f5f7ba40) .W.ON.(swapin_thread_continue) 0xc11f8e2c                                
              4 (f5f7b8f8) .W.ON.(sched_thread_continue) 0                                          
              5 (f5f7b7b0) .W.ON.(io_done_thread_continue) 0xc1201f74                               
              6 (f5f7b668) .W.ON.(net_thread_continue) 0xc11db0a8                                   
  1 ext2fs (f5f7ce40): 6 threads:                                                                   
              0 (f5f7b520) .W.O.F(mach_msg_continue) 0                                              
              1 (f5f7b290) .W.O..(mach_msg_receive_continue) 0                                      
              2 (f5f7b148) .W.O..(mach_msg_receive_continue) 0                                      
              3 (f5f7b000) .W.O..(mach_msg_continue) 0                                              
              4 (f67d4e20) .W.O..(mach_msg_receive_continue) 0                                      
              5 (f67d4cd8) .W.O..(mach_msg_continue) 0                                              
  2 exec (f5f7cd80): (f5f7b3d8) R.....                                                              

Then lots of page faults with the same stack trace, seemingly endlessly:

Toggle snippet (9 lines)

db> c

kernel: Page fault (14), code=6

Stopped at 0x1000: pushl 0x4(%ebx)

>>>>> user space <<<<<

0x1000(bfffff24,0,0,1160b,0)

0x11627(bfffff9c,0,0,0,2)

0x11bb()

When I “debug traps /off” and continue, the startup process hangs as

normal, and at that point ‘show all threads’ no longer shows exec.

On a “working” VM, with traps enabled early on in the same way, I don’t

see any page fault until after exec, proc, auth, etc. have been started.

Thoughts?

Ludo’.

Samuel Thibault wrote on 6 Oct 2022 06:53

Recipients:(name . Ludovic Courtès)(address . ludo@gnu.org)

Message-ID:20221006135316.ijevz5ddnet4aqkr@begin

Ludovic Courtès, le jeu. 06 oct. 2022 15:14:13 +0200, a ecrit:

Toggle quote (3 lines)

> such that it would stop on each trap, hopefully allowing me to see why

> exec is not starting.

Also, better use exec.static to have static addresses.

We use the dynamic version of exec "just because we can", but that makes

debugging difficult.

Samuel

Ludovic Courtès wrote on 6 Oct 2022 15:10

Recipients:(address . 58320@debbugs.gnu.org)(address . bug-hurd@gnu.org)

Message-ID:87r0zkfvso.fsf@gnu.org

Samuel Thibault <samuel.thibault@gnu.org> skribis:

Toggle quote (6 lines)

> Ludovic Courtès, le jeu. 06 oct. 2022 15:14:13 +0200, a ecrit:

>> such that it would stop on each trap, hopefully allowing me to see why

>> exec is not starting.

> Also, better use exec.static to have static addresses.

Thanks for the hint.

Of course, the thing boots just fine on that machine when using

‘exec.static’.

So the issue might be somewhere in ld.so, or triggered by ld.so.

Any debugging tricks here?

Ludo’.

Samuel Thibault wrote on 6 Oct 2022 15:42

Recipients:(name . Ludovic Courtès)(address . ludo@gnu.org)

Message-ID:20221006224219.mn7zp7lhzxwlyrpx@begin

Ludovic Courtès, le ven. 07 oct. 2022 00:10:15 +0200, a ecrit:

Toggle quote (13 lines)> Samuel Thibault <samuel.thibault@gnu.org> skribis:
> 
> > Ludovic Courtès, le jeu. 06 oct. 2022 15:14:13 +0200, a ecrit:
> >> such that it would stop on each trap, hopefully allowing me to see why
> >> exec is not starting.
> >
> > Also, better use exec.static to have static addresses.
> 
> Thanks for the hint.
> 
> Of course, the thing boots just fine on that machine when using
> ‘exec.static’.

Uh. At least you have a workaround :)

Toggle quote (3 lines)

> So the issue might be somewhere in ld.so, or triggered by ld.so.

> Any debugging tricks here?

maybe for a start check with

show map $map2

what is actually mapped, whether it's just ld.so, or also exec, etc.

Perhaps you can also try to run other programs than exec, like small

dumb programs.

Samuel

Ludovic Courtès wrote on 7 Oct 2022 01:24

Recipients:(address . 58320@debbugs.gnu.org)(address . bug-hurd@gnu.org)

Message-ID:8735c0f3d5.fsf@gnu.org

Hi!

Samuel Thibault <samuel.thibault@gnu.org> skribis:

Toggle quote (2 lines)

> Ludovic Courtès, le ven. 07 oct. 2022 00:10:15 +0200, a ecrit:

[...]

Toggle quote (5 lines)

>> Of course, the thing boots just fine on that machine when using

>> ‘exec.static’.

> Uh. At least you have a workaround :)

Yup. :-)

Toggle quote (9 lines)

>> So the issue might be somewhere in ld.so, or triggered by ld.so.

>> Any debugging tricks here?

> maybe for a start check with

> show map $map2

> what is actually mapped, whether it's just ld.so, or also exec, etc.

On the first trap (page fault) I see:

Toggle snippet (46 lines)db> show all tasks                                                                                              
 ID TASK     NAME [THREADS]                                                                                     
  0 f5f7cf00 gnumach [7]                                                                                        
  1 f5f7ce40 ext2fs [6]                                                                                         
  2 f5f7cd80 exec [1]                                                                                           
db> show map $map2                                                                                              
Map 0xf5f6ff30: name="exec", pmap=0xf5f71fa8,ref=1,nentries=5                                                   
size=290816,resident:290816,wired=0                                                                             
version=14                                                                                                      
 map entry 0xf625ec08: start=0x0, end=0x1000                                                                    
 prot=1/7/copy, object=0xf5f6a7d0, offset=0x0                                                                   
  Object 0xf5f6a7d0: size=0x1000, 1 references                                                                  
  1 resident pages, 0 absent pages, 0 paging ops                                                                
   memory object=0x0 (offset=0x0),control=0x0, name=0xf5938968                                                  
   uninitialized,temporary      internal,copy_strategy=0                                                        
   shadow=0x0 (offset=0x0),copy=0x0                                                                             
 map entry 0xf625ebb0: start=0x1000, end=0x26000                                                                
 prot=5/7/copy, object=0xf5f6ad70, offset=0x0                                                                   
  Object 0xf5f6ad70: size=0x25000, 1 references                                                                 
  37 resident pages, 0 absent pages, 0 paging ops                                                               
   memory object=0x0 (offset=0x0),control=0x0, name=0xf5f82780                                                  
   uninitialized,temporary      internal,copy_strategy=0                                                        
   shadow=0x0 (offset=0x0),copy=0x0                                                                             
 map entry 0xf625eb58: start=0x26000, end=0x34000                                                               
 prot=1/7/copy, object=0xf5f6ad20, offset=0x0                                                                   
  Object 0xf5f6ad20: size=0xe000, 1 references                                                                  
  14 resident pages, 0 absent pages, 0 paging ops                                                               
   memory object=0x0 (offset=0x0),control=0x0, name=0xf5f82730                                                  
   uninitialized,temporary      internal,copy_strategy=0                                                        
   shadow=0x0 (offset=0x0),copy=0x0                                                                             
 map entry 0xf625eb00: start=0x34000, end=0x37000                                                               
 prot=3/7/copy, object=0xf5f6acd0, offset=0x0                                                                   
  Object 0xf5f6acd0: size=0x3000, 1 references                                                                  
  3 resident pages, 0 absent pages, 0 paging ops                                                                
   memory object=0x0 (offset=0x0),control=0x0, name=0xf5f826e0                                                  
   uninitialized,temporary      internal,copy_strategy=0                                                        
   shadow=0x0 (offset=0x0),copy=0x0                                                                             
 map entry 0xf625eaa8: start=0xbfff0000, end=0xc0000000                                                         
 prot=3/7/copy, object=0xf5f6ac80, offset=0x0                                                                   
  Object 0xf5f6ac80: size=0x10000, 1 references                                                                 
  16 resident pages, 0 absent pages, 0 paging ops                                                               
   memory object=0x0 (offset=0x0),control=0x0, name=0xf5f82690                                                  
   uninitialized,temporary      internal,copy_strategy=0                                                        
   shadow=0x0 (offset=0x0),copy=0x0                                                                             

The mappings appear to match the PT_LOAD sections of ld.so:

Toggle snippet (19 lines)

$ objdump -x /gnu/store/m8afvcgwmrfhvjpd7b0xllk8vv5isd6j-glibc-cross-i586-pc-gnu-2.33/lib/ld.so.1|head -16

/gnu/store/m8afvcgwmrfhvjpd7b0xllk8vv5isd6j-glibc-cross-i586-pc-gnu-2.33/lib/ld.so.1: file format elf32-i386

/gnu/store/m8afvcgwmrfhvjpd7b0xllk8vv5isd6j-glibc-cross-i586-pc-gnu-2.33/lib/ld.so.1

architecture: i386, flags 0x00000150:

HAS_SYMS, DYNAMIC, D_PAGED

start address 0x000011b0

Program Header:

LOAD off 0x00000000 vaddr 0x00000000 paddr 0x00000000 align 2**12

filesz 0x00000dd8 memsz 0x00000dd8 flags r--

LOAD off 0x00001000 vaddr 0x00001000 paddr 0x00001000 align 2**12

filesz 0x000244a1 memsz 0x000244a1 flags r-x

LOAD off 0x00026000 vaddr 0x00026000 paddr 0x00026000 align 2**12

filesz 0x0000d5e8 memsz 0x0000d5e8 flags r--

LOAD off 0x00033f60 vaddr 0x00034f60 paddr 0x00034f60 align 2**12

filesz 0x00001910 memsz 0x00001a6c flags rw-

… so ‘exec_load’ is doing its job, it seems.

I also tried to set up a breakpoint on ‘task_terminate’ to see what’s

going on when the exec task vanishes:

Toggle snippet (65 lines)module 0: ext2fs --multiboot-command-line=${kernel-command-line} --host-priv-por                                
t=${host-port} --device-master-port=${device-port} --exec-server-task=${exec-tas                                
k} --store-type=typed --x-xattr-translator-records ${root} $(task-create) $(task                                                               -resume)                                                                                                        
module 1: exec /gnu/store/0na7ic689glydngxyb7pazjixz9b6629-hurd-0.9-1.91a5167/hu                                
rd/exec $(exec-task=task-create)                                                                                
2 multiboot modules                                                                                             
task loaded: ext2fs --multiboot-command-line=root=device:hd0s1 root=3367134b-cf6                                
6-e06b-ee4b-4b933367134b gnu.system=/gnu/store/g7kssjfrxqjpbr6r31idiasgglfph2y5-                                
system gnu.load=/gnu/store/g7kssjfrxqjpbr6r31idiasgglfph2y5-system/boot --host-p                                
riv-port=1 --device-master-port=2 --exec-server-task=3 --store-type=typed --x-xa                                
ttr-translator-records device:hd0s1                                                                             
task loaded: exec /gnu/store/0na7ic689glydngxyb7pazjixz9b6629-hurd-0.9-1.91a5167                                
/hurd/exec                                                                                                      
                                                                                                                
start ext2fs: Hurd server bootstrap: ext2fs[device:hd0s1] execKernel Breakpoint                                 
trap, eip 0xc10305c1                                                                                            
Breakpoint at  task_terminate:  pushl   %ebp                                                                    
db> show all threads                                                                                            
    TASK        THREADS                                                                                         
  0 gnumach (f5f7cf00): 7 threads:                                                                              
              0 (f5f7be18) .W..N. 0xc11dac04                                                                    
              1 (f5f7bcd0) R..O..(idle_thread_continue)                                                         
              2 (f5f7bb88) .W.ON.(reaper_thread_continue) 0xc12015d4                                            
              3 (f5f7ba40) .W.ON.(swapin_thread_continue) 0xc11f8e2c                                            
              4 (f5f7b8f8) .W.ON.(sched_thread_continue) 0                                                      
              5 (f5f7b7b0) .W.ON.(io_done_thread_continue) 0xc1201f74                                           
              6 (f5f7b668) .W.ON.(net_thread_continue) 0xc11db0a8                                               
  1 ext2fs (f5f7ce40): 6 threads:                                                                               
              0 (f5f7b520) .W.O.F(mach_msg_continue) 0                                                          
              1 (f5f7b290) .W.O..(mach_msg_receive_continue) 0                                                  
              2 (f5f7b148) .W.O..(mach_msg_receive_continue) 0                                                  
              3 (f5f7b000) .W.O..(mach_msg_continue) 0                                                          
              4 (f67d4e20) .W.O..(mach_msg_receive_continue) 0                                                  
              5 (f67d4cd8) .W.O..(mach_msg_continue) 0                                                          
  2 exec (f5f7cd80): (f5f7b3d8) R.....                                                                          
db> trace/t 0xf5f7b3d8                                                                                          
task_terminate(f625eb10,0,f5f7cd80,f5f7b3d8,c11da940)                                                           
exception_try_task(1,1,bffefffc,ffffffff,c1202b4c)+0x58                                                         
exception(1,1,bffefffc,c10096da,f5957fbc)+0x7a                                                                  
interrupted_pc(1,1,bffefffc,c102ce99,c1202b40)                                                                  
trap_name(1,f5957f80,f5f73f4c,f5f73f58)                                                                         
vm_fault(f5f6ff30,bffef000,3,0,0,c1008ee4,f5f82550,fb7d9000)+0x74a                                              
user_trap(f5f7a718)+0x2df                                                                                       
>>>>> Page fault (14) at 0x1000 <<<<<                                                                           
>>>>> user space <<<<<                                                                                          
db> show map $map2                                                                                              
Map 0xf5f6ff30: name="exec", pmap=0xf5f71fa8,ref=1,nentries=5                                                   
size=290816,resident:290816,wired=0                                                                             
version=14                                                                                                      
 map entry 0xf625ec08: start=0x0, end=0x1000                                                                    
 prot=1/7/copy, object=0xf5f6a7d0, offset=0x0                                                                   
  Object 0xf5f6a7d0: size=0x1000, 1 references                                                                  
  1 resident pages, 0 absent pages, 0 paging ops                                                                
   memory object=0x0 (offset=0x0),control=0x0, name=0xf5938968                                                  
   uninitialized,temporary      internal,copy_strategy=0                                                        
   shadow=0x0 (offset=0x0),copy=0x0                                                                             
 map entry 0xf625ebb0: start=0x1000, end=0x26000                                                                
 prot=5/7/copy, object=0xf5f6ad70, offset=0x0                                                                   
  Object 0xf5f6ad70: size=0x25000, 1 references                                                                 
  37 resident pages, 0 absent pages, 0 paging ops                                                               
   memory object=0x0 (offset=0x0),control=0x0, name=0xf5f82780                                                  
   uninitialized,temporary      internal,copy_strategy=0                                                        
   shadow=0x0 (offset=0x0),copy=0x0                                                                             

It says “page fault at 0x1000” but there is apparently a valid mapping

at that address.

Funny thing: if I set a breakpoint on ‘read_exec’ and continue each time

it’s hit, the ‘exec’ process starts just fine.

Could it be a synchronization issue somewhere?

Thanks,

Ludo’.

Samuel Thibault wrote on 7 Oct 2022 14:16

Recipients:(name . Ludovic Courtès)(address . ludo@gnu.org)

Message-ID:20221007211643.bma6b5yfaj7a2d4i@begin

Attachment: file

Ludovic Courtès wrote on 8 Oct 2022 08:52

Recipients:(address . 58320@debbugs.gnu.org)(address . bug-hurd@gnu.org)

Message-ID:87zge671p0.fsf@gnu.org

Hi Samuel,

Samuel Thibault <samuel.thibault@gnu.org> skribis:

Toggle quote (10 lines)> About the backtrace:
>
>>>>>> user space <<<<<
> 0x1000(bfffff24,0,0,1160b,0)
> 0x11627(bfffff9c,0,0,0,2)
> 0x11bb()
>
> That is quite surprising actually: in my ld.so there is nothing useful
> at 0x1000. Perhaps you can check what 0x11627 is all about?

Sure:

Toggle snippet (6 lines)

$ addr2line -e /gnu/store/m8afvcgwmrfhvjpd7b0xllk8vv5isd6j-glibc-cross-i586-pc-gnu-2.33/lib/ld.so.1 0x1000 0x11627 0x11bb

??:0

/tmp/guix-build-glibc-cross-i586-pc-gnu-2.33.drv-0/glibc-2.33/elf/dl-misc.c:333

That’s ‘_dl_fatal_printf’ calling ‘_exit’; it’s trying to tell us

something.

I’ll try and rebuild the system with the debugging patches at

https://lists.gnu.org/archive/html/bug-hurd/2011-11/msg00038.html, to

get early ld.so output, for lack of a better solution…

Toggle quote (7 lines)

>> Program Header:

>> LOAD off 0x00000000 vaddr 0x00000000 paddr 0x00000000 align 2**12

>> filesz 0x00000dd8 memsz 0x00000dd8 flags r--

> We don't have this section in the Debian glibc. It'd probably be useful

> to know what this is about.

Address 0 is for the ‘_begin’ symbol, passed by -Wl,-defsym:

Toggle snippet (12 lines)i586-pc-gnu-gcc   -nostdlib -nostartfiles -r -o /tmp/guix-build-glibc-cross-i586-pc-gnu-2.33.drv-0/build/elf/librtld.os '-Wl,-(' /tmp/guix-build-glibc-cross-i586-pc-gnu-2.33.drv-0/build/elf/dl-allobjs.os /tmp/guix-build-glibc-cross-i586-pc-gnu-2.33.drv-0/build/elf/rtld-libc.a -lgcc '-Wl,-)' \
	  -Wl,-Map,/tmp/guix-build-glibc-cross-i586-pc-gnu-2.33.drv-0/build/elf/librtld.os.map
i586-pc-gnu-gcc   -nostdlib -nostartfiles -shared -o /tmp/guix-build-glibc-cross-i586-pc-gnu-2.33.drv-0/build/elf/ld.so.new		\
	  -Wl,-z,combreloc -Wl,-z,relro -Wl,--hash-style=both -Wl,-z,defs 	\
	  /tmp/guix-build-glibc-cross-i586-pc-gnu-2.33.drv-0/build/elf/librtld.os -Wl,--version-script=/tmp/guix-build-glibc-cross-i586-pc-gnu-2.33.drv-0/build/ld.map		\
	  -Wl,-soname=ld.so.1			\
	  -Wl,-defsym=_begin=0
i586-pc-gnu-readelf -s /tmp/guix-build-glibc-cross-i586-pc-gnu-2.33.drv-0/build/elf/ld.so.new \
  | gawk '($7 ~ /^UND(|EF)$/ && $1 != "0:" && $4 != "REGISTER") { print; p=1 } END { exit p != 0 }'
mv -f /tmp/guix-build-glibc-cross-i586-pc-gnu-2.33.drv-0/build/elf/ld.so.new /tmp/guix-build-glibc-cross-i586-pc-gnu-2.33.drv-0/build/elf/ld.so

And indeed:

Toggle snippet (4 lines)

$ objdump -t /gnu/store/m8afvcgwmrfhvjpd7b0xllk8vv5isd6j-glibc-cross-i586-pc-gnu-2.33/lib/ld.so.1|grep _begin

00000000 l *ABS* 00000000 _begin

That ‘-Wl,-defsym=_begin=0’ flag was removed in glibc commit

6f043e0ee7e477f50a44024ed0cb579d5e3f511d (April 2022).

On darnassus it’s different but then it’s Debian’s glibc 2.35, natively

built, so I don’t what conclusions can be drawn:

Toggle snippet (25 lines)ludo@darnassus:~$ /lib/ld.so.1 --version
ld.so (Debian GLIBC 2.35-1) stable release version 2.35.
Copyright (C) 2022 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
ludo@darnassus:~$ objdump -x /lib/ld.so.1 |head -40

/lib/ld.so.1:     file format elf32-i386
/lib/ld.so.1
architecture: i386, flags 0x00000150:
HAS_SYMS, DYNAMIC, D_PAGED
start address 0x0001cc40

Program Header:
    LOAD off    0x00000000 vaddr 0x00000000 paddr 0x00000000 align 2**12
         filesz 0x00038494 memsz 0x00038494 flags r-x
    LOAD off    0x00038c00 vaddr 0x00039c00 paddr 0x00039c00 align 2**12
         filesz 0x00001ca8 memsz 0x00001e34 flags rw-
 DYNAMIC off    0x00039f24 vaddr 0x0003af24 paddr 0x0003af24 align 2**2
         filesz 0x000000b8 memsz 0x000000b8 flags rw-
    NOTE off    0x00000114 vaddr 0x00000114 paddr 0x00000114 align 2**2
         filesz 0x00000024 memsz 0x00000024 flags r--

Thanks for your feedback!

Ludo’.

Ludovic Courtès wrote on 9 Oct 2022 09:09

Recipients:(address . 58320@debbugs.gnu.org)(address . bug-hurd@gnu.org)

Message-ID:8735bx6kt8.fsf@gnu.org

Hi!

Ludovic Courtès <ludo@gnu.org> skribis:

Toggle quote (13 lines)> $ addr2line -e  /gnu/store/m8afvcgwmrfhvjpd7b0xllk8vv5isd6j-glibc-cross-i586-pc-gnu-2.33/lib/ld.so.1 0x1000 0x11627 0x11bb
> ??:0
> /tmp/guix-build-glibc-cross-i586-pc-gnu-2.33.drv-0/glibc-2.33/elf/dl-misc.c:333
> :?
>
>
> That’s ‘_dl_fatal_printf’ calling ‘_exit’; it’s trying to tell us
> something.
>
> I’ll try and rebuild the system with the debugging patches at
> <https://lists.gnu.org/archive/html/bug-hurd/2011-11/msg00038.html>, to
> get early ld.so output, for lack of a better solution…

I tried adapted the patches above and tried them, but it seems that

‘_dl_sysdep_start’ isn’t even reached. For example, I set a breakpoint

on ‘mach_task_self’ (called from ‘__mach_init’, called from

‘_dl_sysdep_start’), but that’s never reached (I’m assuming ‘break/tu’

is reliable, is it?).

The user-space backtrace upon trap remains unhelpful:

Toggle snippet (16 lines)start ext2fs: Hurd server bootstrap: ext2fs[device:hd0s1]Kernel Breakpoint trap,                                                        
 eip 0xc1030d5b                                                                                                                         
Breakpoint at  task_resume:     pushl   %ebp                                                                                            
db> debug traps /on                                                                                                                     
db> b task_terminate                                                                                                                    
set breakpoint #2                                                                                                                       
db> c                                                                                                                                   
Kernel Debug trap trap, eip 0xc1030d5b                                                                                                  
 execkernel: Page fault (14), code=6                                                                                                    
Stopped at  0x1000:     pushl   0x4(%ebx)                                                                                               
>>>>> user space <<<<<                                                                                                                  
0x1000(bfffff24,0,0,1160b,0)                                                                                                            
0x11627(bfffff9c,0,0,0,2)                                                                                                               
0x11bb()                                                                                                                                

… where:

Toggle snippet (18 lines)

$ addr2line -e /gnu/store/4p1kab1c4h7h3kvgcm1hbjja4y5k9x4p-glibc-cross-i586-pc-gnu-2.33/lib/ld.so.1 0x11627 0x11bb

/tmp/guix-build-glibc-cross-i586-pc-gnu-2.33.drv-0/glibc-2.33/elf/dl-misc.c:333

$ objdump -S /gnu/store/4p1kab1c4h7h3kvgcm1hbjja4y5k9x4p-glibc-cross-i586-pc-gnu-2.33/lib/ld.so.1 --start-address=0x000011b0 |head -40

/gnu/store/4p1kab1c4h7h3kvgcm1hbjja4y5k9x4p-glibc-cross-i586-pc-gnu-2.33/lib/ld.so.1: file format elf32-i386

Disassembly of section .text:

000011b0 <_start>:

11b0: 89 e0 mov %esp,%eax

11b2: 83 ec 0c sub $0xc,%esp

11b5: 50 push %eax

11b6: e8 b5 0a 00 00 call 1c70 <_dl_start>

11bb: 83 c4 10 add $0x10,%esp

So it would seem that ‘_dl_start’ is called and somehow then a tail-call

to ‘_dl_fatal_printf’ is made.

Through a dichotomy I tried to see how far it goes. The info I have so

far is that ld.so errors out from elf/rtld.c:563 (line 565 is not

reached):

Toggle snippet (12 lines)558:  if (bootstrap_map.l_addr || ! bootstrap_map.l_info[VALIDX(DT_GNU_PRELINKED)])
   {
     /* Relocate ourselves so we can do normal function calls and
        data access using the global offset table.  */

     ELF_DYNAMIC_RELOCATE (&bootstrap_map, 0, 0, 0);
   }
 bootstrap_map.l_relocated = 1;
...
 __rtld_malloc_init_stubs ();

It’s hard to be more precise because ELF_DYNAMIC_RELOCATE is a macro

that expands to quite a lot of code.

I don’t see the code path that would lead to a ‘_dl_fatal_printf’ call

though.

Ideas? :-)

Ludo’.

Samuel Thibault wrote on 9 Oct 2022 12:09

Recipients:(name . Ludovic Courtès)(address . ludo@gnu.org)

Message-ID:20221009190933.tdozwl6bjap6mrru@begin

Ludovic Courtès, le dim. 09 oct. 2022 18:09:07 +0200, a ecrit:

Toggle quote (3 lines)

> So it would seem that ‘_dl_start’ is called and somehow then a tail-call

> to ‘_dl_fatal_printf’ is made.

Perhaps you can build glibc without tail-call optimization?

(-fno-optimize-sibling-calls)

Samuel

Ludovic Courtès wrote on 10 Oct 2022 14:14

Recipients:(address . 58320@debbugs.gnu.org)(address . bug-hurd@gnu.org)

Message-ID:87pmezxty0.fsf@gnu.org

Ludovic Courtès <ludo@gnu.org> skribis:

Toggle quote (15 lines)> Through a dichotomy I tried to see how far it goes.  The info I have so
> far is that ld.so errors out from elf/rtld.c:563 (line 565 is not
> reached):
>
> 558:  if (bootstrap_map.l_addr || ! bootstrap_map.l_info[VALIDX(DT_GNU_PRELINKED)])
> 559:    {
> 560:      /* Relocate ourselves so we can do normal function calls and
> 561:         data access using the global offset table.  */
> 562:
> 563:      ELF_DYNAMIC_RELOCATE (&bootstrap_map, 0, 0, 0);
> 564:    }
> 565:  bootstrap_map.l_relocated = 1;
> ...
> 578:  __rtld_malloc_init_stubs ();

Via brute force¹, I found that ‘__assert_fail’ is hit, with its first

argument in $eax being:

Toggle snippet (6 lines)

db> x/c 0x28604,80

ELF32_R_TYPE (reloc->r_info) == R_386_RELATIVE\000\000map->l_in

fo[VERSYMIDX (DT_VERSYM)] != NULL\000\000Fatal glibc error: Too

many audit mo

This comes from i386/dl-machine.h:

Toggle snippet (11 lines)auto inline void
__attribute ((always_inline))
elf_machine_rel_relative (Elf32_Addr l_addr, const Elf32_Rel *reloc,
			  void *const reloc_addr_arg)
{
  Elf32_Addr *const reloc_addr = reloc_addr_arg;
  assert (ELF32_R_TYPE (reloc->r_info) == R_386_RELATIVE);
  *reloc_addr += l_addr;
}

How can we get there? Looking at ‘_dl_start’, it could be that

‘elf_machine_load_address’ returns a bogus value and we end up reading

wrong ELF data? Or it could be memory corruption somewhere. Or…?

Thing is, it’s not fully deterministic (happens 9 times out of 10 with

KVM, never happens without KVM).

Ideas? :-)

Ludo’.

¹ Building with ‘-fno-optimize-sibling-calls’ didn’t help get nicer

backtraces, but that’s prolly because all that early relocation code

is inlined.

Ludovic Courtès wrote on 17 Oct 2022 05:51

Recipients:(address . 58320@debbugs.gnu.org)(address . bug-hurd@gnu.org)

Message-ID:87czaqsjey.fsf@gnu.org

Hi,

Ludovic Courtès <ludo@gnu.org> skribis:

Toggle quote (2 lines)

> … so ‘exec_load’ is doing its job, it seems.

Turns out that may not be the case.

Here’s a *bad* mapping on the second ‘task_resume’ breakpoint (when

‘exec’ is about to start):

Toggle snippet (53 lines)  db> show all threads
      TASK        THREADS
    0 gnumach (f5f7cf00): 7 threads:
                0 (f5f7be18) .W..N. 0xc11dac04
                1 (f5f7bcd0) R..O..(idle_thread_continue)
                2 (f5f7bb88) .W.ON.(reaper_thread_continue) 0xc12015d4
                3 (f5f7ba40) .W.ON.(swapin_thread_continue) 0xc11f8e2c
                4 (f5f7b8f8) .W.ON.(sched_thread_continue) 0
                5 (f5f7b7b0) .W.ON.(io_done_thread_continue) 0xc1201f74
                6 (f5f7b668) .W.ON.(net_thread_continue) 0xc11db0a8
    1 ext2fs (f5f7ce40): 6 threads:
                0 (f5f7b520) R....F
                1 (f5f7b290) .W.O..(mach_msg_receive_continue) 0
                2 (f5f7b148) .W.O..(mach_msg_receive_continue) 0
                3 (f5f7b000) .W.O..(mach_msg_continue) 0
                4 (f67d3e20) .W.O..(mach_msg_receive_continue) 0
                5 (f67d3cd8) .W.O..(mach_msg_continue) 0
    2 exec (f5f7cd80): (f5f7b3d8) ..SO..(thread_bootstrap_return)
   db> trace
  task_resume(f593e010,fb7d9010,f5f73e80,c106972a)
  ipc_kobject_server(f593e000,3,18,0)+0x1eb
  mach_msg_trap(bffff4c0,3,18,20,8)+0x1703
  >>>>> user space <<<<<
  db> x/tbx 0xcbc 0xf5f7b3d8

  no memory is assigned to address 00000cbc
  0
  db> show map $map2
    Map 0xf5f6ff30: name="exec", pmap=0xf5f71fa8,ref=1,nentries=5
  size=290816,resident:225280,wired=0
  version=13
     map entry 0xf625ec08: start=0x0, end=0x1000
     prot=1/7/copy, object=0x0, offset=0x0
     map entry 0xf625ebb0: start=0x1000, end=0x26000
     prot=5/7/copy, object=0xf5f6ad70, offset=0x0
      Object 0xf5f6ad70: size=0x25000, 1 references
      37 resident pages, 0 absent pages, 0 paging ops
       memory object=0x0 (offset=0x0),control=0x0, name=0xf5f82780
       uninitialized,temporary    internal,copy_strategy=0
       shadow=0x0 (offset=0x0),copy=0x0
     map entry 0xf625eb58: start=0x26000, end=0x34000
     prot=1/7/copy, object=0xf5f6ad20, offset=0x0
      Object 0xf5f6ad20: size=0xe000, 1 references
      14 resident pages, 0 absent pages, 0 paging ops
       memory object=0x0 (offset=0x0),control=0x0, name=0xf5f82730
       uninitialized,temporary    internal,copy_strategy=0
       shadow=0x0 (offset=0x0),copy=0x0
     map entry 0xf625eb00: start=0x34000, end=0x37000
     prot=3/7/copy, object=0xf5f6acd0, offset=0x0
      Object 0xf5f6acd0: size=0x3000, 1 references
      3 resident pages,--db_more--

Compare with what a “good” mapping looks like at that same moment:

Toggle snippet (64 lines)  start ext2fs: Hurd server bootstrap: ext2fs[device:hd0s1]Kernel Breakpoint trap,
   eip 0xc1030d5b
  Breakpoint at  task_resume:     pushl   %ebp
  db> show all threads
      TASK        THREADS
    0 gnumach (f5f7cf00): 7 threads:
                0 (f5f7be18) .W..N. 0xc11dac04
                1 (f5f7bcd0) R..O..(idle_thread_continue)
                2 (f5f7bb88) .W.ON.(reaper_thread_continue) 0xc12015d4
                3 (f5f7ba40) .W.ON.(swapin_thread_continue) 0xc11f8e2c
                4 (f5f7b8f8) .W.ON.(sched_thread_continue) 0
                5 (f5f7b7b0) .W.ON.(io_done_thread_continue) 0xc1201f74
                6 (f5f7b668) .W.ON.(net_thread_continue) 0xc11db0a8
    1 ext2fs (f5f7ce40): 6 threads:
                0 (f5f7b520) R....F
                1 (f5f7b290) .W.O..(mach_msg_receive_continue) 0
                2 (f5f7b148) .W.O..(mach_msg_receive_continue) 0
                3 (f5f7b000) .W.O..(mach_msg_continue) 0
                4 (f67d2e20) .W.O..(mach_msg_receive_continue) 0
                5 (f67d2cd8) .W.O..(mach_msg_continue) 0
    2 exec (f5f7cd80): (f5f7b3d8) ..SO..(thread_bootstrap_return)
  db> x/tbx 0xcbc 0xf5f7b3d8
                  8
  db> show map $map2
  Map 0xf5f6ff30: name="exec", pmap=0xf5f71fa8,ref=1,nentries=5
  size=290816,resident:229376,wired=0
  version=14
   map entry 0xf625ec08: start=0x0, end=0x1000
   prot=1/7/copy, object=0xf5f6ad70, offset=0x0
    Object 0xf5f6ad70: size=0x1000, 1 references
    1 resident pages, 0 absent pages, 0 paging ops
     memory object=0x0 (offset=0x0),control=0x0, name=0xf5f82780
     uninitialized,temporary      internal,copy_strategy=0
     shadow=0x0 (offset=0x0),copy=0x0
   map entry 0xf625ebb0: start=0x1000, end=0x26000
   prot=5/7/copy, object=0xf5f6ad20, offset=0x0
    Object 0xf5f6ad20: size=0x25000, 1 references
    37 resident pages, 0 absent pages, 0 paging ops
     memory object=0x0 (offset=0x0),control=0x0, name=0xf5f82730
     uninitialized,temporary      internal,copy_strategy=0
     shadow=0x0 (offset=0x0),copy=0x0
   map entry 0xf625eb58: start=0x26000, end=0x34000
   prot=1/7/copy, object=0xf5f6acd0, offset=0x0
    Object 0xf5f6acd0: size=0xe000, 1 references
    14 resident pages, 0 absent pages, 0 paging ops
     memory object=0x0 (offset=0x0),control=0x0, name=0xf5f826e0
     uninitialized,temporary      internal,copy_strategy=0
     shadow=0x0 (offset=0x0),copy=0x0
   map entry 0xf625eb00: start=0x34000, end=0x37000
   prot=3/7/copy, object=0xf5f6ac80, offset=0x0
    Object 0xf5f6ac80: size=0x3000, 1 references
    3 resident pages, 0 absent pages, 0 paging ops
     memory object=0x0 (offset=0x0),control=0x0, name=0xf5f82690
     uninitialized,temporary      internal,copy_strategy=0
     shadow=0x0 (offset=0x0),copy=0x0
   map entry 0xf625eaa8: start=0xbfff0000, end=0xc0000000
   prot=3/7/copy, object=0xf5f6ac30, offset=0x0
    Object 0xf5f6ac30: size=0x10000, 1 references
    1 resident pages, 0 absent pages, 0 paging ops
     memory object=0x0 (offset=0x0),control=0x0, name=0xf5f82640
     uninitialized,temporary      internal,copy_strategy=0
     shadow=0x0 (offset=0x0),copy=0x0

Notice that 0xcbc reads a valid relocation, where 8 = R_386_RELATIVE.

In the “bad” case, the first map entry is empty, with no associated

memory object and zero resident pages.

My reading of ‘read_exec’ is that the page is supposed to be populated

eagerly by the ‘copyout’ call here:

Toggle snippet (26 lines)

static int

read_exec(void *handle, vm_offset_t file_ofs, vm_size_t file_size,

vm_offset_t mem_addr, vm_size_t mem_size,

exec_sectype_t sec_type)

{

struct multiboot_module *mod = handle;

[...]

err = vm_allocate(user_map, &start_page, end_page - start_page, FALSE);

assert(err == 0);

assert(start_page == trunc_page(mem_addr));

if (file_size > 0)

{

err = copyout((char *)phystokv (mod->mod_start) + file_ofs,

(void *)mem_addr, file_size);

assert(err == 0);

}

[...]

return 0;

}

There are interesting tricks in ‘copyout_retry’ to fake a page fault so

the copy can actually be made, IIUC.

Could it be that this bit isn’t quite working?

Ideas?

Problem with debugging this is that setting a breakpoint on ‘exec_load’

causes the system to boot fine (breaking on ‘task_resume’ is fine tough,

go figure…).

Ludo’.

Ludovic Courtès wrote on 23 Oct 2022 06:58

Recipients:(address . 58320@debbugs.gnu.org)(address . bug-hurd@gnu.org)

Message-ID:87pmei3alk.fsf@gnu.org

Hi,

Ludovic Courtès <ludo@gnu.org> skribis:

Toggle quote (3 lines)

> Of course, the thing boots just fine on that machine when using

> ‘exec.static’.

It’s frustrating I did not get to the bottom of it, but time passes, so

I pushed this workaround in Guix commit

3fb3bd3da530a5f82a169b1fa451474f9d90c3b6.

Ludo’.

Ludovic Courtès wrote on 29 Aug 2024 02:40

control message for bug #58320

Recipients:(address . control@debbugs.gnu.org)

Message-ID:87seunfyzh.fsf@gnu.org

tags 58320 wontfix
close 58320
quit

Your comment

This issue is archived.

To comment on this conversation send an email to 58320@patchwise.org

To respond to this issue using the mumi CLI, first switch to it

mumi current 58320

Then, you may apply the latest patchset in this issue (with sign off)

mumi am -- -s

Or, compose a reply to this issue

mumi compose

Or, send patches to this issue

mumi send-email *.patch

is:open	open issues
is:done	closed issues
submitter:<who>	search issue submitter
author:<who>	search by message author
date:yesterday..now	search by issue date
mdate:3m..2d	search by message date