2006-05-14 18:25:46

by Alberto Bertogli

[permalink] [raw]
Subject: [UML] Problems building and running 2.6.17-rc4 on x86-64


Hi!

I'm having some problems building and running UML using vanilla
2.6.17-rc4 under x86-64 with glibc 2.4.

First of all, it won't build because lack of definitions for some
constants in arch/um/os-Linux/sys-x86_64/registers.c. After some
digging, I found that these were defined in public setjmp headers in
previous glibc's, but have been removed in 2.4.

So I copied them from sysdeps/x86_64/jmpbuf-offsets.h, and building went
on. Probably, the same happens under i386.


Then, it built fine, but at the end several errors showed up:

-----------------8<-----------------8<-----------------8<-----------------

SYSMAP .tmp_System.map
LINK linux
Building modules, stage 2.
MODPOST
WARNING: vmlinux - Section mismatch: reference to .init.text:do_mount_root from .bss between '__guard@@GLIBC_2.3.2' (at offset 0x603c5688) and 'stdout@@GLIBC_2.2.5'
WARNING: vmlinux - Section mismatch: reference to .init.text:parse_header from .bss between 'stdout@@GLIBC_2.2.5' (at offset 0x603c5690) and 'completed.1'
WARNING: vmlinux - Section mismatch: reference to .init.text: from .plt after '' (at offset 0x603c5278)
WARNING: vmlinux - Section mismatch: reference to .init.ramfs: from .plt after '' (at offset 0x603c5370)
WARNING: vmlinux - Section mismatch: reference to .init.text:nosmp from .plt after '' (at offset 0x603c5418)
WARNING: vmlinux - Section mismatch: reference to .init.text:maxcpus from .plt after '' (at offset 0x603c5428)
WARNING: vmlinux - Section mismatch: reference to .init.text:obsolete_checksetup from .plt after '' (at offset 0x603c5440)
WARNING: vmlinux - Section mismatch: reference to .init.text:debug_kernel from .plt after '' (at offset 0x603c5450)
WARNING: vmlinux - Section mismatch: reference to .init.text:quiet_kernel from .plt after '' (at offset 0x603c5458)
WARNING: vmlinux - Section mismatch: reference to .init.setup:__setup_debug_kernel from .plt after '' (at offset 0x603c5460)
WARNING: vmlinux - Section mismatch: reference to .init.setup:__setup_quiet_kernel from .plt after '' (at offset 0x603c5470)
WARNING: vmlinux - Section mismatch: reference to .init.setup:__setup_loglevel from .plt after '' (at offset 0x603c5478)
WARNING: vmlinux - Section mismatch: reference to .init.text:unknown_bootoption from .plt after '' (at offset 0x603c5488)
WARNING: vmlinux - Section mismatch: reference to .init.text:init_setup from .plt after '' (at offset 0x603c5490)
WARNING: vmlinux - Section mismatch: reference to .init.text:rdinit_setup from .plt after '' (at offset 0x603c54a8)
WARNING: vmlinux - Section mismatch: reference to .init.setup:__setup_rdinit_setup from .plt after '' (at offset 0x603c54c0)
WARNING: vmlinux - Section mismatch: reference to .init.text:do_early_param from .plt after '' (at offset 0x603c54d8)
WARNING: vmlinux - Section mismatch: reference to .init.text:boot_cpu_init from .plt after '' (at offset 0x603c54f0)
WARNING: vmlinux - Section mismatch: reference to .init.text:initcall_debug_setup from .plt after '' (at offset 0x603c54f8)
WARNING: vmlinux - Section mismatch: reference to .init.setup:__setup_initcall_debug_setup from .plt after '' (at offset 0x603c5510)
WARNING: vmlinux - Section mismatch: reference to .init.text:do_initcalls from .plt after '' (at offset 0x603c5518)
WARNING: vmlinux - Section mismatch: reference to .init.text:load_ramdisk from .plt after '' (at offset 0x603c5540)
WARNING: vmlinux - Section mismatch: reference to .init.text:readwrite from .plt after '' (at offset 0x603c5550)
WARNING: vmlinux - Section mismatch: reference to .init.setup:__setup_readonly from .plt after '' (at offset 0x603c5560)
WARNING: vmlinux - Section mismatch: reference to .init.setup:__setup_readwrite from .plt after '' (at offset 0x603c5568)
WARNING: vmlinux - Section mismatch: reference to .init.text:root_dev_setup from .plt after '' (at offset 0x603c5578)
WARNING: vmlinux - Section mismatch: reference to .init.setup:__setup_root_dev_setup from .plt after '' (at offset 0x603c5590)
WARNING: vmlinux - Section mismatch: reference to .init.text:root_data_setup from .plt after '' (at offset 0x603c5598)
WARNING: vmlinux - Section mismatch: reference to .init.text:fs_names_setup from .plt after '' (at offset 0x603c55a8)
WARNING: vmlinux - Section mismatch: reference to .init.text:root_delay_setup from .plt after '' (at offset 0x603c55b8)
WARNING: vmlinux - Section mismatch: reference to .init.setup:__setup_root_data_setup from .plt after '' (at offset 0x603c55d0)WARNING: vmlinux - Section mismatch: reference to .init.setup:__setup_fs_names_setup from .plt after '' (at offset 0x603c55e0)
WARNING: vmlinux - Section mismatch: reference to .init.setup:__setup_root_delay_setup from .plt after '' (at offset 0x603c55f0)
WARNING: vmlinux - Section mismatch: reference to .init.text:get_fs_names from .plt after '' (at offset 0x603c55f8)
WARNING: vmlinux - Section mismatch: reference to .init.text:malloc from .plt after '' (at offset 0x603c5608)
WARNING: vmlinux - Section mismatch: reference to .init.text:free from .plt after '' (at offset 0x603c5610)
WARNING: vmlinux - Section mismatch: reference to .init.text:find_link from .plt after '' (at offset 0x603c5618)
WARNING: vmlinux - Section mismatch: reference to .exit.text: from .plt after '' (at offset 0x603c5368)

-----------------8<-----------------8<-----------------8<-----------------


However, the linux image was there, and I tried it.

It begins to boot, but panics right after mounting root:

[42949373.800000] kjournald starting. Commit interval 5 seconds
[42949373.800000] EXT3-fs: mounted filesystem with ordered data mode.
[42949373.800000] VFS: Mounted root (ext3 filesystem) readonly.
[42949373.800000] Kernel panic - not syncing: handle_trap - failed to wait at end of syscall, errno = 0, status = 2943
[42949373.800000]
[42949373.800000]
[42949373.800000] Modules linked in:
[42949373.800000] Pid: 1, comm: init Not tainted 2.6.17-rc4
[42949373.800000] RIP: 0033:[<000000004000f349>]
[42949373.800000] RSP: 0000007f7fbfbfc8 EFLAGS: 00000246
[42949373.800000] RAX: 0000000000000000 RBX: 0000007f7fbfbfe0 RCX: ffffffffffffffff
[42949373.800000] RDX: 0000007f7fbfc2a0 RSI: 0000000040010900 RDI: 0000007f7fbfbfe0
[42949373.800000] RBP: 0000000000402240 R08: 0000000000000000 R09: 0000000000000000
[42949373.800000] R10: 0000000000000064 R11: 0000000000000246 R12: 0000007f7fbfc170
[42949373.800000] R13: 0000000040001530 R14: 0000000000400040 R15: 0000000000000008
[42949373.800000] Call Trace:
[42949373.800000] 6042bc38: [<6001a10a>] panic_exit+0x2a/0x50
[42949373.800000] 6042bc48: [<60044a8c>] notifier_call_chain+0x1c/0x30
[42949373.800000] 6042bc68: [<6003488f>] panic+0xcf/0x170
[42949373.800000] 6042bcc8: [<6027b4b1>] __down_read+0xa1/0xb0
[42949373.800000] 6042bce8: [<6013f0fe>] __up_read+0x1e/0xc0
[42949373.800000] 6042bcf8: [<600285b4>] set_signals+0x14/0x30
[42949373.800000] 6042bd08: [<6002f0a1>] sys_uname64+0x31/0x90
[42949373.800000] 6042bd18: [<6002acf2>] move_registers+0x42/0x80
[42949373.800000] 6042bd48: [<6002bf65>] userspace+0x255/0x2d0
[42949373.800000] 6042bdc0: [<60014010>] init+0x0/0x170
[42949373.800000] 6042bdd8: [<6001a7a2>] new_thread_handler+0x102/0x140
[42949373.800000]

And I couldn't get past that. I found the error comes from
arch/um/os-Linux/skas/process.c, but I'm not sure what causes it or if
it's related to the constants defined above.

Any ideas?

Thanks,
Alberto


2006-05-15 03:38:52

by Jeff Dike

[permalink] [raw]
Subject: Re: [uml-devel] [UML] Problems building and running 2.6.17-rc4 on x86-64

On Sun, May 14, 2006 at 03:25:41PM -0300, Alberto Bertogli wrote:
> So I copied them from sysdeps/x86_64/jmpbuf-offsets.h, and building went
> on. Probably, the same happens under i386.

The current patch for this is http://user-mode-linux.sourceforge.net/work/current/2.6/2.6.17-rc4/patches/jmpbuf

I need to redo it, but that works for now.

> Then, it built fine, but at the end several errors showed up:
> MODPOST
> WARNING: vmlinux - Section mismatch: reference to .init.text:do_mount_root from .bss between '__guard@@GLIBC_2.3.2' (at offset 0x603c5688) and 'stdout@@GLIBC_2.2.5'

I have no idea what these mean, but they seem not to affect the
viability of the resulting kernel.

> It begins to boot, but panics right after mounting root:
>
> [42949373.800000] kjournald starting. Commit interval 5 seconds
> [42949373.800000] EXT3-fs: mounted filesystem with ordered data mode.
> [42949373.800000] VFS: Mounted root (ext3 filesystem) readonly.
> [42949373.800000] Kernel panic - not syncing: handle_trap - failed to wait at end of syscall, errno = 0, status = 2943

This is a segfault happening when it shouldn't.

Can you disassemble stub_segv_handler and send me the output? If
you're unfamiliar with gdb, it works like this:

% gdb linux
GNU gdb Red Hat Linux (6.3.0.0-1.122rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
`There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu"...Using host libthread_db library "/lib64/libthread_db.so.1".

(gdb) disas stub_segv_handler
Dump of assembler code for function stub_segv_handler:
0x00000000601610c8 <stub_segv_handler+0>: push %rbp
0x00000000601610c9 <stub_segv_handler+1>: mov %rsp,%rbp
0x00000000601610cc <stub_segv_handler+4>: mov %rdx,%r8
...

There was a bug like this a month or so ago, but it has been in
mainline for a while, so this should be something different.

Jeff

2006-05-15 15:30:06

by Alberto Bertogli

[permalink] [raw]
Subject: Re: [uml-devel] [UML] Problems building and running 2.6.17-rc4 on x86-64

On Sun, May 14, 2006 at 11:39:19PM -0400, Jeff Dike wrote:
> On Sun, May 14, 2006 at 03:25:41PM -0300, Alberto Bertogli wrote:
> > So I copied them from sysdeps/x86_64/jmpbuf-offsets.h, and building went
> > on. Probably, the same happens under i386.
>
> The current patch for this is http://user-mode-linux.sourceforge.net/work/current/2.6/2.6.17-rc4/patches/jmpbuf
>
> I need to redo it, but that works for now.

Thanks!


> > It begins to boot, but panics right after mounting root:
> >
> > [42949373.800000] kjournald starting. Commit interval 5 seconds
> > [42949373.800000] EXT3-fs: mounted filesystem with ordered data mode.
> > [42949373.800000] VFS: Mounted root (ext3 filesystem) readonly.
> > [42949373.800000] Kernel panic - not syncing: handle_trap - failed to wait at end of syscall, errno = 0, status = 2943
>
> This is a segfault happening when it shouldn't.
>
> Can you disassemble stub_segv_handler and send me the output?

Sure, here it is:
(gdb) disas stub_segv_handler
Dump of assembler code for function stub_segv_handler:
0x000000006027c0e0 <stub_segv_handler+0>: mov %rdx,%r8
0x000000006027c0e3 <stub_segv_handler+3>: mov 0xd8(%r8),%rax
0x000000006027c0ea <stub_segv_handler+10>: mov $0x7fbffff000,%rdx
0x000000006027c0f4 <stub_segv_handler+20>: mov %rax,0x8(%rdx)
0x000000006027c0f8 <stub_segv_handler+24>: mov 0xc0(%r8),%rax
0x000000006027c0ff <stub_segv_handler+31>: mov %eax,(%rdx)
0x000000006027c101 <stub_segv_handler+33>: mov 0xc8(%r8),%rax
0x000000006027c108 <stub_segv_handler+40>: mov %eax,0x10(%rdx)
0x000000006027c10b <stub_segv_handler+43>: mov $0x27,%eax
0x000000006027c110 <stub_segv_handler+48>: syscall
0x000000006027c112 <stub_segv_handler+50>: mov $0x3e,%edx
0x000000006027c117 <stub_segv_handler+55>: movslq %eax,%rdi
0x000000006027c11a <stub_segv_handler+58>: mov $0xa,%esi
0x000000006027c11f <stub_segv_handler+63>: mov %rdx,%rax
0x000000006027c122 <stub_segv_handler+66>: syscall
0x000000006027c124 <stub_segv_handler+68>: mov %r8,%rsp
0x000000006027c127 <stub_segv_handler+71>: mov $0xf,%rax
0x000000006027c12e <stub_segv_handler+78>: syscall
0x000000006027c130 <stub_segv_handler+80>: retq
End of assembler dump.


Please let me know if you need anything else.

Thanks,
Alberto


2006-05-16 19:12:28

by Jeff Dike

[permalink] [raw]
Subject: Re: [uml-devel] [UML] Problems building and running 2.6.17-rc4 on x86-64

On Mon, May 15, 2006 at 12:29:58PM -0300, Alberto Bertogli wrote:
> Sure, here it is:
> (gdb) disas stub_segv_handler

Sorry, I misread the error message and asked for the wrong thing.
Your UML is seeing a process segfault during a system call, before the
SIGTRAP expected at the end of the system call. I don't know what's
happening there.

Can you apply the following patch, which will just give you a register
dump of the process, and send me the output?

Index: linux-2.6.16/arch/um/os-Linux/skas/process.c
===================================================================
--- linux-2.6.16.orig/arch/um/os-Linux/skas/process.c
+++ linux-2.6.16/arch/um/os-Linux/skas/process.c
@@ -45,6 +45,22 @@ int is_skas_winch(int pid, int fd, void
return(1);
}

+static int ptrace_dump_regs(int pid)
+{
+ unsigned long regs[HOST_FRAME_SIZE];
+ int i;
+
+ if(ptrace(PTRACE_GETREGS, pid, 0, regs) < 0)
+ return -errno;
+ else {
+ printk("Stub registers -\n");
+ for(i = 0; i < HOST_FRAME_SIZE; i++)
+ printk("\t%d - %lx\n", i, regs[i]);
+ }
+
+ return 0;
+}
+
void wait_stub_done(int pid, int sig, char * fname)
{
int n, status, err;
@@ -68,18 +84,10 @@ void wait_stub_done(int pid, int sig, ch

if((n < 0) || !WIFSTOPPED(status) ||
(WSTOPSIG(status) != SIGUSR1 && WSTOPSIG(status) != SIGTRAP)){
- unsigned long regs[HOST_FRAME_SIZE];
-
- if(ptrace(PTRACE_GETREGS, pid, 0, regs) < 0)
- printk("Failed to get registers from stub, "
- "errno = %d\n", errno);
- else {
- int i;
-
- printk("Stub registers -\n");
- for(i = 0; i < HOST_FRAME_SIZE; i++)
- printk("\t%d - %lx\n", i, regs[i]);
- }
+ err = ptrace_dump_regs(pid);
+ if(err)
+ printk("Failed to get registers from stub, "
+ "errno = %d\n", -err);
panic("%s : failed to wait for SIGUSR1/SIGTRAP, "
"pid = %d, n = %d, errno = %d, status = 0x%x\n",
fname, pid, n, errno, status);
@@ -146,9 +154,14 @@ static void handle_trap(int pid, union u

CATCH_EINTR(err = waitpid(pid, &status, WUNTRACED));
if((err < 0) || !WIFSTOPPED(status) ||
- (WSTOPSIG(status) != SIGTRAP + 0x80))
+ (WSTOPSIG(status) != SIGTRAP + 0x80)){
+ err = ptrace_dump_regs(pid);
+ if(err)
+ printk("Failed to get registers from process, "
+ "errno = %d\n", -err);
panic("handle_trap - failed to wait at end of syscall, "
"errno = %d, status = %d\n", errno, status);
+ }
}

handle_syscall(regs);

2006-05-17 02:39:49

by Alberto Bertogli

[permalink] [raw]
Subject: Re: [uml-devel] [UML] Problems building and running 2.6.17-rc4 on x86-64ync-mailbox><next-undeleted><enter-command>set editor=vim

On Tue, May 16, 2006 at 03:12:44PM -0400, Jeff Dike wrote:
> On Mon, May 15, 2006 at 12:29:58PM -0300, Alberto Bertogli wrote:
> > Sure, here it is:
> > (gdb) disas stub_segv_handler
>
> Sorry, I misread the error message and asked for the wrong thing.
> Your UML is seeing a process segfault during a system call, before the
> SIGTRAP expected at the end of the system call. I don't know what's
> happening there.
>
> Can you apply the following patch, which will just give you a register
> dump of the process, and send me the output?

Here it is. While the patch worked, it was for 2.6.16, and I'm using
2.6.17-rc4, I hope that's not a problem.


[42949373.940000] EXT3-fs: mounted filesystem with ordered data mode.
[42949373.940000] VFS: Mounted root (ext3 filesystem) readonly.
[42949374.050000] Stub registers -
[42949374.050000] 0 - 8
[42949374.050000] 1 - 400040
[42949374.050000] 2 - 40001530
[42949374.050000] 3 - 2
[42949374.050000] 4 - fffffffd
[42949374.050000] 5 - 7
[42949374.050000] 6 - 5
[42949374.050000] 7 - 37
[42949374.050000] 8 - 3
[42949374.050000] 9 - 20611
[42949374.050000] 10 - 0
[42949374.050000] 11 - 2d
[42949374.050000] 12 - 11
[42949374.050000] 13 - 7f7f8d4539
[42949374.050000] 14 - 0
[42949374.050000] 15 - ffffffffffffffff
[42949374.050000] 16 - 4000eae0
[42949374.050000] 17 - 33
[42949374.050000] 18 - 10246
[42949374.050000] 19 - 7f7f8d4498
[42949374.050000] 20 - 2b
[42949374.050000] Kernel panic - not syncing: Kernel mode fault at addr 0x0, ip 0x4000f349
[42949374.050000]
[42949374.050000] Modules linked in:
[42949374.050000] Pid: 1, comm: init Not tainted 2.6.17-rc4
[42949374.050000] RIP: 0033:[<000000004000f349>]
[42949374.050000] RSP: 0000007f7f8d4498 EFLAGS: 00000246
[42949374.050000] RAX: 0000000000000000 RBX: 0000007f7f8d44b0 RCX: ffffffffffffffff
[42949374.050000] RDX: 0000007f7f8d4770 RSI: 0000000040010900 RDI: 0000007f7f8d44b0
[42949374.050000] RBP: 0000000000402240 R08: 0000000000000000 R09: 0000000000000000
[42949374.050000] R10: 0000000000000064 R11: 0000000000000246 R12: 0000007f7f8d4640
[42949374.050000] R13: 0000000040001530 R14: 0000000000400040 R15: 0000000000000008
[42949374.050000] Call Trace:
[42949374.050000] 60433888: [<6001a10a>] panic_exit+0x2a/0x50
[42949374.050000] 60433898: [<60044acc>] notifier_call_chain+0x1c/0x30
[42949374.050000] 604338b8: [<600348cf>] panic+0xcf/0x170
[42949374.050000] 60433918: [<600285b4>] set_signals+0x14/0x30
[42949374.050000] 60433928: [<6001947b>] handle_page_fault+0x1bb/0x270
[42949374.050000] 60433998: [<600197b8>] segv+0x208/0x300
[42949374.050000] 60433a80: [<60019530>] segv_handler+0x0/0x80
[42949374.050000] 60433a98: [<600195ab>] segv_handler+0x7b/0x80
[42949374.050000] 60433ab8: [<6002ca18>] sig_handler_common_skas+0xe8/0x140
[42949374.050000] 60433ae8: [<6002873f>] sig_handler+0x5f/0x80
[42949374.050000] 60433c20: [<6001b450>] copy_chunk_to_user+0x0/0x40
[42949374.050000] 60433c88: [<6002b877>] ptrace_dump_regs+0x47/0x70
[42949374.050000] 60433dc0: [<60014010>] init+0x0/0x170
[42949374.050000] 60433dd8: [<6001a7a2>] new_thread_handler+0x102/0x140
[42949374.050000]

Please let me know if there's anything else you want me to try.

Thanks,
Alberto


2006-05-17 06:36:43

by Blaisorblade

[permalink] [raw]
Subject: Re: [uml-devel] [UML] Problems building and running 2.6.17-rc4 on x86-64ync-mailbox><next-undeleted><enter-command>set editor=vim

On Wednesday 17 May 2006 04:39, Alberto Bertogli wrote:
> On Tue, May 16, 2006 at 03:12:44PM -0400, Jeff Dike wrote:
> > On Mon, May 15, 2006 at 12:29:58PM -0300, Alberto Bertogli wrote:
> > > Sure, here it is:
> > > (gdb) disas stub_segv_handler
> >
> > Sorry, I misread the error message and asked for the wrong thing.
> > Your UML is seeing a process segfault during a system call, before the
> > SIGTRAP expected at the end of the system call. I don't know what's
> > happening there.
> >
> > Can you apply the following patch, which will just give you a register
> > dump of the process, and send me the output?
>
> Here it is. While the patch worked, it was for 2.6.16, and I'm using
> 2.6.17-rc4, I hope that's not a problem.

Guess not - I'll test this patch soon because I have the same problem, however
are you running a 2.6.16 host?

If so, can you verify whether on a 2.6.15 host kernel the same binary runs
fine (as is the case for me)?
--
Inform me of my mistakes, so I can keep imitating Homer Simpson's "Doh!".
Paolo Giarrusso, aka Blaisorblade (Skype ID "PaoloGiarrusso", ICQ 215621894)
http://www.user-mode-linux.org/~blaisorblade




___________________________________
Yahoo! Messenger with Voice: chiama da PC a telefono a tariffe esclusive
http://it.messenger.yahoo.com

2006-05-17 18:12:35

by Jeff Dike

[permalink] [raw]
Subject: Re: [uml-devel] [UML] Problems building and running 2.6.17-rc4 on x86-64ync-mailbox><next-undeleted><enter-command>set editor=vim

On Tue, May 16, 2006 at 11:39:42PM -0300, Alberto Bertogli wrote:
> [42949374.050000] Kernel panic - not syncing: Kernel mode fault at addr 0x0, ip 0x4000f349

Err, there was a rather serious bug in that last patch. Can you
replace it with the version below and boot UML again?

Jeff

Index: linux-2.6.16/arch/um/os-Linux/skas/process.c
===================================================================
--- linux-2.6.16.orig/arch/um/os-Linux/skas/process.c
+++ linux-2.6.16/arch/um/os-Linux/skas/process.c
@@ -45,6 +45,22 @@ int is_skas_winch(int pid, int fd, void
return(1);
}

+static int ptrace_dump_regs(int pid)
+{
+ unsigned long regs[HOST_FRAME_SIZE];
+ int i;
+
+ if(ptrace(PTRACE_GETREGS, pid, 0, regs) < 0)
+ return -errno;
+ else {
+ printk("Stub registers -\n");
+ for(i = 0; i < HOST_FRAME_SIZE; i++)
+ printk("\t%d - %lx\n", i, regs[i]);
+ }
+
+ return 0;
+}
+
void wait_stub_done(int pid, int sig, char * fname)
{
int n, status, err;
@@ -68,18 +84,10 @@ void wait_stub_done(int pid, int sig, ch

if((n < 0) || !WIFSTOPPED(status) ||
(WSTOPSIG(status) != SIGUSR1 && WSTOPSIG(status) != SIGTRAP)){
- unsigned long regs[HOST_FRAME_SIZE];
-
- if(ptrace(PTRACE_GETREGS, pid, 0, regs) < 0)
- printk("Failed to get registers from stub, "
- "errno = %d\n", errno);
- else {
- int i;
-
- printk("Stub registers -\n");
- for(i = 0; i < HOST_FRAME_SIZE; i++)
- printk("\t%d - %lx\n", i, regs[i]);
- }
+ err = ptrace_dump_regs(pid);
+ if(err)
+ printk("Failed to get registers from stub, "
+ "errno = %d\n", -err);
panic("%s : failed to wait for SIGUSR1/SIGTRAP, "
"pid = %d, n = %d, errno = %d, status = 0x%x\n",
fname, pid, n, errno, status);
@@ -146,9 +154,14 @@ static void handle_trap(int pid, union u

CATCH_EINTR(err = waitpid(pid, &status, WUNTRACED));
if((err < 0) || !WIFSTOPPED(status) ||
- (WSTOPSIG(status) != SIGTRAP + 0x80))
+ (WSTOPSIG(status) != SIGTRAP + 0x80)){
+ err = ptrace_dump_regs(pid);
+ if(err)
+ printk("Failed to get registers from process, "
+ "errno = %d\n", -err);
panic("handle_trap - failed to wait at end of syscall, "
"errno = %d, status = %d\n", errno, status);
+ }
}

handle_syscall(regs);
Index: linux-2.6.16/arch/um/sys-x86_64/user-offsets.c
===================================================================
--- linux-2.6.16.orig/arch/um/sys-x86_64/user-offsets.c
+++ linux-2.6.16/arch/um/sys-x86_64/user-offsets.c
@@ -57,7 +57,7 @@ void foo(void)
OFFSET(HOST_SC_SS, sigcontext, ss);
#endif

- DEFINE_LONGS(HOST_FRAME_SIZE, FRAME_SIZE);
+ DEFINE_LONGS(HOST_FRAME_SIZE, sizeof(struct user_regs_struct));
DEFINE(HOST_FP_SIZE, sizeof(struct _fpstate) / sizeof(unsigned long));
DEFINE(HOST_XFP_SIZE, 0);
DEFINE_LONGS(HOST_RBX, RBX);

2006-05-17 18:38:12

by Alberto Bertogli

[permalink] [raw]
Subject: Re: [uml-devel] [UML] Problems building and running 2.6.17-rc4 on x86-64

On Wed, May 17, 2006 at 02:12:52PM -0400, Jeff Dike wrote:
> On Tue, May 16, 2006 at 11:39:42PM -0300, Alberto Bertogli wrote:
> > [42949374.050000] Kernel panic - not syncing: Kernel mode fault at addr 0x0, ip 0x4000f349
>
> Err, there was a rather serious bug in that last patch. Can you
> replace it with the version below and boot UML again?

Sure, here's the output.

Thanks,
Alberto


[42949373.790000] VFS: Mounted root (ext3 filesystem) readonly.
[42949373.790000] Stub registers -
[42949373.790000] 0 - 8
[42949373.790000] 1 - 400040
[42949373.790000] 2 - 40001530
[42949373.790000] 3 - 2
[42949373.790000] 4 - fffffffd
[42949373.790000] 5 - 7
[42949373.790000] 6 - 5
[42949373.790000] 7 - 37
[42949373.790000] 8 - 3
[42949373.790000] 9 - 20611
[42949373.790000] 10 - 0
[42949373.790000] 11 - 2d
[42949373.790000] 12 - 11
[42949373.790000] 13 - 7f7fd0d179
[42949373.790000] 14 - 0
[42949373.790000] 15 - ffffffffffffffff
[42949373.790000] 16 - 4000eae0
[42949373.790000] 17 - 33
[42949373.790000] 18 - 10246
[42949373.790000] 19 - 7f7fd0d0d8
[42949373.790000] 20 - 2b
[42949373.790000] 21 - 2b4ba664a6d0
[42949373.790000] 22 - 0
[42949373.790000] 23 - 0
[42949373.790000] 24 - 0
[42949373.790000] 25 - 0
[42949373.790000] 26 - 0
[42949373.790000] Kernel panic - not syncing: handle_trap - failed to wait at end of syscall, errno = 0, status = 2943
[42949373.790000]
[42949373.790000]
[42949373.790000] Modules linked in:
[42949373.790000] Pid: 1, comm: init Not tainted 2.6.17-rc4
[42949373.790000] RIP: 0033:[<000000004000f349>]
[42949373.790000] RSP: 0000007f7fd0d0d8 EFLAGS: 00000246
[42949373.790000] RAX: 0000000000000000 RBX: 0000007f7fd0d0f0 RCX: ffffffffffffffff
[42949373.790000] RDX: 0000007f7fd0d3b0 RSI: 0000000040010900 RDI: 0000007f7fd0d0f0
[42949373.790000] RBP: 0000000000402240 R08: 0000000000000000 R09: 0000000000000000
[42949373.790000] R10: 0000000000000064 R11: 0000000000000246 R12: 0000007f7fd0d280
[42949373.790000] R13: 0000000040001530 R14: 0000000000400040 R15: 0000000000000008
[42949373.790000] Call Trace:
[42949373.790000] 61197c38: [<6001a10a>] panic_exit+0x2a/0x50
[42949373.790000] 61197c48: [<60044acc>] notifier_call_chain+0x1c/0x30
[42949373.790000] 61197c68: [<600348cf>] panic+0xcf/0x170
[42949373.790000] 61197d48: [<6002bf90>] userspace+0x260/0x2f0
[42949373.790000] 61197dc0: [<60014010>] init+0x0/0x170
[42949373.790000] 61197dd8: [<6001a7a2>] new_thread_handler+0x102/0x140
[42949373.790000]


2006-05-17 18:57:38

by Alberto Bertogli

[permalink] [raw]
Subject: Re: [uml-devel] [UML] Problems building and running 2.6.17-rc4 on x86-64

On Wed, May 17, 2006 at 08:36:40AM +0200, Blaisorblade wrote:
> On Wednesday 17 May 2006 04:39, Alberto Bertogli wrote:
> > On Tue, May 16, 2006 at 03:12:44PM -0400, Jeff Dike wrote:
> > Here it is. While the patch worked, it was for 2.6.16, and I'm using
> > 2.6.17-rc4, I hope that's not a problem.
>
> Guess not - I'll test this patch soon because I have the same problem, however
> are you running a 2.6.16 host?
>
> If so, can you verify whether on a 2.6.15 host kernel the same binary runs
> fine (as is the case for me)?

I'm running 2.6.17-rc4 host.

I'll boot 2.6.16 and 2.6.15 later tonight and I'll let you know. Is
there any additional kernel you want me to try as host/uml?

Thanks,
Alberto


2006-05-18 19:49:13

by Alberto Bertogli

[permalink] [raw]
Subject: Re: [uml-devel] [UML] Problems building and running 2.6.17-rc4 on x86-64

On Wed, May 17, 2006 at 08:36:40AM +0200, Blaisorblade wrote:
> On Wednesday 17 May 2006 04:39, Alberto Bertogli wrote:
> > On Tue, May 16, 2006 at 03:12:44PM -0400, Jeff Dike wrote:
> > > On Mon, May 15, 2006 at 12:29:58PM -0300, Alberto Bertogli wrote:
> > Here it is. While the patch worked, it was for 2.6.16, and I'm using
> > 2.6.17-rc4, I hope that's not a problem.
>
> Guess not - I'll test this patch soon because I have the same problem, however
> are you running a 2.6.16 host?
>
> If so, can you verify whether on a 2.6.15 host kernel the same binary runs
> fine (as is the case for me)?

I tried running 2.6.15.1 and 2.6.17-rc1 on the host. Both advanced more
than 2.6.17-rc4, but failed in different ways.

2.6.15.1 was the most successful one, it managed to boot and it kinda
worked, but I got semi-random segmentation faults when running some apps
like apt-get. I reported this some time ago to Jeff Dike.

2.6.17-rc1 didn't got that far, and 'panic'ed when starting runlevel 2;
I attach the output below.

Do you want me to try anything else?

Thanks,
Alberto


Initializing random number generator...done.
INIT: Entering runlevel: 2
INIT: PANIC: segmentation violation! sleeping for 30 seconds.
[42949380.750000] BUG: failure at /usr/src/linux-2.6.17-rc4/include/linux/elfcore.h:95/elf_core_copy_regs()!
[42949380.750000] Kernel panic - not syncing: BUG!
[42949380.750000]
[42949380.750000] Modules linked in:
[42949380.750000] Pid: 444, comm: init Not tainted 2.6.17-rc4
[42949380.750000] RIP: 0033:[<0000000040145896>]
[42949380.750000] RSP: 0000007f7f811f30 EFLAGS: 00000246
[42949380.750000] RAX: 0000000000000000 RBX: 0000000040017600 RCX: ffffffffffffffff
[42949380.750000] RDX: 0000000000000000 RSI: 0000007f7f811f48 RDI: 0000000000000002
[42949380.750000] RBP: 0000000000000000 R08: 00000000000001bc R09: 000000000000000b
[42949380.750000] R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000002
[42949380.750000] R13: 00000000005096bc R14: 0000000000406b21 R15: 0000000000406b21
[42949380.750000] Call Trace:
[42949380.750000] 67efb768: [<6001a10a>] panic_exit+0x2a/0x50
[42949380.750000] 67efb778: [<60044acc>] notifier_call_chain+0x1c/0x30
[42949380.750000] 67efb798: [<600348cf>] panic+0xcf/0x170
[42949380.750000] 67efb7b8: [<600c336f>] ext3_mark_iloc_dirty+0xf/0x20
[42949380.750000] 67efb7f8: [<600c7000>] ext3_orphan_del+0x1d0/0x280
[42949380.750000] 67efb808: [<600c3528>] ext3_dirty_inode+0x78/0xa0
[42949380.750000] 67efb838: [<6009d1c2>] __mark_inode_dirty+0x102/0x1b0
[42949380.750000] 67efb848: [<600285b4>] set_signals+0x14/0x30
[42949380.750000] 67efb858: [<600742c8>] kmem_cache_alloc+0x48/0x70
[42949380.750000] 67efb878: [<600aa4e4>] elf_core_dump+0x264/0x2f0
[42949380.750000] 67efb938: [<6007554d>] do_truncate+0x5d/0x70
[42949380.750000] 67efb9a8: [<600842e7>] do_coredump+0x2c7/0x370
[42949380.750000] 67efba48: [<60040bd9>] recalc_sigpending+0x19/0x20
[42949380.750000] 67efba58: [<60041064>] __dequeue_signal+0x84/0xf0
[42949380.750000] 67efbad8: [<600432e3>] get_signal_to_deliver+0x3e3/0x400
[42949380.750000] 67efbb68: [<60065b6d>] __handle_mm_fault+0x23d/0x280
[42949380.750000] 67efbb88: [<6013f13e>] __up_read+0x1e/0xc0
[42949380.750000] 67efbbe8: [<60017866>] kern_do_signal+0x86/0x230
[42949380.750000] 67efbc00: [<6001b370>] copy_chunk_from_user+0x0/0x40
[42949380.750000] 67efbc18: [<600197cb>] segv+0x21b/0x300
[42949380.750000] 67efbcb8: [<6002b8f8>] wait_stub_done+0x58/0x110
[42949380.750000] 67efbce8: [<600434de>] sys_rt_sigprocmask+0x7e/0x120
[42949380.750000] 67efbd28: [<60017a30>] do_signal+0x20/0x30
[42949380.750000] 67efbd38: [<60016795>] interrupt_end+0x45/0x60
[42949380.750000] 67efbd58: [<6002be8c>] userspace+0x15c/0x2f0
[42949380.750000] 67efbd78: [<6001b502>] copy_to_user_skas+0x72/0x90
[42949380.750000] 67efbde8: [<6001a8fe>] fork_handler+0xee/0x100
[42949380.750000]
[42949380.750000] * route del -host 192.168.0.2 dev tap0
[42949380.750000] * bash -c echo 0 > /proc/sys/net/ipv4/conf/tap0/proxy_arp


2006-05-19 01:20:50

by Blaisorblade

[permalink] [raw]
Subject: Re: [uml-devel] [UML] Problems building and running 2.6.17-rc4 on x86-64

On Thursday 18 May 2006 21:48, Alberto Bertogli wrote:
> On Wed, May 17, 2006 at 08:36:40AM +0200, Blaisorblade wrote:
> > On Wednesday 17 May 2006 04:39, Alberto Bertogli wrote:
> > > On Tue, May 16, 2006 at 03:12:44PM -0400, Jeff Dike wrote:
> > > > On Mon, May 15, 2006 at 12:29:58PM -0300, Alberto Bertogli wrote:
> > >
> > > Here it is. While the patch worked, it was for 2.6.16, and I'm using
> > > 2.6.17-rc4, I hope that's not a problem.
> >
> > Guess not - I'll test this patch soon because I have the same problem,
> > however are you running a 2.6.16 host?
> >
> > If so, can you verify whether on a 2.6.15 host kernel the same binary
> > runs fine (as is the case for me)?
>
> I tried running 2.6.15.1 and 2.6.17-rc1 on the host. Both advanced more
> than 2.6.17-rc4, but failed in different ways.
>
> 2.6.15.1 was the most successful one, it managed to boot and it kinda
> worked, but I got semi-random segmentation faults when running some apps
> like apt-get. I reported this some time ago to Jeff Dike.
>
> 2.6.17-rc1 didn't got that far, and 'panic'ed when starting runlevel 2;
> I attach the output below.

Interesting, I'll test ASAP, meanwhile please verify if a 2.6.16 guest kernel
works fine at least on 2.6.15 host (as it does here). So we'll know that the
semi-random segfault is a bug in 2.6.17-rc guests.

I've built all 2.6.16-rc releases to start doing binary search to insulate
where was the host change.

> Initializing random number generator...done.
> INIT: Entering runlevel: 2
> INIT: PANIC: segmentation violation! sleeping for 30 seconds.
> [42949380.750000] BUG: failure at
> /usr/src/linux-2.6.17-rc4/include/linux/elfcore.h:95/elf_core_copy_regs()!
> [42949380.750000] Kernel panic - not syncing: BUG!
> [42949380.750000]
> [42949380.750000] Modules linked in:
> [42949380.750000] Pid: 444, comm: init Not tainted 2.6.17-rc4
> [42949380.750000] RIP: 0033:[<0000000040145896>]
> [42949380.750000] RSP: 0000007f7f811f30 EFLAGS: 00000246
> [42949380.750000] RAX: 0000000000000000 RBX: 0000000040017600 RCX:
> ffffffffffffffff [42949380.750000] RDX: 0000000000000000 RSI:
> 0000007f7f811f48 RDI: 0000000000000002 [42949380.750000] RBP:
> 0000000000000000 R08: 00000000000001bc R09: 000000000000000b
> [42949380.750000] R10: 0000000000000008 R11: 0000000000000246 R12:
> 0000000000000002 [42949380.750000] R13: 00000000005096bc R14:
> 0000000000406b21 R15: 0000000000406b21 [42949380.750000] Call Trace:
> [42949380.750000] 67efb768: [<6001a10a>] panic_exit+0x2a/0x50
> [42949380.750000] 67efb778: [<60044acc>] notifier_call_chain+0x1c/0x30
> [42949380.750000] 67efb798: [<600348cf>] panic+0xcf/0x170
> [42949380.750000] 67efb7b8: [<600c336f>] ext3_mark_iloc_dirty+0xf/0x20
> [42949380.750000] 67efb7f8: [<600c7000>] ext3_orphan_del+0x1d0/0x280
> [42949380.750000] 67efb808: [<600c3528>] ext3_dirty_inode+0x78/0xa0
> [42949380.750000] 67efb838: [<6009d1c2>] __mark_inode_dirty+0x102/0x1b0
> [42949380.750000] 67efb848: [<600285b4>] set_signals+0x14/0x30
> [42949380.750000] 67efb858: [<600742c8>] kmem_cache_alloc+0x48/0x70
> [42949380.750000] 67efb878: [<600aa4e4>] elf_core_dump+0x264/0x2f0
> [42949380.750000] 67efb938: [<6007554d>] do_truncate+0x5d/0x70
> [42949380.750000] 67efb9a8: [<600842e7>] do_coredump+0x2c7/0x370
> [42949380.750000] 67efba48: [<60040bd9>] recalc_sigpending+0x19/0x20
> [42949380.750000] 67efba58: [<60041064>] __dequeue_signal+0x84/0xf0
> [42949380.750000] 67efbad8: [<600432e3>] get_signal_to_deliver+0x3e3/0x400
> [42949380.750000] 67efbb68: [<60065b6d>] __handle_mm_fault+0x23d/0x280
> [42949380.750000] 67efbb88: [<6013f13e>] __up_read+0x1e/0xc0
> [42949380.750000] 67efbbe8: [<60017866>] kern_do_signal+0x86/0x230
> [42949380.750000] 67efbc00: [<6001b370>] copy_chunk_from_user+0x0/0x40
> [42949380.750000] 67efbc18: [<600197cb>] segv+0x21b/0x300
> [42949380.750000] 67efbcb8: [<6002b8f8>] wait_stub_done+0x58/0x110
> [42949380.750000] 67efbce8: [<600434de>] sys_rt_sigprocmask+0x7e/0x120
> [42949380.750000] 67efbd28: [<60017a30>] do_signal+0x20/0x30
> [42949380.750000] 67efbd38: [<60016795>] interrupt_end+0x45/0x60
> [42949380.750000] 67efbd58: [<6002be8c>] userspace+0x15c/0x2f0
> [42949380.750000] 67efbd78: [<6001b502>] copy_to_user_skas+0x72/0x90
> [42949380.750000] 67efbde8: [<6001a8fe>] fork_handler+0xee/0x100
> [42949380.750000]
> [42949380.750000] * route del -host 192.168.0.2 dev tap0
> [42949380.750000] * bash -c echo 0 > /proc/sys/net/ipv4/conf/tap0/proxy_arp

--
Inform me of my mistakes, so I can keep imitating Homer Simpson's "Doh!".
Paolo Giarrusso, aka Blaisorblade (Skype ID "PaoloGiarrusso", ICQ 215621894)
http://www.user-mode-linux.org/~blaisorblade




___________________________________
Yahoo! Messenger with Voice: chiama da PC a telefono a tariffe esclusive
http://it.messenger.yahoo.com