LinuxLists.cc - 2.6.11.8 + UML/x86_64 (2.6.12-rc3+) = oops

[permalink] [raw]

Subject: Re: 2.6.11.8 + UML/x86_64 (2.6.12-rc3+) = oops

> I never get uml to compile around here, 2.6.12-rc3 + that patchkit from
> the link you sent blows up with defconfig any my minimal config. Please
> attach your guest .config and if you can you might aswell put your guest
> vmlinux somewhere where i can download it too.
ok, here it is:
http://213.228.237.37/uml/2.6.12-rc3/
kernel and config.

Note: you will need a pure 64bit root fs, ie: gentoo 2005 works fine.

2005-05-07 18:08:48

[permalink] [raw]

Subject: Re: 2.6.11.8 + UML/x86_64 (2.6.12-rc3+) = oops

On Sat, May 07, 2005 at 05:57:48PM +0200, Alexander Nyberg wrote:
> I never get uml to compile around here, 2.6.12-rc3 + that patchkit from
> the link you sent blows up with defconfig any my minimal config. Please
> attach your guest .config and if you can you might aswell put your guest
> vmlinux somewhere where i can download it too.

Start with -rc3, and all the patches from
http://user-mode-linux.sf.net/patches.html
up to and including skas0. You'll see a note to x86_64 users on that patch.

Jeff

2005-05-08 00:18:39

[permalink] [raw]

Subject: Re: 2.6.11.8 + UML/x86_64 (2.6.12-rc3+) = oops

On Sat, May 07, 2005 at 02:03:56PM -0400, Jeff Dike wrote:
> On Sat, May 07, 2005 at 05:57:48PM +0200, Alexander Nyberg wrote:
> > I never get uml to compile around here, 2.6.12-rc3 + that patchkit from
> > the link you sent blows up with defconfig any my minimal config. Please
> > attach your guest .config and if you can you might aswell put your guest
> > vmlinux somewhere where i can download it too.
>
> Start with -rc3, and all the patches from
> http://user-mode-linux.sf.net/patches.html
> up to and including skas0. You'll see a note to x86_64 users on that patch.

Hrm...
a) stub.S handling breaks on O= builds. Actually, your unprofile
breaks there - it's bypassing the machinery that deals with include path.
b) stub_segv.c on amd64 includes <signal.h>. Not a good idea...
c) sysdep-x86_64/checksum.h in -rc4 has csum_partial_copy_from_user()
that needs updating (AFAICS, you have that in your patchset, but it hadn't
reached Linus)
d) ip_compute_csum() prototype is missing (same file)
e) #define UPT_SYSCALL_RET(r) UPT_RAX(r) is needed in amd64 ptrace.h
f) take a good look at UPT_SET() in the same file ;-)
g) CFLAGS_csum-partial.o := -Dcsum_partial=arch_csum_partial in
sys-x86_64/Makefile needs to be removed.
h) Makefile.rules should be included _after_ SYMLINKS = in the same
file.
i) sys-x86_64/delay.c needs exports of __udelay() and __const_udelay(),
include of linux/module.h and barriers in your delay loop bodies (or games
with volatile - anything that would guarantee that gcc won't decide to optimize
the entire loop away). The last part applies to i386 as well.
j) ip_compute_csum should be exported on amd64.
k) sys-x86_64/syscalls.c needs include "kern.h"
l) elf-i386.h should include <asm/user.h>, not "user.h"
m) elf-x86_64.h lacks R_X86_64_... definitions
n) WTF _is_ that #ifdef TIF_IA32 in there? Aside of the trailing \,
we could as well put #error there - free-floating clear_thread_flag(TIF_IA32);
outside of any function body will have that effect anyway.
o) in drivers/chan_kern.c we have several printf(KERN_ERR "...");
these should become printk, as they clearly had been intended. As it is,
they give instant panic if we ever call them.
p) TOP_ADDR in Kconfig_x86_64 got lost in transmission - your patchset
has it, but same patch in Linus' tree does not.

I've got patches for everything except (a); that one is really nasty. I hope
to sort it out by tonight; if not, I'll just send what I've got by now.

2005-05-08 06:10:47

[permalink] [raw]

Subject: Re: 2.6.11.8 + UML/x86_64 (2.6.12-rc3+) = oops

On Sun, May 08, 2005 at 01:18:32AM +0100, Al Viro wrote:
> a) stub.S handling breaks on O= builds. Actually, your unprofile
> breaks there - it's bypassing the machinery that deals with include path.

Solved.

> p) TOP_ADDR in Kconfig_x86_64 got lost in transmission - your patchset
> has it, but same patch in Linus' tree does not.

q) skas/mmu.c is calling pte_alloc_map() without ->page_table_lock.
Trivially fixed, needed if you want spinlock debugging to produce something
useful.
r) when built static, kernel dies ugly death with
#0 0x00000000601e4178 in ptmalloc_init () at swab.h:134
#1 0x00000000601e4034 in malloc_hook_ini () at swab.h:134
#2 0x00000000601e1698 in malloc () at swab.h:134
#3 0x00000000602068ee in _dl_init_paths () at swab.h:134
#4 0x00000000601eba45 in _dl_non_dynamic_init () at swab.h:134
#5 0x00000000601ebc60 in __libc_init_first () at swab.h:134
#6 0x00000000601cfa4f in __libc_start_main () at swab.h:134
#7 0x000000006001202a in _start () at proc_fs.h:183
as stack trace. Buggered offsets in uml.lds, perhaps?

Dynamically built it works; for i386 the same tree works both with both
static and dynamic. It _might_ be libc difference, in theory (i386 libc
is 2.3.2.ds1-18, amd64 - 2.3.2.ds1-21, both from sarge), but I wouldn't
bet on it. Anyway, I'm going down right now; carving the fixes into
sane patch series + experimenting with static/amd64 breakage will have
to wait until the morning...

2005-05-08 14:12:40

by Andi Kleen

[permalink] [raw]

Subject: Re: 2.6.11.8 + UML/x86_64 (2.6.12-rc3+) = oops

Antoine Martin <[email protected]> writes:
>
> general protection fault: 0000 [1]
> CPU 0
> Pid: 26926, comm: kernel-4 Not tainted 2.6.11.8
> RIP: 0010:[<ffffffff8010ca47>] <ffffffff8010ca47>{__switch_to+311}
> RSP: 0018:ffff8100a7635d48 EFLAGS: 00010016
> RAX: 0000c8e816000002 RBX: ffff8100b895f320 RCX: 00000000c0000102
> RDX: 000000000000c8e8 RSI: 0000000000000000 RDI: 0000000000000000
> RBP: 0000000000000000 R08: ffff810090db3a00 R09: 0000000000006933
> R10: 0000000000000000 R11: 0000000000000202 R12: ffff8100a827b890
> R13: ffff8100b895f010 R14: ffff8100a827b580 R15: ffff8100a827b7f8
> FS: 000000006025212c(0000) GS:ffffffff80785a00(0000)
> knlGS:0000000000000d7e
> CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 000000006880d010 CR3: 00000000a2321000 CR4: 00000000000006e0
> Process kernel-4 (pid: 26926, threadinfo ffff810060884000, task
> ffff8100a827b580)
> Stack: ffff8100dc37e180 ffff8100b895f010 ffffffff806b6d50
> ffff8100b895f010
> 000003595b049ed6 ffffffff804f4de4 ffff8100a7635de8
> 0000000000000086
> 0000007500000020 ffff8100b895f010
> Call Trace:<ffffffff804f4de4>{thread_return+0}
> <ffffffff8013cb08>{ptrace_stop+280}
> <ffffffff8013cde6>{get_signal_to_deliver+358}
> <ffffffff8010d4e3>{do_signal+163}
> <ffffffff8010e905>{error_exit+0}
> <ffffffff8010de67>{sys_rt_sigreturn+535}
> <ffffffff8010dee9>{sys_rt_sigreturn+665}
> <ffffffff8010e2b6>{int_signal+18}
>
>
> Code: 0f 30 66 41 89 6c 24 2e 65 48 8b 04 25 20 00 00 00 49 89 44

That is a wrmsr to 0x00000000c0000102 (KERNEL_GS_BASE), the code
is trying to write 0x0000c8e816000002 into it. That is a non canonical
address, which causes the GPF.

The strange thing is that the kernel should have rejected it in
the first place. The code to allow user space to set kernel gs
checks for the address being > TASK_SIZE and TASK_SIZE is 0x800000000000.
It should have rejected it in the first place.

Are you sure you did not apply any strange UML related patches
to the host kernel? Maybe those are buggy.

-Andi

2005-05-08 15:03:47

[permalink] [raw]

Subject: Re: 2.6.11.8 + UML/x86_64 (2.6.12-rc3+) = oops

> (..)
> That is a wrmsr to 0x00000000c0000102 (KERNEL_GS_BASE), the code
> is trying to write 0x0000c8e816000002 into it. That is a non canonical
> address, which causes the GPF.
>
> The strange thing is that the kernel should have rejected it in
> the first place. The code to allow user space to set kernel gs
> checks for the address being > TASK_SIZE and TASK_SIZE is 0x800000000000.
> It should have rejected it in the first place.
>
> Are you sure you did not apply any strange UML related patches
> to the host kernel? Maybe those are buggy.
The only extra patch applied on top of what is on the web page (as per
Jeff's instructions) is the mconsole-exec patch, and AFAIK it wouldn't
affect the code above.

Alexander Nyberg is also experiencing crashes, aren't you?
Just un-compressing portage (20MB .tar.bz2) on the root_fs image posted
earlier caused a different kind of mis-behaviour, the guest lost network
connectivity, cpu usage shot up on the host (load > 10 now), and I found
this in the host log:
kernel: segfault at 00000000df2948d0 rip 00000000601e5beb
rsp00000000df2948d0 error 4
(same kernel as earlier crashes...)
That's on a different box, running a different host kernel (FC3 2.6.9-?)

The really weird thing is that the processes are still running, but ps
-ef shows an empty string in place of the process name:
(and the terminal which launched the instance got control back)
I am now rebuilding a new kernel on another test box, let me know what
to do to provide better debug information.

# ps -ef
UID PID PPID C STIME TTY TIME CMD
root 1 0 0 13:27 ? 00:00:00 init [3]
root 2 1 0 13:27 ? 00:00:00 [ksoftirqd/0]
root 3 1 0 13:27 ? 00:00:00 [events/0]
root 4 3 0 13:27 ? 00:00:00 [khelper]
root 5 3 0 13:27 ? 00:00:00 [kacpid]
root 27 3 0 13:27 ? 00:00:00 [kblockd/0]
root 28 1 0 13:27 ? 00:00:00 [khubd]
root 40 3 0 13:27 ? 00:00:00 [pdflush]
root 41 3 0 13:27 ? 00:00:00 [pdflush]
root 43 3 0 13:27 ? 00:00:00 [aio/0]
root 42 1 0 13:27 ? 00:00:00 [kswapd0]
root 115 1 0 13:27 ? 00:00:00 [kseriod]
root 183 3 0 13:27 ? 00:00:00 [ata/0]
root 185 1 0 13:27 ? 00:00:00 [scsi_eh_0]
root 186 1 0 13:27 ? 00:00:00 [scsi_eh_1]
root 202 1 0 13:27 ? 00:00:00 [kjournald]
root 1025 1 0 13:27 ? 00:00:00 udevd
root 1261 1 0 13:27 ? 00:00:00 [khpsbpkt]
root 1290 1 0 13:27 ? 00:00:00 [knodemgrd_0]
root 1453 1 0 13:27 ? 00:00:00 [kjournald]
root 1454 1 0 13:27 ? 00:00:00 [kjournald]
root 1693 1 0 13:28 ? 00:00:00 syslogd -m 0
root 1697 1 0 13:28 ? 00:00:00 klogd -x
nobody 1737 1 0 13:28 ? 00:00:00 mDNSResponder
root 1755 1 0 13:28 ? 00:00:00 /usr/sbin/acpid
root 1770 1 0 13:28 ? 00:00:00 /usr/sbin/sshd
root 1779 1 0 13:28 ? 00:00:00 gpm -m /dev/input/mice
-t imps2
root 1809 1 0 13:28 ? 00:00:00 crond
dbus 1833 1 0 13:28 ? 00:00:00 dbus-daemon-1 --system
root 1842 1 0 13:28 ? 00:00:00 hald
root 1849 1 0 13:28 ? 00:00:00 login -- root
root 1850 1 0 13:28 tty2 00:00:00 /sbin/mingetty tty2
root 1851 1 0 13:28 tty3 00:00:00 /sbin/mingetty tty3
root 1852 1 0 13:28 tty4 00:00:00 /sbin/mingetty tty4
root 1853 1 0 13:28 tty5 00:00:00 /sbin/mingetty tty5
root 1854 1 0 13:28 tty6 00:00:00 /sbin/mingetty tty6
root 2270 1849 0 13:31 tty1 00:00:00 -bash
root 2312 1770 0 13:31 ? 00:00:00 sshd: root@pts/0
root 2314 2312 0 13:32 pts/0 00:00:00 -bash
root 2363 2314 19 13:36 pts/0 00:30:19 ./setiathome
root 4346 1770 0 15:03 ? 00:00:00 sshd: root@pts/1
root 4350 4346 0 15:03 pts/1 00:00:00 -bash
root 5291 4350 0 15:14 pts/1 00:00:00 tail
-f /var/log/messages /var/log/secure
root 6179 1 0 15:16 pts/0 00:00:35
root 7231 1 0 15:16 pts/0 00:00:35
root 7252 1 0 15:16 pts/0 00:00:35
root 7345 1 0 15:17 ? 00:00:00 /usr/sbin/dhcpd
root 7352 1 0 15:17 pts/0 00:00:34
root 7485 1 0 15:17 pts/0 00:00:34
root 7509 1 0 15:17 pts/0 00:00:34
root 7520 1770 0 15:18 ? 00:00:00 sshd: root@pts/2
root 7522 7520 0 15:18 pts/2 00:00:00 -bash
root 7561 7522 0 15:18 pts/2 00:00:00 ssh -v [email protected]
root 7563 1 1 15:18 pts/0 00:00:35
root 7571 1 1 15:18 pts/0 00:00:35
root 10302 1770 0 15:57 ? 00:00:00 sshd: root@pts/3
root 10304 10302 0 15:57 pts/3 00:00:00 -bash
root 10422 1 8 16:08 pts/0 00:00:38
root 10424 1 9 16:08 pts/0 00:00:42
root 10445 2314 0 16:16 pts/0 00:00:00 ps -ef

2005-05-08 15:15:46

by Andi Kleen

[permalink] [raw]

Subject: Re: 2.6.11.8 + UML/x86_64 (2.6.12-rc3+) = oops

Antoine Martin <[email protected]> writes:

>> (..)
>> That is a wrmsr to 0x00000000c0000102 (KERNEL_GS_BASE), the code
>> is trying to write 0x0000c8e816000002 into it. That is a non canonical
>> address, which causes the GPF.
>>
>> The strange thing is that the kernel should have rejected it in
>> the first place. The code to allow user space to set kernel gs
>> checks for the address being > TASK_SIZE and TASK_SIZE is 0x800000000000.
>> It should have rejected it in the first place.
>>
>> Are you sure you did not apply any strange UML related patches
>> to the host kernel? Maybe those are buggy.
> The only extra patch applied on top of what is on the web page (as per
> Jeff's instructions) is the mconsole-exec patch, and AFAIK it wouldn't
> affect the code above.
>
> Alexander Nyberg is also experiencing crashes, aren't you?

Ok, the bug is found now. It is a kernel bug that it allows to set
non canonical addresses in 64bit segment registers through ptrace.

But even if I fixed that then it will not help you run UML, because
UML needs to set correct addresses of course, not illegal ones.

I will submit a patch later for the crash problem.

-Andi

2005-05-08 16:07:20

[permalink] [raw]

Subject: Re: 2.6.11.8 + UML/x86_64 (2.6.12-rc3+) = oops

> Ok, the bug is found now. It is a kernel bug that it allows to set
> non canonical addresses in 64bit segment registers through ptrace.
Is this going to be part of 2.6.11.9 or just 2.6.12?
I've got a good test environment now if you need testers.

Antoine

2005-05-08 16:35:37

[permalink] [raw]

Subject: Re: 2.6.11.8 + UML/x86_64 (2.6.12-rc3+) = oops

On Sun, May 08, 2005 at 01:18:32AM +0100, Al Viro wrote:
> Hrm...

I had a lot of these fixed already. These will be mostly in the fixlets
patch.

> a) stub.S handling breaks on O= builds. Actually, your unprofile
> breaks there - it's bypassing the machinery that deals with include path.

This?
$(patsubst -pg,,$(patsubst -fprofile-arcs -ftest-coverage,,$(1)))

I don't see any connection to include paths there.

> b) stub_segv.c on amd64 includes <signal.h>. Not a good idea...

x86_64 doesn't, but i386 does. Fixed now.

> c) sysdep-x86_64/checksum.h in -rc4 has csum_partial_copy_from_user()
> that needs updating (AFAICS, you have that in your patchset, but it hadn't
> reached Linus)

Fixed.

> d) ip_compute_csum() prototype is missing (same file)
> j) ip_compute_csum should be exported on amd64.

OK, I need to look at this a bit more.

> e) #define UPT_SYSCALL_RET(r) UPT_RAX(r) is needed in amd64 ptrace.h

Fixed.

> f) take a good look at UPT_SET() in the same file ;-)

Sigh. Fixed now.

> g) CFLAGS_csum-partial.o := -Dcsum_partial=arch_csum_partial in
> sys-x86_64/Makefile needs to be removed.

Fixed.

> h) Makefile.rules should be included _after_ SYMLINKS = in the same
> file.

Fixed.

> i) sys-x86_64/delay.c needs exports of __udelay() and __const_udelay(),
> include of linux/module.h and barriers in your delay loop bodies (or games
> with volatile - anything that would guarantee that gcc won't decide to optimize
> the entire loop away). The last part applies to i386 as well.

Looks to me like it all applies to i386 too, except that __delay looks
unoptimizable.

It also looks to me like I could implement __udelay as n=... ; __delay(n);

And also never mind the fact that __udelay and __const_udelay are identical.

> k) sys-x86_64/syscalls.c needs include "kern.h"

Fixed now.

> l) elf-i386.h should include <asm/user.h>, not "user.h"

Fixed now.

> m) elf-x86_64.h lacks R_X86_64_... definitions

Fixed now.

> n) WTF _is_ that #ifdef TIF_IA32 in there? Aside of the trailing \,
> we could as well put #error there - free-floating clear_thread_flag(TIF_IA32);> outside of any function body will have that effect anyway.

The trailing \ aside, which is fixed, that's a reminder for me when I add
the 32-bit compatibility code.

> o) in drivers/chan_kern.c we have several printf(KERN_ERR "...");
> these should become printk, as they clearly had been intended. As it is,
> they give instant panic if we ever call them.

Oops. This requires a bit of thought. Offhand, I think they need to be
printf, because that early in boot, printk'd stuff may not reach the screen
if UML exits then.

> p) TOP_ADDR in Kconfig_x86_64 got lost in transmission - your patchset
> has it, but same patch in Linus' tree does not.

Fixed

>
> I've got patches for everything except (a); that one is really nasty. I hope
> to sort it out by tonight; if not, I'll just send what I've got by now.

OK, send me what you have, and if we've fixed the same thing differently,
I choose one or the other.

Jeff

2005-05-08 16:46:59

[permalink] [raw]

Subject: Re: 2.6.11.8 + UML/x86_64 (2.6.12-rc3+) = oops

On Sun, May 08, 2005 at 05:15:36PM +0200, Andi Kleen wrote:
> Antoine Martin <[email protected]> writes:
> Ok, the bug is found now. It is a kernel bug that it allows to set
> non canonical addresses in 64bit segment registers through ptrace.
>
> But even if I fixed that then it will not help you run UML, because
> UML needs to set correct addresses of course, not illegal ones.

True, but if the host stays up, and maybe printks something (or even returns
-EIO), that would help track down the UML problem.

Jeff

2005-05-08 17:35:25

[permalink] [raw]

Subject: Re: 2.6.11.8 + UML/x86_64 (2.6.12-rc3+) = oops

> Are you sure you did not apply any strange UML related patches
> to the host kernel? Maybe those are buggy.

No, stock x86_64 kernel is horribly unstable running UML. I haven't seen
anything but output-free hangs, so I haven't had much information to
contribute. Antoine is actually getting capturable oopses.

I've tried every recent FC3 kernel, plus stock 2.6.10, and none of them
survive very long with UML running. Haven't tried stock 2.6.11 or anything
later yet.

Jeff

2005-05-08 17:35:31

[permalink] [raw]

Subject: Re: 2.6.11.8 + UML/x86_64 (2.6.12-rc3+) = oops

On Sun, May 08, 2005 at 05:35:02PM +0100, Antoine Martin wrote:
> The only extra patch applied on top of what is on the web page (as per
> Jeff's instructions) is the mconsole-exec patch, and AFAIK it wouldn't
> affect the code above.

mconsole-exec, if it's the patch I'm thinking of, is a patch to the UML
kernel, not to the host.

> The really weird thing is that the processes are still running, but ps
> -ef shows an empty string in place of the process name:
> (and the terminal which launched the instance got control back)
> I am now rebuilding a new kernel on another test box, let me know what
> to do to provide better debug information.

It's not unusual for UML processes to have strange names (including empty
ones) on the host.

Jeff

2005-05-08 18:19:24

[permalink] [raw]

Subject: Re: 2.6.11.8 + UML/x86_64 (2.6.12-rc3+) = oops

On Sun, 2005-05-08 at 12:45 -0400, Jeff Dike wrote:
> On Sun, May 08, 2005 at 05:35:02PM +0100, Antoine Martin wrote:
> > The only extra patch applied on top of what is on the web page (as per
> > Jeff's instructions) is the mconsole-exec patch, and AFAIK it wouldn't
> > affect the code above.
>
> mconsole-exec, if it's the patch I'm thinking of, is a patch to the UML
> kernel, not to the host.
Yep, that's the one, I thought the question was about the guest.
The host is running 2.6.11.8 - no extra patches at all.

> > The really weird thing is that the processes are still running, but ps
> > -ef shows an empty string in place of the process name:
> > (and the terminal which launched the instance got control back)
> > I am now rebuilding a new kernel on another test box, let me know what
> > to do to provide better debug information.
>
> It's not unusual for UML processes to have strange names (including empty
> ones) on the host.
Strange thing is, they had names up to the point where I got the
segfault.

Antoine

2005-05-09 21:08:02

[permalink] [raw]

Subject: Re: 2.6.11.8 + UML/x86_64 (2.6.12-rc3+) = oops

On Sun, May 08, 2005 at 07:10:44AM +0100, Al Viro wrote:
> r) when built static, kernel dies ugly death with
> #0 0x00000000601e4178 in ptmalloc_init () at swab.h:134
> #1 0x00000000601e4034 in malloc_hook_ini () at swab.h:134
> #2 0x00000000601e1698 in malloc () at swab.h:134
> #3 0x00000000602068ee in _dl_init_paths () at swab.h:134
> #4 0x00000000601eba45 in _dl_non_dynamic_init () at swab.h:134
> #5 0x00000000601ebc60 in __libc_init_first () at swab.h:134
> #6 0x00000000601cfa4f in __libc_start_main () at swab.h:134
> #7 0x000000006001202a in _start () at proc_fs.h:183
> as stack trace. Buggered offsets in uml.lds, perhaps?

Apparently solved by adding .tdata and .tbss to uml.lds.S. That change does
not give any visible regression on i386.

s) i386 TT-only won't compile, due to mispaced include in
sysdep/ptrace.h (under ifdef for skas). Trivial fix apparently gets it
to work correctly. Which is surprising - when running that sucker on
amd64 we get zero from
*host_size_out = ROUND_4M((unsigned long) &arg);
Of course, the real size rounded up to 4M is 4Gb there - 32bit tasks do not
have to share the lower 4Gb of address space with the kernel. Looks like
we survive, though - it boots and apparently works both on i386 and (as
32bit process) on amd64.
t) amd64 TT-only builds just fine, but gets buggered due to
mismatch between CONFIG_TOP_ADDR and start address - we get *both* set
to 0x60000000, which is obviously b0rken and doesn't match the old
code, while we are at it. We want TOP_ADDR at 0x80000000 to match start
address.
u) amd64 TT is _still_ buggered due to unmap_fin.o attempts at
magic. errno sits in TLS for amd64, so unmap_fin.o gets very interesting
stuff leaking from libc and messing the link. IMO that should be dealt
with by brute force; namely, unmap-$(SUBARCH).S instead of trying to
play games with pulling stuff from libc.a. For fsck sake, we are just
making 3 syscalls there and switcheroo() is as low-level as it gets...
Will post once that's done...

2005-05-10 02:26:34

[permalink] [raw]

Subject: Re: 2.6.11.8 + UML/x86_64 (2.6.12-rc3+) = oops

On Mon, May 09, 2005 at 10:07:53PM +0100, Al Viro wrote:
> u) amd64 TT is _still_ buggered due to unmap_fin.o attempts at
> magic. errno sits in TLS for amd64, so unmap_fin.o gets very interesting
> stuff leaking from libc and messing the link. IMO that should be dealt
> with by brute force; namely, unmap-$(SUBARCH).S instead of trying to
> play games with pulling stuff from libc.a. For fsck sake, we are just
> making 3 syscalls there and switcheroo() is as low-level as it gets...
> Will post once that's done...
OK, actually - C with use of _syscall(); still, per-architecture
due to different calling conventions (mmap() has enough arguments to
trigger irregularities). That deals with errno / __libc_errno getting
screwed, but there's more...

v) phys_mappings rbtree gets screwed in fixrange_init() - no
surprise, seeing what it does in
for ( ; (i < PTRS_PER_PGD) && (vaddr < end); pgd++, i++) {
pmd = (pmd_t *)pgd;
for (; (j < PTRS_PER_PMD) && (vaddr != end); pmd++, j++) {
Note that here PTR_PER_PGD and PTRS_PER_PMD are both 512. Fun... Liberal
stealing from arch/i386/mm/init.c deals with that one, AFAICS. Now we have
the following:
uml/i386 - all variants work
uml/amd64 TT-only - panics in execve() on /sbin/init (hey, a progress)
uml/amd64 other variants - work
Now to figure out WTF is happening in that execve()...

2005-05-10 04:40:37

[permalink] [raw]

Subject: Re: 2.6.11.8 + UML/x86_64 (2.6.12-rc3+) = oops

On Tue, May 10, 2005 at 03:26:31AM +0100, Al Viro wrote:
> Now we have
> the following:
> uml/i386 - all variants work
> uml/amd64 TT-only - panics in execve() on /sbin/init (hey, a progress)
> uml/amd64 other variants - work

Nice, send patches when you get a chance?

Jeff

2005-05-10 10:02:20