2010-02-04 04:03:01

by Ben Hutchings

[permalink] [raw]
Subject: Re: [PATCH] Fix 'flush_old_exec()/setup_new_exec()' split

On Thu, 2010-02-04 at 01:10 +0000, Ben Hutchings wrote:
> Commit 7ab02af428c2d312c0cf8fb0b01cc1eb21131a3d upstream fixes a
> regression caused by 221af7f87b97431e3ee21ce4b0e77d5411cf1549.

Having said that, it doesn't really fix it for me.

I'm using Debian i386 (i.e. 32-bit userland) with a 64-bit kernel.
After applying commit 221af7f to Debian's kernel source (approximately
equivalent to 2.6.32.7), the kernel fails to exec init. After commit
7ab02af it can exec init but that immediately segfaults:

[ 0.684493] init[1]: segfault at 6d241850 ip 000000006d241850 sp 00000000403fca30 error 14
[ 0.686204] Kernel panic - not syncing: Attempted to kill init!
[ 0.687081] Pid: 1, comm: init Not tainted 2.6.32-2-amd64 #1
[ 0.687968] Call Trace:
[ 0.688716] [<ffffffff812e8c21>] ? panic+0x86/0x141
[ 0.689540] [<ffffffff8104a63a>] ? __cond_resched+0x1d/0x26
[ 0.690390] [<ffffffff81056aed>] ? exit_ptrace+0x30/0x126
[ 0.692344] [<ffffffff81050cd3>] ? do_exit+0x72/0x6b5
[ 0.693261] [<ffffffff8105138c>] ? do_group_exit+0x76/0x9d
[ 0.694203] [<ffffffff8105da4c>] ? get_signal_to_deliver+0x310/0x33c
[ 0.695719] [<ffffffff8101000f>] ? do_notify_resume+0x87/0x73f
[ 0.696754] [<ffffffff81188de8>] ? __down_read_trylock+0x3e/0x44
[ 0.697717] [<ffffffff81188f75>] ? __up_read+0x13/0x8e
[ 0.698748] [<ffffffff8101159c>] ? retint_signal+0x48/0x8c

Ben.

--
Ben Hutchings
friends: People who know you well, but like you anyway.


Attachments:
signature.asc (828.00 B)
This is a digitally signed message part

2010-02-04 05:40:38

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH] Fix 'flush_old_exec()/setup_new_exec()' split



On Thu, 4 Feb 2010, Ben Hutchings wrote:
>
> I'm using Debian i386 (i.e. 32-bit userland) with a 64-bit kernel.
> After applying commit 221af7f to Debian's kernel source (approximately
> equivalent to 2.6.32.7), the kernel fails to exec init. After commit
> 7ab02af it can exec init but that immediately segfaults:

It sounds like you have picked individual commits.

But you don't mention commit 05d43ed8a, which is also a required part of
the series.

So you _should_ have a combination of
- 221af7f87 ("Split 'flush_old_exec' into two functions")
- 05d43ed8a ("x86: get rid of the insane TIF_ABI_PENDING bit")
- 7ab02af42 ("Fix 'flush_old_exec()/setup_new_exec()' split")

(and there are also additional sparc/ppc versions of that TIF_ABI_PENDING
bit removal, but they shouldn't matter on your system)

Linus

2010-02-04 08:29:39

by Ben Hutchings

[permalink] [raw]
Subject: Re: [PATCH] Fix 'flush_old_exec()/setup_new_exec()' split

On Wed, 2010-02-03 at 21:39 -0800, Linus Torvalds wrote:
>
> On Thu, 4 Feb 2010, Ben Hutchings wrote:
> >
> > I'm using Debian i386 (i.e. 32-bit userland) with a 64-bit kernel.
> > After applying commit 221af7f to Debian's kernel source (approximately
> > equivalent to 2.6.32.7), the kernel fails to exec init. After commit
> > 7ab02af it can exec init but that immediately segfaults:
>
> It sounds like you have picked individual commits.

Yes - I'm one of the kernel package maintainers and we're sticking with
2.6.32-stable.

> But you don't mention commit 05d43ed8a, which is also a required part of
> the series.
>
> So you _should_ have a combination of
> - 221af7f87 ("Split 'flush_old_exec' into two functions")
> - 05d43ed8a ("x86: get rid of the insane TIF_ABI_PENDING bit")
> - 7ab02af42 ("Fix 'flush_old_exec()/setup_new_exec()' split")
>
> (and there are also additional sparc/ppc versions of that TIF_ABI_PENDING
> bit removal, but they shouldn't matter on your system)

Thanks. If all the necessary patches are all in the stable queue then
we can pick them from there.

Ben.

--
Ben Hutchings
friends: People who know you well, but like you anyway.


Attachments:
signature.asc (828.00 B)
This is a digitally signed message part

2010-02-04 14:41:32

by Greg KH

[permalink] [raw]
Subject: Re: [stable] [PATCH] Fix 'flush_old_exec()/setup_new_exec()' split

On Thu, Feb 04, 2010 at 08:29:34AM +0000, Ben Hutchings wrote:
> On Wed, 2010-02-03 at 21:39 -0800, Linus Torvalds wrote:
> >
> > On Thu, 4 Feb 2010, Ben Hutchings wrote:
> > >
> > > I'm using Debian i386 (i.e. 32-bit userland) with a 64-bit kernel.
> > > After applying commit 221af7f to Debian's kernel source (approximately
> > > equivalent to 2.6.32.7), the kernel fails to exec init. After commit
> > > 7ab02af it can exec init but that immediately segfaults:
> >
> > It sounds like you have picked individual commits.
>
> Yes - I'm one of the kernel package maintainers and we're sticking with
> 2.6.32-stable.
>
> > But you don't mention commit 05d43ed8a, which is also a required part of
> > the series.
> >
> > So you _should_ have a combination of
> > - 221af7f87 ("Split 'flush_old_exec' into two functions")
> > - 05d43ed8a ("x86: get rid of the insane TIF_ABI_PENDING bit")
> > - 7ab02af42 ("Fix 'flush_old_exec()/setup_new_exec()' split")
> >
> > (and there are also additional sparc/ppc versions of that TIF_ABI_PENDING
> > bit removal, but they shouldn't matter on your system)
>
> Thanks. If all the necessary patches are all in the stable queue then
> we can pick them from there.

They should all be there already, if not, please let me know.

thanks,

greg k-h

2010-02-04 15:58:50

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH] Fix 'flush_old_exec()/setup_new_exec()' split



On Thu, 4 Feb 2010, Ben Hutchings wrote:
> >
> > So you _should_ have a combination of
> > - 221af7f87 ("Split 'flush_old_exec' into two functions")
> > - 05d43ed8a ("x86: get rid of the insane TIF_ABI_PENDING bit")
> > - 7ab02af42 ("Fix 'flush_old_exec()/setup_new_exec()' split")
> >
> > (and there are also additional sparc/ppc versions of that TIF_ABI_PENDING
> > bit removal, but they shouldn't matter on your system)
>
> Thanks. If all the necessary patches are all in the stable queue then
> we can pick them from there.

Yeah, they are all there. Please do report if that fixes your 64-bit
kernel with 32-bit user space issues (I tested that case, but I don't have
a full 32-bit environment, so I only tested it on a fairly simple
test-case that showed the pre-patch problem that the series fixes).

Btw, that 221af7f87 commit (even with the fix) is kind of nasty in that it
changes semantics without then fixing up the users in the same commit.
Normally we wouldn't accept anything like that, but it was supposed to
only change semantics for a case that was already broken, and is pretty
rare (the transition from 32-bit to 64-bit and vice versa).

Splitting them up was supposed to make it clearer what was going on and
tint he original version the first patch didn't change semantics. And in
fact, the split-up did indeed then help me chase down the bug that showed
up on Microblaze, because it broke an architecture that shouldn't have
been affected at all ;)

But pretty it wasn't. My bad. It would have been much better if we'd have
fixed this earlier than -rc6, but the bugreport that reported this came in
around -rc5. Unlucky timing (because the problem has been around for a
looong time).

Linus

2010-02-04 18:46:41

by Sven Joachim

[permalink] [raw]
Subject: Re: [stable] [PATCH] Fix 'flush_old_exec()/setup_new_exec()' split

On 2010-02-04 15:38 +0100, Greg KH wrote:

> On Thu, Feb 04, 2010 at 08:29:34AM +0000, Ben Hutchings wrote:
>> On Wed, 2010-02-03 at 21:39 -0800, Linus Torvalds wrote:
>> >
>> > On Thu, 4 Feb 2010, Ben Hutchings wrote:
>> > >
>> > > I'm using Debian i386 (i.e. 32-bit userland) with a 64-bit kernel.
>> > > After applying commit 221af7f to Debian's kernel source (approximately
>> > > equivalent to 2.6.32.7), the kernel fails to exec init. After commit
>> > > 7ab02af it can exec init but that immediately segfaults:
>> >
>> > It sounds like you have picked individual commits.
>>
>> Yes - I'm one of the kernel package maintainers and we're sticking with
>> 2.6.32-stable.
>>
>> > But you don't mention commit 05d43ed8a, which is also a required part of
>> > the series.
>> >
>> > So you _should_ have a combination of
>> > - 221af7f87 ("Split 'flush_old_exec' into two functions")
>> > - 05d43ed8a ("x86: get rid of the insane TIF_ABI_PENDING bit")
>> > - 7ab02af42 ("Fix 'flush_old_exec()/setup_new_exec()' split")
>> >
>> > (and there are also additional sparc/ppc versions of that TIF_ABI_PENDING
>> > bit removal, but they shouldn't matter on your system)
>>
>> Thanks. If all the necessary patches are all in the stable queue then
>> we can pick them from there.
>
> They should all be there already, if not, please let me know.

It seems they are all there, but on my system with 64-bit kernel and
32-bit userland, 2.6.32.8-rc1 still panics in the way noticed by Ben.

Sven

2010-02-04 18:57:42

by Greg KH

[permalink] [raw]
Subject: Re: [stable] [PATCH] Fix 'flush_old_exec()/setup_new_exec()' split

On Thu, Feb 04, 2010 at 07:46:30PM +0100, Sven Joachim wrote:
> On 2010-02-04 15:38 +0100, Greg KH wrote:
>
> > On Thu, Feb 04, 2010 at 08:29:34AM +0000, Ben Hutchings wrote:
> >> On Wed, 2010-02-03 at 21:39 -0800, Linus Torvalds wrote:
> >> >
> >> > On Thu, 4 Feb 2010, Ben Hutchings wrote:
> >> > >
> >> > > I'm using Debian i386 (i.e. 32-bit userland) with a 64-bit kernel.
> >> > > After applying commit 221af7f to Debian's kernel source (approximately
> >> > > equivalent to 2.6.32.7), the kernel fails to exec init. After commit
> >> > > 7ab02af it can exec init but that immediately segfaults:
> >> >
> >> > It sounds like you have picked individual commits.
> >>
> >> Yes - I'm one of the kernel package maintainers and we're sticking with
> >> 2.6.32-stable.
> >>
> >> > But you don't mention commit 05d43ed8a, which is also a required part of
> >> > the series.
> >> >
> >> > So you _should_ have a combination of
> >> > - 221af7f87 ("Split 'flush_old_exec' into two functions")
> >> > - 05d43ed8a ("x86: get rid of the insane TIF_ABI_PENDING bit")
> >> > - 7ab02af42 ("Fix 'flush_old_exec()/setup_new_exec()' split")
> >> >
> >> > (and there are also additional sparc/ppc versions of that TIF_ABI_PENDING
> >> > bit removal, but they shouldn't matter on your system)
> >>
> >> Thanks. If all the necessary patches are all in the stable queue then
> >> we can pick them from there.
> >
> > They should all be there already, if not, please let me know.
>
> It seems they are all there, but on my system with 64-bit kernel and
> 32-bit userland, 2.6.32.8-rc1 still panics in the way noticed by Ben.

Does 2.6.33-rc6 also cause you the same problem?

thanks,

greg k-h

2010-02-04 19:12:54

by Linus Torvalds

[permalink] [raw]
Subject: Re: [stable] [PATCH] Fix 'flush_old_exec()/setup_new_exec()' split



On Thu, 4 Feb 2010, Sven Joachim wrote:
>
> It seems they are all there, but on my system with 64-bit kernel and
> 32-bit userland, 2.6.32.8-rc1 still panics in the way noticed by Ben.

Ok. Greg - please skip these patches from stable for now. I'll try to
figure out what's up.

Sven/Ben: is /sbin/init (or wherever debian puts it) a regular ELF file?
Shared libraries? Anything at all special about it? I wonder why it seems
to have issues, when other 32-bit programs don't.

Of course, it's entirely possible that other Debian 32-bit programs do
not, but I haven't heard about problems with (for example) firefox from
people who run 64-bit distros but with a 32-bit browser (which at least
used to be very common due to the whole flash plugin issue). So I wonder
if there is soemthing special about init.

Linus

2010-02-04 19:32:51

by Sven Joachim

[permalink] [raw]
Subject: Re: [stable] [PATCH] Fix 'flush_old_exec()/setup_new_exec()' split

On 2010-02-04 19:57 +0100, Greg KH wrote:

> On Thu, Feb 04, 2010 at 07:46:30PM +0100, Sven Joachim wrote:
>> It seems they are all there, but on my system with 64-bit kernel and
>> 32-bit userland, 2.6.32.8-rc1 still panics in the way noticed by Ben.
>
> Does 2.6.33-rc6 also cause you the same problem?

A build from Linus' current git tree (commit 7ab02af42 was added after
2.6.33-rc6, so I skipped that version) does not show the problem.
Actually, I'm using it right now.

Sven

2010-02-04 19:39:17

by Linus Torvalds

[permalink] [raw]
Subject: Re: [stable] [PATCH] Fix 'flush_old_exec()/setup_new_exec()' split



On Thu, 4 Feb 2010, Sven Joachim wrote:

> On 2010-02-04 19:57 +0100, Greg KH wrote:
>
> > On Thu, Feb 04, 2010 at 07:46:30PM +0100, Sven Joachim wrote:
> >> It seems they are all there, but on my system with 64-bit kernel and
> >> 32-bit userland, 2.6.32.8-rc1 still panics in the way noticed by Ben.
> >
> > Does 2.6.33-rc6 also cause you the same problem?
>
> A build from Linus' current git tree (commit 7ab02af42 was added after
> 2.6.33-rc6, so I skipped that version) does not show the problem.
> Actually, I'm using it right now.

Ok, then that's just really odd. I don't think there are any other changes
in this area, and the whole init problem sure as hell smells like execve
setup problems. And afaik, 2.6.32.8-rc1 should have all the commits from
current -git.

Maybe we're looking at a separate issue after all.

Linus

2010-02-04 19:46:31

by Sven Joachim

[permalink] [raw]
Subject: Re: [stable] [PATCH] Fix 'flush_old_exec()/setup_new_exec()' split

On 2010-02-04 20:11 +0100, Linus Torvalds wrote:

> Sven/Ben: is /sbin/init (or wherever debian puts it) a regular ELF file?
> Shared libraries? Anything at all special about it? I wonder why it seems
> to have issues, when other 32-bit programs don't.

,----
| % file /sbin/init
| /sbin/init: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.18, stripped
| % ldd /sbin/init
| linux-gate.so.1 => (0xf770d000)
| libsepol.so.1 => /lib/libsepol.so.1 (0xf76c0000)
| libselinux.so.1 => /lib/libselinux.so.1 (0xf76a6000)
| libc.so.6 => /lib/i686/cmov/libc.so.6 (0xf755e000)
| libdl.so.2 => /lib/i686/cmov/libdl.so.2 (0xf755a000)
| /lib/ld-linux.so.2 (0xf770e000)
`----

Apparently nothing special, and booting with init=/bin/bash does not
change anything.

Sven

2010-02-04 19:55:53

by Linus Torvalds

[permalink] [raw]
Subject: Re: [stable] [PATCH] Fix 'flush_old_exec()/setup_new_exec()' split



On Thu, 4 Feb 2010, Sven Joachim wrote:

> On 2010-02-04 20:11 +0100, Linus Torvalds wrote:
>
> > Sven/Ben: is /sbin/init (or wherever debian puts it) a regular ELF file?
> > Shared libraries? Anything at all special about it? I wonder why it seems
> > to have issues, when other 32-bit programs don't.
>
> ,----
> | % file /sbin/init
> | /sbin/init: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.18, stripped
> | % ldd /sbin/init
> | linux-gate.so.1 => (0xf770d000)
> | libsepol.so.1 => /lib/libsepol.so.1 (0xf76c0000)
> | libselinux.so.1 => /lib/libselinux.so.1 (0xf76a6000)
> | libc.so.6 => /lib/i686/cmov/libc.so.6 (0xf755e000)
> | libdl.so.2 => /lib/i686/cmov/libdl.so.2 (0xf755a000)
> | /lib/ld-linux.so.2 (0xf770e000)
> `----
>
> Apparently nothing special, and booting with init=/bin/bash does not
> change anything.

Yeah, well, if my current -git tree works, then there is something else
going on.

But I don't really see anything relevant _except_ for those commits. I do
note that mainline also has the do_wait() thread optimizations since
2.6.32, and I guess those aren't in stable, but I really don't see them
interacting with that whole flush_old_exec thing in any way.

Very odd.

Linus

2010-02-04 22:26:07

by Ben Hutchings

[permalink] [raw]
Subject: Re: [stable] [PATCH] Fix 'flush_old_exec()/setup_new_exec()' split

On Thu, 2010-02-04 at 11:11 -0800, Linus Torvalds wrote:
>
> On Thu, 4 Feb 2010, Sven Joachim wrote:
> >
> > It seems they are all there, but on my system with 64-bit kernel and
> > 32-bit userland, 2.6.32.8-rc1 still panics in the way noticed by Ben.
>
> Ok. Greg - please skip these patches from stable for now. I'll try to
> figure out what's up.
>
> Sven/Ben: is /sbin/init (or wherever debian puts it) a regular ELF file?
> Shared libraries? Anything at all special about it? I wonder why it seems
> to have issues, when other 32-bit programs don't.

In a Debian initramfs, /init is a script interpreted by /bin/sh.

I've now tested i386 userland on an x86_64 kernel with all the relevant
patches on top of 2.6.32.7:

fdpic-respect-pt_gnu_stack-exec-protection-markings-when-creating-nommu-stack.patch
split-flush_old_exec-into-two-functions.patch
sparc-tif_abi_pending-bit-removal.patch
x86-get-rid-of-the-insane-tif_abi_pending-bit.patch
fix-flush_old_exec-setup_new_exec-split.patch
powerpc-tif_abi_pending-bit-removal.patch

This works perfectly, so far as I can see.

Ben.

--
Ben Hutchings
friends: People who know you well, but like you anyway.


Attachments:
(No filename) (1.14 kB)
signature.asc (827.00 B)
Digital signature
Download all attachments

2010-02-05 05:44:26

by Sven Joachim

[permalink] [raw]
Subject: Re: [stable] [PATCH] Fix 'flush_old_exec()/setup_new_exec()' split

On 2010-02-04 20:37 +0100, Linus Torvalds wrote:

> Ok, then that's just really odd. I don't think there are any other changes
> in this area, and the whole init problem sure as hell smells like execve
> setup problems. And afaik, 2.6.32.8-rc1 should have all the commits from
> current -git.
>
> Maybe we're looking at a separate issue after all.

Possibly, because a 32-bit 2.6.32.8-rc1 kernel also panics very early.

Sven

2010-02-05 10:23:31

by Sven Joachim

[permalink] [raw]
Subject: Re: [stable] [PATCH] Fix 'flush_old_exec()/setup_new_exec()' split

On 2010-02-04 23:25 +0100, Ben Hutchings wrote:

> In a Debian initramfs, /init is a script interpreted by /bin/sh.

Yes, I tested without an initramfs though. The crash happens actually
before init is run.

> I've now tested i386 userland on an x86_64 kernel with all the relevant
> patches on top of 2.6.32.7:
>
> fdpic-respect-pt_gnu_stack-exec-protection-markings-when-creating-nommu-stack.patch
> split-flush_old_exec-into-two-functions.patch
> sparc-tif_abi_pending-bit-removal.patch
> x86-get-rid-of-the-insane-tif_abi_pending-bit.patch
> fix-flush_old_exec-setup_new_exec-split.patch
> powerpc-tif_abi_pending-bit-removal.patch
>
> This works perfectly, so far as I can see.

It seems so. The faulty patch is actually the cherry-pick of commit
28f6aeea (net: restore ip source validation), reverting that leads to a
kernel that runs fine. And this has nothing to do with 64-bit vs
32-bit.

Sven

2010-02-06 08:49:34

by Greg KH

[permalink] [raw]
Subject: Re: [stable] [PATCH] Fix 'flush_old_exec()/setup_new_exec()' split

On Thu, Feb 04, 2010 at 08:32:32PM +0100, Sven Joachim wrote:
> On 2010-02-04 19:57 +0100, Greg KH wrote:
>
> > On Thu, Feb 04, 2010 at 07:46:30PM +0100, Sven Joachim wrote:
> >> It seems they are all there, but on my system with 64-bit kernel and
> >> 32-bit userland, 2.6.32.8-rc1 still panics in the way noticed by Ben.
> >
> > Does 2.6.33-rc6 also cause you the same problem?
>
> A build from Linus' current git tree (commit 7ab02af42 was added after
> 2.6.33-rc6, so I skipped that version) does not show the problem.
> Actually, I'm using it right now.

I think this was the sysctl issue, and have released a 2.6.32.8-rc2.
Can you test that to verify if it fixes your crash or not?

thanks,

greg k-h

2010-02-06 09:21:54

by Sven Joachim

[permalink] [raw]
Subject: Re: [stable] [PATCH] Fix 'flush_old_exec()/setup_new_exec()' split

On 2010-02-06 09:49 +0100, Greg KH wrote:

> I think this was the sysctl issue, and have released a 2.6.32.8-rc2.
> Can you test that to verify if it fixes your crash or not?

Yes, it fixes the issue. I'm using -rc2 right now, did not notice any
problems so far.

Sven

2010-02-06 09:33:14

by Willy Tarreau

[permalink] [raw]
Subject: Re: [Stable-review] [stable] [PATCH] Fix 'flush_old_exec()/setup_new_exec()' split

On Sat, Feb 06, 2010 at 10:21:48AM +0100, Sven Joachim wrote:
> On 2010-02-06 09:49 +0100, Greg KH wrote:
>
> > I think this was the sysctl issue, and have released a 2.6.32.8-rc2.
> > Can you test that to verify if it fixes your crash or not?
>
> Yes, it fixes the issue. I'm using -rc2 right now, did not notice any
> problems so far.

FWIW, it's running fine here too. I've tried to boot on a miniature
init launcher I have (statically linked one) and did not have any issue.

Willy

2010-02-06 09:55:26

by Greg KH

[permalink] [raw]
Subject: Re: [stable] [Stable-review] [PATCH] Fix 'flush_old_exec()/setup_new_exec()' split

On Sat, Feb 06, 2010 at 10:31:44AM +0100, Willy Tarreau wrote:
> On Sat, Feb 06, 2010 at 10:21:48AM +0100, Sven Joachim wrote:
> > On 2010-02-06 09:49 +0100, Greg KH wrote:
> >
> > > I think this was the sysctl issue, and have released a 2.6.32.8-rc2.
> > > Can you test that to verify if it fixes your crash or not?
> >
> > Yes, it fixes the issue. I'm using -rc2 right now, did not notice any
> > problems so far.
>
> FWIW, it's running fine here too. I've tried to boot on a miniature
> init launcher I have (statically linked one) and did not have any issue.

Wonderful, thanks to both of you for letting me know. I feel a lot
better about this release now...

thanks,

greg k-h