It looks like the commit 803bf5ec259941936262d10ecc84511b76a20921
(fs/exec.c: restrict initial stack space expansion to rlimit) broke my
user mode Linux setup by somehow preventing system setup from running
properly (or killing some processes that try to mount things, etc.).
This commit turned up as the reason based on git bisect and reverting it
fixes my UML test setup (Ubuntu 9.10 on both host and in UML and AMD64
arch for both). I have no idea what exactly would be the main cause for
this issue, but this looks like a somewhat unfortunately timed
regression in 2.6.33-rc8.
The failed run shows like this (with current linux-2.6.git):
...
EXT3-fs (ubda): mounted filesystem with writeback data mode
VFS: Mounted root (ext3 filesystem) readonly on device 98:0.
IRQ 3/console-write: IRQF_DISABLED is not guaranteed on shared IRQs
IRQ 2/console: IRQF_DISABLED is not guaranteed on shared IRQs
IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
mountall: mount /sys/kernel/debug [218] killed by KILL signal
mountall: Filesystem could not be mounted: /sys/kernel/debug
mountall: mount /dev [219] killed by KILL signal
mountall: Filesystem could not be mounted: /dev
mountall: mount /tmp [220] killed by KILL signal
mountall: Filesystem could not be mounted: /tmp
mountall: mount /var/lock [222] killed by KILL signal
mountall: Filesystem could not be mounted: /var/lock
...
With 803bf5ec reverted, UML comes up and the output looks like this:
...
EXT3-fs (ubda): mounted filesystem with writeback data mode
VFS: Mounted root (ext3 filesystem) readonly on device 98:0.
IRQ 3/console-write: IRQF_DISABLED is not guaranteed on shared IRQs
IRQ 2/console: IRQF_DISABLED is not guaranteed on shared IRQs
IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
init: procps main process (226) terminated with status 255
fsck from util-linux-ng 2.16
...
--
Jouni Malinen PGP id EFC895FA
In message <[email protected]> you wrote:
> It looks like the commit 803bf5ec259941936262d10ecc84511b76a20921
> (fs/exec.c: restrict initial stack space expansion to rlimit) broke my
> user mode Linux setup by somehow preventing system setup from running
> properly (or killing some processes that try to mount things, etc.).
> This commit turned up as the reason based on git bisect and reverting it
> fixes my UML test setup (Ubuntu 9.10 on both host and in UML and AMD64
> arch for both). I have no idea what exactly would be the main cause for
> this issue, but this looks like a somewhat unfortunately timed
> regression in 2.6.33-rc8.
>
> The failed run shows like this (with current linux-2.6.git):
>
> ...
> EXT3-fs (ubda): mounted filesystem with writeback data mode
> VFS: Mounted root (ext3 filesystem) readonly on device 98:0.
> IRQ 3/console-write: IRQF_DISABLED is not guaranteed on shared IRQs
> IRQ 2/console: IRQF_DISABLED is not guaranteed on shared IRQs
> IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> mountall: mount /sys/kernel/debug [218] killed by KILL signal
> mountall: Filesystem could not be mounted: /sys/kernel/debug
> mountall: mount /dev [219] killed by KILL signal
> mountall: Filesystem could not be mounted: /dev
> mountall: mount /tmp [220] killed by KILL signal
> mountall: Filesystem could not be mounted: /tmp
> mountall: mount /var/lock [222] killed by KILL signal
> mountall: Filesystem could not be mounted: /var/lock
> ...
>
>
> With 803bf5ec reverted, UML comes up and the output looks like this:
>
> ...
> EXT3-fs (ubda): mounted filesystem with writeback data mode
> VFS: Mounted root (ext3 filesystem) readonly on device 98:0.
> IRQ 3/console-write: IRQF_DISABLED is not guaranteed on shared IRQs
> IRQ 2/console: IRQF_DISABLED is not guaranteed on shared IRQs
> IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> init: procps main process (226) terminated with status 255
> fsck from util-linux-ng 2.16
> ...
Crud, the "killed" is definitely something this patch could cause.
I'm not familiar with UML. Is this the guest and the host booting rc8,
or just the host? Does UML use stack protection at all?
Can you try booting the guest to init=/bin/sh and try running some tests
to see what you can set 'ulimit -s' to and still be able to run a simple
command like '/bin/ls'?
Mikey
>
> --
> Jouni Malinen PGP id EFC895FA
>
On Sunday 14 February 2010, Jouni Malinen wrote:
> It looks like the commit 803bf5ec259941936262d10ecc84511b76a20921
> (fs/exec.c: restrict initial stack space expansion to rlimit) broke my
> user mode Linux setup by somehow preventing system setup from running
> properly (or killing some processes that try to mount things, etc.).
> This commit turned up as the reason based on git bisect and reverting it
> fixes my UML test setup (Ubuntu 9.10 on both host and in UML and AMD64
> arch for both). I have no idea what exactly would be the main cause for
> this issue, but this looks like a somewhat unfortunately timed
> regression in 2.6.33-rc8.
>
> The failed run shows like this (with current linux-2.6.git):
>
> ...
> EXT3-fs (ubda): mounted filesystem with writeback data mode
> VFS: Mounted root (ext3 filesystem) readonly on device 98:0.
> IRQ 3/console-write: IRQF_DISABLED is not guaranteed on shared IRQs
> IRQ 2/console: IRQF_DISABLED is not guaranteed on shared IRQs
> IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> mountall: mount /sys/kernel/debug [218] killed by KILL signal
> mountall: Filesystem could not be mounted: /sys/kernel/debug
> mountall: mount /dev [219] killed by KILL signal
> mountall: Filesystem could not be mounted: /dev
> mountall: mount /tmp [220] killed by KILL signal
> mountall: Filesystem could not be mounted: /tmp
> mountall: mount /var/lock [222] killed by KILL signal
> mountall: Filesystem could not be mounted: /var/lock
> ...
>
>
> With 803bf5ec reverted, UML comes up and the output looks like this:
>
> ...
> EXT3-fs (ubda): mounted filesystem with writeback data mode
> VFS: Mounted root (ext3 filesystem) readonly on device 98:0.
> IRQ 3/console-write: IRQF_DISABLED is not guaranteed on shared IRQs
> IRQ 2/console: IRQF_DISABLED is not guaranteed on shared IRQs
> IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> init: procps main process (226) terminated with status 255
> fsck from util-linux-ng 2.16
> ...
I have created the Bugzilla entry at http://bugzilla.kernel.org/show_bug.cgi?id=15308
for your report.
Thanks,
Rafael
>
>
> In message <[email protected]> you wrote:
> > It looks like the commit 803bf5ec259941936262d10ecc84511b76a20921
> > (fs/exec.c: restrict initial stack space expansion to rlimit) broke my
> > user mode Linux setup by somehow preventing system setup from running
> > properly (or killing some processes that try to mount things, etc.).
> > This commit turned up as the reason based on git bisect and reverting it
> > fixes my UML test setup (Ubuntu 9.10 on both host and in UML and AMD64
> > arch for both). I have no idea what exactly would be the main cause for
> > this issue, but this looks like a somewhat unfortunately timed
> > regression in 2.6.33-rc8.
> >
> > The failed run shows like this (with current linux-2.6.git):
> >
> > ...
> > EXT3-fs (ubda): mounted filesystem with writeback data mode
> > VFS: Mounted root (ext3 filesystem) readonly on device 98:0.
> > IRQ 3/console-write: IRQF_DISABLED is not guaranteed on shared IRQs
> > IRQ 2/console: IRQF_DISABLED is not guaranteed on shared IRQs
> > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> > mountall: mount /sys/kernel/debug [218] killed by KILL signal
> > mountall: Filesystem could not be mounted: /sys/kernel/debug
> > mountall: mount /dev [219] killed by KILL signal
> > mountall: Filesystem could not be mounted: /dev
> > mountall: mount /tmp [220] killed by KILL signal
> > mountall: Filesystem could not be mounted: /tmp
> > mountall: mount /var/lock [222] killed by KILL signal
> > mountall: Filesystem could not be mounted: /var/lock
> > ...
Wow. It seems very strange. Usually stack overflow makes SIGSEGV, not SIGKILL.
plus, In my environment (x86_64 non-uml), mount command doesn't use
the stack so much.
% /usr/bin/time --format="mem %M" ls
/usr/bin/time --format="mem %M" ls
GPATH INSTALL README configure fdisk lib misc-utils schedutils
mem 3232
% sudo /usr/bin/time --format="mem %M" mount -a
sudo /usr/bin/time --format="mem %M" mount -a
mem 2992
Hmmm...
I have no idea.
> >
> >
> > With 803bf5ec reverted, UML comes up and the output looks like this:
> >
> > ...
> > EXT3-fs (ubda): mounted filesystem with writeback data mode
> > VFS: Mounted root (ext3 filesystem) readonly on device 98:0.
> > IRQ 3/console-write: IRQF_DISABLED is not guaranteed on shared IRQs
> > IRQ 2/console: IRQF_DISABLED is not guaranteed on shared IRQs
> > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> > init: procps main process (226) terminated with status 255
> > fsck from util-linux-ng 2.16
> > ...
>
> Crud, the "killed" is definitely something this patch could cause.
>
> I'm not familiar with UML. Is this the guest and the host booting rc8,
> or just the host? Does UML use stack protection at all?
>
> Can you try booting the guest to init=/bin/sh and try running some tests
> to see what you can set 'ulimit -s' to and still be able to run a simple
> command like '/bin/ls'?
>
> Mikey
>
> >
> > --
> > Jouni Malinen PGP id EFC895FA
> >
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
In message <[email protected]> you wrote:
> It looks like the commit 803bf5ec259941936262d10ecc84511b76a20921
> (fs/exec.c: restrict initial stack space expansion to rlimit) broke my
> user mode Linux setup by somehow preventing system setup from running
> properly (or killing some processes that try to mount things, etc.).
> This commit turned up as the reason based on git bisect and reverting it
> fixes my UML test setup (Ubuntu 9.10 on both host and in UML and AMD64
> arch for both). I have no idea what exactly would be the main cause for
> this issue, but this looks like a somewhat unfortunately timed
> regression in 2.6.33-rc8.
>
> The failed run shows like this (with current linux-2.6.git):
>
> ...
> EXT3-fs (ubda): mounted filesystem with writeback data mode
> VFS: Mounted root (ext3 filesystem) readonly on device 98:0.
> IRQ 3/console-write: IRQF_DISABLED is not guaranteed on shared IRQs
> IRQ 2/console: IRQF_DISABLED is not guaranteed on shared IRQs
> IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> mountall: mount /sys/kernel/debug [218] killed by KILL signal
> mountall: Filesystem could not be mounted: /sys/kernel/debug
> mountall: mount /dev [219] killed by KILL signal
> mountall: Filesystem could not be mounted: /dev
> mountall: mount /tmp [220] killed by KILL signal
> mountall: Filesystem could not be mounted: /tmp
> mountall: mount /var/lock [222] killed by KILL signal
> mountall: Filesystem could not be mounted: /var/lock
> ...
>
>
> With 803bf5ec reverted, UML comes up and the output looks like this:
>
> ...
> EXT3-fs (ubda): mounted filesystem with writeback data mode
> VFS: Mounted root (ext3 filesystem) readonly on device 98:0.
> IRQ 3/console-write: IRQF_DISABLED is not guaranteed on shared IRQs
> IRQ 2/console: IRQF_DISABLED is not guaranteed on shared IRQs
> IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> init: procps main process (226) terminated with status 255
> fsck from util-linux-ng 2.16
> ...
Jouni,
I can reproduce this now.
We got the logic wrong in one of the cleanups and hence we aren't
actually changing the stack reservation ever, when we intended on
allocating up to 20 new pages.
The:
rlim_stack = min(rlim_stack, stack_size);
always chooses stack_size hence we end up not changing the stack at all.
This seems to cause fatal problems on UML, but is obviously not what was
intended for archs as well.
The following works for me on PPC64 64k and 4k pages and UML on x86_64.
Let me know if it fixes it for you also.
Mikey
exec/fs: fix initial stack reservation
803bf5ec259941936262d10ecc84511b76a20921 (fs/exec.c: restrict initial
stack space expansion to rlimit) attempts to limit the initial stack to
20*PAGE_SIZE. Unfortunately, in also attempting ensure the stack is not
reduced in size, we ended up not changing the stack at all.
This caused a regression in UML resulting in most guest processes to be
killed.
Signed-off-by: Michael Neuling <[email protected]>
cc: <[email protected]>
diff --git a/fs/exec.c b/fs/exec.c
index e95c692..e0e7b3c 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -637,15 +637,16 @@ int setup_arg_pages(struct linux_binprm *bprm,
* will align it up.
*/
rlim_stack = rlimit(RLIMIT_STACK) & PAGE_MASK;
- rlim_stack = min(rlim_stack, stack_size);
#ifdef CONFIG_STACK_GROWSUP
if (stack_size + stack_expand > rlim_stack)
- stack_base = vma->vm_start + rlim_stack;
+ /* Expand only to rlimit, making sure not to shrink it */
+ stack_base = vma->vm_start + max(rlim_stack,stack_size);
else
stack_base = vma->vm_end + stack_expand;
#else
if (stack_size + stack_expand > rlim_stack)
- stack_base = vma->vm_end - rlim_stack;
+ /* Expand only to rlimit, making sure not to shrink it */
+ stack_base = vma->vm_end - max(rlim_stack,stack_size);
else
stack_base = vma->vm_start - stack_expand;
#endif
On Mon, Feb 15, 2010 at 09:03:39AM +1100, Michael Neuling wrote:
> Crud, the "killed" is definitely something this patch could cause.
>
> I'm not familiar with UML. Is this the guest and the host booting rc8,
> or just the host? Does UML use stack protection at all?
Only the guest was booting rc8 (host is Ubuntu 2.6.31-19-generic), i.e.,
the issue shows up in the guest kernel after rc8 (and is removed by
reverting 803bf5). I do not know whether stack protection is used.
> Can you try booting the guest to init=/bin/sh and try running some tests
> to see what you can set 'ulimit -s' to and still be able to run a simple
> command like '/bin/ls'?
With 803bf5 included, init=/bin/sh seems to fail the boot quite
frequently, but not always. Once I get to shell on the UML guest,
/bin/ls fails about half of the time (e.g., running /bin/ls in /root
failed six times out of ten). This was before touching ulimit -s at all
("ulimit -s" shows 8192).
The behavior of 'ls' run with various ulimit -s values:
24-10000: OK or Killed
16-23: OK or Segmentation fault or Killed
1-15: Killed or Segmentation fault
I have no idea what is causing the random behavior in the results, but
anyway, it looks like 'ulimit -s 23' is the first point where
segmentation faults start showing up and 'ulimit -s 15' is the point
where ls fails every time.
If I start the guest (normal boot with multiple virtual consoles) with
803bf5 reverted, 'ulimit -s 84' has /bin/ls working every time and
'ulimit -s 83' makes it fail every time.
--
Jouni Malinen PGP id EFC895FA
>
>
> In message <[email protected]> you wrote:
> > It looks like the commit 803bf5ec259941936262d10ecc84511b76a20921
> > (fs/exec.c: restrict initial stack space expansion to rlimit) broke my
> > user mode Linux setup by somehow preventing system setup from running
> > properly (or killing some processes that try to mount things, etc.).
> > This commit turned up as the reason based on git bisect and reverting it
> > fixes my UML test setup (Ubuntu 9.10 on both host and in UML and AMD64
> > arch for both). I have no idea what exactly would be the main cause for
> > this issue, but this looks like a somewhat unfortunately timed
> > regression in 2.6.33-rc8.
> >
> > The failed run shows like this (with current linux-2.6.git):
> >
> > ...
> > EXT3-fs (ubda): mounted filesystem with writeback data mode
> > VFS: Mounted root (ext3 filesystem) readonly on device 98:0.
> > IRQ 3/console-write: IRQF_DISABLED is not guaranteed on shared IRQs
> > IRQ 2/console: IRQF_DISABLED is not guaranteed on shared IRQs
> > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> > mountall: mount /sys/kernel/debug [218] killed by KILL signal
> > mountall: Filesystem could not be mounted: /sys/kernel/debug
> > mountall: mount /dev [219] killed by KILL signal
> > mountall: Filesystem could not be mounted: /dev
> > mountall: mount /tmp [220] killed by KILL signal
> > mountall: Filesystem could not be mounted: /tmp
> > mountall: mount /var/lock [222] killed by KILL signal
> > mountall: Filesystem could not be mounted: /var/lock
> > ...
> >
> >
> > With 803bf5ec reverted, UML comes up and the output looks like this:
> >
> > ...
> > EXT3-fs (ubda): mounted filesystem with writeback data mode
> > VFS: Mounted root (ext3 filesystem) readonly on device 98:0.
> > IRQ 3/console-write: IRQF_DISABLED is not guaranteed on shared IRQs
> > IRQ 2/console: IRQF_DISABLED is not guaranteed on shared IRQs
> > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> > init: procps main process (226) terminated with status 255
> > fsck from util-linux-ng 2.16
> > ...
>
> Jouni,
>
> I can reproduce this now.
>
> We got the logic wrong in one of the cleanups and hence we aren't
> actually changing the stack reservation ever, when we intended on
> allocating up to 20 new pages.
>
> The:
> rlim_stack = min(rlim_stack, stack_size);
> always chooses stack_size hence we end up not changing the stack at all.
> This seems to cause fatal problems on UML, but is obviously not what was
> intended for archs as well.
>
> The following works for me on PPC64 64k and 4k pages and UML on x86_64.
>
> Let me know if it fixes it for you also.
>
> Mikey
>
>
> exec/fs: fix initial stack reservation
>
> 803bf5ec259941936262d10ecc84511b76a20921 (fs/exec.c: restrict initial
> stack space expansion to rlimit) attempts to limit the initial stack to
> 20*PAGE_SIZE. Unfortunately, in also attempting ensure the stack is not
> reduced in size, we ended up not changing the stack at all.
>
> This caused a regression in UML resulting in most guest processes to be
> killed.
>
> Signed-off-by: Michael Neuling <[email protected]>
> cc: <[email protected]>
>
> diff --git a/fs/exec.c b/fs/exec.c
> index e95c692..e0e7b3c 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -637,15 +637,16 @@ int setup_arg_pages(struct linux_binprm *bprm,
> * will align it up.
> */
> rlim_stack = rlimit(RLIMIT_STACK) & PAGE_MASK;
> - rlim_stack = min(rlim_stack, stack_size);
> #ifdef CONFIG_STACK_GROWSUP
> if (stack_size + stack_expand > rlim_stack)
> - stack_base = vma->vm_start + rlim_stack;
> + /* Expand only to rlimit, making sure not to shrink it */
> + stack_base = vma->vm_start + max(rlim_stack,stack_size);
> else
> stack_base = vma->vm_end + stack_expand;
> #else
> if (stack_size + stack_expand > rlim_stack)
> - stack_base = vma->vm_end - rlim_stack;
> + /* Expand only to rlimit, making sure not to shrink it */
> + stack_base = vma->vm_end - max(rlim_stack,stack_size);
> else
> stack_base = vma->vm_start - stack_expand;
> #endif
- rlim_stack = min(rlim_stack, stack_size);
+ /* Expand only to rlimit, making sure not to shrink it */
+ rlim_stack = max(rlim_stack, stack_size);
is better fix?
On Mon, Feb 15, 2010 at 11:38:52AM +0900, KOSAKI Motohiro wrote:
> Wow. It seems very strange. Usually stack overflow makes SIGSEGV, not SIGKILL.
> plus, In my environment (x86_64 non-uml), mount command doesn't use
> the stack so much.
The SIGKILL part is the odd one, I would assume.. It looks like about
half of the commands are failing with SIGKILL even with ulimit -s 8192
and the ulimit -s value did not really change the failure rate that
much if at all. SIGSEGV starts showing up only after having limited
ulimit -s to quite samll number (see my previous email).
> % /usr/bin/time --format="mem %M" ls
> /usr/bin/time --format="mem %M" ls
> GPATH INSTALL README configure fdisk lib misc-utils schedutils
> mem 3232
mem 3296
on my UML setup
> % sudo /usr/bin/time --format="mem %M" mount -a
> sudo /usr/bin/time --format="mem %M" mount -a
> mem 2992
mem 2928
--
Jouni Malinen PGP id EFC895FA
On Mon, Feb 15, 2010 at 05:30:20PM +1100, Michael Neuling wrote:
> We got the logic wrong in one of the cleanups and hence we aren't
> actually changing the stack reservation ever, when we intended on
> allocating up to 20 new pages.
>
> The:
> rlim_stack = min(rlim_stack, stack_size);
> always chooses stack_size hence we end up not changing the stack at all.
> This seems to cause fatal problems on UML, but is obviously not what was
> intended for archs as well.
>
> The following works for me on PPC64 64k and 4k pages and UML on x86_64.
That fixes my UML setup, too.
--
Jouni Malinen PGP id EFC895FA
On Mon, Feb 15, 2010 at 03:59:26PM +0900, KOSAKI Motohiro wrote:
> - rlim_stack = min(rlim_stack, stack_size);
> + /* Expand only to rlimit, making sure not to shrink it */
> + rlim_stack = max(rlim_stack, stack_size);
>
> is better fix?
Assuming I understood correctly that that was to replace the patch from
Michael completely and not just a part of it (i.e., just this one-liner
on top of linux-2.6.git), I can confirm that this, too, resolves the
issue I was seeing with UML.
--
Jouni Malinen PGP id EFC895FA
In message <[email protected]> you wrote:
> >
> >
> > In message <[email protected]> you wrote:
> > > It looks like the commit 803bf5ec259941936262d10ecc84511b76a20921
> > > (fs/exec.c: restrict initial stack space expansion to rlimit) broke my
> > > user mode Linux setup by somehow preventing system setup from running
> > > properly (or killing some processes that try to mount things, etc.).
> > > This commit turned up as the reason based on git bisect and reverting it
> > > fixes my UML test setup (Ubuntu 9.10 on both host and in UML and AMD64
> > > arch for both). I have no idea what exactly would be the main cause for
> > > this issue, but this looks like a somewhat unfortunately timed
> > > regression in 2.6.33-rc8.
> > >
> > > The failed run shows like this (with current linux-2.6.git):
> > >
> > > ...
> > > EXT3-fs (ubda): mounted filesystem with writeback data mode
> > > VFS: Mounted root (ext3 filesystem) readonly on device 98:0.
> > > IRQ 3/console-write: IRQF_DISABLED is not guaranteed on shared IRQs
> > > IRQ 2/console: IRQF_DISABLED is not guaranteed on shared IRQs
> > > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> > > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> > > mountall: mount /sys/kernel/debug [218] killed by KILL signal
> > > mountall: Filesystem could not be mounted: /sys/kernel/debug
> > > mountall: mount /dev [219] killed by KILL signal
> > > mountall: Filesystem could not be mounted: /dev
> > > mountall: mount /tmp [220] killed by KILL signal
> > > mountall: Filesystem could not be mounted: /tmp
> > > mountall: mount /var/lock [222] killed by KILL signal
> > > mountall: Filesystem could not be mounted: /var/lock
> > > ...
> > >
> > >
> > > With 803bf5ec reverted, UML comes up and the output looks like this:
> > >
> > > ...
> > > EXT3-fs (ubda): mounted filesystem with writeback data mode
> > > VFS: Mounted root (ext3 filesystem) readonly on device 98:0.
> > > IRQ 3/console-write: IRQF_DISABLED is not guaranteed on shared IRQs
> > > IRQ 2/console: IRQF_DISABLED is not guaranteed on shared IRQs
> > > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> > > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> > > init: procps main process (226) terminated with status 255
> > > fsck from util-linux-ng 2.16
> > > ...
> >
> > Jouni,
> >
> > I can reproduce this now.
> >
> > We got the logic wrong in one of the cleanups and hence we aren't
> > actually changing the stack reservation ever, when we intended on
> > allocating up to 20 new pages.
> >
> > The:
> > rlim_stack = min(rlim_stack, stack_size);
> > always chooses stack_size hence we end up not changing the stack at all.
> > This seems to cause fatal problems on UML, but is obviously not what was
> > intended for archs as well.
> >
> > The following works for me on PPC64 64k and 4k pages and UML on x86_64.
> >
> > Let me know if it fixes it for you also.
> >
> > Mikey
> >
> >
> > exec/fs: fix initial stack reservation
> >
> > 803bf5ec259941936262d10ecc84511b76a20921 (fs/exec.c: restrict initial
> > stack space expansion to rlimit) attempts to limit the initial stack to
> > 20*PAGE_SIZE. Unfortunately, in also attempting ensure the stack is not
> > reduced in size, we ended up not changing the stack at all.
> >
> > This caused a regression in UML resulting in most guest processes to be
> > killed.
> >
> > Signed-off-by: Michael Neuling <[email protected]>
> > cc: <[email protected]>
> >
> > diff --git a/fs/exec.c b/fs/exec.c
> > index e95c692..e0e7b3c 100644
> > --- a/fs/exec.c
> > +++ b/fs/exec.c
> > @@ -637,15 +637,16 @@ int setup_arg_pages(struct linux_binprm *bprm,
> > * will align it up.
> > */
> > rlim_stack = rlimit(RLIMIT_STACK) & PAGE_MASK;
> > - rlim_stack = min(rlim_stack, stack_size);
> > #ifdef CONFIG_STACK_GROWSUP
> > if (stack_size + stack_expand > rlim_stack)
> > - stack_base = vma->vm_start + rlim_stack;
> > + /* Expand only to rlimit, making sure not to shrink it */
> > + stack_base = vma->vm_start + max(rlim_stack,stack_size);
> > else
> > stack_base = vma->vm_end + stack_expand;
> > #else
> > if (stack_size + stack_expand > rlim_stack)
> > - stack_base = vma->vm_end - rlim_stack;
> > + /* Expand only to rlimit, making sure not to shrink it */
> > + stack_base = vma->vm_end - max(rlim_stack,stack_size);
> > else
> > stack_base = vma->vm_start - stack_expand;
> > #endif
>
> - rlim_stack = min(rlim_stack, stack_size);
> + /* Expand only to rlimit, making sure not to shrink it */
> + rlim_stack = max(rlim_stack, stack_size);
>
> is better fix?
Actually, I think we can just get rid of min() line altogether.
expand_stack checks to make sure the stack is getting bigger, otherwise
it does nothing. We don't need to bother with this check.
The below works for me on UML x86_64 and ppc64 64k and 4k pages.
Mikey
exec/fs: fix initial stack reservation
803bf5ec259941936262d10ecc84511b76a20921 (fs/exec.c: restrict initial
stack space expansion to rlimit) attempts to limit the initial stack to
20*PAGE_SIZE. Unfortunately, in attempting ensure the stack is not
reduced in size, we ended up not changing the stack at all.
This size reduction check is not necessary as the expand_stack call does
this already.
This caused a regression in UML resulting in most guest processes being
killed.
Signed-off-by: Michael Neuling <[email protected]>
cc: <[email protected]>
---
fs/exec.c | 1 -
1 file changed, 1 deletion(-)
Index: linux-2.6-ozlabs/fs/exec.c
===================================================================
--- linux-2.6-ozlabs.orig/fs/exec.c
+++ linux-2.6-ozlabs/fs/exec.c
@@ -637,7 +637,6 @@ int setup_arg_pages(struct linux_binprm
* will align it up.
*/
rlim_stack = rlimit(RLIMIT_STACK) & PAGE_MASK;
- rlim_stack = min(rlim_stack, stack_size);
#ifdef CONFIG_STACK_GROWSUP
if (stack_size + stack_expand > rlim_stack)
stack_base = vma->vm_start + rlim_stack;
On Mon, Feb 15, 2010 at 03:59:26PM +0900, KOSAKI Motohiro wrote:
>>
>>
>> In message <[email protected]> you wrote:
>> > It looks like the commit 803bf5ec259941936262d10ecc84511b76a20921
>> > (fs/exec.c: restrict initial stack space expansion to rlimit) broke my
>> > user mode Linux setup by somehow preventing system setup from running
>> > properly (or killing some processes that try to mount things, etc.).
>> > This commit turned up as the reason based on git bisect and reverting it
>> > fixes my UML test setup (Ubuntu 9.10 on both host and in UML and AMD64
>> > arch for both). I have no idea what exactly would be the main cause for
>> > this issue, but this looks like a somewhat unfortunately timed
>> > regression in 2.6.33-rc8.
>> >
>> > The failed run shows like this (with current linux-2.6.git):
>> >
>> > ...
>> > EXT3-fs (ubda): mounted filesystem with writeback data mode
>> > VFS: Mounted root (ext3 filesystem) readonly on device 98:0.
>> > IRQ 3/console-write: IRQF_DISABLED is not guaranteed on shared IRQs
>> > IRQ 2/console: IRQF_DISABLED is not guaranteed on shared IRQs
>> > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
>> > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
>> > mountall: mount /sys/kernel/debug [218] killed by KILL signal
>> > mountall: Filesystem could not be mounted: /sys/kernel/debug
>> > mountall: mount /dev [219] killed by KILL signal
>> > mountall: Filesystem could not be mounted: /dev
>> > mountall: mount /tmp [220] killed by KILL signal
>> > mountall: Filesystem could not be mounted: /tmp
>> > mountall: mount /var/lock [222] killed by KILL signal
>> > mountall: Filesystem could not be mounted: /var/lock
>> > ...
>> >
>> >
>> > With 803bf5ec reverted, UML comes up and the output looks like this:
>> >
>> > ...
>> > EXT3-fs (ubda): mounted filesystem with writeback data mode
>> > VFS: Mounted root (ext3 filesystem) readonly on device 98:0.
>> > IRQ 3/console-write: IRQF_DISABLED is not guaranteed on shared IRQs
>> > IRQ 2/console: IRQF_DISABLED is not guaranteed on shared IRQs
>> > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
>> > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
>> > init: procps main process (226) terminated with status 255
>> > fsck from util-linux-ng 2.16
>> > ...
>>
>> Jouni,
>>
>> I can reproduce this now.
>>
>> We got the logic wrong in one of the cleanups and hence we aren't
>> actually changing the stack reservation ever, when we intended on
>> allocating up to 20 new pages.
>>
>> The:
>> rlim_stack = min(rlim_stack, stack_size);
>> always chooses stack_size hence we end up not changing the stack at all.
>> This seems to cause fatal problems on UML, but is obviously not what was
>> intended for archs as well.
>>
>> The following works for me on PPC64 64k and 4k pages and UML on x86_64.
>>
>> Let me know if it fixes it for you also.
>>
>> Mikey
>>
>>
>> exec/fs: fix initial stack reservation
>>
>> 803bf5ec259941936262d10ecc84511b76a20921 (fs/exec.c: restrict initial
>> stack space expansion to rlimit) attempts to limit the initial stack to
>> 20*PAGE_SIZE. Unfortunately, in also attempting ensure the stack is not
>> reduced in size, we ended up not changing the stack at all.
>>
>> This caused a regression in UML resulting in most guest processes to be
>> killed.
>>
>> Signed-off-by: Michael Neuling <[email protected]>
>> cc: <[email protected]>
>>
>> diff --git a/fs/exec.c b/fs/exec.c
>> index e95c692..e0e7b3c 100644
>> --- a/fs/exec.c
>> +++ b/fs/exec.c
>> @@ -637,15 +637,16 @@ int setup_arg_pages(struct linux_binprm *bprm,
>> * will align it up.
>> */
>> rlim_stack = rlimit(RLIMIT_STACK) & PAGE_MASK;
>> - rlim_stack = min(rlim_stack, stack_size);
>> #ifdef CONFIG_STACK_GROWSUP
>> if (stack_size + stack_expand > rlim_stack)
>> - stack_base = vma->vm_start + rlim_stack;
>> + /* Expand only to rlimit, making sure not to shrink it */
>> + stack_base = vma->vm_start + max(rlim_stack,stack_size);
>> else
>> stack_base = vma->vm_end + stack_expand;
>> #else
>> if (stack_size + stack_expand > rlim_stack)
>> - stack_base = vma->vm_end - rlim_stack;
>> + /* Expand only to rlimit, making sure not to shrink it */
>> + stack_base = vma->vm_end - max(rlim_stack,stack_size);
>> else
>> stack_base = vma->vm_start - stack_expand;
>> #endif
>
>- rlim_stack = min(rlim_stack, stack_size);
>+ /* Expand only to rlimit, making sure not to shrink it */
>+ rlim_stack = max(rlim_stack, stack_size);
>
>is better fix?
>
Odd. If this is the right fix, 'stack_size" will be able to exceed
stack rlimit, then Michael's previous rlimit patch will be useless.
Am I missing something?
> On Mon, Feb 15, 2010 at 03:59:26PM +0900, KOSAKI Motohiro wrote:
> >>
> >>
> >> In message <[email protected]> you wrote:
> >> > It looks like the commit 803bf5ec259941936262d10ecc84511b76a20921
> >> > (fs/exec.c: restrict initial stack space expansion to rlimit) broke my
> >> > user mode Linux setup by somehow preventing system setup from running
> >> > properly (or killing some processes that try to mount things, etc.).
> >> > This commit turned up as the reason based on git bisect and reverting it
> >> > fixes my UML test setup (Ubuntu 9.10 on both host and in UML and AMD64
> >> > arch for both). I have no idea what exactly would be the main cause for
> >> > this issue, but this looks like a somewhat unfortunately timed
> >> > regression in 2.6.33-rc8.
> >> >
> >> > The failed run shows like this (with current linux-2.6.git):
> >> >
> >> > ...
> >> > EXT3-fs (ubda): mounted filesystem with writeback data mode
> >> > VFS: Mounted root (ext3 filesystem) readonly on device 98:0.
> >> > IRQ 3/console-write: IRQF_DISABLED is not guaranteed on shared IRQs
> >> > IRQ 2/console: IRQF_DISABLED is not guaranteed on shared IRQs
> >> > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> >> > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> >> > mountall: mount /sys/kernel/debug [218] killed by KILL signal
> >> > mountall: Filesystem could not be mounted: /sys/kernel/debug
> >> > mountall: mount /dev [219] killed by KILL signal
> >> > mountall: Filesystem could not be mounted: /dev
> >> > mountall: mount /tmp [220] killed by KILL signal
> >> > mountall: Filesystem could not be mounted: /tmp
> >> > mountall: mount /var/lock [222] killed by KILL signal
> >> > mountall: Filesystem could not be mounted: /var/lock
> >> > ...
> >> >
> >> >
> >> > With 803bf5ec reverted, UML comes up and the output looks like this:
> >> >
> >> > ...
> >> > EXT3-fs (ubda): mounted filesystem with writeback data mode
> >> > VFS: Mounted root (ext3 filesystem) readonly on device 98:0.
> >> > IRQ 3/console-write: IRQF_DISABLED is not guaranteed on shared IRQs
> >> > IRQ 2/console: IRQF_DISABLED is not guaranteed on shared IRQs
> >> > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> >> > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> >> > init: procps main process (226) terminated with status 255
> >> > fsck from util-linux-ng 2.16
> >> > ...
> >>
> >> Jouni,
> >>
> >> I can reproduce this now.
> >>
> >> We got the logic wrong in one of the cleanups and hence we aren't
> >> actually changing the stack reservation ever, when we intended on
> >> allocating up to 20 new pages.
> >>
> >> The:
> >> rlim_stack = min(rlim_stack, stack_size);
> >> always chooses stack_size hence we end up not changing the stack at all.
> >> This seems to cause fatal problems on UML, but is obviously not what was
> >> intended for archs as well.
> >>
> >> The following works for me on PPC64 64k and 4k pages and UML on x86_64.
> >>
> >> Let me know if it fixes it for you also.
> >>
> >> Mikey
> >>
> >>
> >> exec/fs: fix initial stack reservation
> >>
> >> 803bf5ec259941936262d10ecc84511b76a20921 (fs/exec.c: restrict initial
> >> stack space expansion to rlimit) attempts to limit the initial stack to
> >> 20*PAGE_SIZE. Unfortunately, in also attempting ensure the stack is not
> >> reduced in size, we ended up not changing the stack at all.
> >>
> >> This caused a regression in UML resulting in most guest processes to be
> >> killed.
> >>
> >> Signed-off-by: Michael Neuling <[email protected]>
> >> cc: <[email protected]>
> >>
> >> diff --git a/fs/exec.c b/fs/exec.c
> >> index e95c692..e0e7b3c 100644
> >> --- a/fs/exec.c
> >> +++ b/fs/exec.c
> >> @@ -637,15 +637,16 @@ int setup_arg_pages(struct linux_binprm *bprm,
> >> * will align it up.
> >> */
> >> rlim_stack = rlimit(RLIMIT_STACK) & PAGE_MASK;
> >> - rlim_stack = min(rlim_stack, stack_size);
> >> #ifdef CONFIG_STACK_GROWSUP
> >> if (stack_size + stack_expand > rlim_stack)
> >> - stack_base = vma->vm_start + rlim_stack;
> >> + /* Expand only to rlimit, making sure not to shrink it */
> >> + stack_base = vma->vm_start + max(rlim_stack,stack_size);
> >> else
> >> stack_base = vma->vm_end + stack_expand;
> >> #else
> >> if (stack_size + stack_expand > rlim_stack)
> >> - stack_base = vma->vm_end - rlim_stack;
> >> + /* Expand only to rlimit, making sure not to shrink it */
> >> + stack_base = vma->vm_end - max(rlim_stack,stack_size);
> >> else
> >> stack_base = vma->vm_start - stack_expand;
> >> #endif
> >
> >- rlim_stack = min(rlim_stack, stack_size);
> >+ /* Expand only to rlimit, making sure not to shrink it */
> >+ rlim_stack = max(rlim_stack, stack_size);
> >
> >is better fix?
> >
>
> Odd. If this is the right fix, 'stack_size" will be able to exceed
> stack rlimit, then Michael's previous rlimit patch will be useless.
> Am I missing something?
>
This function is in exec processing, IOW user process doesn't start yet,
and stack_size is always PAGE_SIZE.
No problem.
This expression only mean we parse "ulimit -s 1" as "ulimit -s 4".
(round up to one-page)
> In message <[email protected]> you wrote:
> > >
> > >
> > > In message <[email protected]> you wrote:
> > > > It looks like the commit 803bf5ec259941936262d10ecc84511b76a20921
> > > > (fs/exec.c: restrict initial stack space expansion to rlimit) broke my
> > > > user mode Linux setup by somehow preventing system setup from running
> > > > properly (or killing some processes that try to mount things, etc.).
> > > > This commit turned up as the reason based on git bisect and reverting it
> > > > fixes my UML test setup (Ubuntu 9.10 on both host and in UML and AMD64
> > > > arch for both). I have no idea what exactly would be the main cause for
> > > > this issue, but this looks like a somewhat unfortunately timed
> > > > regression in 2.6.33-rc8.
> > > >
> > > > The failed run shows like this (with current linux-2.6.git):
> > > >
> > > > ...
> > > > EXT3-fs (ubda): mounted filesystem with writeback data mode
> > > > VFS: Mounted root (ext3 filesystem) readonly on device 98:0.
> > > > IRQ 3/console-write: IRQF_DISABLED is not guaranteed on shared IRQs
> > > > IRQ 2/console: IRQF_DISABLED is not guaranteed on shared IRQs
> > > > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> > > > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> > > > mountall: mount /sys/kernel/debug [218] killed by KILL signal
> > > > mountall: Filesystem could not be mounted: /sys/kernel/debug
> > > > mountall: mount /dev [219] killed by KILL signal
> > > > mountall: Filesystem could not be mounted: /dev
> > > > mountall: mount /tmp [220] killed by KILL signal
> > > > mountall: Filesystem could not be mounted: /tmp
> > > > mountall: mount /var/lock [222] killed by KILL signal
> > > > mountall: Filesystem could not be mounted: /var/lock
> > > > ...
> > > >
> > > >
> > > > With 803bf5ec reverted, UML comes up and the output looks like this:
> > > >
> > > > ...
> > > > EXT3-fs (ubda): mounted filesystem with writeback data mode
> > > > VFS: Mounted root (ext3 filesystem) readonly on device 98:0.
> > > > IRQ 3/console-write: IRQF_DISABLED is not guaranteed on shared IRQs
> > > > IRQ 2/console: IRQF_DISABLED is not guaranteed on shared IRQs
> > > > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> > > > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
> > > > init: procps main process (226) terminated with status 255
> > > > fsck from util-linux-ng 2.16
> > > > ...
> > >
> > > Jouni,
> > >
> > > I can reproduce this now.
> > >
> > > We got the logic wrong in one of the cleanups and hence we aren't
> > > actually changing the stack reservation ever, when we intended on
> > > allocating up to 20 new pages.
> > >
> > > The:
> > > rlim_stack = min(rlim_stack, stack_size);
> > > always chooses stack_size hence we end up not changing the stack at all.
> > > This seems to cause fatal problems on UML, but is obviously not what was
> > > intended for archs as well.
> > >
> > > The following works for me on PPC64 64k and 4k pages and UML on x86_64.
> > >
> > > Let me know if it fixes it for you also.
> > >
> > > Mikey
> > >
> > >
> > > exec/fs: fix initial stack reservation
> > >
> > > 803bf5ec259941936262d10ecc84511b76a20921 (fs/exec.c: restrict initial
> > > stack space expansion to rlimit) attempts to limit the initial stack to
> > > 20*PAGE_SIZE. Unfortunately, in also attempting ensure the stack is not
> > > reduced in size, we ended up not changing the stack at all.
> > >
> > > This caused a regression in UML resulting in most guest processes to be
> > > killed.
> > >
> > > Signed-off-by: Michael Neuling <[email protected]>
> > > cc: <[email protected]>
> > >
> > > diff --git a/fs/exec.c b/fs/exec.c
> > > index e95c692..e0e7b3c 100644
> > > --- a/fs/exec.c
> > > +++ b/fs/exec.c
> > > @@ -637,15 +637,16 @@ int setup_arg_pages(struct linux_binprm *bprm,
> > > * will align it up.
> > > */
> > > rlim_stack = rlimit(RLIMIT_STACK) & PAGE_MASK;
> > > - rlim_stack = min(rlim_stack, stack_size);
> > > #ifdef CONFIG_STACK_GROWSUP
> > > if (stack_size + stack_expand > rlim_stack)
> > > - stack_base = vma->vm_start + rlim_stack;
> > > + /* Expand only to rlimit, making sure not to shrink it */
> > > + stack_base = vma->vm_start + max(rlim_stack,stack_size);
> > > else
> > > stack_base = vma->vm_end + stack_expand;
> > > #else
> > > if (stack_size + stack_expand > rlim_stack)
> > > - stack_base = vma->vm_end - rlim_stack;
> > > + /* Expand only to rlimit, making sure not to shrink it */
> > > + stack_base = vma->vm_end - max(rlim_stack,stack_size);
> > > else
> > > stack_base = vma->vm_start - stack_expand;
> > > #endif
> >
> > - rlim_stack = min(rlim_stack, stack_size);
> > + /* Expand only to rlimit, making sure not to shrink it */
> > + rlim_stack = max(rlim_stack, stack_size);
> >
> > is better fix?
>
> Actually, I think we can just get rid of min() line altogether.
> expand_stack checks to make sure the stack is getting bigger, otherwise
> it does nothing. We don't need to bother with this check.
>
> The below works for me on UML x86_64 and ppc64 64k and 4k pages.
OK, Right you are.
Reviewed-by: KOSAKI Motohiro <[email protected]>
>
> Mikey
>
> exec/fs: fix initial stack reservation
>
> 803bf5ec259941936262d10ecc84511b76a20921 (fs/exec.c: restrict initial
> stack space expansion to rlimit) attempts to limit the initial stack to
> 20*PAGE_SIZE. Unfortunately, in attempting ensure the stack is not
> reduced in size, we ended up not changing the stack at all.
>
> This size reduction check is not necessary as the expand_stack call does
> this already.
>
> This caused a regression in UML resulting in most guest processes being
> killed.
>
> Signed-off-by: Michael Neuling <[email protected]>
> cc: <[email protected]>
> ---
> fs/exec.c | 1 -
> 1 file changed, 1 deletion(-)
>
> Index: linux-2.6-ozlabs/fs/exec.c
> ===================================================================
> --- linux-2.6-ozlabs.orig/fs/exec.c
> +++ linux-2.6-ozlabs/fs/exec.c
> @@ -637,7 +637,6 @@ int setup_arg_pages(struct linux_binprm
> * will align it up.
> */
> rlim_stack = rlimit(RLIMIT_STACK) & PAGE_MASK;
> - rlim_stack = min(rlim_stack, stack_size);
> #ifdef CONFIG_STACK_GROWSUP
> if (stack_size + stack_expand > rlim_stack)
> stack_base = vma->vm_start + rlim_stack;
On Mon, Feb 15, 2010 at 07:57:11PM +1100, Michael Neuling wrote:
>In message <[email protected]> you wrote:
>> >
>> >
>> > In message <[email protected]> you wrote:
>> > > It looks like the commit 803bf5ec259941936262d10ecc84511b76a20921
>> > > (fs/exec.c: restrict initial stack space expansion to rlimit) broke my
>> > > user mode Linux setup by somehow preventing system setup from running
>> > > properly (or killing some processes that try to mount things, etc.).
>> > > This commit turned up as the reason based on git bisect and reverting it
>> > > fixes my UML test setup (Ubuntu 9.10 on both host and in UML and AMD64
>> > > arch for both). I have no idea what exactly would be the main cause for
>> > > this issue, but this looks like a somewhat unfortunately timed
>> > > regression in 2.6.33-rc8.
>> > >
>> > > The failed run shows like this (with current linux-2.6.git):
>> > >
>> > > ...
>> > > EXT3-fs (ubda): mounted filesystem with writeback data mode
>> > > VFS: Mounted root (ext3 filesystem) readonly on device 98:0.
>> > > IRQ 3/console-write: IRQF_DISABLED is not guaranteed on shared IRQs
>> > > IRQ 2/console: IRQF_DISABLED is not guaranteed on shared IRQs
>> > > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
>> > > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
>> > > mountall: mount /sys/kernel/debug [218] killed by KILL signal
>> > > mountall: Filesystem could not be mounted: /sys/kernel/debug
>> > > mountall: mount /dev [219] killed by KILL signal
>> > > mountall: Filesystem could not be mounted: /dev
>> > > mountall: mount /tmp [220] killed by KILL signal
>> > > mountall: Filesystem could not be mounted: /tmp
>> > > mountall: mount /var/lock [222] killed by KILL signal
>> > > mountall: Filesystem could not be mounted: /var/lock
>> > > ...
>> > >
>> > >
>> > > With 803bf5ec reverted, UML comes up and the output looks like this:
>> > >
>> > > ...
>> > > EXT3-fs (ubda): mounted filesystem with writeback data mode
>> > > VFS: Mounted root (ext3 filesystem) readonly on device 98:0.
>> > > IRQ 3/console-write: IRQF_DISABLED is not guaranteed on shared IRQs
>> > > IRQ 2/console: IRQF_DISABLED is not guaranteed on shared IRQs
>> > > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
>> > > IRQ 10/winch: IRQF_DISABLED is not guaranteed on shared IRQs
>> > > init: procps main process (226) terminated with status 255
>> > > fsck from util-linux-ng 2.16
>> > > ...
>> >
>> > Jouni,
>> >
>> > I can reproduce this now.
>> >
>> > We got the logic wrong in one of the cleanups and hence we aren't
>> > actually changing the stack reservation ever, when we intended on
>> > allocating up to 20 new pages.
>> >
>> > The:
>> > rlim_stack = min(rlim_stack, stack_size);
>> > always chooses stack_size hence we end up not changing the stack at all.
>> > This seems to cause fatal problems on UML, but is obviously not what was
>> > intended for archs as well.
>> >
>> > The following works for me on PPC64 64k and 4k pages and UML on x86_64.
>> >
>> > Let me know if it fixes it for you also.
>> >
>> > Mikey
>> >
>> >
>> > exec/fs: fix initial stack reservation
>> >
>> > 803bf5ec259941936262d10ecc84511b76a20921 (fs/exec.c: restrict initial
>> > stack space expansion to rlimit) attempts to limit the initial stack to
>> > 20*PAGE_SIZE. Unfortunately, in also attempting ensure the stack is not
>> > reduced in size, we ended up not changing the stack at all.
>> >
>> > This caused a regression in UML resulting in most guest processes to be
>> > killed.
>> >
>> > Signed-off-by: Michael Neuling <[email protected]>
>> > cc: <[email protected]>
>> >
>> > diff --git a/fs/exec.c b/fs/exec.c
>> > index e95c692..e0e7b3c 100644
>> > --- a/fs/exec.c
>> > +++ b/fs/exec.c
>> > @@ -637,15 +637,16 @@ int setup_arg_pages(struct linux_binprm *bprm,
>> > * will align it up.
>> > */
>> > rlim_stack = rlimit(RLIMIT_STACK) & PAGE_MASK;
>> > - rlim_stack = min(rlim_stack, stack_size);
>> > #ifdef CONFIG_STACK_GROWSUP
>> > if (stack_size + stack_expand > rlim_stack)
>> > - stack_base = vma->vm_start + rlim_stack;
>> > + /* Expand only to rlimit, making sure not to shrink it */
>> > + stack_base = vma->vm_start + max(rlim_stack,stack_size);
>> > else
>> > stack_base = vma->vm_end + stack_expand;
>> > #else
>> > if (stack_size + stack_expand > rlim_stack)
>> > - stack_base = vma->vm_end - rlim_stack;
>> > + /* Expand only to rlimit, making sure not to shrink it */
>> > + stack_base = vma->vm_end - max(rlim_stack,stack_size);
>> > else
>> > stack_base = vma->vm_start - stack_expand;
>> > #endif
>>
>> - rlim_stack = min(rlim_stack, stack_size);
>> + /* Expand only to rlimit, making sure not to shrink it */
>> + rlim_stack = max(rlim_stack, stack_size);
>>
>> is better fix?
>
>Actually, I think we can just get rid of min() line altogether.
>expand_stack checks to make sure the stack is getting bigger, otherwise
>it does nothing. We don't need to bother with this check.
>
Right...
Above change makes me confused. :-( But now, everything is clear.
>The below works for me on UML x86_64 and ppc64 64k and 4k pages.
>
>Mikey
>
>exec/fs: fix initial stack reservation
>
>803bf5ec259941936262d10ecc84511b76a20921 (fs/exec.c: restrict initial
>stack space expansion to rlimit) attempts to limit the initial stack to
>20*PAGE_SIZE. Unfortunately, in attempting ensure the stack is not
>reduced in size, we ended up not changing the stack at all.
>
>This size reduction check is not necessary as the expand_stack call does
>this already.
>
>This caused a regression in UML resulting in most guest processes being
>killed.
>
>Signed-off-by: Michael Neuling <[email protected]>
>cc: <[email protected]>
This one definitely better.
Acked-by: WANG Cong <[email protected]>
>---
> fs/exec.c | 1 -
> 1 file changed, 1 deletion(-)
>
>Index: linux-2.6-ozlabs/fs/exec.c
>===================================================================
>--- linux-2.6-ozlabs.orig/fs/exec.c
>+++ linux-2.6-ozlabs/fs/exec.c
>@@ -637,7 +637,6 @@ int setup_arg_pages(struct linux_binprm
> * will align it up.
> */
> rlim_stack = rlimit(RLIMIT_STACK) & PAGE_MASK;
>- rlim_stack = min(rlim_stack, stack_size);
> #ifdef CONFIG_STACK_GROWSUP
> if (stack_size + stack_expand > rlim_stack)
> stack_base = vma->vm_start + rlim_stack;
>--
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to [email protected]
>More majordomo info at http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at http://www.tux.org/lkml/
--
Live like a child, think like the god.