2005-12-15 21:24:48

by Adrian Bunk

[permalink] [raw]
Subject: [2.6 patch] i386: always use 4k stacks

It seems most problems with 4k stacks are already resolved at least
in -mm.

I'd like to see this patch to always use 4k stacks in -mm now for
finding any remaining problems before submitting this patch for Linus'
tree.


Signed-off-by: Adrian Bunk <[email protected]>
Acked-by: Arjan van de Ven <[email protected]>

---

This patch was already sent on:
- 11 Dec 2005
- 5 Dec 2005
- 30 Nov 2005
- 23 Nov 2005
- 14 Nov 2005

arch/i386/Kconfig.debug | 10 ----------
arch/i386/kernel/irq.c | 10 ----------
include/asm-i386/irq.h | 11 +++--------
include/asm-i386/module.h | 8 +-------
include/asm-i386/thread_info.h | 6 +-----
5 files changed, 5 insertions(+), 40 deletions(-)

--- linux-2.6.14-mm2-full/arch/i386/Kconfig.debug.old 2005-11-14 01:30:54.000000000 +0100
+++ linux-2.6.14-mm2-full/arch/i386/Kconfig.debug 2005-11-14 01:31:06.000000000 +0100
@@ -52,16 +52,6 @@
portion of the kernel code won't be covered by a 2MB TLB anymore.
If in doubt, say "N".

-config 4KSTACKS
- bool "Use 4Kb for kernel stacks instead of 8Kb"
- depends on DEBUG_KERNEL
- help
- If you say Y here the kernel will use a 4Kb stacksize for the
- kernel stack attached to each process/thread. This facilitates
- running more threads on a system and also reduces the pressure
- on the VM subsystem for higher order allocations. This option
- will also use IRQ stacks to compensate for the reduced stackspace.
-
config X86_FIND_SMP_CONFIG
bool
depends on X86_LOCAL_APIC || X86_VOYAGER
--- linux-2.6.14-mm2-full/include/asm-i386/irq.h.old 2005-11-14 01:31:18.000000000 +0100
+++ linux-2.6.14-mm2-full/include/asm-i386/irq.h 2005-11-14 01:31:29.000000000 +0100
@@ -27,14 +27,9 @@
# define ARCH_HAS_NMI_WATCHDOG /* See include/linux/nmi.h */
#endif

-#ifdef CONFIG_4KSTACKS
- extern void irq_ctx_init(int cpu);
- extern void irq_ctx_exit(int cpu);
-# define __ARCH_HAS_DO_SOFTIRQ
-#else
-# define irq_ctx_init(cpu) do { } while (0)
-# define irq_ctx_exit(cpu) do { } while (0)
-#endif
+extern void irq_ctx_init(int cpu);
+extern void irq_ctx_exit(int cpu);
+#define __ARCH_HAS_DO_SOFTIRQ

#ifdef CONFIG_IRQBALANCE
extern int irqbalance_disable(char *str);
--- linux-2.6.14-mm2-full/include/asm-i386/thread_info.h.old 2005-11-14 01:31:45.000000000 +0100
+++ linux-2.6.14-mm2-full/include/asm-i386/thread_info.h 2005-11-14 01:32:11.000000000 +0100
@@ -53,11 +53,7 @@
#endif

#define PREEMPT_ACTIVE 0x10000000
-#ifdef CONFIG_4KSTACKS
-#define THREAD_SIZE (4096)
-#else
-#define THREAD_SIZE (8192)
-#endif
+#define THREAD_SIZE (4096)

#define STACK_WARN (THREAD_SIZE/8)
/*
--- linux-2.6.14-mm2-full/include/asm-i386/module.h.old 2005-11-14 01:32:18.000000000 +0100
+++ linux-2.6.14-mm2-full/include/asm-i386/module.h 2005-11-14 01:32:42.000000000 +0100
@@ -64,12 +64,6 @@
#define MODULE_REGPARM ""
#endif

-#ifdef CONFIG_4KSTACKS
-#define MODULE_STACKSIZE "4KSTACKS "
-#else
-#define MODULE_STACKSIZE ""
-#endif
-
-#define MODULE_ARCH_VERMAGIC MODULE_PROC_FAMILY MODULE_REGPARM MODULE_STACKSIZE
+#define MODULE_ARCH_VERMAGIC MODULE_PROC_FAMILY MODULE_REGPARM

#endif /* _ASM_I386_MODULE_H */

--- linux-2.6.15-rc5-mm2-full/arch/i386/kernel/irq.c.old 2005-12-11 15:10:27.000000000 +0100
+++ linux-2.6.15-rc5-mm2-full/arch/i386/kernel/irq.c 2005-12-11 15:11:29.000000000 +0100
@@ -33,7 +33,6 @@
}
#endif

-#ifdef CONFIG_4KSTACKS
/*
* per-CPU IRQ handling contexts (thread information and stack)
*/
@@ -44,7 +43,6 @@

static union irq_ctx *hardirq_ctx[NR_CPUS];
static union irq_ctx *softirq_ctx[NR_CPUS];
-#endif

/*
* do_IRQ handles all normal device IRQ's (the special
@@ -55,10 +53,8 @@
{
/* high bits used in ret_from_ code */
int irq = regs->orig_eax & 0xff;
-#ifdef CONFIG_4KSTACKS
union irq_ctx *curctx, *irqctx;
u32 *isp;
-#endif

irq_enter();
#ifdef CONFIG_DEBUG_STACKOVERFLOW
@@ -76,8 +72,6 @@
}
#endif

-#ifdef CONFIG_4KSTACKS
-
curctx = (union irq_ctx *) current_thread_info();
irqctx = hardirq_ctx[smp_processor_id()];

@@ -104,7 +98,6 @@
: "memory", "cc", "ecx"
);
} else
-#endif
__do_IRQ(irq, regs);

irq_exit();
@@ -114,8 +107,6 @@
return 1;
}

-#ifdef CONFIG_4KSTACKS
-
/*
* These should really be __section__(".bss.page_aligned") as well, but
* gcc's 3.0 and earlier don't handle that correctly.
@@ -200,7 +191,6 @@
}

EXPORT_SYMBOL(do_softirq);
-#endif

/*
* Interrupt statistics:


2005-12-15 22:02:04

by Andrew Morton

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

Adrian Bunk <[email protected]> wrote:
>
> This patch was already sent on:
> - 11 Dec 2005
> - 5 Dec 2005
> - 30 Nov 2005
> - 23 Nov 2005
> - 14 Nov 2005

Sigh. I saw the volume of email last time and though "gee, glad I wasn't
cc'ed on that lot".

Supporting 8k stacks is a small amount of code and nobody has seen a need
to make changes in there for quite a long time. So there's little cost to
keeping the existing code.

And the existing code is useful:

a) people can enable it to confirm that their weird crash was due to a
stack overflow.

b) If I was going to put together a maximally-stable kernel for a
complex server machine, I'd select 8k stacks. We're still just too
squeezy, and we've had too many relatively-recent overflows, and there
are still some really deep callpaths in there.

2005-12-15 22:11:47

by Jeffrey V. Merkey

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks



Andrew,

Thanks. I concur.

Jeff

Andrew Morton wrote:

>Adrian Bunk <[email protected]> wrote:
>
>
>>This patch was already sent on:
>>- 11 Dec 2005
>>- 5 Dec 2005
>>- 30 Nov 2005
>>- 23 Nov 2005
>>- 14 Nov 2005
>>
>>
>
>Sigh. I saw the volume of email last time and though "gee, glad I wasn't
>cc'ed on that lot".
>
>Supporting 8k stacks is a small amount of code and nobody has seen a need
>to make changes in there for quite a long time. So there's little cost to
>keeping the existing code.
>
>And the existing code is useful:
>
>a) people can enable it to confirm that their weird crash was due to a
> stack overflow.
>
>b) If I was going to put together a maximally-stable kernel for a
> complex server machine, I'd select 8k stacks. We're still just too
> squeezy, and we've had too many relatively-recent overflows, and there
> are still some really deep callpaths in there.
>
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to [email protected]
>More majordomo info at http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at http://www.tux.org/lkml/
>
>
>

2005-12-15 22:30:00

by Adrian Bunk

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Thu, Dec 15, 2005 at 02:00:13PM -0800, Andrew Morton wrote:
> Adrian Bunk <[email protected]> wrote:
> >
> > This patch was already sent on:
> > - 11 Dec 2005
> > - 5 Dec 2005
> > - 30 Nov 2005
> > - 23 Nov 2005
> > - 14 Nov 2005
>
> Sigh. I saw the volume of email last time and though "gee, glad I wasn't
> cc'ed on that lot".

If you substract the "this breaks my binary-only M$ Windows driver"
emails there's not much volume left.

> Supporting 8k stacks is a small amount of code and nobody has seen a need
> to make changes in there for quite a long time. So there's little cost to
> keeping the existing code.
>
> And the existing code is useful:
>
> a) people can enable it to confirm that their weird crash was due to a
> stack overflow.
>
> b) If I was going to put together a maximally-stable kernel for a
> complex server machine, I'd select 8k stacks. We're still just too
> squeezy, and we've had too many relatively-recent overflows, and there
> are still some really deep callpaths in there.

a1) People turn off 4k stacks and never report the problem / noone
really debugs and fixes the reported problem.

Me threatening people with enabling 4k stacks for everyone already
resulted in several fixes.

An how many weird crashes with _different_ causes have you seen?
It could be that there are only _very_ few problems that noone really
debugs brcause disabling 4k stacks fixes the issue.

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

2005-12-15 22:36:55

by Jeffrey V. Merkey

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

Adrian Bunk wrote:

>On Thu, Dec 15, 2005 at 02:00:13PM -0800, Andrew Morton wrote:
>
>
>>Adrian Bunk <[email protected]> wrote:
>>
>>
>>>This patch was already sent on:
>>>- 11 Dec 2005
>>>- 5 Dec 2005
>>>- 30 Nov 2005
>>>- 23 Nov 2005
>>>- 14 Nov 2005
>>>
>>>
>>Sigh. I saw the volume of email last time and though "gee, glad I wasn't
>>cc'ed on that lot".
>>
>>
>
>If you substract the "this breaks my binary-only M$ Windows driver"
>emails there's not much volume left.
>
>
>
>>Supporting 8k stacks is a small amount of code and nobody has seen a need
>>to make changes in there for quite a long time. So there's little cost to
>>keeping the existing code.
>>
>>And the existing code is useful:
>>
>>a) people can enable it to confirm that their weird crash was due to a
>> stack overflow.
>>
>>b) If I was going to put together a maximally-stable kernel for a
>> complex server machine, I'd select 8k stacks. We're still just too
>> squeezy, and we've had too many relatively-recent overflows, and there
>> are still some really deep callpaths in there.
>>
>>
>
>a1) People turn off 4k stacks and never report the problem / noone
> really debugs and fixes the reported problem.
>
>Me threatening people with enabling 4k stacks for everyone already
>resulted in several fixes.
>
>An how many weird crashes with _different_ causes have you seen?
>It could be that there are only _very_ few problems that noone really
>debugs brcause disabling 4k stacks fixes the issue.
>
>

When you are on the phone with an irrate customer at 2:00 am in the
morning, and just turning off your broken 4K stack fix
and getting the customer running matters. 4K stacks are a BAD idea. I
have even found USER SPACE apps
that crash linux without the 8K option. Andrew has spoken. Suck it up
and deal with it. It's not a problem limited to Windows
drivers.

Jeff

>cu
>Adrian
>
>
>

2005-12-15 23:15:26

by Lee Revell

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Thu, 2005-12-15 at 14:07 -0700, Jeff V. Merkey wrote:
> When you are on the phone with an irrate customer at 2:00 am in the
> morning, and just turning off your broken 4K stack fix
> and getting the customer running matters.

Bugzilla link please. Otherwise STFU.

Lee

2005-12-15 23:16:16

by Jeffrey V. Merkey

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

Lee Revell wrote:

>On Thu, 2005-12-15 at 14:07 -0700, Jeff V. Merkey wrote:
>
>
>>When you are on the phone with an irrate customer at 2:00 am in the
>>morning, and just turning off your broken 4K stack fix
>>and getting the customer running matters.
>>
>>
>
>Bugzilla link please. Otherwise STFU.
>
>

??????

Jeff

>Lee
>
>
>
>

2005-12-15 23:17:14

by Dave Jones

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Thu, Dec 15, 2005 at 11:30:00PM +0100, Adrian Bunk wrote:

> An how many weird crashes with _different_ causes have you seen?
> It could be that there are only _very_ few problems that noone really
> debugs brcause disabling 4k stacks fixes the issue.

the block layer issue that Neil had patches for was the only one
that rings any bells for me[*] (and the only one in Fedora bugzilla
that anyone has actually hit -- and that's 2-3 people out of
a *lot* of users).

Dave

[*] Plus a few XFS ones, but that's been a lost cause wrt stack usage
for a long time -- people were reporting overflows there before we
enabled 4K stacks.

2005-12-15 23:23:51

by Lee Revell

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Thu, 2005-12-15 at 14:46 -0700, Jeff V. Merkey wrote:
> Lee Revell wrote:
>
> >On Thu, 2005-12-15 at 14:07 -0700, Jeff V. Merkey wrote:
> >
> >
> >>When you are on the phone with an irrate customer at 2:00 am in the
> >>morning, and just turning off your broken 4K stack fix
> >>and getting the customer running matters.
> >>
> >>
> >
> >Bugzilla link please. Otherwise STFU.
> >
> >
>
> ??????
>
> Jeff

You imply that your customer's problem was due to a kernel bug triggered
by CONFIG_4KSTACKS. I am asking you to provide a link to the bug report
or get lost.

Lee

2005-12-15 23:33:57

by Jeffrey V. Merkey

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

Lee Revell wrote:

>On Thu, 2005-12-15 at 14:46 -0700, Jeff V. Merkey wrote:
>
>
>>Lee Revell wrote:
>>
>>
>>
>>>On Thu, 2005-12-15 at 14:07 -0700, Jeff V. Merkey wrote:
>>>
>>>
>>>
>>>
>>>>When you are on the phone with an irrate customer at 2:00 am in the
>>>>morning, and just turning off your broken 4K stack fix
>>>>and getting the customer running matters.
>>>>
>>>>
>>>>
>>>>
>>>Bugzilla link please. Otherwise STFU.
>>>
>>>
>>>
>>>
>>??????
>>
>>Jeff
>>
>>
>
>You imply that your customer's problem was due to a kernel bug triggered
>by CONFIG_4KSTACKS. I am asking you to provide a link to the bug report
>or get lost.
>
>Lee
>
>

You hack on this code base (hack is the right word) -- I sell it,
service and support it with customers in a dozen countries. I don't report
company level issues in "bugzilla" or anywhere else public unless they
apply to kernel code. calls from several of our apps (which use
larger than 4K kernel space on a stack) from user space crash -- so do
wireless drivers -- and kdb crashes as well with some bugs with 4K stacks
turned on when you are trying to debug something.

Hope that addresses your concerns "joe job".

Jeff

>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to [email protected]
>More majordomo info at http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at http://www.tux.org/lkml/
>
>
>

2005-12-15 23:36:52

by Ismail Donmez

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

Cuma 16 Aralık 2005 00:04 tarihinde şunları yazmıştınız:
> Lee Revell wrote:
> >On Thu, 2005-12-15 at 14:46 -0700, Jeff V. Merkey wrote:
> >>Lee Revell wrote:
> >>>On Thu, 2005-12-15 at 14:07 -0700, Jeff V. Merkey wrote:
> >>>>When you are on the phone with an irrate customer at 2:00 am in the
> >>>>morning, and just turning off your broken 4K stack fix
> >>>>and getting the customer running matters.
> >>>
> >>>Bugzilla link please. Otherwise STFU.
> >>
> >>??????
> >>
> >>Jeff
> >
> >You imply that your customer's problem was due to a kernel bug triggered
> >by CONFIG_4KSTACKS. I am asking you to provide a link to the bug report
> >or get lost.
> >
> >Lee
>
> You hack on this code base (hack is the right word) -- I sell it,
> service and support it with customers in a dozen countries. I don't report
> company level issues in "bugzilla" or anywhere else public unless they
> apply to kernel code. calls from several of our apps (which use
> larger than 4K kernel space on a stack) from user space crash -- so do
> wireless drivers -- and kdb crashes as well with some bugs with 4K stacks
> turned on when you are trying to debug something.
>
> Hope that addresses your concerns "joe job".

You are supposed to report those bugs in a manner it won't conflict with the
privacy of your customer(s). Linux distros do this already.

/ismail

2005-12-16 00:07:58

by grundig

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

El Thu, 15 Dec 2005 15:04:38 -0700,
"Jeff V. Merkey" <[email protected]> escribi?:

> apply to kernel code. calls from several of our apps (which use
> larger than 4K kernel space on a stack) from user space crash -- so do
> wireless drivers -- and kdb crashes as well with some bugs with 4K stacks
> turned on when you are trying to debug something.

If you (or other people) don't report those bugs, nobody else except
you will care about them, I'm afraid.

"My customer says it crashes but I don't want to report it publically".
What kind of excuse is that? O_o

2005-12-16 00:20:39

by Jeffrey V. Merkey

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

Fri, 16 Dec 2005 01:08:02 +0100 wrote:

>El Thu, 15 Dec 2005 15:04:38 -0700,
>"Jeff V. Merkey" <[email protected]> escribi?:
>
>
>
>>apply to kernel code. calls from several of our apps (which use
>>larger than 4K kernel space on a stack) from user space crash -- so do
>>wireless drivers -- and kdb crashes as well with some bugs with 4K stacks
>>turned on when you are trying to debug something.
>>
>>
>
>If you (or other people) don't report those bugs, nobody else except
>you will care about them, I'm afraid.
>
>"My customer says it crashes but I don't want to report it publically".
>What kind of excuse is that? O_o
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to [email protected]
>More majordomo info at http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at http://www.tux.org/lkml/
>
>
>
You need to go back and read the whole thread. These bugs were reported
by me weeks ago.

Jeff

2005-12-16 00:37:11

by Ray Lee

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

(Man, I've been holding my tongue on this conversation for a while,
but it seems my better angels have deserted me.)

On 12/15/05, Lee Revell <[email protected]> wrote:
> Bugzilla link please.

No, that's not how failure engineering is done. A guy designing a
bridge doesn't cut all the supports back to the bare minimum just to
save money because his design says that the remaining metal should be
strong enough. If you can't prove it, and it's a safety issue
(continuing my analogy in the physical world), then you engineer for
failure. Can you handle all occurrences? No, a hurricane Katrina comes
along every once in a while. Can you weather more than you did before?
Yes. In the meantime, their are fewer poor sods falling off the bridge
that have to open a bugzilla report.

The world of software is no different. If someone wants to remove the
8k stacks option, they'd better prove that they're making my servers
more reliable. I've seen zero arguments for why 8k stacks is unviable.
(I've also wondered why we can't just have IRQ stacks plus 8k thread
stacks -- seemingly the best of both worlds) Instead, what I've seen
is that we have coders who don't like the idea of any non-order-zero
allocations taking place, because big systems running poorly coded
Java apps with massive threading can hit problems with allocations
from time to time.

The answer for that is the same answer the kernel community usually
gives about poorly designed userspace applications: rewrite them.

I'm quite open to being proved wrong. If someone has a counter case
they can toss forth, please do so. Systems taking lots of interrupts?
Then how about 8k + IRQ stacks? With a counterexample I'll gladly
concede that I'm an ignorant slut[*] -- excuse me, Saturday Night Live
flashbacks -- an ignorant git, and shut up. ([*] is only half right,
I'm not all that ignorant).

If someone doesn't show a counter case, then may I suggest people
consider the possibility that this is not proper engineering. Prove
it, or provide a safety blanket. But don't yank the blanket without
proving the lack of problem.

Ray

2005-12-16 00:40:54

by Alan

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Iau, 2005-12-15 at 15:04 -0700, Jeff V. Merkey wrote:
> apply to kernel code. calls from several of our apps (which use
> larger than 4K kernel space on a stack)

Then you've got bugs anyway. In 8K stack mode that stack is shared with
the IRQ/BH/etc stack so you've only got 4K to play with. Its just more
random whether your box explodes.

2005-12-16 00:47:40

by Adrian Bunk

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Thu, Dec 15, 2005 at 06:15:38PM -0500, Dave Jones wrote:
> On Thu, Dec 15, 2005 at 11:30:00PM +0100, Adrian Bunk wrote:
>
> > An how many weird crashes with _different_ causes have you seen?
> > It could be that there are only _very_ few problems that noone really
> > debugs brcause disabling 4k stacks fixes the issue.
>
> the block layer issue that Neil had patches for was the only one
> that rings any bells for me[*] (and the only one in Fedora bugzilla
> that anyone has actually hit -- and that's 2-3 people out of
> a *lot* of users).

Neil's patch is required, and since it's not in 2.6.15-rc we might still
get bug reports with 4k stacks that are fixed by his patch.

Do we have any bug reports due to 4k stacks against -mm since Neil's
patch was included?

People were able to convince me in the past to delay my patch to always
use 4k stacks by pointing to unsolved problems (or I pointed them like
in the reiser4 case) - and these were constructive delays since the code
was fixed. So if someone wants to convince me that it's too early for my
patch, simply send me some pointers to 4k stack issues still present in
a recent -mm. :-)

Hm, I just found two serious stack usage regressions in 2.6.15-rc (bug
report will be in a separate email), but allocating arrays with more
than 2000 elements on the stack is always wrong in the kernel
independent of the stack size...

> Dave
>
> [*] Plus a few XFS ones, but that's been a lost cause wrt stack usage
> for a long time -- people were reporting overflows there before we
> enabled 4K stacks.

I remember someone from the XFS maintainers (Nathan?) saying they
believe having solved all XFS stack issues.

If there are any XFS issues left, do you have a pointer to them?

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

2005-12-16 00:52:17

by Dave Jones

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Fri, Dec 16, 2005 at 01:47:40AM +0100, Adrian Bunk wrote:

> > [*] Plus a few XFS ones, but that's been a lost cause wrt stack usage
> > for a long time -- people were reporting overflows there before we
> > enabled 4K stacks.
>
> I remember someone from the XFS maintainers (Nathan?) saying they
> believe having solved all XFS stack issues.
>
> If there are any XFS issues left, do you have a pointer to them?

The last one I saw may have been actually been more related
to the block layer problem. iirc that was a user NFS exporting
XFS on a raid1 array.

Dave

2005-12-16 00:58:17

by Michael Buesch

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Friday 16 December 2005 01:08, you wrote:
> El Thu, 15 Dec 2005 15:04:38 -0700,
> "Jeff V. Merkey" <[email protected]> escribi?:
>
> > apply to kernel code. calls from several of our apps (which use
> > larger than 4K kernel space on a stack) from user space crash -- so do
> > wireless drivers -- and kdb crashes as well with some bugs with 4K stacks
> > turned on when you are trying to debug something.
>
> If you (or other people) don't report those bugs, nobody else except
> you will care about them, I'm afraid.
>
> "My customer says it crashes but I don't want to report it publically".
> What kind of excuse is that? O_o

Your customer runs an -mm kernel on his production systems?
Smash him.
This is about removing 8k support in the -mm kernel, to
find the remaining bugs (if there are any).

--
Greetings Michael.


Attachments:
(No filename) (847.00 B)
(No filename) (189.00 B)
Download all attachments

2005-12-16 01:16:14

by Nathan Scott

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Fri, Dec 16, 2005 at 01:47:40AM +0100, Adrian Bunk wrote:
> ...
> > [*] Plus a few XFS ones, but that's been a lost cause wrt stack usage
> > for a long time -- people were reporting overflows there before we
> > enabled 4K stacks.
>
> I remember someone from the XFS maintainers (Nathan?) saying they
> believe having solved all XFS stack issues.

We don't know of any remaining issues...

> If there are any XFS issues left, do you have a pointer to them?

...so I was curious to see these too, since we've never had any
reported from Dave / anyone else @RH.

cheers.

--
Nathan

2005-12-16 01:38:33

by Dave Jones

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Fri, Dec 16, 2005 at 12:15:19PM +1100, Nathan Scott wrote:
> On Fri, Dec 16, 2005 at 01:47:40AM +0100, Adrian Bunk wrote:
> > ...
> > > [*] Plus a few XFS ones, but that's been a lost cause wrt stack usage
> > > for a long time -- people were reporting overflows there before we
> > > enabled 4K stacks.
> >
> > I remember someone from the XFS maintainers (Nathan?) saying they
> > believe having solved all XFS stack issues.
>
> We don't know of any remaining issues...
>
> > If there are any XFS issues left, do you have a pointer to them?
>
> ...so I was curious to see these too, since we've never had any
> reported from Dave / anyone else @RH.

When they've come up in bugzilla, I've pointed them at the xfs
mailing lists. As these folks can reproduce the problems, it seemed
pointless to play middle-man.

Dave

2005-12-16 01:44:25

by Zwane Mwaikambo

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Thu, 15 Dec 2005, Adrian Bunk wrote:

> On Thu, Dec 15, 2005 at 02:00:13PM -0800, Andrew Morton wrote:
>
> > Supporting 8k stacks is a small amount of code and nobody has seen a need
> > to make changes in there for quite a long time. So there's little cost to
> > keeping the existing code.
> >
> > And the existing code is useful:
> >
> > a) people can enable it to confirm that their weird crash was due to a
> > stack overflow.
> >
> > b) If I was going to put together a maximally-stable kernel for a
> > complex server machine, I'd select 8k stacks. We're still just too
> > squeezy, and we've had too many relatively-recent overflows, and there
> > are still some really deep callpaths in there.
>
> a1) People turn off 4k stacks and never report the problem / noone
> really debugs and fixes the reported problem.
>
> Me threatening people with enabling 4k stacks for everyone already
> resulted in several fixes.

How about this, we apply this patch and perhaps add some debug option to
enable 8k by changing THREAD_SIZE. This way we have the seperate interrupt
stacks and 8k stacks for when someone suspects a stack overflow.

2005-12-16 02:57:16

by NeilBrown

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Thursday December 15, [email protected] wrote:
> On Fri, Dec 16, 2005 at 01:47:40AM +0100, Adrian Bunk wrote:
>
> > > [*] Plus a few XFS ones, but that's been a lost cause wrt stack usage
> > > for a long time -- people were reporting overflows there before we
> > > enabled 4K stacks.
> >
> > I remember someone from the XFS maintainers (Nathan?) saying they
> > believe having solved all XFS stack issues.
> >
> > If there are any XFS issues left, do you have a pointer to them?
>
> The last one I saw may have been actually been more related
> to the block layer problem. iirc that was a user NFS exporting
> XFS on a raid1 array.

Yeh, I've noticed that nfsd seems to figure often in these. As nfsd
lives on the same (in-kernel) stack as the filesystem and device
drives, it will add a couple of hundred bytes to the call trace.

A typical nfsd call trace is
nfsd -> svc_process -> nfsd_dispatch -> nfsd3_proc_write ->
nfsd_write ->nfsd_vfs_write -> vfs_writev

(errr. nfsd_vfs_write is inline, large, and called twice, that ain't
good)

These add up to over 300 bytes on the stack.
Looking at each of these, I see that nfsd_write (which includes
nfsd_vfs_write) contributes 0x8c to stack usage itself!!

It turns out this is because it puts a 'struct iattr' on the stack so
it can kill suid if needed. The following patch saves about 50 bytes
off the stack in this call path.

I sometimes wish that gcc could be told to optimise for stack usage -
a lot of variables on the stack are dead at some call points, yet they
stay there using space anyway. The only way to save this space seem
to be to move the code which uses those variable into a separate
function, but we really shouldn't *have* to do these optimisations by
hand!

NeilBrown

Signed-off-by: Neil Brown <[email protected]>

### Diffstat output
./fs/nfsd/vfs.c | 20 ++++++++++++--------
1 file changed, 12 insertions(+), 8 deletions(-)

diff ./fs/nfsd/vfs.c~current~ ./fs/nfsd/vfs.c
--- ./fs/nfsd/vfs.c~current~ 2005-12-12 16:00:40.000000000 +1100
+++ ./fs/nfsd/vfs.c 2005-12-16 13:48:31.000000000 +1100
@@ -869,6 +869,16 @@ out:
return err;
}

+static void kill_suid(struct dentry *dentry)
+{
+ struct iattr ia;
+ ia.ia_valid = ATTR_KILL_SUID | ATTR_KILL_SGID;
+
+ down(&dentry->d_inode->i_sem);
+ notify_change(dentry, &ia);
+ up(&dentry->d_inode->i_sem);
+}
+
static inline int
nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp, struct file *file,
loff_t offset, struct kvec *vec, int vlen,
@@ -922,14 +932,8 @@ nfsd_vfs_write(struct svc_rqst *rqstp, s
}

/* clear setuid/setgid flag after write */
- if (err >= 0 && (inode->i_mode & (S_ISUID | S_ISGID))) {
- struct iattr ia;
- ia.ia_valid = ATTR_KILL_SUID | ATTR_KILL_SGID;
-
- down(&inode->i_sem);
- notify_change(dentry, &ia);
- up(&inode->i_sem);
- }
+ if (err >= 0 && (inode->i_mode & (S_ISUID | S_ISGID)))
+ kill_suid(dentry);

if (err >= 0 && stable) {
static ino_t last_ino;

2005-12-16 03:08:01

by Dave Jones

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Fri, Dec 16, 2005 at 01:56:58PM +1100, Neil Brown wrote:

> It turns out this is because it puts a 'struct iattr' on the stack so
> it can kill suid if needed. The following patch saves about 50 bytes
> off the stack in this call path.

See! it *was* worth Adrian bringing up the "kill 8kb stacks" patch again :-)

Dave

2005-12-16 05:20:58

by Alex Davis

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

The problem is that, with laptops, most of the time you DON'T have a choice:
HP and Dell primarily use a Broadcomm integrated wireless card in ther products.
As of yet, there is no open source driver for Broadcomm wireless.

>If 8k stacks get removed, yes. So if you have a chance to choose don't buy a
>wifi card which doesn't have a native linux driver.
>
>Regards,
>ismail

I code, therefore I am

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

2005-12-16 05:29:16

by Dave Jones

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Thu, Dec 15, 2005 at 09:20:54PM -0800, Alex Davis wrote:
> The problem is that, with laptops, most of the time you DON'T have a choice:
> HP and Dell primarily use a Broadcomm integrated wireless card in ther products.
> As of yet, there is no open source driver for Broadcomm wireless.

We've already been through all this the previous times this came up.

http://bcm43xx.berlios.de

Whilst it's in early stages, it's making progress.

Dave

2005-12-16 06:16:09

by Alex Davis

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks



--- Dave Jones <[email protected]> wrote:

> On Thu, Dec 15, 2005 at 09:20:54PM -0800, Alex Davis wrote:
> > The problem is that, with laptops, most of the time you DON'T have a choice:
> > HP and Dell primarily use a Broadcomm integrated wireless card in ther products.
> > As of yet, there is no open source driver for Broadcomm wireless.
>
> We've already been through all this the previous times this came up.
>
> http://bcm43xx.berlios.de
>
> Whilst it's in early stages, it's making progress.
>
> Dave
>
>
I understand that, and am grateful for the effort, but the point is it's not ready. Are you
expecting people to lose an important feature of their
laptop until you get the driver ready?


I code, therefore I am

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

2005-12-16 07:41:23

by Pekka Enberg

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

Hi Alex,

On 12/16/05, Alex Davis <[email protected]> wrote:
> I understand that, and am grateful for the effort, but the point is it's not ready. Are you
> expecting people to lose an important feature of their
> laptop until you get the driver ready?

Hey, it's the price you pay for using binary only drivers. Why not
complain to Broadcom instead for not releasing the hardware
documentation? Besides, you can still maintain 8 KB stacks as an
out-of-tree patch or change fix ndiswrapper work with 4 KB ones.

Pekka

2005-12-16 07:49:12

by Kyle Moffett

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Dec 16, 2005, at 01:16, Alex Davis wrote:
> [flamewar]

Enough already! These concerns have been raised already, and found
to be insufficient. There are several points:

1) ndiswrapper is broken already, and works sheerly by luck anyways;
NT stacks are 12kb, so you're already asking for stack overflows by
using it.
2) ndiswrapper encourages use of binary drivers instead of the open-
source ones that need the testers, so you're only hurting yourselves
in the long run.
3) All the in-kernel problems have been fixed, and this makes a lot
of stuff less fragmentation-prone and more reliable.

Does anybody have any _in_kernel_ bugreports which are unaddressed,
or maybe something out-of-kernel that is not handled by the above
points?

Cheers,
Kyle Moffett

--
There is no way to make Linux robust with unreliable memory
subsystems, sorry. It would be like trying to make a human more
robust with an unreliable O2 supply. Memory just has to work.
-- Andi Kleen


2005-12-16 07:54:07

by Kyle Moffett

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Dec 16, 2005, at 01:16, Alex Davis wrote:
> [flamewar]

Enough already! These concerns have been raised already, and found
to be insufficient. There are several points:

1) ndiswrapper is broken already, and works sheerly by luck anyways;
NT stacks are 12kb, so you're already asking for stack overflows by
using it.
2) ndiswrapper encourages use of binary drivers instead of the open-
source ones that need the testers, so you're only hurting yourselves
in the long run.
3) All the in-kernel problems have been fixed, and this makes a lot
of stuff less fragmentation-prone and more reliable.

Does anybody have any _in_kernel_ bugreports which are unaddressed,
or maybe something out-of-kernel that is not handled by the above
points?

Cheers,
Kyle Moffett

--
There is no way to make Linux robust with unreliable memory
subsystems, sorry. It would be like trying to make a human more
robust with an unreliable O2 supply. Memory just has to work.
-- Andi Kleen


2005-12-16 08:02:58

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks


> I understand that, and am grateful for the effort, but the point is it's not ready. Are you
> expecting people to lose an important feature of their
> laptop until you get the driver ready?
>
>
> I code, therefore I am

if you code.. why don't you go help coding with the people writing the
broadcom drivers? How is this ONLY our problem? Linux is a cooperative
thing: you take but you also give back. If you're a coder, this is the
perfect opportunity to give something back and help the bcm43xx guys
with debugging and coding and testing....


2005-12-16 08:10:11

by Matt Domsch

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Thu, Dec 15, 2005 at 09:20:54PM -0800, Alex Davis wrote:
> The problem is that, with laptops, most of the time you DON'T have a choice:
> HP and Dell primarily use a Broadcomm integrated wireless card in ther products.
> As of yet, there is no open source driver for Broadcomm wireless.
>
> >If 8k stacks get removed, yes. So if you have a chance to choose don't buy a
> >wifi card which doesn't have a native linux driver.

Dell "Software & Peripherals" sells "customer kits" of the Intel
ipw2915 for $59 US, so even if you bought the "wrong" wireless NIC
when you bought the laptop, this can be remedied.

--
Matt Domsch
Software Architect
Dell Linux Solutions linux.dell.com & http://www.dell.com/linux
Linux on Dell mailing lists @ http://lists.us.dell.com

2005-12-16 08:48:20

by Alex Davis

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks



--- Kyle Moffett <[email protected]> wrote:

> On Dec 16, 2005, at 03:20, Alex Davis wrote:
> > Maybe nobody YOU know cares. I know a few people who do!!
>
> I care a lot about having working wireless on my laptop (a
> PowerBook). The only way that's going to happen is the bcm43xx
> project, which I have supported as much as possible from the very
> beginning. The reluctance of many people to try out the now WORKING
> bcm43xx driver or help out with development before it was working has
> meant that I've had to wait a lot longer than I probably otherwise
> would have. ndiswrapper is NOT an answer
although if you think so,
> you're welcome to fix it to work with 4k stacks (although it's not
> like it really ever "worked" with 8k stacks either, NT has 12k).

> So go f*** off and quit trolling the LKML. If your argument isn't
> reasonable with valid open-source *technical* basis, I'm sure you
> have a wide variety of orifices in which you can shove it, because we
> don't care at all!

So profanity and getting emootional are technical arguments/reasons to
stop stating my opinions??

Hmmm...


> Cheers,
> Kyle Moffett
>
> --
> Premature optimization is the root of all evil in programming
> -- C.A.R. Hoare
>
>
>
>


I code, therefore I am

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

2005-12-16 09:38:51

by Kyle Moffett

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Dec 16, 2005, at 03:48, Alex Davis wrote:
> So profanity and getting emotional are technical arguments/reasons
> to stop stating my opinions??

No, but given that you've blatantly ignored the [1] previous [2]
several [3] emails [4] containing [5] technical [6] reasons [7] only
to repeat the same multiply-rejected nontechnical issue for the forty-
second time, I figured there was no harm in trying something else.
(BTW: Way to CC a private thread to a public list)

[1] http://lkml.org/lkml/2005/11/15/228
[2] http://lkml.org/lkml/2005/11/15/320
[3] http://lkml.org/lkml/2005/11/15/401
[4] http://lkml.org/lkml/2005/11/16/43
[5] http://lkml.org/lkml/2005/11/16/76
[6] http://lkml.org/lkml/2005/11/16/86
[7] http://lkml.org/lkml/2005/12/16/24

Cheers,
Kyle Moffett

--
I didn't say it would work as a defense, just that they can spin that
out for years in court if it came to it.
-- Rob Landley



2005-12-16 11:06:47

by Bodo Eggert

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

Kyle Moffett <[email protected]> wrote:

> Enough already! These concerns have been raised already, and found
> to be insufficient. There are several points:
>
> 1) ndiswrapper is broken already, and works sheerly by luck anyways;
> NT stacks are 12kb, so you're already asking for stack overflows by
> using it.
> 2) ndiswrapper encourages use of binary drivers instead of the open-
> source ones that need the testers, so you're only hurting yourselves
> in the long run.

ACK. So where is the driver for the Netgear WG511 Softmac card I'm supposed
to test? I bought this card because it was labled as being supported, and it
turned out that it wasn't, and just nobody cared to update the list of
supported cards with the warning about the unsupported variant.

> 3) All the in-kernel problems have been fixed, and this makes a lot
> of stuff less fragmentation-prone and more reliable.

BTW: Is there any bug report related to 8K stacks?

--
Ich danke GMX daf?r, die Verwendung meiner Adressen mittels per SPF
verbreiteten L?gen zu sabotieren.

2005-12-16 12:18:05

by Adrian Bunk

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Fri, Dec 16, 2005 at 01:56:58PM +1100, Neil Brown wrote:
> On Thursday December 15, [email protected] wrote:
> > On Fri, Dec 16, 2005 at 01:47:40AM +0100, Adrian Bunk wrote:
> >
> > > > [*] Plus a few XFS ones, but that's been a lost cause wrt stack usage
> > > > for a long time -- people were reporting overflows there before we
> > > > enabled 4K stacks.
> > >
> > > I remember someone from the XFS maintainers (Nathan?) saying they
> > > believe having solved all XFS stack issues.
> > >
> > > If there are any XFS issues left, do you have a pointer to them?
> >
> > The last one I saw may have been actually been more related
> > to the block layer problem. iirc that was a user NFS exporting
> > XFS on a raid1 array.
>
> Yeh, I've noticed that nfsd seems to figure often in these. As nfsd
> lives on the same (in-kernel) stack as the filesystem and device
> drives, it will add a couple of hundred bytes to the call trace.
>
> A typical nfsd call trace is
> nfsd -> svc_process -> nfsd_dispatch -> nfsd3_proc_write ->
> nfsd_write ->nfsd_vfs_write -> vfs_writev
>
> (errr. nfsd_vfs_write is inline, large, and called twice, that ain't
> good)

The nfsd code uses inline in too many places.

gcc can figure out itself that static functions called only once should
be inline (except currently on i386 due to no-unit-at-a-time, see
below).

> These add up to over 300 bytes on the stack.
> Looking at each of these, I see that nfsd_write (which includes
> nfsd_vfs_write) contributes 0x8c to stack usage itself!!
>
> It turns out this is because it puts a 'struct iattr' on the stack so
> it can kill suid if needed. The following patch saves about 50 bytes
> off the stack in this call path.
>...

This works currently on i386 (and only on i386) because we are using
-fno-unit-at-a-time there.

In the medium-term, we want to get rid of no-unit-at-a-time because this
makes the code both bigger and slower, and I'm therefore not a big fan
of this kind of workarounds.

If this struct is really a problem (which I doubt considering it's
size), I'd prefer it being kmalloc'ed.

> NeilBrown
>...

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

2005-12-16 12:23:21

by Denis Vlasenko

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Friday 16 December 2005 13:05, Bodo Eggert wrote:
> Kyle Moffett <[email protected]> wrote:
>
> > Enough already! These concerns have been raised already, and found
> > to be insufficient. There are several points:
> >
> > 1) ndiswrapper is broken already, and works sheerly by luck anyways;
> > NT stacks are 12kb, so you're already asking for stack overflows by
> > using it.
> > 2) ndiswrapper encourages use of binary drivers instead of the open-
> > source ones that need the testers, so you're only hurting yourselves
> > in the long run.
>
> ACK. So where is the driver for the Netgear WG511 Softmac card I'm supposed
> to test? I bought this card because it was labled as being supported, and it
> turned out that it wasn't, and just nobody cared to update the list of
> supported cards with the warning about the unsupported variant.

We do need more people working on wireless front.
OTOH, more people bitching about bad situation on wireless front
doesn't make it any better.
--
vda

2005-12-16 13:10:03

by Diego Calleja

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

El Thu, 15 Dec 2005 14:00:13 -0800,
Andrew Morton <[email protected]> escribi?:


> Supporting 8k stacks is a small amount of code and nobody has seen a need
> to make changes in there for quite a long time. So there's little cost to
> keeping the existing code.
>
> And the existing code is useful:

Maybe this slighty different approach is better?



Signed-off-by: Diego Calleja <[email protected]>

Index: test/arch/i386/Kconfig.debug
===================================================================
--- test.orig/arch/i386/Kconfig.debug 2005-12-16 13:59:54.000000000 +0100
+++ test/arch/i386/Kconfig.debug 2005-12-16 14:03:27.000000000 +0100
@@ -42,15 +42,16 @@
This results in a large slowdown, but helps to find certain types
of memory corruptions.

-config 4KSTACKS
- bool "Use 4Kb for kernel stacks instead of 8Kb"
+config 8KSTACKS
+ bool "Use 8Kb for kernel stacks instead of 4Kb"
depends on DEBUG_KERNEL
help
- If you say Y here the kernel will use a 4Kb stacksize for the
- kernel stack attached to each process/thread. This facilitates
- running more threads on a system and also reduces the pressure
+ If you say Y here the kernel will use a 8Kb stacksize for the
+ kernel stack attached to each process/thread. This makes harder
+ to overflow the stack, and it's used to debug possible stack
+ overflow problems. Notice that this increases the pressure
on the VM subsystem for higher order allocations. This option
- will also use IRQ stacks to compensate for the reduced stackspace.
+ will also disable IRQ stacks.

config X86_FIND_SMP_CONFIG
bool
Index: test/arch/i386/kernel/irq.c
===================================================================
--- test.orig/arch/i386/kernel/irq.c 2005-12-16 13:59:54.000000000 +0100
+++ test/arch/i386/kernel/irq.c 2005-12-16 14:01:24.000000000 +0100
@@ -33,7 +33,7 @@
}
#endif

-#ifdef CONFIG_4KSTACKS
+#ifndef CONFIG_8KSTACKS
/*
* per-CPU IRQ handling contexts (thread information and stack)
*/
@@ -55,7 +55,7 @@
{
/* high bits used in ret_from_ code */
int irq = regs->orig_eax & 0xff;
-#ifdef CONFIG_4KSTACKS
+#ifndef CONFIG_8KSTACKS
union irq_ctx *curctx, *irqctx;
u32 *isp;
#endif
@@ -76,7 +76,7 @@
}
#endif

-#ifdef CONFIG_4KSTACKS
+#ifndef CONFIG_8KSTACKS

curctx = (union irq_ctx *) current_thread_info();
irqctx = hardirq_ctx[smp_processor_id()];
@@ -112,7 +112,7 @@
return 1;
}

-#ifdef CONFIG_4KSTACKS
+#ifndef CONFIG_8KSTACKS

/*
* These should really be __section__(".bss.page_aligned") as well, but
Index: test/include/asm-i386/irq.h
===================================================================
--- test.orig/include/asm-i386/irq.h 2005-12-16 13:59:54.000000000 +0100
+++ test/include/asm-i386/irq.h 2005-12-16 14:04:05.000000000 +0100
@@ -27,7 +27,7 @@
# define ARCH_HAS_NMI_WATCHDOG /* See include/linux/nmi.h */
#endif

-#ifdef CONFIG_4KSTACKS
+#ifndef CONFIG_8KSTACKS
extern void irq_ctx_init(int cpu);
extern void irq_ctx_exit(int cpu);
# define __ARCH_HAS_DO_SOFTIRQ
Index: test/include/asm-i386/module.h
===================================================================
--- test.orig/include/asm-i386/module.h 2005-12-16 13:59:54.000000000 +0100
+++ test/include/asm-i386/module.h 2005-12-16 14:04:36.000000000 +0100
@@ -64,8 +64,8 @@
#define MODULE_REGPARM ""
#endif

-#ifdef CONFIG_4KSTACKS
-#define MODULE_STACKSIZE "4KSTACKS "
+#ifdef CONFIG_8KSTACKS
+#define MODULE_STACKSIZE "8KSTACKS "
#else
#define MODULE_STACKSIZE ""
#endif
Index: test/include/asm-i386/thread_info.h
===================================================================
--- test.orig/include/asm-i386/thread_info.h 2005-12-16 13:59:54.000000000 +0100
+++ test/include/asm-i386/thread_info.h 2005-12-16 14:04:57.000000000 +0100
@@ -53,7 +53,7 @@
#endif

#define PREEMPT_ACTIVE 0x10000000
-#ifdef CONFIG_4KSTACKS
+#ifndef CONFIG_8KSTACKS
#define THREAD_SIZE (4096)
#else
#define THREAD_SIZE (8192)

2005-12-16 14:04:25

by Adrian Bunk

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Fri, Dec 16, 2005 at 02:10:02PM +0100, Diego Calleja wrote:
> El Thu, 15 Dec 2005 14:00:13 -0800,
> Andrew Morton <[email protected]> escribi?:
>
>
> > Supporting 8k stacks is a small amount of code and nobody has seen a need
> > to make changes in there for quite a long time. So there's little cost to
> > keeping the existing code.
> >
> > And the existing code is useful:
>
> Maybe this slighty different approach is better?
>...

My count of bug reports for problems with 4k stacks after Neil's patch
went into -mm is still at 0.

Either there are no problems left or noone pays attention to them since
disabling 4k stacks "fixed" the problem.

In both cases there's no reason against applying my patch.

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

2005-12-16 14:12:07

by Adrian Bunk

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Thu, Dec 15, 2005 at 05:49:45PM -0800, Zwane Mwaikambo wrote:
> On Thu, 15 Dec 2005, Adrian Bunk wrote:
>
> > On Thu, Dec 15, 2005 at 02:00:13PM -0800, Andrew Morton wrote:
> >
> > > Supporting 8k stacks is a small amount of code and nobody has seen a need
> > > to make changes in there for quite a long time. So there's little cost to
> > > keeping the existing code.
> > >
> > > And the existing code is useful:
> > >
> > > a) people can enable it to confirm that their weird crash was due to a
> > > stack overflow.
> > >
> > > b) If I was going to put together a maximally-stable kernel for a
> > > complex server machine, I'd select 8k stacks. We're still just too
> > > squeezy, and we've had too many relatively-recent overflows, and there
> > > are still some really deep callpaths in there.
> >
> > a1) People turn off 4k stacks and never report the problem / noone
> > really debugs and fixes the reported problem.
> >
> > Me threatening people with enabling 4k stacks for everyone already
> > resulted in several fixes.
>
> How about this, we apply this patch and perhaps add some debug option to
> enable 8k by changing THREAD_SIZE. This way we have the seperate interrupt
> stacks and 8k stacks for when someone suspects a stack overflow.

You can always manually change THREAD_SIZE using a text editor.

My count of bug reports for problems with in-kernel code with 4k stacks
after Neil's patch went into -mm is still at 0.

Either there are no problems left or noone pays attention to them since
disabling 4k stacks "fixed" the problem. And not having an option makes
it more likely that we get reports for and people interested in fixes
for the latter.

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

2005-12-16 14:40:16

by linux-os (Dick Johnson)

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks


On Thu, 15 Dec 2005, Lee Revell wrote:

> On Thu, 2005-12-15 at 14:46 -0700, Jeff V. Merkey wrote:
>> Lee Revell wrote:
>>
>>> On Thu, 2005-12-15 at 14:07 -0700, Jeff V. Merkey wrote:
>>>
>>>
>>>> When you are on the phone with an irrate customer at 2:00 am in the
>>>> morning, and just turning off your broken 4K stack fix
>>>> and getting the customer running matters.
>>>>
>>>>
>>>
>>> Bugzilla link please. Otherwise STFU.
>>>
>>>
>>
>> ??????
>>
>> Jeff
>
> You imply that your customer's problem was due to a kernel bug triggered
> by CONFIG_4KSTACKS. I am asking you to provide a link to the bug report
> or get lost.
>
> Lee

Throughout the past two years of 4k stack-wars, I never heard why
such a small stack was needed (not wanted, needed). It seems that
everybody "knows" that smaller is better and most everybody thinks
that one page in ix86 land is "optimum". However I don't think
anybody ever even tried to analyze what was better from a technical
perspective. Instead it's been analyzed as religious dogma, i.e.,
keep the stack small, it will prevent idiots from doing bad things.

I'm fairly sure that if you started from scratch and decided to
write a new operating system, your choice of a stack-size would
probably be something like 64k. I have no clue why somebody
decided to use a 4k stack and force their choice upon others.
And, yes, I am well aware that each system-call requires a
seperate stack upon entry and it even needs to keep that stack
while sleeping.

Cheers,
Dick Johnson
Penguin : Linux version 2.6.13.4 on an i686 machine (5589.56 BogoMips).
Warning : 98.36% of all statistics are fiction.
.

****************************************************************
The information transmitted in this message is confidential and may be privileged. Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited. If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to [email protected] - and destroy all copies of this information, including any attachments, without reading or disclosing them.

Thank you.

2005-12-16 14:45:16

by Alex Davis

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks



--- Kyle Moffett <[email protected]> wrote:

> (BTW: Way to CC a private thread to a public list)
>
Ahhh!!! So when you get upset and show a less pleasant side
of your personality, you don't want others to see?

Classic!!



I code, therefore I am

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

2005-12-16 14:49:57

by Xavier Bestel

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Fri, 2005-12-16 at 15:39, linux-os (Dick Johnson) wrote:

> Throughout the past two years of 4k stack-wars, I never heard why
> such a small stack was needed (not wanted, needed).

Because after some prolonged uptime, memory can be heavily fragmented.
In this case an order-0 allocation will always succeed (as long as some
memory is free), whereas an order-1 allocation may easily fail.

Xav


2005-12-16 15:01:04

by Adrian Bunk

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Fri, Dec 16, 2005 at 09:39:56AM -0500, linux-os (Dick Johnson) wrote:
>
> Throughout the past two years of 4k stack-wars, I never heard why
> such a small stack was needed (not wanted, needed). It seems that
> everybody "knows" that smaller is better and most everybody thinks
> that one page in ix86 land is "optimum". However I don't think
> anybody ever even tried to analyze what was better from a technical
> perspective. Instead it's been analyzed as religious dogma, i.e.,
> keep the stack small, it will prevent idiots from doing bad things.
>...

It seems you missed the discussion of this issue last month.

Arjan had a good list of all technical advantages of 4k stacks:
http://www.ussg.iu.edu/hypermail/linux/kernel/0511.2/0042.html

> Cheers,
> Dick Johnson

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

2005-12-16 15:11:04

by Kyle Moffett

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Dec 16, 2005, at 09:45, Alex Davis wrote:
> --- Kyle Moffett <[email protected]> wrote:
>> (BTW: Way to CC a private thread to a public list)
>>
> Ahhh!!! So when you get upset and show a less pleasant side of
> your personality, you don't want others to see?

Personally, I don't care much, however it does reflect rather poorly
on _your_ netiquette. Might I remind you that _you_ were the one who
made the thread private in the first place? It is generally
considered poor form to reply privately to someone (indicating that
you want to continue off-list), and then as soon as they continue
discussion in off-list fashion, bringing it back on-list whining
about how they called you names in a private thread.

I've had enough of this nonsense, and I'm just beginning to realize
that I've been feeding the troll and lowering the S/N ratio. PLONK!

On a slightly nicer note, I would like to formally apologize to the
list for the noise that has resulted, and will attempt to be more
reserved with my replies in the future.

Cheers,
Kyle Moffett

--
Unix was not designed to stop people from doing stupid things,
because that would also stop them from doing clever things.
-- Doug Gwyn


2005-12-16 15:35:00

by Diego Calleja

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

El Fri, 16 Dec 2005 15:04:25 +0100,
Adrian Bunk <[email protected]> escribi?:

> My count of bug reports for problems with 4k stacks after Neil's patch
> went into -mm is still at 0.
>
> Either there are no problems left or noone pays attention to them since
> disabling 4k stacks "fixed" the problem.
>
> In both cases there's no reason against applying my patch.

I know, but there's too much resistance to the "pure" 4kb patch. The
8 KB patch does the same thing (enables 4kb stacks) and at the same
time the 8kb groupies can't flamewar you for it, it covers akpm's
concerns, it puts some pressure on the ndiswrapper guys and leaves
time for the broadcom driver developers to finish, merge and push
to the distributions their driver. The 8kb config option can be
removed in the future when we're sure that it's 100% safe (neil
brown's patch isn''t a good sign). It makes every happy IMO ;)

2005-12-16 15:35:42

by Matthew D. Reuther

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

Isn't this patch specifically for the -mm kernel and not mainline? How can
anyone using a binary driver provide any feedback on an -mm kernel? The
binary driver taints the kernel, so bug reports are useless.

Furthermore, if you aren't interested in debugging the kernel, why would you
run the -mm tree or why can't you hack/patch the 8k stacks back in?

2005-12-16 15:49:24

by Kyle Moffett

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Dec 16, 2005, at 10:35, Diego Calleja wrote:
> I know, but there's too much resistance to the "pure" 4kb patch.

I have yet to see any resistance to the 4Kb patch this time around
that was not "*whine* don't break my ndiswrapper plz". There are
extremely few remaining 4kb stack bugs, and 4k stacks have a sizeable
list of technical advantages.


> The 8 KB patch does the same thing (enables 4kb stacks)

The point is to force it in -mm so most people can't just disable it
because it fixes their problem. We want 8k stacks to go away, and
the only way to get out the last issues before sending to mainline is
by forcing it in -mm.


> and at the same time the 8kb groupies can't flamewar you for it

This matter how?


> it covers akpm's concerns

I get the impression that they are already sufficiently addressed,
although Andrew should feel free to correct me if I'm wrong :-D.


> it puts some pressure on the ndiswrapper guys

And removing 8kb all-together doesn't do this?


> and leaves time for the broadcom driver developers to finish, merge
> and push to the distributions their driver

It's working partially now. This is the time when we should really
try to force ndiswrapper junkies over to the driver to get it tested
and bugfixed for inclusion.


> The 8kb config option can be removed in the future when we're sure
> that it's 100% safe (neil brown's patch isn''t a good sign). It
> makes every happy IMO ;)

We're not trying to get it removed from mainline (yet), just from -
mm, which has never been anywhere close to 100% safe _anyways_. If
it breaks lots of things horribly, it will get reverted and
development will continue as normal, but that's probably not going to
be the case.

Cheers,
Kyle Moffett

--
Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are, by
definition, not smart enough to debug it.
-- Brian Kernighan


2005-12-16 15:58:24

by Adrian Bunk

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Fri, Dec 16, 2005 at 04:35:03PM +0100, Diego Calleja wrote:
> El Fri, 16 Dec 2005 15:04:25 +0100,
> Adrian Bunk <[email protected]> escribi?:
>
> > My count of bug reports for problems with 4k stacks after Neil's patch
> > went into -mm is still at 0.
> >
> > Either there are no problems left or noone pays attention to them since
> > disabling 4k stacks "fixed" the problem.
> >
> > In both cases there's no reason against applying my patch.
>
> I know, but there's too much resistance to the "pure" 4kb patch. The
> 8 KB patch does the same thing (enables 4kb stacks) and at the same
> time the 8kb groupies can't flamewar you for it, it covers akpm's

I have no problems with people flaming me.

I had problems if people would actually find technical reasons where my
patch breaks in-kernel code. ;-)

> concerns, it puts some pressure on the ndiswrapper guys and leaves
> time for the broadcom driver developers to finish, merge and push
> to the distributions their driver. The 8kb config option can be
> removed in the future when we're sure that it's 100% safe (neil
> brown's patch isn''t a good sign). It makes every happy IMO ;)

Neil's patch fixes the last known poroblems.

My count of bug reports for problems with 4k stacks after Neil's patch
went into -mm is still at 0.

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

2005-12-16 16:23:50

by Michael Buesch

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Friday 16 December 2005 16:49, you wrote:
> > and leaves time for the broadcom driver developers to finish, merge
> > and push to the distributions their driver
>
> It's working partially now. This is the time when we should really
> try to force ndiswrapper junkies over to the driver to get it tested
> and bugfixed for inclusion.

Partially means:
- Connections on CCK Rates (802.11b) are stable,
if the signal quality is good (you are close to the AP).
- No encryption support, yet (I am working on it, but it's
a bit difficult)

Now, I want to test bcm43xx on 4k stacks. But only have a
ppc32 machine with such a broadcom card. ppc32 has 8k stacks.
How am I supposed to test the driver for 4kstack conformance?
Given this, why aren't there people working on 4kstacks for
ppc32? Is it not needed there, or did simply nobody care to
do this now?
Thanks for your nonflaming suggestions. ;)

--
Greetings Michael.


Attachments:
(No filename) (931.00 B)
(No filename) (189.00 B)
Download all attachments

2005-12-16 16:46:46

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Fri, 2005-12-16 at 14:10 +0100, Diego Calleja wrote:
> El Thu, 15 Dec 2005 14:00:13 -0800,
> Andrew Morton <[email protected]> escribió:
>
>
> > Supporting 8k stacks is a small amount of code and nobody has seen a need
> > to make changes in there for quite a long time. So there's little cost to
> > keeping the existing code.
> >
> > And the existing code is useful:
>
> Maybe this slighty different approach is better?
>
>
>
> Signed-off-by: Diego Calleja <[email protected]>


I like this one; it makes the default 4K while leaving the 8K option for
those who really want it...

Acked-by: Arjan van de Ven <[email protected]>

2005-12-16 18:08:35

by Dave Jones

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Fri, Dec 16, 2005 at 12:05:18PM +0100, Bodo Eggert wrote:

> ACK. So where is the driver for the Netgear WG511 Softmac card I'm supposed
> to test? I bought this card because it was labled as being supported, and it
> turned out that it wasn't, and just nobody cared to update the list of
> supported cards with the warning about the unsupported variant.

There are two models of that card with the same name.
The one made in taiwan is a prism54, the one made in china is
something else. I guess yours is made in China ?

Dave

2005-12-16 18:22:43

by Giridhar Pemmasani

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

Kyle Moffett wrote:

> I have yet to see any resistance to the 4Kb patch this time around
> that was not "*whine* don't break my ndiswrapper plz". There are

I haven't seen anyone demanding others not to have 4k stacks; only requests
to leave 4k/8k stack option as it is. If _you_ want to have 4k stacks, you
already have that option. You are only pushing what you want on others and
bad mouthing people that are requesting the option to have either 4k or 8k
stacks.

> It's working partially now. This is the time when we should really

ndiswrapper is used not just for broadcom. There are plenty of other
chipsets that don't even have a project started to write open source
driver.

> try to force ndiswrapper junkies over to the driver to get it tested
^^^^^^^
Shame on you. Your last mail was a promise to be "more reserved". Even
otherwise, such profanities against a group of people are unwarranted.

To kernel developers: I have earlier requested if it is possible to create
threads with different stack sizes (e.g., 4k/8k/16k etc.) at run-time, ala
FreeBSD. In that case, one could chose whatever it is that fits their
needs. Any comments on this idea?

To those that depend on ndiswrapper to have wireless in Linux: A few days
ago I started working on NDIS implementation in user space. However, it
will take considerable time before this is usable. Moreover, I only have
hope with USB drivers. PCI/mini-PCI/PCMCIA drivers need to run interrupt
service routines, which can't be run in user space, so they won't work in
user space.

Giri

2005-12-16 18:37:04

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Fri, 2005-12-16 at 13:18 -0500, Giridhar Pemmasani wrote:
> Kyle Moffett wrote:
>
> > I have yet to see any resistance to the 4Kb patch this time around
> > that was not "*whine* don't break my ndiswrapper plz". There are
>
> I haven't seen anyone demanding others not to have 4k stacks; only requests
> to leave 4k/8k stack option as it is.

in hindsight making this a config option was a mistake. Why? Because
we're not making every single patch we add to the kernel a config
option, nor should it be. Config options for drivers or expensive debug
options are fine, debug options for random patches... aren't really. To
be fair the config option was intended to be really temporary, like 1
kernel release, until it was sure there were no kinks. Oh well, there's
too many people moaning now about ndiswrapper that I fear we'll never
get rid of it.

And no I do not think a kernel with 9000 config options is still useful;
not every single trivial thing should be a config option.


2005-12-16 18:45:32

by Horst H. von Brand

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

linux-os \(Dick Johnson\) <[email protected]> wrote:

[...]

> Throughout the past two years of 4k stack-wars, I never heard why
> such a small stack was needed (not wanted, needed). It seems that
> everybody "knows" that smaller is better and most everybody thinks
> that one page in ix86 land is "optimum". However I don't think
> anybody ever even tried to analyze what was better from a technical
> perspective. Instead it's been analyzed as religious dogma, i.e.,
> keep the stack small, it will prevent idiots from doing bad things.

OK, so here goes again...

The kernel stack has to be contiguous in /physical/ memory. Keep the stack
/one/ page, that way you can always get a new stack when needed (== each
fork(2) or clone(2)). If the stack is 2 (or more) pages, you'll have to
find (or create) a multi-page free area, and (fragmentation being what it
is, and Linux routinely running for months at a time) you are in a whole
new world of pain.
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513

2005-12-16 18:53:15

by Brian Gerst

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

Horst von Brand wrote:
> linux-os \(Dick Johnson\) <[email protected]> wrote:
>
> [...]
>
>
>>Throughout the past two years of 4k stack-wars, I never heard why
>>such a small stack was needed (not wanted, needed). It seems that
>>everybody "knows" that smaller is better and most everybody thinks
>>that one page in ix86 land is "optimum". However I don't think
>>anybody ever even tried to analyze what was better from a technical
>>perspective. Instead it's been analyzed as religious dogma, i.e.,
>>keep the stack small, it will prevent idiots from doing bad things.
>
>
> OK, so here goes again...
>
> The kernel stack has to be contiguous in /physical/ memory. Keep the stack
> /one/ page, that way you can always get a new stack when needed (== each
> fork(2) or clone(2)). If the stack is 2 (or more) pages, you'll have to
> find (or create) a multi-page free area, and (fragmentation being what it
> is, and Linux routinely running for months at a time) you are in a whole
> new world of pain.

So what about arches where single-page stacks aren't viable (for example
x86_64)? Are we just screwed?

--
Brian Gerst

2005-12-16 18:52:59

by Oliver Neukum

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

Am Freitag, 16. Dezember 2005 19:42 schrieb Horst von Brand:
> linux-os \(Dick Johnson\) <[email protected]> wrote:
>
> [...]
>
> > Throughout the past two years of 4k stack-wars, I never heard why
> > such a small stack was needed (not wanted, needed). It seems that
> > everybody "knows" that smaller is better and most everybody thinks
> > that one page in ix86 land is "optimum". However I don't think
> > anybody ever even tried to analyze what was better from a technical
> > perspective. Instead it's been analyzed as religious dogma, i.e.,
> > keep the stack small, it will prevent idiots from doing bad things.
>
> OK, so here goes again...
>
> The kernel stack has to be contiguous in /physical/ memory. Keep the stack
> /one/ page, that way you can always get a new stack when needed (== each
> fork(2) or clone(2)). If the stack is 2 (or more) pages, you'll have to
> find (or create) a multi-page free area, and (fragmentation being what it
> is, and Linux routinely running for months at a time) you are in a whole
> new world of pain.

How about ignoring physical pages and going to virtual, say, 16K pages?
After all, 4K is 15 years old. Disks and RAM have grown enormously.

Regards
Oliver

2005-12-16 18:56:21

by Steven Rostedt

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Fri, 2005-12-16 at 15:42 -0300, Horst von Brand wrote:
> linux-os \(Dick Johnson\) <[email protected]> wrote:
>
> [...]
>
> > Throughout the past two years of 4k stack-wars, I never heard why
> > such a small stack was needed (not wanted, needed). It seems that
> > everybody "knows" that smaller is better and most everybody thinks
> > that one page in ix86 land is "optimum". However I don't think
> > anybody ever even tried to analyze what was better from a technical
> > perspective. Instead it's been analyzed as religious dogma, i.e.,
> > keep the stack small, it will prevent idiots from doing bad things.
>
> OK, so here goes again...
>
> The kernel stack has to be contiguous in /physical/ memory. Keep the stack
> /one/ page, that way you can always get a new stack when needed (== each
> fork(2) or clone(2)). If the stack is 2 (or more) pages, you'll have to
> find (or create) a multi-page free area, and (fragmentation being what it
> is, and Linux routinely running for months at a time) you are in a whole
> new world of pain.

So people should really be asking for a PAGE_SIZE = 8k option ;)

Sorry,

-- Steve


2005-12-16 19:02:14

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks


>
> So what about arches where single-page stacks aren't viable (for example
> x86_64)? Are we just screwed?


x86 is specially handicapped due to the fact that the stacks need to be
in the lowmem zone. Even if you have 8Gb ram, the lowmem zone is still
800Mb and a bit, and this gets to be under a high pressure, like
hyper-fragmentation. Same for bounce buffers etc etc.

note that the order thing is by far not the only advantage, pure memory
usage alone and cache locality also are wins. The memory usage halves
for kernel stacks after all (which means you can do more threads in
java, or use the memory for disk cache ;)

2005-12-16 19:06:35

by Horst H. von Brand

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

Adrian Bunk <[email protected]> wrote:
> On Fri, Dec 16, 2005 at 01:56:58PM +1100, Neil Brown wrote:

[...]

> gcc can figure out itself that static functions called only once should
> be inline (except currently on i386 due to no-unit-at-a-time, see
> below).
>
> > These add up to over 300 bytes on the stack.
> > Looking at each of these, I see that nfsd_write (which includes
> > nfsd_vfs_write) contributes 0x8c to stack usage itself!!
> >
> > It turns out this is because it puts a 'struct iattr' on the stack so
> > it can kill suid if needed. The following patch saves about 50 bytes
> > off the stack in this call path.
> >...

And if you set up a compound literal for the task then? It is just used to
shove data into the called function.

My short test case (attached) has a smaller stack with the compound
literal (gcc-4.1, Fedora rawhide on i686), and IMHO it is clearer what is
going on here.

> This works currently on i386 (and only on i386) because we are using
> -fno-unit-at-a-time there.
>
> In the medium-term, we want to get rid of no-unit-at-a-time because this
> makes the code both bigger and slower, and I'm therefore not a big fan
> of this kind of workarounds.
>
> If this struct is really a problem (which I doubt considering it's
> size), I'd prefer it being kmalloc'ed.

Nodz.
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513

2005-12-16 19:14:20

by Oliver Neukum

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

Am Freitag, 16. Dezember 2005 20:02 schrieb Arjan van de Ven:
>
> >
> > So what about arches where single-page stacks aren't viable (for example
> > x86_64)? Are we just screwed?
>
>
> x86 is specially handicapped due to the fact that the stacks need to be
> in the lowmem zone. Even if you have 8Gb ram, the lowmem zone is still
> 800Mb and a bit, and this gets to be under a high pressure, like
> hyper-fragmentation. Same for bounce buffers etc etc.
>
> note that the order thing is by far not the only advantage, pure memory
> usage alone and cache locality also are wins. The memory usage halves
> for kernel stacks after all (which means you can do more threads in
> java, or use the memory for disk cache ;)

1. Cache usage depends on actual stack usage. How much you allocate
doesn't matter
2. You are surely getting a cache effect by using interrupt stacks. Which
is larger?
3. When you use kmalloc instead of the stack you are reducing locality.

Regards
Oliver

2005-12-16 19:23:39

by Lee Revell

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Fri, 2005-12-16 at 12:05 +0100, Bodo Eggert wrote:
> Kyle Moffett <[email protected]> wrote:
>
> > Enough already! These concerns have been raised already, and found
> > to be insufficient. There are several points:
> >
> > 1) ndiswrapper is broken already, and works sheerly by luck anyways;
> > NT stacks are 12kb, so you're already asking for stack overflows by
> > using it.
> > 2) ndiswrapper encourages use of binary drivers instead of the open-
> > source ones that need the testers, so you're only hurting yourselves
> > in the long run.
>
> ACK. So where is the driver for the Netgear WG511 Softmac card I'm supposed
> to test? I bought this card because it was labled as being supported, and it
> turned out that it wasn't, and just nobody cared to update the list of
> supported cards with the warning about the unsupported variant.

Um, this is not the developers fault. Do you think the vendors call the
driver developers to tell them "hey, we just released a new product,
with a name confusingly similar to the one your driver supports, but we
changed the chipset a tiny bit so it won't work with your driver"?
Dream on.

Driver developers are not psychic. If no USER reported that the new
FooBar1002X is completely different from the FooBar1002, there's no way
for us to know. Sorry you were unfortunate enough to be the first user
to learn the hard way. Complain to the vendor not LKML.

Lee

2005-12-16 19:26:20

by linux-os (Dick Johnson)

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks


On Fri, 16 Dec 2005, Horst von Brand wrote:

> linux-os \(Dick Johnson\) <[email protected]> wrote:
>
> [...]
>
>> Throughout the past two years of 4k stack-wars, I never heard why
>> such a small stack was needed (not wanted, needed). It seems that
>> everybody "knows" that smaller is better and most everybody thinks
>> that one page in ix86 land is "optimum". However I don't think
>> anybody ever even tried to analyze what was better from a technical
>> perspective. Instead it's been analyzed as religious dogma, i.e.,
>> keep the stack small, it will prevent idiots from doing bad things.
>
> OK, so here goes again...
>
> The kernel stack has to be contiguous in /physical/ memory. Keep the stack
> /one/ page, that way you can always get a new stack when needed (== each
> fork(2) or clone(2)). If the stack is 2 (or more) pages, you'll have to
> find (or create) a multi-page free area, and (fragmentation being what it
> is, and Linux routinely running for months at a time) you are in a whole
> new world of pain.

The interrupt stack needs to be non-paged. Are you sure the user-stacks
need to be 'physical', non-paged too? If so, that's probably the
problem. All addresses accessed by the CPUs in the kernel are virtual
which means one needs some mapping anyway.

So, why can't one map non-contiguous pages into the kernel-user-stack?
That entry-into-the-kernel stack is the one that's giving everybody
fits because it needs to remain allocated, even for sleeping tasks.

If it was virtual, built just like other data, it could be made up
from any available RAM and, in the case of a preemptive kernel,
even swapped!

In that case, the total amount of real RAM you actually need
is defined only by the number of concurrent tasks in a preemptive
kernel, plus the page(s) for the interrupt stack. That's far lower
than the 'N' pages times everybody who forked and slept, which is
what we seem to have now.

FYI, there is nothing wrong with a 2-level stack, i.e., the
system-call occurs on the interrupt stack, then the user-kernel
stack gets allocated from paged RAM.

Now I know this isn't VMS, but in VMS we didn't have anything
that really needed to be contiguous except for the interrupt
stack (which was arbitrarily 64k). And on VAXen the pages were
tiny 512 things so you really need to use paged RAM for just
about everything.

> --
> Dr. Horst H. von Brand User #22616 counter.li.org
> Departamento de Informatica Fono: +56 32 654431
> Universidad Tecnica Federico Santa Maria +56 32 654239
> Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513
>

Cheers,
Dick Johnson
Penguin : Linux version 2.6.13.4 on an i686 machine (5589.56 BogoMips).
Warning : 98.36% of all statistics are fiction.
.

****************************************************************
The information transmitted in this message is confidential and may be privileged. Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited. If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to [email protected] - and destroy all copies of this information, including any attachments, without reading or disclosing them.

Thank you.

2005-12-16 19:32:15

by Mike Snitzer

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On 12/16/05, Arjan van de Ven <[email protected]> wrote:
> On Fri, 2005-12-16 at 13:18 -0500, Giridhar Pemmasani wrote:
> > Kyle Moffett wrote:
> >
> > > I have yet to see any resistance to the 4Kb patch this time around
> > > that was not "*whine* don't break my ndiswrapper plz". There are
> >
> > I haven't seen anyone demanding others not to have 4k stacks; only requests
> > to leave 4k/8k stack option as it is.
>
> in hindsight making this a config option was a mistake. Why? Because
> we're not making every single patch we add to the kernel a config
> option, nor should it be. Config options for drivers or expensive debug
> options are fine, debug options for random patches... aren't really. To
> be fair the config option was intended to be really temporary, like 1
> kernel release, until it was sure there were no kinks. Oh well, there's
> too many people moaning now about ndiswrapper that I fear we'll never
> get rid of it.
>
> And no I do not think a kernel with 9000 config options is still useful;
> not every single trivial thing should be a config option.

You're using overly generalized assertions to try to convince others
that the configurability of a particularly important (to some, albeit
not you) config option is unnecessary. 4K vs 8K is hardly a "trivial"
configuration option of the Linux kernel. At this point in time it
has not been sufficiently demonstrated that 4K "just works".

Taking a step back, I'm all for -mm being a 4K only tree to force the
issue; but even once all in-tree code is deemed 4K clean people still
may want to be extremely cautious by enabling 8K stacks (possibly
_with_ IRQ stacks). Its merely a question of can/will Linux (or some
vendor) provide this level of stack overflow safety as is; or does one
need to patch the kernel to get the desired safety? IF upstream
kernel.org doesn't even provide the knobs to ensure safety at all
costs (and vendors like Redhat have people at the helm who are
advocating 4K stacks in the "Enterprise" Linux kernel configurations
of the world) how does one get a Linux kernel that provides a sizable
safety net that is _SUPPORTED_ for true enterprise-grade applications?

Simply put 4K vs 8K is not as trivial a decision as you'd have people believe.

Mike

2005-12-16 19:46:27

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks


> You're using overly generalized assertions to try to convince others

I'm no longer interested in arguing removing this thing. Too much
whining by ndiswrapper addicts [1].

> that the configurability of a particularly important (to some, albeit
> not you) config option is unnecessary. 4K vs 8K is hardly a "trivial"
> configuration option of the Linux kernel.

Compared to many of the other changes that went in since 2.6.0? 4K
stacks is a minor change. There's no config option for 4 level
pagetables for example, and that's a far more invasive change in many
ways.

> At this point in time it
> has not been sufficiently demonstrated that 4K "just works".

Excuse me?
Fedora released 3 distributions with it enabled, and Red Hat uses it in
an enterprise distribution. That's a whopping lot of users right there
with a very wide range of workloads.

> kernel to get the desired safety? IF upstream
> kernel.org doesn't even provide the knobs to ensure safety at all
> costs (and vendors like Redhat have people at the helm who are
> advocating 4K stacks in the "Enterprise" Linux kernel configurations
> of the world) how does one get a Linux kernel that provides a sizable
> safety net that is _SUPPORTED_ for true enterprise-grade applications?

eh I don't know if you paid attention, but Red Hat Enterprise Linux 4
only has 4Kb stacks kernels... so that covers your supported true
enterprise-grade application thing.

> Simply put 4K vs 8K is not as trivial a decision as you'd have people believe.

that's too simply put in fact, especially if you look at it
historically. It's a bit of irony that part of the reason 4K stacks was
developed was that the 2.4 kernels ran out of stack space for customers
occasionally (just as example look at lkml this week, there was a report
about such an overflow there as well). Remember that 4K+4K has more
stack space than the 4K+2K as 2.4 kernels have. Sure 2.6 bumped this to
5.5k/2.5k roughly in the "8K" case, but fundamentally the change to
4k/4k isn't all that big even inside 2.6.

You can go on and keep painting this as a cowboy development, but it
really isn't....



[1] Yes addicts; binary drivers are in many ways similar to heroin;
they're really hard to get rid of for example and highly addictive, they
also cause some people to act like junkies-in-withdrawl when their
binary driver breaks, or when someone suggests breaking it.

2005-12-16 20:05:13

by Alan

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Gwe, 2005-12-16 at 17:23 +0100, Michael Buesch wrote:
> Now, I want to test bcm43xx on 4k stacks. But only have a
> ppc32 machine with such a broadcom card. ppc32 has 8k stacks.
> How am I supposed to test the driver for 4kstack conformance?

Unless you've been writing fairly careless code putting a lot of objects
on stack a driver is going to work fine with 4K stacks.

> Given this, why aren't there people working on 4kstacks for
> ppc32? Is it not needed there, or did simply nobody care to
> do this now?

AFAIK nobody is working on 4K stack for PPC32. I've no idea myself if it
is needed or useful there. In terms of debugging if your code exceeds a
4K stack you'll find out quite rapidly from x86 users. One thing the
seperate IRQ stacks mean is that stack overflows generally show up as
overflows and consistently rather than as weird crashes when timing
co-incides between your heavy stack usage and IRQ heavy stack usage, at
which point the mess is rarely repeatable or debuggable

2005-12-16 20:09:04

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Fri, 2005-12-16 at 20:02 +0000, Alan Cox wrote:
> On Gwe, 2005-12-16 at 17:23 +0100, Michael Buesch wrote:
> > Now, I want to test bcm43xx on 4k stacks. But only have a
> > ppc32 machine with such a broadcom card. ppc32 has 8k stacks.
> > How am I supposed to test the driver for 4kstack conformance?
>
> Unless you've been writing fairly careless code putting a lot of objects
> on stack a driver is going to work fine with 4K stacks.

there is also "make checkstack" that works on many architectures, and
lists offenders. If you're clean there it's very likely you're very
ok ;)

2005-12-16 21:13:23

by David Lang

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Fri, 16 Dec 2005, linux-os (Dick Johnson) wrote:

> On Fri, 16 Dec 2005, Horst von Brand wrote:
>
>> linux-os \(Dick Johnson\) <[email protected]> wrote:
>>
>> [...]
>>
>>> Throughout the past two years of 4k stack-wars, I never heard why
>>> such a small stack was needed (not wanted, needed). It seems that
>>> everybody "knows" that smaller is better and most everybody thinks
>>> that one page in ix86 land is "optimum". However I don't think
>>> anybody ever even tried to analyze what was better from a technical
>>> perspective. Instead it's been analyzed as religious dogma, i.e.,
>>> keep the stack small, it will prevent idiots from doing bad things.
>>
>> OK, so here goes again...
>>
>> The kernel stack has to be contiguous in /physical/ memory. Keep the
> stack
>> /one/ page, that way you can always get a new stack when needed (==
> each
>> fork(2) or clone(2)). If the stack is 2 (or more) pages, you'll have
> to
>> find (or create) a multi-page free area, and (fragmentation being what
> it
>> is, and Linux routinely running for months at a time) you are in a
> whole
>> new world of pain.
>
> The interrupt stack needs to be non-paged. Are you sure the user-stacks
> need to be 'physical', non-paged too? If so, that's probably the
> problem. All addresses accessed by the CPUs in the kernel are virtual
> which means one needs some mapping anyway.

actually, the kernel always uses real addresses, userspace uses virtual
addresses.

This came up recently with the page tables, Linus said that he was
absolutly opposed to adding the complication and overhead of changine the
kernel to user virtual addresses instead of real addresses for it's data
structures. it would add an extra level of redirection to just about every
memory access (which also means an additional load on the cache to store
the mapping info to resolve this redirection). The performance hit for
this would be considerable.

David Lang

2005-12-16 21:28:17

by Mike Snitzer

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

Arjan,

I'm aware that RedHat has been shipping FC and RHEL kernels with 4K
stacks for some time. The recent 4K stack fixes pretty much establish
that RedHat's early adoption of 4K stacks was a "cowbody development"
no? I don't think RedHat kept similar 4K stack fixes from upstream,
so a bit of luck maybe? I do agree that at this point the 4K-only
proposal is NOT an overly rash decision given continued adequate
vetting in -mm. But even then there may be untested workloads that
expose stack issues. Which is perfectly fine if users have the option
to use _supported_ alternate configs.

I got the sense that you were trying to paint me as something of a
closet "ndiswrapper addict". Amusing and all but my motivations for
requesting continued options on the stack size are rooted in concerns
that long call chains can and still do result when running kernel.org
and RHEL4 kernels under particular workloads. An example workload
being cluster filesystems that in a single call chain historically
_could_ leverage iptables + RPC (tcp) + DM (LVM) + etc.

Given Neil Brown's fix for the block layer these stack-heavy workloads
that included DM in the call chain need to be revisited. However, the
savings associated with those particular fixes still may not leave
sufficient breathing room. The logic that all users must NOW provide
workloads which undermine 4K stack viability otherwise the 8K option
will be completely removed _seems_ quite irrational (even though we
are _supposedly_ just talking about doing so in -mm).

All of us appreciate the desire to have Linux be more efficient and 4K
stacks will get us that. If it comes with the cost of instability
under more exotic workloads then the bad outweighs the perceived good
of imposed 4K stacks. With RHEL4 it would seem we're past the point
of no-return for supported 8K stacks. I'm merely advocating upstream
give users the 8K+IRQ stack _options_ and set the default to 4K.

Mike


On 12/16/05, Arjan van de Ven <[email protected]> wrote:
>
> > You're using overly generalized assertions to try to convince others
>
> I'm no longer interested in arguing removing this thing. Too much
> whining by ndiswrapper addicts [1].
>
> > that the configurability of a particularly important (to some, albeit
> > not you) config option is unnecessary. 4K vs 8K is hardly a "trivial"
> > configuration option of the Linux kernel.
>
> Compared to many of the other changes that went in since 2.6.0? 4K
> stacks is a minor change. There's no config option for 4 level
> pagetables for example, and that's a far more invasive change in many
> ways.
>
> > At this point in time it
> > has not been sufficiently demonstrated that 4K "just works".
>
> Excuse me?
> Fedora released 3 distributions with it enabled, and Red Hat uses it in
> an enterprise distribution. That's a whopping lot of users right there
> with a very wide range of workloads.
>
> > kernel to get the desired safety? IF upstream
> > kernel.org doesn't even provide the knobs to ensure safety at all
> > costs (and vendors like Redhat have people at the helm who are
> > advocating 4K stacks in the "Enterprise" Linux kernel configurations
> > of the world) how does one get a Linux kernel that provides a sizable
> > safety net that is _SUPPORTED_ for true enterprise-grade applications?
>
> eh I don't know if you paid attention, but Red Hat Enterprise Linux 4
> only has 4Kb stacks kernels... so that covers your supported true
> enterprise-grade application thing.
>
> > Simply put 4K vs 8K is not as trivial a decision as you'd have people believe.
>
> that's too simply put in fact, especially if you look at it
> historically. It's a bit of irony that part of the reason 4K stacks was
> developed was that the 2.4 kernels ran out of stack space for customers
> occasionally (just as example look at lkml this week, there was a report
> about such an overflow there as well). Remember that 4K+4K has more
> stack space than the 4K+2K as 2.4 kernels have. Sure 2.6 bumped this to
> 5.5k/2.5k roughly in the "8K" case, but fundamentally the change to
> 4k/4k isn't all that big even inside 2.6.
>
> You can go on and keep painting this as a cowboy development, but it
> really isn't....
>
>
>
> [1] Yes addicts; binary drivers are in many ways similar to heroin;
> they're really hard to get rid of for example and highly addictive, they
> also cause some people to act like junkies-in-withdrawl when their
> binary driver breaks, or when someone suggests breaking it.
>
>

2005-12-16 21:35:24

by linux-os (Dick Johnson)

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks


On Fri, 16 Dec 2005, David Lang wrote:

> On Fri, 16 Dec 2005, linux-os (Dick Johnson) wrote:
>
>> On Fri, 16 Dec 2005, Horst von Brand wrote:
>>
>>> linux-os \(Dick Johnson\) <[email protected]> wrote:
>>>
>>> [...]
>>>
>>>> Throughout the past two years of 4k stack-wars, I never heard why
>>>> such a small stack was needed (not wanted, needed). It seems that
>>>> everybody "knows" that smaller is better and most everybody thinks
>>>> that one page in ix86 land is "optimum". However I don't think
>>>> anybody ever even tried to analyze what was better from a technical
>>>> perspective. Instead it's been analyzed as religious dogma, i.e.,
>>>> keep the stack small, it will prevent idiots from doing bad things.
>>>
>>> OK, so here goes again...
>>>
>>> The kernel stack has to be contiguous in /physical/ memory. Keep the
>> stack
>>> /one/ page, that way you can always get a new stack when needed (==
>> each
>>> fork(2) or clone(2)). If the stack is 2 (or more) pages, you'll have
>> to
>>> find (or create) a multi-page free area, and (fragmentation being what
>> it
>>> is, and Linux routinely running for months at a time) you are in a
>> whole
>>> new world of pain.
>>
>> The interrupt stack needs to be non-paged. Are you sure the user-stacks
>> need to be 'physical', non-paged too? If so, that's probably the
>> problem. All addresses accessed by the CPUs in the kernel are virtual
>> which means one needs some mapping anyway.
>
> actually, the kernel always uses real addresses, userspace uses virtual
> addresses.
>

No. Hint: What is PAGE_OFFSET?
Everything the CPU executes/reads/writes is translated from physical (bus)
addresses to virtual addresses.

What you may have heard or read is that the address-space that the
kernel uses for its code and resident data has fixed translation tables
which means one doesn't have to scan a bunch of tables to locate the
bus address, given the virtual address.

physical_address = virtual_address - PAGE_OFFSET;
virtual_address = physical_address + PAGE_OFFSET;

This is totally an artifact of how the page-tables are set up.

> This came up recently with the page tables, Linus said that he was
> absolutly opposed to adding the complication and overhead of changine the
> kernel to user virtual addresses instead of real addresses for it's data
> structures. it would add an extra level of redirection to just about every
> memory access (which also means an additional load on the cache to store
> the mapping info to resolve this redirection). The performance hit for
> this would be considerable.
>
> David Lang
>

Cheers,
Dick Johnson
Penguin : Linux version 2.6.13.4 on an i686 machine (5589.56 BogoMips).
Warning : 98.36% of all statistics are fiction.
.

****************************************************************
The information transmitted in this message is confidential and may be privileged. Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited. If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to [email protected] - and destroy all copies of this information, including any attachments, without reading or disclosing them.

Thank you.

2005-12-16 21:49:44

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Fri, 2005-12-16 at 16:28 -0500, Mike Snitzer wrote:
> Arjan,
>
> I'm aware that RedHat has been shipping FC and RHEL kernels with 4K
> stacks for some time. The recent 4K stack fixes pretty much establish
> that RedHat's early adoption of 4K stacks was a "cowbody development"
> no?

(disclaimer: I don't work for Red Hat)

actually no. The only overflows seen in reality are with XFS, and Red
Hat doesn't support XFS anyway. The recent fix (not plural) was more a
precaution to let XFS work more reliable.


> I don't think RedHat kept similar 4K stack fixes from upstream,
> so a bit of luck maybe? I do agree that at this point the 4K-only
> proposal is NOT an overly rash decision given continued adequate
> vetting in -mm. But even then there may be untested workloads that
> expose stack issues. Which is perfectly fine if users have the option
> to use _supported_ alternate configs.

the problem to some degree is that those people if they hit this, won't
report it most likely. This is a problem because the same overflow can
hit with 8K stacks, just less frequent, so it really needs to be fixed
regardless of what anyone thinks of 4k-vs-8k stacks.


> I got the sense that you were trying to paint me as something of a
> closet "ndiswrapper addict".

Actually I don't even know if you use ndiswrapper, but some others are
behaving like that; you're one of the "others" who actually try to use
real arguments.

> Amusing and all but my motivations for
> requesting continued options on the stack size are rooted in concerns
> that long call chains can and still do result when running kernel.org
> and RHEL4 kernels under particular workloads. An example workload
> being cluster filesystems that in a single call chain historically
> _could_ leverage iptables + RPC (tcp) + DM (LVM) + etc.

funny you mention this one: iptables/RPC(tcp) actually run in the OTHER
4K stack; this workload has actually LESS chance of an overflow than
before... due to having more space in irq context.


> Given Neil Brown's fix for the block layer these stack-heavy workloads

which was pure preemptive and not based on actually observed problems
btw. it's a good fix nevertheless.

> All of us appreciate the desire to have Linux be more efficient and 4K
> stacks will get us that. If it comes with the cost of instability
> under more exotic workloads then the bad outweighs the perceived good
> of imposed 4K stacks. With RHEL4 it would seem we're past the point
> of no-return for supported 8K stacks. I'm merely advocating upstream
> give users the 8K+IRQ stack _options_ and set the default to 4K.

note that I no longer care about this option going away or not; it's not
worth the silly flames (present company excluded ;)

options are good, to a degree. The extreme would be a kernel with 40.000
different config options, one for each patch that goes in. That's of
course silly! The other extreme is the gnome idea that preferences are
bad period. To a degree, Havoc and co are right in that there shouldn't
be a "unbreak my app" option, just like there shouldn't be a "unbreak my
kernel" config option. There needs to be some sort of reason and
proportion in all this.

Where to draw the line is tricky I suppose and to a large degree a
matter of taste of the individual developer; but to be realistic; things
like 4-level pagetables or objrmap have a similar or higher risk of
breakage when they got in, and those didn't get config options
(something I agree with fwiw). In fact each 2.6.X release so far has had
2 or 3 major changes more risky/invasive than 4K stacks.

All that people say about 4k/4k stacks vs unified 8k stacks is
smoke-and-daggers to be honest (yes there were some problems in the past
with XFS, XFS got mostly fixed and Neil's patch fixes that for real, but
the basic things got fixed long ago). They forget that 2.4 got along
fine for a long time with a 4k/2k stack, or maybe chose to forget. All
those overflows you mention should hit double on 2.4 (and to be honest,
2.4 does hit some overflows, mostly due to the effective 2k irq side).
The 4k/4k approach is an extension on top of the 2.4 situation. 2.6 with
a unified 8k stack has a bit extra space, sure. but not dramatic much
more either.

2005-12-16 21:50:54

by Adrian Bunk

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Fri, Dec 16, 2005 at 04:28:15PM -0500, Mike Snitzer wrote:
>...
> Given Neil Brown's fix for the block layer these stack-heavy workloads
> that included DM in the call chain need to be revisited. However, the
> savings associated with those particular fixes still may not leave
> sufficient breathing room. The logic that all users must NOW provide
> workloads which undermine 4K stack viability otherwise the 8K option
> will be completely removed _seems_ quite irrational (even though we
> are _supposedly_ just talking about doing so in -mm).
>
> All of us appreciate the desire to have Linux be more efficient and 4K
> stacks will get us that. If it comes with the cost of instability
> under more exotic workloads then the bad outweighs the perceived good
> of imposed 4K stacks. With RHEL4 it would seem we're past the point
> of no-return for supported 8K stacks. I'm merely advocating upstream
> give users the 8K+IRQ stack _options_ and set the default to 4K.

My count of bug reports for problems with in-kernel code with 4k stacks
after Neil's patch went into -mm is still at 0. That's amazing
considering how many people have claimed in this thread how unstable
4k stacks were...

Enabling 4k stacks unconditionally for all -mm users will give us a
wider testing coverage and will tell us whether we have really fixed all
bugs that become visible with 4k stacks or whether there are still bugs
left.

-mm kernels contain many experimental features, and "completely removed"
isn't really true because we can expect that people running the
experimental -mm kernels to know how to un-apply a patch.

> Mike

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

2005-12-16 22:16:14

by Dave Jones

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Fri, Dec 16, 2005 at 10:50:54PM +0100, Adrian Bunk wrote:
> On Fri, Dec 16, 2005 at 04:28:15PM -0500, Mike Snitzer wrote:
> >...
> > Given Neil Brown's fix for the block layer these stack-heavy workloads
> > that included DM in the call chain need to be revisited. However, the
> > savings associated with those particular fixes still may not leave
> > sufficient breathing room. The logic that all users must NOW provide
> > workloads which undermine 4K stack viability otherwise the 8K option
> > will be completely removed _seems_ quite irrational (even though we
> > are _supposedly_ just talking about doing so in -mm).
> >
> > All of us appreciate the desire to have Linux be more efficient and 4K
> > stacks will get us that. If it comes with the cost of instability
> > under more exotic workloads then the bad outweighs the perceived good
> > of imposed 4K stacks. With RHEL4 it would seem we're past the point
> > of no-return for supported 8K stacks. I'm merely advocating upstream
> > give users the 8K+IRQ stack _options_ and set the default to 4K.
>
> My count of bug reports for problems with in-kernel code with 4k stacks
> after Neil's patch went into -mm is still at 0. That's amazing
> considering how many people have claimed in this thread how unstable
> 4k stacks were...
>
> Enabling 4k stacks unconditionally for all -mm users will give us a
> wider testing coverage and will tell us whether we have really fixed all
> bugs that become visible with 4k stacks or whether there are still bugs
> left.

As another anecdotal point, the number of oomkill/page alloc failure
related bugs that get filed against Fedora these days you can count
on one hand. Before we switched over to 4K stacks, we were
getting regular reports from users having quite sane workloads on
capable machines getting jobs killed left and right.

Now the only cases we're seeing is usually loonies trying to
put silly amounts of RAM in 32bit systems, and the occasional
bug which turns out to be a slab leak or something similar.
(There's one oddball right now where someone sees his 5GB
x86-64 run out of DMA zone, but that might 'go away' when
we push out a kernel with the DMA32 zone).

Dave

2005-12-17 03:46:46

by Bodo Eggert

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Fri, 16 Dec 2005, Lee Revell wrote:
> On Fri, 2005-12-16 at 12:05 +0100, Bodo Eggert wrote:

> > So where is the driver for the Netgear WG511 Softmac card I'm supposed
> > to test? I bought this card because it was labled as being supported, and it
> > turned out that it wasn't, and just nobody cared to update the list of
> > supported cards with the warning about the unsupported variant.
>
> Um, this is not the developers fault. Do you think the vendors call the
> driver developers to tell them "hey, we just released a new product,
> with a name confusingly similar to the one your driver supports, but we
> changed the chipset a tiny bit so it won't work with your driver"?
> Dream on.

> Driver developers are not psychic. If no USER reported that the new
> FooBar1002X is completely different from the FooBar1002, there's no way
> for us to know. Sorry you were unfortunate enough to be the first user
> to learn the hard way. Complain to the vendor not LKML.

I found the information hidden on the developer's website, IIRC in the
developer forum and in several threads. I think it's reasonable to beleave
that the devteam knew.

--
Top 100 things you don't want the sysadmin to say:
90. Wow....that seemed _fast_.....

2005-12-17 06:53:33

by Alex Davis

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

At some time in the not-too-distant past, Giridhar Pemmasani stated:

>ndiswrapper is used not just for broadcom. There are plenty of other
>chipsets that don't even have a project started to write open source
>driver.
Amen!!


>> try to force ndiswrapper junkies over to the driver to get it tested
^^^^^^^
>Shame on you. Your last mail was a promise to be "more reserved". Even
>otherwise, such profanities against a group of people are unwarranted.
Again, AMEN!!!

I code, therefore I am

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

2005-12-17 10:26:54

by Bodo Eggert

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Fri, 16 Dec 2005, Dave Jones wrote:
> On Fri, Dec 16, 2005 at 12:05:18PM +0100, Bodo Eggert wrote:

> > ACK. So where is the driver for the Netgear WG511 Softmac card I'm supposed
> > to test? I bought this card because it was labled as being supported, and it
> > turned out that it wasn't, and just nobody cared to update the list of
> > supported cards with the warning about the unsupported variant.
>
> There are two models of that card with the same name.
> The one made in taiwan is a prism54, the one made in china is
> something else. I guess yours is made in China ?

Yes.
--
Saying your system is secure should be considered the same as saying your food
is too hot. Its a temporary condition which is going away even as you speak.
-- Gandalf Parker

2005-12-17 17:44:20

by Andi Kleen

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

Kyle Moffett <[email protected]> writes:

> On Dec 16, 2005, at 10:35, Diego Calleja wrote:
> > I know, but there's too much resistance to the "pure" 4kb patch.
>
> I have yet to see any resistance to the 4Kb patch this time around
> that was not "*whine* don't break my ndiswrapper plz".

My comment from last time about the missing safety net still applies 100%.

Kernel code is getting more complex all the time and running with
very tight stack is just risky.

> The point is to force it in -mm so most people can't just disable it
> because it fixes their problem. We want 8k stacks to go away, and

Who is we? And why?

About the only half way credible arguments I've seen for it were:

- "it might reduce stalls in the VM with order 1". Didn't quite
convince me because there were no numbers presented and at least on
x86-64 I've never noticed or got reported significant stalls because
of this.

- "it allows more threads for 32bit which might run out of lowmem" - i
think everybody agrees that the 10k threads case is not really
something to encourage. And even when you want to add it then only a factor
two increase (which this patch brings) is not really too helpful.

The main argument thrown around seems to be "but it will break
binary only modules" - while I'm not fully unsympathetic I don't
think technical issues in the kernel should be guided by
such political considerations.

I suspect you will be reposting it so often till the voices
of reasons get tired?

-Andi

2005-12-17 20:16:51

by Kyle Moffett

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Dec 17, 2005, at 12:44, Andi Kleen wrote:
> Kernel code is getting more complex all the time and running with
> very tight stack is just risky.

IMPORTANT POINT: The 4k-stacks code does *NOT* reduce overall
available stack!!! With the old code we have 8k of _total_ stack.
With the new code we have 4k of interrupt stack and 4k of per-process
stack. This makes stack-overflows a _LOT_ more debuggable, because
it's not a coincidence of high process-stack-usage and high interrupt-
stack-usage.

>> The point is to force it in -mm so most people can't just disable
>> it because it fixes their problem. We want 8k stacks to go away
>
> Who is we? And why?
>
> About the only half way credible arguments I've seen for it were:

I posted a list of links to the archives of various reasons a day or
so ago, but for summary:

This helps for some NUMA systems because single pages can come out of
a per-cpu pool instead of requiring global allocator locks.

> - "it might reduce stalls in the VM with order 1". Didn't quite
> convince me because there were no numbers presented and at least on
> x86-64 I've never noticed or got reported significant stalls
> because of this.

One comment on x86-64 vs. x86: There are restrictions on where in
memory your process stacks can be located on a 32-bit platform. They
need to reside in lowmem, which means under certain circumstances
your lowmem can get too fragmented to create new processes even
though you still have a lot of available RAM.

> - "it allows more threads for 32bit which might run out of lowmem"
> - i think everybody agrees that the 10k threads case is not really
> something to encourage.

Who is this "everybody" of whom you speak? :-D. Personally I agree
that we shouldn't _encourage_ 10k threads, but there are existing
userspace programs which do that, and I think we should support them
as much as possible.

> And even when you want to add it then only a factor two increase
> (which this patch brings) is not really too helpful.

The fragmentation behavior and optimizations for order-1 vs. order-0
_is_ significant. You can _always_ allocate order-0 pages if you
have any free memory in that zone, which is _not_ necessarily true
for order-N pages. (even if N==1). Also, I think some of the
fragmentation avoidance attempts get significantly easier and produce
much better results if all the kernel stacks are order-0.

Cheers,
Kyle Moffett

--
If you don't believe that a case based on [nothing] could potentially
drag on in court for _years_, then you have no business playing with
the legal system at all.
-- Rob Landley





2005-12-17 20:25:09

by Paul Rolland

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

Hello,

> One comment on x86-64 vs. x86: There are restrictions on where in
> memory your process stacks can be located on a 32-bit
> platform. They
> need to reside in lowmem, which means under certain circumstances
> your lowmem can get too fragmented to create new processes even
> though you still have a lot of available RAM.

But where does these restrictions come from ? As far as I know, stack
is referenced to by SS:ESP registers, and nothing in the x86 architecture
prevents them from pointing outside of lowmem... Isn't this simply a
Linux design restriction ?

Regards,
Paul


Paul Rolland, rol(at)as2917.net
ex-AS2917 Network administrator and Peering Coordinator

--

Please no HTML, I'm not a browser - Pas d'HTML, je ne suis pas un navigateur
"Some people dream of success... while others wake up and work hard at it"

"I worry about my child and the Internet all the time, even though she's
too young to have logged on yet. Here's what I worry about. I worry that
10 or 15 years from now, she will come to me and say 'Daddy, where were
you when they took freedom of the press away from the Internet?'"
--Mike Godwin, Electronic Frontier Foundation

2005-12-17 20:47:35

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Sat, 2005-12-17 at 21:23 +0100, Paul Rolland wrote:
> Hello,
>
> > One comment on x86-64 vs. x86: There are restrictions on where in
> > memory your process stacks can be located on a 32-bit
> > platform. They
> > need to reside in lowmem, which means under certain circumstances
> > your lowmem can get too fragmented to create new processes even
> > though you still have a lot of available RAM.
>
> But where does these restrictions come from ? As far as I know, stack
> is referenced to by SS:ESP registers, and nothing in the x86 architecture
> prevents them from pointing outside of lowmem... Isn't this simply a
> Linux design restriction ?

lowmem is a linux design restriction; only lowmem is directly
addressable.

(also remember that you can have 36 bits of physical ram, but only 32
bit in a pointer, so even if lowmem wasn't 870Mb itd be limited to 4Gb)

2005-12-17 20:52:38

by Adrian Bunk

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Sat, Dec 17, 2005 at 06:44:07PM +0100, Andi Kleen wrote:
> Kyle Moffett <[email protected]> writes:
>
> > On Dec 16, 2005, at 10:35, Diego Calleja wrote:
> > > I know, but there's too much resistance to the "pure" 4kb patch.
> >
> > I have yet to see any resistance to the 4Kb patch this time around
> > that was not "*whine* don't break my ndiswrapper plz".
>
> My comment from last time about the missing safety net still applies 100%.
>
> Kernel code is getting more complex all the time and running with
> very tight stack is just risky.

My patch reduces it from roughly 6kB to 4kB.

I'm with you that we need a safety net, but I don't see a problem with
this being between 3kB and 4kB. The goal should be to _never_ use more
than 3kB stack having a 1kB safety net.

And in my experience, many stack problems don't come from code getting
more complex but from people allocating 1kB structs or arrays of
> 2k chars on the stack. In these cases, the code has to be fixed and
"make checkstack" makes it easy to find such cases.

And as a data point, my count of bug reports for problems with in-kernel
code with 4k stacks after Neil's patch went into -mm is still at 0.

> > The point is to force it in -mm so most people can't just disable it
> > because it fixes their problem. We want 8k stacks to go away, and
>
> Who is we? And why?
>
> About the only half way credible arguments I've seen for it were:
>
> - "it might reduce stalls in the VM with order 1". Didn't quite
> convince me because there were no numbers presented and at least on
> x86-64 I've never noticed or got reported significant stalls because
> of this.
>
> - "it allows more threads for 32bit which might run out of lowmem" - i
> think everybody agrees that the 10k threads case is not really
> something to encourage. And even when you want to add it then only a factor
> two increase (which this patch brings) is not really too helpful.
>...

Unfortunately, "is not really something to encourage" doesn'a make
"happens in real-life applications" impossible...

Reducing the stack by one third brings a factor two reduction in the
memory usage of threads - I wouldn't say this sounds too bad.

> -Andi

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

2005-12-18 02:35:51

by Andi Kleen

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

> > - "it allows more threads for 32bit which might run out of lowmem" - i
> > think everybody agrees that the 10k threads case is not really
> > something to encourage. And even when you want to add it then only a factor
> > two increase (which this patch brings) is not really too helpful.
> >...
>
> Unfortunately, "is not really something to encourage" doesn'a make
> "happens in real-life applications" impossible...

real-life applications can either use user space threads or 64bit
machines. The days when Linux did otherwise unjustificable ha^w^wdesign
changes just to work around the 900MB lowmem on weird loads on
extremly big 32bit machines are pretty much over I think...

-Andi

2005-12-18 04:21:27

by Horst H. von Brand

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513


Attachments:
tst.c (151.00 B)
Example code with temporary struct on stack

2005-12-18 04:22:32

by Horst H. von Brand

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

Steven Rostedt <[email protected]> wrote:
> On Fri, 2005-12-16 at 15:42 -0300, Horst von Brand wrote:
> > linux-os \(Dick Johnson\) <[email protected]> wrote:
> >
> > [...]
> >
> > > Throughout the past two years of 4k stack-wars, I never heard why
> > > such a small stack was needed (not wanted, needed). It seems that
> > > everybody "knows" that smaller is better and most everybody thinks
> > > that one page in ix86 land is "optimum". However I don't think
> > > anybody ever even tried to analyze what was better from a technical
> > > perspective. Instead it's been analyzed as religious dogma, i.e.,
> > > keep the stack small, it will prevent idiots from doing bad things.
> >
> > OK, so here goes again...
> >
> > The kernel stack has to be contiguous in /physical/ memory. Keep the stack
> > /one/ page, that way you can always get a new stack when needed (== each
> > fork(2) or clone(2)). If the stack is 2 (or more) pages, you'll have to
> > find (or create) a multi-page free area, and (fragmentation being what it
> > is, and Linux routinely running for months at a time) you are in a whole
> > new world of pain.

> So people should really be asking for a PAGE_SIZE = 8k option ;)

On i386 is is either 4KiB or 4MiB. Guess what I prefer...
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513

2005-12-18 05:03:53

by Parag Warudkar

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks


On Dec 17, 2005, at 3:52 PM, Adrian Bunk wrote:

> And in my experience, many stack problems don't come from code getting
> more complex but from people allocating 1kB structs or arrays of

And we catch this type of problems fairly easily in the patch review
itself, even before accepting the code in mainline. Plus there is
make checkstack to help find and fix any such issues, isn't it? So
it's not like forcing the stack to 4Kb and making the offending code
to crash is the best solution to force people to write code which
plays nice with the stack.

I think on i386 most people do fine with the 8Kb stack - whoever
benefits from 4Kb stack, can always choose the 4Kb stack config
option and recompile.

Alternatively, default to 4Kb and let people choose 8Kb and recompile
if that's what suits their workloads.

In any case having options doesn't hurt anything and we don't benefit
in any way from taking away the 8Kb option.

My 2 cents.

Parag


2005-12-18 05:43:35

by Andi Kleen

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Sun, Dec 18, 2005 at 12:03:39AM -0500, Parag Warudkar wrote:
>
> On Dec 17, 2005, at 3:52 PM, Adrian Bunk wrote:
>
> >And in my experience, many stack problems don't come from code getting
> >more complex but from people allocating 1kB structs or arrays of
>
> And we catch this type of problems fairly easily in the patch review
> itself, even before accepting the code in mainline. Plus there is

You can catch the obvious ones, but the really hard ones
that only occur under high load in obscure exceptional
circumstances with large configurations and suitable nesting you won't.
These would be only found at real world users.

-Andi

2005-12-18 06:06:30

by Bodo Eggert

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

Adrian Bunk <[email protected]> wrote:

> I'm with you that we need a safety net, but I don't see a problem with
> this being between 3kB and 4kB. The goal should be to _never_ use more
> than 3kB stack having a 1kB safety net.
>
> And in my experience, many stack problems don't come from code getting
> more complex but from people allocating 1kB structs or arrays of
> > 2k chars on the stack. In these cases, the code has to be fixed and
> "make checkstack" makes it easy to find such cases.
>
> And as a data point, my count of bug reports for problems with in-kernel
> code with 4k stacks after Neil's patch went into -mm is still at 0.

Would you run a desktop with an nfs server on xfs on lvm on dm on SCSI?
Or a productive server on -mm?

IMO it's OK to push 4K stacks in -mm, but one week of no error reports from
a few testers don't make a reliable system.

[...]

> Unfortunately, "is not really something to encourage" doesn'a make
> "happens in real-life applications" impossible...

The same applies to using kernel stack. Therefore I'll want to choose
a bigger stack for my server, which runs less than 100 processes.
--
Ich danke GMX daf?r, die Verwendung meiner Adressen mittels per SPF
verbreiteten L?gen zu sabotieren.

2005-12-18 06:05:56

by Parag Warudkar

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks


On Dec 18, 2005, at 12:43 AM, Andi Kleen wrote:

> You can catch the obvious ones, but the really hard ones
> that only occur under high load in obscure exceptional
> circumstances with large configurations and suitable nesting you
> won't.
> These would be only found at real world users.

Yep, as it all depends on code complexity, some of these cases might
not be "errors" at all - instead for that kind of functionality they
might _require_ bigger stacks.

If you have 64 bit machines common place and memory a lot cheaper I
don't see how it is beneficial to force smaller stack sizes without
giving consideration to the code complexity, architecture and
requirements.
(Solaris for example, seems to be going to have 16Kb kernel stacks on
64 bit machines.)

So, please let's leave stack size as an option for users to choose
and stop this 4Kb stack war. May be after a little rest I will start
another one demanding 16Kb stacks :)

Parag

2005-12-18 10:48:03

by Stefan Rompf

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

Andi Kleen wrote:

> Kernel code is getting more complex all the time and running with
> very tight stack is just risky.

Btw., has anyone yet *measured* maximum stack usage for some weeks on several
machines, e.g. desktop system with one NIC, reiserfs; server with several
NICs, stacked device-mapper targets, fiber channel, appletalk...; web server
with SQL database running on it etc?

Right now I have the impression that the 4k stack flamewars base on make
checkstack output, waiting for bugreports and other guesswork. Removing the
safety net on such a basis is just *very bad engineering*.

Stefan

2005-12-18 11:21:28

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k/4k stacks


> Btw., has anyone yet *measured* maximum stack usage for some weeks on several
> machines, e.g. desktop system with one NIC, reiserfs; server with several
> NICs, stacked device-mapper targets, fiber channel, appletalk...; web server
> with SQL database running on it etc?

partially, see below

> Right now I have the impression that the 4k stack flamewars base on make
> checkstack output, waiting for bugreports and other guesswork. Removing the
> safety net on such a basis is just *very bad engineering*.

your impression is wrong.

the kernel has a stack overflow detector, which checks at irq entry time
if the stack is "rather high" (7kb into the stack on a 8kb stack, 3.5kb
on a 4k stack). When this warning hits there's still runway left (like
12.5 percent), but lets say the end becomes in sight. If the stack usage
would be really tight, this "early warning" detector would be hitting a
lot of people, right? Well the good news is that it isn't being hit in
the distributions that use 4Kb stacks (at least the fedora releases and
RHEL, maybe others), with a few exceptions related to XFS use several
months ago (which got fixed since).

While this isn't a measure of how deep things ACTUALLY go, it's a
measure that they don't go past the 3.5Kb limit, let alone go past 4Kb
limit.

In addition someone did a chain analysis (which no doubt isn't 100%
complete but still a pretty good effort) and that didn't show major
problems either.

The guesswork in this thread is all from the people on the other side of
the argument, with lots fear and doubt but with no data ;)

(and the "safety net" is a bit of misnomer, since it's not really safe,
just "statistically different" if the shit hits the fan)

2005-12-18 12:03:54

by Stefan Rompf

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k/4k stacks

Am Sonntag 18 Dezember 2005 12:21 schrieb Arjan van de Ven:

> the kernel has a stack overflow detector, which checks at irq entry time
> if the stack is "rather high" (7kb into the stack on a 8kb stack, 3.5kb
> on a 4k stack). When this warning hits there's still runway left (like
> 12.5 percent), but lets say the end becomes in sight. If the stack usage
> would be really tight, this "early warning" detector would be hitting a
> lot of people, right?

Wrong. The probability that an interrupt happens just during the codepath with
highest stack usage is very small. Anyway CONFIG_DEBUG_STACKOVERFLOW is not
enabled in 2.6.14.4 i386 defconfig. Don't know about vendor kernel kernels
though.

I thought more about filling the stack with some arbitrary value on thread
startup and checking how much has been overwritten on a regular basis. Part
of it is alreay there, hidden unter CONFIG_DEBUG_STACK_USAGE. The
verification should just happen timer-controlled, not only on sysrq-whatever.

> (and the "safety net" is a bit of misnomer, since it's not really safe,
> just "statistically different" if the shit hits the fan)

If you can't even guarantee that 8k (or 6k) is enough, how can you vote for 4k
then ;-) Just a little provocation, I don't plan getting too involved into
this dicussion, hell, this is just about a ridiculously small amount of self
contained #ifdef'd code ;-)

Stefan

2005-12-18 12:06:08

by Alan

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Sul, 2005-12-18 at 11:49 +0100, Stefan Rompf wrote:
> Btw., has anyone yet *measured* maximum stack usage for some weeks on several
> machines, e.g. desktop system with one NIC, reiserfs; server with several
> NICs, stacked device-mapper targets, fiber channel, appletalk...; web server
> with SQL database running on it etc?

Some vendors have shipped distributions configured with 4K stacks for a
long time and monitored bug reports.

2005-12-18 12:08:59

by Adrian Bunk

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Sun, Dec 18, 2005 at 12:03:39AM -0500, Parag Warudkar wrote:
>
> On Dec 17, 2005, at 3:52 PM, Adrian Bunk wrote:
>
> >And in my experience, many stack problems don't come from code getting
> >more complex but from people allocating 1kB structs or arrays of
>
> And we catch this type of problems fairly easily in the patch review
> itself, even before accepting the code in mainline. Plus there is
> make checkstack to help find and fix any such issues, isn't it? So
> it's not like forcing the stack to 4Kb and making the offending code
> to crash is the best solution to force people to write code which
> plays nice with the stack.

4kB stacks are already an option for some time. There were some problems
in the beginning, but as far as we know we have have fixed all of them.

There are so many possible bugs people writing kernel code could
introduce bugs that cause crashes. The solution is not to add
workarounds for programming bugs at every possible place, but as the
code review and "make checkstack" before accepting code.

As a data point, my count of bug reports for problems with in-kernel
code with 4k stacks after Neil's patch went into -mm is still at 0.

> I think on i386 most people do fine with the 8Kb stack - whoever
> benefits from 4Kb stack, can always choose the 4Kb stack config
> option and recompile.
>
> Alternatively, default to 4Kb and let people choose 8Kb and recompile
> if that's what suits their workloads.
>...

There is no workload where 8kB suits better.

> My 2 cents.
>
> Parag

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

2005-12-18 12:09:28

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k/4k stacks

On Sun, 2005-12-18 at 13:04 +0100, Stefan Rompf wrote:
> Am Sonntag 18 Dezember 2005 12:21 schrieb Arjan van de Ven:
>
> > the kernel has a stack overflow detector, which checks at irq entry time
> > if the stack is "rather high" (7kb into the stack on a 8kb stack, 3.5kb
> > on a 4k stack). When this warning hits there's still runway left (like
> > 12.5 percent), but lets say the end becomes in sight. If the stack usage
> > would be really tight, this "early warning" detector would be hitting a
> > lot of people, right?
>
> Wrong. The probability that an interrupt happens just during the codepath with
> highest stack usage is very small

so it samples over 1000 times per second, more when busy. Multiplied
over a very large number of users, and 2 years of time. "very small"...
I don't quite agree there.


> Anyway CONFIG_DEBUG_STACKOVERFLOW is not
> enabled in 2.6.14.4 i386 defconfig. Don't know about vendor kernel kernels
> though.

the RH/Fedora ones have this enabled

> > (and the "safety net" is a bit of misnomer, since it's not really safe,
> > just "statistically different" if the shit hits the fan)
>
> If you can't even guarantee that 8k (or 6k) is enough, how can you vote for 4k
> then ;-)

it's not 4k it is 4k+4k btw. And my argument is that it's not less
safe.. nor unsafe


2005-12-18 12:28:17

by Adrian Bunk

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Sun, Dec 18, 2005 at 06:57:44AM +0100, Bodo Eggert wrote:
> Adrian Bunk <[email protected]> wrote:
>
> > I'm with you that we need a safety net, but I don't see a problem with
> > this being between 3kB and 4kB. The goal should be to _never_ use more
> > than 3kB stack having a 1kB safety net.
> >
> > And in my experience, many stack problems don't come from code getting
> > more complex but from people allocating 1kB structs or arrays of
> > > 2k chars on the stack. In these cases, the code has to be fixed and
> > "make checkstack" makes it easy to find such cases.
> >
> > And as a data point, my count of bug reports for problems with in-kernel
> > code with 4k stacks after Neil's patch went into -mm is still at 0.
>
> Would you run a desktop with an nfs server on xfs on lvm on dm on SCSI?
> Or a productive server on -mm?
>
> IMO it's OK to push 4K stacks in -mm, but one week of no error reports from
> a few testers don't make a reliable system.
> [...]

It isn't that 4k stacks were completely untested.

Fedore enables it for a long time.

Even RHEL4 always uses 4k stacks - and RHEL is a distribution many
people use on their production servers.

> > Unfortunately, "is not really something to encourage" doesn'a make
> > "happens in real-life applications" impossible...
>
> The same applies to using kernel stack. Therefore I'll want to choose
> a bigger stack for my server, which runs less than 100 processes.

You can always manually adjust THREAD_SIZE if you really want to, but
there should be no reason to do so.

There are so many possible programming errors in kernel code, and stack
usage problems are amongst the ones you can find relatively easy...

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

2005-12-18 13:44:58

by Michael Poole

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

Adrian Bunk writes:

> On Sun, Dec 18, 2005 at 06:57:44AM +0100, Bodo Eggert wrote:
> >
> > Would you run a desktop with an nfs server on xfs on lvm on dm on SCSI?
> > Or a productive server on -mm?
> >
> > IMO it's OK to push 4K stacks in -mm, but one week of no error reports from
> > a few testers don't make a reliable system.
> > [...]
>
> It isn't that 4k stacks were completely untested.
>
> Fedore enables it for a long time.
>
> Even RHEL4 always uses 4k stacks - and RHEL is a distribution many
> people use on their production servers.

As was pointed out previously in this thread, at least one
configuration that is known to have problems with 4k stacks is simply
not supported by RHEL. How many more are like that?

Michael Poole

2005-12-18 14:12:06

by Adrian Bunk

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Sun, Dec 18, 2005 at 08:44:56AM -0500, Michael Poole wrote:
> Adrian Bunk writes:
>
> > On Sun, Dec 18, 2005 at 06:57:44AM +0100, Bodo Eggert wrote:
> > >
> > > Would you run a desktop with an nfs server on xfs on lvm on dm on SCSI?
> > > Or a productive server on -mm?
> > >
> > > IMO it's OK to push 4K stacks in -mm, but one week of no error reports from
> > > a few testers don't make a reliable system.
> > > [...]
> >
> > It isn't that 4k stacks were completely untested.
> >
> > Fedore enables it for a long time.
> >
> > Even RHEL4 always uses 4k stacks - and RHEL is a distribution many
> > people use on their production servers.
>
> As was pointed out previously in this thread, at least one
> configuration that is known to have problems with 4k stacks is simply
> not supported by RHEL. How many more are like that?

s/is known/was known/

XFS got fixed and Neil's patch should fix the rest of the problem.

My count of bug reports for problems with 4k stacks after Neil's patch
went into -mm is still at 0.

4k stacks are always used by Fedora.
4k stacks are always used by RHEL4.

Granted, there might be some small areas that are not covered by such
distributions.

You ask "How many more are like that?".
That's exactly the question I want answers for by always enabling it
in -mm.

-mm is a pretty experimental kernel and everything using it knows about
this. Many -mm kernels contain more than hundred new patches compared to
the previous -mm kernel, and some of these patches are of the quality
"compiles with some specific options set and might not always crash your
kernel". A patch like always enabling 4k stacks that is essentially
already used by at least one popular desktop distribution (Fedora) and
at least one popular server distriution (RHEL4) already had _far_ more
than the average testing coverage for patches in -mm, so WTF is the
problem?

> Michael Poole

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

2005-12-18 15:44:05

by Adrian Bunk

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Sun, Dec 18, 2005 at 06:43:23AM +0100, Andi Kleen wrote:
> On Sun, Dec 18, 2005 at 12:03:39AM -0500, Parag Warudkar wrote:
> >
> > On Dec 17, 2005, at 3:52 PM, Adrian Bunk wrote:
> >
> > >And in my experience, many stack problems don't come from code getting
> > >more complex but from people allocating 1kB structs or arrays of
> >
> > And we catch this type of problems fairly easily in the patch review
> > itself, even before accepting the code in mainline. Plus there is
>
> You can catch the obvious ones, but the really hard ones
> that only occur under high load in obscure exceptional
> circumstances with large configurations and suitable nesting you won't.
> These would be only found at real world users.

You miss the fact that many of these problems can be detected by static
analysis.

We know that we don't have any non-recursive paths with > 3 kB stack
usage anymore since the beginning of this year, and the known recursive
problems should be attacked by Neil's patch.

> -Andi

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

2005-12-18 15:49:36

by Parag Warudkar

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks


On Dec 18, 2005, at 7:09 AM, Adrian Bunk wrote:

> There is no workload where 8kB suits better.

People have pointed out that there is currently at least one
incompatibility introduced by 4K stacks and there may be many others
which are corner cases, that only occur under high load in obscure
exceptional circumstances with large configurations and suitable
nesting.

Moreover for 64 bit architectures there is no proven point that 4Kb
stacks are solving a specific problem there (Like the lowmem
fragmentation on i386 for e.g.). Nor can we predict for sure that in
future no type of functionality will require more stack. So taking
away 8Kb stack size on such arches solves no known problems and
introduces artificial limitations on code complexity.

All I am asking is what is wrong with having options? You can even
default to 4Kb and let people choose 8Kb when they absolutely benefit
from it. Does having options introduce code bloat or what is it that
is pressing so hard to remove the 8Kb "option"?

Parag

2005-12-18 15:51:08

by Adrian Bunk

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Sun, Dec 18, 2005 at 01:05:52AM -0500, Parag Warudkar wrote:
>
> On Dec 18, 2005, at 12:43 AM, Andi Kleen wrote:
>
> >You can catch the obvious ones, but the really hard ones
> >that only occur under high load in obscure exceptional
> >circumstances with large configurations and suitable nesting you
> >won't.
> >These would be only found at real world users.
>
> Yep, as it all depends on code complexity, some of these cases might
> not be "errors" at all - instead for that kind of functionality they
> might _require_ bigger stacks.

Is this just FUD or can you give an example where this is a real
problem that can't be solved by using kmalloc()?

> If you have 64 bit machines common place and memory a lot cheaper I
> don't see how it is beneficial to force smaller stack sizes without
> giving consideration to the code complexity, architecture and
> requirements.
>...

Note that we are talking about reducing the stack size _by one third_.

Therefore, your point it would make code much more complex sounds
strange.

> Parag

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

2005-12-18 15:57:35

by Adrian Bunk

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Sun, Dec 18, 2005 at 10:49:31AM -0500, Parag Warudkar wrote:
>
> On Dec 18, 2005, at 7:09 AM, Adrian Bunk wrote:
>
> >There is no workload where 8kB suits better.
>
> People have pointed out that there is currently at least one
> incompatibility introduced by 4K stacks and there may be many others

That's wrong.

My count of bug reports for problems with 4k stacks with in-kernel code
after Neil's patch went into -mm is still at 0.

> which are corner cases, that only occur under high load in obscure
> exceptional circumstances with large configurations and suitable
> nesting.

And this is not that much of an issue since most of these cases can and
have already been analyzed by static analysis to be below 3 kB stack
usage.

> Moreover for 64 bit architectures there is no proven point that 4Kb
> stacks are solving a specific problem there (Like the lowmem
> fragmentation on i386 for e.g.). Nor can we predict for sure that in
> future no type of functionality will require more stack. So taking
> away 8Kb stack size on such arches solves no known problems and
> introduces artificial limitations on code complexity.
>...

That's complete bullshit considering that we are talking about an
i386-only patch.

> Parag

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

2005-12-18 15:59:37

by Parag Warudkar

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks


On Dec 18, 2005, at 10:51 AM, Adrian Bunk wrote:

> Is this just FUD or can you give an example where this is a real
> problem that can't be solved by using kmalloc()?

Can you prove "Under all uses, circumstances and code requirements we
will do fine with 4K stacks today and tomorrow" ? How will deeply
nested function calls, longer call chains etc. be solved by kmalloc()?

> Therefore, your point it would make code much more complex sounds
> strange.

My point wasn't that reducing stack will make code much more complex.
My point was some type of functionality might validly require complex
code which requires more stack - there are capable and affordable
machines to solve such problems and all we are doing with 4kB stacks
is that preventing it.

Again - what is the pressing need to remove the "8Kb Stack _Option_"?
What problem does it solve on 64 bit arches?

Parag

2005-12-18 16:04:46

by Parag Warudkar

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks


On Dec 18, 2005, at 10:57 AM, Adrian Bunk wrote:

> That's complete bullshit considering that we are talking about an
> i386-only patch.

Ok, ignore my rants, I wrongly assumed there is a drive to make all
arches live with 4Kb! But if it's taking away the 8Kb option on i386
please consider keeping the 8Kb option and making 4Kb default.

Parag

2005-12-18 23:11:05

by NeilBrown

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Friday December 16, [email protected] wrote:
> Horst von Brand <[email protected]> wrote:
> [Forgot the attachment]

Thanks...
Based on that, I tried the following patch, and it didn't change the
amount of space that is reserved on the stack.
gcc version 4.0.3 20051201 (prerelease) (Debian 4.0.2-5)

Further, earlier version of gcc miscompile this construct.
They effectively treat that in-line structure as a 'static', and
seeing notify_change changes .ia_valid, the next time it is called
contents of the structure will be wrong.

NeilBrown


### Diffstat output
./fs/nfsd/vfs.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)

diff ./fs/nfsd/vfs.c~current~ ./fs/nfsd/vfs.c
--- ./fs/nfsd/vfs.c~current~ 2005-12-19 09:44:20.000000000 +1100
+++ ./fs/nfsd/vfs.c 2005-12-19 09:56:46.000000000 +1100
@@ -923,11 +923,9 @@ nfsd_vfs_write(struct svc_rqst *rqstp, s

/* clear setuid/setgid flag after write */
if (err >= 0 && (inode->i_mode & (S_ISUID | S_ISGID))) {
- struct iattr ia;
- ia.ia_valid = ATTR_KILL_SUID | ATTR_KILL_SGID;
-
down(&inode->i_sem);
- notify_change(dentry, &ia);
+ notify_change(dentry, &((struct iattr)
+ {.ia_valid = ATTR_KILL_SUID | ATTR_KILL_SGID}));
up(&inode->i_sem);
}

2005-12-19 00:45:41

by NeilBrown

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Friday December 16, [email protected] wrote:
>
> The nfsd code uses inline in too many places.

Does it?
Most of the uses are either
- truly tiny bits of code
- code that is used only once which, as you as, will not currently
be auto-inlined on i386, so we do it by hand.

An exception is some of the xdr code.
If I
#define inline
in nfs3xdr.c, the nfsd.o changes from
text data bss dec hex filename
76132 3464 2408 82004 14054 ../mm-i386/fs/nfsd/nfsd.o
to
text data bss dec hex filename
72452 3464 2408 78324 131f4 ../mm-i386/fs/nfsd/nfsd.o
which is probably a win.

Is that what you were referring to?

>
> If this struct is really a problem (which I doubt considering it's
> size), I'd prefer it being kmalloc'ed.

It's hard to *know* if it is a problem, but I am conscious that nfsd
adds measurably to stack depth for filesystem paths, and probably
isn't measured nearly as often.
It's true that 50 bytes out of 4K isn't a lot, but wastage that can be
avoided, should be avoided.

NeilBrown

2005-12-19 01:34:28

by Adrian Bunk

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Mon, Dec 19, 2005 at 11:45:24AM +1100, Neil Brown wrote:
> On Friday December 16, [email protected] wrote:
> >
> > The nfsd code uses inline in too many places.
>
> Does it?
> Most of the uses are either
> - truly tiny bits of code

That's OK if they stay tiny and don't grow as time passes by.

> - code that is used only once which, as you as, will not currently
> be auto-inlined on i386, so we do it by hand.

That's OK if it isn't forgotten to un-inline them when they get more
callers.

Unfortunately, people often don't check whether an "inline" is still
appropriate when the code evolves.

Unless this is an extreme hot path, it's therefore IMHO not a good idea
to use "inline" in such cases.

Additionally, it's a medium-term goal for me to re-enable unit-at-a-time
on i386 for recent gcc's.

> An exception is some of the xdr code.
> If I
> #define inline
> in nfs3xdr.c, the nfsd.o changes from
> text data bss dec hex filename
> 76132 3464 2408 82004 14054 ../mm-i386/fs/nfsd/nfsd.o
> to
> text data bss dec hex filename
> 72452 3464 2408 78324 131f4 ../mm-i386/fs/nfsd/nfsd.o
> which is probably a win.
>
> Is that what you were referring to?

I didn't had one specific example in mind, but yes this seems to be an
example for inline's that might have been reasonable at one time in the
past, but are no longer today.

> > If this struct is really a problem (which I doubt considering it's
> > size), I'd prefer it being kmalloc'ed.
>
> It's hard to *know* if it is a problem, but I am conscious that nfsd
> adds measurably to stack depth for filesystem paths, and probably
> isn't measured nearly as often.
> It's true that 50 bytes out of 4K isn't a lot, but wastage that can be
> avoided, should be avoided.

"make checkstack" tells that nfsd_vfs_write is below 100 bytes of stack
usage. So even calling 30 such functions would not get you above
3 kB stack usage.

It's also interesting that according to J?rn Engel's static analysis of
call paths in kernel 2.6.11 [1], the string "nfs" does occur in neither
any of the functions involved in call paths with > 2 kB stack usage, nor
in any recursive call paths.

It's OK to use some bytes from the stack, and you haven't yet convinced
me that the code you are responsible for is using too much stack. ;-)

> NeilBrown

cu
Adrian

[1] http://wh.fh-wedel.de/~joern/stackcheck.2.6.11

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

2005-12-19 08:51:11

by Helge Hafting

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

Ray Lee wrote:

>(Man, I've been holding my tongue on this conversation for a while,
>but it seems my better angels have deserted me.)
>
>On 12/15/05, Lee Revell <[email protected]> wrote:
>
>
>>Bugzilla link please.
>>
>>
>
>No, that's not how failure engineering is done. A guy designing a
>bridge doesn't cut all the supports back to the bare minimum just to
>save money because his design says that the remaining metal should be
>strong enough. If you can't prove it, and it's a safety issue
>(continuing my analogy in the physical world), then you engineer for
>failure. Can you handle all occurrences? No, a hurricane Katrina comes
>along every once in a while. Can you weather more than you did before?
>Yes. In the meantime, their are fewer poor sods falling off the bridge
>that have to open a bugzilla report.
>
>
This is quite different - you know much better what stack loads
the kernel may get into. A bridge gets all sorts of weather,
with the very extreme cases occuring now and then. With the
kernel, you can look at the code and determine the maximum
possible stack depth that can ever occur. It won't get deeper
even in some very rare case.


>The world of software is no different. If someone wants to remove the
>8k stacks option, they'd better prove that they're making my servers
>more reliable. I've seen zero arguments for why 8k stacks is unviable.
>
>
Well, would you like the kernel to kill your webserver (or whatever
important app) because it attempted to fork at a time where
no two consecutive pages could be found? That happens
occationally in real life - with 8k stack. Going to 12k stack, 16k stacks,
or (shudder) 64 stacks would make that much worse than it
is today.

>(I've also wondered why we can't just have IRQ stacks plus 8k thread
>stacks -- seemingly the best of both worlds) Instead, what I've seen
>is that we have coders who don't like the idea of any non-order-zero
>allocations taking place, because big systems running poorly coded
>Java apps with massive threading can hit problems with allocations
>from time to time.
>
>
You don't need big threaded apps for this to happen. All
you need is a handful of forking apps and memory
fragmentation. Many common server apps (web, mail, fileserver,...)
tend to use forking. Massively threaded apps like 4k stacks simply
because that saves 4k for each of the many threads.

>The answer for that is the same answer the kernel community usually
>gives about poorly designed userspace applications: rewrite them.
>
>I'm quite open to being proved wrong. If someone has a counter case
>they can toss forth, please do so. Systems taking lots of interrupts?
>Then how about 8k + IRQ stacks? With a counterexample I'll gladly
>concede that I'm an ignorant slut[*] -- excuse me, Saturday Night Live
>flashbacks -- an ignorant git, and shut up. ([*] is only half right,
>I'm not all that ignorant).
>
>If someone doesn't show a counter case, then may I suggest people
>consider the possibility that this is not proper engineering. Prove
>it, or provide a safety blanket. But don't yank the blanket without
>proving the lack of problem.
>
>
Well, failing order-1 allocations is _the_ counterexample. IT
never happens with 4k stacks unless you run totally out of
memory, and then nothing can save you.

Helge Hafting

2005-12-19 08:59:03

by Helge Hafting

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

linux-os (Dick Johnson) wrote:

>On Thu, 15 Dec 2005, Lee Revell wrote:
>
>
>
>>On Thu, 2005-12-15 at 14:46 -0700, Jeff V. Merkey wrote:
>>
>>
>>>Lee Revell wrote:
>>>
>>>
>>>
>>>>On Thu, 2005-12-15 at 14:07 -0700, Jeff V. Merkey wrote:
>>>>
>>>>
>>>>
>>>>
>>>>>When you are on the phone with an irrate customer at 2:00 am in the
>>>>>morning, and just turning off your broken 4K stack fix
>>>>>and getting the customer running matters.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>Bugzilla link please. Otherwise STFU.
>>>>
>>>>
>>>>
>>>>
>>>??????
>>>
>>>Jeff
>>>
>>>
>>You imply that your customer's problem was due to a kernel bug triggered
>>by CONFIG_4KSTACKS. I am asking you to provide a link to the bug report
>>or get lost.
>>
>>Lee
>>
>>
>
>Throughout the past two years of 4k stack-wars, I never heard why
>such a small stack was needed (not wanted, needed). It seems that
>everybody "knows" that smaller is better and most everybody thinks
>that one page in ix86 land is "optimum". However I don't think
>anybody ever even tried to analyze what was better from a technical
>perspective. Instead it's been analyzed as religious dogma, i.e.,
>keep the stack small, it will prevent idiots from doing bad things.
>
>I'm fairly sure that if you started from scratch and decided to
>write a new operating system, your choice of a stack-size would
>probably be something like 64k. I have no clue why somebody
>decided to use a 4k stack and force their choice upon others.
>And, yes, I am well aware that each system-call requires a
>seperate stack upon entry and it even needs to keep that stack
>while sleeping.
>
>
No. If writing an os from scratch, then using 4k stacks would
be absolutely trivial. The problems now happens only because
we're switching away from the 8k stacks people were used to having.
Design with 4k stacks from the start, and you'll see people writing
code with the assumption that they can't stick more than a handful
of ints/pointers on the stack.

If you design with 64k stacks then people _use_ that memory, and
soon you hear someone wanting 128k stacks to be "safe". That
way you end up with windows. Note that 64k is 16 pages, and they
have to be _consecutive_. It don't take much fragmentation before
you can't get that many consecutive pages - you can easily have 3/4 of your
memory unused, ready to be taken, but still be unable to get 16
consecutive pages.

To see this - create a small kernel module that tries to allocate 64k
of consecutive memory when loaded. Try loading it after a few days
of normal use, and see how often your server fails to do it.

Helge Hafting

2005-12-19 09:36:23

by Helge Hafting

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

Horst von Brand wrote:

>>So people should really be asking for a PAGE_SIZE = 8k option ;)
>>
>>
>
>On i386 is is either 4KiB or 4MiB. Guess what I prefer...
>
>
Well, you can always use 8k pages - by setting PAGE_SIZE to 8k
and always set up the real 4k pages in pairs. Wheter we want to do this
is another issue - but it is simple enough and avoids fragmentation.

Helge Hafting

2005-12-19 09:39:24

by Helge Hafting

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k/4k stacks

Stefan Rompf wrote:

>Wrong. The probability that an interrupt happens just during the codepath with
>highest stack usage is very small. Anyway CONFIG_DEBUG_STACKOVERFLOW is not
>enabled in 2.6.14.4 i386 defconfig. Don't know about vendor kernel kernels
>though.
>
>
Well, the interrupts have their own stack (if using 4k stacks) so
the interrupt timing shouldn't matter.

Helge Hafting

2005-12-19 11:05:30

by Helge Hafting

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

Parag Warudkar wrote:

>
> On Dec 18, 2005, at 12:43 AM, Andi Kleen wrote:
>
>> You can catch the obvious ones, but the really hard ones
>> that only occur under high load in obscure exceptional
>> circumstances with large configurations and suitable nesting you won't.
>> These would be only found at real world users.
>
>
> Yep, as it all depends on code complexity, some of these cases might
> not be "errors" at all - instead for that kind of functionality they
> might _require_ bigger stacks.
>
No complex problem ever requires a big stack. It may require a large amount
of memory - which can be allocated explicitly outside the stack.

> If you have 64 bit machines common place and memory a lot cheaper I
> don't see how it is beneficial to force smaller stack sizes without
> giving consideration to the code complexity, architecture and
> requirements.
> (Solaris for example, seems to be going to have 16Kb kernel stacks on
> 64 bit machines.)
>
> So, please let's leave stack size as an option for users to choose
> and stop this 4Kb stack war. May be after a little rest I will start
> another one demanding 16Kb stacks :)

I suggest a little experiment for you. Make a kernel module which do
nothing
except try to allocate 16k of _contigous_ kernel memory, and
printk whether it succeeded or failed before exiting. Have cron run that
every 5 minutes. After a few weeks of running this low-impact test on
a busy loaded server, look at statistics about how often the 16k allocation
worked - and how often it failed.

Whatever failure rate you get, expect the same failure rate for server
processes forking to handle new connections while running with 16k stacks.
Failing one out of a hundred times would probably not be tolerated
for a webserver, and I suspect the failure rate for this will be higher - if
the machine has a reasonable memory load and the usual fragmentation.

On the other hand, if you can surprise us about how this works very
well - then you have a strong argument!

Helge Hafting

2005-12-19 11:41:37

by Jörn Engel

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Mon, 19 December 2005 02:34:29 +0100, Adrian Bunk wrote:
> On Mon, Dec 19, 2005 at 11:45:24AM +1100, Neil Brown wrote:
> >
> > It's hard to *know* if it is a problem, but I am conscious that nfsd
> > adds measurably to stack depth for filesystem paths, and probably
> > isn't measured nearly as often.
> > It's true that 50 bytes out of 4K isn't a lot, but wastage that can be
> > avoided, should be avoided.
>
> "make checkstack" tells that nfsd_vfs_write is below 100 bytes of stack
> usage. So even calling 30 such functions would not get you above
> 3 kB stack usage.
>
> It's also interesting that according to J?rn Engel's static analysis of
> call paths in kernel 2.6.11 [1], the string "nfs" does occur in neither
> any of the functions involved in call paths with > 2 kB stack usage, nor
> in any recursive call paths.
>
> It's OK to use some bytes from the stack, and you haven't yet convinced
> me that the code you are responsible for is using too much stack. ;-)

Well, my metrics show the worst non-recursive paths and recursions
only. The case at hand is a relatively innocent path on its own, but
is stacked on top of one of the recursions.

Therefore, if my tool could make more sense of recursions and f.e. see
that raid over raid is unlikely, but nfsd over xfs over raid over
block is likely, nfsd would definitely show up. Recursions are the
hard problem to worry about.

Don't blame Neil for my tool being stupid. :)

J?rn

--
Don't patch bad code, rewrite it.
-- Kernigham and Pike, according to Rusty

2005-12-19 16:23:04

by Parag Warudkar

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks


On Dec 19, 2005, at 6:09 AM, Helge Hafting wrote:

> I suggest a little experiment for you. Make a kernel module which
> do nothing
> except try to allocate 16k of _contigous_ kernel memory, and
> printk whether it succeeded or failed before exiting. Have cron
> run that
> every 5 minutes. After a few weeks of running this low-impact test on
> a busy loaded server, look at statistics about how often the 16k
> allocation
> worked - and how often it failed.

I am aware of the limitations of Linux MM and the problems associated
with anything more than zero order allocations over a period of time.

My argument was it's not that a ton of i386 users are affected by
having choice of stack sizes (I read LKML quite frequently and for
long I don't remember seeing allocation failure errors - either
people moved to 64 bits without LOWMEM and that helped or people
just do fine with the current 8K stack on i386) and even if some are,
let's leave the stack size as an option - it's not like it cause a
lot of code bloat or other problems (I read your argument about VM
developers bogged down by having to deal with 8K stacks but quite
frankly I don't understand how.)

Whoever benefits can use the 4K stacks, others who feel it risky to
have 4K stacks for whatever reason, can be happy too. We can even
make the 4K default, but having supported option of 8K is important
and almost all operating systems are having >4K stacks on i386
machines, so there is some reason for having it.

But I rest my argument, I no longer use i386 and I am being told this
patch only affects i386! ;)

Parag

2005-12-19 17:46:42

by Dumitru Ciobarcianu

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

În data de Lu, 19-12-2005 la 11:22 -0500, Parag Warudkar a scris:
> Whoever benefits can use the 4K stacks, others who feel it risky to
> have 4K stacks for whatever reason, can be happy too. We can even
> make the 4K default, but having supported option of 8K is important
> and almost all operating systems are having >4K stacks on i386
> machines, so there is some reason for having it.

Sloppy coding ? As long you don't have the source you can't be sure.
Point to an open source (and not tainting by just reading it) code which
uses >4k+IRQstack stacks.

Millions of flyes eat shit.
It must be a reason for having it...

--
Cioby


2005-12-19 19:10:29

by Parag Warudkar

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks


On Dec 19, 2005, at 12:43 PM, Dumitru Ciobarcianu wrote:

> Sloppy coding ? As long you don't have the source you can't be sure.
> Point to an open source (and not tainting by just reading it) code
> which
> uses >4k+IRQstack stacks.
>

First you gotta understand that I am not arguing to take away the 4K
stacks - I am arguing about keeping both options and defaulting to 4K.

How do you determine how much stack space a piece of code is going to
need without knowing what functionality it needs to build? There
might be deeply nested, long call chains etc. which certain types of
functionality might warrant. How do you prove "4K otta be enough
stack for everyone doing everything", on what basis? (Reminds me of
old DOS days and the famous statement relating to 640K)

> Millions of flyes eat shit.
> It must be a reason for having it...
>

Yeah, compare that same thing to FORCING 4K stacks - it sounds as
illogical as the above statement.

No one is answering what are we gaining from removing the 8K stack
"_option_" - few bytes of code size, reason to not fix the VM, for
fun, for screwing over? Why not let people choose 8K if they need it?

Parag

2005-12-19 19:49:22

by Dumitru Ciobarcianu

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

În data de Lu, 19-12-2005 la 14:10 -0500, Parag Warudkar a scris:
> On Dec 19, 2005, at 12:43 PM, Dumitru Ciobarcianu wrote:
>
> > Sloppy coding ? As long you don't have the source you can't be sure.
> > Point to an open source (and not tainting by just reading it) code
> > which
> > uses >4k+IRQstack stacks.
> >
>
> First you gotta understand that I am not arguing to take away the 4K
> stacks - I am arguing about keeping both options and defaulting to 4K.

Yes, and I agree with you on that, but you din't answered my question
regarding _which_ os you mentioned needing more stack space and why.

--
Cioby


2005-12-19 20:17:22

by Parag Warudkar

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks


On Dec 19, 2005, at 2:27 PM, Dumitru Ciobarcianu wrote:

> but you din't answered my question
> regarding _which_ os you mentioned needing more stack space and why.

The two other commercially successful OSes - Windows and Solaris have
12Kb and 8Kb default kernel stack sizes. And both seem to do well
(hold on :) with the large stack sizes - meaning there is no
commercially observed problem created by the 8K stack size. Solaris
even lets you change the kernel stack size at runtime.

Even if we keep aside the impending argument about both OS'es being
crap and we shouldn't be imitating them, we could still derive one
conclusion from them - it is possible to have larger stack on i386
without problems (albeit with some drawbacks) which could be used
under certain circumstances.

Parag

2005-12-19 20:37:21

by Dumitru Ciobarcianu

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

În data de Lu, 19-12-2005 la 15:17 -0500, Parag Warudkar a scris:
> On Dec 19, 2005, at 2:27 PM, Dumitru Ciobarcianu wrote:
>
> > but you din't answered my question
> > regarding _which_ os you mentioned needing more stack space and why.
>
> The two other commercially successful OSes - Windows and Solaris have
> 12Kb and 8Kb default kernel stack sizes. And both seem to do well
> (hold on :) with the large stack sizes - meaning there is no
> commercially observed problem created by the 8K stack size. Solaris
> even lets you change the kernel stack size at runtime.

My point was that you don't know why those two OS have such a large
stack. Just because you can't look at the source without being
contaminated.


--
Cioby - "I'll just stop feeding the troll now"


2005-12-20 01:24:27

by grundig

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

El Mon, 19 Dec 2005 22:34:57 +0200,
Dumitru Ciobarcianu <[email protected]> escribi?:

> My point was that you don't know why those two OS have such a large
> stack. Just because you can't look at the source without being
> contaminated.

opensolaris is open source, you can look at their code.

But I don't think you'll find an answer there. My bet is: because
it'd be more difficult for their customers (even if opensolaris
is opensource it was born as propietary OS), because doing it
doesn't buys you performance and customers, there're not
lot of reasons for doing it, etc.

As I understand it, linux is "different". I'd say that the main
"philosophic" (not technical) reason for going 4K is: "because
we have the balls to write a 4k-stack-safe kernel". Quoting Linus:

"I hold open source people to higher standards. They are supposed to be
the people who do programming because it's an art-form, not because it's
their job."

2005-12-20 12:58:33

by Adrian Bunk

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Mon, Dec 19, 2005 at 02:10:17PM -0500, Parag Warudkar wrote:
>...
> How do you determine how much stack space a piece of code is going to
> need without knowing what functionality it needs to build? There
> might be deeply nested, long call chains etc. which certain types of
> functionality might warrant.

Static analysis of this problem is possible.

"make checkstack" is a good starting point.

And the automatic analysis of all possible call chains can and has
already found problems.

> How do you prove "4K otta be enough
> stack for everyone doing everything", on what basis? (Reminds me of
> old DOS days and the famous statement relating to 640K)
>...

We are talking about reducing the stack size by one third which doesn't
result in a fundamental difference.

There is no technical reason why 4 kB shouldn't be enough - I don't
count sloppy coding as a reason for it since in such cases we better
correct the code.

> Parag

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

2005-12-20 14:32:44

by Horst H. von Brand

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

Parag Warudkar <[email protected]> wrote:
> On Dec 19, 2005, at 2:27 PM, Dumitru Ciobarcianu wrote:
> > but you din't answered my question
> > regarding _which_ os you mentioned needing more stack space and why.

> The two other commercially successful OSes - Windows and Solaris have
> 12Kb and 8Kb default kernel stack sizes. And both seem to do well
> (hold on :) with the large stack sizes - meaning there is no
> commercially observed problem created by the 8K stack size. Solaris
> even lets you change the kernel stack size at runtime.

> Even if we keep aside the impending argument about both OS'es being
> crap

Right.

> and we shouldn't be imitating them,

Nodz. That doesn't mean they don't have their strong points, which we
should consider carefully.

> we could still derive one
> conclusion from them - it is possible to have larger stack on i386
> without problems (albeit with some drawbacks) which could be used
> under certain circumstances.

"With some drawbacks" is the point: It has been determined that the
drawbacks are heavy enough that the 8KiB stack option should go. Given
there is /no/ compelling argument /against/ 4KiB stacks, even very minor
drawbacks are important. So first make 4KiB the standard (popular
distributions work that way for /years/ now, with no measurable downsides),
then axe 8KiB stack as an option. Also note that 8KiB stacks really only
gives 4KiB to the process (plus (or minus!) a random ammount depending on
the interrupts being serviced ATM), and this has been so forever.

Oh, well, one of the larger drawbacks of 4KiB stacks is the inevitable
flamewar, each time with /less/ data (this round I've seen none) supporting
the need for larger stacks, into which all kinds of idiots* are suckered.

* This certainly includes myself
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513

2005-12-20 14:35:53

by Felix Oxley

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks


On 19 Dec 2005, at 19:10, Parag Warudkar wrote:

>
> No one is answering what are we gaining from removing the 8K stack
> "_option_" - few bytes of code size, reason to not fix the VM, for
> fun, for screwing over? Why not let people choose 8K if they need it?

The proposed patch is for mm only. What you are gaining is wider
testing of 4K stacks.

I am just a lurker but, having read the entire thread, it seems to me
that:
1) the patch should be applied to mm.
2) ndiswrapper should be modified to work with 4K stacks.

It seems unlikely to me that this patch will be pushed to Linus just
because it has been in mm.
If that possibility comes up in 6-12 months then the flamewar can
begin again.

regards,
Felix

2005-12-20 18:01:18

by David Lang

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Tue, 20 Dec 2005, Horst von Brand wrote:

>
> "With some drawbacks" is the point: It has been determined that the
> drawbacks are heavy enough that the 8KiB stack option should go. Given
> there is /no/ compelling argument /against/ 4KiB stacks, even very minor
> drawbacks are important. So first make 4KiB the standard (popular
> distributions work that way for /years/ now, with no measurable downsides),

at least one of the 'popular distributions' that switched to 4k stacks
years ago worked around the problems that it generated by simply lableing
the portions that didn't work with 4k stacks as 'unsupported by this
distro' (XFS has been explicitly stated to be in this catagory in these
discussions)

how many other corner cases are there that these distros just choose not
to support, but need to be supported and tested for the vanilla kernel?

also for those who are arguing that it's only dropping from 6k to 4k, you
are forgetting that the patches to move the interrupts to a seperate stack
have already gone into the kernel, so today it is really 8k+4k and the
talk is to move it to 4k+4k.

I think it's a good idea to change the default (especially in -mm) to 4k
stacks and to schedule a change of the default in mainline for a few
versions out, but there needs to be a safety net other then telling people
to downgrade to a prior kernel if they run into problems when the switch
is made

David Lang

--
There are two ways of constructing a software design. One way is to make it so simple that there are obviously no deficiencies. And the other way is to make it so complicated that there are no obvious deficiencies.
-- C.A.R. Hoare

2005-12-20 18:10:03

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks


>
> how many other corner cases are there that these distros just choose not
> to support, but need to be supported and tested for the vanilla kernel?

as someone who was at that distro in the time.. none other than XFS and
reiserfs4.

> also for those who are arguing that it's only dropping from 6k to 4k, you
> are forgetting that the patches to move the interrupts to a seperate stack
> have already gone into the kernel, so today it is really 8k+4k and the
> talk is to move it to 4k+4k.

actually irq stacks aren't enabled with 8K stacks right now, so your
statement isn't correct.


2005-12-20 18:12:39

by Sean

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Tue, December 20, 2005 12:56 pm, David Lang said:
>
> at least one of the 'popular distributions' that switched to 4k stacks
> years ago worked around the problems that it generated by simply lableing
> the portions that didn't work with 4k stacks as 'unsupported by this
> distro' (XFS has been explicitly stated to be in this catagory in these
> discussions)

Times change, the XFS issues have been resolved.

> how many other corner cases are there that these distros just choose not
> to support, but need to be supported and tested for the vanilla kernel?

This is called FUD. If you have examples of a problem go ahead and post
them. *EVERY* change made to the kernel has the potential to cause a
problem. But this one has been carefully scrutinized and it seems that
only people with general notions of FUD remain to object. Nobody seems
to have any real objections any more. YAY!

> also for those who are arguing that it's only dropping from 6k to 4k, you
> are forgetting that the patches to move the interrupts to a seperate stack
> have already gone into the kernel, so today it is really 8k+4k and the
> talk is to move it to 4k+4k.

So what? The point is that if you compare the world from what it was a
few years back with just 8K, now there is just as much stack space,
although it happens to be split in half.

> I think it's a good idea to change the default (especially in -mm) to 4k
> stacks and to schedule a change of the default in mainline for a few
> versions out, but there needs to be a safety net other then telling people
> to downgrade to a prior kernel if they run into problems when the switch
> is made

Of course. If there are problems discovered in -mm with 4K stack (highly
unlikely since its been in production use on several vendor kernals for a
few years) they will be dealt with. But once a reasonable period of time
has passed with no issues, its time to move it into mainline.

Sean

2005-12-20 18:12:15

by Adrian Bunk

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Tue, Dec 20, 2005 at 09:56:10AM -0800, David Lang wrote:
> On Tue, 20 Dec 2005, Horst von Brand wrote:
>
> >
> >"With some drawbacks" is the point: It has been determined that the
> >drawbacks are heavy enough that the 8KiB stack option should go. Given
> >there is /no/ compelling argument /against/ 4KiB stacks, even very minor
> >drawbacks are important. So first make 4KiB the standard (popular
> >distributions work that way for /years/ now, with no measurable downsides),
>
> at least one of the 'popular distributions' that switched to 4k stacks
> years ago worked around the problems that it generated by simply lableing
> the portions that didn't work with 4k stacks as 'unsupported by this
> distro' (XFS has been explicitly stated to be in this catagory in these
> discussions)

AFAIK, XFS is the only example.
And the XFS related problems have already been fixed.

> how many other corner cases are there that these distros just choose not
> to support, but need to be supported and tested for the vanilla kernel?

My count of bug reports for problems with in-kernel code with 4k stacks
after Neil's patch went into -mm is still at 0.

If 4k stacks were as unstable as you imply, why has noone been able to
point to _one single_ problem with 4k stacks that is still present
after Neil's patch went into -mm?

> also for those who are arguing that it's only dropping from 6k to 4k, you
> are forgetting that the patches to move the interrupts to a seperate stack
> have already gone into the kernel, so today it is really 8k+4k and the
> talk is to move it to 4k+4k.
>...

That's complete bullshit.

Currently, seperate irq stacks are only used with
CONFIG_4KSTACKS=y.

> David Lang

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

2005-12-20 18:25:21

by Dave Jones

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Tue, Dec 20, 2005 at 09:56:10AM -0800, David Lang wrote:

> at least one of the 'popular distributions' that switched to 4k stacks
> years ago worked around the problems that it generated by simply lableing
> the portions that didn't work with 4k stacks as 'unsupported by this
> distro' (XFS has been explicitly stated to be in this catagory in these
> discussions)

Actually there are several reasons why certain parts are
unsupported in RHEL kernels. The fact that XFS blew up with
4k stacks was purely coincidental to it being unsupported.

Dave

2005-12-20 18:49:42

by David Lang

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Tue, 20 Dec 2005, Arjan van de Ven wrote:

>> how many other corner cases are there that these distros just choose not
>> to support, but need to be supported and tested for the vanilla kernel?
>
> as someone who was at that distro in the time.. none other than XFS and
> reiserfs4.

good to hear, outsiders don't know these details. all we know is that some
things aren't supported, but (without a lot of effort) don't know what
things.

>> also for those who are arguing that it's only dropping from 6k to 4k, you
>> are forgetting that the patches to move the interrupts to a seperate stack
>> have already gone into the kernel, so today it is really 8k+4k and the
>> talk is to move it to 4k+4k.
>
> actually irq stacks aren't enabled with 8K stacks right now, so your
> statement isn't correct.

Ok, I stand corrected, I didn't look at the code, I was going on the
memories of the discussions on l-k where the advocates were pushing to
enable the interrupt stacks unconditionally, and I was remembering that a
patch to do so had gone in.

David Lang

--
There are two ways of constructing a software design. One way is to make it so simple that there are obviously no deficiencies. And the other way is to make it so complicated that there are no obvious deficiencies.
-- C.A.R. Hoare

2005-12-20 19:08:39

by Parag Warudkar

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

> Oh, well, one of the larger drawbacks of 4KiB stacks is the inevitable
> flamewar, each time with /less/ data (this round I've seen none) supporting
> the need for larger stacks, into which all kinds of idiots* are suckered.

At the same time, I haven't seen any data showing what we gain by losing the 8K
stack option. Where are the links to posts where people are claiming en masse
that 8K stacks are causing screwups, halting VM development etc.?

If 8K stacks are something that works, is not default, what do we gain by losing
it in total? If people need ndiswrapper (I hate it as much as any one else , but come on
for some people it's the only option) or any other functionality that requires
bigger stack, let them choose it if they are ready to take whatever risks that come with it.

To the ndiswrapper users - Do you guys have any real data showing 4K stacks
result in problems for you? (Since it is dedicated 4K against shared 8K, it
might as well not cause problems.) If you do then it's clear that 8K shared
gives more room than 4k dedicated.

Parag

2005-12-20 19:31:05

by David Lang

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Tue, 20 Dec 2005, Parag Warudkar wrote:

>> Oh, well, one of the larger drawbacks of 4KiB stacks is the inevitable
>> flamewar, each time with /less/ data (this round I've seen none) supporting
>> the need for larger stacks, into which all kinds of idiots* are suckered.
>
> At the same time, I haven't seen any data showing what we gain by losing the 8K
> stack option. Where are the links to posts where people are claiming en masse
> that 8K stacks are causing screwups, halting VM development etc.?

by goig to 4k stacks they are able to be allocated even when memory is
badly fragmented, which is not the case while they are 8k.

David Lang


--
There are two ways of constructing a software design. One way is to make it so simple that there are obviously no deficiencies. And the other way is to make it so complicated that there are no obvious deficiencies.
-- C.A.R. Hoare

2005-12-20 19:53:53

by Parag Warudkar

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

>
> by goig to 4k stacks they are able to be allocated even when memory is
> badly fragmented, which is not the case while they are 8k.
>
> David Lang
>

It's hard to believe all i386 people have a problem with 8K stacks. What you said may be a problem domain bound to a specific workload on i386 with insane amounts of memory and fragmented LOWMEM. - These people can certainly use 4K stacks and no one is preventing that.

But normal people with <=1Gb RAM and using i386 on desktop (I am sure there are many of them) may do OK with 8K stacks if they had a need to do so. (Like running ndiswrapper, or some other thing which requires bigger stacks for that matter.)

Why take away the 8K option which already exists and works for people who need it? Let people choose what suits their needs. Forcing 4K stacks on people and asking them to sacrifice functionality while *gaining nothing* - sure sounds illogical. (You gain from 4K stacks - you have it as default, but technically you gain NOTHING from taking away the 8k option.)

Parag

2005-12-20 20:04:01

by Sean

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Tue, December 20, 2005 2:53 pm, Parag Warudkar said:

> Why take away the 8K option which already exists and works for people who
> need it? Let people choose what suits their needs. Forcing 4K stacks on
> people and asking them to sacrifice functionality while *gaining nothing*
> - sure sounds illogical. (You gain from 4K stacks - you have it as
> default, but technically you gain NOTHING from taking away the 8k option.)

Listen, for anyone who "needs" 8K stacks they can maintain the patch
themselves, they don't need it in the mainline kernel. One of the points
of removing the 8K stack option is to singal to vendors and everyone else
that bugs arising from using 8K stacks and ** ANY ** sloppy code that
_needs_ 8K stacks is no longer appropriate for mainline. The kernel
doesn't just carry around a bunch of crappy options because someone
somewhere thinks he needs it.

Sean

2005-12-20 20:27:35

by Jesper Juhl

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On 12/20/05, Parag Warudkar <[email protected]> wrote:
> >
> > by goig to 4k stacks they are able to be allocated even when memory is
> > badly fragmented, which is not the case while they are 8k.
> >
> > David Lang
> >
>
> It's hard to believe all i386 people have a problem with 8K stacks. What you said may be a problem domain bound to a specific workload on i386 with insane amounts of memory and fragmented LOWMEM. - These people can certainly use 4K stacks and no one is preventing that.
>

There are more bennefits to 4K stacks than just that.
Arjan posted a nice list a while back :
http://www.ussg.iu.edu/hypermail/linux/kernel/0511.2/0042.html


> But normal people with <=1Gb RAM and using i386 on desktop (I am sure there are many of them) may do OK with 8K stacks if they had a need to do so. (Like running ndiswrapper,

ndiswrapper is not safe even with 8K stacks since Windows allow more
than that, so ndiswrapper can still break with 8K stack - the
ndiswrapper people would be a lot better off by biting the bullet and
implementing their own large stack for the drivers they run and not
depend on the size of the Linux kernels stack.


>or some other thing which requires bigger stacks for that matter.)
>
If that something is in the mainline kernel it should be fixed, if it
is not in mainline then mainline doesn't need to care.


> Why take away the 8K option which already exists and works for people who need it? Let people choose what suits their needs. Forcing 4K stacks on people and asking them to sacrifice functionality while *gaining nothing* - sure sounds illogical. (You gain from 4K stacks - you have it as default, but technically you gain NOTHING from taking away the 8k option.)
>

By taking away the 8K stack option (after a while, we need to make
damn sure all in-kernel code is safe first) I can think of these
bennefits in addition to the technical bennefits of 4K stacks :

- less code bloat.
- fewer config options (there are IMHO way too many already).
- more testing of 4K stacks (since it's the only option everyone will
be using it).
- pressure on vendors to get their drivers merged into mainline.


--
Jesper Juhl <[email protected]>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html

2005-12-21 11:12:22

by Sander

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

Arjan van de Ven wrote (ao):
> > how many other corner cases are there that these distros just choose
> > not to support, but need to be supported and tested for the vanilla
> > kernel?
>
> as someone who was at that distro in the time.. none other than XFS
> and reiserfs4.

FWIW, I have a few servers and my workstation running Reiser4 and
CONFIG_4KSTACKS=y for several months now, and haven't encountered
problems yet. One server also runs Reiser4 on top op lvm2, and another
Reiser4 on top of sw raid1.

I know -mm + Reiser4 + 4kstacks is bleeding edge in more than one way,
but I like that for my workstations and the servers are
test/non-critical.

All systems do have real-life load though. I'd be happy to provide data
from these systems. Just mail me the commands.

--
Humilis IT Services and Solutions
http://www.humilis.net

2005-12-21 11:59:28

by Horst H. von Brand

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

Parag Warudkar <[email protected]> wrote:
> > Oh, well, one of the larger drawbacks of 4KiB stacks is the inevitable
> > flamewar, each time with /less/ data (this round I've seen none) supporting
> > the need for larger stacks, into which all kinds of idiots* are suckered.

> At the same time, I haven't seen any data showing what we gain by losing
> the 8K stack option.

Code simplification (don't need both versions). Simpler kernel configuration.
Even smaller .config files ;-)

A useful byproduct is more reproducible crashes when the stack overruns (as
8KiB stands, it will crash the same, but only sometimes; probably even
more, as it really is 6KiB for process + IRQ, and with 4KiB they are 4KiB
each). Yes, more crashes is a feature, as it gets fixed faster.
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513

2005-12-21 11:59:30

by Horst H. von Brand

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

Parag Warudkar <[email protected]> wrote:
> It's hard to believe all i386 people have a problem with 8K stacks. What
> you said may be a problem domain bound to a specific workload on i386
> with insane amounts of memory and fragmented LOWMEM. - These people can
> certainly use 4K stacks and no one is preventing that.

> But normal people with <=1Gb RAM and using i386 on desktop (I am sure
> there are many of them) may do OK with 8K stacks if they had a need to do
> so. (Like running ndiswrapper, or some other thing which requires bigger
> stacks for that matter.)

But those normal people are most of the users, running non-critical stuff,
and thus are /excellent/ guinea pigs for the "real world users" you
mentioned above ;-)

/me ducks and runs like all LKML is loose
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513

2005-12-21 20:03:44

by Jeffrey Hundstad

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

Sander wrote:

>Arjan van de Ven wrote (ao):
>
>
>>>how many other corner cases are there that these distros just choose
>>>not to support, but need to be supported and tested for the vanilla
>>>kernel?
>>>
>>>
>>as someone who was at that distro in the time.. none other than XFS
>>and reiserfs4.
>>
>>
>
>FWIW, I have a few servers and my workstation running Reiser4 and
>CONFIG_4KSTACKS=y for several months now, and haven't encountered
>problems yet. One server also runs Reiser4 on top op lvm2, and another
>Reiser4 on top of sw raid1.
>
>I know -mm + Reiser4 + 4kstacks is bleeding edge in more than one way,
>but I like that for my workstations and the servers are
>test/non-critical.
>
>All systems do have real-life load though. I'd be happy to provide data
>from these systems. Just mail me the commands.
>
>
>

I would like to add to this. I've been using XFS+LVM+SCSI in a 14,000+
user University email server with 4k stacks since it became available.
I didn't even have problems BEFORE the XFS stuff was stack-dieted. I
would also be happy to provide more data.

--
Jeffrey Hundstad

2005-12-23 10:12:32

by Bodo Eggert

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

Horst von Brand <[email protected]> wrote:

> "With some drawbacks" is the point: It has been determined that the
> drawbacks are heavy enough that the 8KiB stack option should go.

Determined by voodoo and wild guessing.

Let's detect the need for 4K stacks: (I hope I found the correct place)

(Maybe the printk should be completely ifdefed, but I'm not sure)


Signed-off-by: Bodo Eggert <[email protected]>

--- 2.6.14/kernel/fork.c.ori 2005-12-21 19:06:24.000000000 +0100
+++ 2.6.14/kernel/fork.c 2005-12-21 19:15:23.000000000 +0100
@@ -168,4 +168,9 @@ static struct task_struct *dup_task_stru
if (!ti) {
free_task_struct(tsk);
+ printk(KERN_WARNING, "Can't allocate new task structure"
+#ifndef CONFIG_4KSTACKS
+ ". Maybe you could benefit from 4K stacks.\n"
+#endif
+ "\n");
return NULL;
}

--
Ich danke GMX daf?r, die Verwendung meiner Adressen mittels per SPF
verbreiteten L?gen zu sabotieren.

2005-12-23 10:33:44

by Eric Dumazet

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

Bodo Eggert a ?crit :
> Horst von Brand <[email protected]> wrote:
>
>> "With some drawbacks" is the point: It has been determined that the
>> drawbacks are heavy enough that the 8KiB stack option should go.
>
> Determined by voodoo and wild guessing.
>
> Let's detect the need for 4K stacks: (I hope I found the correct place)
>
> (Maybe the printk should be completely ifdefed, but I'm not sure)
>
>
> Signed-off-by: Bodo Eggert <[email protected]>
>
> --- 2.6.14/kernel/fork.c.ori 2005-12-21 19:06:24.000000000 +0100
> +++ 2.6.14/kernel/fork.c 2005-12-21 19:15:23.000000000 +0100
> @@ -168,4 +168,9 @@ static struct task_struct *dup_task_stru
> if (!ti) {
> free_task_struct(tsk);
> + printk(KERN_WARNING, "Can't allocate new task structure"
> +#ifndef CONFIG_4KSTACKS
> + ". Maybe you could benefit from 4K stacks.\n"
> +#endif
> + "\n");
> return NULL;
> }
>

This patch is not OK but for i386 architecture.

For example, x86_64 cannot use a 4K stack, it needs a 8KB stack (so a order-1
allocation that may fail)

Eric

2005-12-23 10:44:55

by Sean

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Fri, December 23, 2005 5:12 am, Bodo Eggert said:
> Horst von Brand <[email protected]> wrote:
>
>> "With some drawbacks" is the point: It has been determined that the
>> drawbacks are heavy enough that the 8KiB stack option should go.
>
> Determined by voodoo and wild guessing.

Bullshit. There's no more guessing involved than you thinking 8K stacks
are sufficient. How do you know that 8K stacks are enough? If you
don't understand the testing and common sense that has gone into 4K+4K
stacks you should really be putting in a patch for 128K stacks, because
you don't have any proof that 8K stacks are sufficient either (except by
voodoo and wild guessing). However, if you _do_ understand the testing
and coding methods then you'll see that 4K + 4K stacks are sufficient
(modulo any bugs, which should be fixed).

Sean



>
> Let's detect the need for 4K stacks: (I hope I found the correct place)
>
> (Maybe the printk should be completely ifdefed, but I'm not sure)
>
> Signed-off-by: Bodo Eggert <[email protected]>
>
> --- 2.6.14/kernel/fork.c.ori 2005-12-21 19:06:24.000000000 +0100
> +++ 2.6.14/kernel/fork.c 2005-12-21 19:15:23.000000000 +0100
> @@ -168,4 +168,9 @@ static struct task_struct *dup_task_stru
> if (!ti) {
> free_task_struct(tsk);
> + printk(KERN_WARNING, "Can't allocate new task structure"
> +#ifndef CONFIG_4KSTACKS
> + ". Maybe you could benefit from 4K stacks.\n"
> +#endif
> + "\n");
> return NULL;
> }
>

>

2005-12-23 13:59:31

by Diego Calleja

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

El Fri, 23 Dec 2005 11:12:38 +0100,
Bodo Eggert <[email protected]> escribi?:

> + printk(KERN_WARNING, "Can't allocate new task structure"
> +#ifndef CONFIG_4KSTACKS
> + ". Maybe you could benefit from 4K stacks.\n"
> +#endif
> + "\

A sarcastic patch, nice. So, lets try to get something useful
from this flamewar...sight.


The 4k patch is being proposed for -mm. Personally I'm _shocked_ that so
many people is trying to avoid _testing_ (-mm is for testing, isn't it)
this feature so hard. Which is surprising, since merging it into -mm
may prove that they're right (people will report bugs caused by 4k
stacks, etc). Maybe 8k groupies are not willing to be proved that
they're right, or they're afraid of being proven that they're
wrong? </sarcasm>

Now, seriously:
I think that most of the 8k groupies don't like 4k not because it
doesn't works in the common case, but because it could cause hangs
that are not easy to reproduce (ie: they are paranoid). The combination
of code paths is too big and complex. I can understand that.

What I don't know is why you think that 8k will be "safe". As far
as I know, there're have been stacks overflows with 8KB stacks in
the past (ie, "hangs that are not easy to reproduce") before the 4k
stack patch was proposed, and the _one_ reason why now it's very
safe to run with 8k stacks is because the 4k stack patch has forced
people to do stack diets, not because 8k is the best option.

We have *NO* *WAY* of proving that it's safe to run either 4k or
8k stacks. Face it. And since such bugs can exist no matter what
stack size you use, the best option (IMO) is to choose the option that
will allow us to hit those bugs _faster_, ie: 4k stacks. From a
engineering point of view, I can't understand why hiding the problem
is better than choosing the path that will allow to hit and fix those
bugs faster. It remembers me of "security through obscurity". What
we will do when we have too may overflows with 8K? 16K stacks? Oh,
let me guess: "we'll fix it"?. Well, and why can't we fix 4k stacks???

Now, the code is easy to maintain and some people depends on
8k stacks, as akpm pointed out in http://lkml.org/lkml/2005/12/15/336
This patch (http://lkml.org/lkml/2005/12/16/89) stolen from^W^Winspired
by Adrian Bunk defaults to 4k, makes the 8k people happy and it should
make akpm happy too.

Can someone tell me a reason why all this stupid flamewar can't be
solved with that patch?

2005-12-23 23:08:59

by Pavel Machek

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On P? 23-12-05 11:12:38, Bodo Eggert wrote:
> Horst von Brand <[email protected]> wrote:
>
> > "With some drawbacks" is the point: It has been determined that the
> > drawbacks are heavy enough that the 8KiB stack option should go.
>
> Determined by voodoo and wild guessing.
>
> Let's detect the need for 4K stacks: (I hope I found the correct place)
>
> (Maybe the printk should be completely ifdefed, but I'm not sure)
>
>
> Signed-off-by: Bodo Eggert <[email protected]>
>
> --- 2.6.14/kernel/fork.c.ori 2005-12-21 19:06:24.000000000 +0100
> +++ 2.6.14/kernel/fork.c 2005-12-21 19:15:23.000000000 +0100
> @@ -168,4 +168,9 @@ static struct task_struct *dup_task_stru
> if (!ti) {
> free_task_struct(tsk);
> + printk(KERN_WARNING, "Can't allocate new task structure"
> +#ifndef CONFIG_4KSTACKS
> + ". Maybe you could benefit from 4K stacks.\n"
> +#endif
> + "\n");
> return NULL;
> }

Two newlines in case of 4Kstacks...

Pavel

--
Thanks, Sharp!

2005-12-24 02:22:28

by Horst von Brand

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

Bodo Eggert <[email protected]> wrote:
> Horst von Brand <[email protected]> wrote:

[...]

> > "With some drawbacks" is the point: It has been determined that the
> > drawbacks are heavy enough that the 8KiB stack option should go.

> Determined by voodoo

Did you see them sticking needles into waxen stacks?

> and wild guessing.

More like long experience with the kernel, and sifting many, many bug
reports we've not looked over (and many we probably didn't see).
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513

2005-12-27 21:04:03

by David Weinehall

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Thu, Dec 15, 2005 at 10:16:05PM -0800, Alex Davis wrote:
>
>
> --- Dave Jones <[email protected]> wrote:
>
> > On Thu, Dec 15, 2005 at 09:20:54PM -0800, Alex Davis wrote:
> > > The problem is that, with laptops, most of the time you DON'T
> > > have a choice: HP and Dell primarily use a Broadcomm integrated
> > > wireless card in ther products. As of yet, there is no open
> > > source driver for Broadcomm wireless.
> >
> > We've already been through all this the previous times this came up.
> >
> > http://bcm43xx.berlios.de
> >
> > Whilst it's in early stages, it's making progress.
> >
> > Dave
> >
> >
> I understand that, and am grateful for the effort, but the point is
> it's not ready. Are you expecting people to lose an important feature
> of their laptop until you get the driver ready?

Yeah, it must be oh so important for the laptop owners with that
particular chipset to run the -mm experimental kernels instead of, their
distro kernel or the stable 2.6-kernel series or Linus latest
installment (or even a git-snapshot or checkout...)


Regards: David Weinehall
--
/) David Weinehall <[email protected]> /) Northern lights wander (\
// Maintainer of the v2.0 kernel // Dance across the winter sky //
\) http://www.acc.umu.se/~tao/ (/ Full colour fire (/

2005-12-27 22:27:07

by Michael Buesch

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

On Tuesday 27 December 2005 22:03, David Weinehall wrote:
> On Thu, Dec 15, 2005 at 10:16:05PM -0800, Alex Davis wrote:
> >
> >
> > --- Dave Jones <[email protected]> wrote:
> >
> > > On Thu, Dec 15, 2005 at 09:20:54PM -0800, Alex Davis wrote:
> > > > The problem is that, with laptops, most of the time you DON'T
> > > > have a choice: HP and Dell primarily use a Broadcomm integrated
> > > > wireless card in ther products. As of yet, there is no open
> > > > source driver for Broadcomm wireless.
> > >
> > > We've already been through all this the previous times this came up.
> > >
> > > http://bcm43xx.berlios.de
> > >
> > > Whilst it's in early stages, it's making progress.
> > >
> > > Dave
> > >
> > >
> > I understand that, and am grateful for the effort, but the point is
> > it's not ready. Are you expecting people to lose an important feature
> > of their laptop until you get the driver ready?
>
> Yeah, it must be oh so important for the laptop owners with that
> particular chipset to run the -mm experimental kernels instead of, their
> distro kernel or the stable 2.6-kernel series or Linus latest
> installment (or even a git-snapshot or checkout...)

Well, the devicescape port of the bcm43xx driver works
very relieably on my Apple PowerBook with WPA encryption.
(WEP does also work).
I don't think there are lots of issues left for non-AccessPoint
modes. I simply assume you want to run the card in STA mode,
instead of rendering your expensive notebook into an AP. ;)

It's been a long time, since I plugged my ethernet cable into
the notebook the last time... .

--
Greetings Michael.


Attachments:
(No filename) (1.59 kB)
(No filename) (189.00 B)
Download all attachments

2006-02-11 23:26:55

by Joshua Hudson

[permalink] [raw]
Subject: Re: [2.6 patch] i386: always use 4k stacks

I feel like putting my two cents in.

Suppose you just made 4K stacks the default. Since users of
ndiswrapper already have to recompile the kernel, making one
configuration change as well can't be that hard, especially since
ndiswrapper checks kernel options when compiling.