Yesterday I sent a patch to add stack-poison so the stack usage
could be observed.
Today I wrote a small program and tested the stack usage. Both
the program and the patch is attached. The result is:
Offset : 2ec8f000 Available Stack bytes = 3104
Offset : 2ecb1000 Available Stack bytes = 3104
Offset : 2ee5f000 Available Stack bytes = 20
Offset : 2f36d000 Available Stack bytes = 3104
Offset : 2fd09000 Available Stack bytes = 3012
Offset : 2fd0b000 Available Stack bytes = 3312
Offset : 2fd0f000 Available Stack bytes = 2132
Offset : 2fd2f000 Available Stack bytes = 2744
Offset : 2fd57000 Available Stack bytes = 2900
Offset : 2fdd5000 Available Stack bytes = 1400
Offset : 2fe35000 Available Stack bytes = 2832
Offset : 2ff3f000 Available Stack bytes = 776
Offset : 2ff45000 Available Stack bytes = 3188
This, after compiling the kernel. I did not have 4k stacks
enabled for this test so any crashing of the stack beyond
one page will not hurt the system. This was on linux-2.6.13.4.
Anyway, I tried to enable 4k stacks and the machine would
not boot past trying to install the first module. It just
stopped with the interrupts disabled. So, I am now rebuilding
the kernel back as I write this. That's why I am using 2.6.13
at the moment.
Anyway, getting down to 20 bytes of stack-space available
seems to be pretty scary.
Cheers,
Dick Johnson
Penguin : Linux version 2.6.13 on an i686 machine (5589.54 BogoMips).
Warning : 98.36% of all statistics are fiction.
****************************************************************
The information transmitted in this message is confidential and may be privileged. Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited. If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to [email protected] - and destroy all copies of this information, including any attachments, without reading or disclosing them.
Thank you.
On Thu, 22 Dec 2005 16:53:25 -0500, "linux-os \(Dick Johnson\)" <[email protected]> wrote:
>
>
>Yesterday I sent a patch to add stack-poison so the stack usage
>could be observed.
>
>Today I wrote a small program and tested the stack usage. Both
>the program and the patch is attached. The result is:
>
>Offset : 2ec8f000 Available Stack bytes = 3104
>Offset : 2ecb1000 Available Stack bytes = 3104
>Offset : 2ee5f000 Available Stack bytes = 20
Hmm:
# ./stack
Offset : 003fb000 Available Stack bytes = 3348
Offset : 0195d000 Available Stack bytes = 3620
Offset : 01961000 Available Stack bytes = 3828
Offset : 01963000 Available Stack bytes = 3088
Offset : 01b7d000 Available Stack bytes = 2952
Offset : 01b9f000 Available Stack bytes = 2616
Offset : 3753d000 Available Stack bytes = 3628
Offset : 3755d000 Available Stack bytes = 3604
Offset : 3755f000 Available Stack bytes = 3608
Offset : 37561000 Available Stack bytes = 3608
Offset : 37563000 Available Stack bytes = 3608
Offset : 37585000 Available Stack bytes = 3608
Offset : 37655000 Available Stack bytes = 3756
Offset : 37657000 Available Stack bytes = 3592
Offset : 37659000 Available Stack bytes = 3304
Offset : 37753000 Available Stack bytes = 3608
Offset : 37755000 Available Stack bytes = 3756
Offset : 377b3000 Available Stack bytes = 3880
Offset : 378e5000 Available Stack bytes = 3648
Offset : 37977000 Available Stack bytes = 3608
Offset : 37979000 Available Stack bytes = 3604
Offset : 3799d000 Available Stack bytes = 3820
Offset : 37c07000 Available Stack bytes = 3376
Offset : 37c27000 Available Stack bytes = 3652
Offset : 37dcf000 Available Stack bytes = 2580
Offset : 37def000 Available Stack bytes = 3556
Offset : 37df1000 Available Stack bytes = 3732
Offset : 37f13000 Available Stack bytes = 3612
Offset : 37f15000 Available Stack bytes = 3604
Offset : 37f35000 Available Stack bytes = 3608
I get the crash on startup when 4k + 4k is set too :(
http://bugsplatter.mine.nu/test/boxen/sempro/ with 8k 2.6.14.4.
Grant.
"linux-os \(Dick Johnson\)" <[email protected]> writes:
> Anyway, I tried to enable 4k stacks and the machine would
> not boot past trying to install the first module. It just
> stopped with the interrupts disabled.
Does that happen without your patch as well?
> Anyway, getting down to 20 bytes of stack-space available
> seems to be pretty scary.
More details maybe? .config | grep ^C ? What's on the stack above
the poison?
--
Krzysztof Halasa
On Friday 23 December 2005 01:11, Grant Coady wrote:
> On Thu, 22 Dec 2005 16:53:25 -0500, "linux-os \(Dick Johnson\)"
<[email protected]> wrote:
> >Yesterday I sent a patch to add stack-poison so the stack usage
> >could be observed.
> >
> >Today I wrote a small program and tested the stack usage. Both
> >the program and the patch is attached. The result is:
> >
> >Offset : 2ec8f000 Available Stack bytes = 3104
> >Offset : 2ecb1000 Available Stack bytes = 3104
> >Offset : 2ee5f000 Available Stack bytes = 20
>
> Hmm:
> # ./stack
> Offset : 003fb000 Available Stack bytes = 3348
> Offset : 0195d000 Available Stack bytes = 3620
Please do these tests once you repair the bug preventing the 4K stacks kernel
from booting. The results are meaningless on an 8K stacks kernel.
--
Cheers,
Alistair.
'No sense being pessimistic, it probably wouldn't work anyway.'
Third year Computer Science undergraduate.
1F2 55 South Clerk Street, Edinburgh, UK.
On Thursday 22 December 2005 23:53, linux-os (Dick Johnson) wrote:
>
> Yesterday I sent a patch to add stack-poison so the stack usage
> could be observed.
>
> Today I wrote a small program and tested the stack usage. Both
> the program and the patch is attached. The result is:
>
> Offset : 2ec8f000 Available Stack bytes = 3104
> Offset : 2ecb1000 Available Stack bytes = 3104
> Offset : 2ee5f000 Available Stack bytes = 20
> Offset : 2f36d000 Available Stack bytes = 3104
> Offset : 2fd09000 Available Stack bytes = 3012
> Offset : 2fd0b000 Available Stack bytes = 3312
> Offset : 2fd0f000 Available Stack bytes = 2132
> Offset : 2fd2f000 Available Stack bytes = 2744
> Offset : 2fd57000 Available Stack bytes = 2900
> Offset : 2fdd5000 Available Stack bytes = 1400
> Offset : 2fe35000 Available Stack bytes = 2832
> Offset : 2ff3f000 Available Stack bytes = 776
> Offset : 2ff45000 Available Stack bytes = 3188
>
> This, after compiling the kernel. I did not have 4k stacks
> enabled for this test so any crashing of the stack beyond
> one page will not hurt the system. This was on linux-2.6.13.4.
>
> Anyway, I tried to enable 4k stacks and the machine would
> not boot past trying to install the first module. It just
> stopped with the interrupts disabled. So, I am now rebuilding
> the kernel back as I write this. That's why I am using 2.6.13
> at the moment.
>
> Anyway, getting down to 20 bytes of stack-space available
> seems to be pretty scary.
+ movl %esp, %edi
+ movl %edi, %ecx
+ andl $~0x1000, %edi
+ subl %edi, %ecx
ecx will be equal to ?
+ movb $'Q', %al
+ rep stosb
--
vda
On Saturday 24 December 2005 07:03, Denis Vlasenko wrote:
> + movl %esp, %edi
> + movl %edi, %ecx
> + andl $~0x1000, %edi
> + subl %edi, %ecx
>
> ecx will be equal to ?
0x1000 with 8k stacks, so long as %esp in in the top page of the 2 page
stack. 0x0 otherwise. Which explains why the poisoning crashes the kernel
with 4k stacks.
But there's another problem with Dick Johnson's approach, and that is that
he doesn't clear the poison when a kernel stack is freed. (I don't believe
the kernel does this automatically, though I could be mistaken). And that
means that the results can't be trusted: if you have a string of 20 Qs,
_something's_ overwritten the rest, but that something wasn't necessarily
using the memory as a stack at the time. More than that, with the Qs
spread over two pages it's quite possible for one page to be overwritten
and the other still free with it's 20 or so Qs.
HTH,
Andrew Wade
Ok, I've come up with a patch to "poison"/mark the kernel stacks with Qs
when they're allocated. (I don't think it'll mark the IRQ stacks though).
I clear the marking before the stacks are freed. The patch should work
with any-sized stacks.
There is one wrinkle though: linux has struct thread_info at the bottom of
the kernel stacks, overwriting some of the Qs. stack.c needs to be modified
to skip the first sizeof(struct thread_info) bytes of a page.
DISCLAIMER: I am a novice kernel hacker: this patch may not perform as
advertised.
signed-off-by: <[email protected]>
diff -uprN 2.6.15-rc5-mm3/kernel/fork.c ajw/kernel/fork.c
--- 2.6.15-rc5-mm3/kernel/fork.c 2005-12-26 01:07:57.087518486 -0500
+++ ajw/kernel/fork.c 2005-12-26 01:12:24.281198483 -0500
@@ -43,6 +43,7 @@
#include <linux/rmap.h>
#include <linux/acct.h>
#include <linux/cn_proc.h>
+#include <linux/string.h>
#include <asm/pgtable.h>
#include <asm/pgalloc.h>
@@ -102,6 +103,7 @@ static kmem_cache_t *mm_cachep;
void free_task(struct task_struct *tsk)
{
+ memset(tsk->thread_info, 0, THREAD_SIZE);
free_thread_info(tsk->thread_info);
free_task_struct(tsk);
}
@@ -171,6 +173,8 @@ static struct task_struct *dup_task_stru
return NULL;
}
+ memset(ti, 'Q', THREAD_SIZE);
+
*tsk = *orig;
tsk->thread_info = ti;
setup_thread_stack(tsk, orig);
I've modified stack.c to handle 4k stacks. It can also provide information
for 8k stacks (fwiw) by changing STACK_GRANULARITY.
It found one stack with only 756 bytes left. I hope it's just due to a
greedy boot-time function as I'm not running anything particularly exotic.
(CIFS & Reiser4).
Unfortunately I don't have any more time to experiment: I'm leaving for
a week.
Andrew Wade
El Mon, 26 Dec 2005 02:42:51 -0500,
Andrew James Wade <[email protected]> escribi?:
> Ok, I've come up with a patch to "poison"/mark the kernel stacks with Qs
> when they're allocated. (I don't think it'll mark the IRQ stacks though).
How does this differs from CONFIG_DEBUG_STACKOVERFLOW?
KVER := $(shell uname -r)
KSRC := /lib/modules/$(KVER)/build
PWD = $(shell pwd)
obj-m += stack_avail.o
all: modules
modules:
$(MAKE) -C $(KSRC) SUBDIRS=$(PWD) BUILD_DIR=$(PWD) modules
ioctl: ioctl.c
$(CC) $< -o $@
clean:
@find . \
\( -name '*.ko' -o -name '.*.cmd' \
-o -name '*.o' -o -name '*.mod.c' \) \
-type f -print | xargs rm -f
On Sat, 24 Dec 2005, Denis Vlasenko wrote:
> On Thursday 22 December 2005 23:53, linux-os (Dick Johnson) wrote:
>>
>> Yesterday I sent a patch to add stack-poison so the stack usage
>> could be observed.
>>
>> Today I wrote a small program and tested the stack usage. Both
>> the program and the patch is attached. The result is:
>>
>> Offset : 2ec8f000 Available Stack bytes = 3104
>> Offset : 2ecb1000 Available Stack bytes = 3104
>> Offset : 2ee5f000 Available Stack bytes = 20
>> Offset : 2f36d000 Available Stack bytes = 3104
>> Offset : 2fd09000 Available Stack bytes = 3012
>> Offset : 2fd0b000 Available Stack bytes = 3312
>> Offset : 2fd0f000 Available Stack bytes = 2132
>> Offset : 2fd2f000 Available Stack bytes = 2744
>> Offset : 2fd57000 Available Stack bytes = 2900
>> Offset : 2fdd5000 Available Stack bytes = 1400
>> Offset : 2fe35000 Available Stack bytes = 2832
>> Offset : 2ff3f000 Available Stack bytes = 776
>> Offset : 2ff45000 Available Stack bytes = 3188
>>
>> This, after compiling the kernel. I did not have 4k stacks
>> enabled for this test so any crashing of the stack beyond
>> one page will not hurt the system. This was on linux-2.6.13.4.
>>
>> Anyway, I tried to enable 4k stacks and the machine would
>> not boot past trying to install the first module. It just
>> stopped with the interrupts disabled. So, I am now rebuilding
>> the kernel back as I write this. That's why I am using 2.6.13
>> at the moment.
>>
>> Anyway, getting down to 20 bytes of stack-space available
>> seems to be pretty scary.
>
> + movl %esp, %edi
> + movl %edi, %ecx
> + andl $~0x1000, %edi
> + subl %edi, %ecx
>
> ecx will be equal to ?
Whatever the stack was minus that value ANDed with NOT 0x1000,
i.e. 0x1000 minus the stack already in use. The code assumes
that the stack starts and ends on a 0x1000 (page) boundary.
If that's not true, then all bets are off.
>
> + movb $'Q', %al
> + rep stosb
> --
> vda
>
Cheers,
Dick Johnson
Penguin : Linux version 2.6.13.4 on an i686 machine (5591.11 BogoMips).
Warning : 98.36% of all statistics are fiction.
.
****************************************************************
The information transmitted in this message is confidential and may be privileged. Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited. If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to [email protected] - and destroy all copies of this information, including any attachments, without reading or disclosing them.
Thank you.
On Wednesday 28 December 2005 15:14, linux-os (Dick Johnson) wrote:
> >> Anyway, getting down to 20 bytes of stack-space available
> >> seems to be pretty scary.
> >
> > + movl %esp, %edi
> > + movl %edi, %ecx
> > + andl $~0x1000, %edi
> > + subl %edi, %ecx
> >
> > ecx will be equal to ?
>
> Whatever the stack was minus that value ANDed with NOT 0x1000,
> i.e. 0x1000 minus the stack already in use. The code assumes
> that the stack starts and ends on a 0x1000 (page) boundary.
> If that's not true, then all bets are off.
Hmm. I must be thick today. (esp - (esp & 0xffffefff))
is always equal to (esp & 0x00001000). Which is either 0 or 0x1000.
--
vda
On Tue, 27 Dec 2005 14:12:03 -0700, Frank Sorenson <[email protected]> wrote:
>Andrew James Wade wrote:
>> I've modified stack.c to handle 4k stacks. It can also provide information
>> for 8k stacks (fwiw) by changing STACK_GRANULARITY.
>>
>> It found one stack with only 756 bytes left. I hope it's just due to a
>> greedy boot-time function as I'm not running anything particularly exotic.
>> (CIFS & Reiser4).
>
>Yes, it does appear to be a boot-time function. It eventually becomes
>PID 1, and the stack usage shrinks considerably.
>
Problem I have is the stack poison patch stops box from booting when set
to 4k stacks. Seems to imply boot is within 5 pushl's and a return from
4k? Reiser3 + SATA on K7 + VIA chipset
Grant.
On Wed, 28 Dec 2005, Denis Vlasenko wrote:
> On Wednesday 28 December 2005 15:14, linux-os (Dick Johnson) wrote:
>>>> Anyway, getting down to 20 bytes of stack-space available
>>>> seems to be pretty scary.
>>>
>>> + movl %esp, %edi
>>> + movl %edi, %ecx
>>> + andl $~0x1000, %edi
>>> + subl %edi, %ecx
>>>
>>> ecx will be equal to ?
>>
>> Whatever the stack was minus that value ANDed with NOT 0x1000,
>> i.e. 0x1000 minus the stack already in use. The code assumes
>> that the stack starts and ends on a 0x1000 (page) boundary.
>> If that's not true, then all bets are off.
>
> Hmm. I must be thick today. (esp - (esp & 0xffffefff))
> is always equal to (esp & 0x00001000). Which is either 0 or 0x1000.
> --
>
Yes it's supposed to be ~0xfff.
vda
>
Cheers,
Dick Johnson
Penguin : Linux version 2.6.13.4 on an i686 machine (5591.11 BogoMips).
Warning : 98.36% of all statistics are fiction.
.
****************************************************************
The information transmitted in this message is confidential and may be privileged. Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited. If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to [email protected] - and destroy all copies of this information, including any attachments, without reading or disclosing them.
Thank you.