2004-01-14 09:06:51

by Andi Kleen

[permalink] [raw]
Subject: [PATCH] Add CONFIG for -mregparm=3


Using -mregparm=3 shrinks the kernel further:

(compiled with gcc 3.4, without -funit-at-a-time, using the later
and together with -Os shrinks .text even more, making over 700KB difference)

4129346 708629 207240 5045215 4cfbdf vmlinux
3892905 708629 207240 4808774 496046 vmlinux-regparm

This one helps even more, >236KB .text difference. Clearly worth
the effort.

This patch adds an option to use -mregparm=3 while compiling the kernel.
I did an LTP run and it showed no additional failures over an non
regparm kernel.

According to some gcc developers it should be safe to use in all
gccs that are still supports (2.95 and up)

I didn't make it the default because it will break all binary only
modules (although they can be fixed by adding a wrapper that
calls them with "asmlinkage"). Actually it may be a good idea to
make this default with 2.7.1 or somesuch.

diff -u linux-34/arch/i386/Kconfig-o linux-34/arch/i386/Kconfig
--- linux-34/arch/i386/Kconfig-o 2004-01-09 09:27:09.000000000 +0100
+++ linux-34/arch/i386/Kconfig 2004-01-14 08:43:29.815530072 +0100
@@ -820,6 +820,14 @@
depends on (((X86_SUMMIT || X86_GENERICARCH) && NUMA) || (X86 && EFI))
default y

+config REGPARM
+ bool "Use register arguments (EXPERIMENTAL)"
+ default n
+ help
+ Compile the kernel with -mregparm=3. This uses an different ABI
+ and passes the first three arguments of a function call in registers.
+ This will probably break binary only modules.
+
endmenu


diff -u linux-34/arch/i386/Makefile-o linux-34/arch/i386/Makefile
--- linux-34/arch/i386/Makefile-o 2003-09-28 10:53:14.000000000 +0200
+++ linux-34/arch/i386/Makefile 2004-01-13 20:16:32.000000000 +0100
@@ -47,6 +47,8 @@
cflags-$(CONFIG_MCYRIXIII) += $(call check_gcc,-march=c3,-march=i486) $(align)-functions=0 $(align)-jumps=0 $(align)-loops=0
cflags-$(CONFIG_MVIAC3_2) += $(call check_gcc,-march=c3-2,-march=i686)

+cflags-$(CONFIG_REGPARM) += -mregparm=3
+
CFLAGS += $(cflags-y)

# Default subarch .c files




2004-01-14 09:18:40

by Russell King

[permalink] [raw]
Subject: Re: [PATCH] Add CONFIG for -mregparm=3

On Wed, Jan 14, 2004 at 10:06:03AM +0100, Andi Kleen wrote:
> Using -mregparm=3 shrinks the kernel further:

Note that there is a dependence on this patch - as highlighted by Arjan,
CardServices() breaks when built on x86 with -mregparm.

Therefore, the CardServices() patches need to be merged into any tree
prior to this patch.

--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 PCMCIA - http://pcmcia.arm.linux.org.uk/
2.6 Serial core

2004-01-14 09:41:45

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH] Add CONFIG for -mregparm=3

On Wed, Jan 14, 2004 at 10:34:59AM +0100, Arjan van de Ven wrote:
> On Wed, 2004-01-14 at 10:06, Andi Kleen wrote:
>
> >
> > According to some gcc developers it should be safe to use in all
> > gccs that are still supports (2.95 and up)
>
> it is not safe for the kernel until the cardbus CardServices patches get
> merged (is in -mm), for the same reason CardServices() is broken on
> amd64.

Just mark them asmlinkage then.

I would be a shame to leave that much space saving on the table just
for an single misdesigned API than can be easily fixed.

-Andi

2004-01-14 09:38:28

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH] Add CONFIG for -mregparm=3

On Wed, Jan 14, 2004 at 01:29:28AM -0800, Andrew Morton wrote:
> Andi Kleen <[email protected]> wrote:
> >
> > I didn't make it the default because it will break all binary only
> > modules (although they can be fixed by adding a wrapper that
> > calls them with "asmlinkage"). Actually it may be a good idea to
> > make this default with 2.7.1 or somesuch.
>
> yes, that is a hassle. But for these sorts of gains, it's worth pursuing
> it a bit further.
>
> How _much_ of a hassle it will be I can not say - I'd be looking to vendors

I think the popular modules like nvidia or ATI could be fixed
relatively easily. They usually consist of a glue layer with source and a
binary blob that is only called from the glue layer. Basically all you
have to do is the mark the prototypes for the binary blob in the glue layer
as "asmlinkage". In addition this can be done without any ifdefs
because asmlinkage does the right thing on a non regparm kernel.

Of course true binary only modules without glue layer would be more
difficult, but for those the vendors just have to recompile. Conceivable
it would be possible to write a glue layer even for them.

> to advise before merging this into mainline.

I'm not sure why vendors should care as long as it's only a CONFIG_*.

The option is clearly more aimed at "kernel self compiler operators" for
now.

-Andi

2004-01-14 09:41:46

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [PATCH] Add CONFIG for -mregparm=3

On Wed, 2004-01-14 at 10:06, Andi Kleen wrote:

>
> According to some gcc developers it should be safe to use in all
> gccs that are still supports (2.95 and up)

it is not safe for the kernel until the cardbus CardServices patches get
merged (is in -mm), for the same reason CardServices() is broken on
amd64.



Attachments:
signature.asc (189.00 B)
This is a digitally signed message part

2004-01-14 09:32:38

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] Add CONFIG for -mregparm=3

Andi Kleen <[email protected]> wrote:
>
> I didn't make it the default because it will break all binary only
> modules (although they can be fixed by adding a wrapper that
> calls them with "asmlinkage"). Actually it may be a good idea to
> make this default with 2.7.1 or somesuch.

yes, that is a hassle. But for these sorts of gains, it's worth pursuing
it a bit further.

How _much_ of a hassle it will be I can not say - I'd be looking to vendors
to advise before merging this into mainline.

I changed your patch to make it dependent on CONFIG_EXPERIMENTAL.

2004-01-14 09:32:40

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH] Add CONFIG for -mregparm=3

On Wed, Jan 14, 2004 at 09:16:30AM +0000, Russell King wrote:
> On Wed, Jan 14, 2004 at 10:06:03AM +0100, Andi Kleen wrote:
> > Using -mregparm=3 shrinks the kernel further:
>
> Note that there is a dependence on this patch - as highlighted by Arjan,
> CardServices() breaks when built on x86 with -mregparm.
>
> Therefore, the CardServices() patches need to be merged into any tree
> prior to this patch.

Ah, because of the broken prototypes again?

You could just mark them all asmlinkage to force stack arguments.

Or just fix the prototypes.

-Andi

2004-01-14 09:45:30

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [PATCH] Add CONFIG for -mregparm=3


On Wed, Jan 14, 2004 at 10:39:40AM +0100, Andi Kleen wrote:
> On Wed, Jan 14, 2004 at 10:34:59AM +0100, Arjan van de Ven wrote:
> > On Wed, 2004-01-14 at 10:06, Andi Kleen wrote:
> >
> > >
> > > According to some gcc developers it should be safe to use in all
> > > gccs that are still supports (2.95 and up)
> >
> > it is not safe for the kernel until the cardbus CardServices patches get
> > merged (is in -mm), for the same reason CardServices() is broken on
> > amd64.
>
> Just mark them asmlinkage then.
>
> I would be a shame to leave that much space saving on the table just
> for an single misdesigned API than can be easily fixed.

Oh I rather just fix the API period... :)
Patches exist and work excellent for me.


Attachments:
(No filename) (731.00 B)
(No filename) (189.00 B)
Download all attachments

2004-01-14 09:51:11

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] Add CONFIG for -mregparm=3

Arjan van de Ven <[email protected]> wrote:
>
> On Wed, 2004-01-14 at 10:06, Andi Kleen wrote:
>
> >
> > According to some gcc developers it should be safe to use in all
> > gccs that are still supports (2.95 and up)
>
> it is not safe for the kernel until the cardbus CardServices patches get
> merged (is in -mm), for the same reason CardServices() is broken on
> amd64.

The CardServices API migration work is complete. It'll probaby appear in
2.6.2.

2004-01-14 09:54:39

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [PATCH] Add CONFIG for -mregparm=3

On Wed, 2004-01-14 at 10:29, Andrew Morton wrote:

> How _much_ of a hassle it will be I can not say - I'd be looking to vendors
> to advise before merging this into mainline.

I am compiling my kernel rpms with this already and have the full
intention to keep doing that into production.


Attachments:
signature.asc (189.00 B)
This is a digitally signed message part

2004-01-14 19:28:47

by Adrian Bunk

[permalink] [raw]
Subject: Re: [PATCH] Add CONFIG for -mregparm=3

On Wed, Jan 14, 2004 at 10:35:56AM +0100, Andi Kleen wrote:
>...
> I think the popular modules like nvidia or ATI could be fixed
> relatively easily. They usually consist of a glue layer with source and a
> binary blob that is only called from the glue layer. Basically all you
> have to do is the mark the prototypes for the binary blob in the glue layer
> as "asmlinkage". In addition this can be done without any ifdefs
> because asmlinkage does the right thing on a non regparm kernel.
>
> Of course true binary only modules without glue layer would be more
> difficult, but for those the vendors just have to recompile. Conceivable
> it would be possible to write a glue layer even for them.
>...

Did I miss Linus announcing a stable ABI between kernel versions?

If some binary module vendor tries to support more than one kernel
version it's his problem - this is nothing that is officially supported
by the Linux kernel.

> -Andi

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

2004-01-15 01:39:25

by Rusty Russell

[permalink] [raw]
Subject: Re: [PATCH] Add CONFIG for -mregparm=3

On Wed, 14 Jan 2004 10:06:03 +0100
Andi Kleen <[email protected]> wrote:
> I didn't make it the default because it will break all binary only
> modules (although they can be fixed by adding a wrapper that
> calls them with "asmlinkage"). Actually it may be a good idea to
> make this default with 2.7.1 or somesuch.

Who cares. Anyway, if kept as a config option, this should probably be
added to MODULE_ARCH_VERMAGIC in include/asm-i386/module.h.

Thanks,
Rusty.
--
there are those who do and those who hang on and you don't see too
many doers quoting their contemporaries. -- Larry McVoy

2004-01-15 09:21:37

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH] Add CONFIG for -mregparm=3

On Thu, Jan 15, 2004 at 11:40:11AM +1100, Rusty Russell wrote:
> On Wed, 14 Jan 2004 10:06:03 +0100
> Andi Kleen <[email protected]> wrote:
> > I didn't make it the default because it will break all binary only
> > modules (although they can be fixed by adding a wrapper that
> > calls them with "asmlinkage"). Actually it may be a good idea to
> > make this default with 2.7.1 or somesuch.
>
> Who cares. Anyway, if kept as a config option, this should probably be
> added to MODULE_ARCH_VERMAGIC in include/asm-i386/module.h.

Ok. Good point.

On second thought I'm actually not opposed to make it the default,
but Linus/Andrew have to decide if they want this. It certainly
would be a good strategy longer term (even though it would eliminate
some of the advantages x86-64 currently enjoys over i386 ;-)

New patch appended.

-Andi

----------------------------------------

Add CONFIG_REGPARM option to enable compilation with -mregparm=3.
This shrinks the kernel .text considerably.

This could be made default later when it has been more tested.


diff -u linux-34/arch/i386/Kconfig-o linux-34/arch/i386/Kconfig
--- linux-34/arch/i386/Kconfig-o 2004-01-09 09:27:09.000000000 +0100
+++ linux-34/arch/i386/Kconfig 2004-01-14 08:43:29.000000000 +0100
@@ -820,6 +820,14 @@
depends on (((X86_SUMMIT || X86_GENERICARCH) && NUMA) || (X86 && EFI))
default y

+config REGPARM
+ bool "Use register arguments (EXPERIMENTAL)"
+ default n
+ help
+ Compile the kernel with -mregparm=3. This uses an different ABI
+ and passes the first three arguments of a function call in registers.
+ This will probably break binary only modules.
+
endmenu


diff -u linux-34/arch/i386/Makefile-o linux-34/arch/i386/Makefile
--- linux-34/arch/i386/Makefile-o 2003-09-28 10:53:14.000000000 +0200
+++ linux-34/arch/i386/Makefile 2004-01-13 20:16:32.000000000 +0100
@@ -47,6 +47,8 @@
cflags-$(CONFIG_MCYRIXIII) += $(call check_gcc,-march=c3,-march=i486) $(align)-functions=0 $(align)-jumps=0 $(align)-loops=0
cflags-$(CONFIG_MVIAC3_2) += $(call check_gcc,-march=c3-2,-march=i686)

+cflags-$(CONFIG_REGPARM) += -mregparm=3
+
CFLAGS += $(cflags-y)

# Default subarch .c files
diff -u linux-34/include/asm-i386/module.h-o linux-34/include/asm-i386/module.h
--- linux-34/include/asm-i386/module.h-o 2003-05-27 03:00:24.000000000 +0200
+++ linux-34/include/asm-i386/module.h 2004-01-15 10:15:15.686788608 +0100
@@ -52,6 +52,12 @@
#error unknown processor family
#endif

-#define MODULE_ARCH_VERMAGIC MODULE_PROC_FAMILY
+#ifdef CONFIG_REGPARM
+#define MODULE_REGPARM "REGPARM "
+#else
+#define MODULE_REGPARM ""
+#endif
+
+#define MODULE_ARCH_VERMAGIC MODULE_PROC_FAMILY MODULE_REGPARM

#endif /* _ASM_I386_MODULE_H */



2004-01-15 19:56:26

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: [PATCH] Add CONFIG for -mregparm=3

On Wed, Jan 14, 2004 at 08:25:06PM +0100, Adrian Bunk wrote:
> On Wed, Jan 14, 2004 at 10:35:56AM +0100, Andi Kleen wrote:
> >...
> > I think the popular modules like nvidia or ATI could be fixed
> > relatively easily. They usually consist of a glue layer with source and a
> > binary blob that is only called from the glue layer. Basically all you
> > have to do is the mark the prototypes for the binary blob in the glue layer
> > as "asmlinkage". In addition this can be done without any ifdefs
> > because asmlinkage does the right thing on a non regparm kernel.
> >
> > Of course true binary only modules without glue layer would be more
> > difficult, but for those the vendors just have to recompile. Conceivable
> > it would be possible to write a glue layer even for them.
> >...
>
> Did I miss Linus announcing a stable ABI between kernel versions?
>
> If some binary module vendor tries to support more than one kernel
> version it's his problem - this is nothing that is officially supported
> by the Linux kernel.

agreed.

this is the sort of stuff that shouldn't be a config option, it
exercises new paths in gcc as well, _all_ the userbase has to use it, or
it's not worth the risk/pain IMHO.

2004-01-17 20:16:43

by Sander

[permalink] [raw]
Subject: several oopses during boot (was: Re: [PATCH] Add CONFIG for -mregparm=3)

Hi Andi and Andrew,

Andi Kleen wrote (ao):
> Using -mregparm=3 shrinks the kernel further:

...

> This patch adds an option to use -mregparm=3 while compiling the kernel.
> I did an LTP run and it showed no additional failures over an non
> regparm kernel.
>
> According to some gcc developers it should be safe to use in all
> gccs that are still supports (2.95 and up)
>
> I didn't make it the default because it will break all binary only
> modules (although they can be fixed by adding a wrapper that
> calls them with "asmlinkage"). Actually it may be a good idea to
> make this default with 2.7.1 or somesuch.

...

> +config REGPARM
> + bool "Use register arguments (EXPERIMENTAL)"
> + default n
> + help
> + Compile the kernel with -mregparm=3. This uses an different ABI
> + and passes the first three arguments of a function call in registers.
> + This will probably break binary only modules.

This gives several oopses on my system during boot, after which is seems
dead.

2.6.1-mm4
VIA C3 Ezra

It mounts its root filesystem over nfs and has netconsole compiled in.

Without the REGPARM option the system boots and runs fine.

Should I post the oopses, the result of ksymoops, a dmesg and kernel
config or is this an already known issue?

Kind regards, Sander

--
Humilis IT Services and Solutions
http://www.humilis.net

2004-01-17 20:52:08

by Andi Kleen

[permalink] [raw]
Subject: Re: several oopses during boot (was: Re: [PATCH] Add CONFIG for -mregparm=3)

> 2.6.1-mm4

Note that this kernel is broken on gcc 3.4 and on 3.3-hammer. If you're
using that disable the -funit-at-a-time setting in the main Makefile.

> VIA C3 Ezra
>
> It mounts its root filesystem over nfs and has netconsole compiled in.
>
> Without the REGPARM option the system boots and runs fine.
>
> Should I post the oopses, the result of ksymoops, a dmesg and kernel
> config or is this an already known issue?

Not known. Please post the decoded oopses. Also give your compiler
version.

-Andi

2004-01-17 21:07:22

by Sander

[permalink] [raw]
Subject: Re: several oopses during boot (was: Re: [PATCH] Add CONFIG for -mregparm=3)

Andi Kleen wrote (ao):
> > 2.6.1-mm4
>
> Note that this kernel is broken on gcc 3.4 and on 3.3-hammer. If
> you're using that disable the -funit-at-a-time setting in the main
> Makefile.

> > VIA C3 Ezra
> >
> > It mounts its root filesystem over nfs and has netconsole compiled
> > in.
> >
> > Without the REGPARM option the system boots and runs fine.
> >
> > Should I post the oopses, the result of ksymoops, a dmesg and kernel
> > config or is this an already known issue?
>
> Not known. Please post the decoded oopses. Also give your compiler
> version.

Hope this helps. The system runs fine with the option disabled.

gcc (GCC) 3.3.3 20040110 (prerelease) (Debian)

I ran ksymoops on another system, but used the vmlinux and System.map
from the mentioned oopsing system.

The full output is very long (3657 lines), so I only post the fist 100
or so lines. Do you need them all?


ksymoops 2.4.9 on i686 2.6.0-test11. Options used
-v /tmp/vmlinux (specified)
-K (specified)
-L (specified)
-O (specified)
-m /tmp/System.map (specified)

Unable to handle kernel paging request at virtual address 249579f8
c012c19d
*pde = 00000000
Oops: 0000 [#1]
CPU: 0
EIP: 0060:[<c012c19d>] Not tainted VLI
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010046
eax: c1557940 ebx: c1554000 ecx: c1557a4c edx: c1557990
esi: 0000000d edi: bffffaa0 ebp: c1554000 esp: c1555fb8
ds: 007b es: 007b ss: 0068
Stack: bffffba0 bffffb20 c02a01d7 bffffba0 00002323 23232323 bffffb20 bffffaa0
bffffcc8 00000042 0000007b 0000007b 00000042 400c8b17 00000073 00000202
bffff9fc 0000007b
Call Trace:
[<c02a01d7>] syscall_call+0x7/0xb
Code: 00 00 52 8b 80 88 00 00 00 50 e8 0f 49 ff ff 5e 58 8b 03 <1>Unable to handle kernel paging request at virtual address 249579f8
Warning (Oops_code): trailing garbage ignored on Code: line
Text: 'Code: 00 00 52 8b 80 88 00 00 00 50 e8 0f 49 ff ff 5e 58 8b 03 <1>Unable to handle kernel paging request at virtual address 249579f8'
Garbage: 'Unable to handle kernel paging request at virtual address 249579f8'
Error (Oops_code_values): invalid value 0x1 in Code line, must be 2, 4, 8 or 16 digits, value ignored


>>EIP; c012c19d <sys_setsid+7d/a0> <=====

>>eax; c1557940 <_end+119d158/3fc42818>
>>ebx; c1554000 <_end+1199818/3fc42818>
>>ecx; c1557a4c <_end+119d264/3fc42818>
>>edx; c1557990 <_end+119d1a8/3fc42818>
>>ebp; c1554000 <_end+1199818/3fc42818>
>>esp; c1555fb8 <_end+119b7d0/3fc42818>

Trace; c02a01d7 <syscall_call+7/b>

This architecture has variable length instructions, decoding before eip
is unreliable, take these instructions with a pinch of salt.

Code; c012c18a <sys_setsid+6a/a0>
00000000 <_EIP>:
Code; c012c18a <sys_setsid+6a/a0>
0: 00 00 add %al,(%eax)
Code; c012c18c <sys_setsid+6c/a0>
2: 52 push %edx
Code; c012c18d <sys_setsid+6d/a0>
3: 8b 80 88 00 00 00 mov 0x88(%eax),%eax
Code; c012c193 <sys_setsid+73/a0>
9: 50 push %eax
Code; c012c194 <sys_setsid+74/a0>
a: e8 0f 49 ff ff call ffff491e <_EIP+0xffff491e>
Code; c012c199 <sys_setsid+79/a0>
f: 5e pop %esi
Code; c012c19a <sys_setsid+7a/a0>
10: 58 pop %eax
Code; c012c19b <sys_setsid+7b/a0>
11: 8b 03 mov (%ebx),%eax

This decode from eip onwards should be reliable

Code; c012c19d <sys_setsid+7d/a0>
00000000 <_EIP>:

c012c19d
*pde = 00000000
Oops: 0000 [#2]
CPU: 0
EIP: 0060:[<c012c19d>] Not tainted VLI
EFLAGS: 00010046
eax: c1557940 ebx: c1554000 ecx: c1557a4c edx: c1557990
esi: 0000000e edi: bffffaa0 ebp: c1554000 esp: c1555fb8
ds: 007b es: 007b ss: 0068
Stack: bffffba0 bffffb20 c02a01d7 bffffba0 61630053 32323232 bffffb20 bffffaa0
bffffcc8 00000042 0000007b 0000007b 00000042 400c8b17 00000073 00000246
bffff9fc 0000007b
Call Trace:
[<c02a01d7>] syscall_call+0x7/0xb
Code: 00 00 52 8b 80 88 00 00 00 50 e8 0f 49 ff ff 5e 58 8b 03 <1>Unable to handle kernel paging request at virtual address 249579f8
Warning (Oops_code): trailing garbage ignored on Code: line
Text: 'Code: 00 00 52 8b 80 88 00 00 00 50 e8 0f 49 ff ff 5e 58 8b 03 <1>Unable to handle kernel paging request at virtual address 249579f8'
Garbage: 'Unable to handle kernel paging request at virtual address 249579f8'
Error (Oops_code_values): invalid value 0x1 in Code line, must be 2, 4, 8 or 16 digits, value ignored


>>EIP; c012c19d <sys_setsid+7d/a0> <=====

>>eax; c1557940 <_end+119d158/3fc42818>
>>ebx; c1554000 <_end+1199818/3fc42818>
>>ecx; c1557a4c <_end+119d264/3fc42818>
>>edx; c1557990 <_end+119d1a8/3fc42818>
>>ebp; c1554000 <_end+1199818/3fc42818>
>>esp; c1555fb8 <_end+119b7d0/3fc42818>

Trace; c02a01d7 <syscall_call+7/b>

This architecture has variable length instructions, decoding before eip
is unreliable, take these instructions with a pinch of salt.

Code; c012c18a <sys_setsid+6a/a0>
00000000 <_EIP>:
Code; c012c18a <sys_setsid+6a/a0>
0: 00 00 add %al,(%eax)
Code; c012c18c <sys_setsid+6c/a0>
2: 52 push %edx
Code; c012c18d <sys_setsid+6d/a0>
3: 8b 80 88 00 00 00 mov 0x88(%eax),%eax
Code; c012c193 <sys_setsid+73/a0>
9: 50 push %eax
Code; c012c194 <sys_setsid+74/a0>
a: e8 0f 49 ff ff call ffff491e <_EIP+0xffff491e>
Code; c012c199 <sys_setsid+79/a0>
f: 5e pop %esi
Code; c012c19a <sys_setsid+7a/a0>
10: 58 pop %eax
Code; c012c19b <sys_setsid+7b/a0>
11: 8b 03 mov (%ebx),%eax

This decode from eip onwards should be reliable

Code; c012c19d <sys_setsid+7d/a0>
00000000 <_EIP>:

c012c19d
*pde = 00000000
Oops: 0000 [#3]
CPU: 0
EIP: 0060:[<c012c19d>] Not tainted VLI
EFLAGS: 00010046
eax: c1557940 ebx: c1554000 ecx: c1557a4c edx: c1557990
esi: 0000000f edi: bffffaa0 ebp: c1554000 esp: c1555fb8
ds: 007b es: 007b ss: 0068
Stack: bffffba0 bffffb20 c02a01d7 bffffba0 61630053 32323232 bffffb20 bffffaa0
bffffcc8 00000042 0000007b 0000007b 00000042 400c8b17 00000073 00000246
bffff9fc 0000007b
Call Trace:
[<c02a01d7>] syscall_call+0x7/0xb
Code: 00 00 52 8b 80 88 00 00 00 50 e8 0f 49 ff ff 5e 58 8b 03 <1>Unable to handle kernel paging request at virtual address 249579f8
Warning (Oops_code): trailing garbage ignored on Code: line
Text: 'Code: 00 00 52 8b 80 88 00 00 00 50 e8 0f 49 ff ff 5e 58 8b 03 <1>Unable to handle kernel paging request at virtual address 249579f8'
Garbage: 'Unable to handle kernel paging request at virtual address 249579f8'
Error (Oops_code_values): invalid value 0x1 in Code line, must be 2, 4, 8 or 16 digits, value ignored


>>EIP; c012c19d <sys_setsid+7d/a0> <=====

>>eax; c1557940 <_end+119d158/3fc42818>
>>ebx; c1554000 <_end+1199818/3fc42818>
>>ecx; c1557a4c <_end+119d264/3fc42818>
>>edx; c1557990 <_end+119d1a8/3fc42818>
>>ebp; c1554000 <_end+1199818/3fc42818>
>>esp; c1555fb8 <_end+119b7d0/3fc42818>

Trace; c02a01d7 <syscall_call+7/b>

This architecture has variable length instructions, decoding before eip
is unreliable, take these instructions with a pinch of salt.

Code; c012c18a <sys_setsid+6a/a0>
00000000 <_EIP>:
Code; c012c18a <sys_setsid+6a/a0>
0: 00 00 add %al,(%eax)
Code; c012c18c <sys_setsid+6c/a0>
2: 52 push %edx
Code; c012c18d <sys_setsid+6d/a0>
3: 8b 80 88 00 00 00 mov 0x88(%eax),%eax
Code; c012c193 <sys_setsid+73/a0>
9: 50 push %eax
Code; c012c194 <sys_setsid+74/a0>
a: e8 0f 49 ff ff call ffff491e <_EIP+0xffff491e>
Code; c012c199 <sys_setsid+79/a0>
f: 5e pop %esi
Code; c012c19a <sys_setsid+7a/a0>
10: 58 pop %eax
Code; c012c19b <sys_setsid+7b/a0>
11: 8b 03 mov (%ebx),%eax


--
Humilis IT Services and Solutions
http://www.humilis.net

2004-01-17 21:28:06

by Andi Kleen

[permalink] [raw]
Subject: Re: several oopses during boot (was: Re: [PATCH] Add CONFIG for -mregparm=3)

On Sat, Jan 17, 2004 at 10:07:15PM +0100, Sander wrote:
> Andi Kleen wrote (ao):
> > > 2.6.1-mm4
> >
> > Note that this kernel is broken on gcc 3.4 and on 3.3-hammer. If
> > you're using that disable the -funit-at-a-time setting in the main
> > Makefile.
>
> > > VIA C3 Ezra
> > >
> > > It mounts its root filesystem over nfs and has netconsole compiled
> > > in.
> > >
> > > Without the REGPARM option the system boots and runs fine.
> > >
> > > Should I post the oopses, the result of ksymoops, a dmesg and kernel
> > > config or is this an already known issue?
> >
> > Not known. Please post the decoded oopses. Also give your compiler
> > version.
>
> Hope this helps. The system runs fine with the option disabled.

Can you perhaps save your .config, do a make distclean and try
to compile the kernel from scratch again? Maybe you had some stale object
files around.

-Andi

2004-01-17 22:01:30

by Mike Fedyk

[permalink] [raw]
Subject: Re: several oopses during boot (was: Re: [PATCH] Add CONFIG for -mregparm=3)

On Sat, Jan 17, 2004 at 10:28:57PM +0100, Andi Kleen wrote:
> On Sat, Jan 17, 2004 at 10:07:15PM +0100, Sander wrote:
> > Andi Kleen wrote (ao):
> > > > 2.6.1-mm4
> > >
> > > Note that this kernel is broken on gcc 3.4 and on 3.3-hammer. If
> > > you're using that disable the -funit-at-a-time setting in the main
> > > Makefile.
> >
> > > > VIA C3 Ezra
> > > >
> > > > It mounts its root filesystem over nfs and has netconsole compiled
> > > > in.
> > > >
> > > > Without the REGPARM option the system boots and runs fine.
> > > >
> > > > Should I post the oopses, the result of ksymoops, a dmesg and kernel
> > > > config or is this an already known issue?
> > >
> > > Not known. Please post the decoded oopses. Also give your compiler
> > > version.
> >
> > Hope this helps. The system runs fine with the option disabled.
>
> Can you perhaps save your .config, do a make distclean and try
> to compile the kernel from scratch again? Maybe you had some stale object
> files around.

Also, turn on kksymoops so that you'll get symbols in your oops reports, and
no need for ksymoops in userspace.

2004-01-18 05:44:48

by Sander

[permalink] [raw]
Subject: Re: several oopses during boot (was: Re: [PATCH] Add CONFIG for -mregparm=3)

Andi Kleen wrote (ao):
> On Sat, Jan 17, 2004 at 10:07:15PM +0100, Sander wrote:
> > > > Without the REGPARM option the system boots and runs fine.
> > > >
> > > > Should I post the oopses, the result of ksymoops, a dmesg and
> > > > kernel config or is this an already known issue?
> > >
> > > Not known. Please post the decoded oopses. Also give your
> > > compiler version.
> >
> > Hope this helps. The system runs fine with the option disabled.
>
> Can you perhaps save your .config, do a make distclean and try
> to compile the kernel from scratch again? Maybe you had some stale
> object files around.

I'm terrible sorry to say that I can't reproduce the oopses
anymore .. :-( Maybe something went wrong during the tftp of the
kernel, or I made a mistake ..

Sorry for wasting your time ..

sander

--
Humilis IT Services and Solutions
http://www.humilis.net

2004-01-18 20:35:41

by Sander

[permalink] [raw]
Subject: Re: several oopses during boot (was: Re: [PATCH] Add CONFIG for -mregparm=3)

Sander wrote (ao):
> Andi Kleen wrote (ao):
> > On Sat, Jan 17, 2004 at 10:07:15PM +0100, Sander wrote:
> > > > > Without the REGPARM option the system boots and runs fine.
> > > > >
> > > > > Should I post the oopses, the result of ksymoops, a dmesg and
> > > > > kernel config or is this an already known issue?
> > > >
> > > > Not known. Please post the decoded oopses. Also give your
> > > > compiler version.
> > >
> > > Hope this helps. The system runs fine with the option disabled.
> >
> > Can you perhaps save your .config, do a make distclean and try
> > to compile the kernel from scratch again? Maybe you had some stale
> > object files around.
>
> I'm terrible sorry to say that I can't reproduce the oopses
> anymore .. :-( Maybe something went wrong during the tftp of the
> kernel, or I made a mistake ..
>
> Sorry for wasting your time ..

I got another oops now, and discovered that I had kksymoops (KALLSYMS)
already enabled as Mike requested.

kernel 2.6.1-mm4, C3 cpu (via mini-itx mobo)
gcc (GCC) 3.3.3 20040110 (prerelease) (Debian)

I have no idea if it is hardware or software related, and if it has got
anything to do with the REGPARM option, but I entered this thread
because the kernel oopsed the first time I booted it and the first time
I enabled this option.

Btw, I always do rm -rf linux-2.6.1 ; tar jxf linux-2.6.1.tar.bz2 etc
before a compile.

I'll provide the .config if needed, but thought this message is long
enough already. Is this report of any use?

This is the oops I got now when doing an apt-get install. It was idle for
some hours after an afternoon of stress testing:

Unable to handle kernel NULL pointer dereference at virtual address 00000000
printing eip:
c029fa14
*pde = 00000000
Oops: 0002 [#1]
CPU: 0
EIP: 0060:[<c029fa14>] Not tainted VLI
EFLAGS: 00010282
EIP is at proc_dodebug+0xa74/0x10c0
eax: fffffed0 ebx: 00000000 ecx: 00000003 edx: db098d20
esi: db098d3a edi: cc380010 ebp: 000004f8 esp: c0357ce4
ds: 007b es: 007b ss: 0068
Process swapper (pid: 0, threadinfo=c0356000 task=c02eba60)
Stack: 000000d0 000005c8 000000d0 c0248c60 db098d2a cc380000 000000d0 00000000
00000000 00000000 00000001 00000000 c0357da0 c0126bba 00000000 00000001
000000d0 00001000 00000b90 00001088 c0248dec dacd1be0 000004f8 cc380000 skb_copy_and_csum_bits+0x50/0x2a0
[<c0126bba>] update_process_times+0x2a/0x30
[<c0248dec>] skb_copy_and_csum_bits+0x1dc/0x2a0
[<c0291828>] skb_read_and_csum_bits+0x28/0x60
[<c029b025>] xdr_partial_copy_from_skb+0x105/0x150
[<c02918b4>] csum_partial_copy_to_xdr+0x54/0x100
[<c0291800>] skb_read_and_csum_bits+0x0/0x60
[<c0291a75>] udp_data_ready+0x115/0x1f0
[<c027db48>]

(ends here)

And this are the first few oopses from yesterdays report during startup:

Unable to handle kernel paging request at virtual address 249579f8
printing eip:
c012c19d
*pde = 00000000
Oops: 0000 [#1]
CPU: 0
EIP: 0060:[<c012c19d>] Not tainted VLI
EFLAGS: 00010046
EIP is at sys_setsid+0x7d/0xa0
eax: c1557940 ebx: c1554000 ecx: c1557a4c edx: c1557990
esi: 0000000d edi: bffffaa0 ebp: c1554000 esp: c1555fb8
ds: 007b es: 007b ss: 0068
Process init (pid: 13, threadinfo=c1554000 task=c1557940)
Stack: bffffba0 bffffb20 c02a01d7 bffffba0 00002323 23232323 bffffb20 bffffaa0
bffffcc8 00000042 0000007b 0000007b 00000042 400c8b17 00000073 00000202
bffff9fc 0000007b
Call Trace:
[<c02a01d7>] syscall_call+0x7/0xb

Code: 00 00 52 8b 80 88 00 00 00 50 e8 0f 49 ff ff 5e 58 8b 03 <1>Unable to handle kernel paging request at virtual address 249579f8
printing eip:
c012c19d
*pde = 00000000
Oops: 0000 [#2]
CPU: 0
EIP: 0060:[<c012c19d>] Not tainted VLI
EFLAGS: 00010046
EIP is at sys_setsid+0x7d/0xa0
eax: c1557940 ebx: c1554000 ecx: c1557a4c edx: c1557990
esi: 0000000e edi: bffffaa0 ebp: c1554000 esp: c1555fb8
ds: 007b es: 007b ss: 0068
Process init (pid: 14, threadinfo=c1554000 task=c1557940)
Stack: bffffba0 bffffb20 c02a01d7 bffffba0 61630053 32323232 bffffb20 bffffaa0
bffffcc8 00000042 0000007b 0000007b 00000042 400c8b17 00000073 00000246
bffff9fc 0000007b
Call Trace:
[<c02a01d7>] syscall_call+0x7/0xb

Code: 00 00 52 8b 80 88 00 00 00 50 e8 0f 49 ff ff 5e 58 8b 03 <1>Unable to handle kernel paging request at virtual address 249579f8
printing eip:
c012c19d
*pde = 00000000
Oops: 0000 [#3]
CPU: 0
EIP: 0060:[<c012c19d>] Not tainted VLI
EFLAGS: 00010046
EIP is at sys_setsid+0x7d/0xa0
eax: c1557940 ebx: c1554000 ecx: c1557a4c edx: c1557990
esi: 0000000f edi: bffffaa0 ebp: c1554000 esp: c1555fb8
ds: 007b es: 007b ss: 0068
Process init (pid: 15, threadinfo=c1554000 task=c1557940)
Stack: bffffba0 bffffb20 c02a01d7 bffffba0 61630053 32323232 bffffb20 bffffaa0
bffffcc8 00000042 0000007b 0000007b 00000042 400c8b17 00000073 00000246
bffff9fc 0000007b
Call Trace:
[<c02a01d7>] syscall_call+0x7/0xb

Code: 00 00 52 8b 80 88 00 00 00 50 e8 0f 49 ff ff 5e 58 8b 03 <1>Unable to handle kernel paging request at virtual address 249579f8
printing eip:
c012c19d
*pde = 00000000
Oops: 0000 [#4]
CPU: 0
EIP: 0060:[<c012c19d>] Not tainted VLI
EFLAGS: 00010046
EIP is at sys_setsid+0x7d/0xa0
eax: c1557940 ebx: c1554000 ecx: c1557a4c edx: c1557990
esi: 00000010 edi: bffffaa0 ebp: c1554000 esp: c1555fb8
ds: 007b es: 007b ss: 0068
Process init (pid: 16, threadinfo=c1554000 task=c1557940)
Stack: bffffba0 bffffb20 c02a01d7 bffffba0 61630053 32323232 bffffb20 bffffaa0
bffffcc8 00000042 0000007b 0000007b 00000042 400c8b17 00000073 00000246
bffff9fc 0000007b
Call Trace:
[<c02a01d7>] syscall_call+0x7/0xb

Code: 00 00 52 8b 80 88 00 00 00 50 e8 0f 49 ff ff 5e 58 8b 03 <1>Unable to handle kernel paging request at virtual address 249579f8
printing eip:
c012c19d
*pde = 00000000
Oops: 0000 [#5]
CPU: 0
EIP: 0060:[<c012c19d>] Not tainted VLI
EFLAGS: 00010046
EIP is at sys_setsid+0x7d/0xa0
eax: c1557940 ebx: c1554000 ecx: c1557a4c edx: c1557990
esi: 00000011 edi: bffffaa0 ebp: c1554000 esp: c1555fb8
ds: 007b es: 007b ss: 0068
Process init (pid: 17, threadinfo=c1554000 task=c1557940)
Stack: bffffba0 bffffb20 c02a01d7 bffffba0 61630053 32323232 bffffb20 bffffaa0
bffffcc8 00000042 0000007b 0000007b 00000042 400c8b17 00000073 00000246
bffff9fc 0000007b
Call Trace:
[<c02a01d7>] syscall_call+0x7/0xb

Code: 00 00 52 8b 80 88 00 00 00 50 e8 0f 49 ff ff 5e 58 8b 03 <1>Unable to handle kernel paging request at virtual address 249579f8
printing eip:
c012c19d
*pde = 00000000
Oops: 0000 [#6]
CPU: 0
EIP: 0060:[<c012c19d>] Not tainted VLI
EFLAGS: 00010046
EIP is at sys_setsid+0x7d/0xa0
eax: c1557940 ebx: c1554000 ecx: c1557a4c edx: c1557990
esi: 00000012 edi: bffffaa0 ebp: c1554000 esp: c1555fb8
ds: 007b es: 007b ss: 0068
Process init (pid: 18, threadinfo=c1554000 task=c1557940)
Stack: bffffba0 bffffb20 c02a01d7 bffffba0 61630053 32323232 bffffb20 bffffaa0
bffffcc8 00000042 0000007b 0000007b 00000042 400c8b17 00000073 00000246
bffff9fc 0000007b
Call Trace:
[<c02a01d7>] syscall_call+0x7/0xb

Code: 00 00 52 8b 80 88 00 00 00 50 e8 0f 49 ff ff 5e 58 8b 03 <1>Unable to handle kernel paging request at virtual address 249579f8
printing eip:
c012c19d
*pde = 00000000
Oops: 0000 [#7]
CPU: 0
EIP: 0060:[<c012c19d>] Not tainted VLI
EFLAGS: 00010046
EIP is at sys_setsid+0x7d/0xa0
eax: c1557940 ebx: c1554000 ecx: c1557a4c edx: c1557990
esi: 00000013 edi: bffffaa0 ebp: c1554000 esp: c1555fb8
ds: 007b es: 007b ss: 0068
Process init (pid: 19, threadinfo=c1554000 task=c1557940)
Stack: bffffba0 bffffb20 c02a01d7 bffffba0 61630053 32323232 bffffb20 bffffaa0
bffffcc8 00000042 0000007b 0000007b 00000042 400c8b17 00000073 00000246
bffff9fc 0000007b
Call Trace:
[<c02a01d7>] syscall_call+0x7/0xb

Code: 00 00 52 8b 80 88 00 00 00 50 e8 0f 49 ff ff 5e 58 8b 03 <1>Unable to handle kernel paging request at virtual address 249579f8
printing eip:
c012c19d
*pde = 00000000
Oops: 0000 [#8]
CPU: 0
EIP: 0060:[<c012c19d>] Not tainted VLI
EFLAGS: 00010046
EIP is at sys_setsid+0x7d/0xa0
eax: c1557940 ebx: c1554000 ecx: c1557a4c edx: c1557990
esi: 00000014 edi: bffffaa0 ebp: c1554000 esp: c1555fb8
ds: 007b es: 007b ss: 0068
Process init (pid: 20, threadinfo=c1554000 task=c1557940)
Stack: bffffba0 bffffb20 c02a01d7 bffffba0 61630053 32323232 bffffb20 bffffaa0
bffffcc8 00000042 0000007b 0000007b 00000042 400c8b17 00000073 00000246
bffff9fc 0000007b
Call Trace:
[<c02a01d7>] syscall_call+0x7/0xb


--
Humilis IT Services and Solutions
http://www.humilis.net

2004-01-18 20:59:31

by Andi Kleen

[permalink] [raw]
Subject: Re: several oopses during boot (was: Re: [PATCH] Add CONFIG for -mregparm=3)

> I have no idea if it is hardware or software related, and if it has got
> anything to do with the REGPARM option, but I entered this thread
> because the kernel oopsed the first time I booted it and the first time
> I enabled this option.

Do the oopses go away when you disable the option? And do they come back
when you reenable it again?

You could run memtest86 to make sure your RAM is ok.

-Andi

2004-01-19 06:47:59

by Sander

[permalink] [raw]
Subject: Re: several oopses during boot (was: Re: [PATCH] Add CONFIG for -mregparm=3)

Andi Kleen wrote (ao):
> > I have no idea if it is hardware or software related, and if it has
> > got anything to do with the REGPARM option, but I entered this
> > thread because the kernel oopsed the first time I booted it and the
> > first time I enabled this option.
>
> Do the oopses go away when you disable the option? And do they come
> back when you reenable it again?

I have to try that, but have no reliable way to get the oopses yet which
makes that a bit hard.

> You could run memtest86 to make sure your RAM is ok.

I'll do that.

--
Humilis IT Services and Solutions
http://www.humilis.net