2004-11-17 10:13:18

by Sumit Pandya

[permalink] [raw]
Subject: OOPS - APIC or othere?

Hi All,
At one of our client I faced timer problem in kernel-2.4.26 and I tried to
fixed with patching "arch/i386/kernel/mpparse.c" file taken from
patch-2.4.27.
... ... ...
Mikael Pettersson:
o i386 and x86_64 ACPI mpparse timer bug
... ... ...
After booting up the system now I get OOPS. Did I applied partial patch by
taking only patch for mpparse.c from the whole buntch? Does it broken
dependency to some other functionality? I've ACPI support enabled into
kernel.
Does following Len's patch provide solution to my OOPS?
ftp.kernel.org/pub/linux/kernel/people/lenb/acpi/patches/test/2.4.26-rc4/200
40422153228-irq2.patch

Here is output of ksymsoops.

Unable to handle Kernel NULL pointer dereference at virtual address 00000001
c02c4d80
*pde = 00000000
Oops: 0000
CPU: 0
EIP: 0010:[<c02c4d80>] Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010293
eax: 00000000 ebx: 00000011 ecx: 00000000 edx: 00000000
esi: 00000000 edi: 00000000 ebp: c02bbf38 esp: c02bbf2c
ds: 0018 es: 0018 ss: 0018
Process swapper (pid: 0, stackpage=c02bb000)
Stack: C02bbf78 fffffffb c02bbf80 c02bbf9c c02c50de 00000000 00000000
00000000
00001000 00000011 00000010 c02bbf78 00001000 00000000 00000246
00000001
00000286 00000000 00000000 00000900 01000000 00000016 c02bbf98
c0117930
Call Trace: [<c0117930>] [<c0105000>] [<c0117930>] [<c0105000>]
[<c0105000>]
Code: 0f b6 44 d1 01 3b 45 10 75 26 0f b6 44 d7 06 3a 86 01 dc 2d

>>EIP; c02c4d80 <find_irq_entry+30/70> <=====
Trace; c0117930 <printk+100/110>
Trace; c0105000 <_stext+0/0>
Trace; c0117930 <printk+100/110>
Trace; c0105000 <_stext+0/0>
Trace; c0105000 <_stext+0/0>
Code; c02c4d80 <find_irq_entry+30/70>
00000000 <_EIP>:
Code; c02c4d80 <find_irq_entry+30/70> <=====
0: 0f b6 44 d1 01 movzbl 0x1(%ecx,%edx,8),%eax <=====
Code; c02c4d85 <find_irq_entry+35/70>
5: 3b 45 10 cmp 0x10(%ebp),%eax
Code; c02c4d88 <find_irq_entry+38/70>
8: 75 26 jne 30 <_EIP+0x30> c02c4db0
<find_irq_entry+60/70>
Code; c02c4d8a <find_irq_entry+3a/70>
a: 0f b6 44 d7 06 movzbl 0x6(%edi,%edx,8),%eax
Code; c02c4d8f <find_irq_entry+3f/70>
f: 3a 86 01 dc 2d 00 cmp 0x2ddc01(%esi),%al

<0>Kernel Panic: Attempted to kill the idle task!

I can see following messages before OOPS
Enabled ExtINT on CPU#0
ESR Value before enabling vector : 00000000
ESR Value after enabling vector : 00000000
ENABLING IO-APIC IRQs
... Here OOPS start ....

One more point here is with same kernel source and its configuration and
SMP is enabled with 2 processors the kernel boots up. While it was giving
oops in uniprocessor.

_____ __ __ ____ ____ __ ______
/\ ___\ /\ \ /\ \ /\ \ \/ /\ \ /\ \ /\__ _\
\ \ ____ \\ \ \\_| \\ \ \_ /\ \ \\ \ \\__ \ \/
\//\____ \\ \______ / \ \__\ \ \__\\ \__\ \ \__\
\/_____/ \/_____ / \/__/ \/__/ \/__/ \/__/


2004-11-17 10:54:12

by Mikael Pettersson

[permalink] [raw]
Subject: Re: OOPS - APIC or othere?

Sumit Pandya writes:
> Hi All,
> At one of our client I faced timer problem in kernel-2.4.26 and I tried to
> fixed with patching "arch/i386/kernel/mpparse.c" file taken from
> patch-2.4.27.
> ... ... ...
> Mikael Pettersson:
> o i386 and x86_64 ACPI mpparse timer bug
> ... ... ...
> After booting up the system now I get OOPS. Did I applied partial patch by
> taking only patch for mpparse.c from the whole buntch? Does it broken
> dependency to some other functionality? I've ACPI support enabled into
> kernel.

The effect of the bug was that the timer generated twice as
many interrupts, making the kernel's wall-clock timer twice
as fast.

There were no OOPS issues related with that patch. Therefore,
your OOPS indicates dependencies on other changes in mpparse
and/or the ACPI code. Why hack a 2.4.26 kernel in this way?
Just put a 2.4.27 or 2.4.28-rc4 in there and be done with it :-)

/Mikael

2004-11-17 16:47:53

by Zwane Mwaikambo

[permalink] [raw]
Subject: Re: OOPS - APIC or othere?

On Wed, 17 Nov 2004, Sumit Pandya wrote:

> At one of our client I faced timer problem in kernel-2.4.26 and I tried to
> fixed with patching "arch/i386/kernel/mpparse.c" file taken from
> patch-2.4.27.
> ... ... ...
> Mikael Pettersson:
> o i386 and x86_64 ACPI mpparse timer bug
> ... ... ...
> After booting up the system now I get OOPS. Did I applied partial patch by
> taking only patch for mpparse.c from the whole buntch? Does it broken
> dependency to some other functionality? I've ACPI support enabled into
> kernel.
> Does following Len's patch provide solution to my OOPS?
> ftp.kernel.org/pub/linux/kernel/people/lenb/acpi/patches/test/2.4.26-rc4/200
> 40422153228-irq2.patch
>
> Here is output of ksymsoops.

Sending bug reports for partially patched kernels isn't easy for us to
debug, is there no way for you to simply try booting 2.4.27?

Thanks,
Zwane

2004-11-26 19:50:46

by Zwane Mwaikambo

[permalink] [raw]
Subject: Re: OOPS - APIC or othere?

On Fri, 26 Nov 2004, Marcelo Tosatti wrote:

> On Fri, Nov 26, 2004 at 02:23:57PM +0530, Sumit Pandya wrote:
> > Marcelo,
> > No other message except my name "Sumit" ?
>
> Yes, because Zwane has written the message
>
> "Sending bug report for partially patched kernel isn't easy for us to
> debug, is there no way for you to simply try booting 2.4.27?"
>
> And I was assuming you read that. Did you?
>
> The bugzilla entry makes it understand that Len has fixed the
> problem in 2.4.27:
>
> Seems to be doing its job then isnt it?

I agree with Marcelo, it looks like a fairly isolated change, if you can
get by with applying only that patch then you'll be fine.

Thanks,
Zwane

2004-11-27 05:59:09

by Marcelo Tosatti

[permalink] [raw]
Subject: Re: OOPS - APIC or othere?

On Fri, Nov 26, 2004 at 02:23:57PM +0530, Sumit Pandya wrote:
> Marcelo,
> No other message except my name "Sumit" ?

Yes, because Zwane has written the message

"Sending bug report for partially patched kernel isn't easy for us to
debug, is there no way for you to simply try booting 2.4.27?"

And I was assuming you read that. Did you?

The bugzilla entry makes it understand that Len has fixed the
problem in 2.4.27:

------- Additional Comment #2 From Len Brown 2004-11-04 13:32 -------

shipped in 2.4.27 - closing


> Any update in my problem
> statement? My problem is having an embaded LinuxOS and changing kernel
> version is very critical.

Zwane's statement is valid here.

> Expecting just a quick answer from anyone. Could
> following solution patch break any other functionality if applied on the top
> of 2.4.26?
> http://bugzilla.kernel.org/show_bug.cgi?id=2834
> I got attachemnt from the above link and applied patch. Above patch applies
> nicely and runs without any problem. But wanted just a final confirmation
> from authers.

Seems to be doing its job then isnt it?

> Thanks for your time,
> -- Sumit
>
> > -----Original Message-----
> > From: Marcelo Tosatti [mailto:[email protected]]
> > Sent: Thursday, November 25, 2004 5:46 PM
> >
> >
> > On Wed, Nov 17, 2004 at 09:42:58AM -0700, Zwane Mwaikambo wrote:
> > > On Wed, 17 Nov 2004, Sumit Pandya wrote:
> > >
> > > > At one of our client I faced timer problem in kernel-2.4.26
> > and I tried to
> > > > fixed with patching "arch/i386/kernel/mpparse.c" file taken from
> > > > patch-2.4.27.
> > > > ... ... ...
> > > > Mikael Pettersson:
> > > > o i386 and x86_64 ACPI mpparse timer bug
> > > > ... ... ...
> > > > After booting up the system now I get OOPS. Did I applied
> > partial patch by
> > > > taking only patch for mpparse.c from the whole buntch? Does it broken
> > > > dependency to some other functionality? I've ACPI support enabled into
> > > > kernel.
> > > > Does following Len's patch provide solution to my OOPS?
> > > >
> > ftp.kernel.org/pub/linux/kernel/people/lenb/acpi/patches/test/2.4.
> 26-rc4/200
> > > 40422153228-irq2.patch
> > >
> > > Here is output of ksymsoops.
> >
> > Sending bug reports for partially patched kernels isn't easy for us to
> > debug, is there no way for you to simply try booting 2.4.27?
>
> Sumit?