2003-05-15 01:51:02

by Felipe Alfaro Solana

[permalink] [raw]
Subject: 2.5.69-mm5: pccard oops while booting: resolved

Andrew,

I was having the following Oops when booting 2.5.69-mm5:

Unable to handle kernel paging request at virtual address febf0000
printing eip:
c0192498
*pde = 00000000
Oops: 0000 [#1]
CPU: 0
EIP: 0060:[<c0192498>] Not tainted VLI
EFLAGS: 00010286
EIP is at pci_bus_match+0x18/0xb0
eax: 00000000 ebx: c13c1000 ecx: febf0000 edx: 00000000
esi: c13c104c edi: ffffffed ebp: cff3944c esp: cfde7ed0
ds: 007b es: 007b ss: 0068
Process pccardd (pid: 10, threadinfo=cfde6000 task=c1390060)
Stack: cff46390 c01d044f c13c104c cff46390 cff463c0 c13c104c c03207dc
c01d04ef
c13c104c cff46390 c13c104c c0320780 c13c1084 c01d06a4 c13c104c
c02c35e3
c03270a0 c13c104c 00000000 c13c1084 c01cf874 c13c104c cffc3a40
c13c1000
Call Trace:
[<c01d044f>] bus_match+0x2f/0x80
[<c01d04ef>] device_attach+0x4f/0x90
[<c01d06a4>] bus_add_device+0x64/0xb0
[<c01cf874>] device_add+0xd4/0x110
[<c018eb5e>] pci_bus_add_devices+0xae/0xe0
[<c020339b>] cb_alloc+0xab/0xf0
[<c02001d9>] socket_insert+0x69/0x80
[<c01ff78a>] get_socket_status+0x1a/0x20
[<c020041d>] pccardd+0x13d/0x1f0
[<c0115e90>] default_wake_function+0x0/0x20
[<c0109272>] ret_from_fork+0x6/0x14
[<c0115e90>] default_wake_function+0x0/0x20
[<c02002e0>] pccardd+0x0/0x1f0
[<c010722d>] kernel_thread_helper+0x5/0x18

Code: 83 fa 06 7e f1 31 c0 c3 b8 e0 06 32 c0 c3 90 8d 74 26 00 53 8b 44
24 0c 8b 5c 24 08
83 e8 28 8b 48 0c 83 eb 4c 31 c0 85 c9 74 30 <8b> 11 85 d2 74 7a 89 f6
83 fa ff 74 2b 0f b
7 43 24 39 c2 74 23

This oops made me unable to use my 3Com CardBus NIC.

I've been able to pinpoint the culprit of this: it's the
"make-KOBJ_NAME-match-BUS_ID_SIZE.patch" patch that it's causing the
oops for me when booting 2.5.69.mm5.

Reverting this patch solves the oops for me.

I don't have the resources to investigate why this patch is causing the
oops for me, but I'm willing to help you, if you need it :-)

Thanks!



2003-05-15 02:03:25

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.5.69-mm5: pccard oops while booting: resolved

Felipe Alfaro Solana <[email protected]> wrote:
>
> I've been able to pinpoint the culprit of this: it's the
> "make-KOBJ_NAME-match-BUS_ID_SIZE.patch" patch that it's causing the
> oops for me when booting 2.5.69.mm5.
>
> Reverting this patch solves the oops for me.

I might have screwed that patch up.

This is the second half of it. When it crashed, did you have the below
change in place as well?

Index: include/linux/device.h
===================================================================
RCS file: /home/scm/linux-2.5/include/linux/device.h,v
retrieving revision 1.48
diff -u -u -r1.48 device.h
--- include/linux/device.h 29 Apr 2003 17:30:20 -0000 1.48
+++ include/linux/device.h 13 May 2003 07:47:39 -0000
@@ -35,7 +35,7 @@
#define DEVICE_NAME_SIZE 50
#define DEVICE_NAME_HALF __stringify(20) /* Less than half to accommodate slop */
#define DEVICE_ID_SIZE 32
-#define BUS_ID_SIZE 20
+#define BUS_ID_SIZE KOBJ_NAME_LEN


enum {
-


2003-05-15 11:24:21

by Felipe Alfaro Solana

[permalink] [raw]
Subject: Re: 2.5.69-mm5: pccard oops while booting: resolved

On Thu, 2003-05-15 at 04:17, Andrew Morton wrote:
> Felipe Alfaro Solana <[email protected]> wrote:
> >
> > I've been able to pinpoint the culprit of this: it's the
> > "make-KOBJ_NAME-match-BUS_ID_SIZE.patch" patch that it's causing the
> > oops for me when booting 2.5.69.mm5.
> >
> > Reverting this patch solves the oops for me.
>
> I might have screwed that patch up.
>
> This is the second half of it. When it crashed, did you have the below
> change in place as well?
>
> Index: include/linux/device.h
> ===================================================================
> RCS file: /home/scm/linux-2.5/include/linux/device.h,v
> retrieving revision 1.48
> diff -u -u -r1.48 device.h
> --- include/linux/device.h 29 Apr 2003 17:30:20 -0000 1.48
> +++ include/linux/device.h 13 May 2003 07:47:39 -0000
> @@ -35,7 +35,7 @@
> #define DEVICE_NAME_SIZE 50
> #define DEVICE_NAME_HALF __stringify(20) /* Less than half to accommodate slop */
> #define DEVICE_ID_SIZE 32
> -#define BUS_ID_SIZE 20
> +#define BUS_ID_SIZE KOBJ_NAME_LEN
>
>
> enum {
> -

I applied the second half patch on top of 2.5.69-mm5 (the original
2.5.69-mm5 defined BUS_ID_SIZE as 20), but the "pccard" kernel task
keeps crashing as before.

Anything else for me to try? :-)

Thanks!

2003-05-15 11:47:37

by Russell King

[permalink] [raw]
Subject: Re: 2.5.69-mm5: pccard oops while booting: resolved

On Thu, May 15, 2003 at 01:36:41PM +0200, Felipe Alfaro Solana wrote:
> I applied the second half patch on top of 2.5.69-mm5 (the original
> 2.5.69-mm5 defined BUS_ID_SIZE as 20), but the "pccard" kernel task
> keeps crashing as before.
>
> Anything else for me to try? :-)

I don't believe this problem is being caused by PCMCIA/Cardbus (until
someone proves me wrong.)

This came up a few weeks ago, and it looked like the device models
driver lists became corrupted somehow. Unfortunately it wasn't proven
back then, and I haven't been able to reproduce this behaviour here.

We seem to be failing in pci_bus_match(), with pci_drv->id_table
containing an invalid address. Could you apply this patch and see
what happens? It'll be rather noisy during boot though.

The interesting one should be immediately prior to the oops.

--- orig/drivers/pci/pci-driver.c Sun Nov 24 10:12:24 2002
+++ linux/drivers/pci/pci-driver.c Thu May 15 12:58:56 2003
@@ -6,6 +6,7 @@
#include <linux/pci.h>
#include <linux/module.h>
#include <linux/init.h>
+#include <linux/kallsyms.h>
#include "pci.h"

/*
@@ -183,7 +184,9 @@
struct pci_dev * pci_dev = to_pci_dev(dev);
struct pci_driver * pci_drv = to_pci_driver(drv);
const struct pci_device_id * ids = pci_drv->id_table;
-
+printk("pci_bus_match: pci_drv = %p", pci_drv);
+print_symbol(" (%s)", pci_drv);
+printk("\n");
if (!ids)
return 0;



--
Russell King ([email protected]) The developer of ARM Linux
http://www.arm.linux.org.uk/personal/aboutme.html

2003-05-15 12:00:47

by Russell King

[permalink] [raw]
Subject: Re: 2.5.69-mm5: pccard oops while booting: resolved

When you send me the results of that patch, could you also include:

- /proc/modules (from before the crash)
- all kernel messages

please?

--
Russell King ([email protected]) The developer of ARM Linux
http://www.arm.linux.org.uk/personal/aboutme.html

2003-05-15 13:04:46

by Felipe Alfaro Solana

[permalink] [raw]
Subject: Re: 2.5.69-mm5: pccard oops while booting: resolved

On Thu, 2003-05-15 at 14:00, Russell King wrote:
> On Thu, May 15, 2003 at 01:36:41PM +0200, Felipe Alfaro Solana wrote:
> > I applied the second half patch on top of 2.5.69-mm5 (the original
> > 2.5.69-mm5 defined BUS_ID_SIZE as 20), but the "pccard" kernel task
> > keeps crashing as before.
> >
> > Anything else for me to try? :-)
>
> I don't believe this problem is being caused by PCMCIA/Cardbus (until
> someone proves me wrong.)
>
> This came up a few weeks ago, and it looked like the device models
> driver lists became corrupted somehow. Unfortunately it wasn't proven
> back then, and I haven't been able to reproduce this behaviour here.
>
> We seem to be failing in pci_bus_match(), with pci_drv->id_table
> containing an invalid address. Could you apply this patch and see
> what happens? It'll be rather noisy during boot though.
>
> The interesting one should be immediately prior to the oops.
>
> --- orig/drivers/pci/pci-driver.c Sun Nov 24 10:12:24 2002
> +++ linux/drivers/pci/pci-driver.c Thu May 15 12:58:56 2003
> @@ -6,6 +6,7 @@
> #include <linux/pci.h>
> #include <linux/module.h>
> #include <linux/init.h>
> +#include <linux/kallsyms.h>
> #include "pci.h"
>
> /*
> @@ -183,7 +184,9 @@
> struct pci_dev * pci_dev = to_pci_dev(dev);
> struct pci_driver * pci_drv = to_pci_driver(drv);
> const struct pci_device_id * ids = pci_drv->id_table;
> -
> +printk("pci_bus_match: pci_drv = %p", pci_drv);
> +print_symbol(" (%s)", pci_drv);
> +printk("\n");
> if (!ids)
> return 0;
>
>

OK, attached to this message:

"dmesg" contains the kernel messages when booting up 2.5.69-mm5 at tun
level 1 with the patch applied.

"config" contains options used to configure the kernel. Mostly, the
cardbus stuff is built-in, so no modules were loaded when booting into
single-user mode.

Hope this helps!


Attachments:
dmesg (11.17 kB)
config (18.73 kB)
Download all attachments

2003-05-15 13:31:55

by Russell King

[permalink] [raw]
Subject: Re: 2.5.69-mm5: pccard oops while booting: resolved

On Thu, May 15, 2003 at 03:16:55PM +0200, Felipe Alfaro Solana wrote:
> OK, attached to this message:
>
> "dmesg" contains the kernel messages when booting up 2.5.69-mm5 at tun
> level 1 with the patch applied.
>
> "config" contains options used to configure the kernel. Mostly, the
> cardbus stuff is built-in, so no modules were loaded when booting into
> single-user mode.
>
> Hope this helps!

Indeed it does. This patch should solve the problem.

--- orig/drivers/char/agp/intel-agp.c Sun Apr 20 16:31:48 2003
+++ linux/drivers/char/agp/intel-agp.c Thu May 15 14:41:45 2003
@@ -1635,7 +1635,7 @@

MODULE_DEVICE_TABLE(pci, agp_intel_pci_table);

-static struct __initdata pci_driver agp_intel_pci_driver = {
+static struct pci_driver agp_intel_pci_driver = {
.name = "agpgart-intel",
.id_table = agp_intel_pci_table,
.probe = agp_intel_probe,


--
Russell King ([email protected]) The developer of ARM Linux
http://www.arm.linux.org.uk/personal/aboutme.html

2003-05-15 13:34:52

by Dave Jones

[permalink] [raw]
Subject: Re: 2.5.69-mm5: pccard oops while booting: resolved

On Thu, May 15, 2003 at 02:44:39PM +0100, Russell King wrote:
> Indeed it does. This patch should solve the problem.
>
> --- orig/drivers/char/agp/intel-agp.c Sun Apr 20 16:31:48 2003
> +++ linux/drivers/char/agp/intel-agp.c Thu May 15 14:41:45 2003
> @@ -1635,7 +1635,7 @@
>
> MODULE_DEVICE_TABLE(pci, agp_intel_pci_table);
>
> -static struct __initdata pci_driver agp_intel_pci_driver = {
> +static struct pci_driver agp_intel_pci_driver = {
> .name = "agpgart-intel",
> .id_table = agp_intel_pci_table,
> .probe = agp_intel_probe,

Yup. Same stupid bug in all the other AGP drivers too.
I'll fix them up and push that along in a few hours.

Dave


--
| Dave Jones. http://www.codemonkey.org.uk
| SuSE Labs

2003-05-15 22:19:25

by Felipe Alfaro Solana

[permalink] [raw]
Subject: Re: 2.5.69-mm5: pccard oops while booting: resolved

On Thu, 2003-05-15 at 15:44, Russell King wrote:
> On Thu, May 15, 2003 at 03:16:55PM +0200, Felipe Alfaro Solana wrote:
> > OK, attached to this message:
> >
> > "dmesg" contains the kernel messages when booting up 2.5.69-mm5 at tun
> > level 1 with the patch applied.
> >
> > "config" contains options used to configure the kernel. Mostly, the
> > cardbus stuff is built-in, so no modules were loaded when booting into
> > single-user mode.
> >
> > Hope this helps!
>
> Indeed it does. This patch should solve the problem.
>
> --- orig/drivers/char/agp/intel-agp.c Sun Apr 20 16:31:48 2003
> +++ linux/drivers/char/agp/intel-agp.c Thu May 15 14:41:45 2003
> @@ -1635,7 +1635,7 @@
>
> MODULE_DEVICE_TABLE(pci, agp_intel_pci_table);
>
> -static struct __initdata pci_driver agp_intel_pci_driver = {
> +static struct pci_driver agp_intel_pci_driver = {
> .name = "agpgart-intel",
> .id_table = agp_intel_pci_table,
> .probe = agp_intel_probe,
>

I've applied this patch, but "pccard" keeps oopsing. The test kernel is
a 2.5.69-mm5 with the "i8259-shutdown.patch" reverted, plus the above
patch and your previous "verbose" patch. Attached to this message is the
new "dmesg" from this patched kernel.

As I told Andrew, reverting "make-KOBJ_NAME-match-BUS_ID_SIZE.patch"
solves the oops.


Attachments:
dmesg (11.17 kB)

2003-05-15 22:52:11

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.5.69-mm5: pccard oops while booting: resolved

Felipe Alfaro Solana <[email protected]> wrote:
>
> The test kernel is
> a 2.5.69-mm5 with the "i8259-shutdown.patch" reverted, plus the above
> patch and your previous "verbose" patch. Attached to this message is the
> new "dmesg" from this patched kernel.
>
> As I told Andrew, reverting "make-KOBJ_NAME-match-BUS_ID_SIZE.patch"
> solves the oops.

The weird thing is that this patch really doesn't do anything apart from
increasing KOBJ_NAME_LEN from 16 to 20.





From: Ben Collins <[email protected]>

This was causing me all sorts of problems with linux1394's 16-18 byte long
bus_id lengths. The sysfs names were all broken.

This not only makes KOBJ_NAME_LEN match BUS_ID_SIZE, but fixes the
strncpy's in drivers/base/ so that it can't happen again (at least the
strings will be null terminated).



drivers/base/bus.c | 2 ++
drivers/base/class.c | 2 ++
drivers/base/core.c | 1 +
include/linux/device.h | 2 +-
include/linux/kobject.h | 2 +-
5 files changed, 7 insertions(+), 2 deletions(-)

diff -puN drivers/base/bus.c~make-KOBJ_NAME-match-BUS_ID_SIZE drivers/base/bus.c
--- 25/drivers/base/bus.c~make-KOBJ_NAME-match-BUS_ID_SIZE 2003-05-14 19:18:09.000000000 -0700
+++ 25-akpm/drivers/base/bus.c 2003-05-14 19:18:09.000000000 -0700
@@ -432,6 +432,7 @@ int bus_add_driver(struct device_driver
pr_debug("bus %s: add driver %s\n",bus->name,drv->name);

strncpy(drv->kobj.name,drv->name,KOBJ_NAME_LEN);
+ drv->kobj.name[KOBJ_NAME_LEN-1] = '\0';
drv->kobj.kset = &bus->drivers;

if ((error = kobject_register(&drv->kobj))) {
@@ -541,6 +542,7 @@ struct bus_type * find_bus(char * name)
int bus_register(struct bus_type * bus)
{
strncpy(bus->subsys.kset.kobj.name,bus->name,KOBJ_NAME_LEN);
+ bus->subsys.kset.kobj.name[KOBJ_NAME_LEN-1] = '\0';
subsys_set_kset(bus,bus_subsys);
subsystem_register(&bus->subsys);

diff -puN drivers/base/class.c~make-KOBJ_NAME-match-BUS_ID_SIZE drivers/base/class.c
--- 25/drivers/base/class.c~make-KOBJ_NAME-match-BUS_ID_SIZE 2003-05-14 19:18:09.000000000 -0700
+++ 25-akpm/drivers/base/class.c 2003-05-14 19:18:09.000000000 -0700
@@ -89,6 +89,7 @@ int class_register(struct class * cls)
INIT_LIST_HEAD(&cls->interfaces);

strncpy(cls->subsys.kset.kobj.name,cls->name,KOBJ_NAME_LEN);
+ cls->subsys.kset.kobj.name[KOBJ_NAME_LEN-1] = '\0';
subsys_set_kset(cls,class_subsys);
subsystem_register(&cls->subsys);

@@ -259,6 +260,7 @@ int class_device_add(struct class_device

/* first, register with generic layer. */
strncpy(class_dev->kobj.name, class_dev->class_id, KOBJ_NAME_LEN);
+ class_dev->kobj.name[KOBJ_NAME_LEN-1] = '\0';
kobj_set_kset_s(class_dev, class_obj_subsys);
if (parent)
class_dev->kobj.parent = &parent->subsys.kset.kobj;
diff -puN drivers/base/core.c~make-KOBJ_NAME-match-BUS_ID_SIZE drivers/base/core.c
--- 25/drivers/base/core.c~make-KOBJ_NAME-match-BUS_ID_SIZE 2003-05-14 19:18:09.000000000 -0700
+++ 25-akpm/drivers/base/core.c 2003-05-14 19:18:09.000000000 -0700
@@ -214,6 +214,7 @@ int device_add(struct device *dev)

/* first, register with generic layer. */
strncpy(dev->kobj.name,dev->bus_id,KOBJ_NAME_LEN);
+ dev->kobj.name[KOBJ_NAME_LEN-1] = '\0';
kobj_set_kset_s(dev,devices_subsys);
if (parent)
dev->kobj.parent = &parent->kobj;
diff -puN include/linux/kobject.h~make-KOBJ_NAME-match-BUS_ID_SIZE include/linux/kobject.h
--- 25/include/linux/kobject.h~make-KOBJ_NAME-match-BUS_ID_SIZE 2003-05-14 19:18:09.000000000 -0700
+++ 25-akpm/include/linux/kobject.h 2003-05-14 19:18:09.000000000 -0700
@@ -12,7 +12,7 @@
#include <linux/rwsem.h>
#include <asm/atomic.h>

-#define KOBJ_NAME_LEN 16
+#define KOBJ_NAME_LEN 20

struct kobject {
char name[KOBJ_NAME_LEN];
diff -puN include/linux/device.h~make-KOBJ_NAME-match-BUS_ID_SIZE include/linux/device.h
--- 25/include/linux/device.h~make-KOBJ_NAME-match-BUS_ID_SIZE 2003-05-14 19:18:09.000000000 -0700
+++ 25-akpm/include/linux/device.h 2003-05-14 19:18:16.000000000 -0700
@@ -35,7 +35,7 @@
#define DEVICE_NAME_SIZE 50
#define DEVICE_NAME_HALF __stringify(20) /* Less than half to accommodate slop */
#define DEVICE_ID_SIZE 32
-#define BUS_ID_SIZE 20
+#define BUS_ID_SIZE KOBJ_NAME_LEN


enum {

_

2003-05-16 12:50:23

by Felipe Alfaro Solana

[permalink] [raw]
Subject: Re: 2.5.69-mm5: pccard oops while booting: resolved

On Fri, 2003-05-16 at 01:00, Andrew Morton wrote:
> Felipe Alfaro Solana <[email protected]> wrote:
> >
> > The test kernel is
> > a 2.5.69-mm5 with the "i8259-shutdown.patch" reverted, plus the above
> > patch and your previous "verbose" patch. Attached to this message is the
> > new "dmesg" from this patched kernel.
> >
> > As I told Andrew, reverting "make-KOBJ_NAME-match-BUS_ID_SIZE.patch"
> > solves the oops.
>
> The weird thing is that this patch really doesn't do anything apart from
> increasing KOBJ_NAME_LEN from 16 to 20.

OK, this is what I guessed by playing with 2.5.69-mm6:

1. Simply by changing KOBJ_NAME_LEN from 20 to 16 fixes the problem.
This leads me to think there are some parts of the kernel (a driver, to
be more exact) that are corrupting memory or doing something really
nasty that is affecting PCI ID's tables and pci_bus_match() function.

2. Disabling or enabling preemptible kernel does not help.

3. Now, changing KOBJ_NAME_LEN back to 20, and then disabling support
for the ALSA Yamaha PCI driver (YMFPCI) fixes the problem. I have tried
disabling other drivers, like USB-UHCI, AGPGART, but it doesn't help.
However, disabling YMFPCI solves the problem. So I guess, we've got a
problem at alsa_card_ymfpci_init() function. Note that the YMFPCI was
built-in into the kernel, and not as a module. However, building YMFPCI
as a module still produces an oops. I'll post more information when I
investigate a little more about this.

Any ideas on what's could be going on here? It's driving me nutts!

Thanks!

2003-05-16 18:22:18

by Felipe Alfaro Solana

[permalink] [raw]
Subject: Re: 2.5.69-mm6: pccard oops while booting

On Fri, 2003-05-16 at 15:03, Felipe Alfaro Solana wrote:
> On Fri, 2003-05-16 at 01:00, Andrew Morton wrote:
> > Felipe Alfaro Solana <[email protected]> wrote:
> > >
> > > The test kernel is
> > > a 2.5.69-mm5 with the "i8259-shutdown.patch" reverted, plus the above
> > > patch and your previous "verbose" patch. Attached to this message is the
> > > new "dmesg" from this patched kernel.
> > >
> > > As I told Andrew, reverting "make-KOBJ_NAME-match-BUS_ID_SIZE.patch"
> > > solves the oops.
> >
> > The weird thing is that this patch really doesn't do anything apart from
> > increasing KOBJ_NAME_LEN from 16 to 20.
>
> OK, this is what I guessed by playing with 2.5.69-mm6:
>
> 1. Simply by changing KOBJ_NAME_LEN from 20 to 16 fixes the problem.
> This leads me to think there are some parts of the kernel (a driver, to
> be more exact) that are corrupting memory or doing something really
> nasty that is affecting PCI ID's tables and pci_bus_match() function.
>
> 2. Disabling or enabling preemptible kernel does not help.
>
> 3. Now, changing KOBJ_NAME_LEN back to 20, and then disabling support
> for the ALSA Yamaha PCI driver (YMFPCI) fixes the problem. I have tried
> disabling other drivers, like USB-UHCI, AGPGART, but it doesn't help.
> However, disabling YMFPCI solves the problem. So I guess, we've got a
> problem at alsa_card_ymfpci_init() function. Note that the YMFPCI was
> built-in into the kernel, and not as a module. However, building YMFPCI
> as a module still produces an oops. I'll post more information when I
> investigate a little more about this.
>
> Any ideas on what's could be going on here? It's driving me nutts!
>
> Thanks!

OK, this is the oops caused when trying to modprobe snd-ymfpci on
2.5.69-mm6:

Linux version 2.5.69-mm6 ([email protected]) (gcc version
3.2.3 20030422 (Red Hat Linux 3.2.3-4)) #25 Fri May 16 15:04:26 CEST
2003
Unable to handle kernel paging request at virtual address 50464d59
printing eip:
c01d07bf
*pde = 00000000
Oops: 0000 [#1]
CPU: 0
EIP: 0060:[<c01d07bf>] Not tainted VLI
EFLAGS: 00010202
EIP is at driver_attach+0x3f/0x60
eax: 50464d59 ebx: 50464d59 ecx: 59004943 edx: d08a821e
esi: cfdec780 edi: d08aaa48 ebp: c030d100 esp: cfa0bf40
ds: 007b es: 007b ss: 0068
Process modprobe (pid: 152, threadinfo=cfa0a000 task=cfde2690)
Stack: cff3884c d08aaa48 c030d14c 00000000 d08aaa76 c01d0a98 d08aaa48
00000009
c02e1830 00000000 c02e1818 cfa0a000 c01d0f1f d08aaa48 00000015
00000017
d0894c4c cfdec8a0 c0192527 d08aaa48 d087f019 d08aaa20 c02e1830
d08b1200
Call Trace:
[<d08aaa48>] driver+0x28/0xa0 [snd_ymfpci]
[<d08aaa76>] driver+0x56/0xa0 [snd_ymfpci]
[<c01d0a98>] bus_add_driver+0xa8/0xc0
[<d08aaa48>] driver+0x28/0xa0 [snd_ymfpci]
[<c01d0f1f>] driver_register+0x2f/0x40
[<d08aaa48>] driver+0x28/0xa0 [snd_ymfpci]
[<c0192527>] pci_register_driver+0x47/0x60
[<d08aaa48>] driver+0x28/0xa0 [snd_ymfpci]
[<d087f019>] +0x19/0x5b [snd_ymfpci]
[<d08aaa20>] driver+0x0/0xa0 [snd_ymfpci]
[<d08b1200>] +0x0/0xe0 [snd_ymfpci]
[<c012eb2c>] sys_init_module+0x12c/0x240
[<d08b1200>] +0x0/0xe0 [snd_ymfpci]
[<c0109349>] sysenter_past_esp+0x52/0x71

Code: db 74 32 8b 9a a8 00 00 00 8b 03 0f 18 00 90 8d b2 a8 00 00 00 39
f3 74 1c 8d 76 00 8d 53 f8 8b 8a a4 00 00 00 85 c9 74 13 89 c3 <8b> 00
0f 18 00 90 39 f3 75 e7 83 c4 08 5b 5e 5f c3 89 7c 24 04


2003-05-16 18:40:24

by Ben Collins

[permalink] [raw]
Subject: Re: 2.5.69-mm5: pccard oops while booting: resolved

> 1. Simply by changing KOBJ_NAME_LEN from 20 to 16 fixes the problem.
> This leads me to think there are some parts of the kernel (a driver, to
> be more exact) that are corrupting memory or doing something really
> nasty that is affecting PCI ID's tables and pci_bus_match() function.

Are you sure you have a pristine source and everything is rebuilt
against the new header? It'd be very easy for one object file to have the
incorrect name len and cause this problem.

--
Debian - http://www.debian.org/
Linux 1394 - http://www.linux1394.org/
Subversion - http://subversion.tigris.org/
Deqo - http://www.deqo.com/

2003-05-16 18:51:03

by Felipe Alfaro Solana

[permalink] [raw]
Subject: Re: 2.5.69-mm5: pccard oops while booting: resolved

On Fri, 2003-05-16 at 20:13, Ben Collins wrote:
> > 1. Simply by changing KOBJ_NAME_LEN from 20 to 16 fixes the problem.
> > This leads me to think there are some parts of the kernel (a driver, to
> > be more exact) that are corrupting memory or doing something really
> > nasty that is affecting PCI ID's tables and pci_bus_match() function.

> Are you sure you have a pristine source and everything is rebuilt
> against the new header? It'd be very easy for one object file to have the
> incorrect name len and cause this problem.

I'm 200% sure:

1. tar jxvf linux-2.5.69.tar.bz2
2. zcat 2.5.69-mm6.gz | patch -p1
3. make menuconfig

There is something in the Yamaha DS-XG PCI driver (I think the problem
lies in alsa_card_ymfpci_init()) that is somewhat corrupting the PCI
ID's table, or something else, that's causing the pci_bus_match()
function to oops.

I'm no kernel expert, so I've been unable to guess much more than this.

2003-05-16 20:14:38

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.5.69-mm6: pccard oops while booting

Felipe Alfaro Solana <[email protected]> wrote:
>
> Unable to handle kernel paging request at virtual address 50464d59

hm, that address is "YMFP". Please try generating the oops
again with the below patch applied:

./sound/pci/ymfpci/ymfpci.c | 8 ++++----
./sound/pci/ymfpci/ymfpci_main.c | 22 +++++++++++-----------
2 files changed, 15 insertions(+), 15 deletions(-)

diff -puN ./sound/pci/ymfpci/ymfpci_main.c~a ./sound/pci/ymfpci/ymfpci_main.c
--- 25/./sound/pci/ymfpci/ymfpci_main.c~a 2003-05-16 13:26:26.000000000 -0700
+++ 25-akpm/./sound/pci/ymfpci/ymfpci_main.c 2003-05-16 13:27:27.000000000 -0700
@@ -1093,7 +1093,7 @@ int __devinit snd_ymfpci_pcm(ymfpci_t *c

if (rpcm)
*rpcm = NULL;
- if ((err = snd_pcm_new(chip->card, "YMFPCI", device, 32, 1, &pcm)) < 0)
+ if ((err = snd_pcm_new(chip->card, "1YMFPCI", device, 32, 1, &pcm)) < 0)
return err;
pcm->private_data = chip;
pcm->private_free = snd_ymfpci_pcm_free;
@@ -1103,7 +1103,7 @@ int __devinit snd_ymfpci_pcm(ymfpci_t *c

/* global setup */
pcm->info_flags = 0;
- strcpy(pcm->name, "YMFPCI");
+ strcpy(pcm->name, "2YMFPCI");
chip->pcm = pcm;

snd_pcm_lib_preallocate_pci_pages_for_all(chip->pci, pcm, 64*1024, 256*1024);
@@ -1138,7 +1138,7 @@ int __devinit snd_ymfpci_pcm2(ymfpci_t *

if (rpcm)
*rpcm = NULL;
- if ((err = snd_pcm_new(chip->card, "YMFPCI - AC'97", device, 0, 1, &pcm)) < 0)
+ if ((err = snd_pcm_new(chip->card, "3YMFPCI - AC'97", device, 0, 1, &pcm)) < 0)
return err;
pcm->private_data = chip;
pcm->private_free = snd_ymfpci_pcm2_free;
@@ -1147,7 +1147,7 @@ int __devinit snd_ymfpci_pcm2(ymfpci_t *

/* global setup */
pcm->info_flags = 0;
- strcpy(pcm->name, "YMFPCI - AC'97");
+ strcpy(pcm->name, "4YMFPCI - AC'97");
chip->pcm2 = pcm;

snd_pcm_lib_preallocate_pci_pages_for_all(chip->pci, pcm, 64*1024, 256*1024);
@@ -1182,7 +1182,7 @@ int __devinit snd_ymfpci_pcm_spdif(ymfpc

if (rpcm)
*rpcm = NULL;
- if ((err = snd_pcm_new(chip->card, "YMFPCI - IEC958", device, 1, 0, &pcm)) < 0)
+ if ((err = snd_pcm_new(chip->card, "5YMFPCI - IEC958", device, 1, 0, &pcm)) < 0)
return err;
pcm->private_data = chip;
pcm->private_free = snd_ymfpci_pcm_spdif_free;
@@ -1191,7 +1191,7 @@ int __devinit snd_ymfpci_pcm_spdif(ymfpc

/* global setup */
pcm->info_flags = 0;
- strcpy(pcm->name, "YMFPCI - IEC958");
+ strcpy(pcm->name, "6YMFPCI - IEC958");
chip->pcm_spdif = pcm;

snd_pcm_lib_preallocate_pci_pages_for_all(chip->pci, pcm, 64*1024, 256*1024);
@@ -1226,7 +1226,7 @@ int __devinit snd_ymfpci_pcm_4ch(ymfpci_

if (rpcm)
*rpcm = NULL;
- if ((err = snd_pcm_new(chip->card, "YMFPCI - Rear", device, 1, 0, &pcm)) < 0)
+ if ((err = snd_pcm_new(chip->card, "7YMFPCI - Rear", device, 1, 0, &pcm)) < 0)
return err;
pcm->private_data = chip;
pcm->private_free = snd_ymfpci_pcm_4ch_free;
@@ -1235,7 +1235,7 @@ int __devinit snd_ymfpci_pcm_4ch(ymfpci_

/* global setup */
pcm->info_flags = 0;
- strcpy(pcm->name, "YMFPCI - Rear PCM");
+ strcpy(pcm->name, "8YMFPCI - Rear PCM");
chip->pcm_4ch = pcm;

snd_pcm_lib_preallocate_pci_pages_for_all(chip->pci, pcm, 64*1024, 256*1024);
@@ -1831,7 +1831,7 @@ static void snd_ymfpci_proc_read(snd_inf
{
// ymfpci_t *chip = snd_magic_cast(ymfpci_t, private_data, return);

- snd_iprintf(buffer, "YMFPCI\n\n");
+ snd_iprintf(buffer, "9YMFPCI\n\n");
}

static int __devinit snd_ymfpci_proc_init(snd_card_t * card, ymfpci_t *chip)
@@ -2226,12 +2226,12 @@ int __devinit snd_ymfpci_create(snd_card
chip->reg_area_virt = (unsigned long)ioremap_nocache(chip->reg_area_phys, 0x8000);
pci_set_master(pci);

- if ((chip->res_reg_area = request_mem_region(chip->reg_area_phys, 0x8000, "YMFPCI")) == NULL) {
+ if ((chip->res_reg_area = request_mem_region(chip->reg_area_phys, 0x8000, "AYMFPCI")) == NULL) {
snd_ymfpci_free(chip);
snd_printk("unable to grab memory region 0x%lx-0x%lx\n", chip->reg_area_phys, chip->reg_area_phys + 0x8000 - 1);
return -EBUSY;
}
- if (request_irq(pci->irq, snd_ymfpci_interrupt, SA_INTERRUPT|SA_SHIRQ, "YMFPCI", (void *) chip)) {
+ if (request_irq(pci->irq, snd_ymfpci_interrupt, SA_INTERRUPT|SA_SHIRQ, "BYMFPCI", (void *) chip)) {
snd_ymfpci_free(chip);
snd_printk("unable to grab IRQ %d\n", pci->irq);
return -EBUSY;
diff -puN ./sound/pci/ymfpci/ymfpci.c~a ./sound/pci/ymfpci/ymfpci.c
--- 25/./sound/pci/ymfpci/ymfpci.c~a 2003-05-16 13:26:26.000000000 -0700
+++ 25-akpm/./sound/pci/ymfpci/ymfpci.c 2003-05-16 13:27:49.000000000 -0700
@@ -122,7 +122,7 @@ static int __devinit snd_card_ymfpci_pro
fm_port[dev] = addr;
}
if (fm_port[dev] >= 0 &&
- (chip->fm_res = request_region(fm_port[dev], 4, "YMFPCI OPL3")) != NULL) {
+ (chip->fm_res = request_region(fm_port[dev], 4, "CYMFPCI OPL3")) != NULL) {
legacy_ctrl |= YMFPCI_LEGACY_FMEN;
pci_write_config_word(pci, PCIR_DSXG_FMBASE, fm_port[dev]);
}
@@ -133,7 +133,7 @@ static int __devinit snd_card_ymfpci_pro
mpu_port[dev] = addr;
}
if (mpu_port[dev] >= 0 &&
- (chip->mpu_res = request_region(mpu_port[dev], 2, "YMFPCI MPU401")) != NULL) {
+ (chip->mpu_res = request_region(mpu_port[dev], 2, "DYMFPCI MPU401")) != NULL) {
legacy_ctrl |= YMFPCI_LEGACY_MEN;
pci_write_config_word(pci, PCIR_DSXG_MPU401BASE, mpu_port[dev]);
}
@@ -146,7 +146,7 @@ static int __devinit snd_card_ymfpci_pro
default: fm_port[dev] = -1; break;
}
if (fm_port[dev] > 0 &&
- (chip->fm_res = request_region(fm_port[dev], 4, "YMFPCI OPL3")) != NULL) {
+ (chip->fm_res = request_region(fm_port[dev], 4, "EYMFPCI OPL3")) != NULL) {
legacy_ctrl |= YMFPCI_LEGACY_FMEN;
} else {
legacy_ctrl2 &= ~YMFPCI_LEGACY2_FMIO;
@@ -160,7 +160,7 @@ static int __devinit snd_card_ymfpci_pro
default: mpu_port[dev] = -1; break;
}
if (mpu_port[dev] > 0 &&
- (chip->mpu_res = request_region(mpu_port[dev], 2, "YMFPCI MPU401")) != NULL) {
+ (chip->mpu_res = request_region(mpu_port[dev], 2, "FYMFPCI MPU401")) != NULL) {
legacy_ctrl |= YMFPCI_LEGACY_MEN;
} else {
legacy_ctrl2 &= ~YMFPCI_LEGACY2_MPUIO;

_

2003-05-16 21:29:52

by Felipe Alfaro Solana

[permalink] [raw]
Subject: Re: 2.5.69-mm6: pccard oops while booting

On Fri, 2003-05-16 at 22:29, Andrew Morton wrote:
> Felipe Alfaro Solana <[email protected]> wrote:
> >
> > Unable to handle kernel paging request at virtual address 50464d59
>
> hm, that address is "YMFP". Please try generating the oops
> again with the below patch applied:
>
> ./sound/pci/ymfpci/ymfpci.c | 8 ++++----
> ./sound/pci/ymfpci/ymfpci_main.c | 22 +++++++++++-----------
> 2 files changed, 15 insertions(+), 15 deletions(-)
>
> diff -puN ./sound/pci/ymfpci/ymfpci_main.c~a ./sound/pci/ymfpci/ymfpci_main.c
> --- 25/./sound/pci/ymfpci/ymfpci_main.c~a 2003-05-16 13:26:26.000000000 -0700
> +++ 25-akpm/./sound/pci/ymfpci/ymfpci_main.c 2003-05-16 13:27:27.000000000 -0700
> @@ -1093,7 +1093,7 @@ int __devinit snd_ymfpci_pcm(ymfpci_t *c
>
> if (rpcm)
> *rpcm = NULL;
> - if ((err = snd_pcm_new(chip->card, "YMFPCI", device, 32, 1, &pcm)) < 0)
> + if ((err = snd_pcm_new(chip->card, "1YMFPCI", device, 32, 1, &pcm)) < 0)
> return err;
> pcm->private_data = chip;
> pcm->private_free = snd_ymfpci_pcm_free;
> @@ -1103,7 +1103,7 @@ int __devinit snd_ymfpci_pcm(ymfpci_t *c
>
> /* global setup */
> pcm->info_flags = 0;
> - strcpy(pcm->name, "YMFPCI");
> + strcpy(pcm->name, "2YMFPCI");
> chip->pcm = pcm;
>
> snd_pcm_lib_preallocate_pci_pages_for_all(chip->pci, pcm, 64*1024, 256*1024);
> @@ -1138,7 +1138,7 @@ int __devinit snd_ymfpci_pcm2(ymfpci_t *
>
> if (rpcm)
> *rpcm = NULL;
> - if ((err = snd_pcm_new(chip->card, "YMFPCI - AC'97", device, 0, 1, &pcm)) < 0)
> + if ((err = snd_pcm_new(chip->card, "3YMFPCI - AC'97", device, 0, 1, &pcm)) < 0)
> return err;
> pcm->private_data = chip;
> pcm->private_free = snd_ymfpci_pcm2_free;
> @@ -1147,7 +1147,7 @@ int __devinit snd_ymfpci_pcm2(ymfpci_t *
>
> /* global setup */
> pcm->info_flags = 0;
> - strcpy(pcm->name, "YMFPCI - AC'97");
> + strcpy(pcm->name, "4YMFPCI - AC'97");
> chip->pcm2 = pcm;
>
> snd_pcm_lib_preallocate_pci_pages_for_all(chip->pci, pcm, 64*1024, 256*1024);
> @@ -1182,7 +1182,7 @@ int __devinit snd_ymfpci_pcm_spdif(ymfpc
>
> if (rpcm)
> *rpcm = NULL;
> - if ((err = snd_pcm_new(chip->card, "YMFPCI - IEC958", device, 1, 0, &pcm)) < 0)
> + if ((err = snd_pcm_new(chip->card, "5YMFPCI - IEC958", device, 1, 0, &pcm)) < 0)
> return err;
> pcm->private_data = chip;
> pcm->private_free = snd_ymfpci_pcm_spdif_free;
> @@ -1191,7 +1191,7 @@ int __devinit snd_ymfpci_pcm_spdif(ymfpc
>
> /* global setup */
> pcm->info_flags = 0;
> - strcpy(pcm->name, "YMFPCI - IEC958");
> + strcpy(pcm->name, "6YMFPCI - IEC958");
> chip->pcm_spdif = pcm;
>
> snd_pcm_lib_preallocate_pci_pages_for_all(chip->pci, pcm, 64*1024, 256*1024);
> @@ -1226,7 +1226,7 @@ int __devinit snd_ymfpci_pcm_4ch(ymfpci_
>
> if (rpcm)
> *rpcm = NULL;
> - if ((err = snd_pcm_new(chip->card, "YMFPCI - Rear", device, 1, 0, &pcm)) < 0)
> + if ((err = snd_pcm_new(chip->card, "7YMFPCI - Rear", device, 1, 0, &pcm)) < 0)
> return err;
> pcm->private_data = chip;
> pcm->private_free = snd_ymfpci_pcm_4ch_free;
> @@ -1235,7 +1235,7 @@ int __devinit snd_ymfpci_pcm_4ch(ymfpci_
>
> /* global setup */
> pcm->info_flags = 0;
> - strcpy(pcm->name, "YMFPCI - Rear PCM");
> + strcpy(pcm->name, "8YMFPCI - Rear PCM");
> chip->pcm_4ch = pcm;
>
> snd_pcm_lib_preallocate_pci_pages_for_all(chip->pci, pcm, 64*1024, 256*1024);
> @@ -1831,7 +1831,7 @@ static void snd_ymfpci_proc_read(snd_inf
> {
> // ymfpci_t *chip = snd_magic_cast(ymfpci_t, private_data, return);
>
> - snd_iprintf(buffer, "YMFPCI\n\n");
> + snd_iprintf(buffer, "9YMFPCI\n\n");
> }
>
> static int __devinit snd_ymfpci_proc_init(snd_card_t * card, ymfpci_t *chip)
> @@ -2226,12 +2226,12 @@ int __devinit snd_ymfpci_create(snd_card
> chip->reg_area_virt = (unsigned long)ioremap_nocache(chip->reg_area_phys, 0x8000);
> pci_set_master(pci);
>
> - if ((chip->res_reg_area = request_mem_region(chip->reg_area_phys, 0x8000, "YMFPCI")) == NULL) {
> + if ((chip->res_reg_area = request_mem_region(chip->reg_area_phys, 0x8000, "AYMFPCI")) == NULL) {
> snd_ymfpci_free(chip);
> snd_printk("unable to grab memory region 0x%lx-0x%lx\n", chip->reg_area_phys, chip->reg_area_phys + 0x8000 - 1);
> return -EBUSY;
> }
> - if (request_irq(pci->irq, snd_ymfpci_interrupt, SA_INTERRUPT|SA_SHIRQ, "YMFPCI", (void *) chip)) {
> + if (request_irq(pci->irq, snd_ymfpci_interrupt, SA_INTERRUPT|SA_SHIRQ, "BYMFPCI", (void *) chip)) {
> snd_ymfpci_free(chip);
> snd_printk("unable to grab IRQ %d\n", pci->irq);
> return -EBUSY;
> diff -puN ./sound/pci/ymfpci/ymfpci.c~a ./sound/pci/ymfpci/ymfpci.c
> --- 25/./sound/pci/ymfpci/ymfpci.c~a 2003-05-16 13:26:26.000000000 -0700
> +++ 25-akpm/./sound/pci/ymfpci/ymfpci.c 2003-05-16 13:27:49.000000000 -0700
> @@ -122,7 +122,7 @@ static int __devinit snd_card_ymfpci_pro
> fm_port[dev] = addr;
> }
> if (fm_port[dev] >= 0 &&
> - (chip->fm_res = request_region(fm_port[dev], 4, "YMFPCI OPL3")) != NULL) {
> + (chip->fm_res = request_region(fm_port[dev], 4, "CYMFPCI OPL3")) != NULL) {
> legacy_ctrl |= YMFPCI_LEGACY_FMEN;
> pci_write_config_word(pci, PCIR_DSXG_FMBASE, fm_port[dev]);
> }
> @@ -133,7 +133,7 @@ static int __devinit snd_card_ymfpci_pro
> mpu_port[dev] = addr;
> }
> if (mpu_port[dev] >= 0 &&
> - (chip->mpu_res = request_region(mpu_port[dev], 2, "YMFPCI MPU401")) != NULL) {
> + (chip->mpu_res = request_region(mpu_port[dev], 2, "DYMFPCI MPU401")) != NULL) {
> legacy_ctrl |= YMFPCI_LEGACY_MEN;
> pci_write_config_word(pci, PCIR_DSXG_MPU401BASE, mpu_port[dev]);
> }
> @@ -146,7 +146,7 @@ static int __devinit snd_card_ymfpci_pro
> default: fm_port[dev] = -1; break;
> }
> if (fm_port[dev] > 0 &&
> - (chip->fm_res = request_region(fm_port[dev], 4, "YMFPCI OPL3")) != NULL) {
> + (chip->fm_res = request_region(fm_port[dev], 4, "EYMFPCI OPL3")) != NULL) {
> legacy_ctrl |= YMFPCI_LEGACY_FMEN;
> } else {
> legacy_ctrl2 &= ~YMFPCI_LEGACY2_FMIO;
> @@ -160,7 +160,7 @@ static int __devinit snd_card_ymfpci_pro
> default: mpu_port[dev] = -1; break;
> }
> if (mpu_port[dev] > 0 &&
> - (chip->mpu_res = request_region(mpu_port[dev], 2, "YMFPCI MPU401")) != NULL) {
> + (chip->mpu_res = request_region(mpu_port[dev], 2, "FYMFPCI MPU401")) != NULL) {
> legacy_ctrl |= YMFPCI_LEGACY_MEN;
> } else {
> legacy_ctrl2 &= ~YMFPCI_LEGACY2_MPUIO;
>

I've applied the patch above to a pristine 2.5.69-mm6. Curiously, if I
build snd-ymfpci as a module, I can't reproduce the oops anymore.
However, if I build snd-ymfpci into the kernel, I can *still* reproduce
the oops.

Attached is the dmesg of a 2.5.69-mm6 plus the above patch with ymfpci
integrated into the kernel.

Thanks!


Attachments:
dmesg-kernel (8.55 kB)

2003-05-16 21:56:14

by Carl-Daniel Hailfinger

[permalink] [raw]
Subject: Re: 2.5.69-mm6: pccard oops while booting

Felipe Alfaro Solana wrote:
> On Fri, 2003-05-16 at 22:29, Andrew Morton wrote:
>
>>Felipe Alfaro Solana <[email protected]> wrote:
>>
>>>Unable to handle kernel paging request at virtual address 50464d59
>>
>>hm, that address is "YMFP". Please try generating the oops
>>again with the below patch applied:
>>
>> ./sound/pci/ymfpci/ymfpci.c | 8 ++++----
>> ./sound/pci/ymfpci/ymfpci_main.c | 22 +++++++++++-----------
>> 2 files changed, 15 insertions(+), 15 deletions(-)
>>
>>diff -puN ./sound/pci/ymfpci/ymfpci_main.c~a ./sound/pci/ymfpci/ymfpci_main.c
>>--- 25/./sound/pci/ymfpci/ymfpci_main.c~a 2003-05-16 13:26:26.000000000 -0700
>>+++ 25-akpm/./sound/pci/ymfpci/ymfpci_main.c 2003-05-16 13:27:27.000000000 -0700
>>@@ -1093,7 +1093,7 @@ int __devinit snd_ymfpci_pcm(ymfpci_t *c
>>
>> if (rpcm)
>> *rpcm = NULL;
>>- if ((err = snd_pcm_new(chip->card, "YMFPCI", device, 32, 1, &pcm)) < 0)
>>+ if ((err = snd_pcm_new(chip->card, "1YMFPCI", device, 32, 1, &pcm)) < 0)
>> return err;
>> pcm->private_data = chip;
>> pcm->private_free = snd_ymfpci_pcm_free;
>>@@ -1103,7 +1103,7 @@ int __devinit snd_ymfpci_pcm(ymfpci_t *c
>>
>> /* global setup */
>> pcm->info_flags = 0;
>>- strcpy(pcm->name, "YMFPCI");
>>+ strcpy(pcm->name, "2YMFPCI");
>> chip->pcm = pcm;
>>
>> snd_pcm_lib_preallocate_pci_pages_for_all(chip->pci, pcm, 64*1024, 256*1024);
>>@@ -1138,7 +1138,7 @@ int __devinit snd_ymfpci_pcm2(ymfpci_t *
>>
>> if (rpcm)
>> *rpcm = NULL;
>>- if ((err = snd_pcm_new(chip->card, "YMFPCI - AC'97", device, 0, 1, &pcm)) < 0)
>>+ if ((err = snd_pcm_new(chip->card, "3YMFPCI - AC'97", device, 0, 1, &pcm)) < 0)
>> return err;
>> pcm->private_data = chip;
>> pcm->private_free = snd_ymfpci_pcm2_free;
>>@@ -1147,7 +1147,7 @@ int __devinit snd_ymfpci_pcm2(ymfpci_t *
>>
>> /* global setup */
>> pcm->info_flags = 0;
>>- strcpy(pcm->name, "YMFPCI - AC'97");
>>+ strcpy(pcm->name, "4YMFPCI - AC'97");
>> chip->pcm2 = pcm;
>>
>> snd_pcm_lib_preallocate_pci_pages_for_all(chip->pci, pcm, 64*1024, 256*1024);
>>@@ -1182,7 +1182,7 @@ int __devinit snd_ymfpci_pcm_spdif(ymfpc
>>
>> if (rpcm)
>> *rpcm = NULL;
>>- if ((err = snd_pcm_new(chip->card, "YMFPCI - IEC958", device, 1, 0, &pcm)) < 0)
>>+ if ((err = snd_pcm_new(chip->card, "5YMFPCI - IEC958", device, 1, 0, &pcm)) < 0)
>> return err;
>> pcm->private_data = chip;
>> pcm->private_free = snd_ymfpci_pcm_spdif_free;
>>@@ -1191,7 +1191,7 @@ int __devinit snd_ymfpci_pcm_spdif(ymfpc
>>
>> /* global setup */
>> pcm->info_flags = 0;
>>- strcpy(pcm->name, "YMFPCI - IEC958");
>>+ strcpy(pcm->name, "6YMFPCI - IEC958");
>> chip->pcm_spdif = pcm;
>>
>> snd_pcm_lib_preallocate_pci_pages_for_all(chip->pci, pcm, 64*1024, 256*1024);
>>@@ -1226,7 +1226,7 @@ int __devinit snd_ymfpci_pcm_4ch(ymfpci_
>>
>> if (rpcm)
>> *rpcm = NULL;
>>- if ((err = snd_pcm_new(chip->card, "YMFPCI - Rear", device, 1, 0, &pcm)) < 0)
>>+ if ((err = snd_pcm_new(chip->card, "7YMFPCI - Rear", device, 1, 0, &pcm)) < 0)
>> return err;
>> pcm->private_data = chip;
>> pcm->private_free = snd_ymfpci_pcm_4ch_free;
>>@@ -1235,7 +1235,7 @@ int __devinit snd_ymfpci_pcm_4ch(ymfpci_
>>
>> /* global setup */
>> pcm->info_flags = 0;
>>- strcpy(pcm->name, "YMFPCI - Rear PCM");
>>+ strcpy(pcm->name, "8YMFPCI - Rear PCM");
>> chip->pcm_4ch = pcm;
>>
>> snd_pcm_lib_preallocate_pci_pages_for_all(chip->pci, pcm, 64*1024, 256*1024);
>>@@ -1831,7 +1831,7 @@ static void snd_ymfpci_proc_read(snd_inf
>> {
>> // ymfpci_t *chip = snd_magic_cast(ymfpci_t, private_data, return);
>>
>>- snd_iprintf(buffer, "YMFPCI\n\n");
>>+ snd_iprintf(buffer, "9YMFPCI\n\n");
>> }
>>
>> static int __devinit snd_ymfpci_proc_init(snd_card_t * card, ymfpci_t *chip)
>>@@ -2226,12 +2226,12 @@ int __devinit snd_ymfpci_create(snd_card
>> chip->reg_area_virt = (unsigned long)ioremap_nocache(chip->reg_area_phys, 0x8000);
>> pci_set_master(pci);
>>
>>- if ((chip->res_reg_area = request_mem_region(chip->reg_area_phys, 0x8000, "YMFPCI")) == NULL) {
>>+ if ((chip->res_reg_area = request_mem_region(chip->reg_area_phys, 0x8000, "AYMFPCI")) == NULL) {
>> snd_ymfpci_free(chip);
>> snd_printk("unable to grab memory region 0x%lx-0x%lx\n", chip->reg_area_phys, chip->reg_area_phys + 0x8000 - 1);
>> return -EBUSY;
>> }
>>- if (request_irq(pci->irq, snd_ymfpci_interrupt, SA_INTERRUPT|SA_SHIRQ, "YMFPCI", (void *) chip)) {
>>+ if (request_irq(pci->irq, snd_ymfpci_interrupt, SA_INTERRUPT|SA_SHIRQ, "BYMFPCI", (void *) chip)) {
>> snd_ymfpci_free(chip);
>> snd_printk("unable to grab IRQ %d\n", pci->irq);
>> return -EBUSY;
>>diff -puN ./sound/pci/ymfpci/ymfpci.c~a ./sound/pci/ymfpci/ymfpci.c
>>--- 25/./sound/pci/ymfpci/ymfpci.c~a 2003-05-16 13:26:26.000000000 -0700
>>+++ 25-akpm/./sound/pci/ymfpci/ymfpci.c 2003-05-16 13:27:49.000000000 -0700
>>@@ -122,7 +122,7 @@ static int __devinit snd_card_ymfpci_pro
>> fm_port[dev] = addr;
>> }
>> if (fm_port[dev] >= 0 &&
>>- (chip->fm_res = request_region(fm_port[dev], 4, "YMFPCI OPL3")) != NULL) {
>>+ (chip->fm_res = request_region(fm_port[dev], 4, "CYMFPCI OPL3")) != NULL) {
>> legacy_ctrl |= YMFPCI_LEGACY_FMEN;
>> pci_write_config_word(pci, PCIR_DSXG_FMBASE, fm_port[dev]);
>> }
>>@@ -133,7 +133,7 @@ static int __devinit snd_card_ymfpci_pro
>> mpu_port[dev] = addr;
>> }
>> if (mpu_port[dev] >= 0 &&
>>- (chip->mpu_res = request_region(mpu_port[dev], 2, "YMFPCI MPU401")) != NULL) {
>>+ (chip->mpu_res = request_region(mpu_port[dev], 2, "DYMFPCI MPU401")) != NULL) {
>> legacy_ctrl |= YMFPCI_LEGACY_MEN;
>> pci_write_config_word(pci, PCIR_DSXG_MPU401BASE, mpu_port[dev]);
>> }
>>@@ -146,7 +146,7 @@ static int __devinit snd_card_ymfpci_pro
>> default: fm_port[dev] = -1; break;
>> }
>> if (fm_port[dev] > 0 &&
>>- (chip->fm_res = request_region(fm_port[dev], 4, "YMFPCI OPL3")) != NULL) {
>>+ (chip->fm_res = request_region(fm_port[dev], 4, "EYMFPCI OPL3")) != NULL) {
>> legacy_ctrl |= YMFPCI_LEGACY_FMEN;
>> } else {
>> legacy_ctrl2 &= ~YMFPCI_LEGACY2_FMIO;
>>@@ -160,7 +160,7 @@ static int __devinit snd_card_ymfpci_pro
>> default: mpu_port[dev] = -1; break;
>> }
>> if (mpu_port[dev] > 0 &&
>>- (chip->mpu_res = request_region(mpu_port[dev], 2, "YMFPCI MPU401")) != NULL) {
>>+ (chip->mpu_res = request_region(mpu_port[dev], 2, "FYMFPCI MPU401")) != NULL) {
>> legacy_ctrl |= YMFPCI_LEGACY_MEN;
>> } else {
>> legacy_ctrl2 &= ~YMFPCI_LEGACY2_MPUIO;
>>
>
>
> I've applied the patch above to a pristine 2.5.69-mm6. Curiously, if I
> build snd-ymfpci as a module, I can't reproduce the oops anymore.
> However, if I build snd-ymfpci into the kernel, I can *still* reproduce
> the oops.
>
> Attached is the dmesg of a 2.5.69-mm6 plus the above patch with ymfpci
> integrated into the kernel.
>
> Thanks!
>
> Unable to handle kernel paging request at virtual address 25007367

Unfortunately, now the address is gs\0%
This does not help that much. Could you please backout above patch, hand
edit it so that each YMFPCI -> 1YMFPCI, YMFPCI -> 2YMFPCI etc. change
looks instead like
YMFPCI -> 1MFPCI, YMFPCI -> 2MFPCI so that the string length and the
first 3 bytes of the address stay constant and apply it again? That may
give us better results.

Thanks,
Carl-Daniel
--
http://www.hailfinger.org/

2003-05-16 23:28:07

by Felipe Alfaro Solana

[permalink] [raw]
Subject: Re: 2.5.69-mm6: pccard oops while booting

On Sat, 2003-05-17 at 00:08, Carl-Daniel Hailfinger wrote:
> Felipe Alfaro Solana wrote:
> > On Fri, 2003-05-16 at 22:29, Andrew Morton wrote:
> >
> >>Felipe Alfaro Solana <[email protected]> wrote:
> >>
> >>>Unable to handle kernel paging request at virtual address 50464d59
> >>
> >>hm, that address is "YMFP". Please try generating the oops
> >>again with the below patch applied:
> >>
> >> ./sound/pci/ymfpci/ymfpci.c | 8 ++++----
> >> ./sound/pci/ymfpci/ymfpci_main.c | 22 +++++++++++-----------
> >> 2 files changed, 15 insertions(+), 15 deletions(-)
> >>
> >>diff -puN ./sound/pci/ymfpci/ymfpci_main.c~a ./sound/pci/ymfpci/ymfpci_main.c
> >>--- 25/./sound/pci/ymfpci/ymfpci_main.c~a 2003-05-16 13:26:26.000000000 -0700
> >>+++ 25-akpm/./sound/pci/ymfpci/ymfpci_main.c 2003-05-16 13:27:27.000000000 -0700
> >>@@ -1093,7 +1093,7 @@ int __devinit snd_ymfpci_pcm(ymfpci_t *c
> >>
> >> if (rpcm)
> >> *rpcm = NULL;
> >>- if ((err = snd_pcm_new(chip->card, "YMFPCI", device, 32, 1, &pcm)) < 0)
> >>+ if ((err = snd_pcm_new(chip->card, "1YMFPCI", device, 32, 1, &pcm)) < 0)
> >> return err;
> >> pcm->private_data = chip;
> >> pcm->private_free = snd_ymfpci_pcm_free;
> >>@@ -1103,7 +1103,7 @@ int __devinit snd_ymfpci_pcm(ymfpci_t *c
> >>
> >> /* global setup */
> >> pcm->info_flags = 0;
> >>- strcpy(pcm->name, "YMFPCI");
> >>+ strcpy(pcm->name, "2YMFPCI");
> >> chip->pcm = pcm;
> >>
> >> snd_pcm_lib_preallocate_pci_pages_for_all(chip->pci, pcm, 64*1024, 256*1024);
> >>@@ -1138,7 +1138,7 @@ int __devinit snd_ymfpci_pcm2(ymfpci_t *
> >>
> >> if (rpcm)
> >> *rpcm = NULL;
> >>- if ((err = snd_pcm_new(chip->card, "YMFPCI - AC'97", device, 0, 1, &pcm)) < 0)
> >>+ if ((err = snd_pcm_new(chip->card, "3YMFPCI - AC'97", device, 0, 1, &pcm)) < 0)
> >> return err;
> >> pcm->private_data = chip;
> >> pcm->private_free = snd_ymfpci_pcm2_free;
> >>@@ -1147,7 +1147,7 @@ int __devinit snd_ymfpci_pcm2(ymfpci_t *
> >>
> >> /* global setup */
> >> pcm->info_flags = 0;
> >>- strcpy(pcm->name, "YMFPCI - AC'97");
> >>+ strcpy(pcm->name, "4YMFPCI - AC'97");
> >> chip->pcm2 = pcm;
> >>
> >> snd_pcm_lib_preallocate_pci_pages_for_all(chip->pci, pcm, 64*1024, 256*1024);
> >>@@ -1182,7 +1182,7 @@ int __devinit snd_ymfpci_pcm_spdif(ymfpc
> >>
> >> if (rpcm)
> >> *rpcm = NULL;
> >>- if ((err = snd_pcm_new(chip->card, "YMFPCI - IEC958", device, 1, 0, &pcm)) < 0)
> >>+ if ((err = snd_pcm_new(chip->card, "5YMFPCI - IEC958", device, 1, 0, &pcm)) < 0)
> >> return err;
> >> pcm->private_data = chip;
> >> pcm->private_free = snd_ymfpci_pcm_spdif_free;
> >>@@ -1191,7 +1191,7 @@ int __devinit snd_ymfpci_pcm_spdif(ymfpc
> >>
> >> /* global setup */
> >> pcm->info_flags = 0;
> >>- strcpy(pcm->name, "YMFPCI - IEC958");
> >>+ strcpy(pcm->name, "6YMFPCI - IEC958");
> >> chip->pcm_spdif = pcm;
> >>
> >> snd_pcm_lib_preallocate_pci_pages_for_all(chip->pci, pcm, 64*1024, 256*1024);
> >>@@ -1226,7 +1226,7 @@ int __devinit snd_ymfpci_pcm_4ch(ymfpci_
> >>
> >> if (rpcm)
> >> *rpcm = NULL;
> >>- if ((err = snd_pcm_new(chip->card, "YMFPCI - Rear", device, 1, 0, &pcm)) < 0)
> >>+ if ((err = snd_pcm_new(chip->card, "7YMFPCI - Rear", device, 1, 0, &pcm)) < 0)
> >> return err;
> >> pcm->private_data = chip;
> >> pcm->private_free = snd_ymfpci_pcm_4ch_free;
> >>@@ -1235,7 +1235,7 @@ int __devinit snd_ymfpci_pcm_4ch(ymfpci_
> >>
> >> /* global setup */
> >> pcm->info_flags = 0;
> >>- strcpy(pcm->name, "YMFPCI - Rear PCM");
> >>+ strcpy(pcm->name, "8YMFPCI - Rear PCM");
> >> chip->pcm_4ch = pcm;
> >>
> >> snd_pcm_lib_preallocate_pci_pages_for_all(chip->pci, pcm, 64*1024, 256*1024);
> >>@@ -1831,7 +1831,7 @@ static void snd_ymfpci_proc_read(snd_inf
> >> {
> >> // ymfpci_t *chip = snd_magic_cast(ymfpci_t, private_data, return);
> >>
> >>- snd_iprintf(buffer, "YMFPCI\n\n");
> >>+ snd_iprintf(buffer, "9YMFPCI\n\n");
> >> }
> >>
> >> static int __devinit snd_ymfpci_proc_init(snd_card_t * card, ymfpci_t *chip)
> >>@@ -2226,12 +2226,12 @@ int __devinit snd_ymfpci_create(snd_card
> >> chip->reg_area_virt = (unsigned long)ioremap_nocache(chip->reg_area_phys, 0x8000);
> >> pci_set_master(pci);
> >>
> >>- if ((chip->res_reg_area = request_mem_region(chip->reg_area_phys, 0x8000, "YMFPCI")) == NULL) {
> >>+ if ((chip->res_reg_area = request_mem_region(chip->reg_area_phys, 0x8000, "AYMFPCI")) == NULL) {
> >> snd_ymfpci_free(chip);
> >> snd_printk("unable to grab memory region 0x%lx-0x%lx\n", chip->reg_area_phys, chip->reg_area_phys + 0x8000 - 1);
> >> return -EBUSY;
> >> }
> >>- if (request_irq(pci->irq, snd_ymfpci_interrupt, SA_INTERRUPT|SA_SHIRQ, "YMFPCI", (void *) chip)) {
> >>+ if (request_irq(pci->irq, snd_ymfpci_interrupt, SA_INTERRUPT|SA_SHIRQ, "BYMFPCI", (void *) chip)) {
> >> snd_ymfpci_free(chip);
> >> snd_printk("unable to grab IRQ %d\n", pci->irq);
> >> return -EBUSY;
> >>diff -puN ./sound/pci/ymfpci/ymfpci.c~a ./sound/pci/ymfpci/ymfpci.c
> >>--- 25/./sound/pci/ymfpci/ymfpci.c~a 2003-05-16 13:26:26.000000000 -0700
> >>+++ 25-akpm/./sound/pci/ymfpci/ymfpci.c 2003-05-16 13:27:49.000000000 -0700
> >>@@ -122,7 +122,7 @@ static int __devinit snd_card_ymfpci_pro
> >> fm_port[dev] = addr;
> >> }
> >> if (fm_port[dev] >= 0 &&
> >>- (chip->fm_res = request_region(fm_port[dev], 4, "YMFPCI OPL3")) != NULL) {
> >>+ (chip->fm_res = request_region(fm_port[dev], 4, "CYMFPCI OPL3")) != NULL) {
> >> legacy_ctrl |= YMFPCI_LEGACY_FMEN;
> >> pci_write_config_word(pci, PCIR_DSXG_FMBASE, fm_port[dev]);
> >> }
> >>@@ -133,7 +133,7 @@ static int __devinit snd_card_ymfpci_pro
> >> mpu_port[dev] = addr;
> >> }
> >> if (mpu_port[dev] >= 0 &&
> >>- (chip->mpu_res = request_region(mpu_port[dev], 2, "YMFPCI MPU401")) != NULL) {
> >>+ (chip->mpu_res = request_region(mpu_port[dev], 2, "DYMFPCI MPU401")) != NULL) {
> >> legacy_ctrl |= YMFPCI_LEGACY_MEN;
> >> pci_write_config_word(pci, PCIR_DSXG_MPU401BASE, mpu_port[dev]);
> >> }
> >>@@ -146,7 +146,7 @@ static int __devinit snd_card_ymfpci_pro
> >> default: fm_port[dev] = -1; break;
> >> }
> >> if (fm_port[dev] > 0 &&
> >>- (chip->fm_res = request_region(fm_port[dev], 4, "YMFPCI OPL3")) != NULL) {
> >>+ (chip->fm_res = request_region(fm_port[dev], 4, "EYMFPCI OPL3")) != NULL) {
> >> legacy_ctrl |= YMFPCI_LEGACY_FMEN;
> >> } else {
> >> legacy_ctrl2 &= ~YMFPCI_LEGACY2_FMIO;
> >>@@ -160,7 +160,7 @@ static int __devinit snd_card_ymfpci_pro
> >> default: mpu_port[dev] = -1; break;
> >> }
> >> if (mpu_port[dev] > 0 &&
> >>- (chip->mpu_res = request_region(mpu_port[dev], 2, "YMFPCI MPU401")) != NULL) {
> >>+ (chip->mpu_res = request_region(mpu_port[dev], 2, "FYMFPCI MPU401")) != NULL) {
> >> legacy_ctrl |= YMFPCI_LEGACY_MEN;
> >> } else {
> >> legacy_ctrl2 &= ~YMFPCI_LEGACY2_MPUIO;
> >>
> >
> >
> > I've applied the patch above to a pristine 2.5.69-mm6. Curiously, if I
> > build snd-ymfpci as a module, I can't reproduce the oops anymore.
> > However, if I build snd-ymfpci into the kernel, I can *still* reproduce
> > the oops.
> >
> > Attached is the dmesg of a 2.5.69-mm6 plus the above patch with ymfpci
> > integrated into the kernel.
> >
> > Thanks!
> >
> > Unable to handle kernel paging request at virtual address 25007367
>
> Unfortunately, now the address is gs\0%
> This does not help that much. Could you please backout above patch, hand
> edit it so that each YMFPCI -> 1YMFPCI, YMFPCI -> 2YMFPCI etc. change
> looks instead like
> YMFPCI -> 1MFPCI, YMFPCI -> 2MFPCI so that the string length and the
> first 3 bytes of the address stay constant and apply it again? That may
> give us better results.

This is getting tricky. How about this one?
Attached is "ymfpci2.patch" with your suggested changes, and "dmesg"
with the new oops info.

Hope this helps. Thanks again!


Attachments:
ymfpci2.patch (5.55 kB)
dmesg (8.55 kB)
Download all attachments

2003-05-16 23:42:55

by Russell King

[permalink] [raw]
Subject: Re: 2.5.69-mm6: pccard oops while booting

On Sat, May 17, 2003 at 01:40:26AM +0200, Felipe Alfaro Solana wrote:
> This is getting tricky. How about this one?
> Attached is "ymfpci2.patch" with your suggested changes, and "dmesg"
> with the new oops info.

You need to reproduce the oops you get when you modprobe the module.
The oops with this driver built in is different, and akpm's changes
won't tell us which one causes the problem.

Instead of adding a character to each of those strings, could you
remove the 'Y' character so the strings remain the same length as
the original - that may cause the oops to reappear.

--
Russell King ([email protected]) The developer of ARM Linux
http://www.arm.linux.org.uk/personal/aboutme.html

2003-05-16 23:50:24

by Felipe Alfaro Solana

[permalink] [raw]
Subject: Re: 2.5.69-mm6: pccard oops while booting

On Sat, 2003-05-17 at 01:55, Russell King wrote:
> On Sat, May 17, 2003 at 01:40:26AM +0200, Felipe Alfaro Solana wrote:
> > This is getting tricky. How about this one?
> > Attached is "ymfpci2.patch" with your suggested changes, and "dmesg"
> > with the new oops info.
>
> You need to reproduce the oops you get when you modprobe the module.
> The oops with this driver built in is different, and akpm's changes
> won't tell us which one causes the problem.
>
> Instead of adding a character to each of those strings, could you
> remove the 'Y' character so the strings remain the same length as
> the original - that may cause the oops to reappear.

Yeah! That's exactly what Carl proposed in a previous message. So, I
did, but now I can't reproduce the oops with ymfpci compiled as a
module. I can only reproduce the oops if ymfpci is built-into the
kernel.

Wops! I'm lost. I'm tired and it's too late, so I'd better get some
sleep and try to guess a little bit more tomorrow.

Thanks!

2003-05-17 00:14:44

by Carl-Daniel Hailfinger

[permalink] [raw]
Subject: Re: 2.5.69-mm6: pccard oops while booting

Russell King wrote:
> On Sat, May 17, 2003 at 01:40:26AM +0200, Felipe Alfaro Solana wrote:
>
>>This is getting tricky. How about this one?
>>Attached is "ymfpci2.patch" with your suggested changes, and "dmesg"
>>with the new oops info.
>
>
> You need to reproduce the oops you get when you modprobe the module.
> The oops with this driver built in is different, and akpm's changes
> won't tell us which one causes the problem.

True. Just a stab in the dark - leaving KOBJ_NAME_LEN == 20 and
initializing the first four and last four bytes of the KOBJ_NAME_LEN
sized buffer with a counter starting at 0 might also prove very
interesting and could help resolve the Oops with the driver built in.
Motivation: Somehow, the disaster smells like somebody uses a hardcoded
offset designed to work only if KOBJ_NAME_LEN == 16.

I would provide a patch, but I don't have the source handy right now due
to disk space constraints.


Regards,
Carl-Daniel
--
http://www.hailfinger.org/

2003-05-17 09:58:57

by Felipe Alfaro Solana

[permalink] [raw]
Subject: Re: 2.5.69-mm6: pccard oops while booting: round 2

I've been working on this a little this morning and these are the new
conclusions I've drawn:

1. Definitively, the oops is being caused by the YMFPCI driver. I have
built a mini-kernel by squashing the config to a minimum (disabled
modules, preemptive, removed USB, IDE, AGP, Networking, CardBus, and
nearly everything possible) and the kernel still faults.

2. Compiling the kernel with frame pointers causes a panic during boot
instead of an oops. Thus, I've changed my config to use frame pointers.
I don't know if this will make debugging a little harder or not, but I
think it's interesting to use them as the call trace is a little
different this time.

Attached to this message are the following files:

"ymfpci2.patch" which replaces the "YMPFCI" string with "1MFPCI",
"2MFPCI" and so on. This patch should be applied against a pristine,
clean 2.5.69-mm6 kernel tree.

"config" contains the configuration I used to build the kernel. Nearly
everything has been left off: no swapping, no modules, no preemptive, no
ACPI, no APM, no power management, no IDE, no USB, no AGP, no
networking, etc. I made this to keep the number of additional drivers to
a minimum and being able to concentrate on the faulting driver.

"oops" contains the kernel oops when booting 2.5.69-mm6 + ymfpci2.patch
using the above config file.

Hope this can make things easier for you, guys, cause this thing is
getting greater than me.

Thanks and have a nice weekend!


Attachments:
config (7.91 kB)
oops (1.07 kB)
ymfpci2.patch (5.55 kB)
Download all attachments

2003-05-17 10:04:00

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.5.69-mm6: pccard oops while booting: round 2

Felipe Alfaro Solana <[email protected]> wrote:
>
> "oops" contains the kernel oops when booting 2.5.69-mm6 + ymfpci2.patch
> using the above config file.

Bummer. Vital info is chopped off the top of the oops output.

2003-05-17 10:53:22

by Felipe Alfaro Solana

[permalink] [raw]
Subject: Re: 2.5.69-mm6: pccard oops while booting: round 2

On Sat, 2003-05-17 at 12:18, Andrew Morton wrote:
> Bummer. Vital info is chopped off the top of the oops output.

Well, I'm idiot. Please, disregard my previos mail. This contains the
full oops, config and patch:

---
I've been working on this a little this morning and these are the new
conclusions I've drawn:

1. Definitively, the oops is being caused by the YMFPCI driver. I have
built a mini-kernel by squashing the config to a minimum (disabled
modules, preemptive, removed USB, IDE, AGP, Networking, CardBus, and
nearly everything possible) and the kernel still faults.

2. Compiling the kernel with frame pointers causes a panic during boot
instead of an oops. Thus, I've changed my config to use frame pointers.
I don't know if this will make debugging a little harder or not, but I
think it's interesting to use them as the call trace is a little
different this time.

Attached to this message are the following files:

"ymfpci2.patch" which replaces the "YMPFCI" string with "1MFPCI",
"2MFPCI" and so on. This patch should be applied against a pristine,
clean 2.5.69-mm6 kernel tree.

"config" contains the configuration I used to build the kernel. Nearly
everything has been left off: no swapping, no modules, no preemptive, no
ACPI, no APM, no power management, no IDE, no USB, no AGP, no
networking, etc. I made this to keep the number of additional drivers to
a minimum and being able to concentrate on the faulting driver.

"oops" contains the kernel oops when booting 2.5.69-mm6 + ymfpci2.patch
using the above config file.

Hope this can make things easier for you, guys, cause this thing is
getting greater than me.
---


Attachments:
config (7.91 kB)
oops (1.42 kB)
ymfpci2.patch (5.55 kB)
Download all attachments

2003-05-17 11:09:31

by Carl-Daniel Hailfinger

[permalink] [raw]
Subject: Re: 2.5.69-mm6: pccard oops while booting: round 2

Felipe Alfaro Solana wrote:
> On Sat, 2003-05-17 at 12:18, Andrew Morton wrote:
>
>>Bummer. Vital info is chopped off the top of the oops output.
>
> I've been working on this a little this morning and these are the new
> conclusions I've drawn:
>
> 1. Definitively, the oops is being caused by the YMFPCI driver. I have
> built a mini-kernel by squashing the config to a minimum (disabled
> modules, preemptive, removed USB, IDE, AGP, Networking, CardBus, and

Could you please enable modules again and load ymfpci as module? This is
supposed to give the best results with ymfpci2.patch. For ymfpci built
in, the patch unfortunately does not help much.

> nearly everything possible) and the kernel still faults.

Thanks for your patience,
Carl-Daniel

2003-05-17 11:25:06

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.5.69-mm6: pccard oops while booting: round 2

Felipe Alfaro Solana <[email protected]> wrote:
>
> Unable to handle kernel paging request at virtual address fceec0d7
> printing eip:
> c016954f
> *pde = 00000000
> Oops: 0000 [#1]
> CPU: 0
> EIP: 0060:[<c016954f>] Not tainted VLI
> EFLAGS: 00010246
> EIP is at sys_create_link+0xcf/0x130

bah. That's totally different :(

But there seems to be a bug in there.

--- 25/fs/sysfs/symlink.c~sysfs_create_link-fix 2003-05-17 04:34:50.000000000 -0700
+++ 25-akpm/fs/sysfs/symlink.c 2003-05-17 04:34:56.000000000 -0700
@@ -80,7 +80,7 @@ int sysfs_create_link(struct kobject * k
char * s;

depth = object_depth(kobj);
- size = object_path_length(target) + depth * 3 - 1;
+ size = object_path_length(target) + depth * 3 + 1;
if (size > PATH_MAX)
return -ENAMETOOLONG;
pr_debug("%s: depth = %d, size = %d\n",__FUNCTION__,depth,size);




That probably won't fix it though.

2003-05-17 12:24:03

by Felipe Alfaro Solana

[permalink] [raw]
Subject: Re: 2.5.69-mm6: pccard oops while booting: round 2

On Sat, 2003-05-17 at 13:39, Andrew Morton wrote:
> Felipe Alfaro Solana <[email protected]> wrote:
> >
> > Unable to handle kernel paging request at virtual address fceec0d7
> > printing eip:
> > c016954f
> > *pde = 00000000
> > Oops: 0000 [#1]
> > CPU: 0
> > EIP: 0060:[<c016954f>] Not tainted VLI
> > EFLAGS: 00010246
> > EIP is at sys_create_link+0xcf/0x130
>
> bah. That's totally different :(
>
> But there seems to be a bug in there.
>
> --- 25/fs/sysfs/symlink.c~sysfs_create_link-fix 2003-05-17 04:34:50.000000000 -0700
> +++ 25-akpm/fs/sysfs/symlink.c 2003-05-17 04:34:56.000000000 -0700
> @@ -80,7 +80,7 @@ int sysfs_create_link(struct kobject * k
> char * s;
>
> depth = object_depth(kobj);
> - size = object_path_length(target) + depth * 3 - 1;
> + size = object_path_length(target) + depth * 3 + 1;
> if (size > PATH_MAX)
> return -ENAMETOOLONG;
> pr_debug("%s: depth = %d, size = %d\n",__FUNCTION__,depth,size);
>
> That probably won't fix it though.

I'm sorry to say the above patch doesn't fix the problem :-(


2003-05-17 12:39:10

by Felipe Alfaro Solana

[permalink] [raw]
Subject: Re: 2.5.69-mm6: pccard oops while booting: round 3

On Sat, 2003-05-17 at 13:22, Carl-Daniel Hailfinger wrote:

> Could you please enable modules again and load ymfpci as module? This is
> supposed to give the best results with ymfpci2.patch. For ymfpci built
> in, the patch unfortunately does not help much.

OK, We're getting closer, I think :-)

Attached to the message are: (a) "dmesg" with the oops caused by
modprobing snd-ymfpci, (b) the new "config" file used to build the
kernel, (c) "ymfpci2.patch" and (d) "sysfs.patch" with Andrew's latest
patch to sysfs_create_link function.

As always, from a pristine 2.5.69-mm6 kernel, apply "ymfpci2.patch",
then "sysfs.patch" and then use the supplied "config" to build. Boot
this test kernel into runlevel 1 and then "modprobe snd-ymfpci" causes
the oops described in the "dmesg" file.

Man, this is hard-chasing. Thank you guys for all your hard work!
Thanks^2!


Attachments:
config (9.44 kB)
dmesg (5.88 kB)
sysfs.patch (475.00 B)
ymfpci2.patch (5.55 kB)
Download all attachments

2003-05-18 19:26:07

by Felipe Alfaro Solana

[permalink] [raw]
Subject: Re: 2.5.69-mm6: pccard oops while booting: gcc bug?

I've read the announcement of gcc 3.3 and saw that gcc 3.2 is not yet
supported for linux kernel compilations (I've been using Red Hat's
gcc-3.2.3-4 to compile 2.5.69-mm6). So I thought, what would happen if I
use gcc 2.96 to compile the kernel instead?

And voil?... I've compiled 2.5.69-mm6 with Red Hat's 2.96.118 and now,
I'm unable to reproduce the pccard oops you've been trying to chase
down. Does this mean the pccard oops was caused by a compiler bug?

RFC is welcome.
Thanks!

2003-05-22 13:11:08

by Carl-Daniel Hailfinger

[permalink] [raw]
Subject: [RFC] Disallow compilation with gcc 3.2.3 (was: Re: 2.5.69-mm6: pccard oops while booting:)

Felipe Alfaro Solana wrote:
> I've read the announcement of gcc 3.3 and saw that gcc 3.2 is not yet
> supported for linux kernel compilations (I've been using Red Hat's
> gcc-3.2.3-4 to compile 2.5.69-mm6). So I thought, what would happen if I
> use gcc 2.96 to compile the kernel instead?
>
> And voil?... I've compiled 2.5.69-mm6 with Red Hat's 2.96.118 and now,
> I'm unable to reproduce the pccard oops you've been trying to chase
> down. Does this mean the pccard oops was caused by a compiler bug?

Nobody has found an error in the code we talked about, so a compiler bug
in gcc 3.2.3 seems to be the only explanation.

Thoughts?

2003-05-22 13:52:42

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: [RFC] Disallow compilation with gcc 3.2.3 (was: Re: 2.5.69-mm6: pccard oops while booting:)

On Thu, 22 May 2003 15:24:06 +0200, Carl-Daniel Hailfinger said:

> Nobody has found an error in the code we talked about, so a compiler bug
> in gcc 3.2.3 seems to be the only explanation.

In the last 20 years, I've come across lots of cases where optimizers did
foolish things that broke code. I've also come across the odd case or three
where the optimizer merely exposed a bug. Favorite cases here are where
the optimizer removes what it thinks is a dead/redundant load/store, and
exposes a race condition on a variable that should have been 'volatile' but
wasn't, odd corner cases where sequence points actually matter (one of these was
just posted here the other day, in fact)... stuff like that.

So yes. It's probably something borked in gcc 3.2.3 - but we probably won't
know for sure till somebody goes over the assembler output with a fine tooth
comb...


Attachments:
(No filename) (226.00 B)

2003-05-22 14:13:56

by Russell King

[permalink] [raw]
Subject: Re: 2.5.69-mm6: pccard oops while booting: gcc bug?

On Sun, May 18, 2003 at 09:38:53PM +0200, Felipe Alfaro Solana wrote:
> I've read the announcement of gcc 3.3 and saw that gcc 3.2 is not yet
> supported for linux kernel compilations (I've been using Red Hat's
> gcc-3.2.3-4 to compile 2.5.69-mm6). So I thought, what would happen if I
> use gcc 2.96 to compile the kernel instead?
>
> And voil?... I've compiled 2.5.69-mm6 with Red Hat's 2.96.118 and now,
> I'm unable to reproduce the pccard oops you've been trying to chase
> down. Does this mean the pccard oops was caused by a compiler bug?

Interesting. We know GCC 3.2.x produces wrong code on ARM without
the patch in PR8896 (http://gcc.gnu.org/cgi-bin/gnatsweb.pl?cmd=view&pr=8896)
being applied. GCC people aren't happy about applying this patch
because it touches the generic reload code, which apparantly is
sacred voodoo.

I'm wondering if this problem isn't only ARM, but affects others as
well. (The result is that gcc pokes '4' instead of '3' into the ELF
AUX entries as the value of AT_PHDR.)

Obviously, if x86 is affected and this patch fixes the problem, there's
more motivation for the GCC people to include this fix. Someone needs
to track down what's going wrong, and then gcc people need to comment.

--
Russell King ([email protected]) The developer of ARM Linux
http://www.arm.linux.org.uk/personal/aboutme.html

2003-05-22 18:21:33

by Felipe Alfaro Solana

[permalink] [raw]
Subject: Re: [RFC] Disallow compilation with gcc 3.2.3 (was: Re: 2.5.69-mm6: pccard oops while booting:)

On Thu, 2003-05-22 at 15:24, Carl-Daniel Hailfinger wrote:
> Felipe Alfaro Solana wrote:
> > I've read the announcement of gcc 3.3 and saw that gcc 3.2 is not yet
> > supported for linux kernel compilations (I've been using Red Hat's
> > gcc-3.2.3-4 to compile 2.5.69-mm6). So I thought, what would happen if I
> > use gcc 2.96 to compile the kernel instead?
> >
> > And voil?... I've compiled 2.5.69-mm6 with Red Hat's 2.96.118 and now,
> > I'm unable to reproduce the pccard oops you've been trying to chase
> > down. Does this mean the pccard oops was caused by a compiler bug?
>
> Nobody has found an error in the code we talked about, so a compiler bug
> in gcc 3.2.3 seems to be the only explanation.

I would say it's indeed a bug with gcc 3.2.3. I'm now compiling using
2.96 and the problem has disappeared completely. Maybe I'll give a try
with gcc 3.3 to see how it works.

2003-05-22 19:08:50

by Andrew Morton

[permalink] [raw]
Subject: Re: [RFC] Disallow compilation with gcc 3.2.3 (was: Re: 2.5.69-mm6: pccard oops while booting:)

Carl-Daniel Hailfinger <[email protected]> wrote:
>
> Felipe Alfaro Solana wrote:
> > I've read the announcement of gcc 3.3 and saw that gcc 3.2 is not yet
> > supported for linux kernel compilations (I've been using Red Hat's
> > gcc-3.2.3-4 to compile 2.5.69-mm6). So I thought, what would happen if I
> > use gcc 2.96 to compile the kernel instead?
> >
> > And voil?... I've compiled 2.5.69-mm6 with Red Hat's 2.96.118 and now,
> > I'm unable to reproduce the pccard oops you've been trying to chase
> > down. Does this mean the pccard oops was caused by a compiler bug?
>
> Nobody has found an error in the code we talked about, so a compiler bug
> in gcc 3.2.3 seems to be the only explanation.
>

It could be due to some incorrect code in the kernel, but we got away with
it when using earlier compilers. We need to work out precisely where and
why it went wrong.