2004-10-06 22:16:22

by Richard B. Johnson

[permalink] [raw]
Subject: Probable module bug in linux-2.6.5-1.358


The attached script shows that an attempt to open a device
after its module was removed, will seg-fault the kernel.

Cheers,
Dick Johnson
Penguin : Linux version 2.6.5-1.358-noreg on an i686 machine (5537.79 BogoMips).
Note 96.31% of all statistics are fiction.


Attachments:
typescript (7.54 kB)

2004-10-06 23:27:52

by Stephen Hemminger

[permalink] [raw]
Subject: Re: Probable module bug in linux-2.6.5-1.358

On Wed, 2004-10-06 at 18:08 -0400, Richard B. Johnson wrote:
> The attached script shows that an attempt to open a device
> after its module was removed, will seg-fault the kernel.
>
> Cheers,
> Dick Johnson
> Penguin : Linux version 2.6.5-1.358-noreg on an i686 machine (5537.79 BogoMips).
> Note 96.31% of all statistics are fiction.



Oct 6 17:03:30 chaos kernel: Analogic Corp Datalink Driver : Module
removed

The bug is in that driver. It needs to unregister the character device
in it's module remove routine. It doesn't appear to be in the main
kernel source tree so bug Redhat or the vendor.

2004-10-06 23:27:51

by Stephen Hemminger

[permalink] [raw]
Subject: Re: Probable module bug in linux-2.6.5-1.358

On Wed, 2004-10-06 at 18:08 -0400, Richard B. Johnson wrote:
> The attached script shows that an attempt to open a device
> after its module was removed, will seg-fault the kernel.
>
> Cheers,
> Dick Johnson
> Penguin : Linux version 2.6.5-1.358-noreg on an i686 machine (5537.79 BogoMips).
> Note 96.31% of all statistics are fiction.



Oct 6 17:03:30 chaos kernel: Analogic Corp Datalink Driver : Module
removed

The bug is in that driver. It needs to unregister the character device
in it's module remove routine. It doesn't appear to be in the main
kernel source tree so bug Redhat or the vendor.

2004-10-06 23:44:37

by Alan

[permalink] [raw]
Subject: Re: Probable module bug in linux-2.6.5-1.358

On Iau, 2004-10-07 at 00:20, Stephen Hemminger wrote:
> Oct 6 17:03:30 chaos kernel: Analogic Corp Datalink Driver : Module
> removed
>
> The bug is in that driver. It needs to unregister the character device
> in it's module remove routine. It doesn't appear to be in the main
> kernel source tree so bug Redhat or the vendor.

Not Red Hat shipped, appears to be his own code.

2004-10-07 10:01:07

by Arjan van de Ven

[permalink] [raw]
Subject: Re: Probable module bug in linux-2.6.5-1.358

On Thu, 2004-10-07 at 00:08, Richard B. Johnson wrote:
> The attached script shows that an attempt to open a device
> after its module was removed, will seg-fault the kernel.

Oct 6 17:03:30 chaos kernel: Analogic Corp Datalink Driver : Module
removed

Oct 6 17:03:38 chaos kernel: EIP: 0060:[<021556ad>] Not tainted
Oct 6 17:03:38 chaos kernel: EFLAGS: 00010206 (2.6.5-1.358-noreg)

since your module apparently is gpl (the kernel isn't tainted), can you
post a URL to the sourcecode so that we can point the bug out to you ?


Attachments:
signature.asc (189.00 B)
This is a digitally signed message part

2004-10-07 11:50:45

by Richard B. Johnson

[permalink] [raw]
Subject: Re: Probable module bug in linux-2.6.5-1.358

On Wed, 6 Oct 2004, Stephen Hemminger wrote:

> On Wed, 2004-10-06 at 18:08 -0400, Richard B. Johnson wrote:
>> The attached script shows that an attempt to open a device
>> after its module was removed, will seg-fault the kernel.
>>
>> Cheers,
>> Dick Johnson
>> Penguin : Linux version 2.6.5-1.358-noreg on an i686 machine (5537.79 BogoMips).
>> Note 96.31% of all statistics are fiction.
>
>
>
> Oct 6 17:03:30 chaos kernel: Analogic Corp Datalink Driver : Module
> removed
>
> The bug is in that driver. It needs to unregister the character device
> in it's module remove routine. It doesn't appear to be in the main
> kernel source tree so bug Redhat or the vendor.
>


It certainly does unregister the device.........

if((ret = unregister_chrdev(MAJOR_NR, devname)) < 0)
{
printk(KERN_ALERT"%s : Can't unregister major number %d (%d)\n",
devname, MAJOR_NR, ret);
return;
}
free_resources();
printk(KERN_INFO"%s : Module removed\n", devname);
}

.... and this works fine in 2.4.x




Cheers,
Dick Johnson
Penguin : Linux version 2.6.5-1.358-noreg on an i686 machine (5537.79 BogoMips).
Note 96.31% of all statistics are fiction.

2004-10-07 12:32:58

by Arjan van de Ven

[permalink] [raw]
Subject: Re: Probable module bug in linux-2.6.5-1.358

On Thu, Oct 07, 2004 at 08:26:22AM -0400, Richard B. Johnson wrote:
> On Thu, 7 Oct 2004, Arjan van de Ven wrote:
>
> >On Thu, Oct 07, 2004 at 08:01:47AM -0400, Richard B. Johnson wrote:
> >>Also, when this driver is running, transferring large volumes
> >>of data, the kernel decides that there have been too many interrupts, and
> >>does:
> >>
> >>Message from syslogd@chaos at Wed Oct 6 21:22:57 2004 ...
> >>chaos kernel: Disabling IRQ #18
> >>
> >>This, in spite of the fact that interrupts occur only when
> >>DMA completion happens and new data are available, i.e.,
> >>one interrupt every 16 megabytes of data transferred.
> >>
> >>Who decided that it had a right to disable my interrupt????
> >
> >the kernel did because you don't return the proper value for "I handled the
> >IRQ" from your ISR.
>
> Do you know what that value is? I can't find it. I just returned 0
> and it worked for awhile.

IRQ_HANDLED is you handled the irq, IRQ_NONE if you didn't


> The kernel calls cleanup_module() and the printk() shows that it
> was truly called.

I fail to find where you declare module_exit() in your sources


Attachments:
(No filename) (1.09 kB)
(No filename) (189.00 B)
Download all attachments

2004-10-07 12:43:07

by Arjan van de Ven

[permalink] [raw]
Subject: Re: Probable module bug in linux-2.6.5-1.358

On Thu, Oct 07, 2004 at 08:01:47AM -0400, Richard B. Johnson wrote:
> Also, when this driver is running, transferring large volumes
> of data, the kernel decides that there have been too many interrupts, and
> does:
>
> Message from syslogd@chaos at Wed Oct 6 21:22:57 2004 ...
> chaos kernel: Disabling IRQ #18
>
> This, in spite of the fact that interrupts occur only when
> DMA completion happens and new data are available, i.e.,
> one interrupt every 16 megabytes of data transferred.
>
> Who decided that it had a right to disable my interrupt????

the kernel did because you don't return the proper value for "I handled the
IRQ" from your ISR.

Also I don't see where you call cleanup_module(), the function that does the
deregistration of the chardev... where do you call that ????

2004-10-07 12:41:56

by Richard B. Johnson

[permalink] [raw]
Subject: Re: Probable module bug in linux-2.6.5-1.358

On Thu, 7 Oct 2004, Arjan van de Ven wrote:

> On Thu, Oct 07, 2004 at 08:01:47AM -0400, Richard B. Johnson wrote:
>> Also, when this driver is running, transferring large volumes
>> of data, the kernel decides that there have been too many interrupts, and
>> does:
>>
>> Message from syslogd@chaos at Wed Oct 6 21:22:57 2004 ...
>> chaos kernel: Disabling IRQ #18
>>
>> This, in spite of the fact that interrupts occur only when
>> DMA completion happens and new data are available, i.e.,
>> one interrupt every 16 megabytes of data transferred.
>>
>> Who decided that it had a right to disable my interrupt????
>
> the kernel did because you don't return the proper value for "I handled the
> IRQ" from your ISR.

Do you know what that value is? I can't find it. I just returned 0
and it worked for awhile.

>
> Also I don't see where you call cleanup_module(), the function that does the
> deregistration of the chardev... where do you call that ????
>

The kernel calls cleanup_module() and the printk() shows that it
was truly called.


Cheers,
Dick Johnson
Penguin : Linux version 2.6.5-1.358-noreg on an i686 machine (5537.79 BogoMips).
Note 96.31% of all statistics are fiction.

2004-10-07 12:47:43

by Richard B. Johnson

[permalink] [raw]
Subject: Re: Probable module bug in linux-2.6.5-1.358

On Thu, 7 Oct 2004, Arjan van de Ven wrote:

> On Thu, Oct 07, 2004 at 08:26:22AM -0400, Richard B. Johnson wrote:
>> On Thu, 7 Oct 2004, Arjan van de Ven wrote:
>>
>>> On Thu, Oct 07, 2004 at 08:01:47AM -0400, Richard B. Johnson wrote:
>>>> Also, when this driver is running, transferring large volumes
>>>> of data, the kernel decides that there have been too many interrupts, and
>>>> does:
>>>>
>>>> Message from syslogd@chaos at Wed Oct 6 21:22:57 2004 ...
>>>> chaos kernel: Disabling IRQ #18
>>>>
>>>> This, in spite of the fact that interrupts occur only when
>>>> DMA completion happens and new data are available, i.e.,
>>>> one interrupt every 16 megabytes of data transferred.
>>>>
>>>> Who decided that it had a right to disable my interrupt????
>>>
>>> the kernel did because you don't return the proper value for "I handled the
>>> IRQ" from your ISR.
>>
>> Do you know what that value is? I can't find it. I just returned 0
>> and it worked for awhile.
>
> IRQ_HANDLED is you handled the irq, IRQ_NONE if you didn't
>
>
>> The kernel calls cleanup_module() and the printk() shows that it
>> was truly called.
>
> I fail to find where you declare module_exit() in your sources
>

Well I don't. Is it now required? What does it do? If I put
in module_exit() and have it execute cleanup_module(), it
barfs badly. Do I just make a dummy module_exit() that
does nothing? Does this mean that unregister_chrdev() didn't
really happen until module_exit() is called?


Cheers,
Dick Johnson
Penguin : Linux version 2.6.5-1.358-noreg on an i686 machine (5537.79 BogoMips).
Note 96.31% of all statistics are fiction.

2004-10-07 13:05:04

by Richard B. Johnson

[permalink] [raw]
Subject: Re: Probable module bug in linux-2.6.5-1.358

On Thu, 7 Oct 2004, Richard B. Johnson wrote:

> On Thu, 7 Oct 2004, Arjan van de Ven wrote:
>
>> On Thu, Oct 07, 2004 at 08:26:22AM -0400, Richard B. Johnson wrote:
>>> On Thu, 7 Oct 2004, Arjan van de Ven wrote:
>>>
>>>> On Thu, Oct 07, 2004 at 08:01:47AM -0400, Richard B. Johnson wrote:
>>>>> Also, when this driver is running, transferring large volumes
>>>>> of data, the kernel decides that there have been too many interrupts,
>>>>> and
>>>>> does:
>>>>>
>>>>> Message from syslogd@chaos at Wed Oct 6 21:22:57 2004 ...
>>>>> chaos kernel: Disabling IRQ #18
>>>>>
>>>>> This, in spite of the fact that interrupts occur only when
>>>>> DMA completion happens and new data are available, i.e.,
>>>>> one interrupt every 16 megabytes of data transferred.
>>>>>
>>>>> Who decided that it had a right to disable my interrupt????
>>>>
>>>> the kernel did because you don't return the proper value for "I handled
>>>> the
>>>> IRQ" from your ISR.
>>>
>>> Do you know what that value is? I can't find it. I just returned 0
>>> and it worked for awhile.
>>
>> IRQ_HANDLED is you handled the irq, IRQ_NONE if you didn't
>>


Okay. I'll fix that.


>>
>>> The kernel calls cleanup_module() and the printk() shows that it
>>> was truly called.
>>
>> I fail to find where you declare module_exit() in your sources
>>
>
> Well I don't. Is it now required? What does it do? If I put
> in module_exit() and have it execute cleanup_module(), it

I found it..........

module_exit() is just a wrapper around cleanup_module().

cleanup_module() gets called. I could rename cleanup_module to
foo() and then use module_exit(foo);, which seems obtuse
since this has to compile on 2.4.x also, clearly not the
right thing to do.

`dmesg` clearly shown that the exit routine was called....

PCI: Enabling device 0000:02:04.0 (0106 -> 0107)
Analogic Corp Datalink Driver : Installed 12d6:8004 IRQ18 slot:0204 DMA:1f401000
Analogic Corp Datalink Driver : Initialization complete
Analogic Corp Datalink Driver : Module removed


Cheers,
Dick Johnson
Penguin : Linux version 2.6.5-1.358-noreg on an i686 machine (5537.79 BogoMips).
Note 96.31% of all statistics are fiction.

2004-10-07 17:28:56

by Stephen Hemminger

[permalink] [raw]
Subject: Re: Probable module bug in linux-2.6.5-1.358

Still haven't full source so this is still guess work.
But assuming it is a character device, did you forget to add an owner
field to the file ops structure?

static struct file_operations xxx_fops = {
.owner = THIS_MODULE,
.read = my_read,
...

The owner field is used by the character device layer to maintain module
ref counts in 2.6.



2004-10-07 19:12:52

by Stephen Hemminger

[permalink] [raw]
Subject: Re: Probable module bug in linux-2.6.5-1.358

Since you want to play fast and loose with the GPL and modules
licensing, I see no reason to help you debug your problem.

If you can reproduce the same problem with some GPL version of
standalone (even dummy) code, then come back and see us sometime.

--------------
/*
* Since some in the Linux-kernel development group want to play
* lawyer, and require that a GPL License exist for every kernel
* module, I provide the following:
*
* Everything in this file (only) is released under the so-called
* GNU Public License, incorporated herein by reference.
*
* Now, we just link this with any proprietary code and everybody
* but the lawyers are happy.
*/
#ifndef __KERNEL__
#define __KERNEL__
#endif
#ifndef MODULE
#define MODULE
#endif
#include <linux/module.h>
#if defined(MODULE_LICENSE)
MODULE_LICENSE("GPL");
#endif


2004-10-07 19:15:30

by Richard B. Johnson

[permalink] [raw]
Subject: Re: Probable module bug in linux-2.6.5-1.358

On Thu, 7 Oct 2004, Stephen Hemminger wrote:

> Since you want to play fast and loose with the GPL and modules
> licensing, I see no reason to help you debug your problem.
>
> If you can reproduce the same problem with some GPL version of
> standalone (even dummy) code, then come back and see us sometime.
>

Well it isn't my problem. It's a kernel problem that only exists
in 2.6.n kernels. Versions 2.4.x work fine. And what are you
bitching for? You have the entire source-code.

Further, it exists in a dummy driver that impliments a dummy
open(), close(), and ioctl().

Once a character-device module is installed, then removed. If
you attempt to open() the module again, instead of the kernel
reloading it, it crashes. The last thing executed is
sys_open(). It should have stopped at chrdev_open(), but
the code is broken.


> --------------
> /*
> * Since some in the Linux-kernel development group want to play
> * lawyer, and require that a GPL License exist for every kernel
> * module, I provide the following:
> *
> * Everything in this file (only) is released under the so-called
> * GNU Public License, incorporated herein by reference.
> *
> * Now, we just link this with any proprietary code and everybody
> * but the lawyers are happy.
> */
> #ifndef __KERNEL__
> #define __KERNEL__
> #endif
> #ifndef MODULE
> #define MODULE
> #endif
> #include <linux/module.h>
> #if defined(MODULE_LICENSE)
> MODULE_LICENSE("GPL");
> #endif
>
>

Cheers,
Dick Johnson
Penguin : Linux version 2.6.5-1.358-noreg on an i686 machine (5537.79 BogoMips).
Note 96.31% of all statistics are fiction.

2004-10-07 19:31:12

by Richard B. Johnson

[permalink] [raw]
Subject: Re: Probable module bug in linux-2.6.5-1.358

On Thu, 7 Oct 2004, Stephen Hemminger wrote:

[SNIPPED...]


> If you can reproduce the same problem with some GPL version of
> standalone (even dummy) code, then come back and see us sometime.
>

Best I can see in the sources, the only place cd_forget() (for
removing a character device) is called is from clear_inode()
in inode.c. This is never called by unregister_chrdev().



Cheers,
Dick Johnson
Penguin : Linux version 2.6.5-1.358-noreg on an i686 machine (5537.79 BogoMips).
Note 96.31% of all statistics are fiction.

2004-10-07 20:05:34

by Alan

[permalink] [raw]
Subject: Re: Probable module bug in linux-2.6.5-1.358

On Iau, 2004-10-07 at 20:05, Stephen Hemminger wrote:
> --------------
> /*
> * Since some in the Linux-kernel development group want to play
> * lawyer, and require that a GPL License exist for every kernel
> * module, I provide the following:
> *
> * Everything in this file (only) is released under the so-called
> * GNU Public License, incorporated herein by reference.
> *
> * Now, we just link this with any proprietary code and everybody
> * but the lawyers are happy.
> */

What a fascinating object. I hope thats not reflective of OSDL policy 8)

Is fascinating because my first thought was that if they sign the Induce
act it would be a criminal offence to have it in the USA since its
clearly an incitement and my second thought was that the Bernstein case
appears to argue its protected speech. Interesting times 8)

More seriously the goal of MODULE_LICENSE has never been to -enforce-
GPL licensing. It provides help in understanding what symbols are
definitely off limits and it allows people to identify proprietary stuff
loaded into a system to filter bug reports.

The law on derivative works and copyright in general, murkly alas as it
is, does the enforcing, and unfortunately it is becoming apparent that
the free software world is going to have to go out soon and crack down
hard on abusers, especially those simply shipping Linux binaries with no
source or GPL information.

Alan

2004-10-07 20:31:13

by Stephen Hemminger

[permalink] [raw]
Subject: Re: Probable module bug in linux-2.6.5-1.358

On Thu, 2004-10-07 at 19:59 +0100, Alan Cox wrote:
> On Iau, 2004-10-07 at 20:05, Stephen Hemminger wrote:
> > --------------
> > /*
> > * Since some in the Linux-kernel development group want to play
> > * lawyer, and require that a GPL License exist for every kernel
> > * module, I provide the following:
> > *
> > * Everything in this file (only) is released under the so-called
> > * GNU Public License, incorporated herein by reference.
> > *
> > * Now, we just link this with any proprietary code and everybody
> > * but the lawyers are happy.
> > */
>
> What a fascinating object. I hope thats not reflective of OSDL policy 8)

That's from Richard's license.c in the source he attached.
All copies of that code have been deleted from here because the rest of
it is proprietary.

2004-10-07 21:11:21

by Richard B. Johnson

[permalink] [raw]
Subject: Re: Probable module bug in linux-2.6.5-1.358

On Thu, 7 Oct 2004, Alan Cox wrote:

> On Iau, 2004-10-07 at 20:05, Stephen Hemminger wrote:
>> --------------
>> /*
>> * Since some in the Linux-kernel development group want to play
>> * lawyer, and require that a GPL License exist for every kernel
>> * module, I provide the following:
>> *
>> * Everything in this file (only) is released under the so-called
>> * GNU Public License, incorporated herein by reference.
>> *
>> * Now, we just link this with any proprietary code and everybody
>> * but the lawyers are happy.
>> */
>
> What a fascinating object. I hope thats not reflective of OSDL policy 8)
>
> Is fascinating because my first thought was that if they sign the Induce
> act it would be a criminal offence to have it in the USA since its
> clearly an incitement and my second thought was that the Bernstein case
> appears to argue its protected speech. Interesting times 8)
>
> More seriously the goal of MODULE_LICENSE has never been to -enforce-
> GPL licensing. It provides help in understanding what symbols are
> definitely off limits and it allows people to identify proprietary stuff
> loaded into a system to filter bug reports.
>
> The law on derivative works and copyright in general, murkly alas as it
> is, does the enforcing, and unfortunately it is becoming apparent that
> the free software world is going to have to go out soon and crack down
> hard on abusers, especially those simply shipping Linux binaries with no
> source or GPL information.
>
> Alan
>

Naaah. I included it in my module as a joke. Steve didn't take
it as a joke and forwarded it to you. It shows that the whole
MODULE_LICENSE("Whatever") is a joke. Not only that, I can
simply change /proc/sys/kernel/tainted to 0 before submitting
a bug report. This whole GPL thing has taken a real stupid
turn. I guess GNU has really taken over Linux. Stallman will
be real proud of all the work you guys did for him on
GNU/Linux .........

And, yes, we still have free speech in the US, even though
the current administration won't allow us to say anything
bad about it (treason, you know)!


Cheers,
Dick Johnson
Penguin : Linux version 2.6.5-1.358-noreg on an i686 machine (5537.79 BogoMips).
Note 96.31% of all statistics are fiction.

2004-10-07 21:37:37

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: Probable module bug in linux-2.6.5-1.358

On Thu, 07 Oct 2004 16:48:42 EDT, "Richard B. Johnson" said:

> Naaah. I included it in my module as a joke. Steve didn't take
> it as a joke and forwarded it to you. It shows that the whole
> MODULE_LICENSE("Whatever") is a joke. Not only that, I can
> simply change /proc/sys/kernel/tainted to 0 before submitting
> a bug report.

Hey. It's your conscience. Personally, I rank it right up there with
lying to your physician regarding your symptoms, and for the same reasons.

As for shipping code like that - that's just downright dishonest.


Attachments:
(No filename) (226.00 B)

2004-10-07 22:15:58

by Stephen Hemminger

[permalink] [raw]
Subject: Re: Probable module bug in linux-2.6.5-1.358

I can't reproduce your problem with a simple dummy character device.

-------------
/*
* Minimum driver to test character device stuff.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License version 2 as published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* General Public License for more details.
*
* You should have received a copy of the GNU General Public
* License along with this program; if not, write to the
* Free Software Foundation, Inc., 59 Temple Place - Suite 330,
* Boston, MA 021110-1307, USA.
*/

#include <linux/module.h>
#include <linux/init.h>
#include <linux/config.h>
#include <linux/kernel.h>
#include <linux/major.h>
#include <linux/fs.h>
#include <linux/device.h>

#define MYMAJOR 204

static int dummy_open(struct inode *inode, struct file *file)
{
pr_info("dummy_open (%d)\n", iminor(inode));
return 0;
}

static int dummy_release(struct inode *inode, struct file *file)
{
pr_info("dummy_release (%d)\n", iminor(inode));
return 0;
}

static struct file_operations dummy_ops = {
.owner = THIS_MODULE,
.open = dummy_open,
.release = dummy_release,
};

static int __init dummy_init(void)
{
pr_info("dummy module init\n");
return register_chrdev(MYMAJOR, "dummy", &dummy_ops);
}

static void __exit dummy_exit(void)
{
pr_info("dummy module unloaded\n");
unregister_chrdev(MYMAJOR, "dummy");
}

module_init(dummy_init);
module_exit(dummy_exit);

MODULE_LICENSE("GPL");


2004-10-08 03:11:54

by Richard B. Johnson

[permalink] [raw]
Subject: Re: Probable module bug in linux-2.6.5-1.358

On Thu, 7 Oct 2004, Stephen Hemminger wrote:

> Still haven't full source so this is still guess work.
> But assuming it is a character device, did you forget to add an owner
> field to the file ops structure?
>
> static struct file_operations xxx_fops = {
> .owner = THIS_MODULE,
> .read = my_read,
> ...
>
> The owner field is used by the character device layer to maintain module
> ref counts in 2.6.
>

No. The owner field is properly filled in and I did provide
the complete source code and build files for anybody who didn't
delete the email. Scanner.tar.gz was attached.

Further, unregister_chrdev() is called. Its return value is
checked. Everything is fine. I can manually remove and
reinstall the module multiple times and the resources are
always released and re-acquired.

The problem is that if you install the module and then
remove it, if you attempt to open the device-file, the
kernel will crash because it calls where the properly-
removed module used to be. It isn't there anymore.


Cheers,
Dick Johnson
Penguin : Linux version 2.6.5-1.358-noreg on an i686 machine (5537.79 BogoMips).
Note 96.31% of all statistics are fiction.