2003-07-17 12:41:32

by Ricardo Bugalho

[permalink] [raw]
Subject: SET_MODULE_OWNER

Hi all,
most net device drivers have replaced MOD_INC/DEC_USE_COUNT with
SET_MODULE_OWNER but SET_MODULE_OWNER doesn't do nothing.
Therefore, those modules (though I can only vouch for 8139too) always
report 0 use. Some people that had "modprobe -r" in their cronttab found
it quite annoying.
I'd guess that there's a good reason for why struct net_device doesn't
have .owner field and why this happens. Can someone be so kind to point
it
out?

--
Ricardo

PS: Previous post seemed to fail, hope this one isn't a duplicate.


2003-07-17 15:17:55

by Jeff Garzik

[permalink] [raw]
Subject: Re: SET_MODULE_OWNER

Ricardo Bugalho wrote:
> Hi all,
> most net device drivers have replaced MOD_INC/DEC_USE_COUNT with
> SET_MODULE_OWNER but SET_MODULE_OWNER doesn't do nothing.
> Therefore, those modules (though I can only vouch for 8139too) always
> report 0 use. Some people that had "modprobe -r" in their cronttab found
> it quite annoying.
> I'd guess that there's a good reason for why struct net_device doesn't
> have .owner field and why this happens. Can someone be so kind to point
> it
> out?


struct net_device does have an owner field, and SET_MODULE_OWNER
obviously _does_ do something.

If your interface is up, your net driver's module refcount is greater
than zero.

Jeff



2003-07-17 15:41:34

by Thomas Schlichter

[permalink] [raw]
Subject: Re: SET_MODULE_OWNER

On Thursday 17 July 2003 17:32, Jeff Garzik wrote:
> Ricardo Bugalho wrote:
> > Hi all,
> > most net device drivers have replaced MOD_INC/DEC_USE_COUNT with
> > SET_MODULE_OWNER but SET_MODULE_OWNER doesn't do nothing.
> > Therefore, those modules (though I can only vouch for 8139too) always
> > report 0 use. Some people that had "modprobe -r" in their cronttab found
> > it quite annoying.
> > I'd guess that there's a good reason for why struct net_device doesn't
> > have .owner field and why this happens. Can someone be so kind to point
> > it
> > out?
>
> struct net_device does have an owner field, and SET_MODULE_OWNER
> obviously _does_ do something.

That's not correct for 2.5.x anymore...
Have a look at Changeset 1.1167 from davem.

It removed the owner field about 9 weeks ago. That was the time where
SET_MODULE_OWNER became a NOP...

> If your interface is up, your net driver's module refcount is greater
> than zero.

Well, as I looked now my netdevice is up, but its reference count is at 0,
too!

Thomas Schlichter

2003-07-17 15:46:17

by Jeff Garzik

[permalink] [raw]
Subject: Re: SET_MODULE_OWNER

Thomas Schlichter wrote:
> On Thursday 17 July 2003 17:32, Jeff Garzik wrote:
>
>>Ricardo Bugalho wrote:
>>
>>>Hi all,
>>> most net device drivers have replaced MOD_INC/DEC_USE_COUNT with
>>>SET_MODULE_OWNER but SET_MODULE_OWNER doesn't do nothing.
>>> Therefore, those modules (though I can only vouch for 8139too) always
>>>report 0 use. Some people that had "modprobe -r" in their cronttab found
>>>it quite annoying.
>>> I'd guess that there's a good reason for why struct net_device doesn't
>>>have .owner field and why this happens. Can someone be so kind to point
>>>it
>>>out?
>>
>>struct net_device does have an owner field, and SET_MODULE_OWNER
>>obviously _does_ do something.
>
>
> That's not correct for 2.5.x anymore...
> Have a look at Changeset 1.1167 from davem.
>
> It removed the owner field about 9 weeks ago. That was the time where
> SET_MODULE_OWNER became a NOP...
>
>
>>If your interface is up, your net driver's module refcount is greater
>>than zero.
>
>
> Well, as I looked now my netdevice is up, but its reference count is at 0,
> too!

Doh. I missed that.

David? Does Rusty have a plan here or something?

Jeff



2003-07-17 19:44:55

by David Miller

[permalink] [raw]
Subject: Re: SET_MODULE_OWNER

On Thu, 17 Jul 2003 12:00:58 -0400
Jeff Garzik <[email protected]> wrote:

> David? Does Rusty have a plan here or something?

It just works how it works and that's it.

Net devices are reference counted, anything more is superfluous.
They may be yanked out of the kernel whenever you want.

2003-07-17 20:08:32

by Jeff Garzik

[permalink] [raw]
Subject: Re: SET_MODULE_OWNER

David S. Miller wrote:
> On Thu, 17 Jul 2003 12:00:58 -0400
> Jeff Garzik <[email protected]> wrote:
>
>
>>David? Does Rusty have a plan here or something?
>
>
> It just works how it works and that's it.
>
> Net devices are reference counted, anything more is superfluous.
> They may be yanked out of the kernel whenever you want.


(I'm obviously just realizing the implications of this... missed it
completely during the earlier discussions)

Object lifetime is just part of the story.

This change is a major behavior change. The whole point of removing a
module is knowing its gone ;-) And that is completely changed now.
Modules are very often used by developers in a "modprobe ; test ; rmmod"
cycle, and that's now impossible (you don't know when the net device,
and thus your code, is really gone). It's already breaking userland,
which does sweeps for zero-refcount modules among other things.

I can't believe I missed this.

Jeff



2003-07-17 20:13:58

by David Miller

[permalink] [raw]
Subject: Re: SET_MODULE_OWNER

On Thu, 17 Jul 2003 16:22:33 -0400
Jeff Garzik <[email protected]> wrote:

> This change is a major behavior change. The whole point of removing a
> module is knowing its gone ;-) And that is completely changed now.
> Modules are very often used by developers in a "modprobe ; test ; rmmod"
> cycle, and that's now impossible (you don't know when the net device,
> and thus your code, is really gone). It's already breaking userland,
> which does sweeps for zero-refcount modules among other things.
>
> I can't believe I missed this.

Umm, Jeff, for years if you rmmod netfilter it very will
do this for you even if you have firewall rules installed.
This behavior exists in all of 2.4.x

People who do modprobe -r in their crontabs are asking
for trouble, losing their netdevice is the least of their
trouble especially if they have firewall rules installed.

Module reference counting added complications to net device
handling, and once I killed it off we could begin addressing
all of the real bugs that exist with network devices. For example,
now that we're foreced to make net devices dynamic memory in all
cases we can deal with dangling procfs/sysfs references to the device
sanely. Fixing that was not possible with module refcounting.

2003-07-17 20:34:13

by Jeff Garzik

[permalink] [raw]
Subject: Re: SET_MODULE_OWNER

David S. Miller wrote:
> Module reference counting added complications to net device
> handling, and once I killed it off we could begin addressing

Coding is tough, let's go shopping.


> all of the real bugs that exist with network devices. For example,
> now that we're foreced to make net devices dynamic memory in all
> cases we can deal with dangling procfs/sysfs references to the device
> sanely. Fixing that was not possible with module refcounting.

rmmod is now completely pointless, and developers now have one less
useful tool in their toolbox.

I code all the time doing "modprobe ; test ; rmmod", and that's now
impossible.

Jeff



2003-07-17 21:21:57

by Ricardo Bugalho

[permalink] [raw]
Subject: Re: SET_MODULE_OWNER

On Thu, 2003-07-17 at 20:59, David S. Miller wrote:
> On Thu, 17 Jul 2003 12:00:58 -0400
> Jeff Garzik <[email protected]> wrote:
>
> > David? Does Rusty have a plan here or something?
>
> It just works how it works and that's it.
>
> Net devices are reference counted, anything more is superfluous.
> They may be yanked out of the kernel whenever you want.

Just noticed it: I can't unload the module even after bringing the
interface down.
In either case, modprobe hangs and I start getting this message in
syslog:

Jul 17 21:50:44 ezquiel kernel: unregister_netdevice: waiting for eth0
to become free. Usage count = -4

Can't shutdown the system either. Init hangs waiting for modprobe to
die.

--
Ricardo

2003-07-17 21:29:20

by David Miller

[permalink] [raw]
Subject: Re: SET_MODULE_OWNER

On Thu, 17 Jul 2003 16:48:55 -0400
Jeff Garzik <[email protected]> wrote:

> rmmod is now completely pointless, and developers now have one less
> useful tool in their toolbox.
>
> I code all the time doing "modprobe ; test ; rmmod", and that's now
> impossible.

I fail to see the problem with having rmmod do exactly
what you ask it to do.

If there is some refcounting bug, you will see it, because rmmod will
spin sleeping and waiting for all the net_dev refcounts to go away.
This will spit out kernel messages and only occur when there is a bug
in the kernel somewhere.

2003-07-17 21:36:50

by David Miller

[permalink] [raw]
Subject: Re: SET_MODULE_OWNER

On 17 Jul 2003 22:36:43 +0100
Ricardo Bugalho <[email protected]> wrote:

> Just noticed it: I can't unload the module even after bringing the
> interface down.
> In either case, modprobe hangs and I start getting this message in
> syslog:
>
> Jul 17 21:50:44 ezquiel kernel: unregister_netdevice: waiting for eth0
> to become free. Usage count = -4
>
> Can't shutdown the system either. Init hangs waiting for modprobe to
> die.

That's a bug we need to fix.

What driver are you using?
Are you using ipv6?
Any netfilter modules?
Anything else interesting or "unique" about your particular setup?

2003-07-17 22:09:10

by Ricardo Bugalho

[permalink] [raw]
Subject: Re: SET_MODULE_OWNER

On Thu, 2003-07-17 at 22:40, David S. Miller wrote:
> That's a bug we need to fix.
>
> What driver are you using?
> Are you using ipv6?
> Any netfilter modules?
> Anything else interesting or "unique" about your particular setup?

Sorry, I forgot. Cooking and mailing just don't mix.

Debian SID, Kernel v2.6.0-test1, compiled with GCC3.3, on a Athlon 850.
The driver is 8139too, for the RealTek8139 (no special options).
Netfilter no, IPv6 yes.

XFS, ALSA (emu10k1), OSS emulation, SCSI emulation are also in the mix.
And I did check this behaviour without running X (yet another kernel
tainted by nvidia here).
Can't rember anything else in particular..

--
Ricardo

2003-07-17 22:14:04

by David Miller

[permalink] [raw]
Subject: Re: SET_MODULE_OWNER

On 17 Jul 2003 23:23:57 +0100
Ricardo Bugalho <[email protected]> wrote:

> Netfilter no, IPv6 yes.

There are some heavy device leaks in 2.6.0-test1 as released,
if you could test something more current it would be appreciated
as we believe we have fixed this.

Alternatively, you could test if ipv6 is the culprit by removing
it from your setup somehow.

2003-07-18 00:36:47

by Ricardo Bugalho

[permalink] [raw]
Subject: Re: SET_MODULE_OWNER

On Thu, 2003-07-17 at 23:16, David S. Miller wrote:
> There are some heavy device leaks in 2.6.0-test1 as released,
> if you could test something more current it would be appreciated
> as we believe we have fixed this.

Indeed you have. Without IPv6, it unloads without a glitch.

> Alternatively, you could test if ipv6 is the culprit by removing
> it from your setup somehow.

With IPv6, it takes a couple of seconds before unloading and I get this
message (just ONE message):
Jul 18 01:36:00 ezquiel kernel: unregister_netdevice: waiting for eth0
to become free. Usage count = 4

Unless this is sympthon of something that shouldn't happen, I can live
with it.

--
Ricardo

2003-07-18 02:15:33

by James Cloos

[permalink] [raw]
Subject: Re: SET_MODULE_OWNER

>>>>> "David" == David S Miller <[email protected]> writes:

>> Can't shutdown the system either. Init hangs waiting for
>> modprobe to die.

As this happened to me too, when I wanted to try out the new e100
module w/o rebooting, I'll pipe in below.

David> That's a bug we need to fix.

David> What driver are you using?

e100.ko

David> Are you using ipv6?

Yes, compiled in.

David> Any netfilter modules?

Yes, a basic firewall + nat, with eth0 as the inside and ppp0 outside.
It is a mix of compiled in and modular.

David> Anything else interesting or "unique" about your
David> particular setup?

It used to be rmmod would fail until all of the sockets were closed.
Instead it just hung. Attempting to shutdown network services to
free any eth0 sockets didn't help. A reboot attempt also hung as
per above. Had to use SysRq to sync/umount/boot.

The exact .config is available in bk://cloos.bkbits.net/linux-2.5-jhc
as of the revision dated 2003/06/28. That was between 2.5.73 and .74.

-JimC

2003-07-18 05:21:56

by David Miller

[permalink] [raw]
Subject: Re: SET_MODULE_OWNER

On 17 Jul 2003 22:27:57 -0400
"James H. Cloos Jr." <[email protected]> wrote:

> >>>>> "David" == David S Miller <[email protected]> writes:
>
> David> Are you using ipv6?
>
> Yes, compiled in.

I really think this is the issue, try to eliminate
it from your environment to verify.

2003-07-18 07:16:50

by Duncan Sands

[permalink] [raw]
Subject: Re: SET_MODULE_OWNER

On Thursday 17 July 2003 22:22, Jeff Garzik wrote:
> David S. Miller wrote:
> > On Thu, 17 Jul 2003 12:00:58 -0400
> >
> > Jeff Garzik <[email protected]> wrote:
> >>David? Does Rusty have a plan here or something?
> >
> > It just works how it works and that's it.
> >
> > Net devices are reference counted, anything more is superfluous.
> > They may be yanked out of the kernel whenever you want.
>
> (I'm obviously just realizing the implications of this... missed it
> completely during the earlier discussions)
>
> Object lifetime is just part of the story.
>
> This change is a major behavior change. The whole point of removing a
> module is knowing its gone ;-) And that is completely changed now.
> Modules are very often used by developers in a "modprobe ; test ; rmmod"
> cycle, and that's now impossible (you don't know when the net device,
> and thus your code, is really gone). It's already breaking userland,
> which does sweeps for zero-refcount modules among other things.

Most USB drivers can be unloaded at any time, so this problem already
existed elsewhere.

Duncan.

2003-07-18 08:49:26

by Martin Diehl

[permalink] [raw]
Subject: Re: SET_MODULE_OWNER

On Fri, 18 Jul 2003, Duncan Sands wrote:

> On Thursday 17 July 2003 22:22, Jeff Garzik wrote:
> > David S. Miller wrote:
> > > On Thu, 17 Jul 2003 12:00:58 -0400
> > >
> > > Jeff Garzik <[email protected]> wrote:
> > >>David? Does Rusty have a plan here or something?
> > >
> > > It just works how it works and that's it.
> > >
> > > Net devices are reference counted, anything more is superfluous.
> > > They may be yanked out of the kernel whenever you want.
> >
> > (I'm obviously just realizing the implications of this... missed it
> > completely during the earlier discussions)
> >
> > Object lifetime is just part of the story.
> >
> > This change is a major behavior change. The whole point of removing a
> > module is knowing its gone ;-) And that is completely changed now.
> > Modules are very often used by developers in a "modprobe ; test ; rmmod"
> > cycle, and that's now impossible (you don't know when the net device,
> > and thus your code, is really gone). It's already breaking userland,
> > which does sweeps for zero-refcount modules among other things.
>
> Most USB drivers can be unloaded at any time, so this problem already
> existed elsewhere.

Most? Since when?

For me neither usb-storage nor usbserial (pl2303 f.e.) can be unloaded
when in use (storage being mounted or /dev/usb/ttyUSBX opened).

True, irda-usb (and probably usbnet) can be unloaded when the interface is
up since a few weeks - but this is due to the networking not bumping
the module use counter anymore, nothing todo with usb.

Doing something comparable to network with usb in general one would need
to change usb-storage reporting use-count==0 while the disk is mounted!
Only then one could rmmod and the fs would (hopefully) get synced and
unmounted (or staled) automagically.

Personally I believe it all comes down to the semantics of the module use
counter. If it's taken to indicate the module cannot be unloaded while
!=0, it might (or should) stay ==0 if the underlaying subsystem can handle
module removal at any time safe - like it is done for network now.

In contrast, if the module use count is taken to indicate a module is in
use (interface up, fs mounted, chardev open, ...) I'd expect it to be >0.
Being unable to rmmod in this situation is just a consequence of the fact
it's being used then, regardless whether we could rmmod anyway.

OTOH David has a point as the current situation with network helps to
identify bugs there - YMMV.

Martin

2003-07-18 09:54:47

by Roman Zippel

[permalink] [raw]
Subject: Re: SET_MODULE_OWNER

Hi,

On Thu, 17 Jul 2003, David S. Miller wrote:

> People who do modprobe -r in their crontabs are asking
> for trouble, losing their netdevice is the least of their
> trouble especially if they have firewall rules installed.
>
> Module reference counting added complications to net device
> handling, and once I killed it off we could begin addressing
> all of the real bugs that exist with network devices. For example,
> now that we're foreced to make net devices dynamic memory in all
> cases we can deal with dangling procfs/sysfs references to the device
> sanely. Fixing that was not possible with module refcounting.

I wouldn't say impossible, but definitively not nice.
OTOH a usage indicator is useful and the network driver knows if one of
its devices is in use, but the module count is the only way to tell this
to module code and there is currently no sane way to have a use count and
force the removal of a module.
The network device cleanup is without doubt needed, but I would have
prefered to accompany it with module layer cleanup. The current module
code continues a lot of the old design mistakes.

bye, Roman

2003-07-18 09:58:28

by Duncan Sands

[permalink] [raw]
Subject: Re: SET_MODULE_OWNER

On Friday 18 July 2003 11:05, Martin Diehl wrote:
> On Fri, 18 Jul 2003, Duncan Sands wrote:
> > On Thursday 17 July 2003 22:22, Jeff Garzik wrote:
> > > David S. Miller wrote:
> > > > On Thu, 17 Jul 2003 12:00:58 -0400
> > > >
> > > > Jeff Garzik <[email protected]> wrote:
> > > >>David? Does Rusty have a plan here or something?
> > > >
> > > > It just works how it works and that's it.
> > > >
> > > > Net devices are reference counted, anything more is superfluous.
> > > > They may be yanked out of the kernel whenever you want.
> > >
> > > (I'm obviously just realizing the implications of this... missed it
> > > completely during the earlier discussions)
> > >
> > > Object lifetime is just part of the story.
> > >
> > > This change is a major behavior change. The whole point of removing a
> > > module is knowing its gone ;-) And that is completely changed now.
> > > Modules are very often used by developers in a "modprobe ; test ;
> > > rmmod" cycle, and that's now impossible (you don't know when the net
> > > device, and thus your code, is really gone). It's already breaking
> > > userland, which does sweeps for zero-refcount modules among other
> > > things.
> >
> > Most USB drivers can be unloaded at any time, so this problem already
> > existed elsewhere.
>
> Most? Since when?
>
> For me neither usb-storage nor usbserial (pl2303 f.e.) can be unloaded
> when in use (storage being mounted or /dev/usb/ttyUSBX opened).

OK, maybe most was exagerated. :)

If you want to see what I have in mind, do
rmmod uhci-hcd
(or whatever your hcd is). This shows that there was already a problem with
unloading modules with zero refcount.

> True, irda-usb (and probably usbnet) can be unloaded when the interface is
> up since a few weeks - but this is due to the networking not bumping
> the module use counter anymore, nothing todo with usb.
>
> Doing something comparable to network with usb in general one would need
> to change usb-storage reporting use-count==0 while the disk is mounted!
> Only then one could rmmod and the fs would (hopefully) get synced and
> unmounted (or staled) automagically.
>
> Personally I believe it all comes down to the semantics of the module use
> counter. If it's taken to indicate the module cannot be unloaded while
> !=0, it might (or should) stay ==0 if the underlaying subsystem can handle
> module removal at any time safe - like it is done for network now.
>
> In contrast, if the module use count is taken to indicate a module is in
> use (interface up, fs mounted, chardev open, ...) I'd expect it to be >0.
> Being unable to rmmod in this situation is just a consequence of the fact
> it's being used then, regardless whether we could rmmod anyway.
>
> OTOH David has a point as the current situation with network helps to
> identify bugs there - YMMV.

Duncan.

2003-07-18 12:31:39

by Martin Diehl

[permalink] [raw]
Subject: Re: SET_MODULE_OWNER

On Fri, 18 Jul 2003, Duncan Sands wrote:

> If you want to see what I have in mind, do
> rmmod uhci-hcd
> (or whatever your hcd is). This shows that there was already a problem with
> unloading modules with zero refcount.

Ok, for the hcd you are right. And yes, I'd prefer to see the hcd's use
count getting increased with any interface claimed by an usb client
driver...

With lsmod reporting use count ==0 people might assume the module is
unused and thus rmmod - not much fun if this is beneath a mounted fs.

Martin

PS: I've just tried this - it's even worse: not only would I expect fs
damage, the box is OOPSing and BUGging like hell :-(
I'll take this to linux-usb-devel.

2003-07-20 01:56:25

by James Cloos

[permalink] [raw]
Subject: Re: SET_MODULE_OWNER

>>>>> "David" == David S Miller <[email protected]> writes:

David> I really think [ipv6] is the issue, try to eliminate it from your
David> environment to verify.

Verified. e100.ko rmmod(8)ed fine when ipv6 was not compiled in.

The netfilters and ppp0 were up; v6 ws the only variable.

-JimC