2018-08-08 08:17:05

by piaojun

[permalink] [raw]
Subject: [PATCH] net/9p/trans_virtio.c: decrease the refcount of 9p virtio device when removing it

I found that 9pnet_virtio.ko could not be removed by rmmod command, and I
could still found it by lsmod. The reason is that we forgot decrease the
refcount of 9p virtio device by kobject_put. So we should put refcount in
p9_virtio_remove.

Signed-off-by: Jun Piao <[email protected]>
---
net/9p/trans_virtio.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/9p/trans_virtio.c b/net/9p/trans_virtio.c
index 46a5ab2..a00e992 100644
--- a/net/9p/trans_virtio.c
+++ b/net/9p/trans_virtio.c
@@ -710,7 +710,8 @@ static void p9_virtio_remove(struct virtio_device *vdev)
vdev->config->del_vqs(vdev);

sysfs_remove_file(&(vdev->dev.kobj), &dev_attr_mount_tag.attr);
- kobject_uevent(&(vdev->dev.kobj), KOBJ_CHANGE);
+ kobject_uevent(&(vdev->dev.kobj), KOBJ_REMOVE);
+ kobject_put(&(vdev->dev.kobj));
kfree(chan->tag);
kfree(chan->vc_wq);
kfree(chan);
--


2018-08-08 08:38:23

by Dominique Martinet

[permalink] [raw]
Subject: Re: [PATCH] net/9p/trans_virtio.c: decrease the refcount of 9p virtio device when removing it

piaojun wrote on Wed, Aug 08, 2018:
> I found that 9pnet_virtio.ko could not be removed by rmmod command, and I
> could still found it by lsmod. The reason is that we forgot decrease the
> refcount of 9p virtio device by kobject_put. So we should put refcount in
> p9_virtio_remove.


Hmm, I cannot seem to reproduce this. Can you give more details on how
to get into stuck state?
I tried mounting something, accessing the sysfs files, etc to no avail.

lsmod gives a counter to how many references there are to the module,
you can use that to debug a bit.
For example here I get this line just after loading the module:
9pnet_virtio 32768 0

Then after mounting something there is one reference:
9pnet_virtio 32768 1

Then unmounting puts that back to 0 and 'modprobe -r' (or rmmod) works.




I dislike saying the next part but I think form also is important,
please bear with me:

- shorter subject line, please. For example, you can lose 20 characters
by reodering words so there is no need for pronouns
"net/9p/virtio: decrease 9p virtio device refcount on removal"

- I personally dislike commit messages that are "novelized" (that is,
put yourself as an actor and describe what you were doing)
That seems to be somewhat accepted as looking at the kernel's git log I
see some (few) commits using "I ..." that are not pull request messages
but if possible please avoid this style and try to describe facts, how
things are wrong, what got fixed and if required how.
To give an example again, this says the same thing:
"The 9pnet_virtio module could not be unloaded because we forgot to
decrease the refcount of the 9p virtio device with kobject_put.

Put the reference in 9p_virtio_remove"


Thanks,
--
Dominique

2018-08-08 09:00:30

by piaojun

[permalink] [raw]
Subject: Re: [PATCH] net/9p/trans_virtio.c: decrease the refcount of 9p virtio device when removing it

Hi Dominique,

On 2018/8/8 16:36, Dominique Martinet wrote:
> piaojun wrote on Wed, Aug 08, 2018:
>> I found that 9pnet_virtio.ko could not be removed by rmmod command, and I
>> could still found it by lsmod. The reason is that we forgot decrease the
>> refcount of 9p virtio device by kobject_put. So we should put refcount in
>> p9_virtio_remove.
>
>
> Hmm, I cannot seem to reproduce this. Can you give more details on how
> to get into stuck state?
> I tried mounting something, accessing the sysfs files, etc to no avail.
>
> lsmod gives a counter to how many references there are to the module,
> you can use that to debug a bit.
> For example here I get this line just after loading the module:
> 9pnet_virtio 32768 0
>
> Then after mounting something there is one reference:
> 9pnet_virtio 32768 1
>
> Then unmounting puts that back to 0 and 'modprobe -r' (or rmmod) works.
>
I try to remove 9pnet_virtio.ko by 'rmmod 9pnet_virtio' as I want to
replace it without rebooting system. Here I have not mount 9pfs yet, so
the refcount is still 0.

Before rmmod:
# lsmod | grep 9p
9pnet_virtio 20480 0
9pnet 106496 1 9pnet_virtio
virtio_ring 28672 5 virtio_scsi,9pnet_virtio,virtio_pci,virtio_blk,virtio_net
virtio 16384 5 virtio_scsi,9pnet_virtio,virtio_pci,virtio_blk,virtio_net

After rmmod:
# lsmod | grep 9p
9pnet_virtio 20480 0
9pnet 106496 1 9pnet_virtio
virtio_ring 28672 5 virtio_scsi,9pnet_virtio,virtio_pci,virtio_blk,virtio_net
virtio 16384 5 virtio_scsi,9pnet_virtio,virtio_pci,virtio_blk,virtio_net

Normally 9pnet_virtio should be invisible after rmmod like this:
# lsmod | grep 9p
9pnet 106496 0

>
>
>
> I dislike saying the next part but I think form also is important,
> please bear with me:
>
> - shorter subject line, please. For example, you can lose 20 characters
> by reodering words so there is no need for pronouns
> "net/9p/virtio: decrease 9p virtio device refcount on removal"
>
> - I personally dislike commit messages that are "novelized" (that is,
> put yourself as an actor and describe what you were doing)
> That seems to be somewhat accepted as looking at the kernel's git log I
> see some (few) commits using "I ..." that are not pull request messages
> but if possible please avoid this style and try to describe facts, how
> things are wrong, what got fixed and if required how.
> To give an example again, this says the same thing:
> "The 9pnet_virtio module could not be unloaded because we forgot to
> decrease the refcount of the 9p virtio device with kobject_put.
>
> Put the reference in 9p_virtio_remove"
>
Your suggestion really makes sense, and I will make some improvment later.

Thanks,
Jun
>
> Thanks,
>

2018-08-08 09:41:35

by Dominique Martinet

[permalink] [raw]
Subject: Re: [PATCH] net/9p/trans_virtio.c: decrease the refcount of 9p virtio device when removing it

piaojun wrote on Wed, Aug 08, 2018:
> I try to remove 9pnet_virtio.ko by 'rmmod 9pnet_virtio' as I want to
> replace it without rebooting system.

I do that all the time when testing, it works for me.
What exact kernel commit are you running?

> Here I have not mount 9pfs yet, so the refcount is still 0.
>
> Before rmmod:
> # lsmod | grep 9p
> 9pnet_virtio 20480 0
> 9pnet 106496 1 9pnet_virtio
> virtio_ring 28672 5 virtio_scsi,9pnet_virtio,virtio_pci,virtio_blk,virtio_net
> virtio 16384 5 virtio_scsi,9pnet_virtio,virtio_pci,virtio_blk,virtio_net
>
> After rmmod:
> # lsmod | grep 9p
> 9pnet_virtio 20480 0
> 9pnet 106496 1 9pnet_virtio
> virtio_ring 28672 5 virtio_scsi,9pnet_virtio,virtio_pci,virtio_blk,virtio_net
> virtio 16384 5 virtio_scsi,9pnet_virtio,virtio_pci,virtio_blk,virtio_net
>
> Normally 9pnet_virtio should be invisible after rmmod like this:
> # lsmod | grep 9p
> 9pnet 106496 0

Right, that obviously didn't work...

But on the other hand, if I apply your commit and load/unload
9pnet_virtio 5-10 times (I ran it in a loop) I get KASAN errors because
we put too many of these refs ; that doesn't happen without your patch
so it's apparently wrong.
I'm curious how that could make modprobe work better for you as well, it
shouldn't depend on that...

Maybe `modprobe -r` might give a better error, or something in dmesg?

--
Dominique

2018-08-09 00:47:58

by piaojun

[permalink] [raw]
Subject: Re: [PATCH] net/9p/trans_virtio.c: decrease the refcount of 9p virtio device when removing it

Hi Dominique,

On 2018/8/8 17:40, Dominique Martinet wrote:
> piaojun wrote on Wed, Aug 08, 2018:
>> I try to remove 9pnet_virtio.ko by 'rmmod 9pnet_virtio' as I want to
>> replace it without rebooting system.
>
> I do that all the time when testing, it works for me.
> What exact kernel commit are you running?
>
My kernel commit id 6edf1d4cb0acde, and I replace the 9p code with
9p-next. And I wonder if this will work well?

>> Here I have not mount 9pfs yet, so the refcount is still 0.
>>
>> Before rmmod:
>> # lsmod | grep 9p
>> 9pnet_virtio 20480 0
>> 9pnet 106496 1 9pnet_virtio
>> virtio_ring 28672 5 virtio_scsi,9pnet_virtio,virtio_pci,virtio_blk,virtio_net
>> virtio 16384 5 virtio_scsi,9pnet_virtio,virtio_pci,virtio_blk,virtio_net
>>
>> After rmmod:
>> # lsmod | grep 9p
>> 9pnet_virtio 20480 0
>> 9pnet 106496 1 9pnet_virtio
>> virtio_ring 28672 5 virtio_scsi,9pnet_virtio,virtio_pci,virtio_blk,virtio_net
>> virtio 16384 5 virtio_scsi,9pnet_virtio,virtio_pci,virtio_blk,virtio_net
>>
>> Normally 9pnet_virtio should be invisible after rmmod like this:
>> # lsmod | grep 9p
>> 9pnet 106496 0
>
> Right, that obviously didn't work...
>
> But on the other hand, if I apply your commit and load/unload
> 9pnet_virtio 5-10 times (I ran it in a loop) I get KASAN errors because
> we put too many of these refs ; that doesn't happen without your patch
> so it's apparently wrong.
> I'm curious how that could make modprobe work better for you as well, it
> shouldn't depend on that...
>
> Maybe `modprobe -r` might give a better error, or something in dmesg?
>
In my testing, `modprobe -r` has the same behavior with rmmod.

2018-08-09 01:19:50

by Dominique Martinet

[permalink] [raw]
Subject: Re: [PATCH] net/9p/trans_virtio.c: decrease the refcount of 9p virtio device when removing it

piaojun wrote on Thu, Aug 09, 2018:
> > What exact kernel commit are you running?
>
> My kernel commit id 6edf1d4cb0acde, and I replace the 9p code with
> 9p-next. And I wonder if this will work well?

That is somewhere on top of 4.18-rc1 and got merged in 4.18-rc4, which
are close enough so while I can question the practice I don't see why
not.

I've just tried the following:
$ git checkout 6edf1d4cb0acde
$ git checkout martinetd/9p-next net/9p fs/9p include/net/9p
(martinetd/9p-next is 9f961802a7 as of this mail)
<make, install, reboot>
$ uname -r
4.18.0-rc1+
$ lsmod | grep -E '^9pnet_virtio' || echo "not loaded"
9pnet_virtio 32768 0
$ sudo modprobe -r 9pnet_virtio
$ lsmod | grep -E '^9pnet_virtio' || echo "not loaded"
not loaded
$ sudo modprobe 9pnet_virtio
$ sudo mount -t 9p -o debug=1,trans=virtio shm /mnt
$ ls /mnt
<stuff>
$ cat /sys/module/9pnet_virtio/drivers/virtio\:9pnet_virtio/*/mount_tag
tmpshm (these could use a new line...)
$ sudo umount /mnt
$ sudo modprobe -r 9pnet_virtio
$ lsmod | grep -E '^9pnet_virtio' || echo "not loaded"
not loaded

The /sys/devices/pci*/*/virtio*/mount_tag files are also removed
properly; I don't see any problem.


Not being able to reproduce is fine in general, but I also get problems
when applying the patch and unloading the module multiple times so I
can't help but question this patch and think your problem lies somewhere
else.

--
Dominique