2008-08-17 15:11:34

by Hans Verkuil

[permalink] [raw]
Subject: V4L2: switch to register_chrdev_region: needs testing/review of release() handling

Hi all,

As part of my ongoing cleanup of the v4l subsystem I've been looking
into converting v4l from register_chrdev to register_chrdev_region. The
latter is more flexible and allows for a larger range of minor numbers.
In addition it allows us to intercept the release callback when the
char device's refcount reaches 0.

This is very useful for hotpluggable devices like USB webcams. Currently
usb video drivers need to do the refcounting themselves, but with this
patch they can rely on the release callback since it will only be
called when the last user has closed the device. Since current usb
drivers do the refcounting in varying degrees of competency (from 'not'
to 'if you're lucky' to 'buggy' to 'perfect') it would be nice to have
the v4l framework take care of this.

So on a disconnect the driver can call video_unregister_device() even if
an application still has the device open. Only when the application
closes as well will the release be called and the driver can do the
final cleanup.

In fact, I think with this change it should even be possible to
reconnect the webcam even while some application is still using the old
char device. In that case a new minor number will be chosen since the
old one is still in use, but otherwise the webcam should just work as
usual. This is untested, though.

Note that right now I basically copy the old release callback as
installed by cdev_init() and install our own v4l callback instead (to
be precise, I replace the ktype pointer with our own kobj_type).

It would be much cleaner if chardev.c would allow one to set a callback
explicitly. It's not difficult to do that, but before doing that I
first have to know whether my approach is working correctly.

The v4l-dvb repository with my changes is here:

http://linuxtv.org/hg/~hverkuil/v4l-dvb-cdev2/

To see the diff in question:

http://linuxtv.org/hg/~hverkuil/v4l-dvb-cdev2/rev/98acd2c1dea1

I have tested myself with the quickcam_messenger webcam. For this driver
this change actually fixed a bug: disconnecting while a capture was in
progress and then trying to use /dev/video0 would lock that second
application.

I also tested with gspca: I could find no differences here, it all
worked as before.

There are a lot more USB video devices and it would be great if people
could test with their devices to see if this doesn't break anything.
Having a release callback that is called when it is really safe to free
everything should make life a lot easier I think.

Regards,

Hans


2008-08-17 18:20:17

by Hans de Goede

[permalink] [raw]
Subject: Re: V4L2: switch to register_chrdev_region: needs testing/review of release() handling

Hans Verkuil wrote:
> Hi all,
>
> As part of my ongoing cleanup of the v4l subsystem I've been looking
> into converting v4l from register_chrdev to register_chrdev_region. The
> latter is more flexible and allows for a larger range of minor numbers.
> In addition it allows us to intercept the release callback when the
> char device's refcount reaches 0.
>

Hans,

Thanks for doing this! You rock! I've been planning on cleaning up gspca's
somewhat archaic disconnect handling for a while now and I was sorta waiting
for something like this :) But I guess that that cleanup might be 2.6.28 material.

Anyways I've reviewed your patch and in general I like it, I only see one problem:

@@ -99,7 +130,8 @@ static void video_release(struct device
{
struct video_device *vfd = container_of(cd, struct video_device, dev);
-#if 1 /* keep */
+ return;
+#if 1 /* keep */
/* needed until all drivers are fixed */
if (!vfd->release)
return;
@@ -107,6 +139,7 @@ static void video_release(struct device
vfd->release(vfd);
}
+
static struct class video_class = {
.name = VIDEO_NAME,
#if LINUX_VERSION_CODE < KERNEL_VERSION(2, 6, 19)


Here you basicly make the release callback of the video class device a no-op.
First of all I think it would be better to just delete it then to add a return,
which sort of hides its an empty function now.

More importantly, its wrong to make this a no-op. When a driver unregisters a
v4l device, and all cdev usage has stopped there can still be open references
to sysfs files of the video class device, but in this case currently the
video_unregister_device call will lead to the vfd->release callback getting
called freeing the vfd struct, which contains the class device.

I believe the way to fix this is todo a get on the kobj contained in the cdev
in video_register_device before registering the class device, and then in the
class device release callback do a put on this kobj.

Other then that it looks good!

Regards,

Hans

2008-08-17 19:48:22

by Hans Verkuil

[permalink] [raw]
Subject: Re: V4L2: switch to register_chrdev_region: needs testing/review of release() handling

Hi Hans,

On Sunday 17 August 2008 20:30:19 Hans de Goede wrote:
> Hans Verkuil wrote:
> > Hi all,
> >
> > As part of my ongoing cleanup of the v4l subsystem I've been
> > looking into converting v4l from register_chrdev to
> > register_chrdev_region. The latter is more flexible and allows for
> > a larger range of minor numbers. In addition it allows us to
> > intercept the release callback when the char device's refcount
> > reaches 0.
>
> Hans,
>
> Thanks for doing this! You rock!

Thanks! :-)

> I've been planning on cleaning up
> gspca's somewhat archaic disconnect handling for a while now and I
> was sorta waiting for something like this :) But I guess that that
> cleanup might be 2.6.28 material.
>
> Anyways I've reviewed your patch and in general I like it, I only see
> one problem:
>
> @@ -99,7 +130,8 @@ static void video_release(struct device
> {
> struct video_device *vfd = container_of(cd, struct video_device,
> dev); -#if 1 /* keep */
> + return;
> +#if 1 /* keep */
> /* needed until all drivers are fixed */
> if (!vfd->release)
> return;
> @@ -107,6 +139,7 @@ static void video_release(struct device
> vfd->release(vfd);
> }
> +
> static struct class video_class = {
> .name = VIDEO_NAME,
> #if LINUX_VERSION_CODE < KERNEL_VERSION(2, 6, 19)
>
>
> Here you basicly make the release callback of the video class device
> a no-op. First of all I think it would be better to just delete it
> then to add a return, which sort of hides its an empty function now.

I thought so as well, but without a release function the low-level class
code in the kernel starts complaining about the missing release.

> More importantly, its wrong to make this a no-op. When a driver
> unregisters a v4l device, and all cdev usage has stopped there can
> still be open references to sysfs files of the video class device,
> but in this case currently the video_unregister_device call will lead
> to the vfd->release callback getting called freeing the vfd struct,
> which contains the class device.

You might have gotten confused here: video_release() didn't call the
main release callback: there was a return in front. I'd forgotten to
remove that test code.

I've also tested what happens when you keep a sysfs file open: it seems
to work OK in that video_release is called even though the sysfs file
is still open. That said, I've moved the cdev_del call from
video_unregister_device() to the video_release() function. It makes
sense not to delete the char device until that callback is called.

This patch is here:
http://linuxtv.org/hg/~hverkuil/v4l-dvb-cdev2/rev/575997018499

> I believe the way to fix this is todo a get on the kobj contained in
> the cdev in video_register_device before registering the class
> device, and then in the class device release callback do a put on
> this kobj.

There is no need to do an additional get: cdev_init does that already,
so the char dev stays alive at least until the cdev_del (longer if apps
still keep it open).

> Other then that it looks good!

Thanks, I've been wanting to do this for some time now and I finally had
the time today to go in and dig through all the refcounting and how
chardev handles things so that I could come up with a proper solution.
What's nice is that this approach works also fine in older kernels
(well, it compiles, I guess I need to do a real test on an older kernel
before I can commit it in v4l-dvb). And that is very nice since the
v4l-dvb repository is supposed to support any kernel >= 2.6.16.

I would be very curious to hear how well it works with the gspca driver.
In particular if you can indeed reconnect while an application still
has a char device open from before the disconnect. Currently the gspca
own locking seems to disallow that (the new device doesn't appear until
all applications stopped using the old one).

Regards,

Hans

2008-08-17 21:18:53

by Hans de Goede

[permalink] [raw]
Subject: Re: V4L2: switch to register_chrdev_region: needs testing/review of release() handling

Hans Verkuil wrote:
>> Anyways I've reviewed your patch and in general I like it, I only see
>> one problem:
>>
>> @@ -99,7 +130,8 @@ static void video_release(struct device
>> {
>> struct video_device *vfd = container_of(cd, struct video_device,
>> dev); -#if 1 /* keep */
>> + return;
>> +#if 1 /* keep */
>> /* needed until all drivers are fixed */
>> if (!vfd->release)
>> return;
>> @@ -107,6 +139,7 @@ static void video_release(struct device
>> vfd->release(vfd);
>> }
>> +
>> static struct class video_class = {
>> .name = VIDEO_NAME,
>> #if LINUX_VERSION_CODE < KERNEL_VERSION(2, 6, 19)
>>
>>
>> Here you basicly make the release callback of the video class device
>> a no-op. First of all I think it would be better to just delete it
>> then to add a return, which sort of hides its an empty function now.
>
> I thought so as well, but without a release function the low-level class
> code in the kernel starts complaining about the missing release.
>

I wasn't clear with delete I only meant the body.

>> More importantly, its wrong to make this a no-op. When a driver
>> unregisters a v4l device, and all cdev usage has stopped there can
>> still be open references to sysfs files of the video class device,
>> but in this case currently the video_unregister_device call will lead
>> to the vfd->release callback getting called freeing the vfd struct,
>> which contains the class device.
>
> You might have gotten confused here: video_release() didn't call the
> main release callback: there was a return in front. I'd forgotten to
> remove that test code.
>

I'm not talking about video_release, I'm talking about the following call chain:
video_device_unregister
cdev_del
kobj_put
v4l2_chardev_release
vfd->release

Which could happen in your old version (before the cdev_del was moved) even
when a class device sysfs file was still open, and thus sysfs read / write
callbacks which may use the (now released) vfd could still be made after the
release.

> I've also tested what happens when you keep a sysfs file open: it seems
> to work OK in that video_release is called even though the sysfs file
> is still open.

That should not happen, if a process holds a sysfs file open the release
callback for the device which owns the sysfs file should not get called, are
you sure this is happening, if the device then does a read / write on this file
mayhem could happen, or does the kernel catch this now a days and just returns
-ENODEV?

> That said, I've moved the cdev_del call from
> video_unregister_device() to the video_release() function. It makes
> sense not to delete the char device until that callback is called.
>

Yes, that will fix the problem I was trying to describe too.

> This patch is here:
> http://linuxtv.org/hg/~hverkuil/v4l-dvb-cdev2/rev/575997018499
>
>> I believe the way to fix this is todo a get on the kobj contained in
>> the cdev in video_register_device before registering the class
>> device, and then in the class device release callback do a put on
>> this kobj.
>
> There is no need to do an additional get: cdev_init does that already,
> so the char dev stays alive at least until the cdev_del (longer if apps
> still keep it open).
>

Well since the code was registering a class device which shared the same in
memory struct, we needed an additional put on the cdev kobj, as once the
refcount for that got to 0 the entire vfd struct including the class device
would get released.

But now that you've moved the cdev_del this is moot, as now the ref_count won't
reach zero until all users of the class device are done with it.

> I would be very curious to hear how well it works with the gspca driver.
> In particular if you can indeed reconnect while an application still
> has a char device open from before the disconnect. Currently the gspca
> own locking seems to disallow that (the new device doesn't appear until
> all applications stopped using the old one).
>

This is on my todo, but not very high atm.

Regards,

Hans

2008-08-17 21:42:21

by Hans Verkuil

[permalink] [raw]
Subject: Re: V4L2: switch to register_chrdev_region: needs testing/review of release() handling

On Sunday 17 August 2008 23:29:02 Hans de Goede wrote:
> Hans Verkuil wrote:
> >> Anyways I've reviewed your patch and in general I like it, I only
> >> see one problem:
> >>
> >> @@ -99,7 +130,8 @@ static void video_release(struct device
> >> {
> >> struct video_device *vfd = container_of(cd, struct video_device,
> >> dev); -#if 1 /* keep */
> >> + return;
> >> +#if 1 /* keep */
> >> /* needed until all drivers are fixed */
> >> if (!vfd->release)
> >> return;
> >> @@ -107,6 +139,7 @@ static void video_release(struct device
> >> vfd->release(vfd);
> >> }
> >> +
> >> static struct class video_class = {
> >> .name = VIDEO_NAME,
> >> #if LINUX_VERSION_CODE < KERNEL_VERSION(2, 6, 19)
> >>
> >>
> >> Here you basicly make the release callback of the video class
> >> device a no-op. First of all I think it would be better to just
> >> delete it then to add a return, which sort of hides its an empty
> >> function now.
> >
> > I thought so as well, but without a release function the low-level
> > class code in the kernel starts complaining about the missing
> > release.
>
> I wasn't clear with delete I only meant the body.
>
> >> More importantly, its wrong to make this a no-op. When a driver
> >> unregisters a v4l device, and all cdev usage has stopped there can
> >> still be open references to sysfs files of the video class device,
> >> but in this case currently the video_unregister_device call will
> >> lead to the vfd->release callback getting called freeing the vfd
> >> struct, which contains the class device.
> >
> > You might have gotten confused here: video_release() didn't call
> > the main release callback: there was a return in front. I'd
> > forgotten to remove that test code.
>
> I'm not talking about video_release, I'm talking about the following
> call chain: video_device_unregister
> cdev_del
> kobj_put
> v4l2_chardev_release
> vfd->release
>
> Which could happen in your old version (before the cdev_del was
> moved) even when a class device sysfs file was still open, and thus
> sysfs read / write callbacks which may use the (now released) vfd
> could still be made after the release.
>
> > I've also tested what happens when you keep a sysfs file open: it
> > seems to work OK in that video_release is called even though the
> > sysfs file is still open.
>
> That should not happen, if a process holds a sysfs file open the
> release callback for the device which owns the sysfs file should not
> get called, are you sure this is happening, if the device then does a
> read / write on this file mayhem could happen, or does the kernel
> catch this now a days and just returns -ENODEV?

I have a simple test prog that opens a file and then just sleeps. I did
that with some of the sysfs attribute files,
e.g.: /sys/class/video4linux/video0/name. The video_release is called
even though I have the file still open. And reading from it after
video_release was called results in EOF, which is correct.

I think my first version was probably OK, but perhaps more through luck
than wisdom. Moving the cdev_del call definitely feels a lot safer.

Regards,

Hans

>
> > That said, I've moved the cdev_del call from
> > video_unregister_device() to the video_release() function. It makes
> > sense not to delete the char device until that callback is called.
>
> Yes, that will fix the problem I was trying to describe too.
>
> > This patch is here:
> > http://linuxtv.org/hg/~hverkuil/v4l-dvb-cdev2/rev/575997018499
> >
> >> I believe the way to fix this is todo a get on the kobj contained
> >> in the cdev in video_register_device before registering the class
> >> device, and then in the class device release callback do a put on
> >> this kobj.
> >
> > There is no need to do an additional get: cdev_init does that
> > already, so the char dev stays alive at least until the cdev_del
> > (longer if apps still keep it open).
>
> Well since the code was registering a class device which shared the
> same in memory struct, we needed an additional put on the cdev kobj,
> as once the refcount for that got to 0 the entire vfd struct
> including the class device would get released.
>
> But now that you've moved the cdev_del this is moot, as now the
> ref_count won't reach zero until all users of the class device are
> done with it.
>
> > I would be very curious to hear how well it works with the gspca
> > driver. In particular if you can indeed reconnect while an
> > application still has a char device open from before the
> > disconnect. Currently the gspca own locking seems to disallow that
> > (the new device doesn't appear until all applications stopped using
> > the old one).
>
> This is on my todo, but not very high atm.
>
> Regards,
>
> Hans

2008-08-17 22:27:27

by David Ellingsworth

[permalink] [raw]
Subject: Re: V4L2: switch to register_chrdev_region: needs testing/review of release() handling

I like what you've done with this, it's very close to the patch I had
in mind. I especially like the way you hijacked cdev's kobj release..
I hadn't considered that. I would have sent you my version, but it was
based on the old videodev.c and needed to be redone after the videodev
split. The associated patch which moves cdev_del to video_release is
definitely the right thing to do. I don't know if it means much, but I
ACK this patch with the associated patch that moves cdev_del to
video_release.

ACKed by David Ellingsworth

2008-08-23 14:01:45

by Laurent Pinchart

[permalink] [raw]
Subject: Re: V4L2: switch to register_chrdev_region: needs testing/review of release() handling

Hi Hans,

On Sunday 17 August 2008, Hans Verkuil wrote:
> Hi all,
>
> As part of my ongoing cleanup of the v4l subsystem I've been looking
> into converting v4l from register_chrdev to register_chrdev_region. The
> latter is more flexible and allows for a larger range of minor numbers.
> In addition it allows us to intercept the release callback when the
> char device's refcount reaches 0.
>
> This is very useful for hotpluggable devices like USB webcams. Currently
> usb video drivers need to do the refcounting themselves, but with this
> patch they can rely on the release callback since it will only be
> called when the last user has closed the device. Since current usb
> drivers do the refcounting in varying degrees of competency (from 'not'
> to 'if you're lucky' to 'buggy' to 'perfect') it would be nice to have
> the v4l framework take care of this.
>
> So on a disconnect the driver can call video_unregister_device() even if
> an application still has the device open. Only when the application
> closes as well will the release be called and the driver can do the
> final cleanup.
>
> In fact, I think with this change it should even be possible to
> reconnect the webcam even while some application is still using the old
> char device. In that case a new minor number will be chosen since the
> old one is still in use, but otherwise the webcam should just work as
> usual. This is untested, though.
>
> Note that right now I basically copy the old release callback as
> installed by cdev_init() and install our own v4l callback instead (to
> be precise, I replace the ktype pointer with our own kobj_type).
>
> It would be much cleaner if chardev.c would allow one to set a callback
> explicitly. It's not difficult to do that, but before doing that I
> first have to know whether my approach is working correctly.
>
> The v4l-dvb repository with my changes is here:
>
> http://linuxtv.org/hg/~hverkuil/v4l-dvb-cdev2/
>
> To see the diff in question:
>
> http://linuxtv.org/hg/~hverkuil/v4l-dvb-cdev2/rev/98acd2c1dea1
>
> I have tested myself with the quickcam_messenger webcam. For this driver
> this change actually fixed a bug: disconnecting while a capture was in
> progress and then trying to use /dev/video0 would lock that second
> application.
>
> I also tested with gspca: I could find no differences here, it all
> worked as before.
>
> There are a lot more USB video devices and it would be great if people
> could test with their devices to see if this doesn't break anything.
> Having a release callback that is called when it is really safe to free
> everything should make life a lot easier I think.

I've given your patch a try on 2.6.27-rc4 with the uvcvideo driver. Nothing
seems to have been broken so far, and the uvcvideo driver got a bit simpler
as I've been able to get rid of the refcounting logic. Good job.

Do we really need the double refcounting (class device and character device) ?
As far as I can tell, the class device is only used for the name and index
sysfs attributes. Its release callback, video_release, is called as soon as
video_unregister_device drops its reference to the class device, even when
sysfs files are still opened.

Best regards,

Laurent Pinchart

2008-08-23 14:33:38

by Hans Verkuil

[permalink] [raw]
Subject: Re: V4L2: switch to register_chrdev_region: needs testing/review of release() handling

On Saturday 23 August 2008 16:01:32 Laurent Pinchart wrote:
> Hi Hans,
>
> On Sunday 17 August 2008, Hans Verkuil wrote:
> > Hi all,
> >
> > As part of my ongoing cleanup of the v4l subsystem I've been
> > looking into converting v4l from register_chrdev to
> > register_chrdev_region. The latter is more flexible and allows for
> > a larger range of minor numbers. In addition it allows us to
> > intercept the release callback when the char device's refcount
> > reaches 0.
> >
> > This is very useful for hotpluggable devices like USB webcams.
> > Currently usb video drivers need to do the refcounting themselves,
> > but with this patch they can rely on the release callback since it
> > will only be called when the last user has closed the device. Since
> > current usb drivers do the refcounting in varying degrees of
> > competency (from 'not' to 'if you're lucky' to 'buggy' to
> > 'perfect') it would be nice to have the v4l framework take care of
> > this.
> >
> > So on a disconnect the driver can call video_unregister_device()
> > even if an application still has the device open. Only when the
> > application closes as well will the release be called and the
> > driver can do the final cleanup.
> >
> > In fact, I think with this change it should even be possible to
> > reconnect the webcam even while some application is still using the
> > old char device. In that case a new minor number will be chosen
> > since the old one is still in use, but otherwise the webcam should
> > just work as usual. This is untested, though.
> >
> > Note that right now I basically copy the old release callback as
> > installed by cdev_init() and install our own v4l callback instead
> > (to be precise, I replace the ktype pointer with our own
> > kobj_type).
> >
> > It would be much cleaner if chardev.c would allow one to set a
> > callback explicitly. It's not difficult to do that, but before
> > doing that I first have to know whether my approach is working
> > correctly.
> >
> > The v4l-dvb repository with my changes is here:
> >
> > http://linuxtv.org/hg/~hverkuil/v4l-dvb-cdev2/
> >
> > To see the diff in question:
> >
> > http://linuxtv.org/hg/~hverkuil/v4l-dvb-cdev2/rev/98acd2c1dea1
> >
> > I have tested myself with the quickcam_messenger webcam. For this
> > driver this change actually fixed a bug: disconnecting while a
> > capture was in progress and then trying to use /dev/video0 would
> > lock that second application.
> >
> > I also tested with gspca: I could find no differences here, it all
> > worked as before.
> >
> > There are a lot more USB video devices and it would be great if
> > people could test with their devices to see if this doesn't break
> > anything. Having a release callback that is called when it is
> > really safe to free everything should make life a lot easier I
> > think.
>
> I've given your patch a try on 2.6.27-rc4 with the uvcvideo driver.
> Nothing seems to have been broken so far, and the uvcvideo driver got
> a bit simpler as I've been able to get rid of the refcounting logic.
> Good job.

Thanks for testing this!

> Do we really need the double refcounting (class device and character
> device) ? As far as I can tell, the class device is only used for the
> name and index sysfs attributes. Its release callback, video_release,
> is called as soon as video_unregister_device drops its reference to
> the class device, even when sysfs files are still opened.

I think that in practice it probably isn't necessary, but I do know that
it is the correct way to do it. It doesn't matter for the driver, since
that has only one release to deal with.

I want to do a few additional tests next weekend and if they all pass,
then I'll ask Mauro to merge this patch.

Regards,

Hans