2010-01-03 19:13:52

by Maxim Levitsky

[permalink] [raw]
Subject: [block subsystem] Need help to prevent races on unexpected device removal

During development of hotplug support for mtd translation layer I seems
to be unable to figure a way to prevent following race:

First of all, a block device is registered. I attach a private structure
to that device to save all internal information.

Then out of the blue (when user pulls off the card) I receive a request
to remove the device.

In the function that handles such removal, I do:

del_gendisk(...
blk_start_queue

stop thread that processes the requests

blk_cleanup_queue(old->rq);


The problem is that I don't know where/when to free the private
structure.

I though about adding a field to the structure, with name 'invalid', so
that release will not attempt to go futher, but free the structure, but
what happens if release is never called?
In other words this will work as long as there is a user of the block
device.

I thought then that I can detect that condition and free the structure
in the removal function itself, but then I get a race with ->open
running in same time, and mutex will not prevent it, I will have to
release it somwhen, and then ->open will access a freed structure....


Best regards,
Maxim Levitsky


2010-01-04 19:57:13

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [block subsystem] Need help to prevent races on unexpected device removal

On Sun, 2010-01-03 at 21:13 +0200, Maxim Levitsky wrote:
> During development of hotplug support for mtd translation layer I seems
> to be unable to figure a way to prevent following race:
>
> First of all, a block device is registered. I attach a private structure
> to that device to save all internal information.
>
> Then out of the blue (when user pulls off the card) I receive a request
> to remove the device.
>
> In the function that handles such removal, I do:
>
> del_gendisk(...
> blk_start_queue
>
> stop thread that processes the requests
>
> blk_cleanup_queue(old->rq);
>
>
> The problem is that I don't know where/when to free the private
> structure.
>
> I though about adding a field to the structure, with name 'invalid', so
> that release will not attempt to go futher, but free the structure, but
> what happens if release is never called?
> In other words this will work as long as there is a user of the block
> device.
>
> I thought then that I can detect that condition and free the structure
> in the removal function itself, but then I get a race with ->open
> running in same time, and mutex will not prevent it, I will have to
> release it somewhen, and then ->open will access a freed structure....
>
>
> Best regards,
> Maxim Levitsky
>

Still don't know how to do that properly....

Best regards,
Maxim Levitsky

2010-01-20 02:47:49

by Tejun Heo

[permalink] [raw]
Subject: Re: [block subsystem] Need help to prevent races on unexpected device removal

(cc'ing Jens) Hello,

Sorry about the late reply. I tagged this while I was watching block
related mails a couple of weeks ago but forgot about this.

On 01/04/2010 04:13 AM, Maxim Levitsky wrote:
> During development of hotplug support for mtd translation layer I seems
> to be unable to figure a way to prevent following race:
>
> First of all, a block device is registered. I attach a private structure
> to that device to save all internal information.

I suppose you're talking about struct gendisk and using
gendisk->private_data for the private data, right?

> Then out of the blue (when user pulls off the card) I receive a request
> to remove the device.
>
> In the function that handles such removal, I do:
>
> del_gendisk(...
> blk_start_queue
>
> stop thread that processes the requests
>
> blk_cleanup_queue(old->rq);
>
>
> The problem is that I don't know where/when to free the private
> structure.
>
> I though about adding a field to the structure, with name 'invalid', so
> that release will not attempt to go futher, but free the structure, but
> what happens if release is never called?
> In other words this will work as long as there is a user of the block
> device.
>
> I thought then that I can detect that condition and free the structure
> in the removal function itself, but then I get a race with ->open
> running in same time, and mutex will not prevent it, I will have to
> release it somwhen, and then ->open will access a freed structure....

On hotunplug, the driver should mark the device dead so that all
further operations coming from existing open fail and then put the
base reference. On the final put which may happen either as part of
device destruction or release, the private data structure can be
destroyed while holding a mutex. Open can be protected by grabbing
the mutex before dereferencing the private_data.

Thanks.

--
tejun