2006-11-14 07:39:51

by Pierre Ossman

[permalink] [raw]
Subject: device_del() and references

Hi Russell,

I'm trying to wrap my head around the dependencies in the MMC layer and
there are some gaps I am suspicious about. Since you've been looking at
this a lot longer than I have, I thought you could have some valuable
insight.

When a card driver has obtained a reference to a card, what makes sure
we do not destroy that card from under its feet? The reference count on
the device structure in the card only makes sure the structure itself is
in memory, not that the data is valid. E.g. the host might be removed
leaving the host pointer invalid.

I suspect that device_del() doesn't return until remove() has been
called and that our requirement is that the card driver must have
released all references to the card before its remove routine exits.

If so, then there is the risk of a race in mmc_block. What guarantees
that the request handler isn't running in parallel with the remove
function? Again, I suspect that del_gendisk() might grab the queue lock,
but as there might be stuff left in the queue, this seems insufficient.

Perhaps there is a is_gendisk_valid() we can stick at the top of the
request handler?

Rgds
--
-- Pierre Ossman

Linux kernel, MMC maintainer http://www.kernel.org
PulseAudio, core developer http://pulseaudio.org
rdesktop, core developer http://www.rdesktop.org


2006-11-14 11:24:16

by Russell King

[permalink] [raw]
Subject: Re: device_del() and references

On Tue, Nov 14, 2006 at 08:40:00AM +0100, Pierre Ossman wrote:
> When a card driver has obtained a reference to a card, what makes sure
> we do not destroy that card from under its feet?

Essentially, the driver model. (see the answer to your paragraph below.)

> I suspect that device_del() doesn't return until remove() has been
> called and that our requirement is that the card driver must have
> released all references to the card before its remove routine exits.

Your sentence is confusing - which "remove()" are you talking about
here? If you're talking about mmc_blk_remove() then that's correct.

> If so, then there is the risk of a race in mmc_block. What guarantees
> that the request handler isn't running in parallel with the remove
> function? Again, I suspect that del_gendisk() might grab the queue lock,
> but as there might be stuff left in the queue, this seems insufficient.

Hmm, not sure here. I think you might be right, but the block layer is
*extremely* finaky when it comes to removing stuff.

In short, I don't know - I've forgotten quite a bit about the low level
block interface with MMC since it's something I did once and only once.

Maybe Jens has some ideas?

--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 Serial core