When device wakes up by card change interrupt and MMC_UNSAFE_RESUME is
enabled then race condition between mmc_rescan() and
mmc_resume()/mmc_sd_resume() appeared.
Resume functions can sleep into mmc_remove_card() and at this time
mmc_rescan() can be called by delayed work handler. Double-free of
kobject or double-remove of host->card can be result of this.
This patch adds an mutex which deny simultaneous executing of
mmc_sd_resume()/mmc_resume() and mmc_rescan() functions. Probably, it is
not right way.
Signed-off-by: Yauhen Kharuzhy <[email protected]>
---
drivers/mmc/core/core.c | 7 +++++++
drivers/mmc/core/host.c | 3 +++
drivers/mmc/core/mmc.c | 3 +++
drivers/mmc/core/sd.c | 3 +++
include/linux/mmc/host.h | 3 +++
5 files changed, 19 insertions(+), 0 deletions(-)
diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
index 044d84e..838fee0 100644
--- a/drivers/mmc/core/core.c
+++ b/drivers/mmc/core/core.c
@@ -657,6 +657,9 @@ void mmc_rescan(struct work_struct *work)
u32 ocr;
int err;
+#ifdef CONFIG_MMC_UNSAFE_RESUME
+ mutex_lock(&host->carddetect_lock);
+#endif
mmc_bus_get(host);
if (host->bus_ops == NULL) {
@@ -717,6 +720,10 @@ void mmc_rescan(struct work_struct *work)
out:
if (host->caps & MMC_CAP_NEEDS_POLL)
mmc_schedule_delayed_work(&host->detect, HZ);
+
+#ifdef CONFIG_MMC_UNSAFE_RESUME
+ mutex_unlock(&host->carddetect_lock);
+#endif
}
void mmc_start_host(struct mmc_host *host)
diff --git a/drivers/mmc/core/host.c b/drivers/mmc/core/host.c
index 6da80fd..90a7218 100644
--- a/drivers/mmc/core/host.c
+++ b/drivers/mmc/core/host.c
@@ -82,6 +82,9 @@ struct mmc_host *mmc_alloc_host(int extra, struct device *dev)
device_initialize(&host->class_dev);
spin_lock_init(&host->lock);
+#ifdef CONFIG_MMC_UNSAFE_RESUME
+ mutex_init(&host->carddetect_lock);
+#endif
init_waitqueue_head(&host->wq);
INIT_DELAYED_WORK(&host->detect, mmc_rescan);
diff --git a/drivers/mmc/core/mmc.c b/drivers/mmc/core/mmc.c
index fdd7c76..e335b50 100644
--- a/drivers/mmc/core/mmc.c
+++ b/drivers/mmc/core/mmc.c
@@ -523,6 +523,8 @@ static void mmc_resume(struct mmc_host *host)
{
int err;
+ mutex_lock(&host->carddetect_lock);
+
BUG_ON(!host);
BUG_ON(!host->card);
@@ -538,6 +540,7 @@ static void mmc_resume(struct mmc_host *host)
mmc_release_host(host);
}
+ mutex_unlock(&host->carddetect_lock);
}
#else
diff --git a/drivers/mmc/core/sd.c b/drivers/mmc/core/sd.c
index 26fc098..40a1404 100644
--- a/drivers/mmc/core/sd.c
+++ b/drivers/mmc/core/sd.c
@@ -585,6 +585,8 @@ static void mmc_sd_resume(struct mmc_host *host)
{
int err;
+ mutex_lock(&host->carddetect_lock);
+
BUG_ON(!host);
BUG_ON(!host->card);
@@ -600,6 +602,7 @@ static void mmc_sd_resume(struct mmc_host *host)
mmc_release_host(host);
}
+ mutex_unlock(&host->carddetect_lock);
}
#else
diff --git a/include/linux/mmc/host.h b/include/linux/mmc/host.h
index 9c288c9..049598c 100644
--- a/include/linux/mmc/host.h
+++ b/include/linux/mmc/host.h
@@ -129,6 +129,9 @@ struct mmc_host {
/* private data */
spinlock_t lock; /* lock for claim and bus ops */
+#ifdef CONFIG_MMC_UNSAFE_RESUME
+ struct mutex carddetect_lock;
+#endif
struct mmc_ios ios; /* current io bus settings */
u32 ocr; /* the current OCR setting */
--
1.5.6.5
On Thu, 16 Oct 2008 19:09:36 +0300
Yauhen Kharuzhy <[email protected]> wrote:
> When device wakes up by card change interrupt and MMC_UNSAFE_RESUME is
> enabled then race condition between mmc_rescan() and
> mmc_resume()/mmc_sd_resume() appeared.
>
> Resume functions can sleep into mmc_remove_card() and at this time
> mmc_rescan() can be called by delayed work handler. Double-free of
> kobject or double-remove of host->card can be result of this.
>
> This patch adds an mutex which deny simultaneous executing of
> mmc_sd_resume()/mmc_resume() and mmc_rescan() functions. Probably, it is
> not right way.
>
> Signed-off-by: Yauhen Kharuzhy <[email protected]>
> ---
Can't we just ask the PM layer if this device is currently resuming,
and if so ignore card notifications from the driver?
--
-- Pierre Ossman
Linux kernel, MMC maintainer http://www.kernel.org
rdesktop, core developer http://www.rdesktop.org
WARNING: This correspondence is being monitored by the
Swedish government. Make sure your server uses encryption
for SMTP traffic and consider using PGP for end-to-end
encryption.
On Sat, Oct 18, 2008 at 07:11:47PM +0200, Pierre Ossman wrote:
> On Thu, 16 Oct 2008 19:09:36 +0300
> Yauhen Kharuzhy <[email protected]> wrote:
>
> > When device wakes up by card change interrupt and MMC_UNSAFE_RESUME is
> > enabled then race condition between mmc_rescan() and
> > mmc_resume()/mmc_sd_resume() appeared.
> >
> > Resume functions can sleep into mmc_remove_card() and at this time
> > mmc_rescan() can be called by delayed work handler. Double-free of
> > kobject or double-remove of host->card can be result of this.
> >
> > This patch adds an mutex which deny simultaneous executing of
> > mmc_sd_resume()/mmc_resume() and mmc_rescan() functions. Probably, it is
> > not right way.
> >
> > Signed-off-by: Yauhen Kharuzhy <[email protected]>
> > ---
>
> Can't we just ask the PM layer if this device is currently resuming,
> and if so ignore card notifications from the driver?
What about another idea: mmc_sd_resume() checks if host->detect is
scheduled and if true then it don't try to reinitialize card.
host->detect can be scheduled at this moment only in one case: if device has been waked up by
card change interrupt.
--
Yauhen Kharuzhy jekhor _at_ gmail.com
JID: [email protected]
A: No
Q: Should I quote below my post?
2008/10/19 Yauhen Kharuzhy <[email protected]>:
> On Sat, Oct 18, 2008 at 07:11:47PM +0200, Pierre Ossman wrote:
>> On Thu, 16 Oct 2008 19:09:36 +0300
>> Yauhen Kharuzhy <[email protected]> wrote:
>>
>> > When device wakes up by card change interrupt and MMC_UNSAFE_RESUME is
>> > enabled then race condition between mmc_rescan() and
>> > mmc_resume()/mmc_sd_resume() appeared.
>> >
>> > Resume functions can sleep into mmc_remove_card() and at this time
>> > mmc_rescan() can be called by delayed work handler. Double-free of
>> > kobject or double-remove of host->card can be result of this.
>> >
>> > This patch adds an mutex which deny simultaneous executing of
>> > mmc_sd_resume()/mmc_resume() and mmc_rescan() functions. Probably, it is
>> > not right way.
>> >
>> > Signed-off-by: Yauhen Kharuzhy <[email protected]>
>> > ---
>>
>> Can't we just ask the PM layer if this device is currently resuming,
>> and if so ignore card notifications from the driver?
>
> What about another idea: mmc_sd_resume() checks if host->detect is
> scheduled and if true then it don't try to reinitialize card.
> host->detect can be scheduled at this moment only in one case: if device has been waked up by
> card change interrupt.
Hmm... But it will cause a race condition if SD card change interrupt
will be raised during resume process.
--
Best regards,
Yauhen Kharuzhy jekhor_(at)_gmail.com
A: No
Q: Should I quote below my post?
When device wakes up by card change interrupt and MMC_UNSAFE_RESUME is
enabled then race condition between mmc_rescan() and
mmc_resume()/mmc_sd_resume() appeared.
Resume functions can sleep into mmc_remove_card() and at this time
mmc_rescan() can be called by delayed work handler. Double-free of
kobject or double-remove of host->card can be result of this.
This patch adds an host->suspended flag which indicated that host is in
suspend state. mmc_rescan() checks it and returned when
host->suspended == 1. It's safe because resume code calls
mmc_detect_change() at end of resume process.
Signed-off-by: Yauhen Kharuzhy <[email protected]>
---
drivers/mmc/core/core.c | 7 +++++++
include/linux/mmc/host.h | 3 +++
2 files changed, 10 insertions(+), 0 deletions(-)
diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
index 044d84e..427f283 100644
--- a/drivers/mmc/core/core.c
+++ b/drivers/mmc/core/core.c
@@ -657,6 +657,9 @@ void mmc_rescan(struct work_struct *work)
u32 ocr;
int err;
+ if (host->suspended)
+ return;
+
mmc_bus_get(host);
if (host->bus_ops == NULL) {
@@ -780,6 +783,8 @@ int mmc_suspend_host(struct mmc_host *host, pm_message_t state)
mmc_power_off(host);
+ host->suspended = 1;
+
return 0;
}
@@ -805,6 +810,8 @@ int mmc_resume_host(struct mmc_host *host)
*/
mmc_detect_change(host, 1);
+ host->suspended = 0;
+
return 0;
}
diff --git a/include/linux/mmc/host.h b/include/linux/mmc/host.h
index 9c288c9..a584239 100644
--- a/include/linux/mmc/host.h
+++ b/include/linux/mmc/host.h
@@ -139,6 +139,9 @@ struct mmc_host {
#ifdef CONFIG_MMC_DEBUG
unsigned int removed:1; /* host is being removed */
#endif
+#ifdef CONFIG_MMC_UNSAFE_RESUME
+ unsigned int suspended:1;
+#endif
struct mmc_card *card; /* device attached to this host */
--
1.5.6.5
On Mon, 20 Oct 2008 22:41:48 +0300, Yauhen Kharuzhy <[email protected]> wrote:
> When device wakes up by card change interrupt and MMC_UNSAFE_RESUME is
> enabled then race condition between mmc_rescan() and
> mmc_resume()/mmc_sd_resume() appeared.
>
> Resume functions can sleep into mmc_remove_card() and at this time
> mmc_rescan() can be called by delayed work handler. Double-free of
> kobject or double-remove of host->card can be result of this.
>
> This patch adds an host->suspended flag which indicated that host is in
> suspend state. mmc_rescan() checks it and returned when
> host->suspended == 1. It's safe because resume code calls
> mmc_detect_change() at end of resume process.
>
> Signed-off-by: Yauhen Kharuzhy <[email protected]>
> ---
> drivers/mmc/core/core.c | 7 +++++++
> include/linux/mmc/host.h | 3 +++
> 2 files changed, 10 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
> index 044d84e..427f283 100644
> --- a/drivers/mmc/core/core.c
> +++ b/drivers/mmc/core/core.c
> @@ -657,6 +657,9 @@ void mmc_rescan(struct work_struct *work)
> u32 ocr;
> int err;
>
> + if (host->suspended)
> + return;
> +
> mmc_bus_get(host);
>
> if (host->bus_ops == NULL) {
> @@ -780,6 +783,8 @@ int mmc_suspend_host(struct mmc_host *host, pm_message_t state)
>
> mmc_power_off(host);
>
> + host->suspended = 1;
> +
> return 0;
> }
>
> @@ -805,6 +810,8 @@ int mmc_resume_host(struct mmc_host *host)
> */
> mmc_detect_change(host, 1);
>
> + host->suspended = 0;
> +
> return 0;
> }
>
> diff --git a/include/linux/mmc/host.h b/include/linux/mmc/host.h
> index 9c288c9..a584239 100644
> --- a/include/linux/mmc/host.h
> +++ b/include/linux/mmc/host.h
> @@ -139,6 +139,9 @@ struct mmc_host {
> #ifdef CONFIG_MMC_DEBUG
> unsigned int removed:1; /* host is being removed */
> #endif
> +#ifdef CONFIG_MMC_UNSAFE_RESUME
> + unsigned int suspended:1;
> +#endif
>
> struct mmc_card *card; /* device attached to this host */
>
Shouldn't you also bracket any acces to ->suspended with the ifdefs ?
Compilation failed on my box...and I had to enable MMC_UNSAFE_RESUME.
--
J.A. Magallon <jamagallon()ono!com> \ Software is like sex:
\ It's better when it's free
Mandriva Linux release 2009.0 (Cooker) for i586
Linux 2.6.25-jam18 (gcc 4.3.1 20080626 (GCC) #1 SMP
On Mon, 20 Oct 2008 22:41:48 +0300
Yauhen Kharuzhy <[email protected]> wrote:
> When device wakes up by card change interrupt and MMC_UNSAFE_RESUME is
> enabled then race condition between mmc_rescan() and
> mmc_resume()/mmc_sd_resume() appeared.
>
Having thought a bit more about this, I'm not sure where the race is.
mmc_sd_resume() will be called before mmc_detect_change() is.
There is a race if the drivers call mmc_detect_change() before
mmc_resume_host() has returned, but that is a driver bug. I won't
object to adding a safe guard against that, but the commit message
should reflect that scenario and not something else. There should also
be some printk() to indicate that the driver is up to no good.
> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
> index 044d84e..427f283 100644
> --- a/drivers/mmc/core/core.c
> +++ b/drivers/mmc/core/core.c
> @@ -657,6 +657,9 @@ void mmc_rescan(struct work_struct *work)
> u32 ocr;
> int err;
>
> + if (host->suspended)
> + return;
> +
> mmc_bus_get(host);
>
> if (host->bus_ops == NULL) {
Was there no way to query the PM layer for this information?
> @@ -805,6 +810,8 @@ int mmc_resume_host(struct mmc_host *host)
> */
> mmc_detect_change(host, 1);
>
> + host->suspended = 0;
> +
> return 0;
> }
>
You've added a new race here. ;)
You should set suspended to 0 before calling mmc_detect_change(), not
after. The point was to protect bus_ops->resume(), nothing else.
> diff --git a/include/linux/mmc/host.h b/include/linux/mmc/host.h
> index 9c288c9..a584239 100644
> --- a/include/linux/mmc/host.h
> +++ b/include/linux/mmc/host.h
> @@ -139,6 +139,9 @@ struct mmc_host {
> #ifdef CONFIG_MMC_DEBUG
> unsigned int removed:1; /* host is being removed */
> #endif
> +#ifdef CONFIG_MMC_UNSAFE_RESUME
> + unsigned int suspended:1;
> +#endif
>
> struct mmc_card *card; /* device attached to this host */
>
No #ifdef for this as there are no where this variable is referenced.
Rgds
--
-- Pierre Ossman
Linux kernel, MMC maintainer http://www.kernel.org
rdesktop, core developer http://www.rdesktop.org
WARNING: This correspondence is being monitored by the
Swedish government. Make sure your server uses encryption
for SMTP traffic and consider using PGP for end-to-end
encryption.