2024-03-05 10:54:28

by Wolfram Sang

[permalink] [raw]
Subject: [PATCH v2] mmc: tmio: avoid concurrent runs of mmc_request_done()

With the to-be-fixed commit, the reset_work handler cleared 'host->mrq'
outside of the spinlock protected critical section. That leaves a small
race window during execution of 'tmio_mmc_reset()' where the done_work
handler could grab a pointer to the now invalid 'host->mrq'. Both would
use it to call mmc_request_done() causing problems (see link below).

However, 'host->mrq' cannot simply be cleared earlier inside the
critical section. That would allow new mrqs to come in asynchronously
while the actual reset of the controller still needs to be done. So,
like 'tmio_mmc_set_ios()', an ERR_PTR is used to prevent new mrqs from
coming in but still avoiding concurrency between work handlers.

Reported-by: Dirk Behme <[email protected]>
Closes: https://lore.kernel.org/all/[email protected]/
Fixes: df3ef2d3c92c ("mmc: protect the tmio_mmc driver against a theoretical race")
Signed-off-by: Wolfram Sang <[email protected]>
Tested-by: Dirk Behme <[email protected]>
Reviewed-by: Dirk Behme <[email protected]>
Cc: [email protected] # 3.0+
---

Change since v1/RFT: added Dirk's tags and stable tag

@Ulf: this is nasty, subtle stuff. Would be awesome to have it in 6.8
already!

drivers/mmc/host/tmio_mmc_core.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/mmc/host/tmio_mmc_core.c b/drivers/mmc/host/tmio_mmc_core.c
index be7f18fd4836..c253d176db69 100644
--- a/drivers/mmc/host/tmio_mmc_core.c
+++ b/drivers/mmc/host/tmio_mmc_core.c
@@ -259,6 +259,8 @@ static void tmio_mmc_reset_work(struct work_struct *work)
else
mrq->cmd->error = -ETIMEDOUT;

+ /* No new calls yet, but disallow concurrent tmio_mmc_done_work() */
+ host->mrq = ERR_PTR(-EBUSY);
host->cmd = NULL;
host->data = NULL;

--
2.43.0



2024-03-05 12:19:30

by Ulf Hansson

[permalink] [raw]
Subject: Re: [PATCH v2] mmc: tmio: avoid concurrent runs of mmc_request_done()

On Tue, 5 Mar 2024 at 11:44, Wolfram Sang
<[email protected]> wrote:
>
> With the to-be-fixed commit, the reset_work handler cleared 'host->mrq'
> outside of the spinlock protected critical section. That leaves a small
> race window during execution of 'tmio_mmc_reset()' where the done_work
> handler could grab a pointer to the now invalid 'host->mrq'. Both would
> use it to call mmc_request_done() causing problems (see link below).
>
> However, 'host->mrq' cannot simply be cleared earlier inside the
> critical section. That would allow new mrqs to come in asynchronously
> while the actual reset of the controller still needs to be done. So,
> like 'tmio_mmc_set_ios()', an ERR_PTR is used to prevent new mrqs from
> coming in but still avoiding concurrency between work handlers.
>
> Reported-by: Dirk Behme <[email protected]>
> Closes: https://lore.kernel.org/all/[email protected]/
> Fixes: df3ef2d3c92c ("mmc: protect the tmio_mmc driver against a theoretical race")
> Signed-off-by: Wolfram Sang <[email protected]>
> Tested-by: Dirk Behme <[email protected]>
> Reviewed-by: Dirk Behme <[email protected]>
> Cc: [email protected] # 3.0+

Applied for fixes, thanks!

Kind regards
Uffe


> ---
>
> Change since v1/RFT: added Dirk's tags and stable tag
>
> @Ulf: this is nasty, subtle stuff. Would be awesome to have it in 6.8
> already!
>
> drivers/mmc/host/tmio_mmc_core.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/drivers/mmc/host/tmio_mmc_core.c b/drivers/mmc/host/tmio_mmc_core.c
> index be7f18fd4836..c253d176db69 100644
> --- a/drivers/mmc/host/tmio_mmc_core.c
> +++ b/drivers/mmc/host/tmio_mmc_core.c
> @@ -259,6 +259,8 @@ static void tmio_mmc_reset_work(struct work_struct *work)
> else
> mrq->cmd->error = -ETIMEDOUT;
>
> + /* No new calls yet, but disallow concurrent tmio_mmc_done_work() */
> + host->mrq = ERR_PTR(-EBUSY);
> host->cmd = NULL;
> host->data = NULL;
>
> --
> 2.43.0
>

2024-03-05 13:48:38

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH v2] mmc: tmio: avoid concurrent runs of mmc_request_done()

On Tue, Mar 5, 2024 at 11:54 AM Wolfram Sang
<[email protected]> wrote:
> With the to-be-fixed commit, the reset_work handler cleared 'host->mrq'
> outside of the spinlock protected critical section. That leaves a small
> race window during execution of 'tmio_mmc_reset()' where the done_work
> handler could grab a pointer to the now invalid 'host->mrq'. Both would
> use it to call mmc_request_done() causing problems (see link below).
>
> However, 'host->mrq' cannot simply be cleared earlier inside the
> critical section. That would allow new mrqs to come in asynchronously
> while the actual reset of the controller still needs to be done. So,
> like 'tmio_mmc_set_ios()', an ERR_PTR is used to prevent new mrqs from
> coming in but still avoiding concurrency between work handlers.
>
> Reported-by: Dirk Behme <[email protected]>
> Closes: https://lore.kernel.org/all/[email protected]/
> Fixes: df3ef2d3c92c ("mmc: protect the tmio_mmc driver against a theoretical race")
> Signed-off-by: Wolfram Sang <[email protected]>
> Tested-by: Dirk Behme <[email protected]>
> Reviewed-by: Dirk Behme <[email protected]>
> Cc: [email protected] # 3.0+

Thanks, I gave it a boot run on all boards in my farm, no issues seen.
Tested-by: Geert Uytterhoeven <[email protected]>

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68korg

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds