2020-03-13 18:38:25

by Logan Gunthorpe

[permalink] [raw]
Subject: [PATCH] PCI/switchtec: Fix init_completion race condition with poll_wait()

The call to init_completion() in mrpc_queue_cmd() can theoretically
race with the call to poll_wait() in switchtec_dev_poll().

poll() write()
switchtec_dev_poll() switchtec_dev_write()
poll_wait(&s->comp.wait); mrpc_queue_cmd()
init_completion(&s->comp)
init_waitqueue_head(&s->comp.wait)

To my knowledge, no one has hit this bug, but we should fix it for
correctness.

Fix this by using reinit_completion() instead of init_completion() in
mrpc_queue_cmd().

Fixes: 080b47def5e5 ("MicroSemi Switchtec management interface driver")
Reported-by: Sebastian Andrzej Siewior <[email protected]>
Signed-off-by: Logan Gunthorpe <[email protected]>
---
drivers/pci/switch/switchtec.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pci/switch/switchtec.c b/drivers/pci/switch/switchtec.c
index a823b4b8ef8a..81dc7ac01381 100644
--- a/drivers/pci/switch/switchtec.c
+++ b/drivers/pci/switch/switchtec.c
@@ -175,7 +175,7 @@ static int mrpc_queue_cmd(struct switchtec_user *stuser)
kref_get(&stuser->kref);
stuser->read_len = sizeof(stuser->data);
stuser_set_state(stuser, MRPC_QUEUED);
- init_completion(&stuser->comp);
+ reinit_completion(&stuser->comp);
list_add_tail(&stuser->list, &stdev->mrpc_queue);

mrpc_cmd_submit(stdev);
--
2.20.1


2020-03-17 01:08:30

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] PCI/switchtec: Fix init_completion race condition with poll_wait()

Logan,

Logan Gunthorpe <[email protected]> writes:

> The call to init_completion() in mrpc_queue_cmd() can theoretically
> race with the call to poll_wait() in switchtec_dev_poll().
>
> poll() write()
> switchtec_dev_poll() switchtec_dev_write()
> poll_wait(&s->comp.wait); mrpc_queue_cmd()
> init_completion(&s->comp)
> init_waitqueue_head(&s->comp.wait)

just a nitpick. As you took the liberty to copy the description of the
race, which was btw. disovered by me, verbatim from a changelog written
by someone else w/o providing the courtesy of linking to that original
analysis, allow me the liberty to add the missing link:

Link: https://lore.kernel.org/lkml/[email protected]

> To my knowledge, no one has hit this bug, but we should fix it for
> correctness.

s/,but we should fix/.Fix/ ?

> Fix this by using reinit_completion() instead of init_completion() in
> mrpc_queue_cmd().
>
> Fixes: 080b47def5e5 ("MicroSemi Switchtec management interface driver")
> Reported-by: Sebastian Andrzej Siewior <[email protected]>
> Signed-off-by: Logan Gunthorpe <[email protected]>

Acked-by: Thomas Gleixner <[email protected]>

@Bjorn: Can you please hold off on this for a few days until we sorted
out the remaining issues to avoid potential merge conflicts
vs. the completion series?

Thanks,

tglx

2020-03-17 02:06:08

by Logan Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH] PCI/switchtec: Fix init_completion race condition with poll_wait()



On 2020-03-16 6:56 p.m., Thomas Gleixner wrote:
> Logan,
>
> Logan Gunthorpe <[email protected]> writes:
>
>> The call to init_completion() in mrpc_queue_cmd() can theoretically
>> race with the call to poll_wait() in switchtec_dev_poll().
>>
>> poll() write()
>> switchtec_dev_poll() switchtec_dev_write()
>> poll_wait(&s->comp.wait); mrpc_queue_cmd()
>> init_completion(&s->comp)
>> init_waitqueue_head(&s->comp.wait)
>
> just a nitpick. As you took the liberty to copy the description of the
> race, which was btw. disovered by me, verbatim from a changelog written
> by someone else w/o providing the courtesy of linking to that original
> analysis, allow me the liberty to add the missing link:
>
> Link: https://lore.kernel.org/lkml/[email protected]

Well, I just copied the call chain. I had no way to know you were the
one who discovered the bug given the way it was presented to me. And the
original patch didn't include much in the way of analysis of the bug,
just "It's Racy".

I didn't deliberately omit the link, it just never occurred to me to add
it. In retrospect, I should have included it, sorry about that.

>> To my knowledge, no one has hit this bug, but we should fix it for
>> correctness.
>
> s/,but we should fix/.Fix/ ?

Yes, that's an improvement.

>> Fix this by using reinit_completion() instead of init_completion() in
>> mrpc_queue_cmd().
>>
>> Fixes: 080b47def5e5 ("MicroSemi Switchtec management interface driver")
>> Reported-by: Sebastian Andrzej Siewior <[email protected]>
>> Signed-off-by: Logan Gunthorpe <[email protected]>
>
> Acked-by: Thomas Gleixner <[email protected]>

Thanks.

> @Bjorn: Can you please hold off on this for a few days until we sorted
> out the remaining issues to avoid potential merge conflicts
> vs. the completion series?

I'd suggest simply rebasing the completion patch on this patch, or a
patch like it. Then we'll have the proper bug fix commit and there won't
be a conflict.

Logan

2020-03-17 09:06:06

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] PCI/switchtec: Fix init_completion race condition with poll_wait()

Logan Gunthorpe <[email protected]> writes:
> On 2020-03-16 6:56 p.m., Thomas Gleixner wrote:
>> @Bjorn: Can you please hold off on this for a few days until we sorted
>> out the remaining issues to avoid potential merge conflicts
>> vs. the completion series?
>
> I'd suggest simply rebasing the completion patch on this patch, or a
> patch like it. Then we'll have the proper bug fix commit and there won't
> be a conflict.

The conflict is not a question of rebasing or not. The conflict arises
when this patch is routed through PCI simply because then the rest of
the completion work is stuck until this hits mainline.

Thanks,

tglx

Subject: [tip: locking/core] PCI/switchtec: Fix init_completion race condition with poll_wait()

The following commit has been merged into the locking/core branch of tip:

Commit-ID: efbdc769601f4d50018bf7ca50fc9f7c67392ece
Gitweb: https://git.kernel.org/tip/efbdc769601f4d50018bf7ca50fc9f7c67392ece
Author: Logan Gunthorpe <[email protected]>
AuthorDate: Sat, 21 Mar 2020 12:25:45 +01:00
Committer: Peter Zijlstra <[email protected]>
CommitterDate: Sat, 21 Mar 2020 16:00:20 +01:00

PCI/switchtec: Fix init_completion race condition with poll_wait()

The call to init_completion() in mrpc_queue_cmd() can theoretically
race with the call to poll_wait() in switchtec_dev_poll().

poll() write()
switchtec_dev_poll() switchtec_dev_write()
poll_wait(&s->comp.wait); mrpc_queue_cmd()
init_completion(&s->comp)
init_waitqueue_head(&s->comp.wait)

To my knowledge, no one has hit this bug.

Fix this by using reinit_completion() instead of init_completion() in
mrpc_queue_cmd().

Fixes: 080b47def5e5 ("MicroSemi Switchtec management interface driver")

Reported-by: Sebastian Andrzej Siewior <[email protected]>
Signed-off-by: Logan Gunthorpe <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Bjorn Helgaas <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
drivers/pci/switch/switchtec.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pci/switch/switchtec.c b/drivers/pci/switch/switchtec.c
index a823b4b..81dc7ac 100644
--- a/drivers/pci/switch/switchtec.c
+++ b/drivers/pci/switch/switchtec.c
@@ -175,7 +175,7 @@ static int mrpc_queue_cmd(struct switchtec_user *stuser)
kref_get(&stuser->kref);
stuser->read_len = sizeof(stuser->data);
stuser_set_state(stuser, MRPC_QUEUED);
- init_completion(&stuser->comp);
+ reinit_completion(&stuser->comp);
list_add_tail(&stuser->list, &stdev->mrpc_queue);

mrpc_cmd_submit(stdev);