2017-12-13 11:28:18

by Xinming Hu

[permalink] [raw]
Subject: [PATCH] mwifiex: cancel pcie/sdio work in remove/shutdown handler

The last command used to shutdown firmware might be timeout,
and trigger firmware dump in asynchronous pcie/sdio work.

The remove/shutdown handler will continue free core data
structure private/adapter, which might be dereferenced in
pcie/sdio work, finally crash the kernel.

Sync and Cancel pcie/sdio work, could be a fix for above
cornel case. In this way, the last command timeout could
be handled properly.

Signed-off-by: Xinming Hu <[email protected]>
---
drivers/net/wireless/marvell/mwifiex/pcie.c | 2 ++
drivers/net/wireless/marvell/mwifiex/sdio.c | 2 ++
2 files changed, 4 insertions(+)

diff --git a/drivers/net/wireless/marvell/mwifiex/pcie.c b/drivers/net/wireless/marvell/mwifiex/pcie.c
index f666cb2..23209c5 100644
--- a/drivers/net/wireless/marvell/mwifiex/pcie.c
+++ b/drivers/net/wireless/marvell/mwifiex/pcie.c
@@ -310,6 +310,8 @@ static void mwifiex_pcie_remove(struct pci_dev *pdev)
mwifiex_init_shutdown_fw(priv, MWIFIEX_FUNC_SHUTDOWN);
}

+ cancel_work_sync(&card->work);
+
mwifiex_remove_card(adapter);
}

diff --git a/drivers/net/wireless/marvell/mwifiex/sdio.c b/drivers/net/wireless/marvell/mwifiex/sdio.c
index a828801..2488587 100644
--- a/drivers/net/wireless/marvell/mwifiex/sdio.c
+++ b/drivers/net/wireless/marvell/mwifiex/sdio.c
@@ -399,6 +399,8 @@ static int mwifiex_check_winner_status(struct mwifiex_adapter *adapter)
mwifiex_init_shutdown_fw(priv, MWIFIEX_FUNC_SHUTDOWN);
}

+ cancel_work_sync(&card->work);
+
mwifiex_remove_card(adapter);
}

--
1.9.1


2018-01-08 18:11:22

by Brian Norris

[permalink] [raw]
Subject: Re: [PATCH] mwifiex: cancel pcie/sdio work in remove/shutdown handler

Hi,

On Wed, Dec 13, 2017 at 07:27:53PM +0800, Xinming Hu wrote:
> The last command used to shutdown firmware might be timeout,
> and trigger firmware dump in asynchronous pcie/sdio work.
>
> The remove/shutdown handler will continue free core data
> structure private/adapter, which might be dereferenced in
> pcie/sdio work, finally crash the kernel.
>
> Sync and Cancel pcie/sdio work, could be a fix for above
> cornel case. In this way, the last command timeout could

s/cornel/corner/

> be handled properly.
>
> Signed-off-by: Xinming Hu <[email protected]>
> ---
> drivers/net/wireless/marvell/mwifiex/pcie.c | 2 ++
> drivers/net/wireless/marvell/mwifiex/sdio.c | 2 ++
> 2 files changed, 4 insertions(+)
>
> diff --git a/drivers/net/wireless/marvell/mwifiex/pcie.c b/drivers/net/wireless/marvell/mwifiex/pcie.c
> index f666cb2..23209c5 100644
> --- a/drivers/net/wireless/marvell/mwifiex/pcie.c
> +++ b/drivers/net/wireless/marvell/mwifiex/pcie.c
> @@ -310,6 +310,8 @@ static void mwifiex_pcie_remove(struct pci_dev *pdev)
> mwifiex_init_shutdown_fw(priv, MWIFIEX_FUNC_SHUTDOWN);
> }
>
> + cancel_work_sync(&card->work);
> +

Just FYI, this "fix" is not a real fix. It will likely paper over some
of your bugs (where, e.g., the FW shutdown command times out in the
previous couple of lines), but this highlights the fact that there are
other races that could trigger the same behavior. You're not fixing
those.

For example, what if somebody initiates a scan or other nl80211 command
between the above line and mwifiex_remove_card()? That command could
potentially time out too.

The proper fix would be to institute some kind of mutual exclusion
(locking) between mwifiex_shutdown_sw() and mwifiex_remove_card(), so
that they can't occur at the same time.

Unfortunately, I only paid attention to this after Kalle already applied
this patch. Personally, I'd prefer this patch not get applied, since
it's a bad solution to an obvious problem, which instead leaves a subtle
problem that perhaps no one will bother fixing.

Brian

> mwifiex_remove_card(adapter);
> }
>
> diff --git a/drivers/net/wireless/marvell/mwifiex/sdio.c b/drivers/net/wireless/marvell/mwifiex/sdio.c
> index a828801..2488587 100644
> --- a/drivers/net/wireless/marvell/mwifiex/sdio.c
> +++ b/drivers/net/wireless/marvell/mwifiex/sdio.c
> @@ -399,6 +399,8 @@ static int mwifiex_check_winner_status(struct mwifiex_adapter *adapter)
> mwifiex_init_shutdown_fw(priv, MWIFIEX_FUNC_SHUTDOWN);
> }
>
> + cancel_work_sync(&card->work);
> +
> mwifiex_remove_card(adapter);
> }
>
> --
> 1.9.1
>

2018-01-09 22:01:32

by Brian Norris

[permalink] [raw]
Subject: Re: Re: [PATCH] mwifiex: cancel pcie/sdio work in remove/shutdown handler

+ Christopher

Hi Simon and Kalle,

On Tue, Jan 09, 2018 at 11:42:21AM +0000, Xinming Hu wrote:
> Hi,
>
> > -----Original Message-----
> > From: Kalle Valo [mailto:[email protected]]
> > Sent: 2018年1月9日 15:40
> > To: Brian Norris <[email protected]>
> > Cc: Xinming Hu <[email protected]>; Linux Wireless
> > <[email protected]>; Dmitry Torokhov <[email protected]>;
> > [email protected]; Zhiyuan Yang <[email protected]>; Tim Song
> > <[email protected]>; Cathy Luo <[email protected]>; James Cao
> > <[email protected]>; Ganapathi Bhat <[email protected]>; Ellie Reeves
> > <[email protected]>
> > Subject: [EXT] Re: [PATCH] mwifiex: cancel pcie/sdio work in remove/shutdown
> > handler
> >
> > External Email
> >
> > ----------------------------------------------------------------------
> > Brian Norris <[email protected]> writes:
> >
> > >> --- a/drivers/net/wireless/marvell/mwifiex/pcie.c
> > >> +++ b/drivers/net/wireless/marvell/mwifiex/pcie.c
> > >> @@ -310,6 +310,8 @@ static void mwifiex_pcie_remove(struct pci_dev
> > *pdev)
> > >> mwifiex_init_shutdown_fw(priv, MWIFIEX_FUNC_SHUTDOWN);
> > >> }
> > >>
> > >> + cancel_work_sync(&card->work);
> > >> +
> > >
> > > Just FYI, this "fix" is not a real fix. It will likely paper over some
> > > of your bugs (where, e.g., the FW shutdown command times out in the
> > > previous couple of lines), but this highlights the fact that there are
> > > other races that could trigger the same behavior. You're not fixing
> > > those.
> > >
> > > For example, what if somebody initiates a scan or other nl80211
> > > command between the above line and mwifiex_remove_card()? That
> > command
> > > could potentially time out too.
> > >
>
> The hardware status have been reset before downloading the last
> command(FUNC SHUTDOWN), in this way, follow commands download will be
> ignored and warned.

Hmm, I suppose that's true. So the race I'm talking about probably can't
happen usually. What about in manufacturing mode or !FIRMWARE_READY_PCIE
though? Those cases don't shut down the firmware. Can we still have
outstanding timeouts in those cases?

Anyway, I still think there's a problem though, and this patch is just
going to make things worse. See below.

> > > The proper fix would be to institute some kind of mutual exclusion
> > > (locking) between mwifiex_shutdown_sw() and mwifiex_remove_card(), so
> > > that they can't occur at the same time.
> > >
>
> I am not sure whether there is any mutual exclusion protect between pcie_reset and pcie_remove in pcie core.
> But it looks a different race.
> We still need this fix, right?

Good point. Previously, there wasn't any such exclusion, and that's why
races like the above were even more likely. But as of 4.13, now there
*is* exclusion. See commit b014e96d1abb ("PCI: Protect
pci_error_handlers->reset_notify() usage with device_lock()"). That
incidentally means that you're creating a deadlock with this patch! [1]

If we start a timeout/reset sequence in mwifiex_init_shutdown_fw()
(called from remove()), then you'll eventually have pci_reset_function()
waiting on the device lock, but mwifiex_pcie_remove() will be holding
the device lock already, and now (with your patch), remove() will be
waiting on the worker thread to finish pci_reset_function()...deadlock!

I actually think that the above patch (adding device_lock()) resolves
most of the race but introduces a possible deadlock. I believe an easy
solution is just to switch to pci_try_reset_function() instead? That
will just abort a reset attempt if we're in the middle of removing the
device. Problem solved? Diff appended, but I'll send out a real version
if that looks right. Can you test your original problem with the above
commit from Christopher, as well as the appended diff?

> Regards,
> Simon
> > > Unfortunately, I only paid attention to this after Kalle already
> > > applied this patch. Personally, I'd prefer this patch not get applied,
> > > since it's a bad solution to an obvious problem, which instead leaves
> > > a subtle problem that perhaps no one will bother fixing.
> >
> > I can revert it, that's not a problem. Can I use the text below as explanation for
> > the revert?
> >
> > ----------------------------------------------------------------------
> > Brian Norris <[email protected]> says:
> >
> > Just FYI, this "fix" is not a real fix. It will likely paper over some of your bugs
> > (where, e.g., the FW shutdown command times out in the previous couple of
> > lines), but this highlights the fact that there are other races that could trigger
> > the same behavior. You're not fixing those.
> >
> > For example, what if somebody initiates a scan or other nl80211 command
> > between the above line and mwifiex_remove_card()? That command could
> > potentially time out too.
> >
> > The proper fix would be to institute some kind of mutual exclusion
> > (locking) between mwifiex_shutdown_sw() and mwifiex_remove_card(), so that
> > they can't occur at the same time.
> >
> > ----------------------------------------------------------------------

Hi Kalle, thanks for the above. With the new information from above, I
think that there's a more accurate description, like the following:

---
Brian Norris <[email protected]> says:

The "fix" in question might not actually fix all related problems,
and it also looks like it can cause a deadlock. Since commit
b014e96d1abb ("PCI: Protect pci_error_handlers->reset_notify() usage
with device_lock()"), the race in question is actually resolved (PCIe
reset cannot happen at the same time as remove()). Instead, this "fix"
just introduces a deadlock where mwifiex_pcie_card_reset_work() is
waiting on device_lock, which is held by PCIe device remove(), which is
waiting on...mwifiex_pcie_card_reset_work().

The proper thing to do is just to fix the deadlock. Patch for this will
come separately.
---

Brian

[1] Technically, the deadlock is already there, since
mwifiex_remove_card() eventually calls pcie.c's ->cleanup_if(), which
also calls cancel_work_sync(). But your patch doesn't help...

Untested diff:

diff --git a/drivers/net/wireless/marvell/mwifiex/pcie.c b/drivers/net/wireless/marvell/mwifiex/pcie.c
index cd314946452c..5da3d6ccf5f2 100644
--- a/drivers/net/wireless/marvell/mwifiex/pcie.c
+++ b/drivers/net/wireless/marvell/mwifiex/pcie.c
@@ -2781,7 +2781,7 @@ static void mwifiex_pcie_card_reset_work(struct mwifiex_adapter *adapter)
{
struct pcie_service_card *card = adapter->card;

- pci_reset_function(card->dev);
+ pci_try_reset_function(card->dev);
}

static void mwifiex_pcie_work(struct work_struct *work)

2018-01-09 11:42:28

by Xinming Hu

[permalink] [raw]
Subject: Re: Re: [PATCH] mwifiex: cancel pcie/sdio work in remove/shutdown handler

SGksDQoNCj4gLS0tLS1PcmlnaW5hbCBNZXNzYWdlLS0tLS0NCj4gRnJvbTogS2FsbGUgVmFsbyBb
bWFpbHRvOmt2YWxvQGNvZGVhdXJvcmEub3JnXQ0KPiBTZW50OiAyMDE4xOox1MI5yNUgMTU6NDAN
Cj4gVG86IEJyaWFuIE5vcnJpcyA8YnJpYW5ub3JyaXNAY2hyb21pdW0ub3JnPg0KPiBDYzogWGlu
bWluZyBIdSA8aHV4bUBtYXJ2ZWxsLmNvbT47IExpbnV4IFdpcmVsZXNzDQo+IDxsaW51eC13aXJl
bGVzc0B2Z2VyLmtlcm5lbC5vcmc+OyBEbWl0cnkgVG9yb2tob3YgPGR0b3JAZ29vZ2xlLmNvbT47
DQo+IHJhamF0amFAZ29vZ2xlLmNvbTsgWmhpeXVhbiBZYW5nIDx5YW5nenlAbWFydmVsbC5jb20+
OyBUaW0gU29uZw0KPiA8c29uZ3Rhb0BtYXJ2ZWxsLmNvbT47IENhdGh5IEx1byA8Y2x1b0BtYXJ2
ZWxsLmNvbT47IEphbWVzIENhbw0KPiA8amNhb0BtYXJ2ZWxsLmNvbT47IEdhbmFwYXRoaSBCaGF0
IDxnYmhhdEBtYXJ2ZWxsLmNvbT47IEVsbGllIFJlZXZlcw0KPiA8ZWxsaWVyZXZ2ZXNAZ21haWwu
Y29tPg0KPiBTdWJqZWN0OiBbRVhUXSBSZTogW1BBVENIXSBtd2lmaWV4OiBjYW5jZWwgcGNpZS9z
ZGlvIHdvcmsgaW4gcmVtb3ZlL3NodXRkb3duDQo+IGhhbmRsZXINCj4gDQo+IEV4dGVybmFsIEVt
YWlsDQo+IA0KPiAtLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tDQo+IEJyaWFuIE5vcnJpcyA8YnJpYW5ub3JyaXNAY2hy
b21pdW0ub3JnPiB3cml0ZXM6DQo+IA0KPiA+PiAtLS0gYS9kcml2ZXJzL25ldC93aXJlbGVzcy9t
YXJ2ZWxsL213aWZpZXgvcGNpZS5jDQo+ID4+ICsrKyBiL2RyaXZlcnMvbmV0L3dpcmVsZXNzL21h
cnZlbGwvbXdpZmlleC9wY2llLmMNCj4gPj4gQEAgLTMxMCw2ICszMTAsOCBAQCBzdGF0aWMgdm9p
ZCBtd2lmaWV4X3BjaWVfcmVtb3ZlKHN0cnVjdCBwY2lfZGV2DQo+ICpwZGV2KQ0KPiA+PiAgCQlt
d2lmaWV4X2luaXRfc2h1dGRvd25fZncocHJpdiwgTVdJRklFWF9GVU5DX1NIVVRET1dOKTsNCj4g
Pj4gIAl9DQo+ID4+DQo+ID4+ICsJY2FuY2VsX3dvcmtfc3luYygmY2FyZC0+d29yayk7DQo+ID4+
ICsNCj4gPg0KPiA+IEp1c3QgRllJLCB0aGlzICJmaXgiIGlzIG5vdCBhIHJlYWwgZml4LiBJdCB3
aWxsIGxpa2VseSBwYXBlciBvdmVyIHNvbWUNCj4gPiBvZiB5b3VyIGJ1Z3MgKHdoZXJlLCBlLmcu
LCB0aGUgRlcgc2h1dGRvd24gY29tbWFuZCB0aW1lcyBvdXQgaW4gdGhlDQo+ID4gcHJldmlvdXMg
Y291cGxlIG9mIGxpbmVzKSwgYnV0IHRoaXMgaGlnaGxpZ2h0cyB0aGUgZmFjdCB0aGF0IHRoZXJl
IGFyZQ0KPiA+IG90aGVyIHJhY2VzIHRoYXQgY291bGQgdHJpZ2dlciB0aGUgc2FtZSBiZWhhdmlv
ci4gWW91J3JlIG5vdCBmaXhpbmcNCj4gPiB0aG9zZS4NCj4gPg0KPiA+IEZvciBleGFtcGxlLCB3
aGF0IGlmIHNvbWVib2R5IGluaXRpYXRlcyBhIHNjYW4gb3Igb3RoZXIgbmw4MDIxMQ0KPiA+IGNv
bW1hbmQgYmV0d2VlbiB0aGUgYWJvdmUgbGluZSBhbmQgbXdpZmlleF9yZW1vdmVfY2FyZCgpPyBU
aGF0DQo+IGNvbW1hbmQNCj4gPiBjb3VsZCBwb3RlbnRpYWxseSB0aW1lIG91dCB0b28uDQo+ID4N
Cg0KVGhlIGhhcmR3YXJlIHN0YXR1cyBoYXZlIGJlZW4gcmVzZXQgYmVmb3JlIGRvd25sb2FkaW5n
IHRoZSBsYXN0IGNvbW1hbmQoRlVOQyBTSFVURE9XTiksIGluIHRoaXMgd2F5LCBmb2xsb3cgY29t
bWFuZHMgIGRvd25sb2FkIHdpbGwgYmUgaWdub3JlZCBhbmQgd2FybmVkLg0KDQo+ID4gVGhlIHBy
b3BlciBmaXggd291bGQgYmUgdG8gaW5zdGl0dXRlIHNvbWUga2luZCBvZiBtdXR1YWwgZXhjbHVz
aW9uDQo+ID4gKGxvY2tpbmcpIGJldHdlZW4gbXdpZmlleF9zaHV0ZG93bl9zdygpIGFuZCBtd2lm
aWV4X3JlbW92ZV9jYXJkKCksIHNvDQo+ID4gdGhhdCB0aGV5IGNhbid0IG9jY3VyIGF0IHRoZSBz
YW1lIHRpbWUuDQo+ID4NCg0KSSBhbSBub3Qgc3VyZSB3aGV0aGVyIHRoZXJlIGlzIGFueSBtdXR1
YWwgZXhjbHVzaW9uIHByb3RlY3QgYmV0d2VlbiBwY2llX3Jlc2V0IGFuZCBwY2llX3JlbW92ZSBp
biBwY2llIGNvcmUuDQpCdXQgaXQgbG9va3MgYSBkaWZmZXJlbnQgcmFjZS4NCldlIHN0aWxsIG5l
ZWQgdGhpcyBmaXgsIHJpZ2h0Pw0KDQpSZWdhcmRzLA0KU2ltb24NCj4gPiBVbmZvcnR1bmF0ZWx5
LCBJIG9ubHkgcGFpZCBhdHRlbnRpb24gdG8gdGhpcyBhZnRlciBLYWxsZSBhbHJlYWR5DQo+ID4g
YXBwbGllZCB0aGlzIHBhdGNoLiBQZXJzb25hbGx5LCBJJ2QgcHJlZmVyIHRoaXMgcGF0Y2ggbm90
IGdldCBhcHBsaWVkLA0KPiA+IHNpbmNlIGl0J3MgYSBiYWQgc29sdXRpb24gdG8gYW4gb2J2aW91
cyBwcm9ibGVtLCB3aGljaCBpbnN0ZWFkIGxlYXZlcw0KPiA+IGEgc3VidGxlIHByb2JsZW0gdGhh
dCBwZXJoYXBzIG5vIG9uZSB3aWxsIGJvdGhlciBmaXhpbmcuDQo+IA0KPiBJIGNhbiByZXZlcnQg
aXQsIHRoYXQncyBub3QgYSBwcm9ibGVtLiBDYW4gSSB1c2UgdGhlIHRleHQgYmVsb3cgYXMgZXhw
bGFuYXRpb24gZm9yDQo+IHRoZSByZXZlcnQ/DQo+IA0KPiAtLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tDQo+IEJyaWFu
IE5vcnJpcyA8YnJpYW5ub3JyaXNAY2hyb21pdW0ub3JnPiBzYXlzOg0KPiANCj4gSnVzdCBGWUks
IHRoaXMgImZpeCIgaXMgbm90IGEgcmVhbCBmaXguIEl0IHdpbGwgbGlrZWx5IHBhcGVyIG92ZXIg
c29tZSBvZiB5b3VyIGJ1Z3MNCj4gKHdoZXJlLCBlLmcuLCB0aGUgRlcgc2h1dGRvd24gY29tbWFu
ZCB0aW1lcyBvdXQgaW4gdGhlIHByZXZpb3VzIGNvdXBsZSBvZg0KPiBsaW5lcyksIGJ1dCB0aGlz
IGhpZ2hsaWdodHMgdGhlIGZhY3QgdGhhdCB0aGVyZSBhcmUgb3RoZXIgcmFjZXMgdGhhdCBjb3Vs
ZCB0cmlnZ2VyDQo+IHRoZSBzYW1lIGJlaGF2aW9yLiBZb3UncmUgbm90IGZpeGluZyB0aG9zZS4N
Cj4gDQo+IEZvciBleGFtcGxlLCB3aGF0IGlmIHNvbWVib2R5IGluaXRpYXRlcyBhIHNjYW4gb3Ig
b3RoZXIgbmw4MDIxMSBjb21tYW5kDQo+IGJldHdlZW4gdGhlIGFib3ZlIGxpbmUgYW5kIG13aWZp
ZXhfcmVtb3ZlX2NhcmQoKT8gVGhhdCBjb21tYW5kIGNvdWxkDQo+IHBvdGVudGlhbGx5IHRpbWUg
b3V0IHRvby4NCj4gDQo+IFRoZSBwcm9wZXIgZml4IHdvdWxkIGJlIHRvIGluc3RpdHV0ZSBzb21l
IGtpbmQgb2YgbXV0dWFsIGV4Y2x1c2lvbg0KPiAobG9ja2luZykgYmV0d2VlbiBtd2lmaWV4X3No
dXRkb3duX3N3KCkgYW5kIG13aWZpZXhfcmVtb3ZlX2NhcmQoKSwgc28gdGhhdA0KPiB0aGV5IGNh
bid0IG9jY3VyIGF0IHRoZSBzYW1lIHRpbWUuDQo+IA0KPiAtLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tDQo+IA0KPiAt
LQ0KPiBLYWxsZSBWYWxvDQo=

2018-01-09 07:39:45

by Kalle Valo

[permalink] [raw]
Subject: Re: [PATCH] mwifiex: cancel pcie/sdio work in remove/shutdown handler

Brian Norris <[email protected]> writes:

>> --- a/drivers/net/wireless/marvell/mwifiex/pcie.c
>> +++ b/drivers/net/wireless/marvell/mwifiex/pcie.c
>> @@ -310,6 +310,8 @@ static void mwifiex_pcie_remove(struct pci_dev *pdev)
>> mwifiex_init_shutdown_fw(priv, MWIFIEX_FUNC_SHUTDOWN);
>> }
>>
>> + cancel_work_sync(&card->work);
>> +
>
> Just FYI, this "fix" is not a real fix. It will likely paper over some
> of your bugs (where, e.g., the FW shutdown command times out in the
> previous couple of lines), but this highlights the fact that there are
> other races that could trigger the same behavior. You're not fixing
> those.
>
> For example, what if somebody initiates a scan or other nl80211 command
> between the above line and mwifiex_remove_card()? That command could
> potentially time out too.
>
> The proper fix would be to institute some kind of mutual exclusion
> (locking) between mwifiex_shutdown_sw() and mwifiex_remove_card(), so
> that they can't occur at the same time.
>
> Unfortunately, I only paid attention to this after Kalle already applied
> this patch. Personally, I'd prefer this patch not get applied, since
> it's a bad solution to an obvious problem, which instead leaves a subtle
> problem that perhaps no one will bother fixing.

I can revert it, that's not a problem. Can I use the text below as
explanation for the revert?

----------------------------------------------------------------------
Brian Norris <[email protected]> says:

Just FYI, this "fix" is not a real fix. It will likely paper over some
of your bugs (where, e.g., the FW shutdown command times out in the
previous couple of lines), but this highlights the fact that there are
other races that could trigger the same behavior. You're not fixing
those.

For example, what if somebody initiates a scan or other nl80211 command
between the above line and mwifiex_remove_card()? That command could
potentially time out too.

The proper fix would be to institute some kind of mutual exclusion
(locking) between mwifiex_shutdown_sw() and mwifiex_remove_card(), so
that they can't occur at the same time.

----------------------------------------------------------------------

--
Kalle Valo

2018-01-08 17:38:33

by Kalle Valo

[permalink] [raw]
Subject: Re: mwifiex: cancel pcie/sdio work in remove/shutdown handler

Xinming Hu <[email protected]> wrote:

> The last command used to shutdown firmware might be timeout,
> and trigger firmware dump in asynchronous pcie/sdio work.
>
> The remove/shutdown handler will continue free core data
> structure private/adapter, which might be dereferenced in
> pcie/sdio work, finally crash the kernel.
>
> Sync and Cancel pcie/sdio work, could be a fix for above
> cornel case. In this way, the last command timeout could
> be handled properly.
>
> Signed-off-by: Xinming Hu <[email protected]>

Patch applied to wireless-drivers-next.git, thanks.

b713bbf1471b mwifiex: cancel pcie/sdio work in remove/shutdown handler

--
https://patchwork.kernel.org/patch/10109773/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches