2021-07-31 07:06:55

by Chris Murphy

[permalink] [raw]
Subject: 5.14-rc3 lockdep warning, iwlwifi 9560

[ 8.353133] iwlwifi 0000:00:14.3: loaded firmware version
46.6b541b68.0 9000-pu-b0-jf-b0-46.ucode op_mode iwlmvm
...
[ 8.737865] iwlwifi 0000:00:14.3: Detected Intel(R) Wireless-AC
9560 160MHz, REV=0x318
...
[ 17.384143] ======================================================
[ 17.384144] WARNING: possible circular locking dependency detected
[ 17.384146] 5.14.0-0.rc3.29.fc35.x86_64+debug #1 Not tainted
[ 17.384147] ------------------------------------------------------
[ 17.384148] kworker/u16:8/554 is trying to acquire lock:
[ 17.384149] ffff91e2af7d80e8 (&sta->rate_ctrl_lock){+.-.}-{2:2},
at: rate_control_get_rate+0xc0/0x140 [mac80211]
[ 17.384197]
but task is already holding lock:
[ 17.384198] ffff91e2af7d8130 (&sta->lock){+.-.}-{2:2}, at:
ieee80211_stop_tx_ba_cb+0x2e/0x1e0 [mac80211]
[ 17.384229]

dmesg:
https://drive.google.com/file/d/1A5aZNMf_qXFU6arthGyu_FEIYV08V7wv/view?usp=sharing

lspci -vvnn
00:14.3 Network controller [0280]: Intel Corporation Cannon Point-LP
CNVi [Wireless-AC] [8086:9df0] (rev 11)
Subsystem: Intel Corporation Device [8086:0030]
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 16
IOMMU group: 6
Region 0: Memory at ea238000 (64-bit, non-prefetchable) [size=16K]
Capabilities: [c8] Power Management version 3
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA
PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [d0] MSI: Enable- Count=1/1 Maskable- 64bit+
Address: 0000000000000000 Data: 0000
Capabilities: [40] Express (v2) Root Complex Integrated Endpoint, MSI 00
DevCap: MaxPayload 128 bytes, PhantFunc 0
ExtTag- RBE- FLReset+
DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
RlxdOrd+ ExtTag- PhantFunc- AuxPwr+ NoSnoop+ FLReset-
MaxPayload 128 bytes, MaxReadReq 128 bytes
DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr+ TransPend-
DevCap2: Completion Timeout: Range B, TimeoutDis+ NROPrPrP- LTR+
10BitTagComp- 10BitTagReq- OBFF Via WAKE#, ExtFmt- EETLPPrefix-
EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
FRS-
AtomicOpsCap: 32bit- 64bit- 128bitCAS-
DevCtl2: Completion Timeout: 16ms to 55ms, TimeoutDis- LTR+
OBFF Disabled,
AtomicOpsCtl: ReqEn-
Capabilities: [80] MSI-X: Enable+ Count=16 Masked-
Vector table: BAR=0 offset=00002000
PBA: BAR=0 offset=00003000
Capabilities: [100 v0] Null
Capabilities: [14c v1] Latency Tolerance Reporting
Max snoop latency: 0ns
Max no snoop latency: 0ns
Capabilities: [164 v1] Vendor Specific Information: ID=0010 Rev=0
Len=014 <?>
Kernel driver in use: iwlwifi
Kernel modules: iwlwifi




--
Chris Murphy


2021-08-20 09:37:52

by Johannes Berg

[permalink] [raw]
Subject: Re: 5.14-rc3 lockdep warning, iwlwifi 9560

Hi Chris,

Sorry for the late reply.

> [ 17.384143] ======================================================
> [ 17.384144] WARNING: possible circular locking dependency detected
> [ 17.384146] 5.14.0-0.rc3.29.fc35.x86_64+debug #1 Not tainted
> [ 17.384147] ------------------------------------------------------
> [ 17.384148] kworker/u16:8/554 is trying to acquire lock:
> [ 17.384149] ffff91e2af7d80e8 (&sta->rate_ctrl_lock){+.-.}-{2:2},
> at: rate_control_get_rate+0xc0/0x140 [mac80211]
> [ 17.384197]
>                but task is already holding lock:
> [ 17.384198] ffff91e2af7d8130 (&sta->lock){+.-.}-{2:2}, at:
> ieee80211_stop_tx_ba_cb+0x2e/0x1e0 [mac80211]
> [ 17.384229]
>
> dmesg:
> https://drive.google.com/file/d/1A5aZNMf_qXFU6arthGyu_FEIYV08V7wv/view?usp=sharing

I can confirm this is a problem, basically we have

aggregation start:
* take sta's rate lock
* call driver's rate control
* that decides to start aggregation
* calls ieee80211_start_tx_ba_session()
* takes sta lock

aggregation stop:
* ieee80211_stop_tx_ba_cb() is called
* takes sta lock
* releases TXQ from aggregation-related stop by
calling ieee80211_agg_start_txq()
* calls schedule_and_wake_txq()
* calls driver
* driver retrieves from, which calls rate control
* takes sta's rate lock


If you could, I'd appreciate if you filed a bug report on bugzilla.kernel.org (see also
https://wireless.wiki.kernel.org/en/users/drivers/iwlwifi/debugging#how_to_report
for the product/component)

It's not obvious to me immediately how to fix it, so I'd prefer to track
it as a bug report.

johannes

2021-08-23 13:07:56

by Johannes Berg

[permalink] [raw]
Subject: Re: 5.14-rc3 lockdep warning, iwlwifi 9560

On Sat, 2021-08-21 at 00:22 -0600, Chris Murphy wrote:
> Hi Johannes,
>
> Filed a bug
> https://bugzilla.kernel.org/show_bug.cgi?id=214123

Thanks, I looked at it this, if you want to try a patch, try this:

https://p.sipsolutions.net/d27dfc58efe3313c.txt

johannes