2018-06-15 11:37:14

by McGinn, Dan

[permalink] [raw]
Subject: [BUG] mac80211: Using smp_processor_id() in preemptible code: iwd

Hi, I'm newly trying out Intel iwd daemon but I experience regular kernel e=
rrors in 4.17, although WPA2-PSK connection remains stable. These errors d=
on't seem to be experienced with wpa_supplicant. The errors reliably appea=
r around the following events:
netdev_unicast_notify()
netdev_control_port_frame_event()
netdev_set_rekey_offload()
netdev_set_gtk()

@Denkenz in IRC helpfully suggests Johannes could follow the finger of susp=
icion to this commit:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?=
id=3D911806491425d79107cadddbde11b42bbdfe38c8

dmesg:
BUG: using smp_processor_id() in preemptible [00000000] code: iwd/517
caller is __ieee80211_subif_start_xmit+0x144/0x210 [mac80211]
CPU: 9 PID: 517 Comm: iwd Tainted: P O 4.17.0-1-custom #=
1
Hardware name: BIOS 05/14/2018
Call Trace:
dump_stack+0x5c/0x80
check_preemption_disabled.cold.0+0x46/0x51
__ieee80211_subif_start_xmit+0x144/0x210 [mac80211]
ieee80211_tx_control_port+0x116/0x140 [mac80211]
nl80211_tx_control_port+0x13c/0x270 [cfg80211]
genl_family_rcv_msg+0x1c4/0x3a0
? _raw_spin_lock_irqsave+0x25/0x50
? _raw_spin_unlock_irqrestore+0x20/0x40
? ep_poll_callback+0x212/0x290
genl_rcv_msg+0x47/0x90
? __kmalloc_node_track_caller+0x210/0x2b0
? genl_family_rcv_msg+0x3a0/0x3a0
netlink_rcv_skb+0x4c/0x120
genl_rcv+0x24/0x40
netlink_unicast+0x196/0x240
netlink_sendmsg+0x1fd/0x3c0
sock_sendmsg+0x33/0x40
__sys_sendto+0xee/0x160
? do_epoll_wait+0xb0/0xd0
__x64_sys_sendto+0x24/0x30
do_syscall_64+0x5b/0x170
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f94f41c98cd
RSP: 002b:00007ffd224eab48 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 0000557648053250 RCX: 00007f94f41c98cd
RDX: 0000000000000098 RSI: 00005576480615e0 RDI: 0000000000000006
RBP: 0000557648062560 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffd224eabb0
R13: 00007ffd224eabac R14: 0000000000000000 R15: 0000000000000000

Kernel: 4.17 mainline with iwlwifi patched for recent Intel9560/Killer1552 =
hardware VID/PIDs:
https://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/backport-iwlwifi.gi=
t/commit/?id=3Da3ef483ec5002b7af5a2ad04cb7a77366cd23b9f
uCode: iwlwifi-9000-pu-b0-jf-b0-38

Tried iwd with both 0.2tag and master with same result.

Appears minor, but please let me know if more info is required tracking thi=
s down.=


2018-06-20 03:05:00

by Denis Kenzior

[permalink] [raw]
Subject: Re: [BUG] mac80211: Using smp_processor_id() in preemptible code: iwd

Hi Johannes,

> On Jun 15, 2018, at 7:30 AM, Johannes Berg <[email protected]> =
wrote:
>=20
> On Fri, 2018-06-15 at 11:09 +0000, McGinn, Dan wrote:
>> Hi, I'm newly trying out Intel iwd daemon but I experience regular =
kernel errors in 4.17, although WPA2-PSK connection remains stable. =
These errors don't seem to be experienced with wpa_supplicant. The =
errors reliably appear around the following events:
>> netdev_unicast_notify()
>> netdev_control_port_frame_event()
>> netdev_set_rekey_offload()
>> netdev_set_gtk()
>>=20
>> @Denkenz in IRC helpfully suggests Johannes could follow the finger =
of suspicion to this commit:
>> =
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/=
?id=3D911806491425d79107cadddbde11b42bbdfe38c8
>=20
> It's his code ;-)

Right, but you=E2=80=99re much more aware of all the locking issues than =
I am.

>=20
> Clearly this comes from cfg80211 without any locking other than rtnl, =
so
> you don't have preemption disabled. That's the minimum needed to get =
rid
> of the warning you found.

In my defense, I did ask you whether there are any potential locking =
issues in the RFC and you didn=E2=80=99t think there were any.

I posted a fix for this. Could you please review?

Regards,
-Denis

2018-06-15 12:30:46

by Johannes Berg

[permalink] [raw]
Subject: Re: [BUG] mac80211: Using smp_processor_id() in preemptible code: iwd

On Fri, 2018-06-15 at 11:09 +0000, McGinn, Dan wrote:
> Hi, I'm newly trying out Intel iwd daemon but I experience regular kernel errors in 4.17, although WPA2-PSK connection remains stable. These errors don't seem to be experienced with wpa_supplicant. The errors reliably appear around the following events:
> netdev_unicast_notify()
> netdev_control_port_frame_event()
> netdev_set_rekey_offload()
> netdev_set_gtk()
>
> @Denkenz in IRC helpfully suggests Johannes could follow the finger of suspicion to this commit:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=911806491425d79107cadddbde11b42bbdfe38c8

It's his code ;-)

Clearly this comes from cfg80211 without any locking other than rtnl, so
you don't have preemption disabled. That's the minimum needed to get rid
of the warning you found.

I was thinking this is also wrong because of locking assumptions, but I
don't see that in the code now, so I guess it's fine.

johannes

2018-06-20 08:45:19

by Johannes Berg

[permalink] [raw]
Subject: Re: [BUG] mac80211: Using smp_processor_id() in preemptible code: iwd

On Tue, 2018-06-19 at 22:04 -0500, Denis Kenzior wrote:
>
> Right, but you’re much more aware of all the locking issues than I am.

:-)

> > Clearly this comes from cfg80211 without any locking other than rtnl, so
> > you don't have preemption disabled. That's the minimum needed to get rid
> > of the warning you found.
>
> In my defense, I did ask you whether there are any potential locking
> issues in the RFC and you didn’t think there were any.

Yep, I missed that too. More precisely, ISTR actually thinking about it
and deciding it was fine, so ... my bad for sure.

> I posted a fix for this. Could you please review?

I think it's fine. I'll make a pass later today/this week and send
patches upstream.

johannes