Hi,
On 21/05/2024 14:15, Remi Pommarel wrote:
> On Tue, May 21, 2024 at 09:43:56AM +0200, Antonio Quartulli wrote:
>> Hi,
>>
>> On 18/05/2024 17:50, Remi Pommarel wrote:
>>> Wiphy should be locked before calling rdev_get_station() (see lockdep
>>> assert in ieee80211_get_station()).
>>
>> Adding the lock is fine as nowadays it is taken in pre_doit and released in
>> post_doit (with some exceptions). Therefore when invoking .get_station from
>> a side path the lock should be taken too.
>>
>> It was actually a05829a7222e9d10c416dd2dbbf3929fe6646b89 that introduced
>> this requirement AFAICS.
>
> IIUC lock requirement was already there before this commit, only it was on
> rtnl lock to be taken instead of wiphy one.
You're right.
>
>>
>>>
>>> This fixes the following kernel NULL dereference:
>>
>> As already said by Johannes, I am not sure it truly fixes this NULL
>> dereference though.
>>
>> Have you checked where in ath10k_sta_statistics this is exactly happening?
>> Do you think some sta was partly released and thus fields were NULLified?
>
> ath10k_sta_statistics+0x10 is associated with the arsta->arvif->ar
> statement. It crashes because arsta->arvif is NULL.
>
> Here is a scenario that explains the crash where the same sta
> disconnects then reconnects quickly (e.g. OPEN network):
>
>
> CPU0: CPU1:
>
> batadv_v_elp_periodic_work()
> queue_work(batadv_v_elp_get_throughput)
>
> ieee80211_del_station()
> ieee80211_add_station()
> sta_info_insert()
> list_add_tail_rcu()
> ath10k_sta_state()
> memset(arsta, 0, sizeof(arsta))
> batadv_v_elp_get_throughput()
> cfg80211_get_station()
> ieee80211_get_station() <-- Find sta with its MAC in list
> ath10k_sta_statistics()
> arsta->arvif->ar <-- arsta is still zeroed
>
>
> In other words if the same sta has enough time to disconnect then
> reconnect before batadv_v_elp_get_throughput get scheduled, sta could be
> partially initialized when ath10k_sta_statistics() is called. Locking
> the wiphy ensure sta is fully initialized if found.
>
We were just wondering how you could get the arvif being NULL and your
explanation makes sense.
This said, holding the lock is required when invoking get_station via
netlink, therefore it's meaningful to hold it even when side invoking it
from another module.
Acked-by: Antonio Quartulli <[email protected]>
--
Antonio Quartulli