2017-11-21 12:23:23

by Benjamin Beichler

[permalink] [raw]
Subject: [PATCH v2 1/5] mac80211_hwsim: wait for deferred radio deletion on mod unload

When closing multiple wmediumd instances with many radios and try to
unload the mac80211_hwsim module it may happen that the work items live
longer than the module. This patch does not mitigate completely the
problem, since we need to delete hwsim_data struct from delete queue
(since afterwards the reference is not valid anymore) at the beginning of
the work function and it may be interrupted in between. But this problem
only occurs for the last (or only) item of the delete list. We could either
create a dedicated work queue or call flush_scheduled_work, but both
introduce unnecessary overhead.

Signed-off-by: Benjamin Beichler <[email protected]>
---
drivers/net/wireless/mac80211_hwsim.c | 30 ++++++++++++++++++++++++++++++
1 file changed, 30 insertions(+)

diff --git a/drivers/net/wireless/mac80211_hwsim.c b/drivers/net/wireless/mac80211_hwsim.c
index ec2f4c31425a..d35dc6b2a733 100644
--- a/drivers/net/wireless/mac80211_hwsim.c
+++ b/drivers/net/wireless/mac80211_hwsim.c
@@ -489,6 +489,8 @@ static const struct ieee80211_iface_combination hwsim_if_comb_p2p_dev[] = {

static spinlock_t hwsim_radio_lock;
static LIST_HEAD(hwsim_radios);
+static spinlock_t hwsim_delete_lock;
+static LIST_HEAD(delete_radios);
static int hwsim_radio_idx;

static struct platform_driver mac80211_hwsim_driver = {
@@ -3326,7 +3328,11 @@ static void destroy_radio(struct work_struct *work)
struct mac80211_hwsim_data *data =
container_of(work, struct mac80211_hwsim_data, destroy_work);

+ spin_lock_bh(&hwsim_delete_lock);
+ list_del(&data->list);
+ spin_unlock_bh(&hwsim_delete_lock);
mac80211_hwsim_del_radio(data, wiphy_name(data->hw->wiphy), NULL);
+
}

static void remove_user_radios(u32 portid)
@@ -3337,6 +3343,11 @@ static void remove_user_radios(u32 portid)
list_for_each_entry_safe(entry, tmp, &hwsim_radios, list) {
if (entry->destroy_on_close && entry->portid == portid) {
list_del(&entry->list);
+
+ spin_lock_bh(&hwsim_delete_lock);
+ list_add(&entry->list, &delete_radios);
+ spin_unlock_bh(&hwsim_delete_lock);
+
INIT_WORK(&entry->destroy_work, destroy_radio);
schedule_work(&entry->destroy_work);
}
@@ -3412,6 +3423,11 @@ static void __net_exit hwsim_exit_net(struct net *net)
continue;

list_del(&data->list);
+
+ spin_lock_bh(&hwsim_delete_lock);
+ list_add(&data->list, &delete_radios);
+ spin_unlock_bh(&hwsim_delete_lock);
+
INIT_WORK(&data->destroy_work, destroy_radio);
schedule_work(&data->destroy_work);
}
@@ -3444,6 +3460,7 @@ static int __init init_mac80211_hwsim(void)
return -EINVAL;

spin_lock_init(&hwsim_radio_lock);
+ spin_lock_init(&hwsim_delete_lock);

err = register_pernet_device(&hwsim_net_ops);
if (err)
@@ -3583,6 +3600,19 @@ static void __exit exit_mac80211_hwsim(void)
hwsim_exit_netlink();

mac80211_hwsim_free();
+
+ /*wait for radios with deferred delete*/
+ spin_lock_bh(&hwsim_delete_lock);
+ while (!list_empty(&delete_radios)) {
+ pr_debug("mac80211_hwsim: wait for deferred radio remove\n");
+ spin_unlock_bh(&hwsim_delete_lock);
+ flush_work(&list_entry(&delete_radios,
+ struct mac80211_hwsim_data, list)
+ ->destroy_work);
+ spin_lock_bh(&hwsim_delete_lock);
+ }
+ spin_unlock_bh(&hwsim_delete_lock);
+
unregister_netdev(hwsim_mon);
platform_driver_unregister(&mac80211_hwsim_driver);
unregister_pernet_device(&hwsim_net_ops);
--
2.15.0


2017-12-11 12:53:43

by Benjamin Beichler

[permalink] [raw]
Subject: Re: [PATCH v2 1/5] mac80211_hwsim: wait for deferred radio deletion on mod unload

Am 11.12.2017 um 12:46 schrieb Johannes Berg:
>
>> + spin_lock_bh(&hwsim_delete_lock);
>> + while (!list_empty(&delete_radios)) {
>> + pr_debug("mac80211_hwsim: wait for deferred radio remove\n");
>> + spin_unlock_bh(&hwsim_delete_lock);
>> + flush_work(&list_entry(&delete_radios,
>> + struct mac80211_hwsim_data, list)
>> + ->destroy_work);
> This can't possibly be right ... you're locking the list_empty which is=

> a trivial pointer comparison, but not the actual list_entry() ...
Maybe the first spin_lock is not needed, but since flush_work wait for
the completion of the task, which at the end also deletes the item from
the list, holding any spin_lock would be wrong.
>
> I'd also prefer you actually didn't leave the problem in part as you
> describe - and a new workqueue probably isn't that much overhead and
> should introduce *less* new code than this, so IMHO that's worth it.
I totally agree with you, but I don't know the actual policy for
creating workqeues. I could also simply call flush_scheduled_work,
because we may have enough time at module unload. Some modules also do
it, but the description an some mailing threads mark it like
evil/deprecated. Which solution do you prefer?

kind regards

Benjamin

--=20
M.Sc. Benjamin Beichler

Universit=C3=A4t Rostock, Fakult=C3=A4t f=C3=BCr Informatik und Elektrote=
chnik
Institut f=C3=BCr Angewandte Mikroelektronik und Datentechnik

University of Rostock, Department of CS and EE
Institute of Applied Microelectronics and CE

Richard-Wagner-Stra=C3=9Fe 31
18119 Rostock
Deutschland/Germany

phone: +49 (0) 381 498 - 7278
email: [email protected]
www: http://www.imd.uni-rostock.de/

2017-12-11 11:46:13

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH v2 1/5] mac80211_hwsim: wait for deferred radio deletion on mod unload

On Tue, 2017-11-21 at 13:17 +0100, Benjamin Beichler wrote:
>
> + /*wait for radios with deferred delete*/

please add spaces there

> + spin_lock_bh(&hwsim_delete_lock);
> + while (!list_empty(&delete_radios)) {
> + pr_debug("mac80211_hwsim: wait for deferred radio remove\n");
> + spin_unlock_bh(&hwsim_delete_lock);
> + flush_work(&list_entry(&delete_radios,
> + struct mac80211_hwsim_data, list)
> + ->destroy_work);

This can't possibly be right ... you're locking the list_empty which is
a trivial pointer comparison, but not the actual list_entry() ...

I'd also prefer you actually didn't leave the problem in part as you
describe - and a new workqueue probably isn't that much overhead and
should introduce *less* new code than this, so IMHO that's worth it.

johannes

2017-12-11 12:57:35

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH v2 1/5] mac80211_hwsim: wait for deferred radio deletion on mod unload

On Mon, 2017-12-11 at 13:54 +0100, Benjamin Beichler wrote:
> Am 11.12.2017 um 12:46 schrieb Johannes Berg:
> >
> > > + spin_lock_bh(&hwsim_delete_lock);
> > > + while (!list_empty(&delete_radios)) {
> > > + pr_debug("mac80211_hwsim: wait for deferred radio remove\n");
> > > + spin_unlock_bh(&hwsim_delete_lock);
> > > + flush_work(&list_entry(&delete_radios,
> > > + struct mac80211_hwsim_data, list)
> > > + ->destroy_work);
> >
> > This can't possibly be right ... you're locking the list_empty which is
> > a trivial pointer comparison, but not the actual list_entry() ...
>
> Maybe the first spin_lock is not needed, but since flush_work wait for
> the completion of the task, which at the end also deletes the item from
> the list, holding any spin_lock would be wrong.

But not holding it while taking things that are on the list also seems
wrong.

> > I'd also prefer you actually didn't leave the problem in part as you
> > describe - and a new workqueue probably isn't that much overhead and
> > should introduce *less* new code than this, so IMHO that's worth it.
>
> I totally agree with you, but I don't know the actual policy for
> creating workqeues. I could also simply call flush_scheduled_work,
> because we may have enough time at module unload. Some modules also do
> it, but the description an some mailing threads mark it like
> evil/deprecated. Which solution do you prefer?

Let's go with a new workqueue.

johannes

2017-12-11 13:07:21

by Benjamin Beichler

[permalink] [raw]
Subject: Re: [PATCH v2 1/5] mac80211_hwsim: wait for deferred radio deletion on mod unload

Am 11.12.2017 um 13:57 schrieb Johannes Berg:
> On Mon, 2017-12-11 at 13:54 +0100, Benjamin Beichler wrote:
>> Am 11.12.2017 um 12:46 schrieb Johannes Berg:
>>>> + spin=5Flock=5Fbh(&hwsim=5Fdelete=5Flock);
>>>> + while (!list=5Fempty(&delete=5Fradios)) {
>>>> + pr=5Fdebug("mac80211=5Fhwsim: wait for deferred radio remove\n");
>>>> + spin=5Funlock=5Fbh(&hwsim=5Fdelete=5Flock);
>>>> + flush=5Fwork(&list=5Fentry(&delete=5Fradios,
>>>> + struct mac80211=5Fhwsim=5Fdata, list)
>>>> + ->destroy=5Fwork);
>>> This can't possibly be right ... you're locking the list=5Fempty which is
>>> a trivial pointer comparison, but not the actual list=5Fentry() ...
>> Maybe the first spin=5Flock is not needed, but since flush=5Fwork wait for
>> the completion of the task, which at the end also deletes the item from
>> the list, holding any spin=5Flock would be wrong.
> But not holding it while taking things that are on the list also seems
> wrong.
Since at this place (Netlink is already unregistered so no new deletion
requests) the only user of the delete list is the code deferred work
items (which take itself the lock). Nonetheless, with a separate
workqueue this is no problem anymore.
>
>>> I'd also prefer you actually didn't leave the problem in part as you
>>> describe - and a new workqueue probably isn't that much overhead and
>>> should introduce *less* new code than this, so IMHO that's worth it.
>> I totally agree with you, but I don't know the actual policy for
>> creating workqeues. I could also simply call flush=5Fscheduled=5Fwork,
>> because we may have enough time at module unload. Some modules also do
>> it, but the description an some mailing threads mark it like
>> evil/deprecated. Which solution do you prefer=3F
> Let's go with a new workqueue.
Ok, I prepare a patch in the next few days.

>
> johannes
>

--
M.Sc. Benjamin Beichler

Universit=C3=A4t Rostock, Fakult=C3=A4t f=C3=BCr Informatik und Elektrotechnik
Institut f=C3=BCr Angewandte Mikroelektronik und Datentechnik

University of Rostock, Department of CS and EE
Institute of Applied Microelectronics and CE

Richard-Wagner-Stra=C3=9Fe 31
18119 Rostock
Deutschland/Germany

phone: +49 (0) 381 498 - 7278
email: [email protected]
www: http://www.imd.uni-rostock.de/