2009-10-13 08:56:54

by Holger Schurig

[permalink] [raw]
Subject: cfg80211 / libertas: an unusual race

So far I can connect via cfg80211, and therefore I now want to
disconnect as well.

The code in my driver is:


static int lbs_cfg_deauth(struct wiphy *wiphy, struct net_device *dev,
struct cfg80211_deauth_request *req,
void *cookie)
{
struct lbs_private *priv = wiphy_priv(wiphy);
struct cmd_ds_802_11_deauthenticate cmd;
lbs_deb_enter(LBS_DEB_CFG80211);

memset(&cmd, 0, sizeof(cmd));
cmd.hdr.size = cpu_to_le16(sizeof(cmd));
memcpy(cmd.macaddr, &req->bss->bssid, ETH_ALEN);
cmd.reasoncode = cpu_to_le16(req->reason_code);

__lbs_cmd_async(priv, CMD_802_11_DEAUTHENTICATE,
&cmd.hdr, sizeof(cmd),
lbs_cfg_ret_deauth, 0);

return 0;
}

static int lbs_cfg_ret_deauth(struct lbs_private *priv, unsigned long dummy,
struct cmd_header *resp)
{
struct cmd_ds_802_11_deauthenticate *deauth_resp = (void *)resp;
struct ieee80211_mgmt mgmt;

lbs_deb_enter(LBS_DEB_CFG80211);

/* Fake a management frame */
memset(&mgmt, 0, sizeof(mgmt));
memcpy(mgmt.bssid, deauth_resp->macaddr, ETH_ALEN);
/* Note: .sa / .da swapped */
memcpy(mgmt.da, deauth_resp->macaddr, ETH_ALEN);
memcpy(mgmt.sa, priv->current_addr, ETH_ALEN);
mgmt.u.deauth.reason_code = le16_to_cpu(deauth_resp->reasoncode);
cfg80211_send_deauth(priv->dev, (u8 *)&mgmt, sizeof(mgmt),
(void *)dummy);

lbs_deb_leave(LBS_DEB_CFG80211);
return 0;
}


A reader could *think* this happens:

1. lbs_cfg_deauth() enters
2. lbs_cfg_deauth sends CMD_802_11_DEAUTHENTICATE to firmware
asynchronously via __lbs_cmd_async() and specifies
a lbs_cfg_ret_deauth() as callback
3. a) lbs_cfg_deauth() leaves
3. b) cfg80211 does something related to disconnecting
4. firmware responds with an IRQ
5. libertas get's response and calls our callback
6. lbs_cfg_ret_deauth() enters
7. a) cfg80211_send_deauth() get's called
7. b) cfg80211 does something related to disconnecting

But that is not what happens, and steps "3. b)" and
"7. b)" are executed in this sequence with an ath5k
driver.


In step "2.", __lbs_cmd_async() causes a preemption on my device
(CONFIG_PREEMPT). That changes the sequence of cfg80211
jobs. This is what really happens:

1. lbs_cfg_deauth() enters
2. a) lbs_cfg_deauth sends CMD_802_11_DEAUTHENTICATE to firmware
via __lbs_cmd_async()
2. b) IMPORTANT CHANGE: Linux preempts!
4. firmware responds with an IRQ
5. libertas get's response and calls our callback
6. lbs_cfg_ret_deauth() enters
7. a) cfg80211_send_deauth() get's called
7. b) cfg80211 does something related to disconnecting
7. C) IMPORTANT CHANGE: preemption now schedules the
original task, and thus this happens only now:
3. a) lbs_cfg_deauth() leaves
3. b) cfg80211 does something related to disconnecting

Now at step "7. b)" the function __cfg80211_disconnected()
clears wdev->current_bss to NULL.

But at step "3. b)", which now happens at a later time,
wdev_current_bss is no longer set. This triggers a
WARN_ON(!done) warning in __cfg80211_send_deauth(),
file net/wireless/mlme.c



Would it be O.K. to simply remove the WARN_ON(!done) ?
After all, wdev>current_bss was put'ted and cleared correctly.

--
http://www.holgerschurig.de


2009-10-16 12:05:04

by Holger Schurig

[permalink] [raw]
Subject: Re: cfg80211 / libertas: an unusual race

> I don't see this happening that way -- how are you even getting
to
> __cfg80211_disconnected() if not through _send_deauth()?

See step 7a in the second table. We got there via
_send_deauth().

But later, step 3b in the second table (I kept the original
numbers, so 3b happens after 7a time-wise!) we got into
WARN_ON().

--
http://www.holgerschurig.de

2009-10-23 13:35:08

by Johannes Berg

[permalink] [raw]
Subject: Re: cfg80211 / libertas: an unusual race

On Fri, 2009-10-16 at 14:03 +0200, Holger Schurig wrote:
> > I don't see this happening that way -- how are you even getting
> to
> > __cfg80211_disconnected() if not through _send_deauth()?
>
> See step 7a in the second table. We got there via
> _send_deauth().
>
> But later, step 3b in the second table (I kept the original
> numbers, so 3b happens after 7a time-wise!) we got into
> WARN_ON().

Still not making sense to me, since cfg80211_mlme_deauth() never does
anything after calling you.

johannes


Attachments:
signature.asc (801.00 B)
This is a digitally signed message part

2009-10-27 11:37:19

by Johannes Berg

[permalink] [raw]
Subject: Re: cfg80211 / libertas: an unusual race

On Thu, 2009-10-22 at 10:47 +0200, Johannes Berg wrote:
> On Fri, 2009-10-16 at 14:03 +0200, Holger Schurig wrote:
> > > I don't see this happening that way -- how are you even getting
> > to
> > > __cfg80211_disconnected() if not through _send_deauth()?
> >
> > See step 7a in the second table. We got there via
> > _send_deauth().
> >
> > But later, step 3b in the second table (I kept the original
> > numbers, so 3b happens after 7a time-wise!) we got into
> > WARN_ON().
>
> Still not making sense to me, since cfg80211_mlme_deauth() never does
> anything after calling you.

Ok, so I can see that this might happen as a result of a call to
__cfg80211_disconnect(), and I think the patch is correct.

johannes


Attachments:
signature.asc (801.00 B)
This is a digitally signed message part

2009-10-16 11:39:02

by Johannes Berg

[permalink] [raw]
Subject: Re: cfg80211 / libertas: an unusual race

On Tue, 2009-10-13 at 10:55 +0200, Holger Schurig wrote:

> But at step "3. b)", which now happens at a later time,
> wdev_current_bss is no longer set. This triggers a
> WARN_ON(!done) warning in __cfg80211_send_deauth(),
> file net/wireless/mlme.c
>

> Would it be O.K. to simply remove the WARN_ON(!done) ?
> After all, wdev>current_bss was put'ted and cleared correctly.

I don't see this happening that way -- how are you even getting to
__cfg80211_disconnected() if not through _send_deauth()?

johannes


Attachments:
signature.asc (801.00 B)
This is a digitally signed message part