On Monday 24 November 2008 17:51:45 Stefan Steuerwald wrote:
> Thanks again Christian and Johannes,
>
> from a first quick check
> - set-and-clear.diff doesn't seem to change anything
> - both patches together freeze my system
>
> I don't have a serial console on this embedded thing, so I don't know
> its death poem yet. Let me set up my debug environment properly and
> report back to you.
This might not be necessary.
Try this updated patch. And let us know if we no do the right thing.
Regards,
Chr
> But I don't know what to do with set-and-clear.diff
> (I guess it's still one of the three patches, or?, I've attached it again,
> in case johannes missed it? ).
> Is it really necessary or does your application work without it?
I have done a very quick test of that (exactly what I had before MINUS
the set-and-clear.diff).
I don't see any problems.
I recommend not to include this patch at this time.
I am without the hardware until Dec 8, but will continue testing after that.
CCMP is on my list (have only done unencrypted stuff yet).
Best regards,
Stefan.
Seems solved! Maybe a little premature, but I need to blurt this out...
Don't know why, but I think I was not thorough enough in my kernel config:
- I have a so-called alix board featuring a Geode LX 800
- I had tried to set processor type to Geode GX/LX, but that did not
boot (hangs somewhere)
- I didn't bother to find out why, but compiled for 486 instead (worked)
- In the meantime, I copied/merged a kernel .config from another
branch with processor type = 586/686/etc, which went unnoticed by me,
but seemed to work all the time (except maybe for that last kernel
BUG)
- Now I compiled for Geode GX/LX again, and set
CONFIG_GEODE_MFGPT_TIMER=n (as per this info here:
https://kerneltrap.org/mailarchive/linux-kernel/2008/1/20/585236)
which makes my kernel boot and SEEMS TO MAKE THAT CRASH GO AWAY!!!
At least I haven't observed the crash in the last 60 minutes, whereas
before it took only 1-2 minutes every time to turn it up.
Will test this all day.
The three patches mentioned before are applied, and my app-level
timeout is still gone, and the "dropped filtered TX" messages are gone
as well.
Christian, should I actually test your p54-sta-flags-v3 patch?
Regards,
Stefan.
2008/11/26 Christian Lamparter <[email protected]>:
> On Wednesday 26 November 2008 14:38:59 Stefan Steuerwald wrote:
>> console [netcon0] enabled
>> netconsole: network logging started
>> BUG: unable to handle kernel NULL pointer dereference at 00000038
>> IP: [<d08260fa>] p54_assign_address+0x67/0x14b [p54common]
>> *pde = 00000000
>> Oops: 0000 [#1]
>> last sysfs file: /sys/class/net/lo/operstate
>> Modules linked in: netconsole ipv6 loop evdev ehci_hcd ohci_hcd
>> rtc_cmos rtc_core pcspkr rtc_lib p54pci usbcore via_rhine p54common
>> geode_aes mii [last unloaded: netconsole]
>>
>> Pid: 0, comm: swapper Not tainted (2.6.28-rc6-wl #16)
>> EIP: 0060:[<d08260fa>] EFLAGS: 00010002 CPU: 0
>> EIP is at p54_assign_address+0x67/0x14b [p54common]
>> EAX: cf98b178 EBX: cf86ee40 ECX: 00000000 EDX: 00000000
>> ESI: 000000f8 EDI: 00000000 EBP: 0002027c ESP: c03f9c4c
>> DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
>> Process swapper (pid: 0, ti=c03f8000 task=c03c4380 task.ti=c03f8000)
>> Stack:
>> 00000002 ce4d5880 ce4c48b4 cf86e1a0 00000000 00000038 00020200 00000286
>> cf86ee40 00000004 ce4d58b2 ce4d588c d0826fd7 00000090 014c48d4 ce4c48b4
>> cf86e1a0 0086ee40 00000004 02000282 ce4c48d4 cf86ef10 cf86ee40 ce4d5880
>> Call Trace:
>> [<d0826fd7>] p54_tx+0x416/0x482 [p54common]
>> [<c02fb7c2>] __ieee80211_tx+0x35/0xf8
>> [<c02fc235>] ieee80211_master_start_xmit+0x2ab/0x396
>> [<c01048d3>] common_interrupt+0x23/0x30
>> [<c0297368>] dev_hard_start_xmit+0x16e/0x1c9
>> [<c02a3518>] __qdisc_run+0xa2/0x15c
>> [<c0297796>] dev_queue_xmit+0x2f5/0x3c5
>> [<c02f8608>] ieee80211_invoke_rx_handlers+0x488/0x1486
>> [<c02d9d14>] bictcp_cong_avoid+0x10/0x160
>> [<c02bd904>] tcp_ack+0x16f0/0x1850
>> [<c01170f0>] enqueue_task_fair+0x12a/0x16b
>> [<c02c0c37>] tcp_current_mss+0x6b/0xe4
>> [<c02f9b50>] __ieee80211_rx_handle_packet+0x54a/0x56d
>> [<c02fa1fe>] __ieee80211_rx+0x491/0x4e3
>> [<c02ec95d>] ieee80211_tasklet_handler+0x60/0xd6
>> [<c011cfae>] tasklet_action+0x3e/0x64
>> [<c011d305>] __do_softirq+0x4a/0xbc
>> [<c011d399>] do_softirq+0x22/0x26
>> [<c011d44f>] irq_exit+0x25/0x55
>> [<c0105996>] do_IRQ+0x5a/0x6c
>> [<c01048d3>] common_interrupt+0x23/0x30
>> [<c0108743>] default_idle+0x25/0x38
>> [<c0102926>] cpu_idle+0x41/0x5b
>> Code: 0f 84 01 01 00 00 9c 8f 44 24 1c fa 8b 53 10 31 ff 89 6c 24 18
>> 89 14 24 31 d2 eb 3f 8b 4c 24 10 83 c1 38 89 4c 24 14 8b 4c 24 10 <8b>
>> 41 38 29 e8 85 d2 75 0d 39 f0 72 09 8b 51 04 29 f0 89 6c 24
>> EIP: [<d08260fa>] p54_assign_address+0x67/0x14b [p54common] SS:ESP 0068:c03f9c4c
>> Kernel panic - not syncing: Fatal exception in interrupt
>>
> wt*, this bug is "impossible":
>
> The bug happens when p54_assign_address looks for a free space for a new frame:
> here's the code:
> [...]
> if (!skb)
> return -EINVAL; <--- we don't accept "null" skbs
>
> spin_lock_irqsave(&priv->tx_queue.lock, flags); <--- we are under a spin_lock with irq disabled
> left = skb_queue_len(&priv->tx_queue);
> while (left--) {
> u32 hole_size;
> info = IEEE80211_SKB_CB(entry); <--- Here it BUGs,
> [...]
>
> your binary module said that skb->cb is at 0x38,
> so our "entry" is really NULL right when it BUGS.
> And this only happens means that the queue was
> modified "outside" of our driver.
>
> Since we always take the spin_lock_irqsave (of course,
> only of "our" tx_queue). if we need to do anything with the data in the queue,
>
> Of course, since the package as queued while the station was sleeping
> somewhere mac80211, so maybe it still holds a reference to, but then
> other drivers would have already spotted this misbehaviour long time ago...
>
> So? back to square one... I guess.
>
> Regards,
> Chr
>
On Fri, 2008-11-28 at 21:43 +0100, Christian Lamparter wrote:
> On Friday 28 November 2008 21:09:35 Christian Lamparter wrote:
> > Ahh... I knew it!
> >
> > Alright, I looks like I have to change the mac80211 stack for this.
> > What I need is a callback form ap_sta_ps_end & (ap_sta_ps_start).
> >
> > It's because (p54_)set_tim - and therefore p54_sta_unlock as well - won't
> > be executed if the station changes its power state very quickly/or if no package comes in
> > So we have no change to notify the firmware about the stations new power state
> > and then the firmware won't let us send anything to the station.
> >
> > here is my proposal for mac80211:
> > ---
> Updates:
> - integrate sta_notify_ps into sta_notify.
> - added trivial switch cases for mac80211_hwsim.c (or else gcc complains)
Looks fine to me.
> And BTW: can someone please check the spelling?
And that too.
> ---
> diff --git a/drivers/net/wireless/mac80211_hwsim.c b/drivers/net/wireless/mac80211_hwsim.c
> index f43da1c..e2c50ed 100644
> --- a/drivers/net/wireless/mac80211_hwsim.c
> +++ b/drivers/net/wireless/mac80211_hwsim.c
> @@ -524,6 +524,10 @@ static void mac80211_hwsim_sta_notify(struct ieee80211_hw *hw,
> case STA_NOTIFY_REMOVE:
> hwsim_clear_sta_magic(sta);
> break;
> + case STA_NOTIFY_AWAKE:
> + case STA_NOTIFY_SLEEP:
> + /* TODO: make good use of these callbacks */
> + break;
> }
> }
>
> diff --git a/include/net/mac80211.h b/include/net/mac80211.h
> index 6a1d4ea..7bd8edc 100644
> --- a/include/net/mac80211.h
> +++ b/include/net/mac80211.h
> @@ -768,14 +768,18 @@ struct ieee80211_sta {
> /**
> * enum sta_notify_cmd - sta notify command
> *
> - * Used with the sta_notify() callback in &struct ieee80211_ops, this
> - * indicates addition and removal of a station to station table.
> + * Used with the sta_notify() callback in &struct ieee80211_ops.
> + * this command indicates addition and removal of a station to
> + * station table, or if a station made a power state transition.
> *
> * @STA_NOTIFY_ADD: a station was added to the station table
> * @STA_NOTIFY_REMOVE: a station being removed from the station table
> + * @STA_NOTIFY_SLEEP: a station is now sleeping
> + * @STA_NOTIFY_AWAKE: a sleeping station woke up
> */
> enum sta_notify_cmd {
> - STA_NOTIFY_ADD, STA_NOTIFY_REMOVE
> + STA_NOTIFY_ADD, STA_NOTIFY_REMOVE,
> + STA_NOTIFY_SLEEP, STA_NOTIFY_AWAKE,
> };
>
> /**
> diff --git a/net/mac80211/rx.c b/net/mac80211/rx.c
> index 5a1a60f..2d311a1 100644
> --- a/net/mac80211/rx.c
> +++ b/net/mac80211/rx.c
> @@ -654,10 +654,14 @@ ieee80211_rx_h_decrypt(struct ieee80211_rx_data *rx)
> static void ap_sta_ps_start(struct sta_info *sta)
> {
> struct ieee80211_sub_if_data *sdata = sta->sdata;
> + struct ieee80211_local *local = sdata->local;
> DECLARE_MAC_BUF(mac);
>
> atomic_inc(&sdata->bss->num_sta_ps);
> set_and_clear_sta_flags(sta, WLAN_STA_PS, WLAN_STA_PSPOLL);
> + if (local->ops->sta_notify)
> + local->ops->sta_notify(local_to_hw(local), &sdata->vif,
> + STA_NOTIFY_SLEEP, &sta->sta);
> #ifdef CONFIG_MAC80211_VERBOSE_PS_DEBUG
> printk(KERN_DEBUG "%s: STA %s aid %d enters power save mode\n",
> sdata->dev->name, print_mac(mac, sta->sta.addr), sta->sta.aid);
> @@ -675,6 +679,9 @@ static int ap_sta_ps_end(struct sta_info *sta)
> atomic_dec(&sdata->bss->num_sta_ps);
>
> clear_sta_flags(sta, WLAN_STA_PS | WLAN_STA_PSPOLL);
> + if (local->ops->sta_notify)
> + local->ops->sta_notify(local_to_hw(local), &sdata->vif,
> + STA_NOTIFY_AWAKE, &sta->sta);
>
> if (!skb_queue_empty(&sta->ps_tx_buf))
> sta_info_clear_tim_bit(sta);
> ---
>
> p54 updates:
> - update to new api
> ---
> diff -Nurp a/drivers/net/wireless/p54/p54common.c b/drivers/net/wireless/p54/p54common.c
> --- a/drivers/net/wireless/p54/p54common.c 2008-11-28 20:18:53.000000000 +0100
> +++ b/drivers/net/wireless/p54/p54common.c 2008-11-28 21:37:45.000000000 +0100
> @@ -653,6 +653,10 @@ static void p54_rx_frame_sent(struct iee
> __skb_unlink(entry, &priv->tx_queue);
> spin_unlock_irqrestore(&priv->tx_queue.lock, flags);
>
> + entry_hdr = (struct p54_hdr *) entry->data;
> + entry_data = (struct p54_tx_data *) entry_hdr->data;
> + priv->tx_stats[entry_data->hw_queue].len--;
> +
> if (unlikely(entry == priv->cached_beacon)) {
> kfree_skb(entry);
> priv->cached_beacon = NULL;
> @@ -669,8 +673,6 @@ static void p54_rx_frame_sent(struct iee
> BUILD_BUG_ON(offsetof(struct ieee80211_tx_info,
> status.ampdu_ack_len) != 23);
>
> - entry_hdr = (struct p54_hdr *) entry->data;
> - entry_data = (struct p54_tx_data *) entry_hdr->data;
> if (entry_hdr->flags & cpu_to_le16(P54_HDR_FLAG_DATA_ALIGN))
> pad = entry_data->align[0];
>
> @@ -688,7 +690,6 @@ static void p54_rx_frame_sent(struct iee
> }
> }
>
> - priv->tx_stats[entry_data->hw_queue].len--;
> if (!(info->flags & IEEE80211_TX_CTL_NO_ACK) &&
> (!payload->status))
> info->flags |= IEEE80211_TX_STAT_ACK;
> @@ -1005,6 +1006,26 @@ static int p54_sta_unlock(struct ieee802
> return 0;
> }
>
> +static void p54_sta_notify(struct ieee80211_hw *dev, struct ieee80211_vif *vif,
> + enum sta_notify_cmd notify_cmd,
> + struct ieee80211_sta *sta)
> +{
> + switch (notify_cmd) {
> + case STA_NOTIFY_ADD:
> + case STA_NOTIFY_REMOVE:
> + case STA_NOTIFY_AWAKE:
> + /*
> + * Notify the firmware that we don't want or we don't
> + * need to buffer frames for this station anymore.
> + */
> +
> + p54_sta_unlock(dev, sta->addr);
> + break;
> + default:
> + break;
> + }
> +}
> +
> static int p54_tx_cancel(struct ieee80211_hw *dev, struct sk_buff *entry)
> {
> struct p54_common *priv = dev->priv;
> @@ -1070,7 +1091,7 @@ static int p54_tx_fill(struct ieee80211_
> if (info->control.sta)
> *aid = info->control.sta->aid;
> else
> - *flags = P54_HDR_FLAG_DATA_OUT_NOCANCEL;
> + *flags |= P54_HDR_FLAG_DATA_OUT_NOCANCEL;
> }
> return ret;
> }
> @@ -1083,7 +1104,7 @@ static int p54_tx(struct ieee80211_hw *d
> struct p54_hdr *hdr;
> struct p54_tx_data *txhdr;
> size_t padding, len, tim_len = 0;
> - int i, j, ridx;
> + int i, j, ridx, ret;
> u16 hdr_flags = 0, aid = 0;
> u8 rate, queue;
> u8 cts_rate = 0x20;
> @@ -1093,30 +1114,18 @@ static int p54_tx(struct ieee80211_hw *d
>
> queue = skb_get_queue_mapping(skb);
>
> - if (p54_tx_fill(dev, skb, info, &queue, &tim_len, &hdr_flags, &aid)) {
> - current_queue = &priv->tx_stats[queue];
> - if (unlikely(current_queue->len > current_queue->limit))
> - return NETDEV_TX_BUSY;
> - current_queue->len++;
> - current_queue->count++;
> - if (current_queue->len == current_queue->limit)
> - ieee80211_stop_queue(dev, skb_get_queue_mapping(skb));
> - }
> + ret = p54_tx_fill(dev, skb, info, &queue, &tim_len, &hdr_flags, &aid);
> + current_queue = &priv->tx_stats[queue];
> + if (unlikely((current_queue->len > current_queue->limit) && ret))
> + return NETDEV_TX_BUSY;
> + current_queue->len++;
> + current_queue->count++;
> + if ((current_queue->len == current_queue->limit) && ret)
> + ieee80211_stop_queue(dev, skb_get_queue_mapping(skb));
>
> padding = (unsigned long)(skb->data - (sizeof(*hdr) + sizeof(*txhdr))) & 3;
> len = skb->len;
>
> - if (info->flags & IEEE80211_TX_CTL_CLEAR_PS_FILT) {
> - if (info->control.sta)
> - if (p54_sta_unlock(dev, info->control.sta->addr)) {
> - if (current_queue) {
> - current_queue->len--;
> - current_queue->count--;
> - }
> - return NETDEV_TX_BUSY;
> - }
> - }
> -
> txhdr = (struct p54_tx_data *) skb_push(skb, sizeof(*txhdr) + padding);
> hdr = (struct p54_hdr *) skb_push(skb, sizeof(*hdr));
>
> @@ -1835,6 +1844,7 @@ static const struct ieee80211_ops p54_op
> .add_interface = p54_add_interface,
> .remove_interface = p54_remove_interface,
> .set_tim = p54_set_tim,
> + .sta_notify = p54_sta_notify,
> .config = p54_config,
> .config_interface = p54_config_interface,
> .bss_info_changed = p54_bss_info_changed,
> diff -Nurp a/drivers/net/wireless/p54/p54common.h b/drivers/net/wireless/p54/p54common.h
> --- a/drivers/net/wireless/p54/p54common.h 2008-11-28 20:18:53.000000000 +0100
> +++ b/drivers/net/wireless/p54/p54common.h 2008-11-28 20:27:59.000000000 +0100
> @@ -302,7 +302,7 @@ enum p54_frame_sent_status {
> P54_TX_OK = 0,
> P54_TX_FAILED,
> P54_TX_PSM,
> - P54_TX_PSM_CANCELLED
> + P54_TX_PSM_CANCELLED = 4
> };
>
> struct p54_frame_sent {
>
On Thursday 27 November 2008 09:57:11 Stefan Steuerwald wrote:
> At least I haven't observed the crash in the last 60 minutes, whereas
> before it took only 1-2 minutes every time to turn it up.
> Will test this all day.
Finally!
> The three patches mentioned before are applied, and my app-level
> timeout is still gone, and the "dropped filtered TX" messages are gone
> as well.
>
> Christian, should I actually test your p54-sta-flags-v3 patch?
>
Yes, of course... v1 and v2 had a stupid "full queue" bug in p54_tx/p54_tx_fill.
So let me know if your ipod still works with v3.
Regards,
Chr
Ahh... I knew it!
Alright, I looks like I have to change the mac80211 stack for this.
What I need is a callback form ap_sta_ps_end & (ap_sta_ps_start).
It's because (p54_)set_tim - and therefore p54_sta_unlock as well - won't
be executed if the station changes its power state very quickly/or if no package comes in
So we have no change to notify the firmware about the stations new power state
and then the firmware won't let us send anything to the station.
here is my proposal for mac80211:
---
diff --git a/include/net/mac80211.h b/include/net/mac80211.h
index 6a1d4ea..341ddb1 100644
--- a/include/net/mac80211.h
+++ b/include/net/mac80211.h
@@ -779,6 +779,20 @@ enum sta_notify_cmd {
};
/**
+ * enum sta_notify_ps_cmd - sta powersave notify command
+ *
+ * Used with the sta_notify_ps() callback in &struct ieee80211_ops to
+ * notify the driver if a station made a power state transition.
+ *
+ * @STA_WILL_SLEEP: the station will sleep
+ * @STA_WOKE_UP: the sleeping station woke up
+ */
+enum sta_notify_ps_cmd {
+ STA_WILL_SLEEP,
+ STA_WOKE_UP,
+};
+
+/**
* enum ieee80211_tkip_key_type - get tkip key
*
* Used by drivers which need to get a tkip key for skb. Some drivers need a
@@ -1248,6 +1262,9 @@ enum ieee80211_ampdu_mlme_action {
* @sta_notify: Notifies low level driver about addition or removal
* of associated station or AP.
*
+ * @sta_notify_ps: Notifies low level driver about the power state transition
+ * of a associated station.
+ *
* @conf_tx: Configure TX queue parameters (EDCF (aifs, cw_min, cw_max),
* bursting) for a hardware TX queue.
*
@@ -1314,6 +1331,8 @@ struct ieee80211_ops {
int (*set_frag_threshold)(struct ieee80211_hw *hw, u32 value);
void (*sta_notify)(struct ieee80211_hw *hw, struct ieee80211_vif *vif,
enum sta_notify_cmd, struct ieee80211_sta *sta);
+ void (*sta_notify_ps)(struct ieee80211_hw *hw,
+ enum sta_notify_ps_cmd, struct ieee80211_sta *sta);
int (*conf_tx)(struct ieee80211_hw *hw, u16 queue,
const struct ieee80211_tx_queue_params *params);
int (*get_tx_stats)(struct ieee80211_hw *hw,
diff --git a/net/mac80211/rx.c b/net/mac80211/rx.c
index 5a1a60f..3bf056e 100644
--- a/net/mac80211/rx.c
+++ b/net/mac80211/rx.c
@@ -654,10 +654,14 @@ ieee80211_rx_h_decrypt(struct ieee80211_rx_data *rx)
static void ap_sta_ps_start(struct sta_info *sta)
{
struct ieee80211_sub_if_data *sdata = sta->sdata;
+ struct ieee80211_local *local = sdata->local;
DECLARE_MAC_BUF(mac);
atomic_inc(&sdata->bss->num_sta_ps);
set_and_clear_sta_flags(sta, WLAN_STA_PS, WLAN_STA_PSPOLL);
+ if (local->ops->sta_notify_ps)
+ local->ops->sta_notify_ps(local_to_hw(local), STA_WILL_SLEEP,
+ &sta->sta);
#ifdef CONFIG_MAC80211_VERBOSE_PS_DEBUG
printk(KERN_DEBUG "%s: STA %s aid %d enters power save mode\n",
sdata->dev->name, print_mac(mac, sta->sta.addr), sta->sta.aid);
@@ -675,6 +679,9 @@ static int ap_sta_ps_end(struct sta_info *sta)
atomic_dec(&sdata->bss->num_sta_ps);
clear_sta_flags(sta, WLAN_STA_PS | WLAN_STA_PSPOLL);
+ if (local->ops->sta_notify_ps)
+ local->ops->sta_notify_ps(local_to_hw(local), STA_WOKE_UP,
+ &sta->sta);
if (!skb_queue_empty(&sta->ps_tx_buf))
sta_info_clear_tim_bit(sta);
---
and here is p54 new code:
---
diff -Nurp a/drivers/net/wireless/p54/p54common.c b/drivers/net/wireless/p54/p54common.c
--- a/drivers/net/wireless/p54/p54common.c 2008-11-28 20:18:53.000000000 +0100
+++ b/drivers/net/wireless/p54/p54common.c 2008-11-28 20:40:02.000000000 +0100
@@ -653,6 +653,10 @@ static void p54_rx_frame_sent(struct iee
__skb_unlink(entry, &priv->tx_queue);
spin_unlock_irqrestore(&priv->tx_queue.lock, flags);
+ entry_hdr = (struct p54_hdr *) entry->data;
+ entry_data = (struct p54_tx_data *) entry_hdr->data;
+ priv->tx_stats[entry_data->hw_queue].len--;
+
if (unlikely(entry == priv->cached_beacon)) {
kfree_skb(entry);
priv->cached_beacon = NULL;
@@ -669,8 +673,6 @@ static void p54_rx_frame_sent(struct iee
BUILD_BUG_ON(offsetof(struct ieee80211_tx_info,
status.ampdu_ack_len) != 23);
- entry_hdr = (struct p54_hdr *) entry->data;
- entry_data = (struct p54_tx_data *) entry_hdr->data;
if (entry_hdr->flags & cpu_to_le16(P54_HDR_FLAG_DATA_ALIGN))
pad = entry_data->align[0];
@@ -688,7 +690,6 @@ static void p54_rx_frame_sent(struct iee
}
}
- priv->tx_stats[entry_data->hw_queue].len--;
if (!(info->flags & IEEE80211_TX_CTL_NO_ACK) &&
(!payload->status))
info->flags |= IEEE80211_TX_STAT_ACK;
@@ -1005,6 +1006,38 @@ static int p54_sta_unlock(struct ieee802
return 0;
}
+static void p54_sta_notify_ps(struct ieee80211_hw *dev,
+ enum sta_notify_ps_cmd notify_cmd,
+ struct ieee80211_sta *sta)
+{
+ switch (notify_cmd) {
+ case STA_WOKE_UP:
+ p54_sta_unlock(dev, sta->addr);
+ break;
+ default:
+ break;
+ }
+}
+
+static void p54_sta_notify(struct ieee80211_hw *dev, struct ieee80211_vif *vif,
+ enum sta_notify_cmd notify_cmd,
+ struct ieee80211_sta *sta)
+{
+ switch (notify_cmd) {
+ case STA_NOTIFY_ADD:
+ case STA_NOTIFY_REMOVE:
+ /*
+ * Notify the firmware that we don't want or we don't
+ * need to buffer frames for this station anymore.
+ */
+
+ p54_sta_unlock(dev, sta->addr);
+ break;
+ default:
+ break;
+ }
+}
+
static int p54_tx_cancel(struct ieee80211_hw *dev, struct sk_buff *entry)
{
struct p54_common *priv = dev->priv;
@@ -1070,7 +1103,7 @@ static int p54_tx_fill(struct ieee80211_
if (info->control.sta)
*aid = info->control.sta->aid;
else
- *flags = P54_HDR_FLAG_DATA_OUT_NOCANCEL;
+ *flags |= P54_HDR_FLAG_DATA_OUT_NOCANCEL;
}
return ret;
}
@@ -1083,7 +1116,7 @@ static int p54_tx(struct ieee80211_hw *d
struct p54_hdr *hdr;
struct p54_tx_data *txhdr;
size_t padding, len, tim_len = 0;
- int i, j, ridx;
+ int i, j, ridx, ret;
u16 hdr_flags = 0, aid = 0;
u8 rate, queue;
u8 cts_rate = 0x20;
@@ -1093,30 +1126,18 @@ static int p54_tx(struct ieee80211_hw *d
queue = skb_get_queue_mapping(skb);
- if (p54_tx_fill(dev, skb, info, &queue, &tim_len, &hdr_flags, &aid)) {
- current_queue = &priv->tx_stats[queue];
- if (unlikely(current_queue->len > current_queue->limit))
- return NETDEV_TX_BUSY;
- current_queue->len++;
- current_queue->count++;
- if (current_queue->len == current_queue->limit)
- ieee80211_stop_queue(dev, skb_get_queue_mapping(skb));
- }
+ ret = p54_tx_fill(dev, skb, info, &queue, &tim_len, &hdr_flags, &aid);
+ current_queue = &priv->tx_stats[queue];
+ if (unlikely((current_queue->len > current_queue->limit) && ret))
+ return NETDEV_TX_BUSY;
+ current_queue->len++;
+ current_queue->count++;
+ if ((current_queue->len == current_queue->limit) && ret)
+ ieee80211_stop_queue(dev, skb_get_queue_mapping(skb));
padding = (unsigned long)(skb->data - (sizeof(*hdr) + sizeof(*txhdr))) & 3;
len = skb->len;
- if (info->flags & IEEE80211_TX_CTL_CLEAR_PS_FILT) {
- if (info->control.sta)
- if (p54_sta_unlock(dev, info->control.sta->addr)) {
- if (current_queue) {
- current_queue->len--;
- current_queue->count--;
- }
- return NETDEV_TX_BUSY;
- }
- }
-
txhdr = (struct p54_tx_data *) skb_push(skb, sizeof(*txhdr) + padding);
hdr = (struct p54_hdr *) skb_push(skb, sizeof(*hdr));
@@ -1835,6 +1856,8 @@ static const struct ieee80211_ops p54_op
.add_interface = p54_add_interface,
.remove_interface = p54_remove_interface,
.set_tim = p54_set_tim,
+ .sta_notify_ps = p54_sta_notify_ps,
+ .sta_notify = p54_sta_notify,
.config = p54_config,
.config_interface = p54_config_interface,
.bss_info_changed = p54_bss_info_changed,
diff -Nurp a/drivers/net/wireless/p54/p54common.h b/drivers/net/wireless/p54/p54common.h
--- a/drivers/net/wireless/p54/p54common.h 2008-11-28 20:18:53.000000000 +0100
+++ b/drivers/net/wireless/p54/p54common.h 2008-11-28 20:27:59.000000000 +0100
@@ -302,7 +302,7 @@ enum p54_frame_sent_status {
P54_TX_OK = 0,
P54_TX_FAILED,
P54_TX_PSM,
- P54_TX_PSM_CANCELLED
+ P54_TX_PSM_CANCELLED = 4
};
struct p54_frame_sent {
On Friday 28 November 2008 21:09:35 Christian Lamparter wrote:
> Ahh... I knew it!
>
> Alright, I looks like I have to change the mac80211 stack for this.
> What I need is a callback form ap_sta_ps_end & (ap_sta_ps_start).
>
> It's because (p54_)set_tim - and therefore p54_sta_unlock as well - won't
> be executed if the station changes its power state very quickly/or if no package comes in
> So we have no change to notify the firmware about the stations new power state
> and then the firmware won't let us send anything to the station.
>
> here is my proposal for mac80211:
> ---
Updates:
- integrate sta_notify_ps into sta_notify.
- added trivial switch cases for mac80211_hwsim.c (or else gcc complains)
And BTW: can someone please check the spelling?
---
diff --git a/drivers/net/wireless/mac80211_hwsim.c b/drivers/net/wireless/mac80211_hwsim.c
index f43da1c..e2c50ed 100644
--- a/drivers/net/wireless/mac80211_hwsim.c
+++ b/drivers/net/wireless/mac80211_hwsim.c
@@ -524,6 +524,10 @@ static void mac80211_hwsim_sta_notify(struct ieee80211_hw *hw,
case STA_NOTIFY_REMOVE:
hwsim_clear_sta_magic(sta);
break;
+ case STA_NOTIFY_AWAKE:
+ case STA_NOTIFY_SLEEP:
+ /* TODO: make good use of these callbacks */
+ break;
}
}
diff --git a/include/net/mac80211.h b/include/net/mac80211.h
index 6a1d4ea..7bd8edc 100644
--- a/include/net/mac80211.h
+++ b/include/net/mac80211.h
@@ -768,14 +768,18 @@ struct ieee80211_sta {
/**
* enum sta_notify_cmd - sta notify command
*
- * Used with the sta_notify() callback in &struct ieee80211_ops, this
- * indicates addition and removal of a station to station table.
+ * Used with the sta_notify() callback in &struct ieee80211_ops.
+ * this command indicates addition and removal of a station to
+ * station table, or if a station made a power state transition.
*
* @STA_NOTIFY_ADD: a station was added to the station table
* @STA_NOTIFY_REMOVE: a station being removed from the station table
+ * @STA_NOTIFY_SLEEP: a station is now sleeping
+ * @STA_NOTIFY_AWAKE: a sleeping station woke up
*/
enum sta_notify_cmd {
- STA_NOTIFY_ADD, STA_NOTIFY_REMOVE
+ STA_NOTIFY_ADD, STA_NOTIFY_REMOVE,
+ STA_NOTIFY_SLEEP, STA_NOTIFY_AWAKE,
};
/**
diff --git a/net/mac80211/rx.c b/net/mac80211/rx.c
index 5a1a60f..2d311a1 100644
--- a/net/mac80211/rx.c
+++ b/net/mac80211/rx.c
@@ -654,10 +654,14 @@ ieee80211_rx_h_decrypt(struct ieee80211_rx_data *rx)
static void ap_sta_ps_start(struct sta_info *sta)
{
struct ieee80211_sub_if_data *sdata = sta->sdata;
+ struct ieee80211_local *local = sdata->local;
DECLARE_MAC_BUF(mac);
atomic_inc(&sdata->bss->num_sta_ps);
set_and_clear_sta_flags(sta, WLAN_STA_PS, WLAN_STA_PSPOLL);
+ if (local->ops->sta_notify)
+ local->ops->sta_notify(local_to_hw(local), &sdata->vif,
+ STA_NOTIFY_SLEEP, &sta->sta);
#ifdef CONFIG_MAC80211_VERBOSE_PS_DEBUG
printk(KERN_DEBUG "%s: STA %s aid %d enters power save mode\n",
sdata->dev->name, print_mac(mac, sta->sta.addr), sta->sta.aid);
@@ -675,6 +679,9 @@ static int ap_sta_ps_end(struct sta_info *sta)
atomic_dec(&sdata->bss->num_sta_ps);
clear_sta_flags(sta, WLAN_STA_PS | WLAN_STA_PSPOLL);
+ if (local->ops->sta_notify)
+ local->ops->sta_notify(local_to_hw(local), &sdata->vif,
+ STA_NOTIFY_AWAKE, &sta->sta);
if (!skb_queue_empty(&sta->ps_tx_buf))
sta_info_clear_tim_bit(sta);
---
p54 updates:
- update to new api
---
diff -Nurp a/drivers/net/wireless/p54/p54common.c b/drivers/net/wireless/p54/p54common.c
--- a/drivers/net/wireless/p54/p54common.c 2008-11-28 20:18:53.000000000 +0100
+++ b/drivers/net/wireless/p54/p54common.c 2008-11-28 21:37:45.000000000 +0100
@@ -653,6 +653,10 @@ static void p54_rx_frame_sent(struct iee
__skb_unlink(entry, &priv->tx_queue);
spin_unlock_irqrestore(&priv->tx_queue.lock, flags);
+ entry_hdr = (struct p54_hdr *) entry->data;
+ entry_data = (struct p54_tx_data *) entry_hdr->data;
+ priv->tx_stats[entry_data->hw_queue].len--;
+
if (unlikely(entry == priv->cached_beacon)) {
kfree_skb(entry);
priv->cached_beacon = NULL;
@@ -669,8 +673,6 @@ static void p54_rx_frame_sent(struct iee
BUILD_BUG_ON(offsetof(struct ieee80211_tx_info,
status.ampdu_ack_len) != 23);
- entry_hdr = (struct p54_hdr *) entry->data;
- entry_data = (struct p54_tx_data *) entry_hdr->data;
if (entry_hdr->flags & cpu_to_le16(P54_HDR_FLAG_DATA_ALIGN))
pad = entry_data->align[0];
@@ -688,7 +690,6 @@ static void p54_rx_frame_sent(struct iee
}
}
- priv->tx_stats[entry_data->hw_queue].len--;
if (!(info->flags & IEEE80211_TX_CTL_NO_ACK) &&
(!payload->status))
info->flags |= IEEE80211_TX_STAT_ACK;
@@ -1005,6 +1006,26 @@ static int p54_sta_unlock(struct ieee802
return 0;
}
+static void p54_sta_notify(struct ieee80211_hw *dev, struct ieee80211_vif *vif,
+ enum sta_notify_cmd notify_cmd,
+ struct ieee80211_sta *sta)
+{
+ switch (notify_cmd) {
+ case STA_NOTIFY_ADD:
+ case STA_NOTIFY_REMOVE:
+ case STA_NOTIFY_AWAKE:
+ /*
+ * Notify the firmware that we don't want or we don't
+ * need to buffer frames for this station anymore.
+ */
+
+ p54_sta_unlock(dev, sta->addr);
+ break;
+ default:
+ break;
+ }
+}
+
static int p54_tx_cancel(struct ieee80211_hw *dev, struct sk_buff *entry)
{
struct p54_common *priv = dev->priv;
@@ -1070,7 +1091,7 @@ static int p54_tx_fill(struct ieee80211_
if (info->control.sta)
*aid = info->control.sta->aid;
else
- *flags = P54_HDR_FLAG_DATA_OUT_NOCANCEL;
+ *flags |= P54_HDR_FLAG_DATA_OUT_NOCANCEL;
}
return ret;
}
@@ -1083,7 +1104,7 @@ static int p54_tx(struct ieee80211_hw *d
struct p54_hdr *hdr;
struct p54_tx_data *txhdr;
size_t padding, len, tim_len = 0;
- int i, j, ridx;
+ int i, j, ridx, ret;
u16 hdr_flags = 0, aid = 0;
u8 rate, queue;
u8 cts_rate = 0x20;
@@ -1093,30 +1114,18 @@ static int p54_tx(struct ieee80211_hw *d
queue = skb_get_queue_mapping(skb);
- if (p54_tx_fill(dev, skb, info, &queue, &tim_len, &hdr_flags, &aid)) {
- current_queue = &priv->tx_stats[queue];
- if (unlikely(current_queue->len > current_queue->limit))
- return NETDEV_TX_BUSY;
- current_queue->len++;
- current_queue->count++;
- if (current_queue->len == current_queue->limit)
- ieee80211_stop_queue(dev, skb_get_queue_mapping(skb));
- }
+ ret = p54_tx_fill(dev, skb, info, &queue, &tim_len, &hdr_flags, &aid);
+ current_queue = &priv->tx_stats[queue];
+ if (unlikely((current_queue->len > current_queue->limit) && ret))
+ return NETDEV_TX_BUSY;
+ current_queue->len++;
+ current_queue->count++;
+ if ((current_queue->len == current_queue->limit) && ret)
+ ieee80211_stop_queue(dev, skb_get_queue_mapping(skb));
padding = (unsigned long)(skb->data - (sizeof(*hdr) + sizeof(*txhdr))) & 3;
len = skb->len;
- if (info->flags & IEEE80211_TX_CTL_CLEAR_PS_FILT) {
- if (info->control.sta)
- if (p54_sta_unlock(dev, info->control.sta->addr)) {
- if (current_queue) {
- current_queue->len--;
- current_queue->count--;
- }
- return NETDEV_TX_BUSY;
- }
- }
-
txhdr = (struct p54_tx_data *) skb_push(skb, sizeof(*txhdr) + padding);
hdr = (struct p54_hdr *) skb_push(skb, sizeof(*hdr));
@@ -1835,6 +1844,7 @@ static const struct ieee80211_ops p54_op
.add_interface = p54_add_interface,
.remove_interface = p54_remove_interface,
.set_tim = p54_set_tim,
+ .sta_notify = p54_sta_notify,
.config = p54_config,
.config_interface = p54_config_interface,
.bss_info_changed = p54_bss_info_changed,
diff -Nurp a/drivers/net/wireless/p54/p54common.h b/drivers/net/wireless/p54/p54common.h
--- a/drivers/net/wireless/p54/p54common.h 2008-11-28 20:18:53.000000000 +0100
+++ b/drivers/net/wireless/p54/p54common.h 2008-11-28 20:27:59.000000000 +0100
@@ -302,7 +302,7 @@ enum p54_frame_sent_status {
P54_TX_OK = 0,
P54_TX_FAILED,
P54_TX_PSM,
- P54_TX_PSM_CANCELLED
+ P54_TX_PSM_CANCELLED = 4
};
struct p54_frame_sent {
Almost there, almost there!!! :-)
Using all of the attached three patches my problem seems gone:
- I no longer see any "dropped TX filtered" messages in syslog
- My app no longer times out
p54_sta_flags_v2.diff alone does not change anything for me.
Johannes' patch alone MAY reduce the frequency of my app-level
timeouts, but I have only gut feeling to support that.
All three together do it.
Patches attached in original form for reference.
However, after some time, the kernel oopses. Console output below.
I watched this several times, and it always happens after seeing these
last two lines in syslog:348 :
Nov 26 14:16:32 alix kernel: STA 00:22:41:91:8e:96 aid 1: PS buffer
(entries before 0)
Nov 26 14:16:32 alix kernel: STA 00:22:41:91:8e:96 aid 1: PS buffer
(entries before 1)
Does this help? Anything you need me to try, I have this hardware only
until tomorrow. It gets back to me after Dec 7.
Thanks! Great work so far!
Stefan.
---
Console:
console [netcon0] enabled
netconsole: network logging started
BUG: unable to handle kernel NULL pointer dereference at 00000038
IP: [<d08260fa>] p54_assign_address+0x67/0x14b [p54common]
*pde = 00000000
Oops: 0000 [#1]
last sysfs file: /sys/class/net/lo/operstate
Modules linked in: netconsole ipv6 loop evdev ehci_hcd ohci_hcd
rtc_cmos rtc_core pcspkr rtc_lib p54pci usbcore via_rhine p54common
geode_aes mii [last unloaded: netconsole]
Pid: 0, comm: swapper Not tainted (2.6.28-rc6-wl #16)
EIP: 0060:[<d08260fa>] EFLAGS: 00010002 CPU: 0
EIP is at p54_assign_address+0x67/0x14b [p54common]
EAX: cf98b178 EBX: cf86ee40 ECX: 00000000 EDX: 00000000
ESI: 000000f8 EDI: 00000000 EBP: 0002027c ESP: c03f9c4c
DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
Process swapper (pid: 0, ti=c03f8000 task=c03c4380 task.ti=c03f8000)
Stack:
00000002 ce4d5880 ce4c48b4 cf86e1a0 00000000 00000038 00020200 00000286
cf86ee40 00000004 ce4d58b2 ce4d588c d0826fd7 00000090 014c48d4 ce4c48b4
cf86e1a0 0086ee40 00000004 02000282 ce4c48d4 cf86ef10 cf86ee40 ce4d5880
Call Trace:
[<d0826fd7>] p54_tx+0x416/0x482 [p54common]
[<c02fb7c2>] __ieee80211_tx+0x35/0xf8
[<c02fc235>] ieee80211_master_start_xmit+0x2ab/0x396
[<c01048d3>] common_interrupt+0x23/0x30
[<c0297368>] dev_hard_start_xmit+0x16e/0x1c9
[<c02a3518>] __qdisc_run+0xa2/0x15c
[<c0297796>] dev_queue_xmit+0x2f5/0x3c5
[<c02f8608>] ieee80211_invoke_rx_handlers+0x488/0x1486
[<c02d9d14>] bictcp_cong_avoid+0x10/0x160
[<c02bd904>] tcp_ack+0x16f0/0x1850
[<c01170f0>] enqueue_task_fair+0x12a/0x16b
[<c02c0c37>] tcp_current_mss+0x6b/0xe4
[<c02f9b50>] __ieee80211_rx_handle_packet+0x54a/0x56d
[<c02fa1fe>] __ieee80211_rx+0x491/0x4e3
[<c02ec95d>] ieee80211_tasklet_handler+0x60/0xd6
[<c011cfae>] tasklet_action+0x3e/0x64
[<c011d305>] __do_softirq+0x4a/0xbc
[<c011d399>] do_softirq+0x22/0x26
[<c011d44f>] irq_exit+0x25/0x55
[<c0105996>] do_IRQ+0x5a/0x6c
[<c01048d3>] common_interrupt+0x23/0x30
[<c0108743>] default_idle+0x25/0x38
[<c0102926>] cpu_idle+0x41/0x5b
Code: 0f 84 01 01 00 00 9c 8f 44 24 1c fa 8b 53 10 31 ff 89 6c 24 18
89 14 24 31 d2 eb 3f 8b 4c 24 10 83 c1 38 89 4c 24 14 8b 4c 24 10 <8b>
41 38 29 e8 85 d2 75 0d 39 f0 72 09 8b 51 04 29 f0 89 6c 24
EIP: [<d08260fa>] p54_assign_address+0x67/0x14b [p54common] SS:ESP 0068:c03f9c4c
Kernel panic - not syncing: Fatal exception in interrupt
2008/11/24 Christian Lamparter <[email protected]>:
> On Monday 24 November 2008 17:51:45 Stefan Steuerwald wrote:
>> Thanks again Christian and Johannes,
>>
>> from a first quick check
>> - set-and-clear.diff doesn't seem to change anything
>> - both patches together freeze my system
>>
>> I don't have a serial console on this embedded thing, so I don't know
>> its death poem yet. Let me set up my debug environment properly and
>> report back to you.
>
> This might not be necessary.
>
> Try this updated patch. And let us know if we no do the right thing.
>
> Regards,
> Chr
>
On Thursday 27 November 2008 15:13:54 Johannes Berg wrote:
> On Thu, 2008-11-27 at 15:05 +0100, Stefan Steuerwald wrote:
> > I'm unfamiliar with kernel dev procedure. I assume you guys are trying
> > to get confirmation for that and/or submit patches?
>
> I've already submitted my patch to John, I suppose Christian will submit
> his and add you as Tested-by.
p54-sta-flags-v3.diff will be submitted on friday/saturday (probably together
with the WEP/TKIP & CCMP offload patch, which could be very useful for
embedded systems accesspoints too)...
But I don't know what to do with set-and-clear.diff
(I guess it's still one of the three patches, or?, I've attached it again,
in case johannes missed it? ).
Is it really necessary or does your application work without it?
> > If you need me for
> > any of that, I'll be there anytime. Also, a hint on whether to send
> > pizzas or iPods would be appreciated ;-) .
Johannes,
Isn't there a donation page on the wiki yet? I thought there is such
a thing already?!
Regards,
Chr
On Thu, 2008-11-27 at 15:42 +0100, Christian Lamparter wrote:
> Isn't there a donation page on the wiki yet? I thought there is such
> a thing already?!
We have something for b43 in particular, and madwifi manages funds too,
but since we don't have anything "entity" wireless in general, no. I
don't even know what we'd use funds for.
johannes
The v3 patch (together with the other two) works flawlessly for 1.5
hours now, under app-level traffic.
I will remove 80211 debugging from the kernel and try again for a
longer period, but looking good.
I'm unfamiliar with kernel dev procedure. I assume you guys are trying
to get confirmation for that and/or submit patches? If you need me for
any of that, I'll be there anytime. Also, a hint on whether to send
pizzas or iPods would be appreciated ;-) .
Thank you all for this marathon of a helpdesk session!!!
Stefan.
(and - ahem - I'm one of those customers coming back for more...)
2008/11/27 Christian Lamparter <[email protected]>:
> On Thursday 27 November 2008 09:57:11 Stefan Steuerwald wrote:
>> At least I haven't observed the crash in the last 60 minutes, whereas
>> before it took only 1-2 minutes every time to turn it up.
>> Will test this all day.
>
> Finally!
>
>> The three patches mentioned before are applied, and my app-level
>> timeout is still gone, and the "dropped filtered TX" messages are gone
>> as well.
>>
>> Christian, should I actually test your p54-sta-flags-v3 patch?
>>
> Yes, of course... v1 and v2 had a stupid "full queue" bug in p54_tx/p54_tx_fill.
> So let me know if your ipod still works with v3.
>
> Regards,
> Chr
>
A-ha!
O-kay ;-)
I assume then that the last patch set that removes the problem for me
is perfectly valid, right?
Just that something changes the contents of my RAM. Let me memtest,
maybe recompile my kernel for 486. Other ideas?
Thank you for this!
I very much appreciate the time you people are putting into this thing.
Stefan.
2008/11/26 Christian Lamparter <[email protected]>:
> On Wednesday 26 November 2008 14:38:59 Stefan Steuerwald wrote:
>> console [netcon0] enabled
>> netconsole: network logging started
>> BUG: unable to handle kernel NULL pointer dereference at 00000038
>> IP: [<d08260fa>] p54_assign_address+0x67/0x14b [p54common]
>> *pde = 00000000
>> Oops: 0000 [#1]
>> last sysfs file: /sys/class/net/lo/operstate
>> Modules linked in: netconsole ipv6 loop evdev ehci_hcd ohci_hcd
>> rtc_cmos rtc_core pcspkr rtc_lib p54pci usbcore via_rhine p54common
>> geode_aes mii [last unloaded: netconsole]
>>
>> Pid: 0, comm: swapper Not tainted (2.6.28-rc6-wl #16)
>> EIP: 0060:[<d08260fa>] EFLAGS: 00010002 CPU: 0
>> EIP is at p54_assign_address+0x67/0x14b [p54common]
>> EAX: cf98b178 EBX: cf86ee40 ECX: 00000000 EDX: 00000000
>> ESI: 000000f8 EDI: 00000000 EBP: 0002027c ESP: c03f9c4c
>> DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
>> Process swapper (pid: 0, ti=c03f8000 task=c03c4380 task.ti=c03f8000)
>> Stack:
>> 00000002 ce4d5880 ce4c48b4 cf86e1a0 00000000 00000038 00020200 00000286
>> cf86ee40 00000004 ce4d58b2 ce4d588c d0826fd7 00000090 014c48d4 ce4c48b4
>> cf86e1a0 0086ee40 00000004 02000282 ce4c48d4 cf86ef10 cf86ee40 ce4d5880
>> Call Trace:
>> [<d0826fd7>] p54_tx+0x416/0x482 [p54common]
>> [<c02fb7c2>] __ieee80211_tx+0x35/0xf8
>> [<c02fc235>] ieee80211_master_start_xmit+0x2ab/0x396
>> [<c01048d3>] common_interrupt+0x23/0x30
>> [<c0297368>] dev_hard_start_xmit+0x16e/0x1c9
>> [<c02a3518>] __qdisc_run+0xa2/0x15c
>> [<c0297796>] dev_queue_xmit+0x2f5/0x3c5
>> [<c02f8608>] ieee80211_invoke_rx_handlers+0x488/0x1486
>> [<c02d9d14>] bictcp_cong_avoid+0x10/0x160
>> [<c02bd904>] tcp_ack+0x16f0/0x1850
>> [<c01170f0>] enqueue_task_fair+0x12a/0x16b
>> [<c02c0c37>] tcp_current_mss+0x6b/0xe4
>> [<c02f9b50>] __ieee80211_rx_handle_packet+0x54a/0x56d
>> [<c02fa1fe>] __ieee80211_rx+0x491/0x4e3
>> [<c02ec95d>] ieee80211_tasklet_handler+0x60/0xd6
>> [<c011cfae>] tasklet_action+0x3e/0x64
>> [<c011d305>] __do_softirq+0x4a/0xbc
>> [<c011d399>] do_softirq+0x22/0x26
>> [<c011d44f>] irq_exit+0x25/0x55
>> [<c0105996>] do_IRQ+0x5a/0x6c
>> [<c01048d3>] common_interrupt+0x23/0x30
>> [<c0108743>] default_idle+0x25/0x38
>> [<c0102926>] cpu_idle+0x41/0x5b
>> Code: 0f 84 01 01 00 00 9c 8f 44 24 1c fa 8b 53 10 31 ff 89 6c 24 18
>> 89 14 24 31 d2 eb 3f 8b 4c 24 10 83 c1 38 89 4c 24 14 8b 4c 24 10 <8b>
>> 41 38 29 e8 85 d2 75 0d 39 f0 72 09 8b 51 04 29 f0 89 6c 24
>> EIP: [<d08260fa>] p54_assign_address+0x67/0x14b [p54common] SS:ESP 0068:c03f9c4c
>> Kernel panic - not syncing: Fatal exception in interrupt
>>
> wt*, this bug is "impossible":
>
> The bug happens when p54_assign_address looks for a free space for a new frame:
> here's the code:
> [...]
> if (!skb)
> return -EINVAL; <--- we don't accept "null" skbs
>
> spin_lock_irqsave(&priv->tx_queue.lock, flags); <--- we are under a spin_lock with irq disabled
> left = skb_queue_len(&priv->tx_queue);
> while (left--) {
> u32 hole_size;
> info = IEEE80211_SKB_CB(entry); <--- Here it BUGs,
> [...]
>
> your binary module said that skb->cb is at 0x38,
> so our "entry" is really NULL right when it BUGS.
> And this only happens means that the queue was
> modified "outside" of our driver.
>
> Since we always take the spin_lock_irqsave (of course,
> only of "our" tx_queue). if we need to do anything with the data in the queue,
>
> Of course, since the package as queued while the station was sleeping
> somewhere mac80211, so maybe it still holds a reference to, but then
> other drivers would have already spotted this misbehaviour long time ago...
>
> So? back to square one... I guess.
>
> Regards,
> Chr
>
On Thu, 2008-11-27 at 15:05 +0100, Stefan Steuerwald wrote:
> The v3 patch (together with the other two) works flawlessly for 1.5
> hours now, under app-level traffic.
> I will remove 80211 debugging from the kernel and try again for a
> longer period, but looking good.
:)
> I'm unfamiliar with kernel dev procedure. I assume you guys are trying
> to get confirmation for that and/or submit patches?
I've already submitted my patch to John, I suppose Christian will submit
his and add you as Tested-by.
> If you need me for
> any of that, I'll be there anytime. Also, a hint on whether to send
> pizzas or iPods would be appreciated ;-) .
>
> Thank you all for this marathon of a helpdesk session!!!
> Stefan.
>
> (and - ahem - I'm one of those customers coming back for more...)
heh :)
johannes
On Wednesday 26 November 2008 14:38:59 Stefan Steuerwald wrote:
> console [netcon0] enabled
> netconsole: network logging started
> BUG: unable to handle kernel NULL pointer dereference at 00000038
> IP: [<d08260fa>] p54_assign_address+0x67/0x14b [p54common]
> *pde = 00000000
> Oops: 0000 [#1]
> last sysfs file: /sys/class/net/lo/operstate
> Modules linked in: netconsole ipv6 loop evdev ehci_hcd ohci_hcd
> rtc_cmos rtc_core pcspkr rtc_lib p54pci usbcore via_rhine p54common
> geode_aes mii [last unloaded: netconsole]
>
> Pid: 0, comm: swapper Not tainted (2.6.28-rc6-wl #16)
> EIP: 0060:[<d08260fa>] EFLAGS: 00010002 CPU: 0
> EIP is at p54_assign_address+0x67/0x14b [p54common]
> EAX: cf98b178 EBX: cf86ee40 ECX: 00000000 EDX: 00000000
> ESI: 000000f8 EDI: 00000000 EBP: 0002027c ESP: c03f9c4c
> DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
> Process swapper (pid: 0, ti=c03f8000 task=c03c4380 task.ti=c03f8000)
> Stack:
> 00000002 ce4d5880 ce4c48b4 cf86e1a0 00000000 00000038 00020200 00000286
> cf86ee40 00000004 ce4d58b2 ce4d588c d0826fd7 00000090 014c48d4 ce4c48b4
> cf86e1a0 0086ee40 00000004 02000282 ce4c48d4 cf86ef10 cf86ee40 ce4d5880
> Call Trace:
> [<d0826fd7>] p54_tx+0x416/0x482 [p54common]
> [<c02fb7c2>] __ieee80211_tx+0x35/0xf8
> [<c02fc235>] ieee80211_master_start_xmit+0x2ab/0x396
> [<c01048d3>] common_interrupt+0x23/0x30
> [<c0297368>] dev_hard_start_xmit+0x16e/0x1c9
> [<c02a3518>] __qdisc_run+0xa2/0x15c
> [<c0297796>] dev_queue_xmit+0x2f5/0x3c5
> [<c02f8608>] ieee80211_invoke_rx_handlers+0x488/0x1486
> [<c02d9d14>] bictcp_cong_avoid+0x10/0x160
> [<c02bd904>] tcp_ack+0x16f0/0x1850
> [<c01170f0>] enqueue_task_fair+0x12a/0x16b
> [<c02c0c37>] tcp_current_mss+0x6b/0xe4
> [<c02f9b50>] __ieee80211_rx_handle_packet+0x54a/0x56d
> [<c02fa1fe>] __ieee80211_rx+0x491/0x4e3
> [<c02ec95d>] ieee80211_tasklet_handler+0x60/0xd6
> [<c011cfae>] tasklet_action+0x3e/0x64
> [<c011d305>] __do_softirq+0x4a/0xbc
> [<c011d399>] do_softirq+0x22/0x26
> [<c011d44f>] irq_exit+0x25/0x55
> [<c0105996>] do_IRQ+0x5a/0x6c
> [<c01048d3>] common_interrupt+0x23/0x30
> [<c0108743>] default_idle+0x25/0x38
> [<c0102926>] cpu_idle+0x41/0x5b
> Code: 0f 84 01 01 00 00 9c 8f 44 24 1c fa 8b 53 10 31 ff 89 6c 24 18
> 89 14 24 31 d2 eb 3f 8b 4c 24 10 83 c1 38 89 4c 24 14 8b 4c 24 10 <8b>
> 41 38 29 e8 85 d2 75 0d 39 f0 72 09 8b 51 04 29 f0 89 6c 24
> EIP: [<d08260fa>] p54_assign_address+0x67/0x14b [p54common] SS:ESP 0068:c03f9c4c
> Kernel panic - not syncing: Fatal exception in interrupt
>
wt*, this bug is "impossible":
The bug happens when p54_assign_address looks for a free space for a new frame:
here's the code:
[...]
if (!skb)
return -EINVAL; <--- we don't accept "null" skbs
spin_lock_irqsave(&priv->tx_queue.lock, flags); <--- we are under a spin_lock with irq disabled
left = skb_queue_len(&priv->tx_queue);
while (left--) {
u32 hole_size;
info = IEEE80211_SKB_CB(entry); <--- Here it BUGs,
[...]
your binary module said that skb->cb is at 0x38,
so our "entry" is really NULL right when it BUGS.
And this only happens means that the queue was
modified "outside" of our driver.
Since we always take the spin_lock_irqsave (of course,
only of "our" tx_queue). if we need to do anything with the data in the queue,
Of course, since the package as queued while the station was sleeping
somewhere mac80211, so maybe it still holds a reference to, but then
other drivers would have already spotted this misbehaviour long time ago...
So? back to square one... I guess.
Regards,
Chr