2008-11-06 15:34:17

by Frank Seidel

[permalink] [raw]
Subject: Problem with Kernel Oops in ipw2200

Hi,

after getting this bug
https://bugzilla.novell.com/show_bug.cgi?id=397390
i tried to reproduce this and even got a kernel Oops
in ipw2200 (ipw_tx_skb) after about 15 Minutes pinging
through a wpa enterprise connection.

After some tests i came up with the attaced patch that
fixes the issue for me here, but is probably the wrong
way to go as this is my first attempt in the wireless network
area.
Ever suggestion is very appreciated.

Thanks,
Frank


Attachments:
ipw2200_panic_fix.patch (2.07 kB)

2008-11-18 19:31:16

by John W. Linville

[permalink] [raw]
Subject: Re: [PATCH] Re: Problem with Kernel Oops in ipw2200

On Mon, Nov 17, 2008 at 11:48:19AM +0100, Frank Seidel wrote:
> From: Zhu Yi <[email protected]>
>
> Fixes Oops in ipw2200:ipw_tx_skb when pinging through
> a WPA enterprise connection.
>
> Signed-off-by: Zhu Yi <[email protected]>
> Tested-by: Frank Seidel <[email protected]>
> Signed-off-by: Ffrank Seidel <[email protected]>
>
> ---
> drivers/net/wireless/ipw2200.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> --- a/drivers/net/wireless/ipw2200.c
> +++ b/drivers/net/wireless/ipw2200.c
> @@ -10190,6 +10190,11 @@ static int ipw_tx_skb(struct ipw_priv *p
> u16 remaining_bytes;
> int fc;
>
> + if (!(priv->status & STATUS_ASSOCIATED)) {
> + IPW_DEBUG_TX("Tx attempt while not associated.\n");
> + goto drop;
> + }
> +
> hdr_len = ieee80211_get_hdrlen(le16_to_cpu(hdr->frame_ctl));
> switch (priv->ieee->iw_mode) {
> case IW_MODE_ADHOC:
>

Well, I'm sorry to be a PITA...but the changelog doesn't really
explain how the patch works or what it is doing. Also, this still
seems to me more like a band-aid than a real fix...?

John
--
John W. Linville Linux should be at the core
[email protected] of your literate lifestyle.

2008-11-14 10:07:48

by Frank Seidel

[permalink] [raw]
Subject: Re: Problem with Kernel Oops in ipw2200

Zhu Yi wrote:
> May I have your oops log?

Yes, sure, see below

> Please try if attached patch fix the problem.

IPW_DEBUG is undeclared, but besides that i'm currently trying
your patch, thanks a lot!

Kernel Ooops message:
BUG: unable to handle kernel NULL pointer dereference at 00000000
IP: [<f92f35df>] :ipw2200:ipw_tx_skb+0x1b8/0x5df
*pde = 00000000
Oops: 0002 [#1] SMP
last sysfs file:
/sys/devices/LNXSYSTM:00/device:00/PNP0C0A:00/power_supply/BATA/energy_full
Modules linked in: michael_mic arc4 ecb ieee80211_crypt_tkip aes_i586
crypto_blkcipher aes_generic ieee80211_crypt_ccmp af_packet binfmt_misc
snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device i915 drm ipv6
cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq
speedstep_lib fuse loop dm_mod pcmcia snd_intel8x0 ipw2200(N)
snd_ac97_codec yenta_socket ac97_bus ieee80211 iTCO_wdt rsrc_nonstatic
ohci1394 snd_pcm snd_timer ieee80211_crypt iTCO_vendor_support
pcmcia_core r8169 ieee1394 snd ide_cd_mod i2c_i801 intel_agp video
container soundcore rtc_cmos output rtc_core i2c_core button battery ac
snd_page_alloc rtc_lib agpgart pcspkr cdrom shpchp serio_raw joydev
pci_hotplug ide_disk uhci_hcd ehci_hcd usbcore edd ext3 mbcache jbd fan
ide_pci_generic ata_generic ata_piix pata_acpi libata scsi_mod dock piix
ide_core thermal processor thermal_sys hwmon
Supported: No

Pid: 3405, comm: ping Tainted: G (2.6.27.4-2-default #1)
EIP: 0060:[<f92f35df>] EFLAGS: 00010046 CPU: 0
EIP is at ipw_tx_skb+0x1b8/0x5df [ipw2200]
EAX: 00000000 EBX: 00000000 ECX: f781a400 EDX: 00000000
ESI: f781a400 EDI: 00000000 EBP: 00000000 ESP: ecbadaec
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process ping (pid: 3405, ti=ecbac000 task=ecab5180 task.ti=ecbac000)
Stack: 00000000 00000000 41081600 c2860b00 9f24d2e7 52d2bbd9 f01bc1c0
0000000c
f037d400 f781a400 f78b8c5c 00000000 f7795010 f01bc1f4 01f3b860
0000005c
0000c1f4 00000001 00000018 00000000 0000000c f78b8c5c f781a400
00000000
Call Trace:
[<f92f441d>] ipw_net_hard_start_xmit+0x47/0x66 [ipw2200]
[<f92b5d69>] ieee80211_xmit+0x939/0x99c [ieee80211]
[<c02dd75d>] dev_hard_start_xmit+0xf0/0x156
[<c02ebf42>] __qdisc_run+0xa6/0x1ad
[<c02ddab8>] dev_queue_xmit+0x20b/0x343
[<c02fd90b>] ip_finish_output+0x1d6/0x20f
[<c02fcc75>] ip_local_out+0x15/0x17
[<c02fceca>] ip_push_pending_frames+0x253/0x2ae
[<c0313ba7>] raw_sendmsg+0x37f/0x3f3
[<c031ad82>] inet_sendmsg+0x3b/0x45
[<c02d2482>] sock_sendmsg+0xc9/0xe4
[<c02d2627>] sys_sendmsg+0x18a/0x1eb
[<c02d392b>] sys_socketcall+0x241/0x290
[<c0103cbb>] sysenter_do_call+0x12/0x2f
[<ffffe430>] 0xffffe430
=======================
Code: 24 44 8b 4c 24 24 48 89 44 24 2c 6b c0 28 03 44 24 28 8b 90 ac 04
00 00 89 d5 c1 e5 07 03 a8 c8 04 00 00 8b 80 cc 04 00 00 89 ef <89> 0c
90 b9 20 00 00 00 31 c0 f3 ab 88 5d 08 c6 45 00 00 c6 45
EIP: [<f92f35df>] ipw_tx_skb+0x1b8/0x5df [ipw2200] SS:ESP 0068:ecbadaec
Kernel panic - not syncing: Fatal exception in interrupt
------------[ cut here ]------------

2008-11-14 14:46:11

by Frank Seidel

[permalink] [raw]
Subject: Re: Problem with Kernel Oops in ipw2200

Zhu Yi wrote:
> Please try if attached patch fix the problem.

Unfortunately this bug isn't that easy to reproduce, this
is why my tests took so long.
But finally i can say this if clause really is hit and
circumvents the oops!
Thank you very much!
So, can this patch please be taken?

Thanks,
Frank

2008-11-18 19:34:17

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH] Re: Problem with Kernel Oops in ipw2200

On Tue, 2008-11-18 at 11:23 -0800, John W. Linville wrote:
> On Mon, Nov 17, 2008 at 11:48:19AM +0100, Frank Seidel wrote:
> > From: Zhu Yi <[email protected]>
> >
> > Fixes Oops in ipw2200:ipw_tx_skb when pinging through
> > a WPA enterprise connection.
> >
> > Signed-off-by: Zhu Yi <[email protected]>
> > Tested-by: Frank Seidel <[email protected]>
> > Signed-off-by: Ffrank Seidel <[email protected]>
> >
> > ---
> > drivers/net/wireless/ipw2200.c | 5 +++++
> > 1 file changed, 5 insertions(+)
> >
> > --- a/drivers/net/wireless/ipw2200.c
> > +++ b/drivers/net/wireless/ipw2200.c
> > @@ -10190,6 +10190,11 @@ static int ipw_tx_skb(struct ipw_priv *p
> > u16 remaining_bytes;
> > int fc;
> >
> > + if (!(priv->status & STATUS_ASSOCIATED)) {
> > + IPW_DEBUG_TX("Tx attempt while not associated.\n");
> > + goto drop;
> > + }
> > +
> > hdr_len = ieee80211_get_hdrlen(le16_to_cpu(hdr->frame_ctl));
> > switch (priv->ieee->iw_mode) {
> > case IW_MODE_ADHOC:
> >
>
> Well, I'm sorry to be a PITA...but the changelog doesn't really
> explain how the patch works or what it is doing. Also, this still
> seems to me more like a band-aid than a real fix...?

John,

The commit message could be changed to:

"This patch fixes the ipw2200 kernel oops caused by sending frames
while not in associated state."

Will this be sufficient?

Reinette


2008-11-14 02:59:05

by Zhu Yi

[permalink] [raw]
Subject: Re: Problem with Kernel Oops in ipw2200

On Thu, 2008-11-13 at 17:16 +0800, Frank Seidel wrote:
> The problem is that before this patch ipw2200 didn't have that problem
> and now with it is constantly on various machines is running in
> a kernel oops at ipw_tx_skb called by ipw_net_hard_start_xmit.
> So imho that patch introduced the bug somehow and even partly
> reverting it fixes the problem here on my testmachine.

May I have your oops log?

Please try if attached patch fix the problem.

Thanks,
-yi


Attachments:
ipw2200-send-noassoc.patch (548.00 B)

2008-11-06 15:47:04

by Frank Seidel

[permalink] [raw]
Subject: Re: Problem with Kernel Oops in ipw2200

Frank Seidel schrieb:
> Ever suggestion is very appreciated.

My first search through the ipw2200 logs pointed
me to Davids last commit (521c4d96e0840ecce25b956e00f416ed499ef2ba)
and really after reverting it the problem disappeared here.
The patch from my last post was the smallest subset of
the revertion that still works for me here.
Any comments?

2008-11-13 16:16:19

by John W. Linville

[permalink] [raw]
Subject: Re: Problem with Kernel Oops in ipw2200

On Thu, Nov 13, 2008 at 10:16:18AM +0100, Frank Seidel wrote:
> Hi,
>
> John W. Linville schrieb:
> > On Thu, Nov 06, 2008 at 04:46:59PM +0100, Frank Seidel wrote:
> >> Frank Seidel schrieb:
> >>> Ever suggestion is very appreciated.
> >> My first search through the ipw2200 logs pointed
> >> me to Davids last commit (521c4d96e0840ecce25b956e00f416ed499ef2ba)
> >> and really after reverting it the problem disappeared here.
> >> The patch from my last post was the smallest subset of
> >> the revertion that still works for me here.
> >> Any comments?
> >
> > I think it would be better to make sure that ipw2200 (and/or
> > ieee80211) is properly using netif_carrier_{on,off}() instead?
>
> Thanks for your answer. But i am not sure i understand you correctly.
> Do you propose to let the patch in place like it is?
> The problem is that before this patch ipw2200 didn't have that problem
> and now with it is constantly on various machines is running in
> a kernel oops at ipw_tx_skb called by ipw_net_hard_start_xmit.
> So imho that patch introduced the bug somehow and even partly reverting
> it fixes the problem here on my testmachine.

Well, what I am suggesting is that it would be better to fix the
intended purpose of the original patch rather than to partially
revert it, thereby partially reintroducing the problem it was trying
to fix. :-)

John
--
John W. Linville Linux should be at the core
[email protected] of your literate lifestyle.

2008-11-21 19:39:12

by Reinette Chatre

[permalink] [raw]
Subject: Re: [PATCH] Re: Problem with Kernel Oops in ipw2200

On Mon, 2008-11-17 at 02:48 -0800, Frank Seidel wrote:
> From: Zhu Yi <[email protected]>
>
> Fixes Oops in ipw2200:ipw_tx_skb when pinging through
> a WPA enterprise connection.
>
> Signed-off-by: Zhu Yi <[email protected]>
> Tested-by: Frank Seidel <[email protected]>
> Signed-off-by: Ffrank Seidel <[email protected]>
>
> ---
> drivers/net/wireless/ipw2200.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> --- a/drivers/net/wireless/ipw2200.c
> +++ b/drivers/net/wireless/ipw2200.c
> @@ -10190,6 +10190,11 @@ static int ipw_tx_skb(struct ipw_priv *p
> u16 remaining_bytes;
> int fc;
>
> + if (!(priv->status & STATUS_ASSOCIATED)) {
> + IPW_DEBUG_TX("Tx attempt while not associated.\n");
> + goto drop;
> + }
> +
> hdr_len = ieee80211_get_hdrlen(le16_to_cpu(hdr->frame_ctl));
> switch (priv->ieee->iw_mode) {
> case IW_MODE_ADHOC:
> --
> To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

Frank,

The patch above was not accepted. Would it be possible for you to try
the patch below instead? Thank you very much.

diff --git a/drivers/net/wireless/ipw2x00/ipw2200.c
b/drivers/net/wireless/ipw2x00/ipw2200.c
index c73173a..768bbde 100644
--- a/drivers/net/wireless/ipw2x00/ipw2200.c
+++ b/drivers/net/wireless/ipw2x00/ipw2200.c
@@ -189,6 +189,7 @@ static int ipw_up(struct ipw_priv *);
static void ipw_bg_up(struct work_struct *work);
static void ipw_down(struct ipw_priv *);
static void ipw_bg_down(struct work_struct *work);
+static void ipw_link_down(struct ipw_priv *priv);
static int ipw_config(struct ipw_priv *);
static int init_supported_rates(struct ipw_priv *priv,
struct ipw_supported_rates *prates);
@@ -3897,6 +3898,7 @@ static int ipw_disassociate(void *data)
if (!(priv->status & (STATUS_ASSOCIATED | STATUS_ASSOCIATING)))
return 0;
ipw_send_disassociate(data, 0);
+ ipw_link_down(data);
return 1;
}






2008-11-24 12:45:12

by Frank Seidel

[permalink] [raw]
Subject: Re: [PATCH] Re: Problem with Kernel Oops in ipw2200

Hi,

reinette chatre wrote:
> Frank,
>
> The patch above was not accepted. Would it be possible for you to try
> the patch below instead? Thank you very much.

sadfully this patch doesn't seem to fix this issue like the last one.
When applied i run in the following Oops:

BUG: unable to handle kernel NULL pointer dereference at 00000000
IP: [<f92405bf>] :ipw2200:ipw_net_hard_start_xmit+0x428/0x875
*pde = 00000000
Oops: 0002 [#1] SMP
last sysfs file: /sys/devices/pci0000:00/0000:00:1e.0/0000:02:03.0/rf_kill
Modules linked in: aes_i586(N) crypto_blkcipher(N) aes_generic(N) ieee80211_crypt_ccmp(N) af_packet(N) microcode(N) binfmt_misc(N) cpufreq_conservative(N) cpufreq_userspace(N) cpufreq_powersave(N) acpi_cpufreq(N) speedstep_lib(N) snd_pcm_oss(N) snd_mixer_oss(N) snd_seq(N) snd_seq_device(N) ipv6(N) fuse(N) loop(N) dm_mod(N) snd_intel8x0(N) pcmcia(N) ipw2200(N) snd_ac97_codec(N) ieee80211(N) yenta_socket(N) ohci1394(N) ac97_bus(N) ieee80211_crypt(N) r8169(N) rsrc_nonstatic(N) ieee1394(N) snd_pcm(N) pcmcia_core(N) snd_timer(N) shpchp(N) snd(N) video(N) rtc_cmos(N) pci_hotplug(N) i2c_i801(N) intel_agp(N) container(N) soundcore(N) sr_mod(N) iTCO_wdt(N) rtc_core(N) output(N) battery(N) ac(N) rtc_lib(N) serio_raw(N) joydev(N) pcspkr(N) i2c_core(N) agpgart(N) cdrom(N) button(N) snd_page_alloc(N) iTCO_vendor_support(N) sg(N) sd_mod(N) crc_t10dif(N) ehci_hcd(N) uhci_hcd(N) usbcore(N) edd(N) ext3(N) mbcache(N) jbd(N) fan(N) ide_pci_generic(N) piix(N) ide_core(N) ata_generic(N) ata_piix(
N) libata(N) scsi_mod(N) dock(N) thermal(N) processor(N) thermal_sys(N) hwmon(N)
Supported: No

Pid: 3538, comm: ping Tainted: G (2.6.27.5-2-default #1)
EIP: 0060:[<f92405bf>] EFLAGS: 00010046 CPU: 0
EIP is at ipw_net_hard_start_xmit+0x428/0x875 [ipw2200]
EAX: 00000000 EBX: 00000000 ECX: f7ac7c60 EDX: 00000000
ESI: f79ba480 EDI: 00000000 EBP: 00000000 ESP: ecef1ab0
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process ping (pid: 3538, ti=ecef0000 task=f7507000 task.ti=ecef0000)
Stack: f7534b04 ecd5fc00 f8e986f6 ecd5fc00 f7534b24 f7534af4 f8eb9181 00529280
00000000 00000000 f79ba000 f7ac7c60 f79bac60 f79bac5c 00000282 0000000c
ecd5fc00 f7abd080 00000018 00000001 00004b14 f7534af4 f8eb9860 0001005c
Call Trace:
[<f91e1d59>] ieee80211_xmit+0x9f0/0xa63 [ieee80211]
[<c02e3474>] dev_hard_start_xmit+0x1e0/0x247
[<c02f1e91>] __qdisc_run+0xa6/0x1ad
[<c02e3920>] dev_queue_xmit+0x35b/0x493
[<c03037b1>] ip_finish_output+0x1d6/0x20f
[<c0302b1b>] ip_local_out+0x15/0x17
[<c0302d70>] ip_push_pending_frames+0x253/0x2ae
[<c03198ad>] raw_sendmsg+0x5ff/0x677
[<c0320a82>] inet_sendmsg+0x3b/0x45
[<c02d855a>] sock_sendmsg+0xc9/0xe4
[<c02d86ff>] sys_sendmsg+0x18a/0x1eb
[<c02d9a03>] sys_socketcall+0x241/0x290
[<c0104c8b>] sysenter_do_call+0x12/0x2f
[<ffffe430>] 0xffffe430
=======================
Code: 24 4c 8b 4c 24 2c 48 89 44 24 64 6b c0 28 03 44 24 34 8b 90 ac 04 00 00 89 d5 c1 e5 07 03 a8 c8 04 00 00 8b 80 cc 04 00 00 89 ef <89> 0c 90 b9 20 00 00 00 31 c0 f3 ab 88 5d 08 c6 45 00 00 c6 45
EIP: [<f92405bf>] ipw_net_hard_start_xmit+0x428/0x875 [ipw2200] SS:ESP 0068:ecef1ab0
Kernel panic - not syncing: Fatal exception in interrupt
------------[ cut here ]------------



2008-11-12 21:46:12

by John W. Linville

[permalink] [raw]
Subject: Re: Problem with Kernel Oops in ipw2200

On Thu, Nov 06, 2008 at 04:46:59PM +0100, Frank Seidel wrote:
> Frank Seidel schrieb:
> > Ever suggestion is very appreciated.
>
> My first search through the ipw2200 logs pointed
> me to Davids last commit (521c4d96e0840ecce25b956e00f416ed499ef2ba)
> and really after reverting it the problem disappeared here.
> The patch from my last post was the smallest subset of
> the revertion that still works for me here.
> Any comments?

I think it would be better to make sure that ipw2200 (and/or
ieee80211) is properly using netif_carrier_{on,off}() instead?

Thanks,

John
--
John W. Linville Linux should be at the core
[email protected] of your literate lifestyle.

2008-11-13 09:16:30

by Frank Seidel

[permalink] [raw]
Subject: Re: Problem with Kernel Oops in ipw2200

Hi,

John W. Linville schrieb:
> On Thu, Nov 06, 2008 at 04:46:59PM +0100, Frank Seidel wrote:
>> Frank Seidel schrieb:
>>> Ever suggestion is very appreciated.
>> My first search through the ipw2200 logs pointed
>> me to Davids last commit (521c4d96e0840ecce25b956e00f416ed499ef2ba)
>> and really after reverting it the problem disappeared here.
>> The patch from my last post was the smallest subset of
>> the revertion that still works for me here.
>> Any comments?
>
> I think it would be better to make sure that ipw2200 (and/or
> ieee80211) is properly using netif_carrier_{on,off}() instead?

Thanks for your answer. But i am not sure i understand you correctly.
Do you propose to let the patch in place like it is?
The problem is that before this patch ipw2200 didn't have that problem
and now with it is constantly on various machines is running in
a kernel oops at ipw_tx_skb called by ipw_net_hard_start_xmit.
So imho that patch introduced the bug somehow and even partly reverting
it fixes the problem here on my testmachine.

Thanks for any help,
Frank

2008-11-17 10:48:37

by Frank Seidel

[permalink] [raw]
Subject: [PATCH] Re: Problem with Kernel Oops in ipw2200

From: Zhu Yi <[email protected]>

Fixes Oops in ipw2200:ipw_tx_skb when pinging through
a WPA enterprise connection.

Signed-off-by: Zhu Yi <[email protected]>
Tested-by: Frank Seidel <[email protected]>
Signed-off-by: Ffrank Seidel <[email protected]>

---
drivers/net/wireless/ipw2200.c | 5 +++++
1 file changed, 5 insertions(+)

--- a/drivers/net/wireless/ipw2200.c
+++ b/drivers/net/wireless/ipw2200.c
@@ -10190,6 +10190,11 @@ static int ipw_tx_skb(struct ipw_priv *p
u16 remaining_bytes;
int fc;

+ if (!(priv->status & STATUS_ASSOCIATED)) {
+ IPW_DEBUG_TX("Tx attempt while not associated.\n");
+ goto drop;
+ }
+
hdr_len = ieee80211_get_hdrlen(le16_to_cpu(hdr->frame_ctl));
switch (priv->ieee->iw_mode) {
case IW_MODE_ADHOC: