2009-01-13 03:13:00

by Deuce

[permalink] [raw]
Subject: Re: kernel BUG at drivers/net/wireless/iwlwifi/iwl3945-base.c:3127!





----- Original Message ----
> From: Samuel Ortiz <[email protected]>
> To: Deuce <[email protected]>
> Cc: reinette chatre <[email protected]>; "[email protected]" <[email protected]>
> Sent: Monday, January 12, 2009 1:38:08 PM
> Subject: Re: kernel BUG at drivers/net/wireless/iwlwifi/iwl3945-base.c:3127!
>
> Hi Jason,
>
> On Fri, Jan 09, 2009 at 03:07:01PM -0800, Deuce wrote:
> > > From: reinette chatre
> >
> > > On Thu, 2009-01-08 at 19:28 -0800, Deuce wrote:
> > > > Kernel BUG in iwl3945 with 20090107 wireless-testing and firmware
> 15.28.2.8
> > > The Microcode SW error detected seems to be the beginning of the end. An
> > > attempt with Ubuntu's distributed iwlwifi-3945-1.ucode firmware was not
> > > successful either (I do not know the version).
> > > >
> > > > The BUG happens a short period after logging in when Netmanager starts to
> scan
> > > and attempt to associate. Association never completes.
> > > >
> > > > Curiously, the bug was not triggered the first time I finally booted up
> with
> > > iwl3945 debug=0x43fff and netconsole functioning. However it was
> immediately
> > > triggered on a subsequent reboot. The first try may have been a warm reboot
> vs.
> > > a cold reboot.
> > > >
> > > > Below is the dmesg output without debug. Attached is a full dmesg output
> with
> > > debug=0x43fff.
> > >
> > > There appears to be a few things going on here. I am still investigating
> > > the firmware error, but we could start with something that will not let
> > > your machine crash and get us some more information about one of the
> > > issues.
> > >
> > > Could you please try with this patch? Please do run your test with
> > > debugging enabled as you have done before. Thank you very much.
> > >
> > > diff --git a/drivers/net/wireless/iwlwifi/iwl3945-base.c
> > > b/drivers/net/wireless/iwlwifi/iwl3945-base.c
> > > index a23d51d..09c1c8d 100644
> > > --- a/drivers/net/wireless/iwlwifi/iwl3945-base.c
> > > +++ b/drivers/net/wireless/iwlwifi/iwl3945-base.c
> > > @@ -3118,7 +3118,14 @@ static void iwl3945_tx_cmd_complete(struct iwl_priv
> > > *priv,
> > > int cmd_index;
> > > struct iwl_cmd *cmd;
> > >
> > > - BUG_ON(txq_id != IWL_CMD_QUEUE_NUM);
> > > + if (WARN(txq_id != IWL_CMD_QUEUE_NUM,
> > > + "wrong command queue %d, sequence 0x%X readp=%d writep=%d\n",
> > > + txq_id, sequence,
> > > + priv->txq[IWL_CMD_QUEUE_NUM].q.read_ptr,
> > > + priv->txq[IWL_CMD_QUEUE_NUM].q.write_ptr)) {
> > > + iwl_print_hex_dump(priv, IWL_DL_INFO , rxb, 32);
> > > + return;
> > > + }
> > >
> > > cmd_index = get_cmd_index(&priv->txq[IWL_CMD_QUEUE_NUM].q, index, huge);
> > > cmd = priv->txq[IWL_CMD_QUEUE_NUM].cmd[cmd_index];
> >
> > New log attached. Only the above patch was applied to the previous code base.
> It seems you can easily reproduce this bug, but unfortunately we can't.
> Would you be able to run a code bisection on this one ? Can you try if commit
> cbd8b90ffd8a321ffb2a705733729f0d5ebb20f9 is working for you ? If that's so,
> that should let you bisect quite quickly.

First off, cbd8b90ffd8a321ffb2a705733729f0d5ebb20f9 worked fine.

I ran through the git bisect as you can see below. However, the original BUG_ON was not triggered. Instead, I encountered authentication time outs with my unencrypted AP. This made the nominally straightforward good/bad bisect question a little more interesting. When the BUG_ON was triggered, authentication did not complete either, so take for what it's worth.

Was the restriction of git bisect to drivers/net/wireless/iwlwifi a bad idea?

I'll look into actually triggering the BUG_ON proper.

Jason

$ git bisect log
# bad: [10bc72100559eae0e27f111be96b5e4afd07a1dc] p54: power save management
# good: [cbd8b90ffd8a321ffb2a705733729f0d5ebb20f9] iwl3945: iwl3945_queue and iwl3945_channel_info replacement
git-bisect start 'master-2009-01-06' 'cbd8b90ffd8a321ffb2a705733729f0d5ebb20f9' '--' 'drivers/net/wireless/iwlwifi/'
# good: [a0dedce20b4db9e6a7200eacb4d10fc3ee4c1b6b] iwlwifi: replace IWL_ERROR with IWL_ERR
git-bisect good a0dedce20b4db9e6a7200eacb4d10fc3ee4c1b6b
# skip: [9bfb965e2826a3cc2f6abbefe59b9c3e8f0a8294] iwl3945: use iwl_get_hw_mode
# *** authentication timed out
git-bisect skip 9bfb965e2826a3cc2f6abbefe59b9c3e8f0a8294
# bad: [ba95810656873203d14f6274f1aa73ad0b42cffe] iwl3945: release resources before shutting down
# *** authentication timed out
git-bisect bad ba95810656873203d14f6274f1aa73ad0b42cffe
# bad: [b5e33e433937e7525c6e6102ff9b2c47c0bf8d5a] iwl3945: add load ucode op
# *** authentication timed out
git-bisect bad b5e33e433937e7525c6e6102ff9b2c47c0bf8d5a
# bad: [259fc66b6afeb2f10bb86c7fbef541981a3216c9] iwl3945: use iwl_mod_params for 3945
# *** authentication timed out
git-bisect bad 259fc66b6afeb2f10bb86c7fbef541981a3216c9
# good: [bb64785ad94d575fe4f5f9e69f4f6c0b24e9905d] iwlwifi: use iwl_cmd instead of iwl3945_cmd
git-bisect good bb64785ad94d575fe4f5f9e69f4f6c0b24e9905d


259fc66b6afeb2f10bb86c7fbef541981a3216c9 is first bad commit
commit 259fc66b6afeb2f10bb86c7fbef541981a3216c9
Author: Kolekar, Abhijeet <[email protected]>
Date: Fri Dec 19 10:37:35 2008 +0800

iwl3945: use iwl_mod_params for 3945

Use iwl_mod_params for 3945.

Signed-off-by: Abhijeet Kolekar <[email protected]>
Signed-off-by: Zhu Yi <[email protected]>
Signed-off-by: John W. Linville <[email protected]>

:040000 040000 13a738ff0a9b89db4838c70937f32a6434a0384f 32eabfca375a07a548cf3b4323d1edb550474d4f M drivers