2011-06-08 11:08:05

by Johannes Berg

[permalink] [raw]
Subject: Re: iwlagn aggregation problem when stations are removed/re-added quickly

On Fri, 2011-05-13 at 15:56 -0700, Daniel Halperin wrote:

> I'm running an experiment where a client connects to the AP (both HT
> iwlagn devices), starts a large transfer that gets aggregation going,
> disassociates, and then reassociates and restarts the transfer.
>
> When mac80211 stops the queue (as part of the client's disassocation
> process), it goes into the following code:
>
> IWL_DEBUG_HT(priv, "Stopping a non empty AGG HW QUEUE\n");
> priv->stations[sta_id].tid[tid].agg.state =
> IWL_EMPTYING_HW_QUEUE_DELBA;
> spin_unlock_irqrestore(&priv->sta_lock, flags);
>
> but if the station is removed right away the packets stay in the
> queue. Indeed, when the client reconnects, the packets are then
> delivered! But then the queue gets stuck and the AP issues a firmware
> reset, which doesn't actually get traffic flowing again. Below,
> there's a log with IWL_DL_HT set. It may be something racy; adding
> DL_INFO and DL_MAC80211 I haven't been able to reproduce the bug yet
> in a few tries.
>
> I suspect this will also be a problem with P2P, and not just my klugey
> use of AP mode. Any suggestions as to how to fix?

Sorry I'm replying this late. I'm not sure what the best way to fix it
would be, but it makes sense that this would happen. Maybe we can flush
the aggregation queue (asking the ucode to drop all frames) when the
station is removed, but I'm not sure how we'd do that -- Wey-Yi do you
know if that's possible?

johannes



2011-06-08 14:53:12

by Johannes Berg

[permalink] [raw]
Subject: Re: iwlagn aggregation problem when stations are removed/re-added quickly

On Wed, 2011-06-08 at 07:37 -0700, wwguy wrote:
> On Wed, 2011-06-08 at 04:07 -0700, Johannes Berg wrote:
> > On Fri, 2011-05-13 at 15:56 -0700, Daniel Halperin wrote:
> >
> > > I'm running an experiment where a client connects to the AP (both HT
> > > iwlagn devices), starts a large transfer that gets aggregation going,
> > > disassociates, and then reassociates and restarts the transfer.
> > >
> > > When mac80211 stops the queue (as part of the client's disassocation
> > > process), it goes into the following code:
> > >
> > > IWL_DEBUG_HT(priv, "Stopping a non empty AGG HW QUEUE\n");
> > > priv->stations[sta_id].tid[tid].agg.state =
> > > IWL_EMPTYING_HW_QUEUE_DELBA;
> > > spin_unlock_irqrestore(&priv->sta_lock, flags);
> > >
> > > but if the station is removed right away the packets stay in the
> > > queue. Indeed, when the client reconnects, the packets are then
> > > delivered! But then the queue gets stuck and the AP issues a firmware
> > > reset, which doesn't actually get traffic flowing again. Below,
> > > there's a log with IWL_DL_HT set. It may be something racy; adding
> > > DL_INFO and DL_MAC80211 I haven't been able to reproduce the bug yet
> > > in a few tries.
> > >
> > > I suspect this will also be a problem with P2P, and not just my klugey
> > > use of AP mode. Any suggestions as to how to fix?
> >
> > Sorry I'm replying this late. I'm not sure what the best way to fix it
> > would be, but it makes sense that this would happen. Maybe we can flush
> > the aggregation queue (asking the ucode to drop all frames) when the
> > station is removed, but I'm not sure how we'd do that -- Wey-Yi do you
> > know if that's possible?
> >
> flush the queue might be a good solution, I was being told (I don't
> remember who and when which is bad), the "tx flush" command is needed
> especially for P2P
>
> btw, there are 2 type of "tx flush", flush all the frames in uCode, or
> just flush the frames in specified queue.

Right, but the flush seems to be implemented per FIFO and queue, so it's
a bit confusing. In this case we should drop all frames out of the
aggretgaiton queue.

johannes


2011-06-08 14:40:53

by Wey-Yi Guy

[permalink] [raw]
Subject: Re: iwlagn aggregation problem when stations are removed/re-added quickly

On Wed, 2011-06-08 at 04:07 -0700, Johannes Berg wrote:
> On Fri, 2011-05-13 at 15:56 -0700, Daniel Halperin wrote:
>
> > I'm running an experiment where a client connects to the AP (both HT
> > iwlagn devices), starts a large transfer that gets aggregation going,
> > disassociates, and then reassociates and restarts the transfer.
> >
> > When mac80211 stops the queue (as part of the client's disassocation
> > process), it goes into the following code:
> >
> > IWL_DEBUG_HT(priv, "Stopping a non empty AGG HW QUEUE\n");
> > priv->stations[sta_id].tid[tid].agg.state =
> > IWL_EMPTYING_HW_QUEUE_DELBA;
> > spin_unlock_irqrestore(&priv->sta_lock, flags);
> >
> > but if the station is removed right away the packets stay in the
> > queue. Indeed, when the client reconnects, the packets are then
> > delivered! But then the queue gets stuck and the AP issues a firmware
> > reset, which doesn't actually get traffic flowing again. Below,
> > there's a log with IWL_DL_HT set. It may be something racy; adding
> > DL_INFO and DL_MAC80211 I haven't been able to reproduce the bug yet
> > in a few tries.
> >
> > I suspect this will also be a problem with P2P, and not just my klugey
> > use of AP mode. Any suggestions as to how to fix?
>
> Sorry I'm replying this late. I'm not sure what the best way to fix it
> would be, but it makes sense that this would happen. Maybe we can flush
> the aggregation queue (asking the ucode to drop all frames) when the
> station is removed, but I'm not sure how we'd do that -- Wey-Yi do you
> know if that's possible?
>
flush the queue might be a good solution, I was being told (I don't
remember who and when which is bad), the "tx flush" command is needed
especially for P2P

btw, there are 2 type of "tx flush", flush all the frames in uCode, or
just flush the frames in specified queue.

Thanks
Wey





2011-06-27 15:42:20

by Daniel Halperin

[permalink] [raw]
Subject: Re: iwlagn aggregation problem when stations are removed/re-added quickly

On Wed, Jun 8, 2011 at 8:11 AM, wwguy <[email protected]> wrote:
> On Wed, 2011-06-08 at 07:53 -0700, Johannes Berg wrote:
>> On Wed, 2011-06-08 at 07:37 -0700, wwguy wrote:
>> > On Wed, 2011-06-08 at 04:07 -0700, Johannes Berg wrote:
>> > > On Fri, 2011-05-13 at 15:56 -0700, Daniel Halperin wrote:
>> > >
>> > > > I'm running an experiment where a client connects to the AP (both HT
>> > > > iwlagn devices), starts a large transfer that gets aggregation going,
>> > > > disassociates, and then reassociates and restarts the transfer.
>> > > >
>> > > > When mac80211 stops the queue (as part of the client's disassocation
>> > > > process), it goes into the following code:
>> > > >
>> > > > ? ? ? ? ? ? ? ?IWL_DEBUG_HT(priv, "Stopping a non empty AGG HW QUEUE\n");
>> > > > ? ? ? ? ? ? ? ? priv->stations[sta_id].tid[tid].agg.state =
>> > > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? IWL_EMPTYING_HW_QUEUE_DELBA;
>> > > > ? ? ? ? ? ? ? ? spin_unlock_irqrestore(&priv->sta_lock, flags);
>> > > >
>> > > > but if the station is removed right away the packets stay in the
>> > > > queue. Indeed, when the client reconnects, the packets are then
>> > > > delivered! But then the queue gets stuck and the AP issues a firmware
>> > > > reset, which doesn't actually get traffic flowing again. Below,
>> > > > there's a log with IWL_DL_HT set. It may be something racy; adding
>> > > > DL_INFO and DL_MAC80211 I haven't been able to reproduce the bug yet
>> > > > in a few tries.
>> > > >
>> > > > I suspect this will also be a problem with P2P, and not just my klugey
>> > > > use of AP mode. Any suggestions as to how to fix?
>> > >
>> > > Sorry I'm replying this late. I'm not sure what the best way to fix it
>> > > would be, but it makes sense that this would happen. Maybe we can flush
>> > > the aggregation queue (asking the ucode to drop all frames) when the
>> > > station is removed, but I'm not sure how we'd do that -- Wey-Yi do you
>> > > know if that's possible?
>> > >
>> > flush the queue might be a good solution, I was being told (I don't
>> > remember who and when which is bad), the "tx flush" command is needed
>> > especially for P2P
>> >
>> > btw, there are 2 type of "tx flush", flush all the frames in uCode, or
>> > just flush the frames in specified queue.
>>
>> Right, but the flush seems to be implemented per FIFO and queue, so it's
>> a bit confusing. In this case we should drop all frames out of the
>> aggretgaiton queue.
>>
>
> It is confuse, it will drop all the frames on the request queues
>
> flush_cmd.fifo_control = IWL_TX_FIFO_VO_MSK | IWL_TX_FIFO_VI_MSK |
> ? ? ? ? ? ? ? ? ? ? ? ? IWL_TX_FIFO_BE_MSK | IWL_TX_FIFO_BK_MSK;
> if (priv->cfg->sku & EEPROM_SKU_CAP_11N_ENABLE)
> ? ? ? ? ? ? ? ?flush_cmd.fifo_control |= IWL_AGG_TX_QUEUE_MSK;
>
> But it did not consider different context, I will submit a separate
> patch to fix it

Hi Wey-yi and Johannes,

You must have some docs I don't have because I can't quite figure out
what you're talking about. Is there any progress on this issue? I'd be
happy to test any changes for you.

Dan

2011-06-08 15:14:58

by Wey-Yi Guy

[permalink] [raw]
Subject: Re: iwlagn aggregation problem when stations are removed/re-added quickly

On Wed, 2011-06-08 at 07:53 -0700, Johannes Berg wrote:
> On Wed, 2011-06-08 at 07:37 -0700, wwguy wrote:
> > On Wed, 2011-06-08 at 04:07 -0700, Johannes Berg wrote:
> > > On Fri, 2011-05-13 at 15:56 -0700, Daniel Halperin wrote:
> > >
> > > > I'm running an experiment where a client connects to the AP (both HT
> > > > iwlagn devices), starts a large transfer that gets aggregation going,
> > > > disassociates, and then reassociates and restarts the transfer.
> > > >
> > > > When mac80211 stops the queue (as part of the client's disassocation
> > > > process), it goes into the following code:
> > > >
> > > > IWL_DEBUG_HT(priv, "Stopping a non empty AGG HW QUEUE\n");
> > > > priv->stations[sta_id].tid[tid].agg.state =
> > > > IWL_EMPTYING_HW_QUEUE_DELBA;
> > > > spin_unlock_irqrestore(&priv->sta_lock, flags);
> > > >
> > > > but if the station is removed right away the packets stay in the
> > > > queue. Indeed, when the client reconnects, the packets are then
> > > > delivered! But then the queue gets stuck and the AP issues a firmware
> > > > reset, which doesn't actually get traffic flowing again. Below,
> > > > there's a log with IWL_DL_HT set. It may be something racy; adding
> > > > DL_INFO and DL_MAC80211 I haven't been able to reproduce the bug yet
> > > > in a few tries.
> > > >
> > > > I suspect this will also be a problem with P2P, and not just my klugey
> > > > use of AP mode. Any suggestions as to how to fix?
> > >
> > > Sorry I'm replying this late. I'm not sure what the best way to fix it
> > > would be, but it makes sense that this would happen. Maybe we can flush
> > > the aggregation queue (asking the ucode to drop all frames) when the
> > > station is removed, but I'm not sure how we'd do that -- Wey-Yi do you
> > > know if that's possible?
> > >
> > flush the queue might be a good solution, I was being told (I don't
> > remember who and when which is bad), the "tx flush" command is needed
> > especially for P2P
> >
> > btw, there are 2 type of "tx flush", flush all the frames in uCode, or
> > just flush the frames in specified queue.
>
> Right, but the flush seems to be implemented per FIFO and queue, so it's
> a bit confusing. In this case we should drop all frames out of the
> aggretgaiton queue.
>

It is confuse, it will drop all the frames on the request queues

flush_cmd.fifo_control = IWL_TX_FIFO_VO_MSK | IWL_TX_FIFO_VI_MSK |
IWL_TX_FIFO_BE_MSK | IWL_TX_FIFO_BK_MSK;
if (priv->cfg->sku & EEPROM_SKU_CAP_11N_ENABLE)
flush_cmd.fifo_control |= IWL_AGG_TX_QUEUE_MSK;

But it did not consider different context, I will submit a separate
patch to fix it

Wey