2013-03-19 09:52:43

by Andreas Fenkart

[permalink] [raw]
Subject: mwifiex: infinite loop in mwifiex_main_process

Hi,

I'm working on this, currently testing it
http://www.mail-archive.com/[email protected]/msg17726.html

Within less than 3 days, the module always crashes. I observe
that mwifiex is looping in mwifiex_main_process, not exiting. The
loop is always entered from the sdio_irq_thread.

I put printk statement to find cause for why it's not exiting:

[18017.211513] scan processing 0
[18017.214686] data sent 0
[18017.217269] ps state 0
[18017.219765] cmd sent 0 / curr cmd (null)
[18017.224134] is_command_pending 0
[18017.227548] wmm list empty 0
[18017.230592] tx_lock_flag 0

So it seems the wmm list has packets queued, but they are never
sent out. Adding a few more statements, it seems the problem is
in mwifiex_wmm_get_highest_priolist_ptr:

for (j = adapter->priv_num - 1; j >= 0; --j) {

spin_lock_irqsave(&adapter->bss_prio_tbl[j].bss_prio_lock,
flags);
is_list_empty = list_empty(&adapter->bss_prio_tbl[j]
.bss_prio_head);
spin_unlock_irqrestore(&adapter->bss_prio_tbl[j].bss_prio_lock,
flags);
if (is_list_empty)
continue;

.... <snip> ...

do {
priv_tmp = bssprio_node->priv;
hqp = &priv_tmp->wmm.highest_queued_prio;

for (i = atomic_read(hqp); i >= LOW_PRIO_TID;
--i) {
...
... NEVER REACHED ...
...


So there are packets queued, but the highest_queued_prio is too
low, so they are never sent out.

I was never able to crash it without my patches, though trying
harder now. But maybe my patches only trigger the issue more
often.

Is there a known issue, with highest_queued_prio getting out of
sync with the number of packets queued?

rgds,
Andi



2013-03-19 22:38:05

by Bing Zhao

[permalink] [raw]
Subject: RE: mwifiex: infinite loop in mwifiex_main_process

Hi Andi,

> Hi,
>
> I'm working on this, currently testing it
> http://www.mail-archive.com/[email protected]/msg17726.html
>
> Within less than 3 days, the module always crashes. I observe
> that mwifiex is looping in mwifiex_main_process, not exiting. The
> loop is always entered from the sdio_irq_thread.
>
> I put printk statement to find cause for why it's not exiting:
>
> [18017.211513] scan processing 0
> [18017.214686] data sent 0
> [18017.217269] ps state 0
> [18017.219765] cmd sent 0 / curr cmd (null)
> [18017.224134] is_command_pending 0
> [18017.227548] wmm list empty 0
> [18017.230592] tx_lock_flag 0
>
> So it seems the wmm list has packets queued, but they are never
> sent out. Adding a few more statements, it seems the problem is
> in mwifiex_wmm_get_highest_priolist_ptr:
>
> for (j = adapter->priv_num - 1; j >= 0; --j) {
>
> spin_lock_irqsave(&adapter->bss_prio_tbl[j].bss_prio_lock,
> flags);
> is_list_empty = list_empty(&adapter->bss_prio_tbl[j]
> .bss_prio_head);
> spin_unlock_irqrestore(&adapter->bss_prio_tbl[j].bss_prio_lock,
> flags);
> if (is_list_empty)
> continue;
>
> .... <snip> ...
>
> do {
> priv_tmp = bssprio_node->priv;
> hqp = &priv_tmp->wmm.highest_queued_prio;
>
> for (i = atomic_read(hqp); i >= LOW_PRIO_TID;
> --i) {
> ...
> ... NEVER REACHED ...
> ...
>
>
> So there are packets queued, but the highest_queued_prio is too
> low, so they are never sent out.

Could you apply the debug patch attached to print out hqp number?

>
> I was never able to crash it without my patches, though trying
> harder now. But maybe my patches only trigger the issue more
> often.
>
> Is there a known issue, with highest_queued_prio getting out of
> sync with the number of packets queued?

I'm not aware of any known issue related to highest_queued_prio.

Regards,
Bing




Attachments:
wmm_hqp_debug.diff (1.40 kB)
wmm_hqp_debug.diff

2013-04-11 11:51:33

by Andreas Fenkart

[permalink] [raw]
Subject: [PATCH v3 0/2] wmm queues handling simplificatons

v3: minor cleanups for coding style; checkpatch.pl --strict
v2: split 6 patches into 2 patchsets. This is the set 2.

Andreas Fenkart (2):
mwifiex: replace ra_list_curr by list rotation.
mwifiex: rework round robin scheduling of bss nodes.

drivers/net/wireless/mwifiex/11n_aggr.c | 14 +--
drivers/net/wireless/mwifiex/init.c | 21 ++--
drivers/net/wireless/mwifiex/main.h | 1 -
drivers/net/wireless/mwifiex/wmm.c | 160 ++++++++++---------------------
drivers/net/wireless/mwifiex/wmm.h | 3 +
5 files changed, 61 insertions(+), 138 deletions(-)

--
1.7.10.4


2013-04-05 08:27:15

by Andreas Fenkart

[permalink] [raw]
Subject: Re: [PATCH 1/6] mwifiex: bug: remove NO_PKT_PRIO_TID.

Hi Bing,

On Thu, Apr 04, 2013 at 03:56:28PM -0700, Bing Zhao wrote:
> >
> > set 1:
> > 1/4 mwifiex: rework round robin scheduling of bss nodes.
> > 2/4 mwifiex: replace ra_list_curr by list rotation.
> > 3/4 mwifiex: bug: hold proper locks when accessing ra_list / bss_prio lists.
> > 4/4 mwifiex: bug: remove NO_PKT_PRIO_TID.
>
> In your patchset 1, the actual patches are:
>
> 1/4 mwifiex: bug: wrong list in list_empty check.
> 2/4 mwifiex: remove unused tid_tbl_lock from mwifiex_tid_tbl.
> 3/4 mwifiex: bug: remove NO_PKT_PRIO_TID.
> 4/4 mwifiex: bug: hold proper locks when accessing ra_list / bss_prio lists.

thanks for fixing

>
> And I have comments on 2/4 and 4/4. Could you please make changes and resend v3 series?
> Please let me know if you want me to make the change and post the v3.

thanks for fixing

> > set 2:
> > 1/2 mwifiex: remove unused tid_tbl_lock from mwifiex_tid_tbl.
> > 2/2 mwifiex: bug: wrong list in list_empty check.
>
> And the actual patchset 2 is:
>
> 1/2 mwifiex: replace ra_list_curr by list rotation.
> 2/2 mwifiex: rework round robin scheduling of bss nodes.
>
> checkpatch.pl detected a couple warnings. Please check and resend v3 series for patchset 2.

I get only one warning:

WARNING: networking block comments don't use an empty /* line, use /*
Comment...
#191: FILE: drivers/net/wireless/mwifiex/wmm.c:1011:
+
+/*

Probably this issue:

>From Documentation/CodingStyle:
----------------------------------------------------------------------
The preferred style for long (multi-line) comments is:

/*
* This is the preferred style for multi-line
* comments in the Linux kernel source code.
* Please use it consistently.
*
* Description: A column of asterisks on the left side,
* with beginning and ending almost-blank lines.
*/

For files in net/ and drivers/net/ the preferred style for long
(multi-line)
comments is a little different.

/* The preferred comment style for files in net/ and drivers/net
* looks like this.
*
* It is nearly the same as the generally preferred comment
* style,
* but there is no initial almost-blank line.
*/
----------------------------------------------------------------------

If I run: checkpatch.pl -f drivers/net/wireless/mwifiex/wmm.c. I
get about 10-20 similar warnings. In general it seems mwifiex
sticks with type 1 multiline comments. And I wanted to stick with
the de-facto standard.

Should I still fix the warning?
Should I prepare a separate patch on top fixing all multiline
comments?

> >
> > Patch set 2 can be delayed, but since hard to read code probably
> > introduced all the problems, I suggest to apply it promptly.
> > It simplifies the code a lot.
>
> I will ack your latest patchset 2 as soon as the WMM test is passed.
>
> > > We will run stress tests against 4-6.
> >
> > I'm here at 4+ days, still running. This exceeds all previous
> > tests on my particular setup.
>
> Could you share what test cases you are running? I will try to cover the cases you have not done.

Duration test with high load, sender and receiver.

I have dualband router and two DUT. One DUT connects to
2.4GHz-band the other to 5GHz-band. One is running iperf as
server the other iperf as a client in a while true loop.

Eventually I will switch over to set one DUT as uAP, but didn't
play with uAP yet.

rgds,
Andi

2013-04-02 00:08:58

by Andreas Fenkart

[permalink] [raw]
Subject: [PATCH 1/6] mwifiex: bug: remove NO_PKT_PRIO_TID.

Using NO_PKT_PRIO_TID and tx_pkts_queued to check for an empty state, can
lead to a contradictory state, resulting in an infinite loop.
Currently queueing and dequeuing of packets is not synchronized, and can
happen concurrently. While tx_pkts_queued is incremented when adding a
packet, max prio is set to NO_PKT when the WMM list is empty. If a packet
is added right after the check for empty, but before setting max prio to
NO_PKT, that packet is trapped and creates an infinite loop.
Because of the new packet, tx_pkts_queued is at least 1, indicating wmm
lists are not empty. Opposing that max prio is NO_PKT, which means "skip
this wmm queue, it has no packets". The infinite loop results, because the
main loop checks the wmm lists for not empty via tx_pkts_queued, but when
dequeing uses max_prio to see if it can skip a list. This will never end,
unless a new packet is added which will restore max prio to the level of
the trapped packet.
The solution here is to rely on tx_pkts_queued solely for checking wmm
queue to be empty, and drop the NO_PKT define. It does not address the
locking issue.

Signed-off-by: Andreas Fenkart <[email protected]>
---
drivers/net/wireless/mwifiex/main.h | 1 -
drivers/net/wireless/mwifiex/wmm.c | 12 +++++-------
2 files changed, 5 insertions(+), 8 deletions(-)

diff --git a/drivers/net/wireless/mwifiex/main.h b/drivers/net/wireless/mwifiex/main.h
index 553adfb..3de4cc7 100644
--- a/drivers/net/wireless/mwifiex/main.h
+++ b/drivers/net/wireless/mwifiex/main.h
@@ -218,7 +218,6 @@ struct mwifiex_tid_tbl {
#define WMM_HIGHEST_PRIORITY 7
#define HIGH_PRIO_TID 7
#define LOW_PRIO_TID 0
-#define NO_PKT_PRIO_TID (-1)

struct mwifiex_wmm_desc {
struct mwifiex_tid_tbl tid_tbl_ptr[MAX_NUM_TID];
diff --git a/drivers/net/wireless/mwifiex/wmm.c b/drivers/net/wireless/mwifiex/wmm.c
index 32adc87..91dee27 100644
--- a/drivers/net/wireless/mwifiex/wmm.c
+++ b/drivers/net/wireless/mwifiex/wmm.c
@@ -919,8 +919,12 @@ mwifiex_wmm_get_highest_priolist_ptr(struct mwifiex_adapter *adapter,

do {
priv_tmp = bssprio_node->priv;
- hqp = &priv_tmp->wmm.highest_queued_prio;

+ if (atomic_read(&priv_tmp->wmm.tx_pkts_queued) == 0)
+ continue;
+
+ /* iterate over the WMM queues of the BSS */
+ hqp = &priv_tmp->wmm.highest_queued_prio;
for (i = atomic_read(hqp); i >= LOW_PRIO_TID; --i) {

tid_ptr = &(priv_tmp)->wmm.
@@ -980,12 +984,6 @@ mwifiex_wmm_get_highest_priolist_ptr(struct mwifiex_adapter *adapter,
} while (ptr != head);
}

- /* No packet at any TID for this priv. Mark as such
- * to skip checking TIDs for this priv (until pkt is
- * added).
- */
- atomic_set(hqp, NO_PKT_PRIO_TID);
-
/* Get next bss priority node */
bssprio_node = list_first_entry(&bssprio_node->list,
struct mwifiex_bss_prio_node,
--
1.7.10.4


2013-04-02 18:16:47

by Bing Zhao

[permalink] [raw]
Subject: RE: mwifiex: infinite loop in mwifiex_main_process

Hi Andi,

[...]

> + spin_lock_irqsave(&priv_tmp->wmm.ra_list_spinlock, flags);
> + BUG_ON(atomic_read(&priv_tmp->wmm.tx_pkts_queued));
> + spin_unlock_irqrestore(&priv_tmp->wmm.ra_list_spinlock, flags);
> +
> /* No packet at any TID for this priv. Mark as such
> * to skip checking TIDs for this priv (until pkt is
> * added).
> atomic_set(hqp, NO_PKT_PRIO_TID);
>
>
> Which crashed. Hence searching for queued packets and adding new ones is
> not synchronized, new packets can be added while searching the WMM
> queues. If a packet is added right before setting max prio to NO_PKT,
> that packet is trapped and creates an infinite loop.
>
> Because of the new packet tx_pkts_queued is at least 1, indicating wmm
> lists are not empty. Opposing that max prio is NO_PKT, which means "skip
> this wmm queue, it has no packets".
> The infinite loop results, because the main loop checks the wmm lists
> for not empty (tx_pkts_queued != 0), but then finds no packet since it
> skips the wmm queue where it is located on. This will never end, unless
> a new packet is added which will restore max prio.

Thanks for your analysis.

> One possible solution is is to rely on tx_pkts_queued solely for
> checking wmm queue to be empty, and drop the NO_PKT define.

FYI, Yogesh suggested another fix (attached).

[...]

> seems to be intruduced with this patch:
> 17e8cec 05-16-2011 mwifiex: CPU mips optimization with NO_PKT_PRIO_TID
>
> I was wondering why hasn't happened more frequently. Evtl. if the
> interface is working in bridge mode, new packets might be added to the
> WMM queue with the trapped packet. 2c
>
> I prepared a few patches, fixing above bug as suggested and plus some
> cleanup patches I did while trying to get an understanding. Pls review.

Thanks for the patches. We will review them and run some WMM tests.

Thanks,
Bing


Attachments:
hqp.diff (1.58 kB)
hqp.diff

2013-04-04 20:57:14

by Andreas Fenkart

[permalink] [raw]
Subject: Re: [PATCH 1/6] mwifiex: bug: remove NO_PKT_PRIO_TID.

Hi Bing,
On Wed, Apr 03, 2013 at 11:37:43AM -0700, Bing Zhao wrote:
>
[snip]
> > How should I proceed? Can I reorder patches to match my
> > development cycle, which is? 2-5;1;6 or more verbosely
> > cleanup first followed by bug fix and proper locking last

> You can separate 6 patches into two patch sets.
> 1-3 fix the bug and clean up.
> 4-6 implement ra list rotation and proper locking.

I reworked the patchset and split into two.


set 1:
1/4 mwifiex: rework round robin scheduling of bss nodes.
2/4 mwifiex: replace ra_list_curr by list rotation.
3/4 mwifiex: bug: hold proper locks when accessing ra_list / bss_prio lists.
4/4 mwifiex: bug: remove NO_PKT_PRIO_TID.

3/4 previous 6/6 is bugfix now:

...
Another race condition exists, if a new highest priority
packet is added. If concurrently a packet is dequeued, the newly
set max prio will be overwritten with the value of the dequeued
packet. This can occur, because selecting a packet and modifying
the max prio is not atomic. This results in an infinite loop
unless, a new packet is added that has at least the priority of
the hidden packet.
...


set 2:
1/2 mwifiex: remove unused tid_tbl_lock from mwifiex_tid_tbl.
2/2 mwifiex: bug: wrong list in list_empty check.

Patch set 2 can be delayed, but since hard to read code probably
introduced all the problems, I suggest to apply it promptly.
It simplifies the code a lot.


>
> We will run stress tests against 4-6.

I'm here at 4+ days, still running. This exceeds all previous
tests on my particular setup.

>
> Thanks,
> Bing

rgds,
Andi

2013-04-04 21:08:20

by Andreas Fenkart

[permalink] [raw]
Subject: [PATCH 1/2] mwifiex: replace ra_list_curr by list rotation.

After a packet is successfully transmitted, ra list is rotated, so the ra
next to the one transmitted, will be the first in the list. This way we
pick the ra' in a round robin fashion. This significantly simplifies
iteration in mwifiex_wmm_get_highest_priolist_ptr to a call to
list_for_each_entry.
List rotation is done via list_move, where the head itself is temporarily
removed and then re-inserted after the item just transferred.

Signed-off-by: Andreas Fenkart <[email protected]>
---
drivers/net/wireless/mwifiex/11n_aggr.c | 8 +--
drivers/net/wireless/mwifiex/main.h | 1 -
drivers/net/wireless/mwifiex/wmm.c | 100 ++++++++++++-------------------
drivers/net/wireless/mwifiex/wmm.h | 3 +
4 files changed, 43 insertions(+), 69 deletions(-)

diff --git a/drivers/net/wireless/mwifiex/11n_aggr.c b/drivers/net/wireless/mwifiex/11n_aggr.c
index af8fe63..c6d7451 100644
--- a/drivers/net/wireless/mwifiex/11n_aggr.c
+++ b/drivers/net/wireless/mwifiex/11n_aggr.c
@@ -296,19 +296,13 @@ mwifiex_11n_aggregate_pkt(struct mwifiex_private *priv,
break;
}
if (ret != -EBUSY) {
- spin_lock_irqsave(&priv->wmm.ra_list_spinlock, ra_list_flags);
- if (mwifiex_is_ralist_valid(priv, pra_list, ptrindex)) {
- priv->wmm.packets_out[ptrindex]++;
- priv->wmm.tid_tbl_ptr[ptrindex].ra_list_curr = pra_list;
- }
+ mwifiex_rotate_priolists(priv, pra_list, ptrindex);
/* Now bss_prio_cur pointer points to next node */
adapter->bss_prio_tbl[priv->bss_priority].bss_prio_cur =
list_first_entry(
&adapter->bss_prio_tbl[priv->bss_priority]
.bss_prio_cur->list,
struct mwifiex_bss_prio_node, list);
- spin_unlock_irqrestore(&priv->wmm.ra_list_spinlock,
- ra_list_flags);
}

return 0;
diff --git a/drivers/net/wireless/mwifiex/main.h b/drivers/net/wireless/mwifiex/main.h
index 082a468..80e26c8 100644
--- a/drivers/net/wireless/mwifiex/main.h
+++ b/drivers/net/wireless/mwifiex/main.h
@@ -210,7 +210,6 @@ struct mwifiex_ra_list_tbl {

struct mwifiex_tid_tbl {
struct list_head ra_list;
- struct mwifiex_ra_list_tbl *ra_list_curr;
};

#define WMM_HIGHEST_PRIORITY 7
diff --git a/drivers/net/wireless/mwifiex/wmm.c b/drivers/net/wireless/mwifiex/wmm.c
index 62d740b..c865e58 100644
--- a/drivers/net/wireless/mwifiex/wmm.c
+++ b/drivers/net/wireless/mwifiex/wmm.c
@@ -191,9 +191,6 @@ mwifiex_ralist_add(struct mwifiex_private *priv, u8 *ra)
}
list_add_tail(&ra_list->list,
&priv->wmm.tid_tbl_ptr[i].ra_list);
-
- if (!priv->wmm.tid_tbl_ptr[i].ra_list_curr)
- priv->wmm.tid_tbl_ptr[i].ra_list_curr = ra_list;
}
}

@@ -424,7 +421,6 @@ mwifiex_wmm_init(struct mwifiex_adapter *adapter)
priv->aggr_prio_tbl[i].amsdu = tos_to_tid_inv[i];
priv->aggr_prio_tbl[i].ampdu_ap = tos_to_tid_inv[i];
priv->aggr_prio_tbl[i].ampdu_user = tos_to_tid_inv[i];
- priv->wmm.tid_tbl_ptr[i].ra_list_curr = NULL;
}

priv->aggr_prio_tbl[6].amsdu
@@ -533,8 +529,6 @@ static void mwifiex_wmm_delete_all_ralist(struct mwifiex_private *priv)
}

INIT_LIST_HEAD(&priv->wmm.tid_tbl_ptr[i].ra_list);
-
- priv->wmm.tid_tbl_ptr[i].ra_list_curr = NULL;
}
}

@@ -886,7 +880,7 @@ mwifiex_wmm_get_highest_priolist_ptr(struct mwifiex_adapter *adapter,
struct mwifiex_private **priv, int *tid)
{
struct mwifiex_private *priv_tmp;
- struct mwifiex_ra_list_tbl *ptr, *head;
+ struct mwifiex_ra_list_tbl *ptr;
struct mwifiex_bss_prio_node *bssprio_node, *bssprio_head;
struct mwifiex_tid_tbl *tid_ptr;
atomic_t *hqp;
@@ -929,52 +923,17 @@ mwifiex_wmm_get_highest_priolist_ptr(struct mwifiex_adapter *adapter,
tid_ptr = &(priv_tmp)->wmm.
tid_tbl_ptr[tos_to_tid[i]];

- /* For non-STA ra_list_curr may be NULL */
- if (!tid_ptr->ra_list_curr)
- goto skip_wmm_queue;
-
- if (list_empty(&tid_ptr->ra_list))
- goto skip_wmm_queue;
-
- /*
- * Always choose the next ra we transmitted
- * last time, this way we pick the ra's in
- * round robin fashion.
- */
- ptr = list_first_entry(
- &tid_ptr->ra_list_curr->list,
- struct mwifiex_ra_list_tbl,
- list);
-
- head = ptr;
- if (ptr == (struct mwifiex_ra_list_tbl *)
- &tid_ptr->ra_list) {
- /* Get next ra */
- ptr = list_first_entry(&ptr->list,
- struct mwifiex_ra_list_tbl, list);
- head = ptr;
- }
+ /* iterate over receiver addresses */
+ list_for_each_entry(ptr, &tid_ptr->ra_list,
+ list) {

- do {
if (!skb_queue_empty(&ptr->skb_head)) {
/* holds both locks */
goto found;
}

- /* Get next ra */
- ptr = list_first_entry(&ptr->list,
- struct mwifiex_ra_list_tbl,
- list);
- if (ptr ==
- (struct mwifiex_ra_list_tbl *)
- &tid_ptr->ra_list)
- ptr = list_first_entry(
- &ptr->list,
- struct mwifiex_ra_list_tbl,
- list);
- } while (ptr != head);
-
-skip_wmm_queue:
+ }
+
spin_unlock_irqrestore(&priv_tmp->wmm.
ra_list_spinlock,
flags_ra);
@@ -1019,6 +978,37 @@ found:
}

/*
+ * This functions rotates ra lists so packets are picked in round robin
+ * fashion.
+ *
+ * After a packet is successfully transmitted, rotate the ra list, so the ra
+ * next to the one transmitted, will come first in the list. This way we pick
+ * the ra in a round robin fashion.
+ *
+ * Function also increments wmm.packets_out counter.
+ */
+void mwifiex_rotate_priolists(struct mwifiex_private *priv,
+ struct mwifiex_ra_list_tbl *ra,
+ int tid)
+{
+ struct mwifiex_tid_tbl *tid_ptr = &priv->wmm.tid_tbl_ptr[tid];
+ unsigned long flags;
+
+ spin_lock_irqsave(&priv->wmm.ra_list_spinlock, flags);
+ if (mwifiex_is_ralist_valid(priv, ra, tid)) {
+ priv->wmm.packets_out[tid]++;
+ /*
+ * dirty trick: we remove 'head' temporarily and reinsert it
+ * after curr bss node. imagine list to stay fixed while only
+ * head is moved
+ */
+ list_move(&tid_ptr->ra_list, &ra->list);
+ }
+ spin_unlock_irqrestore(&priv->wmm.ra_list_spinlock, flags);
+
+}
+
+/*
* This function checks if 11n aggregation is possible.
*/
static int
@@ -1104,11 +1094,7 @@ mwifiex_send_single_packet(struct mwifiex_private *priv,
spin_unlock_irqrestore(&priv->wmm.ra_list_spinlock,
ra_list_flags);
} else {
- spin_lock_irqsave(&priv->wmm.ra_list_spinlock, ra_list_flags);
- if (mwifiex_is_ralist_valid(priv, ptr, ptr_index)) {
- priv->wmm.packets_out[ptr_index]++;
- priv->wmm.tid_tbl_ptr[ptr_index].ra_list_curr = ptr;
- }
+ mwifiex_rotate_priolists(priv, ptr, ptr_index);
adapter->bss_prio_tbl[priv->bss_priority].bss_prio_cur =
list_first_entry(
&adapter->bss_prio_tbl[priv->bss_priority]
@@ -1116,8 +1102,6 @@ mwifiex_send_single_packet(struct mwifiex_private *priv,
struct mwifiex_bss_prio_node,
list);
atomic_dec(&priv->wmm.tx_pkts_queued);
- spin_unlock_irqrestore(&priv->wmm.ra_list_spinlock,
- ra_list_flags);
}
}

@@ -1221,11 +1205,7 @@ mwifiex_send_processed_packet(struct mwifiex_private *priv,
break;
}
if (ret != -EBUSY) {
- spin_lock_irqsave(&priv->wmm.ra_list_spinlock, ra_list_flags);
- if (mwifiex_is_ralist_valid(priv, ptr, ptr_index)) {
- priv->wmm.packets_out[ptr_index]++;
- priv->wmm.tid_tbl_ptr[ptr_index].ra_list_curr = ptr;
- }
+ mwifiex_rotate_priolists(priv, ptr, ptr_index);
adapter->bss_prio_tbl[priv->bss_priority].bss_prio_cur =
list_first_entry(
&adapter->bss_prio_tbl[priv->bss_priority]
@@ -1233,8 +1213,6 @@ mwifiex_send_processed_packet(struct mwifiex_private *priv,
struct mwifiex_bss_prio_node,
list);
atomic_dec(&priv->wmm.tx_pkts_queued);
- spin_unlock_irqrestore(&priv->wmm.ra_list_spinlock,
- ra_list_flags);
}
}

diff --git a/drivers/net/wireless/mwifiex/wmm.h b/drivers/net/wireless/mwifiex/wmm.h
index b92f39d..644d6e0 100644
--- a/drivers/net/wireless/mwifiex/wmm.h
+++ b/drivers/net/wireless/mwifiex/wmm.h
@@ -85,6 +85,9 @@ mwifiex_wmm_is_ra_list_empty(struct list_head *ra_list_hhead)
void mwifiex_wmm_add_buf_txqueue(struct mwifiex_private *priv,
struct sk_buff *skb);
void mwifiex_ralist_add(struct mwifiex_private *priv, u8 *ra);
+void mwifiex_rotate_priolists(struct mwifiex_private *priv,
+ struct mwifiex_ra_list_tbl *ra,
+ int tid);

int mwifiex_wmm_lists_empty(struct mwifiex_adapter *adapter);
void mwifiex_wmm_process_tx(struct mwifiex_adapter *adapter);
--
1.7.10.4


2013-04-04 21:02:16

by Andreas Fenkart

[permalink] [raw]
Subject: [PATCH 3/4] mwifiex: bug: remove NO_PKT_PRIO_TID.

Using NO_PKT_PRIO_TID and tx_pkts_queued to check for an empty state, can
lead to a contradictory state, resulting in an infinite loop.
Currently queueing and dequeuing of packets is not synchronized, and can
happen concurrently. While tx_pkts_queued is incremented when adding a
packet, max prio is set to NO_PKT when the WMM list is empty. If a packet
is added right after the check for empty, but before setting max prio to
NO_PKT, that packet is trapped and creates an infinite loop.
Because of the new packet, tx_pkts_queued is at least 1, indicating wmm
lists are not empty. Opposing that max prio is NO_PKT, which means "skip
this wmm queue, it has no packets". The infinite loop results, because the
main loop checks the wmm lists for not empty via tx_pkts_queued, but for
dequeing it uses max_prio to see if it can skip current list. This will
never end, unless a new packet is added which will restore max prio to the
level of the trapped packet.
The solution here is to rely on tx_pkts_queued solely for checking wmm
queue to be empty, and drop the NO_PKT define. It does not address the
locking issue.

Signed-off-by: Andreas Fenkart <[email protected]>
---
drivers/net/wireless/mwifiex/main.h | 1 -
drivers/net/wireless/mwifiex/wmm.c | 13 ++++++-------
2 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/drivers/net/wireless/mwifiex/main.h b/drivers/net/wireless/mwifiex/main.h
index d94406a..082a468 100644
--- a/drivers/net/wireless/mwifiex/main.h
+++ b/drivers/net/wireless/mwifiex/main.h
@@ -216,7 +216,6 @@ struct mwifiex_tid_tbl {
#define WMM_HIGHEST_PRIORITY 7
#define HIGH_PRIO_TID 7
#define LOW_PRIO_TID 0
-#define NO_PKT_PRIO_TID (-1)

struct mwifiex_wmm_desc {
struct mwifiex_tid_tbl tid_tbl_ptr[MAX_NUM_TID];
diff --git a/drivers/net/wireless/mwifiex/wmm.c b/drivers/net/wireless/mwifiex/wmm.c
index 96eed2f..62b07d3 100644
--- a/drivers/net/wireless/mwifiex/wmm.c
+++ b/drivers/net/wireless/mwifiex/wmm.c
@@ -919,8 +919,12 @@ mwifiex_wmm_get_highest_priolist_ptr(struct mwifiex_adapter *adapter,

do {
priv_tmp = bssprio_node->priv;
- hqp = &priv_tmp->wmm.highest_queued_prio;

+ if (atomic_read(&priv_tmp->wmm.tx_pkts_queued) == 0)
+ goto skip_bss;
+
+ /* iterate over the WMM queues of the BSS */
+ hqp = &priv_tmp->wmm.highest_queued_prio;
for (i = atomic_read(hqp); i >= LOW_PRIO_TID; --i) {

tid_ptr = &(priv_tmp)->wmm.
@@ -979,12 +983,7 @@ mwifiex_wmm_get_highest_priolist_ptr(struct mwifiex_adapter *adapter,
} while (ptr != head);
}

- /* No packet at any TID for this priv. Mark as such
- * to skip checking TIDs for this priv (until pkt is
- * added).
- */
- atomic_set(hqp, NO_PKT_PRIO_TID);
-
+skip_bss:
/* Get next bss priority node */
bssprio_node = list_first_entry(&bssprio_node->list,
struct mwifiex_bss_prio_node,
--
1.7.10.4


2013-04-02 00:09:22

by Andreas Fenkart

[permalink] [raw]
Subject: [PATCH 4/6] mwifiex: replace ra_list_curr by list rotation.

After a packet is successfully transmitted, ra list is rotated, so the ra
next to the one transmitted, will be the first in the list. This way we
pick the ra' in a round robin fashion. This significantly simplifies
iteration in mwifiex_wmm_get_highest_priolist_ptr to a call to
list_for_each_entry.
List rotation is done via list_move, where the head itself is temporarily
removed and then re-inserted after the item just transferred.

Signed-off-by: Andreas Fenkart <[email protected]>
---
drivers/net/wireless/mwifiex/11n_aggr.c | 8 +--
drivers/net/wireless/mwifiex/main.h | 1 -
drivers/net/wireless/mwifiex/wmm.c | 99 ++++++++++++-------------------
drivers/net/wireless/mwifiex/wmm.h | 3 +
4 files changed, 43 insertions(+), 68 deletions(-)

diff --git a/drivers/net/wireless/mwifiex/11n_aggr.c b/drivers/net/wireless/mwifiex/11n_aggr.c
index af8fe63..c6d7451 100644
--- a/drivers/net/wireless/mwifiex/11n_aggr.c
+++ b/drivers/net/wireless/mwifiex/11n_aggr.c
@@ -296,19 +296,13 @@ mwifiex_11n_aggregate_pkt(struct mwifiex_private *priv,
break;
}
if (ret != -EBUSY) {
- spin_lock_irqsave(&priv->wmm.ra_list_spinlock, ra_list_flags);
- if (mwifiex_is_ralist_valid(priv, pra_list, ptrindex)) {
- priv->wmm.packets_out[ptrindex]++;
- priv->wmm.tid_tbl_ptr[ptrindex].ra_list_curr = pra_list;
- }
+ mwifiex_rotate_priolists(priv, pra_list, ptrindex);
/* Now bss_prio_cur pointer points to next node */
adapter->bss_prio_tbl[priv->bss_priority].bss_prio_cur =
list_first_entry(
&adapter->bss_prio_tbl[priv->bss_priority]
.bss_prio_cur->list,
struct mwifiex_bss_prio_node, list);
- spin_unlock_irqrestore(&priv->wmm.ra_list_spinlock,
- ra_list_flags);
}

return 0;
diff --git a/drivers/net/wireless/mwifiex/main.h b/drivers/net/wireless/mwifiex/main.h
index 082a468..80e26c8 100644
--- a/drivers/net/wireless/mwifiex/main.h
+++ b/drivers/net/wireless/mwifiex/main.h
@@ -210,7 +210,6 @@ struct mwifiex_ra_list_tbl {

struct mwifiex_tid_tbl {
struct list_head ra_list;
- struct mwifiex_ra_list_tbl *ra_list_curr;
};

#define WMM_HIGHEST_PRIORITY 7
diff --git a/drivers/net/wireless/mwifiex/wmm.c b/drivers/net/wireless/mwifiex/wmm.c
index faae344..a9c381a 100644
--- a/drivers/net/wireless/mwifiex/wmm.c
+++ b/drivers/net/wireless/mwifiex/wmm.c
@@ -191,9 +191,6 @@ mwifiex_ralist_add(struct mwifiex_private *priv, u8 *ra)
}
list_add_tail(&ra_list->list,
&priv->wmm.tid_tbl_ptr[i].ra_list);
-
- if (!priv->wmm.tid_tbl_ptr[i].ra_list_curr)
- priv->wmm.tid_tbl_ptr[i].ra_list_curr = ra_list;
}
}

@@ -424,7 +421,6 @@ mwifiex_wmm_init(struct mwifiex_adapter *adapter)
priv->aggr_prio_tbl[i].amsdu = tos_to_tid_inv[i];
priv->aggr_prio_tbl[i].ampdu_ap = tos_to_tid_inv[i];
priv->aggr_prio_tbl[i].ampdu_user = tos_to_tid_inv[i];
- priv->wmm.tid_tbl_ptr[i].ra_list_curr = NULL;
}

priv->aggr_prio_tbl[6].amsdu
@@ -533,8 +529,6 @@ static void mwifiex_wmm_delete_all_ralist(struct mwifiex_private *priv)
}

INIT_LIST_HEAD(&priv->wmm.tid_tbl_ptr[i].ra_list);
-
- priv->wmm.tid_tbl_ptr[i].ra_list_curr = NULL;
}
}

@@ -886,7 +880,7 @@ mwifiex_wmm_get_highest_priolist_ptr(struct mwifiex_adapter *adapter,
struct mwifiex_private **priv, int *tid)
{
struct mwifiex_private *priv_tmp;
- struct mwifiex_ra_list_tbl *ptr, *head;
+ struct mwifiex_ra_list_tbl *ptr;
struct mwifiex_bss_prio_node *bssprio_node, *bssprio_head;
struct mwifiex_tid_tbl *tid_ptr;
atomic_t *hqp;
@@ -930,10 +924,6 @@ mwifiex_wmm_get_highest_priolist_ptr(struct mwifiex_adapter *adapter,
tid_ptr = &(priv_tmp)->wmm.
tid_tbl_ptr[tos_to_tid[i]];

- /* For non-STA ra_list_curr may be NULL */
- if (!tid_ptr->ra_list_curr)
- continue;
-
spin_lock_irqsave(&priv_tmp->wmm.
ra_list_spinlock, flags);
is_list_empty =
@@ -943,44 +933,14 @@ mwifiex_wmm_get_highest_priolist_ptr(struct mwifiex_adapter *adapter,
if (is_list_empty)
continue;

- /*
- * Always choose the next ra we transmitted
- * last time, this way we pick the ra's in
- * round robin fashion.
- */
- ptr = list_first_entry(
- &tid_ptr->ra_list_curr->list,
- struct mwifiex_ra_list_tbl,
- list);
-
- head = ptr;
- if (ptr == (struct mwifiex_ra_list_tbl *)
- &tid_ptr->ra_list) {
- /* Get next ra */
- ptr = list_first_entry(&ptr->list,
- struct mwifiex_ra_list_tbl, list);
- head = ptr;
- }
-
- do {
- is_list_empty =
- skb_queue_empty(&ptr->skb_head);
+ /* iterate over receiver addresses */
+ list_for_each_entry(ptr, &tid_ptr->ra_list,
+ list) {

- if (!is_list_empty)
+ if (!skb_queue_empty(&ptr->skb_head))
goto found;

- /* Get next ra */
- ptr = list_first_entry(&ptr->list,
- struct mwifiex_ra_list_tbl,
- list);
- if (ptr ==
- (struct mwifiex_ra_list_tbl *)
- &tid_ptr->ra_list)
- ptr = list_first_entry(
- &ptr->list,
- struct mwifiex_ra_list_tbl,
- list);
- } while (ptr != head);
+ }
}

/* Get next bss priority node */
@@ -1013,6 +973,37 @@ found:
}

/*
+ * This functions rotates ra lists so packets are picked in round robin
+ * fashion.
+ *
+ * After a packet is successfully transmitted, rotate the ra list, so the ra
+ * next to the one transmitted, will come first in the list. This way we pick
+ * the ra in a round robin fashion.
+ *
+ * Function also increments wmm.packets_out counter.
+ */
+void mwifiex_rotate_priolists(struct mwifiex_private *priv,
+ struct mwifiex_ra_list_tbl *ra,
+ int tid)
+{
+ struct mwifiex_tid_tbl *tid_ptr = &priv->wmm.tid_tbl_ptr[tid];
+ unsigned long flags;
+
+ spin_lock_irqsave(&priv->wmm.ra_list_spinlock, flags);
+ if (mwifiex_is_ralist_valid(priv, ra, tid)) {
+ priv->wmm.packets_out[tid]++;
+ /*
+ * dirty trick: we remove 'head' temporarily and reinsert it
+ * after curr bss node. imagine list to stay fixed while only
+ * head is moved
+ */
+ list_move(&tid_ptr->ra_list, &ra->list);
+ }
+ spin_unlock_irqrestore(&priv->wmm.ra_list_spinlock, flags);
+
+}
+
+/*
* This function checks if 11n aggregation is possible.
*/
static int
@@ -1098,11 +1089,7 @@ mwifiex_send_single_packet(struct mwifiex_private *priv,
spin_unlock_irqrestore(&priv->wmm.ra_list_spinlock,
ra_list_flags);
} else {
- spin_lock_irqsave(&priv->wmm.ra_list_spinlock, ra_list_flags);
- if (mwifiex_is_ralist_valid(priv, ptr, ptr_index)) {
- priv->wmm.packets_out[ptr_index]++;
- priv->wmm.tid_tbl_ptr[ptr_index].ra_list_curr = ptr;
- }
+ mwifiex_rotate_priolists(priv, ptr, ptr_index);
adapter->bss_prio_tbl[priv->bss_priority].bss_prio_cur =
list_first_entry(
&adapter->bss_prio_tbl[priv->bss_priority]
@@ -1110,8 +1097,6 @@ mwifiex_send_single_packet(struct mwifiex_private *priv,
struct mwifiex_bss_prio_node,
list);
atomic_dec(&priv->wmm.tx_pkts_queued);
- spin_unlock_irqrestore(&priv->wmm.ra_list_spinlock,
- ra_list_flags);
}
}

@@ -1215,11 +1200,7 @@ mwifiex_send_processed_packet(struct mwifiex_private *priv,
break;
}
if (ret != -EBUSY) {
- spin_lock_irqsave(&priv->wmm.ra_list_spinlock, ra_list_flags);
- if (mwifiex_is_ralist_valid(priv, ptr, ptr_index)) {
- priv->wmm.packets_out[ptr_index]++;
- priv->wmm.tid_tbl_ptr[ptr_index].ra_list_curr = ptr;
- }
+ mwifiex_rotate_priolists(priv, ptr, ptr_index);
adapter->bss_prio_tbl[priv->bss_priority].bss_prio_cur =
list_first_entry(
&adapter->bss_prio_tbl[priv->bss_priority]
@@ -1227,8 +1208,6 @@ mwifiex_send_processed_packet(struct mwifiex_private *priv,
struct mwifiex_bss_prio_node,
list);
atomic_dec(&priv->wmm.tx_pkts_queued);
- spin_unlock_irqrestore(&priv->wmm.ra_list_spinlock,
- ra_list_flags);
}
}

diff --git a/drivers/net/wireless/mwifiex/wmm.h b/drivers/net/wireless/mwifiex/wmm.h
index b92f39d..644d6e0 100644
--- a/drivers/net/wireless/mwifiex/wmm.h
+++ b/drivers/net/wireless/mwifiex/wmm.h
@@ -85,6 +85,9 @@ mwifiex_wmm_is_ra_list_empty(struct list_head *ra_list_hhead)
void mwifiex_wmm_add_buf_txqueue(struct mwifiex_private *priv,
struct sk_buff *skb);
void mwifiex_ralist_add(struct mwifiex_private *priv, u8 *ra);
+void mwifiex_rotate_priolists(struct mwifiex_private *priv,
+ struct mwifiex_ra_list_tbl *ra,
+ int tid);

int mwifiex_wmm_lists_empty(struct mwifiex_adapter *adapter);
void mwifiex_wmm_process_tx(struct mwifiex_adapter *adapter);
--
1.7.10.4


2013-04-04 22:37:50

by Bing Zhao

[permalink] [raw]
Subject: RE: [PATCH 3/4] mwifiex: bug: remove NO_PKT_PRIO_TID.


> Using NO_PKT_PRIO_TID and tx_pkts_queued to check for an empty state, can
> lead to a contradictory state, resulting in an infinite loop.
> Currently queueing and dequeuing of packets is not synchronized, and can
> happen concurrently. While tx_pkts_queued is incremented when adding a
> packet, max prio is set to NO_PKT when the WMM list is empty. If a packet
> is added right after the check for empty, but before setting max prio to
> NO_PKT, that packet is trapped and creates an infinite loop.
> Because of the new packet, tx_pkts_queued is at least 1, indicating wmm
> lists are not empty. Opposing that max prio is NO_PKT, which means "skip
> this wmm queue, it has no packets". The infinite loop results, because the
> main loop checks the wmm lists for not empty via tx_pkts_queued, but for
> dequeing it uses max_prio to see if it can skip current list. This will
> never end, unless a new packet is added which will restore max prio to the
> level of the trapped packet.
> The solution here is to rely on tx_pkts_queued solely for checking wmm
> queue to be empty, and drop the NO_PKT define. It does not address the
> locking issue.
>
> Signed-off-by: Andreas Fenkart <[email protected]>

Acked-by: Bing Zhao <[email protected]>

Thanks,
Bing

> ---
> drivers/net/wireless/mwifiex/main.h | 1 -
> drivers/net/wireless/mwifiex/wmm.c | 13 ++++++-------
> 2 files changed, 6 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/net/wireless/mwifiex/main.h b/drivers/net/wireless/mwifiex/main.h
> index d94406a..082a468 100644
> --- a/drivers/net/wireless/mwifiex/main.h
> +++ b/drivers/net/wireless/mwifiex/main.h
> @@ -216,7 +216,6 @@ struct mwifiex_tid_tbl {
> #define WMM_HIGHEST_PRIORITY 7
> #define HIGH_PRIO_TID 7
> #define LOW_PRIO_TID 0
> -#define NO_PKT_PRIO_TID (-1)
>
> struct mwifiex_wmm_desc {
> struct mwifiex_tid_tbl tid_tbl_ptr[MAX_NUM_TID];
> diff --git a/drivers/net/wireless/mwifiex/wmm.c b/drivers/net/wireless/mwifiex/wmm.c
> index 96eed2f..62b07d3 100644
> --- a/drivers/net/wireless/mwifiex/wmm.c
> +++ b/drivers/net/wireless/mwifiex/wmm.c
> @@ -919,8 +919,12 @@ mwifiex_wmm_get_highest_priolist_ptr(struct mwifiex_adapter *adapter,
>
> do {
> priv_tmp = bssprio_node->priv;
> - hqp = &priv_tmp->wmm.highest_queued_prio;
>
> + if (atomic_read(&priv_tmp->wmm.tx_pkts_queued) == 0)
> + goto skip_bss;
> +
> + /* iterate over the WMM queues of the BSS */
> + hqp = &priv_tmp->wmm.highest_queued_prio;
> for (i = atomic_read(hqp); i >= LOW_PRIO_TID; --i) {
>
> tid_ptr = &(priv_tmp)->wmm.
> @@ -979,12 +983,7 @@ mwifiex_wmm_get_highest_priolist_ptr(struct mwifiex_adapter *adapter,
> } while (ptr != head);
> }
>
> - /* No packet at any TID for this priv. Mark as such
> - * to skip checking TIDs for this priv (until pkt is
> - * added).
> - */
> - atomic_set(hqp, NO_PKT_PRIO_TID);
> -
> +skip_bss:
> /* Get next bss priority node */
> bssprio_node = list_first_entry(&bssprio_node->list,
> struct mwifiex_bss_prio_node,
> --
> 1.7.10.4


2013-04-02 19:36:55

by Andreas Fenkart

[permalink] [raw]
Subject: Re: mwifiex: infinite loop in mwifiex_main_process

On Tue, Apr 02, 2013 at 11:16:26AM -0700, Bing Zhao wrote:
> Hi Andi,
>
> [...]
>
> > + spin_lock_irqsave(&priv_tmp->wmm.ra_list_spinlock, flags);
> > + BUG_ON(atomic_read(&priv_tmp->wmm.tx_pkts_queued));
> > + spin_unlock_irqrestore(&priv_tmp->wmm.ra_list_spinlock, flags);
> > +
> > /* No packet at any TID for this priv. Mark as such
> > * to skip checking TIDs for this priv (until pkt is
> > * added).
> > atomic_set(hqp, NO_PKT_PRIO_TID);
> >
> >
> > Which crashed. Hence searching for queued packets and adding new ones is
> > not synchronized, new packets can be added while searching the WMM
> > queues. If a packet is added right before setting max prio to NO_PKT,
> > that packet is trapped and creates an infinite loop.
> >
> > Because of the new packet tx_pkts_queued is at least 1, indicating wmm
> > lists are not empty. Opposing that max prio is NO_PKT, which means "skip
> > this wmm queue, it has no packets".
> > The infinite loop results, because the main loop checks the wmm lists
> > for not empty (tx_pkts_queued != 0), but then finds no packet since it
> > skips the wmm queue where it is located on. This will never end, unless
> > a new packet is added which will restore max prio.
>
> Thanks for your analysis.
>
> > One possible solution is is to rely on tx_pkts_queued solely for
> > checking wmm queue to be empty, and drop the NO_PKT define.
>
> FYI, Yogesh suggested another fix (attached).

Started with similar patch, but then learned that NO_PKT_PRIO_TID
is not needed at all. It only adds complexity, rely on
tx_pkts_queued!

On top, bss_prio_tbl should be locked as well.

>
> [...]
>
> > seems to be intruduced with this patch:
> > 17e8cec 05-16-2011 mwifiex: CPU mips optimization with NO_PKT_PRIO_TID
> >
> > I was wondering why hasn't happened more frequently. Evtl. if the
> > interface is working in bridge mode, new packets might be added to the
> > WMM queue with the trapped packet. 2c
> >
> > I prepared a few patches, fixing above bug as suggested and plus some
> > cleanup patches I did while trying to get an understanding. Pls review.
>
> Thanks for the patches. We will review them and run some WMM tests.

Thanks, looking forward to that.

Andi



2013-04-11 18:43:01

by Bing Zhao

[permalink] [raw]
Subject: RE: [PATCH 1/2] mwifiex: replace ra_list_curr by list rotation.

Hi Andi,

Thanks for the patch.

> After a packet is successfully transmitted, ra list is rotated, so the ra
> next to the one transmitted, will be the first in the list. This way we
> pick the ra' in a round robin fashion. This significantly simplifies
> iteration in mwifiex_wmm_get_highest_priolist_ptr to a call to
> list_for_each_entry.
> List rotation is done via list_move, where the head itself is temporarily
> removed and then re-inserted after the item just transferred.
>
> Signed-off-by: Andreas Fenkart <[email protected]>

Acked-by: Bing Zhao <[email protected]>

Thanks,
Bing

> ---
> drivers/net/wireless/mwifiex/11n_aggr.c | 8 +--
> drivers/net/wireless/mwifiex/main.h | 1 -
> drivers/net/wireless/mwifiex/wmm.c | 97 ++++++++++++-------------------
> drivers/net/wireless/mwifiex/wmm.h | 3 +
> 4 files changed, 40 insertions(+), 69 deletions(-)


2013-04-04 21:02:20

by Andreas Fenkart

[permalink] [raw]
Subject: [PATCH 4/4] mwifiex: bug: hold proper locks when accessing ra_list / bss_prio lists.

Not locking ra_list when dequeuing packets creates race conditions.
When adding a packet 'tx_pkts_queued' is modified before setting
highest_priority_queue. If in-between the main loop starts, it will see
a packet queued (tx_pkts_queued > 0) but will not find it, since max prio
is not set yet. Depending on the scheduling, the thread trying to add the
packet could complete and restore the situation. But this is not something
to rely on.
Another race condition exists, if a new packet, exceeding current max prio
is added. If concurrently a packet is dequeued, the newly set max
prio will be overwritten with the value of the dequeued packet.
This can occur, because selecting a packet and modifying the max prio
is not atomic. The result in an infinite loop unless, a new packet is
added that has at least the priority of the hidden packet.

Same applies to bss_prio_tbl. Forward iteration is no proper lock-free
technique and provides no protection from calls to list_del. Although
BSS are currently not added/removed dynamically, this must not be the
case in the future. Hence always hold proper locks when accessing those
lists.

Signed-off-by: Andreas Fenkart <[email protected]>
---
drivers/net/wireless/mwifiex/wmm.c | 57 ++++++++++++++++++++----------------
1 file changed, 31 insertions(+), 26 deletions(-)

diff --git a/drivers/net/wireless/mwifiex/wmm.c b/drivers/net/wireless/mwifiex/wmm.c
index 62b07d3..62d740b 100644
--- a/drivers/net/wireless/mwifiex/wmm.c
+++ b/drivers/net/wireless/mwifiex/wmm.c
@@ -688,13 +688,13 @@ mwifiex_wmm_add_buf_txqueue(struct mwifiex_private *priv,
ra_list->total_pkts_size += skb->len;
ra_list->pkt_count++;

- atomic_inc(&priv->wmm.tx_pkts_queued);
-
if (atomic_read(&priv->wmm.highest_queued_prio) <
tos_to_tid_inv[tid_down])
atomic_set(&priv->wmm.highest_queued_prio,
tos_to_tid_inv[tid_down]);

+ atomic_inc(&priv->wmm.tx_pkts_queued);
+
spin_unlock_irqrestore(&priv->wmm.ra_list_spinlock, flags);
}

@@ -890,19 +890,15 @@ mwifiex_wmm_get_highest_priolist_ptr(struct mwifiex_adapter *adapter,
struct mwifiex_bss_prio_node *bssprio_node, *bssprio_head;
struct mwifiex_tid_tbl *tid_ptr;
atomic_t *hqp;
- int is_list_empty;
- unsigned long flags;
+ unsigned long flags_bss, flags_ra;
int i, j;

for (j = adapter->priv_num - 1; j >= 0; --j) {
spin_lock_irqsave(&adapter->bss_prio_tbl[j].bss_prio_lock,
- flags);
- is_list_empty = list_empty(&adapter->bss_prio_tbl[j]
- .bss_prio_head);
- spin_unlock_irqrestore(&adapter->bss_prio_tbl[j].bss_prio_lock,
- flags);
- if (is_list_empty)
- continue;
+ flags_bss);
+
+ if (list_empty(&adapter->bss_prio_tbl[j].bss_prio_head))
+ goto skip_prio_tbl;

if (adapter->bss_prio_tbl[j].bss_prio_cur ==
(struct mwifiex_bss_prio_node *)
@@ -927,21 +923,18 @@ mwifiex_wmm_get_highest_priolist_ptr(struct mwifiex_adapter *adapter,
hqp = &priv_tmp->wmm.highest_queued_prio;
for (i = atomic_read(hqp); i >= LOW_PRIO_TID; --i) {

+ spin_lock_irqsave(&priv_tmp->wmm.
+ ra_list_spinlock, flags_ra);
+
tid_ptr = &(priv_tmp)->wmm.
tid_tbl_ptr[tos_to_tid[i]];

/* For non-STA ra_list_curr may be NULL */
if (!tid_ptr->ra_list_curr)
- continue;
+ goto skip_wmm_queue;

- spin_lock_irqsave(&priv_tmp->wmm.
- ra_list_spinlock, flags);
- is_list_empty =
- list_empty(&tid_ptr->ra_list);
- spin_unlock_irqrestore(&priv_tmp->wmm.
- ra_list_spinlock, flags);
- if (is_list_empty)
- continue;
+ if (list_empty(&tid_ptr->ra_list))
+ goto skip_wmm_queue;

/*
* Always choose the next ra we transmitted
@@ -963,11 +956,10 @@ mwifiex_wmm_get_highest_priolist_ptr(struct mwifiex_adapter *adapter,
}

do {
- is_list_empty =
- skb_queue_empty(&ptr->skb_head);
-
- if (!is_list_empty)
+ if (!skb_queue_empty(&ptr->skb_head)) {
+ /* holds both locks */
goto found;
+ }

/* Get next ra */
ptr = list_first_entry(&ptr->list,
@@ -981,6 +973,12 @@ mwifiex_wmm_get_highest_priolist_ptr(struct mwifiex_adapter *adapter,
struct mwifiex_ra_list_tbl,
list);
} while (ptr != head);
+
+skip_wmm_queue:
+ spin_unlock_irqrestore(&priv_tmp->wmm.
+ ra_list_spinlock,
+ flags_ra);
+
}

skip_bss:
@@ -998,14 +996,21 @@ skip_bss:
struct mwifiex_bss_prio_node,
list);
} while (bssprio_node != bssprio_head);
+
+skip_prio_tbl:
+ spin_unlock_irqrestore(&adapter->bss_prio_tbl[j].bss_prio_lock,
+ flags_bss);
+
}
return NULL;

found:
- spin_lock_irqsave(&priv_tmp->wmm.ra_list_spinlock, flags);
+ /* holds bss_prio_lock / ra_list_spinlock */
if (atomic_read(hqp) > i)
atomic_set(hqp, i);
- spin_unlock_irqrestore(&priv_tmp->wmm.ra_list_spinlock, flags);
+ spin_unlock_irqrestore(&priv_tmp->wmm.ra_list_spinlock, flags_ra);
+ spin_unlock_irqrestore(&adapter->bss_prio_tbl[j].bss_prio_lock,
+ flags_bss);

*priv = priv_tmp;
*tid = tos_to_tid[i];
--
1.7.10.4


2013-04-02 00:09:13

by Andreas Fenkart

[permalink] [raw]
Subject: [PATCH 3/6] mwifiex: remove unused tid_tbl_lock from mwifiex_tid_tbl.

ra_list_spinlock is used to protect struct mwifiex_wmm_desc and embedded
structures such as ra_list. tid_tbl_lock while more fine grained, is not
used but in one function. That function is not called reentrantly. To
protect ra_list from concurrent modification ra_list_spinlock must be held.

Signed-off-by: Andreas Fenkart <[email protected]>
---
drivers/net/wireless/mwifiex/init.c | 1 -
drivers/net/wireless/mwifiex/main.h | 2 --
drivers/net/wireless/mwifiex/wmm.c | 8 ++++----
3 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/drivers/net/wireless/mwifiex/init.c b/drivers/net/wireless/mwifiex/init.c
index e38aa9b..ba9c5a5 100644
--- a/drivers/net/wireless/mwifiex/init.c
+++ b/drivers/net/wireless/mwifiex/init.c
@@ -535,7 +535,6 @@ int mwifiex_init_lock_list(struct mwifiex_adapter *adapter)
priv = adapter->priv[i];
for (j = 0; j < MAX_NUM_TID; ++j) {
INIT_LIST_HEAD(&priv->wmm.tid_tbl_ptr[j].ra_list);
- spin_lock_init(&priv->wmm.tid_tbl_ptr[j].tid_tbl_lock);
}
INIT_LIST_HEAD(&priv->tx_ba_stream_tbl_ptr);
INIT_LIST_HEAD(&priv->rx_reorder_tbl_ptr);
diff --git a/drivers/net/wireless/mwifiex/main.h b/drivers/net/wireless/mwifiex/main.h
index 3de4cc7..082a468 100644
--- a/drivers/net/wireless/mwifiex/main.h
+++ b/drivers/net/wireless/mwifiex/main.h
@@ -210,8 +210,6 @@ struct mwifiex_ra_list_tbl {

struct mwifiex_tid_tbl {
struct list_head ra_list;
- /* spin lock for tid table */
- spinlock_t tid_tbl_lock;
struct mwifiex_ra_list_tbl *ra_list_curr;
};

diff --git a/drivers/net/wireless/mwifiex/wmm.c b/drivers/net/wireless/mwifiex/wmm.c
index 5b454d6..faae344 100644
--- a/drivers/net/wireless/mwifiex/wmm.c
+++ b/drivers/net/wireless/mwifiex/wmm.c
@@ -934,12 +934,12 @@ mwifiex_wmm_get_highest_priolist_ptr(struct mwifiex_adapter *adapter,
if (!tid_ptr->ra_list_curr)
continue;

- spin_lock_irqsave(&tid_ptr->tid_tbl_lock,
- flags);
+ spin_lock_irqsave(&priv_tmp->wmm.
+ ra_list_spinlock, flags);
is_list_empty =
list_empty(&tid_ptr->ra_list);
- spin_unlock_irqrestore(&tid_ptr->tid_tbl_lock,
- flags);
+ spin_unlock_irqrestore(&priv_tmp->wmm.
+ ra_list_spinlock, flags);
if (is_list_empty)
continue;

--
1.7.10.4


2013-04-23 18:37:09

by Bing Zhao

[permalink] [raw]
Subject: RE: [PATCH v3 0/2] wmm queues handling simplificatons

Hi John,

>
> v3: minor cleanups for coding style; checkpatch.pl --strict

Could you please apply the v3 series from Andreas Fenkart?
I've ACKed both patches individually.

Thanks,
Bing

> v2: split 6 patches into 2 patchsets. This is the set 2.
>
> Andreas Fenkart (2):
> mwifiex: replace ra_list_curr by list rotation.
> mwifiex: rework round robin scheduling of bss nodes.
>
> drivers/net/wireless/mwifiex/11n_aggr.c | 14 +--
> drivers/net/wireless/mwifiex/init.c | 21 ++--
> drivers/net/wireless/mwifiex/main.h | 1 -
> drivers/net/wireless/mwifiex/wmm.c | 160 ++++++++++---------------------
> drivers/net/wireless/mwifiex/wmm.h | 3 +
> 5 files changed, 61 insertions(+), 138 deletions(-)


2013-04-04 21:01:45

by Andreas Fenkart

[permalink] [raw]
Subject: [PATCH 2/4] mwifiex: remove unused tid_tbl_lock from mwifiex_tid_tbl.

ra_list_spinlock is used to protect struct mwifiex_wmm_desc and embedded
structures such as ra_list. tid_tbl_lock while more fine grained, is not
used but in one function. That function is not called reentrantly. To
protect ra_list from concurrent modification ra_list_spinlock must be held.

Signed-off-by: Andreas Fenkart <[email protected]>
---
drivers/net/wireless/mwifiex/init.c | 1 -
drivers/net/wireless/mwifiex/main.h | 2 --
drivers/net/wireless/mwifiex/wmm.c | 8 ++++----
3 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/drivers/net/wireless/mwifiex/init.c b/drivers/net/wireless/mwifiex/init.c
index e38aa9b..ba9c5a5 100644
--- a/drivers/net/wireless/mwifiex/init.c
+++ b/drivers/net/wireless/mwifiex/init.c
@@ -535,7 +535,6 @@ int mwifiex_init_lock_list(struct mwifiex_adapter *adapter)
priv = adapter->priv[i];
for (j = 0; j < MAX_NUM_TID; ++j) {
INIT_LIST_HEAD(&priv->wmm.tid_tbl_ptr[j].ra_list);
- spin_lock_init(&priv->wmm.tid_tbl_ptr[j].tid_tbl_lock);
}
INIT_LIST_HEAD(&priv->tx_ba_stream_tbl_ptr);
INIT_LIST_HEAD(&priv->rx_reorder_tbl_ptr);
diff --git a/drivers/net/wireless/mwifiex/main.h b/drivers/net/wireless/mwifiex/main.h
index 553adfb..d94406a 100644
--- a/drivers/net/wireless/mwifiex/main.h
+++ b/drivers/net/wireless/mwifiex/main.h
@@ -210,8 +210,6 @@ struct mwifiex_ra_list_tbl {

struct mwifiex_tid_tbl {
struct list_head ra_list;
- /* spin lock for tid table */
- spinlock_t tid_tbl_lock;
struct mwifiex_ra_list_tbl *ra_list_curr;
};

diff --git a/drivers/net/wireless/mwifiex/wmm.c b/drivers/net/wireless/mwifiex/wmm.c
index b132c42..96eed2f 100644
--- a/drivers/net/wireless/mwifiex/wmm.c
+++ b/drivers/net/wireless/mwifiex/wmm.c
@@ -930,12 +930,12 @@ mwifiex_wmm_get_highest_priolist_ptr(struct mwifiex_adapter *adapter,
if (!tid_ptr->ra_list_curr)
continue;

- spin_lock_irqsave(&tid_ptr->tid_tbl_lock,
- flags);
+ spin_lock_irqsave(&priv_tmp->wmm.
+ ra_list_spinlock, flags);
is_list_empty =
list_empty(&tid_ptr->ra_list);
- spin_unlock_irqrestore(&tid_ptr->tid_tbl_lock,
- flags);
+ spin_unlock_irqrestore(&priv_tmp->wmm.
+ ra_list_spinlock, flags);
if (is_list_empty)
continue;

--
1.7.10.4


2013-04-02 00:09:07

by Andreas Fenkart

[permalink] [raw]
Subject: [PATCH 2/6] mwifiex: bug: wrong list in list_empty check.

adapter->bss_prio_tbl list has already been checked in outer loop. The
inner loop works with priv_tmp->wmm.tid_tbl_ptr list. Also the lock taken,
gives hint that this is likely a copy-paste error.

Signed-off-by: Andreas Fenkart <[email protected]>
---
drivers/net/wireless/mwifiex/wmm.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/wireless/mwifiex/wmm.c b/drivers/net/wireless/mwifiex/wmm.c
index 91dee27..5b454d6 100644
--- a/drivers/net/wireless/mwifiex/wmm.c
+++ b/drivers/net/wireless/mwifiex/wmm.c
@@ -937,8 +937,7 @@ mwifiex_wmm_get_highest_priolist_ptr(struct mwifiex_adapter *adapter,
spin_lock_irqsave(&tid_ptr->tid_tbl_lock,
flags);
is_list_empty =
- list_empty(&adapter->bss_prio_tbl[j]
- .bss_prio_head);
+ list_empty(&tid_ptr->ra_list);
spin_unlock_irqrestore(&tid_ptr->tid_tbl_lock,
flags);
if (is_list_empty)
--
1.7.10.4


2013-04-11 11:52:22

by Andreas Fenkart

[permalink] [raw]
Subject: [PATCH 2/2] mwifiex: rework round robin scheduling of bss nodes.

Rotate bss prio list, so the bss next to the one served, will come first
in the list of bss' with equal priority. This way we pick bss nodes in a
round robin fashion. Using list rotation instead of a cur ptr simplifies
iteration to calling list_for_each_entry. List rotation is done via
list_move, where the head itself is temporarily removed and then
re-inserted after the bss just served.

Signed-off-by: Andreas Fenkart <[email protected]>
---
drivers/net/wireless/mwifiex/11n_aggr.c | 6 ---
drivers/net/wireless/mwifiex/init.c | 21 +++-----
drivers/net/wireless/mwifiex/wmm.c | 79 +++++++++----------------------
3 files changed, 29 insertions(+), 77 deletions(-)

diff --git a/drivers/net/wireless/mwifiex/11n_aggr.c b/drivers/net/wireless/mwifiex/11n_aggr.c
index c6d7451..a78e065 100644
--- a/drivers/net/wireless/mwifiex/11n_aggr.c
+++ b/drivers/net/wireless/mwifiex/11n_aggr.c
@@ -297,12 +297,6 @@ mwifiex_11n_aggregate_pkt(struct mwifiex_private *priv,
}
if (ret != -EBUSY) {
mwifiex_rotate_priolists(priv, pra_list, ptrindex);
- /* Now bss_prio_cur pointer points to next node */
- adapter->bss_prio_tbl[priv->bss_priority].bss_prio_cur =
- list_first_entry(
- &adapter->bss_prio_tbl[priv->bss_priority]
- .bss_prio_cur->list,
- struct mwifiex_bss_prio_node, list);
}

return 0;
diff --git a/drivers/net/wireless/mwifiex/init.c b/drivers/net/wireless/mwifiex/init.c
index 716a68b..ff6b7ec 100644
--- a/drivers/net/wireless/mwifiex/init.c
+++ b/drivers/net/wireless/mwifiex/init.c
@@ -44,8 +44,6 @@ static int mwifiex_add_bss_prio_tbl(struct mwifiex_private *priv)

bss_prio->priv = priv;
INIT_LIST_HEAD(&bss_prio->list);
- if (!tbl[priv->bss_priority].bss_prio_cur)
- tbl[priv->bss_priority].bss_prio_cur = bss_prio;

spin_lock_irqsave(&tbl[priv->bss_priority].bss_prio_lock, flags);
list_add_tail(&bss_prio->list, &tbl[priv->bss_priority].bss_prio_head);
@@ -525,7 +523,6 @@ int mwifiex_init_lock_list(struct mwifiex_adapter *adapter)

for (i = 0; i < adapter->priv_num; ++i) {
INIT_LIST_HEAD(&adapter->bss_prio_tbl[i].bss_prio_head);
- adapter->bss_prio_tbl[i].bss_prio_cur = NULL;
spin_lock_init(&adapter->bss_prio_tbl[i].bss_prio_lock);
}

@@ -625,42 +622,36 @@ static void mwifiex_delete_bss_prio_tbl(struct mwifiex_private *priv)
{
int i;
struct mwifiex_adapter *adapter = priv->adapter;
- struct mwifiex_bss_prio_node *bssprio_node, *tmp_node, **cur;
+ struct mwifiex_bss_prio_node *bssprio_node, *tmp_node;
struct list_head *head;
spinlock_t *lock; /* bss priority lock */
unsigned long flags;

for (i = 0; i < adapter->priv_num; ++i) {
head = &adapter->bss_prio_tbl[i].bss_prio_head;
- cur = &adapter->bss_prio_tbl[i].bss_prio_cur;
lock = &adapter->bss_prio_tbl[i].bss_prio_lock;
dev_dbg(adapter->dev, "info: delete BSS priority table,"
" bss_type = %d, bss_num = %d, i = %d,"
- " head = %p, cur = %p\n",
- priv->bss_type, priv->bss_num, i, head, *cur);
- if (*cur) {
+ " head = %p\n",
+ priv->bss_type, priv->bss_num, i, head);
+
+ {
spin_lock_irqsave(lock, flags);
if (list_empty(head)) {
spin_unlock_irqrestore(lock, flags);
continue;
}
- bssprio_node = list_first_entry(head,
- struct mwifiex_bss_prio_node, list);
- spin_unlock_irqrestore(lock, flags);
-
list_for_each_entry_safe(bssprio_node, tmp_node, head,
list) {
if (bssprio_node->priv == priv) {
dev_dbg(adapter->dev, "info: Delete "
"node %p, next = %p\n",
bssprio_node, tmp_node);
- spin_lock_irqsave(lock, flags);
list_del(&bssprio_node->list);
- spin_unlock_irqrestore(lock, flags);
kfree(bssprio_node);
}
}
- *cur = (struct mwifiex_bss_prio_node *)head;
+ spin_unlock_irqrestore(lock, flags);
}
}
}
diff --git a/drivers/net/wireless/mwifiex/wmm.c b/drivers/net/wireless/mwifiex/wmm.c
index c0d7952..21ae7c2 100644
--- a/drivers/net/wireless/mwifiex/wmm.c
+++ b/drivers/net/wireless/mwifiex/wmm.c
@@ -881,37 +881,25 @@ mwifiex_wmm_get_highest_priolist_ptr(struct mwifiex_adapter *adapter,
{
struct mwifiex_private *priv_tmp;
struct mwifiex_ra_list_tbl *ptr;
- struct mwifiex_bss_prio_node *bssprio_node, *bssprio_head;
struct mwifiex_tid_tbl *tid_ptr;
atomic_t *hqp;
unsigned long flags_bss, flags_ra;
int i, j;

+ /* check the BSS with highest priority first */
for (j = adapter->priv_num - 1; j >= 0; --j) {
spin_lock_irqsave(&adapter->bss_prio_tbl[j].bss_prio_lock,
flags_bss);

- if (list_empty(&adapter->bss_prio_tbl[j].bss_prio_head))
- goto skip_prio_tbl;
-
- if (adapter->bss_prio_tbl[j].bss_prio_cur ==
- (struct mwifiex_bss_prio_node *)
- &adapter->bss_prio_tbl[j].bss_prio_head) {
- adapter->bss_prio_tbl[j].bss_prio_cur =
- list_first_entry(&adapter->bss_prio_tbl[j]
- .bss_prio_head,
- struct mwifiex_bss_prio_node,
- list);
- }
-
- bssprio_node = adapter->bss_prio_tbl[j].bss_prio_cur;
- bssprio_head = bssprio_node;
+ /* iterate over BSS with the equal priority */
+ list_for_each_entry(adapter->bss_prio_tbl[j].bss_prio_cur,
+ &adapter->bss_prio_tbl[j].bss_prio_head,
+ list) {

- do {
- priv_tmp = bssprio_node->priv;
+ priv_tmp = adapter->bss_prio_tbl[j].bss_prio_cur->priv;

if (atomic_read(&priv_tmp->wmm.tx_pkts_queued) == 0)
- goto skip_bss;
+ continue;

/* iterate over the WMM queues of the BSS */
hqp = &priv_tmp->wmm.highest_queued_prio;
@@ -936,24 +924,8 @@ mwifiex_wmm_get_highest_priolist_ptr(struct mwifiex_adapter *adapter,
ra_list_spinlock,
flags_ra);
}
+ }

-skip_bss:
- /* Get next bss priority node */
- bssprio_node = list_first_entry(&bssprio_node->list,
- struct mwifiex_bss_prio_node,
- list);
-
- if (bssprio_node ==
- (struct mwifiex_bss_prio_node *)
- &adapter->bss_prio_tbl[j].bss_prio_head)
- /* Get next bss priority node */
- bssprio_node = list_first_entry(
- &bssprio_node->list,
- struct mwifiex_bss_prio_node,
- list);
- } while (bssprio_node != bssprio_head);
-
-skip_prio_tbl:
spin_unlock_irqrestore(&adapter->bss_prio_tbl[j].bss_prio_lock,
flags_bss);
}
@@ -974,12 +946,12 @@ found:
return ptr;
}

-/* This functions rotates ra lists so packets are picked in round robin
- * fashion.
+/* This functions rotates ra and bss lists so packets are picked round robin.
*
* After a packet is successfully transmitted, rotate the ra list, so the ra
* next to the one transmitted, will come first in the list. This way we pick
- * the ra in a round robin fashion.
+ * the ra' in a round robin fashion. Same applies to bss nodes of equal
+ * priority.
*
* Function also increments wmm.packets_out counter.
*/
@@ -987,17 +959,24 @@ void mwifiex_rotate_priolists(struct mwifiex_private *priv,
struct mwifiex_ra_list_tbl *ra,
int tid)
{
+ struct mwifiex_adapter *adapter = priv->adapter;
+ struct mwifiex_bss_prio_tbl *tbl = adapter->bss_prio_tbl;
struct mwifiex_tid_tbl *tid_ptr = &priv->wmm.tid_tbl_ptr[tid];
unsigned long flags;

+ spin_lock_irqsave(&tbl[priv->bss_priority].bss_prio_lock, flags);
+ /*
+ * dirty trick: we remove 'head' temporarily and reinsert it after
+ * curr bss node. imagine list to stay fixed while head is moved
+ */
+ list_move(&tbl[priv->bss_priority].bss_prio_head,
+ &tbl[priv->bss_priority].bss_prio_cur->list);
+ spin_unlock_irqrestore(&tbl[priv->bss_priority].bss_prio_lock, flags);
+
spin_lock_irqsave(&priv->wmm.ra_list_spinlock, flags);
if (mwifiex_is_ralist_valid(priv, ra, tid)) {
priv->wmm.packets_out[tid]++;
- /*
- * dirty trick: we remove 'head' temporarily and reinsert it
- * after curr bss node. imagine list to stay fixed while only
- * head is moved
- */
+ /* same as above */
list_move(&tid_ptr->ra_list, &ra->list);
}
spin_unlock_irqrestore(&priv->wmm.ra_list_spinlock, flags);
@@ -1090,12 +1069,6 @@ mwifiex_send_single_packet(struct mwifiex_private *priv,
ra_list_flags);
} else {
mwifiex_rotate_priolists(priv, ptr, ptr_index);
- adapter->bss_prio_tbl[priv->bss_priority].bss_prio_cur =
- list_first_entry(
- &adapter->bss_prio_tbl[priv->bss_priority]
- .bss_prio_cur->list,
- struct mwifiex_bss_prio_node,
- list);
atomic_dec(&priv->wmm.tx_pkts_queued);
}
}
@@ -1201,12 +1174,6 @@ mwifiex_send_processed_packet(struct mwifiex_private *priv,
}
if (ret != -EBUSY) {
mwifiex_rotate_priolists(priv, ptr, ptr_index);
- adapter->bss_prio_tbl[priv->bss_priority].bss_prio_cur =
- list_first_entry(
- &adapter->bss_prio_tbl[priv->bss_priority]
- .bss_prio_cur->list,
- struct mwifiex_bss_prio_node,
- list);
atomic_dec(&priv->wmm.tx_pkts_queued);
}
}
--
1.7.10.4


2013-04-04 22:38:44

by Bing Zhao

[permalink] [raw]
Subject: RE: [PATCH 4/4] mwifiex: bug: hold proper locks when accessing ra_list / bss_prio lists.

Hi Andi,

> @@ -981,6 +973,12 @@ mwifiex_wmm_get_highest_priolist_ptr(struct mwifiex_adapter *adapter,
> struct mwifiex_ra_list_tbl,
> list);
> } while (ptr != head);
> +
> +skip_wmm_queue:
> + spin_unlock_irqrestore(&priv_tmp->wmm.
> + ra_list_spinlock,
> + flags_ra);
> +

Please remove this blank line.

> }
>
> skip_bss:
> @@ -998,14 +996,21 @@ skip_bss:
> struct mwifiex_bss_prio_node,
> list);
> } while (bssprio_node != bssprio_head);
> +
> +skip_prio_tbl:
> + spin_unlock_irqrestore(&adapter->bss_prio_tbl[j].bss_prio_lock,
> + flags_bss);
> +

Please remove this blank line.

> }
> return NULL;
>
> found:
> - spin_lock_irqsave(&priv_tmp->wmm.ra_list_spinlock, flags);
> + /* holds bss_prio_lock / ra_list_spinlock */

Thanks,
Bing


2013-04-04 22:56:38

by Bing Zhao

[permalink] [raw]
Subject: RE: [PATCH 1/6] mwifiex: bug: remove NO_PKT_PRIO_TID.

Hi Andi,

> I reworked the patchset and split into two.

Thanks for the v2 series.

>
>
> set 1:
> 1/4 mwifiex: rework round robin scheduling of bss nodes.
> 2/4 mwifiex: replace ra_list_curr by list rotation.
> 3/4 mwifiex: bug: hold proper locks when accessing ra_list / bss_prio lists.
> 4/4 mwifiex: bug: remove NO_PKT_PRIO_TID.

In your patchset 1, the actual patches are:

1/4 mwifiex: bug: wrong list in list_empty check.
2/4 mwifiex: remove unused tid_tbl_lock from mwifiex_tid_tbl.
3/4 mwifiex: bug: remove NO_PKT_PRIO_TID.
4/4 mwifiex: bug: hold proper locks when accessing ra_list / bss_prio lists.

And I have comments on 2/4 and 4/4. Could you please make changes and resend v3 series?
Please let me know if you want me to make the change and post the v3.

[...]

> set 2:
> 1/2 mwifiex: remove unused tid_tbl_lock from mwifiex_tid_tbl.
> 2/2 mwifiex: bug: wrong list in list_empty check.

And the actual patchset 2 is:

1/2 mwifiex: replace ra_list_curr by list rotation.
2/2 mwifiex: rework round robin scheduling of bss nodes.

checkpatch.pl detected a couple warnings. Please check and resend v3 series for patchset 2.

>
> Patch set 2 can be delayed, but since hard to read code probably
> introduced all the problems, I suggest to apply it promptly.
> It simplifies the code a lot.

I will ack your latest patchset 2 as soon as the WMM test is passed.

> > We will run stress tests against 4-6.
>
> I'm here at 4+ days, still running. This exceeds all previous
> tests on my particular setup.

Could you share what test cases you are running? I will try to cover the cases you have not done.

Thanks,
Bing

2013-04-11 11:52:00

by Andreas Fenkart

[permalink] [raw]
Subject: [PATCH 1/2] mwifiex: replace ra_list_curr by list rotation.

After a packet is successfully transmitted, ra list is rotated, so the ra
next to the one transmitted, will be the first in the list. This way we
pick the ra' in a round robin fashion. This significantly simplifies
iteration in mwifiex_wmm_get_highest_priolist_ptr to a call to
list_for_each_entry.
List rotation is done via list_move, where the head itself is temporarily
removed and then re-inserted after the item just transferred.

Signed-off-by: Andreas Fenkart <[email protected]>
---
drivers/net/wireless/mwifiex/11n_aggr.c | 8 +--
drivers/net/wireless/mwifiex/main.h | 1 -
drivers/net/wireless/mwifiex/wmm.c | 97 ++++++++++++-------------------
drivers/net/wireless/mwifiex/wmm.h | 3 +
4 files changed, 40 insertions(+), 69 deletions(-)

diff --git a/drivers/net/wireless/mwifiex/11n_aggr.c b/drivers/net/wireless/mwifiex/11n_aggr.c
index af8fe63..c6d7451 100644
--- a/drivers/net/wireless/mwifiex/11n_aggr.c
+++ b/drivers/net/wireless/mwifiex/11n_aggr.c
@@ -296,19 +296,13 @@ mwifiex_11n_aggregate_pkt(struct mwifiex_private *priv,
break;
}
if (ret != -EBUSY) {
- spin_lock_irqsave(&priv->wmm.ra_list_spinlock, ra_list_flags);
- if (mwifiex_is_ralist_valid(priv, pra_list, ptrindex)) {
- priv->wmm.packets_out[ptrindex]++;
- priv->wmm.tid_tbl_ptr[ptrindex].ra_list_curr = pra_list;
- }
+ mwifiex_rotate_priolists(priv, pra_list, ptrindex);
/* Now bss_prio_cur pointer points to next node */
adapter->bss_prio_tbl[priv->bss_priority].bss_prio_cur =
list_first_entry(
&adapter->bss_prio_tbl[priv->bss_priority]
.bss_prio_cur->list,
struct mwifiex_bss_prio_node, list);
- spin_unlock_irqrestore(&priv->wmm.ra_list_spinlock,
- ra_list_flags);
}

return 0;
diff --git a/drivers/net/wireless/mwifiex/main.h b/drivers/net/wireless/mwifiex/main.h
index 11f4127..4aeb6ea 100644
--- a/drivers/net/wireless/mwifiex/main.h
+++ b/drivers/net/wireless/mwifiex/main.h
@@ -210,7 +210,6 @@ struct mwifiex_ra_list_tbl {

struct mwifiex_tid_tbl {
struct list_head ra_list;
- struct mwifiex_ra_list_tbl *ra_list_curr;
};

#define WMM_HIGHEST_PRIORITY 7
diff --git a/drivers/net/wireless/mwifiex/wmm.c b/drivers/net/wireless/mwifiex/wmm.c
index b8146f0..c0d7952 100644
--- a/drivers/net/wireless/mwifiex/wmm.c
+++ b/drivers/net/wireless/mwifiex/wmm.c
@@ -191,9 +191,6 @@ mwifiex_ralist_add(struct mwifiex_private *priv, u8 *ra)
}
list_add_tail(&ra_list->list,
&priv->wmm.tid_tbl_ptr[i].ra_list);
-
- if (!priv->wmm.tid_tbl_ptr[i].ra_list_curr)
- priv->wmm.tid_tbl_ptr[i].ra_list_curr = ra_list;
}
}

@@ -424,7 +421,6 @@ mwifiex_wmm_init(struct mwifiex_adapter *adapter)
priv->aggr_prio_tbl[i].amsdu = tos_to_tid_inv[i];
priv->aggr_prio_tbl[i].ampdu_ap = tos_to_tid_inv[i];
priv->aggr_prio_tbl[i].ampdu_user = tos_to_tid_inv[i];
- priv->wmm.tid_tbl_ptr[i].ra_list_curr = NULL;
}

priv->aggr_prio_tbl[6].amsdu
@@ -533,8 +529,6 @@ static void mwifiex_wmm_delete_all_ralist(struct mwifiex_private *priv)
}

INIT_LIST_HEAD(&priv->wmm.tid_tbl_ptr[i].ra_list);
-
- priv->wmm.tid_tbl_ptr[i].ra_list_curr = NULL;
}
}

@@ -886,7 +880,7 @@ mwifiex_wmm_get_highest_priolist_ptr(struct mwifiex_adapter *adapter,
struct mwifiex_private **priv, int *tid)
{
struct mwifiex_private *priv_tmp;
- struct mwifiex_ra_list_tbl *ptr, *head;
+ struct mwifiex_ra_list_tbl *ptr;
struct mwifiex_bss_prio_node *bssprio_node, *bssprio_head;
struct mwifiex_tid_tbl *tid_ptr;
atomic_t *hqp;
@@ -929,51 +923,15 @@ mwifiex_wmm_get_highest_priolist_ptr(struct mwifiex_adapter *adapter,
tid_ptr = &(priv_tmp)->wmm.
tid_tbl_ptr[tos_to_tid[i]];

- /* For non-STA ra_list_curr may be NULL */
- if (!tid_ptr->ra_list_curr)
- goto skip_wmm_queue;
-
- if (list_empty(&tid_ptr->ra_list))
- goto skip_wmm_queue;
-
- /*
- * Always choose the next ra we transmitted
- * last time, this way we pick the ra's in
- * round robin fashion.
- */
- ptr = list_first_entry(
- &tid_ptr->ra_list_curr->list,
- struct mwifiex_ra_list_tbl,
- list);
+ /* iterate over receiver addresses */
+ list_for_each_entry(ptr, &tid_ptr->ra_list,
+ list) {

- head = ptr;
- if (ptr == (struct mwifiex_ra_list_tbl *)
- &tid_ptr->ra_list) {
- /* Get next ra */
- ptr = list_first_entry(&ptr->list,
- struct mwifiex_ra_list_tbl, list);
- head = ptr;
- }
-
- do {
if (!skb_queue_empty(&ptr->skb_head))
/* holds both locks */
goto found;
+ }

- /* Get next ra */
- ptr = list_first_entry(&ptr->list,
- struct mwifiex_ra_list_tbl,
- list);
- if (ptr ==
- (struct mwifiex_ra_list_tbl *)
- &tid_ptr->ra_list)
- ptr = list_first_entry(
- &ptr->list,
- struct mwifiex_ra_list_tbl,
- list);
- } while (ptr != head);
-
-skip_wmm_queue:
spin_unlock_irqrestore(&priv_tmp->wmm.
ra_list_spinlock,
flags_ra);
@@ -1016,6 +974,35 @@ found:
return ptr;
}

+/* This functions rotates ra lists so packets are picked in round robin
+ * fashion.
+ *
+ * After a packet is successfully transmitted, rotate the ra list, so the ra
+ * next to the one transmitted, will come first in the list. This way we pick
+ * the ra in a round robin fashion.
+ *
+ * Function also increments wmm.packets_out counter.
+ */
+void mwifiex_rotate_priolists(struct mwifiex_private *priv,
+ struct mwifiex_ra_list_tbl *ra,
+ int tid)
+{
+ struct mwifiex_tid_tbl *tid_ptr = &priv->wmm.tid_tbl_ptr[tid];
+ unsigned long flags;
+
+ spin_lock_irqsave(&priv->wmm.ra_list_spinlock, flags);
+ if (mwifiex_is_ralist_valid(priv, ra, tid)) {
+ priv->wmm.packets_out[tid]++;
+ /*
+ * dirty trick: we remove 'head' temporarily and reinsert it
+ * after curr bss node. imagine list to stay fixed while only
+ * head is moved
+ */
+ list_move(&tid_ptr->ra_list, &ra->list);
+ }
+ spin_unlock_irqrestore(&priv->wmm.ra_list_spinlock, flags);
+}
+
/*
* This function checks if 11n aggregation is possible.
*/
@@ -1102,11 +1089,7 @@ mwifiex_send_single_packet(struct mwifiex_private *priv,
spin_unlock_irqrestore(&priv->wmm.ra_list_spinlock,
ra_list_flags);
} else {
- spin_lock_irqsave(&priv->wmm.ra_list_spinlock, ra_list_flags);
- if (mwifiex_is_ralist_valid(priv, ptr, ptr_index)) {
- priv->wmm.packets_out[ptr_index]++;
- priv->wmm.tid_tbl_ptr[ptr_index].ra_list_curr = ptr;
- }
+ mwifiex_rotate_priolists(priv, ptr, ptr_index);
adapter->bss_prio_tbl[priv->bss_priority].bss_prio_cur =
list_first_entry(
&adapter->bss_prio_tbl[priv->bss_priority]
@@ -1114,8 +1097,6 @@ mwifiex_send_single_packet(struct mwifiex_private *priv,
struct mwifiex_bss_prio_node,
list);
atomic_dec(&priv->wmm.tx_pkts_queued);
- spin_unlock_irqrestore(&priv->wmm.ra_list_spinlock,
- ra_list_flags);
}
}

@@ -1219,11 +1200,7 @@ mwifiex_send_processed_packet(struct mwifiex_private *priv,
break;
}
if (ret != -EBUSY) {
- spin_lock_irqsave(&priv->wmm.ra_list_spinlock, ra_list_flags);
- if (mwifiex_is_ralist_valid(priv, ptr, ptr_index)) {
- priv->wmm.packets_out[ptr_index]++;
- priv->wmm.tid_tbl_ptr[ptr_index].ra_list_curr = ptr;
- }
+ mwifiex_rotate_priolists(priv, ptr, ptr_index);
adapter->bss_prio_tbl[priv->bss_priority].bss_prio_cur =
list_first_entry(
&adapter->bss_prio_tbl[priv->bss_priority]
@@ -1231,8 +1208,6 @@ mwifiex_send_processed_packet(struct mwifiex_private *priv,
struct mwifiex_bss_prio_node,
list);
atomic_dec(&priv->wmm.tx_pkts_queued);
- spin_unlock_irqrestore(&priv->wmm.ra_list_spinlock,
- ra_list_flags);
}
}

diff --git a/drivers/net/wireless/mwifiex/wmm.h b/drivers/net/wireless/mwifiex/wmm.h
index b92f39d..644d6e0 100644
--- a/drivers/net/wireless/mwifiex/wmm.h
+++ b/drivers/net/wireless/mwifiex/wmm.h
@@ -85,6 +85,9 @@ mwifiex_wmm_is_ra_list_empty(struct list_head *ra_list_hhead)
void mwifiex_wmm_add_buf_txqueue(struct mwifiex_private *priv,
struct sk_buff *skb);
void mwifiex_ralist_add(struct mwifiex_private *priv, u8 *ra);
+void mwifiex_rotate_priolists(struct mwifiex_private *priv,
+ struct mwifiex_ra_list_tbl *ra,
+ int tid);

int mwifiex_wmm_lists_empty(struct mwifiex_adapter *adapter);
void mwifiex_wmm_process_tx(struct mwifiex_adapter *adapter);
--
1.7.10.4


2013-04-23 18:55:22

by Bing Zhao

[permalink] [raw]
Subject: RE: [PATCH v3 0/2] wmm queues handling simplificatons

Hi John,

> > > v3: minor cleanups for coding style; checkpatch.pl --strict
> >
> > Could you please apply the v3 series from Andreas Fenkart?
> > I've ACKed both patches individually.
>
> Hmmm....this series seems to have slipped through my fingers.
> Could you or Andreas re-send it (preferably with Bing's ACKs included)?

I will re-send it shortly.

Thanks,
Bing

2013-04-03 11:36:03

by Andreas Fenkart

[permalink] [raw]
Subject: Re: [PATCH 1/6] mwifiex: bug: remove NO_PKT_PRIO_TID.

Hi Bing,

On Tue, Apr 02, 2013 at 07:40:53PM -0700, Bing Zhao wrote:
>
> > Using NO_PKT_PRIO_TID and tx_pkts_queued to check for an empty state, can
> > lead to a contradictory state, resulting in an infinite loop.
> > Currently queueing and dequeuing of packets is not synchronized, and can
> > happen concurrently. While tx_pkts_queued is incremented when adding a
> > packet, max prio is set to NO_PKT when the WMM list is empty. If a packet
> > is added right after the check for empty, but before setting max prio to
> > NO_PKT, that packet is trapped and creates an infinite loop.
> > Because of the new packet, tx_pkts_queued is at least 1, indicating wmm
> > lists are not empty. Opposing that max prio is NO_PKT, which means "skip
> > this wmm queue, it has no packets". The infinite loop results, because the
> > main loop checks the wmm lists for not empty via tx_pkts_queued, but when
> > dequeing uses max_prio to see if it can skip a list. This will never end,
> > unless a new packet is added which will restore max prio to the level of
> > the trapped packet.
> > The solution here is to rely on tx_pkts_queued solely for checking wmm
> > queue to be empty, and drop the NO_PKT define. It does not address the
> > locking issue.
> >
> > Signed-off-by: Andreas Fenkart <[email protected]>
>
> With this patch (1/6) applied, I'm getting soft-lockup watchdog:
>
> BUG: soft lockup - CPU#3 stuck for 22s! [kworker/3:1:37]

My bad here, should be like this when patch is applied first:

@@ -919,8 +919,12 @@ mwifiex_wmm_get_highest_priolist_ptr(struct
mwifiex_adapter *adapter,

do {
priv_tmp = bssprio_node->priv;
- hqp = &priv_tmp->wmm.highest_queued_prio;

+ if (atomic_read(&priv_tmp->wmm.tx_pkts_queued) == 0)
+ goto skip_bss;
+
+ /* iterate over the WMM queues of the BSS */
+ hqp = &priv_tmp->wmm.highest_queued_prio;
for (i = atomic_read(hqp); i >= LOW_PRIO_TID; --i) {

tid_ptr = &(priv_tmp)->wmm.
@@ -980,12 +984,7 @@ mwifiex_wmm_get_highest_priolist_ptr(struct mwifiex_adapter *adapter,
} while (ptr != head);
}

- /* No packet at any TID for this priv. Mark as
such
- * to skip checking TIDs for this priv (until
pkt is
- * added).
- */
- atomic_set(hqp, NO_PKT_PRIO_TID);
-
+skip_bss:
/* Get next bss priority node */
bssprio_node = list_first_entry(&bssprio_node->list,
struct mwifiex_bss_prio_node,

That said, yes I developed the pathset the other way round. First
cleaned up, until I knew how to fix the bug best. Then pulled the fix
in front of the cleanup patches and -- mea culpa -- didn't test the
patches individually. Sorry again.

Also found issue here, which could be a problem without patch 6/6:

--- a/drivers/net/wireless/mwifiex/wmm.c
+++ b/drivers/net/wireless/mwifiex/wmm.c
@@ -688,13 +688,13 @@ mwifiex_wmm_add_buf_txqueue(struct mwifiex_private *priv,
ra_list->total_pkts_size += skb->len;
ra_list->pkt_count++;

- atomic_inc(&priv->wmm.tx_pkts_queued);
-
if (atomic_read(&priv->wmm.highest_queued_prio) <
tos_to_tid_inv[tid_down])
atomic_set(&priv->wmm.highest_queued_prio,
tos_to_tid_inv[tid_down]);

+ atomic_inc(&priv->wmm.tx_pkts_queued);
+


How should I proceed? Can I reorder patches to match my development
cycle, which is? 2-5;1;6 or more verbosely cleanup first followed
by bug fix and proper locking last

Or should keep the order as is, but fix patch 1, and propagate changes
through patch 2 till 6?

rgds,
Andi

2013-04-04 22:33:46

by Bing Zhao

[permalink] [raw]
Subject: RE: [PATCH 2/4] mwifiex: remove unused tid_tbl_lock from mwifiex_tid_tbl.

Hi Andi,

> ra_list_spinlock is used to protect struct mwifiex_wmm_desc and embedded
> structures such as ra_list. tid_tbl_lock while more fine grained, is not
> used but in one function. That function is not called reentrantly. To
> protect ra_list from concurrent modification ra_list_spinlock must be held.
>
> Signed-off-by: Andreas Fenkart <[email protected]>
> ---
> drivers/net/wireless/mwifiex/init.c | 1 -
> drivers/net/wireless/mwifiex/main.h | 2 --
> drivers/net/wireless/mwifiex/wmm.c | 8 ++++----
> 3 files changed, 4 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/net/wireless/mwifiex/init.c b/drivers/net/wireless/mwifiex/init.c
> index e38aa9b..ba9c5a5 100644
> --- a/drivers/net/wireless/mwifiex/init.c
> +++ b/drivers/net/wireless/mwifiex/init.c
> @@ -535,7 +535,6 @@ int mwifiex_init_lock_list(struct mwifiex_adapter *adapter)
> priv = adapter->priv[i];
> for (j = 0; j < MAX_NUM_TID; ++j) {
> INIT_LIST_HEAD(&priv->wmm.tid_tbl_ptr[j].ra_list);
> - spin_lock_init(&priv->wmm.tid_tbl_ptr[j].tid_tbl_lock);
> }

Could you remove the braces {} too?

for (j = 0; j < MAX_NUM_TID; ++j)
INIT_LIST_HEAD(&priv->wmm.tid_tbl_ptr[j].ra_list);

Thanks,
Bing

> INIT_LIST_HEAD(&priv->tx_ba_stream_tbl_ptr);
> INIT_LIST_HEAD(&priv->rx_reorder_tbl_ptr);
> diff --git a/drivers/net/wireless/mwifiex/main.h b/drivers/net/wireless/mwifiex/main.h
> index 553adfb..d94406a 100644
> --- a/drivers/net/wireless/mwifiex/main.h
> +++ b/drivers/net/wireless/mwifiex/main.h
> @@ -210,8 +210,6 @@ struct mwifiex_ra_list_tbl {
>
> struct mwifiex_tid_tbl {
> struct list_head ra_list;
> - /* spin lock for tid table */
> - spinlock_t tid_tbl_lock;
> struct mwifiex_ra_list_tbl *ra_list_curr;
> };
>
> diff --git a/drivers/net/wireless/mwifiex/wmm.c b/drivers/net/wireless/mwifiex/wmm.c
> index b132c42..96eed2f 100644
> --- a/drivers/net/wireless/mwifiex/wmm.c
> +++ b/drivers/net/wireless/mwifiex/wmm.c
> @@ -930,12 +930,12 @@ mwifiex_wmm_get_highest_priolist_ptr(struct mwifiex_adapter *adapter,
> if (!tid_ptr->ra_list_curr)
> continue;
>
> - spin_lock_irqsave(&tid_ptr->tid_tbl_lock,
> - flags);
> + spin_lock_irqsave(&priv_tmp->wmm.
> + ra_list_spinlock, flags);
> is_list_empty =
> list_empty(&tid_ptr->ra_list);
> - spin_unlock_irqrestore(&tid_ptr->tid_tbl_lock,
> - flags);
> + spin_unlock_irqrestore(&priv_tmp->wmm.
> + ra_list_spinlock, flags);
> if (is_list_empty)
> continue;
>
> --
> 1.7.10.4


2013-04-04 21:08:26

by Andreas Fenkart

[permalink] [raw]
Subject: [PATCH 2/2] mwifiex: rework round robin scheduling of bss nodes.

Rotate bss prio list, so the bss next to the one served, will come first
in the list of bss' with equal priority. This way we pick bss nodes in a
round robin fashion. Using list rotation instead of a cur ptr simplifies
iteration to calling list_for_each_entry. List rotation is done via
list_move, where the head itself is temporarily removed and then
re-inserted after the bss just served.

Signed-off-by: Andreas Fenkart <[email protected]>
---
drivers/net/wireless/mwifiex/11n_aggr.c | 6 ---
drivers/net/wireless/mwifiex/init.c | 21 +++-----
drivers/net/wireless/mwifiex/wmm.c | 81 +++++++++----------------------
3 files changed, 30 insertions(+), 78 deletions(-)

diff --git a/drivers/net/wireless/mwifiex/11n_aggr.c b/drivers/net/wireless/mwifiex/11n_aggr.c
index c6d7451..a78e065 100644
--- a/drivers/net/wireless/mwifiex/11n_aggr.c
+++ b/drivers/net/wireless/mwifiex/11n_aggr.c
@@ -297,12 +297,6 @@ mwifiex_11n_aggregate_pkt(struct mwifiex_private *priv,
}
if (ret != -EBUSY) {
mwifiex_rotate_priolists(priv, pra_list, ptrindex);
- /* Now bss_prio_cur pointer points to next node */
- adapter->bss_prio_tbl[priv->bss_priority].bss_prio_cur =
- list_first_entry(
- &adapter->bss_prio_tbl[priv->bss_priority]
- .bss_prio_cur->list,
- struct mwifiex_bss_prio_node, list);
}

return 0;
diff --git a/drivers/net/wireless/mwifiex/init.c b/drivers/net/wireless/mwifiex/init.c
index ba9c5a5..90ed3f2 100644
--- a/drivers/net/wireless/mwifiex/init.c
+++ b/drivers/net/wireless/mwifiex/init.c
@@ -44,8 +44,6 @@ static int mwifiex_add_bss_prio_tbl(struct mwifiex_private *priv)

bss_prio->priv = priv;
INIT_LIST_HEAD(&bss_prio->list);
- if (!tbl[priv->bss_priority].bss_prio_cur)
- tbl[priv->bss_priority].bss_prio_cur = bss_prio;

spin_lock_irqsave(&tbl[priv->bss_priority].bss_prio_lock, flags);
list_add_tail(&bss_prio->list, &tbl[priv->bss_priority].bss_prio_head);
@@ -525,7 +523,6 @@ int mwifiex_init_lock_list(struct mwifiex_adapter *adapter)

for (i = 0; i < adapter->priv_num; ++i) {
INIT_LIST_HEAD(&adapter->bss_prio_tbl[i].bss_prio_head);
- adapter->bss_prio_tbl[i].bss_prio_cur = NULL;
spin_lock_init(&adapter->bss_prio_tbl[i].bss_prio_lock);
}

@@ -626,42 +623,36 @@ static void mwifiex_delete_bss_prio_tbl(struct mwifiex_private *priv)
{
int i;
struct mwifiex_adapter *adapter = priv->adapter;
- struct mwifiex_bss_prio_node *bssprio_node, *tmp_node, **cur;
+ struct mwifiex_bss_prio_node *bssprio_node, *tmp_node;
struct list_head *head;
spinlock_t *lock; /* bss priority lock */
unsigned long flags;

for (i = 0; i < adapter->priv_num; ++i) {
head = &adapter->bss_prio_tbl[i].bss_prio_head;
- cur = &adapter->bss_prio_tbl[i].bss_prio_cur;
lock = &adapter->bss_prio_tbl[i].bss_prio_lock;
dev_dbg(adapter->dev, "info: delete BSS priority table,"
" bss_type = %d, bss_num = %d, i = %d,"
- " head = %p, cur = %p\n",
- priv->bss_type, priv->bss_num, i, head, *cur);
- if (*cur) {
+ " head = %p\n",
+ priv->bss_type, priv->bss_num, i, head);
+
+ {
spin_lock_irqsave(lock, flags);
if (list_empty(head)) {
spin_unlock_irqrestore(lock, flags);
continue;
}
- bssprio_node = list_first_entry(head,
- struct mwifiex_bss_prio_node, list);
- spin_unlock_irqrestore(lock, flags);
-
list_for_each_entry_safe(bssprio_node, tmp_node, head,
list) {
if (bssprio_node->priv == priv) {
dev_dbg(adapter->dev, "info: Delete "
"node %p, next = %p\n",
bssprio_node, tmp_node);
- spin_lock_irqsave(lock, flags);
list_del(&bssprio_node->list);
- spin_unlock_irqrestore(lock, flags);
kfree(bssprio_node);
}
}
- *cur = (struct mwifiex_bss_prio_node *)head;
+ spin_unlock_irqrestore(lock, flags);
}
}
}
diff --git a/drivers/net/wireless/mwifiex/wmm.c b/drivers/net/wireless/mwifiex/wmm.c
index c865e58..4995ff6 100644
--- a/drivers/net/wireless/mwifiex/wmm.c
+++ b/drivers/net/wireless/mwifiex/wmm.c
@@ -881,37 +881,25 @@ mwifiex_wmm_get_highest_priolist_ptr(struct mwifiex_adapter *adapter,
{
struct mwifiex_private *priv_tmp;
struct mwifiex_ra_list_tbl *ptr;
- struct mwifiex_bss_prio_node *bssprio_node, *bssprio_head;
struct mwifiex_tid_tbl *tid_ptr;
atomic_t *hqp;
unsigned long flags_bss, flags_ra;
int i, j;

+ /* check the BSS with highest priority first */
for (j = adapter->priv_num - 1; j >= 0; --j) {
spin_lock_irqsave(&adapter->bss_prio_tbl[j].bss_prio_lock,
flags_bss);

- if (list_empty(&adapter->bss_prio_tbl[j].bss_prio_head))
- goto skip_prio_tbl;
-
- if (adapter->bss_prio_tbl[j].bss_prio_cur ==
- (struct mwifiex_bss_prio_node *)
- &adapter->bss_prio_tbl[j].bss_prio_head) {
- adapter->bss_prio_tbl[j].bss_prio_cur =
- list_first_entry(&adapter->bss_prio_tbl[j]
- .bss_prio_head,
- struct mwifiex_bss_prio_node,
- list);
- }
-
- bssprio_node = adapter->bss_prio_tbl[j].bss_prio_cur;
- bssprio_head = bssprio_node;
+ /* iterate over BSS with the equal priority */
+ list_for_each_entry(adapter->bss_prio_tbl[j].bss_prio_cur,
+ &adapter->bss_prio_tbl[j].bss_prio_head,
+ list) {

- do {
- priv_tmp = bssprio_node->priv;
+ priv_tmp = adapter->bss_prio_tbl[j].bss_prio_cur->priv;

if (atomic_read(&priv_tmp->wmm.tx_pkts_queued) == 0)
- goto skip_bss;
+ continue;

/* iterate over the WMM queues of the BSS */
hqp = &priv_tmp->wmm.highest_queued_prio;
@@ -940,23 +928,8 @@ mwifiex_wmm_get_highest_priolist_ptr(struct mwifiex_adapter *adapter,

}

-skip_bss:
- /* Get next bss priority node */
- bssprio_node = list_first_entry(&bssprio_node->list,
- struct mwifiex_bss_prio_node,
- list);
-
- if (bssprio_node ==
- (struct mwifiex_bss_prio_node *)
- &adapter->bss_prio_tbl[j].bss_prio_head)
- /* Get next bss priority node */
- bssprio_node = list_first_entry(
- &bssprio_node->list,
- struct mwifiex_bss_prio_node,
- list);
- } while (bssprio_node != bssprio_head);
-
-skip_prio_tbl:
+ }
+
spin_unlock_irqrestore(&adapter->bss_prio_tbl[j].bss_prio_lock,
flags_bss);

@@ -978,12 +951,12 @@ found:
}

/*
- * This functions rotates ra lists so packets are picked in round robin
- * fashion.
+ * This functions rotates ra and bss lists so packets are picked round robin.
*
* After a packet is successfully transmitted, rotate the ra list, so the ra
* next to the one transmitted, will come first in the list. This way we pick
- * the ra in a round robin fashion.
+ * the ra' in a round robin fashion. Same applies to bss nodes of equal
+ * priority.
*
* Function also increments wmm.packets_out counter.
*/
@@ -991,21 +964,27 @@ void mwifiex_rotate_priolists(struct mwifiex_private *priv,
struct mwifiex_ra_list_tbl *ra,
int tid)
{
+ struct mwifiex_adapter *adapter = priv->adapter;
+ struct mwifiex_bss_prio_tbl *tbl = adapter->bss_prio_tbl;
struct mwifiex_tid_tbl *tid_ptr = &priv->wmm.tid_tbl_ptr[tid];
unsigned long flags;

+ spin_lock_irqsave(&tbl[priv->bss_priority].bss_prio_lock, flags);
+ /*
+ * dirty trick: we remove 'head' temporarily and reinsert it after
+ * curr bss node. imagine list to stay fixed while head is moved
+ */
+ list_move(&tbl[priv->bss_priority].bss_prio_head,
+ &tbl[priv->bss_priority].bss_prio_cur->list);
+ spin_unlock_irqrestore(&tbl[priv->bss_priority].bss_prio_lock, flags);
+
spin_lock_irqsave(&priv->wmm.ra_list_spinlock, flags);
if (mwifiex_is_ralist_valid(priv, ra, tid)) {
priv->wmm.packets_out[tid]++;
- /*
- * dirty trick: we remove 'head' temporarily and reinsert it
- * after curr bss node. imagine list to stay fixed while only
- * head is moved
- */
+ /* same as above */
list_move(&tid_ptr->ra_list, &ra->list);
}
spin_unlock_irqrestore(&priv->wmm.ra_list_spinlock, flags);
-
}

/*
@@ -1095,12 +1074,6 @@ mwifiex_send_single_packet(struct mwifiex_private *priv,
ra_list_flags);
} else {
mwifiex_rotate_priolists(priv, ptr, ptr_index);
- adapter->bss_prio_tbl[priv->bss_priority].bss_prio_cur =
- list_first_entry(
- &adapter->bss_prio_tbl[priv->bss_priority]
- .bss_prio_cur->list,
- struct mwifiex_bss_prio_node,
- list);
atomic_dec(&priv->wmm.tx_pkts_queued);
}
}
@@ -1206,12 +1179,6 @@ mwifiex_send_processed_packet(struct mwifiex_private *priv,
}
if (ret != -EBUSY) {
mwifiex_rotate_priolists(priv, ptr, ptr_index);
- adapter->bss_prio_tbl[priv->bss_priority].bss_prio_cur =
- list_first_entry(
- &adapter->bss_prio_tbl[priv->bss_priority]
- .bss_prio_cur->list,
- struct mwifiex_bss_prio_node,
- list);
atomic_dec(&priv->wmm.tx_pkts_queued);
}
}
--
1.7.10.4


2013-04-03 02:43:48

by Bing Zhao

[permalink] [raw]
Subject: RE: [PATCH 1/6] mwifiex: bug: remove NO_PKT_PRIO_TID.

Hi Andi,

Thanks for the patchset.

> Using NO_PKT_PRIO_TID and tx_pkts_queued to check for an empty state, can
> lead to a contradictory state, resulting in an infinite loop.
> Currently queueing and dequeuing of packets is not synchronized, and can
> happen concurrently. While tx_pkts_queued is incremented when adding a
> packet, max prio is set to NO_PKT when the WMM list is empty. If a packet
> is added right after the check for empty, but before setting max prio to
> NO_PKT, that packet is trapped and creates an infinite loop.
> Because of the new packet, tx_pkts_queued is at least 1, indicating wmm
> lists are not empty. Opposing that max prio is NO_PKT, which means "skip
> this wmm queue, it has no packets". The infinite loop results, because the
> main loop checks the wmm lists for not empty via tx_pkts_queued, but when
> dequeing uses max_prio to see if it can skip a list. This will never end,
> unless a new packet is added which will restore max prio to the level of
> the trapped packet.
> The solution here is to rely on tx_pkts_queued solely for checking wmm
> queue to be empty, and drop the NO_PKT define. It does not address the
> locking issue.
>
> Signed-off-by: Andreas Fenkart <[email protected]>

With this patch (1/6) applied, I'm getting soft-lockup watchdog:

BUG: soft lockup - CPU#3 stuck for 22s! [kworker/3:1:37]

I'm running 64-bit Ubuntu 12.04 (latest wireless-testing.git) with SD8787.
The BUG is hit when I enter "dhclient" command after association.

# iw mlan0 scan
# iw mlan0 connect MY_AP
# dhclient mlan0

BTW, if I apply the first 5 patches (1/6-5/6) or all 6 patches together, the soft-lockup BUG is gone.
Any ideas?

Thanks,
Bing


2013-04-02 00:09:36

by Andreas Fenkart

[permalink] [raw]
Subject: [PATCH 5/6] mwifiex: rework round robin scheduling of bss nodes.

Rotate bss prio list, so the bss next to the one served, will come first in
the list of bss' with equal priority. This way we pick bss nodes in a round
robin fashion. Using list rotation instead of a cur ptr simplifies
iteration to calling list_for_each_entry. List rotation is done via
list_move, where the head itself is temporarily removed and then
re-inserted after the bss just served.

Signed-off-by: Andreas Fenkart <[email protected]>
---
drivers/net/wireless/mwifiex/11n_aggr.c | 6 ---
drivers/net/wireless/mwifiex/init.c | 21 +++------
drivers/net/wireless/mwifiex/wmm.c | 72 ++++++++++---------------------
3 files changed, 28 insertions(+), 71 deletions(-)

diff --git a/drivers/net/wireless/mwifiex/11n_aggr.c b/drivers/net/wireless/mwifiex/11n_aggr.c
index c6d7451..a78e065 100644
--- a/drivers/net/wireless/mwifiex/11n_aggr.c
+++ b/drivers/net/wireless/mwifiex/11n_aggr.c
@@ -297,12 +297,6 @@ mwifiex_11n_aggregate_pkt(struct mwifiex_private *priv,
}
if (ret != -EBUSY) {
mwifiex_rotate_priolists(priv, pra_list, ptrindex);
- /* Now bss_prio_cur pointer points to next node */
- adapter->bss_prio_tbl[priv->bss_priority].bss_prio_cur =
- list_first_entry(
- &adapter->bss_prio_tbl[priv->bss_priority]
- .bss_prio_cur->list,
- struct mwifiex_bss_prio_node, list);
}

return 0;
diff --git a/drivers/net/wireless/mwifiex/init.c b/drivers/net/wireless/mwifiex/init.c
index ba9c5a5..90ed3f2 100644
--- a/drivers/net/wireless/mwifiex/init.c
+++ b/drivers/net/wireless/mwifiex/init.c
@@ -44,8 +44,6 @@ static int mwifiex_add_bss_prio_tbl(struct mwifiex_private *priv)

bss_prio->priv = priv;
INIT_LIST_HEAD(&bss_prio->list);
- if (!tbl[priv->bss_priority].bss_prio_cur)
- tbl[priv->bss_priority].bss_prio_cur = bss_prio;

spin_lock_irqsave(&tbl[priv->bss_priority].bss_prio_lock, flags);
list_add_tail(&bss_prio->list, &tbl[priv->bss_priority].bss_prio_head);
@@ -525,7 +523,6 @@ int mwifiex_init_lock_list(struct mwifiex_adapter *adapter)

for (i = 0; i < adapter->priv_num; ++i) {
INIT_LIST_HEAD(&adapter->bss_prio_tbl[i].bss_prio_head);
- adapter->bss_prio_tbl[i].bss_prio_cur = NULL;
spin_lock_init(&adapter->bss_prio_tbl[i].bss_prio_lock);
}

@@ -626,42 +623,36 @@ static void mwifiex_delete_bss_prio_tbl(struct mwifiex_private *priv)
{
int i;
struct mwifiex_adapter *adapter = priv->adapter;
- struct mwifiex_bss_prio_node *bssprio_node, *tmp_node, **cur;
+ struct mwifiex_bss_prio_node *bssprio_node, *tmp_node;
struct list_head *head;
spinlock_t *lock; /* bss priority lock */
unsigned long flags;

for (i = 0; i < adapter->priv_num; ++i) {
head = &adapter->bss_prio_tbl[i].bss_prio_head;
- cur = &adapter->bss_prio_tbl[i].bss_prio_cur;
lock = &adapter->bss_prio_tbl[i].bss_prio_lock;
dev_dbg(adapter->dev, "info: delete BSS priority table,"
" bss_type = %d, bss_num = %d, i = %d,"
- " head = %p, cur = %p\n",
- priv->bss_type, priv->bss_num, i, head, *cur);
- if (*cur) {
+ " head = %p\n",
+ priv->bss_type, priv->bss_num, i, head);
+
+ {
spin_lock_irqsave(lock, flags);
if (list_empty(head)) {
spin_unlock_irqrestore(lock, flags);
continue;
}
- bssprio_node = list_first_entry(head,
- struct mwifiex_bss_prio_node, list);
- spin_unlock_irqrestore(lock, flags);
-
list_for_each_entry_safe(bssprio_node, tmp_node, head,
list) {
if (bssprio_node->priv == priv) {
dev_dbg(adapter->dev, "info: Delete "
"node %p, next = %p\n",
bssprio_node, tmp_node);
- spin_lock_irqsave(lock, flags);
list_del(&bssprio_node->list);
- spin_unlock_irqrestore(lock, flags);
kfree(bssprio_node);
}
}
- *cur = (struct mwifiex_bss_prio_node *)head;
+ spin_unlock_irqrestore(lock, flags);
}
}
}
diff --git a/drivers/net/wireless/mwifiex/wmm.c b/drivers/net/wireless/mwifiex/wmm.c
index a9c381a..4d8a5ca 100644
--- a/drivers/net/wireless/mwifiex/wmm.c
+++ b/drivers/net/wireless/mwifiex/wmm.c
@@ -881,13 +881,13 @@ mwifiex_wmm_get_highest_priolist_ptr(struct mwifiex_adapter *adapter,
{
struct mwifiex_private *priv_tmp;
struct mwifiex_ra_list_tbl *ptr;
- struct mwifiex_bss_prio_node *bssprio_node, *bssprio_head;
struct mwifiex_tid_tbl *tid_ptr;
atomic_t *hqp;
int is_list_empty;
unsigned long flags;
int i, j;

+ /* check the BSS with highest priority first */
for (j = adapter->priv_num - 1; j >= 0; --j) {
spin_lock_irqsave(&adapter->bss_prio_tbl[j].bss_prio_lock,
flags);
@@ -898,21 +898,12 @@ mwifiex_wmm_get_highest_priolist_ptr(struct mwifiex_adapter *adapter,
if (is_list_empty)
continue;

- if (adapter->bss_prio_tbl[j].bss_prio_cur ==
- (struct mwifiex_bss_prio_node *)
- &adapter->bss_prio_tbl[j].bss_prio_head) {
- adapter->bss_prio_tbl[j].bss_prio_cur =
- list_first_entry(&adapter->bss_prio_tbl[j]
- .bss_prio_head,
- struct mwifiex_bss_prio_node,
- list);
- }
-
- bssprio_node = adapter->bss_prio_tbl[j].bss_prio_cur;
- bssprio_head = bssprio_node;
+ /* iterate over BSS with the equal priority */
+ list_for_each_entry(adapter->bss_prio_tbl[j].bss_prio_cur,
+ &adapter->bss_prio_tbl[j].bss_prio_head,
+ list) {

- do {
- priv_tmp = bssprio_node->priv;
+ priv_tmp = adapter->bss_prio_tbl[j].bss_prio_cur->priv;

if (atomic_read(&priv_tmp->wmm.tx_pkts_queued) == 0)
continue;
@@ -943,20 +934,7 @@ mwifiex_wmm_get_highest_priolist_ptr(struct mwifiex_adapter *adapter,
}
}

- /* Get next bss priority node */
- bssprio_node = list_first_entry(&bssprio_node->list,
- struct mwifiex_bss_prio_node,
- list);
-
- if (bssprio_node ==
- (struct mwifiex_bss_prio_node *)
- &adapter->bss_prio_tbl[j].bss_prio_head)
- /* Get next bss priority node */
- bssprio_node = list_first_entry(
- &bssprio_node->list,
- struct mwifiex_bss_prio_node,
- list);
- } while (bssprio_node != bssprio_head);
+ }
}
return NULL;

@@ -973,12 +951,12 @@ found:
}

/*
- * This functions rotates ra lists so packets are picked in round robin
- * fashion.
+ * This functions rotates ra and bss lists so packets are picked round robin.
*
* After a packet is successfully transmitted, rotate the ra list, so the ra
* next to the one transmitted, will come first in the list. This way we pick
- * the ra in a round robin fashion.
+ * the ra' in a round robin fashion. Same applies to bss nodes of equal
+ * priority.
*
* Function also increments wmm.packets_out counter.
*/
@@ -986,21 +964,27 @@ void mwifiex_rotate_priolists(struct mwifiex_private *priv,
struct mwifiex_ra_list_tbl *ra,
int tid)
{
+ struct mwifiex_adapter *adapter = priv->adapter;
+ struct mwifiex_bss_prio_tbl *tbl = adapter->bss_prio_tbl;
struct mwifiex_tid_tbl *tid_ptr = &priv->wmm.tid_tbl_ptr[tid];
unsigned long flags;

+ spin_lock_irqsave(&tbl[priv->bss_priority].bss_prio_lock, flags);
+ /*
+ * dirty trick: we remove 'head' temporarily and reinsert it after
+ * curr bss node. imagine list to stay fixed while head is moved
+ */
+ list_move(&tbl[priv->bss_priority].bss_prio_head,
+ &tbl[priv->bss_priority].bss_prio_cur->list);
+ spin_unlock_irqrestore(&tbl[priv->bss_priority].bss_prio_lock, flags);
+
spin_lock_irqsave(&priv->wmm.ra_list_spinlock, flags);
if (mwifiex_is_ralist_valid(priv, ra, tid)) {
priv->wmm.packets_out[tid]++;
- /*
- * dirty trick: we remove 'head' temporarily and reinsert it
- * after curr bss node. imagine list to stay fixed while only
- * head is moved
- */
+ /* same as above */
list_move(&tid_ptr->ra_list, &ra->list);
}
spin_unlock_irqrestore(&priv->wmm.ra_list_spinlock, flags);
-
}

/*
@@ -1090,12 +1074,6 @@ mwifiex_send_single_packet(struct mwifiex_private *priv,
ra_list_flags);
} else {
mwifiex_rotate_priolists(priv, ptr, ptr_index);
- adapter->bss_prio_tbl[priv->bss_priority].bss_prio_cur =
- list_first_entry(
- &adapter->bss_prio_tbl[priv->bss_priority]
- .bss_prio_cur->list,
- struct mwifiex_bss_prio_node,
- list);
atomic_dec(&priv->wmm.tx_pkts_queued);
}
}
@@ -1201,12 +1179,6 @@ mwifiex_send_processed_packet(struct mwifiex_private *priv,
}
if (ret != -EBUSY) {
mwifiex_rotate_priolists(priv, ptr, ptr_index);
- adapter->bss_prio_tbl[priv->bss_priority].bss_prio_cur =
- list_first_entry(
- &adapter->bss_prio_tbl[priv->bss_priority]
- .bss_prio_cur->list,
- struct mwifiex_bss_prio_node,
- list);
atomic_dec(&priv->wmm.tx_pkts_queued);
}
}
--
1.7.10.4


2013-04-04 22:30:26

by Bing Zhao

[permalink] [raw]
Subject: RE: [PATCH 1/4] mwifiex: bug: wrong list in list_empty check.

Hi Andi,

Thanks for the v2 patch series.

> adapter->bss_prio_tbl list has already been checked in outer loop. The
> inner loop works with priv_tmp->wmm.tid_tbl_ptr list. Also the lock taken,
> gives hint that this is likely a copy-paste error.
>
> Signed-off-by: Andreas Fenkart <[email protected]>

Acked-by: Bing Zhao <[email protected]>

Thanks,
Bing

> ---
> drivers/net/wireless/mwifiex/wmm.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/net/wireless/mwifiex/wmm.c b/drivers/net/wireless/mwifiex/wmm.c
> index 32adc87..b132c42 100644
> --- a/drivers/net/wireless/mwifiex/wmm.c
> +++ b/drivers/net/wireless/mwifiex/wmm.c
> @@ -933,8 +933,7 @@ mwifiex_wmm_get_highest_priolist_ptr(struct mwifiex_adapter *adapter,
> spin_lock_irqsave(&tid_ptr->tid_tbl_lock,
> flags);
> is_list_empty =
> - list_empty(&adapter->bss_prio_tbl[j]
> - .bss_prio_head);
> + list_empty(&tid_ptr->ra_list);
> spin_unlock_irqrestore(&tid_ptr->tid_tbl_lock,
> flags);
> if (is_list_empty)
> --
> 1.7.10.4


2013-04-04 21:01:36

by Andreas Fenkart

[permalink] [raw]
Subject: [PATCH 1/4] mwifiex: bug: wrong list in list_empty check.

adapter->bss_prio_tbl list has already been checked in outer loop. The
inner loop works with priv_tmp->wmm.tid_tbl_ptr list. Also the lock taken,
gives hint that this is likely a copy-paste error.

Signed-off-by: Andreas Fenkart <[email protected]>
---
drivers/net/wireless/mwifiex/wmm.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/wireless/mwifiex/wmm.c b/drivers/net/wireless/mwifiex/wmm.c
index 32adc87..b132c42 100644
--- a/drivers/net/wireless/mwifiex/wmm.c
+++ b/drivers/net/wireless/mwifiex/wmm.c
@@ -933,8 +933,7 @@ mwifiex_wmm_get_highest_priolist_ptr(struct mwifiex_adapter *adapter,
spin_lock_irqsave(&tid_ptr->tid_tbl_lock,
flags);
is_list_empty =
- list_empty(&adapter->bss_prio_tbl[j]
- .bss_prio_head);
+ list_empty(&tid_ptr->ra_list);
spin_unlock_irqrestore(&tid_ptr->tid_tbl_lock,
flags);
if (is_list_empty)
--
1.7.10.4


2013-04-03 18:38:36

by Bing Zhao

[permalink] [raw]
Subject: RE: [PATCH 1/6] mwifiex: bug: remove NO_PKT_PRIO_TID.

Hi Andi,

> > With this patch (1/6) applied, I'm getting soft-lockup watchdog:
> >
> > BUG: soft lockup - CPU#3 stuck for 22s! [kworker/3:1:37]
>
> My bad here, should be like this when patch is applied first:
>
> @@ -919,8 +919,12 @@ mwifiex_wmm_get_highest_priolist_ptr(struct
> mwifiex_adapter *adapter,
>
> do {
> priv_tmp = bssprio_node->priv;
> - hqp = &priv_tmp->wmm.highest_queued_prio;
>
> + if (atomic_read(&priv_tmp->wmm.tx_pkts_queued) == 0)
> + goto skip_bss;
> +
> + /* iterate over the WMM queues of the BSS */
> + hqp = &priv_tmp->wmm.highest_queued_prio;
> for (i = atomic_read(hqp); i >= LOW_PRIO_TID; --i) {
>
> tid_ptr = &(priv_tmp)->wmm.
> @@ -980,12 +984,7 @@ mwifiex_wmm_get_highest_priolist_ptr(struct mwifiex_adapter *adapter,
> } while (ptr != head);
> }
>
> - /* No packet at any TID for this priv. Mark as
> such
> - * to skip checking TIDs for this priv (until
> pkt is
> - * added).
> - */
> - atomic_set(hqp, NO_PKT_PRIO_TID);
> -
> +skip_bss:
> /* Get next bss priority node */
> bssprio_node = list_first_entry(&bssprio_node->list,
> struct mwifiex_bss_prio_node,

Thanks for looking into it. This change fixes the soft-lockup.

>
> That said, yes I developed the pathset the other way round. First
> cleaned up, until I knew how to fix the bug best. Then pulled the fix
> in front of the cleanup patches and -- mea culpa -- didn't test the
> patches individually. Sorry again.
>
> Also found issue here, which could be a problem without patch 6/6:
>
> --- a/drivers/net/wireless/mwifiex/wmm.c
> +++ b/drivers/net/wireless/mwifiex/wmm.c
> @@ -688,13 +688,13 @@ mwifiex_wmm_add_buf_txqueue(struct mwifiex_private *priv,
> ra_list->total_pkts_size += skb->len;
> ra_list->pkt_count++;
>
> - atomic_inc(&priv->wmm.tx_pkts_queued);
> -
> if (atomic_read(&priv->wmm.highest_queued_prio) <
> tos_to_tid_inv[tid_down])
> atomic_set(&priv->wmm.highest_queued_prio,
> tos_to_tid_inv[tid_down]);
>
> + atomic_inc(&priv->wmm.tx_pkts_queued);
> +
>
>
> How should I proceed? Can I reorder patches to match my development
> cycle, which is? 2-5;1;6 or more verbosely cleanup first followed
> by bug fix and proper locking last

You can separate 6 patches into two patch sets.
1-3 fix the bug and clean up.
4-6 implement ra list rotation and proper locking.

We will run stress tests against 4-6.

Thanks,
Bing

>
> Or should keep the order as is, but fix patch 1, and propagate changes
> through patch 2 till 6?


2013-04-02 00:09:40

by Andreas Fenkart

[permalink] [raw]
Subject: [PATCH 6/6] mwifiex: hold proper locks when accessing ra_list / bss_prio lists.

Using forward iteration only, is no protection from calls to list_del.
Also due to list rotation, lists are modified frequently, and having
removed the head could lead to an infinite loop when calling
list_for_each. Currently this is not a problem, since list rotation
and dequeue always happen in the same thread.
But this is no proper lockfree technique, and hence proper locks
should be held when accessing those lists.
Patch also remove list_empty calls, relying on the embedded check in
list_for_each_entry.

Signed-off-by: Andreas Fenkart <[email protected]>
---
drivers/net/wireless/mwifiex/wmm.c | 36 +++++++++++++++++-------------------
1 file changed, 17 insertions(+), 19 deletions(-)

diff --git a/drivers/net/wireless/mwifiex/wmm.c b/drivers/net/wireless/mwifiex/wmm.c
index 4d8a5ca..0a22656 100644
--- a/drivers/net/wireless/mwifiex/wmm.c
+++ b/drivers/net/wireless/mwifiex/wmm.c
@@ -883,20 +883,13 @@ mwifiex_wmm_get_highest_priolist_ptr(struct mwifiex_adapter *adapter,
struct mwifiex_ra_list_tbl *ptr;
struct mwifiex_tid_tbl *tid_ptr;
atomic_t *hqp;
- int is_list_empty;
- unsigned long flags;
+ unsigned long flags_bss, flags_ra;
int i, j;

/* check the BSS with highest priority first */
for (j = adapter->priv_num - 1; j >= 0; --j) {
spin_lock_irqsave(&adapter->bss_prio_tbl[j].bss_prio_lock,
- flags);
- is_list_empty = list_empty(&adapter->bss_prio_tbl[j]
- .bss_prio_head);
- spin_unlock_irqrestore(&adapter->bss_prio_tbl[j].bss_prio_lock,
- flags);
- if (is_list_empty)
- continue;
+ flags_bss);

/* iterate over BSS with the equal priority */
list_for_each_entry(adapter->bss_prio_tbl[j].bss_prio_cur,
@@ -916,33 +909,38 @@ mwifiex_wmm_get_highest_priolist_ptr(struct mwifiex_adapter *adapter,
tid_tbl_ptr[tos_to_tid[i]];

spin_lock_irqsave(&priv_tmp->wmm.
- ra_list_spinlock, flags);
- is_list_empty =
- list_empty(&tid_ptr->ra_list);
- spin_unlock_irqrestore(&priv_tmp->wmm.
- ra_list_spinlock, flags);
- if (is_list_empty)
- continue;
+ ra_list_spinlock, flags_ra);

/* iterate over receiver addresses */
list_for_each_entry(ptr, &tid_ptr->ra_list,
list) {

- if (!skb_queue_empty(&ptr->skb_head))
+ if (!skb_queue_empty(&ptr->skb_head)) {
+ /* holds both locks */
goto found;
+ }

}
+
+ spin_unlock_irqrestore(&priv_tmp->wmm.
+ ra_list_spinlock,
+ flags_ra);
}

}
+
+ spin_unlock_irqrestore(&adapter->bss_prio_tbl[j].bss_prio_lock,
+ flags_bss);
}
return NULL;

found:
- spin_lock_irqsave(&priv_tmp->wmm.ra_list_spinlock, flags);
+ /* holds bss_prio_lock / ra_list_spinlock */
if (atomic_read(hqp) > i)
atomic_set(hqp, i);
- spin_unlock_irqrestore(&priv_tmp->wmm.ra_list_spinlock, flags);
+ spin_unlock_irqrestore(&priv_tmp->wmm.ra_list_spinlock, flags_ra);
+ spin_unlock_irqrestore(&adapter->bss_prio_tbl[j].bss_prio_lock,
+ flags_bss);

*priv = priv_tmp;
*tid = tos_to_tid[i];
--
1.7.10.4


2013-04-11 18:45:31

by Bing Zhao

[permalink] [raw]
Subject: RE: [PATCH 2/2] mwifiex: rework round robin scheduling of bss nodes.


> Rotate bss prio list, so the bss next to the one served, will come first
> in the list of bss' with equal priority. This way we pick bss nodes in a
> round robin fashion. Using list rotation instead of a cur ptr simplifies
> iteration to calling list_for_each_entry. List rotation is done via
> list_move, where the head itself is temporarily removed and then
> re-inserted after the bss just served.
>
> Signed-off-by: Andreas Fenkart <[email protected]>

Acked-by: Bing Zhao <[email protected]>

Thanks,
Bing

> ---
> drivers/net/wireless/mwifiex/11n_aggr.c | 6 ---
> drivers/net/wireless/mwifiex/init.c | 21 +++-----
> drivers/net/wireless/mwifiex/wmm.c | 79 +++++++++----------------------
> 3 files changed, 29 insertions(+), 77 deletions(-)


2013-04-02 00:05:18

by Andreas Fenkart

[permalink] [raw]
Subject: Re: mwifiex: infinite loop in mwifiex_main_process

Hi Bing,

On Tue, Mar 19, 2013 at 03:37:52PM -0700, Bing Zhao wrote:
[snip]
> >
> > [18017.214686] data sent 0
> > [18017.227548] wmm list empty 0
> > [18017.230592] tx_lock_flag 0
> >
> > So it seems the wmm list has packets queued, but they are never
> > sent out. Adding a few more statements, it seems the problem is
> > in mwifiex_wmm_get_highest_priolist_ptr:
> >
> > for (j = adapter->priv_num - 1; j >= 0; --j) {
> >
> > spin_lock_irqsave(&adapter->bss_prio_tbl[j].bss_prio_lock,
> > flags);
> > is_list_empty = list_empty(&adapter->bss_prio_tbl[j]
> > .bss_prio_head);
> > spin_unlock_irqrestore(&adapter->bss_prio_tbl[j].bss_prio_lock,
> > flags);
> > if (is_list_empty)
> > continue;
> >
> > .... <snip> ...
> >
> > do {
> > priv_tmp = bssprio_node->priv;
> > hqp = &priv_tmp->wmm.highest_queued_prio;
> >
> > for (i = atomic_read(hqp); i >= LOW_PRIO_TID;
> > --i) {
> > ...
> > ... NEVER REACHED ...
> > ...
> >
> >
> > So there are packets queued, but the highest_queued_prio is too
> > low, so they are never sent out.
>
> Could you apply the debug patch attached to print out hqp number?

I tried the following patch with lesser impact on performance.

@@ -928,6 +947,10 @@ mwifiex_wmm_get_highest_priolist_ptr(struct mwifiex_adapter *adapter,
}
}

+ spin_lock_irqsave(&priv_tmp->wmm.ra_list_spinlock, flags);
+ BUG_ON(atomic_read(&priv_tmp->wmm.tx_pkts_queued));
+ spin_unlock_irqrestore(&priv_tmp->wmm.ra_list_spinlock, flags);
+
/* No packet at any TID for this priv. Mark as such
* to skip checking TIDs for this priv (until pkt is
* added).
atomic_set(hqp, NO_PKT_PRIO_TID);


Which crashed. Hence searching for queued packets and adding new ones is
not synchronized, new packets can be added while searching the WMM
queues. If a packet is added right before setting max prio to NO_PKT,
that packet is trapped and creates an infinite loop.

Because of the new packet tx_pkts_queued is at least 1, indicating wmm
lists are not empty. Opposing that max prio is NO_PKT, which means "skip
this wmm queue, it has no packets".
The infinite loop results, because the main loop checks the wmm lists
for not empty (tx_pkts_queued != 0), but then finds no packet since it
skips the wmm queue where it is located on. This will never end, unless
a new packet is added which will restore max prio.

One possible solution is is to rely on tx_pkts_queued solely for
checking wmm queue to be empty, and drop the NO_PKT define.

> >
> > Is there a known issue, with highest_queued_prio getting out of
> > sync with the number of packets queued?
>
> I'm not aware of any known issue related to highest_queued_prio.

seems to be intruduced with this patch:
17e8cec 05-16-2011 mwifiex: CPU mips optimization with NO_PKT_PRIO_TID

I was wondering why hasn't happened more frequently. Evtl. if the
interface is working in bridge mode, new packets might be added to the
WMM queue with the trapped packet. 2c

I prepared a few patches, fixing above bug as suggested and plus some
cleanup patches I did while trying to get an understanding. Pls review

rgds,
Andi


drivers/net/wireless/mwifiex/11n_aggr.c | 14 +----------
drivers/net/wireless/mwifiex/init.c | 22 +++++------------
drivers/net/wireless/mwifiex/main.h | 4 ---
drivers/net/wireless/mwifiex/wmm.c | 200 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-----------------------------------------------------------------------------------------------
drivers/net/wireless/mwifiex/wmm.h | 3 +++
5 files changed, 83 insertions(+), 160 deletions(-)




2013-04-23 18:48:43

by John W. Linville

[permalink] [raw]
Subject: Re: [PATCH v3 0/2] wmm queues handling simplificatons

On Tue, Apr 23, 2013 at 11:33:13AM -0700, Bing Zhao wrote:
> Hi John,
>
> >
> > v3: minor cleanups for coding style; checkpatch.pl --strict
>
> Could you please apply the v3 series from Andreas Fenkart?
> I've ACKed both patches individually.

Hmmm....this series seems to have slipped through my fingers.
Could you or Andreas re-send it (preferably with Bing's ACKs included)?

John
--
John W. Linville Someday the world will need a hero, and you
[email protected] might be all we have. Be ready.

2013-04-08 18:20:45

by Bing Zhao

[permalink] [raw]
Subject: RE: [PATCH 1/6] mwifiex: bug: remove NO_PKT_PRIO_TID.

Hi Andi,

> If I run: checkpatch.pl -f drivers/net/wireless/mwifiex/wmm.c. I
> get about 10-20 similar warnings. In general it seems mwifiex
> sticks with type 1 multiline comments. And I wanted to stick with
> the de-facto standard.
>
> Should I still fix the warning?

Yes. Please fix the warning in these 2 patches.
Also there are some other warnings:

CHECK: Blank lines aren't necessary before a close brace '}'
#154: FILE: drivers/net/wireless/mwifiex/wmm.c:935:

You have to use "--strict" option to spot these messages.

> Should I prepare a separate patch on top fixing all multiline
> comments?

I think that will be too much in the entire driver code.
Let's keep them for now.

[...]

> > Could you share what test cases you are running? I will try to cover the cases you have not done.
>
> Duration test with high load, sender and receiver.
>
> I have dualband router and two DUT. One DUT connects to
> 2.4GHz-band the other to 5GHz-band. One is running iperf as
> server the other iperf as a client in a while true loop.
>
> Eventually I will switch over to set one DUT as uAP, but didn't
> play with uAP yet.

Thanks for the information.

Regards,
Bing