2023-07-25 00:10:39

by Alex Elder

[permalink] [raw]
Subject: [PATCH net] net: ipa: only reset hashed tables when supported

Last year, the code that manages GSI channel transactions switched
from using spinlock-protected linked lists to using indexes into the
ring buffer used for a channel. Recently, Google reported seeing
transaction reference count underflows occasionally during shutdown.

Doug Anderson found a way to reproduce the issue reliably, and
bisected the issue to the commit that eliminated the linked lists
and the lock. The root cause was ultimately determined to be
related to unused transactions being committed as part of the modem
shutdown cleanup activity. Unused transactions are not normally
expected (except in error cases).

The modem uses some ranges of IPA-resident memory, and whenever it
shuts down we zero those ranges. In ipa_filter_reset_table() a
transaction is allocated to zero modem filter table entries. If
hashing is not supported, hashed table memory should not be zeroed.
But currently nothing prevents that, and the result is an unused
transaction. Something similar occurs when we zero routing table
entries for the modem.

By preventing any attempt to clear hashed tables when hashing is not
supported, the reference count underflow is avoided in this case.

Note that there likely remains an issue with properly freeing unused
transactions (if they occur due to errors). This patch addresses
only the underflows that Google originally reported.

Fixes: d338ae28d8a8 ("net: ipa: kill all other transaction lists")
Cc: <[email protected]> # 6.1.x
Tested-by: Douglas Anderson <[email protected]>
Signed-off-by: Alex Elder <[email protected]>
---
drivers/net/ipa/ipa_table.c | 26 ++++++++++++++------------
1 file changed, 14 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ipa/ipa_table.c b/drivers/net/ipa/ipa_table.c
index 510ff2dc8999a..cd81dd916c29e 100644
--- a/drivers/net/ipa/ipa_table.c
+++ b/drivers/net/ipa/ipa_table.c
@@ -311,16 +311,15 @@ static int ipa_filter_reset(struct ipa *ipa, bool modem)
if (ret)
return ret;

- ret = ipa_filter_reset_table(ipa, IPA_MEM_V4_FILTER_HASHED, modem);
- if (ret)
- return ret;
-
ret = ipa_filter_reset_table(ipa, IPA_MEM_V6_FILTER, modem);
+ if (ret || !ipa_table_hash_support(ipa))
+ return ret;
+
+ ret = ipa_filter_reset_table(ipa, IPA_MEM_V4_FILTER_HASHED, modem);
if (ret)
return ret;
- ret = ipa_filter_reset_table(ipa, IPA_MEM_V6_FILTER_HASHED, modem);

- return ret;
+ return ipa_filter_reset_table(ipa, IPA_MEM_V6_FILTER_HASHED, modem);
}

/* The AP routes and modem routes are each contiguous within the
@@ -329,11 +328,12 @@ static int ipa_filter_reset(struct ipa *ipa, bool modem)
* */
static int ipa_route_reset(struct ipa *ipa, bool modem)
{
+ bool hash_support = ipa_table_hash_support(ipa);
struct gsi_trans *trans;
u16 first;
u16 count;

- trans = ipa_cmd_trans_alloc(ipa, 4);
+ trans = ipa_cmd_trans_alloc(ipa, hash_support ? 4 : 2);
if (!trans) {
dev_err(&ipa->pdev->dev,
"no transaction for %s route reset\n",
@@ -350,12 +350,14 @@ static int ipa_route_reset(struct ipa *ipa, bool modem)
}

ipa_table_reset_add(trans, false, first, count, IPA_MEM_V4_ROUTE);
- ipa_table_reset_add(trans, false, first, count,
- IPA_MEM_V4_ROUTE_HASHED);
-
ipa_table_reset_add(trans, false, first, count, IPA_MEM_V6_ROUTE);
- ipa_table_reset_add(trans, false, first, count,
- IPA_MEM_V6_ROUTE_HASHED);
+
+ if (hash_support) {
+ ipa_table_reset_add(trans, false, first, count,
+ IPA_MEM_V4_ROUTE_HASHED);
+ ipa_table_reset_add(trans, false, first, count,
+ IPA_MEM_V6_ROUTE_HASHED);
+ }

gsi_trans_commit_wait(trans);

--
2.34.1



2023-07-25 07:17:45

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH net] net: ipa: only reset hashed tables when supported

On Mon, Jul 24, 2023 at 05:41:06PM -0500, Alex Elder wrote:
> Last year, the code that manages GSI channel transactions switched
> from using spinlock-protected linked lists to using indexes into the
> ring buffer used for a channel. Recently, Google reported seeing
> transaction reference count underflows occasionally during shutdown.
>
> Doug Anderson found a way to reproduce the issue reliably, and
> bisected the issue to the commit that eliminated the linked lists
> and the lock. The root cause was ultimately determined to be
> related to unused transactions being committed as part of the modem
> shutdown cleanup activity. Unused transactions are not normally
> expected (except in error cases).
>
> The modem uses some ranges of IPA-resident memory, and whenever it
> shuts down we zero those ranges. In ipa_filter_reset_table() a
> transaction is allocated to zero modem filter table entries. If
> hashing is not supported, hashed table memory should not be zeroed.
> But currently nothing prevents that, and the result is an unused
> transaction. Something similar occurs when we zero routing table
> entries for the modem.
>
> By preventing any attempt to clear hashed tables when hashing is not
> supported, the reference count underflow is avoided in this case.
>
> Note that there likely remains an issue with properly freeing unused
> transactions (if they occur due to errors). This patch addresses
> only the underflows that Google originally reported.
>
> Fixes: d338ae28d8a8 ("net: ipa: kill all other transaction lists")
> Cc: <[email protected]> # 6.1.x
> Tested-by: Douglas Anderson <[email protected]>
> Signed-off-by: Alex Elder <[email protected]>
> ---
> drivers/net/ipa/ipa_table.c | 26 ++++++++++++++------------
> 1 file changed, 14 insertions(+), 12 deletions(-)

You sent 2 different versions of this patch? Which one is for what
tree? Is this in Linus's tree already? If so, what's the git id?

confused,

greg k-h

2023-07-25 13:07:44

by Alex Elder

[permalink] [raw]
Subject: Re: [PATCH net] net: ipa: only reset hashed tables when supported

On 7/25/23 2:08 AM, Greg KH wrote:
> You sent 2 different versions of this patch? Which one is for what
> tree? Is this in Linus's tree already? If so, what's the git id?

It was a mistake. I reached out to the netdev maintainers
yesterday to explain and I'm sorry I didn't do the same for
you/stable.

One of those patches will be brought upstream the normal
netdev way. Back-porting to 6.1 won't work cleanly--and
once it's upstream I'll provide the other one if required.

I'm really sorry to have caused the confusion.

-Alex