2016-06-14 06:18:20

by Rajkumar Manoharan

[permalink] [raw]
Subject: [PATCH] ath10k: fix system hang at qca99x0 probe on x86 platform

commit b057886524be ("ath10k: do not use coherent memory for allocated
device memory chunks") replaced coherent memory allocation for memory
chunks to fix low memory platforms. Unfortunately this is causing system
freeze on x86 platform while bringing up qca99x0 device. The system
hangs while DMA mapping bigger memory chunks (689816/865444 bytes). Fix
this by limiting maximum memory chunk size to 256 KiB per request.

Cc: Felix Fietkau <[email protected]>
Fixes: b057886524be ("ath10k: do not use coherent memory for allocated device memory chunks")
Signed-off-by: Rajkumar Manoharan <[email protected]>
---
drivers/net/wireless/ath/ath10k/wmi.c | 6 ++++++
drivers/net/wireless/ath/ath10k/wmi.h | 1 +
2 files changed, 7 insertions(+)

diff --git a/drivers/net/wireless/ath/ath10k/wmi.c b/drivers/net/wireless/ath/ath10k/wmi.c
index 6279ab4a760e..7c15f65fe5ed 100644
--- a/drivers/net/wireless/ath/ath10k/wmi.c
+++ b/drivers/net/wireless/ath/ath10k/wmi.c
@@ -4411,6 +4411,12 @@ static int ath10k_wmi_alloc_chunk(struct ath10k *ar, u32 req_id,
if (!pool_size)
return -EINVAL;

+ if (pool_size > WMI_MAX_MEM_CHUNK_SIZE) {
+ num_units = WMI_MAX_MEM_CHUNK_SIZE /
+ round_up(unit_len, 4);
+ pool_size = num_units * round_up(unit_len, 4);
+ }
+
vaddr = kzalloc(pool_size, GFP_KERNEL | __GFP_NOWARN);
if (!vaddr)
num_units /= 2;
diff --git a/drivers/net/wireless/ath/ath10k/wmi.h b/drivers/net/wireless/ath/ath10k/wmi.h
index 90f594e89f94..dea1f235a54d 100644
--- a/drivers/net/wireless/ath/ath10k/wmi.h
+++ b/drivers/net/wireless/ath/ath10k/wmi.h
@@ -6184,6 +6184,7 @@ struct wmi_roam_ev {
#define ATH10K_DEFAULT_ATIM 0

#define WMI_MAX_MEM_REQS 16
+#define WMI_MAX_MEM_CHUNK_SIZE (256 * 1024) /* 256 KB */

struct wmi_scan_ev_arg {
__le32 event_type; /* %WMI_SCAN_EVENT_ */
--
2.8.3



2016-07-22 22:43:22

by Ben Greear

[permalink] [raw]
Subject: Re: [PATCH] ath10k: fix system hang at qca99x0 probe on x86 platform

On 06/13/2016 11:17 PM, Rajkumar Manoharan wrote:
> commit b057886524be ("ath10k: do not use coherent memory for allocated
> device memory chunks") replaced coherent memory allocation for memory
> chunks to fix low memory platforms. Unfortunately this is causing system
> freeze on x86 platform while bringing up qca99x0 device. The system
> hangs while DMA mapping bigger memory chunks (689816/865444 bytes). Fix
> this by limiting maximum memory chunk size to 256 KiB per request.
>
> Cc: Felix Fietkau <[email protected]>
> Fixes: b057886524be ("ath10k: do not use coherent memory for allocated device memory chunks")
> Signed-off-by: Rajkumar Manoharan <[email protected]>
> ---
> drivers/net/wireless/ath/ath10k/wmi.c | 6 ++++++
> drivers/net/wireless/ath/ath10k/wmi.h | 1 +
> 2 files changed, 7 insertions(+)
>
> diff --git a/drivers/net/wireless/ath/ath10k/wmi.c b/drivers/net/wireless/ath/ath10k/wmi.c
> index 6279ab4a760e..7c15f65fe5ed 100644
> --- a/drivers/net/wireless/ath/ath10k/wmi.c
> +++ b/drivers/net/wireless/ath/ath10k/wmi.c
> @@ -4411,6 +4411,12 @@ static int ath10k_wmi_alloc_chunk(struct ath10k *ar, u32 req_id,
> if (!pool_size)
> return -EINVAL;
>
> + if (pool_size > WMI_MAX_MEM_CHUNK_SIZE) {
> + num_units = WMI_MAX_MEM_CHUNK_SIZE /
> + round_up(unit_len, 4);
> + pool_size = num_units * round_up(unit_len, 4);
> + }

I started testing my 9980 x86-64 system with VT/d enabled today.
With this patch in my tree, it crashes on bootup (with my firmware).
Works fine without this patch.

I don't see the exact place it is crashing in the firmware, though I could
probably narrow it down with some effort. It is in the early startup code,
at least, which makes debugging more difficult.

This patch works fine with a slightly newer firmware compiled for 9984. That
same firmware compiled for 9980 crashes, but I am not certain it is the same issue
as the older 9980. It appears similar, at least.

Looks to me like there are lots of variances in how firmware and chip revisions
deal with this particular code, so we are going to have to test on lots of chips
and platforms to know if a 'fix' is really a fix or not.

Thanks,
Ben


--
Ben Greear <[email protected]>
Candela Technologies Inc http://www.candelatech.com


2016-07-23 09:45:25

by Rajkumar Manoharan

[permalink] [raw]
Subject: Re: [PATCH] ath10k: fix system hang at qca99x0 probe on x86 platform

> from my point of view this patch is just shit. it trunkates the maximum
> allocated memory to a certain value.
> so firmware requests 800 kb memory but just gets 256kb. so out of bound
> memory access is guaranteed at all.
>
Even with current logic, If the memory chunk allocation fails for bigger size, then it tries
to allocate smaller chunks. So anyway it is guaranteed for oob access. no?

while (!vaddr && num_units) {
pool_size = num_units * round_up(unit_len, 4);
if (!pool_size)
return -EINVAL;

vaddr = kzalloc(pool_size, GFP_KERNEL | __GFP_NOWARN);
if (!vaddr)
num_units /= 2;
}

Actually the commit "ath10k: do not use coherent memory for allocated device memory chunks"
is causing system hang on non VT/d x86 platform. Better to revert the commit until it is properly
root caused

-Rajkumar

Am 23.07.2016 um 00:43 schrieb Ben Greear:
> On 06/13/2016 11:17 PM, Rajkumar Manoharan wrote:
>> commit b057886524be ("ath10k: do not use coherent memory for allocated
>> device memory chunks") replaced coherent memory allocation for memory
>> chunks to fix low memory platforms. Unfortunately this is causing system
>> freeze on x86 platform while bringing up qca99x0 device. The system
>> hangs while DMA mapping bigger memory chunks (689816/865444 bytes). Fix
>> this by limiting maximum memory chunk size to 256 KiB per request.
>>
>> Cc: Felix Fietkau <[email protected]>
>> Fixes: b057886524be ("ath10k: do not use coherent memory for
>> allocated device memory chunks")
>> Signed-off-by: Rajkumar Manoharan <[email protected]>
>> ---
>> drivers/net/wireless/ath/ath10k/wmi.c | 6 ++++++
>> drivers/net/wireless/ath/ath10k/wmi.h | 1 +
>> 2 files changed, 7 insertions(+)
>>
>> diff --git a/drivers/net/wireless/ath/ath10k/wmi.c
>> b/drivers/net/wireless/ath/ath10k/wmi.c
>> index 6279ab4a760e..7c15f65fe5ed 100644
>> --- a/drivers/net/wireless/ath/ath10k/wmi.c
>> +++ b/drivers/net/wireless/ath/ath10k/wmi.c
>> @@ -4411,6 +4411,12 @@ static int ath10k_wmi_alloc_chunk(struct
>> ath10k *ar, u32 req_id,
>> if (!pool_size)
>> return -EINVAL;
>>
>> + if (pool_size > WMI_MAX_MEM_CHUNK_SIZE) {
>> + num_units = WMI_MAX_MEM_CHUNK_SIZE /
>> + round_up(unit_len, 4);
>> + pool_size = num_units * round_up(unit_len, 4);
>> + }
>
> I started testing my 9980 x86-64 system with VT/d enabled today.
> With this patch in my tree, it crashes on bootup (with my firmware).
> Works fine without this patch.
>
> I don't see the exact place it is crashing in the firmware, though I
> could
> probably narrow it down with some effort. It is in the early startup
> code,
> at least, which makes debugging more difficult.
>
> This patch works fine with a slightly newer firmware compiled for
> 9984. That
> same firmware compiled for 9980 crashes, but I am not certain it is
> the same issue
> as the older 9980. It appears similar, at least.
>
> Looks to me like there are lots of variances in how firmware and chip
> revisions
> deal with this particular code, so we are going to have to test on
> lots of chips
> and platforms to know if a 'fix' is really a fix or not.
>
> Thanks,
> Ben
>
>


--
Mit freundlichen Gr?ssen / Regards

Sebastian Gottschall / CTO

NewMedia-NET GmbH - DD-WRT
Firmensitz: Berliner Ring 101, 64625 Bensheim
Registergericht: Amtsgericht Darmstadt, HRB 25473
Gesch?ftsf?hrer: Peter Steinh?user, Christian Scheele
http://www.dd-wrt.com
email: [email protected]
Tel.: +496251-582650 / Fax: +496251-5826565


2016-07-23 21:55:39

by Ben Greear

[permalink] [raw]
Subject: Re: [PATCH] ath10k: fix system hang at qca99x0 probe on x86 platform



On 07/23/2016 02:11 AM, Sebastian Gottschall wrote:
> from my point of view this patch is just shit. it trunkates the maximum allocated memory to a certain value.
> so firmware requests 800 kb memory but just gets 256kb. so out of bound memory access is guaranteed at all.

At least some of the firmware logic attempts to deal with this segmentation. 9984 firmware had bugs in one area that I fixed,
but probably there are other bugs and that is why it fails for 9980 firmware.

Even if I fix my firmware, that doesn't help everyone else with stock firmware.

At best, maybe enable this type of logic for only exact firmware with proper feature
flag showing that it is known to work with it.

Thanks,
Ben

--
Ben Greear <[email protected]>
Candela Technologies Inc http://www.candelatech.com

2016-07-23 09:12:15

by Sebastian Gottschall

[permalink] [raw]
Subject: Re: [PATCH] ath10k: fix system hang at qca99x0 probe on x86 platform

from my point of view this patch is just shit. it trunkates the maximum
allocated memory to a certain value.
so firmware requests 800 kb memory but just gets 256kb. so out of bound
memory access is guaranteed at all.


Am 23.07.2016 um 00:43 schrieb Ben Greear:
> On 06/13/2016 11:17 PM, Rajkumar Manoharan wrote:
>> commit b057886524be ("ath10k: do not use coherent memory for allocated
>> device memory chunks") replaced coherent memory allocation for memory
>> chunks to fix low memory platforms. Unfortunately this is causing system
>> freeze on x86 platform while bringing up qca99x0 device. The system
>> hangs while DMA mapping bigger memory chunks (689816/865444 bytes). Fix
>> this by limiting maximum memory chunk size to 256 KiB per request.
>>
>> Cc: Felix Fietkau <[email protected]>
>> Fixes: b057886524be ("ath10k: do not use coherent memory for
>> allocated device memory chunks")
>> Signed-off-by: Rajkumar Manoharan <[email protected]>
>> ---
>> drivers/net/wireless/ath/ath10k/wmi.c | 6 ++++++
>> drivers/net/wireless/ath/ath10k/wmi.h | 1 +
>> 2 files changed, 7 insertions(+)
>>
>> diff --git a/drivers/net/wireless/ath/ath10k/wmi.c
>> b/drivers/net/wireless/ath/ath10k/wmi.c
>> index 6279ab4a760e..7c15f65fe5ed 100644
>> --- a/drivers/net/wireless/ath/ath10k/wmi.c
>> +++ b/drivers/net/wireless/ath/ath10k/wmi.c
>> @@ -4411,6 +4411,12 @@ static int ath10k_wmi_alloc_chunk(struct
>> ath10k *ar, u32 req_id,
>> if (!pool_size)
>> return -EINVAL;
>>
>> + if (pool_size > WMI_MAX_MEM_CHUNK_SIZE) {
>> + num_units = WMI_MAX_MEM_CHUNK_SIZE /
>> + round_up(unit_len, 4);
>> + pool_size = num_units * round_up(unit_len, 4);
>> + }
>
> I started testing my 9980 x86-64 system with VT/d enabled today.
> With this patch in my tree, it crashes on bootup (with my firmware).
> Works fine without this patch.
>
> I don't see the exact place it is crashing in the firmware, though I
> could
> probably narrow it down with some effort. It is in the early startup
> code,
> at least, which makes debugging more difficult.
>
> This patch works fine with a slightly newer firmware compiled for
> 9984. That
> same firmware compiled for 9980 crashes, but I am not certain it is
> the same issue
> as the older 9980. It appears similar, at least.
>
> Looks to me like there are lots of variances in how firmware and chip
> revisions
> deal with this particular code, so we are going to have to test on
> lots of chips
> and platforms to know if a 'fix' is really a fix or not.
>
> Thanks,
> Ben
>
>


--
Mit freundlichen Gr?ssen / Regards

Sebastian Gottschall / CTO

NewMedia-NET GmbH - DD-WRT
Firmensitz: Berliner Ring 101, 64625 Bensheim
Registergericht: Amtsgericht Darmstadt, HRB 25473
Gesch?ftsf?hrer: Peter Steinh?user, Christian Scheele
http://www.dd-wrt.com
email: [email protected]
Tel.: +496251-582650 / Fax: +496251-5826565