2017-03-23 16:47:13

by Christian Lamparter

[permalink] [raw]
Subject: QCA9984 bmi identification failure

Hannu Nyman reported a issue with the QCA9984 in his Netgear R7800
and LEDE's ath10k: (This is with 936-ath10k_skip_otp_check.patch removed):

[ 12.999550] ath10k_pci 0000:01:00.0: enabling device (0140 -> 0142)
[ 12.999637] ath10k_pci 0000:01:00.0: enabling bus mastering
[ 13.000105] ath10k_pci 0000:01:00.0: pci irq msi oper_irq_mode 2 irq_mode 0 reset_mode 0
[ 13.130838] ath10k_pci 0000:01:00.0: Direct firmware load for ath10k/pre-cal-pci-0000:01:00.0.bin failed with error -2
[ 13.130888] ath10k_pci 0000:01:00.0: Falling back to user helper
[ 13.183995] firmware ath10k!pre-cal-pci-0000:01:00.0.bin: firmware_loading_store: map pages failed
[ 13.184338] ath10k_pci 0000:01:00.0: Direct firmware load for ath10k/cal-pci-0000:01:00.0.bin failed with error -2
[ 13.191960] ath10k_pci 0000:01:00.0: Falling back to user helper
[ 13.673417] ath10k_pci 0000:01:00.0: qca9984/qca9994 hw1.0 target 0x01000000 chip_id 0x00000000 sub 168c:cafe
[ 13.673451] ath10k_pci 0000:01:00.0: kconfig debug 0 debugfs 1 tracing 0 dfs 1 testmode 1
[ 13.684393] ath10k_pci 0000:01:00.0: firmware ver 10.4-3.4-00074 api 5 features no-p2p,mfp,peer-flow-ctrl,btcoex-param,allows-mesh-bcast crc32 fa32e88e
[ 15.728598] ath10k_pci 0000:01:00.0: unable to read from the device
[ 15.728621] ath10k_pci 0000:01:00.0: could not execute otp for board id check: -110
[ 15.733663] ath10k_pci 0000:01:00.0: failed to get board id from otp: -110
[ 15.741474] ath10k_pci 0000:01:00.0: could not probe fw (-110)

I requested to test what happens, if the driver ignored -ETIMEDOUT error from
ath10k_core_get_board_id_from_otp() and the device initialized successfully:

[ 16.163318] ath10k_pci 0000:01:00.0: enabling device (0140 -> 0142)
[ 16.163401] ath10k_pci 0000:01:00.0: enabling bus mastering
[ 16.163850] ath10k_pci 0000:01:00.0: pci irq msi oper_irq_mode 2 irq_mode 0 reset_mode 0
[ 16.337294] ath10k_pci 0000:01:00.0: Direct firmware load for ath10k/pre-cal-pci-0000:01:00.0.bin failed with error -2
[ 16.337351] ath10k_pci 0000:01:00.0: Falling back to user helper
[ 22.837360] firmware ath10k!pre-cal-pci-0000:01:00.0.bin: firmware_loading_store: map pages failed
[ 23.212157] ath10k_pci 0000:01:00.0: qca9984/qca9994 hw1.0 target 0x01000000 chip_id 0x00000000 sub 168c:cafe
[ 23.212211] ath10k_pci 0000:01:00.0: kconfig debug 0 debugfs 1 tracing 0 dfs 1 testmode 1
[ 23.226748] ath10k_pci 0000:01:00.0: firmware ver 10.4-3.4-00074 api 5 features no-p2p,mfp,peer-flow-ctrl,btcoex-param,allows-mesh-bcast crc32 fa32e88e
[ 25.259266] ath10k_pci 0000:01:00.0: unable to read from the device
[ 25.259288] ath10k_pci 0000:01:00.0: could not execute otp for board id check: -110
[ 25.277326] ath10k_pci 0000:01:00.0: failed to fetch board data for bus=pci,vendor=168c,device=0046,subsystem-vendor=168c,subsystem-device=cafem...from ath10k/QCA9984/hw1.0/board-2.bin
[ 25.277588] ath10k_pci 0000:01:00.0: board_file api 1 bmi_id N/A crc32 dd636801
[ 26.800717] ath10k_pci 0000:01:00.0: htt-ver 2.2 wmi-op 6 htt-op 4 cal file max-sta 512 raw 0 hwcrypto 1

<https://forum.lede-project.org/t/netgear-r7800-exploration-ipq8065-qca9984/285/277>

What seems strange is that only the call bmi_execute with
BMI_PARAM_GET_EEPROM_BOARD_ID is timing out. So by just
ignoring the -ETIMEDOUT result from:

ret = ath10k_bmi_execute(ar, address, BMI_PARAM_GET_EEPROM_BOARD_ID,
&result);

in ath10k_core_get_board_id_from_otp() the device will initialize and work.
This begs the question, what is so special about the BMI_PARAM_GET_EEPROM_BOARD_ID
at that time for the QCA9984? Does the device need some extra msleep time after
the OTP has been uploaded? Or is the BMI_PARAM_GET_EEPROM_BOARD_ID not
implemented/has a different ID, etc... ?

Thanks,
Christian


2017-03-24 15:03:18

by Christian Lamparter

[permalink] [raw]
Subject: Re: QCA9984 bmi identification failure

On Friday, March 24, 2017 11:09:03 AM CET Sebastian Gottschall wrote:
> i have a r7800 running. consider to use the board.bin file which is=20
> stored in flash memory of the r7800.
Well, this is a bit beside the point. But what makes you think that=20
what is stored in the flash memory of R7800 is the "board.bin"?=20
I know that Netgear provided a myriad of different board data files
with in there GPL drop:

Here's a link:
<https://github.com/paul-chambers/netgear-r7800/tree/master/git_home/madwif=
i-11n.git/halphy_tools/host/eepromUtil/release_qca9984/hw1>

So, does the data in your flash matches any of those files 1:1 or not?

(Note: From what I know, it's the caldata that's in the flash.=20
caldata =E2=89=88 cal+board. But I'm asking why ath10k's bmi identification
isn't working for those chips right now. And judging from your logs,
you are using probably a similar WA to the=20
936-ath10k_skip_otp_check.patch out of necessity as well.)

> there are 2 stored for both cards. you need to patch ath10k to use=20
> different board.bin files for each card.
Exactly. Why do you (or anyone for that matter) need to patch ath10k?
The driver is supposed to support the QCA9984 out of the box, right?

And I know, that the bmi identification is supposed to work, as
somebody posted the following log:
<http://lists.infradead.org/pipermail/lede-dev/2016-December/004987.html>

[ 379.392210] ath10k_pci 0002:01:00.0: boot upload otp to 0x1234 len 9000 =
for board id
[ 379.399945] ath10k_pci 0002:01:00.0: bmi fast download address 0x1234 bu=
ffer 0xe1676038 length 9000
[ 379.408977] ath10k_pci 0002:01:00.0: bmi lz stream start address 0x1234
[ 379.415603] ath10k_pci 0002:01:00.0: bmi lz data buffer 0xe1676038 lengt=
h 9000
[ 379.451626] ath10k_pci 0002:01:00.0: bmi lz stream start address 0x0
[ 379.457985] ath10k_pci 0002:01:00.0: bmi execute address 0x1234 param 0x=
10
[ 380.857006] ath10k_pci 0002:01:00.0: bmi execute result 0x400
[ 380.862749] ath10k_pci 0002:01:00.0: boot get otp board id result 0x0000=
0400 board_id 1 chip_id 0
[ 380.871603] ath10k_pci 0002:01:00.0: boot using board name 'bus=3Dpci,bm=
i-chip-id=3D0,bmi-board-id=3D1'
[ 380.880468] ath10k_pci 0002:01:00.0: board name
[ 380.884999] ath10k_pci 0002:01:00.0: 00000000: 62 75 73 3d 70 63 69 2c 6=
2 6d 69 2d 63 68 69 70 bus=3Dpci,bmi-chip
[ 380.895159] ath10k_pci 0002:01:00.0: 00000010: 2d 69 64 3d 30 2c 62 6d 6=
9 2d 62 6f 61 72 64 2d -id=3D0,bmi-board-
[ 380.905317] ath10k_pci 0002:01:00.0: 00000020: 69 64 3d 31 =
id=3D1
[ 380.914436] ath10k_pci 0002:01:00.0: boot found match for name 'bus=3Dpc=
i,bmi-chip-id=3D0,bmi-board-id=3D1'
[ 380.923640] ath10k_pci 0002:01:00.0: boot found board data for 'bus=3Dpc=
i,bmi-chip-id=3D0,bmi-board-id=3D1'
[ 380.932845] ath10k_pci 0002:01:00.0: using board api 2
=2E..

The board name for the QCA9984 is supposed to look like
"'bus=3Dpci,bmi-chip-id=3D0,bmi-board-id=3D1'"

and not like (from your log):
> bus=3Dpci,vendor=3D168c,device=3D0046,subsystem-vendor=3D168c,subsystem-d=
evice=3Dcafe=20
> from ath10k/QCA9984/hw1.0/board-2.bin

> the failed to fetch board data error is normal.=20
I don't think it is. I think it's a regression.

Thanks,
Christian

2017-03-28 16:19:22

by Christian Lamparter

[permalink] [raw]
Subject: Re: QCA9984 bmi identification failure

On Monday, March 27, 2017 1:33:54 PM CEST Sebastian Gottschall wrote:
> i dont know how to prove you that the firmware format is identical without
> simply showing you the hexdump.
We sort of know what is encoded in these calibration files in the flash and
in the board files. I've told you about the BoardData Files on github. If
you look into the directory, you'll notice that for each of the board.bin
files there, there's a .txt with the same name:

e.g.:
<https://github.com/paul-chambers/netgear-r7800/blob/master/git_home/madwifi-11n.git/halphy_tools/host/eepromUtil/release_qca9984/hw1/boardData_QCA9984_CUS238_5G_v1_003.txt>

You can use the identifiers from the file and compare the data from the flash
and the boardData file, you'll notice that they are differences. And the
difference is what caused all these problems. We ran into this last year
with the IPQ40XX. You can read all about it on the ath10k ML: This was one
of the threads:
<https://www.mail-archive.com/[email protected]/msg06154.html>

> > which is what I said in the response as well. we both knew that
> > (from the beginning). If you want you can go on about it:
> > Please do. However, you should provide some data to back up your
> > claims and statements (logs, links to code or patches are fine I think).
> > Furthermore, let's keep the discussion civil and not go off on a
> > tangent and start a pissing contest. And finally, let's not forget
> > that the discussion is about the "QCA9984 bmi identification failure".
> ahm. sorry. we stop here. [...]
> i do not claim anything and i dont have you proof anything. [...]
> its up to you if you believe me or not.

Ok. Then let's stop here.

> > If so, then there's the Aerohive HiveAP 121.
> > <https://github.com/riptidewave93/LEDE-HiveAP-121>. It has an AR934x SoC
> > and the internal WMAC is storing its calibration data in the SoC's OTP area.
> > The device is supported by ath9k. The device does have a wifi-cal/art
> > partition but it was empty.
> need to take a look into a flash memory dump so see if i find the
> calibration data. the partition you see is created by lede as
> preconfigured layout.
Well, if you are still interested and want to take a look at it.
I can ask Chris Blake, if he's willing to sent the complete flashdump
of the Aerohive HiveAP 121. It's one of those enterprise APs, it has
a NAND and NOR chip, so from what I remember the image is like 20MiB+
zipped.

Thanks,
Christian

2017-03-25 07:25:08

by Sebastian Gottschall

[permalink] [raw]
Subject: Re: QCA9984 bmi identification failure

Am 24.03.2017 um 16:01 schrieb Christian Lamparter:
> On Friday, March 24, 2017 11:09:03 AM CET Sebastian Gottschall wrote:
>> i have a r7800 running. consider to use the board.bin file which is
>> stored in flash memory of the r7800.
> Well, this is a bit beside the point. But what makes you think that
> what is stored in the flash memory of R7800 is the "board.bin"?
i dont know how to answer this question without getting rude. i'm
developing dd-wrt which is a alternate firmware like lede/openwrt for
many hundrets of different routers.this is my job since more than 10 years.
so my primary job is reverse engineering all the vendor stuff they wont
tell you in a easy way but on the other hands vendors do also share such
informations with me if i kindly ask them
the board.bin has a specific file format this can be easily detected.
the flash memory contains 2 board.bin files, for each of the both cards.
since the mtd partition where this all is stored is named art there is
also another indicator. you may not know it, but the name of the
qca/atheros calibration software is "art".

> I know that Netgear provided a myriad of different board data files
> with in there GPL drop:
you can ignore them. use the files stored in flash memory. this board
data is the calibration data which is different for each device you buy.
its precalibrated and stored in flash memory.
a normal wifi card has a own eeprom on it which stores this data. but on
embedded devices the routers own flash memory is used for storing this data.
this case is mainly ignored by drivers like ath10k, so patches are
required right now until ath10k does officially support these conditions
> Here's a link:
> <https://github.com/paul-chambers/netgear-r7800/tree/master/git_home/madwifi-11n.git/halphy_tools/host/eepromUtil/release_qca9984/hw1>
i know the gpl tree
> So, does the data in your flash matches any of those files 1:1 or not?
nope. these files are just default files shipped with the driver by qca
> (Note: From what I know, it's the caldata that's in the flash.
> caldata ≈ cal+board. But I'm asking why ath10k's bmi identification
> isn't working for those chips right now. And judging from your logs,
> you are using probably a similar WA to the
> 936-ath10k_skip_otp_check.patch out of necessity as well.)
the board.bin is the caldata and configuration set for the card.
the skip otp check patch is required in any way since the cards has no
on board eeprom
if bmi identification fails, the api 1 board-2.bin file is loaded as
alternate variant and this is exactly the data which is stored in flash
memory.
>> there are 2 stored for both cards. you need to patch ath10k to use
>> different board.bin files for each card.
> Exactly. Why do you (or anyone for that matter) need to patch ath10k?
> The driver is supposed to support the QCA9984 out of the box, right?
it is supposed, but thats not the case. ath10k mainly supports cards
with own on board eeprom.
embedded routers can be very specific and nonstandard, so customizations
are required sometimes.
you may not like it, but i know this device very well and also others.
consider how the skip otp patch was created.
its a workaround which is only required for routers.
> And I know, that the bmi identification is supposed to work, as
> somebody posted the following log:
> <http://lists.infradead.org/pipermail/lede-dev/2016-December/004987.html>
>
> [ 379.392210] ath10k_pci 0002:01:00.0: boot upload otp to 0x1234 len 9000 for board id
> [ 379.399945] ath10k_pci 0002:01:00.0: bmi fast download address 0x1234 buffer 0xe1676038 length 9000
> [ 379.408977] ath10k_pci 0002:01:00.0: bmi lz stream start address 0x1234
> [ 379.415603] ath10k_pci 0002:01:00.0: bmi lz data buffer 0xe1676038 length 9000
> [ 379.451626] ath10k_pci 0002:01:00.0: bmi lz stream start address 0x0
> [ 379.457985] ath10k_pci 0002:01:00.0: bmi execute address 0x1234 param 0x10
> [ 380.857006] ath10k_pci 0002:01:00.0: bmi execute result 0x400
> [ 380.862749] ath10k_pci 0002:01:00.0: boot get otp board id result 0x00000400 board_id 1 chip_id 0
> [ 380.871603] ath10k_pci 0002:01:00.0: boot using board name 'bus=pci,bmi-chip-id=0,bmi-board-id=1'
> [ 380.880468] ath10k_pci 0002:01:00.0: board name
> [ 380.884999] ath10k_pci 0002:01:00.0: 00000000: 62 75 73 3d 70 63 69 2c 62 6d 69 2d 63 68 69 70 bus=pci,bmi-chip
> [ 380.895159] ath10k_pci 0002:01:00.0: 00000010: 2d 69 64 3d 30 2c 62 6d 69 2d 62 6f 61 72 64 2d -id=0,bmi-board-
> [ 380.905317] ath10k_pci 0002:01:00.0: 00000020: 69 64 3d 31 id=1
> [ 380.914436] ath10k_pci 0002:01:00.0: boot found match for name 'bus=pci,bmi-chip-id=0,bmi-board-id=1'
> [ 380.923640] ath10k_pci 0002:01:00.0: boot found board data for 'bus=pci,bmi-chip-id=0,bmi-board-id=1'
> [ 380.932845] ath10k_pci 0002:01:00.0: using board api 2
> ...
>
> The board name for the QCA9984 is supposed to look like
> "'bus=pci,bmi-chip-id=0,bmi-board-id=1'"
>
> and not like (from your log):
sure. you can load wrong board data into your card this may result in
the following situation a 5 ghz card is shown as 2.4 ghz card and vice
versa or even dualband also if the card is not supporting this. this may
not be the case on the r7800 but i know another device where this is the
bevaviour. power calibration set is wrong in any way so output power may
be lower as expected.
and the resulting operation state will not work in best way. lede does
not support this device in correct way.
wrong board data can also destroy the hardware. this may not happen
fast, but using wrong power calibration data may destroy the phy over
time by overheating

take a look on your router. especially on mtd3
mtd3: 00140000 00020000 "art"

dump this partition and take a look inside and take a guess what you
will find. offset 0x1000 is first board data. offset 0x5000 is second
board data
this is common for almost all wireless routers on the market. i dont
know a single router with a qca or atheros chipset on the market which
does not have stored the board data in flash memory


>> bus=pci,vendor=168c,device=0046,subsystem-vendor=168c,subsystem-device=cafe
>> from ath10k/QCA9984/hw1.0/board-2.bin
>> the failed to fetch board data error is normal.
> I don't think it is. I think it's a regression.
ahm no. you dont know what you're dealing with.
>
> Thanks,
> Christian
>


--
Mit freundlichen Grüssen / Regards

Sebastian Gottschall / CTO

NewMedia-NET GmbH - DD-WRT
Firmensitz: Berliner Ring 101, 64625 Bensheim
Registergericht: Amtsgericht Darmstadt, HRB 25473
Geschäftsführer: Peter Steinhäuser, Christian Scheele
http://www.dd-wrt.com
email: [email protected]
Tel.: +496251-582650 / Fax: +496251-5826565

2017-03-27 11:55:29

by Sebastian Gottschall

[permalink] [raw]
Subject: Re: QCA9984 bmi identification failure


>
>>> On Friday, March 24, 2017 11:09:03 AM CET Sebastian Gottschall wrote:
>>> i have a r7800 running. consider to use the board.bin file which is
>>> stored in flash memory of the r7800.
>>> there are 2 stored for both cards. you need to patch ath10k to use
>>> different board.bin files for each card.
>>> this is a log from a working r7800 running dd-wrt. the failed to fetch
>>> board data error is normal. it will use api1 board files then which i
>>> provide on fs based on the board data stored in flash memory
> What makes you think that what you do with the data in the flash is
> the "board.bin"? Do you have any documentation to back up your statement?
> I know that you have been reporting about the QCA99x0. there even
> is a patch with your "reported-by" tag:
> "ath10k: fix calibration init sequence of qca99x0".
> <http://lists.infradead.org/pipermail/ath10k/2016-March/007173.html>
this question has been already answered. take it as a fact or find it
out by yourself. i dont know how to prove you that the firmware format
is identical without simply showing you the hexdump which you can do by
yourself
use the correct data, insert it to the driver using hotplug. thats it.
the mail you are refering to is not related to the source of the
calibration data
>
> Clearly, you must have read it. So you know that:
> "[...] whereas calibration file is used by ar9888 and qca99x0 that contains
> both board data and caldata. [...]"
and? the mail is about a fix in a structure but says nothing about the
content of a board.bin file
>
> which is what I said in the response as well. we both knew that
> (from the beginning). If you want you can go on about it:
> Please do. However, you should provide some data to back up your
> claims and statements (logs, links to code or patches are fine I think).
> Furthermore, let's keep the discussion civil and not go off on a
> tangent and start a pissing contest. And finally, let's not forget
> that the discussion is about the "QCA9984 bmi identification failure".
ahm. sorry. we stop here.
you asked a question. i was kind enough to answer it. i do not claim
anything and i dont have you proof anything. i have my working firmware
using the correct data on multiple devices
its up to you if you believe me or not.
>>> the failed to fetch board data error is normal.
> Why do you need to patch the ath10k driver? And why is it, that even
> you have patched it, ath10k is still producing an error message? And
> why is it normal(save?) to ignore it?
i patched it. but i also found out that you may use the hotplug event of
the driver to load it per card.
in addition i talked with nbd from lede about the caldata problem on the
r7800 and he is aware of it and told me it will be fixed soon
>
>>> I know that Netgear provided a myriad of different board data files
>>> with in there GPL drop:
>> you can ignore them. use the files stored in flash memory. this board
>> data is the calibration data which is different for each device you buy.
>> its precalibrated and stored in flash memory.
> If this was the case, then why does the ath10k-firmware and codeaurora.org
> provide the board files in the firmware repositories?
its a default set and "again" not related to embedded devices. and second
please ask qca if you want to know why something is done by qca
> <https://github.com/kvalo/ath10k-firmware/tree/master/QCA9984/hw1.0>
> <https://source.codeaurora.org/quic/qsdk/oss/firmware/ath10k-firmware/plain/ath10k/QCA9984/hw1.0/>
>
> Also, why does ath10k insist on requesting the board-2.bin files, even after
> it has loaded the calibration file which is supposed to contain both
> (for the QCA9984 and others)?
its chipset specific. the file is different for each chipset and also
the content.
but board-2.bin doesnt matter here. this is api 2. the routers are only
using api 1 files which are named board.bin which are stored in the
flash memory.
the board-2.bin can contain multiple board datas for different card
configurations like 2.4 and 5 ghz or both. the board.bin is a single
calibration data file for a chipset
it also stores the mac address for your card. without using the correct
file your card wont get the correct mac address. lede does patch the
board.bin on some devices to correct this
> Would it be easier just to request "one file" and not two different files
> with the same content?
its not the same content. the flash contains 2 files which are
different. one is made for 2.4 ghz 9984 and the second is done for the 5
ghz 9984
>
> Note: I know that LEDE currently puts the data from the flash
> into the cal-pci-0000:01:00.0.bin and creates a symlink to "board.bin".
> <https://git.lede-project.org/?p=source.git;a=blob;f=package/firmware/ath10k-firmware/Makefile;h=febf7d26794bd8c5b34bd6703e88cf8070e213a1;hb=HEAD#l323>
> I expect to see something similar in DD-WRT. If not, please tell us about your
> method.
similar method, yes but different implementation. result is the same.
you just need the correct data extracted from the "art" partitition
offset 0x1000 and offset 0x5000 (consider to use the correct filesize)
and then recreate the links to point to the correct data.
>
>> a normal wifi card has a own eeprom on it which stores this data. but on
>> embedded devices the routers own flash memory is used for storing this data.
>> this case is mainly ignored by drivers like ath10k, so patches are
>> required right now until ath10k does officially support these conditions
> The QCA9984 support was added by this patch back in August 2016:
> "ath10k: enable support for QCA9984"
> https://patchwork.kernel.org/patch/8890981/
this is chipset support. i was talking about firmware loading methods
>
> At this point, I think ath10k should support the QCA9984 in the R7800
> "out of the box" without any need for special or custom patches.
> LEDE's compat-wireless is dated 2017-01-31.
its possible by using the firmware hotplug event and the way you
described with symbolic links
>> the board.bin is the caldata and configuration set for the card.
>> the skip otp check patch is required in any way since the cards has no
>> on board eeprom
>> if bmi identification fails, the api 1 board-2.bin file is loaded as
>> alternate variant and this is exactly the data which is stored in flash
>> memory.
> Two points:
>
> - skip_otp is mentioned in a patch from 2014 (long before
> the QCA9984) <https://patchwork.kernel.org/patch/5252351/>
> The commit log says: "It is useful for initial calibration."
> This does sound like this feature is used for "product development"
> if it is used for something else (QCA6174, QCA9377?),
> the description:
> | MODULE_PARM_DESC(skip_otp, "Skip otp failure for calibration in testmode");
> should be upgraded.
this patch has been made a long time ago by me and i provided the info
to lede and they made a own variant.
some cards like yours do not have a otp. so the result of the otp
calibration needs to be ignored.
the reason is that your card has no own eeprom stored per chipset which
is normal for minipci express cards with such chipsets.
it contains the calibration data, mac address, chipset configuration. etc.
on the r7800 like on most other routers its stored in the routers own
flash memory. i told this already.
so the otp calibration will fail. if the result is ignored the
calibration data will be loaded from file and all is fine
>
> Note: skip_otp doesn't skip over the BMI identification.
> Instead it just instructs the code to ignore the result from it.
> So setting skip_otp will not help here, since bmi_execute is
> returing -ETIMEDOUT.
> <http://lxr.free-electrons.com/source/drivers/net/wireless/ath/ath10k/core.c#L742>
>
> - board api 1 was not intended to be used for QCA99X0+.
> In fact, the patch:"ath10k: add board 2 API support".
> <https://patchwork.kernel.org/patch/7334851/> which added it
> lists the 99X0 (predecessor of the 9984?) as a target (altought
> the ath10k-firmware repository still has the old file).
> The board api 1 was just left as a fallback.
it must be used. its the only available format on this router. the
original qca driver which is used by netgear (they do not use ath10k)
does not know anything about bmi / api 2

>
>> sure. you can load wrong board data into your card this may result in
>> the following situation a 5 ghz card is shown as 2.4 ghz card and vice
>> versa or even dualband also if the card is not supporting this. this may
>> not be the case on the r7800 but i know another device where this is the
>> bevaviour. power calibration set is wrong in any way so output power may
>> be lower as expected.
>> and the resulting operation state will not work in best way. lede does
>> not support this device in correct way.
>> wrong board data can also destroy the hardware. this may not happen
>> fast, but using wrong power calibration data may destroy the phy over
>> time by overheating
>>
>> take a look on your router. especially on mtd3
>> mtd3: 00140000 00020000 "art"
>>
>> dump this partition and take a look inside and take a guess what you
>> will find. offset 0x1000 is first board data. offset 0x5000 is second
>> board data
>> this is common for almost all wireless routers on the market. i dont
>> know a single router with a qca or atheros chipset on the market which
>> does not have stored the board data in flash memory
> Well, "almost all wireless routers on the martket" does it include older ath9k too?
yes also ath9k uses calibration data from flash memory. its stored in
platform data and provided by the kernel implementation for the device
> If so, then there's the Aerohive HiveAP 121.
> <https://github.com/riptidewave93/LEDE-HiveAP-121>. It has an AR934x SoC
> and the internal WMAC is storing its calibration data in the SoC's OTP area.
> The device is supported by ath9k. The device does have a wifi-cal/art
> partition but it was empty.
need to take a look into a flash memory dump so see if i find the
calibration data. the partition you see is created by lede as
preconfigured layout.
it doesnt mean that the offsets are correct. and the kernel doesnt need
the partition to read data from flash memory. look into the lede
sourcecode at the various mach-.....c files to see how its handled
> There are simply too many devices, to keep track of each one and
> each is a little bit different.
most store the data in the last flash sector. not much difference for
many vendors
>
>>>> bus=pci,vendor=168c,device=0046,subsystem-vendor=168c,subsystem-device=cafe
>>>> from ath10k/QCA9984/hw1.0/board-2.bin
>>>> the failed to fetch board data error is normal.
>>> I don't think it is. I think it's a regression.
>> ahm no. you dont know what you're dealing with.
> I reported that Hannu Nyman has an issue with the QCA9984 in his Netgear R7800.
> The BMI Identification isn't working as expected... I asked on the ML what's
> going on and if they could take a look. Let's see if anyone has a good idea or
> suggestion. I know that people are able to get the correct bmi identification
> too (see 5G_success_log.txt from Mani), but it is not working for everyone.
>
> Thanks,
> Christian


--
Mit freundlichen Grüssen / Regards

Sebastian Gottschall / CTO

NewMedia-NET GmbH - DD-WRT
Firmensitz: Berliner Ring 101, 64625 Bensheim
Registergericht: Amtsgericht Darmstadt, HRB 25473
Geschäftsführer: Peter Steinhäuser, Christian Scheele
http://www.dd-wrt.com
email: [email protected]
Tel.: +496251-582650 / Fax: +496251-5826565

2017-03-24 10:09:12

by Sebastian Gottschall

[permalink] [raw]
Subject: Re: QCA9984 bmi identification failure

i have a r7800 running. consider to use the board.bin file which is
stored in flash memory of the r7800.
there are 2 stored for both cards. you need to patch ath10k to use
different board.bin files for each card.
this is a log from a working r7800 running dd-wrt. the failed to fetch
board data error is normal. it will use api1 board files then which i
provide on fs based on the board data stored in flash memory

[ 6.661388] ath10k_pci 0000:01:00.0: enabling device (0140 -> 0142)
[ 6.662017] ath10k_pci 0000:01:00.0: pci irq msi oper_irq_mode 2
irq_mode 0 reset_mode 0
[ 6.806366] ath10k_pci 0000:01:00.0: Direct firmware load for
ath10k/pre-cal-pci-0000:01:00.0.bin failed with error -2
[ 6.806418] ath10k_pci 0000:01:00.0: Falling back to user helper
[ 7.049557] ath10k_pci 0000:01:00.0: qca9984/qca9994 hw1.0 target
0x01000000 chip_id 0x00000000 sub 168c:cafe
[ 7.049592] ath10k_pci 0000:01:00.0: kconfig debug 1 debugfs 1
tracing 0 dfs 0 testmode 0
[ 7.060569] ath10k_pci 0000:01:00.0: firmware ver 10.4-3.4-00074 api
5 features no-p2p,mfp,peer-flow-ctrl,btcoex-param,allows-mesh-bcast
crc32 fa32e88e
[ 7.071165] ath10k_pci 0000:01:00.0: failed to fetch board data for
bus=pci,vendor=168c,device=0046,subsystem-vendor=168c,subsystem-device=cafe
from ath10k/QCA9984/hw1.0/board-2.bin
[ 7.080216] ath10k_pci 0000:01:00.0: board_file api 1 bmi_id N/A
crc32 dc2f9863
[ 8.061877] ipq806x-gmac-dwmac 37200000.ethernet eth0: Link is Up -
1Gbps/Full - flow control off
[ 8.151876] ipq806x-gmac-dwmac 37400000.ethernet eth1: Link is Up -
1Gbps/Full - flow control off
[ 8.429005] ath10k_pci 0000:01:00.0: htt-ver 2.2 wmi-op 6 htt-op 4
cal file max-sta 512 raw 0 hwcrypto 1
[ 8.500297] ath: EEPROM regdomain: 0x0
[ 8.500319] ath: EEPROM indicates default country code should be used
[ 8.500336] ath: doing EEPROM country->regdmn map search
[ 8.500360] ath: country maps to regdmn code: 0x3a
[ 8.500380] ath: Country alpha2 being used: US
[ 8.500395] ath: Regpair used: 0x3a
[ 8.515285] ath10k_pci 0001:01:00.0: enabling device (0140 -> 0142)
[ 8.515971] ath10k_pci 0001:01:00.0: pci irq msi oper_irq_mode 2
irq_mode 0 reset_mode 0
[ 8.643800] ath10k_pci 0001:01:00.0: Direct firmware load for
ath10k/pre-cal-pci-0001:01:00.0.bin failed with error -2
[ 8.643837] ath10k_pci 0001:01:00.0: Falling back to user helper
[ 8.655447] ath10k_pci 0001:01:00.0: qca9984/qca9994 hw1.0 target
0x01000000 chip_id 0x00000000 sub 168c:cafe
[ 8.659561] ath10k_pci 0001:01:00.0: kconfig debug 1 debugfs 1
tracing 0 dfs 0 testmode 0
[ 8.673498] ath10k_pci 0001:01:00.0: firmware ver 10.4-3.4-00074 api
5 features no-p2p,mfp,peer-flow-ctrl,btcoex-param,allows-mesh-bcast
crc32 fa32e88e
[ 8.677862] ath10k_pci 0001:01:00.0: failed to fetch board data for
bus=pci,vendor=168c,device=0046,subsystem-vendor=168c,subsystem-device=cafe
from ath10k/QCA9984/hw1.0/board-2.bin
[ 8.691311] ath10k_pci 0001:01:00.0: board_file api 1 bmi_id N/A
crc32 3483f7cb
[ 10.039146] ath10k_pci 0001:01:00.0: htt-ver 2.2 wmi-op 6 htt-op 4
cal file max-sta 512 raw 0 hwcrypto 1


Am 23.03.2017 um 17:47 schrieb Christian Lamparter:
> Hannu Nyman reported a issue with the QCA9984 in his Netgear R7800
> and LEDE's ath10k: (This is with 936-ath10k_skip_otp_check.patch removed):
>
> [ 12.999550] ath10k_pci 0000:01:00.0: enabling device (0140 -> 0142)
> [ 12.999637] ath10k_pci 0000:01:00.0: enabling bus mastering
> [ 13.000105] ath10k_pci 0000:01:00.0: pci irq msi oper_irq_mode 2 irq_mode 0 reset_mode 0
> [ 13.130838] ath10k_pci 0000:01:00.0: Direct firmware load for ath10k/pre-cal-pci-0000:01:00.0.bin failed with error -2
> [ 13.130888] ath10k_pci 0000:01:00.0: Falling back to user helper
> [ 13.183995] firmware ath10k!pre-cal-pci-0000:01:00.0.bin: firmware_loading_store: map pages failed
> [ 13.184338] ath10k_pci 0000:01:00.0: Direct firmware load for ath10k/cal-pci-0000:01:00.0.bin failed with error -2
> [ 13.191960] ath10k_pci 0000:01:00.0: Falling back to user helper
> [ 13.673417] ath10k_pci 0000:01:00.0: qca9984/qca9994 hw1.0 target 0x01000000 chip_id 0x00000000 sub 168c:cafe
> [ 13.673451] ath10k_pci 0000:01:00.0: kconfig debug 0 debugfs 1 tracing 0 dfs 1 testmode 1
> [ 13.684393] ath10k_pci 0000:01:00.0: firmware ver 10.4-3.4-00074 api 5 features no-p2p,mfp,peer-flow-ctrl,btcoex-param,allows-mesh-bcast crc32 fa32e88e
> [ 15.728598] ath10k_pci 0000:01:00.0: unable to read from the device
> [ 15.728621] ath10k_pci 0000:01:00.0: could not execute otp for board id check: -110
> [ 15.733663] ath10k_pci 0000:01:00.0: failed to get board id from otp: -110
> [ 15.741474] ath10k_pci 0000:01:00.0: could not probe fw (-110)
>
> I requested to test what happens, if the driver ignored -ETIMEDOUT error from
> ath10k_core_get_board_id_from_otp() and the device initialized successfully:
>
> [ 16.163318] ath10k_pci 0000:01:00.0: enabling device (0140 -> 0142)
> [ 16.163401] ath10k_pci 0000:01:00.0: enabling bus mastering
> [ 16.163850] ath10k_pci 0000:01:00.0: pci irq msi oper_irq_mode 2 irq_mode 0 reset_mode 0
> [ 16.337294] ath10k_pci 0000:01:00.0: Direct firmware load for ath10k/pre-cal-pci-0000:01:00.0.bin failed with error -2
> [ 16.337351] ath10k_pci 0000:01:00.0: Falling back to user helper
> [ 22.837360] firmware ath10k!pre-cal-pci-0000:01:00.0.bin: firmware_loading_store: map pages failed
> [ 23.212157] ath10k_pci 0000:01:00.0: qca9984/qca9994 hw1.0 target 0x01000000 chip_id 0x00000000 sub 168c:cafe
> [ 23.212211] ath10k_pci 0000:01:00.0: kconfig debug 0 debugfs 1 tracing 0 dfs 1 testmode 1
> [ 23.226748] ath10k_pci 0000:01:00.0: firmware ver 10.4-3.4-00074 api 5 features no-p2p,mfp,peer-flow-ctrl,btcoex-param,allows-mesh-bcast crc32 fa32e88e
> [ 25.259266] ath10k_pci 0000:01:00.0: unable to read from the device
> [ 25.259288] ath10k_pci 0000:01:00.0: could not execute otp for board id check: -110
> [ 25.277326] ath10k_pci 0000:01:00.0: failed to fetch board data for bus=pci,vendor=168c,device=0046,subsystem-vendor=168c,subsystem-device=cafem...from ath10k/QCA9984/hw1.0/board-2.bin
> [ 25.277588] ath10k_pci 0000:01:00.0: board_file api 1 bmi_id N/A crc32 dd636801
> [ 26.800717] ath10k_pci 0000:01:00.0: htt-ver 2.2 wmi-op 6 htt-op 4 cal file max-sta 512 raw 0 hwcrypto 1
>
> <https://forum.lede-project.org/t/netgear-r7800-exploration-ipq8065-qca9984/285/277>
>
> What seems strange is that only the call bmi_execute with
> BMI_PARAM_GET_EEPROM_BOARD_ID is timing out. So by just
> ignoring the -ETIMEDOUT result from:
>
> ret = ath10k_bmi_execute(ar, address, BMI_PARAM_GET_EEPROM_BOARD_ID,
> &result);
>
> in ath10k_core_get_board_id_from_otp() the device will initialize and work.
> This begs the question, what is so special about the BMI_PARAM_GET_EEPROM_BOARD_ID
> at that time for the QCA9984? Does the device need some extra msleep time after
> the OTP has been uploaded? Or is the BMI_PARAM_GET_EEPROM_BOARD_ID not
> implemented/has a different ID, etc... ?
>
> Thanks,
> Christian
>


--
Mit freundlichen Gr?ssen / Regards

Sebastian Gottschall / CTO

NewMedia-NET GmbH - DD-WRT
Firmensitz: Berliner Ring 101, 64625 Bensheim
Registergericht: Amtsgericht Darmstadt, HRB 25473
Gesch?ftsf?hrer: Peter Steinh?user, Christian Scheele
http://www.dd-wrt.com
email: [email protected]
Tel.: +496251-582650 / Fax: +496251-5826565

2017-03-25 18:21:12

by Christian Lamparter

[permalink] [raw]
Subject: Re: QCA9984 bmi identification failure

On Saturday, March 25, 2017 8:24:59 AM CET Sebastian Gottschall wrote:
> Am 24.03.2017 um 16:01 schrieb Christian Lamparter:
> > On Friday, March 24, 2017 11:09:03 AM CET Sebastian Gottschall wrote:
> >> i have a r7800 running. consider to use the board.bin file which is
> >> stored in flash memory of the r7800.
> > Well, this is a bit beside the point. But what makes you think that
> > what is stored in the flash memory of R7800 is the "board.bin"?
> i dont know how to answer this question without getting rude.
> i'm developing dd-wrt which is a alternate firmware like lede/openwrt for
> many hundrets of different routers.this is my job since more than 10 years.
> [...]

Well. What you're posting to the ML, is entirely your own decision.
But let's focus again on:

> > On Friday, March 24, 2017 11:09:03 AM CET Sebastian Gottschall wrote:
> > i have a r7800 running. consider to use the board.bin file which is
> > stored in flash memory of the r7800.
> > there are 2 stored for both cards. you need to patch ath10k to use
> > different board.bin files for each card.
> > this is a log from a working r7800 running dd-wrt. the failed to fetch
> > board data error is normal. it will use api1 board files then which i
> > provide on fs based on the board data stored in flash memory

What makes you think that what you do with the data in the flash is
the "board.bin"? Do you have any documentation to back up your statement?
I know that you have been reporting about the QCA99x0. there even
is a patch with your "reported-by" tag:
"ath10k: fix calibration init sequence of qca99x0".
<http://lists.infradead.org/pipermail/ath10k/2016-March/007173.html>

Clearly, you must have read it. So you know that:
"[...] whereas calibration file is used by ar9888 and qca99x0 that contains
both board data and caldata. [...]"

which is what I said in the response as well. we both knew that
(from the beginning). If you want you can go on about it:
Please do. However, you should provide some data to back up your
claims and statements (logs, links to code or patches are fine I think).
Furthermore, let's keep the discussion civil and not go off on a
tangent and start a pissing contest. And finally, let's not forget
that the discussion is about the "QCA9984 bmi identification failure".

which seems to be also affecting your unit, since you wrote about the same
issue in your first mail:
> > On Friday, March 24, 2017 11:09:03 AM CET Sebastian Gottschall wrote:
> > you need to patch ath10k to use different board.bin files for each card.

and

> > the failed to fetch board data error is normal.

Why do you need to patch the ath10k driver? And why is it, that even
you have patched it, ath10k is still producing an error message? And
why is it normal(save?) to ignore it?

> > I know that Netgear provided a myriad of different board data files
> > with in there GPL drop:
> you can ignore them. use the files stored in flash memory. this board
> data is the calibration data which is different for each device you buy.
> its precalibrated and stored in flash memory.
If this was the case, then why does the ath10k-firmware and codeaurora.org
provide the board files in the firmware repositories?
<https://github.com/kvalo/ath10k-firmware/tree/master/QCA9984/hw1.0>
<https://source.codeaurora.org/quic/qsdk/oss/firmware/ath10k-firmware/plain/ath10k/QCA9984/hw1.0/>

Also, why does ath10k insist on requesting the board-2.bin files, even after
it has loaded the calibration file which is supposed to contain both
(for the QCA9984 and others)?
Would it be easier just to request "one file" and not two different files
with the same content?

Note: I know that LEDE currently puts the data from the flash
into the cal-pci-0000:01:00.0.bin and creates a symlink to "board.bin".
<https://git.lede-project.org/?p=source.git;a=blob;f=package/firmware/ath10k-firmware/Makefile;h=febf7d26794bd8c5b34bd6703e88cf8070e213a1;hb=HEAD#l323>
I expect to see something similar in DD-WRT. If not, please tell us about your
method.

> a normal wifi card has a own eeprom on it which stores this data. but on
> embedded devices the routers own flash memory is used for storing this data.
> this case is mainly ignored by drivers like ath10k, so patches are
> required right now until ath10k does officially support these conditions
The QCA9984 support was added by this patch back in August 2016:
"ath10k: enable support for QCA9984"
https://patchwork.kernel.org/patch/8890981/

At this point, I think ath10k should support the QCA9984 in the R7800
"out of the box" without any need for special or custom patches.
LEDE's compat-wireless is dated 2017-01-31.
> > Here's a link:
> > <https://github.com/paul-chambers/netgear-r7800/tree/master/git_home/madwifi-11n.git/halphy_tools/host/eepromUtil/release_qca9984/hw1>
> i know the gpl tree
> > So, does the data in your flash matches any of those files 1:1 or not?
> nope. these files are just default files shipped with the driver by qca
> > (Note: From what I know, it's the caldata that's in the flash.
> > caldata ≈ cal+board. But I'm asking why ath10k's bmi identification
> > isn't working for those chips right now. And judging from your logs,
> > you are using probably a similar WA to the
> > 936-ath10k_skip_otp_check.patch out of necessity as well.)
> the board.bin is the caldata and configuration set for the card.
> the skip otp check patch is required in any way since the cards has no
> on board eeprom
> if bmi identification fails, the api 1 board-2.bin file is loaded as
> alternate variant and this is exactly the data which is stored in flash
> memory.
Two points:

- skip_otp is mentioned in a patch from 2014 (long before
the QCA9984) <https://patchwork.kernel.org/patch/5252351/>
The commit log says: "It is useful for initial calibration."
This does sound like this feature is used for "product development"
if it is used for something else (QCA6174, QCA9377?),
the description:
| MODULE_PARM_DESC(skip_otp, "Skip otp failure for calibration in testmode");
should be upgraded.

Note: skip_otp doesn't skip over the BMI identification.
Instead it just instructs the code to ignore the result from it.
So setting skip_otp will not help here, since bmi_execute is
returing -ETIMEDOUT.
<http://lxr.free-electrons.com/source/drivers/net/wireless/ath/ath10k/core.c#L742>

- board api 1 was not intended to be used for QCA99X0+.
In fact, the patch:"ath10k: add board 2 API support".
<https://patchwork.kernel.org/patch/7334851/> which added it
lists the 99X0 (predecessor of the 9984?) as a target (altought
the ath10k-firmware repository still has the old file).
The board api 1 was just left as a fallback.

> >> there are 2 stored for both cards. you need to patch ath10k to use
> >> different board.bin files for each card.
> > Exactly. Why do you (or anyone for that matter) need to patch ath10k?
> > The driver is supposed to support the QCA9984 out of the box, right?
> it is supposed, but thats not the case. ath10k mainly supports cards
> with own on board eeprom.
> embedded routers can be very specific and nonstandard, so customizations
> are required sometimes.
> you may not like it, but i know this device very well and also others.
> consider how the skip otp patch was created.
> its a workaround which is only required for routers.
> > And I know, that the bmi identification is supposed to work, as
> > somebody posted the following log:
> > <http://lists.infradead.org/pipermail/lede-dev/2016-December/004987.html>
> >
> > [ 379.392210] ath10k_pci 0002:01:00.0: boot upload otp to 0x1234 len 9000 for board id
> > [ 379.399945] ath10k_pci 0002:01:00.0: bmi fast download address 0x1234 buffer 0xe1676038 length 9000
> > [ 379.408977] ath10k_pci 0002:01:00.0: bmi lz stream start address 0x1234
> > [ 379.415603] ath10k_pci 0002:01:00.0: bmi lz data buffer 0xe1676038 length 9000
> > [ 379.451626] ath10k_pci 0002:01:00.0: bmi lz stream start address 0x0
> > [ 379.457985] ath10k_pci 0002:01:00.0: bmi execute address 0x1234 param 0x10
> > [ 380.857006] ath10k_pci 0002:01:00.0: bmi execute result 0x400
> > [ 380.862749] ath10k_pci 0002:01:00.0: boot get otp board id result 0x00000400 board_id 1 chip_id 0
> > [ 380.871603] ath10k_pci 0002:01:00.0: boot using board name 'bus=pci,bmi-chip-id=0,bmi-board-id=1'
> > [ 380.880468] ath10k_pci 0002:01:00.0: board name
> > [ 380.884999] ath10k_pci 0002:01:00.0: 00000000: 62 75 73 3d 70 63 69 2c 62 6d 69 2d 63 68 69 70 bus=pci,bmi-chip
> > [ 380.895159] ath10k_pci 0002:01:00.0: 00000010: 2d 69 64 3d 30 2c 62 6d 69 2d 62 6f 61 72 64 2d -id=0,bmi-board-
> > [ 380.905317] ath10k_pci 0002:01:00.0: 00000020: 69 64 3d 31 id=1
> > [ 380.914436] ath10k_pci 0002:01:00.0: boot found match for name 'bus=pci,bmi-chip-id=0,bmi-board-id=1'
> > [ 380.923640] ath10k_pci 0002:01:00.0: boot found board data for 'bus=pci,bmi-chip-id=0,bmi-board-id=1'
> > [ 380.932845] ath10k_pci 0002:01:00.0: using board api 2
> > ...
> >
> > The board name for the QCA9984 is supposed to look like
> > "'bus=pci,bmi-chip-id=0,bmi-board-id=1'"
> >
> > and not like (from your log):
> sure. you can load wrong board data into your card this may result in
> the following situation a 5 ghz card is shown as 2.4 ghz card and vice
> versa or even dualband also if the card is not supporting this. this may
> not be the case on the r7800 but i know another device where this is the
> bevaviour. power calibration set is wrong in any way so output power may
> be lower as expected.
> and the resulting operation state will not work in best way. lede does
> not support this device in correct way.
> wrong board data can also destroy the hardware. this may not happen
> fast, but using wrong power calibration data may destroy the phy over
> time by overheating
>
> take a look on your router. especially on mtd3
> mtd3: 00140000 00020000 "art"
>
> dump this partition and take a look inside and take a guess what you
> will find. offset 0x1000 is first board data. offset 0x5000 is second
> board data
> this is common for almost all wireless routers on the market. i dont
> know a single router with a qca or atheros chipset on the market which
> does not have stored the board data in flash memory
Well, "almost all wireless routers on the martket" does it include older ath9k too?
If so, then there's the Aerohive HiveAP 121.
<https://github.com/riptidewave93/LEDE-HiveAP-121>. It has an AR934x SoC
and the internal WMAC is storing its calibration data in the SoC's OTP area.
The device is supported by ath9k. The device does have a wifi-cal/art
partition but it was empty.

There are simply too many devices, to keep track of each one and
each is a little bit different.

> >> bus=pci,vendor=168c,device=0046,subsystem-vendor=168c,subsystem-device=cafe
> >> from ath10k/QCA9984/hw1.0/board-2.bin
> >> the failed to fetch board data error is normal.
> > I don't think it is. I think it's a regression.
> ahm no. you dont know what you're dealing with.
I reported that Hannu Nyman has an issue with the QCA9984 in his Netgear R7800.
The BMI Identification isn't working as expected... I asked on the ML what's
going on and if they could take a look. Let's see if anyone has a good idea or
suggestion. I know that people are able to get the correct bmi identification
too (see 5G_success_log.txt from Mani), but it is not working for everyone.

Thanks,
Christian


Attachments:
5G_succes_log.txt (37.09 kB)

2017-06-09 17:22:38

by Christian Lamparter

[permalink] [raw]
Subject: Re: QCA9984 bmi identification failure (fixed)

On Thursday, March 23, 2017 5:47:08 PM CEST Christian Lamparter wrote:
> Hannu Nyman reported a issue with the QCA9984 in his Netgear R7800
> and LEDE's ath10k: (This is with 936-ath10k_skip_otp_check.patch removed):
>
> [ 25.259266] ath10k_pci 0000:01:00.0: unable to read from the device
> [ 25.259288] ath10k_pci 0000:01:00.0: could not execute otp for board id check: -110
> [ 25.277326] ath10k_pci 0000:01:00.0: failed to fetch board data for bus=pci,vendor=168c,device=0046,subsystem-vendor=168c,subsystem-device=cafem...from ath10k/QCA9984/hw1.0/board-2.bin
> [ 25.277588] ath10k_pci 0000:01:00.0: board_file api 1 bmi_id N/A crc32 dd636801
> [ 26.800717] ath10k_pci 0000:01:00.0: htt-ver 2.2 wmi-op 6 htt-op 4 cal file max-sta 512 raw 0 hwcrypto 1
> [...]
>
> <https://forum.lede-project.org/t/netgear-r7800-exploration-ipq8065-qca9984/285/277>
>
> What seems strange is that only the call bmi_execute with
> BMI_PARAM_GET_EEPROM_BOARD_ID is timing out. [...]
>
> This begs the question, what is so special about the BMI_PARAM_GET_EEPROM_BOARD_ID
> at that time for the QCA9984? Does the device need some extra msleep time after
> the OTP has been uploaded? Or is the BMI_PARAM_GET_EEPROM_BOARD_ID not
> implemented/has a different ID, etc... ?

The issue regarding the BMI_PARAM_GET_EEPROM_BOARD_ID has
been addressed by the following patch:
"[PATCH] ath10k: Add BMI parameters to fix calibration from DT/pre-cal"
<https://patchwork.kernel.org/patch/9748097/>
|QCA99X0, QCA9888, QCA9984 supports calibration data in
|either OTP or DT/pre-cal file. Current ath10k supports
|Calibration data from OTP only.
|
|If caldata is loaded from DT/pre-cal file, fetching board id
|and applying calibration parameters like tx power gets failed.
|
|error log:
|[ 15.733663] ath10k_pci 0000:01:00.0: failed to fetch board file: -2
|[ 15.741474] ath10k_pci 0000:01:00.0: could not probe fw (-2)
|
|This patch adds calibration data support from DT/pre-cal
|file. Below parameters are used to get board id and
|applying calibration parameters from cal data.
|
| EEPROM[OTP] FLASH[DT/pre-cal file]
|Cal param 0x700 0x10000
|Board id 0x10 0x8000
|
|Tested on QCA9888 with pre-cal file.

Several developers and users have reported success with Pavel's PR:
<https://github.com/lede-project/source/pull/1153>

[ 19.132296] ath10k_pci 0000:01:00.0: pci irq msi oper_irq_mode 2 irq_mode 0 reset_mode 0
[ 19.314182] ath10k_pci 0000:01:00.0: Direct firmware load for ath10k/pre-cal-pci-0000:01:00.0.bin failed with error -2
[ 19.314235] ath10k_pci 0000:01:00.0: Falling back to user helper
[ 32.827197] ath10k_pci 0000:01:00.0: qca9984/qca9994 hw1.0 target 0x01000000 chip_id 0x00000000 sub 168c:cafe
[ 32.827233] ath10k_pci 0000:01:00.0: kconfig debug 0 debugfs 1 tracing 0 dfs 1 testmode 1
[ 32.839487] ath10k_pci 0000:01:00.0: firmware ver 10.4-3.4-00082 api 5 features no-p2p,mfp,peer-flow-ctrl,[...]
[ 35.116999] ath10k_pci 0000:01:00.0: *board_file api 2 bmi_id 0:1* crc32 751efba1
[ 40.981190] ath10k_pci 0000:01:00.0: htt-ver 2.2 wmi-op 6 htt-op 4 cal pre-cal-file max-sta 512 raw 0 hwcrypto 1

All existing users of 936-ath10k_skip_otp_check.patch should be
able to drop the 936-patch entirely and switch to the pre-cal
file cal method for their devices.

Regards,
Christian