2024-05-09 12:10:53

by James Prestwood

[permalink] [raw]
Subject: Ath10k "hardware restarting" all clients at specific location

Hi,

I've got a really odd one here. We have about wifi 50 clients all
experiencing a problem where seemingly out of the blue the ath10k driver
times out reading a WMI command and restarts the hardware. I've poured
though logs and cannot find any patters as far as when it happens.
Sometimes during a scan, sometimes after a roam, and sometimes when the
client has a stable connection. It appears completely random.

The issue looks to be isolated to a single physical location. We have
many other locations with clients running an identical software stack
(kernel version and ath10k firmware) and the same AP/network deployments
and see ZERO instances of this in the last month of logs. Its only one
physical location where clients are frequently seeing this happen. For
this specific version of our software we are running:

Ubuntu 22.04 5.15.0-72-generic
QCA6174 HW 3.2
Ath10k firmware WLAN.RM.4.4.1-00288-QCARMSWPZ-1

I have also updated to the latest firmware (WLAN.RM.4.4.1-00309-) on a
single client yesterday, and saw 3 instances of this still as of this
morning.

Due to the isolated nature of this and having other locations with
identical configurations/software it seems like some
external/environmental difference at this location. Is there any
external event that could cause this (rouge frame, interference, ???)

May 09 00:31:06 kernel: wlan0: disconnect from AP aa:46:8d:37:8f:48 for
new assoc to aa:46:8d:37:80:b5
May 09 00:31:06 kernel: wlan0: associate with aa:46:8d:37:80:b5 (try 1/3)
May 09 00:31:06 kernel: wlan0: RX ReassocResp from aa:46:8d:37:80:b5
(capab=0x1511 status=0 aid=4)
May 09 00:31:06 kernel: wlan0: associated
May 09 00:31:06 kernel: ath: EEPROM regdomain: 0x82d4
May 09 00:31:06 kernel: ath: EEPROM indicates we should expect a country
code
May 09 00:31:06 kernel: ath: doing EEPROM country->regdmn map search
May 09 00:31:06 kernel: ath: country maps to regdmn code: 0x37
May 09 00:31:06 kernel: ath: Country alpha2 being used: ES
May 09 00:31:06 kernel: ath: Regpair used: 0x37
May 09 00:31:06 kernel: ath: regdomain 0x82d4 dynamically updated by
country element
May 09 00:31:06 kernel: wlan0: Limiting TX power to 30 (30 - 0) dBm as
advertised by aa:46:8d:37:80:b5
May 09 00:31:42 kernel: ath10k_pci 0000:02:00.0: timed out waiting peer
stats info
May 09 00:31:45 kernel: ath10k_pci 0000:02:00.0: wmi command 90113
timeout, restarting hardware
May 09 00:31:45 kernel: ath10k_pci 0000:02:00.0: could not request stats
(-11)
May 09 00:31:45 kernel: ath10k_pci 0000:02:00.0: could not request peer
stats info: -108
May 09 00:31:45 kernel: ath10k_pci 0000:02:00.0: failed to read
hi_board_data address: -16
May 09 00:31:45 kernel: ieee80211 phy0: Hardware restart was requested
May 09 00:31:45 kernel: ath10k_pci 0000:02:00.0: could not request stats
(-108)
May 09 00:31:45 kernel: ath10k_pci 0000:02:00.0: device successfully
recovered
May 09 00:31:46 kernel: wlan0: deauthenticated from aa:46:8d:37:80:b5
(Reason: 6=CLASS2_FRAME_FROM_NONAUTH_STA)

Thanks,

James