Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751180AbdFCH5k (ORCPT ); Sat, 3 Jun 2017 03:57:40 -0400 Received: from mout.gmx.net ([212.227.17.20]:61843 "EHLO mout.gmx.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750759AbdFCH5j (ORCPT ); Sat, 3 Jun 2017 03:57:39 -0400 Subject: Re: ath9k_htc - Division by zero in kernel (as well as firmware panic) To: Nathan Royce , QCA ath9k Development , Kalle Valo , linux-wireless@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, ath9k_htc_fw References: From: Oleksij Rempel Message-ID: <71818afe-9075-5582-bb6c-650dfa8a5363@rempel-privat.de> Date: Sat, 3 Jun 2017 09:57:00 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.1.1 MIME-Version: 1.0 In-Reply-To: Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="415QJvTWQF9UvxKLl4kuEBb1CsHUx30p5" X-Provags-ID: V03:K0:pn6KYgoxj0IqYB5MsbK1I6FyUc4HHDJJWsDp+DQIkSMnfOulAwU pVq3YIIRI+nhxZMftY/3Vj5YBlfRwb0HnBrEe0NepDK/cnyr55SPLhVBdeHjB/TyrhttrZh U5x0Ly2MRaZEXrLyJAonZbu2RBaJ1hd30a8qg1uhJKUrj3hxPUVWXrKji6naU3hjvJRkKxc EV2jEEaPqYUWhbRL8ggJg== X-UI-Out-Filterresults: notjunk:1;V01:K0:wP8qsMhky1Y=:SvbxLSsJBO6HrXcXgV2ymA c+2A7rrH6vkr+HC92aP2zcJfsxX+Sgundgx9ZBHT46yYOx3SxdwVdXJks6DCJTfT/nkLQ+di1 UwJeLLYYkzQxMAbpnLRVtAI8+bQ6Ggu35y9zutRrPQS9ayshizgC7DjyLcIVpzA6MJ5Rge0BS 8bYW7ykmfOnoA3BcNBVf+qTqPHnvsCFF1KmORUkkMOXXawYM3ibzdv8n8pos+fQ1mn5Sh5owI +15hAMhOIlxS2clHPPF6rT4SL3/VagbivirlBPkPDXsteMjRIBxwEwoDgFZ3zLGsAvU7tnplD GNaVZzyJ0HTo1xNL171UF5oglYYe4hh8hqnyjKwJ7LkfuEuYKGCyS5pkbgobGb7t9YiT1V8Kk HX9XD46e67IMMP4B/Uk7O3DetDUUxDOT86rX6kHAyru7/mlbr+wpxhK2yKNGgeX+q9Skl9wSb 2n3gNNtuiWppZne2mzdWY8qgpAWmYUx5SUAJAUNtKjDzz8z9B91rePyRmUbtbpKKG9bhUC3Ow gzYSuX7a9qx39DsY5SfMmnxGjVFf9duwcFF6XekjfGcM670Anac9li5IVuVhMi8uMpdG3/thw PRzSC0uJQAWjm0BA5+Uu9sSbAW6ZoI0BJnd/R/AzheoYZQOyreGmVZGwBOgBCmTsPBUBMXWLM YqBIw/JC0pqyYvB5tDyokiIoVHy4TrJL4d8wHVqkwKQ4nNCfUo1VoDj2d+h2YpAIbf/7aesOY vPPkwajE6xUrt+Jdd7GQcWqQwU/U8OHq9J3LTKx2O2UKlg1IWoNSkn9y/Ns= Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 10022 Lines: 228 This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --415QJvTWQF9UvxKLl4kuEBb1CsHUx30p5 Content-Type: multipart/mixed; boundary="qhjlQVd67XBgcXMFngbgtkFrxqBK5Ibaq"; protected-headers="v1" From: Oleksij Rempel To: Nathan Royce , QCA ath9k Development , Kalle Valo , linux-wireless@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, ath9k_htc_fw Message-ID: <71818afe-9075-5582-bb6c-650dfa8a5363@rempel-privat.de> Subject: Re: ath9k_htc - Division by zero in kernel (as well as firmware panic) References: In-Reply-To: --qhjlQVd67XBgcXMFngbgtkFrxqBK5Ibaq Content-Type: text/plain; charset=utf-8 Content-Language: en-GB Content-Transfer-Encoding: quoted-printable Hi, Am 03.06.2017 um 00:02 schrieb Nathan Royce: > ODroid XU4 >=20 > $ uname -a > Linux computer 4.12.0-rc3-dirty #1 SMP Wed May 31 15:02:05 CDT 2017 > armv7l GNU/Linux >=20 > $ lsusb > ... > Bus 001 Device 002: ID 2109:2813 VIA Labs, Inc. > Bus 001 Device 010: ID 0cf3:7015 Qualcomm Atheros Communications > TP-Link TL-WN821N v3 / TL-WN822N v2 802.11n [Atheros AR7010+AR9287] > ... >=20 > ***** > Jun 02 16:20:11 computer hostapd[14954]: vwlan0: interface state > COUNTRY_UPDATE->HT_SCAN > Jun 02 16:20:17 computer hostapd[14954]: 20/40 MHz operation not > permitted on channel pri=3D7 sec=3D3 based on overlapping BSSes > Jun 02 16:20:18 computer kernel: Division by zero in kernel. > Jun 02 16:20:18 computer kernel: CPU: 1 PID: 14507 Comm: kworker/u16:2 > Tainted: G W 4.12.0-rc3-dirty #1 > Jun 02 16:20:18 computer kernel: Hardware name: SAMSUNG EXYNOS > (Flattened Device Tree) > Jun 02 16:20:18 computer kernel: Workqueue: phy5 ieee80211_scan_work [m= ac80211] > Jun 02 16:20:18 computer kernel: [] (unwind_backtrace) from > [] (show_stack+0x10/0x14) > Jun 02 16:20:18 computer kernel: [] (show_stack) from > [] (dump_stack+0x88/0x9c) > Jun 02 16:20:18 computer kernel: [] (dump_stack) from > [] (Ldiv0_64+0x8/0x18) > Jun 02 16:20:18 computer kernel: [] (Ldiv0_64) from > [] (ath9k_get_next_tbtt+0x58/0x5c [ath9k_common]) Hm... this function and file: linux/drivers/net/wireless/ath/ath9k/common-beacon.c didn't changed since 2015. So, it should be some thing different. Can you run git bisect to find exact patch caused this regression? > Jun 02 16:20:18 computer kernel: [] (ath9k_get_next_tbtt > [ath9k_common]) from [] (ath9k_cmn_beacon_config > Jun 02 16:20:18 computer kernel: [] > (ath9k_cmn_beacon_config_ap [ath9k_common]) from [] > (ath9k_htc_beacon > Jun 02 16:20:18 computer kernel: [] > (ath9k_htc_beacon_config_ap [ath9k_htc]) from [] > (ath9k_htc_vif_recon > Jun 02 16:20:18 computer kernel: [] (ath9k_htc_vif_reconfig > [ath9k_htc]) from [] (ath9k_htc_sw_scan_compl > Jun 02 16:20:18 computer kernel: [] > (ath9k_htc_sw_scan_complete [ath9k_htc]) from [] > (__ieee80211_scan_co > Jun 02 16:20:18 computer kernel: [] > (__ieee80211_scan_completed [mac80211]) from [] > (ieee80211_scan_work+ > Jun 02 16:20:18 computer kernel: [] (ieee80211_scan_work > [mac80211]) from [] (process_one_work+0x1d8/0x40 > Jun 02 16:20:18 computer kernel: [] (process_one_work) from > [] (worker_thread+0x4c/0x564) > Jun 02 16:20:18 computer kernel: [] (worker_thread) from > [] (kthread+0x14c/0x154) > Jun 02 16:20:18 computer kernel: [] (kthread) from > [] (ret_from_fork+0x14/0x3c) > Jun 02 16:20:18 computer hostapd[14954]: Using interface wlan0 with > hwaddr and ssid "" > Jun 02 16:20:18 computer kernel: IPv6: ADDRCONF(NETDEV_CHANGE): > vwlan0: link becomes ready > ***** > This is a new one on me. >=20 > The "normal" problem (search shows to be a very old issue) I > consistently (daily or multiple times/day) encounter is: Yes, this is "normal" problem. The firmware has no error handler for PCI bus related exceptions. So if we filed to read PCI bus first time, we have choice to Ooops and stall or Ooops and reboot ASAP. So we reboot and provide an kernel "firmware panic!" message. Every one who can or will to fix this, is welcome. > ***** > Jun 02 14:55:30 computer kernel: usb 1-1.1: ath: firmware panic! > exccause: 0x0000000d; pc: 0x0090ae81; badvaddr: 0x10ff4038. > Jun 02 14:55:30 computer kernel: usb 1-1.1: USB disconnect, device numb= er 9 > Jun 02 14:55:30 computer systemd-networkd[11959]: vwlan0: Lost carrier > Jun 02 14:55:30 computer kernel: br0: port 2(vwlan0) entered disabled s= tate > Jun 02 14:55:30 computer kernel: wlan0: deauthenticating from > by local choice (Reason: 3=3DDEAUTH_LEAVING) > Jun 02 14:55:30 computer kernel: ath: phy4: Failed to wakeup in 500us > Jun 02 14:55:30 computer kernel: ath: phy4: Failed to wakeup in 500us > Jun 02 14:55:30 computer kernel: ath: phy4: Failed to wakeup in 500us > Jun 02 14:55:30 computer kernel: ath: phy4: Failed to wakeup in 500us > Jun 02 14:55:30 computer systemd-networkd[11959]: wlan0: Lost carrier > Jun 02 14:55:30 computer systemd[1]: Stopping A simple WPA encrypted > wireless connection using a static IP... > -- Subject: Unit netctl@wlan0.service has begun shutting down > -- Defined-By: systemd > -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel= > -- > -- Unit netctl@wlan0.service has begun shutting down. > Jun 02 14:55:30 computer kernel: device vwlan0 left promiscuous mode > Jun 02 14:55:30 computer kernel: br0: port 2(vwlan0) entered disabled s= tate > Jun 02 14:55:30 computer audit: ANOM_PROMISCUOUS dev=3Dvwlan0 prom=3D0 > old_prom=3D256 auid=3D4294967295 uid=3D0 gid=3D0 ses=3D4294967295 > Jun 02 14:55:30 computer hostapd[13218]: vwlan0: AP-STA-DISCONNECTED > Jun 02 14:55:30 computer hostapd[13218]: Failed to set beacon parameter= s > Jun 02 14:55:30 computer hostapd[13218]: vwlan0: INTERFACE-DISABLED > Jun 02 14:55:30 computer kernel: usb 1-1.1: ath9k_htc: USB layer deinit= ialized > Jun 02 14:55:30 computer systemd[1]: Starting Load/Save RF Kill Switch = Status... > -- Subject: Unit systemd-rfkill.service has begun start-up > -- Defined-By: systemd > -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel= > -- > -- Unit systemd-rfkill.service has begun starting up. > Jun 02 14:55:30 computer systemd[1]: Started Load/Save RF Kill Switch S= tatus. > -- Subject: Unit systemd-rfkill.service has finished start-up > -- Defined-By: systemd > -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel= > -- > -- Unit systemd-rfkill.service has finished starting up. > -- > -- The start-up result is done. > Jun 02 14:55:30 computer network[13261]: Stopping network profile 'wlan= 0'... > Jun 02 14:55:30 computer kernel: usb 1-1.1: new high-speed USB device > number 10 using exynos-ehci > Jun 02 14:55:30 computer kernel: usb 1-1.1: New USB device found, > idVendor=3D0cf3, idProduct=3D7015 > Jun 02 14:55:30 computer kernel: usb 1-1.1: New USB device strings: > Mfr=3D16, Product=3D32, SerialNumber=3D48 > Jun 02 14:55:30 computer kernel: usb 1-1.1: Product: USB WLAN > Jun 02 14:55:30 computer kernel: usb 1-1.1: Manufacturer: ATHEROS > Jun 02 14:55:30 computer kernel: usb 1-1.1: SerialNumber: 12345 > Jun 02 14:55:30 computer kernel: usb 1-1.1: ath9k_htc: Firmware > ath9k_htc/htc_7010-1.4.0.fw requested > Jun 02 14:55:30 computer kernel: usb 1-1.1: ath9k_htc: Transferred FW: > ath9k_htc/htc_7010-1.4.0.fw, size: 72812 > Jun 02 14:55:30 computer kernel: ath9k_htc 1-1.1:1.0: ath9k_htc: HTC > initialized with 45 credits > Jun 02 14:55:31 computer kernel: ath9k_htc 1-1.1:1.0: ath9k_htc: FW Ver= sion: 1.4 > Jun 02 14:55:31 computer kernel: ath9k_htc 1-1.1:1.0: FW RMW support: O= n > Jun 02 14:55:31 computer kernel: ath: EEPROM regdomain: 0x809c > Jun 02 14:55:31 computer kernel: ath: EEPROM indicates we should > expect a country code > Jun 02 14:55:31 computer kernel: ath: doing EEPROM country->regdmn map = search > Jun 02 14:55:31 computer kernel: ath: country maps to regdmn code: 0x52= > Jun 02 14:55:31 computer kernel: ath: Country alpha2 being used: CN > Jun 02 14:55:31 computer kernel: ath: Regpair used: 0x52 > Jun 02 14:55:31 computer kernel: ieee80211 phy5: Atheros AR9287 Rev:2 > Jun 02 14:55:31 computer kernel: IPv6: ADDRCONF(NETDEV_UP): vwlan0: > link is not ready > Jun 02 14:55:31 computer hostapd[13218]: vwlan0: INTERFACE-ENABLED > Jun 02 14:55:31 computer network[13261]: Stopped network profile 'wlan0= ' > ***** > I don't know the particular reason for this one. > At first it would happen every time I compiled anything (all cpu > used). Then I added the ZTE Mobley to the USB hub. Even after removing > the Mobley, the panic would still happen often. > I then recompiled the kernel so only the 4 LITTLE cpus were used > (big.LITTLE support+switcher), but the panic still happens sometimes. > Now the consistency seems to come from the wireless adapter used as > both AP and managed client. It is possible. If adapter is used in AP mode, then lots of WiFi noise is dumped over this interface. I assume the reproducibility depends on external environment, not internal. --=20 Regards, Oleksij --qhjlQVd67XBgcXMFngbgtkFrxqBK5Ibaq-- --415QJvTWQF9UvxKLl4kuEBb1CsHUx30p5 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iF4EAREIAAYFAlkya8wACgkQHwImuRkmbWm+FAD9GyDZZcWToZ6edZgdyH/JmGnF Z1YjXbuWLOICr/9OYs0A/idba3qwGjOzQG9UsC4IyhAMASfpaa4CrmH9ZbmiOUkO =wEtW -----END PGP SIGNATURE----- --415QJvTWQF9UvxKLl4kuEBb1CsHUx30p5--