Return-path: Received: from mail-wg0-f49.google.com ([74.125.82.49]:51616 "EHLO mail-wg0-f49.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752216AbaAOSln (ORCPT ); Wed, 15 Jan 2014 13:41:43 -0500 Received: by mail-wg0-f49.google.com with SMTP id a1so2135773wgh.28 for ; Wed, 15 Jan 2014 10:41:41 -0800 (PST) MIME-Version: 1.0 Reply-To: andrea.merello@gmail.com In-Reply-To: <52D6C871.7020302@lwfinger.net> References: <522F584E.6000806@lwfinger.net> <52D6B31F.8080007@lwfinger.net> <52D6C871.7020302@lwfinger.net> Date: Wed, 15 Jan 2014 19:41:41 +0100 Message-ID: (sfid-20140115_194147_221451_C24B4B4B) Subject: Re: RTL8187SE staging Linux driver From: Andrea Merello To: Larry Finger Cc: Bernhard Schiffner , John Linville , Greg Kroah-Hartman , linux-wireless@vger.kernel.org Content-Type: text/plain; charset=ISO-8859-1 Sender: linux-wireless-owner@vger.kernel.org List-ID: Thank you. The driver tries to set this in rtl8180_probe, line 2140 dev->max_signal = 65; I have no idea yet if this will be overwritten somewhere other or whatever else.. In this case maybe the value become corrupted later on, after mac80211 initialization, BTW, what about making mac80211 robust to at least wrong initialization? Simulating a initialization to zero, the following patch will triggers also lot of other WARN_ON because of broken signal information, but should avoid the panic.. BTW Currently i'm not able to reproduce the rtl8187se bug anymore :( >From cdc000007a1226b9daaab2d8354aab55127c1fb4 Mon Sep 17 00:00:00 2001 From: andrea merello Date: Wed, 15 Jan 2014 19:17:25 +0100 Subject: [PATCH] MAC80211: Issue a WARN and prevent divide by zero when max_signal is not set if the driver sets IEEE80211_HW_SIGNAL_UNSPEC, then mac80211 tries to perform a division by max_signal while scanning. Print a warn and set a dummy value. This should result is wrong signal information but avoid a crash. --- net/mac80211/main.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/net/mac80211/main.c b/net/mac80211/main.c index d767cfb..449c417 100644 --- a/net/mac80211/main.c +++ b/net/mac80211/main.c @@ -753,6 +753,11 @@ int ieee80211_register_hw(struct ieee80211_hw *hw) netdev_features_t feature_whitelist; struct cfg80211_chan_def dflt_chandef = {}; + if (WARN((hw->flags & IEEE80211_HW_SIGNAL_UNSPEC) && + (hw->max_signal < 0), + "max_signal not set while set IEEE80211_HW_SIGNAL_UNSPEC\n")) + hw->max_signal = 1; + if (hw->flags & IEEE80211_HW_QUEUE_CONTROL && (local->hw.offchannel_tx_hw_queue == IEEE80211_INVAL_HW_QUEUE || local->hw.offchannel_tx_hw_queue >= local->hw.queues)) -- 1.8.3.2 On Wed, Jan 15, 2014 at 6:42 PM, Larry Finger wrote: > On 01/15/2014 11:22 AM, Andrea Merello wrote: >> >> Hello, >> Thank you for testing! >> >> This is interesting: >> I ever worked on this patch on an older wireless-testing tree, that >> gave me no oops after lot of time. >> >> Yesterday, before sending you my patch, I ported it to a newer >> wireless-testing, and I did just a quick compile/load test. >> But today I got panic me too with the new kernel... >> >> I have a serial console over I could capture the the oops.. >> I will look at this issue in next days... >> >> 86509.384436] divide error: 0000 [#1] PREEMPT SMP >> [86509.387743] Modules linked in: rtl8180(O) mac80211 cfbfillrect >> cfbimgblt cfbcopyarea drm_kms_helper cfg80211 ttm [last unloaded: >> rtl8180] >> [86509.399253] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G O >> 3.13.0-rc7-wl+ #16 >> [86509.399253] Hardware name: System manufacturer System Product >> Name/M3N78 PRO, BIOS ASUS M3N78 PRO ACPI BIOS Revision 1402 12/04/2009 >> [86509.399253] task: ffffffff81c10460 ti: ffffffff81c00000 task.ti: >> ffffffff81c00000 >> [86509.428032] RIP: 0010:[] [] >> ieee80211_bss_info_update+0x1c2/0x350 [mac80211] >> [86509.433405] RSP: 0018:ffff88006fc03cb8 EFLAGS: 00010202 >> [86509.441368] RAX: 00000000000003e8 RBX: ffff88006fc03d08 RCX: >> 0000000000000077 >> [86509.451441] RDX: 0000000000000000 RSI: ffff880068aa97b0 RDI: >> 0000000000000000 >> [86509.451441] RBP: ffff88006fc03cf8 R08: ffff88006fc03d08 R09: >> 000000000000000a >> [86509.464969] R10: ffff88006fc03d08 R11: 0000000000000000 R12: >> ffff880068a93300 >> [86509.464969] R13: ffff88006c2db628 R14: ffff880068a93301 R15: >> ffff880068aa84c0 >> [86509.478134] FS: 00007f413fab8800(0000) GS:ffff88006fc00000(0000) >> knlGS:0000000000000000 >> [86509.488032] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b >> [86509.489400] CR2: 00007f413eedbfc9 CR3: 0000000001c0b000 CR4: >> 00000000000007f0 >> [86509.494704] Stack: >> [86509.494704] 0000000000000000 0000000000000000 0000000020000000 >> ffff88006c2db600 >> [86509.494704] ffff880068a93300 ffff880068aa84c0 ffff880068a93300 >> ffff880068aa9ac0 >> [86509.494704] ffff88006fc03e38 ffffffffa00fdd8e ffff880068a93324 >> 0000000000000053 >> [86509.494704] Call Trace: >> [86509.494704] >> [86509.494704] [] ieee80211_scan_rx+0x13e/0x1a0 >> [mac80211] >> [86509.494704] [] ieee80211_rx+0x700/0x7c0 [mac80211] >> [86509.494704] [] >> ieee80211_tasklet_handler+0xb9/0xc0 [mac80211] >> [86509.494704] [] tasklet_action+0xa7/0xb0 >> [86509.494704] [] __do_softirq+0xcd/0x1d0 >> [86509.494704] [] irq_exit+0x76/0xa0 >> [86509.494704] [] do_IRQ+0x5e/0xd0 >> [86509.494704] [] common_interrupt+0x6a/0x6a >> [86509.494704] >> [86509.494704] [] ? amd_e400_idle+0x68/0xe0 >> [86509.494704] [] arch_cpu_idle+0x16/0x20 >> [86509.494704] [] cpu_startup_entry+0x11d/0x170 >> [86509.494704] [] rest_init+0x7f/0x90 >> [86509.494704] [] start_kernel+0x307/0x313 >> [86509.494704] [] ? repair_env_string+0x5c/0x5c >> [86509.494704] [] x86_64_start_reservations+0x2a/0x2c >> [86509.494704] [] x86_64_start_kernel+0xc7/0xca >> [86509.494704] Code: 5e 41 5f 5d c3 0f 1f 40 00 45 31 c9 83 e7 20 0f >> 84 9f fe ff ff 45 0f be 4d 21 bf 64 00 00 00 44 89 c8 0f af c7 41 0f >> be 7f 74 99 ff 41 89 c1 e9 7f fe ff ff 0 >> [86509.494704] RIP [] >> ieee80211_bss_info_update+0x1c2/0x350 [mac80211] >> [86509.494704] RSP >> [86509.654701] ---[ end trace 08e0a7abe35b1caf ]--- >> [86509.661368] Kernel panic - not syncing: Fatal exception in interrupt > > > The divide fault occurs because hw.max_signal was not set in line 75 of > net/mac80211/scan.c. The failing line is > > signal = (rx_status->signal * 100) / local->hw.max_signal; > > I have not yet looked to see where that info comes from in the driver. > > Larry > > >