Received: by 2002:a05:6a11:4021:0:0:0:0 with SMTP id ky33csp287988pxb; Wed, 15 Sep 2021 01:58:53 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwUc6b5oAiWAuxEn6vNs86NPs+aMbdpQNI/IS8FfvZcoUPsXdCNHhB0TCzl3ni2geVxVk0D X-Received: by 2002:ac2:528f:: with SMTP id q15mr17178624lfm.507.1631696332961; Wed, 15 Sep 2021 01:58:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1631696332; cv=none; d=google.com; s=arc-20160816; b=MWwadvO5y/qdIRPcXSPYm5n1P/6GiVuwx8kLC3cyXZKhcSutHdLDarsR5hRpOT46HZ IYbqWaBIXEUg3o/mWSqQihdemEQPVOERVucKM06HwpsoIvh9VZxyQ72rdtbVwb/3ZHMi 8I99v2/q3AHJKXeBEqO8y8NPtkFY+1KmgDYJG/jJcrZ0eJ3n+M1eEUH97tiFztVBrX75 tYJYkbI7CQZ2ZsW2xly8Tpb80JO8U1aONfpBybxafy28awPjizK1op88JxVIQ0r9PIg4 4ayLfuIm9vAEDzEB2FE+zG23q/YuoCuIBhRCO1ZRO/DKUGWFW35+Wg4aJKLJHcxp7e9G Uhsw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:message-id:date:references :in-reply-to:subject:cc:to:dkim-signature:dkim-signature:from; bh=oO9OrB1G0+YxTYn/JMvLtBpJYNEF69i2CP96bjIn7ik=; b=FeCnGBpABd/+Wokqu/WvhyDLfi0Vuw5mG8qMZzYiUptpkUcJ6ZdHFUOw30nVQqnTvb YSB1kCHhkxvSvXMcbMYtjVGKPX9Qp5TuZmp1oeqLLiof44Qk4vc5iJqNulgUhQWOuiLO vYtAqQyXbBot9ppUxpYdHuj6z5y5EysNnElkI6cgF23DaAAFCTSF9vv2ZfHJoLhovg+l Ajh2ntx06P51DXc8rG4rdYeYKpSo0VzchecMlkLFRC0xDZ4hPV3d4AL052qBZsDkM0sD TAF1vZaVr+1p2o31d52gI/sFpk22Q6mIi1JwLPvU3shYr7Qos1Iooa5j9U0lmBEeYp6r 4X/w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=vgC8kAEu; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e; spf=pass (google.com: domain of linux-wireless-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-wireless-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id 2si19658066ljh.185.2021.09.15.01.58.20; Wed, 15 Sep 2021 01:58:52 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-wireless-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=vgC8kAEu; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e; spf=pass (google.com: domain of linux-wireless-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-wireless-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236972AbhIOI7N (ORCPT + 99 others); Wed, 15 Sep 2021 04:59:13 -0400 Received: from Galois.linutronix.de ([193.142.43.55]:39420 "EHLO galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232676AbhIOI7M (ORCPT ); Wed, 15 Sep 2021 04:59:12 -0400 From: Thomas Gleixner DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1631696272; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=oO9OrB1G0+YxTYn/JMvLtBpJYNEF69i2CP96bjIn7ik=; b=vgC8kAEuGghuxRk6oLs9YOA4m+/CPyok0IXqSq/RQwAliqNSFG7N2sfhvA4flz/w+16dXS ze4J6Z3GFT8N2+Poo/b8OnWUIoqAp/7Rj5XlkcSHHqI+mEpxILlT5DUMy4rYiEdxTPzziT YkifNWR599qlh4v77bwqXaUfuGeqjHsAA4d0kOR2HboBNQeFb0PYmWvPyZ4/N55aQ9NF4V X9J0mdaE05krLDUmsixWfqH41ZVe9a0eK+Uo90phs6GArDDPp1+eD8B76DrhrKIEO0xQdy hgHq45m3u9aAIsM7SzSuL8m4PDncS8inBg9BGHgvqqJBFWF8iukLNWgLMHn6Mg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1631696272; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=oO9OrB1G0+YxTYn/JMvLtBpJYNEF69i2CP96bjIn7ik=; b=9kyvvPSSWvn71+QrnJy9ImxxpNW73+vs4dy6JlrZE54kCOPRwX3FE0TRUfD1/VEtvke3SQ HbLcye+tf86ue0BA== To: Dmitry Vyukov Cc: Hillf Danton , syzbot , linux-kernel@vger.kernel.org, paulmck@kernel.org, syzkaller-bugs@googlegroups.com, Peter Zijlstra , kasan-dev , Johannes Berg , Kalle Valo , linux-wireless@vger.kernel.org Subject: Re: [syzbot] INFO: rcu detected stall in syscall_exit_to_user_mode In-Reply-To: References: <000000000000eaacf005ca975d1a@google.com> <20210831074532.2255-1-hdanton@sina.com> <20210914123726.4219-1-hdanton@sina.com> <87v933b3wf.ffs@tglx> Date: Wed, 15 Sep 2021 10:57:52 +0200 Message-ID: <87mtoeb4hb.ffs@tglx> MIME-Version: 1.0 Content-Type: text/plain Precedence: bulk List-ID: X-Mailing-List: linux-wireless@vger.kernel.org On Tue, Sep 14 2021 at 20:00, Dmitry Vyukov wrote: > On Tue, 14 Sept 2021 at 16:58, Thomas Gleixner wrote: >> Now what happens when the mac80211 callback rearms the timer so it >> expires immediately again: >> >> hrtimer_forward(&data->beacon_timer, hrtimer_get_expires(timer), >> ns_to_ktime(bcn_int * NSEC_PER_USEC)); >> >> bcn is a user space controlled value. Now lets assume that bcn_int is <=1, >> which would certainly cause the loop in hrtimer_run_queues() to keeping >> looping forever. >> >> That should be easy to verify by implementing a simple test which >> reschedules a hrtimer from the callback with a expiry time close to now. >> >> Not today as I'm about to head home to fire up the pizza oven. > > This question definitely shouldn't take priority over the pizza. But I > think I saw this "rearm a timer with a user-controlled value without > any checks" pattern lots of times and hangs are inherently harder to > localize and reproduce. So I wonder if it makes sense to add a debug > config that would catch such cases right when the timer is set up > (issue a WARNING)? Yes and no. It's hard to differentiate between a valid short expiry rearm and something which is caused by unchecked values. I have some ideas but all of them are expensive and therefore probably debug only. Which is actually better than nothing :) > However, for automated testing there is the usual question of > balancing between false positives and false negatives. The check > should not produce false positives, but at the same time it should > catch [almost] all actual stalls so that they don't manifest as > duplicate stall reports. Right. The problem could be even there with checked values: start_timer(1ms) timer_expires() callback() forward_timer(timer, now, period(1ms)); which might be perfectly fine with a production kernel as it leaves enough time to make overall progress. Now with a full debug kernel with all bells and whistels that callback might just run into this situation: start_timer(1ms) T0 timer_expires() T1 callback() do_stuff() forward_timer(timer, TNOW, period(1ms)); T1 - T0 = 1.001ms TNOW - T1 = 0.998 ms So the forward will just rearm it to T0 + 2ms which means it expires in 1us. > If I understand it correctly the timer is not actually set up as > periodic, but rather each callback invocation arms it again. Setting > up a timer for 1 ns _once_ (or few times) is probably fine (right?), > so the check needs to be somewhat more elaborate and detect "infinite" > rearming. Yes. That made me actually look at that mac80211_hwsim callback again. hrtimer_forward(&data->beacon_timer, hrtimer_get_expires(timer), ns_to_ktime(bcn_int * NSEC_PER_USEC)); So what this does is really wrong because it tries to schedule the timer on the theoretical periodic timeline. Which goes really south once the timer is late or the callback execution took longer than the period. Hypervisors scheduling out a VCPU at the wrong place will do that for you nicely. What this actually should use is hrtimer_forward_now() which prevents that problem because it will forward the timer in the periodic schedule beyond now. That won't prevent the above corner case, but I doubt you can create an endless loop with that scenario as easy as you can with trying to catch up on your theoretical timeline by using the previous expiry time as a base for the forward. Patch below. /me goes off to audit hrtimer_forward() usage. Sigh... After that figure out ways to debug or even prevent this. More sigh... Thanks, tglx --- drivers/net/wireless/mac80211_hwsim.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/drivers/net/wireless/mac80211_hwsim.c +++ b/drivers/net/wireless/mac80211_hwsim.c @@ -1867,8 +1867,8 @@ mac80211_hwsim_beacon(struct hrtimer *ti bcn_int -= data->bcn_delta; data->bcn_delta = 0; } - hrtimer_forward(&data->beacon_timer, hrtimer_get_expires(timer), - ns_to_ktime(bcn_int * NSEC_PER_USEC)); + hrtimer_forward_now(&data->beacon_timer, + ns_to_ktime(bcn_int * NSEC_PER_USEC)); return HRTIMER_RESTART; }