Received: by 2002:ac0:98c7:0:0:0:0:0 with SMTP id g7-v6csp1195218imd; Sat, 27 Oct 2018 04:17:52 -0700 (PDT) X-Google-Smtp-Source: AJdET5dactJaYedyAG9DNJTwLFI/tsNuTJwWGbcw/Kz/9+r/KcGR5zcjOn/OGVJLKAcTk+glnO+C X-Received: by 2002:a63:8f09:: with SMTP id n9-v6mr6753824pgd.222.1540639072734; Sat, 27 Oct 2018 04:17:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1540639072; cv=none; d=google.com; s=arc-20160816; b=jk0N77Fmwz/G9dE+BBW8P0ITK//Clxf4Gjbgwt3tVSi77QuYa/zKcuQnXUh6Dl6GJO eDoZjLnwpwShLJDlwEre0kPjVToUSO/IRyRkMttrJ2EWXD5URUyb77IGAgFEirVCW3ea lX35KsVOto4TMEQCRzLkPsjlIgE5rTTufwVA/nqY8CPfzeU6z6TFbYfsXV36XZs9WkVC MJbKzJSH760iuajzjWrY/Y+E1aIRF7lrdnJPYUWmS8EQqQtoMdQPIzyy5tKyQNeQeprs wbTJ19KFNozLz/jvSijkC2J9iQP+mBG1LUkK5lDkUS0e+HutNnmK1KkpPf8qqo8XvlmC LW4g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature; bh=0781gJOjO76kYtQR1KxNXFJAf5l4cns/NMzPhquz0tw=; b=c/TN4dYF6T97c4zg89vPD0ZNSfVsYqr+GctcPhNM1jiYIHczIfwRllOP56zC00X6yU 3qxu300Vaq9+eW2oBeK+q1+Pyx4vl1UInRtCcKtnxnXJANlLwoSW3hx9BGRwFFskl2E2 AdMVFSxQNNntswWhF7OvG/ziqqMozc95aFms0RCD0ZQiJ28YJWJ+Yiu26bT3E+qp6aFU K4hsAZHcbhPTJFqp91qBsiJGswnLBgYrQSFlFjtcNW257mh1fQLOD0UCrE5W37S9H1E4 /OM58UvUW01epWIgi61jnFc8CmuwKUXJvaJmbZ8K475nn3PGcUzTnxb83H2Nowhyks2P cpMQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=VSS8V4x2; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u68-v6si14763435pfa.28.2018.10.27.04.17.36; Sat, 27 Oct 2018 04:17:52 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=VSS8V4x2; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728570AbeJ0T5w (ORCPT + 99 others); Sat, 27 Oct 2018 15:57:52 -0400 Received: from mail-it1-f195.google.com ([209.85.166.195]:33695 "EHLO mail-it1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728524AbeJ0T5w (ORCPT ); Sat, 27 Oct 2018 15:57:52 -0400 Received: by mail-it1-f195.google.com with SMTP id h6-v6so6489197ith.0 for ; Sat, 27 Oct 2018 04:17:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=0781gJOjO76kYtQR1KxNXFJAf5l4cns/NMzPhquz0tw=; b=VSS8V4x2ljZrTKE8q5Yi57Vl6AsykPy3HLTQoDTED6h+NNib8NJrjcmDgUxFR1EANj 0Tin3pqSvwcTCks7a1ncgR+C3lT7F5L2NvU1Xuqir5+l8+UIxAO1kqXJ5aUJLQ0DAixJ QhDBF7lFcvfKX8QTyYivAMMfHPu6eVyjxUCtTMEIxtWFeZ6YG96IUaOWrBxykmtSrzlZ D5VLOQkS+hUvoYlK/Xi3HL24k+m2PUMg9+KgalotVJDOJ6ufbI8BrlXzlPkCfIgKZB5m eLKTxVPzXN9+fKaX9IxXbGHrvylmiNd4ft8ZO9O2sgESwcs55M25ydH9KX3tZBZGlwGj WuOQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=0781gJOjO76kYtQR1KxNXFJAf5l4cns/NMzPhquz0tw=; b=r9/oZwpSWFEN/ehRmGBYbHipzyGB3NrUNIEhW8vPWtA7t3fZwxTawr7XDi+O1mLm1b 2DY8OwALUUMYK24gtuLTgK3erHmwYPgEYpOA8Ed/O03FE62GhU2ryHqol6QZyODyQ+YD SHBSwNsCZQrLQGvhcOfjQiE1JLz0gVqeO2rOBxhDFY4u2AQVm0Ykrn3etO3A+c9b8mbL 6zI//i6mbRlPgVGJNUWdQlexVVI/X3+QnlWvxoJ+e+02+mdPHcGans0pJElFgT+SPp5N v/zjW6k3mtutPFmau1QzylP567EtJOj9fw34gTNjDFjGISMnER4S2vg34rx4//Hs+t75 YJ6g== X-Gm-Message-State: AGRZ1gLhaL+IlocnF+LVrPP/UBqz57WzyhGp1Tc39kelCPS57Y0R5KnL u94sYxGnHldG3724be97o1PdORTGwcjoyeDAfWkpPA== X-Received: by 2002:a24:940f:: with SMTP id j15-v6mr6083625ite.12.1540639033640; Sat, 27 Oct 2018 04:17:13 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a02:1003:0:0:0:0:0 with HTTP; Sat, 27 Oct 2018 04:16:52 -0700 (PDT) In-Reply-To: <20181024120335.GE29272@localhost.localdomain> References: <20181016140322.GB3121@hirez.programming.kicks-ass.net> <20181016144045.GF9130@localhost.localdomain> <20181016153608.GH9130@localhost.localdomain> <20181018082838.GA21611@localhost.localdomain> <20181018122331.50ed3212@luca64> <20181018104713.GC21611@localhost.localdomain> <20181018130811.61337932@luca64> <20181019113942.GH3121@hirez.programming.kicks-ass.net> <20181019225005.61707c64@nowhere> <20181024120335.GE29272@localhost.localdomain> From: Dmitry Vyukov Date: Sat, 27 Oct 2018 12:16:52 +0100 Message-ID: Subject: Re: INFO: rcu detected stall in do_idle To: Juri Lelli Cc: Peter Zijlstra , luca abeni , Thomas Gleixner , Juri Lelli , syzbot , Borislav Petkov , "H. Peter Anvin" , LKML , Ingo Molnar , nstange@suse.de, syzkaller-bugs@googlegroups.com, henrik@austad.us, Tommaso Cucinotta , Claudio Scordino , Daniel Bristot de Oliveira Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Oct 24, 2018 at 1:03 PM, Juri Lelli wrote: > > On 19/10/18 22:50, luca abeni wrote: > > On Fri, 19 Oct 2018 13:39:42 +0200 > > Peter Zijlstra wrote: > > > > > On Thu, Oct 18, 2018 at 01:08:11PM +0200, luca abeni wrote: > > > > Ok, I see the issue now: the problem is that the "while > > > > (dl_se->runtime <= 0)" loop is executed at replenishment time, but > > > > the deadline should be postponed at enforcement time. > > > > > > > > I mean: in update_curr_dl() we do: > > > > dl_se->runtime -= scaled_delta_exec; > > > > if (dl_runtime_exceeded(dl_se) || dl_se->dl_yielded) { > > > > ... > > > > enqueue replenishment timer at dl_next_period(dl_se) > > > > But dl_next_period() is based on a "wrong" deadline! > > > > > > > > > > > > I think that inserting a > > > > while (dl_se->runtime <= -pi_se->dl_runtime) { > > > > dl_se->deadline += pi_se->dl_period; > > > > dl_se->runtime += pi_se->dl_runtime; > > > > } > > > > immediately after "dl_se->runtime -= scaled_delta_exec;" would fix > > > > the problem, no? > > > > > > That certainly makes sense to me. > > > > Good; I'll try to work on this idea in the weekend. > > So, we (me and Luca) managed to spend some more time on this and found a > few more things worth sharing. I'll try to summarize what we have got so > far (including what already discussed) because impression is that each > point might deserve a fix or at least consideration (just amazing how a > simple random fuzzer thing can highlight all that :). 1. Fuzzing finds bugs in any code. Always. If a code wasn't fuzzed, there are bugs. 2. This fuzzer is not so simple ;) > Apologies for the > long email. > > Reproducer runs on a CONFIG_HZ=100, CONFIG_IRQ_TIME_ACCOUNTING kernel > and does something like this (only the bits that seems to matter here) > > int main(void) > { > [...] > [setup stuff at 0x2001d000] > syscall(__NR_perf_event_open, 0x2001d000, 0, -1, -1, 0); > *(uint32_t*)0x20000000 = 0; > *(uint32_t*)0x20000004 = 6; > *(uint64_t*)0x20000008 = 0; > *(uint32_t*)0x20000010 = 0; > *(uint32_t*)0x20000014 = 0; > *(uint64_t*)0x20000018 = 0x9917; <-- ~40us > *(uint64_t*)0x20000020 = 0xffff; <-- ~65us (~60% bandwidth) > *(uint64_t*)0x20000028 = 0; > syscall(__NR_sched_setattr, 0, 0x20000000, 0); > [busy loop] > return 0; > } > > And this causes problems because the task is actually never throttled. > > Pain points: > > 1. Granularity of enforcement (at each tick) is huge compared with > the task runtime. This makes starting the replenishment timer, > when runtime is depleted, always to fail (because old deadline > is way in the past). So, the task is fully replenished and put > back to run. > > - Luca's proposal should help here, since the deadline is postponed > at throttling time, and replenishment timer set to that (and it > should be in the future) > > 1.1 Even if we fix 1. in a configuration like this, the task would > still be able to run for ~10ms (worst case) and potentially starve > other tasks. It doesn't seem a too big interval maybe, but there > might be other very short activities that might miss an occasion > to run "quickly". > > - Might be fixed by imposing (via sysctl) reasonable defaults for > minimum runtime (w.r.t. HZ, like HZ/2) and maximum for period > (as also a very small bandwidth task can have a big runtime if > period is big as well) > > (1.2) When runtime becomes very negative (because delta_exec was big) > we seem to spend lot of time inside the replenishment loop. > > - Not sure it's such a big problem, might need more profiling. > Feeling is that once the other points will be addressed this > won't matter anymore > > 2. This is related to perf_event_open syscall reproducer does before > becoming DEADLINE and entering the busy loop. Enabling of perf > swevents generates lot of hrtimers load that happens in the > reproducer task context. Now, DEADLINE uses rq_clock() for setting > deadlines, but rq_clock_task() for doing runtime enforcement. > In a situation like this it seems that the amount of irq pressure > becomes pretty big (I'm seeing this on kvm, real hw should maybe do > better, pain point remains I guess), so rq_clock() and > rq_clock_task() might become more a more skewed w.r.t. each other. > Since rq_clock() is only used when setting absolute deadlines for > the first time (or when resetting them in certain cases), after a > bit the replenishment code will start to see postponed deadlines > always in the past w.r.t. rq_clock(). And this brings us back to the > fact that the task is never stopped, since it can't keep up with > rq_clock(). > > - Not sure yet how we want to address this [1]. We could use > rq_clock() everywhere, but tasks might be penalized by irq > pressure (theoretically this would mandate that irqs are > explicitly accounted for I guess). I tried to use the skew between > the two clocks to "fix" deadlines, but that puts us at risks of > de-synchronizing userspace and kernel views of deadlines. > > 3. HRTICK is not started for new entities. > > - Already got a patch for it. > > This should be it, I hope. Luca (thanks a lot for your help) and please > add or correct me if I was wrong. > > Thoughts? > > Best, > > - Juri > > 1 - https://elixir.bootlin.com/linux/latest/source/kernel/sched/deadline.c#L1162 > > -- > You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group. > To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com. > To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/20181024120335.GE29272%40localhost.localdomain. > For more options, visit https://groups.google.com/d/optout.