Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp1640280imm; Thu, 18 Oct 2018 01:29:24 -0700 (PDT) X-Google-Smtp-Source: ACcGV63wue8LupNQiCsV4UffLrtl98n5aZRUugOCC3KnVWFz98WKUz38e9PJfKs68DikY4u42cTh X-Received: by 2002:a63:6645:: with SMTP id a66-v6mr27379405pgc.5.1539851364555; Thu, 18 Oct 2018 01:29:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539851364; cv=none; d=google.com; s=arc-20160816; b=msD0hIJg3yt5s2wTFtdWWrx+PbhZC1Zxwdh2mx+4h7ly2Ue0NpGngo9f9f3DLdaV7+ XB8ZiyO2BCIv559ipQz4f58i0+x8+hwzy8bGS4abvzGQQYqoz2nSioJv5uAV7dqkHH1y cjTvTIM0UZ1zewt7GvcJ8K123raTAWpT1RYJmj/w28jh4JQT+LpjS1sPorTcCXfQ/hM3 Vd3Nv1OaxEeIMdsnkVfMJ/P5zFS111+nS7FhfmRydkbjQCHGsS7E7gB0gx+0WSa7NWh3 O8yZCzMJaLy9uJ36NE1pC3zE4+sqqLjMrZLqU8xmVRn0s3yBwaUMrsbu9/fqN5O0NEig kmCw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=A81vAJGptTkeCRBtHozeAQu4M/6x9Fy2WNuynQ1scSU=; b=mQlvpTcVW8LpNLnCjKYaoBPQAp03TosAD/GalFjTeTpA5pOU4dT6b7Jb5OAg2ykwqL 8vfWQkE25oaNW9/JMLAAZfLC/zCTT0Ca1HvYdJQS6+ZE0CcJVW9xOcSa02iM+esY/JhX 7RshZUnOz8aXuE0GLxyTyxpcpp0Vvsvs46CAwCrv709gTR5AXJ2MYHuBfRuchlfZ759S k1gAGT+87AmSBo8i96V+LKDgbdF5cHY4sI2kbpagKY1MEGCeRd47OCOZLy9E18ec/e/9 peOE9Ek/YTNZ+R5m+Gd+qU3renC+1wwQcPMHm7IYyRu5m/veaesps50MNn1nvlQTRZmy 4x7A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b6-v6si19615070plr.267.2018.10.18.01.29.08; Thu, 18 Oct 2018 01:29:24 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727620AbeJRQ2j (ORCPT + 99 others); Thu, 18 Oct 2018 12:28:39 -0400 Received: from mail-wm1-f67.google.com ([209.85.128.67]:55465 "EHLO mail-wm1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727414AbeJRQ2j (ORCPT ); Thu, 18 Oct 2018 12:28:39 -0400 Received: by mail-wm1-f67.google.com with SMTP id 206-v6so4507732wmb.5 for ; Thu, 18 Oct 2018 01:28:45 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=A81vAJGptTkeCRBtHozeAQu4M/6x9Fy2WNuynQ1scSU=; b=szFWoladd9u3aC7uNkskSDFKGdtDkmy6i7/WKMfMZAcswss0rhHU7/NEHverJS2Bhp USdEgkeeEOQfukmIe81ovI6Us920I/RFjQJFCUQQ0z90ywkQD+HJ3FcEUmjn/rYYIAph sJRwck6CqTabHxGDTYnKn8lJ7bvPHrLdoamG3PyKlgubGrNHXjTY4CFiGRAOsc1ZilUk +rk2CncKOJZXK4QA+PE54JXZPKgJHnpeqUu65amOftFSAN67N3Wjb5caKPnDxYbr2GfV 2NF/AwJyNHaarL3dLzf8Jbg1+mGnJYrE6CAc0b5z5PQOlgGQKN/mheASEvsaFtfvpTh5 8bPQ== X-Gm-Message-State: ABuFfojCy0dIfrCzrBOLuELhOzKxYWyTbinrEdhmHvLhm/EusCz4fWCn oYWNnl12dB0bqKbAKbgFAjaUTQ== X-Received: by 2002:a1c:3702:: with SMTP id e2-v6mr5902477wma.89.1539851324184; Thu, 18 Oct 2018 01:28:44 -0700 (PDT) Received: from localhost.localdomain ([151.37.218.44]) by smtp.gmail.com with ESMTPSA id i13-v6sm4872308wrn.62.2018.10.18.01.28.41 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 18 Oct 2018 01:28:43 -0700 (PDT) Date: Thu, 18 Oct 2018 10:28:38 +0200 From: Juri Lelli To: Thomas Gleixner Cc: Juri Lelli , Peter Zijlstra , syzbot , Borislav Petkov , "H. Peter Anvin" , LKML , mingo@redhat.com, nstange@suse.de, syzkaller-bugs@googlegroups.com, Luca Abeni , henrik@austad.us, Tommaso Cucinotta , Claudio Scordino , Daniel Bristot de Oliveira Subject: Re: INFO: rcu detected stall in do_idle Message-ID: <20181018082838.GA21611@localhost.localdomain> References: <000000000000a4ee200578172fde@google.com> <20181016140322.GB3121@hirez.programming.kicks-ass.net> <20181016144045.GF9130@localhost.localdomain> <20181016153608.GH9130@localhost.localdomain> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20181016153608.GH9130@localhost.localdomain> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 16/10/18 16:03, Peter Zijlstra wrote: > On Tue, Oct 16, 2018 at 03:24:06PM +0200, Thomas Gleixner wrote: > > It does reproduce here but with a kworker stall. Looking at the reproducer: > > > > *(uint32_t*)0x20000000 = 0; > > *(uint32_t*)0x20000004 = 6; > > *(uint64_t*)0x20000008 = 0; > > *(uint32_t*)0x20000010 = 0; > > *(uint32_t*)0x20000014 = 0; > > *(uint64_t*)0x20000018 = 0x9917; > > *(uint64_t*)0x20000020 = 0xffff; > > *(uint64_t*)0x20000028 = 0; > > syscall(__NR_sched_setattr, 0, 0x20000000, 0); > > > > which means: > > > > struct sched_attr { > > .size = 0, > > .policy = 6, > > .flags = 0, > > .nice = 0, > > .priority = 0, > > .deadline = 0x9917, > > .runtime = 0xffff, > > .period = 0, > > } > > > > policy 6 is SCHED_DEADLINE > > > > That makes the thread hog the CPU and prevents all kind of stuff to run. > > > > Peter, is that expected behaviour? > > Sorta, just like FIFO-99 while(1);. Except we should be rejecting the > above configuration, because of the rule: > > runtime <= deadline <= period > > Juri, where were we supposed to check that? OK, looks like the "which means" part above had me fooled, as we actually have ([1], where the comment is wrong) struct sched_attr { .size = 0, .policy = 6, .flags = 0, .nice = 0, .priority = 0, .runtime = 0x9917, .deadline = 0xffff, .period = 0, } So, we seem to be correctly (in theory, see below) accepting the task. What seems to generate the problem here is that CONFIG_HZ=100 and reproducer task has "tiny" runtime (~40us) and deadline (~66us) parameters, combination that "bypasses" the enforcing mechanism (performed at each tick). Another side problem seems also to be that with such tiny parameters we spend lot of time in the while (dl_se->runtime <= 0) loop of replenish_dl_ entity() (actually uselessly, as deadline is most probably going to still be in the past when eventually runtime becomes positive again), as delta_exec is huge w.r.t. runtime and runtime has to keep up with tiny increments of dl_runtime. I guess we could ameliorate things here by limiting the number of time we execute the loop before bailing out. Enabling HRTICK makes a difference [2]. I played a bit with several combinations and could verify that parameters in the ~50us range seem usable. However, still to mention that when runtime gets close to deadline (very high bandwidth) enforcing could be tricked again, as hrtick overheads might make the task effectively executing for more than the runtime, over passing the replenish instant (old deadline), so replenish timer is not set, and letting the task continuing executing after a replenishment. This is all however very much platform and config dependent, of course. So, I tend to think that we might want to play safe and put some higher minimum value for dl_runtime (it's currently at 1ULL << DL_SCALE). Guess the problem is to pick a reasonable value, though. Maybe link it someway to HZ? Then we might add a sysctl (or similar) thing with which knowledgeable users can do whatever they think their platform/config can support? Thoughts? I'm adding more people on Cc as I'm not sure they are following this. Thread starts here [3]. 1 - https://elixir.bootlin.com/linux/latest/source/include/uapi/linux/sched/types.h#L70 2 - noticed that we don't actually start hrtick on setup_new_dl_entity() and think we should 3 - https://lore.kernel.org/lkml/000000000000a4ee200578172fde@google.com/