Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp1704928imm; Thu, 18 Oct 2018 02:49:54 -0700 (PDT) X-Google-Smtp-Source: ACcGV62t65zfZ1elp5ykOhnHZoBcan0t0YlBEFK2M76DCDSls6fAnk9r3g4RWVwAYUsaUv6NKhPj X-Received: by 2002:a17:902:5597:: with SMTP id g23-v6mr1237733pli.46.1539856194456; Thu, 18 Oct 2018 02:49:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539856194; cv=none; d=google.com; s=arc-20160816; b=BFCPjDBFWL/EWR3a5uicmt+iByzihSBgfQnhOiyeuo9w1qufDfJ/MmEl7IhLOjfhYq xMLqq7rFxkQW4eIT9RLFT1tpVo43XpESHIAmXrxcpMffrwlkMKmovG73usur+BsvNCGu 8xa9q8jkQgnOAw10KhlDirPCx3hjqoT1sbDDscZOehWc2zg6Ss4+66Bw4PRmnHVb63R5 U1Ijuf+ZTn00TAfrhhJq2ugvs60m/SDD3JBEueq1VHBhsmoZY/zSI88fwqkL8bKepcc4 hGJQ2jxM4D2gXKAKoxjdmlKLBJpSb35z3QKEir9TMs6Ynf3EEXREBPHGqrfVFwUMj8pJ 7sfg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=uJZ5rRBzU9/M15ru97mTJlkbfqXgg+Jj5ja5EWlv7q4=; b=s0UjWDqv4XP75iQ3ozq6ussnh/JgwVg6qulfTb9dlv17YsX3WN5W7jjM2RrRZSIHRW WDH8xQyljqDugcbclT+q9sY8ekAFVX8JAOrZ4Rw0a7kT8pC5a5bCCbZGJ5okpwZMmC23 dWPMfSE9K5vSXQh8pEGuIGL5pAr9uTdmc5nQ4rxjlUIYrmHiRi4fkT4t3KF9vcwt5cqo Y4Q4bCu+a2xRiAtOkQ/2mVnM7QPwlnVpd3++a1ke3GDtuIdqTDsw+96gpzW8PcgVqP61 kBW7I1Wk1X7ItAahPB73VdrQhdJ4ZDhwoVAWByoa//m9qdhntvkXy0/JzBvN7z55tpSW sXiw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=KgYHFLkK; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l3-v6si21033090pld.404.2018.10.18.02.49.37; Thu, 18 Oct 2018 02:49:54 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=KgYHFLkK; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727575AbeJRRtV (ORCPT + 99 others); Thu, 18 Oct 2018 13:49:21 -0400 Received: from bombadil.infradead.org ([198.137.202.133]:50848 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727451AbeJRRtV (ORCPT ); Thu, 18 Oct 2018 13:49:21 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=In-Reply-To:Content-Type:MIME-Version :References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=uJZ5rRBzU9/M15ru97mTJlkbfqXgg+Jj5ja5EWlv7q4=; b=KgYHFLkKgXuv4LFvcw7BVprhm 6dFzPW3IWJ+Jq41tfThEM+hO5NIqF6eqxhfgC0Pl6GiOhetg/q5Zc4+yjz2kLpSWJ1nIxyn+h+kn8 MXaWQizlv3HFEwUQcK6xF3UMUrNrW9bHS05n6fnE0bKkO+QktpdJmGaZUhIN5egj5yu0PfG15A7rZ nAuwLMf3QcbaecgSTlcRTieOo9jgddyYmJIApjmqnTy0kwHLuX/yRQ42sCza8fe5UhZgLQ5F2+Bpj AX/Mojw5TdVT6dAE9FwnSOmKiaLPHWMJqjTlPghCuDXz8WsTinL/D98PA1mnStmqMpfZi0sjnBr7t 43lflaLew==; Received: from j217100.upc-j.chello.nl ([24.132.217.100] helo=hirez.programming.kicks-ass.net) by bombadil.infradead.org with esmtpsa (Exim 4.90_1 #2 (Red Hat Linux)) id 1gD4vE-0001Tf-Pq; Thu, 18 Oct 2018 09:48:52 +0000 Received: by hirez.programming.kicks-ass.net (Postfix, from userid 1000) id 8221120297B7F; Thu, 18 Oct 2018 11:48:50 +0200 (CEST) Date: Thu, 18 Oct 2018 11:48:50 +0200 From: Peter Zijlstra To: Juri Lelli Cc: Thomas Gleixner , Juri Lelli , syzbot , Borislav Petkov , "H. Peter Anvin" , LKML , mingo@redhat.com, nstange@suse.de, syzkaller-bugs@googlegroups.com, Luca Abeni , henrik@austad.us, Tommaso Cucinotta , Claudio Scordino , Daniel Bristot de Oliveira Subject: Re: INFO: rcu detected stall in do_idle Message-ID: <20181018094850.GW3121@hirez.programming.kicks-ass.net> References: <000000000000a4ee200578172fde@google.com> <20181016140322.GB3121@hirez.programming.kicks-ass.net> <20181016144045.GF9130@localhost.localdomain> <20181016153608.GH9130@localhost.localdomain> <20181018082838.GA21611@localhost.localdomain> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20181018082838.GA21611@localhost.localdomain> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Oct 18, 2018 at 10:28:38AM +0200, Juri Lelli wrote: > Another side problem seems also to be that with such tiny parameters we > spend lot of time in the while (dl_se->runtime <= 0) loop of replenish_dl_ > entity() (actually uselessly, as deadline is most probably going to > still be in the past when eventually runtime becomes positive again), as > delta_exec is huge w.r.t. runtime and runtime has to keep up with tiny > increments of dl_runtime. I guess we could ameliorate things here by > limiting the number of time we execute the loop before bailing out. That's the "DL replenish lagged too much" case, right? Yeah, there is only so much we can recover from. Funny that GCC actually emits that loop; sometimes we've had to fight GCC not to turn that into a division. But yes, I suppose we can put a limit on how many periods we can lag before just giving up. > So, I tend to think that we might want to play safe and put some higher > minimum value for dl_runtime (it's currently at 1ULL << DL_SCALE). > Guess the problem is to pick a reasonable value, though. Maybe link it > someway to HZ? Then we might add a sysctl (or similar) thing with which > knowledgeable users can do whatever they think their platform/config can > support? Yes, a HZ related limit sounds like something we'd want. But if we're going to do a minimum sysctl, we should also consider adding a maximum, if you set a massive period/deadline, you can, even with a relatively low u, incur significant delays. And do we want to put the limit on runtime or on period ? That is, something like: TICK_NSEC/2 < period < 10*TICK_NSEC and/or TICK_NSEC/2 < runtime < 10*TICK_NSEC Hmm, for HZ=1000 that ends up with a max period of 10ms, that's far too low, 24Hz needs ~41ms. We can of course also limit the runtime by capping u for users (as we should anyway).