Received: by 2002:a17:90a:9307:0:0:0:0 with SMTP id p7csp7935653pjo; Fri, 6 Mar 2020 14:35:59 -0800 (PST) X-Google-Smtp-Source: ADFU+vtlUFFymKJkx9wGkTqF7NCq0n0tkhy32eQxRD8+Z2C++Ngrv9ZMFBWyLum9xIa7L8RcVZZx X-Received: by 2002:a9d:6951:: with SMTP id p17mr4172731oto.24.1583534159025; Fri, 06 Mar 2020 14:35:59 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1583534159; cv=none; d=google.com; s=arc-20160816; b=h8GzolCYWKjk6KdthTBMIsKtYXpBwU/AGe1jy0/LfKbitkOs4KeR5KG47Dm8JxMhOQ whfJjiA2FS0dp5SRSZFDL8feSK/KKs15Jhaa6mx5UGkMsIgHE+jKkjWxX8AfJtdmwz6B XvpriMZ4OyZ1FbsiYUWCslPFKDF/YoTueYCEKkCervlRMh5OcgDBjoy0juOph/kY7IcS yWEPW1wURXVPUqDRcW1dcXXPp0CKm/PCbPWsDFkl6ppNT9Wx7nDrY5LQWUKC1HCKRcIs 9d2H1vWZTho3TJLy5to39QjZ7MnqVz9//aOhVzTeHVSKH32uEkBN25/U8tN/E9dTbQL/ XiTg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=olqcvdNDEsEEHVidSmBf224StzdHlYaDh6Q/H8VQ18U=; b=EN6VPqoOdzcFcGoRqHZ2/IKuDRCbZ0qOqw6lvpXtfMdVwz9APbpBCQVNdOzIX0Yrcp kMs9VRZFNvjy32TyhT/N/PmesUgELs4t9EH8XqPjZmqxP99m98RxwoiGVwH/yqZEqYtB tqieaF9jS5JHDnc5NvM41R3dCXydSIBZ2RNfy3xYcyPHOP/pL7jNitnKtwh8vEddaEn0 a+KT+C6OK4T/6RnfG4GtXmGT5uf24zpZG0vuW2DxHmd/oR+jFVeG4cNf28AgyAFfITAa w1AwbwQ2ULvt9Ikxl4M/SYZ8dTVEBRvCDyeL/JFRe8rNGsELlDWIwtpaAhLWk0FkifqP r6kQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=ncOXjfUX; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d11si2049283otf.135.2020.03.06.14.35.47; Fri, 06 Mar 2020 14:35:59 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=ncOXjfUX; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726968AbgCFWeP (ORCPT + 99 others); Fri, 6 Mar 2020 17:34:15 -0500 Received: from mail-oi1-f193.google.com ([209.85.167.193]:41669 "EHLO mail-oi1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726891AbgCFWeP (ORCPT ); Fri, 6 Mar 2020 17:34:15 -0500 Received: by mail-oi1-f193.google.com with SMTP id i1so4172479oie.8 for ; Fri, 06 Mar 2020 14:34:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=olqcvdNDEsEEHVidSmBf224StzdHlYaDh6Q/H8VQ18U=; b=ncOXjfUXXjzZmhUyLfXPdhqFTnTUDE3cbPGJmmH+5gyunAAbQJNixohNMdpUXu3weq u3g72vmPpNK9Za0wb96gzEnLKkQi3kMENHQwul4TJ30lrAe2jS2MEW3Xl0xFFrw59JSM eTyr5+ovLGL2FfnWz74Rx5rsr6KwTa8cidtSiIXAFcLR2R1INwjLAb6IGNthq5njhTfi MzT7L7ALNJd4lfU33h9yxtr7Ci4zuxUG3bmFKoniFNh11yAm7+QaNR10Z/WEzL57NITe j59SHdXYLUzyz4cU0LnyC39/HzF2OIcKZL2mCVBMx9a3ioPiZ4OrnfMGaHhUiXqy8s8r Ow+g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=olqcvdNDEsEEHVidSmBf224StzdHlYaDh6Q/H8VQ18U=; b=O76O7PMVKc2usuZSpjVE/d9yDUiNd1foGOGCDaYahPjGtpGj1ZApViH9NP8nzjRlHN opDpprGijlcwmmjNI9nDgLmGyhe9gUTsMQ3PdCr5ooqDy5VSEKaE7RmPOAYHpaZfzzoB vThAxDkq+9y6e9i/kDzkd8NTnZeUzi3RBrY+pLbNbNsV1PLUgUQ0sXMcmjsNR6sTOyLD Dn84jHyhb6zNBJ2NQhscORxifHbsDBCTYIuxoJWLt8N/FqaGNSVMPjBtWTqI51WlhP+l japNHQFlDzjuInLZ16QjjUaIVfy84jQXFLw/NwZLHovvydf+NcNyrT8AoTokRYSFmKm4 tRtw== X-Gm-Message-State: ANhLgQ2N9xDzI2wOYAhk2ySNFYMPOWBqAH2pDOg2QWiJy9cHSI6b8JYB b3nZ6MOgYkDhRfgeG+RUHMOBkK8tDL9XpSL4dWIx6A== X-Received: by 2002:aca:75c1:: with SMTP id q184mr4264436oic.35.1583534054174; Fri, 06 Mar 2020 14:34:14 -0800 (PST) MIME-Version: 1.0 References: <20200304213941.112303-1-xii@google.com> <20200305075742.GR2596@hirez.programming.kicks-ass.net> <20200306084039.GC12561@hirez.programming.kicks-ass.net> In-Reply-To: <20200306084039.GC12561@hirez.programming.kicks-ass.net> From: Xi Wang Date: Fri, 6 Mar 2020 14:34:20 -0800 Message-ID: Subject: Re: [PATCH] sched: watchdog: Touch kernel watchdog in sched code To: Peter Zijlstra Cc: Paul Turner , Ingo Molnar , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Josh Don , LKML , linux-fsdevel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Mar 6, 2020 at 12:40 AM Peter Zijlstra wrote: > > On Thu, Mar 05, 2020 at 02:11:49PM -0800, Paul Turner wrote: > > The goal is to improve jitter since we're constantly periodically > > preempting other classes to run the watchdog. Even on a single CPU > > this is measurable as jitter in the us range. But, what increases the > > motivation is this disruption has been recently magnified by CPU > > "gifts" which require evicting the whole core when one of the siblings > > schedules one of these watchdog threads. > > > > The majority outcome being asserted here is that we could actually > > exercise pick_next_task if required -- there are other potential > > things this will catch, but they are much more braindead generally > > speaking (e.g. a bug in pick_next_task itself). > > I still utterly hate what the patch does though; there is no way I'll > have watchdog code hook in the scheduler like this. That's just asking > for trouble. > > Why isn't it sufficient to sample the existing context switch counters > from the watchdog? And why can't we fix that? We could go to pick next and repick the same task. There won't be a context switch but we still want to hold the watchdog. I assume such a counter also needs to be per cpu and inside the rq lock. There doesn't seem to be an existing one that fits this purpose.