Received: by 2002:a25:c205:0:0:0:0:0 with SMTP id s5csp6146301ybf; Thu, 5 Mar 2020 14:10:07 -0800 (PST) X-Google-Smtp-Source: ADFU+vs6vd/OlmiACV/q2TxfK32s2hjxzj9o45JmdPz4jdBDlSldkqcE7ZUtcRIq22q6syxeNvMt X-Received: by 2002:aca:4f17:: with SMTP id d23mr430024oib.145.1583446207742; Thu, 05 Mar 2020 14:10:07 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1583446207; cv=none; d=google.com; s=arc-20160816; b=dW0HVoHH99IMkvmw0pDSHN1YQaULWm3UhVMcMRJfgSRguFFd0A3kmYHbO5yhKtw3SE s87PAttbogookPRakfwxGO4+xDojrBwiCS8xzNDFMAX65E4sJzy0nq2bPPNOI4gM59e5 NB4cGkt4NZuIgallfSrHVFmraSJ5Xd/4FRH3IJuXDFVIheY0Nwyhe8ZJ7oHOPWFT6PCD JuCcbmx1TT2gBU1YhVdkqCHRHw7o8SKQPhzYlVEKwEInL1+Qi39qau7sRwh0PZxzW8fN yenoW6DXX/Hr89pRoHoIPpHaUznHQSI4ghIH6T4bzM86zivHzBC8Pq0D0D82BQ3ojyEk Sh5w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=OGS8rnXzHLbpXsJJAH3jh30eyIA7p3nGgcBVwa0A2w8=; b=NxxzgQo/Hun9Nb7xBvpvjKsBjKubGZ4U2hHw1Lod0DHbBXtfnmw/mFCOabuRtDCfop 1nTXZaB7klHiYXAuN3uVpgigpSjdnsCiQy9Gwf8vPWbGc0xKC4Kz5Gs7xOKL39tbtbgZ jc0Ur5aILSl2Gh5c9CvHfN432uThdYu7Y+cbHK0RFgmEWLaUH5qVfkvrGew90ryCqXnt 61VHphLw9qWOKsgnbVuwBIamWG0Pherwtvhd5IsbRfZ4SD8I1l623qqp9xvsSB/eHJEv xku2slEcAoCrB+b84iCQ/w/+Lagu5HoSg2lJfK9/Dht55TOMv9IK9hFWj/U6sCDUBKI7 k92Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=fzqeEciF; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id o20si161339ota.17.2020.03.05.14.09.56; Thu, 05 Mar 2020 14:10:07 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=fzqeEciF; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726563AbgCEWIQ (ORCPT + 99 others); Thu, 5 Mar 2020 17:08:16 -0500 Received: from mail-io1-f65.google.com ([209.85.166.65]:40548 "EHLO mail-io1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726209AbgCEWIQ (ORCPT ); Thu, 5 Mar 2020 17:08:16 -0500 Received: by mail-io1-f65.google.com with SMTP id d8so111254ion.7 for ; Thu, 05 Mar 2020 14:08:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=OGS8rnXzHLbpXsJJAH3jh30eyIA7p3nGgcBVwa0A2w8=; b=fzqeEciFPF22o5uyQb5nHvkhVF8jr5xf2Wo5iKCh1x0QlRNspF7nqIhsmjI6R0JPN3 tEi8EvQg+inlv/MZY/54d5x7ow47ExRcP9TN/GAsrIS1QseaOW3dXjAgAuotTHd1SuqV ap65h9Hkwnl1INBV+i9n2vvRw9p/S2xfuN7/ilfX8B2CIGzQ1/2M+qxLydAVK46c2LRS WDYy39RorrBeZYlr+8liVxg6aCqJj6o6vxBtEgPtYT8GgiTuN0cG7NYJPqe+vjD5yxW+ 5udG5txKOInbYOZbpMrrUpx2jq6Vcgl4vcCnIXy7A1eK5Npk0zFKcuDVPHxzAQlYoeox ZAfQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=OGS8rnXzHLbpXsJJAH3jh30eyIA7p3nGgcBVwa0A2w8=; b=SPb3C6kcBV6/iskoZzGJqc1nlYUqZoG6WK4xGKM0NeG//ppgPAoiY189AYg1bd9vQR LsDaiE/Wg2sI5Yn4Fg3nzhCJVQG8kCYbB4C/u/SgaKICi/WduH/ZJHbTy8eqJ4/KXrvz 0mHCl+lmwqS1arSytiGpMc9ELWgR2C3SHGFXUUPaM1pCh56Jb3xD37rTT1Q4jHgcRSck E9vc/STd2/uGIFS3xO8dv7NN8aV4ph078DQqucAtJNNriACZQE5VauK2RmulHQzxVSM1 0Yjunyi84xsuoP7+e8lWXrbaJrsliTYvoukUyWd+tPP1ObGEHY16SMbLDv1Kf0rGFxwJ WdEg== X-Gm-Message-State: ANhLgQ3cCJLgqcblfz/8F5jr6AYYcF31tzIOXOdf7C4H4UhkxTYgpr4X nktgOgNmWJrG0CXRI574yVSabTYN7p1zAFzHzT3X8A== X-Received: by 2002:a02:3093:: with SMTP id q141mr650220jaq.121.1583446094877; Thu, 05 Mar 2020 14:08:14 -0800 (PST) MIME-Version: 1.0 References: <20200304213941.112303-1-xii@google.com> <20200305075742.GR2596@hirez.programming.kicks-ass.net> <87blpad6b2.fsf@nanos.tec.linutronix.de> In-Reply-To: <87blpad6b2.fsf@nanos.tec.linutronix.de> From: Paul Turner Date: Thu, 5 Mar 2020 14:07:37 -0800 Message-ID: Subject: Re: [PATCH] sched: watchdog: Touch kernel watchdog in sched code To: Thomas Gleixner Cc: Peter Zijlstra , Xi Wang , Ingo Molnar , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Josh Don , LKML , linux-fsdevel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Mar 5, 2020 at 10:07 AM Thomas Gleixner wrote: > > Peter Zijlstra writes: > > > On Wed, Mar 04, 2020 at 01:39:41PM -0800, Xi Wang wrote: > >> The main purpose of kernel watchdog is to test whether scheduler can > >> still schedule tasks on a cpu. In order to reduce latency from > >> periodically invoking watchdog reset in thread context, we can simply > >> touch watchdog from pick_next_task in scheduler. Compared to actually > >> resetting watchdog from cpu stop / migration threads, we lose coverage > >> on: a migration thread actually get picked and we actually context > >> switch to the migration thread. Both steps are heavily protected by > >> kernel locks and unlikely to silently fail. Thus the change would > >> provide the same level of protection with less overhead. > >> > >> The new way vs the old way to touch the watchdogs is configurable > >> from: > >> > >> /proc/sys/kernel/watchdog_touch_in_thread_interval > >> > >> The value means: > >> 0: Always touch watchdog from pick_next_task > >> 1: Always touch watchdog from migration thread > >> N (N>0): Touch watchdog from migration thread once in every N > >> invocations, and touch watchdog from pick_next_task for > >> other invocations. > >> > > > > This is configurable madness. What are we really trying to do here? > > Create yet another knob which will be advertised in random web blogs to > solve all problems of the world and some more. Like the one which got > silently turned into a NOOP ~10 years ago :) > The knob can obviously be removed, it's vestigial and reflects caution from when we were implementing / rolling things over to it. We have default values that we know work at scale. I don't think this actually needs or wants to be tunable beyond on or off (and even that could be strictly compile or boot time only).