Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp123816pxf; Wed, 17 Mar 2021 17:23:26 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyqbqDm8ki8zbQhVs442tfMR70GfV32/vQT7+Osj4uBUBFpQXS3otGLZ58vSEm6GHH/+Ujr X-Received: by 2002:a17:906:3385:: with SMTP id v5mr38237906eja.539.1616027006246; Wed, 17 Mar 2021 17:23:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1616027006; cv=none; d=google.com; s=arc-20160816; b=A+qSWRGKtw2OvD8neXlXo4DVQMnxHNFBkh0lkz4+rVi3FyzkgE9NUpV9RK7ZI8q71K PmOe4stkS+i1kpw2T8sQGLzx/iB/Sn7Lok3gHMy+dZddBMDaDJyqPvmBiDz8hQtpfXvJ +TIjnrzzWDmTTuBXrwTtv792LaG7NpTy/RFvEWxwn8l/aIGfX+wAJwii8OgXvsRLuRsM BdlBwViJII5IrvhR3OEwfwgwVWU28gJXY2YcpMPP3EDC3CjUnhZAgD8auPww2qDKUha8 041snSFVRoCQua1j2aI/yQoBWolQDvLU5W39i3tTOPNMNDr51PkptJe5OLRbFSS6K/XT 4pyg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=qBWp64Pug/SrgiIyadWuZwnOsiwL1P6TZRSFM00kaTE=; b=wFi43zanrPSo983QmwjTABhmbr5kVDrq381pWODHD5swBi6eXzSBqjDeq3T2p1OgnU +e5auZeNV+/f7+OaqPOxtVNLKkrk7yg8XVrLQrAMU2K0AZwSnAJMfqHU6ZUBFG+ClXnV M16pUWv4rEq44pPPPoWuH7/H2HrZyxZeW7mrhPWfE6jiVx58aJYGajubFdhXvTiMLaEd alFegTIy352YksFWaFbeKOvoYsI8NunGwvpC5tYETuZjFzyG9ZynmPqGLwXmQx2Z+Zwo 0qjp9mb+/xxJSwuVk1HLh6IvD1vVwXRtDwtyQExSB6qDfxLgBhK2s1NNHGzqVOSKVpZF b7CQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=IIqq6Eoj; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id mj17si265247ejb.28.2021.03.17.17.23.03; Wed, 17 Mar 2021 17:23:26 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=IIqq6Eoj; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230092AbhCRARw (ORCPT + 99 others); Wed, 17 Mar 2021 20:17:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40970 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230104AbhCRARt (ORCPT ); Wed, 17 Mar 2021 20:17:49 -0400 Received: from mail-qt1-x833.google.com (mail-qt1-x833.google.com [IPv6:2607:f8b0:4864:20::833]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6D9F4C06174A for ; Wed, 17 Mar 2021 17:17:49 -0700 (PDT) Received: by mail-qt1-x833.google.com with SMTP id y2so419213qtw.13 for ; Wed, 17 Mar 2021 17:17:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=qBWp64Pug/SrgiIyadWuZwnOsiwL1P6TZRSFM00kaTE=; b=IIqq6Eoj+LnGC7XeDnJA2LmQcviLVkZMtifQwLfmDot9ahXHP36PaGWUF7TjDY03Kc CgLSKDVnRGfD2QD1G+Nr0Ytv3ywTh3OkyHJjNtkRqgVhyv+we+Fw00hJ1T5SNf7bwCGf VdDFtMk+sITBGsRFfRFHeWThbRqSa6TGuRndAK9twANKLboIlc5el7Wg/rqlqUyA/RIp HJpOvztH5oI8kn0X+pqy8c/jqavu/Oqc8I3CHtT7/oNd1b2ayXGM4tSOCN7AfgeCt6sW x+zNRXN3ACaNJ00Rcic+TJobm40DIc/TTTsKAHwWTiPX0RcMDB3X+01RkybEOEOyg9PU Adzg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=qBWp64Pug/SrgiIyadWuZwnOsiwL1P6TZRSFM00kaTE=; b=W/NItVtqvX50Talc0h31+IIXyr9HvtD7Pe/CK84v89MuonfrCAjWniKNeUrNwoqs5b mHbAq+Jwdcrq8Pa0DoHwJdejNU/GNaI9D0bgN1vKArlLhpwsZgxLZMWJ+VMIgeEs41lR YzwJg3MtTglcUHvUKIDUFBtOf/zSXTo5wUSj2yOAEinsIcH5FpcABk7ipGeyCOoAJti7 zVKW1owSeOnK7AIVisF89/QT5wjjPgdAc5xJUoexm208XkbRGWV6zkJAnh6ndf31WI3r 2LAYQaiOSP4YxjF0Rz6OASaCgKnkx8r2Q2VMGpxrBTOeO/BY7NcB1JnJIq0QNHjx0s5+ 921Q== X-Gm-Message-State: AOAM533BV9IXu5f8KRL0ozpMCyEduBJsCj6i8ybFPb0zwCrv5YdOlS1n TELsOMRasA4RiRAzKXI2Xtoew23qNyDh2ihm4TZDSQ== X-Received: by 2002:a05:622a:114:: with SMTP id u20mr1443275qtw.317.1616026668446; Wed, 17 Mar 2021 17:17:48 -0700 (PDT) MIME-Version: 1.0 References: <20210317045949.1584952-1-joshdon@google.com> <20210317083141.GB3881262@gmail.com> In-Reply-To: <20210317083141.GB3881262@gmail.com> From: Josh Don Date: Wed, 17 Mar 2021 17:17:37 -0700 Message-ID: Subject: Re: [PATCH] sched: Warn on long periods of pending need_resched To: Ingo Molnar Cc: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Daniel Bristot de Oliveira , Luis Chamberlain , Kees Cook , Iurii Zaikin , linux-kernel , linux-fsdevel@vger.kernel.org, David Rientjes , Oleg Rombakh , Paul Turner Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Mar 17, 2021 at 1:31 AM Ingo Molnar wrote: > > > * Josh Don wrote: > > > +static inline u64 resched_latency_check(struct rq *rq) > > +{ > > + int latency_warn_ms = READ_ONCE(sysctl_resched_latency_warn_ms); > > + bool warn_only_once = (latency_warn_ms == RESCHED_DEFAULT_WARN_LATENCY_MS); > > + u64 need_resched_latency, now = rq_clock(rq); > > + static bool warned_once; > > + > > + if (warn_only_once && warned_once) > > + return 0; > > + > > + if (!need_resched() || latency_warn_ms < 2) > > + return 0; > > + > > + /* Disable this warning for the first few mins after boot */ > > + if (now < RESCHED_BOOT_QUIET_SEC * NSEC_PER_SEC) > > + return 0; > > + > > + if (!rq->last_seen_need_resched_ns) { > > + rq->last_seen_need_resched_ns = now; > > + rq->ticks_without_resched = 0; > > + return 0; > > + } > > + > > + rq->ticks_without_resched++; > > So AFAICS this will only really do something useful on full-nohz > kernels with sufficiently long scheduler ticks, right? Not quite sure what you mean; it is actually the inverse? Since we rely on the tick to detect the resched latency, on nohz-full we won't have detection on cpus running a single thread. The ideal scenario is !nohz-full and tick interval << warn_ms. > On other kernels the scheduler tick interrupt, when it returns to > user-space, will trigger a reschedule if it sees a need_resched. True for the case where we return to userspace, but we could instead be executing in a non-preemptible region of the kernel. This is where we've seen/fixed kernel bugs. Best, Josh