Received: by 2002:a05:6a10:f3d0:0:0:0:0 with SMTP id a16csp946489pxv; Fri, 25 Jun 2021 01:51:47 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwU58qjKMQcR09Dr+1KBoy7E9WlVjkLqgtg7oljRvFuvH7NCS6wXB1vleVOeJP/Lu6BkLax X-Received: by 2002:a50:ee13:: with SMTP id g19mr12967756eds.147.1624611107434; Fri, 25 Jun 2021 01:51:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1624611107; cv=none; d=google.com; s=arc-20160816; b=vhO+farhVi8vOrp3QPpx0b+3dgeT5Y/6o2+TlHLpaf9UV8MI02Fiq+Hwohdsbnjjbp WtrYf5NCE9VWjbBWpzmDhCxvmJBEGh6qd40Unn6Wa8s/Lnw6g3PpzIJiowh/h/cTCjo2 7CsFh/GrjmIBLZ7xSA00fXzVr23LfvzTdN2HVmnbYE1+bBya/9sFJM32hXZuThEOeRb4 LNib0a0ZgepNdNtxFoVv/VG9Aed8nkoyZ4aKeo8V/0cRnNQYCDd3fX8KOrjk0RI+xaz1 9v4fPOoa7Rm9qeEqRXPhF5zAM2sa9PjMGw+ra/mlrUkrPgrDgsYbB/fYqbY6Oc4Y+I4g aVFw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=FpCOlWk3vKWnUnsPZTBmvx+1pyqIKsgkj5HKTtcGYP0=; b=umMxN6zZJuglxDp9ElgkfZmPvZ+1esACfUsSxvYCCJHvVCt65Al43VXy+LpvxLzfdl FSuKwBqxbABmTNMlq0tao18xVE8IN6bOJ0+jUzHifG6Xd3GR0YqYt9tuc94AtZrJv4va tfR/ar3hq+6D29xZHRQTmExxbPUqLamGdVaFmVwnlf5e/c3ub7P2qTijfm94bw9OVbgU Z7uC4luUwUqTRvqtHD1tk9y8w/ToTUR6BSCHzh+kLhbGyBvc2MwX21qIZAOx3s+VxlAv zjE3r6UrikgU8i9JFYh8RiRCK2E91GyYERzAVcY8ogZDk0jioe9h7WqQKIg0w/cGsO4q rcGQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b="dLvIKi7/"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id df27si5376913edb.247.2021.06.25.01.51.23; Fri, 25 Jun 2021 01:51:47 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b="dLvIKi7/"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229974AbhFYIwu (ORCPT + 99 others); Fri, 25 Jun 2021 04:52:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37250 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229839AbhFYIwt (ORCPT ); Fri, 25 Jun 2021 04:52:49 -0400 Received: from mail-lj1-x231.google.com (mail-lj1-x231.google.com [IPv6:2a00:1450:4864:20::231]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B8D82C061574 for ; Fri, 25 Jun 2021 01:50:25 -0700 (PDT) Received: by mail-lj1-x231.google.com with SMTP id r16so11497619ljk.9 for ; Fri, 25 Jun 2021 01:50:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=FpCOlWk3vKWnUnsPZTBmvx+1pyqIKsgkj5HKTtcGYP0=; b=dLvIKi7/XUStuXGU0YbkzxykAcFpHdqB2wSAOdfJTPIqRO+Csa/GF5sHMyTU4RQwx1 sDqdCqsWA1rKGDtUnrmHvrGzknRnjJC5NrrKAUiKaIhOYcLL1lWqDfF0/lCen7gB+jFH pDUIkTB0Pemz1E7uP5aePPv+3jBqgr4ml/Do0j1pV5Bw94019cPDHEHhLZHIVDzq70B0 RjN3xYe6sPmejKNYzOl+D80R1Jt34K767Q8irxmuNrtQL++X9NzHV4lypZAqq7gwEOv+ b3Fd+++yd43rvvJMwXgY95YBor1476h6P+AdVDDt/q74GfQVDJwtGwNNrCsdh/ICiD0/ h0jA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=FpCOlWk3vKWnUnsPZTBmvx+1pyqIKsgkj5HKTtcGYP0=; b=Gs38iINfZn2LCNWOdpQTrr2QTnkeGbU/6B0sLi+RXDtwMrhbO6nqPIcChMz7t2Zm/J RGQfCExGGmXIm4IrDhHkEfxROSwXEad3FpyeWn3UqBVkh/M0vNVvlwTfabVdz3/QeCd6 XhCSg69WBqACUm1m/SQFkK/VA7lZxSBAGbqGtzEs1gICvwBjf7hyr82CL4J9awODescg 45DPE4e5g55LBd2ueCSWg6fqVenaVbjgG63rR17cuYzPJL7+EBKFLcGbDor/CN0QAbmn iGmb7jaCRwyDPDQhQjVqj3l2BXebhGht9WuJhct9F98Fat+AsYPYnrtuNqD10yjIX3zJ nkRA== X-Gm-Message-State: AOAM533UfogvRrfihU94zaGbdCVu5fXGcdtr34EwSKQ8BeqXSUZ6LjfI tbsx4DwmG6a3U26mIF0jMA9GBHmKW1XFEWfi70wJLA== X-Received: by 2002:a2e:8941:: with SMTP id b1mr7486483ljk.284.1624611024055; Fri, 25 Jun 2021 01:50:24 -0700 (PDT) MIME-Version: 1.0 References: <4aa674d9-db49-83d5-356f-a20f9e2a7935@linux.intel.com> <2d2294ce-f1d1-f827-754b-4541c1b43be8@linux.intel.com> <577b0aae-0111-97aa-0c99-c2a2fcfb5e2e@linux.intel.com> <20210512135955.suzvxxfilvwg33y2@e107158-lin.cambridge.arm.com> <729718fd-bd2c-2e0e-46f5-8027281e5821@linux.intel.com> <366aa93b-ecbf-ac0f-cd9e-3376b20d4929@linux.intel.com> In-Reply-To: <366aa93b-ecbf-ac0f-cd9e-3376b20d4929@linux.intel.com> From: Vincent Guittot Date: Fri, 25 Jun 2021 10:50:12 +0200 Message-ID: Subject: Re: [PATCH] sched/fair: Rate limit calls to update_blocked_averages() for NOHZ To: Tim Chen Cc: Qais Yousef , Joel Fernandes , linux-kernel , Paul McKenney , Frederic Weisbecker , Dietmar Eggeman , Ben Segall , Daniel Bristot de Oliveira , Ingo Molnar , Juri Lelli , Mel Gorman , Peter Zijlstra , Steven Rostedt , "Uladzislau Rezki (Sony)" , Neeraj upadhyay , Aubrey Li Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 18 Jun 2021 at 18:14, Tim Chen wrote: > > > > On 6/18/21 3:28 AM, Vincent Guittot wrote: > > >> > >> The current logic is when a CPU becomes idle, next_balance occur very > >> shortly (usually in the next jiffie) as get_sd_balance_interval returns > >> the next_balance in the next jiffie if the CPU is idle. However, in > >> reality, I saw most CPUs are 95% busy on average for my workload and > >> a task will wake up on an idle CPU shortly. So having frequent idle > >> balancing towards shortly idle CPUs is counter productive and simply > >> increase overhead and does not improve performance. > > > > Just to make sure that I understand your problem correctly: Your problem is: > > - that we have an ilb happening on the idle CPU and consume cycle > > That's right. The cycles are consumed heavily in update_blocked_averages() > when cgroup is enabled. But they are normally consumed on an idle CPU and the ILB checks need_resched() before running load balance for the next idle CPU. Does it mean that your problem is coming from update_blocked_average() spending a long time with rq_lock_irqsave and increasing the wakeup latency of your short running task ? > > > - or that the ilb will pull a task on an idle CPU on which a task will > > shortly wakeup which ends to 2 tasks competing for the same CPU. > > > > Because for the OLTP workload I'm looking at, we have tasks that sleep > for a short while and wake again very shortly (i.e. the CPU actually > is ~95% busy on average), pulling tasks to such a CPU is really not > helpful to improve overall CPU utilization in the system. So my > intuition is for such almost fully busy CPU, we should defer load > balancing to it (see prototype patch 3). Note that this is at the opposite of what you said earlier: " Though in our test environment, sysctl_sched_migration_cost was kept much lower (25000) compared to the default (500000), to encourage migrations to idle cpu and reduce latency. " But, it will be quite hard to find a value that fits to requirements for everybody and some will have UCs for which they want to pull tasks even if the CPU is 95% busy; You can have 2ms of idle time but having a utilization above 95% and an ILB inside a Core or at LLC is somewhat cheap and would take advantage of those 2ms > > Tim > > > >