Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A3F8EC678D5 for ; Tue, 7 Mar 2023 21:07:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231171AbjCGVHP (ORCPT ); Tue, 7 Mar 2023 16:07:15 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38410 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230236AbjCGVHL (ORCPT ); Tue, 7 Mar 2023 16:07:11 -0500 Received: from mail-pf1-x429.google.com (mail-pf1-x429.google.com [IPv6:2607:f8b0:4864:20::429]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A8FDDAA736 for ; Tue, 7 Mar 2023 13:07:09 -0800 (PST) Received: by mail-pf1-x429.google.com with SMTP id fd25so8957985pfb.1 for ; Tue, 07 Mar 2023 13:07:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1678223229; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=ht5LN05corsofYoRRZ6LtuPGm1HccZj2Qv4agF8zMXE=; b=j56rWVMq8AcAbG91+IFHoGnDYEKdEhPpWXttHuOCR9ZdEgqreHqaCEhELwykgJTclB rgqtjUgRgeI58Y7JhFG2bz3LQdOGRetfU31lRgyq+dl8CucVnMQp/oOSQsC48IB9K6+g hFehKk3LYl6xHfwqzf6XGkDsnIsz037HXGHbRuvtt3Z8IhyD121NivSzU/m9hA2H6rxh Lf0aJetdsL8le7mzSvfLD4j0dul5xrEvBKyw6tsGDAkeKa8Ci8OV3CZwni72Y+CJF3o+ 3dE9D7q4P2O+b5YsJbbC4vJwi+9NXDGYIBnpJnF2VnD+fENEF7Guz3Oh4otbyF+ZnIlA W8pg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678223229; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ht5LN05corsofYoRRZ6LtuPGm1HccZj2Qv4agF8zMXE=; b=ACIun0b1FILU36ZGdd+/3f9M8D/AyMM7BMi+otnRdgv1wUnQC3pjJar4Xjju2uwAL5 I6bV0WjulX06h2udLEtkK1HkFY5Bn709hOKgmUxsoLlaWS5wlwVOXuGBZmnIBs5FFcZK Cbna29rFWa/7YdZcF5UvYBmE3hO0H9M6PRWBc8SC51YkjaNjBKYJzDS445GDYVxAanIB sRWIi3KwS8h5CHQoQNu8n5NVgOcO5XKo1KGMms9V/6+yhe6y8aUadqdLeMKCdW9AJ2gC 90qOEtrSXI24j4x7bFR2YVIFACx0YKREdjPggCl1wz66nFJ/008rEm4LmRqH2ZSD0QiC r6nA== X-Gm-Message-State: AO0yUKUdNqKpRiTf4EdLYWfHMe+SlUqtqGX7E1yEawXmemq3+kXBzqRC CGUW0SwKqkog9DhHsvBvmdV9nf4wezRrqZsi3BvEUA== X-Google-Smtp-Source: AK7set8u+1Eh0MMGGELIeBwi9KpF81YnEVJ6HrRYnUhsWF0KJOIm5xTeDLbPuOPKQW+PIVapl6rkrS0+i226bORB290= X-Received: by 2002:a62:f80d:0:b0:5e6:f9a1:e224 with SMTP id d13-20020a62f80d000000b005e6f9a1e224mr6517647pfh.6.1678223228845; Tue, 07 Mar 2023 13:07:08 -0800 (PST) MIME-Version: 1.0 References: <20230206221428.2125324-1-qyousef@layalina.io> In-Reply-To: From: Hao Luo Date: Tue, 7 Mar 2023 13:06:57 -0800 Message-ID: Subject: Re: [PATCH v3] sched: cpuset: Don't rebuild root domains on suspend-resume To: Waiman Long Cc: Qais Yousef , Peter Zijlstra , Ingo Molnar , Juri Lelli , Steven Rostedt , tj@kernel.org, linux-kernel@vger.kernel.org, luca.abeni@santannapisa.it, claudio@evidence.eu.com, tommaso.cucinotta@santannapisa.it, bristot@redhat.com, mathieu.poirier@linaro.org, Dietmar Eggemann , cgroups@vger.kernel.org, Vincent Guittot , Wei Wang , Rick Yiu , Quentin Perret , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Sudeep Holla , Zefan Li , linux-s390@vger.kernel.org, x86@kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Mar 7, 2023 at 12:09=E2=80=AFPM Waiman Long wr= ote: > > On 3/7/23 14:56, Hao Luo wrote: > > On Mon, Feb 6, 2023 at 2:15=E2=80=AFPM Qais Yousef wrote: > >> Commit f9a25f776d78 ("cpusets: Rebuild root domain deadline accounting= information") > >> enabled rebuilding root domain on cpuset and hotplug operations to > >> correct deadline accounting. > >> > >> Rebuilding root domain is a slow operation and we see 10+ of ms delays > >> on suspend-resume because of that (worst case captures 20ms which > >> happens often). > >> > >> Since nothing is expected to change on suspend-resume operation; skip > >> rebuilding the root domains to regain the some of the time lost. > >> > >> Achieve this by refactoring the code to pass whether dl accoutning nee= ds > >> an update to rebuild_sched_domains(). And while at it, rename > >> rebuild_root_domains() to update_dl_rd_accounting() which I believe is > >> a more representative name since we are not really rebuilding the root > >> domains, but rather updating dl accounting at the root domain. > >> > >> Some users of rebuild_sched_domains() will skip dl accounting update > >> now: > >> > >> * Update sched domains when relaxing the domain level in cpus= et > >> which only impacts searching level in load balance > >> * update sched domains when cpufreq governor changes and we n= eed > >> to create the perf domains > >> > >> Users in arch/x86 and arch/s390 are left with the old behavior. > >> > >> Debugged-by: Rick Yiu > >> Signed-off-by: Qais Yousef (Google) > >> --- > > Hi Qais, > > > > Thank you for reporting this. We observed the same issue in our > > production environment. Rebuild_root_domains() is also called under > > cpuset_write_resmask, which handles writing to cpuset.cpus. Under > > production workloads, on a 4.15 kernel, we observed the median latency > > of writing cpuset.cpus at 3ms, p99 at 7ms. Now the median becomes > > 60ms, p99 at >100ms. Writing cpuset.cpus is a fairly frequent and > > critical path in production, but blindly traversing every task in the > > system is not scalable. And its cost is really unnecessary for users > > who don't use deadline tasks at all. > > The rebuild_root_domains() function shouldn't be called when updating > cpuset.cpus unless it is a partition root. Is it? > I think it's because we were using the legacy hierarchy. I'm not familiar with cpuset partition though. Hao