Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D31A4C433EF for ; Wed, 5 Jan 2022 01:48:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236971AbiAEBsV (ORCPT ); Tue, 4 Jan 2022 20:48:21 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40332 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236962AbiAEBsU (ORCPT ); Tue, 4 Jan 2022 20:48:20 -0500 Received: from mail-yb1-xb2e.google.com (mail-yb1-xb2e.google.com [IPv6:2607:f8b0:4864:20::b2e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 713BAC061784 for ; Tue, 4 Jan 2022 17:48:20 -0800 (PST) Received: by mail-yb1-xb2e.google.com with SMTP id w13so79575435ybs.13 for ; Tue, 04 Jan 2022 17:48:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=S3MV8EDA/gvv4OPelttWmulQGEHAkpGwQXtCt5sJZFc=; b=IjnvSApgYqoCeUI/96H6ZFMo8jneVsZ6oT5EroZA96IcN5y4uOe8HrLGXDKJxB1yk6 d5vLYIZFUBUTrW0gABnB1eBHPxLyuRZvLdad3hmu9KQzzZf0qcHYK0P39nBGPE6Ij2+j fw5mlcY27kIOS8nDgHB4bFddbXLmtNRc6H0SFu/kSlRhCWV5/T0q4ajAb1MsRdFoP8+w qT7KTGH5hJpmyPT/we5xRA8bzDDzTdQfQc2LINB8GkvEKhYZqxuAPIssLYGivb0tkMiZ ZMi/Qa5ji/dXu5AxNvUVL2hw6HGZaI9uPYjWETdQX26M7SzUOdx7v0YODSlqllan5NVf HH7A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=S3MV8EDA/gvv4OPelttWmulQGEHAkpGwQXtCt5sJZFc=; b=mnBsG7y1fvtBAr9zN/md/XZR6pC2KGGtm+1okhuu5zjxG490TktxcX/EQ4QSISvxZm BIxVNpEs+wcohn2SoahF2V3ulzXcA8Jmw5LEwRqLiWd4mp3tQ085kSnnLEzU2DegACk4 OhLJfpKWFIkfDB3aqkqCZ7d3+m2YXDbmcrEk2Ll6myFKPUKIwyjQ7iQ3sN5aXTN3UjZa qRs4tyS5jTNOwwd6KrkkAZqzZzkUSmVurWaYg4QhtgGoTAuw3tKlO/ek/MeFx2jNetGT lhJa8b7DjOG/jxPcsGTzAWysiPJxLUvYzVQfpdidobSxLxKAlkz3SKu0C97vmPjRVDYg uBOg== X-Gm-Message-State: AOAM533QVUxXZDAZIlW1J/SXwDhV5P/Z448AXaR5izj4QOSIi7jVc5cR 2aY0N61RqwwCuP1EQnvYshsaHbgHcbe52b6x90zqvg== X-Google-Smtp-Source: ABdhPJzgiDx9c56h3HvklU/xl5FU9OHdfrLBiBFcrJF3XAgLqNgRZIx7IBYHVpRBrRyzXDnZFaj8MeQCOxGaEnmrLB0= X-Received: by 2002:a25:3745:: with SMTP id e66mr21815632yba.208.1641347299207; Tue, 04 Jan 2022 17:48:19 -0800 (PST) MIME-Version: 1.0 References: <1640262603-19339-1-git-send-email-CruzZhao@linux.alibaba.com> <1640262603-19339-2-git-send-email-CruzZhao@linux.alibaba.com> In-Reply-To: <1640262603-19339-2-git-send-email-CruzZhao@linux.alibaba.com> From: Josh Don Date: Tue, 4 Jan 2022 17:48:08 -0800 Message-ID: Subject: Re: [PATCH 1/2] sched/core: Cookied forceidle accounting per cpu To: Cruz Zhao Cc: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Benjamin Segall , Mel Gorman , Daniel Bristot de Oliveira , Alexey Dobriyan , Eric Dumazet , linux-kernel , linux-fsdevel Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Cruz, Could you add a bit more background to help me understand what case this patch solves? Is your issue that you want to be able to attribute forced idle time to the specific cpus it happens on, or do you simply want a more convenient way of summing forced idle without iterating your cookie'd tasks and summing the schedstat manually? > @@ -190,6 +202,9 @@ static int show_stat(struct seq_file *p, void *v) > seq_put_decimal_ull(p, " ", nsec_to_clock_t(steal)); > seq_put_decimal_ull(p, " ", nsec_to_clock_t(guest)); > seq_put_decimal_ull(p, " ", nsec_to_clock_t(guest_nice)); > +#ifdef CONFIG_SCHED_CORE > + seq_put_decimal_ull(p, " ", nsec_to_clock_t(cookied_forceidle)); > +#endif IMO it would be better to always print this stat, otherwise it sets a weird precedent for new stats added in the future (much more difficult for userspace to reason about which column corresponds with which field, since it would depend on kernel config). Also, did you intend to put this in /proc/stat instead of /proc/schedstat (the latter of which would be more attractive to prevent calculation of these stats unless schestat was enabled)? > diff --git a/kernel/sched/core_sched.c b/kernel/sched/core_sched.c > @@ -260,6 +261,21 @@ void __sched_core_account_forceidle(struct rq *rq) > > rq->core->core_forceidle_start = now; > > + for_each_cpu(i, smt_mask) { > + rq_i = cpu_rq(i); > + p = rq_i->core_pick ?: rq_i->curr; > + > + if (!rq->core->core_cookie) > + continue; I see this is temporary given your other patch, but just a note that if your other patch is dropped, this check can be pulled outside the loop. > + if (p == rq_i->idle && rq_i->nr_running) { > + cpustat = kcpustat_cpu(i).cpustat; > + cpustat[CPUTIME_COOKIED_FORCEIDLE] += delta; > + } > + } I don't think this is right. If a cpu was simply idle while some other SMT sibling on its core was forced idle, and then a task happens to wake on the idle cpu, that cpu will now be charged the full delta here as forced idle (when actually it was never forced idle, we just haven't been through pick_next_task yet). One workaround would be to add a boolean to struct rq to cache whether the rq was in forced idle state. Best, Josh