Received: by 2002:a05:6a10:af89:0:0:0:0 with SMTP id iu9csp3728109pxb; Mon, 24 Jan 2022 16:30:18 -0800 (PST) X-Google-Smtp-Source: ABdhPJwX8X+1wMvmW/MkxNnjdPWqVtFZlxlUZllM1QTGfysg6zi1po8SM1BLdEyUNRzqsz+DNoeA X-Received: by 2002:a17:90b:1bc7:: with SMTP id oa7mr745751pjb.149.1643070617945; Mon, 24 Jan 2022 16:30:17 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1643070617; cv=none; d=google.com; s=arc-20160816; b=tYt+RLCR5wa1QWqRrzxMfvCAHsosaYZ21oFpHIEq9yUygsaw92F6Narsh5eyqd19Xa dPKh4NfW2tq2i0+iypGnQ4km4U23DGrxoU3gq9bT71eArehCWtqrVgsdqlEpHGY8n4BM rBrkTqFy7fRFpO6b/hB25QddhrPHs61W5TxdLv0hn/ZdugYwM7bLSorBhsmuYR8//4XU jeabUA93Q1gmlajFna1skOaGOskh7MnYLOGUlSo9HQOVGBmIKRkBPxLcfL7SY0ykNS0G JNHjEdPB5IA2NqcGu4ord0CQH4ZINDUtbaTz/PCDtC9PDcWJubj15XryReglralkbVti BQGQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=ar0bPmvppL1i4PvAehivyQwFnzLuTCagcfEnwxsgFOI=; b=sdcWI9Tdf3v+SPR3bTvhWhPdhZUQLUXKrzaDB7MyrLmJ7JChGki+N/sfNwg0b9Pm/K heXMt/7FVPv5TTdiCQ5iFtVK9fKpoerzKREPtd6wa/5ga6qIU2bREOE3JNnQ2/UMLKU1 XuczPFAhpMdbgMpOHT5sGktf5J2PnyJd58EhRec1DzvKU6ov9nzbZziAqgEnI6ABkslt 3Sfy1jdEuLSEocr0SQ+nMS/6KqGceClmMyti/kLftvXxOgJkD7wnv/WECHKNejOu/YNn 1MMqP3zGpJ8IPeFsJDYlDWzvC4uy8oEYnMrD0TQodIVSkkj9fUAWXHw+3lXByfNsKDqO JHMQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=iGncU4nn; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id f4si7157276pfd.61.2022.01.24.16.30.04; Mon, 24 Jan 2022 16:30:17 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=iGncU4nn; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S3408237AbiAYAWG (ORCPT + 99 others); Mon, 24 Jan 2022 19:22:06 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53862 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1381874AbiAXXeJ (ORCPT ); Mon, 24 Jan 2022 18:34:09 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AC343C07596F; Mon, 24 Jan 2022 13:35:37 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 4C0A1612D6; Mon, 24 Jan 2022 21:35:37 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 313BFC340E4; Mon, 24 Jan 2022 21:35:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1643060136; bh=lwmo3Kp9Lf9a9yqto9dD7w/To7cLK85iygXK9u4mjjU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=iGncU4nnstaW734ZP1+sROy4zT2ybnsOkCfADf6BbMCjS6q+lBRk3Gj+VmgIU7CjR 9CJQ+4Wrx6zn8sNseO+bZ63mLxRfuv5M3ohB5qYCvNBR4/BmBIRIa818gqVsC67fon RZ7aUKL8Psu4V93n4pRFPnI28EktljofFTzcd9/U= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Andrey Ryabinin , "Peter Zijlstra (Intel)" , Daniel Jordan , Tejun Heo Subject: [PATCH 5.16 0851/1039] sched/cpuacct: Fix user/system in shown cpuacct.usage* Date: Mon, 24 Jan 2022 19:44:00 +0100 Message-Id: <20220124184153.885119284@linuxfoundation.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220124184125.121143506@linuxfoundation.org> References: <20220124184125.121143506@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Andrey Ryabinin commit dd02d4234c9a2214a81c57a16484304a1a51872a upstream. cpuacct has 2 different ways of accounting and showing user and system times. The first one uses cpuacct_account_field() to account times and cpuacct.stat file to expose them. And this one seems to work ok. The second one is uses cpuacct_charge() function for accounting and set of cpuacct.usage* files to show times. Despite some attempts to fix it in the past it still doesn't work. Sometimes while running KVM guest the cpuacct_charge() accounts most of the guest time as system time. This doesn't match with user&system times shown in cpuacct.stat or proc//stat. Demonstration: # git clone https://github.com/aryabinin/kvmsample # make # mkdir /sys/fs/cgroup/cpuacct/test # echo $$ > /sys/fs/cgroup/cpuacct/test/tasks # ./kvmsample & # for i in {1..5}; do cat /sys/fs/cgroup/cpuacct/test/cpuacct.usage_sys; sleep 1; done 1976535645 2979839428 3979832704 4983603153 5983604157 Use cpustats accounted in cpuacct_account_field() as the source of user/sys times for cpuacct.usage* files. Make cpuacct_charge() to account only summary execution time. Fixes: d740037fac70 ("sched/cpuacct: Split usage accounting into user_usage and sys_usage") Signed-off-by: Andrey Ryabinin Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Daniel Jordan Acked-by: Tejun Heo Cc: Link: https://lore.kernel.org/r/20211115164607.23784-3-arbn@yandex-team.com Signed-off-by: Greg Kroah-Hartman --- kernel/sched/cpuacct.c | 79 +++++++++++++++++++------------------------------ 1 file changed, 32 insertions(+), 47 deletions(-) --- a/kernel/sched/cpuacct.c +++ b/kernel/sched/cpuacct.c @@ -21,15 +21,11 @@ static const char * const cpuacct_stat_d [CPUACCT_STAT_SYSTEM] = "system", }; -struct cpuacct_usage { - u64 usages[CPUACCT_STAT_NSTATS]; -}; - /* track CPU usage of a group of tasks and its child groups */ struct cpuacct { struct cgroup_subsys_state css; /* cpuusage holds pointer to a u64-type object on every CPU */ - struct cpuacct_usage __percpu *cpuusage; + u64 __percpu *cpuusage; struct kernel_cpustat __percpu *cpustat; }; @@ -49,7 +45,7 @@ static inline struct cpuacct *parent_ca( return css_ca(ca->css.parent); } -static DEFINE_PER_CPU(struct cpuacct_usage, root_cpuacct_cpuusage); +static DEFINE_PER_CPU(u64, root_cpuacct_cpuusage); static struct cpuacct root_cpuacct = { .cpustat = &kernel_cpustat, .cpuusage = &root_cpuacct_cpuusage, @@ -68,7 +64,7 @@ cpuacct_css_alloc(struct cgroup_subsys_s if (!ca) goto out; - ca->cpuusage = alloc_percpu(struct cpuacct_usage); + ca->cpuusage = alloc_percpu(u64); if (!ca->cpuusage) goto out_free_ca; @@ -99,7 +95,8 @@ static void cpuacct_css_free(struct cgro static u64 cpuacct_cpuusage_read(struct cpuacct *ca, int cpu, enum cpuacct_stat_index index) { - struct cpuacct_usage *cpuusage = per_cpu_ptr(ca->cpuusage, cpu); + u64 *cpuusage = per_cpu_ptr(ca->cpuusage, cpu); + u64 *cpustat = per_cpu_ptr(ca->cpustat, cpu)->cpustat; u64 data; /* @@ -115,14 +112,17 @@ static u64 cpuacct_cpuusage_read(struct raw_spin_rq_lock_irq(cpu_rq(cpu)); #endif - if (index == CPUACCT_STAT_NSTATS) { - int i = 0; - - data = 0; - for (i = 0; i < CPUACCT_STAT_NSTATS; i++) - data += cpuusage->usages[i]; - } else { - data = cpuusage->usages[index]; + switch (index) { + case CPUACCT_STAT_USER: + data = cpustat[CPUTIME_USER] + cpustat[CPUTIME_NICE]; + break; + case CPUACCT_STAT_SYSTEM: + data = cpustat[CPUTIME_SYSTEM] + cpustat[CPUTIME_IRQ] + + cpustat[CPUTIME_SOFTIRQ]; + break; + case CPUACCT_STAT_NSTATS: + data = *cpuusage; + break; } #ifndef CONFIG_64BIT @@ -132,10 +132,14 @@ static u64 cpuacct_cpuusage_read(struct return data; } -static void cpuacct_cpuusage_write(struct cpuacct *ca, int cpu, u64 val) +static void cpuacct_cpuusage_write(struct cpuacct *ca, int cpu) { - struct cpuacct_usage *cpuusage = per_cpu_ptr(ca->cpuusage, cpu); - int i; + u64 *cpuusage = per_cpu_ptr(ca->cpuusage, cpu); + u64 *cpustat = per_cpu_ptr(ca->cpustat, cpu)->cpustat; + + /* Don't allow to reset global kernel_cpustat */ + if (ca == &root_cpuacct) + return; #ifndef CONFIG_64BIT /* @@ -143,9 +147,10 @@ static void cpuacct_cpuusage_write(struc */ raw_spin_rq_lock_irq(cpu_rq(cpu)); #endif - - for (i = 0; i < CPUACCT_STAT_NSTATS; i++) - cpuusage->usages[i] = val; + *cpuusage = 0; + cpustat[CPUTIME_USER] = cpustat[CPUTIME_NICE] = 0; + cpustat[CPUTIME_SYSTEM] = cpustat[CPUTIME_IRQ] = 0; + cpustat[CPUTIME_SOFTIRQ] = 0; #ifndef CONFIG_64BIT raw_spin_rq_unlock_irq(cpu_rq(cpu)); @@ -196,7 +201,7 @@ static int cpuusage_write(struct cgroup_ return -EINVAL; for_each_possible_cpu(cpu) - cpuacct_cpuusage_write(ca, cpu, 0); + cpuacct_cpuusage_write(ca, cpu); return 0; } @@ -243,25 +248,10 @@ static int cpuacct_all_seq_show(struct s seq_puts(m, "\n"); for_each_possible_cpu(cpu) { - struct cpuacct_usage *cpuusage = per_cpu_ptr(ca->cpuusage, cpu); - seq_printf(m, "%d", cpu); - - for (index = 0; index < CPUACCT_STAT_NSTATS; index++) { -#ifndef CONFIG_64BIT - /* - * Take rq->lock to make 64-bit read safe on 32-bit - * platforms. - */ - raw_spin_rq_lock_irq(cpu_rq(cpu)); -#endif - - seq_printf(m, " %llu", cpuusage->usages[index]); - -#ifndef CONFIG_64BIT - raw_spin_rq_unlock_irq(cpu_rq(cpu)); -#endif - } + for (index = 0; index < CPUACCT_STAT_NSTATS; index++) + seq_printf(m, " %llu", + cpuacct_cpuusage_read(ca, cpu, index)); seq_puts(m, "\n"); } return 0; @@ -339,16 +329,11 @@ static struct cftype files[] = { void cpuacct_charge(struct task_struct *tsk, u64 cputime) { struct cpuacct *ca; - int index = CPUACCT_STAT_SYSTEM; - struct pt_regs *regs = get_irq_regs() ? : task_pt_regs(tsk); - - if (regs && user_mode(regs)) - index = CPUACCT_STAT_USER; rcu_read_lock(); for (ca = task_ca(tsk); ca; ca = parent_ca(ca)) - __this_cpu_add(ca->cpuusage->usages[index], cputime); + __this_cpu_add(*ca->cpuusage, cputime); rcu_read_unlock(); }