Received: by 2002:a25:7ec1:0:0:0:0:0 with SMTP id z184csp2676057ybc; Wed, 20 Nov 2019 18:50:04 -0800 (PST) X-Google-Smtp-Source: APXvYqywoEl8Hgtvq3DRqzuKRE0gwKFJQrTz7koayFKFLrgt1/2p1CcLHmp+n4feKUP2+0wv7kFM X-Received: by 2002:a17:906:5246:: with SMTP id y6mr10364158ejm.330.1574304604706; Wed, 20 Nov 2019 18:50:04 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1574304604; cv=none; d=google.com; s=arc-20160816; b=wEqFqahq5u3dkquPBOShQmS/YJ39B/ULZcZShEjTBP7QzXX5QMEtpiUmaLQ1NvPxl1 V0GQONtRKbZOnFEiY2ldml8n8RRF7TqKu7aNbKz2JOeFW5KyLxoXX3yic9pW72BpLaUW Ffz0JOyXxFvD9WHgbCw2OBQDdO7HzIm8juFRdXbrgtEuO2nAy0X7mO3I1rd58VKnzCCE LBl3E5k99wCWoAG+NmU2CYpAU62fMBKEOWj6CQuq1vzSMRw3mSvnu4e4RJEXnUzKhDc7 FCyHyr03FqeF+MUKm/dcH7n+Lll0bC5Tb9IcMi2WS2OClENbW86bS2tR6+ptTGB/ZAAW s3aw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=/SwBWs5q66edq7+rnCXLbZ3jHRIJEgn/KoE7Pgf6btU=; b=g4dgIzoT8R+9HEkxT0uPcB+jmHOEviiBTzmRn5nSRsrs598s7w5utVb4aI4WMNjUqq 9KKeT+SRe2pbAYV7FZU5PwbfEXSTDQynYBPpP+O4jXgKpKacpHTvSPwNKTepAEzzZRaf 94f9e9p+kALaTKZY7LejP5z9U58YBB87Wxpb/jC6Xqt/XUzrNRNtNTZQm7AKoQyZd+D1 g1j/qZerjboH0PvbYpgTCc/GsmKakIJEJwQnmPCMlYa7NOl7g8VEtJ1wI7PFQig14zPH dojo5VEKKptGHClc6MtwcbBuhd5OmeRq7YzdqT5EEvfFSNcVvEjO+411BOJOvp6rKHMn V5fQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=wIOZuAtR; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g53si1113179edb.313.2019.11.20.18.49.40; Wed, 20 Nov 2019 18:50:04 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=wIOZuAtR; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726792AbfKUCoo (ORCPT + 99 others); Wed, 20 Nov 2019 21:44:44 -0500 Received: from mail.kernel.org ([198.145.29.99]:54250 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726695AbfKUCon (ORCPT ); Wed, 20 Nov 2019 21:44:43 -0500 Received: from lenoir.home (lfbn-ncy-1-150-155.w83-194.abo.wanadoo.fr [83.194.232.155]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id B62BA20885; Thu, 21 Nov 2019 02:44:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1574304282; bh=Ue1B0LLPG3CxGJuFT54MuyKwfQ0rwUv5ZkF2IWs0GyI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=wIOZuAtRntu96iK2XmiK7ELzMp9Yvj3x6ZN2zZAhjgPDwrilPGrTN9PndqNA09Dqj R7xcVdy1JPJahNtlWwQQCLv+vc/7b5ieOT+0IVLi7rUeDc9BxbzvllKje88sy/qZrB hdosbSy4wnEfgE21RZDp2nrcz+KKnIrUjb+4c1tQ= From: Frederic Weisbecker To: Peter Zijlstra , Ingo Molnar Cc: LKML , Frederic Weisbecker , Jacek Anaszewski , Wanpeng Li , "Rafael J . Wysocki" , Benjamin Herrenschmidt , Rik van Riel , Thomas Gleixner , Yauheni Kaliuta , Viresh Kumar , Pavel Machek Subject: [PATCH 2/6] sched/vtime: Bring up complete kcpustat accessor Date: Thu, 21 Nov 2019 03:44:26 +0100 Message-Id: <20191121024430.19938-3-frederic@kernel.org> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20191121024430.19938-1-frederic@kernel.org> References: <20191121024430.19938-1-frederic@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Many callsites want to fetch the values of system, user, user_nice, guest or guest_nice kcpustat fields altogether or at least a pair of these. In that case calling kcpustat_field() for each requested field brings unecessary overhead when we could fetch all of them in a row. So provide kcpustat_cpu_fetch() that fetches the whole kcpustat array in a vtime safe way under the same RCU and seqcount block. Signed-off-by: Frederic Weisbecker Cc: Yauheni Kaliuta Cc: Thomas Gleixner Cc: Rik van Riel Cc: Peter Zijlstra Cc: Wanpeng Li Cc: Ingo Molnar --- include/linux/kernel_stat.h | 7 ++ kernel/sched/cputime.c | 136 ++++++++++++++++++++++++++++++------ 2 files changed, 123 insertions(+), 20 deletions(-) diff --git a/include/linux/kernel_stat.h b/include/linux/kernel_stat.h index 79781196eb25..89f0745c096d 100644 --- a/include/linux/kernel_stat.h +++ b/include/linux/kernel_stat.h @@ -81,12 +81,19 @@ static inline unsigned int kstat_cpu_irqs_sum(unsigned int cpu) #ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN extern u64 kcpustat_field(struct kernel_cpustat *kcpustat, enum cpu_usage_stat usage, int cpu); +extern void kcpustat_cpu_fetch(struct kernel_cpustat *dst, int cpu); #else static inline u64 kcpustat_field(struct kernel_cpustat *kcpustat, enum cpu_usage_stat usage, int cpu) { return kcpustat->cpustat[usage]; } + +static inline void kcpustat_cpu_fetch(struct kernel_cpustat *dst, int cpu) +{ + *dst = kcpustat_cpu(cpu); +} + #endif extern void account_user_time(struct task_struct *, u64); diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c index 27b5406222fc..d43318a489f2 100644 --- a/kernel/sched/cputime.c +++ b/kernel/sched/cputime.c @@ -912,6 +912,30 @@ void task_cputime(struct task_struct *t, u64 *utime, u64 *stime) } while (read_seqcount_retry(&vtime->seqcount, seq)); } +static int vtime_state_check(struct vtime *vtime, int cpu) +{ + /* + * We raced against a context switch, fetch the + * kcpustat task again. + */ + if (vtime->cpu != cpu && vtime->cpu != -1) + return -EAGAIN; + + /* + * Two possible things here: + * 1) We are seeing the scheduling out task (prev) or any past one. + * 2) We are seeing the scheduling in task (next) but it hasn't + * passed though vtime_task_switch() yet so the pending + * cputime of the prev task may not be flushed yet. + * + * Case 1) is ok but 2) is not. So wait for a safe VTIME state. + */ + if (vtime->state == VTIME_INACTIVE) + return -EAGAIN; + + return 0; +} + static u64 kcpustat_user_vtime(struct vtime *vtime) { if (vtime->state == VTIME_USER) @@ -933,26 +957,9 @@ static int kcpustat_field_vtime(u64 *cpustat, do { seq = read_seqcount_begin(&vtime->seqcount); - /* - * We raced against context switch, fetch the - * kcpustat task again. - */ - if (vtime->cpu != cpu && vtime->cpu != -1) - return -EAGAIN; - - /* - * Two possible things here: - * 1) We are seeing the scheduling out task (prev) or any past one. - * 2) We are seeing the scheduling in task (next) but it hasn't - * passed though vtime_task_switch() yet so the pending - * cputime of the prev task may not be flushed yet. - * - * Case 1) is ok but 2) is not. So wait for a safe VTIME state. - */ - if (vtime->state == VTIME_INACTIVE) - return -EAGAIN; - - err = 0; + err = vtime_state_check(vtime, cpu); + if (err < 0) + return err; *val = cpustat[usage]; @@ -1025,4 +1032,93 @@ u64 kcpustat_field(struct kernel_cpustat *kcpustat, } } EXPORT_SYMBOL_GPL(kcpustat_field); + +static int kcpustat_cpu_fetch_vtime(struct kernel_cpustat *dst, + const struct kernel_cpustat *src, + struct task_struct *tsk, int cpu) +{ + struct vtime *vtime = &tsk->vtime; + unsigned int seq; + int err; + + do { + u64 *cpustat; + u64 delta; + + seq = read_seqcount_begin(&vtime->seqcount); + + err = vtime_state_check(vtime, cpu); + if (err < 0) + return err; + + *dst = *src; + cpustat = dst->cpustat; + + /* Task is sleeping, dead or idle, nothing to add */ + if (vtime->state < VTIME_SYS) + continue; + + delta = vtime_delta(vtime); + + /* + * Task runs either in user (including guest) or kernel space, + * add pending nohz time to the right place. + */ + if (vtime->state == VTIME_SYS) { + cpustat[CPUTIME_SYSTEM] += vtime->stime + delta; + } else if (vtime->state == VTIME_USER) { + if (task_nice(tsk) > 0) + cpustat[CPUTIME_NICE] += vtime->utime + delta; + else + cpustat[CPUTIME_USER] += vtime->utime + delta; + } else { + WARN_ON_ONCE(vtime->state != VTIME_GUEST); + if (task_nice(tsk) > 0) { + cpustat[CPUTIME_GUEST_NICE] += vtime->gtime + delta; + cpustat[CPUTIME_NICE] += vtime->gtime + delta; + } else { + cpustat[CPUTIME_GUEST] += vtime->gtime + delta; + cpustat[CPUTIME_USER] += vtime->gtime + delta; + } + } + } while (read_seqcount_retry(&vtime->seqcount, seq)); + + return err; +} + +void kcpustat_cpu_fetch(struct kernel_cpustat *dst, int cpu) +{ + const struct kernel_cpustat *src = &kcpustat_cpu(cpu); + struct rq *rq; + int err; + + if (!vtime_accounting_enabled_cpu(cpu)) { + *dst = *src; + return; + } + + rq = cpu_rq(cpu); + + for (;;) { + struct task_struct *curr; + + rcu_read_lock(); + curr = rcu_dereference(rq->curr); + if (WARN_ON_ONCE(!curr)) { + rcu_read_unlock(); + *dst = *src; + return; + } + + err = kcpustat_cpu_fetch_vtime(dst, src, curr, cpu); + rcu_read_unlock(); + + if (!err) + return; + + cpu_relax(); + } +} +EXPORT_SYMBOL_GPL(kcpustat_cpu_fetch); + #endif /* CONFIG_VIRT_CPU_ACCOUNTING_GEN */ -- 2.23.0