Received: by 2002:ab2:7855:0:b0:1f9:5764:f03e with SMTP id m21csp62201lqp; Tue, 21 May 2024 18:37:25 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCU+coP5/Q9XDoHDGfVWx9IYyWIX+0N5bB9U63T4r4LhyAiZKJN2Lp71QATzXlRfkqVroMK8XrcXSyRrepRGCooxpdwQ9Jwx7AsGp+l2Jg== X-Google-Smtp-Source: AGHT+IEwDk/w8dXCnhsuHDs/9rYKFge4H1pWm0zN27FEPN8d6znzdOlR0N0Mo8An/M+WzsGYKy5e X-Received: by 2002:a2e:8807:0:b0:2de:7cc5:7a27 with SMTP id 38308e7fff4ca-2e949583e50mr1857281fa.5.1716341845238; Tue, 21 May 2024 18:37:25 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1716341845; cv=pass; d=google.com; s=arc-20160816; b=KfqH3A/LqeoUN1iHZh7tLkyvgXxU4ZkJHgLh4iYT5RhgYb4XvhHGu9iUwtULn8tNUE eb1GEEGa6NSU+f3EGfGLCNfzOZMHBpoGb1r5uzbuD2nlBzMLNOhPsjLAd/+96RBZ1UK/ oey9nb8DAo4DQzI5XMjM9JoUp2qwsRYCsxFdriDXeNDqwV9Tz6saS1TQX5pVLngl3zgH GsOxmfDCuC7+kax9BjXO/Cy/6HYN5GsSbSN0uBPVU6c5xyKQjX3lGOXVGPe0oT7CZ6hX UFkbvlizghNQvioiqWnsTobquseenaMqXPXMeJdsSXiK04qtvm+DPGKKaEcMhMR4Vvo2 zIVw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=79BDN/uYAypqTs1CgsRKyCVAI6kWKrtRRuhydi33cIU=; fh=cHbBeIcWPXug0itFmRfudfa44WmtKnJkoGAOtdzsqoA=; b=cFMPHw4jXPmHesOBjJnnn+aaA9V3Dy7+QCK9NsfLMyQcfvP+ULuMhYFsgtkgWsZICD lkXm3hT6p4rlGWdIMhRfyE1zPC8R1xBPy0rQhbpSYSmL209wWQ4TCXHz4RFKC/L+k6/s EX989jguPDVYgu/8h5x89NMiWkZAjo3VN/xZ4DJlYwqcF/FMwoe3UCJexaglfll/BusU 1fGCerSkHU3Db/qSv/iO8aQ85JA58HJDBvPJgDNVAbc5Sz1wHf7R4afotUyXPOT7RruW SeflL5I4ZNuk12w/svmma8BolnLVUwu3qSBg790xuwkv7lwkqDvQjFv4hFOojVN2ryLe 085g==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=JSgFE3+i; arc=pass (i=1 dkim=pass dkdomain=infradead.org); spf=pass (google.com: domain of linux-kernel+bounces-185549-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-185549-linux.lists.archive=gmail.com@vger.kernel.org" Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id 4fb4d7f45d1cf-5733c328968si14144817a12.422.2024.05.21.18.37.25 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 21 May 2024 18:37:25 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-185549-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=JSgFE3+i; arc=pass (i=1 dkim=pass dkdomain=infradead.org); spf=pass (google.com: domain of linux-kernel+bounces-185549-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-185549-linux.lists.archive=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id A74881F24EE6 for ; Wed, 22 May 2024 00:22:22 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 4F9007BAFF; Wed, 22 May 2024 00:18:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="JSgFE3+i" Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 49D0C23BE; Wed, 22 May 2024 00:18:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=90.155.50.34 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716337117; cv=none; b=M5FoCADKZuWuUNtLsA3wBfo5QCdo+hqF207hUnVxrFJZ0RySD1M58DRrzRMmhuTwCCPxhkEHPA+3rdMSMuSyxkhLvENFfdkhnjRakxHM4DcujJ8xSjkDrEz1BBK4TBwIz7Cs+BBLvgYO4IrO7Wa3FIN1xjxkP0tywvGOdvh+6Ow= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716337117; c=relaxed/simple; bh=KVL11mNFOe5FNqTLKijeVjwg5UuKyYE5qtUYky/f4y0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ih800XFGsZhtifdKZtWecDb69zzL8MA8x0HojCY9ysU6sCke5Ub9h2DMd5hkU2NxrZZtRqbDCF6t/lPfcQCsvVE8G3UOf/BSjjLzEKn3P3w3VYaM9DIDQwFS5EwddokfuXfDIcdiSe/Tdop4muhf2Lz5ZZG4ZAv/cc+q897RtGk= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org; spf=none smtp.mailfrom=casper.srs.infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=JSgFE3+i; arc=none smtp.client-ip=90.155.50.34 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=casper.srs.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=79BDN/uYAypqTs1CgsRKyCVAI6kWKrtRRuhydi33cIU=; b=JSgFE3+icTIMsuKBoyYUY12zsI dhDWAZcaODw1byJCqBilCdnwDK+cwo5F77GfTZmwVbG7GiDkyt+phcO6ho52Ng4meUYqJTAusnyH3 CmSdIxNedXOHUY4JkPwzdtBLXZVTe/S/v3MU89hzwbo1GYqlUY0bK2eWX9D8WG/A4ti6NsAVmePVe 4ZynJ5Lg2GZpp0FB0qoSP2Np1hL+cq1Uw06hb6a5CWoXvBg7WTcectZfCWHniiClCo60uAi4mjIXE FpEq74A/G2GNwOt97HcooASYiJ67vpYSgq3smhmCSsdcnOcUEU6OLBZJSZ6L8VEwRJSiyoy6Vnvyy gawCo8iQ==; Received: from [2001:8b0:10b:1::ebe] (helo=i7.infradead.org) by casper.infradead.org with esmtpsa (Exim 4.97.1 #2 (Red Hat Linux)) id 1s9ZgT-0000000081W-18S4; Wed, 22 May 2024 00:18:21 +0000 Received: from dwoodhou by i7.infradead.org with local (Exim 4.97.1 #2 (Red Hat Linux)) id 1s9ZgS-00000002b5c-3lUa; Wed, 22 May 2024 01:18:20 +0100 From: David Woodhouse To: kvm@vger.kernel.org Cc: Paolo Bonzini , Jonathan Corbet , Sean Christopherson , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" , Paul Durrant , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Daniel Bristot de Oliveira , Valentin Schneider , Shuah Khan , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, jalliste@amazon.co.uk, sveith@amazon.de, zide.chen@intel.com, Dongli Zhang , Chenyi Qiang Subject: [RFC PATCH v3 21/21] sched/cputime: Cope with steal time going backwards or negative Date: Wed, 22 May 2024 01:17:16 +0100 Message-ID: <20240522001817.619072-22-dwmw2@infradead.org> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240522001817.619072-1-dwmw2@infradead.org> References: <20240522001817.619072-1-dwmw2@infradead.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: David Woodhouse X-SRS-Rewrite: SMTP reverse-path rewritten from by casper.infradead.org. See http://www.infradead.org/rpr.html From: David Woodhouse In steal_account_process_time(), a delta is calculated between the value returned by paravirt_steal_clock(), and this_rq()->prev_steal_time which is assumed to be the *previous* value returned by paravirt_steal_clock(). However, instead of just assigning the newly-read value directly into ->prev_steal_time for use in the next iteration, ->prev_steal_time is *incremented* by the calculated delta. This used to be roughly the same, modulo conversion to jiffies and back, until commit 807e5b80687c0 ("sched/cputime: Add steal time support to full dynticks CPU time accounting") started clamping that delta to a maximum of the actual time elapsed. So now, if the value returned by paravirt_steal_clock() jumps by a large amount, instead of a *single* period of reporting 100% steal time, the system will report 100% steal time for as long as it takes to "catch up" with the reported value. Which is up to 584 years. But there is a benefit to advancing ->prev_steal_time only by the time which was *accounted* as having been stolen. It means that any extra time truncated by the clamping will be accounted in the next sample period rather than lost. Given the stochastic nature of the sampling, that is more accurate overall. So, continue to advance ->prev_steal_time by the accounted value as long as the delta isn't egregiously large (for which, use maxtime * 2). If the delta is more than that, just set ->prev_steal_time directly to the value returned by paravirt_steal_clock(). Fixes: 807e5b80687c0 ("sched/cputime: Add steal time support to full dynticks CPU time accounting") Signed-off-by: David Woodhouse --- kernel/sched/cputime.c | 20 ++++++++++++++------ 1 file changed, 14 insertions(+), 6 deletions(-) diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c index af7952f12e6c..3a8a8b38966d 100644 --- a/kernel/sched/cputime.c +++ b/kernel/sched/cputime.c @@ -254,13 +254,21 @@ static __always_inline u64 steal_account_process_time(u64 maxtime) { #ifdef CONFIG_PARAVIRT if (static_key_false(¶virt_steal_enabled)) { - u64 steal; - - steal = paravirt_steal_clock(smp_processor_id()); - steal -= this_rq()->prev_steal_time; - steal = min(steal, maxtime); + u64 steal, abs_steal; + + abs_steal = paravirt_steal_clock(smp_processor_id()); + steal = abs_steal - this_rq()->prev_steal_time; + if (unlikely(steal > maxtime)) { + /* + * If the delta isn't egregious, it can be counted + * in the next time period. Only advance by maxtime. + */ + if (steal < maxtime * 2) + abs_steal = this_rq()->prev_steal_time + maxtime; + steal = maxtime; + } account_steal_time(steal); - this_rq()->prev_steal_time += steal; + this_rq()->prev_steal_time = abs_steal; return steal; } -- 2.44.0