Received: by 2002:ab2:6991:0:b0:1f7:f6c3:9cb1 with SMTP id v17csp580061lqo; Wed, 8 May 2024 08:35:38 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCUEhx6CSVsg1UH+fGNRnBy29B/nXmN0Hxg/CcZArTit+YqmcwEEBK9x1Zt8ymssfAOKbYXCF4vLo0ukLKbmwF0MPBoSOt125Uk94AZnOw== X-Google-Smtp-Source: AGHT+IEWRFtSulXFUJWZWC9HaKGXdsI7AuoDBSJOmBFBFhckADhDu4HpEPKKSxbp8+LEtdWiihL1 X-Received: by 2002:a05:6870:3111:b0:23c:a5ee:1c2d with SMTP id 586e51a60fabf-240979cb257mr3559762fac.12.1715182538204; Wed, 08 May 2024 08:35:38 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1715182538; cv=pass; d=google.com; s=arc-20160816; b=l6hJOuUQ5e3aa5vDLKpsy9gSp0fSX2ryjc9bad30b/Uq7kRbz/aL47li66/fCS5LhM HFmPfO/WIlYoNX0BdvCTD8U9fRG8iIFzrzaN67jw6mDVHrpYv0c+h58y0fHHIcgtHwh3 9yRqbcBnKPQhTQQ9vL4jv8NENtgsVWOeYm0Q2ANvkyaClA6PAP5MpsJn02QZ9y7aKGea HKh/Uo8y3Pq7Mgtj4gQBwxCVMB3ysEkG9/Qmcrt5Vnv2urQoHUC8wCgVfYLopr6o53zx 1x765VALwxMoMmCSmzO4yAl4uLp1gRGT5cvJ1W/1XuCMAIAyIn7pCc3rKB/sU6ONiGXR dyRA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=cc:to:from:subject:message-id:references:mime-version :list-unsubscribe:list-subscribe:list-id:precedence:in-reply-to:date :dkim-signature; bh=zgF+FfCyif5jmmQvrN+Ij7wj3+WZfQYjh63u9JDeBL0=; fh=BxVDspMpcGJDmEZJpX6ePba6FoTMZjkADG+RXuSVx0c=; b=FK2p/K1lA9JbO98udEFhqvOkpPj9ewG6UcYb0Vp+hR7oUZr5wwziRdBBeek0jDYIsq w+liSp4QBuMtLef+lV73O73225y88IuQNhZx9LJhUp/g9DSi8NRZ6NRDejSCLMNWGs+m z4kBsMuihTiCeKIqwAlQTHZBWk5RctLpraal8SDEe/iAThkoRKIptBglC2JlTIDao0dA EXwKmOFtje6hJjaYyvPRYHRNBW6+yo0Uj9/ZuwXNIlHSWSB3xvCyYE9jM/J2mtXz3TI/ wa+HlZDFhvyLU2g0/Qn2bv22McKYI4EmGrCJrpxMavwliuaI5ZqCseYT6VpGI5jUtYYR ofeA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=xd8NFkCc; arc=pass (i=1 spf=pass spfdomain=flex--seanjc.bounces.google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-173493-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-173493-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id b7-20020a05622a020700b0043b0dc57b6bsi13771874qtx.86.2024.05.08.08.35.38 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 08 May 2024 08:35:38 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-173493-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=xd8NFkCc; arc=pass (i=1 spf=pass spfdomain=flex--seanjc.bounces.google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-173493-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-173493-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id B39A01C245A1 for ; Wed, 8 May 2024 15:35:37 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id A1B08128368; Wed, 8 May 2024 15:35:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="xd8NFkCc" Received: from mail-pg1-f201.google.com (mail-pg1-f201.google.com [209.85.215.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3A9FA1272A0 for ; Wed, 8 May 2024 15:35:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715182507; cv=none; b=e1HYEQIsXsOrvdkcD8IONYP+BmPG7X7gtXyJMRDH3sOHw3Wfm9mNRLPbdBeBsNPdelREXHLLrDNzb5rK8npIZN+kYm2z8r3ckmXixVnJzUy0Z6GsQRci203ZGlcfvHTKdl5y8ue6GLiWDb2RiD+hFFonfWLIh8dgJyb4Lua2Bi4= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715182507; c=relaxed/simple; bh=YNwnw7x/Uo6NlKxdHmJmjTkvzz/xc3NOR1EbfVQ1Qdk=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=bxW69+BnqnIFhOhMDKo8X3WWL9Yg/UdPPhhEgnn88eKC+gS73XxEldlrIOMCTkKNJke67W1K0WnaY1hYBqwQh162IF03cpquwu7Nw6fdM5A8YZkIYxgQqoBI8HN9Cy4/03s67U0tDictxYk2ZlzmQ2PkX7HFvuIMs9tWf3hXvpg= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=xd8NFkCc; arc=none smtp.client-ip=209.85.215.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Received: by mail-pg1-f201.google.com with SMTP id 41be03b00d2f7-61c943c18a8so4449801a12.0 for ; Wed, 08 May 2024 08:35:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1715182505; x=1715787305; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=zgF+FfCyif5jmmQvrN+Ij7wj3+WZfQYjh63u9JDeBL0=; b=xd8NFkCc5uRmmg9BWiGkeXd81FmeWv95bsD91ua9LNqd2BJOus1sHuHAWNJjNo9jdR 1A0/j6lIji3UVVlW7RJTknyiVNHm1KHCSQ0EwMFbPsY9J8DLZqc63C+MBJf5SEPA6PuA 1IUIKKz5cHMucshl4W4bDv5Ukk95HANWdTI3fOd1Zq6r0r+fybv6d+3aua1CFAHmcdrb EPPJCoPkAwWILLodkqCV668AL5PBygrQL5Po4dIt7OmBn/mOS6rkJ5nSHiQxKt1hD24B gH2kVJd4Boc5vmQLyDvNt1O7HXQ6eyWhL+dbZFOMjFT3UN7VhUERJNrEGZaTUEHy5L/Q 2SKQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1715182505; x=1715787305; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=zgF+FfCyif5jmmQvrN+Ij7wj3+WZfQYjh63u9JDeBL0=; b=IazCzwNKzxM1dTwTMdTWFKo7XXl0EomMLanhKUmzokqU8BeJx0c8Fy1Gybh9sGWkYc QpoJMqqU0YRaYow8Nomtp/F3wU7OV3tDggyL8aGfeK2SZ11q+Jq7OeUs+YrhpZZgwNXe c8Azss5zk23sn09jh38doGUvnJKTEj2wvhrytZinTociIIBQvSI7F0JqUTSoTijQ+5Qb 4hqjWR80PmLMrRGsE1ioEeslOLPwJT6VVg40QrkWxQMybMh4bNpCdDsbSZOfo2aQgpOK OgdheZIpbu/zoexHvo/9pNu93oYCgStc1L2scepZnCEYFz84sJMQM/PkGnOmsWc0bZfx oVCQ== X-Forwarded-Encrypted: i=1; AJvYcCW/EKrT2uvFLX0Sfj232yKlRgt7iWOGz1n+B4PI/xQxQqzg1aiVkR7XZFHiYGTzVQe7fjBcD1MbUK+apgdjanXXN0qyO1Va6soezOlU X-Gm-Message-State: AOJu0YxvrzzimWEl9hwwZRJcIJb8ancOlnxqDBVBvMoi2MwNIh1S2aJS 1yzz2v1BBPKsVj9k9NCb2sMbnOTc3n+FgjDnUA5c8PYHhJUuJzH3ogUHkzrENx3l+xMmp4CeCTl wUw== X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a63:65c1:0:b0:5dc:af76:f57d with SMTP id 41be03b00d2f7-62f223af825mr37896a12.7.1715182505319; Wed, 08 May 2024 08:35:05 -0700 (PDT) Date: Wed, 8 May 2024 08:35:03 -0700 In-Reply-To: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <3b2c222b-9ef7-43e2-8ab3-653a5ee824d4@paulmck-laptop> <663a659d-3a6f-4bec-a84b-4dd5fd16c3c1@paulmck-laptop> <0e239143-65ed-445a-9782-e905527ea572@paulmck-laptop> Message-ID: Subject: Re: [RFC PATCH v1 0/2] Avoid rcu_core() if CPU just left guest vcpu From: Sean Christopherson To: "Paul E. McKenney" Cc: Leonardo Bras , Paolo Bonzini , Frederic Weisbecker , Neeraj Upadhyay , Joel Fernandes , Josh Triplett , Boqun Feng , Steven Rostedt , Mathieu Desnoyers , Lai Jiangshan , Zqiang , Marcelo Tosatti , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, rcu@vger.kernel.org Content-Type: text/plain; charset="us-ascii" On Tue, May 07, 2024, Paul E. McKenney wrote: > On Tue, May 07, 2024 at 05:08:54PM -0700, Sean Christopherson wrote: > > > > This is admittedly a bit indirect, but then again this is Linux-kernel > > > > RCU that we are talking about. > > > > > > > > > And I'm arguing that, since the @user check isn't bombproof, there's no reason to > > > > > try to harden against every possible edge case in an equivalent @guest check, > > > > > because it's unnecessary for kernel safety, thanks to the guardrails. > > > > > > > > And the same argument above would also apply to an equivalent check for > > > > execution in guest mode at the time of the interrupt. > > > > > > This is partly why I was off in the weeds. KVM cannot guarantee that the > > > interrupt that leads to rcu_pending() actually interrupted the guest. And the > > > original patch didn't help at all, because a time-based check doesn't come > > > remotely close to the guarantees that the @user check provides. > > Nothing in the registers from the interrupted context permits that > determination? No, because the interrupt/call chain that reaches rcu_pending() actually originates in KVM host code, not guest code. I.e. the eventual IRET will return control to KVM, not to the guest. On AMD, the interrupt quite literally interrupts the host, not the guest. AMD CPUs don't actually acknowledge/consume the physical interrupt when the guest is running, the CPU simply generates a VM-Exit that says "there's an interrupt pending". It's up to software, i.e. KVM, to enable IRQs and handle (all!) pending interrupts. Intel CPUs have a mode where the CPU fully acknowledges the interrupt and reports the exact vector that caused the VM-Exit, but it's still up to software to invoke the interrupt handler, i.e. the interrupt trampolines through KVM. And before handling/forwarding the interrupt, KVM exits its quiescent state, leaves its no-instrumention region, invokes tracepoitnes, etc. So even my PF_VCPU idea is _very_ different than the user/idle scenarios, where the interrupt really truly does original from an extended quiescent state. > > > > But if we do need RCU to be more aggressive about treating guest execution as > > > > an RCU quiescent state within the host, that additional check would be an > > > > excellent way of making that happen. > > > > > > It's not clear to me that being more agressive is warranted. If my understanding > > > of the existing @user check is correct, we _could_ achieve similar functionality > > > for vCPU tasks by defining a rule that KVM must never enter an RCU critical section > > > with PF_VCPU set and IRQs enabled, and then rcu_pending() could check PF_VCPU. > > > On x86, this would be relatively straightforward (hack-a-patch below), but I've > > > no idea what it would look like on other architectures. > > At first glance, this looks plausible. I would guess that a real patch > would have to be architecture dependent, and that could simply involve > a Kconfig option (perhaps something like CONFIG_RCU_SENSE_GUEST), so > that the check you add to rcu_pending is conditioned on something like > IS_ENABLED(CONFIG_RCU_SENSE_GUEST). > > There would also need to be a similar check in rcu_sched_clock_irq(), > or maybe in rcu_flavor_sched_clock_irq(), to force a call to rcu_qs() > in this situation. > > > > But the value added isn't entirely clear to me, probably because I'm still missing > > > something. KVM will have *very* recently called __ct_user_exit(CONTEXT_GUEST) to > > > note the transition from guest to host kernel. Why isn't that a sufficient hook > > > for RCU to infer grace period completion? > > Agreed, unless we are sure we need the change, we should not make it. +1. And your comments about tracepoints, instrumentions, etc. makes me think that trying to force the issue with PF_VCPU would be a bad idea.