Received: by 2002:ab2:6991:0:b0:1f7:f6c3:9cb1 with SMTP id v17csp27946lqo; Tue, 7 May 2024 11:06:29 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCXeFkPZK9qsM2O626MqKo12G+e5gn/JVdf4dhDQZ6oCttd5bqGtdLhpuPmWHSSAaTsckRpm2m18sp0ZJyzL2X8HgxGsFpZAQtaAmXeLuQ== X-Google-Smtp-Source: AGHT+IGRALnpU3EL4O9fSFSfrnltwMnmdeyZaElueM+oGmJ9P1HfJum5Lbd1PpEPJAwmn8LeTJ1y X-Received: by 2002:a17:906:3848:b0:a59:db0f:6bdd with SMTP id a640c23a62f3a-a59fb95dff7mr13762666b.44.1715105189732; Tue, 07 May 2024 11:06:29 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1715105189; cv=pass; d=google.com; s=arc-20160816; b=QS56ZsMHuVl/xaUSMHozAz/d/qbfPH9zRDNZoOj2578MczbeWUNvD+be7UfY8L3E6z ybWZpPt/+647zmrj6clwbg2ABsxVaoyItJgXL2Rdl8btM3qO+FFxd6tJKX0dbnRQ/6ER pXJMOzv8COlxS9F4D7CFZ8do6rvrwoRQPLVn7f+MC8GEySIGNzHL4Jj8tkcJpCZwnyD0 2ukrE1P9NP2lcZm+m4kvBdzx4M8ALb7GU7hRjxT8AliygK/LWceLSp5rgvbwfK+eqavg xj5Do8BESZ1vfwZIlBvmRKhAG7NYKjFenEdKMhbvpRMasjih+FvI/65T1CG4N6qi5BHO ydZQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=cc:to:from:subject:message-id:references:mime-version :list-unsubscribe:list-subscribe:list-id:precedence:in-reply-to:date :dkim-signature; bh=4B3b0fSHvBV0t0bfPLzBXODTqy0x2bXlGHdHOkNN940=; fh=wiKpPsB0RZO4ZA0FQJ+oz0WZFKPrWeo5T4iafxHsBZA=; b=dOd1ZTSDi29YkIdVcBbtxHW1Eb0eyHH9luPBowgNzgusrpbeAkCfl20s3BLmzpZK8K X6KzA4W0U94AtpCm0QmHKhQhkHWETg6Insm1ccvHNpCqXmh0ftleDzsPdI3gb5dXTcn7 xVQXZAFmJWZduZU4PR+jyDEVUdd8vYZYPhl7HpwKqaaJL1LmASmn4QFfH5Klq8cDMwqc lnbiZG/Pmk38joKWkA8KrE7VeFlTgpVQx8dy7gM21BpuD5mz33ZYNihClkUWDO7MJrnH eXSWaBLUEh4cQBHaT0H4ceNccjvm+JDNsLebGlYqAaNqJ/G+9ervfrS1pY0p0IBW+nLp 7/vw==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=jqwBqJye; arc=pass (i=1 spf=pass spfdomain=flex--seanjc.bounces.google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-171951-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-171951-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id dt3-20020a170907728300b00a59cd083487si2449586ejc.735.2024.05.07.11.06.29 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 07 May 2024 11:06:29 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-171951-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=jqwBqJye; arc=pass (i=1 spf=pass spfdomain=flex--seanjc.bounces.google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-171951-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-171951-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 500181F2674F for ; Tue, 7 May 2024 18:06:29 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 7E52E16D4C1; Tue, 7 May 2024 18:05:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="jqwBqJye" Received: from mail-pf1-f201.google.com (mail-pf1-f201.google.com [209.85.210.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 59E1E16D339 for ; Tue, 7 May 2024 18:05:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715105158; cv=none; b=kRP0Q3tC5FoWiNHozfCPwXNK0XQs5cDKF9Heh1eEioJnPeiVdkcSPMoLejaNdHlCganL9eAyC+y4/kvBmSl0ayAVGIWwu3QBLdol+mQHWqpWSwlOcvkWG+zkddS2Im2zu099S6kFbdXfoJHBjgvcovXXSdHGFC5zTsdD1lId9Bk= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715105158; c=relaxed/simple; bh=Xb5whRwDyx5E2riHjw3KTgP+9PH574imG/syxIYEytQ=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=jk5Yi95ci5cmDUv2KW4GBphgfcN+9UKUMsGo1m74igecjYSWiwn1Cc+79YCZf86JF4cpECPUxHbg7wFBJPSNjERchrZYBT4nScjFHBdj7JLapXtl2/g4EmCveXDUkMk6CRjCh2nKixgRAKujT8hDONbyerJZkaoOncjJJhWZbIw= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=jqwBqJye; arc=none smtp.client-ip=209.85.210.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Received: by mail-pf1-f201.google.com with SMTP id d2e1a72fcca58-6f4755be972so1894152b3a.0 for ; Tue, 07 May 2024 11:05:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1715105157; x=1715709957; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=4B3b0fSHvBV0t0bfPLzBXODTqy0x2bXlGHdHOkNN940=; b=jqwBqJye19do5EWiytzLlpVtzf0lqGFylsGNz8vZp1F1W6dQpp63id7T7FFAv8T+va wbE7pnNA7j7Tgchl+Z9bOfbP9Xk3J4/J0acmfhkdDevlTFmrrUdMVj0b+2dHrzLvjG0s EGSADrdL7krrzZtgIUSEnX8vjALX4YkuQ4ki/EFv0dPOoXjx0lfpyddRr1/YqjwbFS98 tjmhNfUyOk5yil79poH80o04VU4aoUaxVrnxirXTjsodm+v2gKDhTTDOUjEYHElSUQB1 VS+8Hk44RGPkiXa8LRzQRqIMV5fcOPjhEtp6+/lgkubmlh/bp7WcUPPKrzZosZcpga2q ylVg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1715105157; x=1715709957; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=4B3b0fSHvBV0t0bfPLzBXODTqy0x2bXlGHdHOkNN940=; b=hGcKlfC0ge50Wrsr7S3j1qXWFnn9MmEBBI4YnzjXtY96q9UG0MhRWFL8ezYxRDjwbD tKoi+iwRySpNyrpPLUIqpew8OCOnxkSHKVL1vfggWg/uPbza4O6Ah9pvfdWe87/sRiO1 XUZj7aSjIpyeaWv4wkAQAYzW10n1t4MIR8bXDrXA9vUGVrvrNuD/f0St9hanfAnp4yN6 t6ncHepEdtmz6knLI1j2TxO/lSrEB5JhMIXzjhGiibFHXGU/i96ZeppQuEAzOYLXDfBa VPjKXHJ14BoxfgFrB2ofNtd/7tvjtY1ErRaXmMjvXCqDUFvw32f3HmBKjOuqJ3duBxXt kchg== X-Forwarded-Encrypted: i=1; AJvYcCVhdRmG5zKy8KsRorkfY0DzauAaqUWaX+yDqsyipgTTn+0oppGl9v0YZa5kTbLSkKslnCR+gGfpgD74ttjkoyIVCl/GNsyLucxYCjti X-Gm-Message-State: AOJu0YweUeh1oOmQDvOZddrSuC12JerHG8yVV6GtT4BkA0ciOI280fIO LbUUCFGYG/7fnSWhC4KPZESjN3YBrLjdUpAWGDbSTYyNSXwnAw3KQ12rob6uoeClGKM9K+3hDUI W8Q== X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a05:6a00:2191:b0:6f4:9fc7:d21e with SMTP id d2e1a72fcca58-6f49fc7d436mr734b3a.5.1715105156569; Tue, 07 May 2024 11:05:56 -0700 (PDT) Date: Tue, 7 May 2024 11:05:55 -0700 In-Reply-To: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: Message-ID: Subject: Re: [RFC PATCH v1 0/2] Avoid rcu_core() if CPU just left guest vcpu From: Sean Christopherson To: Marcelo Tosatti Cc: Leonardo Bras , "Paul E. McKenney" , Paolo Bonzini , Frederic Weisbecker , Neeraj Upadhyay , Joel Fernandes , Josh Triplett , Boqun Feng , Steven Rostedt , Mathieu Desnoyers , Lai Jiangshan , Zqiang , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, rcu@vger.kernel.org Content-Type: text/plain; charset="us-ascii" On Mon, May 06, 2024, Marcelo Tosatti wrote: > On Fri, May 03, 2024 at 05:44:22PM -0300, Leonardo Bras wrote: > > > And that race exists in general, i.e. any IRQ that arrives just as the idle task > > > is being scheduled in will unnecessarily wakeup rcuc. > > > > That's a race could be solved with the timeout (snapshot) solution, if we > > don't zero last_guest_exit on kvm_sched_out(), right? > > Yes. And if KVM doesn't zero last_guest_exit on kvm_sched_out(), then we're right back in the situation where RCU can get false positives (see below). > > > > > > /* Is the RCU core waiting for a quiescent state from this CPU? */ > > > > > > > > > > > > The problem is: > > > > > > > > > > > > 1) You should only set that flag, in the VM-entry path, after the point > > > > > > where no use of RCU is made: close to guest_state_enter_irqoff call. > > > > > > > > > > Why? As established above, KVM essentially has 1 second to enter the guest after > > > > > setting in_guest_run_loop (or whatever we call it). In the vast majority of cases, > > > > > the time before KVM enters the guest can probably be measured in microseconds. > > > > > > > > OK. > > > > > > > > > Snapshotting the exit time has the exact same problem of depending on KVM to > > > > > re-enter the guest soon-ish, so I don't understand why this would be considered > > > > > a problem with a flag to note the CPU is in KVM's run loop, but not with a > > > > > snapshot to say the CPU recently exited a KVM guest. > > > > > > > > See the race above. > > > > > > Ya, but if kvm_last_guest_exit is zeroed in kvm_sched_out(), then the snapshot > > > approach ends up with the same race. And not zeroing kvm_last_guest_exit is > > > arguably much more problematic as encountering a false positive doesn't require > > > hitting a small window. > > > > For the false positive (only on nohz_full) the maximum delay for the > > rcu_core() to be run would be 1s, and that would be in case we don't > > schedule out for some userspace task or idle thread, in which case we have > > a quiescent state without the need of rcu_core(). > > > > Now, for not being an userspace nor idle thread, it would need to be one or > > more kernel threads, which I suppose aren't usually many, nor usually take > > that long for completing, if we consider to be running on an isolated > > (nohz_full) cpu. > > > > So, for the kvm_sched_out() case, I don't actually think we are > > statistically introducing that much of a delay in the RCU mechanism. > > > > (I may be missing some point, though) My point is that if kvm_last_guest_exit is left as-is on kvm_sched_out() and vcpu_put(), then from a kernel/RCU safety perspective there is no meaningful difference between KVM setting kvm_last_guest_exit and userspace being allowed to mark a task as being exempt from being preempted by rcuc. Userspace can simply do KVM_RUN once to gain exemption from rcuc until the 1 second timeout expires. And if KVM does zero kvm_last_guest_exit on kvm_sched_out()/vcpu_put(), then the approach has the exact same window as my in_guest_run_loop idea, i.e. rcuc can be unnecessarily awakened in the time between KVM puts the vCPU and the CPU exits to userspace.