Received: by 2002:ab2:6991:0:b0:1f7:f6c3:9cb1 with SMTP id v17csp306252lqo; Tue, 7 May 2024 23:31:41 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCWrVjz7XPF0Zs+vCNCZNTquY+ig+Yu2DC4gV0O5uW6hGCTVqP4Bw8CzA62s1K2/V5NbVnjyS/PFpx3rWTNr+JheDSHZ9F/+EHAlDZddLw== X-Google-Smtp-Source: AGHT+IHF65jHmCLEAWVGa1omTNDgpHxN3TJmIBXePlWJMugDqKmSIcTE+OhfTw9+yuCjkkxjKEuI X-Received: by 2002:a17:903:2444:b0:1eb:904:dfd7 with SMTP id d9443c01a7336-1ee62f65e4bmr76519685ad.2.1715149901540; Tue, 07 May 2024 23:31:41 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1715149901; cv=pass; d=google.com; s=arc-20160816; b=j7K9G8Lh1zGKaNxXtVbkx2svcX1jSsqKB4Jlh63fmmn8MCFUh7ZbR5piq+dTHZX08v OiygA5YQE/m9ILoiBO1pk03EDMMTozKkchjG1/9G2hh97SiRGOcl4m8ib8K0Mugw9APA nr/XLIKlMlNb8iLlK4di1UeI9pMHkHfO3zBLrmGFEoNdectgcqoq4m/SsejwYcd7F7hm xlqGXu6OAWLlAzaKQI4Z8ifZFeEa2YXHOEYSosJ5J/rxgstpxIIvSRKuN8MUmdhqgTSN vpQOB9ElLomK2fsKZJ9jS3WpMc/WfH50q+wT5yJxBphWEiZ0lMHiMgg5mNNlQSGl2l5I 6bIg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:content-disposition:mime-version :list-unsubscribe:list-subscribe:list-id:precedence:references :in-reply-to:message-id:date:subject:cc:to:from:dkim-signature; bh=vBdAW+sq1bhL2ar45eOYIKwIreP2ij4Hd9fZNz+NXM8=; fh=mMEvqr6viloVphK2l3V2OV5132vgJsN0QnQpqT38H7Y=; b=WYCrnasGYE1d7naljt6eHgIHXoQhy40BzJP/iWcIIwAkqSNNsIxIHT7xNzrUUz1qqB xpOl6tDGF/c5U1EPDmpojSTkyYz4GFvmV/Qob5mrL2Do/8nu00B5i8lt5f5aAKSOSB56 rdxBMWWkqXqDeDdliREn7sRPnNPC0AzQpoByUDTVvdXzCMWC7qb7VxdYKyMHtHu5Tjm0 hZ+lIv03TjoAVJKVGZ6oreQA5STEF/8KJHieswd3G44EOcUnO1/pZiqztMcqEpGXPGQd MQ5pPTsuhG+LPmluM/NNj9Sb4RSj8wmZO1VaRF9NPHB7b1SRXYdwRFPIATvaFCI4glWT aqxg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="CGKzFvk/"; arc=pass (i=1 spf=pass spfdomain=redhat.com dkim=pass dkdomain=redhat.com dmarc=pass fromdomain=redhat.com); spf=pass (google.com: domain of linux-kernel+bounces-172740-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-172740-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [2604:1380:45e3:2400::1]) by mx.google.com with ESMTPS id dq18-20020a056a020f9200b0061c72f44d6asi10300118pgb.100.2024.05.07.23.31.41 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 07 May 2024 23:31:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-172740-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) client-ip=2604:1380:45e3:2400::1; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="CGKzFvk/"; arc=pass (i=1 spf=pass spfdomain=redhat.com dkim=pass dkdomain=redhat.com dmarc=pass fromdomain=redhat.com); spf=pass (google.com: domain of linux-kernel+bounces-172740-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-172740-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 63F452878EA for ; Wed, 8 May 2024 06:19:37 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 8C79F1758E; Wed, 8 May 2024 06:19:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="CGKzFvk/" Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1117A17BCD for ; Wed, 8 May 2024 06:19:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715149166; cv=none; b=cuTWtaoGGU5VVq7q3xl3adtRsz0B5fs1LLAswVXjQUwvZ+04pQst9TX8yXB0RozOPsoFYFe9Z1udvjBqXKFR20tRxTwr7ngFaecgQUkXjk5vzyCEWHgqQsle1QsF6pAa3cJCQJHyMJ4R+5ULBQBkpEQiqGyOT7hag0noPdEZNEA= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715149166; c=relaxed/simple; bh=DymN4p2A+Vsge2z/Iye6njrMgcg+Mh7xY36mVnu2GZI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type:Content-Disposition; b=R1yWW2YeUvhfB7eaniYfPighFAWMJm6P8dHV3NvT76me7ZD4xDXsyznEJ+reRTez9lLleYsYcNKeL6cu+yavdMdt77xJikKbZlPZHdnSf9gpHfOJmckaAc9Vjj7+kZ0nPIijpuUo4KZqSjP45dfwV3M3mjndceA5XQgFjRSQ0pI= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=CGKzFvk/; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1715149163; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vBdAW+sq1bhL2ar45eOYIKwIreP2ij4Hd9fZNz+NXM8=; b=CGKzFvk/77J+Xb7OJfkOrAVkox7WFhMnlK3KBX7LZ7QLBjgBJbag4rgyZNtw89ES06Po9Z Z5B2l4ba3KONklQx2rUCjq/64Cw2KfkM9qN6fxN72a+ys6WBUk3dux/x+v6oJCrxV40jqw vgNnp2CV0fNGNHAsKGstnu6fGvbHxik= Received: from mail-pl1-f199.google.com (mail-pl1-f199.google.com [209.85.214.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-344-CCBNaJpOMomaDPdLqWyzpA-1; Wed, 08 May 2024 02:19:21 -0400 X-MC-Unique: CCBNaJpOMomaDPdLqWyzpA-1 Received: by mail-pl1-f199.google.com with SMTP id d9443c01a7336-1ece562f2afso4314135ad.1 for ; Tue, 07 May 2024 23:19:20 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1715149160; x=1715753960; h=content-transfer-encoding:content-disposition:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=vBdAW+sq1bhL2ar45eOYIKwIreP2ij4Hd9fZNz+NXM8=; b=EG1xpHlgycA83xDSSEcREUMVmEXUbRf9gO8TyMX/vnpFMv4IazFsZm2+DXaMGP3eY3 YbTKI24V8Lzozq7wjLvmsO+SXAEKXpcwGrAOq7zfVLWC8X2D5TORC9VP/Pn88sTdlK5H QJqQpL7iFLYh79fL8LbnlxJzDJ9GNs/QEpTKw3ndqRAG3zATNm+IWRjq7LoQnG/NDQum bTdDRK8QetK4THXrxGYUQHr0C/QRY0H6mXg+UFSYlLj4y+IoT6Lm8UAPkVpAQsn4Gdcg VQdWh/9VngIh0Chr31hKAvjeRmF9sUJkHNMkXGekOir0xj0hcpY22yl/DshgyZp114Lk KHsg== X-Forwarded-Encrypted: i=1; AJvYcCXH51c5wJvAz5vmJGAoVKMMHWaK2CMxRR7hYGsat/Misr5roS4AcNfYf7d0pYfaWNDt5wEN+kgseGgp1I0iB/XIPuPbZfsLeuKey4pX X-Gm-Message-State: AOJu0YwZP8AXYWLGw7CQkHEOryMiIAn2g06ywDcyvIUvnm9l8OH5uniy 3nWg1sn+TjLZYy8CNC436CFtNhgaZBoUey+i8B3JFxIumncxWIxg5Jsx4okft+LUxbq2XYiG8q8 vJvgZm63K0ADm0X3KmKf3SAIBZpkFd1dxVNIWvZ6da9Ob0NG7p71CW94gQhlJkA== X-Received: by 2002:a17:902:d4d0:b0:1e2:c350:b46a with SMTP id d9443c01a7336-1eeabea279cmr32410165ad.27.1715149159981; Tue, 07 May 2024 23:19:19 -0700 (PDT) X-Received: by 2002:a17:902:d4d0:b0:1e2:c350:b46a with SMTP id d9443c01a7336-1eeabea279cmr32409945ad.27.1715149159537; Tue, 07 May 2024 23:19:19 -0700 (PDT) Received: from localhost.localdomain ([2804:1b3:a800:4b0a:b7a4:5eb9:b8a9:508d]) by smtp.gmail.com with ESMTPSA id l13-20020a170902f68d00b001eb2fa0c577sm10999265plg.116.2024.05.07.23.19.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 07 May 2024 23:19:18 -0700 (PDT) From: Leonardo Bras To: "Paul E. McKenney" Cc: Leonardo Bras , Sean Christopherson , Paolo Bonzini , Frederic Weisbecker , Neeraj Upadhyay , Joel Fernandes , Josh Triplett , Boqun Feng , Steven Rostedt , Mathieu Desnoyers , Lai Jiangshan , Zqiang , Marcelo Tosatti , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, rcu@vger.kernel.org Subject: Re: [RFC PATCH v1 0/2] Avoid rcu_core() if CPU just left guest vcpu Date: Wed, 8 May 2024 03:19:01 -0300 Message-ID: X-Mailer: git-send-email 2.45.0 In-Reply-To: References: <3b2c222b-9ef7-43e2-8ab3-653a5ee824d4@paulmck-laptop> <663a659d-3a6f-4bec-a84b-4dd5fd16c3c1@paulmck-laptop> <0e239143-65ed-445a-9782-e905527ea572@paulmck-laptop> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: 8bit On Tue, May 07, 2024 at 08:22:42PM -0700, Paul E. McKenney wrote: > On Tue, May 07, 2024 at 11:51:15PM -0300, Leonardo Bras wrote: > > On Tue, May 07, 2024 at 05:08:54PM -0700, Sean Christopherson wrote: > > > On Tue, May 07, 2024, Sean Christopherson wrote: > > > > On Tue, May 07, 2024, Paul E. McKenney wrote: > > [ . . . ] > > > > > > But if we do need RCU to be more aggressive about treating guest execution as > > > > > an RCU quiescent state within the host, that additional check would be an > > > > > excellent way of making that happen. > > > > > > > > It's not clear to me that being more agressive is warranted. If my understanding > > > > of the existing @user check is correct, we _could_ achieve similar functionality > > > > for vCPU tasks by defining a rule that KVM must never enter an RCU critical section > > > > with PF_VCPU set and IRQs enabled, and then rcu_pending() could check PF_VCPU. > > > > On x86, this would be relatively straightforward (hack-a-patch below), but I've > > > > no idea what it would look like on other architectures. > > > > > > > > But the value added isn't entirely clear to me, probably because I'm still missing > > > > something. KVM will have *very* recently called __ct_user_exit(CONTEXT_GUEST) to > > > > note the transition from guest to host kernel. Why isn't that a sufficient hook > > > > for RCU to infer grace period completion? > > > > This is one of the solutions I tested when I was trying to solve the bug: > > - Report quiescent state both in guest entry & guest exit. > > > > It improves the bug, but has 2 issues compared to the timing alternative: > > 1 - Saving jiffies to a per-cpu local variable is usually cheaper than > > reporting a quiescent state > > 2 - If we report it on guest_exit() and some other cpu requests a grace > > period in the next few cpu cycles, there is chance a timer interrupt > > can trigger rcu_core() before the next guest_entry, which would > > introduce unnecessary latency, and cause be the issue we are trying to > > fix. > > > > I mean, it makes the bug reproduce less, but do not fix it. > > OK, then it sounds like something might be needed, but again, I must > defer to you guys on the need. > > If there is a need, what are your thoughts on the approach that Sean > suggested? Something just hit me, and maybe I need to propose something more generic. But I need some help with a question first: - Let's forget about kvm for a few seconds, and focus in host userspace: If we have a high priority (user) task running on nohz_full cpu, and it gets interrupted (IRQ, let's say). Is it possible that the interrupting task gets interrupted by the timer interrupt which will check for rcu_pending(), and return true ? (1) (or is there any protection for that kind of scenario?) (2) 1) If there is any possibility of this happening, maybe we could consider fixing it by adding some kind of generic timeout in RCU code, to be used in nohz_full, so that it keeps track of the last time an quiescent state ran in this_cpu, and returns false on rcu_pending() if one happened in the last N jiffies. In this case, we could also report a quiescent state in guest_exit, and make use of above generic RCU timeout to avoid having any rcu_core() running in those switching moments. 2) On the other hand, if there are mechanisms in place for avoiding such scenario, it could justify adding some similar mechanism to KVM guest_exit / guest_entry. In case adding such mechanism is hard, or expensive, we could use the KVM-only timeout previously suggested to avoid what we are currently hitting. Could we use both a timeout & context tracking in this scenario? yes But why do that, if the timeout would work just as well? If I missed something, please let me know. :) Thanks! Leo