Received: by 2002:ab2:7b86:0:b0:1f7:5705:b850 with SMTP id q6csp90352lqh; Fri, 3 May 2024 14:30:14 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCUiMFv49ibM8L3wUZv+gGPp6Pz75LSl8nIMeWbA1z9Fwu6Jig0GGk57eCHWMKX4joXTIg1Rtc9iXBuIO2QAcKgZ/TCVAK03Qub1zBwyIQ== X-Google-Smtp-Source: AGHT+IGy50YA8mqzntMMM/GxUIKEXErQ4y1b4IOwVdseHT0xpt1F3hBuoyVnvT+3wKsX0QDUiGtm X-Received: by 2002:a17:906:e20b:b0:a55:b331:73c0 with SMTP id gf11-20020a170906e20b00b00a55b33173c0mr2415895ejb.24.1714771814167; Fri, 03 May 2024 14:30:14 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1714771814; cv=pass; d=google.com; s=arc-20160816; b=aS0AmfWPbDylw2tw+tpSod/FngxgnkBda6dZvnrUEMJJmXi/VIyp5IpolXQxChiZ6K 7dYsvvU/hRWhsAkn4WqSDj93qBdfkf6tShv/3A5yxV6BmrUnSWOzHSPrBw3juifRovWI KPL28eJk2d3lIliq2IvTQtDEj+DJs5xw04h4FK85V0gE7WMeHyTTxHGhez+myxx5GCeK 4BOKfOCsyNoOl5yl7JrUiEsHzTL1lOan3pF+Af3X8HtTS52KZzexj0SvNGc7XTwpCN3e gQin4XxPpp2hLGLAGIb4OsVFfzpkue6pmIYuhubVR2nMz4vDEUoDr4Uzyx2AfYV6ZvPM CQDQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=cc:to:from:subject:message-id:references:mime-version :list-unsubscribe:list-subscribe:list-id:precedence:in-reply-to:date :dkim-signature; bh=lNr9lZmsjOzqzGs7i+GBj2/4wYllIMuY8anx349EdT0=; fh=lLGOofbQkQhznmpBwekEOSgZzBdKryCO9dobTY2Lql4=; b=g4b6wohgZRSyFndsWRiflvjavUDvB5MoUv5R8poXMG1rGw99T0PcfRWfbf/qzkJXcW r6z6D0VVQlapSCJ6G68bSAKcn4WcAkpy6wlD1VLZ5EBFeXc1On13O3R/9dwuu5ymPNNE Col3M2ivURlKpAvBxK3XEZmn/+61uME+IHlC6Sw6zxj68Griybib2Bo/YBQcsgNBD56u zNTuNjhBjfXGs5mLrxZkALZsAIm4QkptGB/ZxiyiSfDeuiSYzNI0WJCWSqNzXRBCGWtU uji5JL8u4AhZth7uUKhVHyvXtATvV9dmUQVfNhsA4U1pVooyJ7SpFt5pxAgL4VTGWLNu 5QKQ==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=DdxW38AY; arc=pass (i=1 spf=pass spfdomain=flex--seanjc.bounces.google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-168253-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-168253-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id qf42-20020a1709077f2a00b00a55923b7a25si2066646ejc.496.2024.05.03.14.30.14 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 03 May 2024 14:30:14 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-168253-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=DdxW38AY; arc=pass (i=1 spf=pass spfdomain=flex--seanjc.bounces.google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-168253-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-168253-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id BA8B81F2296E for ; Fri, 3 May 2024 21:30:13 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 9218F56750; Fri, 3 May 2024 21:30:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="DdxW38AY" Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 564921E4B2 for ; Fri, 3 May 2024 21:29:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714771800; cv=none; b=Gy93q+2DMepT10su5N2FjHZzRgGeEvKHMkibuH4RXEGSaNWfNsZFBDqHHGYS+OsHq80j7ONv4wKdgdr/dxOJIUCVb+uHbozfPhEzuXNOLirkMSZshouasP6GLAPJz4CX6osuN8np5KFLOcohns2dgQ/qQBVn9hht2Xf/WJYxpQQ= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1714771800; c=relaxed/simple; bh=EQL3IeOBP2Bu3dDlMxHDHIr3MtiCwMbPHUudi15G5QY=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=jUVrsgXbZPvzqD23SXQY2ij/1+wCw6aQ6ymUXH28dPWwaWdox4q2Dms/H5QzifMvDLPRGUvRZukLVlz+5XHmYO4kkaYUp1jGJAroi7H6tzi1C0h+pTpHhd0It1yiCQev7EHJKp0bW5DvwqsPYUHeiFjdn27itII6fr8kHjieeyQ= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=DdxW38AY; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-2b40ef83453so145002a91.2 for ; Fri, 03 May 2024 14:29:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1714771799; x=1715376599; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=lNr9lZmsjOzqzGs7i+GBj2/4wYllIMuY8anx349EdT0=; b=DdxW38AYZ4c9t5T6ft/zwfIdQQBEngxmtBJsZei+Ux8Y8kjJRgGUq46Wa7AbpuomB3 aHveA0IQ+/ta5QwTefzZSjS6po6YK/Lcy6nrVRkbqbv5uD/acTYTKIEQrL9GhW+7YhVa GqRoXL3vSA+3NKPGEsAqKmMUdFr0Jj3u3DdO6RIpzOdcBBb3S0zy6hVq5pIgPANfObei TjoA1tFqNYRc/CZUfLHLZgWUlZoidlm0VWwzqN6qOeLmD9ZURZst7xr5pJ4+dKtK9a33 omjMKM/hcWLU1C/uilp9BAP4Kg9u+FAX1e+3VwwAuZ9hv7wi/I4kEHNZnZrXCPjcMDwM NVeQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1714771799; x=1715376599; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=lNr9lZmsjOzqzGs7i+GBj2/4wYllIMuY8anx349EdT0=; b=S9hU1tGelTwA92oY7fldm2PczegMsykL9DJWGAPP5IW7E0Tu98NXHQZXsddE/Ucw6q ayZENz8/yW9kFpxRIt2RXUKMirRAvS4pHnFCqhftKST1ltK+ZhkOzBca0q0/MdNxr2h/ i7AV4ngVn2NyUHLHrlBUeL/wfAleIJWEu4OIXsePNrGgMzUgQzSUEqjpEFYqg/ckRi74 wkD4lsdIgP2PxHw7jK0nbAURssYpT6FCv+ssRTQOlv+R6D5dczDq+QFLrYYKPu8aJHvx Xyjbujf7/VXIoWSF8QMJYUj1FAjHwPxAOcAyxTBHu0hBgfzwyo/w9hRL87V6tWJd5cgQ YvmA== X-Forwarded-Encrypted: i=1; AJvYcCWm6DF3cyMLAKuDyeEcY7TPXuJz68qMZ/EU9/o4FHoUyM0BVUAJP0loosvu3+QZxIA63oi2HmlJQ4Pph0zK5b3yb6wOaVcWSuy2klUd X-Gm-Message-State: AOJu0YxTdHR9/InZIKJQdw9oT4Hh33rFcqteCIFhDKqNKJvMswJ648+7 6vr4m75tIZTYexwPanC71Ne9Yjn/xWTNQ3nnSCFkyV/ovSJNUiOtt3RDNpQbX4uU7YucSTYYFRl Z8Q== X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:90a:db55:b0:2b2:9773:edab with SMTP id u21-20020a17090adb5500b002b29773edabmr10552pjx.0.1714771798574; Fri, 03 May 2024 14:29:58 -0700 (PDT) Date: Fri, 3 May 2024 14:29:57 -0700 In-Reply-To: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240328171949.743211-1-leobras@redhat.com> Message-ID: Subject: Re: [RFC PATCH v1 0/2] Avoid rcu_core() if CPU just left guest vcpu From: Sean Christopherson To: Leonardo Bras Cc: Paolo Bonzini , "Paul E. McKenney" , Frederic Weisbecker , Neeraj Upadhyay , Joel Fernandes , Josh Triplett , Boqun Feng , Steven Rostedt , Mathieu Desnoyers , Lai Jiangshan , Zqiang , Marcelo Tosatti , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, rcu@vger.kernel.org Content-Type: text/plain; charset="us-ascii" On Fri, May 03, 2024, Leonardo Bras wrote: > > KVM can provide that information with much better precision, e.g. KVM > > knows when when it's in the core vCPU run loop. > > That would not be enough. > I need to present the application/problem to make a point: > > - There is multiple isolated physical CPU (nohz_full) on which we want to > run KVM_RT vcpus, which will be running a real-time (low latency) task. > - This task should not miss deadlines (RT), so we test the VM to make sure > the maximum latency on a long run does not exceed the latency requirement > - This vcpu will run on SCHED_FIFO, but has to run on lower priority than > rcuc, so we can avoid stalling other cpus. > - There may be some scenarios where the vcpu will go back to userspace > (from KVM_RUN ioctl), and that does not mean it's good to interrupt the > this to run other stuff (like rcuc). > > Now, I understand it will cover most of our issues if we have a context > tracking around the vcpu_run loop, since we can use that to decide not to > run rcuc on the cpu if the interruption hapenned inside the loop. > > But IIUC we can have a thread that "just got out of the loop" getting > interrupted by the timer, and asked to run rcu_core which will be bad for > latency. > > I understand that the chance may be statistically low, but happening once > may be enough to crush the latency numbers. > > Now, I can't think on a place to put this context trackers in kvm code that > would avoid the chance of rcuc running improperly, that's why the suggested > timeout, even though its ugly. > > About the false-positive, IIUC we could reduce it if we reset the per-cpu > last_guest_exit on kvm_put. Which then opens up the window that you're trying to avoid (IRQ arriving just after the vCPU is put, before the CPU exits to userspace). If you want the "entry to guest is imminent" status to be preserved across an exit to userspace, then it seems liek the flag really should be a property of the task, not a property of the physical CPU. Similar to how rcu_is_cpu_rrupt_from_idle() detects that an idle task was interrupted, that goal is to detect if a vCPU task was interrupted. PF_VCPU is already "taken" for similar tracking, but if we want to track "this task will soon enter an extended quiescent state", I don't see any reason to make it specific to vCPU tasks. Unless the kernel/KVM dynamically manages the flag, which as above will create windows for false negatives, the kernel needs to trust userspace to a certaine extent no matter what. E.g. even if KVM sets a PF_xxx flag on the first KVM_RUN, nothing would prevent userspace from calling into KVM to get KVM to set the flag, and then doing something else entirely with the task. So if we're comfortable relying on the 1 second timeout to guard against a misbehaving userspace, IMO we might as well fully rely on that guardrail. I.e. add a generic PF_xxx flag (or whatever flag location is most appropriate) to let userspace communicate to the kernel that it's a real-time task that spends the overwhelming majority of its time in userspace or guest context, i.e. should be given extra leniency with respect to rcuc if the task happens to be interrupted while it's in kernel context.