Received: by 2002:a05:7412:8d10:b0:f3:1519:9f41 with SMTP id bj16csp6837735rdb; Fri, 15 Dec 2023 09:30:53 -0800 (PST) X-Google-Smtp-Source: AGHT+IHVApGslS8CTndaHbPqVGRg2YmB4oZTHy9fGnfO75LLo1bVAW2qxVLexR3CITegR+B5+U5q X-Received: by 2002:a17:907:7d8c:b0:a19:a19b:55ef with SMTP id oz12-20020a1709077d8c00b00a19a19b55efmr6610872ejc.127.1702661453723; Fri, 15 Dec 2023 09:30:53 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702661453; cv=none; d=google.com; s=arc-20160816; b=BNZpNdgoRN4S2TNhC+AT5ypRa6wMjlqjN304CPXMcJ6EDN6/y50WHu5Zz18KwYMTwA VelzSK8jpxqf9AVZfeD5xKhBn5qkN72yDJesSVTGNWQuY4LSW8ZOHoY0rHKvS5iOd4vz K6FWdY4z3GM1Q5WfE7w5Z98WVN0RMQS3APQOt5M6y2g3NA+dSoTMs6vs4SKMQlpa2AqH qqUtRwUH5G0wgR1J2AI1wf3tzPm/7gRetzSk+bI56G5rz9OY2Xpy24qTmjbbL7E0cOij pUNonUArmh3gziJ4J9mDbXrA9cF9f6Zcdfr48TqUtoyL/4FTJTeeCoGFw3+a3EJ8iQ/3 2PDQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=mime-version:list-unsubscribe:list-subscribe:list-id:precedence :message-id:date:references:in-reply-to:subject:cc:to:dkim-signature :dkim-signature:from; bh=VYZRdTrUfLSjIbMuwr8q2hB5IoUOm77UOIngwqn7iJU=; fh=hJgHkBSK4cHRDY3NNA+xsgkiuhwWoHkMJuBpA9Xq5/o=; b=GcpEJ4fmLRUMV907+FEiRNCKsGzJ4Z4V1lSBvBrzDUrlmpWJwwU5eVZ0UV6SHW7WBJ pPVeIR/KGl7cKAsxQzjE9lfzL7GDtnBvhnrSP9xM7XH1hecEjMkcwEr87ls++yPTRpg+ Jmhez2toWltox7vEyLNS1U2z+zBJtcjxYIdT0CWRpVKGCuS+imGZv7fpp7ruQL1rz0w0 7B0avfDLvhl/WSx+zTZcLPLBE0yIPFbDSdrO7nN6OIRhtDLwfrTqhqV6GZpO2Mxyhq+h tA+ZTfeJk3n/HhSWZenm484KQQxPJNE3vI0XenKieUsEQPxdpXGi86C/kip3Wekcy+Md Adwg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=IKn9mPIl; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel+bounces-1420-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-1420-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id j23-20020a170906411700b00a1d6bd4de7asi7226462ejk.885.2023.12.15.09.30.53 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Dec 2023 09:30:53 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-1420-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=IKn9mPIl; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel+bounces-1420-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-1420-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id ECF3E1F2595B for ; Fri, 15 Dec 2023 17:30:49 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id DC8D982ED7; Fri, 15 Dec 2023 17:26:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="IKn9mPIl"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="+kWi05ef" X-Original-To: linux-kernel@vger.kernel.org Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 71DCE30103; Fri, 15 Dec 2023 17:26:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de From: Thomas Gleixner DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1702661212; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=VYZRdTrUfLSjIbMuwr8q2hB5IoUOm77UOIngwqn7iJU=; b=IKn9mPIlQGHLdLYlt72NR+6gXNAd57/ptU/wmG/+bXyc75gESDFrSFuhOKC+Fd3eSrWNco /mkGCExYeIF5nNf8Acaqio4f5rKehxBjL9KjOZSH2PSru2oCd/58B0jnkppKR64LFFLiFd wxSczyy8IQqz4cHwp2NcbTfegKR2uCzQgqM2ene3gCxZ0PmTNOozw0YEMJORi7XC5lF6tF b3t0OGE29QZSq+RNJXVceX0vOZAp/Vw1jFqLIRGq7ty8G06cYVs4jhiNSeVxQ9+g1f/PjS V6e9iQYPQthSHBGOMC2OzRmP108pPenk6X4RoHbi2mYjSaAsAc1SzBJqltqqZw== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1702661212; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=VYZRdTrUfLSjIbMuwr8q2hB5IoUOm77UOIngwqn7iJU=; b=+kWi05ef+R91pM1ddNPnVDIOyZmI5bJhC1nemoVwP66Q+Y3daQdLyWxmXjB1BpSmsvh97o t1lywhsYkDEFZLBQ== To: "Vineeth Pillai (Google)" , Ben Segall , Borislav Petkov , Daniel Bristot de Oliveira , Dave Hansen , Dietmar Eggemann , "H . Peter Anvin" , Ingo Molnar , Juri Lelli , Mel Gorman , Paolo Bonzini , Andy Lutomirski , Peter Zijlstra , Sean Christopherson , Steven Rostedt , Valentin Schneider , Vincent Guittot , Vitaly Kuznetsov , Wanpeng Li Cc: "Vineeth Pillai (Google)" , Suleiman Souhlal , Masami Hiramatsu , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, x86@kernel.org, Joel Fernandes Subject: Re: [RFC PATCH 8/8] irq: boost/unboost in irq/nmi entry/exit and softirq In-Reply-To: <20231214024727.3503870-9-vineeth@bitbyteword.org> References: <20231214024727.3503870-1-vineeth@bitbyteword.org> <20231214024727.3503870-9-vineeth@bitbyteword.org> Date: Fri, 15 Dec 2023 18:26:51 +0100 Message-ID: <87zfybml5w.ffs@tglx> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain On Wed, Dec 13 2023 at 21:47, Vineeth Pillai (Google) wrote: > The host proactively boosts the VCPU threads during irq/nmi injection. > However, the host is unaware of posted interrupts, and therefore, the > guest should request a boost if it has not already been boosted. > > Similarly, guest should request an unboost on irq/nmi/softirq exit if > the vcpu doesn't need the boost any more. That's giving a hint but no context for someone who is not familiar with the problem which is tried to be solved here. > @@ -327,6 +327,13 @@ noinstr irqentry_state_t irqentry_enter(struct pt_regs *regs) > .exit_rcu = false, > }; > > +#ifdef CONFIG_PARAVIRT_SCHED > + instrumentation_begin(); Slapping instrumentation_begin() at it silences the objtool checker, but that does not make it correct in any way. You _cannot_ call random code _before_ the kernel has established context. It's clearly documented: https://www.kernel.org/doc/html/latest/core-api/entry.html No? > + if (pv_sched_enabled()) > + pv_sched_boost_vcpu_lazy(); > + instrumentation_end(); > +#endif > + > if (user_mode(regs)) { > irqentry_enter_from_user_mode(regs); > return ret; > @@ -452,6 +459,18 @@ noinstr void irqentry_exit(struct pt_regs *regs, irqentry_state_t state) > if (state.exit_rcu) > ct_irq_exit(); > } > + > +#ifdef CONFIG_PARAVIRT_SCHED > + instrumentation_begin(); Broken too > + /* > + * On irq exit, request a deboost from hypervisor if no softirq pending > + * and current task is not RT and !need_resched. > + */ > + if (pv_sched_enabled() && !local_softirq_pending() && > + !need_resched() && !task_is_realtime(current)) > + pv_sched_unboost_vcpu(); > + instrumentation_end(); > +#endif > } > > irqentry_state_t noinstr irqentry_nmi_enter(struct pt_regs *regs) > @@ -469,6 +488,11 @@ irqentry_state_t noinstr irqentry_nmi_enter(struct pt_regs *regs) > kmsan_unpoison_entry_regs(regs); > trace_hardirqs_off_finish(); > ftrace_nmi_enter(); > + > +#ifdef CONFIG_PARAVIRT_SCHED > + if (pv_sched_enabled()) > + pv_sched_boost_vcpu_lazy(); > +#endif > instrumentation_end(); > > return irq_state; > @@ -482,6 +506,12 @@ void noinstr irqentry_nmi_exit(struct pt_regs *regs, irqentry_state_t irq_state) > trace_hardirqs_on_prepare(); > lockdep_hardirqs_on_prepare(); > } > + > +#ifdef CONFIG_PARAVIRT_SCHED > + if (pv_sched_enabled() && !in_hardirq() && !local_softirq_pending() && > + !need_resched() && !task_is_realtime(current)) > + pv_sched_unboost_vcpu(); > +#endif Symmetry is overrated. Just pick a spot and slap your hackery in. Aside of that this whole #ifdeffery is tasteless at best. > instrumentation_end(); > +#ifdef CONFIG_PARAVIRT_SCHED > + if (pv_sched_enabled()) > + pv_sched_boost_vcpu_lazy(); > +#endif But what's worse is that this is just another approach of sprinkling random hints all over the place and see what sticks. Abusing KVM as the man in the middle to communicate between guest and host scheduler is creative, but ill defined. From the host scheduler POV the vCPU is just a user space thread and making the guest special is just the wrong approach. The kernel has no proper general interface to convey information from a user space task to the scheduler. So instead of adding some ad-hoc KVM hackery the right thing is to solve this problem from ground up in a generic way and KVM can just utilize that instead of having the special snow-flake hackery which is just a maintainability nightmare. Thanks, tglx