Received: by 2002:a25:e74b:0:0:0:0:0 with SMTP id e72csp2417098ybh; Fri, 24 Jul 2020 12:13:02 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzcAJ6sxGsLT1oxM3ytEJS3BWIZzMyzwTzdP7fbzxWz1YxRwkDshnxdP218Y+n0im4kGN7Y X-Received: by 2002:a17:906:cc13:: with SMTP id ml19mr10441748ejb.288.1595617982586; Fri, 24 Jul 2020 12:13:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1595617982; cv=none; d=google.com; s=arc-20160816; b=BM0IBwvi29xogcKQZ2oPNkUx18iO228D/DTQ2m8hPdV5Krwg5KglgsaoqCe6BQVjWG gGoZt2J7lKRbdq5aAygob2wR1SFTA2OD1kgqXjDsUsYquRPgktNPOh0xzqor767OYTlC 4iOh5GLUfCBaLaPBm+hkJh0DsM3GZ2uext/r/0bm5fqloZuX7mP8mmdOviN5FrEWW5ZD /tPGaIJ+Azh26KznX5yOUWfFJpXoC3M9psxvXRppZz6QAVz03cayQIXnh4a7aXiy98Bq Q4HgKI8Q7WBxBo7LhKLEt02+gRuoD5iLzxfWy2F2JsBM56M/wcek4ls0hqZmaQiVOlAI G/BA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :robot-unsubscribe:robot-id:message-id:mime-version:references :in-reply-to:cc:subject:to:reply-to:from:dkim-signature :dkim-signature:date; bh=xb7gkOw79hTP+VUseD+J3vUtjmg9zwEsI6NtW/Q9tlM=; b=OaYjZp4pM7dzasDedQt0PWilWOdJMSmG6GA+2p4clw6X7e5QcuZxR1t1bw0uptuU9z 6ZoL7fu7+rKlZDMx1ce9BHJzEMNvAn55FWghMlVQ8q9oFAoRlJlXc3c+4/9Mpn+PKXtF CdNrKsxH8syCDieXEgEr9lNaH4Bz8582SAK+wdBxDx1SMLXAxcn/a9BXB9Yk33Czi9bY lcyJ5MQZ42sy1F15wW6jnUJfKRD1FNdlg0vbmlLfAa1noMI6R7Ff2NIRgL/JSlFEVtD2 zGy9seOhZDMQ+4Uayw4Jgj/xOTeYZyZ6Syq+goWjlR0yPg84acL+Y73LFlqINVSz859R dapA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linutronix.de header.s=2020 header.b=yvqkxOjs; dkim=neutral (no key) header.i=@vger.kernel.org; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id 12si1046784edx.70.2020.07.24.12.12.40; Fri, 24 Jul 2020 12:13:02 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=fail header.i=@linutronix.de header.s=2020 header.b=yvqkxOjs; dkim=neutral (no key) header.i=@vger.kernel.org; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726984AbgGXTI7 (ORCPT + 99 others); Fri, 24 Jul 2020 15:08:59 -0400 Received: from Galois.linutronix.de ([193.142.43.55]:39880 "EHLO galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726381AbgGXTI6 (ORCPT ); Fri, 24 Jul 2020 15:08:58 -0400 Date: Fri, 24 Jul 2020 19:08:54 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1595617735; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=xb7gkOw79hTP+VUseD+J3vUtjmg9zwEsI6NtW/Q9tlM=; b=yvqkxOjsvYzog+g5FrZ3olWQ4k9WjhMi/Jgz82wfNmaw5oGtxqesJWB1DHHKTmQwjObNc4 v03BZn92cYqPz9Y7NflwA1ZCNCDgklxpUBuUnuQQ92QdYw3SU5lYwFfXM0M0gAblXHAclW YJjumJ0wluqg0UdME05DetWdbQ6CIuDePxdumJW4uGj9UdUpvIwD8sYn1jvPiFL2T15eUR 6W6t4YWiADQq5Z7FceOOW/flnhcEda7BMYdR+6M67lID/M5yyaJ+7/0efI0Ox1xd0+KVfU Tjde7NEKt0S/kDDlyXwsNJ3Tq5iSN3TsHnSF4uc+NXBSCIqj07v8P62H9DkVlg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1595617735; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=xb7gkOw79hTP+VUseD+J3vUtjmg9zwEsI6NtW/Q9tlM=; b=ilOWMxXdQG44kkEnfHIrpfCdLbxFjFZk4cdSBVIVpkgc2KLnaTdB8+M+J0suwium1Ut0Bh 2cyywGX2p5IEuMBw== From: "tip-bot2 for Thomas Gleixner" Reply-to: linux-kernel@vger.kernel.org To: linux-tip-commits@vger.kernel.org Subject: [tip: core/entry] entry: Provide infrastructure for work before transitioning to guest mode Cc: Thomas Gleixner , x86 , LKML In-Reply-To: <20200722220519.833296398@linutronix.de> References: <20200722220519.833296398@linutronix.de> MIME-Version: 1.0 Message-ID: <159561773439.4006.6664021782988649777.tip-bot2@tip-bot2> Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The following commit has been merged into the core/entry branch of tip: Commit-ID: 935ace2fb5cc49ae88bd1f1735ddc51cdc2ebfb3 Gitweb: https://git.kernel.org/tip/935ace2fb5cc49ae88bd1f1735ddc51cdc2ebfb3 Author: Thomas Gleixner AuthorDate: Wed, 22 Jul 2020 23:59:59 +02:00 Committer: Thomas Gleixner CommitterDate: Fri, 24 Jul 2020 15:03:42 +02:00 entry: Provide infrastructure for work before transitioning to guest mode Entering a guest is similar to exiting to user space. Pending work like handling signals, rescheduling, task work etc. needs to be handled before that. Provide generic infrastructure to avoid duplication of the same handling code all over the place. The transfer to guest mode handling is different from the exit to usermode handling, e.g. vs. rseq and live patching, so a separate function is used. The initial list of work items handled is: TIF_SIGPENDING, TIF_NEED_RESCHED, TIF_NOTIFY_RESUME Architecture specific TIF flags can be added via defines in the architecture specific include files. The calling convention is also different from the syscall/interrupt entry functions as KVM invokes this from the outer vcpu_run() loop with interrupts and preemption enabled. To prevent missing a pending work item it invokes a check for pending TIF work from interrupt disabled code right before transitioning to guest mode. The lockdep, RCU and tracing state handling is also done directly around the switch to and from guest mode. Signed-off-by: Thomas Gleixner Link: https://lkml.kernel.org/r/20200722220519.833296398@linutronix.de --- include/linux/entry-kvm.h | 80 ++++++++++++++++++++++++++++++++++++++- include/linux/kvm_host.h | 8 ++++- kernel/entry/Makefile | 3 +- kernel/entry/kvm.c | 51 ++++++++++++++++++++++++- virt/kvm/Kconfig | 3 +- 5 files changed, 144 insertions(+), 1 deletion(-) create mode 100644 include/linux/entry-kvm.h create mode 100644 kernel/entry/kvm.c diff --git a/include/linux/entry-kvm.h b/include/linux/entry-kvm.h new file mode 100644 index 0000000..0cef17a --- /dev/null +++ b/include/linux/entry-kvm.h @@ -0,0 +1,80 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __LINUX_ENTRYKVM_H +#define __LINUX_ENTRYKVM_H + +#include + +/* Transfer to guest mode work */ +#ifdef CONFIG_KVM_XFER_TO_GUEST_WORK + +#ifndef ARCH_XFER_TO_GUEST_MODE_WORK +# define ARCH_XFER_TO_GUEST_MODE_WORK (0) +#endif + +#define XFER_TO_GUEST_MODE_WORK \ + (_TIF_NEED_RESCHED | _TIF_SIGPENDING | \ + _TIF_NOTIFY_RESUME | ARCH_XFER_TO_GUEST_MODE_WORK) + +struct kvm_vcpu; + +/** + * arch_xfer_to_guest_mode_handle_work - Architecture specific xfer to guest + * mode work handling function. + * @vcpu: Pointer to current's VCPU data + * @ti_work: Cached TIF flags gathered in xfer_to_guest_mode_handle_work() + * + * Invoked from xfer_to_guest_mode_handle_work(). Defaults to NOOP. Can be + * replaced by architecture specific code. + */ +static inline int arch_xfer_to_guest_mode_handle_work(struct kvm_vcpu *vcpu, + unsigned long ti_work); + +#ifndef arch_xfer_to_guest_mode_work +static inline int arch_xfer_to_guest_mode_handle_work(struct kvm_vcpu *vcpu, + unsigned long ti_work) +{ + return 0; +} +#endif + +/** + * xfer_to_guest_mode_handle_work - Check and handle pending work which needs + * to be handled before going to guest mode + * @vcpu: Pointer to current's VCPU data + * + * Returns: 0 or an error code + */ +int xfer_to_guest_mode_handle_work(struct kvm_vcpu *vcpu); + +/** + * __xfer_to_guest_mode_work_pending - Check if work is pending + * + * Returns: True if work pending, False otherwise. + * + * Bare variant of xfer_to_guest_mode_work_pending(). Can be called from + * interrupt enabled code for racy quick checks with care. + */ +static inline bool __xfer_to_guest_mode_work_pending(void) +{ + unsigned long ti_work = READ_ONCE(current_thread_info()->flags); + + return !!(ti_work & XFER_TO_GUEST_MODE_WORK); +} + +/** + * xfer_to_guest_mode_work_pending - Check if work is pending which needs to be + * handled before returning to guest mode + * + * Returns: True if work pending, False otherwise. + * + * Has to be invoked with interrupts disabled before the transition to + * guest mode. + */ +static inline bool xfer_to_guest_mode_work_pending(void) +{ + lockdep_assert_irqs_disabled(); + return __xfer_to_guest_mode_work_pending(); +} +#endif /* CONFIG_KVM_XFER_TO_GUEST_WORK */ + +#endif diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index d564855..ac83e9c 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1439,4 +1439,12 @@ int kvm_vm_create_worker_thread(struct kvm *kvm, kvm_vm_thread_fn_t thread_fn, uintptr_t data, const char *name, struct task_struct **thread_ptr); +#ifdef CONFIG_KVM_XFER_TO_GUEST_WORK +static inline void kvm_handle_signal_exit(struct kvm_vcpu *vcpu) +{ + vcpu->run->exit_reason = KVM_EXIT_INTR; + vcpu->stat.signal_exits++; +} +#endif /* CONFIG_KVM_XFER_TO_GUEST_WORK */ + #endif diff --git a/kernel/entry/Makefile b/kernel/entry/Makefile index c207d20..34c8a3f 100644 --- a/kernel/entry/Makefile +++ b/kernel/entry/Makefile @@ -9,4 +9,5 @@ KCOV_INSTRUMENT := n CFLAGS_REMOVE_common.o = -fstack-protector -fstack-protector-strong CFLAGS_common.o += -fno-stack-protector -obj-$(CONFIG_GENERIC_ENTRY) += common.o +obj-$(CONFIG_GENERIC_ENTRY) += common.o +obj-$(CONFIG_KVM_XFER_TO_GUEST_WORK) += kvm.o diff --git a/kernel/entry/kvm.c b/kernel/entry/kvm.c new file mode 100644 index 0000000..eb1a8a4 --- /dev/null +++ b/kernel/entry/kvm.c @@ -0,0 +1,51 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include + +static int xfer_to_guest_mode_work(struct kvm_vcpu *vcpu, unsigned long ti_work) +{ + do { + int ret; + + if (ti_work & _TIF_SIGPENDING) { + kvm_handle_signal_exit(vcpu); + return -EINTR; + } + + if (ti_work & _TIF_NEED_RESCHED) + schedule(); + + if (ti_work & _TIF_NOTIFY_RESUME) { + clear_thread_flag(TIF_NOTIFY_RESUME); + tracehook_notify_resume(NULL); + } + + ret = arch_xfer_to_guest_mode_handle_work(vcpu, ti_work); + if (ret) + return ret; + + ti_work = READ_ONCE(current_thread_info()->flags); + } while (ti_work & XFER_TO_GUEST_MODE_WORK || need_resched()); + return 0; +} + +int xfer_to_guest_mode_handle_work(struct kvm_vcpu *vcpu) +{ + unsigned long ti_work; + + /* + * This is invoked from the outer guest loop with interrupts and + * preemption enabled. + * + * KVM invokes xfer_to_guest_mode_work_pending() with interrupts + * disabled in the inner loop before going into guest mode. No need + * to disable interrupts here. + */ + ti_work = READ_ONCE(current_thread_info()->flags); + if (!(ti_work & XFER_TO_GUEST_MODE_WORK)) + return 0; + + return xfer_to_guest_mode_work(vcpu, ti_work); +} +EXPORT_SYMBOL_GPL(xfer_to_guest_mode_handle_work); diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig index aad9284..1c37ccd 100644 --- a/virt/kvm/Kconfig +++ b/virt/kvm/Kconfig @@ -60,3 +60,6 @@ config HAVE_KVM_VCPU_RUN_PID_CHANGE config HAVE_KVM_NO_POLL bool + +config KVM_XFER_TO_GUEST_WORK + bool