Received: by 2002:ac0:da4c:0:0:0:0:0 with SMTP id a12csp1051803imi; Fri, 22 Jul 2022 16:09:32 -0700 (PDT) X-Google-Smtp-Source: AGRyM1tTo/08zH3hcuzsDt1fpYrPHrKhu0lv9meU29KIM9rLuews6XzPK2PUvtjd2C6C5frwUeXK X-Received: by 2002:a05:6402:444c:b0:43b:d375:e932 with SMTP id o12-20020a056402444c00b0043bd375e932mr1919376edb.399.1658531372423; Fri, 22 Jul 2022 16:09:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1658531372; cv=none; d=google.com; s=arc-20160816; b=eXBFX01obn6WqgVnVEg2eP8yUzexii/faZXVSQD5mUv2KwE3XD/KMMxCnWY9Ghy5Lt S1UT9Hwvxbl835+p0JbnTL4XPsqJbsiZv2RGG1gnH6gEBx8sdauwE1zOlBuzOmIUMh36 p/vXzVYkXDwOANC95pwcixBzuPPyTCct3mMhGnJhF2psAwRt+CICPjqWYP/ima7gDbtN LqQAS7tUOGqQ8YR5g09aWdnVxowdcGSt+kknoQQkhvysFIz+c32cR/V7KUrDz09t2awY qSvj9zoFKLmPnLhB/XXn4oUBuXa6B/kyiMMacKJwqRcvbDmycfRsfispamOnBsqTto/V Wcaw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:from:subject :references:mime-version:message-id:in-reply-to:date:dkim-signature; bh=93pRDb/hhCi6rn7HrzOWCC1IRMMdq/ahH9HJBQp/dJg=; b=D3EpamsxxurkRaOHW/EmVvmmCC1xKvXaVRaUXSXJ/+dzt/+LqC4QVR8zwSHpXCsdbO SjkpTQ/pzR9b5w/Tji8Hq/NMUg4uy3YkeTQRUH0jcAYmA3gfQJj0nBHvfmY5KKnwbjJc prGJpA7VoQ7IqestH1OYNuAehhErpnYj9t+N9FwCMsQuATIk4dqfbI5wWF7rU7/nE358 THWhF68lE32hyYSirDLmcijf9Uq9CDMy2jEVqvReNyQw9DmpLVIPtyGrNyfQcHhdYyEi uMDDTpaoIkrJ8hJCPQ4FwCRd1rn/aAZjd7sVL5qMlBEYC5vaxmI//FUZFPNSFBm1de5B 1NwQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=EUvosg4J; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id z68-20020a509e4a000000b0043bb80dadd7si6866763ede.404.2022.07.22.16.09.07; Fri, 22 Jul 2022 16:09:32 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=EUvosg4J; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236507AbiGVXC7 (ORCPT + 99 others); Fri, 22 Jul 2022 19:02:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52270 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236831AbiGVXC5 (ORCPT ); Fri, 22 Jul 2022 19:02:57 -0400 Received: from mail-pl1-x64a.google.com (mail-pl1-x64a.google.com [IPv6:2607:f8b0:4864:20::64a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3527E8D5E8 for ; Fri, 22 Jul 2022 16:02:56 -0700 (PDT) Received: by mail-pl1-x64a.google.com with SMTP id b10-20020a170902d50a00b0016c56d1f90fso3342969plg.21 for ; Fri, 22 Jul 2022 16:02:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc:content-transfer-encoding; bh=93pRDb/hhCi6rn7HrzOWCC1IRMMdq/ahH9HJBQp/dJg=; b=EUvosg4JwWCErsar9boMExIxd60trxESqOUvfC4H7XMTC55998ES0+6EaKk32Yx4YJ BOBys09yAI3hZK/BjMEiyUcHPvh3D/BXi7DAVvAtcpu2eWLQZ0YkhcHrVH3zSNksmjKa cuVHaBmIQBHZpy4oogMHMpxdPNPhaxDBwnziDcvMiUoorLSx2LhPnOt7FYv6EaWca+ez Y/FTlRddDDLDb7zJzq6+XHgGQXg+9b5hiVjj/0Fcy3KETz7WmM8GLsLL1ztkAz95DHtj eIAqRvf41WwdGlp4bo6xZeEhzPCDhUlTAQhDeVquikPqpyxtqTg+RZbFo4aaTuztDRD/ 9F6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc:content-transfer-encoding; bh=93pRDb/hhCi6rn7HrzOWCC1IRMMdq/ahH9HJBQp/dJg=; b=A+o2nXRn3UQqQIN8a3ajw1FegfbTthoLLRuqT0ONx3VXfDa610zIyZuRqotlh53SQ1 q8XocoaDOYtfcM6d15tesqVreIQBcZZPsZb9JIv5knnwGx6o7Dy6ilOdnXvup+i+20Co X06Ppzxkzz5Trj00xsHY5/annpAHlKEgflAfSaHQqLsqMreVPY6n23e+qIzVDzT6TmbF 9VEYUNAvxeF3XvFP9KENTEmU0sxU1lAxri9+aAcUKdMnaogH/Qsw8L9sdkLPSm6oYKpB H5R3dSLHhjZBKtVfkGPaErYSnkZLqAyHYawTFzO+UDfqQjqrdZiWToCGDfuFneAg8pva RidA== X-Gm-Message-State: AJIora8zkYDBK5yvN6JQPTCDZKwYrTBCGk3bSLGlFO6USBm/Uhc9Mfrq Xp9tisfxpYzVNvoRP3FRiXYXEn7vKcM= X-Received: from avagin.kir.corp.google.com ([2620:15c:29:204:5863:d08b:b2f8:4a3e]) (user=avagin job=sendgmr) by 2002:a17:902:cf0e:b0:16d:2517:845 with SMTP id i14-20020a170902cf0e00b0016d25170845mr1737227plg.62.1658530975445; Fri, 22 Jul 2022 16:02:55 -0700 (PDT) Date: Fri, 22 Jul 2022 16:02:39 -0700 In-Reply-To: <20220722230241.1944655-1-avagin@google.com> Message-Id: <20220722230241.1944655-4-avagin@google.com> Mime-Version: 1.0 References: <20220722230241.1944655-1-avagin@google.com> X-Mailer: git-send-email 2.37.1.359.gd136c6c3e2-goog Subject: [PATCH 3/5] KVM/x86: add a new hypercall to execute host system calls. From: Andrei Vagin To: Paolo Bonzini Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Andrei Vagin , Sean Christopherson , Wanpeng Li , Vitaly Kuznetsov , Jianfeng Tan , Adin Scannell , Konstantin Bogomolov , Etienne Perot Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org There is a class of applications that use KVM to manage multiple address spaces rather than use it as an isolation boundary. In all other terms, they are normal processes that execute system calls, handle signals, etc. Currently, each time when such a process needs to interact with the operation system, it has to switch to host and back to guest. Such entire switches are expensive and significantly increase the overhead of system calls. The new hypercall reduces this overhead by more than two times. The new hypercall allows to execute host system calls. As for native calls, seccomp filters are executed before calls. It takes one argument that is a pointer to a pt_regs structure in the host address space. It provides registers to execute a system call according to the calling convention. Arguments are passed in %rdi, %rsi, %rdx, %r10, %r8 and %r9 and then a return code is stored in %rax.=C2=A0 The hypercall returns 0 if a system call has been executed. Otherwise, it returns an error code. Signed-off-by: Andrei Vagin --- Documentation/virt/kvm/x86/hypercalls.rst | 18 +++++++++++++ arch/x86/kvm/x86.c | 33 +++++++++++++++++++++++ include/uapi/linux/kvm_para.h | 1 + 3 files changed, 52 insertions(+) diff --git a/Documentation/virt/kvm/x86/hypercalls.rst b/Documentation/virt= /kvm/x86/hypercalls.rst index e56fa8b9cfca..eb18f2128bfe 100644 --- a/Documentation/virt/kvm/x86/hypercalls.rst +++ b/Documentation/virt/kvm/x86/hypercalls.rst @@ -190,3 +190,21 @@ the KVM_CAP_EXIT_HYPERCALL capability. Userspace must = enable that capability before advertising KVM_FEATURE_HC_MAP_GPA_RANGE in the guest CPUID. In addition, if the guest supports KVM_FEATURE_MIGRATION_CONTROL, userspace must also set up an MSR filter to process writes to MSR_KVM_MIGRATION_CONT= ROL. + +9. KVM_HC_HOST_SYSCALL +--------------------- +:Architecture: x86 +:Status: active +:Purpose: Execute a specified system call. + +- a0: pointer to a pt_regs structure in the host addess space. + +This hypercall lets a guest to execute host system calls. The first and on= ly +argument represents process registers that are used as input and output +parameters. + +Returns 0 if the requested syscall has been executed. Otherwise, it return= s an +error code. + +**Implementation note**: The KVM_CAP_PV_HOST_SYSCALL capability has to be = set +to use this hypercall. diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 19e634768161..aa54e180c9d4 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -81,6 +81,7 @@ #include #include #include +#include =20 #define CREATE_TRACE_POINTS #include "trace.h" @@ -9253,6 +9254,27 @@ static int complete_hypercall_exit(struct kvm_vcpu *= vcpu) return kvm_skip_emulated_instruction(vcpu); } =20 +static int kvm_pv_host_syscall(unsigned long a0) +{ + struct pt_regs pt_regs =3D {}; + unsigned long sysno; + + if (copy_from_user(&pt_regs, (void *)a0, sizeof(pt_regs))) + return -EFAULT; + + sysno =3D pt_regs.ax; + pt_regs.orig_ax =3D pt_regs.ax; + pt_regs.ax =3D -ENOSYS; + + do_ksyscall_64(sysno, &pt_regs); + + pt_regs.orig_ax =3D -1; + if (copy_to_user((void *)a0, &pt_regs, sizeof(pt_regs))) + return -EFAULT; + + return 0; +} + int kvm_emulate_hypercall(struct kvm_vcpu *vcpu) { unsigned long nr, a0, a1, a2, a3, ret; @@ -9318,6 +9340,7 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu) kvm_sched_yield(vcpu, a0); ret =3D 0; break; + case KVM_HC_MAP_GPA_RANGE: { u64 gpa =3D a0, npages =3D a1, attrs =3D a2; =20 @@ -9340,6 +9363,16 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu) vcpu->arch.complete_userspace_io =3D complete_hypercall_exit; return 0; } + + case KVM_HC_HOST_SYSCALL: + if (!guest_pv_has(vcpu, KVM_FEATURE_PV_HOST_SYSCALL)) + break; + + kvm_vcpu_srcu_read_unlock(vcpu); + ret =3D kvm_pv_host_syscall(a0); + kvm_vcpu_srcu_read_lock(vcpu); + break; + default: ret =3D -KVM_ENOSYS; break; diff --git a/include/uapi/linux/kvm_para.h b/include/uapi/linux/kvm_para.h index 960c7e93d1a9..3fcfb3241f35 100644 --- a/include/uapi/linux/kvm_para.h +++ b/include/uapi/linux/kvm_para.h @@ -30,6 +30,7 @@ #define KVM_HC_SEND_IPI 10 #define KVM_HC_SCHED_YIELD 11 #define KVM_HC_MAP_GPA_RANGE 12 +#define KVM_HC_HOST_SYSCALL 13 =20 /* * hypercalls use architecture specific --=20 2.37.1.359.gd136c6c3e2-goog