Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp1826807rwb; Fri, 11 Nov 2022 00:31:41 -0800 (PST) X-Google-Smtp-Source: AA0mqf4aAFAWmfuVHeRFwgc2Bgw9DZNAAUfou1czdcUmvGaAxLUlvw/vgnE2QfeTc3ech+nmLpmL X-Received: by 2002:a50:ec12:0:b0:461:d5af:e9ea with SMTP id g18-20020a50ec12000000b00461d5afe9eamr490885edr.403.1668155501518; Fri, 11 Nov 2022 00:31:41 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668155501; cv=none; d=google.com; s=arc-20160816; b=AXVh6l9WF6+mUFLkuqsb7v/7f8J9bOp8P6pUjV8H87RDv/4/kK6y78DTqrkWpK8c4k iOC0OgGYW0zGqJQteWv80ShkkMHL/qSl1YnRjKCIkV7E5CAiqO2m4BqZqjo3eN/pn7QP cC2DWSRJAYS5HLn+QxdbUlUeJ3jqnRHRSBxvuJUwo09XPmR8849nHhKaOzyeCX8vteqh dphJi3fPbLRPG0l5OpeSOMQGjMXIXCuBezfeFMKV4PObqyAewNfg/kRvbUDKoxqa2LEv ck3UV5iWsb3HzbOyFk/D1BgoGyTgii8fyMzNlnXAqoQoQn9XsEG9yVvokjZCdiI/SaFE rS7A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:user-agent:references:in-reply-to :subject:cc:to:from:message-id:date:dkim-signature; bh=awYqCwm6m2TNY+ElTNqlcLOT7iJ346v2p+wF2AppIlI=; b=y6n593hzXRE/erSHrjjOTzux/IzurzEJu1f7f5X5+5WE4IdISqg85ks8J2hoFZ0pVn pgSDOsL8rBvXyf68XmNdFfd2DQkPnsHYdp01TPiTQ71d2gW6qHqzo+mlV6AhdVYGJVbV bUYnvP6cjhvOkHjnUBp89b9ZQeR/zpvOLowXqVTs81g/MuO+KH5t6AcqGfOUfv63hfzX zq78AM0QvHpM5D1OLyyxTmWz6T8QkzBo5p+NqxzmzVqWex7M3w90vM4uEHbcK5GHkejK hHMDqaVy2updo90pbnSCz0u+Xj1v/6pMYLYMe6N1tyEazJh2DU2pxABZu8QCaV2zRlZP tk2A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b="Z5Lo6/Qx"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id oz17-20020a1709077d9100b0078d8bd255d4si1604830ejc.949.2022.11.11.00.31.18; Fri, 11 Nov 2022 00:31:41 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b="Z5Lo6/Qx"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233108AbiKKI1h (ORCPT + 93 others); Fri, 11 Nov 2022 03:27:37 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50924 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231235AbiKKI1f (ORCPT ); Fri, 11 Nov 2022 03:27:35 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BAEE424085; Fri, 11 Nov 2022 00:27:34 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 58E9261EBE; Fri, 11 Nov 2022 08:27:34 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B667EC433D6; Fri, 11 Nov 2022 08:27:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1668155253; bh=e1/d/8WJcd5q4Zo+wAS3Yn8BOJ7hTNu5U8ybWGzTWSA=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=Z5Lo6/QxPZZi4/tmWRUwTkusz+fQQk/7gsE9/crjsG55v2Z6z08e23HfnklH6VzT2 wChL012poapeFKpMu+BSLo9D3Oc9QG8c16Gj2j6So7CBeU0XjudS6dMeu6CQNHSkdI USGb3AOBLBe9RbbCaM2tXYnup+Hk8eUX9vgrng4/kGbKCLuytHFwQduzxjoENMgU0P m51kt2c1AI+lveMeHr9pyvpQfZV9HFFkrMlwdx/v53gCZW0zGVNGGaqAM3avyr32br Kj2Xewc3c9nHeDfr94d4YGc732/P00mx3+PiggLlxocooMFw/o1Nv2FcocojC8tWzq I3ZDnFbw/SgEA== Received: from ip-185-104-136-29.ptr.icomera.net ([185.104.136.29] helo=wait-a-minute.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1otPNr-005Lew-C3; Fri, 11 Nov 2022 08:27:31 +0000 Date: Fri, 11 Nov 2022 08:26:02 +0000 Message-ID: <87fsepvqw5.wl-maz@kernel.org> From: Marc Zyngier To: Oliver Upton Cc: James Morse , Alexandru Elisei , Suzuki K Poulose , Catalin Marinas , Will Deacon , Paolo Bonzini , Raghavendra Rao Ananta , linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvmarm@lists.cs.columbia.edu, linux-kernel@vger.kernel.org, kvm@vger.kernel.org Subject: Re: [RFC PATCH 2/3] KVM: arm64: Allow userspace to trap SMCCC sub-ranges In-Reply-To: References: <20221110015327.3389351-1-oliver.upton@linux.dev> <20221110015327.3389351-3-oliver.upton@linux.dev> <86o7tfov7v.wl-maz@kernel.org> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.104.136.29 X-SA-Exim-Rcpt-To: oliver.upton@linux.dev, james.morse@arm.com, alexandru.elisei@arm.com, suzuki.poulose@arm.com, catalin.marinas@arm.com, will@kernel.org, pbonzini@redhat.com, rananta@google.com, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, kvmarm@lists.cs.columbia.edu, linux-kernel@vger.kernel.org, kvm@vger.kernel.org X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 10 Nov 2022 21:13:54 +0000, Oliver Upton wrote: > > On Thu, Nov 10, 2022 at 12:22:12PM +0000, Marc Zyngier wrote: > > > +static bool kvm_hvc_call_user_trapped(struct kvm_vcpu *vcpu, u32 func_id) > > > +{ > > > + struct kvm *kvm = vcpu->kvm; > > > + unsigned long *bmap = &kvm->arch.smccc_feat.user_trap_bmap; > > > + > > > + switch (ARM_SMCCC_OWNER_NUM(func_id)) { > > > + case ARM_SMCCC_OWNER_ARCH: > > > + return test_bit(KVM_ARM_USER_HYPERCALL_OWNER_ARCH, bmap); > > > + case ARM_SMCCC_OWNER_CPU: > > > + return test_bit(KVM_ARM_USER_HYPERCALL_OWNER_CPU, bmap); > > > + case ARM_SMCCC_OWNER_SIP: > > > + return test_bit(KVM_ARM_USER_HYPERCALL_OWNER_SIP, bmap); > > > + case ARM_SMCCC_OWNER_OEM: > > > + return test_bit(KVM_ARM_USER_HYPERCALL_OWNER_OEM, bmap); > > > + case ARM_SMCCC_OWNER_STANDARD: > > > + return test_bit(KVM_ARM_USER_HYPERCALL_OWNER_STANDARD, bmap); > > > + case ARM_SMCCC_OWNER_STANDARD_HYP: > > > + return test_bit(KVM_ARM_USER_HYPERCALL_OWNER_STANDARD_HYP, bmap); > > > + case ARM_SMCCC_OWNER_VENDOR_HYP: > > > + return test_bit(KVM_ARM_USER_HYPERCALL_OWNER_VENDOR_HYP, bmap); > > > + case ARM_SMCCC_OWNER_TRUSTED_APP ... ARM_SMCCC_OWNER_TRUSTED_APP_END: > > > + return test_bit(KVM_ARM_USER_HYPERCALL_OWNER_TRUSTED_APP, bmap); > > > + case ARM_SMCCC_OWNER_TRUSTED_OS ... ARM_SMCCC_OWNER_TRUSTED_OS_END: > > > + return test_bit(KVM_ARM_USER_HYPERCALL_OWNER_TRUSTED_OS, bmap); > > > + default: > > > + return false; > > > + } > > > > You have multiple problems here: > > > > - the granularity is way too coarse. You want to express arbitrary > > ranges, and not necessarily grab a whole owner range. > > > > - you have now an overlap between ranges that are handled in the > > kernel (PSCI, spectre mitigations) and ranges that userspace wants > > to observe. Not good. > > We need to come to agreement on what degree of mix-and-match should be > supported. > > Spectre really ought to be in the kernel, and I don't think anyone is > particularly excited about reimplementing PSCI. Right now my interest > in this starts and ends with forwarding the vendor-specific hypercall > range to userspace, allowing something like Hyper-V PV on KVM. > > > If we are going down this road, this can only be done at the > > *function* level. And userspace must know that the kernel will refuse > > to forward some ranges. > > The goal of what I was trying to get at is that either the kernel or > userspace takes ownership of a range that has an ABI, but not both. i.e. > you really wouldn't want some VMM or cloud provider trapping portions of > KVM's vendor-specific range while still reporting a 'vanilla' ABI at the > time of discovery. Same goes for PSCI, TRNG, etc. But I definitely think this is one of the major use cases. For example, there is value in taking PSCI to userspace in order to implement a newer version of the spec, or to support sub-features that KVM doesn't (want to) implement. I don't think this changes the ABI from the guest perspective. pKVM also has a use case for this where userspace gets a notification of the hypercall that a guest has performed to share memory. Communication with a TEE also is on the cards, as would be a FFA implementation. All of this could be implemented in KVM, or in userspace, depending what users of these misfeatures want to do. > > > So obviously, this cannot be a simple bitmap. Making it a radix tree > > (or an xarray, which is basically the same thing) could work. And the > > filtering request from userspace can be similar to what we have for > > the PMU filters. > > Right, we'll need a more robust data structure for all this. > > My only concern is that communicating the hypercall filter between > user/kernel with a set of ranges or function numbers is that we could be > mutating what KVM *doesn't* already implement into an ABI of sorts. > > i.e. suppose that userspace wants to filter function(s) in an > unallocated/unused range of function numbers. Later down the line KVM > adds support for a new shiny thing and the filter becomes a subset of a > now allocated range of calls. We then reject the filter due to the > incongruence. But isn't the problem to ask for ranges that are unallocated the first place? What semantic can userspace give to such a thing other than replying "not implemented", which is what the kernel would do anyway? The more interesting problem is when you want to emulate another hypervisor, and that the vendor spaces overlap (a very likely outcome). Somehow, this means overriding all the KVM-specific hypercalls, and let userspace deal with it. But again, this can be done on a per function basis. Thanks, M. -- Without deviation from the norm, progress is not possible.