Received: by 2002:a6b:500f:0:0:0:0:0 with SMTP id e15csp2381473iob; Sat, 30 Apr 2022 07:07:51 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyR8JCG7vStP1rEMf+/JMg/knR8kt1vcg1ERK6neQHKCrUN6MsXm/qe/UwUgPMb1lVRhzVi X-Received: by 2002:a05:6512:25e:b0:472:251f:9611 with SMTP id b30-20020a056512025e00b00472251f9611mr3174017lfo.164.1651327670868; Sat, 30 Apr 2022 07:07:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1651327670; cv=none; d=google.com; s=arc-20160816; b=Jc4KKHePHgR9exPArPm0uASuhYZtdd/RnsrQQ/SsWMxol23SMuksBW1xiC34I9dCwl A5vppBT9+LsSmkG5HpgMqn2fSfuycmR9CybSDpLDJb9DnSUw/YTanhVv93DPqtQvg084 85JciSP4vDhMjtPinf+JAMLUR4Hdze9Z2a1YuWoRFclM7QxhZw9FcKwBlQ0+YpwM24VS B6ePHUlofpzEpPVDcID+zwXKVBhFLxb43Dll0S+wlfKzKGHIN5GWKdfB16xhBEjCfgzq 53a3Nw8kA2CX5t/cDOoDu+t92n5Y+2Pedgx0SyKcfCp2qcIwePkHznNttzpUgI+6zL1m iOSQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=eF2Vsl1Y+sSmbDFKvxvLG8OQAc2P8BqrkxARZmIJo/4=; b=ab9pAbSbx/kNbeQU3S4XM3R6kaAlhsIxczH6Z0+zEFGLPwJb++7rwzKmjrA8UED8vz Z/bXRfV5hLQijrRwJxcp0Owt0bS7m/zJB61oeekx4pXywkPgqsjCKyGxbu0ZcGUx33Ye Wc0yAsehXw1LpkS2FRhs9WXYtK8PynesXnGHQImdwEkvsdId/RHp3SDNLZDfEZ536OXD VrmHyW+5Rq5Jd9mKFE73QVybr6Zqjv6ptpPdOqcqpjbTodSPECXbx7vg3vOqzjNnGttV yE0hD8lgBsuXq1ZkjpIKlnqN4ESFcH433eEn/dJvW3CueZZ7Nb90Tpo/UxByb3ojFcYH XTng== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel-com.20210112.gappssmtp.com header.s=20210112 header.b=UaXK0qek; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id c15-20020a056512238f00b004725e0a12a0si591635lfv.83.2022.04.30.07.07.23; Sat, 30 Apr 2022 07:07:50 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel-com.20210112.gappssmtp.com header.s=20210112 header.b=UaXK0qek; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1380124AbiD2SvD (ORCPT + 99 others); Fri, 29 Apr 2022 14:51:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54568 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1380011AbiD2SvB (ORCPT ); Fri, 29 Apr 2022 14:51:01 -0400 Received: from mail-pj1-x102d.google.com (mail-pj1-x102d.google.com [IPv6:2607:f8b0:4864:20::102d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 921B0B189D for ; Fri, 29 Apr 2022 11:47:42 -0700 (PDT) Received: by mail-pj1-x102d.google.com with SMTP id w17-20020a17090a529100b001db302efed6so6517626pjh.4 for ; Fri, 29 Apr 2022 11:47:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel-com.20210112.gappssmtp.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=eF2Vsl1Y+sSmbDFKvxvLG8OQAc2P8BqrkxARZmIJo/4=; b=UaXK0qek/m3KBku94FiR2aCpLAmc5L7lHG00+M/6bHi2MEYsMwwXqvgB2pRRMk14wg s6tblUC23UdGhOat3IEPEcSOpJDcOMrCdHM58Mw8ZXum5+ZuoTGa7Xk2q1eRJpiyeRay Hep4LWc+YYOikH03Cqom/UkSCzLEu3MDszrqmo3ssDuL1liGDgvRSpq38S11WoniJNab 1sovPccL6OxSpAGgoV8egZ1SiuVmGbinm4i5byzEsnqhUZt4Tus1ISfCed5wKrxemRSd rIElvttjQpwC9+cRUf0seywgw+zpDqfTHFTglZJ+aTXFUBhYdLoJWiP6A2LOEa9Euo+r dFOg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=eF2Vsl1Y+sSmbDFKvxvLG8OQAc2P8BqrkxARZmIJo/4=; b=dpyYP0HHZ5TbbQrWKMI9oXEHZakc8DarQdGwa06T1GENFnPa1MMaBW5bpvz9GWxZP0 prXat+necWcGlX2ktD6LZ3Cr9eEvYCr9e/jOQVtzjixgTIgwx2pZE7Xy5NoRSrLvqfvP Q/kGQ46HZDlbKpP8ZNtlbgLZHpN/KG9z8P6wEVi7tJF5mjCsdiRKENAQWzZU+dTBG9zP 52fRhjrdBv1huUeuWS63UQfzheVmlv4n4DO0BAoxR99esFuQRwgn4My44zmODBQM4baL QSBs0ZRmA9SUgONIYpjMzcRkHWG7bPWMXwGA5hrVU6be0qSdb1KHS8I3UDOkBLcqhSqY 9mYQ== X-Gm-Message-State: AOAM530k4IFVjp6H+aZ0GtCxF7o5jZcv9c2MXt3LwNCJmchFlUcGhQlf wlACeofYEFWanIf5l9aWN72uQI6jcryBzaPShxCKzg== X-Received: by 2002:a17:902:ea57:b0:15a:6173:87dd with SMTP id r23-20020a170902ea5700b0015a617387ddmr437403plg.147.1651258062096; Fri, 29 Apr 2022 11:47:42 -0700 (PDT) MIME-Version: 1.0 References: <522e37eb-68fc-35db-44d5-479d0088e43f@intel.com> <92af7b22-fa8a-5d42-ae15-8526abfd2622@intel.com> <4a5143cc-3102-5e30-08b4-c07e44f1a2fc@intel.com> <4d0c7316-3564-ef27-1113-042019d583dc@intel.com> In-Reply-To: <4d0c7316-3564-ef27-1113-042019d583dc@intel.com> From: Dan Williams Date: Fri, 29 Apr 2022 11:47:31 -0700 Message-ID: Subject: Re: [PATCH v3 00/21] TDX host kernel support To: Dave Hansen Cc: Kai Huang , Linux Kernel Mailing List , KVM list , Sean Christopherson , Paolo Bonzini , "Brown, Len" , "Luck, Tony" , Rafael J Wysocki , Reinette Chatre , Peter Zijlstra , Andi Kleen , "Kirill A. Shutemov" , Kuppuswamy Sathyanarayanan , Isaku Yamahata Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Apr 29, 2022 at 11:34 AM Dave Hansen wrote: > > On 4/29/22 10:48, Dan Williams wrote: > >> But, neither of those really help with, say, a device-DAX mapping of > >> TDX-*IN*capable memory handed to KVM. The "new syscall" would just > >> throw up its hands and leave users with the same result: TDX can't be > >> used. The new sysfs ABI for NUMA nodes wouldn't clearly apply to > >> device-DAX because they don't respect the NUMA policy ABI. > > They do have "target_node" attributes to associate node specific > > metadata, and could certainly express target_node capabilities in its > > own ABI. Then it's just a matter of making pfn_to_nid() do the right > > thing so KVM kernel side can validate the capabilities of all inbound > > pfns. > > Let's walk through how this would work with today's kernel on tomorrow's > hardware, without KVM validating PFNs: > > 1. daxaddr mmap("/dev/dax1234") > 2. kvmfd = open("/dev/kvm") > 3. ioctl(KVM_SET_USER_MEMORY_REGION, { daxaddr }; At least for a file backed mapping the capability lookup could be done here, no need to wait for the fault. > 4. guest starts running > 5. guest touches 'daxaddr' > 6. Page fault handler maps 'daxaddr' > 7. KVM finds new 'daxaddr' PTE > 8. TDX code tries to add physical address to Secure-EPT > 9. TDX "SEAMCALL" fails because page is not convertible > 10. Guest dies > > All we can do to improve on that is call something that pledges to only > map convertible memory at 'daxaddr'. We can't *actually* validate the > physical addresses at mmap() time or even > KVM_SET_USER_MEMORY_REGION-time because the memory might not have been > allocated. > > Those pledges are hard for anonymous memory though. To fulfill the > pledge, we not only have to validate that the NUMA policy is compatible > at KVM_SET_USER_MEMORY_REGION, we also need to decline changes to the > policy that might undermine the pledge. I think it's less that the kernel needs to enforce a pledge and more that an interface is needed to communicate the guest death reason. I.e. "here is the impossible thing you asked for, next time set this policy to avoid this problem".