Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp5564463pxj; Wed, 26 May 2021 13:44:43 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzvoqRCD+8XVHyDsx54A0LH3JwueHachg8aC3WHuY8U0Ay4lRIGUzjJ1Cff8JT0k3u3NgZQ X-Received: by 2002:a05:6402:19a:: with SMTP id r26mr134682edv.44.1622061883620; Wed, 26 May 2021 13:44:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1622061883; cv=none; d=google.com; s=arc-20160816; b=Yxfw/Xj+tdnrv6pNbSwD1s/CHaeTlsiOAWdDOjlql/iRjDuulW8ImsFVOqWqCsEGom LZXzI2YDkovF7BQA2cGln9UEAusB9b0LHoqtt74rh4or7eGkgdhNR7KWD3bScpQwUSlF TAUwda+0PFmHGyOisM3V+tbEoE55TrlmToG/zdLEpyH0t3R5lWfgoukHFZ+Ax+i3sg3U QFm6GYB4/2ywIdX14m5sJDNvBgUtEii5fvt0bRhYguKup+LHtxMshzxAKfoPwcFcSYo6 SNLImHYtbTPXCDvGhOioOlv8jBGQtzNSxVnamLgr108OF29YwB+QAE78/sEe9ucDi2Ae xeCg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=hEQ3MdXmpdNmcX6sa1MKz2lUFHPHmPostz9Pp0Yt+54=; b=zgSvLt+T9zK5hMp4ykx5lVCL4xQP1JPXgAFyDY9gKRIPjg4q1kFq8DAWbK+4Qe4H5H ObSQzuQ5g489p9PJp4L8Zm7sBplkcGAD3rHXEQGzS5mfbKfzFcuaTOrWmCNq+rRzIWaW UPFbBVKDfX1Ia8GNeSTbrrXVwIt5W5pvN7niW5og5hknPaLlqb9AccPtm/zneX0hnE/t Fplshh7id/N5Zc12yWY/JTr05q0XM833OdS1zsyjAj4svHqp0AgMyAEi4hSJvFh4MJfB B9UKzwEcK8HlK3XTzS9PDxZRIJd6CvHed+rYAsbhohgnh/8E6tmxmYVaXvhZbe93A7qw MA5g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=b1ledajA; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id zm17si115121ejb.444.2021.05.26.13.44.19; Wed, 26 May 2021 13:44:43 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=b1ledajA; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233794AbhEZTsb (ORCPT + 99 others); Wed, 26 May 2021 15:48:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49496 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232053AbhEZTsa (ORCPT ); Wed, 26 May 2021 15:48:30 -0400 Received: from mail-pf1-x433.google.com (mail-pf1-x433.google.com [IPv6:2607:f8b0:4864:20::433]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CF789C061574 for ; Wed, 26 May 2021 12:46:57 -0700 (PDT) Received: by mail-pf1-x433.google.com with SMTP id p39so1753643pfw.8 for ; Wed, 26 May 2021 12:46:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=hEQ3MdXmpdNmcX6sa1MKz2lUFHPHmPostz9Pp0Yt+54=; b=b1ledajAYuzSP2VsYjLXXA+yLdhz9jMTPLKR8jN1BmCm06HetxB4+K+9y1UkLgPGJU 8FJJevRA+tNRkxZEmKTbGAEI201rzRZoz5+VL7ftvW0M84XkhdamXP09FeQBf0gDMQGU 0k9I1ORp07R5N8pLA1QDajOqd6fOK0a7s0zKXZ7MRZknul8dbZAVK3KNwxMHmNCQfBLY jJ3jy8WK2GAgyehfuQVJdPflw5Uv2kQBba7LRL5vNSkwHjO/TeZhiuXKJEuo7EZFqXnP 6CqLhRB4GytEUeiguuk8nexR6qzpY5MWCLAB7GLELwSWrKeTRKKlEoIKE4X5J/7EgIkv hv3A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=hEQ3MdXmpdNmcX6sa1MKz2lUFHPHmPostz9Pp0Yt+54=; b=Gmtyd55JeBvuIB4exirdzQ5qJwYCa1mNAgTtKwx/aNg2NN0uZApGQvyqKAoDBsWwlH zoS6iwOTXE6VJ7Caa1qJGDGyFXvrf2dI3lWCMk1Pq1xQvAe/w6mKeRUwXcxaFeceZwSn B+TXnvFXazp+lZDqqS/mURoGNz9VxiTcAPPWajEJtViA42hxHKmhinP+UdYaUqLaC2Hm XteFjm2pntdb11DFBbpU6ba49wL679ZP3UM6OTLP1rGwmmksmn2MWMxdHfUhfqib1FXS R0AamccR7wqHnCZsJPs7DjEv+EQRX965GUZ7a+OlB6ALq2USgaC5JxNDJLeDmXQeGjHC NCWQ== X-Gm-Message-State: AOAM5311K6uxjU/FVJ9PYx8x6ayx7XQX5xl9/+Q2eHvzmwy/BhWQQ6nw o9G4Z4CUwpIDFbnCfNPFvWblkA== X-Received: by 2002:a62:cec9:0:b029:2e3:9125:c280 with SMTP id y192-20020a62cec90000b02902e39125c280mr65079pfg.11.1622058417033; Wed, 26 May 2021 12:46:57 -0700 (PDT) Received: from google.com (240.111.247.35.bc.googleusercontent.com. [35.247.111.240]) by smtp.gmail.com with ESMTPSA id f9sm46220pfc.42.2021.05.26.12.46.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 26 May 2021 12:46:56 -0700 (PDT) Date: Wed, 26 May 2021 19:46:52 +0000 From: Sean Christopherson To: "Kirill A. Shutemov" Cc: "Kirill A. Shutemov" , Dave Hansen , Andy Lutomirski , Peter Zijlstra , Jim Mattson , David Rientjes , "Edgecombe, Rick P" , "Kleen, Andi" , "Yamahata, Isaku" , Erdem Aktas , Steve Rutherford , Peter Gonda , David Hildenbrand , Chao Peng , x86@kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [RFCv2 13/13] KVM: unmap guest memory using poisoned pages Message-ID: References: <20210419142602.khjbzktk5tk5l6lk@box.shutemov.name> <20210419164027.dqiptkebhdt5cfmy@box.shutemov.name> <20210419185354.v3rgandtrel7bzjj@box> <20210419225755.nsrtjfvfcqscyb6m@box.shutemov.name> <20210521123148.a3t4uh4iezm6ax47@box> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210521123148.a3t4uh4iezm6ax47@box> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, May 21, 2021, Kirill A. Shutemov wrote: > Hi Sean, > > The core patch of the approach we've discussed before is below. It > introduce a new page type with the required semantics. > > The full patchset can be found here: > > git://git.kernel.org/pub/scm/linux/kernel/git/kas/linux.git kvm-unmapped-guest-only > > but only the patch below is relevant for TDX. QEMU patch is attached. Can you post the whole series? The KVM behavior and usage of FOLL_GUEST is very relevant to TDX. > CONFIG_HAVE_KVM_PROTECTED_MEMORY has to be changed to what is appropriate > for TDX and FOLL_GUEST has to be used in hva_to_pfn_slow() when running > TDX guest. This behavior in particular is relevant; KVM should provide FOLL_GUEST iff the access is private or the VM type doesn't differentiate between private and shared. > When page get inserted into private sept we must make sure it is > PageGuest() or SIGBUS otherwise. More KVM feedback :-) Ideally, KVM will synchronously exit to userspace with detailed information on the bad behavior, not do SIGBUS. Hopefully that infrastructure will be in place sooner than later. https://lkml.kernel.org/r/YKxJLcg/WomPE422@google.com > Inserting PageGuest() into shared is fine, but the page will not be accessible > from userspace. Even if it can be functionally fine, I don't think we want to allow KVM to map PageGuest() as shared memory. The only reason to map memory shared is to share it with something, e.g. the host, that doesn't have access to private memory, so I can't envision a use case. On the KVM side, it's trivially easy to omit FOLL_GUEST for shared memory, while always passing FOLL_GUEST would require manually zapping. Manual zapping isn't a big deal, but I do think it can be avoided if userspace must either remap the hva or define a new KVM memslot (new gpa->hva), both of which will automatically zap any existing translations. Aha, thought of a concrete problem. If KVM maps PageGuest() into shared memory, then KVM must ensure that the page is not mapped private via a different hva/gpa, and is not mapped _any_ other guest because the TDX-Module's 1:1 PFN:TD+GPA enforcement only applies to private memory. The explicit "VM_WRITE | VM_SHARED" requirement below makes me think this wouldn't be prevented. Oh, and the other nicety is that I think it would avoid having to explicitly handle PageGuest() memory that is being accessed from kernel/KVM, i.e. if all memory exposed to KVM must be !PageGuest(), then it is also eligible for copy_{to,from}_user(). > Any feedback is welcome. > > -------------------------------8<------------------------------------------- > > From: "Kirill A. Shutemov" > Date: Fri, 16 Apr 2021 01:30:48 +0300 > Subject: [PATCH] mm: Introduce guest-only pages > > PageGuest() pages are only allowed to be used as guest memory. Userspace > is not allowed read from or write to such pages. > > On page fault, PageGuest() pages produce PROT_NONE page table entries. > Read or write there will trigger SIGBUS. Access to such pages via > syscall leads to -EIO. > > The new mprotect(2) flag PROT_GUEST translates to VM_GUEST. Any page > fault to VM_GUEST VMA produces PageGuest() page. > > Only shared tmpfs/shmem mappings are supported. Is limiting this to tmpfs/shmem only for the PoC/RFC, or is it also expected to be the long-term behavior? > GUP normally fails on such pages. KVM will use the new FOLL_GUEST flag > to access them.