Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9B2C2C61DA0 for ; Wed, 25 Jan 2023 13:01:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235221AbjAYNB2 (ORCPT ); Wed, 25 Jan 2023 08:01:28 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43702 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232999AbjAYNBZ (ORCPT ); Wed, 25 Jan 2023 08:01:25 -0500 Received: from new4-smtp.messagingengine.com (new4-smtp.messagingengine.com [66.111.4.230]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5D19712064; Wed, 25 Jan 2023 05:01:22 -0800 (PST) Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailnew.nyi.internal (Postfix) with ESMTP id 6D5F8581DCF; Wed, 25 Jan 2023 07:53:28 -0500 (EST) Received: from mailfrontend1 ([10.202.2.162]) by compute5.internal (MEProxy); Wed, 25 Jan 2023 07:53:28 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=shutemov.name; h=cc:cc:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:sender :subject:subject:to:to; s=fm1; t=1674651208; x=1674658408; bh=4+ WTh2wxYnUegwTOEldoemeAGUATx2lachXfcnNgrRM=; b=rW63RSV1s6p/5iI23s pPb3Yev73vywbEK2l+K+UINyN0wq/ycFBZaIek7/rkZOTLRNsrb0bR8ni0tCGKby EHkp3vTR7w59Cpw8IPpHzmtvU/PAFIfoRD4vaV/obIMlj05qa0x0RAP2GeD852oI ql2jJRikMcmoNxIvJmcP7Y2F4bop+DZRBWkTHAJ76vxDVBlc2G1W0ri4zWEjF47W RNJyjY1rMDCHnYAKtmDJut0wQz71Qi2dD1xllshO9o7KofTMflb45iXB1p9qfcp5 4SQQayRWg5q076nLUgQgb8PIAPhLi5FkrHBJEv3jQavuT8nd6Rl0xsJ+kfBWIN0h 5v+g== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:date:date:feedback-id :feedback-id:from:from:in-reply-to:in-reply-to:message-id :mime-version:references:reply-to:sender:subject:subject:to:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; t=1674651208; x=1674658408; bh=4+WTh2wxYnUegwTOEldoemeAGUAT x2lachXfcnNgrRM=; b=nMswIvhy39T0VwYZ++ke/f29wffPClGz2z3gerpeHY/P veXiiALrTIG/Tq9iBnbIj1PjWFh/FIo+58pZWDG6X1UIpGokb3bWoMHiLtdDhXvK 7o1HZ0CYempwzfVV2O7x73la0cJpLaWtgAH5HWaHiJocgmAeMgBmgq83HcWq7Ffd xvNnslwVO7SP3TDfN3BfKw3R3so/jjjZlzOtsjm8FsMi5As2hPcsHhGyN4/OiZIK LX495GAPtTfvbAzlrws4yqPcGPUor/eUCjtIKv9eZOOb+PPPEZC7I6bP5Noa2dFF yaaaeBAAMwBnWn3TsiT6GRQTzvIVT0bmyhlBeCVDHw== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedruddvvddggeehucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepfffhvfevuffkfhggtggujgesthdttddttddtvdenucfhrhhomhepfdfmihhr ihhllhcutedrucfuhhhuthgvmhhovhdfuceokhhirhhilhhlsehshhhuthgvmhhovhdrnh grmhgvqeenucggtffrrghtthgvrhhnpeetvdehffelffeiveeikeduffetudeuheeiiefg ueduvdevtdejhedvhfffffehfeenucffohhmrghinhepghhithhhuhgsrdgtohhmnecuve hluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepkhhirhhilhhl sehshhhuthgvmhhovhdrnhgrmhgv X-ME-Proxy: Feedback-ID: ie3994620:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 25 Jan 2023 07:53:24 -0500 (EST) Received: by box.shutemov.name (Postfix, from userid 1000) id 83A27104985; Wed, 25 Jan 2023 15:53:21 +0300 (+03) Date: Wed, 25 Jan 2023 15:53:21 +0300 From: "Kirill A. Shutemov" To: Sean Christopherson Cc: Liam Merwick , Chao Peng , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, linux-doc@vger.kernel.org, qemu-devel@nongnu.org, Paolo Bonzini , Jonathan Corbet , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Arnd Bergmann , Naoya Horiguchi , Miaohe Lin , x86@kernel.org, "H . Peter Anvin" , Hugh Dickins , Jeff Layton , "J . Bruce Fields" , Andrew Morton , Shuah Khan , Mike Rapoport , Steven Price , "Maciej S . Szmigiero" , Vlastimil Babka , Vishal Annapurve , Yu Zhang , "Kirill A . Shutemov" , luto@kernel.org, jun.nakajima@intel.com, dave.hansen@intel.com, ak@linux.intel.com, david@redhat.com, aarcange@redhat.com, ddutile@redhat.com, dhildenb@redhat.com, Quentin Perret , tabba@google.com, Michael Roth , mhocko@suse.com, wei.w.wang@intel.com Subject: Re: [PATCH v10 0/9] KVM: mm: fd-based approach for supporting KVM Message-ID: <20230125125321.yvsivupbbaqkb7a5@box.shutemov.name> References: <20221202061347.1070246-1-chao.p.peng@linux.intel.com> <48953bf2-cee9-f818-dc50-5fb5b9b410bf@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jan 25, 2023 at 12:20:26AM +0000, Sean Christopherson wrote: > On Tue, Jan 24, 2023, Liam Merwick wrote: > > On 14/01/2023 00:37, Sean Christopherson wrote: > > > On Fri, Dec 02, 2022, Chao Peng wrote: > > > > This patch series implements KVM guest private memory for confidential > > > > computing scenarios like Intel TDX[1]. If a TDX host accesses > > > > TDX-protected guest memory, machine check can happen which can further > > > > crash the running host system, this is terrible for multi-tenant > > > > configurations. The host accesses include those from KVM userspace like > > > > QEMU. This series addresses KVM userspace induced crash by introducing > > > > new mm and KVM interfaces so KVM userspace can still manage guest memory > > > > via a fd-based approach, but it can never access the guest memory > > > > content. > > > > > > > > The patch series touches both core mm and KVM code. I appreciate > > > > Andrew/Hugh and Paolo/Sean can review and pick these patches. Any other > > > > reviews are always welcome. > > > > - 01: mm change, target for mm tree > > > > - 02-09: KVM change, target for KVM tree > > > > > > A version with all of my feedback, plus reworked versions of Vishal's selftest, > > > is available here: > > > > > > git@github.com:sean-jc/linux.git x86/upm_base_support > > > > > > It compiles and passes the selftest, but it's otherwise barely tested. There are > > > a few todos (2 I think?) and many of the commits need changelogs, i.e. it's still > > > a WIP. > > > > > > > When running LTP (https://github.com/linux-test-project/ltp) on the v10 > > bits (and also with Sean's branch above) I encounter the following NULL > > pointer dereference with testcases/kernel/syscalls/madvise/madvise01 > > (100% reproducible). > > > > It appears that in restrictedmem_error_page() inode->i_mapping->private_data > > is NULL > > in the list_for_each_entry_safe(inode, next, &sb->s_inodes, i_sb_list) > > but I don't know why. > > Kirill, can you take a look? Or pass the buck to someone who can? :-) The patch below should help. diff --git a/mm/restrictedmem.c b/mm/restrictedmem.c index 15c52301eeb9..39ada985c7c0 100644 --- a/mm/restrictedmem.c +++ b/mm/restrictedmem.c @@ -307,14 +307,29 @@ void restrictedmem_error_page(struct page *page, struct address_space *mapping) spin_lock(&sb->s_inode_list_lock); list_for_each_entry_safe(inode, next, &sb->s_inodes, i_sb_list) { - struct restrictedmem *rm = inode->i_mapping->private_data; struct restrictedmem_notifier *notifier; - struct file *memfd = rm->memfd; + struct restrictedmem *rm; unsigned long index; + struct file *memfd; - if (memfd->f_mapping != mapping) + if (atomic_read(&inode->i_count)) continue; + spin_lock(&inode->i_lock); + if (inode->i_state & (I_NEW | I_FREEING | I_WILL_FREE)) { + spin_unlock(&inode->i_lock); + continue; + } + + rm = inode->i_mapping->private_data; + memfd = rm->memfd; + + if (memfd->f_mapping != mapping) { + spin_unlock(&inode->i_lock); + continue; + } + spin_unlock(&inode->i_lock); + xa_for_each_range(&rm->bindings, index, notifier, start, end) notifier->ops->error(notifier, start, end); break; -- Kiryl Shutsemau / Kirill A. Shutemov