Received: by 2002:a05:6358:489b:b0:bb:da1:e618 with SMTP id x27csp2384248rwn; Fri, 9 Sep 2022 12:51:36 -0700 (PDT) X-Google-Smtp-Source: AA6agR5X/XglkpehR1PL/UO93l+huLAnxrxrI2buqwESS2/+Rz0VWCOl2Q7V8D2r+Vf5GKaPLOzN X-Received: by 2002:a17:906:8a53:b0:770:7aea:2350 with SMTP id gx19-20020a1709068a5300b007707aea2350mr11320941ejc.17.1662753096006; Fri, 09 Sep 2022 12:51:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662753096; cv=none; d=google.com; s=arc-20160816; b=rgR5F0z5rqBeMfG/sCzjfgD5zzzKGdFRl9RvP80BsgjDV4MwJb17mHDusRphYzPMMp bukFT7SC+cCY+InkAylNd/yXvXLpJCRzLU7+dQJRUn5o+flouCjHoaj0mB0KNNIpJkaQ iAtUC/fIHgLXbUpP0Q/hawl3AUR2RKV46xN64OmtnuNWaGkSQ6XbNPLXH/qKaWNVtwoU vl9EYcPlhp5Xgfw15mxqsZ5un7aDXb8Q35hnwAEB16+1p4I10KpIRhZoHK6H9SAKLmTj iTicRoAbm9VFcHl7rwfIeZ7XSWYjHePtA+vAjFw+uZmwadAQ+TGtg7jj7JC6k0gs+qvi +d3g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:subject:cc:to:from:date:references:in-reply-to :message-id:mime-version:user-agent:feedback-id:dkim-signature; bh=dwaV6nlu8jP8nCSH44r91oYFZCBpu9oBaKN4hO2yWbU=; b=ug4yMsOl68QYvBLi/FiHYmgOCtxrKWs6TPe/uKhRsbGtODHSp3bZQk2ZcstHxMZWqt BiGfMeyuufHYfE1bjdGupbo2BSzfnC42EjGyShSxjsg81XlYbxR9ziw8rROqVfSS0unX wBQI3q3bJ8Ss7BAhIXtcRS0OoA3vcymS+vt6JDf5+LWc7Yb8vdUMcN+fXbVVOhoxqZ1e djFrQ8XJSrqd6ZQHzuqW1gqmDWHYXnlYQ/WhBqTeCEbaPacufE3jijRpxRn3TJnw21dW 0uwb4Y9X/yuJYK2GcbKDrEbI852zjWKA7j45VPzlMQ6unYXloTANBmKiGkfCqmQrSl6P nQTw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=fTdb4ftA; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id k17-20020a50c091000000b00450f1234f3esi1009926edf.199.2022.09.09.12.51.10; Fri, 09 Sep 2022 12:51:35 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=fTdb4ftA; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229974AbiIITLg (ORCPT + 99 others); Fri, 9 Sep 2022 15:11:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39146 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229728AbiIITLd (ORCPT ); Fri, 9 Sep 2022 15:11:33 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A6807C0B53 for ; Fri, 9 Sep 2022 12:11:31 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 304B46207A for ; Fri, 9 Sep 2022 19:11:31 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 804DAC433C1; Fri, 9 Sep 2022 19:11:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1662750690; bh=SUgnHRqE57J9ijihjzghns114gPu46lDz6ObxsACSls=; h=In-Reply-To:References:Date:From:To:Cc:Subject:From; b=fTdb4ftAq7R9/GGRhnXmD2nv/XLZdUp78IC3/k1noM0mwC3ia4w4JmkSoNel+Ot4u kO7uEgI3rWIF9d+q0yVj9JzimKSIiFI5w4VI7vzV2gL/AqjtODf0zHe6ZpzT2r7bOU bBtjUaVmEvsKeFSmE/+pjzwNffNkXhjWRx2D+AULoQLP9FuB1Zn8xi6605QLcSEH0e oLt7BzlpcaEt3g0oAKqBMQduFEIO554y3f8XgZPOv5cfN37vX7NQT7I+nD8+/rasz+ Gg/HVms627W4sQu0JF+Reug86HzEcGb9S+2x6xEowoF3FfIR0AAk6EF758cLaYuTjt TS0MvwAO5K49A== Received: from compute2.internal (compute2.nyi.internal [10.202.2.46]) by mailauth.nyi.internal (Postfix) with ESMTP id 553A027C005B; Fri, 9 Sep 2022 15:11:28 -0400 (EDT) Received: from imap48 ([10.202.2.98]) by compute2.internal (MEProxy); Fri, 09 Sep 2022 15:11:28 -0400 X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvfedrfedthedgudefjecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd enucfjughrpefofgggkfgjfhffhffvvefutgesthdtredtreertdenucfhrhhomhepfdet nhguhicunfhuthhomhhirhhskhhifdcuoehluhhtoheskhgvrhhnvghlrdhorhhgqeenuc ggtffrrghtthgvrhhnpeekuddthfelkeegtdelteeuieevkeegudduheevtdetieegheet ffelleduvddtueenucffohhmrghinhepihhnthgvlhdrtghomhdpmhgvmhdrphgrghgvne cuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomheprghnugih odhmvghsmhhtphgruhhthhhpvghrshhonhgrlhhithihqdduudeiudekheeifedvqddvie efudeiiedtkedqlhhuthhopeepkhgvrhhnvghlrdhorhhgsehlihhnuhigrdhluhhtohdr uhhs X-ME-Proxy: Feedback-ID: ieff94742:Fastmail Received: by mailuser.nyi.internal (Postfix, from userid 501) id 4802E31A0062; Fri, 9 Sep 2022 15:11:25 -0400 (EDT) X-Mailer: MessagingEngine.com Webmail Interface User-Agent: Cyrus-JMAP/3.7.0-alpha0-927-gf4c98c8499-fm-20220826.002-gf4c98c84 Mime-Version: 1.0 Message-Id: <762581e4-a6bf-41d1-b0d3-72543153ffb1@www.fastmail.com> In-Reply-To: <20220909143236.sznwzkpedldrlnn5@box.shutemov.name> References: <20220706082016.2603916-1-chao.p.peng@linux.intel.com> <20220818132421.6xmjqduempmxnnu2@box> <20220820002700.6yflrxklmpsavdzi@box.shutemov.name> <95bd287b-d17f-fda8-58c9-20700b1e0c72@kernel.org> <20220909143236.sznwzkpedldrlnn5@box.shutemov.name> Date: Fri, 09 Sep 2022 12:11:05 -0700 From: "Andy Lutomirski" To: "Kirill A. Shutemov" Cc: "Kirill A . Shutemov" , "Hugh Dickins" , "Chao Peng" , "kvm list" , "Linux Kernel Mailing List" , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, "Linux API" , linux-doc@vger.kernel.org, qemu-devel@nongnu.org, linux-kselftest@vger.kernel.org, "Paolo Bonzini" , "Jonathan Corbet" , "Sean Christopherson" , "Vitaly Kuznetsov" , "Wanpeng Li" , "Jim Mattson" , "Joerg Roedel" , "Thomas Gleixner" , "Ingo Molnar" , "Borislav Petkov" , "the arch/x86 maintainers" , "H. Peter Anvin" , "Jeff Layton" , "J . Bruce Fields" , "Andrew Morton" , "Shuah Khan" , "Mike Rapoport" , "Steven Price" , "Maciej S . Szmigiero" , "Vlastimil Babka" , "Vishal Annapurve" , "Yu Zhang" , "Nakajima, Jun" , "Dave Hansen" , "Andi Kleen" , "David Hildenbrand" , aarcange@redhat.com, ddutile@redhat.com, dhildenb@redhat.com, "Quentin Perret" , "Michael Roth" , "Michal Hocko" , "Muchun Song" , "Gupta, Pankaj" Subject: Re: [PATCH v7 00/14] KVM: mm: fd-based approach for supporting KVM guest private memory Content-Type: text/plain X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Sep 9, 2022, at 7:32 AM, Kirill A . Shutemov wrote: > On Thu, Sep 08, 2022 at 09:48:35PM -0700, Andy Lutomirski wrote: >> On 8/19/22 17:27, Kirill A. Shutemov wrote: >> > On Thu, Aug 18, 2022 at 08:00:41PM -0700, Hugh Dickins wrote: >> > > On Thu, 18 Aug 2022, Kirill A . Shutemov wrote: >> > > > On Wed, Aug 17, 2022 at 10:40:12PM -0700, Hugh Dickins wrote: >> > > > > >> > > > > If your memory could be swapped, that would be enough of a good reason >> > > > > to make use of shmem.c: but it cannot be swapped; and although there >> > > > > are some references in the mailthreads to it perhaps being swappable >> > > > > in future, I get the impression that will not happen soon if ever. >> > > > > >> > > > > If your memory could be migrated, that would be some reason to use >> > > > > filesystem page cache (because page migration happens to understand >> > > > > that type of memory): but it cannot be migrated. >> > > > >> > > > Migration support is in pipeline. It is part of TDX 1.5 [1]. And swapping >> > > > theoretically possible, but I'm not aware of any plans as of now. >> > > > >> > > > [1] https://www.intel.com/content/www/us/en/developer/articles/technical/intel-trust-domain-extensions.html >> > > >> > > I always forget, migration means different things to different audiences. >> > > As an mm person, I was meaning page migration, whereas a virtualization >> > > person thinks VM live migration (which that reference appears to be about), >> > > a scheduler person task migration, an ornithologist bird migration, etc. >> > > >> > > But you're an mm person too: you may have cited that reference in the >> > > knowledge that TDX 1.5 Live Migration will entail page migration of the >> > > kind I'm thinking of. (Anyway, it's not important to clarify that here.) >> > >> > TDX 1.5 brings both. >> > >> > In TDX speak, mm migration called relocation. See TDH.MEM.PAGE.RELOCATE. >> > >> >> This seems to be a pretty bad fit for the way that the core mm migrates >> pages. The core mm unmaps the page, then moves (in software) the contents >> to a new address, then faults it in. TDH.MEM.PAGE.RELOCATE doesn't fit into >> that workflow very well. I'm not saying it can't be done, but it won't just >> work. > > Hm. From what I see we have all necessary infrastructure in place. > > Unmaping is NOP for inaccessible pages as it is never mapped and we have > mapping->a_ops->migrate_folio() callback that allows to replace software > copying with whatever is needed, like TDH.MEM.PAGE.RELOCATE. > > What do I miss? Hmm, maybe this isn't as bad as I thought. Right now, unless I've missed something, the migration workflow is to unmap (via try_to_migrate) all mappings, then migrate the backing store (with ->migrate_folio(), although it seems like most callers expect the actual copy to happen outside of ->migrate_folio(), and then make new mappings. With the *current* (vma-based, not fd-based) model for KVM memory, this won't work -- we can't unmap before calling TDH.MEM.PAGE.RELOCATE. But maybe it's actually okay with some care or maybe mild modifications with the fd-based model. We don't have any mmaps, per se, to unmap for secret / INACCESSIBLE memory. So maybe we can get all the way to ->migrate_folio() without zapping anything in the secure EPT and just call TDH-MEM.PAGE.RELOCATE from inside migrate_folio(). And there will be nothing to fault back in. From the core code's perspective, it's like migrating a memfd that doesn't happen to have my mappings at the time. --Andy