Received: by 2002:a05:6a10:1d13:0:0:0:0 with SMTP id pp19csp1698557pxb; Fri, 27 Aug 2021 15:30:34 -0700 (PDT) X-Google-Smtp-Source: ABdhPJywS68Em1CtZ47CMHeE0NkF+fc7Hb5AgT0l5Cf5F+0DCetLvM5iDdwjvhHe98LZrYp5WVB/ X-Received: by 2002:a05:6602:2211:: with SMTP id n17mr9245678ion.142.1630103433935; Fri, 27 Aug 2021 15:30:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1630103433; cv=none; d=google.com; s=arc-20160816; b=giEhGQw/GdFYdHLbs4L/ZDIdP8JVqP9kMwuxPg4IX2k4ZXCe5ccxchv42l8Y16AQRP f8Vr+VMKXp/poEXnWF7RquAsCROBFwcxjeQxDt3N/SnBEa/gRQ4qDxAsJbvlq6IOZ6VG CRUXN0Ae+FlP4VAKz+UFz7JRJMm7o+oGp34JUpJt5ji6nuvfPKPx5BL6rIDf4cFA9AQ3 ldQkADKaErTmSaT16rJHEVKLm4ZE6xS9k66n6CC3q63MWKYwykQguSuWmcgvx8qaXNpn O+WT/NkcKXkBzvNjNbuTwnfgiVLMM8ouFVORvD9DQQeLGW6BJ2epX4MmcFhLi9iWsPIx xF+A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=Kcz/OcOk78cxC9xSl0rIRLhu0Wg6VNLvHKoD96Y0zU4=; b=aDO8H6WoXvVVZdjOt2aYLrrIv9jZQTTItB+3MeudOHrswe4dZpFQN4EjeEiemmqw3i CMQo8KQASVowUiRnbZJEc46OfFcYE66OoqfZ0L0jnAftA5SxsFMzf6N3YjQbNOLGDPRz umIcAmyWc+gtosKkQkXiiZdA8UX+iSl1RVob2DI96A5ZpZnpzjVcLU8f1Zmyr8mDlyhw 5OvLHGFREetklbj0hCRHpNrkZx41//6D48w7o39O9A6bAMirkjOS7zRRYLzm6c1AVDev M5Jm30q/Dh37awwTs+jZk6Fvzfv9vBa4xcGNkHJuRD1Ow//6FhwlCj1EFhRcHDZFI3PB N05A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=JSfG3SMS; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id u4si7474243jak.106.2021.08.27.15.30.22; Fri, 27 Aug 2021 15:30:33 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=JSfG3SMS; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232246AbhH0W3u (ORCPT + 99 others); Fri, 27 Aug 2021 18:29:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46210 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232094AbhH0W3t (ORCPT ); Fri, 27 Aug 2021 18:29:49 -0400 Received: from mail-pl1-x635.google.com (mail-pl1-x635.google.com [IPv6:2607:f8b0:4864:20::635]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BDDC9C061796 for ; Fri, 27 Aug 2021 15:28:59 -0700 (PDT) Received: by mail-pl1-x635.google.com with SMTP id n12so4801027plk.10 for ; Fri, 27 Aug 2021 15:28:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=Kcz/OcOk78cxC9xSl0rIRLhu0Wg6VNLvHKoD96Y0zU4=; b=JSfG3SMSCC9NBv4yyAkJXKMCG8K50uUZ9UT0zpseTaWzxZqw7s2EkKqGCUkqY2hEEO rTAN5Wi3ymSOi052g5pSN+AYYo2jhO/xEnmmSqM83ebsws8hkJXfqzXn5hmRkEaeRHXA tzfo43wq7M4o16hRkaVL/5ait942tSAHvDNMqYnAkg3QXfq71LkI6xV86h9PyzZVVcpn p09Z7AznbqxQrUlt77vweBgDktY/D8IaBnrWZqY7Gz1E22yuuoHGV/Ci1ICTIEHARo7U xZOjtz78b+ZRmwN2VCOjYpDOISpwNFmKomLi7QW8XakSn7tWhLJ9jAeWAYvJiBwm7wMb T5DQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=Kcz/OcOk78cxC9xSl0rIRLhu0Wg6VNLvHKoD96Y0zU4=; b=D2nOT6IVlXsPgHQTUHirezRIxQOaDu2RSTPYJpKvsH2QMJxXnMV8qcmVTa6gtwJ2vb 81zSic9hz3H1d98RxUrnThxPED3qjYBWXSHyrP7/n60ZftwX9fAW+R8QyntZZK++ZTG6 bkNRZa8DsV7swlFJLlZG1lsigMFwjrR/MCHU6N0zSsl3K+2YoH+gtnW1qigX7EZk5afP /qJ+nM8dcIHiM2jnaQovzqJwHhL7UqYpQfisPhWk9FX0eBsp2Owk2Bd5Y9kAxz+n/eSH p5Y1VA/dqHycSf0GD4NLXpZYRCpRu77kxgVE2rZSptEe99uKHBiRFRucZvqnlDBN23Lq 9LdA== X-Gm-Message-State: AOAM530eJ6fThMLb0Y5qWLvNKtrqFBtSJxCMWqx1zRFAS0Ku/ws/m2DL A6OaJhQi+nDFD3xnSELo+j/l5w== X-Received: by 2002:a17:90a:1957:: with SMTP id 23mr12592991pjh.141.1630103339086; Fri, 27 Aug 2021 15:28:59 -0700 (PDT) Received: from google.com (157.214.185.35.bc.googleusercontent.com. [35.185.214.157]) by smtp.gmail.com with ESMTPSA id a23sm2263182pfo.120.2021.08.27.15.28.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 27 Aug 2021 15:28:58 -0700 (PDT) Date: Fri, 27 Aug 2021 22:28:54 +0000 From: Sean Christopherson To: Andy Lutomirski Cc: David Hildenbrand , Paolo Bonzini , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , kvm list , Linux Kernel Mailing List , Borislav Petkov , Andrew Morton , Joerg Roedel , Andi Kleen , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , "Peter Zijlstra (Intel)" , Ingo Molnar , Varad Gautam , Dario Faggioli , the arch/x86 maintainers , linux-mm@kvack.org, linux-coco@lists.linux.dev, "Kirill A. Shutemov" , "Kirill A . Shutemov" , Sathyanarayanan Kuppuswamy , Dave Hansen , Yu Zhang Subject: Re: [RFC] KVM: mm: fd-based approach for supporting KVM guest private memory Message-ID: References: <20210824005248.200037-1-seanjc@google.com> <307d385a-a263-276f-28eb-4bc8dd287e32@redhat.com> <40af9d25-c854-8846-fdab-13fe70b3b279@kernel.org> <73319f3c-6f5e-4f39-a678-7be5fddd55f2@www.fastmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <73319f3c-6f5e-4f39-a678-7be5fddd55f2@www.fastmail.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Aug 27, 2021, Andy Lutomirski wrote: > > On Thu, Aug 26, 2021, at 2:26 PM, David Hildenbrand wrote: > > On 26.08.21 19:05, Andy Lutomirski wrote: > > > > Oof. That's quite a requirement. What's the point of the VMA once all > > > this is done? > > > > You can keep using things like mbind(), madvise(), ... and the GUP code > > with a special flag might mostly just do what you want. You won't have > > to reinvent too many wheels on the page fault logic side at least. Ya, Kirill's RFC more or less proved a special GUP flag would indeed Just Work. However, the KVM page fault side of things would require only a handful of small changes to send private memslots down a different path. Compared to the rest of the enabling, it's quite minor. The counter to that is other KVM architectures would need to learn how to use the new APIs, though I suspect that there will be a fair bit of arch enabling regardless of what route we take. > You can keep calling the functions. The implementations working is a > different story: you can't just unmap (pte_numa-style or otherwise) a private > guest page to quiesce it, move it with memcpy(), and then fault it back in. Ya, I brought this up in my earlier reply. Even the initial implementation (without real NUMA support) would likely be painful, e.g. the KVM TDX RFC/PoC adds dedicated logic in KVM to handle the case where NUMA balancing zaps a _pinned_ page and then KVM fault in the same pfn. It's not thaaat ugly, but it's arguably more invasive to KVM's page fault flows than a new fd-based private memslot scheme.