Received: by 2002:a05:6a10:1d13:0:0:0:0 with SMTP id pp19csp4749247pxb; Tue, 31 Aug 2021 12:19:11 -0700 (PDT) X-Google-Smtp-Source: ABdhPJw6oGucmB9aYidoEO79gAaacauMChpzx/+PjxTvH2bho3agXNhLimXhzuAIC8nvyyq/7iJz X-Received: by 2002:a05:6402:2210:: with SMTP id cq16mr22586816edb.348.1630437551373; Tue, 31 Aug 2021 12:19:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1630437551; cv=none; d=google.com; s=arc-20160816; b=yQtkiGws7O6r6t6W8QsubeXxuPrNKS8lYIy7K+V5R18SxX47f7EomqFas+MiDYZ9my //BO3IeHZObQx/bzoZ8haWBvt2z7juk43w3X/nWwRpMhVFhp3blVwqe9drmHH2x+k78d tvbTNynZYNbCWig/E0O1Z5bBQjh7EfQWCmf0ylGXoRt/3Z3ta/9ZOpsoV+x48BgDFJes ixVjQY9WcOoEch/JKq0J7LMAFWJMSyaYki9SiGgv3xvIOCCuRO5oYs87OQjZx8rhleXm //Xb7s6OR+5u5CwTgGIFXxkCp2yGaWnxOS9PgRHWDYJedh3oe6CQ9TwILDh0OndYFUzO kYAw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:organization :from:references:cc:to:subject:dkim-signature; bh=B3R8+9xCMOWLuSi664S67yIHwiGeNUaH36y7wRqkQk8=; b=qsdX6PvutuOls9w3OV4AIsXl5W1Ap4xQpyPxvqkGdBIrw4g81P/t+sCfouAjFienAU sQXF1/9evqbfw1UBnk3eSpxQUvwBjDUSht4QDKaQun0cDjpatouXcC6dYuqvnb5EbPhq IXR/wOSO5VHjyPJbMX0fI0IpE5A4/27EMHvY070ez4JB3xsDw1dJEeWOZ5lb4h6iThKe B35TE1AV790Iu7vX6Z2Q/d+JvQnypeuOnnUDYpFv78LQAD0GKVW/H9w70iTAd+AQSRtF k3OARm/wr2hsvQRjEXot39dHJBphFA46jZdWR+mNUsvuW7mie+MqSI22Zk3ZCgJtHcGO iftA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=WcNtHgW5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id c2si18411896edt.157.2021.08.31.12.18.46; Tue, 31 Aug 2021 12:19:11 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=WcNtHgW5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240444AbhHaTNq (ORCPT + 99 others); Tue, 31 Aug 2021 15:13:46 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:53049 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240212AbhHaTNq (ORCPT ); Tue, 31 Aug 2021 15:13:46 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1630437170; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=B3R8+9xCMOWLuSi664S67yIHwiGeNUaH36y7wRqkQk8=; b=WcNtHgW56NbbjMMwYa2R59bm3DVYFbGX/hza0hueHCB97kBSnqKyv6V/7p25dhDHpOzGO7 708/gSe2/VAhQrZ8xqC77BMLr1Knp/6SoKtbra2Tu/j5l46IRcaxzoO11xEtfZcU3E6sTX onEy+PSO7M9DslsMJsd/E3OrvrKEsl4= Received: from mail-wr1-f70.google.com (mail-wr1-f70.google.com [209.85.221.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-269-K3z3UYBQNT-kcZnFKmpRMg-1; Tue, 31 Aug 2021 15:12:47 -0400 X-MC-Unique: K3z3UYBQNT-kcZnFKmpRMg-1 Received: by mail-wr1-f70.google.com with SMTP id u2-20020adfdd42000000b001579f5d6779so153350wrm.8 for ; Tue, 31 Aug 2021 12:12:47 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:organization :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=B3R8+9xCMOWLuSi664S67yIHwiGeNUaH36y7wRqkQk8=; b=MLZiciM9xWef2DxoRQ6fdp40GvysBXyCHc1gl1x/oQz9OlxmsMdxrNlqrfEv/W+Iz9 /qj36+caIocqja7YRGcU8YIdDrJC/r304ord7gNERI6AgroWbSS9xQurgmskSy+T4Hme cn9pY5dtntfv2r9YC9Yj9wsJogzXHN5TdlrYCHs4i45uGe7MXjCHTTNGFqIG6JN02B9p /G/GR+6ywBbjwvcp2ZjWAM7rGw2SQ6co/L77rCxwRbzXnYX5hGrWe1DTof5XJsaydVOS E2PkAn9SgqlNxKhrEAkH9WBKFM0wJl3VTt2FN2CJELX4CQ7SehgdYTibguoiw4TLoO3O VmJA== X-Gm-Message-State: AOAM530QCetwzKAOYSO/TgecRBbPkWQLbIkj0eXCuXK1t0GZoAbf2MJ4 Z+l2IUuh5zfnuHJCv22XewSZrKSPT81zXhs9/LlcBV/WkomafFyT37yiMwcpymIQkvIYFqlXJFP 97CvAJ1cA2DN1obdUlSi6kiR9 X-Received: by 2002:a1c:f315:: with SMTP id q21mr5819461wmq.76.1630437166301; Tue, 31 Aug 2021 12:12:46 -0700 (PDT) X-Received: by 2002:a1c:f315:: with SMTP id q21mr5819425wmq.76.1630437166118; Tue, 31 Aug 2021 12:12:46 -0700 (PDT) Received: from [192.168.3.132] (p4ff23bf5.dip0.t-ipconnect.de. [79.242.59.245]) by smtp.gmail.com with ESMTPSA id f5sm3231993wmb.47.2021.08.31.12.12.44 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 31 Aug 2021 12:12:45 -0700 (PDT) Subject: Re: [RFC] KVM: mm: fd-based approach for supporting KVM guest private memory To: Sean Christopherson , Andy Lutomirski Cc: Paolo Bonzini , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , kvm list , Linux Kernel Mailing List , Borislav Petkov , Andrew Morton , Joerg Roedel , Andi Kleen , David Rientjes , Vlastimil Babka , Tom Lendacky , Thomas Gleixner , "Peter Zijlstra (Intel)" , Ingo Molnar , Varad Gautam , Dario Faggioli , the arch/x86 maintainers , linux-mm@kvack.org, linux-coco@lists.linux.dev, "Kirill A. Shutemov" , "Kirill A . Shutemov" , Sathyanarayanan Kuppuswamy , Dave Hansen , Yu Zhang References: <20210824005248.200037-1-seanjc@google.com> <307d385a-a263-276f-28eb-4bc8dd287e32@redhat.com> <40af9d25-c854-8846-fdab-13fe70b3b279@kernel.org> <73319f3c-6f5e-4f39-a678-7be5fddd55f2@www.fastmail.com> From: David Hildenbrand Organization: Red Hat Message-ID: <949e6d95-266d-0234-3b86-6bd3c5267333@redhat.com> Date: Tue, 31 Aug 2021 21:12:44 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 28.08.21 00:28, Sean Christopherson wrote: > On Fri, Aug 27, 2021, Andy Lutomirski wrote: >> >> On Thu, Aug 26, 2021, at 2:26 PM, David Hildenbrand wrote: >>> On 26.08.21 19:05, Andy Lutomirski wrote: >> >>>> Oof. That's quite a requirement. What's the point of the VMA once all >>>> this is done? >>> >>> You can keep using things like mbind(), madvise(), ... and the GUP code >>> with a special flag might mostly just do what you want. You won't have >>> to reinvent too many wheels on the page fault logic side at least. > > Ya, Kirill's RFC more or less proved a special GUP flag would indeed Just Work. > However, the KVM page fault side of things would require only a handful of small > changes to send private memslots down a different path. Compared to the rest of > the enabling, it's quite minor. > > The counter to that is other KVM architectures would need to learn how to use the > new APIs, though I suspect that there will be a fair bit of arch enabling regardless > of what route we take. > >> You can keep calling the functions. The implementations working is a >> different story: you can't just unmap (pte_numa-style or otherwise) a private >> guest page to quiesce it, move it with memcpy(), and then fault it back in. > > Ya, I brought this up in my earlier reply. Even the initial implementation (without > real NUMA support) would likely be painful, e.g. the KVM TDX RFC/PoC adds dedicated > logic in KVM to handle the case where NUMA balancing zaps a _pinned_ page and then > KVM fault in the same pfn. It's not thaaat ugly, but it's arguably more invasive > to KVM's page fault flows than a new fd-based private memslot scheme. I might have a different mindset, but less code churn doesn't necessarily translate to "better approach". I'm certainly not pushing for what I proposed (it's a rough, broken sketch). I'm much rather trying to come up with alternatives that try solving the same issue, handling the identified requirements. I have a gut feeling that the list of requirements might not be complete yet. For example, I wonder if we have to protect against user space replacing private pages by shared pages or punishing random holes into the encrypted memory fd. -- Thanks, David / dhildenb