Received: by 2002:a5d:9c59:0:0:0:0:0 with SMTP id 25csp849166iof; Mon, 6 Jun 2022 13:46:53 -0700 (PDT) X-Google-Smtp-Source: ABdhPJygFwFcsz48MydYPr7NyKOyYYh0GhKytIK9lZEdR5AK04I/WGc489UHhguNriGp1pdfHZcU X-Received: by 2002:a63:5f0d:0:b0:3fd:7b18:bad8 with SMTP id t13-20020a635f0d000000b003fd7b18bad8mr10551598pgb.213.1654548413619; Mon, 06 Jun 2022 13:46:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1654548413; cv=none; d=google.com; s=arc-20160816; b=bjslr+gFyU6/Eg99xY517skK5WGQJDc4diOrHNVAox6BIck/2MFIDoaOYDva5LYIvW ZaooU/aDQ94ZNe/7uXE91fQgedxrtrkxLxC9EEL+Mhj4aK2J6I2UIFJKzB1BthDHU579 PCGg0Bx1ZkSSwWYOIzbg2oJNytkp73vRrH+TgQ7vBXUuwvKVKDBNW/0gfRJTIxGAHiG0 qFaa8sqAVCQQxX+i2AtPsfq7kJIOVQW8G3OICnaSflgVnESz6NI6BiRnKe5BuvlhcKj7 pM2vMVM8Gc40vu8lWb2oWTs/xOf3yWdEXg5aihqx1i2kH33tIUjSei59kYf1KdMOlEHh 8NAw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=oQFpDXi6bD7+wKzTyAPy27GXXUSscdFak0V6JCay3LM=; b=B8wwYeC1qvR3K5WeAOXL2TvEDC/Us3bfKG/mDfn1vzaakJs0i1MJbq1z1GgI2n03+Z fvae5+QErhaMl1DJXGELZa7VVrklI2IY1jC9+h0zps9XDvqCujVyiFyeR9dTklXCNOZB STS/A7kTrROr6xnA9guda9lL0S21hkpuOkSJqz1Ar7zhsPymRtYsjussBX5EVnI0n0zI gxsNCJ8KOwpf+uMIYNyL3KLUM98/7Kdm3zi7nJrU88khI12xz72Vp57Dycnhhqe6MgT5 Pcj0vynSR22iRVIRX7ehRlopMhWZR9VMUwv73TZv4P9/qoCneolcldJBK/aK3WpRyVQG XU5w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=Lt8bn1CK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 135-20020a63018d000000b003fd55602efasi11768750pgb.362.2022.06.06.13.46.39; Mon, 06 Jun 2022 13:46:53 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=Lt8bn1CK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233145AbiFFUKK (ORCPT + 99 others); Mon, 6 Jun 2022 16:10:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44310 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233067AbiFFUKF (ORCPT ); Mon, 6 Jun 2022 16:10:05 -0400 Received: from mail-pg1-x52a.google.com (mail-pg1-x52a.google.com [IPv6:2607:f8b0:4864:20::52a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3E09012D1D5 for ; Mon, 6 Jun 2022 13:10:02 -0700 (PDT) Received: by mail-pg1-x52a.google.com with SMTP id 7so13025440pga.12 for ; Mon, 06 Jun 2022 13:10:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=oQFpDXi6bD7+wKzTyAPy27GXXUSscdFak0V6JCay3LM=; b=Lt8bn1CKv27NBAN2LZ6a3kjXi7zTNT9n5YC6qdlDRRz27wSGX169ro34tJXroQ4ylw 9Wh5AtBkZYrRJCB6VOaJC0MlhiSdzn1C4PqwPfP5C2UQ58ClSnY1eORDsS+jz9o2IDri ni9cn1KDaMb5LTVmi0K48gTlOfkMsvw04FEVxngHWMm311dg/8tSsLNw8NZngXEjzLdm MRj2v0erbTuphncL4y5iv8ZG1Wuu7I1voO3ifQtd6JmL8xCnR4egkVJv67k8mCcsucuy B7Vorowmpt94+GNwzICo6EQEcNJyTDkVWQJuYGl9kQjzC8CdwmuVUJOWPEVPDexxdgNI LvRQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=oQFpDXi6bD7+wKzTyAPy27GXXUSscdFak0V6JCay3LM=; b=oa/Ob0tnLhhEYCCW1nhiDz6mqO/i1+EDu5j7IHxEWri9rRdt7Q/LyKrhTX5iyiw2l+ xSvlnLolDGA7qAHonv6ioExBHqkMFHZ/aznklVMWsmJgf5ukDv2XBY3q5HN5CmlZLfie cnqFF0LlC+BW1uTe2N4sSTdtrYgUMJC7e0oqF8LW2PKr/a7xlzktAasg43BpCrlewZHa Ye1y/Dn+gaIfRsOPqFueod1rxHgOfyMhkfyyB/rkX3ydonLnSuA1So1rn14GPPBdWKVT MyKWf2r9X2HVIHnPWHpAnBz5+n6GxhY/JoQ01b1WCeA5uLr5J6BR3dYXEAodG0PvWmyx djJg== X-Gm-Message-State: AOAM533rT23wEqA1XFMl1IjljwS8YUINHog+wOY9IFI4HC0t2iQve39y J6O9SZoEfaR9LEzTBHxU0cNuEBrGgr2iRpAZ6PSwK+n7L1Br1Q== X-Received: by 2002:a63:69c2:0:b0:3fa:78b5:d991 with SMTP id e185-20020a6369c2000000b003fa78b5d991mr23043411pgc.40.1654546201400; Mon, 06 Jun 2022 13:10:01 -0700 (PDT) MIME-Version: 1.0 References: <20220519153713.819591-1-chao.p.peng@linux.intel.com> In-Reply-To: <20220519153713.819591-1-chao.p.peng@linux.intel.com> From: Vishal Annapurve Date: Mon, 6 Jun 2022 13:09:50 -0700 Message-ID: Subject: Re: [PATCH v6 0/8] KVM: mm: fd-based approach for supporting KVM guest private memory To: Chao Peng Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-api@vger.kernel.org, linux-doc@vger.kernel.org, qemu-devel@nongnu.org, Paolo Bonzini , Jonathan Corbet , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H . Peter Anvin" , Hugh Dickins , Jeff Layton , "J . Bruce Fields" , Andrew Morton , Mike Rapoport , Steven Price , "Maciej S . Szmigiero" , Vlastimil Babka , Yu Zhang , "Kirill A . Shutemov" , Andy Lutomirski , Jun Nakajima , dave.hansen@intel.com, ak@linux.intel.com, david@redhat.com, aarcange@redhat.com, ddutile@redhat.com, dhildenb@redhat.com, Quentin Perret , Michael Roth , mhocko@suse.com Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > > Private memory map/unmap and conversion > --------------------------------------- > Userspace's map/unmap operations are done by fallocate() ioctl on the > backing store fd. > - map: default fallocate() with mode=0. > - unmap: fallocate() with FALLOC_FL_PUNCH_HOLE. > The map/unmap will trigger above memfile_notifier_ops to let KVM map/unmap > secondary MMU page tables. > .... > QEMU: https://github.com/chao-p/qemu/tree/privmem-v6 > > An example QEMU command line for TDX test: > -object tdx-guest,id=tdx \ > -object memory-backend-memfd-private,id=ram1,size=2G \ > -machine q35,kvm-type=tdx,pic=no,kernel_irqchip=split,memory-encryption=tdx,memory-backend=ram1 > There should be more discussion around double allocation scenarios when using the private fd approach. A malicious guest or buggy userspace VMM can cause physical memory getting allocated for both shared (memory accessible from host) and private fds backing the guest memory. Userspace VMM will need to unback the shared guest memory while handling the conversion from shared to private in order to prevent double allocation even with malicious guests or bugs in userspace VMM. Options to unback shared guest memory seem to be: 1) madvise(.., MADV_DONTNEED/MADV_REMOVE) - This option won't stop kernel from backing the shared memory on subsequent write accesses 2) fallocate(..., FALLOC_FL_PUNCH_HOLE...) - For file backed shared guest memory, this option still is similar to madvice since this would still allow shared memory to get backed on write accesses 3) munmap - This would give away the contiguous virtual memory region reservation with holes in the guest backing memory, which might make guest memory management difficult. 4) mprotect(... PROT_NONE) - This would keep the virtual memory address range backing the guest memory preserved ram_block_discard_range_fd from reference implementation: https://github.com/chao-p/qemu/tree/privmem-v6 seems to be relying on fallocate/madvise. Any thoughts/suggestions around better ways to unback the shared memory in order to avoid double allocation scenarios? Regards, Vishal