X-Received: by 2002:a17:902:bf06:b0:14d:8c72:96c6 with SMTP id bi6-20020a170902bf0600b0014d8c7296c6mr1927874plb.156.1645656169819; Wed, 23 Feb 2022 14:42:49 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1645656169; cv=none; d=google.com; s=arc-20160816; b=mdn8nBcGLX1hmFIatxmcIq0eC6Dcsl7PVMpRveF56vP9Qq0HqOutFZ1hQpDBZgutrr mwpVrCXUTGooEQSV6Mi2VoYBAO2jtHXz9Ze5r7ULY43gRGZl9fixDpViqwEcvW4Ryxrp OtwvdZq9tBjGCZaU52owJTzLknH0TdlnNjb2H6vQzeRZerhUbCJE3BaUxAwNqqOherPd sHUAZyQu6AvoXPUgjxH8sAxrpXWQdvXUhRQ9fCxu7HgJ/CIvRqo16He1ZFWLepAK/Xoq sW+S0sqGe+MOaIZhuElFYUqc34lt7JA5x+tkfAYSeK9FbmyINzmLbo9gMgSeDBMwFN+h an8w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:subject :from:references:cc:to:content-language:user-agent:mime-version:date :message-id; bh=jlYbtdL3or5G4x0b4b7fjRD7xbcY58xsTZo2f5GG0NE=; b=Nn2l5pSGrmNsrI5Rj30OgbOSCMHfSufYs+OZSfsWHKhGSASIsxEvUVidM+rVS+NUqT W9Al4nBqGDsR9Rh0g3BRlBhXJHSWlOJ/VIyOPv9M0zIrYJ3hRdXb9imxSad58yaIRBTH rglc+ZnDkvuFntROvCmbXcnLluZ7YsmmUWf5s+akm/nArVyr4hFl29yE0q2efvHN5I7b KA4LEK707dHIGSYMvh/VMkz4DT+WDXcRi9k4o31lmQqeSob8kw5gd4MRp2zgKq5tN58L SzIEuxKbSxFWX7n8cYcuSENntF5kprJdwxhW8S5Wm/AGui5sxJUbRRlO9Ni/++rx5KsL c13g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id s13si725969plc.289.2022.02.23.14.42.32; Wed, 23 Feb 2022 14:42:49 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238922AbiBWSde (ORCPT + 99 others); Wed, 23 Feb 2022 13:33:34 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45128 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243922AbiBWSdc (ORCPT ); Wed, 23 Feb 2022 13:33:32 -0500 Received: from vps-vb.mhejs.net (vps-vb.mhejs.net [37.28.154.113]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E5F0F123; Wed, 23 Feb 2022 10:33:02 -0800 (PST) Received: from MUA by vps-vb.mhejs.net with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1nMwRP-0003AV-QA; Wed, 23 Feb 2022 19:32:43 +0100 Message-ID: <7822c00f-5a2d-b6a2-2f81-cf3330801ad3@maciej.szmigiero.name> Date: Wed, 23 Feb 2022 19:32:37 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.6.1 Content-Language: en-US To: Chao Peng Cc: Yu Zhang , Paolo Bonzini , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Jonathan Corbet , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Thomas Gleixner , Ingo Molnar , kvm@vger.kernel.org, Borislav Petkov , x86@kernel.org, "H . Peter Anvin" , Hugh Dickins , Jeff Layton , "J . Bruce Fields" , Andrew Morton , "Kirill A . Shutemov" , luto@kernel.org, jun.nakajima@intel.com, dave.hansen@intel.com, ak@linux.intel.com, david@redhat.com, qemu-devel@nongnu.org References: <20220118132121.31388-1-chao.p.peng@linux.intel.com> <20220118132121.31388-13-chao.p.peng@linux.intel.com> <20220217134548.GA33836@chaop.bj.intel.com> <45148f5f-fe79-b452-f3b2-482c5c3291c4@maciej.szmigiero.name> <20220223120047.GB53733@chaop.bj.intel.com> From: "Maciej S. Szmigiero" Subject: Re: [PATCH v4 12/12] KVM: Expose KVM_MEM_PRIVATE In-Reply-To: <20220223120047.GB53733@chaop.bj.intel.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,NICE_REPLY_A, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 23.02.2022 13:00, Chao Peng wrote: > On Tue, Feb 22, 2022 at 02:16:46AM +0100, Maciej S. Szmigiero wrote: >> On 17.02.2022 14:45, Chao Peng wrote: >>> On Tue, Jan 25, 2022 at 09:20:39PM +0100, Maciej S. Szmigiero wrote: >>>> On 18.01.2022 14:21, Chao Peng wrote: >>>>> KVM_MEM_PRIVATE is not exposed by default but architecture code can turn >>>>> on it by implementing kvm_arch_private_memory_supported(). >>>>> >>>>> Also private memslot cannot be movable and the same file+offset can not >>>>> be mapped into different GFNs. >>>>> >>>>> Signed-off-by: Yu Zhang >>>>> Signed-off-by: Chao Peng >>>>> --- >>>> (..) >>>>> static bool kvm_check_memslot_overlap(struct kvm_memslots *slots, int id, >>>>> - gfn_t start, gfn_t end) >>>>> + struct file *file, >>>>> + gfn_t start, gfn_t end, >>>>> + loff_t start_off, loff_t end_off) >>>>> { >>>>> struct kvm_memslot_iter iter; >>>>> + struct kvm_memory_slot *slot; >>>>> + struct inode *inode; >>>>> + int bkt; >>>>> kvm_for_each_memslot_in_gfn_range(&iter, slots, start, end) { >>>>> if (iter.slot->id != id) >>>>> return true; >>>>> } >>>>> + /* Disallow mapping the same file+offset into multiple gfns. */ >>>>> + if (file) { >>>>> + inode = file_inode(file); >>>>> + kvm_for_each_memslot(slot, bkt, slots) { >>>>> + if (slot->private_file && >>>>> + file_inode(slot->private_file) == inode && >>>>> + !(end_off <= slot->private_offset || >>>>> + start_off >= slot->private_offset >>>>> + + (slot->npages >> PAGE_SHIFT))) >>>>> + return true; >>>>> + } >>>>> + } >>>> >>>> That's a linear scan of all memslots on each CREATE (and MOVE) operation >>>> with a fd - we just spent more than a year rewriting similar linear scans >>>> into more efficient operations in KVM. >>> (..) >>> So linear scan is used before I can find a better way. >> >> Another option would be to simply not check for overlap at add or move >> time, declare such configuration undefined behavior under KVM API and >> make sure in MMU notifiers that nothing bad happens to the host kernel >> if it turns out somebody actually set up a VM this way (it could be >> inefficient in this case, since it's not supposed to ever happen >> unless there is a bug somewhere in the userspace part). > > Specific to TDX case, SEAMMODULE will fail the overlapping case and then > KVM prints a message to the kernel log. It will not cause any other side > effect, it does look weird however. Yes warn that in the API document > can help to some extent. So for the functionality you are adding this code for (TDX) this scan isn't necessary and the overlapping case (not supported anyway) is safely handled by the hardware (or firmware)? Then I would simply remove the scan and, maybe, add a comment instead that the overlap check is done by the hardware. By the way, if a kernel log message could be triggered by (misbehaving) userspace then it should be rate limited (if it isn't already). > Thanks, > Chao Thanks, Maciej