Received: by 2002:ac0:98c7:0:0:0:0:0 with SMTP id g7-v6csp5785315imd; Wed, 31 Oct 2018 01:51:14 -0700 (PDT) X-Google-Smtp-Source: AJdET5dnQr3GiIyST1JmRtksA918GHXaSp+rZctXYAZXUAttWFIcLHJ/Y/L+HDaF8Rwdc1EkkYZp X-Received: by 2002:a17:902:ac86:: with SMTP id h6-v6mr2380063plr.174.1540975874937; Wed, 31 Oct 2018 01:51:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1540975874; cv=none; d=google.com; s=arc-20160816; b=ySmYdoYxTTRQADtcU7rnhBEy5T1UBnpEHTHrZdwBuFVagDiN5brJwezx/qxDTg0pF+ EI5//83lFztUplWBMZZ/LZgzsXstWiA1UojuXhK6DQ37kBRehx/BoHu/4gFdKRC6d1YD 3JZz697pPFqHWzdubDMaLagXM26KrHdxyKWy+eTlOwUvbNjfKi0oGxJdB2/PTkkYvE4P Z7nbgyjCG5v1e3YBg7fneuzaLz81dhTMrxXRpqyio1I6QRjxtd1IXA0YGUyhYj/WStmH IfD0q4advb6lLag1tfZnJAYWcXd/2x6MbMJ4zbFxxwnp87cy+u/zdDl8KlW4F33tYmx4 4UuQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:autocrypt:openpgp:from:references:cc:to:subject; bh=FimUUzQL6GC7Wz1Rbjrao09iNQsqGZANiWLp6kpHzDA=; b=e900FCGshKXp8vuclG4Zh2Zi9X3YzVaDmfiUivPAlrCA/3U66yTxy0PmssNnuI+TSV dh68CwNRwla3yNlj/gvW4/pHisY6YZhlBvlvD2+XozlGGADLZ4aMuVCXGcKXNljKO9OD 30AN+Kiq5o5P/146Kuezn3yw2cpnMHrkxMWzd6RT1DYYSJqZyu2Yl3B8ShGRBnp4ykRf G/U1D5bnJKi7yD0YZxQwOMha+OZg7pO3T9vVoZB81WbBORgKzvKlB0caTJ6mtmVCym7a UgG34ygJp8Yz3NLiA8Ss0R0S0Rcx8nbG6fAH2viuQTIhncMiVggR7vviCl7cMJo61dmN VFLg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e1-v6si26972590pgd.528.2018.10.31.01.50.59; Wed, 31 Oct 2018 01:51:14 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727646AbeJaRrL (ORCPT + 99 others); Wed, 31 Oct 2018 13:47:11 -0400 Received: from mx1.redhat.com ([209.132.183.28]:59692 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726054AbeJaRrL (ORCPT ); Wed, 31 Oct 2018 13:47:11 -0400 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 907CC4E915; Wed, 31 Oct 2018 08:49:59 +0000 (UTC) Received: from [10.36.112.62] (ovpn-112-62.ams2.redhat.com [10.36.112.62]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 779541974C; Wed, 31 Oct 2018 08:49:54 +0000 (UTC) Subject: Re: [RFC PATCH] kvm: Use huge pages for DAX-backed files To: Barret Rhoden , Dan Williams Cc: Dave Jiang , zwisler@kernel.org, Vishal L Verma , rkrcmar@redhat.com, Thomas Gleixner , Ingo Molnar , Borislav Petkov , linux-nvdimm , Linux Kernel Mailing List , "H. Peter Anvin" , X86 ML , KVM list , "Zhang, Yu C" , "Zhang, Yi Z" References: <20181029210716.212159-1-brho@google.com> <20181029202854.7c924fd3@gnomeregan.cam.corp.google.com> <20181030154524.181b8236@gnomeregan.cam.corp.google.com> From: Paolo Bonzini Openpgp: preference=signencrypt Autocrypt: addr=pbonzini@redhat.com; prefer-encrypt=mutual; keydata= xsEhBFRCcBIBDqDGsz4K0zZun3jh+U6Z9wNGLKQ0kSFyjN38gMqU1SfP+TUNQepFHb/Gc0E2 CxXPkIBTvYY+ZPkoTh5xF9oS1jqI8iRLzouzF8yXs3QjQIZ2SfuCxSVwlV65jotcjD2FTN04 hVopm9llFijNZpVIOGUTqzM4U55sdsCcZUluWM6x4HSOdw5F5Utxfp1wOjD/v92Lrax0hjiX DResHSt48q+8FrZzY+AUbkUS+Jm34qjswdrgsC5uxeVcLkBgWLmov2kMaMROT0YmFY6A3m1S P/kXmHDXxhe23gKb3dgwxUTpENDBGcfEzrzilWueOeUWiOcWuFOed/C3SyijBx3Av/lbCsHU Vx6pMycNTdzU1BuAroB+Y3mNEuW56Yd44jlInzG2UOwt9XjjdKkJZ1g0P9dwptwLEgTEd3Fo UdhAQyRXGYO8oROiuh+RZ1lXp6AQ4ZjoyH8WLfTLf5g1EKCTc4C1sy1vQSdzIRu3rBIjAvnC tGZADei1IExLqB3uzXKzZ1BZ+Z8hnt2og9hb7H0y8diYfEk2w3R7wEr+Ehk5NQsT2MPI2QBd wEv1/Aj1DgUHZAHzG1QN9S8wNWQ6K9DqHZTBnI1hUlkp22zCSHK/6FwUCuYp1zcAEQEAAc0f UGFvbG8gQm9uemluaSA8Ym9uemluaUBnbnUub3JnPsLBTQQTAQIAIwUCVEJ7AwIbAwcLCQgH AwIBBhUIAgkKCwQWAgMBAh4BAheAAAoJEH4VEAzNNmmxNcwOniaZVLsuy1lW/ntYCA0Caz0i sHpmecK8aWlvL9wpQCk4GlOX9L1emyYXZPmzIYB0IRqmSzAlZxi+A2qm9XOxs5gJ2xqMEXX5 FMtUH3kpkWWJeLqe7z0EoQdUI4EG988uv/tdZyqjUn2XJE+K01x7r3MkUSFz/HZKZiCvYuze VlS0NTYdUt5jBXualvAwNKfxEkrxeHjxgdFHjYWhjflahY7TNRmuqPM/Lx7wAuyoDjlYNE40 Z+Kun4/KjMbjgpcF4Nf3PJQR8qXI6p3so2qsSn91tY7DFSJO6v2HwFJkC2jU95wxfNmTEUZc znXahYbVOwCDJRuPrE5GKFd/XJU9u5hNtr/uYipHij01WXal2cce1S5mn1/HuM1yo1u8xdHy IupCd57EWI948e8BlhpujUCU2tzOb2iYS0kpmJ9/oLVZrOcSZCcCl2P0AaCAsj59z2kwQS9D du0WxUs8waso0Qq6tDEHo8yLCOJDzSz4oojTtWe4zsulVnWV+wu70AioemAT8S6JOtlu60C5 dHgQUD1Tp+ReXpDKXmjbASJx4otvW0qah3o6JaqO79tbDqIvncu3tewwp6c85uZd48JnIOh3 utBAu684nJakbbvZUGikJfxd887ATQRUQnHuAQgAx4dxXO6/Zun0eVYOnr5GRl76+2UrAAem Vv9Yfn2PbDIbxXqLff7oyVJIkw4WdhQIIvvtu5zH24iYjmdfbg8iWpP7NqxUQRUZJEWbx2CR wkMHtOmzQiQ2tSLjKh/cHeyFH68xjeLcinR7jXMrHQK+UCEw6jqi1oeZzGvfmxarUmS0uRuf fAb589AJW50kkQK9VD/9QC2FJISSUDnRC0PawGSZDXhmvITJMdD4TjYrePYhSY4uuIV02v02 8TVAaYbIhxvDY0hUQE4r8ZbGRLn52bEzaIPgl1p/adKfeOUeMReg/CkyzQpmyB1TSk8lDMxQ zCYHXAzwnGi8WU9iuE1P0wARAQABwsEzBBgBAgAJBQJUQnHuAhsMAAoJEH4VEAzNNmmxp1EO oJy0uZggJm7gZKeJ7iUpeX4eqUtqelUw6gU2daz2hE/jsxsTbC/w5piHmk1H1VWDKEM4bQBT uiJ0bfo55SWsUNN+c9hhIX+Y8LEe22izK3w7mRpvGcg+/ZRG4DEMHLP6JVsv5GMpoYwYOmHn plOzCXHvmdlW0i6SrMsBDl9rw4AtIa6bRwWLim1lQ6EM3PWifPrWSUPrPcw4OLSwFk0CPqC4 HYv/7ZnASVkR5EERFF3+6iaaVi5OgBd81F1TCvCX2BEyIDRZLJNvX3TOd5FEN+lIrl26xecz 876SvcOb5SL5SKg9/rCBufdPSjojkGFWGziHiFaYhbuI2E+NfWLJtd+ZvWAAV+O0d8vFFSvr iy9enJ8kxJwhC0ECbSKFY+W1eTIhMD3aeAKY90drozWEyHhENf4l/V+Ja5vOnW+gCDQkGt2Y 1lJAPPSIqZKvHzGShdh8DduC0U3xYkfbGAUvbxeepjgzp0uEnBXfPTy09JGpgWbg0w91GyfT /ujKaGd4vxG2Ei+MMNDmS1SMx7wu0evvQ5kT9NPzyq8R2GIhVSiAd2jioGuTjX6AZCFv3ToO 53DliFMkVTecLptsXaesuUHgL9dKIfvpm+rNXRn9wAwGjk0X/A== Message-ID: <71d52e0f-ec40-d423-4dd4-e3aeb3730166@redhat.com> Date: Wed, 31 Oct 2018 09:49:52 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.0 MIME-Version: 1.0 In-Reply-To: <20181030154524.181b8236@gnomeregan.cam.corp.google.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Wed, 31 Oct 2018 08:49:59 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 30/10/2018 20:45, Barret Rhoden wrote: > On 2018-10-29 at 20:10 Dan Williams wrote: >> The property of DAX pages that requires special coordination is the >> fact that the device hosting the pages can be disabled at will. The >> get_dev_pagemap() api is the interface to pin a device-pfn so that you >> can safely perform a pfn_to_page() operation. >> >> Have the pages that kvm uses in this path already been pinned by vfio? No, VFIO is not involved here. The pages that KVM uses are never pinned. Soon after we grab them and we build KVM's page table, we do put_page in mmu_set_spte (via kvm_release_pfn_clean). From that point on the MMU notifier will take care of invalidating SPT before the page disappears from the mm's page table. > One usage of kvm_is_reserved_pfn() in KVM code is like this: > > static struct page *kvm_pfn_to_page(kvm_pfn_t pfn) > { > if (is_error_noslot_pfn(pfn)) > return KVM_ERR_PTR_BAD_PAGE; > > if (kvm_is_reserved_pfn(pfn)) { > WARN_ON(1); > return KVM_ERR_PTR_BAD_PAGE; > } > > return pfn_to_page(pfn); > } > > I think there's no guarantee the kvm->mmu_lock is held in the generic > case. Indeed, it's not. > There are probably other rules related to gfn_to_page that keep the > page alive, maybe just during interrupt/vmexit context? Whatever keeps > those pages alive for normal memory might grab that devmap reference > under the hood for DAX mappings. Nothing keeps the page alive except for the MMU notifier (which of course cannot run in atomic context, since its callers take the mmap_sem). Paolo