Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp1115035ybl; Fri, 13 Dec 2019 09:48:07 -0800 (PST) X-Google-Smtp-Source: APXvYqxCMxbj4Vi5quIzvxA8cx9LsFsd3G/WSfIes4tzqcRoI4iZ2p7Jpf18grYgvbNIJW5EGQIU X-Received: by 2002:a9d:588c:: with SMTP id x12mr15283437otg.2.1576259286760; Fri, 13 Dec 2019 09:48:06 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1576259286; cv=none; d=google.com; s=arc-20160816; b=s7ERT7Scerzo7g5jSCaXyG6R+iSfi9vwLQrosx51oYp6vubcbZRCMYNHDRfdZU5b6F MSwnAdUmC+rs4MMWo0lsWjgzhY2dZuw9IOxgV17mBwhnbNCnrTSU+n4PoZR/RsSP2O7S jvLvdGlUPjHr4dm04K4qjnS3WjNLfQu++PVmHTXTvIEDF4JOAdFfmMt/HGucprvOB8w0 SF3qBCr91Fw+HPgWCm93l81hDIuLFStDr4Zx0PRqSDvXH/cDlX4aHmVoyRClVCSvVLO8 h19E0/anHLvVk4aVqAnL/0KtrumjjhEaNCu0BA3hSjICGA7+neHPB/1q9WwuV5PqVWPW Hkmg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=WTpyFacVl5BWKS/GkKV5sBQtwiRIssp5+FLytYUX3ZM=; b=pGIp2TmQFOwFfLDVxSx3zzeD1M3di8UkBZnYZidCTkTW8RkrZXcDizB5NzL+CjkDTP T53Q0wQZkZDBLlezbM4KkQTKtnDHmcZByJ9WIpKCJ6f0DDiOVAN984oOw+KjOOftOrT1 uIy9jOjJpdWOCDa0Go0gWc/BJ7B6brzKhCZaACY92gCoWSY1pbsGTmMR9pG+3USh2ELI sI7K6KyRU6qEHes7zZedonsghqAS3TTKIiSddepI3guOWCdvcD/v1umkX04wgHx70Odw zRmUyK502pra04WPtEE8xOE1HRCz087rBVgadkBq8Ie7jfGS8kOlSd6BLCjus+yCzPZU 72cQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id w7si5001748oie.196.2019.12.13.09.47.55; Fri, 13 Dec 2019 09:48:06 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728565AbfLMRrE (ORCPT + 99 others); Fri, 13 Dec 2019 12:47:04 -0500 Received: from mga11.intel.com ([192.55.52.93]:29572 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726404AbfLMRrE (ORCPT ); Fri, 13 Dec 2019 12:47:04 -0500 X-Amp-Result: UNKNOWN X-Amp-Original-Verdict: FILE UNKNOWN X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 13 Dec 2019 09:47:03 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.69,309,1571727600"; d="scan'208";a="296989821" Received: from sjchrist-coffee.jf.intel.com (HELO linux.intel.com) ([10.54.74.202]) by orsmga001.jf.intel.com with ESMTP; 13 Dec 2019 09:47:02 -0800 Date: Fri, 13 Dec 2019 09:47:02 -0800 From: Sean Christopherson To: Barret Rhoden Cc: Paolo Bonzini , Dan Williams , David Hildenbrand , Dave Jiang , Alexander Duyck , linux-nvdimm@lists.01.org, x86@kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org, jason.zeng@intel.com Subject: Re: [PATCH v5 1/2] mm: make dev_pagemap_mapping_shift() externally visible Message-ID: <20191213174702.GB31552@linux.intel.com> References: <20191212182238.46535-1-brho@google.com> <20191212182238.46535-2-brho@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20191212182238.46535-2-brho@google.com> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Dec 12, 2019 at 01:22:37PM -0500, Barret Rhoden wrote: > KVM has a use case for determining the size of a dax mapping. > > The KVM code has easy access to the address and the mm, and > dev_pagemap_mapping_shift() needs only those parameters. It was > deriving them from page and vma. This commit changes those parameters > from (page, vma) to (address, mm). > > Signed-off-by: Barret Rhoden > Reviewed-by: David Hildenbrand > Acked-by: Dan Williams > --- > include/linux/mm.h | 3 +++ > mm/memory-failure.c | 38 +++----------------------------------- > mm/util.c | 34 ++++++++++++++++++++++++++++++++++ > 3 files changed, 40 insertions(+), 35 deletions(-) > > diff --git a/include/linux/mm.h b/include/linux/mm.h > index a2adf95b3f9c..bfd1882dd5c6 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -1013,6 +1013,9 @@ static inline bool is_pci_p2pdma_page(const struct page *page) > #define page_ref_zero_or_close_to_overflow(page) \ > ((unsigned int) page_ref_count(page) + 127u <= 127u) > > +unsigned long dev_pagemap_mapping_shift(unsigned long address, > + struct mm_struct *mm); > + > static inline void get_page(struct page *page) > { > page = compound_head(page); > diff --git a/mm/memory-failure.c b/mm/memory-failure.c > index 3151c87dff73..bafa464c8290 100644 > --- a/mm/memory-failure.c > +++ b/mm/memory-failure.c > @@ -261,40 +261,6 @@ void shake_page(struct page *p, int access) > } > EXPORT_SYMBOL_GPL(shake_page); > > -static unsigned long dev_pagemap_mapping_shift(struct page *page, > - struct vm_area_struct *vma) > -{ > - unsigned long address = vma_address(page, vma); > - pgd_t *pgd; > - p4d_t *p4d; > - pud_t *pud; > - pmd_t *pmd; > - pte_t *pte; > - > - pgd = pgd_offset(vma->vm_mm, address); > - if (!pgd_present(*pgd)) > - return 0; > - p4d = p4d_offset(pgd, address); > - if (!p4d_present(*p4d)) > - return 0; > - pud = pud_offset(p4d, address); > - if (!pud_present(*pud)) > - return 0; > - if (pud_devmap(*pud)) > - return PUD_SHIFT; > - pmd = pmd_offset(pud, address); > - if (!pmd_present(*pmd)) > - return 0; > - if (pmd_devmap(*pmd)) > - return PMD_SHIFT; > - pte = pte_offset_map(pmd, address); > - if (!pte_present(*pte)) > - return 0; > - if (pte_devmap(*pte)) > - return PAGE_SHIFT; > - return 0; > -} > - > /* > * Failure handling: if we can't find or can't kill a process there's > * not much we can do. We just print a message and ignore otherwise. > @@ -324,7 +290,9 @@ static void add_to_kill(struct task_struct *tsk, struct page *p, > } > tk->addr = page_address_in_vma(p, vma); > if (is_zone_device_page(p)) > - tk->size_shift = dev_pagemap_mapping_shift(p, vma); > + tk->size_shift = > + dev_pagemap_mapping_shift(vma_address(page, vma), > + vma->vm_mm); > else > tk->size_shift = compound_order(compound_head(p)) + PAGE_SHIFT; > > diff --git a/mm/util.c b/mm/util.c > index 3ad6db9a722e..59984e6b40ab 100644 > --- a/mm/util.c > +++ b/mm/util.c > @@ -901,3 +901,37 @@ int memcmp_pages(struct page *page1, struct page *page2) > kunmap_atomic(addr1); > return ret; > } > + > +unsigned long dev_pagemap_mapping_shift(unsigned long address, > + struct mm_struct *mm) > +{ > + pgd_t *pgd; > + p4d_t *p4d; > + pud_t *pud; > + pmd_t *pmd; > + pte_t *pte; > + > + pgd = pgd_offset(mm, address); > + if (!pgd_present(*pgd)) > + return 0; > + p4d = p4d_offset(pgd, address); > + if (!p4d_present(*p4d)) > + return 0; > + pud = pud_offset(p4d, address); > + if (!pud_present(*pud)) > + return 0; > + if (pud_devmap(*pud)) > + return PUD_SHIFT; > + pmd = pmd_offset(pud, address); > + if (!pmd_present(*pmd)) > + return 0; > + if (pmd_devmap(*pmd)) > + return PMD_SHIFT; > + pte = pte_offset_map(pmd, address); > + if (!pte_present(*pte)) > + return 0; > + if (pte_devmap(*pte)) > + return PAGE_SHIFT; > + return 0; > +} > +EXPORT_SYMBOL_GPL(dev_pagemap_mapping_shift); This is basically a rehash of lookup_address_in_pgd(), and doesn't provide exactly what KVM needs. E.g. KVM works with levels instead of shifts, and it would be nice to provide the pte so that KVM can sanity check that the pfn from this walk matches the pfn it plans on mapping. Instead of exporting dev_pagemap_mapping_shift(), what about relacing it with a patch to introduce lookup_address_mm() and export that? dev_pagemap_mapping_shift() could then wrap the new helper (if you want), and KVM could do lookup_address_mm() for querying the size of ZONE_DEVICE pages.