Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp3389619pxk; Mon, 28 Sep 2020 16:46:12 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxOR6WKDONKyZNb2Wig5Om5CwFw0rWNzuQGXKhHz8rUxLNoiQeQhB2cZ8lRyKZ8d5122iji X-Received: by 2002:aa7:cf98:: with SMTP id z24mr405314edx.241.1601336771950; Mon, 28 Sep 2020 16:46:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1601336771; cv=none; d=google.com; s=arc-20160816; b=mJtmz95aqi3ly9/Onxv+9vuOkok+NZFQWh0xgS+BxDXDxPagL4g+3qS0ogEBXoc8io GIDhedzCfn57FAe9cOmFus/zC5kdsYSvv/s82eath+lBq4RaPB6W4m0UbhPxyYwkzZ4A 0XkZ9r51Iy9SD7L6nJhfHx6kKqV6laMOAcgs0hXUOdKEaMU07K2Ign8AcG5FaYnoxiIQ Pyp5HU1fF7BFty+clG3j1SKuGV7hfuftzTP1F4Bf7Aa8qZnk72pMT8FH6oNCLcoa+pa9 8yFgOx0tGfOkrInel04osUl3EJCgdo/n9WV41b4UKOiOGKeeZulryTNnCBL1Z1XzAx5F uZlg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:dkim-signature:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=ntlK3Xi4LRi5JEwFdimDv+bjXFI0l+EMLXJlC3bQqIo=; b=iKYX++wUCjxNjPwQB2/h4C+3MC56LAS3r39smLDzv+r9JQhG8gKsyiDgVeq5N/Lz12 8NWQa/6yT1dDCC4+Y0R9tR40+BUvaoljJcffIRHbyeVsG5oL7kB75G3SkVM5FRWmMjkR DdRvMzrz1Bf32FzC5MLPoRkT+bdbymdAAJ60SiNxkh7JldyN+kLrviR+7W64YaJOHOkI 2heZRJzjMUpzCqpXrSpoHCggaeK9VommPkSuq+V6Yn0ILiMONdmk6eIK7AYxBfL4/SBo GUoenTU4u8ezsgcx+jcQV3BSzYCMe8TAKQWCN9QAuwU/38pBSltADnIF5jLqZ8ZFymnt lmwA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@nvidia.com header.s=n1 header.b=p+5vdv8J; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=nvidia.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id u5si1691934edy.385.2020.09.28.16.45.40; Mon, 28 Sep 2020 16:46:11 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@nvidia.com header.s=n1 header.b=p+5vdv8J; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=nvidia.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727133AbgI1Xk5 (ORCPT + 99 others); Mon, 28 Sep 2020 19:40:57 -0400 Received: from hqnvemgate25.nvidia.com ([216.228.121.64]:15788 "EHLO hqnvemgate25.nvidia.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726477AbgI1Xk5 (ORCPT ); Mon, 28 Sep 2020 19:40:57 -0400 Received: from hqmail.nvidia.com (Not Verified[216.228.121.13]) by hqnvemgate25.nvidia.com (using TLS: TLSv1.2, AES256-SHA) id ; Mon, 28 Sep 2020 15:28:38 -0700 Received: from rcampbell-dev.nvidia.com (10.124.1.5) by HQMAIL107.nvidia.com (172.20.187.13) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Mon, 28 Sep 2020 22:29:24 +0000 Subject: Re: [PATCH 2/2] mm: remove extra ZONE_DEVICE struct page refcount To: Christoph Hellwig CC: , , , , Dan Williams , Ira Weiny , Matthew Wilcox , Jerome Glisse , John Hubbard , Alistair Popple , Jason Gunthorpe , Bharata B Rao , Zi Yan , "Kirill A . Shutemov" , Yang Shi , Paul Mackerras , Ben Skeggs , Andrew Morton References: <20200925204442.31348-1-rcampbell@nvidia.com> <20200925204442.31348-3-rcampbell@nvidia.com> <20200926064116.GB3540@lst.de> X-Nvconfidentiality: public From: Ralph Campbell Message-ID: <78746c2f-4bbb-886f-6eb6-0daffab8be3f@nvidia.com> Date: Mon, 28 Sep 2020 15:29:24 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.2.2 MIME-Version: 1.0 In-Reply-To: <20200926064116.GB3540@lst.de> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.124.1.5] X-ClientProxiedBy: HQMAIL101.nvidia.com (172.20.187.10) To HQMAIL107.nvidia.com (172.20.187.13) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1601332118; bh=ntlK3Xi4LRi5JEwFdimDv+bjXFI0l+EMLXJlC3bQqIo=; h=Subject:To:CC:References:X-Nvconfidentiality:From:Message-ID:Date: User-Agent:MIME-Version:In-Reply-To:Content-Type:Content-Language: Content-Transfer-Encoding:X-Originating-IP:X-ClientProxiedBy; b=p+5vdv8JLToXbkMmRk7a4etC29I/vEqOTkTM+BP6PCvq8/5JBz/n6y5wSi+CIRqrL vzauJc3enPL/CL5m/kIxahvUYZwNmqhnSze1+RAQvbPbZ5BcFErSe/vtmTl8Ju3roI k6gj0REdmzH+87NjkeLyC9Tv1Mz89CcLz0CVd0PSQEUxFO+3CKMNR/sKmlqMOFXiim 0LxBkAntA2XFm4Kumm1ThCARsbAYSqekzue1TSj7Qn1RlBBurKZpBwo8DsqsNyTB/s GpYRxmbyeILVoxewkJ/ODDzdDFagQVpK732chT36eASHb4Z7nRS/u/m1pfcRd+04WT x2y64mfoCE0Ug== Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 9/25/20 11:41 PM, Christoph Hellwig wrote: > On Fri, Sep 25, 2020 at 01:44:42PM -0700, Ralph Campbell wrote: >> ZONE_DEVICE struct pages have an extra reference count that complicates the >> code for put_page() and several places in the kernel that need to check the >> reference count to see that a page is not being used (gup, compaction, >> migration, etc.). Clean up the code so the reference count doesn't need to >> be treated specially for ZONE_DEVICE. >> >> Signed-off-by: Ralph Campbell >> --- >> arch/powerpc/kvm/book3s_hv_uvmem.c | 2 +- >> drivers/gpu/drm/nouveau/nouveau_dmem.c | 2 +- >> include/linux/dax.h | 2 +- >> include/linux/memremap.h | 7 ++- >> include/linux/mm.h | 44 -------------- >> lib/test_hmm.c | 2 +- >> mm/gup.c | 44 -------------- >> mm/internal.h | 8 +++ >> mm/memremap.c | 82 ++++++-------------------- >> mm/migrate.c | 5 -- >> mm/page_alloc.c | 3 + >> mm/swap.c | 46 +++------------ >> 12 files changed, 44 insertions(+), 203 deletions(-) >> >> diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c b/arch/powerpc/kvm/book3s_hv_uvmem.c >> index 7705d5557239..e6ec98325fab 100644 >> --- a/arch/powerpc/kvm/book3s_hv_uvmem.c >> +++ b/arch/powerpc/kvm/book3s_hv_uvmem.c >> @@ -711,7 +711,7 @@ static struct page *kvmppc_uvmem_get_page(unsigned long gpa, struct kvm *kvm) >> >> dpage = pfn_to_page(uvmem_pfn); >> dpage->zone_device_data = pvt; >> - get_page(dpage); >> + init_page_count(dpage); >> lock_page(dpage); >> return dpage; >> out_clear: >> diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c >> index 4e8112fde3e6..ca2e3c3edc36 100644 >> --- a/drivers/gpu/drm/nouveau/nouveau_dmem.c >> +++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c >> @@ -323,7 +323,7 @@ nouveau_dmem_page_alloc_locked(struct nouveau_drm *drm) >> return NULL; >> } >> >> - get_page(page); >> + init_page_count(page); >> lock_page(page); >> return page; >> } >> diff --git a/include/linux/dax.h b/include/linux/dax.h >> index 3f78ed78d1d6..8d29f38645aa 100644 >> --- a/include/linux/dax.h >> +++ b/include/linux/dax.h >> @@ -240,7 +240,7 @@ static inline bool dax_mapping(struct address_space *mapping) >> >> static inline bool dax_layout_is_idle_page(struct page *page) >> { >> - return page_ref_count(page) <= 1; >> + return page_ref_count(page) == 0; >> } >> >> #endif >> diff --git a/include/linux/memremap.h b/include/linux/memremap.h >> index e5862746751b..f9224f88e4cd 100644 >> --- a/include/linux/memremap.h >> +++ b/include/linux/memremap.h >> @@ -65,9 +65,10 @@ enum memory_type { >> >> struct dev_pagemap_ops { >> /* >> - * Called once the page refcount reaches 1. (ZONE_DEVICE pages never >> - * reach 0 refcount unless there is a refcount bug. This allows the >> - * device driver to implement its own memory management.) >> + * Called once the page refcount reaches 0. The reference count >> + * should be reset to one with init_page_count(page) before reusing >> + * the page. This allows the device driver to implement its own >> + * memory management. >> */ >> void (*page_free)(struct page *page); >> >> diff --git a/include/linux/mm.h b/include/linux/mm.h >> index b2f370f0b420..2159c2477aa3 100644 >> --- a/include/linux/mm.h >> +++ b/include/linux/mm.h >> @@ -1092,39 +1092,6 @@ static inline bool is_zone_device_page(const struct page *page) >> } >> #endif >> >> -#ifdef CONFIG_DEV_PAGEMAP_OPS >> -void free_devmap_managed_page(struct page *page); >> -DECLARE_STATIC_KEY_FALSE(devmap_managed_key); >> - >> -static inline bool page_is_devmap_managed(struct page *page) >> -{ >> - if (!static_branch_unlikely(&devmap_managed_key)) >> - return false; >> - if (!is_zone_device_page(page)) >> - return false; >> - switch (page->pgmap->type) { >> - case MEMORY_DEVICE_PRIVATE: >> - case MEMORY_DEVICE_FS_DAX: >> - return true; >> - default: >> - break; >> - } >> - return false; >> -} >> - >> -void put_devmap_managed_page(struct page *page); >> - >> -#else /* CONFIG_DEV_PAGEMAP_OPS */ >> -static inline bool page_is_devmap_managed(struct page *page) >> -{ >> - return false; >> -} >> - >> -static inline void put_devmap_managed_page(struct page *page) >> -{ >> -} >> -#endif /* CONFIG_DEV_PAGEMAP_OPS */ >> - >> static inline bool is_device_private_page(const struct page *page) >> { >> return IS_ENABLED(CONFIG_DEV_PAGEMAP_OPS) && >> @@ -1171,17 +1138,6 @@ static inline void put_page(struct page *page) >> { >> page = compound_head(page); >> >> - /* >> - * For devmap managed pages we need to catch refcount transition from >> - * 2 to 1, when refcount reach one it means the page is free and we >> - * need to inform the device driver through callback. See >> - * include/linux/memremap.h and HMM for details. >> - */ >> - if (page_is_devmap_managed(page)) { >> - put_devmap_managed_page(page); >> - return; >> - } >> - >> if (put_page_testzero(page)) >> __put_page(page); >> } >> diff --git a/lib/test_hmm.c b/lib/test_hmm.c >> index e7dc3de355b7..1033b19c9c52 100644 >> --- a/lib/test_hmm.c >> +++ b/lib/test_hmm.c >> @@ -561,7 +561,7 @@ static struct page *dmirror_devmem_alloc_page(struct dmirror_device *mdevice) >> } >> >> dpage->zone_device_data = rpage; >> - get_page(dpage); >> + init_page_count(dpage); >> lock_page(dpage); >> return dpage; >> > > Doesn't test_hmm also need to reinitialize the refcount before freeing > the page in hmm_dmirror_exit? The dmirror_zero_page is dead code, it isn't used. There is a patch queued in linux-mm which removes it. Besides, it was allocated with alloc_page() so it isn't a device private struct page. >> int error, is_ram; >> - bool need_devmap_managed = true; >> >> switch (pgmap->type) { >> case MEMORY_DEVICE_PRIVATE: >> @@ -217,11 +171,9 @@ void *memremap_pages(struct dev_pagemap *pgmap, int nid) >> } >> break; >> case MEMORY_DEVICE_GENERIC: > > The MEMORY_DEVICE_PRIVATE cases loses the sanity check that the > page_free method is set. I'll add that back into memremap_pages() with other sanity checks in v3. > Otherwise this looks good. Thanks!