Received: by 2002:a05:6a10:6744:0:0:0:0 with SMTP id w4csp2184474pxu; Fri, 9 Oct 2020 09:56:16 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx6tS0LcdqEol0jrS0+8y1NFImPI6e4FDZdOliMGJHg1XROhA6QiJ8S0vdfCJp+r71CX9yX X-Received: by 2002:a17:906:358a:: with SMTP id o10mr15706921ejb.371.1602262576268; Fri, 09 Oct 2020 09:56:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1602262576; cv=none; d=google.com; s=arc-20160816; b=diPcqG9KENt/1ImTGOrMWKYwaTq1Bv96DIqI35RVKSgkRvvjIjIevc6pvwD6a2B6WX bBKQz8n5fYVs2+eKV6qc+svz2cs7K6r3L7diV9EvzQn+Nq8bEnT4I4pvIyoSjY2VWwex GfqLLPOFvLtRRXizaRviPn8JC9LHjtxbvopxN8ekvyqfrsIfOKdlZPuOMzoBSs2ig2+F UlJmEhRWO9oV0XRnrc4NBtZ9lkcHSfbVdr0po+GQXuaPmd6fPxjf90F3smT1eVh7Yfvm i72MwxGgh2hrCjqa7UB+tiREE5gYylg1kKykXyRSfhB0qezXvjRHalfyFKWha0sHcp+8 2dgw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :ironport-sdr:ironport-sdr; bh=NB/3wVHdGfKCjiN6rOnqxNJHc8KGfnWozARheUOtf1o=; b=PP8RrjSZxnXAQ2dUsl0uGkgVoiUyqFksej3xzf8uVBh2Nc5p+mBWi3aukQOUC0m0da XZlpBJtTDX41ntEh4+j/+uUOe15wRQVaKxTZmzIStxhCVfkcdytuRJcME7U0pYs7fxp6 bEbM6GzdhTyKQpBuHNho3RRjelpvSsLYJnhRTmD3G+xgXUQHt6nN7Q47vN2BXgwtO77j pX2QmBeMZ3BIhMrxl+ThH6caf8gpM2v3eudrJTDNJIDlvjXEKQeiquPIXlWdgG6zDETf ORtswhI4PYSONv2a/Mng95fvTJk+Fkf02lXylIvxXWdEyCzqgQesylYaSC8RYuip2Q2q nkzg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id i13si857897edx.106.2020.10.09.09.55.50; Fri, 09 Oct 2020 09:56:16 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389858AbgJIQxx (ORCPT + 99 others); Fri, 9 Oct 2020 12:53:53 -0400 Received: from mga18.intel.com ([134.134.136.126]:27293 "EHLO mga18.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2389334AbgJIQxx (ORCPT ); Fri, 9 Oct 2020 12:53:53 -0400 IronPort-SDR: X2afcn6UEUQFXSLFjqmvqC7MrfpGBq9wBp9IDl4zl3bbgTO5ejvSsySkG9iymqrY492C6bdh1L Z9pHfzhgPHCw== X-IronPort-AV: E=McAfee;i="6000,8403,9769"; a="153343172" X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="153343172" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 09:53:51 -0700 IronPort-SDR: cZPsX0pTjYEGuD0ArdpSCuRg3di0O8Q5VtMLwajHT3ZQMVDnnXcOC/xpo+DvF5F75jcclyC3Q9 zRwAKC+0JvpA== X-IronPort-AV: E=Sophos;i="5.77,355,1596524400"; d="scan'208";a="528996042" Received: from iweiny-desk2.sc.intel.com (HELO localhost) ([10.3.52.147]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Oct 2020 09:53:50 -0700 Date: Fri, 9 Oct 2020 09:53:50 -0700 From: Ira Weiny To: Ralph Campbell Cc: linux-mm@kvack.org, kvm-ppc@vger.kernel.org, nouveau@lists.freedesktop.org, linux-kernel@vger.kernel.org, Dan Williams , Matthew Wilcox , Jerome Glisse , John Hubbard , Alistair Popple , Christoph Hellwig , Jason Gunthorpe , Bharata B Rao , Zi Yan , "Kirill A . Shutemov" , Yang Shi , Paul Mackerras , Ben Skeggs , Andrew Morton Subject: Re: [PATCH] mm: make device private reference counts zero based Message-ID: <20201009165350.GV2046448@iweiny-DESK2.sc.intel.com> References: <20201008172544.29905-1-rcampbell@nvidia.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20201008172544.29905-1-rcampbell@nvidia.com> User-Agent: Mutt/1.11.1 (2018-12-01) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Oct 08, 2020 at 10:25:44AM -0700, Ralph Campbell wrote: > ZONE_DEVICE struct pages have an extra reference count that complicates the > code for put_page() and several places in the kernel that need to check the > reference count to see that a page is not being used (gup, compaction, > migration, etc.). Clean up the code so the reference count doesn't need to > be treated specially for device private pages, leaving DAX as still being > a special case. What about the check in mc_handle_swap_pte()? mm/memcontrol.c: 5513 /* 5514 * MEMORY_DEVICE_PRIVATE means ZONE_DEVICE page and which have 5515 * a refcount of 1 when free (unlike normal page) 5516 */ 5517 if (!page_ref_add_unless(page, 1, 1)) 5518 return NULL; ... does that need to change? Perhaps just the comment? > > Signed-off-by: Ralph Campbell > --- > [snip] > > void put_devmap_managed_page(struct page *page); > diff --git a/lib/test_hmm.c b/lib/test_hmm.c > index e151a7f10519..bf92a261fa6f 100644 > --- a/lib/test_hmm.c > +++ b/lib/test_hmm.c > @@ -509,10 +509,15 @@ static bool dmirror_allocate_chunk(struct dmirror_device *mdevice, > mdevice->devmem_count * (DEVMEM_CHUNK_SIZE / (1024 * 1024)), > pfn_first, pfn_last); > > + /* > + * Pages are created with an initial reference count of one but should > + * have a reference count of zero while in the free state. > + */ > spin_lock(&mdevice->lock); > for (pfn = pfn_first; pfn < pfn_last; pfn++) { > struct page *page = pfn_to_page(pfn); > > + set_page_count(page, 0); This confuses me. How does this and init_page_count() not confuse the buddy allocator? Don't you have to reset the refcount somewhere after the test? > page->zone_device_data = mdevice->free_pages; > mdevice->free_pages = page; > } > @@ -561,7 +566,7 @@ static struct page *dmirror_devmem_alloc_page(struct dmirror_device *mdevice) > } > > dpage->zone_device_data = rpage; > - get_page(dpage); > + init_page_count(dpage); > lock_page(dpage); > return dpage; > > diff --git a/mm/internal.h b/mm/internal.h > index c43ccdddb0f6..e1443b73aa9b 100644 > --- a/mm/internal.h > +++ b/mm/internal.h > [snip] > diff --git a/mm/swap.c b/mm/swap.c > index 0eb057141a04..93d880c6f73c 100644 > --- a/mm/swap.c > +++ b/mm/swap.c > @@ -116,12 +116,11 @@ static void __put_compound_page(struct page *page) > void __put_page(struct page *page) > { > if (is_zone_device_page(page)) { > - put_dev_pagemap(page->pgmap); > - > /* > * The page belongs to the device that created pgmap. Do > * not return it to page allocator. > */ > + free_zone_device_page(page); I really like this. Ira