Received-SPF: pass (google.com: domain of linux-kernel+bounces-4-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99;
Precedence: bulk
MIME-Version: 1.0
References: <20231130201504.2322355-1-pasha.tatashin@soleen.com>
 <20231130201504.2322355-2-pasha.tatashin@soleen.com> <776e17af-ae25-16a0-f443-66f3972b00c0@google.com>
In-Reply-To: <776e17af-ae25-16a0-f443-66f3972b00c0@google.com>
From: Pasha Tatashin <pasha.tatashin@soleen.com>
Date: Thu, 14 Dec 2023 14:16:44 -0500
Message-ID: <CA+CK2bA8iJ_w8CSx2Ed=d2cVSujrC0-TpO7U9j+Ow-gfk1nyfQ@mail.gmail.com>
Subject: Re: [PATCH v2 01/10] iommu/vt-d: add wrapper functions for page allocations
To: David Rientjes <rientjes@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>, alim.akhtar@samsung.com, 
	alyssa@rosenzweig.io, asahi@lists.linux.dev, baolu.lu@linux.intel.com, 
	bhelgaas@google.com, cgroups@vger.kernel.org, corbet@lwn.net, 
	david@redhat.com, dwmw2@infradead.org, hannes@cmpxchg.org, heiko@sntech.de, 
	iommu@lists.linux.dev, jernej.skrabec@gmail.com, jonathanh@nvidia.com, 
	joro@8bytes.org, krzysztof.kozlowski@linaro.org, linux-doc@vger.kernel.org, 
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, 
	linux-mm@kvack.org, linux-rockchip@lists.infradead.org, 
	linux-samsung-soc@vger.kernel.org, linux-sunxi@lists.linux.dev, 
	linux-tegra@vger.kernel.org, lizefan.x@bytedance.com, marcan@marcan.st, 
	mhiramat@kernel.org, m.szyprowski@samsung.com, paulmck@kernel.org, 
	rdunlap@infradead.org, robin.murphy@arm.com, samuel@sholland.org, 
	suravee.suthikulpanit@amd.com, sven@svenpeter.dev, thierry.reding@gmail.com, 
	tj@kernel.org, tomas.mudrunka@gmail.com, vdumpa@nvidia.com, wens@csie.org, 
	will@kernel.org, yu-cheng.yu@intel.com
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

On Thu, Dec 14, 2023 at 12:58=E2=80=AFPM David Rientjes <rientjes@google.co=
m> wrote:
>
> On Thu, 30 Nov 2023, Pasha Tatashin wrote:
>
> > diff --git a/drivers/iommu/iommu-pages.h b/drivers/iommu/iommu-pages.h
> > new file mode 100644
> > index 000000000000..2332f807d514
> > --- /dev/null
> > +++ b/drivers/iommu/iommu-pages.h
> > @@ -0,0 +1,199 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/*
> > + * Copyright (c) 2023, Google LLC.
> > + * Pasha Tatashin <pasha.tatashin@soleen.com>
> > + */
> > +
> > +#ifndef __IOMMU_PAGES_H
> > +#define __IOMMU_PAGES_H
> > +
> > +#include <linux/vmstat.h>
> > +#include <linux/gfp.h>
> > +#include <linux/mm.h>
> > +
> > +/*
> > + * All page allocation that are performed in the IOMMU subsystem must =
use one of
> > + * the functions below.  This is necessary for the proper accounting a=
s IOMMU
> > + * state can be rather large, i.e. multiple gigabytes in size.
> > + */
> > +
> > +/**
> > + * __iommu_alloc_pages_node - allocate a zeroed page of a given order =
from
> > + * specific NUMA node.
> > + * @nid: memory NUMA node id
>
> NUMA_NO_NODE if no locality requirements?

If no locality is required, there is a better interface:
__iommu_alloc_pages(). That one will also take a look at the calling
process policies to determine the proper NUMA node when nothing is
specified. However, when policies should be ignored, and no locality
required, NUMA_NO_NODE can be passed.

>
> > + * @gfp: buddy allocator flags
> > + * @order: page order
> > + *
> > + * returns the head struct page of the allocated page.
> > + */
> > +static inline struct page *__iommu_alloc_pages_node(int nid, gfp_t gfp=
,
> > +                                                 int order)
> > +{
> > +     struct page *pages;
>
> s/pages/page/ here and later in this file.

In this file, where there a page with an "order", I reference it with
"pages", when no order (i.e. order =3D 0), I reference it with "page"

I.e.: __iommu_alloc_page vs. __iommu_alloc_pages

>
> > +
> > +     pages =3D alloc_pages_node(nid, gfp | __GFP_ZERO, order);
> > +     if (!pages)
>
> unlikely()?

Will add it.

>
> > +             return NULL;
> > +
> > +     return pages;
> > +}
> > +
> > +/**
> > + * __iommu_alloc_pages - allocate a zeroed page of a given order.
> > + * @gfp: buddy allocator flags
> > + * @order: page order
> > + *
> > + * returns the head struct page of the allocated page.
> > + */
> > +static inline struct page *__iommu_alloc_pages(gfp_t gfp, int order)
> > +{
> > +     struct page *pages;
> > +
> > +     pages =3D alloc_pages(gfp | __GFP_ZERO, order);
> > +     if (!pages)
> > +             return NULL;
> > +
> > +     return pages;
> > +}
> > +
> > +/**
> > + * __iommu_alloc_page_node - allocate a zeroed page at specific NUMA n=
ode.
> > + * @nid: memory NUMA node id
> > + * @gfp: buddy allocator flags
> > + *
> > + * returns the struct page of the allocated page.
> > + */
> > +static inline struct page *__iommu_alloc_page_node(int nid, gfp_t gfp)
> > +{
> > +     return __iommu_alloc_pages_node(nid, gfp, 0);
> > +}
> > +
> > +/**
> > + * __iommu_alloc_page - allocate a zeroed page
> > + * @gfp: buddy allocator flags
> > + *
> > + * returns the struct page of the allocated page.
> > + */
> > +static inline struct page *__iommu_alloc_page(gfp_t gfp)
> > +{
> > +     return __iommu_alloc_pages(gfp, 0);
> > +}
> > +
> > +/**
> > + * __iommu_free_pages - free page of a given order
> > + * @pages: head struct page of the page
>
> I think "pages" implies more than one page, this is just a (potentially
> compound) page?

Yes, more than one page, basically, when order may be > 0.

> > +/**
> > + * iommu_free_page - free page
> > + * @virt: virtual address of the page to be freed.
> > + */
> > +static inline void iommu_free_page(void *virt)
> > +{
> > +     iommu_free_pages(virt, 0);
> > +}
> > +
> > +/**
> > + * iommu_free_pages_list - free a list of pages.
> > + * @pages: the head of the lru list to be freed.
>
> Document the locking requirements for this?

Thank you for the review. I will add info about locking requirements,
in fact they are very relaxed.

These pages are added to the list by unmaps or remaps operation in
Intel IOMMU implementation. These calls assume that whoever is doing
those operations has exclusive access to the VA range in the page
table of that operation. The pages in this freelist only belong to the
former page-tables from the IOVA range for those operations.

> > + */
> > +static inline void iommu_free_pages_list(struct list_head *pages)
> > +{
> > +     while (!list_empty(pages)) {
> > +             struct page *p =3D list_entry(pages->prev, struct page, l=
ru);
> > +
> > +             list_del(&p->lru);
> > +             put_page(p);
> > +     }
> > +}