2023-09-13 10:50:50

by Raghavendra K T

[permalink] [raw]
Subject: Re: [PATCH v2 6/9] x86/clear_huge_page: multi-page clearing

On 8/31/2023 12:19 AM, Ankur Arora wrote:
> clear_pages_rep(), clear_pages_erms() clear using string instructions.
> While clearing extents of more than a single page, we can use these
> more effectively by explicitly advertising the region-size to the
> processor.
>
> This can be used as a hint by the processor-uarch to optimize the
> clearing (ex. to avoid polluting one or more levels of the data-cache.)
>
> As a secondary benefit, string instructions are typically microcoded,
> and so it's a good idea to amortize the cost of the decode across larger
> regions.
>
> Accordingly, clear_huge_page() now does huge-page clearing in three
> parts: the neighbourhood of the faulting address, the left, and the
> right region of the neighbourhood.
>
> The local neighbourhood is cleared last to keep its cachelines hot.
>
[...]
>
> Signed-off-by: Ankur Arora <[email protected]>
> ---
> arch/x86/mm/hugetlbpage.c | 54 +++++++++++++++++++++++++++++++++++++++
> 1 file changed, 54 insertions(+)
>

Hello Ankur,

Just thinking loud here (w.r.t THP).

V3 patchset with uarch changes had changes in THP path too, where
one could explicitly give hints or non-caching hints. and they are
passed down to call incoherent clearing.

IMO, those changes logically belong to uarch optimizations.. right?