The folio is physically and virtually contiguous. If a folio have
more than one pages, lru_gen_look_around() will run several times in
the while loop in folio_referenced_one(), but most of times is
unnecessary. Because these pages always belong to the same pmd and
vma, lru_gen_look_around() will scan the same range.
So add a variable to record the former pvmw.address to reduce
repeated scans if the pages of this folio don't sride across a pmd.
In most codes in memory management now, a folio just have one page,
the while loop in folio_referenced_one() just run one time, so this
patch actually may not reduce scans. But when other memory management
codes expand contiguous pages to a folio in the future, this patch
can be more effective.
Signed-off-by: Jinyu Tang <[email protected]>
---
mm/rmap.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/mm/rmap.c b/mm/rmap.c
index 2ec925e5fa6a..b11fcfe812e8 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -809,6 +809,7 @@ static bool folio_referenced_one(struct folio *folio,
struct folio_referenced_arg *pra = arg;
DEFINE_FOLIO_VMA_WALK(pvmw, folio, vma, address, 0);
int referenced = 0;
+ unsigned long former_address = 0;
while (page_vma_mapped_walk(&pvmw)) {
address = pvmw.address;
@@ -825,7 +826,13 @@ static bool folio_referenced_one(struct folio *folio,
if (pvmw.pte) {
if (lru_gen_enabled() && pte_young(*pvmw.pte) &&
!(vma->vm_flags & (VM_SEQ_READ | VM_RAND_READ))) {
- lru_gen_look_around(&pvmw);
+ unsigned long pmd_now = pvmw.address & PMD_MASK;
+ unsigned long pmd_former = former_address & PMD_MASK;
+
+ if ((!former_address) || (pmd_now != pmd_former)) {
+ lru_gen_look_around(&pvmw);
+ former_address = pvmw.address;
+ }
referenced++;
}
--
2.30.2
On Sun, Jan 15, 2023 at 5:57 AM Jinyu Tang <[email protected]> wrote:
>
> The folio is physically and virtually contiguous. If a folio have
> more than one pages, lru_gen_look_around() will run several times in
> the while loop in folio_referenced_one(), but most of times is
> unnecessary. Because these pages always belong to the same pmd and
> vma, lru_gen_look_around() will scan the same range.
Thanks -- the commit message is quite clear, so I think I understand
what you're thinking.
Let me clarify:
1. First of all, there are no repeated scans, because after
lru_gen_look_around() clears the A-bit in a range, the pte_young()
test stops it from going into the same range again.
2. Of course, pte_young() can become true later, but this is not a
problem because it's cache hot.
3. Physically and virtually contiguous mapping existed before folios:
a THP can be mapped by 512 PTEs.
> while (page_vma_mapped_walk(&pvmw)) {
> address = pvmw.address;
> @@ -825,7 +826,13 @@ static bool folio_referenced_one(struct folio *folio,
> if (pvmw.pte) {
> if (lru_gen_enabled() && pte_young(*pvmw.pte) &&
> !(vma->vm_flags & (VM_SEQ_READ | VM_RAND_READ))) {
> - lru_gen_look_around(&pvmw);
> + unsigned long pmd_now = pvmw.address & PMD_MASK;
> + unsigned long pmd_former = former_address & PMD_MASK;
> +
> + if ((!former_address) || (pmd_now != pmd_former)) {
> + lru_gen_look_around(&pvmw);
> + former_address = pvmw.address;
> + }
> referenced++;
> }