2023-08-02 16:36:40

by Matthew Wilcox

[permalink] [raw]
Subject: [PATCH v6 00/38] New page table range API

For some reason, I didn't get an email when v5 was dropped from linux-next
(I got an email from Andrew saying he was going to, but when I didn't get
the automated emails, I assumed he'd changed his mind). So here's v6,
far later than I would have resent it.

This patchset changes the API used by the MM to set up page table entries.
The four APIs are:
set_ptes(mm, addr, ptep, pte, nr)
update_mmu_cache_range(vma, addr, ptep, nr)
flush_dcache_folio(folio)
flush_icache_pages(vma, page, nr)

flush_dcache_folio() isn't technically new, but no architecture
implemented it, so I've done that for them. The old APIs remain around
but are mostly implemented by calling the new interfaces.

The new APIs are based around setting up N page table entries at once.
The N entries belong to the same PMD, the same folio and the same VMA,
so ptep++ is a legitimate operation, and locking is taken care of for
you. Some architectures can do a better job of it than just a loop,
but I have hesitated to make too deep a change to architectures I don't
understand well.

One thing I have changed in every architecture is that PG_arch_1 is
now a per-folio bit instead of a per-page bit when used for dcache
clean/dirty tracking. This was something that would have to happen
eventually, and it makes sense to do it now rather than iterate over
every page involved in a cache flush and figure out if it needs to happen.

The point of all this is better performance, and Fengwei Yin has
measured improvement on x86. I suspect you'll see improvement on
your architecture too. Try the new will-it-scale test mentioned here:
https://lore.kernel.org/linux-mm/[email protected]/
You'll need to run it on an XFS filesystem and have
CONFIG_TRANSPARENT_HUGEPAGE set.

This patchset is the basis for much of the anonymous large folio work
being done by Ryan, so it's received quite a lot of testing over the
last few months.

v6:
- Fix other definitions of in_range(); either renaming them to not conflict
or removing them and using the new common definition.
- Use arch_enter_lazy_mmu_mode() and arch_leave_lazy_mmu_mode()
- Fix a compiler warning in the riscv implementation
- Update to next-20230731

v5:
- Add in_range() macro
- Fix numerous compilation problems on minority architectures (thanks LKP!)
- Add the 'vmf' argument to update_mmu_cache_range() to help MIPS
and other architectures that insert TLB entries in software,
rather than using a hardware page table walker
- Get rid of first_map_page() and next_map_page(); use
next_uptodate_folio() directly
- Actually move the mmap_miss accounting in filemap.c
- Add kernel-doc for set_pte_range()
- Correct determination of prefaulting in set_pte_range()
- More Acked & Reviewed tags

v4:
- Fix a few compile errors (mostly Mike Rapoport)
- Incorporate Mike's suggestion to avoid having to define set_ptes()
or set_pte_at() on the majority of architectures
- Optimise m68k's __flush_pages_to_ram (Geert Uytterhoeven)
- Fix sun3 (me)
- Fix sparc32 (me)
- Pick up a few more Ack/Reviewed tags

v3:
- Reinstate flush_dcache_icache_phys() on PowerPC
- Fix folio_flush_mapping(). The documentation was correct and the
implementation was completely wrong
- Change the flush_dcache_page() documentation to describe
flush_dcache_folio() instead
- Split ARM from ARC. I messed up my git commands
- Remove page_mapping_file()
- Rationalise how flush_icache_pages() and flush_icache_page() are defined
- Use flush_icache_pages() in do_set_pmd()
- Pick up Guo Ren's Ack for csky

Matthew Wilcox (Oracle) (34):
minmax: Add in_range() macro
mm: Convert page_table_check_pte_set() to page_table_check_ptes_set()
mm: Add generic flush_icache_pages() and documentation
mm: Add folio_flush_mapping()
mm: Remove ARCH_IMPLEMENTS_FLUSH_DCACHE_FOLIO
mm: Add default definition of set_ptes()
alpha: Implement the new page table range API
arc: Implement the new page table range API
arm: Implement the new page table range API
arm64: Implement the new page table range API
csky: Implement the new page table range API
hexagon: Implement the new page table range API
ia64: Implement the new page table range API
loongarch: Implement the new page table range API
m68k: Implement the new page table range API
microblaze: Implement the new page table range API
mips: Implement the new page table range API
nios2: Implement the new page table range API
openrisc: Implement the new page table range API
parisc: Implement the new page table range API
powerpc: Implement the new page table range API
riscv: Implement the new page table range API
s390: Implement the new page table range API
sh: Implement the new page table range API
sparc32: Implement the new page table range API
sparc64: Implement the new page table range API
um: Implement the new page table range API
x86: Implement the new page table range API
xtensa: Implement the new page table range API
mm: Remove page_mapping_file()
mm: Rationalise flush_icache_pages() and flush_icache_page()
mm: Tidy up set_ptes definition
mm: Use flush_icache_pages() in do_set_pmd()
mm: Call update_mmu_cache_range() in more page fault handling paths

Yin Fengwei (4):
filemap: Add filemap_map_folio_range()
rmap: add folio_add_file_rmap_range()
mm: Convert do_set_pte() to set_pte_range()
filemap: Batch PTE mappings

Documentation/core-api/cachetlb.rst | 55 ++++----
Documentation/filesystems/locking.rst | 2 +-
arch/alpha/include/asm/cacheflush.h | 13 +-
arch/alpha/include/asm/pgtable.h | 10 +-
arch/arc/include/asm/cacheflush.h | 14 +-
arch/arc/include/asm/pgtable-bits-arcv2.h | 12 +-
arch/arc/include/asm/pgtable-levels.h | 1 +
arch/arc/mm/cache.c | 61 +++++----
arch/arc/mm/tlb.c | 18 ++-
arch/arm/include/asm/cacheflush.h | 29 ++---
arch/arm/include/asm/pgtable.h | 5 +-
arch/arm/include/asm/tlbflush.h | 14 +-
arch/arm/mm/copypage-v4mc.c | 5 +-
arch/arm/mm/copypage-v6.c | 5 +-
arch/arm/mm/copypage-xscale.c | 5 +-
arch/arm/mm/dma-mapping.c | 24 ++--
arch/arm/mm/fault-armv.c | 16 +--
arch/arm/mm/flush.c | 99 ++++++++------
arch/arm/mm/mm.h | 2 +-
arch/arm/mm/mmu.c | 14 +-
arch/arm/mm/nommu.c | 6 +
arch/arm/mm/pageattr.c | 6 +-
arch/arm64/include/asm/cacheflush.h | 4 +-
arch/arm64/include/asm/pgtable.h | 26 +++-
arch/arm64/mm/flush.c | 36 ++---
arch/csky/abiv1/cacheflush.c | 32 +++--
arch/csky/abiv1/inc/abi/cacheflush.h | 3 +-
arch/csky/abiv2/cacheflush.c | 32 ++---
arch/csky/abiv2/inc/abi/cacheflush.h | 11 +-
arch/csky/include/asm/pgtable.h | 8 +-
arch/hexagon/include/asm/cacheflush.h | 10 +-
arch/hexagon/include/asm/pgtable.h | 9 +-
arch/ia64/hp/common/sba_iommu.c | 26 ++--
arch/ia64/include/asm/cacheflush.h | 14 +-
arch/ia64/include/asm/pgtable.h | 4 +-
arch/ia64/mm/init.c | 28 ++--
arch/loongarch/include/asm/cacheflush.h | 1 -
arch/loongarch/include/asm/pgtable-bits.h | 4 +-
arch/loongarch/include/asm/pgtable.h | 33 ++---
arch/loongarch/mm/pgtable.c | 2 +-
arch/loongarch/mm/tlb.c | 2 +-
arch/m68k/include/asm/cacheflush_mm.h | 26 ++--
arch/m68k/include/asm/mcf_pgtable.h | 1 +
arch/m68k/include/asm/motorola_pgtable.h | 1 +
arch/m68k/include/asm/pgtable_mm.h | 10 +-
arch/m68k/include/asm/sun3_pgtable.h | 1 +
arch/m68k/mm/motorola.c | 2 +-
arch/microblaze/include/asm/cacheflush.h | 8 ++
arch/microblaze/include/asm/pgtable.h | 15 +--
arch/microblaze/include/asm/tlbflush.h | 4 +-
arch/mips/bcm47xx/prom.c | 2 +-
arch/mips/include/asm/cacheflush.h | 32 +++--
arch/mips/include/asm/pgtable-32.h | 10 +-
arch/mips/include/asm/pgtable-64.h | 6 +-
arch/mips/include/asm/pgtable-bits.h | 6 +-
arch/mips/include/asm/pgtable.h | 63 +++++----
arch/mips/mm/c-r4k.c | 5 +-
arch/mips/mm/cache.c | 56 ++++----
arch/mips/mm/init.c | 21 +--
arch/mips/mm/pgtable-32.c | 2 +-
arch/mips/mm/pgtable-64.c | 2 +-
arch/mips/mm/tlbex.c | 2 +-
arch/nios2/include/asm/cacheflush.h | 6 +-
arch/nios2/include/asm/pgtable.h | 28 ++--
arch/nios2/mm/cacheflush.c | 79 ++++++-----
arch/openrisc/include/asm/cacheflush.h | 8 +-
arch/openrisc/include/asm/pgtable.h | 15 ++-
arch/openrisc/mm/cache.c | 12 +-
arch/parisc/include/asm/cacheflush.h | 14 +-
arch/parisc/include/asm/pgtable.h | 37 ++++--
arch/parisc/kernel/cache.c | 107 ++++++++++-----
arch/powerpc/include/asm/book3s/32/pgtable.h | 5 -
arch/powerpc/include/asm/book3s/64/pgtable.h | 6 +-
arch/powerpc/include/asm/book3s/pgtable.h | 11 +-
arch/powerpc/include/asm/cacheflush.h | 14 +-
arch/powerpc/include/asm/kvm_ppc.h | 10 +-
arch/powerpc/include/asm/nohash/pgtable.h | 16 +--
arch/powerpc/include/asm/pgtable.h | 12 ++
arch/powerpc/mm/book3s64/hash_utils.c | 11 +-
arch/powerpc/mm/cacheflush.c | 40 ++----
arch/powerpc/mm/nohash/e500_hugetlbpage.c | 3 +-
arch/powerpc/mm/pgtable.c | 53 ++++----
arch/riscv/include/asm/cacheflush.h | 19 ++-
arch/riscv/include/asm/pgtable.h | 37 ++++--
arch/riscv/mm/cacheflush.c | 13 +-
arch/s390/include/asm/pgtable.h | 33 +++--
arch/sh/include/asm/cacheflush.h | 21 ++-
arch/sh/include/asm/pgtable.h | 7 +-
arch/sh/include/asm/pgtable_32.h | 5 +-
arch/sh/mm/cache-j2.c | 4 +-
arch/sh/mm/cache-sh4.c | 26 ++--
arch/sh/mm/cache-sh7705.c | 26 ++--
arch/sh/mm/cache.c | 52 ++++----
arch/sh/mm/kmap.c | 3 +-
arch/sparc/include/asm/cacheflush_32.h | 10 +-
arch/sparc/include/asm/cacheflush_64.h | 19 +--
arch/sparc/include/asm/pgtable_32.h | 8 +-
arch/sparc/include/asm/pgtable_64.h | 29 ++++-
arch/sparc/kernel/smp_64.c | 56 +++++---
arch/sparc/mm/init_32.c | 13 +-
arch/sparc/mm/init_64.c | 78 ++++++-----
arch/sparc/mm/tlb.c | 5 +-
arch/um/include/asm/pgtable.h | 7 +-
arch/x86/include/asm/pgtable.h | 14 +-
arch/xtensa/include/asm/cacheflush.h | 11 +-
arch/xtensa/include/asm/pgtable.h | 18 ++-
arch/xtensa/mm/cache.c | 83 +++++++-----
.../drm/arm/display/include/malidp_utils.h | 2 +-
.../display/komeda/komeda_pipeline_state.c | 24 ++--
drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 6 -
.../net/ethernet/chelsio/cxgb3/cxgb3_main.c | 18 +--
drivers/virt/acrn/ioreq.c | 4 +-
fs/btrfs/misc.h | 2 -
fs/ext2/balloc.c | 2 -
fs/ext4/ext4.h | 2 -
fs/ufs/util.h | 6 -
include/asm-generic/cacheflush.h | 7 -
include/linux/cacheflush.h | 13 +-
include/linux/minmax.h | 27 ++++
include/linux/mm.h | 3 +-
include/linux/page_table_check.h | 13 +-
include/linux/pagemap.h | 28 ++--
include/linux/pgtable.h | 75 ++++++++---
include/linux/rmap.h | 2 +
lib/logic_pio.c | 3 -
mm/filemap.c | 123 ++++++++++--------
mm/memory.c | 56 ++++----
mm/page_table_check.c | 16 ++-
mm/rmap.c | 60 +++++++--
mm/util.c | 2 +-
net/netfilter/nf_nat_core.c | 6 +-
net/tipc/core.h | 2 +-
net/tipc/link.c | 10 +-
.../selftests/bpf/progs/get_branch_snapshot.c | 4 +-
134 files changed, 1520 insertions(+), 1056 deletions(-)

--
2.40.1



2023-08-02 16:37:39

by Matthew Wilcox

[permalink] [raw]
Subject: [PATCH v6 29/38] xtensa: Implement the new page table range API

Add PFN_PTE_SHIFT, update_mmu_cache_range(), flush_dcache_folio() and
flush_icache_pages().

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Acked-by: Mike Rapoport (IBM) <[email protected]>
Cc: Max Filippov <[email protected]>
Cc: [email protected]
---
arch/xtensa/include/asm/cacheflush.h | 9 ++-
arch/xtensa/include/asm/pgtable.h | 18 +++---
arch/xtensa/mm/cache.c | 83 ++++++++++++++++------------
3 files changed, 63 insertions(+), 47 deletions(-)

diff --git a/arch/xtensa/include/asm/cacheflush.h b/arch/xtensa/include/asm/cacheflush.h
index 7b4359312c25..35153f6725e4 100644
--- a/arch/xtensa/include/asm/cacheflush.h
+++ b/arch/xtensa/include/asm/cacheflush.h
@@ -119,8 +119,14 @@ void flush_cache_page(struct vm_area_struct*,
#define flush_cache_vmap(start,end) flush_cache_all()
#define flush_cache_vunmap(start,end) flush_cache_all()

+void flush_dcache_folio(struct folio *folio);
+#define flush_dcache_folio flush_dcache_folio
+
#define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1
-void flush_dcache_page(struct page *);
+static inline void flush_dcache_page(struct page *page)
+{
+ flush_dcache_folio(page_folio(page));
+}

void local_flush_cache_range(struct vm_area_struct *vma,
unsigned long start, unsigned long end);
@@ -156,6 +162,7 @@ void local_flush_cache_page(struct vm_area_struct *vma,

/* This is not required, see Documentation/core-api/cachetlb.rst */
#define flush_icache_page(vma,page) do { } while (0)
+#define flush_icache_pages(vma, page, nr) do { } while (0)

#define flush_dcache_mmap_lock(mapping) do { } while (0)
#define flush_dcache_mmap_unlock(mapping) do { } while (0)
diff --git a/arch/xtensa/include/asm/pgtable.h b/arch/xtensa/include/asm/pgtable.h
index fc7a14884c6c..ef79cb6c20dc 100644
--- a/arch/xtensa/include/asm/pgtable.h
+++ b/arch/xtensa/include/asm/pgtable.h
@@ -274,6 +274,7 @@ static inline pte_t pte_mkwrite(pte_t pte)
* and a page entry and page directory to the page they refer to.
*/

+#define PFN_PTE_SHIFT PAGE_SHIFT
#define pte_pfn(pte) (pte_val(pte) >> PAGE_SHIFT)
#define pte_same(a,b) (pte_val(a) == pte_val(b))
#define pte_page(x) pfn_to_page(pte_pfn(x))
@@ -301,15 +302,9 @@ static inline void update_pte(pte_t *ptep, pte_t pteval)

struct mm_struct;

-static inline void
-set_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pteval)
-{
- update_pte(ptep, pteval);
-}
-
-static inline void set_pte(pte_t *ptep, pte_t pteval)
+static inline void set_pte(pte_t *ptep, pte_t pte)
{
- update_pte(ptep, pteval);
+ update_pte(ptep, pte);
}

static inline void
@@ -407,8 +402,11 @@ static inline pte_t pte_swp_clear_exclusive(pte_t pte)

#else

-extern void update_mmu_cache(struct vm_area_struct * vma,
- unsigned long address, pte_t *ptep);
+struct vm_fault;
+void update_mmu_cache_range(struct vm_fault *vmf, struct vm_area_struct *vma,
+ unsigned long address, pte_t *ptep, unsigned int nr);
+#define update_mmu_cache(vma, address, ptep) \
+ update_mmu_cache_range(NULL, vma, address, ptep, 1)

typedef pte_t *pte_addr_t;

diff --git a/arch/xtensa/mm/cache.c b/arch/xtensa/mm/cache.c
index 19e5a478a7e8..7ec66a79f472 100644
--- a/arch/xtensa/mm/cache.c
+++ b/arch/xtensa/mm/cache.c
@@ -121,9 +121,9 @@ EXPORT_SYMBOL(copy_user_highpage);
*
*/

-void flush_dcache_page(struct page *page)
+void flush_dcache_folio(struct folio *folio)
{
- struct address_space *mapping = page_mapping_file(page);
+ struct address_space *mapping = folio_flush_mapping(folio);

/*
* If we have a mapping but the page is not mapped to user-space
@@ -132,14 +132,14 @@ void flush_dcache_page(struct page *page)
*/

if (mapping && !mapping_mapped(mapping)) {
- if (!test_bit(PG_arch_1, &page->flags))
- set_bit(PG_arch_1, &page->flags);
+ if (!test_bit(PG_arch_1, &folio->flags))
+ set_bit(PG_arch_1, &folio->flags);
return;

} else {
-
- unsigned long phys = page_to_phys(page);
- unsigned long temp = page->index << PAGE_SHIFT;
+ unsigned long phys = folio_pfn(folio) * PAGE_SIZE;
+ unsigned long temp = folio_pos(folio);
+ unsigned int i, nr = folio_nr_pages(folio);
unsigned long alias = !(DCACHE_ALIAS_EQ(temp, phys));
unsigned long virt;

@@ -154,22 +154,26 @@ void flush_dcache_page(struct page *page)
return;

preempt_disable();
- virt = TLBTEMP_BASE_1 + (phys & DCACHE_ALIAS_MASK);
- __flush_invalidate_dcache_page_alias(virt, phys);
+ for (i = 0; i < nr; i++) {
+ virt = TLBTEMP_BASE_1 + (phys & DCACHE_ALIAS_MASK);
+ __flush_invalidate_dcache_page_alias(virt, phys);

- virt = TLBTEMP_BASE_1 + (temp & DCACHE_ALIAS_MASK);
+ virt = TLBTEMP_BASE_1 + (temp & DCACHE_ALIAS_MASK);

- if (alias)
- __flush_invalidate_dcache_page_alias(virt, phys);
+ if (alias)
+ __flush_invalidate_dcache_page_alias(virt, phys);

- if (mapping)
- __invalidate_icache_page_alias(virt, phys);
+ if (mapping)
+ __invalidate_icache_page_alias(virt, phys);
+ phys += PAGE_SIZE;
+ temp += PAGE_SIZE;
+ }
preempt_enable();
}

/* There shouldn't be an entry in the cache for this page anymore. */
}
-EXPORT_SYMBOL(flush_dcache_page);
+EXPORT_SYMBOL(flush_dcache_folio);

/*
* For now, flush the whole cache. FIXME??
@@ -207,45 +211,52 @@ EXPORT_SYMBOL(local_flush_cache_page);

#endif /* DCACHE_WAY_SIZE > PAGE_SIZE */

-void
-update_mmu_cache(struct vm_area_struct * vma, unsigned long addr, pte_t *ptep)
+void update_mmu_cache_range(struct vm_fault *vmf, struct vm_area_struct *vma,
+ unsigned long addr, pte_t *ptep, unsigned int nr)
{
unsigned long pfn = pte_pfn(*ptep);
- struct page *page;
+ struct folio *folio;
+ unsigned int i;

if (!pfn_valid(pfn))
return;

- page = pfn_to_page(pfn);
+ folio = page_folio(pfn_to_page(pfn));

- /* Invalidate old entry in TLBs */
-
- flush_tlb_page(vma, addr);
+ /* Invalidate old entries in TLBs */
+ for (i = 0; i < nr; i++)
+ flush_tlb_page(vma, addr + i * PAGE_SIZE);
+ nr = folio_nr_pages(folio);

#if (DCACHE_WAY_SIZE > PAGE_SIZE)

- if (!PageReserved(page) && test_bit(PG_arch_1, &page->flags)) {
- unsigned long phys = page_to_phys(page);
+ if (!folio_test_reserved(folio) && test_bit(PG_arch_1, &folio->flags)) {
+ unsigned long phys = folio_pfn(folio) * PAGE_SIZE;
unsigned long tmp;

preempt_disable();
- tmp = TLBTEMP_BASE_1 + (phys & DCACHE_ALIAS_MASK);
- __flush_invalidate_dcache_page_alias(tmp, phys);
- tmp = TLBTEMP_BASE_1 + (addr & DCACHE_ALIAS_MASK);
- __flush_invalidate_dcache_page_alias(tmp, phys);
- __invalidate_icache_page_alias(tmp, phys);
+ for (i = 0; i < nr; i++) {
+ tmp = TLBTEMP_BASE_1 + (phys & DCACHE_ALIAS_MASK);
+ __flush_invalidate_dcache_page_alias(tmp, phys);
+ tmp = TLBTEMP_BASE_1 + (addr & DCACHE_ALIAS_MASK);
+ __flush_invalidate_dcache_page_alias(tmp, phys);
+ __invalidate_icache_page_alias(tmp, phys);
+ phys += PAGE_SIZE;
+ }
preempt_enable();

- clear_bit(PG_arch_1, &page->flags);
+ clear_bit(PG_arch_1, &folio->flags);
}
#else
- if (!PageReserved(page) && !test_bit(PG_arch_1, &page->flags)
+ if (!folio_test_reserved(folio) && !test_bit(PG_arch_1, &folio->flags)
&& (vma->vm_flags & VM_EXEC) != 0) {
- unsigned long paddr = (unsigned long)kmap_atomic(page);
- __flush_dcache_page(paddr);
- __invalidate_icache_page(paddr);
- set_bit(PG_arch_1, &page->flags);
- kunmap_atomic((void *)paddr);
+ for (i = 0; i < nr; i++) {
+ void *paddr = kmap_local_folio(folio, i * PAGE_SIZE);
+ __flush_dcache_page((unsigned long)paddr);
+ __invalidate_icache_page((unsigned long)paddr);
+ kunmap_local(paddr);
+ }
+ set_bit(PG_arch_1, &folio->flags);
}
#endif
}
--
2.40.1


2023-08-02 16:43:50

by Matthew Wilcox

[permalink] [raw]
Subject: [PATCH v6 17/38] mips: Implement the new page table range API

Rename _PFN_SHIFT to PFN_PTE_SHIFT. Convert a few places
to call set_pte() instead of set_pte_at(). Add set_ptes(),
update_mmu_cache_range(), flush_icache_pages() and flush_dcache_folio().
Change the PG_arch_1 (aka PG_dcache_dirty) flag from being per-page
to per-folio.

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Acked-by: Mike Rapoport (IBM) <[email protected]>
Cc: Thomas Bogendoerfer <[email protected]>
Cc: [email protected]
---
arch/mips/bcm47xx/prom.c | 2 +-
arch/mips/include/asm/cacheflush.h | 32 +++++++++-----
arch/mips/include/asm/pgtable-32.h | 10 ++---
arch/mips/include/asm/pgtable-64.h | 6 +--
arch/mips/include/asm/pgtable-bits.h | 6 +--
arch/mips/include/asm/pgtable.h | 63 ++++++++++++++++++----------
arch/mips/mm/c-r4k.c | 5 ++-
arch/mips/mm/cache.c | 56 ++++++++++++-------------
arch/mips/mm/init.c | 21 ++++++----
arch/mips/mm/pgtable-32.c | 2 +-
arch/mips/mm/pgtable-64.c | 2 +-
arch/mips/mm/tlbex.c | 2 +-
12 files changed, 121 insertions(+), 86 deletions(-)

diff --git a/arch/mips/bcm47xx/prom.c b/arch/mips/bcm47xx/prom.c
index a9bea411d928..99a1ba5394e0 100644
--- a/arch/mips/bcm47xx/prom.c
+++ b/arch/mips/bcm47xx/prom.c
@@ -116,7 +116,7 @@ void __init prom_init(void)
#if defined(CONFIG_BCM47XX_BCMA) && defined(CONFIG_HIGHMEM)

#define EXTVBASE 0xc0000000
-#define ENTRYLO(x) ((pte_val(pfn_pte((x) >> _PFN_SHIFT, PAGE_KERNEL_UNCACHED)) >> 6) | 1)
+#define ENTRYLO(x) ((pte_val(pfn_pte((x) >> PFN_PTE_SHIFT, PAGE_KERNEL_UNCACHED)) >> 6) | 1)

#include <asm/tlbflush.h>

diff --git a/arch/mips/include/asm/cacheflush.h b/arch/mips/include/asm/cacheflush.h
index d8d3f80f9fc0..0f389bc7cb90 100644
--- a/arch/mips/include/asm/cacheflush.h
+++ b/arch/mips/include/asm/cacheflush.h
@@ -36,12 +36,12 @@
*/
#define PG_dcache_dirty PG_arch_1

-#define Page_dcache_dirty(page) \
- test_bit(PG_dcache_dirty, &(page)->flags)
-#define SetPageDcacheDirty(page) \
- set_bit(PG_dcache_dirty, &(page)->flags)
-#define ClearPageDcacheDirty(page) \
- clear_bit(PG_dcache_dirty, &(page)->flags)
+#define folio_test_dcache_dirty(folio) \
+ test_bit(PG_dcache_dirty, &(folio)->flags)
+#define folio_set_dcache_dirty(folio) \
+ set_bit(PG_dcache_dirty, &(folio)->flags)
+#define folio_clear_dcache_dirty(folio) \
+ clear_bit(PG_dcache_dirty, &(folio)->flags)

extern void (*flush_cache_all)(void);
extern void (*__flush_cache_all)(void);
@@ -50,15 +50,24 @@ extern void (*flush_cache_mm)(struct mm_struct *mm);
extern void (*flush_cache_range)(struct vm_area_struct *vma,
unsigned long start, unsigned long end);
extern void (*flush_cache_page)(struct vm_area_struct *vma, unsigned long page, unsigned long pfn);
-extern void __flush_dcache_page(struct page *page);
+extern void __flush_dcache_pages(struct page *page, unsigned int nr);

#define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1
+static inline void flush_dcache_folio(struct folio *folio)
+{
+ if (cpu_has_dc_aliases)
+ __flush_dcache_pages(&folio->page, folio_nr_pages(folio));
+ else if (!cpu_has_ic_fills_f_dc)
+ folio_set_dcache_dirty(folio);
+}
+#define flush_dcache_folio flush_dcache_folio
+
static inline void flush_dcache_page(struct page *page)
{
if (cpu_has_dc_aliases)
- __flush_dcache_page(page);
+ __flush_dcache_pages(page, 1);
else if (!cpu_has_ic_fills_f_dc)
- SetPageDcacheDirty(page);
+ folio_set_dcache_dirty(page_folio(page));
}

#define flush_dcache_mmap_lock(mapping) do { } while (0)
@@ -73,10 +82,11 @@ static inline void flush_anon_page(struct vm_area_struct *vma,
__flush_anon_page(page, vmaddr);
}

-static inline void flush_icache_page(struct vm_area_struct *vma,
- struct page *page)
+static inline void flush_icache_pages(struct vm_area_struct *vma,
+ struct page *page, unsigned int nr)
{
}
+#define flush_icache_page(vma, page) flush_icache_pages(vma, page, 1)

extern void (*flush_icache_range)(unsigned long start, unsigned long end);
extern void (*local_flush_icache_range)(unsigned long start, unsigned long end);
diff --git a/arch/mips/include/asm/pgtable-32.h b/arch/mips/include/asm/pgtable-32.h
index ba0016709a1a..0e196650f4f4 100644
--- a/arch/mips/include/asm/pgtable-32.h
+++ b/arch/mips/include/asm/pgtable-32.h
@@ -153,7 +153,7 @@ static inline void pmd_clear(pmd_t *pmdp)
#if defined(CONFIG_XPA)

#define MAX_POSSIBLE_PHYSMEM_BITS 40
-#define pte_pfn(x) (((unsigned long)((x).pte_high >> _PFN_SHIFT)) | (unsigned long)((x).pte_low << _PAGE_PRESENT_SHIFT))
+#define pte_pfn(x) (((unsigned long)((x).pte_high >> PFN_PTE_SHIFT)) | (unsigned long)((x).pte_low << _PAGE_PRESENT_SHIFT))
static inline pte_t
pfn_pte(unsigned long pfn, pgprot_t prot)
{
@@ -161,7 +161,7 @@ pfn_pte(unsigned long pfn, pgprot_t prot)

pte.pte_low = (pfn >> _PAGE_PRESENT_SHIFT) |
(pgprot_val(prot) & ~_PFNX_MASK);
- pte.pte_high = (pfn << _PFN_SHIFT) |
+ pte.pte_high = (pfn << PFN_PTE_SHIFT) |
(pgprot_val(prot) & ~_PFN_MASK);
return pte;
}
@@ -184,9 +184,9 @@ static inline pte_t pfn_pte(unsigned long pfn, pgprot_t prot)
#else

#define MAX_POSSIBLE_PHYSMEM_BITS 32
-#define pte_pfn(x) ((unsigned long)((x).pte >> _PFN_SHIFT))
-#define pfn_pte(pfn, prot) __pte(((unsigned long long)(pfn) << _PFN_SHIFT) | pgprot_val(prot))
-#define pfn_pmd(pfn, prot) __pmd(((unsigned long long)(pfn) << _PFN_SHIFT) | pgprot_val(prot))
+#define pte_pfn(x) ((unsigned long)((x).pte >> PFN_PTE_SHIFT))
+#define pfn_pte(pfn, prot) __pte(((unsigned long long)(pfn) << PFN_PTE_SHIFT) | pgprot_val(prot))
+#define pfn_pmd(pfn, prot) __pmd(((unsigned long long)(pfn) << PFN_PTE_SHIFT) | pgprot_val(prot))
#endif /* defined(CONFIG_PHYS_ADDR_T_64BIT) && defined(CONFIG_CPU_MIPS32) */

#define pte_page(x) pfn_to_page(pte_pfn(x))
diff --git a/arch/mips/include/asm/pgtable-64.h b/arch/mips/include/asm/pgtable-64.h
index 98e24e3e7f2b..20ca48c1b606 100644
--- a/arch/mips/include/asm/pgtable-64.h
+++ b/arch/mips/include/asm/pgtable-64.h
@@ -298,9 +298,9 @@ static inline void pud_clear(pud_t *pudp)

#define pte_page(x) pfn_to_page(pte_pfn(x))

-#define pte_pfn(x) ((unsigned long)((x).pte >> _PFN_SHIFT))
-#define pfn_pte(pfn, prot) __pte(((pfn) << _PFN_SHIFT) | pgprot_val(prot))
-#define pfn_pmd(pfn, prot) __pmd(((pfn) << _PFN_SHIFT) | pgprot_val(prot))
+#define pte_pfn(x) ((unsigned long)((x).pte >> PFN_PTE_SHIFT))
+#define pfn_pte(pfn, prot) __pte(((pfn) << PFN_PTE_SHIFT) | pgprot_val(prot))
+#define pfn_pmd(pfn, prot) __pmd(((pfn) << PFN_PTE_SHIFT) | pgprot_val(prot))

#ifndef __PAGETABLE_PMD_FOLDED
static inline pmd_t *pud_pgtable(pud_t pud)
diff --git a/arch/mips/include/asm/pgtable-bits.h b/arch/mips/include/asm/pgtable-bits.h
index 1c576679aa87..421e78c30253 100644
--- a/arch/mips/include/asm/pgtable-bits.h
+++ b/arch/mips/include/asm/pgtable-bits.h
@@ -182,10 +182,10 @@ enum pgtable_bits {
#if defined(CONFIG_CPU_R3K_TLB)
# define _CACHE_UNCACHED (1 << _CACHE_UNCACHED_SHIFT)
# define _CACHE_MASK _CACHE_UNCACHED
-# define _PFN_SHIFT PAGE_SHIFT
+# define PFN_PTE_SHIFT PAGE_SHIFT
#else
# define _CACHE_MASK (7 << _CACHE_SHIFT)
-# define _PFN_SHIFT (PAGE_SHIFT - 12 + _CACHE_SHIFT + 3)
+# define PFN_PTE_SHIFT (PAGE_SHIFT - 12 + _CACHE_SHIFT + 3)
#endif

#ifndef _PAGE_NO_EXEC
@@ -195,7 +195,7 @@ enum pgtable_bits {
#define _PAGE_SILENT_READ _PAGE_VALID
#define _PAGE_SILENT_WRITE _PAGE_DIRTY

-#define _PFN_MASK (~((1 << (_PFN_SHIFT)) - 1))
+#define _PFN_MASK (~((1 << (PFN_PTE_SHIFT)) - 1))

/*
* The final layouts of the PTE bits are:
diff --git a/arch/mips/include/asm/pgtable.h b/arch/mips/include/asm/pgtable.h
index 574fa14ac8b2..cbb93a834f52 100644
--- a/arch/mips/include/asm/pgtable.h
+++ b/arch/mips/include/asm/pgtable.h
@@ -66,7 +66,7 @@ extern void paging_init(void);

static inline unsigned long pmd_pfn(pmd_t pmd)
{
- return pmd_val(pmd) >> _PFN_SHIFT;
+ return pmd_val(pmd) >> PFN_PTE_SHIFT;
}

#ifndef CONFIG_MIPS_HUGE_TLB_SUPPORT
@@ -105,9 +105,6 @@ do { \
} \
} while(0)

-static inline void set_pte_at(struct mm_struct *mm, unsigned long addr,
- pte_t *ptep, pte_t pteval);
-
#if defined(CONFIG_PHYS_ADDR_T_64BIT) && defined(CONFIG_CPU_MIPS32)

#ifdef CONFIG_XPA
@@ -157,7 +154,7 @@ static inline void pte_clear(struct mm_struct *mm, unsigned long addr, pte_t *pt
null.pte_low = null.pte_high = _PAGE_GLOBAL;
}

- set_pte_at(mm, addr, ptep, null);
+ set_pte(ptep, null);
htw_start();
}
#else
@@ -196,28 +193,41 @@ static inline void pte_clear(struct mm_struct *mm, unsigned long addr, pte_t *pt
#if !defined(CONFIG_CPU_R3K_TLB)
/* Preserve global status for the pair */
if (pte_val(*ptep_buddy(ptep)) & _PAGE_GLOBAL)
- set_pte_at(mm, addr, ptep, __pte(_PAGE_GLOBAL));
+ set_pte(ptep, __pte(_PAGE_GLOBAL));
else
#endif
- set_pte_at(mm, addr, ptep, __pte(0));
+ set_pte(ptep, __pte(0));
htw_start();
}
#endif

-static inline void set_pte_at(struct mm_struct *mm, unsigned long addr,
- pte_t *ptep, pte_t pteval)
+static inline void set_ptes(struct mm_struct *mm, unsigned long addr,
+ pte_t *ptep, pte_t pte, unsigned int nr)
{
+ unsigned int i;
+ bool do_sync = false;

- if (!pte_present(pteval))
- goto cache_sync_done;
+ for (i = 0; i < nr; i++) {
+ if (!pte_present(pte))
+ continue;
+ if (pte_present(ptep[i]) &&
+ (pte_pfn(ptep[i]) == pte_pfn(pte)))
+ continue;
+ do_sync = true;
+ }

- if (pte_present(*ptep) && (pte_pfn(*ptep) == pte_pfn(pteval)))
- goto cache_sync_done;
+ if (do_sync)
+ __update_cache(addr, pte);

- __update_cache(addr, pteval);
-cache_sync_done:
- set_pte(ptep, pteval);
+ for (;;) {
+ set_pte(ptep, pte);
+ if (--nr == 0)
+ break;
+ ptep++;
+ pte = __pte(pte_val(pte) + (1UL << PFN_PTE_SHIFT));
+ }
}
+#define set_ptes set_ptes

/*
* (pmds are folded into puds so this doesn't get actually called,
@@ -486,7 +496,7 @@ static inline int ptep_set_access_flags(struct vm_area_struct *vma,
pte_t entry, int dirty)
{
if (!pte_same(*ptep, entry))
- set_pte_at(vma->vm_mm, address, ptep, entry);
+ set_pte(ptep, entry);
/*
* update_mmu_cache will unconditionally execute, handling both
* the case that the PTE changed and the spurious fault case.
@@ -568,12 +578,21 @@ static inline pte_t pte_swp_clear_exclusive(pte_t pte)
extern void __update_tlb(struct vm_area_struct *vma, unsigned long address,
pte_t pte);

-static inline void update_mmu_cache(struct vm_area_struct *vma,
- unsigned long address, pte_t *ptep)
-{
- pte_t pte = *ptep;
- __update_tlb(vma, address, pte);
+static inline void update_mmu_cache_range(struct vm_fault *vmf,
+ struct vm_area_struct *vma, unsigned long address,
+ pte_t *ptep, unsigned int nr)
+{
+ for (;;) {
+ pte_t pte = *ptep;
+ __update_tlb(vma, address, pte);
+ if (--nr == 0)
+ break;
+ ptep++;
+ address += PAGE_SIZE;
+ }
}
+#define update_mmu_cache(vma, address, ptep) \
+ update_mmu_cache_range(NULL, vma, address, ptep, 1)

#define __HAVE_ARCH_UPDATE_MMU_TLB
#define update_mmu_tlb update_mmu_cache
diff --git a/arch/mips/mm/c-r4k.c b/arch/mips/mm/c-r4k.c
index 4b6554b48923..187d1c16361c 100644
--- a/arch/mips/mm/c-r4k.c
+++ b/arch/mips/mm/c-r4k.c
@@ -568,13 +568,14 @@ static inline void local_r4k_flush_cache_page(void *args)
if ((mm == current->active_mm) && (pte_val(*ptep) & _PAGE_VALID))
vaddr = NULL;
else {
+ struct folio *folio = page_folio(page);
/*
* Use kmap_coherent or kmap_atomic to do flushes for
* another ASID than the current one.
*/
map_coherent = (cpu_has_dc_aliases &&
- page_mapcount(page) &&
- !Page_dcache_dirty(page));
+ folio_mapped(folio) &&
+ !folio_test_dcache_dirty(folio));
if (map_coherent)
vaddr = kmap_coherent(page, addr);
else
diff --git a/arch/mips/mm/cache.c b/arch/mips/mm/cache.c
index d21cf8c6cf6c..02042100e267 100644
--- a/arch/mips/mm/cache.c
+++ b/arch/mips/mm/cache.c
@@ -99,13 +99,15 @@ SYSCALL_DEFINE3(cacheflush, unsigned long, addr, unsigned long, bytes,
return 0;
}

-void __flush_dcache_page(struct page *page)
+void __flush_dcache_pages(struct page *page, unsigned int nr)
{
- struct address_space *mapping = page_mapping_file(page);
+ struct folio *folio = page_folio(page);
+ struct address_space *mapping = folio_flush_mapping(folio);
unsigned long addr;
+ unsigned int i;

if (mapping && !mapping_mapped(mapping)) {
- SetPageDcacheDirty(page);
+ folio_set_dcache_dirty(folio);
return;
}

@@ -114,25 +116,21 @@ void __flush_dcache_page(struct page *page)
* case is for exec env/arg pages and those are %99 certainly going to
* get faulted into the tlb (and thus flushed) anyways.
*/
- if (PageHighMem(page))
- addr = (unsigned long)kmap_atomic(page);
- else
- addr = (unsigned long)page_address(page);
-
- flush_data_cache_page(addr);
-
- if (PageHighMem(page))
- kunmap_atomic((void *)addr);
+ for (i = 0; i < nr; i++) {
+ addr = (unsigned long)kmap_local_page(page + i);
+ flush_data_cache_page(addr);
+ kunmap_local((void *)addr);
+ }
}
-
-EXPORT_SYMBOL(__flush_dcache_page);
+EXPORT_SYMBOL(__flush_dcache_pages);

void __flush_anon_page(struct page *page, unsigned long vmaddr)
{
unsigned long addr = (unsigned long) page_address(page);
+ struct folio *folio = page_folio(page);

if (pages_do_alias(addr, vmaddr)) {
- if (page_mapcount(page) && !Page_dcache_dirty(page)) {
+ if (folio_mapped(folio) && !folio_test_dcache_dirty(folio)) {
void *kaddr;

kaddr = kmap_coherent(page, vmaddr);
@@ -147,27 +145,29 @@ EXPORT_SYMBOL(__flush_anon_page);

void __update_cache(unsigned long address, pte_t pte)
{
- struct page *page;
+ struct folio *folio;
unsigned long pfn, addr;
int exec = !pte_no_exec(pte) && !cpu_has_ic_fills_f_dc;
+ unsigned int i;

pfn = pte_pfn(pte);
if (unlikely(!pfn_valid(pfn)))
return;
- page = pfn_to_page(pfn);
- if (Page_dcache_dirty(page)) {
- if (PageHighMem(page))
- addr = (unsigned long)kmap_atomic(page);
- else
- addr = (unsigned long)page_address(page);
-
- if (exec || pages_do_alias(addr, address & PAGE_MASK))
- flush_data_cache_page(addr);

- if (PageHighMem(page))
- kunmap_atomic((void *)addr);
+ folio = page_folio(pfn_to_page(pfn));
+ address &= PAGE_MASK;
+ address -= offset_in_folio(folio, pfn << PAGE_SHIFT);
+
+ if (folio_test_dcache_dirty(folio)) {
+ for (i = 0; i < folio_nr_pages(folio); i++) {
+ addr = (unsigned long)kmap_local_folio(folio, i);

- ClearPageDcacheDirty(page);
+ if (exec || pages_do_alias(addr, address))
+ flush_data_cache_page(addr);
+ kunmap_local((void *)addr);
+ address += PAGE_SIZE;
+ }
+ folio_clear_dcache_dirty(folio);
}
}

diff --git a/arch/mips/mm/init.c b/arch/mips/mm/init.c
index 5a8002839550..5dcb525a8995 100644
--- a/arch/mips/mm/init.c
+++ b/arch/mips/mm/init.c
@@ -88,7 +88,7 @@ static void *__kmap_pgprot(struct page *page, unsigned long addr, pgprot_t prot)
pte_t pte;
int tlbidx;

- BUG_ON(Page_dcache_dirty(page));
+ BUG_ON(folio_test_dcache_dirty(page_folio(page)));

preempt_disable();
pagefault_disable();
@@ -169,11 +169,12 @@ void kunmap_coherent(void)
void copy_user_highpage(struct page *to, struct page *from,
unsigned long vaddr, struct vm_area_struct *vma)
{
+ struct folio *src = page_folio(from);
void *vfrom, *vto;

vto = kmap_atomic(to);
if (cpu_has_dc_aliases &&
- page_mapcount(from) && !Page_dcache_dirty(from)) {
+ folio_mapped(src) && !folio_test_dcache_dirty(src)) {
vfrom = kmap_coherent(from, vaddr);
copy_page(vto, vfrom);
kunmap_coherent();
@@ -194,15 +195,17 @@ void copy_to_user_page(struct vm_area_struct *vma,
struct page *page, unsigned long vaddr, void *dst, const void *src,
unsigned long len)
{
+ struct folio *folio = page_folio(page);
+
if (cpu_has_dc_aliases &&
- page_mapcount(page) && !Page_dcache_dirty(page)) {
+ folio_mapped(folio) && !folio_test_dcache_dirty(folio)) {
void *vto = kmap_coherent(page, vaddr) + (vaddr & ~PAGE_MASK);
memcpy(vto, src, len);
kunmap_coherent();
} else {
memcpy(dst, src, len);
if (cpu_has_dc_aliases)
- SetPageDcacheDirty(page);
+ folio_set_dcache_dirty(folio);
}
if (vma->vm_flags & VM_EXEC)
flush_cache_page(vma, vaddr, page_to_pfn(page));
@@ -212,15 +215,17 @@ void copy_from_user_page(struct vm_area_struct *vma,
struct page *page, unsigned long vaddr, void *dst, const void *src,
unsigned long len)
{
+ struct folio *folio = page_folio(page);
+
if (cpu_has_dc_aliases &&
- page_mapcount(page) && !Page_dcache_dirty(page)) {
+ folio_mapped(folio) && !folio_test_dcache_dirty(folio)) {
void *vfrom = kmap_coherent(page, vaddr) + (vaddr & ~PAGE_MASK);
memcpy(dst, vfrom, len);
kunmap_coherent();
} else {
memcpy(dst, src, len);
if (cpu_has_dc_aliases)
- SetPageDcacheDirty(page);
+ folio_set_dcache_dirty(folio);
}
}
EXPORT_SYMBOL_GPL(copy_from_user_page);
@@ -448,10 +453,10 @@ static inline void __init mem_init_free_highmem(void)
void __init mem_init(void)
{
/*
- * When _PFN_SHIFT is greater than PAGE_SHIFT we won't have enough PTE
+ * When PFN_PTE_SHIFT is greater than PAGE_SHIFT we won't have enough PTE
* bits to hold a full 32b physical address on MIPS32 systems.
*/
- BUILD_BUG_ON(IS_ENABLED(CONFIG_32BIT) && (_PFN_SHIFT > PAGE_SHIFT));
+ BUILD_BUG_ON(IS_ENABLED(CONFIG_32BIT) && (PFN_PTE_SHIFT > PAGE_SHIFT));

#ifdef CONFIG_HIGHMEM
max_mapnr = highend_pfn ? highend_pfn : max_low_pfn;
diff --git a/arch/mips/mm/pgtable-32.c b/arch/mips/mm/pgtable-32.c
index f57fb69472f8..84dd5136d53a 100644
--- a/arch/mips/mm/pgtable-32.c
+++ b/arch/mips/mm/pgtable-32.c
@@ -35,7 +35,7 @@ pmd_t mk_pmd(struct page *page, pgprot_t prot)
{
pmd_t pmd;

- pmd_val(pmd) = (page_to_pfn(page) << _PFN_SHIFT) | pgprot_val(prot);
+ pmd_val(pmd) = (page_to_pfn(page) << PFN_PTE_SHIFT) | pgprot_val(prot);

return pmd;
}
diff --git a/arch/mips/mm/pgtable-64.c b/arch/mips/mm/pgtable-64.c
index b4386a0e2ef8..c76d21f7dffb 100644
--- a/arch/mips/mm/pgtable-64.c
+++ b/arch/mips/mm/pgtable-64.c
@@ -93,7 +93,7 @@ pmd_t mk_pmd(struct page *page, pgprot_t prot)
{
pmd_t pmd;

- pmd_val(pmd) = (page_to_pfn(page) << _PFN_SHIFT) | pgprot_val(prot);
+ pmd_val(pmd) = (page_to_pfn(page) << PFN_PTE_SHIFT) | pgprot_val(prot);

return pmd;
}
diff --git a/arch/mips/mm/tlbex.c b/arch/mips/mm/tlbex.c
index 8d514a9082c6..b4e1c783e617 100644
--- a/arch/mips/mm/tlbex.c
+++ b/arch/mips/mm/tlbex.c
@@ -253,7 +253,7 @@ static void output_pgtable_bits_defines(void)
pr_define("_PAGE_GLOBAL_SHIFT %d\n", _PAGE_GLOBAL_SHIFT);
pr_define("_PAGE_VALID_SHIFT %d\n", _PAGE_VALID_SHIFT);
pr_define("_PAGE_DIRTY_SHIFT %d\n", _PAGE_DIRTY_SHIFT);
- pr_define("_PFN_SHIFT %d\n", _PFN_SHIFT);
+ pr_define("PFN_PTE_SHIFT %d\n", PFN_PTE_SHIFT);
pr_debug("\n");
}

--
2.40.1


2023-08-02 16:45:44

by Matthew Wilcox

[permalink] [raw]
Subject: [PATCH v6 31/38] mm: Rationalise flush_icache_pages() and flush_icache_page()

Move the default (no-op) implementation of flush_icache_pages()
to <linux/cacheflush.h> from <asm-generic/cacheflush.h>.
Remove the flush_icache_page() wrapper from each architecture
into <linux/cacheflush.h>.

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
---
arch/alpha/include/asm/cacheflush.h | 5 +----
arch/arc/include/asm/cacheflush.h | 9 ---------
arch/arm/include/asm/cacheflush.h | 7 -------
arch/csky/abiv1/inc/abi/cacheflush.h | 1 -
arch/csky/abiv2/inc/abi/cacheflush.h | 1 -
arch/hexagon/include/asm/cacheflush.h | 2 +-
arch/loongarch/include/asm/cacheflush.h | 2 --
arch/m68k/include/asm/cacheflush_mm.h | 1 -
arch/mips/include/asm/cacheflush.h | 6 ------
arch/nios2/include/asm/cacheflush.h | 2 +-
arch/parisc/include/asm/cacheflush.h | 2 +-
arch/sh/include/asm/cacheflush.h | 2 +-
arch/sparc/include/asm/cacheflush_32.h | 2 --
arch/sparc/include/asm/cacheflush_64.h | 3 ---
arch/xtensa/include/asm/cacheflush.h | 4 ----
include/asm-generic/cacheflush.h | 12 ------------
include/linux/cacheflush.h | 9 +++++++++
17 files changed, 14 insertions(+), 56 deletions(-)

diff --git a/arch/alpha/include/asm/cacheflush.h b/arch/alpha/include/asm/cacheflush.h
index 3956460e69e2..36a7e924c3b9 100644
--- a/arch/alpha/include/asm/cacheflush.h
+++ b/arch/alpha/include/asm/cacheflush.h
@@ -53,10 +53,6 @@ extern void flush_icache_user_page(struct vm_area_struct *vma,
#define flush_icache_user_page flush_icache_user_page
#endif /* CONFIG_SMP */

-/* This is used only in __do_fault and do_swap_page. */
-#define flush_icache_page(vma, page) \
- flush_icache_user_page((vma), (page), 0, 0)
-
/*
* Both implementations of flush_icache_user_page flush the entire
* address space, so one call, no matter how many pages.
@@ -66,6 +62,7 @@ static inline void flush_icache_pages(struct vm_area_struct *vma,
{
flush_icache_user_page(vma, page, 0, 0);
}
+#define flush_icache_pages flush_icache_pages

#include <asm-generic/cacheflush.h>

diff --git a/arch/arc/include/asm/cacheflush.h b/arch/arc/include/asm/cacheflush.h
index 04f65f588510..bd5b1a9a0544 100644
--- a/arch/arc/include/asm/cacheflush.h
+++ b/arch/arc/include/asm/cacheflush.h
@@ -18,15 +18,6 @@
#include <linux/mm.h>
#include <asm/shmparam.h>

-/*
- * Semantically we need this because icache doesn't snoop dcache/dma.
- * However ARC Cache flush requires paddr as well as vaddr, latter not available
- * in the flush_icache_page() API. So we no-op it but do the equivalent work
- * in update_mmu_cache()
- */
-#define flush_icache_page(vma, page)
-#define flush_icache_pages(vma, page, nr)
-
void flush_cache_all(void);

void flush_icache_range(unsigned long kstart, unsigned long kend);
diff --git a/arch/arm/include/asm/cacheflush.h b/arch/arm/include/asm/cacheflush.h
index 841e268d2374..f6181f69577f 100644
--- a/arch/arm/include/asm/cacheflush.h
+++ b/arch/arm/include/asm/cacheflush.h
@@ -321,13 +321,6 @@ static inline void flush_anon_page(struct vm_area_struct *vma,
#define flush_dcache_mmap_lock(mapping) xa_lock_irq(&mapping->i_pages)
#define flush_dcache_mmap_unlock(mapping) xa_unlock_irq(&mapping->i_pages)

-/*
- * We don't appear to need to do anything here. In fact, if we did, we'd
- * duplicate cache flushing elsewhere performed by flush_dcache_page().
- */
-#define flush_icache_page(vma,page) do { } while (0)
-#define flush_icache_pages(vma, page, nr) do { } while (0)
-
/*
* flush_cache_vmap() is used when creating mappings (eg, via vmap,
* vmalloc, ioremap etc) in kernel space for pages. On non-VIPT
diff --git a/arch/csky/abiv1/inc/abi/cacheflush.h b/arch/csky/abiv1/inc/abi/cacheflush.h
index 0d6cb65624c4..908d8b0bc4fd 100644
--- a/arch/csky/abiv1/inc/abi/cacheflush.h
+++ b/arch/csky/abiv1/inc/abi/cacheflush.h
@@ -45,7 +45,6 @@ extern void flush_cache_range(struct vm_area_struct *vma, unsigned long start, u
#define flush_cache_vmap(start, end) cache_wbinv_all()
#define flush_cache_vunmap(start, end) cache_wbinv_all()

-#define flush_icache_page(vma, page) do {} while (0);
#define flush_icache_range(start, end) cache_wbinv_range(start, end)
#define flush_icache_mm_range(mm, start, end) cache_wbinv_range(start, end)
#define flush_icache_deferred(mm) do {} while (0);
diff --git a/arch/csky/abiv2/inc/abi/cacheflush.h b/arch/csky/abiv2/inc/abi/cacheflush.h
index 9c728933a776..40be16907267 100644
--- a/arch/csky/abiv2/inc/abi/cacheflush.h
+++ b/arch/csky/abiv2/inc/abi/cacheflush.h
@@ -33,7 +33,6 @@ static inline void flush_dcache_page(struct page *page)

#define flush_dcache_mmap_lock(mapping) do { } while (0)
#define flush_dcache_mmap_unlock(mapping) do { } while (0)
-#define flush_icache_page(vma, page) do { } while (0)

#define flush_icache_range(start, end) cache_wbinv_range(start, end)

diff --git a/arch/hexagon/include/asm/cacheflush.h b/arch/hexagon/include/asm/cacheflush.h
index dc3f500a5a01..bfff514a81c8 100644
--- a/arch/hexagon/include/asm/cacheflush.h
+++ b/arch/hexagon/include/asm/cacheflush.h
@@ -18,7 +18,7 @@
* - flush_cache_range(vma, start, end) flushes a range of pages
* - flush_icache_range(start, end) flush a range of instructions
* - flush_dcache_page(pg) flushes(wback&invalidates) a page for dcache
- * - flush_icache_page(vma, pg) flushes(invalidates) a page for icache
+ * - flush_icache_pages(vma, pg, nr) flushes(invalidates) nr pages for icache
*
* Need to doublecheck which one is really needed for ptrace stuff to work.
*/
diff --git a/arch/loongarch/include/asm/cacheflush.h b/arch/loongarch/include/asm/cacheflush.h
index 88a44da50a3b..80bd74106985 100644
--- a/arch/loongarch/include/asm/cacheflush.h
+++ b/arch/loongarch/include/asm/cacheflush.h
@@ -46,8 +46,6 @@ void local_flush_icache_range(unsigned long start, unsigned long end);
#define flush_cache_page(vma, vmaddr, pfn) do { } while (0)
#define flush_cache_vmap(start, end) do { } while (0)
#define flush_cache_vunmap(start, end) do { } while (0)
-#define flush_icache_page(vma, page) do { } while (0)
-#define flush_icache_pages(vma, page) do { } while (0)
#define flush_icache_user_page(vma, page, addr, len) do { } while (0)
#define flush_dcache_page(page) do { } while (0)
#define flush_dcache_mmap_lock(mapping) do { } while (0)
diff --git a/arch/m68k/include/asm/cacheflush_mm.h b/arch/m68k/include/asm/cacheflush_mm.h
index 88eb85e81ef6..ed12358c4783 100644
--- a/arch/m68k/include/asm/cacheflush_mm.h
+++ b/arch/m68k/include/asm/cacheflush_mm.h
@@ -261,7 +261,6 @@ static inline void __flush_pages_to_ram(void *vaddr, unsigned int nr)
#define flush_dcache_mmap_unlock(mapping) do { } while (0)
#define flush_icache_pages(vma, page, nr) \
__flush_pages_to_ram(page_address(page), nr)
-#define flush_icache_page(vma, page) flush_icache_pages(vma, page, 1)

extern void flush_icache_user_page(struct vm_area_struct *vma, struct page *page,
unsigned long addr, int len);
diff --git a/arch/mips/include/asm/cacheflush.h b/arch/mips/include/asm/cacheflush.h
index 0f389bc7cb90..f36c2519ed97 100644
--- a/arch/mips/include/asm/cacheflush.h
+++ b/arch/mips/include/asm/cacheflush.h
@@ -82,12 +82,6 @@ static inline void flush_anon_page(struct vm_area_struct *vma,
__flush_anon_page(page, vmaddr);
}

-static inline void flush_icache_pages(struct vm_area_struct *vma,
- struct page *page, unsigned int nr)
-{
-}
-#define flush_icache_page(vma, page) flush_icache_pages(vma, page, 1)
-
extern void (*flush_icache_range)(unsigned long start, unsigned long end);
extern void (*local_flush_icache_range)(unsigned long start, unsigned long end);
extern void (*__flush_icache_user_range)(unsigned long start,
diff --git a/arch/nios2/include/asm/cacheflush.h b/arch/nios2/include/asm/cacheflush.h
index 8624ca83cffe..7c48c5213fb7 100644
--- a/arch/nios2/include/asm/cacheflush.h
+++ b/arch/nios2/include/asm/cacheflush.h
@@ -35,7 +35,7 @@ void flush_dcache_folio(struct folio *folio);
extern void flush_icache_range(unsigned long start, unsigned long end);
void flush_icache_pages(struct vm_area_struct *vma, struct page *page,
unsigned int nr);
-#define flush_icache_page(vma, page) flush_icache_pages(vma, page, 1);
+#define flush_icache_pages flush_icache_pages

#define flush_cache_vmap(start, end) flush_dcache_range(start, end)
#define flush_cache_vunmap(start, end) flush_dcache_range(start, end)
diff --git a/arch/parisc/include/asm/cacheflush.h b/arch/parisc/include/asm/cacheflush.h
index b77c3e0c37d3..b4006f2a9705 100644
--- a/arch/parisc/include/asm/cacheflush.h
+++ b/arch/parisc/include/asm/cacheflush.h
@@ -60,7 +60,7 @@ static inline void flush_dcache_page(struct page *page)

void flush_icache_pages(struct vm_area_struct *vma, struct page *page,
unsigned int nr);
-#define flush_icache_page(vma, page) flush_icache_pages(vma, page, 1)
+#define flush_icache_pages flush_icache_pages

#define flush_icache_range(s,e) do { \
flush_kernel_dcache_range_asm(s,e); \
diff --git a/arch/sh/include/asm/cacheflush.h b/arch/sh/include/asm/cacheflush.h
index 9fceef6f3e00..878b6b551bd2 100644
--- a/arch/sh/include/asm/cacheflush.h
+++ b/arch/sh/include/asm/cacheflush.h
@@ -53,7 +53,7 @@ extern void flush_icache_range(unsigned long start, unsigned long end);
#define flush_icache_user_range flush_icache_range
void flush_icache_pages(struct vm_area_struct *vma, struct page *page,
unsigned int nr);
-#define flush_icache_page(vma, page) flush_icache_pages(vma, page, 1)
+#define flush_icache_pages flush_icache_pages
extern void flush_cache_sigtramp(unsigned long address);

struct flusher_data {
diff --git a/arch/sparc/include/asm/cacheflush_32.h b/arch/sparc/include/asm/cacheflush_32.h
index c8dd971f0e88..f3b7270bf71b 100644
--- a/arch/sparc/include/asm/cacheflush_32.h
+++ b/arch/sparc/include/asm/cacheflush_32.h
@@ -16,8 +16,6 @@
#define flush_cache_page(vma,addr,pfn) \
sparc32_cachetlb_ops->cache_page(vma, addr)
#define flush_icache_range(start, end) do { } while (0)
-#define flush_icache_page(vma, pg) do { } while (0)
-#define flush_icache_pages(vma, pg, nr) do { } while (0)

#define copy_to_user_page(vma, page, vaddr, dst, src, len) \
do { \
diff --git a/arch/sparc/include/asm/cacheflush_64.h b/arch/sparc/include/asm/cacheflush_64.h
index a9a719f04d06..0e879004efff 100644
--- a/arch/sparc/include/asm/cacheflush_64.h
+++ b/arch/sparc/include/asm/cacheflush_64.h
@@ -53,9 +53,6 @@ static inline void flush_dcache_page(struct page *page)
flush_dcache_folio(page_folio(page));
}

-#define flush_icache_page(vma, pg) do { } while(0)
-#define flush_icache_pages(vma, pg, nr) do { } while(0)
-
void flush_ptrace_access(struct vm_area_struct *, struct page *,
unsigned long uaddr, void *kaddr,
unsigned long len, int write);
diff --git a/arch/xtensa/include/asm/cacheflush.h b/arch/xtensa/include/asm/cacheflush.h
index 35153f6725e4..785a00ce83c1 100644
--- a/arch/xtensa/include/asm/cacheflush.h
+++ b/arch/xtensa/include/asm/cacheflush.h
@@ -160,10 +160,6 @@ void local_flush_cache_page(struct vm_area_struct *vma,
__invalidate_icache_range(start,(end) - (start)); \
} while (0)

-/* This is not required, see Documentation/core-api/cachetlb.rst */
-#define flush_icache_page(vma,page) do { } while (0)
-#define flush_icache_pages(vma, page, nr) do { } while (0)
-
#define flush_dcache_mmap_lock(mapping) do { } while (0)
#define flush_dcache_mmap_unlock(mapping) do { } while (0)

diff --git a/include/asm-generic/cacheflush.h b/include/asm-generic/cacheflush.h
index 09d51a680765..84ec53ccc450 100644
--- a/include/asm-generic/cacheflush.h
+++ b/include/asm-generic/cacheflush.h
@@ -77,18 +77,6 @@ static inline void flush_icache_range(unsigned long start, unsigned long end)
#define flush_icache_user_range flush_icache_range
#endif

-#ifndef flush_icache_page
-static inline void flush_icache_pages(struct vm_area_struct *vma,
- struct page *page, unsigned int nr)
-{
-}
-
-static inline void flush_icache_page(struct vm_area_struct *vma,
- struct page *page)
-{
-}
-#endif
-
#ifndef flush_icache_user_page
static inline void flush_icache_user_page(struct vm_area_struct *vma,
struct page *page,
diff --git a/include/linux/cacheflush.h b/include/linux/cacheflush.h
index 82136f3fcf54..55f297b2c23f 100644
--- a/include/linux/cacheflush.h
+++ b/include/linux/cacheflush.h
@@ -17,4 +17,13 @@ static inline void flush_dcache_folio(struct folio *folio)
#define flush_dcache_folio flush_dcache_folio
#endif /* ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE */

+#ifndef flush_icache_pages
+static inline void flush_icache_pages(struct vm_area_struct *vma,
+ struct page *page, unsigned int nr)
+{
+}
+#endif
+
+#define flush_icache_page(vma, page) flush_icache_pages(vma, page, 1)
+
#endif /* _LINUX_CACHEFLUSH_H */
--
2.40.1


2023-08-02 16:47:33

by Matthew Wilcox

[permalink] [raw]
Subject: [PATCH v6 36/38] mm: Convert do_set_pte() to set_pte_range()

From: Yin Fengwei <[email protected]>

set_pte_range() allows to setup page table entries for a specific
range. It takes advantage of batched rmap update for large folio.
It now takes care of calling update_mmu_cache_range().

Signed-off-by: Yin Fengwei <[email protected]>
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
---
Documentation/filesystems/locking.rst | 2 +-
include/linux/mm.h | 3 ++-
mm/filemap.c | 3 +--
mm/memory.c | 37 +++++++++++++++++----------
4 files changed, 28 insertions(+), 17 deletions(-)

diff --git a/Documentation/filesystems/locking.rst b/Documentation/filesystems/locking.rst
index 89c5ec9e3392..cd032f2324e8 100644
--- a/Documentation/filesystems/locking.rst
+++ b/Documentation/filesystems/locking.rst
@@ -670,7 +670,7 @@ locked. The VM will unlock the page.
Filesystem should find and map pages associated with offsets from "start_pgoff"
till "end_pgoff". ->map_pages() is called with the RCU lock held and must
not block. If it's not possible to reach a page without blocking,
-filesystem should skip it. Filesystem should use do_set_pte() to setup
+filesystem should skip it. Filesystem should use set_pte_range() to setup
page table entry. Pointer to entry associated with the page is passed in
"pte" field in vm_fault structure. Pointers to entries for other offsets
should be calculated relative to "pte".
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 2fbc6c631764..19493d6a2bb8 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1346,7 +1346,8 @@ static inline pte_t maybe_mkwrite(pte_t pte, struct vm_area_struct *vma)
}

vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page);
-void do_set_pte(struct vm_fault *vmf, struct page *page, unsigned long addr);
+void set_pte_range(struct vm_fault *vmf, struct folio *folio,
+ struct page *page, unsigned int nr, unsigned long addr);

vm_fault_t finish_fault(struct vm_fault *vmf);
vm_fault_t finish_mkwrite_fault(struct vm_fault *vmf);
diff --git a/mm/filemap.c b/mm/filemap.c
index 9dc15af7ab5b..2e7050461a87 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -3506,8 +3506,7 @@ static vm_fault_t filemap_map_folio_range(struct vm_fault *vmf,
ret = VM_FAULT_NOPAGE;

ref_count++;
- do_set_pte(vmf, page, addr);
- update_mmu_cache(vma, addr, vmf->pte);
+ set_pte_range(vmf, folio, page, 1, addr);
} while (vmf->pte++, page++, addr += PAGE_SIZE, ++count < nr_pages);

/* Restore the vmf->pte */
diff --git a/mm/memory.c b/mm/memory.c
index e25edd4c24b8..621716109627 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4465,15 +4465,24 @@ vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page)
}
#endif

-void do_set_pte(struct vm_fault *vmf, struct page *page, unsigned long addr)
+/**
+ * set_pte_range - Set a range of PTEs to point to pages in a folio.
+ * @vmf: Fault decription.
+ * @folio: The folio that contains @page.
+ * @page: The first page to create a PTE for.
+ * @nr: The number of PTEs to create.
+ * @addr: The first address to create a PTE for.
+ */
+void set_pte_range(struct vm_fault *vmf, struct folio *folio,
+ struct page *page, unsigned int nr, unsigned long addr)
{
struct vm_area_struct *vma = vmf->vma;
bool uffd_wp = vmf_orig_pte_uffd_wp(vmf);
bool write = vmf->flags & FAULT_FLAG_WRITE;
- bool prefault = vmf->address != addr;
+ bool prefault = in_range(vmf->address, addr, nr * PAGE_SIZE);
pte_t entry;

- flush_icache_page(vma, page);
+ flush_icache_pages(vma, page, nr);
entry = mk_pte(page, vma->vm_page_prot);

if (prefault && arch_wants_old_prefaulted_pte())
@@ -4487,14 +4496,18 @@ void do_set_pte(struct vm_fault *vmf, struct page *page, unsigned long addr)
entry = pte_mkuffd_wp(entry);
/* copy-on-write page */
if (write && !(vma->vm_flags & VM_SHARED)) {
- inc_mm_counter(vma->vm_mm, MM_ANONPAGES);
- page_add_new_anon_rmap(page, vma, addr);
- lru_cache_add_inactive_or_unevictable(page, vma);
+ add_mm_counter(vma->vm_mm, MM_ANONPAGES, nr);
+ VM_BUG_ON_FOLIO(nr != 1, folio);
+ folio_add_new_anon_rmap(folio, vma, addr);
+ folio_add_lru_vma(folio, vma);
} else {
- inc_mm_counter(vma->vm_mm, mm_counter_file(page));
- page_add_file_rmap(page, vma, false);
+ add_mm_counter(vma->vm_mm, mm_counter_file(page), nr);
+ folio_add_file_rmap_range(folio, page, nr, vma, false);
}
- set_pte_at(vma->vm_mm, addr, vmf->pte, entry);
+ set_ptes(vma->vm_mm, addr, vmf->pte, entry, nr);
+
+ /* no need to invalidate: a not-present page won't be cached */
+ update_mmu_cache_range(vmf, vma, addr, vmf->pte, nr);
}

static bool vmf_pte_changed(struct vm_fault *vmf)
@@ -4562,11 +4575,9 @@ vm_fault_t finish_fault(struct vm_fault *vmf)

/* Re-check under ptl */
if (likely(!vmf_pte_changed(vmf))) {
- do_set_pte(vmf, page, vmf->address);
-
- /* no need to invalidate: a not-present page won't be cached */
- update_mmu_cache(vma, vmf->address, vmf->pte);
+ struct folio *folio = page_folio(page);

+ set_pte_range(vmf, folio, page, 1, vmf->address);
ret = 0;
} else {
update_mmu_tlb(vma, vmf->address, vmf->pte);
--
2.40.1


2023-08-02 16:50:02

by Matthew Wilcox

[permalink] [raw]
Subject: [PATCH v6 33/38] mm: Use flush_icache_pages() in do_set_pmd()

Push the iteration over each page down to the architectures (many
can flush the entire THP without iteration).

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Reviewed-by: Anshuman Khandual <[email protected]>
---
mm/memory.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index 4e6e9ec7eaaf..e25edd4c24b8 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4400,7 +4400,6 @@ vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page)
bool write = vmf->flags & FAULT_FLAG_WRITE;
unsigned long haddr = vmf->address & HPAGE_PMD_MASK;
pmd_t entry;
- int i;
vm_fault_t ret = VM_FAULT_FALLBACK;

if (!transhuge_vma_suitable(vma, haddr))
@@ -4433,8 +4432,7 @@ vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page)
if (unlikely(!pmd_none(*vmf->pmd)))
goto out;

- for (i = 0; i < HPAGE_PMD_NR; i++)
- flush_icache_page(vma, page + i);
+ flush_icache_pages(vma, page, HPAGE_PMD_NR);

entry = mk_huge_pmd(page, vma->vm_page_prot);
if (write)
--
2.40.1


2023-08-02 16:50:21

by Matthew Wilcox

[permalink] [raw]
Subject: [PATCH v6 23/38] s390: Implement the new page table range API

Add set_ptes() and update_mmu_cache_range().

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Reviewed-by: Gerald Schaefer <[email protected]>
Acked-by: Mike Rapoport (IBM) <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Vasily Gorbik <[email protected]>
Cc: Alexander Gordeev <[email protected]>
Cc: Christian Borntraeger <[email protected]>
Cc: Sven Schnelle <[email protected]>
Cc: [email protected]
---
arch/s390/include/asm/pgtable.h | 33 ++++++++++++++++++++++++---------
1 file changed, 24 insertions(+), 9 deletions(-)

diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
index 30909fe27c24..d28d2e5e68ee 100644
--- a/arch/s390/include/asm/pgtable.h
+++ b/arch/s390/include/asm/pgtable.h
@@ -47,6 +47,7 @@ static inline void update_page_count(int level, long count)
* tables contain all the necessary information.
*/
#define update_mmu_cache(vma, address, ptep) do { } while (0)
+#define update_mmu_cache_range(vmf, vma, addr, ptep, nr) do { } while (0)
#define update_mmu_cache_pmd(vma, address, ptep) do { } while (0)

/*
@@ -1314,20 +1315,34 @@ pgprot_t pgprot_writecombine(pgprot_t prot);
pgprot_t pgprot_writethrough(pgprot_t prot);

/*
- * Certain architectures need to do special things when PTEs
- * within a page table are directly modified. Thus, the following
- * hook is made available.
+ * Set multiple PTEs to consecutive pages with a single call. All PTEs
+ * are within the same folio, PMD and VMA.
*/
-static inline void set_pte_at(struct mm_struct *mm, unsigned long addr,
- pte_t *ptep, pte_t entry)
+static inline void set_ptes(struct mm_struct *mm, unsigned long addr,
+ pte_t *ptep, pte_t entry, unsigned int nr)
{
if (pte_present(entry))
entry = clear_pte_bit(entry, __pgprot(_PAGE_UNUSED));
- if (mm_has_pgste(mm))
- ptep_set_pte_at(mm, addr, ptep, entry);
- else
- set_pte(ptep, entry);
+ if (mm_has_pgste(mm)) {
+ for (;;) {
+ ptep_set_pte_at(mm, addr, ptep, entry);
+ if (--nr == 0)
+ break;
+ ptep++;
+ entry = __pte(pte_val(entry) + PAGE_SIZE);
+ addr += PAGE_SIZE;
+ }
+ } else {
+ for (;;) {
+ set_pte(ptep, entry);
+ if (--nr == 0)
+ break;
+ ptep++;
+ entry = __pte(pte_val(entry) + PAGE_SIZE);
+ }
+ }
}
+#define set_ptes set_ptes

/*
* Conversion functions: convert a page and protection to a page entry,
--
2.40.1


2023-08-02 16:50:48

by Matthew Wilcox

[permalink] [raw]
Subject: [PATCH v6 38/38] mm: Call update_mmu_cache_range() in more page fault handling paths

Pass the vm_fault to the architecture to help it make smarter decisions
about which PTEs to insert into the TLB.

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
---
mm/memory.c | 15 ++++++++-------
1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index 621716109627..236c46e85dc2 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2997,7 +2997,7 @@ static inline int __wp_page_copy_user(struct page *dst, struct page *src,

entry = pte_mkyoung(vmf->orig_pte);
if (ptep_set_access_flags(vma, addr, vmf->pte, entry, 0))
- update_mmu_cache(vma, addr, vmf->pte);
+ update_mmu_cache_range(vmf, vma, addr, vmf->pte, 1);
}

/*
@@ -3174,7 +3174,7 @@ static inline void wp_page_reuse(struct vm_fault *vmf)
entry = pte_mkyoung(vmf->orig_pte);
entry = maybe_mkwrite(pte_mkdirty(entry), vma);
if (ptep_set_access_flags(vma, vmf->address, vmf->pte, entry, 1))
- update_mmu_cache(vma, vmf->address, vmf->pte);
+ update_mmu_cache_range(vmf, vma, vmf->address, vmf->pte, 1);
pte_unmap_unlock(vmf->pte, vmf->ptl);
count_vm_event(PGREUSE);
}
@@ -3298,7 +3298,7 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf)
*/
BUG_ON(unshare && pte_write(entry));
set_pte_at_notify(mm, vmf->address, vmf->pte, entry);
- update_mmu_cache(vma, vmf->address, vmf->pte);
+ update_mmu_cache_range(vmf, vma, vmf->address, vmf->pte, 1);
if (old_folio) {
/*
* Only after switching the pte to the new page may
@@ -4181,7 +4181,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
}

/* No need to invalidate - it was non-present before */
- update_mmu_cache(vma, vmf->address, vmf->pte);
+ update_mmu_cache_range(vmf, vma, vmf->address, vmf->pte, 1);
unlock:
if (vmf->pte)
pte_unmap_unlock(vmf->pte, vmf->ptl);
@@ -4305,7 +4305,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf)
set_pte_at(vma->vm_mm, vmf->address, vmf->pte, entry);

/* No need to invalidate - it was non-present before */
- update_mmu_cache(vma, vmf->address, vmf->pte);
+ update_mmu_cache_range(vmf, vma, vmf->address, vmf->pte, 1);
unlock:
if (vmf->pte)
pte_unmap_unlock(vmf->pte, vmf->ptl);
@@ -4994,7 +4994,7 @@ static vm_fault_t do_numa_page(struct vm_fault *vmf)
if (writable)
pte = pte_mkwrite(pte);
ptep_modify_prot_commit(vma, vmf->address, vmf->pte, old_pte, pte);
- update_mmu_cache(vma, vmf->address, vmf->pte);
+ update_mmu_cache_range(vmf, vma, vmf->address, vmf->pte, 1);
pte_unmap_unlock(vmf->pte, vmf->ptl);
goto out;
}
@@ -5165,7 +5165,8 @@ static vm_fault_t handle_pte_fault(struct vm_fault *vmf)
entry = pte_mkyoung(entry);
if (ptep_set_access_flags(vmf->vma, vmf->address, vmf->pte, entry,
vmf->flags & FAULT_FLAG_WRITE)) {
- update_mmu_cache(vmf->vma, vmf->address, vmf->pte);
+ update_mmu_cache_range(vmf, vmf->vma, vmf->address,
+ vmf->pte, 1);
} else {
/* Skip spurious TLB flush for retried page fault */
if (vmf->flags & FAULT_FLAG_TRIED)
--
2.40.1


2023-08-02 16:54:10

by Matthew Wilcox

[permalink] [raw]
Subject: [PATCH v6 07/38] alpha: Implement the new page table range API

Add PFN_PTE_SHIFT, update_mmu_cache_range() and flush_icache_pages().

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Acked-by: Mike Rapoport (IBM) <[email protected]>
Cc: Richard Henderson <[email protected]>
Cc: Ivan Kokshaysky <[email protected]>
Cc: Matt Turner <[email protected]>
Cc: [email protected]
---
arch/alpha/include/asm/cacheflush.h | 10 ++++++++++
arch/alpha/include/asm/pgtable.h | 10 ++++++++--
2 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/arch/alpha/include/asm/cacheflush.h b/arch/alpha/include/asm/cacheflush.h
index 9945ff483eaf..3956460e69e2 100644
--- a/arch/alpha/include/asm/cacheflush.h
+++ b/arch/alpha/include/asm/cacheflush.h
@@ -57,6 +57,16 @@ extern void flush_icache_user_page(struct vm_area_struct *vma,
#define flush_icache_page(vma, page) \
flush_icache_user_page((vma), (page), 0, 0)

+/*
+ * Both implementations of flush_icache_user_page flush the entire
+ * address space, so one call, no matter how many pages.
+ */
+static inline void flush_icache_pages(struct vm_area_struct *vma,
+ struct page *page, unsigned int nr)
+{
+ flush_icache_user_page(vma, page, 0, 0);
+}
+
#include <asm-generic/cacheflush.h>

#endif /* _ALPHA_CACHEFLUSH_H */
diff --git a/arch/alpha/include/asm/pgtable.h b/arch/alpha/include/asm/pgtable.h
index ba43cb841d19..747b5f706c47 100644
--- a/arch/alpha/include/asm/pgtable.h
+++ b/arch/alpha/include/asm/pgtable.h
@@ -26,7 +26,6 @@ struct vm_area_struct;
* hook is made available.
*/
#define set_pte(pteptr, pteval) ((*(pteptr)) = (pteval))
-#define set_pte_at(mm,addr,ptep,pteval) set_pte(ptep,pteval)

/* PMD_SHIFT determines the size of the area a second-level page table can map */
#define PMD_SHIFT (PAGE_SHIFT + (PAGE_SHIFT-3))
@@ -189,7 +188,8 @@ extern unsigned long __zero_page(void);
* and a page entry and page directory to the page they refer to.
*/
#define page_to_pa(page) (page_to_pfn(page) << PAGE_SHIFT)
-#define pte_pfn(pte) (pte_val(pte) >> 32)
+#define PFN_PTE_SHIFT 32
+#define pte_pfn(pte) (pte_val(pte) >> PFN_PTE_SHIFT)

#define pte_page(pte) pfn_to_page(pte_pfn(pte))
#define mk_pte(page, pgprot) \
@@ -303,6 +303,12 @@ extern inline void update_mmu_cache(struct vm_area_struct * vma,
{
}

+static inline void update_mmu_cache_range(struct vm_fault *vmf,
+ struct vm_area_struct *vma, unsigned long address,
+ pte_t *ptep, unsigned int nr)
+{
+}
+
/*
* Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
* are !pte_none() && !pte_present().
--
2.40.1


2023-08-02 16:54:10

by Matthew Wilcox

[permalink] [raw]
Subject: [PATCH v6 21/38] powerpc: Implement the new page table range API

Add set_ptes(), update_mmu_cache_range() and flush_dcache_folio().
Change the PG_arch_1 (aka PG_dcache_dirty) flag from being per-page to
per-folio.

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Acked-by: Mike Rapoport (IBM) <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: [email protected]
---
arch/powerpc/include/asm/book3s/32/pgtable.h | 5 --
arch/powerpc/include/asm/book3s/64/pgtable.h | 6 +--
arch/powerpc/include/asm/book3s/pgtable.h | 11 ++--
arch/powerpc/include/asm/cacheflush.h | 14 ++++--
arch/powerpc/include/asm/kvm_ppc.h | 10 ++--
arch/powerpc/include/asm/nohash/pgtable.h | 16 ++----
arch/powerpc/include/asm/pgtable.h | 12 +++++
arch/powerpc/mm/book3s64/hash_utils.c | 11 ++--
arch/powerpc/mm/cacheflush.c | 40 +++++----------
arch/powerpc/mm/nohash/e500_hugetlbpage.c | 3 +-
arch/powerpc/mm/pgtable.c | 53 ++++++++++++--------
11 files changed, 88 insertions(+), 93 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h
index 7bf1fe7297c6..5f12b9382909 100644
--- a/arch/powerpc/include/asm/book3s/32/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
@@ -462,11 +462,6 @@ static inline pte_t pfn_pte(unsigned long pfn, pgprot_t pgprot)
pgprot_val(pgprot));
}

-static inline unsigned long pte_pfn(pte_t pte)
-{
- return pte_val(pte) >> PTE_RPN_SHIFT;
-}
-
/* Generic modifiers for PTE bits */
static inline pte_t pte_wrprotect(pte_t pte)
{
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index a8204566cfd0..8269b231c533 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -104,6 +104,7 @@
* and every thing below PAGE_SHIFT;
*/
#define PTE_RPN_MASK (((1UL << _PAGE_PA_MAX) - 1) & (PAGE_MASK))
+#define PTE_RPN_SHIFT PAGE_SHIFT
/*
* set of bits not changed in pmd_modify. Even though we have hash specific bits
* in here, on radix we expect them to be zero.
@@ -569,11 +570,6 @@ static inline pte_t pfn_pte(unsigned long pfn, pgprot_t pgprot)
return __pte(((pte_basic_t)pfn << PAGE_SHIFT) | pgprot_val(pgprot) | _PAGE_PTE);
}

-static inline unsigned long pte_pfn(pte_t pte)
-{
- return (pte_val(pte) & PTE_RPN_MASK) >> PAGE_SHIFT;
-}
-
/* Generic modifiers for PTE bits */
static inline pte_t pte_wrprotect(pte_t pte)
{
diff --git a/arch/powerpc/include/asm/book3s/pgtable.h b/arch/powerpc/include/asm/book3s/pgtable.h
index d18b748ea3ae..3b7bd36a2321 100644
--- a/arch/powerpc/include/asm/book3s/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/pgtable.h
@@ -9,13 +9,6 @@
#endif

#ifndef __ASSEMBLY__
-/* Insert a PTE, top-level function is out of line. It uses an inline
- * low level function in the respective pgtable-* files
- */
-extern void set_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
- pte_t pte);
-
-
#define __HAVE_ARCH_PTEP_SET_ACCESS_FLAGS
extern int ptep_set_access_flags(struct vm_area_struct *vma, unsigned long address,
pte_t *ptep, pte_t entry, int dirty);
@@ -36,7 +29,9 @@ void __update_mmu_cache(struct vm_area_struct *vma, unsigned long address, pte_t
* corresponding HPTE into the hash table ahead of time, instead of
* waiting for the inevitable extra hash-table miss exception.
*/
-static inline void update_mmu_cache(struct vm_area_struct *vma, unsigned long address, pte_t *ptep)
+static inline void update_mmu_cache_range(struct vm_fault *vmf,
+ struct vm_area_struct *vma, unsigned long address,
+ pte_t *ptep, unsigned int nr)
{
if (IS_ENABLED(CONFIG_PPC32) && !mmu_has_feature(MMU_FTR_HPTE_TABLE))
return;
diff --git a/arch/powerpc/include/asm/cacheflush.h b/arch/powerpc/include/asm/cacheflush.h
index 7564dd4fd12b..ef7d2de33b89 100644
--- a/arch/powerpc/include/asm/cacheflush.h
+++ b/arch/powerpc/include/asm/cacheflush.h
@@ -35,13 +35,19 @@ static inline void flush_cache_vmap(unsigned long start, unsigned long end)
* It just marks the page as not i-cache clean. We do the i-cache
* flush later when the page is given to a user process, if necessary.
*/
-static inline void flush_dcache_page(struct page *page)
+static inline void flush_dcache_folio(struct folio *folio)
{
if (cpu_has_feature(CPU_FTR_COHERENT_ICACHE))
return;
/* avoid an atomic op if possible */
- if (test_bit(PG_dcache_clean, &page->flags))
- clear_bit(PG_dcache_clean, &page->flags);
+ if (test_bit(PG_dcache_clean, &folio->flags))
+ clear_bit(PG_dcache_clean, &folio->flags);
+}
+#define flush_dcache_folio flush_dcache_folio
+
+static inline void flush_dcache_page(struct page *page)
+{
+ flush_dcache_folio(page_folio(page));
}

void flush_icache_range(unsigned long start, unsigned long stop);
@@ -51,7 +57,7 @@ void flush_icache_user_page(struct vm_area_struct *vma, struct page *page,
unsigned long addr, int len);
#define flush_icache_user_page flush_icache_user_page

-void flush_dcache_icache_page(struct page *page);
+void flush_dcache_icache_folio(struct folio *folio);

/**
* flush_dcache_range(): Write any modified data cache blocks out to memory and
diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
index d16d80ad2ae4..b4da8514af43 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -894,7 +894,7 @@ void kvmppc_init_lpid(unsigned long nr_lpids);

static inline void kvmppc_mmu_flush_icache(kvm_pfn_t pfn)
{
- struct page *page;
+ struct folio *folio;
/*
* We can only access pages that the kernel maps
* as memory. Bail out for unmapped ones.
@@ -903,10 +903,10 @@ static inline void kvmppc_mmu_flush_icache(kvm_pfn_t pfn)
return;

/* Clear i-cache for new pages */
- page = pfn_to_page(pfn);
- if (!test_bit(PG_dcache_clean, &page->flags)) {
- flush_dcache_icache_page(page);
- set_bit(PG_dcache_clean, &page->flags);
+ folio = page_folio(pfn_to_page(pfn));
+ if (!test_bit(PG_dcache_clean, &folio->flags)) {
+ flush_dcache_icache_folio(folio);
+ set_bit(PG_dcache_clean, &folio->flags);
}
}

diff --git a/arch/powerpc/include/asm/nohash/pgtable.h b/arch/powerpc/include/asm/nohash/pgtable.h
index a6caaaab6f92..56ea48276356 100644
--- a/arch/powerpc/include/asm/nohash/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/pgtable.h
@@ -101,8 +101,6 @@ static inline bool pte_access_permitted(pte_t pte, bool write)
static inline pte_t pfn_pte(unsigned long pfn, pgprot_t pgprot) {
return __pte(((pte_basic_t)(pfn) << PTE_RPN_SHIFT) |
pgprot_val(pgprot)); }
-static inline unsigned long pte_pfn(pte_t pte) {
- return pte_val(pte) >> PTE_RPN_SHIFT; }

/* Generic modifiers for PTE bits */
static inline pte_t pte_exprotect(pte_t pte)
@@ -166,12 +164,6 @@ static inline pte_t pte_swp_clear_exclusive(pte_t pte)
return __pte(pte_val(pte) & ~_PAGE_SWP_EXCLUSIVE);
}

-/* Insert a PTE, top-level function is out of line. It uses an inline
- * low level function in the respective pgtable-* files
- */
-extern void set_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
- pte_t pte);
-
/* This low level function performs the actual PTE insertion
* Setting the PTE depends on the MMU type and other factors. It's
* an horrible mess that I'm not going to try to clean up now but
@@ -282,10 +274,12 @@ static inline int pud_huge(pud_t pud)
* for the page which has just been mapped in.
*/
#if defined(CONFIG_PPC_E500) && defined(CONFIG_HUGETLB_PAGE)
-void update_mmu_cache(struct vm_area_struct *vma, unsigned long address, pte_t *ptep);
+void update_mmu_cache_range(struct vm_fault *vmf, struct vm_area_struct *vma,
+ unsigned long address, pte_t *ptep, unsigned int nr);
#else
-static inline
-void update_mmu_cache(struct vm_area_struct *vma, unsigned long address, pte_t *ptep) {}
+static inline void update_mmu_cache_range(struct vm_fault *vmf,
+ struct vm_area_struct *vma, unsigned long address,
+ pte_t *ptep, unsigned int nr) {}
#endif

#endif /* __ASSEMBLY__ */
diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
index a4893b17705a..11675d97e723 100644
--- a/arch/powerpc/include/asm/pgtable.h
+++ b/arch/powerpc/include/asm/pgtable.h
@@ -41,6 +41,12 @@ struct mm_struct;

#ifndef __ASSEMBLY__

+void set_ptes(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
+ pte_t pte, unsigned int nr);
+#define set_ptes set_ptes
+#define update_mmu_cache(vma, addr, ptep) \
+ update_mmu_cache_range(NULL, vma, addr, ptep, 1)
+
#ifndef MAX_PTRS_PER_PGD
#define MAX_PTRS_PER_PGD PTRS_PER_PGD
#endif
@@ -48,6 +54,12 @@ struct mm_struct;
/* Keep these as a macros to avoid include dependency mess */
#define pte_page(x) pfn_to_page(pte_pfn(x))
#define mk_pte(page, pgprot) pfn_pte(page_to_pfn(page), (pgprot))
+
+static inline unsigned long pte_pfn(pte_t pte)
+{
+ return (pte_val(pte) & PTE_RPN_MASK) >> PTE_RPN_SHIFT;
+}
+
/*
* Select all bits except the pfn
*/
diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c
index fedffe3ae136..ad2afa08e62e 100644
--- a/arch/powerpc/mm/book3s64/hash_utils.c
+++ b/arch/powerpc/mm/book3s64/hash_utils.c
@@ -1307,18 +1307,19 @@ void hash__early_init_mmu_secondary(void)
*/
unsigned int hash_page_do_lazy_icache(unsigned int pp, pte_t pte, int trap)
{
- struct page *page;
+ struct folio *folio;

if (!pfn_valid(pte_pfn(pte)))
return pp;

- page = pte_page(pte);
+ folio = page_folio(pte_page(pte));

/* page is dirty */
- if (!test_bit(PG_dcache_clean, &page->flags) && !PageReserved(page)) {
+ if (!test_bit(PG_dcache_clean, &folio->flags) &&
+ !folio_test_reserved(folio)) {
if (trap == INTERRUPT_INST_STORAGE) {
- flush_dcache_icache_page(page);
- set_bit(PG_dcache_clean, &page->flags);
+ flush_dcache_icache_folio(folio);
+ set_bit(PG_dcache_clean, &folio->flags);
} else
pp |= HPTE_R_N;
}
diff --git a/arch/powerpc/mm/cacheflush.c b/arch/powerpc/mm/cacheflush.c
index 0e9b4879c0f9..8760d2223abe 100644
--- a/arch/powerpc/mm/cacheflush.c
+++ b/arch/powerpc/mm/cacheflush.c
@@ -148,44 +148,30 @@ static void __flush_dcache_icache(void *p)
invalidate_icache_range(addr, addr + PAGE_SIZE);
}

-static void flush_dcache_icache_hugepage(struct page *page)
+void flush_dcache_icache_folio(struct folio *folio)
{
- int i;
- int nr = compound_nr(page);
+ unsigned int i, nr = folio_nr_pages(folio);

- if (!PageHighMem(page)) {
+ if (flush_coherent_icache())
+ return;
+
+ if (!folio_test_highmem(folio)) {
+ void *addr = folio_address(folio);
for (i = 0; i < nr; i++)
- __flush_dcache_icache(lowmem_page_address(page + i));
- } else {
+ __flush_dcache_icache(addr + i * PAGE_SIZE);
+ } else if (IS_ENABLED(CONFIG_BOOKE) || sizeof(phys_addr_t) > sizeof(void *)) {
for (i = 0; i < nr; i++) {
- void *start = kmap_local_page(page + i);
+ void *start = kmap_local_folio(folio, i * PAGE_SIZE);

__flush_dcache_icache(start);
kunmap_local(start);
}
- }
-}
-
-void flush_dcache_icache_page(struct page *page)
-{
- if (flush_coherent_icache())
- return;
-
- if (PageCompound(page))
- return flush_dcache_icache_hugepage(page);
-
- if (!PageHighMem(page)) {
- __flush_dcache_icache(lowmem_page_address(page));
- } else if (IS_ENABLED(CONFIG_BOOKE) || sizeof(phys_addr_t) > sizeof(void *)) {
- void *start = kmap_local_page(page);
-
- __flush_dcache_icache(start);
- kunmap_local(start);
} else {
- flush_dcache_icache_phys(page_to_phys(page));
+ unsigned long pfn = folio_pfn(folio);
+ for (i = 0; i < nr; i++)
+ flush_dcache_icache_phys((pfn + i) * PAGE_SIZE);
}
}
-EXPORT_SYMBOL(flush_dcache_icache_page);

void clear_user_page(void *page, unsigned long vaddr, struct page *pg)
{
diff --git a/arch/powerpc/mm/nohash/e500_hugetlbpage.c b/arch/powerpc/mm/nohash/e500_hugetlbpage.c
index 58c8d9849cb1..6b30e40d4590 100644
--- a/arch/powerpc/mm/nohash/e500_hugetlbpage.c
+++ b/arch/powerpc/mm/nohash/e500_hugetlbpage.c
@@ -178,7 +178,8 @@ book3e_hugetlb_preload(struct vm_area_struct *vma, unsigned long ea, pte_t pte)
*
* This must always be called with the pte lock held.
*/
-void update_mmu_cache(struct vm_area_struct *vma, unsigned long address, pte_t *ptep)
+void update_mmu_cache_range(struct vm_fault *vmf, struct vm_area_struct *vma,
+ unsigned long address, pte_t *ptep, unsigned int nr)
{
if (is_vm_hugetlb_page(vma))
book3e_hugetlb_preload(vma, address, *ptep);
diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c
index a3dcdb2d5b4b..3f86fd217690 100644
--- a/arch/powerpc/mm/pgtable.c
+++ b/arch/powerpc/mm/pgtable.c
@@ -58,7 +58,7 @@ static inline int pte_looks_normal(pte_t pte)
return 0;
}

-static struct page *maybe_pte_to_page(pte_t pte)
+static struct folio *maybe_pte_to_folio(pte_t pte)
{
unsigned long pfn = pte_pfn(pte);
struct page *page;
@@ -68,7 +68,7 @@ static struct page *maybe_pte_to_page(pte_t pte)
page = pfn_to_page(pfn);
if (PageReserved(page))
return NULL;
- return page;
+ return page_folio(page);
}

#ifdef CONFIG_PPC_BOOK3S
@@ -84,12 +84,12 @@ static pte_t set_pte_filter_hash(pte_t pte)
pte = __pte(pte_val(pte) & ~_PAGE_HPTEFLAGS);
if (pte_looks_normal(pte) && !(cpu_has_feature(CPU_FTR_COHERENT_ICACHE) ||
cpu_has_feature(CPU_FTR_NOEXECUTE))) {
- struct page *pg = maybe_pte_to_page(pte);
- if (!pg)
+ struct folio *folio = maybe_pte_to_folio(pte);
+ if (!folio)
return pte;
- if (!test_bit(PG_dcache_clean, &pg->flags)) {
- flush_dcache_icache_page(pg);
- set_bit(PG_dcache_clean, &pg->flags);
+ if (!test_bit(PG_dcache_clean, &folio->flags)) {
+ flush_dcache_icache_folio(folio);
+ set_bit(PG_dcache_clean, &folio->flags);
}
}
return pte;
@@ -107,7 +107,7 @@ static pte_t set_pte_filter_hash(pte_t pte) { return pte; }
*/
static inline pte_t set_pte_filter(pte_t pte)
{
- struct page *pg;
+ struct folio *folio;

if (radix_enabled())
return pte;
@@ -120,18 +120,18 @@ static inline pte_t set_pte_filter(pte_t pte)
return pte;

/* If you set _PAGE_EXEC on weird pages you're on your own */
- pg = maybe_pte_to_page(pte);
- if (unlikely(!pg))
+ folio = maybe_pte_to_folio(pte);
+ if (unlikely(!folio))
return pte;

/* If the page clean, we move on */
- if (test_bit(PG_dcache_clean, &pg->flags))
+ if (test_bit(PG_dcache_clean, &folio->flags))
return pte;

/* If it's an exec fault, we flush the cache and make it clean */
if (is_exec_fault()) {
- flush_dcache_icache_page(pg);
- set_bit(PG_dcache_clean, &pg->flags);
+ flush_dcache_icache_folio(folio);
+ set_bit(PG_dcache_clean, &folio->flags);
return pte;
}

@@ -142,7 +142,7 @@ static inline pte_t set_pte_filter(pte_t pte)
static pte_t set_access_flags_filter(pte_t pte, struct vm_area_struct *vma,
int dirty)
{
- struct page *pg;
+ struct folio *folio;

if (IS_ENABLED(CONFIG_PPC_BOOK3S_64))
return pte;
@@ -168,17 +168,17 @@ static pte_t set_access_flags_filter(pte_t pte, struct vm_area_struct *vma,
#endif /* CONFIG_DEBUG_VM */

/* If you set _PAGE_EXEC on weird pages you're on your own */
- pg = maybe_pte_to_page(pte);
- if (unlikely(!pg))
+ folio = maybe_pte_to_folio(pte);
+ if (unlikely(!folio))
goto bail;

/* If the page is already clean, we move on */
- if (test_bit(PG_dcache_clean, &pg->flags))
+ if (test_bit(PG_dcache_clean, &folio->flags))
goto bail;

/* Clean the page and set PG_dcache_clean */
- flush_dcache_icache_page(pg);
- set_bit(PG_dcache_clean, &pg->flags);
+ flush_dcache_icache_folio(folio);
+ set_bit(PG_dcache_clean, &folio->flags);

bail:
return pte_mkexec(pte);
@@ -187,8 +187,8 @@ static pte_t set_access_flags_filter(pte_t pte, struct vm_area_struct *vma,
/*
* set_pte stores a linux PTE into the linux page table.
*/
-void set_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
- pte_t pte)
+void set_ptes(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
+ pte_t pte, unsigned int nr)
{
/*
* Make sure hardware valid bit is not set. We don't do
@@ -203,7 +203,16 @@ void set_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
pte = set_pte_filter(pte);

/* Perform the setting of the PTE */
- __set_pte_at(mm, addr, ptep, pte, 0);
+ arch_enter_lazy_mmu_mode();
+ for (;;) {
+ __set_pte_at(mm, addr, ptep, pte, 0);
+ if (--nr == 0)
+ break;
+ ptep++;
+ pte = __pte(pte_val(pte) + (1UL << PTE_RPN_SHIFT));
+ addr += PAGE_SIZE;
+ }
+ arch_leave_lazy_mmu_mode();
}

void unmap_kernel_page(unsigned long va)
--
2.40.1


2023-08-02 16:56:21

by Matthew Wilcox

[permalink] [raw]
Subject: [PATCH v6 12/38] hexagon: Implement the new page table range API

Add PFN_PTE_SHIFT and update_mmu_cache_range().

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Acked-by: Brian Cain <[email protected]>
Acked-by: Mike Rapoport (IBM) <[email protected]>
---
arch/hexagon/include/asm/cacheflush.h | 8 ++++++--
arch/hexagon/include/asm/pgtable.h | 9 +--------
2 files changed, 7 insertions(+), 10 deletions(-)

diff --git a/arch/hexagon/include/asm/cacheflush.h b/arch/hexagon/include/asm/cacheflush.h
index 6eff0730e6ef..dc3f500a5a01 100644
--- a/arch/hexagon/include/asm/cacheflush.h
+++ b/arch/hexagon/include/asm/cacheflush.h
@@ -58,12 +58,16 @@ extern void flush_cache_all_hexagon(void);
* clean the cache when the PTE is set.
*
*/
-static inline void update_mmu_cache(struct vm_area_struct *vma,
- unsigned long address, pte_t *ptep)
+static inline void update_mmu_cache_range(struct vm_fault *vmf,
+ struct vm_area_struct *vma, unsigned long address,
+ pte_t *ptep, unsigned int nr)
{
/* generic_ptrace_pokedata doesn't wind up here, does it? */
}

+#define update_mmu_cache(vma, addr, ptep) \
+ update_mmu_cache_range(NULL, vma, addr, ptep, 1)
+
void copy_to_user_page(struct vm_area_struct *vma, struct page *page,
unsigned long vaddr, void *dst, void *src, int len);
#define copy_to_user_page copy_to_user_page
diff --git a/arch/hexagon/include/asm/pgtable.h b/arch/hexagon/include/asm/pgtable.h
index 59393613d086..dd05dd71b8ec 100644
--- a/arch/hexagon/include/asm/pgtable.h
+++ b/arch/hexagon/include/asm/pgtable.h
@@ -338,6 +338,7 @@ static inline int pte_exec(pte_t pte)
/* __swp_entry_to_pte - extract PTE from swap entry */
#define __swp_entry_to_pte(x) ((pte_t) { (x).val })

+#define PFN_PTE_SHIFT PAGE_SHIFT
/* pfn_pte - convert page number and protection value to page table entry */
#define pfn_pte(pfn, pgprot) __pte((pfn << PAGE_SHIFT) | pgprot_val(pgprot))

@@ -345,14 +346,6 @@ static inline int pte_exec(pte_t pte)
#define pte_pfn(pte) (pte_val(pte) >> PAGE_SHIFT)
#define set_pmd(pmdptr, pmdval) (*(pmdptr) = (pmdval))

-/*
- * set_pte_at - update page table and do whatever magic may be
- * necessary to make the underlying hardware/firmware take note.
- *
- * VM may require a virtual instruction to alert the MMU.
- */
-#define set_pte_at(mm, addr, ptep, pte) set_pte(ptep, pte)
-
static inline unsigned long pmd_page_vaddr(pmd_t pmd)
{
return (unsigned long)__va(pmd_val(pmd) & PAGE_MASK);
--
2.40.1


2023-08-02 17:13:43

by Matthew Wilcox

[permalink] [raw]
Subject: [PATCH v6 02/38] mm: Convert page_table_check_pte_set() to page_table_check_ptes_set()

Tell the page table check how many PTEs & PFNs we want it to check.

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Reviewed-by: Mike Rapoport (IBM) <[email protected]>
Acked-by: Pasha Tatashin <[email protected]>
Reviewed-by: Anshuman Khandual <[email protected]>
---
arch/arm64/include/asm/pgtable.h | 2 +-
arch/riscv/include/asm/pgtable.h | 2 +-
arch/x86/include/asm/pgtable.h | 2 +-
include/linux/page_table_check.h | 13 +++++++------
mm/page_table_check.c | 16 +++++++++-------
5 files changed, 19 insertions(+), 16 deletions(-)

diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index fe4b913589ee..445b18d7a47c 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -348,7 +348,7 @@ static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr,
static inline void set_pte_at(struct mm_struct *mm, unsigned long addr,
pte_t *ptep, pte_t pte)
{
- page_table_check_pte_set(mm, ptep, pte);
+ page_table_check_ptes_set(mm, ptep, pte, 1);
return __set_pte_at(mm, addr, ptep, pte);
}

diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index 44377f0d7c35..01e4aabc8898 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -499,7 +499,7 @@ static inline void __set_pte_at(struct mm_struct *mm,
static inline void set_pte_at(struct mm_struct *mm,
unsigned long addr, pte_t *ptep, pte_t pteval)
{
- page_table_check_pte_set(mm, ptep, pteval);
+ page_table_check_ptes_set(mm, ptep, pteval, 1);
__set_pte_at(mm, addr, ptep, pteval);
}

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index ada1bbf12961..cd0b6337d03c 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -1023,7 +1023,7 @@ static inline pud_t native_local_pudp_get_and_clear(pud_t *pudp)
static inline void set_pte_at(struct mm_struct *mm, unsigned long addr,
pte_t *ptep, pte_t pte)
{
- page_table_check_pte_set(mm, ptep, pte);
+ page_table_check_ptes_set(mm, ptep, pte, 1);
set_pte(ptep, pte);
}

diff --git a/include/linux/page_table_check.h b/include/linux/page_table_check.h
index 7f6b9bf926c5..6722941c7cb8 100644
--- a/include/linux/page_table_check.h
+++ b/include/linux/page_table_check.h
@@ -17,7 +17,8 @@ void __page_table_check_zero(struct page *page, unsigned int order);
void __page_table_check_pte_clear(struct mm_struct *mm, pte_t pte);
void __page_table_check_pmd_clear(struct mm_struct *mm, pmd_t pmd);
void __page_table_check_pud_clear(struct mm_struct *mm, pud_t pud);
-void __page_table_check_pte_set(struct mm_struct *mm, pte_t *ptep, pte_t pte);
+void __page_table_check_ptes_set(struct mm_struct *mm, pte_t *ptep, pte_t pte,
+ unsigned int nr);
void __page_table_check_pmd_set(struct mm_struct *mm, pmd_t *pmdp, pmd_t pmd);
void __page_table_check_pud_set(struct mm_struct *mm, pud_t *pudp, pud_t pud);
void __page_table_check_pte_clear_range(struct mm_struct *mm,
@@ -64,13 +65,13 @@ static inline void page_table_check_pud_clear(struct mm_struct *mm, pud_t pud)
__page_table_check_pud_clear(mm, pud);
}

-static inline void page_table_check_pte_set(struct mm_struct *mm, pte_t *ptep,
- pte_t pte)
+static inline void page_table_check_ptes_set(struct mm_struct *mm,
+ pte_t *ptep, pte_t pte, unsigned int nr)
{
if (static_branch_likely(&page_table_check_disabled))
return;

- __page_table_check_pte_set(mm, ptep, pte);
+ __page_table_check_ptes_set(mm, ptep, pte, nr);
}

static inline void page_table_check_pmd_set(struct mm_struct *mm, pmd_t *pmdp,
@@ -123,8 +124,8 @@ static inline void page_table_check_pud_clear(struct mm_struct *mm, pud_t pud)
{
}

-static inline void page_table_check_pte_set(struct mm_struct *mm, pte_t *ptep,
- pte_t pte)
+static inline void page_table_check_ptes_set(struct mm_struct *mm,
+ pte_t *ptep, pte_t pte, unsigned int nr)
{
}

diff --git a/mm/page_table_check.c b/mm/page_table_check.c
index 46e77c12c81e..af69c3c8f7c2 100644
--- a/mm/page_table_check.c
+++ b/mm/page_table_check.c
@@ -182,18 +182,20 @@ void __page_table_check_pud_clear(struct mm_struct *mm, pud_t pud)
}
EXPORT_SYMBOL(__page_table_check_pud_clear);

-void __page_table_check_pte_set(struct mm_struct *mm, pte_t *ptep, pte_t pte)
+void __page_table_check_ptes_set(struct mm_struct *mm, pte_t *ptep, pte_t pte,
+ unsigned int nr)
{
+ unsigned int i;
+
if (&init_mm == mm)
return;

- __page_table_check_pte_clear(mm, ptep_get(ptep));
- if (pte_user_accessible_page(pte)) {
- page_table_check_set(pte_pfn(pte), PAGE_SIZE >> PAGE_SHIFT,
- pte_write(pte));
- }
+ for (i = 0; i < nr; i++)
+ __page_table_check_pte_clear(mm, ptep_get(ptep + i));
+ if (pte_user_accessible_page(pte))
+ page_table_check_set(pte_pfn(pte), nr, pte_write(pte));
}
-EXPORT_SYMBOL(__page_table_check_pte_set);
+EXPORT_SYMBOL(__page_table_check_ptes_set);

void __page_table_check_pmd_set(struct mm_struct *mm, pmd_t *pmdp, pmd_t pmd)
{
--
2.40.1


2023-08-02 17:15:04

by Matthew Wilcox

[permalink] [raw]
Subject: [PATCH v6 01/38] minmax: Add in_range() macro

Determine if a value lies within a range more efficiently (subtraction +
comparison vs two comparisons and an AND). It also has useful (under
some circumstances) behaviour if the range exceeds the maximum value of
the type. Convert all the conflicting definitions of in_range() within
the kernel; some can use the generic definition while others need their
own definition.

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
---
arch/arm/mm/pageattr.c | 6 ++---
.../drm/arm/display/include/malidp_utils.h | 2 +-
.../display/komeda/komeda_pipeline_state.c | 24 ++++++++---------
drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 6 -----
.../net/ethernet/chelsio/cxgb3/cxgb3_main.c | 18 ++++++-------
drivers/virt/acrn/ioreq.c | 4 +--
fs/btrfs/misc.h | 2 --
fs/ext2/balloc.c | 2 --
fs/ext4/ext4.h | 2 --
fs/ufs/util.h | 6 -----
include/linux/minmax.h | 27 +++++++++++++++++++
lib/logic_pio.c | 3 ---
net/netfilter/nf_nat_core.c | 6 ++---
net/tipc/core.h | 2 +-
net/tipc/link.c | 10 +++----
.../selftests/bpf/progs/get_branch_snapshot.c | 4 +--
16 files changed, 65 insertions(+), 59 deletions(-)

diff --git a/arch/arm/mm/pageattr.c b/arch/arm/mm/pageattr.c
index c3c34fe714b0..064ad508c149 100644
--- a/arch/arm/mm/pageattr.c
+++ b/arch/arm/mm/pageattr.c
@@ -25,7 +25,7 @@ static int change_page_range(pte_t *ptep, unsigned long addr, void *data)
return 0;
}

-static bool in_range(unsigned long start, unsigned long size,
+static bool range_in_range(unsigned long start, unsigned long size,
unsigned long range_start, unsigned long range_end)
{
return start >= range_start && start < range_end &&
@@ -63,8 +63,8 @@ static int change_memory_common(unsigned long addr, int numpages,
if (!size)
return 0;

- if (!in_range(start, size, MODULES_VADDR, MODULES_END) &&
- !in_range(start, size, VMALLOC_START, VMALLOC_END))
+ if (!range_in_range(start, size, MODULES_VADDR, MODULES_END) &&
+ !range_in_range(start, size, VMALLOC_START, VMALLOC_END))
return -EINVAL;

return __change_memory_common(start, size, set_mask, clear_mask);
diff --git a/drivers/gpu/drm/arm/display/include/malidp_utils.h b/drivers/gpu/drm/arm/display/include/malidp_utils.h
index 49a1d7f3539c..9f83baac6ed8 100644
--- a/drivers/gpu/drm/arm/display/include/malidp_utils.h
+++ b/drivers/gpu/drm/arm/display/include/malidp_utils.h
@@ -35,7 +35,7 @@ static inline void set_range(struct malidp_range *rg, u32 start, u32 end)
rg->end = end;
}

-static inline bool in_range(struct malidp_range *rg, u32 v)
+static inline bool malidp_in_range(struct malidp_range *rg, u32 v)
{
return (v >= rg->start) && (v <= rg->end);
}
diff --git a/drivers/gpu/drm/arm/display/komeda/komeda_pipeline_state.c b/drivers/gpu/drm/arm/display/komeda/komeda_pipeline_state.c
index 3276a3e82c62..4618687a8f4d 100644
--- a/drivers/gpu/drm/arm/display/komeda/komeda_pipeline_state.c
+++ b/drivers/gpu/drm/arm/display/komeda/komeda_pipeline_state.c
@@ -305,12 +305,12 @@ komeda_layer_check_cfg(struct komeda_layer *layer,
if (komeda_fb_check_src_coords(kfb, src_x, src_y, src_w, src_h))
return -EINVAL;

- if (!in_range(&layer->hsize_in, src_w)) {
+ if (!malidp_in_range(&layer->hsize_in, src_w)) {
DRM_DEBUG_ATOMIC("invalidate src_w %d.\n", src_w);
return -EINVAL;
}

- if (!in_range(&layer->vsize_in, src_h)) {
+ if (!malidp_in_range(&layer->vsize_in, src_h)) {
DRM_DEBUG_ATOMIC("invalidate src_h %d.\n", src_h);
return -EINVAL;
}
@@ -452,14 +452,14 @@ komeda_scaler_check_cfg(struct komeda_scaler *scaler,
hsize_out = dflow->out_w;
vsize_out = dflow->out_h;

- if (!in_range(&scaler->hsize, hsize_in) ||
- !in_range(&scaler->hsize, hsize_out)) {
+ if (!malidp_in_range(&scaler->hsize, hsize_in) ||
+ !malidp_in_range(&scaler->hsize, hsize_out)) {
DRM_DEBUG_ATOMIC("Invalid horizontal sizes");
return -EINVAL;
}

- if (!in_range(&scaler->vsize, vsize_in) ||
- !in_range(&scaler->vsize, vsize_out)) {
+ if (!malidp_in_range(&scaler->vsize, vsize_in) ||
+ !malidp_in_range(&scaler->vsize, vsize_out)) {
DRM_DEBUG_ATOMIC("Invalid vertical sizes");
return -EINVAL;
}
@@ -574,13 +574,13 @@ komeda_splitter_validate(struct komeda_splitter *splitter,
return -EINVAL;
}

- if (!in_range(&splitter->hsize, dflow->in_w)) {
+ if (!malidp_in_range(&splitter->hsize, dflow->in_w)) {
DRM_DEBUG_ATOMIC("split in_w:%d is out of the acceptable range.\n",
dflow->in_w);
return -EINVAL;
}

- if (!in_range(&splitter->vsize, dflow->in_h)) {
+ if (!malidp_in_range(&splitter->vsize, dflow->in_h)) {
DRM_DEBUG_ATOMIC("split in_h: %d exceeds the acceptable range.\n",
dflow->in_h);
return -EINVAL;
@@ -624,13 +624,13 @@ komeda_merger_validate(struct komeda_merger *merger,
return -EINVAL;
}

- if (!in_range(&merger->hsize_merged, output->out_w)) {
+ if (!malidp_in_range(&merger->hsize_merged, output->out_w)) {
DRM_DEBUG_ATOMIC("merged_w: %d is out of the accepted range.\n",
output->out_w);
return -EINVAL;
}

- if (!in_range(&merger->vsize_merged, output->out_h)) {
+ if (!malidp_in_range(&merger->vsize_merged, output->out_h)) {
DRM_DEBUG_ATOMIC("merged_h: %d is out of the accepted range.\n",
output->out_h);
return -EINVAL;
@@ -866,8 +866,8 @@ void komeda_complete_data_flow_cfg(struct komeda_layer *layer,
* input/output range.
*/
if (dflow->en_scaling && scaler)
- dflow->en_split = !in_range(&scaler->hsize, dflow->in_w) ||
- !in_range(&scaler->hsize, dflow->out_w);
+ dflow->en_split = !malidp_in_range(&scaler->hsize, dflow->in_w) ||
+ !malidp_in_range(&scaler->hsize, dflow->out_w);
}

static bool merger_is_available(struct komeda_pipeline *pipe,
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
index b20ef6c8ea26..57dc601fc95a 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -678,12 +678,6 @@ struct block_header {
u32 data[];
};

-/* this should be a general kernel helper */
-static int in_range(u32 addr, u32 start, u32 size)
-{
- return addr >= start && addr < start + size;
-}
-
static bool fw_block_mem(struct a6xx_gmu_bo *bo, const struct block_header *blk)
{
if (!in_range(blk->addr, bo->iova, bo->size))
diff --git a/drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c b/drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c
index 9b84c8d8d309..d117022d15d7 100644
--- a/drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c
@@ -2126,7 +2126,7 @@ static const struct ethtool_ops cxgb_ethtool_ops = {
.set_link_ksettings = set_link_ksettings,
};

-static int in_range(int val, int lo, int hi)
+static int cxgb_in_range(int val, int lo, int hi)
{
return val < 0 || (val <= hi && val >= lo);
}
@@ -2162,19 +2162,19 @@ static int cxgb_siocdevprivate(struct net_device *dev,
return -EINVAL;
if (t.qset_idx >= SGE_QSETS)
return -EINVAL;
- if (!in_range(t.intr_lat, 0, M_NEWTIMER) ||
- !in_range(t.cong_thres, 0, 255) ||
- !in_range(t.txq_size[0], MIN_TXQ_ENTRIES,
+ if (!cxgb_in_range(t.intr_lat, 0, M_NEWTIMER) ||
+ !cxgb_in_range(t.cong_thres, 0, 255) ||
+ !cxgb_in_range(t.txq_size[0], MIN_TXQ_ENTRIES,
MAX_TXQ_ENTRIES) ||
- !in_range(t.txq_size[1], MIN_TXQ_ENTRIES,
+ !cxgb_in_range(t.txq_size[1], MIN_TXQ_ENTRIES,
MAX_TXQ_ENTRIES) ||
- !in_range(t.txq_size[2], MIN_CTRL_TXQ_ENTRIES,
+ !cxgb_in_range(t.txq_size[2], MIN_CTRL_TXQ_ENTRIES,
MAX_CTRL_TXQ_ENTRIES) ||
- !in_range(t.fl_size[0], MIN_FL_ENTRIES,
+ !cxgb_in_range(t.fl_size[0], MIN_FL_ENTRIES,
MAX_RX_BUFFERS) ||
- !in_range(t.fl_size[1], MIN_FL_ENTRIES,
+ !cxgb_in_range(t.fl_size[1], MIN_FL_ENTRIES,
MAX_RX_JUMBO_BUFFERS) ||
- !in_range(t.rspq_size, MIN_RSPQ_ENTRIES,
+ !cxgb_in_range(t.rspq_size, MIN_RSPQ_ENTRIES,
MAX_RSPQ_ENTRIES))
return -EINVAL;

diff --git a/drivers/virt/acrn/ioreq.c b/drivers/virt/acrn/ioreq.c
index cecdc1c13af7..29e1ef1915fd 100644
--- a/drivers/virt/acrn/ioreq.c
+++ b/drivers/virt/acrn/ioreq.c
@@ -351,7 +351,7 @@ static bool handle_cf8cfc(struct acrn_vm *vm,
return is_handled;
}

-static bool in_range(struct acrn_ioreq_range *range,
+static bool acrn_in_range(struct acrn_ioreq_range *range,
struct acrn_io_request *req)
{
bool ret = false;
@@ -389,7 +389,7 @@ static struct acrn_ioreq_client *find_ioreq_client(struct acrn_vm *vm,
list_for_each_entry(client, &vm->ioreq_clients, list) {
read_lock_bh(&client->range_lock);
list_for_each_entry(range, &client->range_list, list) {
- if (in_range(range, req)) {
+ if (acrn_in_range(range, req)) {
found = client;
break;
}
diff --git a/fs/btrfs/misc.h b/fs/btrfs/misc.h
index 005751a12911..40f2d9f1a17a 100644
--- a/fs/btrfs/misc.h
+++ b/fs/btrfs/misc.h
@@ -8,8 +8,6 @@
#include <linux/math64.h>
#include <linux/rbtree.h>

-#define in_range(b, first, len) ((b) >= (first) && (b) < (first) + (len))
-
/*
* Enumerate bits using enum autoincrement. Define the @name as the n-th bit.
*/
diff --git a/fs/ext2/balloc.c b/fs/ext2/balloc.c
index eca60b747c6b..c8049c90323d 100644
--- a/fs/ext2/balloc.c
+++ b/fs/ext2/balloc.c
@@ -36,8 +36,6 @@
*/


-#define in_range(b, first, len) ((b) >= (first) && (b) <= (first) + (len) - 1)
-
struct ext2_group_desc * ext2_get_group_desc(struct super_block * sb,
unsigned int block_group,
struct buffer_head ** bh)
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 1e2259d9967d..481491e892df 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -3780,8 +3780,6 @@ static inline void set_bitmap_uptodate(struct buffer_head *bh)
set_bit(BH_BITMAP_UPTODATE, &(bh)->b_state);
}

-#define in_range(b, first, len) ((b) >= (first) && (b) <= (first) + (len) - 1)
-
/* For ioend & aio unwritten conversion wait queues */
#define EXT4_WQ_HASH_SZ 37
#define ext4_ioend_wq(v) (&ext4__ioend_wq[((unsigned long)(v)) %\
diff --git a/fs/ufs/util.h b/fs/ufs/util.h
index 4931bec1a01c..89247193d96d 100644
--- a/fs/ufs/util.h
+++ b/fs/ufs/util.h
@@ -11,12 +11,6 @@
#include <linux/fs.h>
#include "swab.h"

-
-/*
- * some useful macros
- */
-#define in_range(b,first,len) ((b)>=(first)&&(b)<(first)+(len))
-
/*
* functions used for retyping
*/
diff --git a/include/linux/minmax.h b/include/linux/minmax.h
index 798c6963909f..83aebc244cba 100644
--- a/include/linux/minmax.h
+++ b/include/linux/minmax.h
@@ -3,6 +3,7 @@
#define _LINUX_MINMAX_H

#include <linux/const.h>
+#include <linux/types.h>

/*
* min()/max()/clamp() macros must accomplish three things:
@@ -222,6 +223,32 @@
*/
#define clamp_val(val, lo, hi) clamp_t(typeof(val), val, lo, hi)

+static inline bool in_range64(u64 val, u64 start, u64 len)
+{
+ return (val - start) < len;
+}
+
+static inline bool in_range32(u32 val, u32 start, u32 len)
+{
+ return (val - start) < len;
+}
+
+/**
+ * in_range - Determine if a value lies within a range.
+ * @val: Value to test.
+ * @start: First value in range.
+ * @len: Number of values in range.
+ *
+ * This is more efficient than "if (start <= val && val < (start + len))".
+ * It also gives a different answer if @start + @len overflows the size of
+ * the type by a sufficient amount to encompass @val. Decide for yourself
+ * which behaviour you want, or prove that start + len never overflow.
+ * Do not blindly replace one form with the other.
+ */
+#define in_range(val, start, len) \
+ ((sizeof(start) | sizeof(len) | sizeof(val)) <= sizeof(u32) ? \
+ in_range32(val, start, len) : in_range64(val, start, len))
+
/**
* swap - swap values of @a and @b
* @a: first value
diff --git a/lib/logic_pio.c b/lib/logic_pio.c
index 07b4b9a1f54b..2ea564a40064 100644
--- a/lib/logic_pio.c
+++ b/lib/logic_pio.c
@@ -20,9 +20,6 @@
static LIST_HEAD(io_range_list);
static DEFINE_MUTEX(io_range_mutex);

-/* Consider a kernel general helper for this */
-#define in_range(b, first, len) ((b) >= (first) && (b) < (first) + (len))
-
/**
* logic_pio_register_range - register logical PIO range for a host
* @new_range: pointer to the IO range to be registered.
diff --git a/net/netfilter/nf_nat_core.c b/net/netfilter/nf_nat_core.c
index fadbd4ed3dc0..c4e0516a8dfa 100644
--- a/net/netfilter/nf_nat_core.c
+++ b/net/netfilter/nf_nat_core.c
@@ -327,7 +327,7 @@ static bool l4proto_in_range(const struct nf_conntrack_tuple *tuple,
/* If we source map this tuple so reply looks like reply_tuple, will
* that meet the constraints of range.
*/
-static int in_range(const struct nf_conntrack_tuple *tuple,
+static int nf_in_range(const struct nf_conntrack_tuple *tuple,
const struct nf_nat_range2 *range)
{
/* If we are supposed to map IPs, then we must be in the
@@ -376,7 +376,7 @@ find_appropriate_src(struct net *net,
&ct->tuplehash[IP_CT_DIR_REPLY].tuple);
result->dst = tuple->dst;

- if (in_range(result, range))
+ if (nf_in_range(result, range))
return 1;
}
}
@@ -607,7 +607,7 @@ get_unique_tuple(struct nf_conntrack_tuple *tuple,
if (maniptype == NF_NAT_MANIP_SRC &&
!(range->flags & NF_NAT_RANGE_PROTO_RANDOM_ALL)) {
/* try the original tuple first */
- if (in_range(orig_tuple, range)) {
+ if (nf_in_range(orig_tuple, range)) {
if (!nf_nat_used_tuple(orig_tuple, ct)) {
*tuple = *orig_tuple;
return;
diff --git a/net/tipc/core.h b/net/tipc/core.h
index 0a3f7a70a50a..7eccd97e0609 100644
--- a/net/tipc/core.h
+++ b/net/tipc/core.h
@@ -197,7 +197,7 @@ static inline int less(u16 left, u16 right)
return less_eq(left, right) && (mod(right) != mod(left));
}

-static inline int in_range(u16 val, u16 min, u16 max)
+static inline int tipc_in_range(u16 val, u16 min, u16 max)
{
return !less(val, min) && !more(val, max);
}
diff --git a/net/tipc/link.c b/net/tipc/link.c
index 2eff1c7949cb..e33b4f29f77c 100644
--- a/net/tipc/link.c
+++ b/net/tipc/link.c
@@ -1623,7 +1623,7 @@ static int tipc_link_advance_transmq(struct tipc_link *l, struct tipc_link *r,
last_ga->bgack_cnt);
}
/* Check against the last Gap ACK block */
- if (in_range(seqno, start, end))
+ if (tipc_in_range(seqno, start, end))
continue;
/* Update/release the packet peer is acking */
bc_has_acked = true;
@@ -2251,12 +2251,12 @@ static int tipc_link_proto_rcv(struct tipc_link *l, struct sk_buff *skb,
strncpy(if_name, data, TIPC_MAX_IF_NAME);

/* Update own tolerance if peer indicates a non-zero value */
- if (in_range(peers_tol, TIPC_MIN_LINK_TOL, TIPC_MAX_LINK_TOL)) {
+ if (tipc_in_range(peers_tol, TIPC_MIN_LINK_TOL, TIPC_MAX_LINK_TOL)) {
l->tolerance = peers_tol;
l->bc_rcvlink->tolerance = peers_tol;
}
/* Update own priority if peer's priority is higher */
- if (in_range(peers_prio, l->priority + 1, TIPC_MAX_LINK_PRI))
+ if (tipc_in_range(peers_prio, l->priority + 1, TIPC_MAX_LINK_PRI))
l->priority = peers_prio;

/* If peer is going down we want full re-establish cycle */
@@ -2299,13 +2299,13 @@ static int tipc_link_proto_rcv(struct tipc_link *l, struct sk_buff *skb,
l->rcv_nxt_state = msg_seqno(hdr) + 1;

/* Update own tolerance if peer indicates a non-zero value */
- if (in_range(peers_tol, TIPC_MIN_LINK_TOL, TIPC_MAX_LINK_TOL)) {
+ if (tipc_in_range(peers_tol, TIPC_MIN_LINK_TOL, TIPC_MAX_LINK_TOL)) {
l->tolerance = peers_tol;
l->bc_rcvlink->tolerance = peers_tol;
}
/* Update own prio if peer indicates a different value */
if ((peers_prio != l->priority) &&
- in_range(peers_prio, 1, TIPC_MAX_LINK_PRI)) {
+ tipc_in_range(peers_prio, 1, TIPC_MAX_LINK_PRI)) {
l->priority = peers_prio;
rc = tipc_link_fsm_evt(l, LINK_FAILURE_EVT);
}
diff --git a/tools/testing/selftests/bpf/progs/get_branch_snapshot.c b/tools/testing/selftests/bpf/progs/get_branch_snapshot.c
index a1b139888048..511ac634eef0 100644
--- a/tools/testing/selftests/bpf/progs/get_branch_snapshot.c
+++ b/tools/testing/selftests/bpf/progs/get_branch_snapshot.c
@@ -15,7 +15,7 @@ long total_entries = 0;
#define ENTRY_CNT 32
struct perf_branch_entry entries[ENTRY_CNT] = {};

-static inline bool in_range(__u64 val)
+static inline bool gbs_in_range(__u64 val)
{
return (val >= address_low) && (val < address_high);
}
@@ -31,7 +31,7 @@ int BPF_PROG(test1, int n, int ret)
for (i = 0; i < ENTRY_CNT; i++) {
if (i >= total_entries)
break;
- if (in_range(entries[i].from) && in_range(entries[i].to))
+ if (gbs_in_range(entries[i].from) && gbs_in_range(entries[i].to))
test1_hits++;
else if (!test1_hits)
wasted_entries++;
--
2.40.1


2023-08-02 17:28:00

by Matthew Wilcox

[permalink] [raw]
Subject: [PATCH v6 24/38] sh: Implement the new page table range API

Add PFN_PTE_SHIFT, update_mmu_cache_range(), flush_dcache_folio() and
flush_icache_pages(). Change the PG_dcache_clean flag from being
per-page to per-folio. Flush the entire folio containing the pages in
flush_icache_pages() for ease of implementation.

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Acked-by: Mike Rapoport (IBM) <[email protected]>
Cc: Yoshinori Sato <[email protected]>
Cc: Rich Felker <[email protected]>
Cc: John Paul Adrian Glaubitz <[email protected]>
Cc: [email protected]
---
arch/sh/include/asm/cacheflush.h | 21 ++++++++-----
arch/sh/include/asm/pgtable.h | 7 +++--
arch/sh/include/asm/pgtable_32.h | 5 ++-
arch/sh/mm/cache-j2.c | 4 +--
arch/sh/mm/cache-sh4.c | 26 +++++++++++-----
arch/sh/mm/cache-sh7705.c | 26 ++++++++++------
arch/sh/mm/cache.c | 52 ++++++++++++++++++--------------
arch/sh/mm/kmap.c | 3 +-
8 files changed, 89 insertions(+), 55 deletions(-)

diff --git a/arch/sh/include/asm/cacheflush.h b/arch/sh/include/asm/cacheflush.h
index 481a664287e2..9fceef6f3e00 100644
--- a/arch/sh/include/asm/cacheflush.h
+++ b/arch/sh/include/asm/cacheflush.h
@@ -13,9 +13,9 @@
* - flush_cache_page(mm, vmaddr, pfn) flushes a single page
* - flush_cache_range(vma, start, end) flushes a range of pages
*
- * - flush_dcache_page(pg) flushes(wback&invalidates) a page for dcache
+ * - flush_dcache_folio(folio) flushes(wback&invalidates) a folio for dcache
* - flush_icache_range(start, end) flushes(invalidates) a range for icache
- * - flush_icache_page(vma, pg) flushes(invalidates) a page for icache
+ * - flush_icache_pages(vma, pg, nr) flushes(invalidates) pages for icache
* - flush_cache_sigtramp(vaddr) flushes the signal trampoline
*/
extern void (*local_flush_cache_all)(void *args);
@@ -23,9 +23,9 @@ extern void (*local_flush_cache_mm)(void *args);
extern void (*local_flush_cache_dup_mm)(void *args);
extern void (*local_flush_cache_page)(void *args);
extern void (*local_flush_cache_range)(void *args);
-extern void (*local_flush_dcache_page)(void *args);
+extern void (*local_flush_dcache_folio)(void *args);
extern void (*local_flush_icache_range)(void *args);
-extern void (*local_flush_icache_page)(void *args);
+extern void (*local_flush_icache_folio)(void *args);
extern void (*local_flush_cache_sigtramp)(void *args);

static inline void cache_noop(void *args) { }
@@ -42,11 +42,18 @@ extern void flush_cache_page(struct vm_area_struct *vma,
extern void flush_cache_range(struct vm_area_struct *vma,
unsigned long start, unsigned long end);
#define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1
-void flush_dcache_page(struct page *page);
+void flush_dcache_folio(struct folio *folio);
+#define flush_dcache_folio flush_dcache_folio
+static inline void flush_dcache_page(struct page *page)
+{
+ flush_dcache_folio(page_folio(page));
+}
+
extern void flush_icache_range(unsigned long start, unsigned long end);
#define flush_icache_user_range flush_icache_range
-extern void flush_icache_page(struct vm_area_struct *vma,
- struct page *page);
+void flush_icache_pages(struct vm_area_struct *vma, struct page *page,
+ unsigned int nr);
+#define flush_icache_page(vma, page) flush_icache_pages(vma, page, 1)
extern void flush_cache_sigtramp(unsigned long address);

struct flusher_data {
diff --git a/arch/sh/include/asm/pgtable.h b/arch/sh/include/asm/pgtable.h
index 3ce30becf6df..729f5c6225fb 100644
--- a/arch/sh/include/asm/pgtable.h
+++ b/arch/sh/include/asm/pgtable.h
@@ -102,13 +102,16 @@ extern void __update_cache(struct vm_area_struct *vma,
extern void __update_tlb(struct vm_area_struct *vma,
unsigned long address, pte_t pte);

-static inline void
-update_mmu_cache(struct vm_area_struct *vma, unsigned long address, pte_t *ptep)
+static inline void update_mmu_cache_range(struct vm_fault *vmf,
+ struct vm_area_struct *vma, unsigned long address,
+ pte_t *ptep, unsigned int nr)
{
pte_t pte = *ptep;
__update_cache(vma, address, pte);
__update_tlb(vma, address, pte);
}
+#define update_mmu_cache(vma, addr, ptep) \
+ update_mmu_cache_range(NULL, vma, addr, ptep, 1)

extern pgd_t swapper_pg_dir[PTRS_PER_PGD];
extern void paging_init(void);
diff --git a/arch/sh/include/asm/pgtable_32.h b/arch/sh/include/asm/pgtable_32.h
index 21952b094650..676f3d4ef6ce 100644
--- a/arch/sh/include/asm/pgtable_32.h
+++ b/arch/sh/include/asm/pgtable_32.h
@@ -307,14 +307,13 @@ static inline void set_pte(pte_t *ptep, pte_t pte)
#define set_pte(pteptr, pteval) (*(pteptr) = pteval)
#endif

-#define set_pte_at(mm,addr,ptep,pteval) set_pte(ptep,pteval)
-
/*
* (pmds are folded into pgds so this doesn't get actually called,
* but the define is needed for a generic inline function.)
*/
#define set_pmd(pmdptr, pmdval) (*(pmdptr) = pmdval)

+#define PFN_PTE_SHIFT PAGE_SHIFT
#define pfn_pte(pfn, prot) \
__pte(((unsigned long long)(pfn) << PAGE_SHIFT) | pgprot_val(prot))
#define pfn_pmd(pfn, prot) \
@@ -323,7 +322,7 @@ static inline void set_pte(pte_t *ptep, pte_t pte)
#define pte_none(x) (!pte_val(x))
#define pte_present(x) ((x).pte_low & (_PAGE_PRESENT | _PAGE_PROTNONE))

-#define pte_clear(mm,addr,xp) do { set_pte_at(mm, addr, xp, __pte(0)); } while (0)
+#define pte_clear(mm, addr, ptep) set_pte(ptep, __pte(0))

#define pmd_none(x) (!pmd_val(x))
#define pmd_present(x) (pmd_val(x))
diff --git a/arch/sh/mm/cache-j2.c b/arch/sh/mm/cache-j2.c
index f277862a11f5..9ac960214380 100644
--- a/arch/sh/mm/cache-j2.c
+++ b/arch/sh/mm/cache-j2.c
@@ -55,9 +55,9 @@ void __init j2_cache_init(void)
local_flush_cache_dup_mm = j2_flush_both;
local_flush_cache_page = j2_flush_both;
local_flush_cache_range = j2_flush_both;
- local_flush_dcache_page = j2_flush_dcache;
+ local_flush_dcache_folio = j2_flush_dcache;
local_flush_icache_range = j2_flush_icache;
- local_flush_icache_page = j2_flush_icache;
+ local_flush_icache_folio = j2_flush_icache;
local_flush_cache_sigtramp = j2_flush_icache;

pr_info("Initial J2 CCR is %.8x\n", __raw_readl(j2_ccr_base));
diff --git a/arch/sh/mm/cache-sh4.c b/arch/sh/mm/cache-sh4.c
index 72c2e1b46c08..862046f26981 100644
--- a/arch/sh/mm/cache-sh4.c
+++ b/arch/sh/mm/cache-sh4.c
@@ -107,19 +107,29 @@ static inline void flush_cache_one(unsigned long start, unsigned long phys)
* Write back & invalidate the D-cache of the page.
* (To avoid "alias" issues)
*/
-static void sh4_flush_dcache_page(void *arg)
+static void sh4_flush_dcache_folio(void *arg)
{
- struct page *page = arg;
- unsigned long addr = (unsigned long)page_address(page);
+ struct folio *folio = arg;
#ifndef CONFIG_SMP
- struct address_space *mapping = page_mapping_file(page);
+ struct address_space *mapping = folio_flush_mapping(folio);

if (mapping && !mapping_mapped(mapping))
- clear_bit(PG_dcache_clean, &page->flags);
+ clear_bit(PG_dcache_clean, &folio->flags);
else
#endif
- flush_cache_one(CACHE_OC_ADDRESS_ARRAY |
- (addr & shm_align_mask), page_to_phys(page));
+ {
+ unsigned long pfn = folio_pfn(folio);
+ unsigned long addr = (unsigned long)folio_address(folio);
+ unsigned int i, nr = folio_nr_pages(folio);
+
+ for (i = 0; i < nr; i++) {
+ flush_cache_one(CACHE_OC_ADDRESS_ARRAY |
+ (addr & shm_align_mask),
+ pfn * PAGE_SIZE);
+ addr += PAGE_SIZE;
+ pfn++;
+ }
+ }

wmb();
}
@@ -379,7 +389,7 @@ void __init sh4_cache_init(void)
__raw_readl(CCN_PRR));

local_flush_icache_range = sh4_flush_icache_range;
- local_flush_dcache_page = sh4_flush_dcache_page;
+ local_flush_dcache_folio = sh4_flush_dcache_folio;
local_flush_cache_all = sh4_flush_cache_all;
local_flush_cache_mm = sh4_flush_cache_mm;
local_flush_cache_dup_mm = sh4_flush_cache_mm;
diff --git a/arch/sh/mm/cache-sh7705.c b/arch/sh/mm/cache-sh7705.c
index 9b63a53a5e46..b509a407588f 100644
--- a/arch/sh/mm/cache-sh7705.c
+++ b/arch/sh/mm/cache-sh7705.c
@@ -132,15 +132,20 @@ static void __flush_dcache_page(unsigned long phys)
* Write back & invalidate the D-cache of the page.
* (To avoid "alias" issues)
*/
-static void sh7705_flush_dcache_page(void *arg)
+static void sh7705_flush_dcache_folio(void *arg)
{
- struct page *page = arg;
- struct address_space *mapping = page_mapping_file(page);
+ struct folio *folio = arg;
+ struct address_space *mapping = folio_flush_mapping(folio);

if (mapping && !mapping_mapped(mapping))
- clear_bit(PG_dcache_clean, &page->flags);
- else
- __flush_dcache_page(__pa(page_address(page)));
+ clear_bit(PG_dcache_clean, &folio->flags);
+ else {
+ unsigned long pfn = folio_pfn(folio);
+ unsigned int i, nr = folio_nr_pages(folio);
+
+ for (i = 0; i < nr; i++)
+ __flush_dcache_page((pfn + i) * PAGE_SIZE);
+ }
}

static void sh7705_flush_cache_all(void *args)
@@ -176,19 +181,20 @@ static void sh7705_flush_cache_page(void *args)
* Not entirely sure why this is necessary on SH3 with 32K cache but
* without it we get occasional "Memory fault" when loading a program.
*/
-static void sh7705_flush_icache_page(void *page)
+static void sh7705_flush_icache_folio(void *arg)
{
- __flush_purge_region(page_address(page), PAGE_SIZE);
+ struct folio *folio = arg;
+ __flush_purge_region(folio_address(folio), folio_size(folio));
}

void __init sh7705_cache_init(void)
{
local_flush_icache_range = sh7705_flush_icache_range;
- local_flush_dcache_page = sh7705_flush_dcache_page;
+ local_flush_dcache_folio = sh7705_flush_dcache_folio;
local_flush_cache_all = sh7705_flush_cache_all;
local_flush_cache_mm = sh7705_flush_cache_all;
local_flush_cache_dup_mm = sh7705_flush_cache_all;
local_flush_cache_range = sh7705_flush_cache_all;
local_flush_cache_page = sh7705_flush_cache_page;
- local_flush_icache_page = sh7705_flush_icache_page;
+ local_flush_icache_folio = sh7705_flush_icache_folio;
}
diff --git a/arch/sh/mm/cache.c b/arch/sh/mm/cache.c
index 3aef78ceb820..9bcaa5619eab 100644
--- a/arch/sh/mm/cache.c
+++ b/arch/sh/mm/cache.c
@@ -20,9 +20,9 @@ void (*local_flush_cache_mm)(void *args) = cache_noop;
void (*local_flush_cache_dup_mm)(void *args) = cache_noop;
void (*local_flush_cache_page)(void *args) = cache_noop;
void (*local_flush_cache_range)(void *args) = cache_noop;
-void (*local_flush_dcache_page)(void *args) = cache_noop;
+void (*local_flush_dcache_folio)(void *args) = cache_noop;
void (*local_flush_icache_range)(void *args) = cache_noop;
-void (*local_flush_icache_page)(void *args) = cache_noop;
+void (*local_flush_icache_folio)(void *args) = cache_noop;
void (*local_flush_cache_sigtramp)(void *args) = cache_noop;

void (*__flush_wback_region)(void *start, int size);
@@ -61,15 +61,17 @@ void copy_to_user_page(struct vm_area_struct *vma, struct page *page,
unsigned long vaddr, void *dst, const void *src,
unsigned long len)
{
- if (boot_cpu_data.dcache.n_aliases && page_mapcount(page) &&
- test_bit(PG_dcache_clean, &page->flags)) {
+ struct folio *folio = page_folio(page);
+
+ if (boot_cpu_data.dcache.n_aliases && folio_mapped(folio) &&
+ test_bit(PG_dcache_clean, &folio->flags)) {
void *vto = kmap_coherent(page, vaddr) + (vaddr & ~PAGE_MASK);
memcpy(vto, src, len);
kunmap_coherent(vto);
} else {
memcpy(dst, src, len);
if (boot_cpu_data.dcache.n_aliases)
- clear_bit(PG_dcache_clean, &page->flags);
+ clear_bit(PG_dcache_clean, &folio->flags);
}

if (vma->vm_flags & VM_EXEC)
@@ -80,27 +82,30 @@ void copy_from_user_page(struct vm_area_struct *vma, struct page *page,
unsigned long vaddr, void *dst, const void *src,
unsigned long len)
{
+ struct folio *folio = page_folio(page);
+
if (boot_cpu_data.dcache.n_aliases && page_mapcount(page) &&
- test_bit(PG_dcache_clean, &page->flags)) {
+ test_bit(PG_dcache_clean, &folio->flags)) {
void *vfrom = kmap_coherent(page, vaddr) + (vaddr & ~PAGE_MASK);
memcpy(dst, vfrom, len);
kunmap_coherent(vfrom);
} else {
memcpy(dst, src, len);
if (boot_cpu_data.dcache.n_aliases)
- clear_bit(PG_dcache_clean, &page->flags);
+ clear_bit(PG_dcache_clean, &folio->flags);
}
}

void copy_user_highpage(struct page *to, struct page *from,
unsigned long vaddr, struct vm_area_struct *vma)
{
+ struct folio *src = page_folio(from);
void *vfrom, *vto;

vto = kmap_atomic(to);

- if (boot_cpu_data.dcache.n_aliases && page_mapcount(from) &&
- test_bit(PG_dcache_clean, &from->flags)) {
+ if (boot_cpu_data.dcache.n_aliases && folio_mapped(src) &&
+ test_bit(PG_dcache_clean, &src->flags)) {
vfrom = kmap_coherent(from, vaddr);
copy_page(vto, vfrom);
kunmap_coherent(vfrom);
@@ -136,27 +141,28 @@ EXPORT_SYMBOL(clear_user_highpage);
void __update_cache(struct vm_area_struct *vma,
unsigned long address, pte_t pte)
{
- struct page *page;
unsigned long pfn = pte_pfn(pte);

if (!boot_cpu_data.dcache.n_aliases)
return;

- page = pfn_to_page(pfn);
if (pfn_valid(pfn)) {
- int dirty = !test_and_set_bit(PG_dcache_clean, &page->flags);
+ struct folio *folio = page_folio(pfn_to_page(pfn));
+ int dirty = !test_and_set_bit(PG_dcache_clean, &folio->flags);
if (dirty)
- __flush_purge_region(page_address(page), PAGE_SIZE);
+ __flush_purge_region(folio_address(folio),
+ folio_size(folio));
}
}

void __flush_anon_page(struct page *page, unsigned long vmaddr)
{
+ struct folio *folio = page_folio(page);
unsigned long addr = (unsigned long) page_address(page);

if (pages_do_alias(addr, vmaddr)) {
- if (boot_cpu_data.dcache.n_aliases && page_mapcount(page) &&
- test_bit(PG_dcache_clean, &page->flags)) {
+ if (boot_cpu_data.dcache.n_aliases && folio_mapped(folio) &&
+ test_bit(PG_dcache_clean, &folio->flags)) {
void *kaddr;

kaddr = kmap_coherent(page, vmaddr);
@@ -164,7 +170,8 @@ void __flush_anon_page(struct page *page, unsigned long vmaddr)
/* __flush_purge_region((void *)kaddr, PAGE_SIZE); */
kunmap_coherent(kaddr);
} else
- __flush_purge_region((void *)addr, PAGE_SIZE);
+ __flush_purge_region(folio_address(folio),
+ folio_size(folio));
}
}

@@ -215,11 +222,11 @@ void flush_cache_range(struct vm_area_struct *vma, unsigned long start,
}
EXPORT_SYMBOL(flush_cache_range);

-void flush_dcache_page(struct page *page)
+void flush_dcache_folio(struct folio *folio)
{
- cacheop_on_each_cpu(local_flush_dcache_page, page, 1);
+ cacheop_on_each_cpu(local_flush_dcache_folio, folio, 1);
}
-EXPORT_SYMBOL(flush_dcache_page);
+EXPORT_SYMBOL(flush_dcache_folio);

void flush_icache_range(unsigned long start, unsigned long end)
{
@@ -233,10 +240,11 @@ void flush_icache_range(unsigned long start, unsigned long end)
}
EXPORT_SYMBOL(flush_icache_range);

-void flush_icache_page(struct vm_area_struct *vma, struct page *page)
+void flush_icache_pages(struct vm_area_struct *vma, struct page *page,
+ unsigned int nr)
{
- /* Nothing uses the VMA, so just pass the struct page along */
- cacheop_on_each_cpu(local_flush_icache_page, page, 1);
+ /* Nothing uses the VMA, so just pass the folio along */
+ cacheop_on_each_cpu(local_flush_icache_folio, page_folio(page), 1);
}

void flush_cache_sigtramp(unsigned long address)
diff --git a/arch/sh/mm/kmap.c b/arch/sh/mm/kmap.c
index 73fd7cc99430..fa50e8f6e7a9 100644
--- a/arch/sh/mm/kmap.c
+++ b/arch/sh/mm/kmap.c
@@ -27,10 +27,11 @@ void __init kmap_coherent_init(void)

void *kmap_coherent(struct page *page, unsigned long addr)
{
+ struct folio *folio = page_folio(page);
enum fixed_addresses idx;
unsigned long vaddr;

- BUG_ON(!test_bit(PG_dcache_clean, &page->flags));
+ BUG_ON(!test_bit(PG_dcache_clean, &folio->flags));

preempt_disable();
pagefault_disable();
--
2.40.1


2023-08-02 17:30:03

by Matthew Wilcox

[permalink] [raw]
Subject: [PATCH v6 35/38] rmap: add folio_add_file_rmap_range()

From: Yin Fengwei <[email protected]>

folio_add_file_rmap_range() allows to add pte mapping to a specific
range of file folio. Comparing to page_add_file_rmap(), it batched
updates __lruvec_stat for large folio.

Signed-off-by: Yin Fengwei <[email protected]>
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
---
include/linux/rmap.h | 2 ++
mm/rmap.c | 60 +++++++++++++++++++++++++++++++++-----------
2 files changed, 48 insertions(+), 14 deletions(-)

diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index f578975c12c0..d442d1e5425d 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -198,6 +198,8 @@ void folio_add_new_anon_rmap(struct folio *, struct vm_area_struct *,
unsigned long address);
void page_add_file_rmap(struct page *, struct vm_area_struct *,
bool compound);
+void folio_add_file_rmap_range(struct folio *, struct page *, unsigned int nr,
+ struct vm_area_struct *, bool compound);
void page_remove_rmap(struct page *, struct vm_area_struct *,
bool compound);
void folio_remove_rmap_range(struct folio *folio, struct page *page,
diff --git a/mm/rmap.c b/mm/rmap.c
index 54124f18e0e4..d82d52ebf3a6 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1294,31 +1294,39 @@ void folio_add_new_anon_rmap(struct folio *folio, struct vm_area_struct *vma,
}

/**
- * page_add_file_rmap - add pte mapping to a file page
- * @page: the page to add the mapping to
+ * folio_add_file_rmap_range - add pte mapping to page range of a folio
+ * @folio: The folio to add the mapping to
+ * @page: The first page to add
+ * @nr_pages: The number of pages which will be mapped
* @vma: the vm area in which the mapping is added
* @compound: charge the page as compound or small page
*
+ * The page range of folio is defined by [first_page, first_page + nr_pages)
+ *
* The caller needs to hold the pte lock.
*/
-void page_add_file_rmap(struct page *page, struct vm_area_struct *vma,
- bool compound)
+void folio_add_file_rmap_range(struct folio *folio, struct page *page,
+ unsigned int nr_pages, struct vm_area_struct *vma,
+ bool compound)
{
- struct folio *folio = page_folio(page);
atomic_t *mapped = &folio->_nr_pages_mapped;
- int nr = 0, nr_pmdmapped = 0;
- bool first;
+ unsigned int nr_pmdmapped = 0, first;
+ int nr = 0;

- VM_BUG_ON_PAGE(compound && !PageTransHuge(page), page);
+ VM_WARN_ON_FOLIO(compound && !folio_test_pmd_mappable(folio), folio);

/* Is page being mapped by PTE? Is this its first map to be added? */
if (likely(!compound)) {
- first = atomic_inc_and_test(&page->_mapcount);
- nr = first;
- if (first && folio_test_large(folio)) {
- nr = atomic_inc_return_relaxed(mapped);
- nr = (nr < COMPOUND_MAPPED);
- }
+ do {
+ first = atomic_inc_and_test(&page->_mapcount);
+ if (first && folio_test_large(folio)) {
+ first = atomic_inc_return_relaxed(mapped);
+ first = (first < COMPOUND_MAPPED);
+ }
+
+ if (first)
+ nr++;
+ } while (page++, --nr_pages > 0);
} else if (folio_test_pmd_mappable(folio)) {
/* That test is redundant: it's for safety or to optimize out */

@@ -1347,6 +1355,30 @@ void page_add_file_rmap(struct page *page, struct vm_area_struct *vma,
mlock_vma_folio(folio, vma, compound);
}

+/**
+ * page_add_file_rmap - add pte mapping to a file page
+ * @page: the page to add the mapping to
+ * @vma: the vm area in which the mapping is added
+ * @compound: charge the page as compound or small page
+ *
+ * The caller needs to hold the pte lock.
+ */
+void page_add_file_rmap(struct page *page, struct vm_area_struct *vma,
+ bool compound)
+{
+ struct folio *folio = page_folio(page);
+ unsigned int nr_pages;
+
+ VM_WARN_ON_ONCE_PAGE(compound && !PageTransHuge(page), page);
+
+ if (likely(!compound))
+ nr_pages = 1;
+ else
+ nr_pages = folio_nr_pages(folio);
+
+ folio_add_file_rmap_range(folio, page, nr_pages, vma, compound);
+}
+
/**
* __remove_rmap_finish - common operations when taking down a mapping.
* @folio: Folio containing all pages taken down.
--
2.40.1


2023-08-02 17:31:49

by Matthew Wilcox

[permalink] [raw]
Subject: [PATCH v6 19/38] openrisc: Implement the new page table range API

Add PFN_PTE_SHIFT, update_mmu_cache_range() and flush_dcache_folio().
Change the PG_arch_1 (aka PG_dcache_dirty) flag from being per-page
to per-folio.

Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Acked-by: Mike Rapoport (IBM) <[email protected]>
Cc: Jonas Bonn <[email protected]>
Cc: Stefan Kristiansson <[email protected]>
Cc: Stafford Horne <[email protected]>
Cc: [email protected]
---
arch/openrisc/include/asm/cacheflush.h | 8 +++++++-
arch/openrisc/include/asm/pgtable.h | 15 ++++++++++-----
arch/openrisc/mm/cache.c | 12 ++++++++----
3 files changed, 25 insertions(+), 10 deletions(-)

diff --git a/arch/openrisc/include/asm/cacheflush.h b/arch/openrisc/include/asm/cacheflush.h
index eeac40d4a854..984c331ff5f4 100644
--- a/arch/openrisc/include/asm/cacheflush.h
+++ b/arch/openrisc/include/asm/cacheflush.h
@@ -56,10 +56,16 @@ static inline void sync_icache_dcache(struct page *page)
*/
#define PG_dc_clean PG_arch_1

+static inline void flush_dcache_folio(struct folio *folio)
+{
+ clear_bit(PG_dc_clean, &folio->flags);
+}
+#define flush_dcache_folio flush_dcache_folio
+
#define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1
static inline void flush_dcache_page(struct page *page)
{
- clear_bit(PG_dc_clean, &page->flags);
+ flush_dcache_folio(page_folio(page));
}

#define flush_icache_user_page(vma, page, addr, len) \
diff --git a/arch/openrisc/include/asm/pgtable.h b/arch/openrisc/include/asm/pgtable.h
index 3eb9b9555d0d..7bdf1bb0d177 100644
--- a/arch/openrisc/include/asm/pgtable.h
+++ b/arch/openrisc/include/asm/pgtable.h
@@ -46,7 +46,7 @@ extern void paging_init(void);
* hook is made available.
*/
#define set_pte(pteptr, pteval) ((*(pteptr)) = (pteval))
-#define set_pte_at(mm, addr, ptep, pteval) set_pte(ptep, pteval)
+
/*
* (pmds are folded into pgds so this doesn't get actually called,
* but the define is needed for a generic inline function.)
@@ -357,6 +357,7 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd)
#define __pmd_offset(address) \
(((address) >> PMD_SHIFT) & (PTRS_PER_PMD-1))

+#define PFN_PTE_SHIFT PAGE_SHIFT
#define pte_pfn(x) ((unsigned long)(((x).pte)) >> PAGE_SHIFT)
#define pfn_pte(pfn, prot) __pte((((pfn) << PAGE_SHIFT)) | pgprot_val(prot))

@@ -379,13 +380,17 @@ static inline void update_tlb(struct vm_area_struct *vma,
extern void update_cache(struct vm_area_struct *vma,
unsigned long address, pte_t *pte);

-static inline void update_mmu_cache(struct vm_area_struct *vma,
- unsigned long address, pte_t *pte)
+static inline void update_mmu_cache_range(struct vm_fault *vmf,
+ struct vm_area_struct *vma, unsigned long address,
+ pte_t *ptep, unsigned int nr)
{
- update_tlb(vma, address, pte);
- update_cache(vma, address, pte);
+ update_tlb(vma, address, ptep);
+ update_cache(vma, address, ptep);
}

+#define update_mmu_cache(vma, addr, ptep) \
+ update_mmu_cache_range(NULL, vma, addr, ptep, 1)
+
/* __PHX__ FIXME, SWAP, this probably doesn't work */

/*
diff --git a/arch/openrisc/mm/cache.c b/arch/openrisc/mm/cache.c
index 534a52ec5e66..eb43b73f3855 100644
--- a/arch/openrisc/mm/cache.c
+++ b/arch/openrisc/mm/cache.c
@@ -43,15 +43,19 @@ void update_cache(struct vm_area_struct *vma, unsigned long address,
pte_t *pte)
{
unsigned long pfn = pte_val(*pte) >> PAGE_SHIFT;
- struct page *page = pfn_to_page(pfn);
- int dirty = !test_and_set_bit(PG_dc_clean, &page->flags);
+ struct folio *folio = page_folio(pfn_to_page(pfn));
+ int dirty = !test_and_set_bit(PG_dc_clean, &folio->flags);

/*
* Since icaches do not snoop for updated data on OpenRISC, we
* must write back and invalidate any dirty pages manually. We
* can skip data pages, since they will not end up in icaches.
*/
- if ((vma->vm_flags & VM_EXEC) && dirty)
- sync_icache_dcache(page);
+ if ((vma->vm_flags & VM_EXEC) && dirty) {
+ unsigned int nr = folio_nr_pages(folio);
+
+ while (nr--)
+ sync_icache_dcache(folio_page(folio, nr));
+ }
}

--
2.40.1


2023-08-02 19:13:57

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH v6 00/38] New page table range API

On Wed, 2 Aug 2023 16:13:28 +0100 "Matthew Wilcox (Oracle)" <[email protected]> wrote:

> For some reason, I didn't get an email when v5 was dropped from linux-next
> (I got an email from Andrew saying he was going to, but when I didn't get
> the automated emails, I assumed he'd changed his mind). So here's v6,
> far later than I would have resent it.

Sorry - I try to reduce the amount of email I spray out by reducing it to
a single email.

2023-08-03 14:02:15

by Nguyen Dinh Phi

[permalink] [raw]
Subject: Re: [PATCH v6 01/38] minmax: Add in_range() macro

On 8/2/2023 11:13 PM, Matthew Wilcox (Oracle) wrote:
> diff --git a/include/linux/minmax.h b/include/linux/minmax.h
> index 798c6963909f..83aebc244cba 100644
> --- a/include/linux/minmax.h
> +++ b/include/linux/minmax.h
> @@ -3,6 +3,7 @@
> #define _LINUX_MINMAX_H
>
> #include <linux/const.h>
> +#include <linux/types.h>
>
> /*
> * min()/max()/clamp() macros must accomplish three things:
> @@ -222,6 +223,32 @@
> */
> #define clamp_val(val, lo, hi) clamp_t(typeof(val), val, lo, hi)
>
> +static inline bool in_range64(u64 val, u64 start, u64 len)
> +{
> + return (val - start) < len;
> +}
> +
> +static inline bool in_range32(u32 val, u32 start, u32 len)
> +{
> + return (val - start) < len;
> +}
> +

I think these two functions return wrong result if val is smaller than
start and len is big enough.

2023-08-03 16:23:43

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH v6 01/38] minmax: Add in_range() macro

On Thu, Aug 03, 2023 at 09:00:35PM +0800, Phi Nguyen wrote:
> On 8/2/2023 11:13 PM, Matthew Wilcox (Oracle) wrote:
> > +static inline bool in_range64(u64 val, u64 start, u64 len)
> > +{
> > + return (val - start) < len;
> > +}
> > +
> > +static inline bool in_range32(u32 val, u32 start, u32 len)
> > +{
> > + return (val - start) < len;
> > +}
> > +
>
> I think these two functions return wrong result if val is smaller than start
> and len is big enough.

How is it that you stopped reading at exactly the point where I explained
that this is intentional?

+/**
+ * in_range - Determine if a value lies within a range.
+ * @val: Value to test.
+ * @start: First value in range.
+ * @len: Number of values in range.
+ *
+ * This is more efficient than "if (start <= val && val < (start + len))".
+ * It also gives a different answer if @start + @len overflows the size of
+ * the type by a sufficient amount to encompass @val. Decide for yourself
+ * which behaviour you want, or prove that start + len never overflow.
+ * Do not blindly replace one form with the other.
+ */


2023-08-03 21:10:46

by Nguyen Dinh Phi

[permalink] [raw]
Subject: Re: [PATCH v6 01/38] minmax: Add in_range() macro

On 8/3/2023 9:22 PM, Matthew Wilcox wrote:
> On Thu, Aug 03, 2023 at 09:00:35PM +0800, Phi Nguyen wrote:
>> On 8/2/2023 11:13 PM, Matthew Wilcox (Oracle) wrote:
>>> +static inline bool in_range64(u64 val, u64 start, u64 len)
>>> +{
>>> + return (val - start) < len;
>>> +}
>>> +
>>> +static inline bool in_range32(u32 val, u32 start, u32 len)
>>> +{
>>> + return (val - start) < len;
>>> +}
>>> +
>>
>> I think these two functions return wrong result if val is smaller than start
>> and len is big enough.
>
> How is it that you stopped reading at exactly the point where I explained
> that this is intentional?
>
> +/**
> + * in_range - Determine if a value lies within a range.
> + * @val: Value to test.
> + * @start: First value in range.
> + * @len: Number of values in range.
> + *
> + * This is more efficient than "if (start <= val && val < (start + len))".
> + * It also gives a different answer if @start + @len overflows the size of
> + * the type by a sufficient amount to encompass @val. Decide for yourself
> + * which behaviour you want, or prove that start + len never overflow.
> + * Do not blindly replace one form with the other.
> + */
>

Oh, sorry, I see, my bad.

2023-08-04 01:45:00

by Nathan Chancellor

[permalink] [raw]
Subject: Re: [PATCH v6 21/38] powerpc: Implement the new page table range API

Hi Matthew,

On Wed, Aug 02, 2023 at 04:13:49PM +0100, Matthew Wilcox (Oracle) wrote:
> Add set_ptes(), update_mmu_cache_range() and flush_dcache_folio().
> Change the PG_arch_1 (aka PG_dcache_dirty) flag from being per-page to
> per-folio.
>
> Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
> Acked-by: Mike Rapoport (IBM) <[email protected]>
> Cc: Michael Ellerman <[email protected]>
> Cc: Nicholas Piggin <[email protected]>
> Cc: Christophe Leroy <[email protected]>
> Cc: [email protected]
...
> diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
> index d16d80ad2ae4..b4da8514af43 100644
> --- a/arch/powerpc/include/asm/kvm_ppc.h
> +++ b/arch/powerpc/include/asm/kvm_ppc.h
> @@ -894,7 +894,7 @@ void kvmppc_init_lpid(unsigned long nr_lpids);
>
> static inline void kvmppc_mmu_flush_icache(kvm_pfn_t pfn)
> {
> - struct page *page;
> + struct folio *folio;
> /*
> * We can only access pages that the kernel maps
> * as memory. Bail out for unmapped ones.
> @@ -903,10 +903,10 @@ static inline void kvmppc_mmu_flush_icache(kvm_pfn_t pfn)
> return;
>
> /* Clear i-cache for new pages */
> - page = pfn_to_page(pfn);
> - if (!test_bit(PG_dcache_clean, &page->flags)) {
> - flush_dcache_icache_page(page);
> - set_bit(PG_dcache_clean, &page->flags);
> + folio = page_folio(pfn_to_page(pfn));
> + if (!test_bit(PG_dcache_clean, &folio->flags)) {
> + flush_dcache_icache_folio(folio);
> + set_bit(PG_dcache_clean, &folio->flags);
> }
> }
...
> diff --git a/arch/powerpc/mm/cacheflush.c b/arch/powerpc/mm/cacheflush.c
> index 0e9b4879c0f9..8760d2223abe 100644
> --- a/arch/powerpc/mm/cacheflush.c
> +++ b/arch/powerpc/mm/cacheflush.c
> @@ -148,44 +148,30 @@ static void __flush_dcache_icache(void *p)
> invalidate_icache_range(addr, addr + PAGE_SIZE);
> }
>
> -static void flush_dcache_icache_hugepage(struct page *page)
> +void flush_dcache_icache_folio(struct folio *folio)
> {
> - int i;
> - int nr = compound_nr(page);
> + unsigned int i, nr = folio_nr_pages(folio);
>
> - if (!PageHighMem(page)) {
> + if (flush_coherent_icache())
> + return;
> +
> + if (!folio_test_highmem(folio)) {
> + void *addr = folio_address(folio);
> for (i = 0; i < nr; i++)
> - __flush_dcache_icache(lowmem_page_address(page + i));
> - } else {
> + __flush_dcache_icache(addr + i * PAGE_SIZE);
> + } else if (IS_ENABLED(CONFIG_BOOKE) || sizeof(phys_addr_t) > sizeof(void *)) {
> for (i = 0; i < nr; i++) {
> - void *start = kmap_local_page(page + i);
> + void *start = kmap_local_folio(folio, i * PAGE_SIZE);
>
> __flush_dcache_icache(start);
> kunmap_local(start);
> }
> - }
> -}
> -
> -void flush_dcache_icache_page(struct page *page)
> -{
> - if (flush_coherent_icache())
> - return;
> -
> - if (PageCompound(page))
> - return flush_dcache_icache_hugepage(page);
> -
> - if (!PageHighMem(page)) {
> - __flush_dcache_icache(lowmem_page_address(page));
> - } else if (IS_ENABLED(CONFIG_BOOKE) || sizeof(phys_addr_t) > sizeof(void *)) {
> - void *start = kmap_local_page(page);
> -
> - __flush_dcache_icache(start);
> - kunmap_local(start);
> } else {
> - flush_dcache_icache_phys(page_to_phys(page));
> + unsigned long pfn = folio_pfn(folio);
> + for (i = 0; i < nr; i++)
> + flush_dcache_icache_phys((pfn + i) * PAGE_SIZE);
> }
> }
> -EXPORT_SYMBOL(flush_dcache_icache_page);

Apologies if this has already been fixed or reported, I searched lore
and did not find anything. The dropping of this export in combination
with the conversion above appears to cause ARCH=powerpc allmodconfig to
fail with:

ERROR: modpost: "flush_dcache_icache_folio" [arch/powerpc/kvm/kvm-pr.ko] undefined!

I don't know if this should be re-exported or not but that does
obviously resolve the issue.

Cheers,
Nathan

2023-08-04 04:19:18

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH v6 21/38] powerpc: Implement the new page table range API

On Thu, Aug 03, 2023 at 04:38:14PM -0700, Nathan Chancellor wrote:
> > -EXPORT_SYMBOL(flush_dcache_icache_page);
>
> Apologies if this has already been fixed or reported, I searched lore
> and did not find anything. The dropping of this export in combination
> with the conversion above appears to cause ARCH=powerpc allmodconfig to
> fail with:
>
> ERROR: modpost: "flush_dcache_icache_folio" [arch/powerpc/kvm/kvm-pr.ko] undefined!
>
> I don't know if this should be re-exported or not but that does
> obviously resolve the issue.

Well, that was clumsy of me. I (and the Intel build bot) did test several
build combos, but clearly didn't manage to find a config that showed
this problem. Andrew, a fix patch for you to integrate, if you would:

diff --git a/arch/powerpc/mm/cacheflush.c b/arch/powerpc/mm/cacheflush.c
index 8760d2223abe..15189592da09 100644
--- a/arch/powerpc/mm/cacheflush.c
+++ b/arch/powerpc/mm/cacheflush.c
@@ -172,6 +172,7 @@ void flush_dcache_icache_folio(struct folio *folio)
flush_dcache_icache_phys((pfn + i) * PAGE_SIZE);
}
}
+EXPORT_SYMBOL(flush_dcache_icache_folio);

void clear_user_page(void *page, unsigned long vaddr, struct page *pg)
{