2020-01-07 21:01:53

by Logan Gunthorpe

[permalink] [raw]
Subject: [PATCH v2 0/8] Allow setting caching mode in arch_add_memory() for P2PDMA

Hi,

The main feedback from v1 was around the interface for arch_add_memory().
Per Dan's suggestions I've renamed the mhp_restrictions structure to
mhp_modifiers and put a pgprot_t field in that structure. I've also
included patch to drop the unused flags field.

Thanks,

Logan

--

Changes in v2:
* Rebased onto v5.5-rc5
* Renamed mhp_restrictions to mhp_modifiers and added the pgprot field
to that structure instead of using an argument for
arch_add_memory().
* Add patch to drop the unused flags field in mhp_restrictions

A git branch is available here:

https://github.com/sbates130272/linux-p2pmem remap_pages_cache_v2

--

Currently, the page tables created using memremap_pages() are always
created with the PAGE_KERNEL cacheing mode. However, the P2PDMA code
is creating pages for PCI BAR memory which should never be accessed
through the cache and instead use either WC or UC. This still works in
most cases, on x86, because the MTRR registers typically override the
caching settings in the page tables for all of the IO memory to be
UC-. However, this tends not to work so well on other arches or
some rare x86 machines that have firmware which does not setup the
MTRR registers in this way.

Instead of this, this series proposes a change to arch_add_memory()
to take the pgprot required by the mapping which allows us to
explicitly set pagetable entries for P2PDMA memory to WC.

This changes is pretty routine for most of the arches: x86_64, s390, arm64
and powerpc simply need to thread the pgprot through to where the page
tables are setup. x86_32 unfortunately sets up the page tables at boot so
must use _set_memory_prot() to change their caching mode. ia64 and sh
don't appear to have an easy way to change the page tables so, for now
at least, we just return -EINVAL on such mappings and thus they will
not support P2PDMA memory until the work for this is done.

--

Logan Gunthorpe (8):
mm/memory_hotplug: Drop the flags field from struct mhp_restrictions
mm/memory_hotplug: Rename mhp_restrictions to mhp_modifiers
x86/mm: Thread pgprot_t through init_memory_mapping()
x86/mm: Introduce _set_memory_prot()
powerpc/mm: Thread pgprot_t through create_section_mapping()
s390/mm: Thread pgprot_t through vmem_add_mapping()
mm/memory_hotplug: Add pgprot_t to mhp_modifiers
mm/memremap: Set caching mode for PCI P2PDMA memory to WC

arch/arm64/mm/mmu.c | 7 ++--
arch/ia64/mm/init.c | 6 +++-
arch/powerpc/include/asm/book3s/64/hash.h | 3 +-
arch/powerpc/include/asm/book3s/64/radix.h | 3 +-
arch/powerpc/include/asm/sparsemem.h | 3 +-
arch/powerpc/mm/book3s64/hash_utils.c | 5 +--
arch/powerpc/mm/book3s64/pgtable.c | 7 ++--
arch/powerpc/mm/book3s64/radix_pgtable.c | 18 ++++++----
arch/powerpc/mm/mem.c | 10 +++---
arch/s390/include/asm/pgtable.h | 3 +-
arch/s390/mm/extmem.c | 3 +-
arch/s390/mm/init.c | 8 ++---
arch/s390/mm/vmem.c | 10 +++---
arch/sh/mm/init.c | 7 ++--
arch/x86/include/asm/page_types.h | 3 --
arch/x86/include/asm/pgtable.h | 3 ++
arch/x86/include/asm/set_memory.h | 1 +
arch/x86/kernel/amd_gart_64.c | 3 +-
arch/x86/mm/init.c | 9 ++---
arch/x86/mm/init_32.c | 12 +++++--
arch/x86/mm/init_64.c | 40 ++++++++++++----------
arch/x86/mm/mm_internal.h | 3 +-
arch/x86/mm/pageattr.c | 7 ++++
arch/x86/platform/efi/efi_64.c | 3 +-
include/linux/memory_hotplug.h | 16 ++++-----
mm/memory_hotplug.c | 8 ++---
mm/memremap.c | 17 +++++----
27 files changed, 132 insertions(+), 86 deletions(-)

--
2.20.1


2020-01-07 21:25:50

by Logan Gunthorpe

[permalink] [raw]
Subject: [PATCH v2 6/8] s390/mm: Thread pgprot_t through vmem_add_mapping()

In prepartion to support a pgprot_t argument for arch_add_memory().

Cc: Heiko Carstens <[email protected]>
Cc: Vasily Gorbik <[email protected]>
Cc: Christian Borntraeger <[email protected]>
Signed-off-by: Logan Gunthorpe <[email protected]>
---
arch/s390/include/asm/pgtable.h | 3 ++-
arch/s390/mm/extmem.c | 3 ++-
arch/s390/mm/init.c | 2 +-
arch/s390/mm/vmem.c | 10 +++++-----
4 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
index 7b03037a8475..e667a1a96879 100644
--- a/arch/s390/include/asm/pgtable.h
+++ b/arch/s390/include/asm/pgtable.h
@@ -1640,7 +1640,8 @@ static inline swp_entry_t __swp_entry(unsigned long type, unsigned long offset)

#define kern_addr_valid(addr) (1)

-extern int vmem_add_mapping(unsigned long start, unsigned long size);
+extern int vmem_add_mapping(unsigned long start, unsigned long size,
+ pgprot_t prot);
extern int vmem_remove_mapping(unsigned long start, unsigned long size);
extern int s390_enable_sie(void);
extern int s390_enable_skey(void);
diff --git a/arch/s390/mm/extmem.c b/arch/s390/mm/extmem.c
index fd0dae9d10f4..6cf7029a7b35 100644
--- a/arch/s390/mm/extmem.c
+++ b/arch/s390/mm/extmem.c
@@ -313,7 +313,8 @@ __segment_load (char *name, int do_nonshared, unsigned long *addr, unsigned long
goto out_free;
}

- rc = vmem_add_mapping(seg->start_addr, seg->end - seg->start_addr + 1);
+ rc = vmem_add_mapping(seg->start_addr, seg->end - seg->start_addr + 1,
+ PAGE_KERNEL);

if (rc)
goto out_free;
diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
index a0c88c1c9ad0..ef19522ddad2 100644
--- a/arch/s390/mm/init.c
+++ b/arch/s390/mm/init.c
@@ -277,7 +277,7 @@ int arch_add_memory(int nid, u64 start, u64 size,
if (WARN_ON_ONCE(modifiers->altmap))
return -EINVAL;

- rc = vmem_add_mapping(start, size);
+ rc = vmem_add_mapping(start, size, PAGE_KERNEL);
if (rc)
return rc;

diff --git a/arch/s390/mm/vmem.c b/arch/s390/mm/vmem.c
index b403fa14847d..8a5e95f184a2 100644
--- a/arch/s390/mm/vmem.c
+++ b/arch/s390/mm/vmem.c
@@ -66,7 +66,7 @@ pte_t __ref *vmem_pte_alloc(void)
/*
* Add a physical memory range to the 1:1 mapping.
*/
-static int vmem_add_mem(unsigned long start, unsigned long size)
+static int vmem_add_mem(unsigned long start, unsigned long size, pgprot_t prot)
{
unsigned long pgt_prot, sgt_prot, r3_prot;
unsigned long pages4k, pages1m, pages2g;
@@ -79,7 +79,7 @@ static int vmem_add_mem(unsigned long start, unsigned long size)
pte_t *pt_dir;
int ret = -ENOMEM;

- pgt_prot = pgprot_val(PAGE_KERNEL);
+ pgt_prot = pgprot_val(prot);
sgt_prot = pgprot_val(SEGMENT_KERNEL);
r3_prot = pgprot_val(REGION3_KERNEL);
if (!MACHINE_HAS_NX) {
@@ -362,7 +362,7 @@ int vmem_remove_mapping(unsigned long start, unsigned long size)
return ret;
}

-int vmem_add_mapping(unsigned long start, unsigned long size)
+int vmem_add_mapping(unsigned long start, unsigned long size, pgprot_t prot)
{
struct memory_segment *seg;
int ret;
@@ -379,7 +379,7 @@ int vmem_add_mapping(unsigned long start, unsigned long size)
if (ret)
goto out_free;

- ret = vmem_add_mem(start, size);
+ ret = vmem_add_mem(start, size, prot);
if (ret)
goto out_remove;
goto out;
@@ -403,7 +403,7 @@ void __init vmem_map_init(void)
struct memblock_region *reg;

for_each_memblock(memory, reg)
- vmem_add_mem(reg->base, reg->size);
+ vmem_add_mem(reg->base, reg->size, PAGE_KERNEL);
__set_memory((unsigned long)_stext,
(unsigned long)(_etext - _stext) >> PAGE_SHIFT,
SET_MEMORY_RO | SET_MEMORY_X);
--
2.20.1

2020-01-07 21:26:26

by Logan Gunthorpe

[permalink] [raw]
Subject: [PATCH v2 3/8] x86/mm: Thread pgprot_t through init_memory_mapping()

In prepartion to support a pgprot_t argument for arch_add_memory().

It's required to move the prototype of init_memory_mapping() seeing
the original location came before the definition of pgprot_t.

Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: [email protected]
Cc: Dave Hansen <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Signed-off-by: Logan Gunthorpe <[email protected]>
---
arch/x86/include/asm/page_types.h | 3 ---
arch/x86/include/asm/pgtable.h | 3 +++
arch/x86/kernel/amd_gart_64.c | 3 ++-
arch/x86/mm/init.c | 9 +++++----
arch/x86/mm/init_32.c | 3 ++-
arch/x86/mm/init_64.c | 32 +++++++++++++++++--------------
arch/x86/mm/mm_internal.h | 3 ++-
arch/x86/platform/efi/efi_64.c | 3 ++-
8 files changed, 34 insertions(+), 25 deletions(-)

diff --git a/arch/x86/include/asm/page_types.h b/arch/x86/include/asm/page_types.h
index c85e15010f48..bf7aa2e290ef 100644
--- a/arch/x86/include/asm/page_types.h
+++ b/arch/x86/include/asm/page_types.h
@@ -73,9 +73,6 @@ static inline phys_addr_t get_max_mapped(void)

bool pfn_range_is_mapped(unsigned long start_pfn, unsigned long end_pfn);

-extern unsigned long init_memory_mapping(unsigned long start,
- unsigned long end);
-
extern void initmem_init(void);

#endif /* !__ASSEMBLY__ */
diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index ad97dc155195..844635e02da5 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -1041,6 +1041,9 @@ static inline void __meminit init_trampoline_default(void)

void __init poking_init(void);

+unsigned long init_memory_mapping(unsigned long start,
+ unsigned long end, pgprot_t prot);
+
# ifdef CONFIG_RANDOMIZE_MEMORY
void __meminit init_trampoline(void);
# else
diff --git a/arch/x86/kernel/amd_gart_64.c b/arch/x86/kernel/amd_gart_64.c
index 4e5f50236048..16133819415c 100644
--- a/arch/x86/kernel/amd_gart_64.c
+++ b/arch/x86/kernel/amd_gart_64.c
@@ -744,7 +744,8 @@ int __init gart_iommu_init(void)

start_pfn = PFN_DOWN(aper_base);
if (!pfn_range_is_mapped(start_pfn, end_pfn))
- init_memory_mapping(start_pfn<<PAGE_SHIFT, end_pfn<<PAGE_SHIFT);
+ init_memory_mapping(start_pfn<<PAGE_SHIFT, end_pfn<<PAGE_SHIFT,
+ PAGE_KERNEL);

pr_info("PCI-DMA: using GART IOMMU.\n");
iommu_size = check_iommu_size(info.aper_base, aper_size);
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index e7bb483557c9..1bba16c5742b 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -467,7 +467,7 @@ bool pfn_range_is_mapped(unsigned long start_pfn, unsigned long end_pfn)
* the physical memory. To access them they are temporarily mapped.
*/
unsigned long __ref init_memory_mapping(unsigned long start,
- unsigned long end)
+ unsigned long end, pgprot_t prot)
{
struct map_range mr[NR_RANGE_MR];
unsigned long ret = 0;
@@ -481,7 +481,8 @@ unsigned long __ref init_memory_mapping(unsigned long start,

for (i = 0; i < nr_range; i++)
ret = kernel_physical_mapping_init(mr[i].start, mr[i].end,
- mr[i].page_size_mask);
+ mr[i].page_size_mask,
+ prot);

add_pfn_range_mapped(start >> PAGE_SHIFT, ret >> PAGE_SHIFT);

@@ -521,7 +522,7 @@ static unsigned long __init init_range_memory_mapping(
*/
can_use_brk_pgt = max(start, (u64)pgt_buf_end<<PAGE_SHIFT) >=
min(end, (u64)pgt_buf_top<<PAGE_SHIFT);
- init_memory_mapping(start, end);
+ init_memory_mapping(start, end, PAGE_KERNEL);
mapped_ram_size += end - start;
can_use_brk_pgt = true;
}
@@ -661,7 +662,7 @@ void __init init_mem_mapping(void)
#endif

/* the ISA range is always mapped regardless of memory holes */
- init_memory_mapping(0, ISA_END_ADDRESS);
+ init_memory_mapping(0, ISA_END_ADDRESS, PAGE_KERNEL);

/* Init the trampoline, possibly with KASLR memory offset */
init_trampoline();
diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c
index e6fce2d547f0..630d8a36fcd7 100644
--- a/arch/x86/mm/init_32.c
+++ b/arch/x86/mm/init_32.c
@@ -252,7 +252,8 @@ static inline int is_kernel_text(unsigned long addr)
unsigned long __init
kernel_physical_mapping_init(unsigned long start,
unsigned long end,
- unsigned long page_size_mask)
+ unsigned long page_size_mask,
+ pgprot_t prot)
{
int use_pse = page_size_mask == (1<<PG_LEVEL_2M);
unsigned long last_map_addr = end;
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 516133f555e9..17ea0bfc0b83 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -585,7 +585,7 @@ phys_pmd_init(pmd_t *pmd_page, unsigned long paddr, unsigned long paddr_end,
*/
static unsigned long __meminit
phys_pud_init(pud_t *pud_page, unsigned long paddr, unsigned long paddr_end,
- unsigned long page_size_mask, bool init)
+ unsigned long page_size_mask, pgprot_t _prot, bool init)
{
unsigned long pages = 0, paddr_next;
unsigned long paddr_last = paddr_end;
@@ -595,7 +595,7 @@ phys_pud_init(pud_t *pud_page, unsigned long paddr, unsigned long paddr_end,
for (; i < PTRS_PER_PUD; i++, paddr = paddr_next) {
pud_t *pud;
pmd_t *pmd;
- pgprot_t prot = PAGE_KERNEL;
+ pgprot_t prot = _prot;

vaddr = (unsigned long)__va(paddr);
pud = pud_page + pud_index(vaddr);
@@ -644,9 +644,12 @@ phys_pud_init(pud_t *pud_page, unsigned long paddr, unsigned long paddr_end,
if (page_size_mask & (1<<PG_LEVEL_1G)) {
pages++;
spin_lock(&init_mm.page_table_lock);
+
+ prot = __pgprot(pgprot_val(prot) | __PAGE_KERNEL_LARGE);
+
set_pte_init((pte_t *)pud,
pfn_pte((paddr & PUD_MASK) >> PAGE_SHIFT,
- PAGE_KERNEL_LARGE),
+ prot),
init);
spin_unlock(&init_mm.page_table_lock);
paddr_last = paddr_next;
@@ -669,7 +672,7 @@ phys_pud_init(pud_t *pud_page, unsigned long paddr, unsigned long paddr_end,

static unsigned long __meminit
phys_p4d_init(p4d_t *p4d_page, unsigned long paddr, unsigned long paddr_end,
- unsigned long page_size_mask, bool init)
+ unsigned long page_size_mask, pgprot_t prot, bool init)
{
unsigned long vaddr, vaddr_end, vaddr_next, paddr_next, paddr_last;

@@ -679,7 +682,7 @@ phys_p4d_init(p4d_t *p4d_page, unsigned long paddr, unsigned long paddr_end,

if (!pgtable_l5_enabled())
return phys_pud_init((pud_t *) p4d_page, paddr, paddr_end,
- page_size_mask, init);
+ page_size_mask, prot, init);

for (; vaddr < vaddr_end; vaddr = vaddr_next) {
p4d_t *p4d = p4d_page + p4d_index(vaddr);
@@ -702,13 +705,13 @@ phys_p4d_init(p4d_t *p4d_page, unsigned long paddr, unsigned long paddr_end,
if (!p4d_none(*p4d)) {
pud = pud_offset(p4d, 0);
paddr_last = phys_pud_init(pud, paddr, __pa(vaddr_end),
- page_size_mask, init);
+ page_size_mask, prot, init);
continue;
}

pud = alloc_low_page();
paddr_last = phys_pud_init(pud, paddr, __pa(vaddr_end),
- page_size_mask, init);
+ page_size_mask, prot, init);

spin_lock(&init_mm.page_table_lock);
p4d_populate_init(&init_mm, p4d, pud, init);
@@ -722,7 +725,7 @@ static unsigned long __meminit
__kernel_physical_mapping_init(unsigned long paddr_start,
unsigned long paddr_end,
unsigned long page_size_mask,
- bool init)
+ pgprot_t prot, bool init)
{
bool pgd_changed = false;
unsigned long vaddr, vaddr_start, vaddr_end, vaddr_next, paddr_last;
@@ -743,13 +746,13 @@ __kernel_physical_mapping_init(unsigned long paddr_start,
paddr_last = phys_p4d_init(p4d, __pa(vaddr),
__pa(vaddr_end),
page_size_mask,
- init);
+ prot, init);
continue;
}

p4d = alloc_low_page();
paddr_last = phys_p4d_init(p4d, __pa(vaddr), __pa(vaddr_end),
- page_size_mask, init);
+ page_size_mask, prot, init);

spin_lock(&init_mm.page_table_lock);
if (pgtable_l5_enabled())
@@ -778,10 +781,10 @@ __kernel_physical_mapping_init(unsigned long paddr_start,
unsigned long __meminit
kernel_physical_mapping_init(unsigned long paddr_start,
unsigned long paddr_end,
- unsigned long page_size_mask)
+ unsigned long page_size_mask, pgprot_t prot)
{
return __kernel_physical_mapping_init(paddr_start, paddr_end,
- page_size_mask, true);
+ page_size_mask, prot, true);
}

/*
@@ -796,7 +799,8 @@ kernel_physical_mapping_change(unsigned long paddr_start,
unsigned long page_size_mask)
{
return __kernel_physical_mapping_init(paddr_start, paddr_end,
- page_size_mask, false);
+ page_size_mask, PAGE_KERNEL,
+ false);
}

#ifndef CONFIG_NUMA
@@ -864,7 +868,7 @@ int arch_add_memory(int nid, u64 start, u64 size,
unsigned long start_pfn = start >> PAGE_SHIFT;
unsigned long nr_pages = size >> PAGE_SHIFT;

- init_memory_mapping(start, start + size);
+ init_memory_mapping(start, start + size, PAGE_KERNEL);

return add_pages(nid, start_pfn, nr_pages, modifiers);
}
diff --git a/arch/x86/mm/mm_internal.h b/arch/x86/mm/mm_internal.h
index eeae142062ed..3f37b5c80bb3 100644
--- a/arch/x86/mm/mm_internal.h
+++ b/arch/x86/mm/mm_internal.h
@@ -12,7 +12,8 @@ void early_ioremap_page_table_range_init(void);

unsigned long kernel_physical_mapping_init(unsigned long start,
unsigned long end,
- unsigned long page_size_mask);
+ unsigned long page_size_mask,
+ pgprot_t prot);
unsigned long kernel_physical_mapping_change(unsigned long start,
unsigned long end,
unsigned long page_size_mask);
diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
index 08ce8177c3af..3d0acfdb58c3 100644
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -499,7 +499,8 @@ void __iomem *__init efi_ioremap(unsigned long phys_addr, unsigned long size,
if (type == EFI_MEMORY_MAPPED_IO)
return ioremap(phys_addr, size);

- last_map_pfn = init_memory_mapping(phys_addr, phys_addr + size);
+ last_map_pfn = init_memory_mapping(phys_addr, phys_addr + size,
+ PAGE_KERNEL);
if ((last_map_pfn << PAGE_SHIFT) < phys_addr + size) {
unsigned long top = last_map_pfn << PAGE_SHIFT;
efi_ioremap(top, size - (top - phys_addr), type, attribute);
--
2.20.1

2020-01-07 21:26:29

by Logan Gunthorpe

[permalink] [raw]
Subject: [PATCH v2 1/8] mm/memory_hotplug: Drop the flags field from struct mhp_restrictions

This variable is not used anywhere and should therefore be removed
from the structure.

Signed-off-by: Logan Gunthorpe <[email protected]>
---
include/linux/memory_hotplug.h | 2 --
1 file changed, 2 deletions(-)

diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index ba0dca6aac6e..e47a29761088 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -55,11 +55,9 @@ enum {

/*
* Restrictions for the memory hotplug:
- * flags: MHP_ flags
* altmap: alternative allocator for memmap array
*/
struct mhp_restrictions {
- unsigned long flags;
struct vmem_altmap *altmap;
};

--
2.20.1

2020-01-07 21:26:30

by Logan Gunthorpe

[permalink] [raw]
Subject: [PATCH v2 2/8] mm/memory_hotplug: Rename mhp_restrictions to mhp_modifiers

The mhp_restrictions struct really doesn't specify anything resembling
a restriction anymore so rename it to be mhp_modifiers.

Signed-off-by: Logan Gunthorpe <[email protected]>
---
arch/arm64/mm/mmu.c | 4 ++--
arch/ia64/mm/init.c | 4 ++--
arch/powerpc/mm/mem.c | 4 ++--
arch/s390/mm/init.c | 6 +++---
arch/sh/mm/init.c | 4 ++--
arch/x86/mm/init_32.c | 4 ++--
arch/x86/mm/init_64.c | 8 ++++----
include/linux/memory_hotplug.h | 12 ++++++------
mm/memory_hotplug.c | 8 ++++----
mm/memremap.c | 8 ++++----
10 files changed, 31 insertions(+), 31 deletions(-)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 40797cbfba2d..3320406579c3 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -1050,7 +1050,7 @@ int p4d_free_pud_page(p4d_t *p4d, unsigned long addr)

#ifdef CONFIG_MEMORY_HOTPLUG
int arch_add_memory(int nid, u64 start, u64 size,
- struct mhp_restrictions *restrictions)
+ struct mhp_modifiers *modifiers)
{
int flags = 0;

@@ -1063,7 +1063,7 @@ int arch_add_memory(int nid, u64 start, u64 size,
memblock_clear_nomap(start, size);

return __add_pages(nid, start >> PAGE_SHIFT, size >> PAGE_SHIFT,
- restrictions);
+ modifiers);
}
void arch_remove_memory(int nid, u64 start, u64 size,
struct vmem_altmap *altmap)
diff --git a/arch/ia64/mm/init.c b/arch/ia64/mm/init.c
index b01d68a2d5d9..daf438e08b96 100644
--- a/arch/ia64/mm/init.c
+++ b/arch/ia64/mm/init.c
@@ -670,13 +670,13 @@ mem_init (void)

#ifdef CONFIG_MEMORY_HOTPLUG
int arch_add_memory(int nid, u64 start, u64 size,
- struct mhp_restrictions *restrictions)
+ struct mhp_modifiers *modifiers)
{
unsigned long start_pfn = start >> PAGE_SHIFT;
unsigned long nr_pages = size >> PAGE_SHIFT;
int ret;

- ret = __add_pages(nid, start_pfn, nr_pages, restrictions);
+ ret = __add_pages(nid, start_pfn, nr_pages, modifiers);
if (ret)
printk("%s: Problem encountered in __add_pages() as ret=%d\n",
__func__, ret);
diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index f5535eae637f..9dd9c3c1be7f 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -127,7 +127,7 @@ static void flush_dcache_range_chunked(unsigned long start, unsigned long stop,
}

int __ref arch_add_memory(int nid, u64 start, u64 size,
- struct mhp_restrictions *restrictions)
+ struct mhp_modifiers *modifiers)
{
unsigned long start_pfn = start >> PAGE_SHIFT;
unsigned long nr_pages = size >> PAGE_SHIFT;
@@ -143,7 +143,7 @@ int __ref arch_add_memory(int nid, u64 start, u64 size,
return -EFAULT;
}

- return __add_pages(nid, start_pfn, nr_pages, restrictions);
+ return __add_pages(nid, start_pfn, nr_pages, modifiers);
}

void __ref arch_remove_memory(int nid, u64 start, u64 size,
diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
index ac44bd76db4b..a0c88c1c9ad0 100644
--- a/arch/s390/mm/init.c
+++ b/arch/s390/mm/init.c
@@ -268,20 +268,20 @@ device_initcall(s390_cma_mem_init);
#endif /* CONFIG_CMA */

int arch_add_memory(int nid, u64 start, u64 size,
- struct mhp_restrictions *restrictions)
+ struct mhp_modifiers *modifiers)
{
unsigned long start_pfn = PFN_DOWN(start);
unsigned long size_pages = PFN_DOWN(size);
int rc;

- if (WARN_ON_ONCE(restrictions->altmap))
+ if (WARN_ON_ONCE(modifiers->altmap))
return -EINVAL;

rc = vmem_add_mapping(start, size);
if (rc)
return rc;

- rc = __add_pages(nid, start_pfn, size_pages, restrictions);
+ rc = __add_pages(nid, start_pfn, size_pages, modifiers);
if (rc)
vmem_remove_mapping(start, size);
return rc;
diff --git a/arch/sh/mm/init.c b/arch/sh/mm/init.c
index d1b1ff2be17a..7e64f42fb570 100644
--- a/arch/sh/mm/init.c
+++ b/arch/sh/mm/init.c
@@ -406,14 +406,14 @@ void __init mem_init(void)

#ifdef CONFIG_MEMORY_HOTPLUG
int arch_add_memory(int nid, u64 start, u64 size,
- struct mhp_restrictions *restrictions)
+ struct mhp_modifiers *modifiers)
{
unsigned long start_pfn = PFN_DOWN(start);
unsigned long nr_pages = size >> PAGE_SHIFT;
int ret;

/* We only have ZONE_NORMAL, so this is easy.. */
- ret = __add_pages(nid, start_pfn, nr_pages, restrictions);
+ ret = __add_pages(nid, start_pfn, nr_pages, modifiers);
if (unlikely(ret))
printk("%s: Failed, __add_pages() == %d\n", __func__, ret);

diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c
index 0a74407ef92e..e6fce2d547f0 100644
--- a/arch/x86/mm/init_32.c
+++ b/arch/x86/mm/init_32.c
@@ -852,12 +852,12 @@ void __init mem_init(void)

#ifdef CONFIG_MEMORY_HOTPLUG
int arch_add_memory(int nid, u64 start, u64 size,
- struct mhp_restrictions *restrictions)
+ struct mhp_modifiers *modifiers)
{
unsigned long start_pfn = start >> PAGE_SHIFT;
unsigned long nr_pages = size >> PAGE_SHIFT;

- return __add_pages(nid, start_pfn, nr_pages, restrictions);
+ return __add_pages(nid, start_pfn, nr_pages, modifiers);
}

void arch_remove_memory(int nid, u64 start, u64 size,
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index bcfede46fe02..516133f555e9 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -844,11 +844,11 @@ static void update_end_of_memory_vars(u64 start, u64 size)
}

int add_pages(int nid, unsigned long start_pfn, unsigned long nr_pages,
- struct mhp_restrictions *restrictions)
+ struct mhp_modifiers *modifiers)
{
int ret;

- ret = __add_pages(nid, start_pfn, nr_pages, restrictions);
+ ret = __add_pages(nid, start_pfn, nr_pages, modifiers);
WARN_ON_ONCE(ret);

/* update max_pfn, max_low_pfn and high_memory */
@@ -859,14 +859,14 @@ int add_pages(int nid, unsigned long start_pfn, unsigned long nr_pages,
}

int arch_add_memory(int nid, u64 start, u64 size,
- struct mhp_restrictions *restrictions)
+ struct mhp_modifiers *modifiers)
{
unsigned long start_pfn = start >> PAGE_SHIFT;
unsigned long nr_pages = size >> PAGE_SHIFT;

init_memory_mapping(start, start + size);

- return add_pages(nid, start_pfn, nr_pages, restrictions);
+ return add_pages(nid, start_pfn, nr_pages, modifiers);
}

#define PAGE_INUSE 0xFD
diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index e47a29761088..2152efae2f4b 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -57,7 +57,7 @@ enum {
* Restrictions for the memory hotplug:
* altmap: alternative allocator for memmap array
*/
-struct mhp_restrictions {
+struct mhp_modifiers {
struct vmem_altmap *altmap;
};

@@ -107,7 +107,7 @@ extern int restore_online_page_callback(online_page_callback_t callback);
extern int try_online_node(int nid);

extern int arch_add_memory(int nid, u64 start, u64 size,
- struct mhp_restrictions *restrictions);
+ struct mhp_modifiers *modifiers);
extern u64 max_mem_size;

extern bool memhp_auto_online;
@@ -125,17 +125,17 @@ extern void __remove_pages(unsigned long start_pfn, unsigned long nr_pages,

/* reasonably generic interface to expand the physical pages */
extern int __add_pages(int nid, unsigned long start_pfn, unsigned long nr_pages,
- struct mhp_restrictions *restrictions);
+ struct mhp_modifiers *modifiers);

#ifndef CONFIG_ARCH_HAS_ADD_PAGES
static inline int add_pages(int nid, unsigned long start_pfn,
- unsigned long nr_pages, struct mhp_restrictions *restrictions)
+ unsigned long nr_pages, struct mhp_modifiers *modifiers)
{
- return __add_pages(nid, start_pfn, nr_pages, restrictions);
+ return __add_pages(nid, start_pfn, nr_pages, modifiers);
}
#else /* ARCH_HAS_ADD_PAGES */
int add_pages(int nid, unsigned long start_pfn, unsigned long nr_pages,
- struct mhp_restrictions *restrictions);
+ struct mhp_modifiers *modifiers);
#endif /* ARCH_HAS_ADD_PAGES */

#ifdef CONFIG_NUMA
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index a91a072f2b2c..1bb3f92e087d 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -299,11 +299,11 @@ static int check_hotplug_memory_addressable(unsigned long pfn,
* add the new pages.
*/
int __ref __add_pages(int nid, unsigned long pfn, unsigned long nr_pages,
- struct mhp_restrictions *restrictions)
+ struct mhp_modifiers *modifiers)
{
int err;
unsigned long nr, start_sec, end_sec;
- struct vmem_altmap *altmap = restrictions->altmap;
+ struct vmem_altmap *altmap = modifiers->altmap;

err = check_hotplug_memory_addressable(pfn, nr_pages);
if (err)
@@ -1027,7 +1027,7 @@ static int online_memory_block(struct memory_block *mem, void *arg)
*/
int __ref add_memory_resource(int nid, struct resource *res)
{
- struct mhp_restrictions restrictions = {};
+ struct mhp_modifiers modifiers = {};
u64 start, size;
bool new_node = false;
int ret;
@@ -1055,7 +1055,7 @@ int __ref add_memory_resource(int nid, struct resource *res)
new_node = ret;

/* call arch's memory hotadd */
- ret = arch_add_memory(nid, start, size, &restrictions);
+ ret = arch_add_memory(nid, start, size, &modifiers);
if (ret < 0)
goto error;

diff --git a/mm/memremap.c b/mm/memremap.c
index c51c6bd2fe34..e30be8ba706b 100644
--- a/mm/memremap.c
+++ b/mm/memremap.c
@@ -158,7 +158,7 @@ void *memremap_pages(struct dev_pagemap *pgmap, int nid)
{
struct resource *res = &pgmap->res;
struct dev_pagemap *conflict_pgmap;
- struct mhp_restrictions restrictions = {
+ struct mhp_modifiers modifiers = {
/*
* We do not want any optional features only our own memmap
*/
@@ -272,7 +272,7 @@ void *memremap_pages(struct dev_pagemap *pgmap, int nid)
*/
if (pgmap->type == MEMORY_DEVICE_PRIVATE) {
error = add_pages(nid, PHYS_PFN(res->start),
- PHYS_PFN(resource_size(res)), &restrictions);
+ PHYS_PFN(resource_size(res)), &modifiers);
} else {
error = kasan_add_zero_shadow(__va(res->start), resource_size(res));
if (error) {
@@ -281,7 +281,7 @@ void *memremap_pages(struct dev_pagemap *pgmap, int nid)
}

error = arch_add_memory(nid, res->start, resource_size(res),
- &restrictions);
+ &modifiers);
}

if (!error) {
@@ -289,7 +289,7 @@ void *memremap_pages(struct dev_pagemap *pgmap, int nid)

zone = &NODE_DATA(nid)->node_zones[ZONE_DEVICE];
move_pfn_range_to_zone(zone, PHYS_PFN(res->start),
- PHYS_PFN(resource_size(res)), restrictions.altmap);
+ PHYS_PFN(resource_size(res)), modifiers.altmap);
}

mem_hotplug_done();
--
2.20.1

2020-01-07 21:26:35

by Logan Gunthorpe

[permalink] [raw]
Subject: [PATCH v2 5/8] powerpc/mm: Thread pgprot_t through create_section_mapping()

In prepartion to support a pgprot_t argument for arch_add_memory().

Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Michael Ellerman <[email protected]>
Signed-off-by: Logan Gunthorpe <[email protected]>
---
arch/powerpc/include/asm/book3s/64/hash.h | 3 ++-
arch/powerpc/include/asm/book3s/64/radix.h | 3 ++-
arch/powerpc/include/asm/sparsemem.h | 3 ++-
arch/powerpc/mm/book3s64/hash_utils.c | 5 +++--
arch/powerpc/mm/book3s64/pgtable.c | 7 ++++---
arch/powerpc/mm/book3s64/radix_pgtable.c | 18 +++++++++++-------
arch/powerpc/mm/mem.c | 5 +++--
7 files changed, 27 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
index 2781ebf6add4..6fc4520092c7 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -251,7 +251,8 @@ extern int __meminit hash__vmemmap_create_mapping(unsigned long start,
extern void hash__vmemmap_remove_mapping(unsigned long start,
unsigned long page_size);

-int hash__create_section_mapping(unsigned long start, unsigned long end, int nid);
+int hash__create_section_mapping(unsigned long start, unsigned long end,
+ int nid, pgprot_t prot);
int hash__remove_section_mapping(unsigned long start, unsigned long end);

#endif /* !__ASSEMBLY__ */
diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/include/asm/book3s/64/radix.h
index d97db3ad9aae..46799f3c3d1d 100644
--- a/arch/powerpc/include/asm/book3s/64/radix.h
+++ b/arch/powerpc/include/asm/book3s/64/radix.h
@@ -289,7 +289,8 @@ static inline unsigned long radix__get_tree_size(void)
}

#ifdef CONFIG_MEMORY_HOTPLUG
-int radix__create_section_mapping(unsigned long start, unsigned long end, int nid);
+int radix__create_section_mapping(unsigned long start, unsigned long end,
+ int nid, pgprot_t prot);
int radix__remove_section_mapping(unsigned long start, unsigned long end);
#endif /* CONFIG_MEMORY_HOTPLUG */
#endif /* __ASSEMBLY__ */
diff --git a/arch/powerpc/include/asm/sparsemem.h b/arch/powerpc/include/asm/sparsemem.h
index 3192d454a733..c89b32443cff 100644
--- a/arch/powerpc/include/asm/sparsemem.h
+++ b/arch/powerpc/include/asm/sparsemem.h
@@ -13,7 +13,8 @@
#endif /* CONFIG_SPARSEMEM */

#ifdef CONFIG_MEMORY_HOTPLUG
-extern int create_section_mapping(unsigned long start, unsigned long end, int nid);
+extern int create_section_mapping(unsigned long start, unsigned long end,
+ int nid, pgprot_t prot);
extern int remove_section_mapping(unsigned long start, unsigned long end);

#ifdef CONFIG_PPC_BOOK3S_64
diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c
index b30435c7d804..276e353d5264 100644
--- a/arch/powerpc/mm/book3s64/hash_utils.c
+++ b/arch/powerpc/mm/book3s64/hash_utils.c
@@ -800,7 +800,8 @@ int resize_hpt_for_hotplug(unsigned long new_mem_size)
return 0;
}

-int hash__create_section_mapping(unsigned long start, unsigned long end, int nid)
+int hash__create_section_mapping(unsigned long start, unsigned long end,
+ int nid, pgprot_t prot)
{
int rc;

@@ -810,7 +811,7 @@ int hash__create_section_mapping(unsigned long start, unsigned long end, int nid
}

rc = htab_bolt_mapping(start, end, __pa(start),
- pgprot_val(PAGE_KERNEL), mmu_linear_psize,
+ pgprot_val(prot), mmu_linear_psize,
mmu_kernel_ssize);

if (rc < 0) {
diff --git a/arch/powerpc/mm/book3s64/pgtable.c b/arch/powerpc/mm/book3s64/pgtable.c
index 75483b40fcb1..b60c18d2e5c9 100644
--- a/arch/powerpc/mm/book3s64/pgtable.c
+++ b/arch/powerpc/mm/book3s64/pgtable.c
@@ -171,12 +171,13 @@ void mmu_cleanup_all(void)
}

#ifdef CONFIG_MEMORY_HOTPLUG
-int __meminit create_section_mapping(unsigned long start, unsigned long end, int nid)
+int __meminit create_section_mapping(unsigned long start, unsigned long end,
+ int nid, pgprot_t prot)
{
if (radix_enabled())
- return radix__create_section_mapping(start, end, nid);
+ return radix__create_section_mapping(start, end, nid, prot);

- return hash__create_section_mapping(start, end, nid);
+ return hash__create_section_mapping(start, end, nid, prot);
}

int __meminit remove_section_mapping(unsigned long start, unsigned long end)
diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c
index 974109bb85db..2a21fb4a22b2 100644
--- a/arch/powerpc/mm/book3s64/radix_pgtable.c
+++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
@@ -253,7 +253,7 @@ static unsigned long next_boundary(unsigned long addr, unsigned long end)

static int __meminit create_physical_mapping(unsigned long start,
unsigned long end,
- int nid)
+ int nid, pgprot_t _prot)
{
unsigned long vaddr, addr, mapping_size = 0;
bool prev_exec, exec = false;
@@ -289,7 +289,7 @@ static int __meminit create_physical_mapping(unsigned long start,
prot = PAGE_KERNEL_X;
exec = true;
} else {
- prot = PAGE_KERNEL;
+ prot = _prot;
exec = false;
}

@@ -333,7 +333,7 @@ static void __init radix_init_pgtable(void)

WARN_ON(create_physical_mapping(reg->base,
reg->base + reg->size,
- -1));
+ -1, PAGE_KERNEL));
}

/* Find out how many PID bits are supported */
@@ -708,8 +708,10 @@ static int __meminit stop_machine_change_mapping(void *data)

spin_unlock(&init_mm.page_table_lock);
pte_clear(&init_mm, params->aligned_start, params->pte);
- create_physical_mapping(__pa(params->aligned_start), __pa(params->start), -1);
- create_physical_mapping(__pa(params->end), __pa(params->aligned_end), -1);
+ create_physical_mapping(__pa(params->aligned_start),
+ __pa(params->start), -1, PAGE_KERNEL);
+ create_physical_mapping(__pa(params->end), __pa(params->aligned_end),
+ -1, PAGE_KERNEL);
spin_lock(&init_mm.page_table_lock);
return 0;
}
@@ -866,14 +868,16 @@ static void __meminit remove_pagetable(unsigned long start, unsigned long end)
radix__flush_tlb_kernel_range(start, end);
}

-int __meminit radix__create_section_mapping(unsigned long start, unsigned long end, int nid)
+int __meminit radix__create_section_mapping(unsigned long start,
+ unsigned long end, int nid,
+ pgprot_t prot)
{
if (end >= RADIX_VMALLOC_START) {
pr_warn("Outside the supported range\n");
return -1;
}

- return create_physical_mapping(__pa(start), __pa(end), nid);
+ return create_physical_mapping(__pa(start), __pa(end), nid, prot);
}

int __meminit radix__remove_section_mapping(unsigned long start, unsigned long end)
diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index 9dd9c3c1be7f..631ee684721f 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -95,7 +95,8 @@ int memory_add_physaddr_to_nid(u64 start)
}
#endif

-int __weak create_section_mapping(unsigned long start, unsigned long end, int nid)
+int __weak create_section_mapping(unsigned long start, unsigned long end,
+ int nid, pgprot_t prot)
{
return -ENODEV;
}
@@ -136,7 +137,7 @@ int __ref arch_add_memory(int nid, u64 start, u64 size,
resize_hpt_for_hotplug(memblock_phys_mem_size());

start = (unsigned long)__va(start);
- rc = create_section_mapping(start, start + size, nid);
+ rc = create_section_mapping(start, start + size, nid, PAGE_KERNEL);
if (rc) {
pr_warn("Unable to create mapping for hot added memory 0x%llx..0x%llx: %d\n",
start, start + size, rc);
--
2.20.1

2020-01-07 21:26:46

by Logan Gunthorpe

[permalink] [raw]
Subject: [PATCH v2 4/8] x86/mm: Introduce _set_memory_prot()

For use in the 32bit arch_add_memory() to set the pgprot type of the
memory to add.

Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: [email protected]
Cc: Dave Hansen <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Signed-off-by: Logan Gunthorpe <[email protected]>
---
arch/x86/include/asm/set_memory.h | 1 +
arch/x86/mm/pageattr.c | 7 +++++++
2 files changed, 8 insertions(+)

diff --git a/arch/x86/include/asm/set_memory.h b/arch/x86/include/asm/set_memory.h
index 2ee8e469dcf5..d728c2f3ad96 100644
--- a/arch/x86/include/asm/set_memory.h
+++ b/arch/x86/include/asm/set_memory.h
@@ -34,6 +34,7 @@
* The caller is required to take care of these.
*/

+int _set_memory_prot(unsigned long addr, int numpages, pgprot_t prot);
int _set_memory_uc(unsigned long addr, int numpages);
int _set_memory_wc(unsigned long addr, int numpages);
int _set_memory_wt(unsigned long addr, int numpages);
diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c
index 1b99ad05b117..2f1934d7b8d9 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -1781,6 +1781,13 @@ static inline int cpa_clear_pages_array(struct page **pages, int numpages,
CPA_PAGES_ARRAY, pages);
}

+int _set_memory_prot(unsigned long addr, int numpages, pgprot_t prot)
+{
+ return change_page_attr_set_clr(&addr, numpages, prot,
+ __pgprot(~pgprot_val(prot)), 0, 0,
+ NULL);
+}
+
int _set_memory_uc(unsigned long addr, int numpages)
{
/*
--
2.20.1

2020-01-07 21:26:54

by Logan Gunthorpe

[permalink] [raw]
Subject: [PATCH v2 7/8] mm/memory_hotplug: Add pgprot_t to mhp_modifiers

devm_memremap_pages() is currently used by the PCI P2PDMA code to create
struct page mappings for IO memory. At present, these mappings are created
with PAGE_KERNEL which implies setting the PAT bits to be WB. However, on
x86, an mtrr register will typically override this and force the cache
type to be UC-. In the case firmware doesn't set this register it is
effectively WB and will typically result in a machine check exception
when it's accessed.

Other arches are not currently likely to function correctly seeing they
don't have any MTRR registers to fall back on.

To solve this, add an argument to arch_add_memory() to explicitly
set the pgprot value to a specific value.

Of the arches that support MEMORY_HOTPLUG: x86_64, s390 and arm64 is a
simple change to pass the pgprot_t down to their respective functions
which set up the page tables. For x86_32, set the page tables explicitly
using _set_memory_prot() (seeing they are already mapped). For sh, reject
anything but PAGE_KERNEL settings -- this should be fine, for now, seeing
sh doesn't support ZONE_DEVICE anyway.

Cc: Dan Williams <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Michal Hocko <[email protected]>
Signed-off-by: Logan Gunthorpe <[email protected]>
---
arch/arm64/mm/mmu.c | 3 ++-
arch/ia64/mm/init.c | 4 ++++
arch/powerpc/mm/mem.c | 3 ++-
arch/s390/mm/init.c | 2 +-
arch/sh/mm/init.c | 3 +++
arch/x86/mm/init_32.c | 5 +++++
arch/x86/mm/init_64.c | 2 +-
include/linux/memory_hotplug.h | 2 ++
mm/memory_hotplug.c | 2 +-
mm/memremap.c | 6 +++---
10 files changed, 24 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 3320406579c3..9b214b0d268f 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -1058,7 +1058,8 @@ int arch_add_memory(int nid, u64 start, u64 size,
flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;

__create_pgd_mapping(swapper_pg_dir, start, __phys_to_virt(start),
- size, PAGE_KERNEL, __pgd_pgtable_alloc, flags);
+ size, modifiers->pgprot, __pgd_pgtable_alloc,
+ flags);

memblock_clear_nomap(start, size);

diff --git a/arch/ia64/mm/init.c b/arch/ia64/mm/init.c
index daf438e08b96..5fd6ae4929c9 100644
--- a/arch/ia64/mm/init.c
+++ b/arch/ia64/mm/init.c
@@ -677,6 +677,10 @@ int arch_add_memory(int nid, u64 start, u64 size,
int ret;

ret = __add_pages(nid, start_pfn, nr_pages, modifiers);
+ if (modifiers->pgprot != PAGE_KERNEL)
+ return -EINVAL;
+
+ ret = __add_pages(nid, start_pfn, nr_pages, restrictions);
if (ret)
printk("%s: Problem encountered in __add_pages() as ret=%d\n",
__func__, ret);
diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index 631ee684721f..fddeaee53198 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -137,7 +137,8 @@ int __ref arch_add_memory(int nid, u64 start, u64 size,
resize_hpt_for_hotplug(memblock_phys_mem_size());

start = (unsigned long)__va(start);
- rc = create_section_mapping(start, start + size, nid, PAGE_KERNEL);
+ rc = create_section_mapping(start, start + size, nid,
+ modifiers->pgprot);
if (rc) {
pr_warn("Unable to create mapping for hot added memory 0x%llx..0x%llx: %d\n",
start, start + size, rc);
diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
index ef19522ddad2..c65fb33f6a89 100644
--- a/arch/s390/mm/init.c
+++ b/arch/s390/mm/init.c
@@ -277,7 +277,7 @@ int arch_add_memory(int nid, u64 start, u64 size,
if (WARN_ON_ONCE(modifiers->altmap))
return -EINVAL;

- rc = vmem_add_mapping(start, size, PAGE_KERNEL);
+ rc = vmem_add_mapping(start, size, modifiers->pgprot);
if (rc)
return rc;

diff --git a/arch/sh/mm/init.c b/arch/sh/mm/init.c
index 7e64f42fb570..7071dc5bd2e4 100644
--- a/arch/sh/mm/init.c
+++ b/arch/sh/mm/init.c
@@ -412,6 +412,9 @@ int arch_add_memory(int nid, u64 start, u64 size,
unsigned long nr_pages = size >> PAGE_SHIFT;
int ret;

+ if (modifiers->pgprot != PAGE_KERNEL)
+ return -EINVAL;
+
/* We only have ZONE_NORMAL, so this is easy.. */
ret = __add_pages(nid, start_pfn, nr_pages, modifiers);
if (unlikely(ret))
diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c
index 630d8a36fcd7..737da0dbc0d5 100644
--- a/arch/x86/mm/init_32.c
+++ b/arch/x86/mm/init_32.c
@@ -857,6 +857,11 @@ int arch_add_memory(int nid, u64 start, u64 size,
{
unsigned long start_pfn = start >> PAGE_SHIFT;
unsigned long nr_pages = size >> PAGE_SHIFT;
+ int ret;
+
+ ret = _set_memory_prot(start, nr_pages, modifiers->pgprot);
+ if (ret)
+ return ret;

return __add_pages(nid, start_pfn, nr_pages, modifiers);
}
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 17ea0bfc0b83..cc9eb45ad120 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -868,7 +868,7 @@ int arch_add_memory(int nid, u64 start, u64 size,
unsigned long start_pfn = start >> PAGE_SHIFT;
unsigned long nr_pages = size >> PAGE_SHIFT;

- init_memory_mapping(start, start + size, PAGE_KERNEL);
+ init_memory_mapping(start, start + size, modifiers->pgprot);

return add_pages(nid, start_pfn, nr_pages, modifiers);
}
diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 2152efae2f4b..00dfb2016737 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -56,9 +56,11 @@ enum {
/*
* Restrictions for the memory hotplug:
* altmap: alternative allocator for memmap array
+ * pgprot: page protection flags to apply to newly added page tables
*/
struct mhp_modifiers {
struct vmem_altmap *altmap;
+ pgprot_t pgprot;
};

/*
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 1bb3f92e087d..0888f821af06 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1027,7 +1027,7 @@ static int online_memory_block(struct memory_block *mem, void *arg)
*/
int __ref add_memory_resource(int nid, struct resource *res)
{
- struct mhp_modifiers modifiers = {};
+ struct mhp_modifiers modifiers = {.pgprot = PAGE_KERNEL};
u64 start, size;
bool new_node = false;
int ret;
diff --git a/mm/memremap.c b/mm/memremap.c
index e30be8ba706b..45ab4ef0643d 100644
--- a/mm/memremap.c
+++ b/mm/memremap.c
@@ -163,8 +163,8 @@ void *memremap_pages(struct dev_pagemap *pgmap, int nid)
* We do not want any optional features only our own memmap
*/
.altmap = pgmap_altmap(pgmap),
+ .pgprot = PAGE_KERNEL,
};
- pgprot_t pgprot = PAGE_KERNEL;
int error, is_ram;
bool need_devmap_managed = true;

@@ -252,8 +252,8 @@ void *memremap_pages(struct dev_pagemap *pgmap, int nid)
if (nid < 0)
nid = numa_mem_id();

- error = track_pfn_remap(NULL, &pgprot, PHYS_PFN(res->start), 0,
- resource_size(res));
+ error = track_pfn_remap(NULL, &modifiers.pgprot, PHYS_PFN(res->start),
+ 0, resource_size(res));
if (error)
goto err_pfn_remap;

--
2.20.1

2020-01-07 21:27:27

by Logan Gunthorpe

[permalink] [raw]
Subject: [PATCH v2 8/8] mm/memremap: Set caching mode for PCI P2PDMA memory to WC

PCI BAR IO memory should never be mapped as WB, however prior to this
the PAT bits were set WB and it was typically overridden by MTRR
registers set by the firmware.

Set PCI P2PDMA memory to be WC (writecombining) as the only current
user (the NVMe CMB) was originally mapped WC before the P2PDMA code
replaced the mapping with devm_memremap_pages().

Future use-cases may need to generalize this by adding flags to
select the caching type, as some P2PDMA cases will not want WC.
However, those use-cases are not upstream yet and this can be changed
when they arrive.

Cc: Dan Williams <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Signed-off-by: Logan Gunthorpe <[email protected]>
---
mm/memremap.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/mm/memremap.c b/mm/memremap.c
index 45ab4ef0643d..d36ff688b768 100644
--- a/mm/memremap.c
+++ b/mm/memremap.c
@@ -187,7 +187,10 @@ void *memremap_pages(struct dev_pagemap *pgmap, int nid)
}
break;
case MEMORY_DEVICE_DEVDAX:
+ need_devmap_managed = false;
+ break;
case MEMORY_DEVICE_PCI_P2PDMA:
+ modifiers.pgprot = pgprot_writecombine(modifiers.pgprot);
need_devmap_managed = false;
break;
default:
--
2.20.1

2020-01-08 12:28:41

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH v2 1/8] mm/memory_hotplug: Drop the flags field from struct mhp_restrictions

On 07.01.20 21:59, Logan Gunthorpe wrote:
> This variable is not used anywhere and should therefore be removed
> from the structure.
>
> Signed-off-by: Logan Gunthorpe <[email protected]>
> ---
> include/linux/memory_hotplug.h | 2 --
> 1 file changed, 2 deletions(-)
>
> diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> index ba0dca6aac6e..e47a29761088 100644
> --- a/include/linux/memory_hotplug.h
> +++ b/include/linux/memory_hotplug.h
> @@ -55,11 +55,9 @@ enum {
>
> /*
> * Restrictions for the memory hotplug:
> - * flags: MHP_ flags
> * altmap: alternative allocator for memmap array
> */
> struct mhp_restrictions {
> - unsigned long flags;
> struct vmem_altmap *altmap;
> };

We wanted to use that for the "vmemmap on added memory" config knob. But
that might still take some time and we might actually realize it using
the altmap instead (hopefully :) ).

Reviewed-by: David Hildenbrand <[email protected]>

--
Thanks,

David / dhildenb

2020-01-08 13:00:09

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH v2 7/8] mm/memory_hotplug: Add pgprot_t to mhp_modifiers

On Tue 07-01-20 13:59:58, Logan Gunthorpe wrote:
> devm_memremap_pages() is currently used by the PCI P2PDMA code to create
> struct page mappings for IO memory. At present, these mappings are created
> with PAGE_KERNEL which implies setting the PAT bits to be WB. However, on
> x86, an mtrr register will typically override this and force the cache
> type to be UC-. In the case firmware doesn't set this register it is
> effectively WB and will typically result in a machine check exception
> when it's accessed.
>
> Other arches are not currently likely to function correctly seeing they
> don't have any MTRR registers to fall back on.
>
> To solve this, add an argument to arch_add_memory() to explicitly
> set the pgprot value to a specific value.
>
> Of the arches that support MEMORY_HOTPLUG: x86_64, s390 and arm64 is a
> simple change to pass the pgprot_t down to their respective functions
> which set up the page tables. For x86_32, set the page tables explicitly
> using _set_memory_prot() (seeing they are already mapped). For sh, reject
> anything but PAGE_KERNEL settings -- this should be fine, for now, seeing
> sh doesn't support ZONE_DEVICE anyway.
>
> Cc: Dan Williams <[email protected]>
> Cc: David Hildenbrand <[email protected]>
> Cc: Michal Hocko <[email protected]>
> Signed-off-by: Logan Gunthorpe <[email protected]>

OK, this is less code churn than I expected. Having pgprot as an implcit
parameter de-facto is a bit fragile though. Should we add a WARN_ON_ONCE
(e.g. into the add_pages to catch all arches) for value 0?

Other than that
Acked-by: Michal Hocko <[email protected]>

> ---
> arch/arm64/mm/mmu.c | 3 ++-
> arch/ia64/mm/init.c | 4 ++++
> arch/powerpc/mm/mem.c | 3 ++-
> arch/s390/mm/init.c | 2 +-
> arch/sh/mm/init.c | 3 +++
> arch/x86/mm/init_32.c | 5 +++++
> arch/x86/mm/init_64.c | 2 +-
> include/linux/memory_hotplug.h | 2 ++
> mm/memory_hotplug.c | 2 +-
> mm/memremap.c | 6 +++---
> 10 files changed, 24 insertions(+), 8 deletions(-)
>
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index 3320406579c3..9b214b0d268f 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -1058,7 +1058,8 @@ int arch_add_memory(int nid, u64 start, u64 size,
> flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
>
> __create_pgd_mapping(swapper_pg_dir, start, __phys_to_virt(start),
> - size, PAGE_KERNEL, __pgd_pgtable_alloc, flags);
> + size, modifiers->pgprot, __pgd_pgtable_alloc,
> + flags);
>
> memblock_clear_nomap(start, size);
>
> diff --git a/arch/ia64/mm/init.c b/arch/ia64/mm/init.c
> index daf438e08b96..5fd6ae4929c9 100644
> --- a/arch/ia64/mm/init.c
> +++ b/arch/ia64/mm/init.c
> @@ -677,6 +677,10 @@ int arch_add_memory(int nid, u64 start, u64 size,
> int ret;
>
> ret = __add_pages(nid, start_pfn, nr_pages, modifiers);
> + if (modifiers->pgprot != PAGE_KERNEL)
> + return -EINVAL;
> +
> + ret = __add_pages(nid, start_pfn, nr_pages, restrictions);
> if (ret)
> printk("%s: Problem encountered in __add_pages() as ret=%d\n",
> __func__, ret);
> diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
> index 631ee684721f..fddeaee53198 100644
> --- a/arch/powerpc/mm/mem.c
> +++ b/arch/powerpc/mm/mem.c
> @@ -137,7 +137,8 @@ int __ref arch_add_memory(int nid, u64 start, u64 size,
> resize_hpt_for_hotplug(memblock_phys_mem_size());
>
> start = (unsigned long)__va(start);
> - rc = create_section_mapping(start, start + size, nid, PAGE_KERNEL);
> + rc = create_section_mapping(start, start + size, nid,
> + modifiers->pgprot);
> if (rc) {
> pr_warn("Unable to create mapping for hot added memory 0x%llx..0x%llx: %d\n",
> start, start + size, rc);
> diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
> index ef19522ddad2..c65fb33f6a89 100644
> --- a/arch/s390/mm/init.c
> +++ b/arch/s390/mm/init.c
> @@ -277,7 +277,7 @@ int arch_add_memory(int nid, u64 start, u64 size,
> if (WARN_ON_ONCE(modifiers->altmap))
> return -EINVAL;
>
> - rc = vmem_add_mapping(start, size, PAGE_KERNEL);
> + rc = vmem_add_mapping(start, size, modifiers->pgprot);
> if (rc)
> return rc;
>
> diff --git a/arch/sh/mm/init.c b/arch/sh/mm/init.c
> index 7e64f42fb570..7071dc5bd2e4 100644
> --- a/arch/sh/mm/init.c
> +++ b/arch/sh/mm/init.c
> @@ -412,6 +412,9 @@ int arch_add_memory(int nid, u64 start, u64 size,
> unsigned long nr_pages = size >> PAGE_SHIFT;
> int ret;
>
> + if (modifiers->pgprot != PAGE_KERNEL)
> + return -EINVAL;
> +
> /* We only have ZONE_NORMAL, so this is easy.. */
> ret = __add_pages(nid, start_pfn, nr_pages, modifiers);
> if (unlikely(ret))
> diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c
> index 630d8a36fcd7..737da0dbc0d5 100644
> --- a/arch/x86/mm/init_32.c
> +++ b/arch/x86/mm/init_32.c
> @@ -857,6 +857,11 @@ int arch_add_memory(int nid, u64 start, u64 size,
> {
> unsigned long start_pfn = start >> PAGE_SHIFT;
> unsigned long nr_pages = size >> PAGE_SHIFT;
> + int ret;
> +
> + ret = _set_memory_prot(start, nr_pages, modifiers->pgprot);
> + if (ret)
> + return ret;
>
> return __add_pages(nid, start_pfn, nr_pages, modifiers);
> }
> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> index 17ea0bfc0b83..cc9eb45ad120 100644
> --- a/arch/x86/mm/init_64.c
> +++ b/arch/x86/mm/init_64.c
> @@ -868,7 +868,7 @@ int arch_add_memory(int nid, u64 start, u64 size,
> unsigned long start_pfn = start >> PAGE_SHIFT;
> unsigned long nr_pages = size >> PAGE_SHIFT;
>
> - init_memory_mapping(start, start + size, PAGE_KERNEL);
> + init_memory_mapping(start, start + size, modifiers->pgprot);
>
> return add_pages(nid, start_pfn, nr_pages, modifiers);
> }
> diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> index 2152efae2f4b..00dfb2016737 100644
> --- a/include/linux/memory_hotplug.h
> +++ b/include/linux/memory_hotplug.h
> @@ -56,9 +56,11 @@ enum {
> /*
> * Restrictions for the memory hotplug:
> * altmap: alternative allocator for memmap array
> + * pgprot: page protection flags to apply to newly added page tables
> */
> struct mhp_modifiers {
> struct vmem_altmap *altmap;
> + pgprot_t pgprot;
> };
>
> /*
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 1bb3f92e087d..0888f821af06 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1027,7 +1027,7 @@ static int online_memory_block(struct memory_block *mem, void *arg)
> */
> int __ref add_memory_resource(int nid, struct resource *res)
> {
> - struct mhp_modifiers modifiers = {};
> + struct mhp_modifiers modifiers = {.pgprot = PAGE_KERNEL};
> u64 start, size;
> bool new_node = false;
> int ret;
> diff --git a/mm/memremap.c b/mm/memremap.c
> index e30be8ba706b..45ab4ef0643d 100644
> --- a/mm/memremap.c
> +++ b/mm/memremap.c
> @@ -163,8 +163,8 @@ void *memremap_pages(struct dev_pagemap *pgmap, int nid)
> * We do not want any optional features only our own memmap
> */
> .altmap = pgmap_altmap(pgmap),
> + .pgprot = PAGE_KERNEL,
> };
> - pgprot_t pgprot = PAGE_KERNEL;
> int error, is_ram;
> bool need_devmap_managed = true;
>
> @@ -252,8 +252,8 @@ void *memremap_pages(struct dev_pagemap *pgmap, int nid)
> if (nid < 0)
> nid = numa_mem_id();
>
> - error = track_pfn_remap(NULL, &pgprot, PHYS_PFN(res->start), 0,
> - resource_size(res));
> + error = track_pfn_remap(NULL, &modifiers.pgprot, PHYS_PFN(res->start),
> + 0, resource_size(res));
> if (error)
> goto err_pfn_remap;
>
> --
> 2.20.1
>

--
Michal Hocko
SUSE Labs

2020-01-08 13:59:19

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH v2 6/8] s390/mm: Thread pgprot_t through vmem_add_mapping()

On 07.01.20 21:59, Logan Gunthorpe wrote:
> In prepartion to support a pgprot_t argument for arch_add_memory().
>
> Cc: Heiko Carstens <[email protected]>
> Cc: Vasily Gorbik <[email protected]>
> Cc: Christian Borntraeger <[email protected]>
> Signed-off-by: Logan Gunthorpe <[email protected]>
> ---
> arch/s390/include/asm/pgtable.h | 3 ++-
> arch/s390/mm/extmem.c | 3 ++-
> arch/s390/mm/init.c | 2 +-
> arch/s390/mm/vmem.c | 10 +++++-----
> 4 files changed, 10 insertions(+), 8 deletions(-)
>
> diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
> index 7b03037a8475..e667a1a96879 100644
> --- a/arch/s390/include/asm/pgtable.h
> +++ b/arch/s390/include/asm/pgtable.h
> @@ -1640,7 +1640,8 @@ static inline swp_entry_t __swp_entry(unsigned long type, unsigned long offset)
>
> #define kern_addr_valid(addr) (1)
>
> -extern int vmem_add_mapping(unsigned long start, unsigned long size);
> +extern int vmem_add_mapping(unsigned long start, unsigned long size,
> + pgprot_t prot);
> extern int vmem_remove_mapping(unsigned long start, unsigned long size);
> extern int s390_enable_sie(void);
> extern int s390_enable_skey(void);
> diff --git a/arch/s390/mm/extmem.c b/arch/s390/mm/extmem.c
> index fd0dae9d10f4..6cf7029a7b35 100644
> --- a/arch/s390/mm/extmem.c
> +++ b/arch/s390/mm/extmem.c
> @@ -313,7 +313,8 @@ __segment_load (char *name, int do_nonshared, unsigned long *addr, unsigned long
> goto out_free;
> }
>
> - rc = vmem_add_mapping(seg->start_addr, seg->end - seg->start_addr + 1);
> + rc = vmem_add_mapping(seg->start_addr, seg->end - seg->start_addr + 1,
> + PAGE_KERNEL);
>
> if (rc)
> goto out_free;
> diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
> index a0c88c1c9ad0..ef19522ddad2 100644
> --- a/arch/s390/mm/init.c
> +++ b/arch/s390/mm/init.c
> @@ -277,7 +277,7 @@ int arch_add_memory(int nid, u64 start, u64 size,
> if (WARN_ON_ONCE(modifiers->altmap))
> return -EINVAL;
>
> - rc = vmem_add_mapping(start, size);
> + rc = vmem_add_mapping(start, size, PAGE_KERNEL);
> if (rc)
> return rc;
>
> diff --git a/arch/s390/mm/vmem.c b/arch/s390/mm/vmem.c
> index b403fa14847d..8a5e95f184a2 100644
> --- a/arch/s390/mm/vmem.c
> +++ b/arch/s390/mm/vmem.c
> @@ -66,7 +66,7 @@ pte_t __ref *vmem_pte_alloc(void)
> /*
> * Add a physical memory range to the 1:1 mapping.
> */
> -static int vmem_add_mem(unsigned long start, unsigned long size)
> +static int vmem_add_mem(unsigned long start, unsigned long size, pgprot_t prot)
> {
> unsigned long pgt_prot, sgt_prot, r3_prot;
> unsigned long pages4k, pages1m, pages2g;
> @@ -79,7 +79,7 @@ static int vmem_add_mem(unsigned long start, unsigned long size)
> pte_t *pt_dir;
> int ret = -ENOMEM;
>
> - pgt_prot = pgprot_val(PAGE_KERNEL);
> + pgt_prot = pgprot_val(prot);
> sgt_prot = pgprot_val(SEGMENT_KERNEL);
> r3_prot = pgprot_val(REGION3_KERNEL);

So, if we map as huge/gigantic pages, the protection would be discarded?
That looks wrong.

s390x does not support ZONE_DEVICE yet. Maybe simply bail out for s390x
as you do for sh to make your life easier?

[...]

--
Thanks,

David / dhildenb

2020-01-08 14:08:24

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH v2 2/8] mm/memory_hotplug: Rename mhp_restrictions to mhp_modifiers

On 07.01.20 21:59, Logan Gunthorpe wrote:
> The mhp_restrictions struct really doesn't specify anything resembling
> a restriction anymore so rename it to be mhp_modifiers.

I wonder if something like "mhp_params" would be even better. It's
essentially just a way to avoid changing call chains rough-out all archs
whenever we want to add a new parameter.


--
Thanks,

David / dhildenb

2020-01-08 14:12:30

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH v2 7/8] mm/memory_hotplug: Add pgprot_t to mhp_modifiers

On 07.01.20 21:59, Logan Gunthorpe wrote:
> devm_memremap_pages() is currently used by the PCI P2PDMA code to create
> struct page mappings for IO memory. At present, these mappings are created
> with PAGE_KERNEL which implies setting the PAT bits to be WB. However, on
> x86, an mtrr register will typically override this and force the cache
> type to be UC-. In the case firmware doesn't set this register it is
> effectively WB and will typically result in a machine check exception
> when it's accessed.
>
> Other arches are not currently likely to function correctly seeing they
> don't have any MTRR registers to fall back on.
>
> To solve this, add an argument to arch_add_memory() to explicitly
> set the pgprot value to a specific value.

You're adding a parameter indirectly by adding it to the structure.
Maybe "provide a way to specify the pgprot value explicitly to
arch_add_memory()"

>
> Of the arches that support MEMORY_HOTPLUG: x86_64, s390 and arm64 is a

s/is/need/

> simple change to pass the pgprot_t down to their respective functions
> which set up the page tables. For x86_32, set the page tables explicitly

"page table protection" ?

> using _set_memory_prot() (seeing they are already mapped). For sh, reject
> anything but PAGE_KERNEL settings -- this should be fine, for now, seeing
> sh doesn't support ZONE_DEVICE anyway.
>
> Cc: Dan Williams <[email protected]>
> Cc: David Hildenbrand <[email protected]>
> Cc: Michal Hocko <[email protected]>
> Signed-off-by: Logan Gunthorpe <[email protected]>
> ---
> arch/arm64/mm/mmu.c | 3 ++-
> arch/ia64/mm/init.c | 4 ++++
> arch/powerpc/mm/mem.c | 3 ++-
> arch/s390/mm/init.c | 2 +-
> arch/sh/mm/init.c | 3 +++
> arch/x86/mm/init_32.c | 5 +++++
> arch/x86/mm/init_64.c | 2 +-
> include/linux/memory_hotplug.h | 2 ++
> mm/memory_hotplug.c | 2 +-
> mm/memremap.c | 6 +++---
> 10 files changed, 24 insertions(+), 8 deletions(-)
>
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index 3320406579c3..9b214b0d268f 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -1058,7 +1058,8 @@ int arch_add_memory(int nid, u64 start, u64 size,
> flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
>
> __create_pgd_mapping(swapper_pg_dir, start, __phys_to_virt(start),
> - size, PAGE_KERNEL, __pgd_pgtable_alloc, flags);
> + size, modifiers->pgprot, __pgd_pgtable_alloc,
> + flags);
>
> memblock_clear_nomap(start, size);
>
> diff --git a/arch/ia64/mm/init.c b/arch/ia64/mm/init.c
> index daf438e08b96..5fd6ae4929c9 100644
> --- a/arch/ia64/mm/init.c
> +++ b/arch/ia64/mm/init.c
> @@ -677,6 +677,10 @@ int arch_add_memory(int nid, u64 start, u64 size,
> int ret;
>
> ret = __add_pages(nid, start_pfn, nr_pages, modifiers);
> + if (modifiers->pgprot != PAGE_KERNEL)
> + return -EINVAL;

... maybe better "if (WARN_ON_ONCE(...))"
[...]

> --- a/include/linux/memory_hotplug.h
> +++ b/include/linux/memory_hotplug.h
> @@ -56,9 +56,11 @@ enum {
> /*
> * Restrictions for the memory hotplug:
> * altmap: alternative allocator for memmap array
> + * pgprot: page protection flags to apply to newly added page tables
> */
> struct mhp_modifiers {
> struct vmem_altmap *altmap;
> + pgprot_t pgprot;
> };
>
> /*
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 1bb3f92e087d..0888f821af06 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1027,7 +1027,7 @@ static int online_memory_block(struct memory_block *mem, void *arg)
> */
> int __ref add_memory_resource(int nid, struct resource *res)
> {
> - struct mhp_modifiers modifiers = {};
> + struct mhp_modifiers modifiers = {.pgprot = PAGE_KERNEL};

I think we usually use spaces like

= { .pgprot = PAGE_KERNEL };

t480s: ~/git/linux virtio-mem-v1 $ git grep "= {\." | wc -l
978
t480s: ~/git/linux virtio-mem-v1 $ git grep "= { " | wc -l
35447

> u64 start, size;
> bool new_node = false;
> int ret;
> diff --git a/mm/memremap.c b/mm/memremap.c
> index e30be8ba706b..45ab4ef0643d 100644
> --- a/mm/memremap.c
> +++ b/mm/memremap.c
> @@ -163,8 +163,8 @@ void *memremap_pages(struct dev_pagemap *pgmap, int nid)
> * We do not want any optional features only our own memmap
> */
> .altmap = pgmap_altmap(pgmap),
> + .pgprot = PAGE_KERNEL,
> };
> - pgprot_t pgprot = PAGE_KERNEL;
> int error, is_ram;
> bool need_devmap_managed = true;
>
> @@ -252,8 +252,8 @@ void *memremap_pages(struct dev_pagemap *pgmap, int nid)
> if (nid < 0)
> nid = numa_mem_id();
>
> - error = track_pfn_remap(NULL, &pgprot, PHYS_PFN(res->start), 0,
> - resource_size(res));
> + error = track_pfn_remap(NULL, &modifiers.pgprot, PHYS_PFN(res->start),
> + 0, resource_size(res));
> if (error)
> goto err_pfn_remap;
>
>

The !arch code looks good to me (besides I would prefer "params" instead
of "modifiers").

Acked-by: David Hildenbrand <[email protected]>


--
Thanks,

David / dhildenb

2020-01-08 17:21:41

by Logan Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v2 6/8] s390/mm: Thread pgprot_t through vmem_add_mapping()



On 2020-01-08 5:43 a.m., David Hildenbrand wrote:
> On 07.01.20 21:59, Logan Gunthorpe wrote:
>> In prepartion to support a pgprot_t argument for arch_add_memory().
>>
>> Cc: Heiko Carstens <[email protected]>
>> Cc: Vasily Gorbik <[email protected]>
>> Cc: Christian Borntraeger <[email protected]>
>> Signed-off-by: Logan Gunthorpe <[email protected]>
>> ---
>> arch/s390/include/asm/pgtable.h | 3 ++-
>> arch/s390/mm/extmem.c | 3 ++-
>> arch/s390/mm/init.c | 2 +-
>> arch/s390/mm/vmem.c | 10 +++++-----
>> 4 files changed, 10 insertions(+), 8 deletions(-)
>>
>> diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
>> index 7b03037a8475..e667a1a96879 100644
>> --- a/arch/s390/include/asm/pgtable.h
>> +++ b/arch/s390/include/asm/pgtable.h
>> @@ -1640,7 +1640,8 @@ static inline swp_entry_t __swp_entry(unsigned long type, unsigned long offset)
>>
>> #define kern_addr_valid(addr) (1)
>>
>> -extern int vmem_add_mapping(unsigned long start, unsigned long size);
>> +extern int vmem_add_mapping(unsigned long start, unsigned long size,
>> + pgprot_t prot);
>> extern int vmem_remove_mapping(unsigned long start, unsigned long size);
>> extern int s390_enable_sie(void);
>> extern int s390_enable_skey(void);
>> diff --git a/arch/s390/mm/extmem.c b/arch/s390/mm/extmem.c
>> index fd0dae9d10f4..6cf7029a7b35 100644
>> --- a/arch/s390/mm/extmem.c
>> +++ b/arch/s390/mm/extmem.c
>> @@ -313,7 +313,8 @@ __segment_load (char *name, int do_nonshared, unsigned long *addr, unsigned long
>> goto out_free;
>> }
>>
>> - rc = vmem_add_mapping(seg->start_addr, seg->end - seg->start_addr + 1);
>> + rc = vmem_add_mapping(seg->start_addr, seg->end - seg->start_addr + 1,
>> + PAGE_KERNEL);
>>
>> if (rc)
>> goto out_free;
>> diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
>> index a0c88c1c9ad0..ef19522ddad2 100644
>> --- a/arch/s390/mm/init.c
>> +++ b/arch/s390/mm/init.c
>> @@ -277,7 +277,7 @@ int arch_add_memory(int nid, u64 start, u64 size,
>> if (WARN_ON_ONCE(modifiers->altmap))
>> return -EINVAL;
>>
>> - rc = vmem_add_mapping(start, size);
>> + rc = vmem_add_mapping(start, size, PAGE_KERNEL);
>> if (rc)
>> return rc;
>>
>> diff --git a/arch/s390/mm/vmem.c b/arch/s390/mm/vmem.c
>> index b403fa14847d..8a5e95f184a2 100644
>> --- a/arch/s390/mm/vmem.c
>> +++ b/arch/s390/mm/vmem.c
>> @@ -66,7 +66,7 @@ pte_t __ref *vmem_pte_alloc(void)
>> /*
>> * Add a physical memory range to the 1:1 mapping.
>> */
>> -static int vmem_add_mem(unsigned long start, unsigned long size)
>> +static int vmem_add_mem(unsigned long start, unsigned long size, pgprot_t prot)
>> {
>> unsigned long pgt_prot, sgt_prot, r3_prot;
>> unsigned long pages4k, pages1m, pages2g;
>> @@ -79,7 +79,7 @@ static int vmem_add_mem(unsigned long start, unsigned long size)
>> pte_t *pt_dir;
>> int ret = -ENOMEM;
>>
>> - pgt_prot = pgprot_val(PAGE_KERNEL);
>> + pgt_prot = pgprot_val(prot);
>> sgt_prot = pgprot_val(SEGMENT_KERNEL);
>> r3_prot = pgprot_val(REGION3_KERNEL);
>
> So, if we map as huge/gigantic pages, the protection would be discarded?
> That looks wrong.
>
> s390x does not support ZONE_DEVICE yet. Maybe simply bail out for s390x
> as you do for sh to make your life easier?

Yeah, ok, makes sense to me; I'll change it for v3.

Logan

2020-01-08 19:48:32

by Dan Williams

[permalink] [raw]
Subject: Re: [PATCH v2 2/8] mm/memory_hotplug: Rename mhp_restrictions to mhp_modifiers

On Wed, Jan 8, 2020 at 9:17 AM Logan Gunthorpe <[email protected]> wrote:
>
>
>
> On 2020-01-08 5:28 a.m., David Hildenbrand wrote:
> > On 07.01.20 21:59, Logan Gunthorpe wrote:
> >> The mhp_restrictions struct really doesn't specify anything resembling
> >> a restriction anymore so rename it to be mhp_modifiers.
> >
> > I wonder if something like "mhp_params" would be even better. It's
> > essentially just a way to avoid changing call chains rough-out all archs
> > whenever we want to add a new parameter.
>
> Sure, that does sound a bit nicer to me. I can change it for v3.

Oh, I was just about to chime in to support "modifiers" because I
would expect all parameters to folded into a "params" struct. The
modifiers seem to be limited to the set of items that are only
considered in a non-default / expert memory hotplug use cases.

2020-01-08 19:48:35

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH v2 2/8] mm/memory_hotplug: Rename mhp_restrictions to mhp_modifiers



> Am 08.01.2020 um 20:00 schrieb Dan Williams <[email protected]>:
>
> On Wed, Jan 8, 2020 at 9:17 AM Logan Gunthorpe <[email protected]> wrote:
>>
>>
>>
>>> On 2020-01-08 5:28 a.m., David Hildenbrand wrote:
>>> On 07.01.20 21:59, Logan Gunthorpe wrote:
>>>> The mhp_restrictions struct really doesn't specify anything resembling
>>>> a restriction anymore so rename it to be mhp_modifiers.
>>>
>>> I wonder if something like "mhp_params" would be even better. It's
>>> essentially just a way to avoid changing call chains rough-out all archs
>>> whenever we want to add a new parameter.
>>
>> Sure, that does sound a bit nicer to me. I can change it for v3.
>
> Oh, I was just about to chime in to support "modifiers" because I
> would expect all parameters to folded into a "params" struct. The
> modifiers seem to be limited to the set of items that are only
> considered in a non-default / expert memory hotplug use cases.
>

It‘s a set of extended parameters I‘d say.

2020-01-08 19:49:08

by Logan Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v2 2/8] mm/memory_hotplug: Rename mhp_restrictions to mhp_modifiers



On 2020-01-08 12:13 p.m., Dan Williams wrote:
> On Wed, Jan 8, 2020 at 11:08 AM David Hildenbrand <[email protected]> wrote:
>>
>>
>>
>>> Am 08.01.2020 um 20:00 schrieb Dan Williams <[email protected]>:
>>>
>>> On Wed, Jan 8, 2020 at 9:17 AM Logan Gunthorpe <[email protected]> wrote:
>>>>
>>>>
>>>>
>>>>> On 2020-01-08 5:28 a.m., David Hildenbrand wrote:
>>>>> On 07.01.20 21:59, Logan Gunthorpe wrote:
>>>>>> The mhp_restrictions struct really doesn't specify anything resembling
>>>>>> a restriction anymore so rename it to be mhp_modifiers.
>>>>>
>>>>> I wonder if something like "mhp_params" would be even better. It's
>>>>> essentially just a way to avoid changing call chains rough-out all archs
>>>>> whenever we want to add a new parameter.
>>>>
>>>> Sure, that does sound a bit nicer to me. I can change it for v3.
>>>
>>> Oh, I was just about to chime in to support "modifiers" because I
>>> would expect all parameters to folded into a "params" struct. The
>>> modifiers seem to be limited to the set of items that are only
>>> considered in a non-default / expert memory hotplug use cases.

>>
>> It‘s a set of extended parameters I‘d say.

> Sure, we can call them "mhp_params" and just clarify that they are
> optional / extended in the kernel-doc.

Well pgprot isn't going to be optional... But I'll add something to the
kernel_doc.

Logan

2020-01-08 19:54:47

by Logan Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v2 7/8] mm/memory_hotplug: Add pgprot_t to mhp_modifiers



On 2020-01-08 5:39 a.m., David Hildenbrand wrote:
> On 07.01.20 21:59, Logan Gunthorpe wrote:
>> devm_memremap_pages() is currently used by the PCI P2PDMA code to create
>> struct page mappings for IO memory. At present, these mappings are created
>> with PAGE_KERNEL which implies setting the PAT bits to be WB. However, on
>> x86, an mtrr register will typically override this and force the cache
>> type to be UC-. In the case firmware doesn't set this register it is
>> effectively WB and will typically result in a machine check exception
>> when it's accessed.
>>
>> Other arches are not currently likely to function correctly seeing they
>> don't have any MTRR registers to fall back on.
>>
>> To solve this, add an argument to arch_add_memory() to explicitly
>> set the pgprot value to a specific value.
>
> You're adding a parameter indirectly by adding it to the structure.
> Maybe "provide a way to specify the pgprot value explicitly to
> arch_add_memory()"
>
>>
>> Of the arches that support MEMORY_HOTPLUG: x86_64, s390 and arm64 is a
>
> s/is/need/
>
>> simple change to pass the pgprot_t down to their respective functions
>> which set up the page tables. For x86_32, set the page tables explicitly
>
> "page table protection" ?
>
>> using _set_memory_prot() (seeing they are already mapped). For sh, reject
>> anything but PAGE_KERNEL settings -- this should be fine, for now, seeing
>> sh doesn't support ZONE_DEVICE anyway.
>>
>> Cc: Dan Williams <[email protected]>
>> Cc: David Hildenbrand <[email protected]>
>> Cc: Michal Hocko <[email protected]>
>> Signed-off-by: Logan Gunthorpe <[email protected]>
>> ---
>> arch/arm64/mm/mmu.c | 3 ++-
>> arch/ia64/mm/init.c | 4 ++++
>> arch/powerpc/mm/mem.c | 3 ++-
>> arch/s390/mm/init.c | 2 +-
>> arch/sh/mm/init.c | 3 +++
>> arch/x86/mm/init_32.c | 5 +++++
>> arch/x86/mm/init_64.c | 2 +-
>> include/linux/memory_hotplug.h | 2 ++
>> mm/memory_hotplug.c | 2 +-
>> mm/memremap.c | 6 +++---
>> 10 files changed, 24 insertions(+), 8 deletions(-)
>>
>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
>> index 3320406579c3..9b214b0d268f 100644
>> --- a/arch/arm64/mm/mmu.c
>> +++ b/arch/arm64/mm/mmu.c
>> @@ -1058,7 +1058,8 @@ int arch_add_memory(int nid, u64 start, u64 size,
>> flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
>>
>> __create_pgd_mapping(swapper_pg_dir, start, __phys_to_virt(start),
>> - size, PAGE_KERNEL, __pgd_pgtable_alloc, flags);
>> + size, modifiers->pgprot, __pgd_pgtable_alloc,
>> + flags);
>>
>> memblock_clear_nomap(start, size);
>>
>> diff --git a/arch/ia64/mm/init.c b/arch/ia64/mm/init.c
>> index daf438e08b96..5fd6ae4929c9 100644
>> --- a/arch/ia64/mm/init.c
>> +++ b/arch/ia64/mm/init.c
>> @@ -677,6 +677,10 @@ int arch_add_memory(int nid, u64 start, u64 size,
>> int ret;
>>
>> ret = __add_pages(nid, start_pfn, nr_pages, modifiers);
>> + if (modifiers->pgprot != PAGE_KERNEL)
>> + return -EINVAL;
>
> ... maybe better "if (WARN_ON_ONCE(...))"
> [...]
>
>> --- a/include/linux/memory_hotplug.h
>> +++ b/include/linux/memory_hotplug.h
>> @@ -56,9 +56,11 @@ enum {
>> /*
>> * Restrictions for the memory hotplug:
>> * altmap: alternative allocator for memmap array
>> + * pgprot: page protection flags to apply to newly added page tables
>> */
>> struct mhp_modifiers {
>> struct vmem_altmap *altmap;
>> + pgprot_t pgprot;
>> };
>>
>> /*
>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>> index 1bb3f92e087d..0888f821af06 100644
>> --- a/mm/memory_hotplug.c
>> +++ b/mm/memory_hotplug.c
>> @@ -1027,7 +1027,7 @@ static int online_memory_block(struct memory_block *mem, void *arg)
>> */
>> int __ref add_memory_resource(int nid, struct resource *res)
>> {
>> - struct mhp_modifiers modifiers = {};
>> + struct mhp_modifiers modifiers = {.pgprot = PAGE_KERNEL};
>
> I think we usually use spaces like
>
> = { .pgprot = PAGE_KERNEL };
>
> t480s: ~/git/linux virtio-mem-v1 $ git grep "= {\." | wc -l
> 978
> t480s: ~/git/linux virtio-mem-v1 $ git grep "= { " | wc -l
> 35447
>
>> u64 start, size;
>> bool new_node = false;
>> int ret;
>> diff --git a/mm/memremap.c b/mm/memremap.c
>> index e30be8ba706b..45ab4ef0643d 100644
>> --- a/mm/memremap.c
>> +++ b/mm/memremap.c
>> @@ -163,8 +163,8 @@ void *memremap_pages(struct dev_pagemap *pgmap, int nid)
>> * We do not want any optional features only our own memmap
>> */
>> .altmap = pgmap_altmap(pgmap),
>> + .pgprot = PAGE_KERNEL,
>> };
>> - pgprot_t pgprot = PAGE_KERNEL;
>> int error, is_ram;
>> bool need_devmap_managed = true;
>>
>> @@ -252,8 +252,8 @@ void *memremap_pages(struct dev_pagemap *pgmap, int nid)
>> if (nid < 0)
>> nid = numa_mem_id();
>>
>> - error = track_pfn_remap(NULL, &pgprot, PHYS_PFN(res->start), 0,
>> - resource_size(res));
>> + error = track_pfn_remap(NULL, &modifiers.pgprot, PHYS_PFN(res->start),
>> + 0, resource_size(res));
>> if (error)
>> goto err_pfn_remap;
>>
>>
>
> The !arch code looks good to me (besides I would prefer "params" instead
> of "modifiers").
>
> Acked-by: David Hildenbrand <[email protected]>

Thanks, I'll make the changes above for v3.

Logan

2020-01-08 19:54:48

by Logan Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v2 2/8] mm/memory_hotplug: Rename mhp_restrictions to mhp_modifiers



On 2020-01-08 5:28 a.m., David Hildenbrand wrote:
> On 07.01.20 21:59, Logan Gunthorpe wrote:
>> The mhp_restrictions struct really doesn't specify anything resembling
>> a restriction anymore so rename it to be mhp_modifiers.
>
> I wonder if something like "mhp_params" would be even better. It's
> essentially just a way to avoid changing call chains rough-out all archs
> whenever we want to add a new parameter.

Sure, that does sound a bit nicer to me. I can change it for v3.

Logan

2020-01-08 19:55:10

by Logan Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v2 7/8] mm/memory_hotplug: Add pgprot_t to mhp_modifiers



On 2020-01-08 5:42 a.m., Michal Hocko wrote:
> On Tue 07-01-20 13:59:58, Logan Gunthorpe wrote:
>> devm_memremap_pages() is currently used by the PCI P2PDMA code to create
>> struct page mappings for IO memory. At present, these mappings are created
>> with PAGE_KERNEL which implies setting the PAT bits to be WB. However, on
>> x86, an mtrr register will typically override this and force the cache
>> type to be UC-. In the case firmware doesn't set this register it is
>> effectively WB and will typically result in a machine check exception
>> when it's accessed.
>>
>> Other arches are not currently likely to function correctly seeing they
>> don't have any MTRR registers to fall back on.
>>
>> To solve this, add an argument to arch_add_memory() to explicitly
>> set the pgprot value to a specific value.
>>
>> Of the arches that support MEMORY_HOTPLUG: x86_64, s390 and arm64 is a
>> simple change to pass the pgprot_t down to their respective functions
>> which set up the page tables. For x86_32, set the page tables explicitly
>> using _set_memory_prot() (seeing they are already mapped). For sh, reject
>> anything but PAGE_KERNEL settings -- this should be fine, for now, seeing
>> sh doesn't support ZONE_DEVICE anyway.
>>
>> Cc: Dan Williams <[email protected]>
>> Cc: David Hildenbrand <[email protected]>
>> Cc: Michal Hocko <[email protected]>
>> Signed-off-by: Logan Gunthorpe <[email protected]>
>
> OK, this is less code churn than I expected. Having pgprot as an implcit
> parameter de-facto is a bit fragile though. Should we add a WARN_ON_ONCE
> (e.g. into the add_pages to catch all arches) for value 0?

Sure, I can add that for v3.

Logan

> Other than that
> Acked-by: Michal Hocko <[email protected]>
>
>> ---
>> arch/arm64/mm/mmu.c | 3 ++-
>> arch/ia64/mm/init.c | 4 ++++
>> arch/powerpc/mm/mem.c | 3 ++-
>> arch/s390/mm/init.c | 2 +-
>> arch/sh/mm/init.c | 3 +++
>> arch/x86/mm/init_32.c | 5 +++++
>> arch/x86/mm/init_64.c | 2 +-
>> include/linux/memory_hotplug.h | 2 ++
>> mm/memory_hotplug.c | 2 +-
>> mm/memremap.c | 6 +++---
>> 10 files changed, 24 insertions(+), 8 deletions(-)
>>
>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
>> index 3320406579c3..9b214b0d268f 100644
>> --- a/arch/arm64/mm/mmu.c
>> +++ b/arch/arm64/mm/mmu.c
>> @@ -1058,7 +1058,8 @@ int arch_add_memory(int nid, u64 start, u64 size,
>> flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
>>
>> __create_pgd_mapping(swapper_pg_dir, start, __phys_to_virt(start),
>> - size, PAGE_KERNEL, __pgd_pgtable_alloc, flags);
>> + size, modifiers->pgprot, __pgd_pgtable_alloc,
>> + flags);
>>
>> memblock_clear_nomap(start, size);
>>
>> diff --git a/arch/ia64/mm/init.c b/arch/ia64/mm/init.c
>> index daf438e08b96..5fd6ae4929c9 100644
>> --- a/arch/ia64/mm/init.c
>> +++ b/arch/ia64/mm/init.c
>> @@ -677,6 +677,10 @@ int arch_add_memory(int nid, u64 start, u64 size,
>> int ret;
>>
>> ret = __add_pages(nid, start_pfn, nr_pages, modifiers);
>> + if (modifiers->pgprot != PAGE_KERNEL)
>> + return -EINVAL;
>> +
>> + ret = __add_pages(nid, start_pfn, nr_pages, restrictions);
>> if (ret)
>> printk("%s: Problem encountered in __add_pages() as ret=%d\n",
>> __func__, ret);
>> diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
>> index 631ee684721f..fddeaee53198 100644
>> --- a/arch/powerpc/mm/mem.c
>> +++ b/arch/powerpc/mm/mem.c
>> @@ -137,7 +137,8 @@ int __ref arch_add_memory(int nid, u64 start, u64 size,
>> resize_hpt_for_hotplug(memblock_phys_mem_size());
>>
>> start = (unsigned long)__va(start);
>> - rc = create_section_mapping(start, start + size, nid, PAGE_KERNEL);
>> + rc = create_section_mapping(start, start + size, nid,
>> + modifiers->pgprot);
>> if (rc) {
>> pr_warn("Unable to create mapping for hot added memory 0x%llx..0x%llx: %d\n",
>> start, start + size, rc);
>> diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
>> index ef19522ddad2..c65fb33f6a89 100644
>> --- a/arch/s390/mm/init.c
>> +++ b/arch/s390/mm/init.c
>> @@ -277,7 +277,7 @@ int arch_add_memory(int nid, u64 start, u64 size,
>> if (WARN_ON_ONCE(modifiers->altmap))
>> return -EINVAL;
>>
>> - rc = vmem_add_mapping(start, size, PAGE_KERNEL);
>> + rc = vmem_add_mapping(start, size, modifiers->pgprot);
>> if (rc)
>> return rc;
>>
>> diff --git a/arch/sh/mm/init.c b/arch/sh/mm/init.c
>> index 7e64f42fb570..7071dc5bd2e4 100644
>> --- a/arch/sh/mm/init.c
>> +++ b/arch/sh/mm/init.c
>> @@ -412,6 +412,9 @@ int arch_add_memory(int nid, u64 start, u64 size,
>> unsigned long nr_pages = size >> PAGE_SHIFT;
>> int ret;
>>
>> + if (modifiers->pgprot != PAGE_KERNEL)
>> + return -EINVAL;
>> +
>> /* We only have ZONE_NORMAL, so this is easy.. */
>> ret = __add_pages(nid, start_pfn, nr_pages, modifiers);
>> if (unlikely(ret))
>> diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c
>> index 630d8a36fcd7..737da0dbc0d5 100644
>> --- a/arch/x86/mm/init_32.c
>> +++ b/arch/x86/mm/init_32.c
>> @@ -857,6 +857,11 @@ int arch_add_memory(int nid, u64 start, u64 size,
>> {
>> unsigned long start_pfn = start >> PAGE_SHIFT;
>> unsigned long nr_pages = size >> PAGE_SHIFT;
>> + int ret;
>> +
>> + ret = _set_memory_prot(start, nr_pages, modifiers->pgprot);
>> + if (ret)
>> + return ret;
>>
>> return __add_pages(nid, start_pfn, nr_pages, modifiers);
>> }
>> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
>> index 17ea0bfc0b83..cc9eb45ad120 100644
>> --- a/arch/x86/mm/init_64.c
>> +++ b/arch/x86/mm/init_64.c
>> @@ -868,7 +868,7 @@ int arch_add_memory(int nid, u64 start, u64 size,
>> unsigned long start_pfn = start >> PAGE_SHIFT;
>> unsigned long nr_pages = size >> PAGE_SHIFT;
>>
>> - init_memory_mapping(start, start + size, PAGE_KERNEL);
>> + init_memory_mapping(start, start + size, modifiers->pgprot);
>>
>> return add_pages(nid, start_pfn, nr_pages, modifiers);
>> }
>> diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
>> index 2152efae2f4b..00dfb2016737 100644
>> --- a/include/linux/memory_hotplug.h
>> +++ b/include/linux/memory_hotplug.h
>> @@ -56,9 +56,11 @@ enum {
>> /*
>> * Restrictions for the memory hotplug:
>> * altmap: alternative allocator for memmap array
>> + * pgprot: page protection flags to apply to newly added page tables
>> */
>> struct mhp_modifiers {
>> struct vmem_altmap *altmap;
>> + pgprot_t pgprot;
>> };
>>
>> /*
>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>> index 1bb3f92e087d..0888f821af06 100644
>> --- a/mm/memory_hotplug.c
>> +++ b/mm/memory_hotplug.c
>> @@ -1027,7 +1027,7 @@ static int online_memory_block(struct memory_block *mem, void *arg)
>> */
>> int __ref add_memory_resource(int nid, struct resource *res)
>> {
>> - struct mhp_modifiers modifiers = {};
>> + struct mhp_modifiers modifiers = {.pgprot = PAGE_KERNEL};
>> u64 start, size;
>> bool new_node = false;
>> int ret;
>> diff --git a/mm/memremap.c b/mm/memremap.c
>> index e30be8ba706b..45ab4ef0643d 100644
>> --- a/mm/memremap.c
>> +++ b/mm/memremap.c
>> @@ -163,8 +163,8 @@ void *memremap_pages(struct dev_pagemap *pgmap, int nid)
>> * We do not want any optional features only our own memmap
>> */
>> .altmap = pgmap_altmap(pgmap),
>> + .pgprot = PAGE_KERNEL,
>> };
>> - pgprot_t pgprot = PAGE_KERNEL;
>> int error, is_ram;
>> bool need_devmap_managed = true;
>>
>> @@ -252,8 +252,8 @@ void *memremap_pages(struct dev_pagemap *pgmap, int nid)
>> if (nid < 0)
>> nid = numa_mem_id();
>>
>> - error = track_pfn_remap(NULL, &pgprot, PHYS_PFN(res->start), 0,
>> - resource_size(res));
>> + error = track_pfn_remap(NULL, &modifiers.pgprot, PHYS_PFN(res->start),
>> + 0, resource_size(res));
>> if (error)
>> goto err_pfn_remap;
>>
>> --
>> 2.20.1
>>
>

2020-01-08 20:01:37

by Dan Williams

[permalink] [raw]
Subject: Re: [PATCH v2 2/8] mm/memory_hotplug: Rename mhp_restrictions to mhp_modifiers

On Wed, Jan 8, 2020 at 11:08 AM David Hildenbrand <[email protected]> wrote:
>
>
>
> > Am 08.01.2020 um 20:00 schrieb Dan Williams <[email protected]>:
> >
> > On Wed, Jan 8, 2020 at 9:17 AM Logan Gunthorpe <[email protected]> wrote:
> >>
> >>
> >>
> >>> On 2020-01-08 5:28 a.m., David Hildenbrand wrote:
> >>> On 07.01.20 21:59, Logan Gunthorpe wrote:
> >>>> The mhp_restrictions struct really doesn't specify anything resembling
> >>>> a restriction anymore so rename it to be mhp_modifiers.
> >>>
> >>> I wonder if something like "mhp_params" would be even better. It's
> >>> essentially just a way to avoid changing call chains rough-out all archs
> >>> whenever we want to add a new parameter.
> >>
> >> Sure, that does sound a bit nicer to me. I can change it for v3.
> >
> > Oh, I was just about to chime in to support "modifiers" because I
> > would expect all parameters to folded into a "params" struct. The
> > modifiers seem to be limited to the set of items that are only
> > considered in a non-default / expert memory hotplug use cases.
> >
>
> It‘s a set of extended parameters I‘d say.

Sure, we can call them "mhp_params" and just clarify that they are
optional / extended in the kernel-doc.
>