2018-05-04 12:40:40

by Christophe Leroy

[permalink] [raw]
Subject: [PATCH 00/17] Implement use of HW assistance on TLB table walk on 8xx

The purpose of this serie is to implement hardware assistance for TLB table walk
on the 8xx.

First part is to make L1 entries and L2 entries independant.
For that, we need to alter ioremap functions in order to handle GUARD attribute
at the PGD/PMD level.

Last part is to try and reuse PTE fragment implemented on PPC64 in order to
not waste 16k Pages for page tables as only 4k are used. For the time being,
it doesn't work, but I include it in the serie anyway in order to get feedback.

Tested successfully on 8xx up to the one before the last.

Didn't have time to do compilation test on other configs, I send it anyway
before leaving for one week vacation in order to get feedback.

Christophe Leroy (17):
powerpc/nohash: remove hash related code from nohash headers.
powerpc/nohash: remove _PAGE_BUSY
powerpc/nohash: use IS_ENABLED() to simplify __set_pte_at()
Revert "powerpc/8xx: Use L1 entry APG to handle _PAGE_ACCESSED for
CONFIG_SWAP"
powerpc: move io mapping functions into ioremap.c
powerpc: common ioremap functions.
powerpc: make ioremap_bot common to PPC32 and PPC64
powerpc: make __iounmap() common to PPC32 and PPC64
powerpc: make __ioremap_caller() common to PPC32 and PPC64
powerpc: use _ALIGN macro
powerpc/nohash32: set GUARDED attribute in the PMD directly
powerpc/8xx: Remove PTE_ATOMIC_UPDATES
powerpc/mm: Use hardware assistance in TLB handlers on the 8xx
powerpc/8xx: reunify TLB handler routines
powerpc/8xx: Free up SPRN_SPRG_SCRATCH2
powerpc/mm: Make pte_fragment_alloc() common to PPC32 and PPC64
powerpc/mm: Use pte_fragment_alloc() on 8xx (Not Working yet)

arch/powerpc/include/asm/book3s/32/pgtable.h | 16 +-
arch/powerpc/include/asm/book3s/64/pgtable.h | 2 +
arch/powerpc/include/asm/hugetlb.h | 4 +-
arch/powerpc/include/asm/machdep.h | 2 +-
arch/powerpc/include/asm/mmu-8xx.h | 38 +--
arch/powerpc/include/asm/mmu_context.h | 28 +++
arch/powerpc/include/asm/nohash/32/pgalloc.h | 39 ++-
arch/powerpc/include/asm/nohash/32/pgtable.h | 88 +++----
arch/powerpc/include/asm/nohash/32/pte-8xx.h | 6 +-
arch/powerpc/include/asm/nohash/64/pgtable.h | 26 +-
arch/powerpc/include/asm/nohash/pgtable.h | 61 ++---
arch/powerpc/include/asm/nohash/pte-book3e.h | 6 -
arch/powerpc/include/asm/pgtable-types.h | 4 +
arch/powerpc/kernel/head_8xx.S | 350 ++++++++++-----------------
arch/powerpc/mm/8xx_mmu.c | 12 +-
arch/powerpc/mm/Makefile | 2 +-
arch/powerpc/mm/dma-noncoherent.c | 2 +-
arch/powerpc/mm/dump_linuxpagetables.c | 32 ++-
arch/powerpc/mm/hugetlbpage.c | 12 +
arch/powerpc/mm/init_32.c | 6 +-
arch/powerpc/mm/ioremap.c | 250 +++++++++++++++++++
arch/powerpc/mm/mem.c | 16 +-
arch/powerpc/mm/mmu_context_book3s64.c | 28 ---
arch/powerpc/mm/mmu_context_nohash.c | 4 +
arch/powerpc/mm/pgtable.c | 75 ++++++
arch/powerpc/mm/pgtable_32.c | 167 +++----------
arch/powerpc/mm/pgtable_64.c | 244 -------------------
arch/powerpc/platforms/Kconfig.cputype | 9 +
28 files changed, 730 insertions(+), 799 deletions(-)
create mode 100644 arch/powerpc/mm/ioremap.c

--
2.13.3



2018-05-04 12:35:42

by Christophe Leroy

[permalink] [raw]
Subject: [PATCH 13/17] powerpc/mm: Use hardware assistance in TLB handlers on the 8xx

Today, on the 8xx the TLB handlers do SW tablewalk by doing all
the calculation in ASM, in order to match with the Linux page
table structure.

The 8xx offers hardware assistance which allows significant size
reduction of the TLB handlers, hence also reduces the time spent
in the handlers.

However, using this HW assistance implies some constraints on the
page table structure:
- Regardless of the main page size used (4k or 16k), the
level 1 table (PGD) contains 1024 entries and each PGD entry covers
a 4Mbytes area which is managed by a level 2 table (PTE) containing
also 1024 entries each describing a 4k page.
- 16k pages require 4 identifical entries in the L2 table
- 512k pages PTE have to be spread every 128 bytes in the L2 table
- 8M pages PTE are at the address pointed by the L1 entry and each
8M page require 2 identical entries in the PGD.

In order to use hardware assistance, this patch does the following
modifications:
- Make PGD size independant of the main page size
- In 16k pages mode, redefine pte_t as a struct with 4 elements,
and populate those 4 elements in __set_pte_at() and pte_update()
- Modify the TLB handlers to use HW assistance
- Adapt the size of the hugepage tables.

Signed-off-by: Christophe Leroy <[email protected]>
---
arch/powerpc/include/asm/hugetlb.h | 4 +-
arch/powerpc/include/asm/nohash/32/pgtable.h | 16 +-
arch/powerpc/include/asm/nohash/pgtable.h | 4 +
arch/powerpc/include/asm/pgtable-types.h | 4 +
arch/powerpc/kernel/head_8xx.S | 227 +++++++++------------------
arch/powerpc/mm/8xx_mmu.c | 10 +-
arch/powerpc/mm/hugetlbpage.c | 12 ++
7 files changed, 112 insertions(+), 165 deletions(-)

diff --git a/arch/powerpc/include/asm/hugetlb.h b/arch/powerpc/include/asm/hugetlb.h
index 78540c074d70..6d29be6bac74 100644
--- a/arch/powerpc/include/asm/hugetlb.h
+++ b/arch/powerpc/include/asm/hugetlb.h
@@ -77,7 +77,9 @@ static inline pte_t *hugepte_offset(hugepd_t hpd, unsigned long addr,
unsigned long idx = 0;

pte_t *dir = hugepd_page(hpd);
-#ifndef CONFIG_PPC_FSL_BOOK3E
+#ifdef CONFIG_PPC_8xx
+ idx = (addr & ((1UL << pdshift) - 1)) >> PAGE_SHIFT;
+#elif !defined(CONFIG_PPC_FSL_BOOK3E)
idx = (addr & ((1UL << pdshift) - 1)) >> hugepd_shift(hpd);
#endif

diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h b/arch/powerpc/include/asm/nohash/32/pgtable.h
index 009a5b3d3192..3efd616bbc80 100644
--- a/arch/powerpc/include/asm/nohash/32/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
@@ -18,7 +18,11 @@ extern int icache_44x_need_flush;

#endif /* __ASSEMBLY__ */

+#if defined(CONFIG_PPC_8xx) && defined(CONFIG_PPC_16K_PAGES)
+#define PTE_INDEX_SIZE (PTE_SHIFT - 2)
+#else
#define PTE_INDEX_SIZE PTE_SHIFT
+#endif
#define PMD_INDEX_SIZE 0
#define PUD_INDEX_SIZE 0
#define PGD_INDEX_SIZE (32 - PGDIR_SHIFT)
@@ -47,7 +51,11 @@ extern int icache_44x_need_flush;
* -Matt
*/
/* PGDIR_SHIFT determines what a top-level page table entry can map */
+#ifdef CONFIG_PPC_8xx
+#define PGDIR_SHIFT 22
+#else
#define PGDIR_SHIFT (PAGE_SHIFT + PTE_INDEX_SIZE)
+#endif
#define PGDIR_SIZE (1UL << PGDIR_SHIFT)
#define PGDIR_MASK (~(PGDIR_SIZE-1))

@@ -190,7 +198,13 @@ static inline unsigned long pte_update(pte_t *p,
: "cc" );
#else /* PTE_ATOMIC_UPDATES */
unsigned long old = pte_val(*p);
- *p = __pte((old & ~clr) | set);
+ unsigned long new = (old & ~clr) | set;
+
+#if defined(CONFIG_PPC_8xx) && defined(CONFIG_PPC_16K_PAGES)
+ p->pte = p->pte1 = p->pte2 = p->pte3 = new;
+#else
+ *p = __pte(new);
+#endif
#endif /* !PTE_ATOMIC_UPDATES */

#ifdef CONFIG_44x
diff --git a/arch/powerpc/include/asm/nohash/pgtable.h b/arch/powerpc/include/asm/nohash/pgtable.h
index 077472640b35..e4b6c084be5c 100644
--- a/arch/powerpc/include/asm/nohash/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/pgtable.h
@@ -165,7 +165,11 @@ static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr,
/* Anything else just stores the PTE normally. That covers all 64-bit
* cases, and 32-bit non-hash with 32-bit PTEs.
*/
+#if defined(CONFIG_PPC_8xx) && defined(CONFIG_PPC_16K_PAGES)
+ ptep->pte = ptep->pte1 = ptep->pte2 = ptep->pte3 = pte_val(pte);
+#else
*ptep = pte;
+#endif

/*
* With hardware tablewalk, a sync is needed to ensure that
diff --git a/arch/powerpc/include/asm/pgtable-types.h b/arch/powerpc/include/asm/pgtable-types.h
index eccb30b38b47..3b0edf041b2e 100644
--- a/arch/powerpc/include/asm/pgtable-types.h
+++ b/arch/powerpc/include/asm/pgtable-types.h
@@ -3,7 +3,11 @@
#define _ASM_POWERPC_PGTABLE_TYPES_H

/* PTE level */
+#if defined(CONFIG_PPC_8xx) && defined(CONFIG_PPC_16K_PAGES)
+typedef struct { pte_basic_t pte, pte1, pte2, pte3; } pte_t;
+#else
typedef struct { pte_basic_t pte; } pte_t;
+#endif
#define __pte(x) ((pte_t) { (x) })
static inline pte_basic_t pte_val(pte_t x)
{
diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 85b017c67e11..4855d5a36f70 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -275,7 +275,7 @@ SystemCall:
. = 0x1100
/*
* For the MPC8xx, this is a software tablewalk to load the instruction
- * TLB. The task switch loads the M_TW register with the pointer to the first
+ * TLB. The task switch loads the M_TWB register with the pointer to the first
* level table.
* If we discover there is no second level table (value is zero) or if there
* is an invalid pte, we load that into the TLB, which causes another fault
@@ -285,106 +285,100 @@ SystemCall:
*/

#ifdef CONFIG_8xx_CPU15
-#define INVALIDATE_ADJACENT_PAGES_CPU15(tmp, addr) \
- addi tmp, addr, PAGE_SIZE; \
- tlbie tmp; \
- addi tmp, addr, -PAGE_SIZE; \
- tlbie tmp
+#define INVALIDATE_ADJACENT_PAGES_CPU15(addr) \
+ addi addr, addr, PAGE_SIZE; \
+ tlbie addr; \
+ addi addr, addr, -(PAGE_SIZE << 1); \
+ tlbie addr; \
+ addi addr, addr, PAGE_SIZE
#else
-#define INVALIDATE_ADJACENT_PAGES_CPU15(tmp, addr)
+#define INVALIDATE_ADJACENT_PAGES_CPU15(addr)
#endif

InstructionTLBMiss:
mtspr SPRN_SPRG_SCRATCH0, r10
+#if defined(ITLB_MISS_KERNEL) || defined(CONFIG_SWAP)
mtspr SPRN_SPRG_SCRATCH1, r11
-#if defined(ITLB_MISS_KERNEL) || defined(CONFIG_HUGETLB_PAGE)
- mtspr SPRN_SPRG_SCRATCH2, r12
+#ifdef ITLB_MISS_KERNEL
+ mfcr r11
+#endif
#endif

/* If we are faulting a kernel address, we have to use the
* kernel page tables.
*/
mfspr r10, SPRN_SRR0 /* Get effective address of fault */
- INVALIDATE_ADJACENT_PAGES_CPU15(r11, r10)
+ INVALIDATE_ADJACENT_PAGES_CPU15(r10)
/* Only modules will cause ITLB Misses as we always
* pin the first 8MB of kernel memory */
-#if defined(ITLB_MISS_KERNEL) || defined(CONFIG_HUGETLB_PAGE)
- mfcr r12
-#endif
+ mtspr SPRN_MD_EPN, r10
#ifdef ITLB_MISS_KERNEL
#if defined(SIMPLE_KERNEL_ADDRESS) && defined(CONFIG_PIN_TLB_TEXT)
- andis. r11, r10, 0x8000 /* Address >= 0x80000000 */
+ cmpi cr0, r10, 0 /* Address >= 0x80000000 */
#else
- rlwinm r11, r10, 16, 0xfff8
- cmpli cr0, r11, PAGE_OFFSET@h
+ rlwinm r10, r10, 16, 0xfff8
+ cmpli cr0, r10, PAGE_OFFSET@h
#ifndef CONFIG_PIN_TLB_TEXT
/* It is assumed that kernel code fits into the first 8M page */
_ENTRY(ITLBMiss_cmp)
- cmpli cr7, r11, (PAGE_OFFSET + 0x0800000)@h
+ cmpli cr7, r10, (PAGE_OFFSET + 0x0800000)@h
#endif
#endif
#endif
- mfspr r11, SPRN_M_TW /* Get level 1 table */
+ mfspr r10, SPRN_M_TWB /* Get level 1 table */
#ifdef ITLB_MISS_KERNEL
#if defined(SIMPLE_KERNEL_ADDRESS) && defined(CONFIG_PIN_TLB_TEXT)
- beq+ 3f
+ bge+ 3f
#else
blt+ 3f
#endif
#ifndef CONFIG_PIN_TLB_TEXT
blt cr7, ITLBMissLinear
#endif
- lis r11, (swapper_pg_dir-PAGE_OFFSET)@ha
+ rlwinm r10, r10, 0, 20, 31
+ oris r10, r10, (swapper_pg_dir - PAGE_OFFSET)@h
+ ori r10, r10, (swapper_pg_dir - PAGE_OFFSET)@l
3:
#endif
+#ifdef ITLB_MISS_KERNEL
+ mfcr r11
+#endif
/* Insert level 1 index */
- rlwimi r11, r10, 32 - ((PAGE_SHIFT - 2) << 1), (PAGE_SHIFT - 2) << 1, 29
- lwz r11, (swapper_pg_dir-PAGE_OFFSET)@l(r11) /* Get the level 1 entry */
+ lwz r10, 0(r10) /* Get the level 1 entry */
+ mtspr SPRN_MI_TWC, r10 /* Set segment attributes */
+ mtspr SPRN_MD_TWC, r10

- /* Extract level 2 index */
- rlwinm r10, r10, 32 - (PAGE_SHIFT - 2), 32 - PAGE_SHIFT, 29
-#ifdef CONFIG_HUGETLB_PAGE
- mtcr r11
-#endif
- /* Load the MI_TWC with the attributes for this "segment." */
- mtspr SPRN_MI_TWC, r11 /* Set segment attributes */
-#ifdef CONFIG_HUGETLB_PAGE
- bt- 28, 10f /* bit 28 = Large page (8M) */
- bt- 29, 20f /* bit 29 = Large page (8M or 512k) */
-#endif
- rlwimi r10, r11, 0, 0, 32 - PAGE_SHIFT - 1 /* Add level 2 base */
+ mfspr r10, SPRN_MD_TWC
lwz r10, 0(r10) /* Get the pte */
-4:
-#if defined(ITLB_MISS_KERNEL) || defined(CONFIG_HUGETLB_PAGE)
- mtcr r12
-#endif

#ifdef CONFIG_SWAP
rlwinm r11, r10, 32-5, _PAGE_PRESENT
and r11, r11, r10
rlwimi r10, r11, 0, _PAGE_PRESENT
#endif
- li r11, RPN_PATTERN | 0x200
/* The Linux PTE won't go exactly into the MMU TLB.
* Software indicator bits 20 and 23 must be clear.
* Software indicator bits 22, 24, 25, 26, and 27 must be
* set. All other Linux PTE bits control the behavior
* of the MMU.
*/
- rlwimi r11, r10, 4, 0x0400 /* Copy _PAGE_EXEC into bit 21 */
- rlwimi r10, r11, 0, 0x0ff0 /* Set 22, 24-27, clear 20,23 */
+ rlwimi r10, r10, 0, 0x0f00 /* Clear bits 20-23 */
+ rlwimi r10, r10, 4, 0x0400 /* Copy _PAGE_EXEC into bit 21 */
+ ori r10, r10, RPN_PATTERN | 0x200 /* Set 22 and 24-27 */
mtspr SPRN_MI_RPN, r10 /* Update TLB entry */

/* Restore registers */
_ENTRY(itlb_miss_exit_1)
mfspr r10, SPRN_SPRG_SCRATCH0
+#if defined(ITLB_MISS_KERNEL) || defined(CONFIG_SWAP)
mfspr r11, SPRN_SPRG_SCRATCH1
-#if defined(ITLB_MISS_KERNEL) || defined(CONFIG_HUGETLB_PAGE)
- mfspr r12, SPRN_SPRG_SCRATCH2
#endif
rfi
#ifdef CONFIG_PERF_EVENTS
_ENTRY(itlb_miss_perf)
+#if !defined(ITLB_MISS_KERNEL) && !defined(CONFIG_SWAP)
+ mtspr SPRN_SPRG_SCRATCH1, r11
+#endif
lis r10, (itlb_miss_counter - PAGE_OFFSET)@ha
lwz r11, (itlb_miss_counter - PAGE_OFFSET)@l(r10)
addi r11, r11, 1
@@ -392,83 +386,42 @@ _ENTRY(itlb_miss_perf)
#endif
mfspr r10, SPRN_SPRG_SCRATCH0
mfspr r11, SPRN_SPRG_SCRATCH1
-#if defined(ITLB_MISS_KERNEL) || defined(CONFIG_HUGETLB_PAGE)
- mfspr r12, SPRN_SPRG_SCRATCH2
-#endif
rfi

-#ifdef CONFIG_HUGETLB_PAGE
-10: /* 8M pages */
-#ifdef CONFIG_PPC_16K_PAGES
- /* Extract level 2 index */
- rlwinm r10, r10, 32 - (PAGE_SHIFT_8M - PAGE_SHIFT), 32 + PAGE_SHIFT_8M - (PAGE_SHIFT << 1), 29
- /* Add level 2 base */
- rlwimi r10, r11, 0, 0, 32 + PAGE_SHIFT_8M - (PAGE_SHIFT << 1) - 1
-#else
- /* Level 2 base */
- rlwinm r10, r11, 0, ~HUGEPD_SHIFT_MASK
-#endif
- lwz r10, 0(r10) /* Get the pte */
- b 4b
-
-20: /* 512k pages */
- /* Extract level 2 index */
- rlwinm r10, r10, 32 - (PAGE_SHIFT_512K - PAGE_SHIFT), 32 + PAGE_SHIFT_512K - (PAGE_SHIFT << 1), 29
- /* Add level 2 base */
- rlwimi r10, r11, 0, 0, 32 + PAGE_SHIFT_512K - (PAGE_SHIFT << 1) - 1
- lwz r10, 0(r10) /* Get the pte */
- b 4b
-#endif
-
. = 0x1200
DataStoreTLBMiss:
mtspr SPRN_SPRG_SCRATCH0, r10
mtspr SPRN_SPRG_SCRATCH1, r11
- mtspr SPRN_SPRG_SCRATCH2, r12
- mfcr r12
+ mfcr r11

/* If we are faulting a kernel address, we have to use the
* kernel page tables.
*/
mfspr r10, SPRN_MD_EPN
- rlwinm r11, r10, 16, 0xfff8
- cmpli cr0, r11, PAGE_OFFSET@h
- mfspr r11, SPRN_M_TW /* Get level 1 table */
- blt+ 3f
- rlwinm r11, r10, 16, 0xfff8
+ rlwinm r10, r10, 16, 0xfff8
+ cmpli cr0, r10, PAGE_OFFSET@h
#ifndef CONFIG_PIN_TLB_IMMR
- cmpli cr0, r11, VIRT_IMMR_BASE@h
+ cmpli cr6, r10, VIRT_IMMR_BASE@h
#endif
_ENTRY(DTLBMiss_cmp)
- cmpli cr7, r11, (PAGE_OFFSET + 0x1800000)@h
+ cmpli cr7, r10, (PAGE_OFFSET + 0x1800000)@h
+ mfspr r10, SPRN_M_TWB /* Get level 1 table */
+ blt+ 3f
#ifndef CONFIG_PIN_TLB_IMMR
_ENTRY(DTLBMiss_jmp)
- beq- DTLBMissIMMR
+ beq- cr6, DTLBMissIMMR
#endif
blt cr7, DTLBMissLinear
- lis r11, (swapper_pg_dir-PAGE_OFFSET)@ha
+ rlwinm r10, r10, 0, 20, 31
+ oris r10, r10, (swapper_pg_dir - PAGE_OFFSET)@h
+ ori r10, r10, (swapper_pg_dir - PAGE_OFFSET)@l
3:
-
- /* Insert level 1 index */
- rlwimi r11, r10, 32 - ((PAGE_SHIFT - 2) << 1), (PAGE_SHIFT - 2) << 1, 29
- lwz r11, (swapper_pg_dir-PAGE_OFFSET)@l(r11) /* Get the level 1 entry */
-
- /* We have a pte table, so load fetch the pte from the table.
- */
- /* Extract level 2 index */
- rlwinm r10, r10, 32 - (PAGE_SHIFT - 2), 32 - PAGE_SHIFT, 29
-#ifdef CONFIG_HUGETLB_PAGE
+ lwz r10, 0(r10) /* Get the level 1 entry */
mtcr r11
-#endif
- mtspr SPRN_MD_TWC, r11
-#ifdef CONFIG_HUGETLB_PAGE
- bt- 28, 10f /* bit 28 = Large page (8M) */
- bt- 29, 20f /* bit 29 = Large page (8M or 512k) */
-#endif
- rlwimi r10, r11, 0, 0, 32 - PAGE_SHIFT - 1 /* Add level 2 base */
+
+ mtspr SPRN_MD_TWC, r10
+ mfspr r10, SPRN_MD_TWC
lwz r10, 0(r10) /* Get the pte */
-4:
- mtcr r12

/* Both _PAGE_ACCESSED and _PAGE_PRESENT has to be set.
* We also need to know if the insn is a load/store, so:
@@ -498,7 +451,6 @@ _ENTRY(DTLBMiss_jmp)
_ENTRY(dtlb_miss_exit_1)
mfspr r10, SPRN_SPRG_SCRATCH0
mfspr r11, SPRN_SPRG_SCRATCH1
- mfspr r12, SPRN_SPRG_SCRATCH2
rfi
#ifdef CONFIG_PERF_EVENTS
_ENTRY(dtlb_miss_perf)
@@ -509,32 +461,8 @@ _ENTRY(dtlb_miss_perf)
#endif
mfspr r10, SPRN_SPRG_SCRATCH0
mfspr r11, SPRN_SPRG_SCRATCH1
- mfspr r12, SPRN_SPRG_SCRATCH2
rfi

-#ifdef CONFIG_HUGETLB_PAGE
-10: /* 8M pages */
- /* Extract level 2 index */
-#ifdef CONFIG_PPC_16K_PAGES
- rlwinm r10, r10, 32 - (PAGE_SHIFT_8M - PAGE_SHIFT), 32 + PAGE_SHIFT_8M - (PAGE_SHIFT << 1), 29
- /* Add level 2 base */
- rlwimi r10, r11, 0, 0, 32 + PAGE_SHIFT_8M - (PAGE_SHIFT << 1) - 1
-#else
- /* Level 2 base */
- rlwinm r10, r11, 0, ~HUGEPD_SHIFT_MASK
-#endif
- lwz r10, 0(r10) /* Get the pte */
- b 4b
-
-20: /* 512k pages */
- /* Extract level 2 index */
- rlwinm r10, r10, 32 - (PAGE_SHIFT_512K - PAGE_SHIFT), 32 + PAGE_SHIFT_512K - (PAGE_SHIFT << 1), 29
- /* Add level 2 base */
- rlwimi r10, r11, 0, 0, 32 + PAGE_SHIFT_512K - (PAGE_SHIFT << 1) - 1
- lwz r10, 0(r10) /* Get the pte */
- b 4b
-#endif
-
/* This is an instruction TLB error on the MPC8xx. This could be due
* to many reasons, such as executing guarded memory or illegal instruction
* addresses. There is nothing to do but handle a big time error fault.
@@ -642,7 +570,7 @@ InstructionBreakpoint:
* not enough space in the DataStoreTLBMiss area.
*/
DTLBMissIMMR:
- mtcr r12
+ mtcr r11
/* Set 512k byte guarded page and mark it valid */
li r10, MD_PS512K | MD_GUARDED | MD_SVALID
mtspr SPRN_MD_TWC, r10
@@ -657,15 +585,14 @@ DTLBMissIMMR:
_ENTRY(dtlb_miss_exit_2)
mfspr r10, SPRN_SPRG_SCRATCH0
mfspr r11, SPRN_SPRG_SCRATCH1
- mfspr r12, SPRN_SPRG_SCRATCH2
rfi

DTLBMissLinear:
- mtcr r12
+ mtcr r11
/* Set 8M byte page and mark it valid */
li r11, MD_PS8MEG | MD_SVALID
mtspr SPRN_MD_TWC, r11
- rlwinm r10, r10, 0, 0x0f800000 /* 8xx supports max 256Mb RAM */
+ rlwinm r10, r10, 20, 0x0f800000 /* 8xx supports max 256Mb RAM */
ori r10, r10, 0xf0 | MD_SPS16K | _PAGE_PRIVILEGED | _PAGE_DIRTY | \
_PAGE_PRESENT
mtspr SPRN_MD_RPN, r10 /* Update TLB entry */
@@ -675,16 +602,15 @@ DTLBMissLinear:
_ENTRY(dtlb_miss_exit_3)
mfspr r10, SPRN_SPRG_SCRATCH0
mfspr r11, SPRN_SPRG_SCRATCH1
- mfspr r12, SPRN_SPRG_SCRATCH2
rfi

#ifndef CONFIG_PIN_TLB_TEXT
ITLBMissLinear:
- mtcr r12
+ mtcr r11
/* Set 8M byte page and mark it valid */
li r11, MI_PS8MEG | MI_SVALID
mtspr SPRN_MI_TWC, r11
- rlwinm r10, r10, 0, 0x0f800000 /* 8xx supports max 256Mb RAM */
+ rlwinm r10, r10, 20, 0x0f800000 /* 8xx supports max 256Mb RAM */
ori r10, r10, 0xf0 | MI_SPS16K | _PAGE_PRIVILEGED | _PAGE_DIRTY | \
_PAGE_PRESENT
mtspr SPRN_MI_RPN, r10 /* Update TLB entry */
@@ -692,7 +618,6 @@ ITLBMissLinear:
_ENTRY(itlb_miss_exit_2)
mfspr r10, SPRN_SPRG_SCRATCH0
mfspr r11, SPRN_SPRG_SCRATCH1
- mfspr r12, SPRN_SPRG_SCRATCH2
rfi
#endif

@@ -706,9 +631,10 @@ FixupDAR:/* Entry point for dcbx workaround. */
mtspr SPRN_SPRG_SCRATCH2, r10
/* fetch instruction from memory. */
mfspr r10, SPRN_SRR0
+ mtspr SPRN_MD_EPN, r10
rlwinm r11, r10, 16, 0xfff8
cmpli cr0, r11, PAGE_OFFSET@h
- mfspr r11, SPRN_M_TW /* Get level 1 table */
+ mfspr r11, SPRN_M_TWB /* Get level 1 table */
blt+ 3f
rlwinm r11, r10, 16, 0xfff8
_ENTRY(FixupDAR_cmp)
@@ -716,19 +642,20 @@ _ENTRY(FixupDAR_cmp)
/* create physical page address from effective address */
tophys(r11, r10)
blt- cr7, 201f
- lis r11, (swapper_pg_dir-PAGE_OFFSET)@ha
- /* Insert level 1 index */
-3: rlwimi r11, r10, 32 - ((PAGE_SHIFT - 2) << 1), (PAGE_SHIFT - 2) << 1, 29
- lwz r11, (swapper_pg_dir-PAGE_OFFSET)@l(r11) /* Get the level 1 entry */
+ mfspr r11, SPRN_M_TWB /* Get level 1 table */
+ rlwinm r11, r11, 0, 20, 31
+ oris r11, r11, (swapper_pg_dir - PAGE_OFFSET)@h
+ ori r11, r11, (swapper_pg_dir - PAGE_OFFSET)@l
+3:
+ lwz r11, 0(r11) /* Get the level 1 entry */
+ mtspr SPRN_MD_TWC, r11
mtcr r11
+ mfspr r11, SPRN_MD_TWC
+ lwz r11, 0(r11) /* Get the pte */
bt 28,200f /* bit 28 = Large page (8M) */
bt 29,202f /* bit 29 = Large page (8M or 512K) */
- rlwinm r11, r11,0,0,19 /* Extract page descriptor page address */
- /* Insert level 2 index */
- rlwimi r11, r10, 32 - (PAGE_SHIFT - 2), 32 - PAGE_SHIFT, 29
- lwz r11, 0(r11) /* Get the pte */
/* concat physical page address(r11) and page offset(r10) */
- rlwimi r11, r10, 0, 32 - PAGE_SHIFT, 31
+ rlwimi r11, r10, 0, 20, 31
201: lwz r11,0(r11)
/* Check if it really is a dcbx instruction. */
/* dcbt and dcbtst does not generate DTLB Misses/Errors,
@@ -748,23 +675,12 @@ _ENTRY(FixupDAR_cmp)
141: mfspr r10,SPRN_SPRG_SCRATCH2
b DARFixed /* Nope, go back to normal TLB processing */

- /* concat physical page address(r11) and page offset(r10) */
200:
-#ifdef CONFIG_PPC_16K_PAGES
- rlwinm r11, r11, 0, 0, 32 + PAGE_SHIFT_8M - (PAGE_SHIFT << 1) - 1
- rlwimi r11, r10, 32 - (PAGE_SHIFT_8M - 2), 32 + PAGE_SHIFT_8M - (PAGE_SHIFT << 1), 29
-#else
- rlwinm r11, r10, 0, ~HUGEPD_SHIFT_MASK
-#endif
- lwz r11, 0(r11) /* Get the pte */
/* concat physical page address(r11) and page offset(r10) */
rlwimi r11, r10, 0, 32 - PAGE_SHIFT_8M, 31
b 201b

202:
- rlwinm r11, r11, 0, 0, 32 + PAGE_SHIFT_512K - (PAGE_SHIFT << 1) - 1
- rlwimi r11, r10, 32 - (PAGE_SHIFT_512K - 2), 32 + PAGE_SHIFT_512K - (PAGE_SHIFT << 1), 29
- lwz r11, 0(r11) /* Get the pte */
/* concat physical page address(r11) and page offset(r10) */
rlwimi r11, r10, 0, 32 - PAGE_SHIFT_512K, 31
b 201b
@@ -898,9 +814,10 @@ start_here:
* init's THREAD like the context switch code does, but this is
* easier......until someone changes init's static structures.
*/
- lis r6, swapper_pg_dir@ha
+ lis r6, swapper_pg_dir@h
+ ori r6, r6, swapper_pg_dir@l
tophys(r6,r6)
- mtspr SPRN_M_TW, r6
+ mtspr SPRN_M_TWB, r6
lis r4,2f@h
ori r4,r4,2f@l
tophys(r4,r4)
diff --git a/arch/powerpc/mm/8xx_mmu.c b/arch/powerpc/mm/8xx_mmu.c
index 5d53684c2ebd..54a02b8e21ec 100644
--- a/arch/powerpc/mm/8xx_mmu.c
+++ b/arch/powerpc/mm/8xx_mmu.c
@@ -173,8 +173,6 @@ void __init setup_initial_memory_limit(phys_addr_t first_memblock_base,
*/
void set_context(unsigned long id, pgd_t *pgd)
{
- s16 offset = (s16)(__pa(swapper_pg_dir));
-
#ifdef CONFIG_BDI_SWITCH
pgd_t **ptr = *(pgd_t ***)(KERNELBASE + 0xf0);

@@ -184,12 +182,8 @@ void set_context(unsigned long id, pgd_t *pgd)
*(ptr + 1) = pgd;
#endif

- /* Register M_TW will contain base address of level 1 table minus the
- * lower part of the kernel PGDIR base address, so that all accesses to
- * level 1 table are done relative to lower part of kernel PGDIR base
- * address.
- */
- mtspr(SPRN_M_TW, __pa(pgd) - offset);
+ /* Register M_TWB will contain base address of level 1 table */
+ mtspr(SPRN_M_TWB, __pa(pgd));

/* Update context */
mtspr(SPRN_M_CASID, id - 1);
diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
index f1153f8254e3..6c07d40eed0f 100644
--- a/arch/powerpc/mm/hugetlbpage.c
+++ b/arch/powerpc/mm/hugetlbpage.c
@@ -61,7 +61,11 @@ static int __hugepte_alloc(struct mm_struct *mm, hugepd_t *hpdp,
cachep = hugepte_cache;
num_hugepd = 1 << (pshift - pdshift);
} else {
+#ifdef CONFIG_PPC_8xx
+ cachep = PGT_CACHE(PTE_SHIFT);
+#else
cachep = PGT_CACHE(pdshift - pshift);
+#endif
num_hugepd = 1;
}

@@ -326,7 +330,11 @@ static void free_hugepd_range(struct mmu_gather *tlb, hugepd_t *hpdp, int pdshif
if (shift >= pdshift)
hugepd_free(tlb, hugepte);
else
+#ifdef CONFIG_PPC_8xx
+ pgtable_free_tlb(tlb, hugepte, PTE_SHIFT);
+#else
pgtable_free_tlb(tlb, hugepte, pdshift - shift);
+#endif
}

static void hugetlb_free_pmd_range(struct mmu_gather *tlb, pud_t *pud,
@@ -689,7 +697,11 @@ static int __init hugetlbpage_init(void)
* use pgt cache for hugepd.
*/
if (pdshift > shift)
+#ifdef CONFIG_PPC_8xx
+ pgtable_cache_add(PTE_SHIFT, NULL);
+#else
pgtable_cache_add(pdshift - shift, NULL);
+#endif
#if defined(CONFIG_PPC_FSL_BOOK3E) || defined(CONFIG_PPC_8xx)
else if (!hugepte_cache) {
/*
--
2.13.3


2018-05-04 12:35:46

by Christophe Leroy

[permalink] [raw]
Subject: [PATCH 15/17] powerpc/8xx: Free up SPRN_SPRG_SCRATCH2

We can now use SPRN_M_TW in the DAR Fixup code, freeing
SPRN_SPRG_SCRATCH2

Then SPRN_SPRG_SCRATCH2 may be used for something else in the future.

Signed-off-by: Christophe Leroy <[email protected]>
---
arch/powerpc/kernel/head_8xx.S | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index c98a4ebb5a4d..8e96b526f109 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -624,7 +624,7 @@ InstructionBreakpoint:
/* define if you don't want to use self modifying code */
#define NO_SELF_MODIFYING_CODE
FixupDAR:/* Entry point for dcbx workaround. */
- mtspr SPRN_SPRG_SCRATCH2, r10
+ mtspr SPRN_M_TW, r10
/* fetch instruction from memory. */
mfspr r10, SPRN_SRR0
mtspr SPRN_MD_EPN, r10
@@ -668,7 +668,7 @@ _ENTRY(FixupDAR_cmp)
beq+ 142f
cmpwi cr0, r10, 1964 /* Is icbi? */
beq+ 142f
-141: mfspr r10,SPRN_SPRG_SCRATCH2
+141: mfspr r10,SPRN_M_TW
b DARFixed /* Nope, go back to normal TLB processing */

200:
@@ -703,7 +703,7 @@ modified_instr:
bne+ 143f
subf r10,r0,r10 /* r10=r10-r0, only if reg RA is r0 */
143: mtdar r10 /* store faulting EA in DAR */
- mfspr r10,SPRN_SPRG_SCRATCH2
+ mfspr r10,SPRN_M_TW
b DARFixed /* Go back to normal TLB handling */
#else
mfctr r10
@@ -757,7 +757,7 @@ modified_instr:
mfdar r11
mtctr r11 /* restore ctr reg from DAR */
mtdar r10 /* save fault EA to DAR */
- mfspr r10,SPRN_SPRG_SCRATCH2
+ mfspr r10,SPRN_M_TW
b DARFixed /* Go back to normal TLB handling */

/* special handling for r10,r11 since these are modified already */
--
2.13.3


2018-05-04 12:36:10

by Christophe Leroy

[permalink] [raw]
Subject: [PATCH BAD 17/17] powerpc/mm: Use pte_fragment_alloc() on 8xx

DO NOT APPLY THAT ONE, IT BUGS. But comments are welcome.


In 16k pages mode, the 8xx still need only 4k for the page table.

This patch makes use of the pte_fragment functions in order
to avoid wasting memory space

Signed-off-by: Christophe Leroy <[email protected]>
---
arch/powerpc/include/asm/mmu-8xx.h | 4 ++++
arch/powerpc/include/asm/nohash/32/pgalloc.h | 29 +++++++++++++++++++++++++++-
arch/powerpc/include/asm/nohash/32/pgtable.h | 5 ++++-
arch/powerpc/mm/mmu_context_nohash.c | 4 ++++
arch/powerpc/mm/pgtable.c | 10 +++++++++-
arch/powerpc/mm/pgtable_32.c | 12 ++++++++++++
arch/powerpc/platforms/Kconfig.cputype | 1 +
7 files changed, 62 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/mmu-8xx.h b/arch/powerpc/include/asm/mmu-8xx.h
index 193f53116c7a..4f4cb754afd8 100644
--- a/arch/powerpc/include/asm/mmu-8xx.h
+++ b/arch/powerpc/include/asm/mmu-8xx.h
@@ -190,6 +190,10 @@ typedef struct {
struct slice_mask mask_8m;
# endif
#endif
+#ifdef CONFIG_NEED_PTE_FRAG
+ /* for 4K PTE fragment support */
+ void *pte_frag;
+#endif
} mm_context_t;

#define PHYS_IMMR_BASE (mfspr(SPRN_IMMR) & 0xfff80000)
diff --git a/arch/powerpc/include/asm/nohash/32/pgalloc.h b/arch/powerpc/include/asm/nohash/32/pgalloc.h
index 1c6461e7c6aa..1e3b8f580499 100644
--- a/arch/powerpc/include/asm/nohash/32/pgalloc.h
+++ b/arch/powerpc/include/asm/nohash/32/pgalloc.h
@@ -93,6 +93,32 @@ static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmdp,
((unlikely(pmd_none(*(pmd))) && __pte_alloc_kernel_g(pmd, address))? \
NULL: pte_offset_kernel(pmd, address))

+#ifdef CONFIG_NEED_PTE_FRAG
+extern pte_t *pte_fragment_alloc(struct mm_struct *, unsigned long, int);
+extern void pte_fragment_free(unsigned long *, int);
+
+static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm,
+ unsigned long address)
+{
+ return (pte_t *)pte_fragment_alloc(mm, address, 1);
+}
+
+static inline pgtable_t pte_alloc_one(struct mm_struct *mm,
+ unsigned long address)
+{
+ return (pgtable_t)pte_fragment_alloc(mm, address, 0);
+}
+
+static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte)
+{
+ pte_fragment_free((unsigned long *)pte, 1);
+}
+
+static inline void pte_free(struct mm_struct *mm, pgtable_t ptepage)
+{
+ pte_fragment_free((unsigned long *)ptepage, 0);
+}
+#else
extern pte_t *pte_alloc_one_kernel(struct mm_struct *mm, unsigned long addr);
extern pgtable_t pte_alloc_one(struct mm_struct *mm, unsigned long addr);

@@ -106,11 +132,12 @@ static inline void pte_free(struct mm_struct *mm, pgtable_t ptepage)
pgtable_page_dtor(ptepage);
__free_page(ptepage);
}
+#endif

static inline void pgtable_free(void *table, unsigned index_size)
{
if (!index_size) {
- free_page((unsigned long)table);
+ pte_free_kernel(NULL, table);
} else {
BUG_ON(index_size > MAX_PGTABLE_INDEX_SIZE);
kmem_cache_free(PGT_CACHE(index_size), table);
diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h b/arch/powerpc/include/asm/nohash/32/pgtable.h
index 3efd616bbc80..e2a22c8dc7f6 100644
--- a/arch/powerpc/include/asm/nohash/32/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
@@ -20,6 +20,9 @@ extern int icache_44x_need_flush;

#if defined(CONFIG_PPC_8xx) && defined(CONFIG_PPC_16K_PAGES)
#define PTE_INDEX_SIZE (PTE_SHIFT - 2)
+#define PTE_FRAG_NR 4
+#define PTE_FRAG_SIZE_SHIFT 12
+#define PTE_FRAG_SIZE (1UL << PTE_FRAG_SIZE_SHIFT)
#else
#define PTE_INDEX_SIZE PTE_SHIFT
#endif
@@ -303,7 +306,7 @@ static inline void __ptep_set_access_flags(struct mm_struct *mm,
*/
#ifndef CONFIG_BOOKE
#define pmd_page_vaddr(pmd) \
- ((unsigned long) __va(pmd_val(pmd) & PAGE_MASK))
+ ((unsigned long) __va(pmd_val(pmd) & ~(PTE_TABLE_SIZE - 1)))
#define pmd_page(pmd) \
pfn_to_page(pmd_val(pmd) >> PAGE_SHIFT)
#else
diff --git a/arch/powerpc/mm/mmu_context_nohash.c b/arch/powerpc/mm/mmu_context_nohash.c
index e09228a9ad00..8b0ab33673e5 100644
--- a/arch/powerpc/mm/mmu_context_nohash.c
+++ b/arch/powerpc/mm/mmu_context_nohash.c
@@ -390,6 +390,9 @@ int init_new_context(struct task_struct *t, struct mm_struct *mm)
#endif
mm->context.id = MMU_NO_CONTEXT;
mm->context.active = 0;
+#ifdef CONFIG_NEED_PTE_FRAG
+ mm->context.pte_frag = NULL;
+#endif
return 0;
}

@@ -418,6 +421,7 @@ void destroy_context(struct mm_struct *mm)
nr_free_contexts++;
}
raw_spin_unlock_irqrestore(&context_lock, flags);
+ destroy_pagetable_page(mm);
}

#ifdef CONFIG_SMP
diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c
index 2d34755ed727..96cc5aa73331 100644
--- a/arch/powerpc/mm/pgtable.c
+++ b/arch/powerpc/mm/pgtable.c
@@ -23,6 +23,7 @@

#include <linux/kernel.h>
#include <linux/gfp.h>
+#include <linux/memblock.h>
#include <linux/mm.h>
#include <linux/percpu.h>
#include <linux/hardirq.h>
@@ -320,10 +321,17 @@ static pte_t *__alloc_for_cache(struct mm_struct *mm, int kernel)
return (pte_t *)ret;
}

-pte_t *pte_fragment_alloc(struct mm_struct *mm, unsigned long vmaddr, int kernel)
+__ref pte_t *pte_fragment_alloc(struct mm_struct *mm, unsigned long vmaddr, int kernel)
{
pte_t *pte;

+ if (kernel && !slab_is_available()) {
+ pte = __va(memblock_alloc(PTE_FRAG_SIZE, PTE_FRAG_SIZE));
+ if (pte)
+ memset(pte, 0, PTE_FRAG_SIZE);
+
+ return pte;
+ }
pte = get_from_cache(mm);
if (pte)
return pte;
diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c
index 3aa0c78db95d..5c8737cf2945 100644
--- a/arch/powerpc/mm/pgtable_32.c
+++ b/arch/powerpc/mm/pgtable_32.c
@@ -40,6 +40,17 @@

extern char etext[], _stext[], _sinittext[], _einittext[];

+#ifdef CONFIG_NEED_PTE_FRAG
+void pte_fragment_free(unsigned long *table, int kernel)
+{
+ struct page *page = virt_to_page(table);
+ if (put_page_testzero(page)) {
+ if (!kernel)
+ pgtable_page_dtor(page);
+ free_unref_page(page);
+ }
+}
+#else
__ref pte_t *pte_alloc_one_kernel(struct mm_struct *mm, unsigned long address)
{
pte_t *pte;
@@ -69,6 +80,7 @@ pgtable_t pte_alloc_one(struct mm_struct *mm, unsigned long address)
}
return ptepage;
}
+#endif

#ifdef CONFIG_PPC_GUARDED_PAGE_IN_PMD
int __pte_alloc_kernel_g(pmd_t *pmd, unsigned long address)
diff --git a/arch/powerpc/platforms/Kconfig.cputype b/arch/powerpc/platforms/Kconfig.cputype
index 7172b04c91b5..eff6210ad3c0 100644
--- a/arch/powerpc/platforms/Kconfig.cputype
+++ b/arch/powerpc/platforms/Kconfig.cputype
@@ -340,6 +340,7 @@ config PPC_MM_SLICES
config NEED_PTE_FRAG
bool
default y if PPC_BOOK3S_64 && PPC_64K_PAGES
+ default y if PPC_8xx && PPC_16K_PAGES
default n

config PPC_HAVE_PMU_SUPPORT
--
2.13.3


2018-05-04 12:37:08

by Christophe Leroy

[permalink] [raw]
Subject: [PATCH 10/17] powerpc: use _ALIGN macro

Signed-off-by: Christophe Leroy <[email protected]>
---
arch/powerpc/include/asm/nohash/32/pgtable.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h b/arch/powerpc/include/asm/nohash/32/pgtable.h
index b413abcd5a09..93dc22dbe964 100644
--- a/arch/powerpc/include/asm/nohash/32/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
@@ -99,9 +99,9 @@ extern int icache_44x_need_flush;
*/
#define VMALLOC_OFFSET (0x1000000) /* 16M */
#ifdef PPC_PIN_SIZE
-#define VMALLOC_BASE (((_ALIGN((long)high_memory, PPC_PIN_SIZE) + VMALLOC_OFFSET) & ~(VMALLOC_OFFSET-1)))
+#define VMALLOC_BASE _ALIGN_DOWN(_ALIGN((long)high_memory, PPC_PIN_SIZE) + VMALLOC_OFFSET, VMALLOC_OFFSET)
#else
-#define VMALLOC_BASE ((((long)high_memory + VMALLOC_OFFSET) & ~(VMALLOC_OFFSET-1)))
+#define VMALLOC_BASE _ALIGN_DOWN((long)high_memory + VMALLOC_OFFSET, VMALLOC_OFFSET)
#endif
#define VMALLOC_START ioremap_bot
#define VMALLOC_END IOREMAP_END
--
2.13.3


2018-05-04 12:37:08

by Christophe Leroy

[permalink] [raw]
Subject: [PATCH 14/17] powerpc/8xx: reunify TLB handler routines

Each handler must not exceed 64 instructions to fit into the main
exception area.
Following the significant size reduction of TLB handler routines,
the side handlers can be brought back close to the main part.

In the worst case:
Main part of ITLB handler is 45 insn, side part is 9 insn ==> total 54
Main part of DTLB handler is 37 insn, side part is 23 insn ==> total 60

Signed-off-by: Christophe Leroy <[email protected]>
---
arch/powerpc/kernel/head_8xx.S | 108 ++++++++++++++++++++---------------------
1 file changed, 52 insertions(+), 56 deletions(-)

diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 4855d5a36f70..c98a4ebb5a4d 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -388,6 +388,23 @@ _ENTRY(itlb_miss_perf)
mfspr r11, SPRN_SPRG_SCRATCH1
rfi

+#ifndef CONFIG_PIN_TLB_TEXT
+ITLBMissLinear:
+ mtcr r11
+ /* Set 8M byte page and mark it valid */
+ li r11, MI_PS8MEG | MI_SVALID
+ mtspr SPRN_MI_TWC, r11
+ rlwinm r10, r10, 20, 0x0f800000 /* 8xx supports max 256Mb RAM */
+ ori r10, r10, 0xf0 | MI_SPS16K | _PAGE_PRIVILEGED | _PAGE_DIRTY | \
+ _PAGE_PRESENT
+ mtspr SPRN_MI_RPN, r10 /* Update TLB entry */
+
+_ENTRY(itlb_miss_exit_2)
+ mfspr r10, SPRN_SPRG_SCRATCH0
+ mfspr r11, SPRN_SPRG_SCRATCH1
+ rfi
+#endif
+
. = 0x1200
DataStoreTLBMiss:
mtspr SPRN_SPRG_SCRATCH0, r10
@@ -463,6 +480,41 @@ _ENTRY(dtlb_miss_perf)
mfspr r11, SPRN_SPRG_SCRATCH1
rfi

+DTLBMissIMMR:
+ mtcr r11
+ /* Set 512k byte guarded page and mark it valid */
+ li r10, MD_PS512K | MD_GUARDED | MD_SVALID
+ mtspr SPRN_MD_TWC, r10
+ mfspr r10, SPRN_IMMR /* Get current IMMR */
+ rlwinm r10, r10, 0, 0xfff80000 /* Get 512 kbytes boundary */
+ ori r10, r10, 0xf0 | MD_SPS16K | _PAGE_PRIVILEGED | _PAGE_DIRTY | \
+ _PAGE_PRESENT | _PAGE_NO_CACHE
+ mtspr SPRN_MD_RPN, r10 /* Update TLB entry */
+
+ li r11, RPN_PATTERN
+ mtspr SPRN_DAR, r11 /* Tag DAR */
+_ENTRY(dtlb_miss_exit_2)
+ mfspr r10, SPRN_SPRG_SCRATCH0
+ mfspr r11, SPRN_SPRG_SCRATCH1
+ rfi
+
+DTLBMissLinear:
+ mtcr r11
+ /* Set 8M byte page and mark it valid */
+ li r11, MD_PS8MEG | MD_SVALID
+ mtspr SPRN_MD_TWC, r11
+ rlwinm r10, r10, 20, 0x0f800000 /* 8xx supports max 256Mb RAM */
+ ori r10, r10, 0xf0 | MD_SPS16K | _PAGE_PRIVILEGED | _PAGE_DIRTY | \
+ _PAGE_PRESENT
+ mtspr SPRN_MD_RPN, r10 /* Update TLB entry */
+
+ li r11, RPN_PATTERN
+ mtspr SPRN_DAR, r11 /* Tag DAR */
+_ENTRY(dtlb_miss_exit_3)
+ mfspr r10, SPRN_SPRG_SCRATCH0
+ mfspr r11, SPRN_SPRG_SCRATCH1
+ rfi
+
/* This is an instruction TLB error on the MPC8xx. This could be due
* to many reasons, such as executing guarded memory or illegal instruction
* addresses. There is nothing to do but handle a big time error fault.
@@ -565,62 +617,6 @@ InstructionBreakpoint:

. = 0x2000

-/*
- * Bottom part of DataStoreTLBMiss handlers for IMMR area and linear RAM.
- * not enough space in the DataStoreTLBMiss area.
- */
-DTLBMissIMMR:
- mtcr r11
- /* Set 512k byte guarded page and mark it valid */
- li r10, MD_PS512K | MD_GUARDED | MD_SVALID
- mtspr SPRN_MD_TWC, r10
- mfspr r10, SPRN_IMMR /* Get current IMMR */
- rlwinm r10, r10, 0, 0xfff80000 /* Get 512 kbytes boundary */
- ori r10, r10, 0xf0 | MD_SPS16K | _PAGE_PRIVILEGED | _PAGE_DIRTY | \
- _PAGE_PRESENT | _PAGE_NO_CACHE
- mtspr SPRN_MD_RPN, r10 /* Update TLB entry */
-
- li r11, RPN_PATTERN
- mtspr SPRN_DAR, r11 /* Tag DAR */
-_ENTRY(dtlb_miss_exit_2)
- mfspr r10, SPRN_SPRG_SCRATCH0
- mfspr r11, SPRN_SPRG_SCRATCH1
- rfi
-
-DTLBMissLinear:
- mtcr r11
- /* Set 8M byte page and mark it valid */
- li r11, MD_PS8MEG | MD_SVALID
- mtspr SPRN_MD_TWC, r11
- rlwinm r10, r10, 20, 0x0f800000 /* 8xx supports max 256Mb RAM */
- ori r10, r10, 0xf0 | MD_SPS16K | _PAGE_PRIVILEGED | _PAGE_DIRTY | \
- _PAGE_PRESENT
- mtspr SPRN_MD_RPN, r10 /* Update TLB entry */
-
- li r11, RPN_PATTERN
- mtspr SPRN_DAR, r11 /* Tag DAR */
-_ENTRY(dtlb_miss_exit_3)
- mfspr r10, SPRN_SPRG_SCRATCH0
- mfspr r11, SPRN_SPRG_SCRATCH1
- rfi
-
-#ifndef CONFIG_PIN_TLB_TEXT
-ITLBMissLinear:
- mtcr r11
- /* Set 8M byte page and mark it valid */
- li r11, MI_PS8MEG | MI_SVALID
- mtspr SPRN_MI_TWC, r11
- rlwinm r10, r10, 20, 0x0f800000 /* 8xx supports max 256Mb RAM */
- ori r10, r10, 0xf0 | MI_SPS16K | _PAGE_PRIVILEGED | _PAGE_DIRTY | \
- _PAGE_PRESENT
- mtspr SPRN_MI_RPN, r10 /* Update TLB entry */
-
-_ENTRY(itlb_miss_exit_2)
- mfspr r10, SPRN_SPRG_SCRATCH0
- mfspr r11, SPRN_SPRG_SCRATCH1
- rfi
-#endif
-
/* This is the procedure to calculate the data EA for buggy dcbx,dcbi instructions
* by decoding the registers used by the dcbx instruction and adding them.
* DAR is set to the calculated address.
--
2.13.3


2018-05-04 12:37:17

by Christophe Leroy

[permalink] [raw]
Subject: [PATCH 16/17] powerpc/mm: Make pte_fragment_alloc() common to PPC32 and PPC64

In order to allow the 8xx to handle pte_fragments, this patch
makes in common to PPC32 and PPC64

Signed-off-by: Christophe Leroy <[email protected]>
---
arch/powerpc/include/asm/mmu_context.h | 28 ++++++++++++++
arch/powerpc/mm/mmu_context_book3s64.c | 28 --------------
arch/powerpc/mm/pgtable.c | 67 ++++++++++++++++++++++++++++++++++
arch/powerpc/mm/pgtable_64.c | 67 ----------------------------------
arch/powerpc/platforms/Kconfig.cputype | 5 +++
5 files changed, 100 insertions(+), 95 deletions(-)

diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h
index 1835ca1505d6..252988f7e219 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -262,5 +262,33 @@ static inline u64 pte_to_hpte_pkey_bits(u64 pteflags)

#endif /* CONFIG_PPC_MEM_KEYS */

+#ifdef CONFIG_NEED_PTE_FRAG
+static inline void destroy_pagetable_page(struct mm_struct *mm)
+{
+ int count;
+ void *pte_frag;
+ struct page *page;
+
+ pte_frag = mm->context.pte_frag;
+ if (!pte_frag)
+ return;
+
+ page = virt_to_page(pte_frag);
+ /* drop all the pending references */
+ count = ((unsigned long)pte_frag & ~PAGE_MASK) >> PTE_FRAG_SIZE_SHIFT;
+ /* We allow PTE_FRAG_NR fragments from a PTE page */
+ if (page_ref_sub_and_test(page, PTE_FRAG_NR - count)) {
+ pgtable_page_dtor(page);
+ free_unref_page(page);
+ }
+}
+
+#else
+static inline void destroy_pagetable_page(struct mm_struct *mm)
+{
+ return;
+}
+#endif
+
#endif /* __KERNEL__ */
#endif /* __ASM_POWERPC_MMU_CONTEXT_H */
diff --git a/arch/powerpc/mm/mmu_context_book3s64.c b/arch/powerpc/mm/mmu_context_book3s64.c
index b75194dff64c..2f55a4e3c09a 100644
--- a/arch/powerpc/mm/mmu_context_book3s64.c
+++ b/arch/powerpc/mm/mmu_context_book3s64.c
@@ -192,34 +192,6 @@ static void destroy_contexts(mm_context_t *ctx)
spin_unlock(&mmu_context_lock);
}

-#ifdef CONFIG_PPC_64K_PAGES
-static void destroy_pagetable_page(struct mm_struct *mm)
-{
- int count;
- void *pte_frag;
- struct page *page;
-
- pte_frag = mm->context.pte_frag;
- if (!pte_frag)
- return;
-
- page = virt_to_page(pte_frag);
- /* drop all the pending references */
- count = ((unsigned long)pte_frag & ~PAGE_MASK) >> PTE_FRAG_SIZE_SHIFT;
- /* We allow PTE_FRAG_NR fragments from a PTE page */
- if (page_ref_sub_and_test(page, PTE_FRAG_NR - count)) {
- pgtable_page_dtor(page);
- free_unref_page(page);
- }
-}
-
-#else
-static inline void destroy_pagetable_page(struct mm_struct *mm)
-{
- return;
-}
-#endif
-
void destroy_context(struct mm_struct *mm)
{
#ifdef CONFIG_SPAPR_TCE_IOMMU
diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c
index 9f361ae571e9..2d34755ed727 100644
--- a/arch/powerpc/mm/pgtable.c
+++ b/arch/powerpc/mm/pgtable.c
@@ -264,3 +264,70 @@ unsigned long vmalloc_to_phys(void *va)
return __pa(pfn_to_kaddr(pfn)) + offset_in_page(va);
}
EXPORT_SYMBOL_GPL(vmalloc_to_phys);
+
+#ifdef CONFIG_NEED_PTE_FRAG
+static pte_t *get_from_cache(struct mm_struct *mm)
+{
+ void *pte_frag, *ret;
+
+ spin_lock(&mm->page_table_lock);
+ ret = mm->context.pte_frag;
+ if (ret) {
+ pte_frag = ret + PTE_FRAG_SIZE;
+ /*
+ * If we have taken up all the fragments mark PTE page NULL
+ */
+ if (((unsigned long)pte_frag & ~PAGE_MASK) == 0)
+ pte_frag = NULL;
+ mm->context.pte_frag = pte_frag;
+ }
+ spin_unlock(&mm->page_table_lock);
+ return (pte_t *)ret;
+}
+
+static pte_t *__alloc_for_cache(struct mm_struct *mm, int kernel)
+{
+ void *ret = NULL;
+ struct page *page;
+
+ if (!kernel) {
+ page = alloc_page(PGALLOC_GFP | __GFP_ACCOUNT);
+ if (!page)
+ return NULL;
+ if (!pgtable_page_ctor(page)) {
+ __free_page(page);
+ return NULL;
+ }
+ } else {
+ page = alloc_page(PGALLOC_GFP);
+ if (!page)
+ return NULL;
+ }
+
+ ret = page_address(page);
+ spin_lock(&mm->page_table_lock);
+ /*
+ * If we find pgtable_page set, we return
+ * the allocated page with single fragement
+ * count.
+ */
+ if (likely(!mm->context.pte_frag)) {
+ set_page_count(page, PTE_FRAG_NR);
+ mm->context.pte_frag = ret + PTE_FRAG_SIZE;
+ }
+ spin_unlock(&mm->page_table_lock);
+
+ return (pte_t *)ret;
+}
+
+pte_t *pte_fragment_alloc(struct mm_struct *mm, unsigned long vmaddr, int kernel)
+{
+ pte_t *pte;
+
+ pte = get_from_cache(mm);
+ if (pte)
+ return pte;
+
+ return __alloc_for_cache(mm, kernel);
+}
+#endif
diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
index dd1102a246e4..1d8dc37d98a7 100644
--- a/arch/powerpc/mm/pgtable_64.c
+++ b/arch/powerpc/mm/pgtable_64.c
@@ -139,73 +139,6 @@ struct page *pmd_page(pmd_t pmd)
return virt_to_page(pmd_page_vaddr(pmd));
}

-#ifdef CONFIG_PPC_64K_PAGES
-static pte_t *get_from_cache(struct mm_struct *mm)
-{
- void *pte_frag, *ret;
-
- spin_lock(&mm->page_table_lock);
- ret = mm->context.pte_frag;
- if (ret) {
- pte_frag = ret + PTE_FRAG_SIZE;
- /*
- * If we have taken up all the fragments mark PTE page NULL
- */
- if (((unsigned long)pte_frag & ~PAGE_MASK) == 0)
- pte_frag = NULL;
- mm->context.pte_frag = pte_frag;
- }
- spin_unlock(&mm->page_table_lock);
- return (pte_t *)ret;
-}
-
-static pte_t *__alloc_for_cache(struct mm_struct *mm, int kernel)
-{
- void *ret = NULL;
- struct page *page;
-
- if (!kernel) {
- page = alloc_page(PGALLOC_GFP | __GFP_ACCOUNT);
- if (!page)
- return NULL;
- if (!pgtable_page_ctor(page)) {
- __free_page(page);
- return NULL;
- }
- } else {
- page = alloc_page(PGALLOC_GFP);
- if (!page)
- return NULL;
- }
-
- ret = page_address(page);
- spin_lock(&mm->page_table_lock);
- /*
- * If we find pgtable_page set, we return
- * the allocated page with single fragement
- * count.
- */
- if (likely(!mm->context.pte_frag)) {
- set_page_count(page, PTE_FRAG_NR);
- mm->context.pte_frag = ret + PTE_FRAG_SIZE;
- }
- spin_unlock(&mm->page_table_lock);
-
- return (pte_t *)ret;
-}
-
-pte_t *pte_fragment_alloc(struct mm_struct *mm, unsigned long vmaddr, int kernel)
-{
- pte_t *pte;
-
- pte = get_from_cache(mm);
- if (pte)
- return pte;
-
- return __alloc_for_cache(mm, kernel);
-}
-#endif /* CONFIG_PPC_64K_PAGES */
-
void pte_fragment_free(unsigned long *table, int kernel)
{
struct page *page = virt_to_page(table);
diff --git a/arch/powerpc/platforms/Kconfig.cputype b/arch/powerpc/platforms/Kconfig.cputype
index f860f0326c78..7172b04c91b5 100644
--- a/arch/powerpc/platforms/Kconfig.cputype
+++ b/arch/powerpc/platforms/Kconfig.cputype
@@ -337,6 +337,11 @@ config PPC_MM_SLICES
default y if PPC_8xx && HUGETLB_PAGE
default n

+config NEED_PTE_FRAG
+ bool
+ default y if PPC_BOOK3S_64 && PPC_64K_PAGES
+ default n
+
config PPC_HAVE_PMU_SUPPORT
bool

--
2.13.3


2018-05-04 12:37:17

by Christophe Leroy

[permalink] [raw]
Subject: [PATCH 08/17] powerpc: make __iounmap() common to PPC32 and PPC64

This patch makes __iounmap() common to PPC32 and PPC64.

Signed-off-by: Christophe Leroy <[email protected]>
---
arch/powerpc/mm/ioremap.c | 26 ++++++++++----------------
1 file changed, 10 insertions(+), 16 deletions(-)

diff --git a/arch/powerpc/mm/ioremap.c b/arch/powerpc/mm/ioremap.c
index 153657db084e..65d611d44d38 100644
--- a/arch/powerpc/mm/ioremap.c
+++ b/arch/powerpc/mm/ioremap.c
@@ -120,20 +120,6 @@ __ioremap_caller(phys_addr_t addr, unsigned long size, unsigned long flags,
return (void __iomem *) (v + ((unsigned long)addr & ~PAGE_MASK));
}

-void __iounmap(volatile void __iomem *addr)
-{
- /*
- * If mapped by BATs then there is nothing to do.
- * Calling vfree() generates a benign warning.
- */
- if (v_block_mapped((unsigned long)addr))
- return;
-
- if ((unsigned long) addr >= ioremap_bot)
- vunmap((void *) (PAGE_MASK & (unsigned long)addr));
-}
-EXPORT_SYMBOL(__iounmap);
-
#else

/**
@@ -225,6 +211,8 @@ void __iomem * __ioremap_caller(phys_addr_t addr, unsigned long size,
return ret;
}

+#endif
+
/*
* Unmap an IO region and remove it from imalloc'd list.
* Access to IO memory should be serialized by driver.
@@ -238,6 +226,14 @@ void __iounmap(volatile void __iomem *token)

addr = (void *) ((unsigned long __force)
PCI_FIX_ADDR(token) & PAGE_MASK);
+
+ /*
+ * If mapped by BATs then there is nothing to do.
+ * Calling vfree() generates a benign warning.
+ */
+ if (v_block_mapped((unsigned long)addr))
+ return;
+
if ((unsigned long)addr < ioremap_bot) {
printk(KERN_WARNING "Attempt to iounmap early bolted mapping"
" at 0x%p\n", addr);
@@ -247,8 +243,6 @@ void __iounmap(volatile void __iomem *token)
}
EXPORT_SYMBOL(__iounmap);

-#endif
-
void __iomem * __ioremap(phys_addr_t addr, unsigned long size,
unsigned long flags)
{
--
2.13.3


2018-05-04 12:37:24

by Christophe Leroy

[permalink] [raw]
Subject: [PATCH 11/17] powerpc/nohash32: set GUARDED attribute in the PMD directly

On the 8xx, the GUARDED attribute of the pages is managed in the
L1 entry, therefore to avoid having to copy it into L1 entry
at each TLB miss, we set it in the PMD.

For this, we split the VM alloc space in two parts, one
for VM alloc and non Guarded IO, and one for Guarded IO.

Signed-off-by: Christophe Leroy <[email protected]>
---
arch/powerpc/include/asm/nohash/32/pgalloc.h | 10 ++++++++++
arch/powerpc/include/asm/nohash/32/pgtable.h | 18 ++++++++++++++++--
arch/powerpc/include/asm/nohash/32/pte-8xx.h | 3 ++-
arch/powerpc/kernel/head_8xx.S | 18 +++++++-----------
arch/powerpc/mm/dump_linuxpagetables.c | 26 ++++++++++++++++++++++++--
arch/powerpc/mm/ioremap.c | 11 ++++++++---
arch/powerpc/mm/mem.c | 9 +++++++++
arch/powerpc/mm/pgtable_32.c | 28 +++++++++++++++++++++++++++-
arch/powerpc/platforms/Kconfig.cputype | 3 +++
9 files changed, 106 insertions(+), 20 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/32/pgalloc.h b/arch/powerpc/include/asm/nohash/32/pgalloc.h
index 29d37bd1f3b3..1c6461e7c6aa 100644
--- a/arch/powerpc/include/asm/nohash/32/pgalloc.h
+++ b/arch/powerpc/include/asm/nohash/32/pgalloc.h
@@ -58,6 +58,12 @@ static inline void pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmdp,
*pmdp = __pmd(__pa(pte) | _PMD_PRESENT);
}

+static inline void pmd_populate_kernel_g(struct mm_struct *mm, pmd_t *pmdp,
+ pte_t *pte)
+{
+ *pmdp = __pmd(__pa(pte) | _PMD_PRESENT | _PMD_GUARDED);
+}
+
static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmdp,
pgtable_t pte_page)
{
@@ -83,6 +89,10 @@ static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmdp,
#define pmd_pgtable(pmd) pmd_page(pmd)
#endif

+#define pte_alloc_kernel_g(pmd, address) \
+ ((unlikely(pmd_none(*(pmd))) && __pte_alloc_kernel_g(pmd, address))? \
+ NULL: pte_offset_kernel(pmd, address))
+
extern pte_t *pte_alloc_one_kernel(struct mm_struct *mm, unsigned long addr);
extern pgtable_t pte_alloc_one(struct mm_struct *mm, unsigned long addr);

diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h b/arch/powerpc/include/asm/nohash/32/pgtable.h
index 93dc22dbe964..009a5b3d3192 100644
--- a/arch/powerpc/include/asm/nohash/32/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
@@ -69,9 +69,14 @@ extern int icache_44x_need_flush;
* virtual space that goes below PKMAP and FIXMAP
*/
#ifdef CONFIG_HIGHMEM
-#define KVIRT_TOP PKMAP_BASE
+#define _KVIRT_TOP PKMAP_BASE
#else
-#define KVIRT_TOP (0xfe000000UL) /* for now, could be FIXMAP_BASE ? */
+#define _KVIRT_TOP (0xfe000000UL) /* for now, could be FIXMAP_BASE ? */
+#endif
+#ifdef CONFIG_PPC_GUARDED_PAGE_IN_PMD
+#define KVIRT_TOP _ALIGN_DOWN(_KVIRT_TOP, PGDIR_SIZE)
+#else
+#define KVIRT_TOP _KVIRT_TOP
#endif

/*
@@ -84,7 +89,11 @@ extern int icache_44x_need_flush;
#else
#define IOREMAP_END KVIRT_TOP
#endif
+#ifdef CONFIG_PPC_GUARDED_PAGE_IN_PMD
+#define IOREMAP_BASE _ALIGN_UP(VMALLOC_BASE + (IOREMAP_END - VMALLOC_BASE) / 2, PGDIR_SIZE)
+#else
#define IOREMAP_BASE VMALLOC_BASE
+#endif

/*
* Just any arbitrary offset to the start of the vmalloc VM area: the
@@ -103,8 +112,13 @@ extern int icache_44x_need_flush;
#else
#define VMALLOC_BASE _ALIGN_DOWN((long)high_memory + VMALLOC_OFFSET, VMALLOC_OFFSET)
#endif
+#ifdef CONFIG_PPC_GUARDED_PAGE_IN_PMD
+#define VMALLOC_START VMALLOC_BASE
+#define VMALLOC_END IOREMAP_BASE
+#else
#define VMALLOC_START ioremap_bot
#define VMALLOC_END IOREMAP_END
+#endif

/*
* Bits in a linux-style PTE. These match the bits in the
diff --git a/arch/powerpc/include/asm/nohash/32/pte-8xx.h b/arch/powerpc/include/asm/nohash/32/pte-8xx.h
index f04cb46ae8a1..a9a2919251e0 100644
--- a/arch/powerpc/include/asm/nohash/32/pte-8xx.h
+++ b/arch/powerpc/include/asm/nohash/32/pte-8xx.h
@@ -47,10 +47,11 @@
#define _PAGE_RO 0x0600 /* Supervisor RO, User no access */

#define _PMD_PRESENT 0x0001
-#define _PMD_BAD 0x0fd0
+#define _PMD_BAD 0x0fc0
#define _PMD_PAGE_MASK 0x000c
#define _PMD_PAGE_8M 0x000c
#define _PMD_PAGE_512K 0x0004
+#define _PMD_GUARDED 0x0010
#define _PMD_USER 0x0020 /* APG 1 */

/* Until my rework is finished, 8xx still needs atomic PTE updates */
diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index c3b831bb8bad..85b017c67e11 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -345,6 +345,10 @@ _ENTRY(ITLBMiss_cmp)
rlwinm r10, r10, 32 - (PAGE_SHIFT - 2), 32 - PAGE_SHIFT, 29
#ifdef CONFIG_HUGETLB_PAGE
mtcr r11
+#endif
+ /* Load the MI_TWC with the attributes for this "segment." */
+ mtspr SPRN_MI_TWC, r11 /* Set segment attributes */
+#ifdef CONFIG_HUGETLB_PAGE
bt- 28, 10f /* bit 28 = Large page (8M) */
bt- 29, 20f /* bit 29 = Large page (8M or 512k) */
#endif
@@ -354,8 +358,6 @@ _ENTRY(ITLBMiss_cmp)
#if defined(ITLB_MISS_KERNEL) || defined(CONFIG_HUGETLB_PAGE)
mtcr r12
#endif
- /* Load the MI_TWC with the attributes for this "segment." */
- mtspr SPRN_MI_TWC, r11 /* Set segment attributes */

#ifdef CONFIG_SWAP
rlwinm r11, r10, 32-5, _PAGE_PRESENT
@@ -457,6 +459,9 @@ _ENTRY(DTLBMiss_jmp)
rlwinm r10, r10, 32 - (PAGE_SHIFT - 2), 32 - PAGE_SHIFT, 29
#ifdef CONFIG_HUGETLB_PAGE
mtcr r11
+#endif
+ mtspr SPRN_MD_TWC, r11
+#ifdef CONFIG_HUGETLB_PAGE
bt- 28, 10f /* bit 28 = Large page (8M) */
bt- 29, 20f /* bit 29 = Large page (8M or 512k) */
#endif
@@ -465,15 +470,6 @@ _ENTRY(DTLBMiss_jmp)
4:
mtcr r12

- /* Insert the Guarded flag into the TWC from the Linux PTE.
- * It is bit 27 of both the Linux PTE and the TWC (at least
- * I got that right :-). It will be better when we can put
- * this into the Linux pgd/pmd and load it in the operation
- * above.
- */
- rlwimi r11, r10, 0, _PAGE_GUARDED
- mtspr SPRN_MD_TWC, r11
-
/* Both _PAGE_ACCESSED and _PAGE_PRESENT has to be set.
* We also need to know if the insn is a load/store, so:
* Clear _PAGE_PRESENT and load that which will
diff --git a/arch/powerpc/mm/dump_linuxpagetables.c b/arch/powerpc/mm/dump_linuxpagetables.c
index 6022adb899b7..cd3797be5e05 100644
--- a/arch/powerpc/mm/dump_linuxpagetables.c
+++ b/arch/powerpc/mm/dump_linuxpagetables.c
@@ -74,9 +74,9 @@ struct addr_marker {

static struct addr_marker address_markers[] = {
{ 0, "Start of kernel VM" },
+#ifdef CONFIG_PPC64
{ 0, "vmalloc() Area" },
{ 0, "vmalloc() End" },
-#ifdef CONFIG_PPC64
{ 0, "isa I/O start" },
{ 0, "isa I/O end" },
{ 0, "phb I/O start" },
@@ -85,8 +85,19 @@ static struct addr_marker address_markers[] = {
{ 0, "I/O remap end" },
{ 0, "vmemmap start" },
#else
+#ifdef CONFIG_PPC_GUARDED_PAGE_IN_PMD
+ { 0, "vmalloc() Area" },
+ { 0, "vmalloc() End" },
{ 0, "Early I/O remap start" },
{ 0, "Early I/O remap end" },
+ { 0, "I/O remap start" },
+ { 0, "I/O remap end" },
+#else
+ { 0, "Early I/O remap start" },
+ { 0, "Early I/O remap end" },
+ { 0, "vmalloc() I/O remap start" },
+ { 0, "vmalloc() I/O remap end" },
+#endif
#ifdef CONFIG_NOT_COHERENT_CACHE
{ 0, "Consistent mem start" },
{ 0, "Consistent mem end" },
@@ -437,9 +448,9 @@ static void populate_markers(void)
int i = 0;

address_markers[i++].start_address = PAGE_OFFSET;
+#ifdef CONFIG_PPC64
address_markers[i++].start_address = VMALLOC_START;
address_markers[i++].start_address = VMALLOC_END;
-#ifdef CONFIG_PPC64
address_markers[i++].start_address = ISA_IO_BASE;
address_markers[i++].start_address = ISA_IO_END;
address_markers[i++].start_address = PHB_IO_BASE;
@@ -452,8 +463,19 @@ static void populate_markers(void)
address_markers[i++].start_address = VMEMMAP_BASE;
#endif
#else /* !CONFIG_PPC64 */
+#ifdef CONFIG_PPC_GUARDED_PAGE_IN_PMD
+ address_markers[i++].start_address = VMALLOC_START;
+ address_markers[i++].start_address = VMALLOC_END;
address_markers[i++].start_address = IOREMAP_BASE;
address_markers[i++].start_address = ioremap_bot;
+ address_markers[i++].start_address = ioremap_bot;
+ address_markers[i++].start_address = IOREMAP_END;
+#else
+ address_markers[i++].start_address = IOREMAP_BASE;
+ address_markers[i++].start_address = ioremap_bot;
+ address_markers[i++].start_address = ioremap_bot;
+ address_markers[i++].start_address = IOREMAP_END;
+#endif
#ifdef CONFIG_NOT_COHERENT_CACHE
address_markers[i++].start_address = IOREMAP_END;
address_markers[i++].start_address = IOREMAP_END +
diff --git a/arch/powerpc/mm/ioremap.c b/arch/powerpc/mm/ioremap.c
index 59be5dfcb3e9..b8c347077e02 100644
--- a/arch/powerpc/mm/ioremap.c
+++ b/arch/powerpc/mm/ioremap.c
@@ -132,9 +132,14 @@ void __iomem * __ioremap_caller(phys_addr_t addr, unsigned long size,
if (slab_is_available()) {
struct vm_struct *area;

- area = __get_vm_area_caller(size, VM_IOREMAP,
- ioremap_bot, IOREMAP_END,
- caller);
+ if (flags & _PAGE_GUARDED)
+ area = __get_vm_area_caller(size, VM_IOREMAP,
+ ioremap_bot, IOREMAP_END,
+ caller);
+ else
+ area = __get_vm_area_caller(size, VM_IOREMAP,
+ VMALLOC_START, VMALLOC_END,
+ caller);
if (area == NULL)
return NULL;

diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index b680aa78a4ac..fd7af7af5b58 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -386,10 +386,19 @@ void __init mem_init(void)
pr_info(" * 0x%08lx..0x%08lx : consistent mem\n",
IOREMAP_END, IOREMAP_END + CONFIG_CONSISTENT_SIZE);
#endif /* CONFIG_NOT_COHERENT_CACHE */
+#ifdef CONFIG_PPC_GUARDED_PAGE_IN_PMD
+ pr_info(" * 0x%08lx..0x%08lx : ioremap\n",
+ ioremap_bot, IOREMAP_END);
pr_info(" * 0x%08lx..0x%08lx : early ioremap\n",
IOREMAP_BASE, ioremap_bot);
+ pr_info(" * 0x%08lx..0x%08lx : vmalloc\n",
+ VMALLOC_START, VMALLOC_END);
+#else
pr_info(" * 0x%08lx..0x%08lx : vmalloc & ioremap\n",
VMALLOC_START, VMALLOC_END);
+ pr_info(" * 0x%08lx..0x%08lx : early ioremap\n",
+ IOREMAP_BASE, ioremap_bot);
+#endif
#endif /* CONFIG_PPC32 */
}

diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c
index 54a5bc0767a9..3aa0c78db95d 100644
--- a/arch/powerpc/mm/pgtable_32.c
+++ b/arch/powerpc/mm/pgtable_32.c
@@ -70,6 +70,27 @@ pgtable_t pte_alloc_one(struct mm_struct *mm, unsigned long address)
return ptepage;
}

+#ifdef CONFIG_PPC_GUARDED_PAGE_IN_PMD
+int __pte_alloc_kernel_g(pmd_t *pmd, unsigned long address)
+{
+ pte_t *new = pte_alloc_one_kernel(&init_mm, address);
+ if (!new)
+ return -ENOMEM;
+
+ smp_wmb(); /* See comment in __pte_alloc */
+
+ spin_lock(&init_mm.page_table_lock);
+ if (likely(pmd_none(*pmd))) { /* Has another populated it ? */
+ pmd_populate_kernel_g(&init_mm, pmd, new);
+ new = NULL;
+ }
+ spin_unlock(&init_mm.page_table_lock);
+ if (new)
+ pte_free_kernel(&init_mm, new);
+ return 0;
+}
+#endif
+
int map_kernel_page(unsigned long va, phys_addr_t pa, int flags)
{
pmd_t *pd;
@@ -79,7 +100,12 @@ int map_kernel_page(unsigned long va, phys_addr_t pa, int flags)
/* Use upper 10 bits of VA to index the first level map */
pd = pmd_offset(pud_offset(pgd_offset_k(va), va), va);
/* Use middle 10 bits of VA to index the second-level map */
- pg = pte_alloc_kernel(pd, va);
+#ifdef CONFIG_PPC_GUARDED_PAGE_IN_PMD
+ if (flags & _PAGE_GUARDED)
+ pg = pte_alloc_kernel_g(pd, va);
+ else
+#endif
+ pg = pte_alloc_kernel(pd, va);
if (pg != 0) {
err = 0;
/* The PTE should never be already set nor present in the
diff --git a/arch/powerpc/platforms/Kconfig.cputype b/arch/powerpc/platforms/Kconfig.cputype
index 67d3125d0610..f860f0326c78 100644
--- a/arch/powerpc/platforms/Kconfig.cputype
+++ b/arch/powerpc/platforms/Kconfig.cputype
@@ -319,6 +319,9 @@ config ARCH_ENABLE_HUGEPAGE_MIGRATION
def_bool y
depends on PPC_BOOK3S_64 && HUGETLB_PAGE && MIGRATION

+config PPC_GUARDED_PAGE_IN_PMD
+ def_bool y
+ depends on PPC_8xx

config PPC_MMU_NOHASH
def_bool y
--
2.13.3


2018-05-04 12:38:00

by Christophe Leroy

[permalink] [raw]
Subject: [PATCH 09/17] powerpc: make __ioremap_caller() common to PPC32 and PPC64

Signed-off-by: Christophe Leroy <[email protected]>
---
arch/powerpc/include/asm/book3s/64/pgtable.h | 1 +
arch/powerpc/mm/ioremap.c | 126 +++++++--------------------
2 files changed, 34 insertions(+), 93 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index c5c6ead06bfb..2bebdd8302cb 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -18,6 +18,7 @@
#define _PAGE_RO 0
#define _PAGE_USER 0
#define _PAGE_HWWRITE 0
+#define _PAGE_COHERENT 0

#define _PAGE_EXEC 0x00001 /* execute permission */
#define _PAGE_WRITE 0x00002 /* write access allowed */
diff --git a/arch/powerpc/mm/ioremap.c b/arch/powerpc/mm/ioremap.c
index 65d611d44d38..59be5dfcb3e9 100644
--- a/arch/powerpc/mm/ioremap.c
+++ b/arch/powerpc/mm/ioremap.c
@@ -33,95 +33,6 @@ unsigned long ioremap_bot;
unsigned long ioremap_bot = IOREMAP_BASE;
#endif

-#ifdef CONFIG_PPC32
-
-void __iomem *
-__ioremap_caller(phys_addr_t addr, unsigned long size, unsigned long flags,
- void *caller)
-{
- unsigned long v, i;
- phys_addr_t p;
- int err;
-
- /* Make sure we have the base flags */
- if ((flags & _PAGE_PRESENT) == 0)
- flags |= pgprot_val(PAGE_KERNEL);
-
- /* Non-cacheable page cannot be coherent */
- if (flags & _PAGE_NO_CACHE)
- flags &= ~_PAGE_COHERENT;
-
- /*
- * Choose an address to map it to.
- * Once the vmalloc system is running, we use it.
- * Before then, we use space going up from IOREMAP_BASE
- * (ioremap_bot records where we're up to).
- */
- p = addr & PAGE_MASK;
- size = PAGE_ALIGN(addr + size) - p;
-
- /*
- * If the address lies within the first 16 MB, assume it's in ISA
- * memory space
- */
- if (p < 16*1024*1024)
- p += _ISA_MEM_BASE;
-
-#ifndef CONFIG_CRASH_DUMP
- /*
- * Don't allow anybody to remap normal RAM that we're using.
- * mem_init() sets high_memory so only do the check after that.
- */
- if (slab_is_available() && (p < virt_to_phys(high_memory)) &&
- page_is_ram(__phys_to_pfn(p))) {
- printk("__ioremap(): phys addr 0x%llx is RAM lr %ps\n",
- (unsigned long long)p, __builtin_return_address(0));
- return NULL;
- }
-#endif
-
- if (size == 0)
- return NULL;
-
- /*
- * Is it already mapped? Perhaps overlapped by a previous
- * mapping.
- */
- v = p_block_mapped(p);
- if (v)
- goto out;
-
- if (slab_is_available()) {
- struct vm_struct *area;
- area = get_vm_area_caller(size, VM_IOREMAP, caller);
- if (area == 0)
- return NULL;
- area->phys_addr = p;
- v = (unsigned long) area->addr;
- } else {
- v = ioremap_bot;
- ioremap_bot += size;
- }
-
- /*
- * Should check if it is a candidate for a BAT mapping
- */
-
- err = 0;
- for (i = 0; i < size && err == 0; i += PAGE_SIZE)
- err = map_kernel_page(v+i, p+i, flags);
- if (err) {
- if (slab_is_available())
- vunmap((void *)v);
- return NULL;
- }
-
-out:
- return (void __iomem *) (v + ((unsigned long)addr & ~PAGE_MASK));
-}
-
-#else
-
/**
* __ioremap_at - Low level function to establish the page tables
* for an IO mapping
@@ -135,6 +46,10 @@ void __iomem * __ioremap_at(phys_addr_t pa, void *ea, unsigned long size,
if ((flags & _PAGE_PRESENT) == 0)
flags |= pgprot_val(PAGE_KERNEL);

+ /* Non-cacheable page cannot be coherent */
+ if (flags & _PAGE_NO_CACHE)
+ flags &= ~_PAGE_COHERENT;
+
/* We don't support the 4K PFN hack with ioremap */
if (flags & H_PAGE_4K_PFN)
return NULL;
@@ -187,6 +102,33 @@ void __iomem * __ioremap_caller(phys_addr_t addr, unsigned long size,
if ((size == 0) || (paligned == 0))
return NULL;

+ /*
+ * If the address lies within the first 16 MB, assume it's in ISA
+ * memory space
+ */
+ if (IS_ENABLED(CONFIG_PPC32) && paligned < 16*1024*1024)
+ paligned += _ISA_MEM_BASE;
+
+ /*
+ * Don't allow anybody to remap normal RAM that we're using.
+ * mem_init() sets high_memory so only do the check after that.
+ */
+ if (!IS_ENABLED(CONFIG_CRASH_DUMP) &&
+ slab_is_available() && (paligned < virt_to_phys(high_memory)) &&
+ page_is_ram(__phys_to_pfn(paligned))) {
+ printk("__ioremap(): phys addr 0x%llx is RAM lr %ps\n",
+ (u64)paligned, __builtin_return_address(0));
+ return NULL;
+ }
+
+ /*
+ * Is it already mapped? Perhaps overlapped by a previous
+ * mapping.
+ */
+ ret = (void __iomem *)p_block_mapped(paligned);
+ if (ret)
+ goto out;
+
if (slab_is_available()) {
struct vm_struct *area;

@@ -205,14 +147,12 @@ void __iomem * __ioremap_caller(phys_addr_t addr, unsigned long size,
if (ret)
ioremap_bot += size;
}
-
+out:
if (ret)
- ret += addr & ~PAGE_MASK;
+ ret += (unsigned long)addr & ~PAGE_MASK;
return ret;
}

-#endif
-
/*
* Unmap an IO region and remove it from imalloc'd list.
* Access to IO memory should be serialized by driver.
--
2.13.3


2018-05-04 12:38:39

by Christophe Leroy

[permalink] [raw]
Subject: [PATCH 07/17] powerpc: make ioremap_bot common to PPC32 and PPC64

Today, early ioremap maps from IOREMAP_BASE down to up on PPC64
and from IOREMAP_TOP up to down on PPC32

This patchs modifies PPC32 behaviour to get same behaviour as PPC64

Signed-off-by: Christophe Leroy <[email protected]>
---
arch/powerpc/include/asm/book3s/32/pgtable.h | 16 +++++++++-------
arch/powerpc/include/asm/nohash/32/pgtable.h | 20 ++++++++------------
arch/powerpc/mm/dma-noncoherent.c | 2 +-
arch/powerpc/mm/dump_linuxpagetables.c | 6 +++---
arch/powerpc/mm/init_32.c | 6 +++++-
arch/powerpc/mm/ioremap.c | 22 ++++++++++------------
arch/powerpc/mm/mem.c | 7 ++++---
7 files changed, 40 insertions(+), 39 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h
index c615abdce119..6cf962ec7a20 100644
--- a/arch/powerpc/include/asm/book3s/32/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
@@ -54,16 +54,17 @@
#else
#define KVIRT_TOP (0xfe000000UL) /* for now, could be FIXMAP_BASE ? */
#endif
+#define IOREMAP_BASE VMALLOC_BASE

/*
- * ioremap_bot starts at that address. Early ioremaps move down from there,
- * until mem_init() at which point this becomes the top of the vmalloc
+ * ioremap_bot starts at IOREMAP_BASE. Early ioremaps move up from there,
+ * until mem_init() at which point this becomes the bottom of the vmalloc
* and ioremap space
*/
#ifdef CONFIG_NOT_COHERENT_CACHE
-#define IOREMAP_TOP ((KVIRT_TOP - CONFIG_CONSISTENT_SIZE) & PAGE_MASK)
+#define IOREMAP_END ((KVIRT_TOP - CONFIG_CONSISTENT_SIZE) & PAGE_MASK)
#else
-#define IOREMAP_TOP KVIRT_TOP
+#define IOREMAP_END KVIRT_TOP
#endif

/*
@@ -85,11 +86,12 @@
*/
#define VMALLOC_OFFSET (0x1000000) /* 16M */
#ifdef PPC_PIN_SIZE
-#define VMALLOC_START (((_ALIGN((long)high_memory, PPC_PIN_SIZE) + VMALLOC_OFFSET) & ~(VMALLOC_OFFSET-1)))
+#define VMALLOC_BASE (((_ALIGN((long)high_memory, PPC_PIN_SIZE) + VMALLOC_OFFSET) & ~(VMALLOC_OFFSET-1)))
#else
-#define VMALLOC_START ((((long)high_memory + VMALLOC_OFFSET) & ~(VMALLOC_OFFSET-1)))
+#define VMALLOC_BASE ((((long)high_memory + VMALLOC_OFFSET) & ~(VMALLOC_OFFSET-1)))
#endif
-#define VMALLOC_END ioremap_bot
+#define VMALLOC_START ioremap_bot
+#define VMALLOC_END IOREMAP_END

#ifndef __ASSEMBLY__
#include <linux/sched.h>
diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h b/arch/powerpc/include/asm/nohash/32/pgtable.h
index 140f8e74b478..b413abcd5a09 100644
--- a/arch/powerpc/include/asm/nohash/32/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
@@ -80,10 +80,11 @@ extern int icache_44x_need_flush;
* and ioremap space
*/
#ifdef CONFIG_NOT_COHERENT_CACHE
-#define IOREMAP_TOP ((KVIRT_TOP - CONFIG_CONSISTENT_SIZE) & PAGE_MASK)
+#define IOREMAP_END ((KVIRT_TOP - CONFIG_CONSISTENT_SIZE) & PAGE_MASK)
#else
-#define IOREMAP_TOP KVIRT_TOP
+#define IOREMAP_END KVIRT_TOP
#endif
+#define IOREMAP_BASE VMALLOC_BASE

/*
* Just any arbitrary offset to the start of the vmalloc VM area: the
@@ -94,21 +95,16 @@ extern int icache_44x_need_flush;
* area for the same reason. ;)
*
* We no longer map larger than phys RAM with the BATs so we don't have
- * to worry about the VMALLOC_OFFSET causing problems. We do have to worry
- * about clashes between our early calls to ioremap() that start growing down
- * from IOREMAP_TOP being run into the VM area allocations (growing upwards
- * from VMALLOC_START). For this reason we have ioremap_bot to check when
- * we actually run into our mappings setup in the early boot with the VM
- * system. This really does become a problem for machines with good amounts
- * of RAM. -- Cort
+ * to worry about the VMALLOC_OFFSET causing problems.
*/
#define VMALLOC_OFFSET (0x1000000) /* 16M */
#ifdef PPC_PIN_SIZE
-#define VMALLOC_START (((_ALIGN((long)high_memory, PPC_PIN_SIZE) + VMALLOC_OFFSET) & ~(VMALLOC_OFFSET-1)))
+#define VMALLOC_BASE (((_ALIGN((long)high_memory, PPC_PIN_SIZE) + VMALLOC_OFFSET) & ~(VMALLOC_OFFSET-1)))
#else
-#define VMALLOC_START ((((long)high_memory + VMALLOC_OFFSET) & ~(VMALLOC_OFFSET-1)))
+#define VMALLOC_BASE ((((long)high_memory + VMALLOC_OFFSET) & ~(VMALLOC_OFFSET-1)))
#endif
-#define VMALLOC_END ioremap_bot
+#define VMALLOC_START ioremap_bot
+#define VMALLOC_END IOREMAP_END

/*
* Bits in a linux-style PTE. These match the bits in the
diff --git a/arch/powerpc/mm/dma-noncoherent.c b/arch/powerpc/mm/dma-noncoherent.c
index 382528475433..d0a8fe74f5a0 100644
--- a/arch/powerpc/mm/dma-noncoherent.c
+++ b/arch/powerpc/mm/dma-noncoherent.c
@@ -43,7 +43,7 @@
* can be further configured for specific applications under
* the "Advanced Setup" menu. -Matt
*/
-#define CONSISTENT_BASE (IOREMAP_TOP)
+#define CONSISTENT_BASE (IOREMAP_END)
#define CONSISTENT_END (CONSISTENT_BASE + CONFIG_CONSISTENT_SIZE)
#define CONSISTENT_OFFSET(x) (((unsigned long)(x) - CONSISTENT_BASE) >> PAGE_SHIFT)

diff --git a/arch/powerpc/mm/dump_linuxpagetables.c b/arch/powerpc/mm/dump_linuxpagetables.c
index 876e2a3c79f2..6022adb899b7 100644
--- a/arch/powerpc/mm/dump_linuxpagetables.c
+++ b/arch/powerpc/mm/dump_linuxpagetables.c
@@ -452,11 +452,11 @@ static void populate_markers(void)
address_markers[i++].start_address = VMEMMAP_BASE;
#endif
#else /* !CONFIG_PPC64 */
+ address_markers[i++].start_address = IOREMAP_BASE;
address_markers[i++].start_address = ioremap_bot;
- address_markers[i++].start_address = IOREMAP_TOP;
#ifdef CONFIG_NOT_COHERENT_CACHE
- address_markers[i++].start_address = IOREMAP_TOP;
- address_markers[i++].start_address = IOREMAP_TOP +
+ address_markers[i++].start_address = IOREMAP_END;
+ address_markers[i++].start_address = IOREMAP_END +
CONFIG_CONSISTENT_SIZE;
#endif
#ifdef CONFIG_HIGHMEM
diff --git a/arch/powerpc/mm/init_32.c b/arch/powerpc/mm/init_32.c
index 3e59e5d64b01..7fb9e5a9852a 100644
--- a/arch/powerpc/mm/init_32.c
+++ b/arch/powerpc/mm/init_32.c
@@ -172,7 +172,11 @@ void __init MMU_init(void)
mapin_ram();

/* Initialize early top-down ioremap allocator */
- ioremap_bot = IOREMAP_TOP;
+ if (IS_ENABLED(CONFIG_HIGHMEM))
+ high_memory = (void *) __va(lowmem_end_addr);
+ else
+ high_memory = (void *) __va(memblock_end_of_DRAM());
+ ioremap_bot = IOREMAP_BASE;

if (ppc_md.progress)
ppc_md.progress("MMU:exit", 0x211);
diff --git a/arch/powerpc/mm/ioremap.c b/arch/powerpc/mm/ioremap.c
index f8dc9638c598..153657db084e 100644
--- a/arch/powerpc/mm/ioremap.c
+++ b/arch/powerpc/mm/ioremap.c
@@ -27,10 +27,13 @@

#include "mmu_decl.h"

-#ifdef CONFIG_PPC32
-
+#if defined(CONFIG_PPC_BOOK3S_64) || defined(CONFIG_PPC32)
unsigned long ioremap_bot;
-EXPORT_SYMBOL(ioremap_bot); /* aka VMALLOC_END */
+#else
+unsigned long ioremap_bot = IOREMAP_BASE;
+#endif
+
+#ifdef CONFIG_PPC32

void __iomem *
__ioremap_caller(phys_addr_t addr, unsigned long size, unsigned long flags,
@@ -51,7 +54,7 @@ __ioremap_caller(phys_addr_t addr, unsigned long size, unsigned long flags,
/*
* Choose an address to map it to.
* Once the vmalloc system is running, we use it.
- * Before then, we use space going down from IOREMAP_TOP
+ * Before then, we use space going up from IOREMAP_BASE
* (ioremap_bot records where we're up to).
*/
p = addr & PAGE_MASK;
@@ -96,7 +99,8 @@ __ioremap_caller(phys_addr_t addr, unsigned long size, unsigned long flags,
area->phys_addr = p;
v = (unsigned long) area->addr;
} else {
- v = (ioremap_bot -= size);
+ v = ioremap_bot;
+ ioremap_bot += size;
}

/*
@@ -125,19 +129,13 @@ void __iounmap(volatile void __iomem *addr)
if (v_block_mapped((unsigned long)addr))
return;

- if (addr > high_memory && (unsigned long) addr < ioremap_bot)
+ if ((unsigned long) addr >= ioremap_bot)
vunmap((void *) (PAGE_MASK & (unsigned long)addr));
}
EXPORT_SYMBOL(__iounmap);

#else

-#ifdef CONFIG_PPC_BOOK3S_64
-unsigned long ioremap_bot;
-#else /* !CONFIG_PPC_BOOK3S_64 */
-unsigned long ioremap_bot = IOREMAP_BASE;
-#endif
-
/**
* __ioremap_at - Low level function to establish the page tables
* for an IO mapping
diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index c3c39b02b2ba..b680aa78a4ac 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -345,8 +345,9 @@ void __init mem_init(void)
#ifdef CONFIG_SWIOTLB
swiotlb_init(0);
#endif
-
+#ifdef CONFIG_PPC64
high_memory = (void *) __va(max_low_pfn * PAGE_SIZE);
+#endif
set_max_mapnr(max_pfn);
free_all_bootmem();

@@ -383,10 +384,10 @@ void __init mem_init(void)
#endif /* CONFIG_HIGHMEM */
#ifdef CONFIG_NOT_COHERENT_CACHE
pr_info(" * 0x%08lx..0x%08lx : consistent mem\n",
- IOREMAP_TOP, IOREMAP_TOP + CONFIG_CONSISTENT_SIZE);
+ IOREMAP_END, IOREMAP_END + CONFIG_CONSISTENT_SIZE);
#endif /* CONFIG_NOT_COHERENT_CACHE */
pr_info(" * 0x%08lx..0x%08lx : early ioremap\n",
- ioremap_bot, IOREMAP_TOP);
+ IOREMAP_BASE, ioremap_bot);
pr_info(" * 0x%08lx..0x%08lx : vmalloc & ioremap\n",
VMALLOC_START, VMALLOC_END);
#endif /* CONFIG_PPC32 */
--
2.13.3


2018-05-04 12:38:48

by Christophe Leroy

[permalink] [raw]
Subject: [PATCH 06/17] powerpc: common ioremap functions.

__ioremap(), ioremap(), ioremap_wc() et ioremap_prot() are
very similar between PPC32 and PPC64, they can easily be
made common.

_PAGE_WRITE equals to _PAGE_RW on PPC32
_PAGE_RO and _PAGE_HWWRITE are 0 on PPC64

iounmap() can also be made common by renamig the PPC32
iounmap() as __iounmap()

Signed-off-by: Christophe Leroy <[email protected]>
---
arch/powerpc/include/asm/book3s/64/pgtable.h | 1 +
arch/powerpc/include/asm/machdep.h | 2 +-
arch/powerpc/mm/ioremap.c | 95 +++++++++-------------------
3 files changed, 31 insertions(+), 67 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 47b5ffc8715d..c5c6ead06bfb 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -17,6 +17,7 @@
#define _PAGE_NA 0
#define _PAGE_RO 0
#define _PAGE_USER 0
+#define _PAGE_HWWRITE 0

#define _PAGE_EXEC 0x00001 /* execute permission */
#define _PAGE_WRITE 0x00002 /* write access allowed */
diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h
index ffe7c71e1132..84d99ed82d5d 100644
--- a/arch/powerpc/include/asm/machdep.h
+++ b/arch/powerpc/include/asm/machdep.h
@@ -33,11 +33,11 @@ struct pci_host_bridge;

struct machdep_calls {
char *name;
-#ifdef CONFIG_PPC64
void __iomem * (*ioremap)(phys_addr_t addr, unsigned long size,
unsigned long flags, void *caller);
void (*iounmap)(volatile void __iomem *token);

+#ifdef CONFIG_PPC64
#ifdef CONFIG_PM
void (*iommu_save)(void);
void (*iommu_restore)(void);
diff --git a/arch/powerpc/mm/ioremap.c b/arch/powerpc/mm/ioremap.c
index 5d2645193568..f8dc9638c598 100644
--- a/arch/powerpc/mm/ioremap.c
+++ b/arch/powerpc/mm/ioremap.c
@@ -23,6 +23,7 @@
#include <asm/io.h>
#include <asm/setup.h>
#include <asm/sections.h>
+#include <asm/machdep.h>

#include "mmu_decl.h"

@@ -32,44 +33,6 @@ unsigned long ioremap_bot;
EXPORT_SYMBOL(ioremap_bot); /* aka VMALLOC_END */

void __iomem *
-ioremap(phys_addr_t addr, unsigned long size)
-{
- return __ioremap_caller(addr, size, _PAGE_NO_CACHE | _PAGE_GUARDED,
- __builtin_return_address(0));
-}
-EXPORT_SYMBOL(ioremap);
-
-void __iomem *
-ioremap_wc(phys_addr_t addr, unsigned long size)
-{
- return __ioremap_caller(addr, size, _PAGE_NO_CACHE,
- __builtin_return_address(0));
-}
-EXPORT_SYMBOL(ioremap_wc);
-
-void __iomem *
-ioremap_prot(phys_addr_t addr, unsigned long size, unsigned long flags)
-{
- /* writeable implies dirty for kernel addresses */
- if ((flags & (_PAGE_RW | _PAGE_RO)) != _PAGE_RO)
- flags |= _PAGE_DIRTY | _PAGE_HWWRITE;
-
- /* we don't want to let _PAGE_USER and _PAGE_EXEC leak out */
- flags &= ~(_PAGE_USER | _PAGE_EXEC);
- flags |= _PAGE_PRIVILEGED;
-
- return __ioremap_caller(addr, size, flags, __builtin_return_address(0));
-}
-EXPORT_SYMBOL(ioremap_prot);
-
-void __iomem *
-__ioremap(phys_addr_t addr, unsigned long size, unsigned long flags)
-{
- return __ioremap_caller(addr, size, flags, __builtin_return_address(0));
-}
-EXPORT_SYMBOL(__ioremap);
-
-void __iomem *
__ioremap_caller(phys_addr_t addr, unsigned long size, unsigned long flags,
void *caller)
{
@@ -153,7 +116,7 @@ __ioremap_caller(phys_addr_t addr, unsigned long size, unsigned long flags,
return (void __iomem *) (v + ((unsigned long)addr & ~PAGE_MASK));
}

-void iounmap(volatile void __iomem *addr)
+void __iounmap(volatile void __iomem *addr)
{
/*
* If mapped by BATs then there is nothing to do.
@@ -165,7 +128,7 @@ void iounmap(volatile void __iomem *addr)
if (addr > high_memory && (unsigned long) addr < ioremap_bot)
vunmap((void *) (PAGE_MASK & (unsigned long)addr));
}
-EXPORT_SYMBOL(iounmap);
+EXPORT_SYMBOL(__iounmap);

#else

@@ -264,6 +227,30 @@ void __iomem * __ioremap_caller(phys_addr_t addr, unsigned long size,
return ret;
}

+/*
+ * Unmap an IO region and remove it from imalloc'd list.
+ * Access to IO memory should be serialized by driver.
+ */
+void __iounmap(volatile void __iomem *token)
+{
+ void *addr;
+
+ if (!slab_is_available())
+ return;
+
+ addr = (void *) ((unsigned long __force)
+ PCI_FIX_ADDR(token) & PAGE_MASK);
+ if ((unsigned long)addr < ioremap_bot) {
+ printk(KERN_WARNING "Attempt to iounmap early bolted mapping"
+ " at 0x%p\n", addr);
+ return;
+ }
+ vunmap(addr);
+}
+EXPORT_SYMBOL(__iounmap);
+
+#endif
+
void __iomem * __ioremap(phys_addr_t addr, unsigned long size,
unsigned long flags)
{
@@ -299,8 +286,8 @@ void __iomem * ioremap_prot(phys_addr_t addr, unsigned long size,
void *caller = __builtin_return_address(0);

/* writeable implies dirty for kernel addresses */
- if (flags & _PAGE_WRITE)
- flags |= _PAGE_DIRTY;
+ if ((flags & (_PAGE_WRITE | _PAGE_RO)) != _PAGE_RO)
+ flags |= _PAGE_DIRTY | _PAGE_HWWRITE;

/* we don't want to let _PAGE_EXEC leak out */
flags &= ~_PAGE_EXEC;
@@ -316,28 +303,6 @@ void __iomem * ioremap_prot(phys_addr_t addr, unsigned long size,
}
EXPORT_SYMBOL(ioremap_prot);

-/*
- * Unmap an IO region and remove it from imalloc'd list.
- * Access to IO memory should be serialized by driver.
- */
-void __iounmap(volatile void __iomem *token)
-{
- void *addr;
-
- if (!slab_is_available())
- return;
-
- addr = (void *) ((unsigned long __force)
- PCI_FIX_ADDR(token) & PAGE_MASK);
- if ((unsigned long)addr < ioremap_bot) {
- printk(KERN_WARNING "Attempt to iounmap early bolted mapping"
- " at 0x%p\n", addr);
- return;
- }
- vunmap(addr);
-}
-EXPORT_SYMBOL(__iounmap);
-
void iounmap(volatile void __iomem *token)
{
if (ppc_md.iounmap)
@@ -346,5 +311,3 @@ void iounmap(volatile void __iomem *token)
__iounmap(token);
}
EXPORT_SYMBOL(iounmap);
-
-#endif
--
2.13.3


2018-05-04 12:38:50

by Christophe Leroy

[permalink] [raw]
Subject: [PATCH 12/17] powerpc/8xx: Remove PTE_ATOMIC_UPDATES

commit 1bc54c03117b9 ("powerpc: rework 4xx PTE access and TLB miss")
introduced non atomic PTE updates and started the work of removing
PTE updates in TLB miss handlers, but kept PTE_ATOMIC_UPDATES for the
8xx with the following comment:
/* Until my rework is finished, 8xx still needs atomic PTE updates */

commit fe11dc3f9628e ("powerpc/8xx: Update TLB asm so it behaves as linux
mm expects") removed all PTE updates done in TLB miss handlers

Therefore, atomic PTE updates are not needed anymore for the 8xx

Signed-off-by: Christophe Leroy <[email protected]>
---
arch/powerpc/include/asm/nohash/32/pte-8xx.h | 3 ---
1 file changed, 3 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/32/pte-8xx.h b/arch/powerpc/include/asm/nohash/32/pte-8xx.h
index a9a2919251e0..31401320c1a5 100644
--- a/arch/powerpc/include/asm/nohash/32/pte-8xx.h
+++ b/arch/powerpc/include/asm/nohash/32/pte-8xx.h
@@ -54,9 +54,6 @@
#define _PMD_GUARDED 0x0010
#define _PMD_USER 0x0020 /* APG 1 */

-/* Until my rework is finished, 8xx still needs atomic PTE updates */
-#define PTE_ATOMIC_UPDATES 1
-
#ifdef CONFIG_PPC_16K_PAGES
#define _PAGE_PSIZE _PAGE_HUGE
#endif
--
2.13.3


2018-05-04 12:39:24

by Christophe Leroy

[permalink] [raw]
Subject: [PATCH 05/17] powerpc: move io mapping functions into ioremap.c

This patch is the first of a serie that intends to make
io mappings common to PPC32 and PPC64.

It moves ioremap/unmap fonctions into a new file called ioremap.c with
no other modification to the functions.
For the time being, the PPC32 and PPC64 parts get enclosed into #ifdef.
Following patches will aim at making those functions as common as
possible between PPC32 and PPC64.

This patch also moves EXPORT_SYMBOL at the end of each function

Signed-off-by: Christophe Leroy <[email protected]>
---
arch/powerpc/mm/Makefile | 2 +-
arch/powerpc/mm/ioremap.c | 350 +++++++++++++++++++++++++++++++++++++++++++
arch/powerpc/mm/pgtable_32.c | 139 -----------------
arch/powerpc/mm/pgtable_64.c | 177 ----------------------
4 files changed, 351 insertions(+), 317 deletions(-)
create mode 100644 arch/powerpc/mm/ioremap.c

diff --git a/arch/powerpc/mm/Makefile b/arch/powerpc/mm/Makefile
index f06f3577d8d1..22d54c1d90e1 100644
--- a/arch/powerpc/mm/Makefile
+++ b/arch/powerpc/mm/Makefile
@@ -9,7 +9,7 @@ ccflags-$(CONFIG_PPC64) := $(NO_MINIMAL_TOC)

obj-y := fault.o mem.o pgtable.o mmap.o \
init_$(BITS).o pgtable_$(BITS).o \
- init-common.o mmu_context.o drmem.o
+ init-common.o mmu_context.o drmem.o ioremap.o
obj-$(CONFIG_PPC_MMU_NOHASH) += mmu_context_nohash.o tlb_nohash.o \
tlb_nohash_low.o
obj-$(CONFIG_PPC_BOOK3E) += tlb_low_$(BITS)e.o
diff --git a/arch/powerpc/mm/ioremap.c b/arch/powerpc/mm/ioremap.c
new file mode 100644
index 000000000000..5d2645193568
--- /dev/null
+++ b/arch/powerpc/mm/ioremap.c
@@ -0,0 +1,350 @@
+/*
+ * This file contains the routines for mapping IO areas
+ *
+ * Derived from arch/powerpc/mm/pgtable_32.c and
+ * arch/powerpc/mm/pgtable_64.c
+ *
+ * SPDX-License-Identifier: GPL-2.0
+ */
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/types.h>
+#include <linux/mm.h>
+#include <linux/vmalloc.h>
+#include <linux/init.h>
+#include <linux/highmem.h>
+#include <linux/memblock.h>
+#include <linux/slab.h>
+
+#include <asm/pgtable.h>
+#include <asm/pgalloc.h>
+#include <asm/fixmap.h>
+#include <asm/io.h>
+#include <asm/setup.h>
+#include <asm/sections.h>
+
+#include "mmu_decl.h"
+
+#ifdef CONFIG_PPC32
+
+unsigned long ioremap_bot;
+EXPORT_SYMBOL(ioremap_bot); /* aka VMALLOC_END */
+
+void __iomem *
+ioremap(phys_addr_t addr, unsigned long size)
+{
+ return __ioremap_caller(addr, size, _PAGE_NO_CACHE | _PAGE_GUARDED,
+ __builtin_return_address(0));
+}
+EXPORT_SYMBOL(ioremap);
+
+void __iomem *
+ioremap_wc(phys_addr_t addr, unsigned long size)
+{
+ return __ioremap_caller(addr, size, _PAGE_NO_CACHE,
+ __builtin_return_address(0));
+}
+EXPORT_SYMBOL(ioremap_wc);
+
+void __iomem *
+ioremap_prot(phys_addr_t addr, unsigned long size, unsigned long flags)
+{
+ /* writeable implies dirty for kernel addresses */
+ if ((flags & (_PAGE_RW | _PAGE_RO)) != _PAGE_RO)
+ flags |= _PAGE_DIRTY | _PAGE_HWWRITE;
+
+ /* we don't want to let _PAGE_USER and _PAGE_EXEC leak out */
+ flags &= ~(_PAGE_USER | _PAGE_EXEC);
+ flags |= _PAGE_PRIVILEGED;
+
+ return __ioremap_caller(addr, size, flags, __builtin_return_address(0));
+}
+EXPORT_SYMBOL(ioremap_prot);
+
+void __iomem *
+__ioremap(phys_addr_t addr, unsigned long size, unsigned long flags)
+{
+ return __ioremap_caller(addr, size, flags, __builtin_return_address(0));
+}
+EXPORT_SYMBOL(__ioremap);
+
+void __iomem *
+__ioremap_caller(phys_addr_t addr, unsigned long size, unsigned long flags,
+ void *caller)
+{
+ unsigned long v, i;
+ phys_addr_t p;
+ int err;
+
+ /* Make sure we have the base flags */
+ if ((flags & _PAGE_PRESENT) == 0)
+ flags |= pgprot_val(PAGE_KERNEL);
+
+ /* Non-cacheable page cannot be coherent */
+ if (flags & _PAGE_NO_CACHE)
+ flags &= ~_PAGE_COHERENT;
+
+ /*
+ * Choose an address to map it to.
+ * Once the vmalloc system is running, we use it.
+ * Before then, we use space going down from IOREMAP_TOP
+ * (ioremap_bot records where we're up to).
+ */
+ p = addr & PAGE_MASK;
+ size = PAGE_ALIGN(addr + size) - p;
+
+ /*
+ * If the address lies within the first 16 MB, assume it's in ISA
+ * memory space
+ */
+ if (p < 16*1024*1024)
+ p += _ISA_MEM_BASE;
+
+#ifndef CONFIG_CRASH_DUMP
+ /*
+ * Don't allow anybody to remap normal RAM that we're using.
+ * mem_init() sets high_memory so only do the check after that.
+ */
+ if (slab_is_available() && (p < virt_to_phys(high_memory)) &&
+ page_is_ram(__phys_to_pfn(p))) {
+ printk("__ioremap(): phys addr 0x%llx is RAM lr %ps\n",
+ (unsigned long long)p, __builtin_return_address(0));
+ return NULL;
+ }
+#endif
+
+ if (size == 0)
+ return NULL;
+
+ /*
+ * Is it already mapped? Perhaps overlapped by a previous
+ * mapping.
+ */
+ v = p_block_mapped(p);
+ if (v)
+ goto out;
+
+ if (slab_is_available()) {
+ struct vm_struct *area;
+ area = get_vm_area_caller(size, VM_IOREMAP, caller);
+ if (area == 0)
+ return NULL;
+ area->phys_addr = p;
+ v = (unsigned long) area->addr;
+ } else {
+ v = (ioremap_bot -= size);
+ }
+
+ /*
+ * Should check if it is a candidate for a BAT mapping
+ */
+
+ err = 0;
+ for (i = 0; i < size && err == 0; i += PAGE_SIZE)
+ err = map_kernel_page(v+i, p+i, flags);
+ if (err) {
+ if (slab_is_available())
+ vunmap((void *)v);
+ return NULL;
+ }
+
+out:
+ return (void __iomem *) (v + ((unsigned long)addr & ~PAGE_MASK));
+}
+
+void iounmap(volatile void __iomem *addr)
+{
+ /*
+ * If mapped by BATs then there is nothing to do.
+ * Calling vfree() generates a benign warning.
+ */
+ if (v_block_mapped((unsigned long)addr))
+ return;
+
+ if (addr > high_memory && (unsigned long) addr < ioremap_bot)
+ vunmap((void *) (PAGE_MASK & (unsigned long)addr));
+}
+EXPORT_SYMBOL(iounmap);
+
+#else
+
+#ifdef CONFIG_PPC_BOOK3S_64
+unsigned long ioremap_bot;
+#else /* !CONFIG_PPC_BOOK3S_64 */
+unsigned long ioremap_bot = IOREMAP_BASE;
+#endif
+
+/**
+ * __ioremap_at - Low level function to establish the page tables
+ * for an IO mapping
+ */
+void __iomem * __ioremap_at(phys_addr_t pa, void *ea, unsigned long size,
+ unsigned long flags)
+{
+ unsigned long i;
+
+ /* Make sure we have the base flags */
+ if ((flags & _PAGE_PRESENT) == 0)
+ flags |= pgprot_val(PAGE_KERNEL);
+
+ /* We don't support the 4K PFN hack with ioremap */
+ if (flags & H_PAGE_4K_PFN)
+ return NULL;
+
+ WARN_ON(pa & ~PAGE_MASK);
+ WARN_ON(((unsigned long)ea) & ~PAGE_MASK);
+ WARN_ON(size & ~PAGE_MASK);
+
+ for (i = 0; i < size; i += PAGE_SIZE)
+ if (map_kernel_page((unsigned long)ea+i, pa+i, flags))
+ return NULL;
+
+ return (void __iomem *)ea;
+}
+EXPORT_SYMBOL(__ioremap_at);
+
+/**
+ * __iounmap_from - Low level function to tear down the page tables
+ * for an IO mapping. This is used for mappings that
+ * are manipulated manually, like partial unmapping of
+ * PCI IOs or ISA space.
+ */
+void __iounmap_at(void *ea, unsigned long size)
+{
+ WARN_ON(((unsigned long)ea) & ~PAGE_MASK);
+ WARN_ON(size & ~PAGE_MASK);
+
+ unmap_kernel_range((unsigned long)ea, size);
+}
+EXPORT_SYMBOL(__iounmap_at);
+
+void __iomem * __ioremap_caller(phys_addr_t addr, unsigned long size,
+ unsigned long flags, void *caller)
+{
+ phys_addr_t paligned;
+ void __iomem *ret;
+
+ /*
+ * Choose an address to map it to.
+ * Once the imalloc system is running, we use it.
+ * Before that, we map using addresses going
+ * up from ioremap_bot. imalloc will use
+ * the addresses from ioremap_bot through
+ * IMALLOC_END
+ *
+ */
+ paligned = addr & PAGE_MASK;
+ size = PAGE_ALIGN(addr + size) - paligned;
+
+ if ((size == 0) || (paligned == 0))
+ return NULL;
+
+ if (slab_is_available()) {
+ struct vm_struct *area;
+
+ area = __get_vm_area_caller(size, VM_IOREMAP,
+ ioremap_bot, IOREMAP_END,
+ caller);
+ if (area == NULL)
+ return NULL;
+
+ area->phys_addr = paligned;
+ ret = __ioremap_at(paligned, area->addr, size, flags);
+ if (!ret)
+ vunmap(area->addr);
+ } else {
+ ret = __ioremap_at(paligned, (void *)ioremap_bot, size, flags);
+ if (ret)
+ ioremap_bot += size;
+ }
+
+ if (ret)
+ ret += addr & ~PAGE_MASK;
+ return ret;
+}
+
+void __iomem * __ioremap(phys_addr_t addr, unsigned long size,
+ unsigned long flags)
+{
+ return __ioremap_caller(addr, size, flags, __builtin_return_address(0));
+}
+EXPORT_SYMBOL(__ioremap);
+
+void __iomem * ioremap(phys_addr_t addr, unsigned long size)
+{
+ unsigned long flags = pgprot_val(pgprot_noncached(__pgprot(0)));
+ void *caller = __builtin_return_address(0);
+
+ if (ppc_md.ioremap)
+ return ppc_md.ioremap(addr, size, flags, caller);
+ return __ioremap_caller(addr, size, flags, caller);
+}
+EXPORT_SYMBOL(ioremap);
+
+void __iomem * ioremap_wc(phys_addr_t addr, unsigned long size)
+{
+ unsigned long flags = pgprot_val(pgprot_noncached_wc(__pgprot(0)));
+ void *caller = __builtin_return_address(0);
+
+ if (ppc_md.ioremap)
+ return ppc_md.ioremap(addr, size, flags, caller);
+ return __ioremap_caller(addr, size, flags, caller);
+}
+EXPORT_SYMBOL(ioremap_wc);
+
+void __iomem * ioremap_prot(phys_addr_t addr, unsigned long size,
+ unsigned long flags)
+{
+ void *caller = __builtin_return_address(0);
+
+ /* writeable implies dirty for kernel addresses */
+ if (flags & _PAGE_WRITE)
+ flags |= _PAGE_DIRTY;
+
+ /* we don't want to let _PAGE_EXEC leak out */
+ flags &= ~_PAGE_EXEC;
+ /*
+ * Force kernel mapping.
+ */
+ flags &= ~_PAGE_USER;
+ flags |= _PAGE_PRIVILEGED;
+
+ if (ppc_md.ioremap)
+ return ppc_md.ioremap(addr, size, flags, caller);
+ return __ioremap_caller(addr, size, flags, caller);
+}
+EXPORT_SYMBOL(ioremap_prot);
+
+/*
+ * Unmap an IO region and remove it from imalloc'd list.
+ * Access to IO memory should be serialized by driver.
+ */
+void __iounmap(volatile void __iomem *token)
+{
+ void *addr;
+
+ if (!slab_is_available())
+ return;
+
+ addr = (void *) ((unsigned long __force)
+ PCI_FIX_ADDR(token) & PAGE_MASK);
+ if ((unsigned long)addr < ioremap_bot) {
+ printk(KERN_WARNING "Attempt to iounmap early bolted mapping"
+ " at 0x%p\n", addr);
+ return;
+ }
+ vunmap(addr);
+}
+EXPORT_SYMBOL(__iounmap);
+
+void iounmap(volatile void __iomem *token)
+{
+ if (ppc_md.iounmap)
+ ppc_md.iounmap(token);
+ else
+ __iounmap(token);
+}
+EXPORT_SYMBOL(iounmap);
+
+#endif
diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c
index 120a49bfb9c6..54a5bc0767a9 100644
--- a/arch/powerpc/mm/pgtable_32.c
+++ b/arch/powerpc/mm/pgtable_32.c
@@ -38,9 +38,6 @@

#include "mmu_decl.h"

-unsigned long ioremap_bot;
-EXPORT_SYMBOL(ioremap_bot); /* aka VMALLOC_END */
-
extern char etext[], _stext[], _sinittext[], _einittext[];

__ref pte_t *pte_alloc_one_kernel(struct mm_struct *mm, unsigned long address)
@@ -73,142 +70,6 @@ pgtable_t pte_alloc_one(struct mm_struct *mm, unsigned long address)
return ptepage;
}

-void __iomem *
-ioremap(phys_addr_t addr, unsigned long size)
-{
- return __ioremap_caller(addr, size, _PAGE_NO_CACHE | _PAGE_GUARDED,
- __builtin_return_address(0));
-}
-EXPORT_SYMBOL(ioremap);
-
-void __iomem *
-ioremap_wc(phys_addr_t addr, unsigned long size)
-{
- return __ioremap_caller(addr, size, _PAGE_NO_CACHE,
- __builtin_return_address(0));
-}
-EXPORT_SYMBOL(ioremap_wc);
-
-void __iomem *
-ioremap_prot(phys_addr_t addr, unsigned long size, unsigned long flags)
-{
- /* writeable implies dirty for kernel addresses */
- if ((flags & (_PAGE_RW | _PAGE_RO)) != _PAGE_RO)
- flags |= _PAGE_DIRTY | _PAGE_HWWRITE;
-
- /* we don't want to let _PAGE_USER and _PAGE_EXEC leak out */
- flags &= ~(_PAGE_USER | _PAGE_EXEC);
- flags |= _PAGE_PRIVILEGED;
-
- return __ioremap_caller(addr, size, flags, __builtin_return_address(0));
-}
-EXPORT_SYMBOL(ioremap_prot);
-
-void __iomem *
-__ioremap(phys_addr_t addr, unsigned long size, unsigned long flags)
-{
- return __ioremap_caller(addr, size, flags, __builtin_return_address(0));
-}
-
-void __iomem *
-__ioremap_caller(phys_addr_t addr, unsigned long size, unsigned long flags,
- void *caller)
-{
- unsigned long v, i;
- phys_addr_t p;
- int err;
-
- /* Make sure we have the base flags */
- if ((flags & _PAGE_PRESENT) == 0)
- flags |= pgprot_val(PAGE_KERNEL);
-
- /* Non-cacheable page cannot be coherent */
- if (flags & _PAGE_NO_CACHE)
- flags &= ~_PAGE_COHERENT;
-
- /*
- * Choose an address to map it to.
- * Once the vmalloc system is running, we use it.
- * Before then, we use space going down from IOREMAP_TOP
- * (ioremap_bot records where we're up to).
- */
- p = addr & PAGE_MASK;
- size = PAGE_ALIGN(addr + size) - p;
-
- /*
- * If the address lies within the first 16 MB, assume it's in ISA
- * memory space
- */
- if (p < 16*1024*1024)
- p += _ISA_MEM_BASE;
-
-#ifndef CONFIG_CRASH_DUMP
- /*
- * Don't allow anybody to remap normal RAM that we're using.
- * mem_init() sets high_memory so only do the check after that.
- */
- if (slab_is_available() && (p < virt_to_phys(high_memory)) &&
- page_is_ram(__phys_to_pfn(p))) {
- printk("__ioremap(): phys addr 0x%llx is RAM lr %ps\n",
- (unsigned long long)p, __builtin_return_address(0));
- return NULL;
- }
-#endif
-
- if (size == 0)
- return NULL;
-
- /*
- * Is it already mapped? Perhaps overlapped by a previous
- * mapping.
- */
- v = p_block_mapped(p);
- if (v)
- goto out;
-
- if (slab_is_available()) {
- struct vm_struct *area;
- area = get_vm_area_caller(size, VM_IOREMAP, caller);
- if (area == 0)
- return NULL;
- area->phys_addr = p;
- v = (unsigned long) area->addr;
- } else {
- v = (ioremap_bot -= size);
- }
-
- /*
- * Should check if it is a candidate for a BAT mapping
- */
-
- err = 0;
- for (i = 0; i < size && err == 0; i += PAGE_SIZE)
- err = map_kernel_page(v+i, p+i, flags);
- if (err) {
- if (slab_is_available())
- vunmap((void *)v);
- return NULL;
- }
-
-out:
- return (void __iomem *) (v + ((unsigned long)addr & ~PAGE_MASK));
-}
-EXPORT_SYMBOL(__ioremap);
-
-void iounmap(volatile void __iomem *addr)
-{
- /*
- * If mapped by BATs then there is nothing to do.
- * Calling vfree() generates a benign warning.
- */
- if (v_block_mapped((unsigned long)addr))
- return;
-
- if (addr > high_memory && (unsigned long) addr < ioremap_bot)
- vunmap((void *) (PAGE_MASK & (unsigned long)addr));
-}
-EXPORT_SYMBOL(iounmap);
-
int map_kernel_page(unsigned long va, phys_addr_t pa, int flags)
{
pmd_t *pd;
diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
index 9bf659d5078c..dd1102a246e4 100644
--- a/arch/powerpc/mm/pgtable_64.c
+++ b/arch/powerpc/mm/pgtable_64.c
@@ -109,185 +109,8 @@ unsigned long __pte_frag_nr;
EXPORT_SYMBOL(__pte_frag_nr);
unsigned long __pte_frag_size_shift;
EXPORT_SYMBOL(__pte_frag_size_shift);
-unsigned long ioremap_bot;
-#else /* !CONFIG_PPC_BOOK3S_64 */
-unsigned long ioremap_bot = IOREMAP_BASE;
#endif

-/**
- * __ioremap_at - Low level function to establish the page tables
- * for an IO mapping
- */
-void __iomem * __ioremap_at(phys_addr_t pa, void *ea, unsigned long size,
- unsigned long flags)
-{
- unsigned long i;
-
- /* Make sure we have the base flags */
- if ((flags & _PAGE_PRESENT) == 0)
- flags |= pgprot_val(PAGE_KERNEL);
-
- /* We don't support the 4K PFN hack with ioremap */
- if (flags & H_PAGE_4K_PFN)
- return NULL;
-
- WARN_ON(pa & ~PAGE_MASK);
- WARN_ON(((unsigned long)ea) & ~PAGE_MASK);
- WARN_ON(size & ~PAGE_MASK);
-
- for (i = 0; i < size; i += PAGE_SIZE)
- if (map_kernel_page((unsigned long)ea+i, pa+i, flags))
- return NULL;
-
- return (void __iomem *)ea;
-}
-
-/**
- * __iounmap_from - Low level function to tear down the page tables
- * for an IO mapping. This is used for mappings that
- * are manipulated manually, like partial unmapping of
- * PCI IOs or ISA space.
- */
-void __iounmap_at(void *ea, unsigned long size)
-{
- WARN_ON(((unsigned long)ea) & ~PAGE_MASK);
- WARN_ON(size & ~PAGE_MASK);
-
- unmap_kernel_range((unsigned long)ea, size);
-}
-
-void __iomem * __ioremap_caller(phys_addr_t addr, unsigned long size,
- unsigned long flags, void *caller)
-{
- phys_addr_t paligned;
- void __iomem *ret;
-
- /*
- * Choose an address to map it to.
- * Once the imalloc system is running, we use it.
- * Before that, we map using addresses going
- * up from ioremap_bot. imalloc will use
- * the addresses from ioremap_bot through
- * IMALLOC_END
- *
- */
- paligned = addr & PAGE_MASK;
- size = PAGE_ALIGN(addr + size) - paligned;
-
- if ((size == 0) || (paligned == 0))
- return NULL;
-
- if (slab_is_available()) {
- struct vm_struct *area;
-
- area = __get_vm_area_caller(size, VM_IOREMAP,
- ioremap_bot, IOREMAP_END,
- caller);
- if (area == NULL)
- return NULL;
-
- area->phys_addr = paligned;
- ret = __ioremap_at(paligned, area->addr, size, flags);
- if (!ret)
- vunmap(area->addr);
- } else {
- ret = __ioremap_at(paligned, (void *)ioremap_bot, size, flags);
- if (ret)
- ioremap_bot += size;
- }
-
- if (ret)
- ret += addr & ~PAGE_MASK;
- return ret;
-}
-
-void __iomem * __ioremap(phys_addr_t addr, unsigned long size,
- unsigned long flags)
-{
- return __ioremap_caller(addr, size, flags, __builtin_return_address(0));
-}
-
-void __iomem * ioremap(phys_addr_t addr, unsigned long size)
-{
- unsigned long flags = pgprot_val(pgprot_noncached(__pgprot(0)));
- void *caller = __builtin_return_address(0);
-
- if (ppc_md.ioremap)
- return ppc_md.ioremap(addr, size, flags, caller);
- return __ioremap_caller(addr, size, flags, caller);
-}
-
-void __iomem * ioremap_wc(phys_addr_t addr, unsigned long size)
-{
- unsigned long flags = pgprot_val(pgprot_noncached_wc(__pgprot(0)));
- void *caller = __builtin_return_address(0);
-
- if (ppc_md.ioremap)
- return ppc_md.ioremap(addr, size, flags, caller);
- return __ioremap_caller(addr, size, flags, caller);
-}
-
-void __iomem * ioremap_prot(phys_addr_t addr, unsigned long size,
- unsigned long flags)
-{
- void *caller = __builtin_return_address(0);
-
- /* writeable implies dirty for kernel addresses */
- if (flags & _PAGE_WRITE)
- flags |= _PAGE_DIRTY;
-
- /* we don't want to let _PAGE_EXEC leak out */
- flags &= ~_PAGE_EXEC;
- /*
- * Force kernel mapping.
- */
- flags &= ~_PAGE_USER;
- flags |= _PAGE_PRIVILEGED;
-
- if (ppc_md.ioremap)
- return ppc_md.ioremap(addr, size, flags, caller);
- return __ioremap_caller(addr, size, flags, caller);
-}
-
-
-/*
- * Unmap an IO region and remove it from imalloc'd list.
- * Access to IO memory should be serialized by driver.
- */
-void __iounmap(volatile void __iomem *token)
-{
- void *addr;
-
- if (!slab_is_available())
- return;
-
- addr = (void *) ((unsigned long __force)
- PCI_FIX_ADDR(token) & PAGE_MASK);
- if ((unsigned long)addr < ioremap_bot) {
- printk(KERN_WARNING "Attempt to iounmap early bolted mapping"
- " at 0x%p\n", addr);
- return;
- }
- vunmap(addr);
-}
-
-void iounmap(volatile void __iomem *token)
-{
- if (ppc_md.iounmap)
- ppc_md.iounmap(token);
- else
- __iounmap(token);
-}
-
-EXPORT_SYMBOL(ioremap);
-EXPORT_SYMBOL(ioremap_wc);
-EXPORT_SYMBOL(ioremap_prot);
-EXPORT_SYMBOL(__ioremap);
-EXPORT_SYMBOL(__ioremap_at);
-EXPORT_SYMBOL(iounmap);
-EXPORT_SYMBOL(__iounmap);
-EXPORT_SYMBOL(__iounmap_at);
-
#ifndef __PAGETABLE_PUD_FOLDED
/* 4 level page table */
struct page *pgd_page(pgd_t pgd)
--
2.13.3


2018-05-04 12:39:41

by Christophe Leroy

[permalink] [raw]
Subject: [PATCH 04/17] Revert "powerpc/8xx: Use L1 entry APG to handle _PAGE_ACCESSED for CONFIG_SWAP"

This reverts commit 4f94b2c7462d9720b2afa7e8e8d4c19446bb31ce.

That commit was buggy, as it used rlwinm instead of rlwimi.
Instead of fixing that bug, we revert the previous commit in order to
reduce the dependency between L1 entries and L2 entries

Signed-off-by: Christophe Leroy <[email protected]>
---
arch/powerpc/include/asm/mmu-8xx.h | 34 +++++-----------------------
arch/powerpc/kernel/head_8xx.S | 45 +++++++++++++++++++++++---------------
arch/powerpc/mm/8xx_mmu.c | 2 +-
3 files changed, 34 insertions(+), 47 deletions(-)

diff --git a/arch/powerpc/include/asm/mmu-8xx.h b/arch/powerpc/include/asm/mmu-8xx.h
index 4f547752ae79..193f53116c7a 100644
--- a/arch/powerpc/include/asm/mmu-8xx.h
+++ b/arch/powerpc/include/asm/mmu-8xx.h
@@ -34,20 +34,12 @@
* respectively NA for All or X for Supervisor and no access for User.
* Then we use the APG to say whether accesses are according to Page rules or
* "all Supervisor" rules (Access to all)
- * We also use the 2nd APG bit for _PAGE_ACCESSED when having SWAP:
- * When that bit is not set access is done iaw "all user"
- * which means no access iaw page rules.
- * Therefore, we define 4 APG groups. lsb is _PMD_USER, 2nd is _PAGE_ACCESSED
- * 0x => No access => 11 (all accesses performed as user iaw page definition)
- * 10 => No user => 01 (all accesses performed according to page definition)
- * 11 => User => 00 (all accesses performed as supervisor iaw page definition)
+ * Therefore, we define 2 APG groups. lsb is _PMD_USER
+ * 0 => No user => 01 (all accesses performed according to page definition)
+ * 1 => User => 00 (all accesses performed as supervisor iaw page definition)
* We define all 16 groups so that all other bits of APG can take any value
*/
-#ifdef CONFIG_SWAP
-#define MI_APG_INIT 0xf4f4f4f4
-#else
#define MI_APG_INIT 0x44444444
-#endif

/* The effective page number register. When read, contains the information
* about the last instruction TLB miss. When MI_RPN is written, bits in
@@ -115,20 +107,12 @@
* Supervisor and no access for user and NA for ALL.
* Then we use the APG to say whether accesses are according to Page rules or
* "all Supervisor" rules (Access to all)
- * We also use the 2nd APG bit for _PAGE_ACCESSED when having SWAP:
- * When that bit is not set access is done iaw "all user"
- * which means no access iaw page rules.
- * Therefore, we define 4 APG groups. lsb is _PMD_USER, 2nd is _PAGE_ACCESSED
- * 0x => No access => 11 (all accesses performed as user iaw page definition)
- * 10 => No user => 01 (all accesses performed according to page definition)
- * 11 => User => 00 (all accesses performed as supervisor iaw page definition)
+ * Therefore, we define 2 APG groups. lsb is _PMD_USER
+ * 0 => No user => 01 (all accesses performed according to page definition)
+ * 1 => User => 00 (all accesses performed as supervisor iaw page definition)
* We define all 16 groups so that all other bits of APG can take any value
*/
-#ifdef CONFIG_SWAP
-#define MD_APG_INIT 0xf4f4f4f4
-#else
#define MD_APG_INIT 0x44444444
-#endif

/* The effective page number register. When read, contains the information
* about the last instruction TLB miss. When MD_RPN is written, bits in
@@ -180,12 +164,6 @@
*/
#define SPRN_M_TW 799

-/* APGs */
-#define M_APG0 0x00000000
-#define M_APG1 0x00000020
-#define M_APG2 0x00000040
-#define M_APG3 0x00000060
-
#ifdef CONFIG_PPC_MM_SLICES
#include <asm/nohash/32/slice.h>
#define SLICE_ARRAY_SIZE (1 << (32 - SLICE_LOW_SHIFT - 1))
diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index d8670a37d70c..c3b831bb8bad 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -354,13 +354,14 @@ _ENTRY(ITLBMiss_cmp)
#if defined(ITLB_MISS_KERNEL) || defined(CONFIG_HUGETLB_PAGE)
mtcr r12
#endif
-
-#ifdef CONFIG_SWAP
- rlwinm r11, r10, 31, _PAGE_ACCESSED >> 1
-#endif
/* Load the MI_TWC with the attributes for this "segment." */
mtspr SPRN_MI_TWC, r11 /* Set segment attributes */

+#ifdef CONFIG_SWAP
+ rlwinm r11, r10, 32-5, _PAGE_PRESENT
+ and r11, r11, r10
+ rlwimi r10, r11, 0, _PAGE_PRESENT
+#endif
li r11, RPN_PATTERN | 0x200
/* The Linux PTE won't go exactly into the MMU TLB.
* Software indicator bits 20 and 23 must be clear.
@@ -471,14 +472,22 @@ _ENTRY(DTLBMiss_jmp)
* above.
*/
rlwimi r11, r10, 0, _PAGE_GUARDED
-#ifdef CONFIG_SWAP
- /* _PAGE_ACCESSED has to be set. We use second APG bit for that, 0
- * on that bit will represent a Non Access group
- */
- rlwinm r11, r10, 31, _PAGE_ACCESSED >> 1
-#endif
mtspr SPRN_MD_TWC, r11

+ /* Both _PAGE_ACCESSED and _PAGE_PRESENT has to be set.
+ * We also need to know if the insn is a load/store, so:
+ * Clear _PAGE_PRESENT and load that which will
+ * trap into DTLB Error with store bit set accordinly.
+ */
+ /* PRESENT=0x1, ACCESSED=0x20
+ * r11 = ((r10 & PRESENT) & ((r10 & ACCESSED) >> 5));
+ * r10 = (r10 & ~PRESENT) | r11;
+ */
+#ifdef CONFIG_SWAP
+ rlwinm r11, r10, 32-5, _PAGE_PRESENT
+ and r11, r11, r10
+ rlwimi r10, r11, 0, _PAGE_PRESENT
+#endif
/* The Linux PTE won't go exactly into the MMU TLB.
* Software indicator bits 24, 25, 26, and 27 must be
* set. All other Linux PTE bits control the behavior
@@ -638,8 +647,8 @@ InstructionBreakpoint:
*/
DTLBMissIMMR:
mtcr r12
- /* Set 512k byte guarded page and mark it valid and accessed */
- li r10, MD_PS512K | MD_GUARDED | MD_SVALID | M_APG2
+ /* Set 512k byte guarded page and mark it valid */
+ li r10, MD_PS512K | MD_GUARDED | MD_SVALID
mtspr SPRN_MD_TWC, r10
mfspr r10, SPRN_IMMR /* Get current IMMR */
rlwinm r10, r10, 0, 0xfff80000 /* Get 512 kbytes boundary */
@@ -657,8 +666,8 @@ _ENTRY(dtlb_miss_exit_2)

DTLBMissLinear:
mtcr r12
- /* Set 8M byte page and mark it valid and accessed */
- li r11, MD_PS8MEG | MD_SVALID | M_APG2
+ /* Set 8M byte page and mark it valid */
+ li r11, MD_PS8MEG | MD_SVALID
mtspr SPRN_MD_TWC, r11
rlwinm r10, r10, 0, 0x0f800000 /* 8xx supports max 256Mb RAM */
ori r10, r10, 0xf0 | MD_SPS16K | _PAGE_PRIVILEGED | _PAGE_DIRTY | \
@@ -676,8 +685,8 @@ _ENTRY(dtlb_miss_exit_3)
#ifndef CONFIG_PIN_TLB_TEXT
ITLBMissLinear:
mtcr r12
- /* Set 8M byte page and mark it valid,accessed */
- li r11, MI_PS8MEG | MI_SVALID | M_APG2
+ /* Set 8M byte page and mark it valid */
+ li r11, MI_PS8MEG | MI_SVALID
mtspr SPRN_MI_TWC, r11
rlwinm r10, r10, 0, 0x0f800000 /* 8xx supports max 256Mb RAM */
ori r10, r10, 0xf0 | MI_SPS16K | _PAGE_PRIVILEGED | _PAGE_DIRTY | \
@@ -960,7 +969,7 @@ initial_mmu:
ori r8, r8, MI_EVALID /* Mark it valid */
mtspr SPRN_MI_EPN, r8
li r8, MI_PS8MEG /* Set 8M byte page */
- ori r8, r8, MI_SVALID | M_APG2 /* Make it valid, APG 2 */
+ ori r8, r8, MI_SVALID /* Make it valid */
mtspr SPRN_MI_TWC, r8
li r8, MI_BOOTINIT /* Create RPN for address 0 */
mtspr SPRN_MI_RPN, r8 /* Store TLB entry */
@@ -987,7 +996,7 @@ initial_mmu:
ori r8, r8, MD_EVALID /* Mark it valid */
mtspr SPRN_MD_EPN, r8
li r8, MD_PS512K | MD_GUARDED /* Set 512k byte page */
- ori r8, r8, MD_SVALID | M_APG2 /* Make it valid and accessed */
+ ori r8, r8, MD_SVALID /* Make it valid */
mtspr SPRN_MD_TWC, r8
mr r8, r9 /* Create paddr for TLB */
ori r8, r8, MI_BOOTINIT|0x2 /* Inhibit cache -- Cort */
diff --git a/arch/powerpc/mm/8xx_mmu.c b/arch/powerpc/mm/8xx_mmu.c
index cf77d755246d..5d53684c2ebd 100644
--- a/arch/powerpc/mm/8xx_mmu.c
+++ b/arch/powerpc/mm/8xx_mmu.c
@@ -79,7 +79,7 @@ void __init MMU_init_hw(void)
for (; i < 32 && mem >= LARGE_PAGE_SIZE_8M; i++) {
mtspr(SPRN_MD_CTR, ctr | (i << 8));
mtspr(SPRN_MD_EPN, (unsigned long)__va(addr) | MD_EVALID);
- mtspr(SPRN_MD_TWC, MD_PS8MEG | MD_SVALID | M_APG2);
+ mtspr(SPRN_MD_TWC, MD_PS8MEG | MD_SVALID);
mtspr(SPRN_MD_RPN, addr | flags | _PAGE_PRESENT);
addr += LARGE_PAGE_SIZE_8M;
mem -= LARGE_PAGE_SIZE_8M;
--
2.13.3


2018-05-04 12:40:30

by Christophe Leroy

[permalink] [raw]
Subject: [PATCH 02/17] powerpc/nohash: remove _PAGE_BUSY

_PAGE_BUSY is always 0, remove it

Signed-off-by: Christophe Leroy <[email protected]>
---
arch/powerpc/include/asm/nohash/64/pgtable.h | 10 +++-------
arch/powerpc/include/asm/nohash/pte-book3e.h | 5 -----
2 files changed, 3 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h b/arch/powerpc/include/asm/nohash/64/pgtable.h
index 4f6f5a27bfb5..c3559d7a94fb 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable.h
@@ -186,14 +186,12 @@ static inline unsigned long pte_update(struct mm_struct *mm,

__asm__ __volatile__(
"1: ldarx %0,0,%3 # pte_update\n\
- andi. %1,%0,%6\n\
- bne- 1b \n\
andc %1,%0,%4 \n\
- or %1,%1,%7\n\
+ or %1,%1,%6\n\
stdcx. %1,0,%3 \n\
bne- 1b"
: "=&r" (old), "=&r" (tmp), "=m" (*ptep)
- : "r" (ptep), "r" (clr), "m" (*ptep), "i" (_PAGE_BUSY), "r" (set)
+ : "r" (ptep), "r" (clr), "m" (*ptep), "r" (set)
: "cc" );
#else
unsigned long old = pte_val(*ptep);
@@ -290,13 +288,11 @@ static inline void __ptep_set_access_flags(struct mm_struct *mm,

__asm__ __volatile__(
"1: ldarx %0,0,%4\n\
- andi. %1,%0,%6\n\
- bne- 1b \n\
or %0,%3,%0\n\
stdcx. %0,0,%4\n\
bne- 1b"
:"=&r" (old), "=&r" (tmp), "=m" (*ptep)
- :"r" (bits), "r" (ptep), "m" (*ptep), "i" (_PAGE_BUSY)
+ :"r" (bits), "r" (ptep), "m" (*ptep)
:"cc");
#else
unsigned long old = pte_val(*ptep);
diff --git a/arch/powerpc/include/asm/nohash/pte-book3e.h b/arch/powerpc/include/asm/nohash/pte-book3e.h
index 9ff51b4c0cac..12730b81cd98 100644
--- a/arch/powerpc/include/asm/nohash/pte-book3e.h
+++ b/arch/powerpc/include/asm/nohash/pte-book3e.h
@@ -57,13 +57,8 @@
#define _PAGE_USER (_PAGE_BAP_UR | _PAGE_BAP_SR) /* Can be read */
#define _PAGE_PRIVILEGED (_PAGE_BAP_SR)

-#define _PAGE_BUSY 0
-
#define _PAGE_SPECIAL _PAGE_SW0

-/* Flags to be preserved on PTE modifications */
-#define _PAGE_HPTEFLAGS _PAGE_BUSY
-
/* Base page size */
#ifdef CONFIG_PPC_64K_PAGES
#define _PAGE_PSIZE _PAGE_PSIZE_64K
--
2.13.3


2018-05-04 12:40:47

by Christophe Leroy

[permalink] [raw]
Subject: [PATCH 03/17] powerpc/nohash: use IS_ENABLED() to simplify __set_pte_at()

By using IS_ENABLED() we can simplify __set_pte_at() by removing
redundant *ptep = pte

Signed-off-by: Christophe Leroy <[email protected]>
---
arch/powerpc/include/asm/nohash/pgtable.h | 23 ++++++++---------------
1 file changed, 8 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/pgtable.h b/arch/powerpc/include/asm/nohash/pgtable.h
index f2fe3cbe90af..077472640b35 100644
--- a/arch/powerpc/include/asm/nohash/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/pgtable.h
@@ -148,40 +148,33 @@ extern void set_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr,
pte_t *ptep, pte_t pte, int percpu)
{
-#if defined(CONFIG_PPC32) && defined(CONFIG_PTE_64BIT)
/* Second case is 32-bit with 64-bit PTE. In this case, we
* can just store as long as we do the two halves in the right order
* with a barrier in between.
* In the percpu case, we also fallback to the simple update
*/
- if (percpu) {
- *ptep = pte;
+ if (IS_ENABLED(CONFIG_PPC32) && IS_ENABLED(CONFIG_PTE_64BIT) && !percpu) {
+ __asm__ __volatile__("\
+ stw%U0%X0 %2,%0\n\
+ eieio\n\
+ stw%U0%X0 %L2,%1"
+ : "=m" (*ptep), "=m" (*((unsigned char *)ptep+4))
+ : "r" (pte) : "memory");
return;
}
- __asm__ __volatile__("\
- stw%U0%X0 %2,%0\n\
- eieio\n\
- stw%U0%X0 %L2,%1"
- : "=m" (*ptep), "=m" (*((unsigned char *)ptep+4))
- : "r" (pte) : "memory");
-
-#else
/* Anything else just stores the PTE normally. That covers all 64-bit
* cases, and 32-bit non-hash with 32-bit PTEs.
*/
*ptep = pte;

-#ifdef CONFIG_PPC_BOOK3E_64
/*
* With hardware tablewalk, a sync is needed to ensure that
* subsequent accesses see the PTE we just wrote. Unlike userspace
* mappings, we can't tolerate spurious faults, so make sure
* the new PTE will be seen the first time.
*/
- if (is_kernel_addr(addr))
+ if (IS_ENABLED(CONFIG_PPC_BOOK3E_64) && is_kernel_addr(addr))
mb();
-#endif
-#endif
}


--
2.13.3


2018-05-04 12:41:34

by Christophe Leroy

[permalink] [raw]
Subject: [PATCH 01/17] powerpc/nohash: remove hash related code from nohash headers.

When nohash and book3s header were split, some hash related stuff
remained in the nohash header. This patch removes them.

Signed-off-by: Christophe Leroy <[email protected]>
---
Removed the call to pte_young() as it fails, back to using PAGE_ACCESSED directly.

arch/powerpc/include/asm/nohash/32/pgtable.h | 29 +++------------------
arch/powerpc/include/asm/nohash/64/pgtable.h | 16 ++----------
arch/powerpc/include/asm/nohash/pgtable.h | 38 +++-------------------------
arch/powerpc/include/asm/nohash/pte-book3e.h | 1 -
4 files changed, 10 insertions(+), 74 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h b/arch/powerpc/include/asm/nohash/32/pgtable.h
index 03bbd1149530..140f8e74b478 100644
--- a/arch/powerpc/include/asm/nohash/32/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
@@ -133,7 +133,7 @@ extern int icache_44x_need_flush;
#ifndef __ASSEMBLY__

#define pte_clear(mm, addr, ptep) \
- do { pte_update(ptep, ~_PAGE_HASHPTE, 0); } while (0)
+ do { pte_update(ptep, ~0, 0); } while (0)

#define pmd_none(pmd) (!pmd_val(pmd))
#define pmd_bad(pmd) (pmd_val(pmd) & _PMD_BAD)
@@ -146,21 +146,6 @@ static inline void pmd_clear(pmd_t *pmdp)


/*
- * When flushing the tlb entry for a page, we also need to flush the hash
- * table entry. flush_hash_pages is assembler (for speed) in hashtable.S.
- */
-extern int flush_hash_pages(unsigned context, unsigned long va,
- unsigned long pmdval, int count);
-
-/* Add an HPTE to the hash table */
-extern void add_hash_page(unsigned context, unsigned long va,
- unsigned long pmdval);
-
-/* Flush an entry from the TLB/hash table */
-extern void flush_hash_entry(struct mm_struct *mm, pte_t *ptep,
- unsigned long address);
-
-/*
* PTE updates. This function is called whenever an existing
* valid PTE is updated. This does -not- include set_pte_at()
* which nowadays only sets a new PTE.
@@ -246,12 +231,6 @@ static inline int __ptep_test_and_clear_young(unsigned int context, unsigned lon
{
unsigned long old;
old = pte_update(ptep, _PAGE_ACCESSED, 0);
-#if _PAGE_HASHPTE != 0
- if (old & _PAGE_HASHPTE) {
- unsigned long ptephys = __pa(ptep) & PAGE_MASK;
- flush_hash_pages(context, addr, ptephys, 1);
- }
-#endif
return (old & _PAGE_ACCESSED) != 0;
}
#define ptep_test_and_clear_young(__vma, __addr, __ptep) \
@@ -261,7 +240,7 @@ static inline int __ptep_test_and_clear_young(unsigned int context, unsigned lon
static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long addr,
pte_t *ptep)
{
- return __pte(pte_update(ptep, ~_PAGE_HASHPTE, 0));
+ return __pte(pte_update(ptep, ~0, 0));
}

#define __HAVE_ARCH_PTEP_SET_WRPROTECT
@@ -289,7 +268,7 @@ static inline void __ptep_set_access_flags(struct mm_struct *mm,
}

#define __HAVE_ARCH_PTE_SAME
-#define pte_same(A,B) (((pte_val(A) ^ pte_val(B)) & ~_PAGE_HASHPTE) == 0)
+#define pte_same(A,B) ((pte_val(A) ^ pte_val(B)) == 0)

/*
* Note that on Book E processors, the pmd contains the kernel virtual
@@ -330,7 +309,7 @@ static inline void __ptep_set_access_flags(struct mm_struct *mm,
/*
* Encode and decode a swap entry.
* Note that the bits we use in a PTE for representing a swap entry
- * must not include the _PAGE_PRESENT bit or the _PAGE_HASHPTE bit (if used).
+ * must not include the _PAGE_PRESENT bit.
* -- paulus
*/
#define __swp_type(entry) ((entry).val & 0x1f)
diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h b/arch/powerpc/include/asm/nohash/64/pgtable.h
index 5c5f75d005ad..4f6f5a27bfb5 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable.h
@@ -173,8 +173,6 @@ static inline void pgd_set(pgd_t *pgdp, unsigned long val)
/* to find an entry in a kernel page-table-directory */
/* This now only contains the vmalloc pages */
#define pgd_offset_k(address) pgd_offset(&init_mm, address)
-extern void hpte_need_flush(struct mm_struct *mm, unsigned long addr,
- pte_t *ptep, unsigned long pte, int huge);

/* Atomic PTE updates */
static inline unsigned long pte_update(struct mm_struct *mm,
@@ -205,11 +203,6 @@ static inline unsigned long pte_update(struct mm_struct *mm,
if (!huge)
assert_pte_locked(mm, addr);

-#ifdef CONFIG_PPC_BOOK3S_64
- if (old & _PAGE_HASHPTE)
- hpte_need_flush(mm, addr, ptep, old, huge);
-#endif
-
return old;
}

@@ -218,7 +211,7 @@ static inline int __ptep_test_and_clear_young(struct mm_struct *mm,
{
unsigned long old;

- if ((pte_val(*ptep) & (_PAGE_ACCESSED | _PAGE_HASHPTE)) == 0)
+ if ((pte_val(*ptep) & _PAGE_ACCESSED) == 0)
return 0;
old = pte_update(mm, addr, ptep, _PAGE_ACCESSED, 0, 0);
return (old & _PAGE_ACCESSED) != 0;
@@ -312,7 +305,7 @@ static inline void __ptep_set_access_flags(struct mm_struct *mm,
}

#define __HAVE_ARCH_PTE_SAME
-#define pte_same(A,B) (((pte_val(A) ^ pte_val(B)) & ~_PAGE_HPTEFLAGS) == 0)
+#define pte_same(A,B) ((pte_val(A) ^ pte_val(B)) == 0)

#define pte_ERROR(e) \
pr_err("%s:%d: bad pte %08lx.\n", __FILE__, __LINE__, pte_val(e))
@@ -324,11 +317,6 @@ static inline void __ptep_set_access_flags(struct mm_struct *mm,
/* Encode and de-code a swap entry */
#define MAX_SWAPFILES_CHECK() do { \
BUILD_BUG_ON(MAX_SWAPFILES_SHIFT > SWP_TYPE_BITS); \
- /* \
- * Don't have overlapping bits with _PAGE_HPTEFLAGS \
- * We filter HPTEFLAGS on set_pte. \
- */ \
- BUILD_BUG_ON(_PAGE_HPTEFLAGS & (0x1f << _PAGE_BIT_SWAP_TYPE)); \
} while (0)
/*
* on pte we don't need handle RADIX_TREE_EXCEPTIONAL_SHIFT;
diff --git a/arch/powerpc/include/asm/nohash/pgtable.h b/arch/powerpc/include/asm/nohash/pgtable.h
index c56de1e8026f..f2fe3cbe90af 100644
--- a/arch/powerpc/include/asm/nohash/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/pgtable.h
@@ -148,37 +148,16 @@ extern void set_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr,
pte_t *ptep, pte_t pte, int percpu)
{
-#if defined(CONFIG_PPC_STD_MMU_32) && defined(CONFIG_SMP) && !defined(CONFIG_PTE_64BIT)
- /* First case is 32-bit Hash MMU in SMP mode with 32-bit PTEs. We use the
- * helper pte_update() which does an atomic update. We need to do that
- * because a concurrent invalidation can clear _PAGE_HASHPTE. If it's a
- * per-CPU PTE such as a kmap_atomic, we do a simple update preserving
- * the hash bits instead (ie, same as the non-SMP case)
- */
- if (percpu)
- *ptep = __pte((pte_val(*ptep) & _PAGE_HASHPTE)
- | (pte_val(pte) & ~_PAGE_HASHPTE));
- else
- pte_update(ptep, ~_PAGE_HASHPTE, pte_val(pte));
-
-#elif defined(CONFIG_PPC32) && defined(CONFIG_PTE_64BIT)
+#if defined(CONFIG_PPC32) && defined(CONFIG_PTE_64BIT)
/* Second case is 32-bit with 64-bit PTE. In this case, we
* can just store as long as we do the two halves in the right order
- * with a barrier in between. This is possible because we take care,
- * in the hash code, to pre-invalidate if the PTE was already hashed,
- * which synchronizes us with any concurrent invalidation.
- * In the percpu case, we also fallback to the simple update preserving
- * the hash bits
+ * with a barrier in between.
+ * In the percpu case, we also fallback to the simple update
*/
if (percpu) {
- *ptep = __pte((pte_val(*ptep) & _PAGE_HASHPTE)
- | (pte_val(pte) & ~_PAGE_HASHPTE));
+ *ptep = pte;
return;
}
-#if _PAGE_HASHPTE != 0
- if (pte_val(*ptep) & _PAGE_HASHPTE)
- flush_hash_entry(mm, ptep, addr);
-#endif
__asm__ __volatile__("\
stw%U0%X0 %2,%0\n\
eieio\n\
@@ -186,15 +165,6 @@ static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr,
: "=m" (*ptep), "=m" (*((unsigned char *)ptep+4))
: "r" (pte) : "memory");

-#elif defined(CONFIG_PPC_STD_MMU_32)
- /* Third case is 32-bit hash table in UP mode, we need to preserve
- * the _PAGE_HASHPTE bit since we may not have invalidated the previous
- * translation in the hash yet (done in a subsequent flush_tlb_xxx())
- * and see we need to keep track that this PTE needs invalidating
- */
- *ptep = __pte((pte_val(*ptep) & _PAGE_HASHPTE)
- | (pte_val(pte) & ~_PAGE_HASHPTE));
-
#else
/* Anything else just stores the PTE normally. That covers all 64-bit
* cases, and 32-bit non-hash with 32-bit PTEs.
diff --git a/arch/powerpc/include/asm/nohash/pte-book3e.h b/arch/powerpc/include/asm/nohash/pte-book3e.h
index ccee8eb509bb..9ff51b4c0cac 100644
--- a/arch/powerpc/include/asm/nohash/pte-book3e.h
+++ b/arch/powerpc/include/asm/nohash/pte-book3e.h
@@ -57,7 +57,6 @@
#define _PAGE_USER (_PAGE_BAP_UR | _PAGE_BAP_SR) /* Can be read */
#define _PAGE_PRIVILEGED (_PAGE_BAP_SR)

-#define _PAGE_HASHPTE 0
#define _PAGE_BUSY 0

#define _PAGE_SPECIAL _PAGE_SW0
--
2.13.3


2018-05-04 13:17:40

by Joakim Tjernlund

[permalink] [raw]
Subject: Re: [PATCH 12/17] powerpc/8xx: Remove PTE_ATOMIC_UPDATES

On Fri, 2018-05-04 at 14:34 +0200, Christophe Leroy wrote:
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.
>
>
> commit 1bc54c03117b9 ("powerpc: rework 4xx PTE access and TLB miss")
> introduced non atomic PTE updates and started the work of removing
> PTE updates in TLB miss handlers, but kept PTE_ATOMIC_UPDATES for the
> 8xx with the following comment:
> /* Until my rework is finished, 8xx still needs atomic PTE updates */
>
> commit fe11dc3f9628e ("powerpc/8xx: Update TLB asm so it behaves as linux
> mm expects") removed all PTE updates done in TLB miss handlers

Is that my 7 year old commit ?

>
> Therefore, atomic PTE updates are not needed anymore for the 8xx

About time removing atomic updates then :)

>
> Signed-off-by: Christophe Leroy <[email protected]>
>

2018-05-08 08:27:03

by Aneesh Kumar K.V

[permalink] [raw]
Subject: Re: [PATCH 01/17] powerpc/nohash: remove hash related code from nohash headers.

Christophe Leroy <[email protected]> writes:

> When nohash and book3s header were split, some hash related stuff
> remained in the nohash header. This patch removes them.
>

Thanks for doing this. This was on the TODO list for a long time. When
we did the split for book3s, I mostly copied the generic headers to
nohash.

Reviewed-by: Aneesh Kumar K.V <[email protected]>

> Signed-off-by: Christophe Leroy <[email protected]>
> ---
> Removed the call to pte_young() as it fails, back to using PAGE_ACCESSED directly.
>
> arch/powerpc/include/asm/nohash/32/pgtable.h | 29 +++------------------
> arch/powerpc/include/asm/nohash/64/pgtable.h | 16 ++----------
> arch/powerpc/include/asm/nohash/pgtable.h | 38 +++-------------------------
> arch/powerpc/include/asm/nohash/pte-book3e.h | 1 -
> 4 files changed, 10 insertions(+), 74 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h b/arch/powerpc/include/asm/nohash/32/pgtable.h
> index 03bbd1149530..140f8e74b478 100644
> --- a/arch/powerpc/include/asm/nohash/32/pgtable.h
> +++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
> @@ -133,7 +133,7 @@ extern int icache_44x_need_flush;
> #ifndef __ASSEMBLY__
>
> #define pte_clear(mm, addr, ptep) \
> - do { pte_update(ptep, ~_PAGE_HASHPTE, 0); } while (0)
> + do { pte_update(ptep, ~0, 0); } while (0)
>
> #define pmd_none(pmd) (!pmd_val(pmd))
> #define pmd_bad(pmd) (pmd_val(pmd) & _PMD_BAD)
> @@ -146,21 +146,6 @@ static inline void pmd_clear(pmd_t *pmdp)
>
>
> /*
> - * When flushing the tlb entry for a page, we also need to flush the hash
> - * table entry. flush_hash_pages is assembler (for speed) in hashtable.S.
> - */
> -extern int flush_hash_pages(unsigned context, unsigned long va,
> - unsigned long pmdval, int count);
> -
> -/* Add an HPTE to the hash table */
> -extern void add_hash_page(unsigned context, unsigned long va,
> - unsigned long pmdval);
> -
> -/* Flush an entry from the TLB/hash table */
> -extern void flush_hash_entry(struct mm_struct *mm, pte_t *ptep,
> - unsigned long address);
> -
> -/*
> * PTE updates. This function is called whenever an existing
> * valid PTE is updated. This does -not- include set_pte_at()
> * which nowadays only sets a new PTE.
> @@ -246,12 +231,6 @@ static inline int __ptep_test_and_clear_young(unsigned int context, unsigned lon
> {
> unsigned long old;
> old = pte_update(ptep, _PAGE_ACCESSED, 0);
> -#if _PAGE_HASHPTE != 0
> - if (old & _PAGE_HASHPTE) {
> - unsigned long ptephys = __pa(ptep) & PAGE_MASK;
> - flush_hash_pages(context, addr, ptephys, 1);
> - }
> -#endif
> return (old & _PAGE_ACCESSED) != 0;
> }
> #define ptep_test_and_clear_young(__vma, __addr, __ptep) \
> @@ -261,7 +240,7 @@ static inline int __ptep_test_and_clear_young(unsigned int context, unsigned lon
> static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long addr,
> pte_t *ptep)
> {
> - return __pte(pte_update(ptep, ~_PAGE_HASHPTE, 0));
> + return __pte(pte_update(ptep, ~0, 0));
> }
>
> #define __HAVE_ARCH_PTEP_SET_WRPROTECT
> @@ -289,7 +268,7 @@ static inline void __ptep_set_access_flags(struct mm_struct *mm,
> }
>
> #define __HAVE_ARCH_PTE_SAME
> -#define pte_same(A,B) (((pte_val(A) ^ pte_val(B)) & ~_PAGE_HASHPTE) == 0)
> +#define pte_same(A,B) ((pte_val(A) ^ pte_val(B)) == 0)
>
> /*
> * Note that on Book E processors, the pmd contains the kernel virtual
> @@ -330,7 +309,7 @@ static inline void __ptep_set_access_flags(struct mm_struct *mm,
> /*
> * Encode and decode a swap entry.
> * Note that the bits we use in a PTE for representing a swap entry
> - * must not include the _PAGE_PRESENT bit or the _PAGE_HASHPTE bit (if used).
> + * must not include the _PAGE_PRESENT bit.
> * -- paulus
> */
> #define __swp_type(entry) ((entry).val & 0x1f)
> diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h b/arch/powerpc/include/asm/nohash/64/pgtable.h
> index 5c5f75d005ad..4f6f5a27bfb5 100644
> --- a/arch/powerpc/include/asm/nohash/64/pgtable.h
> +++ b/arch/powerpc/include/asm/nohash/64/pgtable.h
> @@ -173,8 +173,6 @@ static inline void pgd_set(pgd_t *pgdp, unsigned long val)
> /* to find an entry in a kernel page-table-directory */
> /* This now only contains the vmalloc pages */
> #define pgd_offset_k(address) pgd_offset(&init_mm, address)
> -extern void hpte_need_flush(struct mm_struct *mm, unsigned long addr,
> - pte_t *ptep, unsigned long pte, int huge);
>
> /* Atomic PTE updates */
> static inline unsigned long pte_update(struct mm_struct *mm,
> @@ -205,11 +203,6 @@ static inline unsigned long pte_update(struct mm_struct *mm,
> if (!huge)
> assert_pte_locked(mm, addr);
>
> -#ifdef CONFIG_PPC_BOOK3S_64
> - if (old & _PAGE_HASHPTE)
> - hpte_need_flush(mm, addr, ptep, old, huge);
> -#endif
> -
> return old;
> }
>
> @@ -218,7 +211,7 @@ static inline int __ptep_test_and_clear_young(struct mm_struct *mm,
> {
> unsigned long old;
>
> - if ((pte_val(*ptep) & (_PAGE_ACCESSED | _PAGE_HASHPTE)) == 0)
> + if ((pte_val(*ptep) & _PAGE_ACCESSED) == 0)
> return 0;
> old = pte_update(mm, addr, ptep, _PAGE_ACCESSED, 0, 0);
> return (old & _PAGE_ACCESSED) != 0;
> @@ -312,7 +305,7 @@ static inline void __ptep_set_access_flags(struct mm_struct *mm,
> }
>
> #define __HAVE_ARCH_PTE_SAME
> -#define pte_same(A,B) (((pte_val(A) ^ pte_val(B)) & ~_PAGE_HPTEFLAGS) == 0)
> +#define pte_same(A,B) ((pte_val(A) ^ pte_val(B)) == 0)
>
> #define pte_ERROR(e) \
> pr_err("%s:%d: bad pte %08lx.\n", __FILE__, __LINE__, pte_val(e))
> @@ -324,11 +317,6 @@ static inline void __ptep_set_access_flags(struct mm_struct *mm,
> /* Encode and de-code a swap entry */
> #define MAX_SWAPFILES_CHECK() do { \
> BUILD_BUG_ON(MAX_SWAPFILES_SHIFT > SWP_TYPE_BITS); \
> - /* \
> - * Don't have overlapping bits with _PAGE_HPTEFLAGS \
> - * We filter HPTEFLAGS on set_pte. \
> - */ \
> - BUILD_BUG_ON(_PAGE_HPTEFLAGS & (0x1f << _PAGE_BIT_SWAP_TYPE)); \
> } while (0)
> /*
> * on pte we don't need handle RADIX_TREE_EXCEPTIONAL_SHIFT;
> diff --git a/arch/powerpc/include/asm/nohash/pgtable.h b/arch/powerpc/include/asm/nohash/pgtable.h
> index c56de1e8026f..f2fe3cbe90af 100644
> --- a/arch/powerpc/include/asm/nohash/pgtable.h
> +++ b/arch/powerpc/include/asm/nohash/pgtable.h
> @@ -148,37 +148,16 @@ extern void set_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
> static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr,
> pte_t *ptep, pte_t pte, int percpu)
> {
> -#if defined(CONFIG_PPC_STD_MMU_32) && defined(CONFIG_SMP) && !defined(CONFIG_PTE_64BIT)
> - /* First case is 32-bit Hash MMU in SMP mode with 32-bit PTEs. We use the
> - * helper pte_update() which does an atomic update. We need to do that
> - * because a concurrent invalidation can clear _PAGE_HASHPTE. If it's a
> - * per-CPU PTE such as a kmap_atomic, we do a simple update preserving
> - * the hash bits instead (ie, same as the non-SMP case)
> - */
> - if (percpu)
> - *ptep = __pte((pte_val(*ptep) & _PAGE_HASHPTE)
> - | (pte_val(pte) & ~_PAGE_HASHPTE));
> - else
> - pte_update(ptep, ~_PAGE_HASHPTE, pte_val(pte));
> -
> -#elif defined(CONFIG_PPC32) && defined(CONFIG_PTE_64BIT)
> +#if defined(CONFIG_PPC32) && defined(CONFIG_PTE_64BIT)
> /* Second case is 32-bit with 64-bit PTE. In this case, we
> * can just store as long as we do the two halves in the right order
> - * with a barrier in between. This is possible because we take care,
> - * in the hash code, to pre-invalidate if the PTE was already hashed,
> - * which synchronizes us with any concurrent invalidation.
> - * In the percpu case, we also fallback to the simple update preserving
> - * the hash bits
> + * with a barrier in between.
> + * In the percpu case, we also fallback to the simple update
> */
> if (percpu) {
> - *ptep = __pte((pte_val(*ptep) & _PAGE_HASHPTE)
> - | (pte_val(pte) & ~_PAGE_HASHPTE));
> + *ptep = pte;
> return;
> }
> -#if _PAGE_HASHPTE != 0
> - if (pte_val(*ptep) & _PAGE_HASHPTE)
> - flush_hash_entry(mm, ptep, addr);
> -#endif
> __asm__ __volatile__("\
> stw%U0%X0 %2,%0\n\
> eieio\n\
> @@ -186,15 +165,6 @@ static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr,
> : "=m" (*ptep), "=m" (*((unsigned char *)ptep+4))
> : "r" (pte) : "memory");
>
> -#elif defined(CONFIG_PPC_STD_MMU_32)
> - /* Third case is 32-bit hash table in UP mode, we need to preserve
> - * the _PAGE_HASHPTE bit since we may not have invalidated the previous
> - * translation in the hash yet (done in a subsequent flush_tlb_xxx())
> - * and see we need to keep track that this PTE needs invalidating
> - */
> - *ptep = __pte((pte_val(*ptep) & _PAGE_HASHPTE)
> - | (pte_val(pte) & ~_PAGE_HASHPTE));
> -
> #else
> /* Anything else just stores the PTE normally. That covers all 64-bit
> * cases, and 32-bit non-hash with 32-bit PTEs.
> diff --git a/arch/powerpc/include/asm/nohash/pte-book3e.h b/arch/powerpc/include/asm/nohash/pte-book3e.h
> index ccee8eb509bb..9ff51b4c0cac 100644
> --- a/arch/powerpc/include/asm/nohash/pte-book3e.h
> +++ b/arch/powerpc/include/asm/nohash/pte-book3e.h
> @@ -57,7 +57,6 @@
> #define _PAGE_USER (_PAGE_BAP_UR | _PAGE_BAP_SR) /* Can be read */
> #define _PAGE_PRIVILEGED (_PAGE_BAP_SR)
>
> -#define _PAGE_HASHPTE 0
> #define _PAGE_BUSY 0
>
> #define _PAGE_SPECIAL _PAGE_SW0
> --
> 2.13.3


2018-05-08 08:29:14

by Aneesh Kumar K.V

[permalink] [raw]
Subject: Re: [PATCH 02/17] powerpc/nohash: remove _PAGE_BUSY

Christophe Leroy <[email protected]> writes:

> _PAGE_BUSY is always 0, remove it
>

Reviewed-by: Aneesh Kumar K.V <[email protected]>

> Signed-off-by: Christophe Leroy <[email protected]>
> ---
> arch/powerpc/include/asm/nohash/64/pgtable.h | 10 +++-------
> arch/powerpc/include/asm/nohash/pte-book3e.h | 5 -----
> 2 files changed, 3 insertions(+), 12 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h b/arch/powerpc/include/asm/nohash/64/pgtable.h
> index 4f6f5a27bfb5..c3559d7a94fb 100644
> --- a/arch/powerpc/include/asm/nohash/64/pgtable.h
> +++ b/arch/powerpc/include/asm/nohash/64/pgtable.h
> @@ -186,14 +186,12 @@ static inline unsigned long pte_update(struct mm_struct *mm,
>
> __asm__ __volatile__(
> "1: ldarx %0,0,%3 # pte_update\n\
> - andi. %1,%0,%6\n\
> - bne- 1b \n\
> andc %1,%0,%4 \n\
> - or %1,%1,%7\n\
> + or %1,%1,%6\n\
> stdcx. %1,0,%3 \n\
> bne- 1b"
> : "=&r" (old), "=&r" (tmp), "=m" (*ptep)
> - : "r" (ptep), "r" (clr), "m" (*ptep), "i" (_PAGE_BUSY), "r" (set)
> + : "r" (ptep), "r" (clr), "m" (*ptep), "r" (set)
> : "cc" );
> #else
> unsigned long old = pte_val(*ptep);
> @@ -290,13 +288,11 @@ static inline void __ptep_set_access_flags(struct mm_struct *mm,
>
> __asm__ __volatile__(
> "1: ldarx %0,0,%4\n\
> - andi. %1,%0,%6\n\
> - bne- 1b \n\
> or %0,%3,%0\n\
> stdcx. %0,0,%4\n\
> bne- 1b"
> :"=&r" (old), "=&r" (tmp), "=m" (*ptep)
> - :"r" (bits), "r" (ptep), "m" (*ptep), "i" (_PAGE_BUSY)
> + :"r" (bits), "r" (ptep), "m" (*ptep)
> :"cc");
> #else
> unsigned long old = pte_val(*ptep);
> diff --git a/arch/powerpc/include/asm/nohash/pte-book3e.h b/arch/powerpc/include/asm/nohash/pte-book3e.h
> index 9ff51b4c0cac..12730b81cd98 100644
> --- a/arch/powerpc/include/asm/nohash/pte-book3e.h
> +++ b/arch/powerpc/include/asm/nohash/pte-book3e.h
> @@ -57,13 +57,8 @@
> #define _PAGE_USER (_PAGE_BAP_UR | _PAGE_BAP_SR) /* Can be read */
> #define _PAGE_PRIVILEGED (_PAGE_BAP_SR)
>
> -#define _PAGE_BUSY 0
> -
> #define _PAGE_SPECIAL _PAGE_SW0
>
> -/* Flags to be preserved on PTE modifications */
> -#define _PAGE_HPTEFLAGS _PAGE_BUSY
> -
> /* Base page size */
> #ifdef CONFIG_PPC_64K_PAGES
> #define _PAGE_PSIZE _PAGE_PSIZE_64K
> --
> 2.13.3


2018-05-08 09:58:02

by Aneesh Kumar K.V

[permalink] [raw]
Subject: Re: [PATCH 09/17] powerpc: make __ioremap_caller() common to PPC32 and PPC64

Christophe Leroy <[email protected]> writes:

> Signed-off-by: Christophe Leroy <[email protected]>
> ---
> arch/powerpc/include/asm/book3s/64/pgtable.h | 1 +
> arch/powerpc/mm/ioremap.c | 126 +++++++--------------------
> 2 files changed, 34 insertions(+), 93 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
> index c5c6ead06bfb..2bebdd8302cb 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> @@ -18,6 +18,7 @@
> #define _PAGE_RO 0
> #define _PAGE_USER 0
> #define _PAGE_HWWRITE 0
> +#define _PAGE_COHERENT 0

This is something I was trying to avoid when I split the headers. We do
support _PAGE_USER it is !_PAGE_PRIVILEGED. It gets really confusing
when we have these conflicting names because we are trying to make code
common across platforms.


>
> #define _PAGE_EXEC 0x00001 /* execute permission */
> #define _PAGE_WRITE 0x00002 /* write access allowed */
> diff --git a/arch/powerpc/mm/ioremap.c b/arch/powerpc/mm/ioremap.c
> index 65d611d44d38..59be5dfcb3e9 100644
> --- a/arch/powerpc/mm/ioremap.c
> +++ b/arch/powerpc/mm/ioremap.c
> @@ -33,95 +33,6 @@ unsigned long ioremap_bot;
> unsigned long ioremap_bot = IOREMAP_BASE;
> #endif
>
> -#ifdef CONFIG_PPC32
> -
> -void __iomem *
> -__ioremap_caller(phys_addr_t addr, unsigned long size, unsigned long flags,
> - void *caller)
> -{
> - unsigned long v, i;
> - phys_addr_t p;
> - int err;
> -
> - /* Make sure we have the base flags */
> - if ((flags & _PAGE_PRESENT) == 0)
> - flags |= pgprot_val(PAGE_KERNEL);
> -
> - /* Non-cacheable page cannot be coherent */
> - if (flags & _PAGE_NO_CACHE)
> - flags &= ~_PAGE_COHERENT;
> -
> - /*
> - * Choose an address to map it to.
> - * Once the vmalloc system is running, we use it.
> - * Before then, we use space going up from IOREMAP_BASE
> - * (ioremap_bot records where we're up to).
> - */
> - p = addr & PAGE_MASK;
> - size = PAGE_ALIGN(addr + size) - p;
> -
> - /*
> - * If the address lies within the first 16 MB, assume it's in ISA
> - * memory space
> - */
> - if (p < 16*1024*1024)
> - p += _ISA_MEM_BASE;
> -
> -#ifndef CONFIG_CRASH_DUMP
> - /*
> - * Don't allow anybody to remap normal RAM that we're using.
> - * mem_init() sets high_memory so only do the check after that.
> - */
> - if (slab_is_available() && (p < virt_to_phys(high_memory)) &&
> - page_is_ram(__phys_to_pfn(p))) {
> - printk("__ioremap(): phys addr 0x%llx is RAM lr %ps\n",
> - (unsigned long long)p, __builtin_return_address(0));
> - return NULL;
> - }
> -#endif
> -
> - if (size == 0)
> - return NULL;
> -
> - /*
> - * Is it already mapped? Perhaps overlapped by a previous
> - * mapping.
> - */
> - v = p_block_mapped(p);
> - if (v)
> - goto out;
> -
> - if (slab_is_available()) {
> - struct vm_struct *area;
> - area = get_vm_area_caller(size, VM_IOREMAP, caller);
> - if (area == 0)
> - return NULL;
> - area->phys_addr = p;
> - v = (unsigned long) area->addr;
> - } else {
> - v = ioremap_bot;
> - ioremap_bot += size;
> - }
> -
> - /*
> - * Should check if it is a candidate for a BAT mapping
> - */
> -
> - err = 0;
> - for (i = 0; i < size && err == 0; i += PAGE_SIZE)
> - err = map_kernel_page(v+i, p+i, flags);
> - if (err) {
> - if (slab_is_available())
> - vunmap((void *)v);
> - return NULL;
> - }
> -
> -out:
> - return (void __iomem *) (v + ((unsigned long)addr & ~PAGE_MASK));
> -}
> -
> -#else
> -
> /**
> * __ioremap_at - Low level function to establish the page tables
> * for an IO mapping
> @@ -135,6 +46,10 @@ void __iomem * __ioremap_at(phys_addr_t pa, void *ea, unsigned long size,
> if ((flags & _PAGE_PRESENT) == 0)
> flags |= pgprot_val(PAGE_KERNEL);
>
> + /* Non-cacheable page cannot be coherent */
> + if (flags & _PAGE_NO_CACHE)
> + flags &= ~_PAGE_COHERENT;
> +
> /* We don't support the 4K PFN hack with ioremap */
> if (flags & H_PAGE_4K_PFN)
> return NULL;
> @@ -187,6 +102,33 @@ void __iomem * __ioremap_caller(phys_addr_t addr, unsigned long size,
> if ((size == 0) || (paligned == 0))
> return NULL;
>
> + /*
> + * If the address lies within the first 16 MB, assume it's in ISA
> + * memory space
> + */
> + if (IS_ENABLED(CONFIG_PPC32) && paligned < 16*1024*1024)
> + paligned += _ISA_MEM_BASE;
> +
> + /*
> + * Don't allow anybody to remap normal RAM that we're using.
> + * mem_init() sets high_memory so only do the check after that.
> + */
> + if (!IS_ENABLED(CONFIG_CRASH_DUMP) &&
> + slab_is_available() && (paligned < virt_to_phys(high_memory)) &&
> + page_is_ram(__phys_to_pfn(paligned))) {
> + printk("__ioremap(): phys addr 0x%llx is RAM lr %ps\n",
> + (u64)paligned, __builtin_return_address(0));
> + return NULL;
> + }
> +
> + /*
> + * Is it already mapped? Perhaps overlapped by a previous
> + * mapping.
> + */
> + ret = (void __iomem *)p_block_mapped(paligned);
> + if (ret)
> + goto out;
> +
> if (slab_is_available()) {
> struct vm_struct *area;
>
> @@ -205,14 +147,12 @@ void __iomem * __ioremap_caller(phys_addr_t addr, unsigned long size,
> if (ret)
> ioremap_bot += size;
> }
> -
> +out:
> if (ret)
> - ret += addr & ~PAGE_MASK;
> + ret += (unsigned long)addr & ~PAGE_MASK;
> return ret;
> }
>
> -#endif
> -
> /*
> * Unmap an IO region and remove it from imalloc'd list.
> * Access to IO memory should be serialized by driver.
> --
> 2.13.3


2018-05-11 06:02:06

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH 05/17] powerpc: move io mapping functions into ioremap.c

Christophe Leroy <[email protected]> writes:

> diff --git a/arch/powerpc/mm/ioremap.c b/arch/powerpc/mm/ioremap.c
> new file mode 100644
> index 000000000000..5d2645193568
> --- /dev/null
> +++ b/arch/powerpc/mm/ioremap.c
> @@ -0,0 +1,350 @@
> +/*
> + * This file contains the routines for mapping IO areas
> + *
> + * Derived from arch/powerpc/mm/pgtable_32.c and
> + * arch/powerpc/mm/pgtable_64.c
> + *
> + * SPDX-License-Identifier: GPL-2.0

This goes at the top of the file as:

// SPDX-License-Identifier: GPL-2.0

See:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/license-rules.rst?#n56

> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/types.h>
> +#include <linux/mm.h>
> +#include <linux/vmalloc.h>
> +#include <linux/init.h>
> +#include <linux/highmem.h>
> +#include <linux/memblock.h>
> +#include <linux/slab.h>
> +
> +#include <asm/pgtable.h>
> +#include <asm/pgalloc.h>
> +#include <asm/fixmap.h>
> +#include <asm/io.h>
> +#include <asm/setup.h>
> +#include <asm/sections.h>

I needed:

+#include <asm/machdep.h>

To fix:

../arch/powerpc/mm/ioremap.c: In function ‘ioremap_wc’:
../arch/powerpc/mm/ioremap.c:290:6: error: ‘ppc_md’ undeclared (first use in this function)
if (ppc_md.ioremap)
^~~~~~

cheers

2018-05-11 06:46:19

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH 11/17] powerpc/nohash32: set GUARDED attribute in the PMD directly

Christophe Leroy <[email protected]> writes:

> diff --git a/arch/powerpc/mm/ioremap.c b/arch/powerpc/mm/ioremap.c
> index 59be5dfcb3e9..b8c347077e02 100644
> --- a/arch/powerpc/mm/ioremap.c
> +++ b/arch/powerpc/mm/ioremap.c
> @@ -132,9 +132,14 @@ void __iomem * __ioremap_caller(phys_addr_t addr, unsigned long size,
> if (slab_is_available()) {
> struct vm_struct *area;
>
> - area = __get_vm_area_caller(size, VM_IOREMAP,
> - ioremap_bot, IOREMAP_END,
> - caller);
> + if (flags & _PAGE_GUARDED)

On 64-bit configs:

arch/powerpc/mm/ioremap.c:135:15: error: '_PAGE_GUARDED' undeclared (first use in this function)

cheers

2018-05-11 06:49:12

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH 00/17] Implement use of HW assistance on TLB table walk on 8xx

Christophe Leroy <[email protected]> writes:

> The purpose of this serie is to implement hardware assistance for TLB table walk
> on the 8xx.
>
> First part is to make L1 entries and L2 entries independant.
> For that, we need to alter ioremap functions in order to handle GUARD attribute
> at the PGD/PMD level.
>
> Last part is to try and reuse PTE fragment implemented on PPC64 in order to
> not waste 16k Pages for page tables as only 4k are used. For the time being,
> it doesn't work, but I include it in the serie anyway in order to get feedback.
>
> Tested successfully on 8xx up to the one before the last.
>
> Didn't have time to do compilation test on other configs, I send it anyway
> before leaving for one week vacation in order to get feedback.

I replied to a few patches, here's some other build errors:


arch/powerpc/mm/ioremap.c:135:15: error: '_PAGE_GUARDED' undeclared (first use in this function):
pseries_defconfig/powerpc

arch/powerpc/include/asm/book3s/32/pgtable.h:53:19: error: 'PKMAP_BASE' undeclared (first use in this function):
pmac32_defconfig/powerpc-5.3

include/linux/mm.h:533:41: error: 'PKMAP_BASE' undeclared (first use in this function):
pmac32_defconfig/powerpc

ERROR: "ioremap_bot" [net/netfilter/nf_conntrack.ko] undefined!:
linkstation_defconfig/powerpc

ERROR: "ioremap_bot" [fs/xfs/xfs.ko] undefined!:
linkstation_defconfig/powerpc

arch/powerpc/include/asm/nohash/32/pgtable.h:80:20: error: 'PKMAP_BASE' undeclared (first use in this function):
corenet32_smp_defconfig/powerpc-5.3

arch/powerpc/include/asm/nohash/32/pgalloc.h:64:43: error: '_PMD_GUARDED' undeclared (first use in this function):
ppc40x_defconfig/powerpc-5.3

ERROR: "ioremap_bot" [net/packet/af_packet.ko] undefined!:
storcenter_defconfig/powerpc

ERROR: "ioremap_bot" [drivers/usb/core/usbcore.ko] undefined!:
ppc44x_defconfig/powerpc

cheers

2018-05-16 09:59:57

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH 09/17] powerpc: make __ioremap_caller() common to PPC32 and PPC64



Le 08/05/2018 à 11:56, Aneesh Kumar K.V a écrit :
> Christophe Leroy <[email protected]> writes:
>
>> Signed-off-by: Christophe Leroy <[email protected]>
>> ---
>> arch/powerpc/include/asm/book3s/64/pgtable.h | 1 +
>> arch/powerpc/mm/ioremap.c | 126 +++++++--------------------
>> 2 files changed, 34 insertions(+), 93 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
>> index c5c6ead06bfb..2bebdd8302cb 100644
>> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
>> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
>> @@ -18,6 +18,7 @@
>> #define _PAGE_RO 0
>> #define _PAGE_USER 0
>> #define _PAGE_HWWRITE 0
>> +#define _PAGE_COHERENT 0
>
> This is something I was trying to avoid when I split the headers. We do
> support _PAGE_USER it is !_PAGE_PRIVILEGED. It gets really confusing
> when we have these conflicting names because we are trying to make code
> common across platforms.

Euh ... Here patch adds _PAGE_COHERENT

_PAGE_USER was added some time ago.

Well, we have three cases:

BOOK3S64 and NOHASH32/8xx has _PAGE_PRIVILEGED and no _PAGE_USER
BOOKE has both _PAGE_PRIVILEGED and _PAGE_USER
Others have _PAGE_USER and no _PAGE_PRIVILEGED

So when giving user rights to a page, some will set _PAGE_USER, some
will unset _PAGE_PRIVILEGED and some will do both.

_PAGE_USER and _PAGE_PRIVILEGED being used outside of the subarch headers,
- either we have to add uggly ifdefs
- or we can just make sure unused flags are set as 0, then (x | 0) and
(x & ~0) will do nothing and will be eliminated by the compiler.

Today, this is done in asm/pte-common.h. Unfortunately, all headers
except book3s64 do include pte-common.

Another solution would be to make sure _PAGE_xxx flags are not used
outside of subarch specific headers. That would mean having specific
helpers defined in each subarch header, but is it really worth it ?




Lets take the exemple of an even more tricky one :
- Some subarchs have _PAGE_RW, others have _PAGE_RO instead. In
addition, the 8xx has _PAGE_NA
- Book3s64 has _PAGE_READ and _PAGE_WRITE.
- In some places, _PAGE_RW has been redefined has _PAGE_READ | _PAGE_WRITE

It has really become pretty complex. Why having defined new flags
instead of using _PAGE_RO for _PAGE_READ and _PAGE_RW for _PAGE_READ |
_PAGE_WRITE ? Does it make any sense to have the possibility to set
_PAGE_WRITE without _PAGE_READ ?


I feel like having simple generic code like:

flags = (flags & ~_PAGE_PRIVILEGED) | _PAGE_USER;

is better than having almost same code duplicated in several places or
ugly ifdefs like:

#if defined(CONFIG_BOOK3S64) || defined(CONFIG_PPC_8xx) || defined
(CONFIG_BOOKE)
flags &= ~_PAGE_PRIVILEGED;
#endif
#if !defined(CONFIG_BOOK3S64) && !defined(CONFIG_PPC_8xx)
flags |= _PAGE_USER;
#endif


It looks to me that patch
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=812fadcb941a81d1f3948b10a95a4dce663da3e4
allowed a nice code simplification, don't you feel the same ?

Christophe

>
>
>>
>> #define _PAGE_EXEC 0x00001 /* execute permission */
>> #define _PAGE_WRITE 0x00002 /* write access allowed */
>> diff --git a/arch/powerpc/mm/ioremap.c b/arch/powerpc/mm/ioremap.c
>> index 65d611d44d38..59be5dfcb3e9 100644
>> --- a/arch/powerpc/mm/ioremap.c
>> +++ b/arch/powerpc/mm/ioremap.c
>> @@ -33,95 +33,6 @@ unsigned long ioremap_bot;
>> unsigned long ioremap_bot = IOREMAP_BASE;
>> #endif
>>
>> -#ifdef CONFIG_PPC32
>> -
>> -void __iomem *
>> -__ioremap_caller(phys_addr_t addr, unsigned long size, unsigned long flags,
>> - void *caller)
>> -{
>> - unsigned long v, i;
>> - phys_addr_t p;
>> - int err;
>> -
>> - /* Make sure we have the base flags */
>> - if ((flags & _PAGE_PRESENT) == 0)
>> - flags |= pgprot_val(PAGE_KERNEL);
>> -
>> - /* Non-cacheable page cannot be coherent */
>> - if (flags & _PAGE_NO_CACHE)
>> - flags &= ~_PAGE_COHERENT;
>> -
>> - /*
>> - * Choose an address to map it to.
>> - * Once the vmalloc system is running, we use it.
>> - * Before then, we use space going up from IOREMAP_BASE
>> - * (ioremap_bot records where we're up to).
>> - */
>> - p = addr & PAGE_MASK;
>> - size = PAGE_ALIGN(addr + size) - p;
>> -
>> - /*
>> - * If the address lies within the first 16 MB, assume it's in ISA
>> - * memory space
>> - */
>> - if (p < 16*1024*1024)
>> - p += _ISA_MEM_BASE;
>> -
>> -#ifndef CONFIG_CRASH_DUMP
>> - /*
>> - * Don't allow anybody to remap normal RAM that we're using.
>> - * mem_init() sets high_memory so only do the check after that.
>> - */
>> - if (slab_is_available() && (p < virt_to_phys(high_memory)) &&
>> - page_is_ram(__phys_to_pfn(p))) {
>> - printk("__ioremap(): phys addr 0x%llx is RAM lr %ps\n",
>> - (unsigned long long)p, __builtin_return_address(0));
>> - return NULL;
>> - }
>> -#endif
>> -
>> - if (size == 0)
>> - return NULL;
>> -
>> - /*
>> - * Is it already mapped? Perhaps overlapped by a previous
>> - * mapping.
>> - */
>> - v = p_block_mapped(p);
>> - if (v)
>> - goto out;
>> -
>> - if (slab_is_available()) {
>> - struct vm_struct *area;
>> - area = get_vm_area_caller(size, VM_IOREMAP, caller);
>> - if (area == 0)
>> - return NULL;
>> - area->phys_addr = p;
>> - v = (unsigned long) area->addr;
>> - } else {
>> - v = ioremap_bot;
>> - ioremap_bot += size;
>> - }
>> -
>> - /*
>> - * Should check if it is a candidate for a BAT mapping
>> - */
>> -
>> - err = 0;
>> - for (i = 0; i < size && err == 0; i += PAGE_SIZE)
>> - err = map_kernel_page(v+i, p+i, flags);
>> - if (err) {
>> - if (slab_is_available())
>> - vunmap((void *)v);
>> - return NULL;
>> - }
>> -
>> -out:
>> - return (void __iomem *) (v + ((unsigned long)addr & ~PAGE_MASK));
>> -}
>> -
>> -#else
>> -
>> /**
>> * __ioremap_at - Low level function to establish the page tables
>> * for an IO mapping
>> @@ -135,6 +46,10 @@ void __iomem * __ioremap_at(phys_addr_t pa, void *ea, unsigned long size,
>> if ((flags & _PAGE_PRESENT) == 0)
>> flags |= pgprot_val(PAGE_KERNEL);
>>
>> + /* Non-cacheable page cannot be coherent */
>> + if (flags & _PAGE_NO_CACHE)
>> + flags &= ~_PAGE_COHERENT;
>> +
>> /* We don't support the 4K PFN hack with ioremap */
>> if (flags & H_PAGE_4K_PFN)
>> return NULL;
>> @@ -187,6 +102,33 @@ void __iomem * __ioremap_caller(phys_addr_t addr, unsigned long size,
>> if ((size == 0) || (paligned == 0))
>> return NULL;
>>
>> + /*
>> + * If the address lies within the first 16 MB, assume it's in ISA
>> + * memory space
>> + */
>> + if (IS_ENABLED(CONFIG_PPC32) && paligned < 16*1024*1024)
>> + paligned += _ISA_MEM_BASE;
>> +
>> + /*
>> + * Don't allow anybody to remap normal RAM that we're using.
>> + * mem_init() sets high_memory so only do the check after that.
>> + */
>> + if (!IS_ENABLED(CONFIG_CRASH_DUMP) &&
>> + slab_is_available() && (paligned < virt_to_phys(high_memory)) &&
>> + page_is_ram(__phys_to_pfn(paligned))) {
>> + printk("__ioremap(): phys addr 0x%llx is RAM lr %ps\n",
>> + (u64)paligned, __builtin_return_address(0));
>> + return NULL;
>> + }
>> +
>> + /*
>> + * Is it already mapped? Perhaps overlapped by a previous
>> + * mapping.
>> + */
>> + ret = (void __iomem *)p_block_mapped(paligned);
>> + if (ret)
>> + goto out;
>> +
>> if (slab_is_available()) {
>> struct vm_struct *area;
>>
>> @@ -205,14 +147,12 @@ void __iomem * __ioremap_caller(phys_addr_t addr, unsigned long size,
>> if (ret)
>> ioremap_bot += size;
>> }
>> -
>> +out:
>> if (ret)
>> - ret += addr & ~PAGE_MASK;
>> + ret += (unsigned long)addr & ~PAGE_MASK;
>> return ret;
>> }
>>
>> -#endif
>> -
>> /*
>> * Unmap an IO region and remove it from imalloc'd list.
>> * Access to IO memory should be serialized by driver.
>> --
>> 2.13.3

2018-05-16 10:14:33

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH 05/17] powerpc: move io mapping functions into ioremap.c



Le 11/05/2018 à 08:01, Michael Ellerman a écrit :
> Christophe Leroy <[email protected]> writes:
>

[...]

>> +#include <linux/kernel.h>
>> +#include <linux/module.h>
>> +#include <linux/types.h>
>> +#include <linux/mm.h>
>> +#include <linux/vmalloc.h>
>> +#include <linux/init.h>
>> +#include <linux/highmem.h>
>> +#include <linux/memblock.h>
>> +#include <linux/slab.h>
>> +
>> +#include <asm/pgtable.h>
>> +#include <asm/pgalloc.h>
>> +#include <asm/fixmap.h>
>> +#include <asm/io.h>
>> +#include <asm/setup.h>
>> +#include <asm/sections.h>
>
> I needed:
>
> +#include <asm/machdep.h>

Oops, yes it was added in the following patch. Ok I move it one patch back.

Thanks
Christophe

>
> To fix:
>
> ../arch/powerpc/mm/ioremap.c: In function ‘ioremap_wc’:
> ../arch/powerpc/mm/ioremap.c:290:6: error: ‘ppc_md’ undeclared (first use in this function)
> if (ppc_md.ioremap)
> ^~~~~~
>
> cheers
>

2018-05-16 10:18:49

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH 00/17] Implement use of HW assistance on TLB table walk on 8xx



Le 11/05/2018 à 08:48, Michael Ellerman a écrit :
> Christophe Leroy <[email protected]> writes:
>
>> The purpose of this serie is to implement hardware assistance for TLB table walk
>> on the 8xx.
>>
>> First part is to make L1 entries and L2 entries independant.
>> For that, we need to alter ioremap functions in order to handle GUARD attribute
>> at the PGD/PMD level.
>>
>> Last part is to try and reuse PTE fragment implemented on PPC64 in order to
>> not waste 16k Pages for page tables as only 4k are used. For the time being,
>> it doesn't work, but I include it in the serie anyway in order to get feedback.
>>
>> Tested successfully on 8xx up to the one before the last.
>>
>> Didn't have time to do compilation test on other configs, I send it anyway
>> before leaving for one week vacation in order to get feedback.
>
> I replied to a few patches, here's some other build errors:
>
>
> arch/powerpc/mm/ioremap.c:135:15: error: '_PAGE_GUARDED' undeclared (first use in this function):
> pseries_defconfig/powerpc
>
> arch/powerpc/include/asm/book3s/32/pgtable.h:53:19: error: 'PKMAP_BASE' undeclared (first use in this function):
> pmac32_defconfig/powerpc-5.3
>
> include/linux/mm.h:533:41: error: 'PKMAP_BASE' undeclared (first use in this function):
> pmac32_defconfig/powerpc
>
> ERROR: "ioremap_bot" [net/netfilter/nf_conntrack.ko] undefined!:
> linkstation_defconfig/powerpc
>
> ERROR: "ioremap_bot" [fs/xfs/xfs.ko] undefined!:
> linkstation_defconfig/powerpc
>
> arch/powerpc/include/asm/nohash/32/pgtable.h:80:20: error: 'PKMAP_BASE' undeclared (first use in this function):
> corenet32_smp_defconfig/powerpc-5.3
>
> arch/powerpc/include/asm/nohash/32/pgalloc.h:64:43: error: '_PMD_GUARDED' undeclared (first use in this function):
> ppc40x_defconfig/powerpc-5.3
>
> ERROR: "ioremap_bot" [net/packet/af_packet.ko] undefined!:
> storcenter_defconfig/powerpc
>
> ERROR: "ioremap_bot" [drivers/usb/core/usbcore.ko] undefined!:
> ppc44x_defconfig/powerpc
>

Thanks for testing. I have now fixed all of them in v2.

For PKMAP_BASE, I had to move it from asm/highmem.h into the
book3s/32/pgtable.h and nohash/32/pgtable.h because including
asm/highmem.h in the pgtable.h files was introducing circular dependency.

Christophe