2020-05-11 20:44:37

by Will Deacon

[permalink] [raw]
Subject: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables

Now that the page table allocator can free page table allocations
smaller than PAGE_SIZE, reduce the size of the PMD and PTE allocations
to avoid needlessly wasting memory.

Cc: "David S. Miller" <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
---
arch/sparc/include/asm/pgtsrmmu.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/sparc/include/asm/pgtsrmmu.h b/arch/sparc/include/asm/pgtsrmmu.h
index 58ea8e8c6ee7..7708d015712b 100644
--- a/arch/sparc/include/asm/pgtsrmmu.h
+++ b/arch/sparc/include/asm/pgtsrmmu.h
@@ -17,8 +17,8 @@
/* Number of contexts is implementation-dependent; 64k is the most we support */
#define SRMMU_MAX_CONTEXTS 65536

-#define SRMMU_PTE_TABLE_SIZE (PAGE_SIZE)
-#define SRMMU_PMD_TABLE_SIZE (PAGE_SIZE)
+#define SRMMU_PTE_TABLE_SIZE (PTRS_PER_PTE*4)
+#define SRMMU_PMD_TABLE_SIZE (PTRS_PER_PMD*4)
#define SRMMU_PGD_TABLE_SIZE (PTRS_PER_PGD*4)

/* Definition of the values in the ET field of PTD's and PTE's */
--
2.26.2.645.ge9eca65c58-goog


Subject: [tip: locking/kcsan] sparc32: mm: Reduce allocation size for PMD and PTE tables

The following commit has been merged into the locking/kcsan branch of tip:

Commit-ID: 2443600dc98fdc91661b2e24184f279d1198f8cc
Gitweb: https://git.kernel.org/tip/2443600dc98fdc91661b2e24184f279d1198f8cc
Author: Will Deacon <[email protected]>
AuthorDate: Mon, 11 May 2020 21:41:36 +01:00
Committer: Thomas Gleixner <[email protected]>
CommitterDate: Tue, 12 May 2020 11:04:09 +02:00

sparc32: mm: Reduce allocation size for PMD and PTE tables

Now that the page table allocator can free page table allocations
smaller than PAGE_SIZE, reduce the size of the PMD and PTE allocations
to avoid needlessly wasting memory.

Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Cc: "David S. Miller" <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]

---
arch/sparc/include/asm/pgtsrmmu.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/sparc/include/asm/pgtsrmmu.h b/arch/sparc/include/asm/pgtsrmmu.h
index 58ea8e8..7708d01 100644
--- a/arch/sparc/include/asm/pgtsrmmu.h
+++ b/arch/sparc/include/asm/pgtsrmmu.h
@@ -17,8 +17,8 @@
/* Number of contexts is implementation-dependent; 64k is the most we support */
#define SRMMU_MAX_CONTEXTS 65536

-#define SRMMU_PTE_TABLE_SIZE (PAGE_SIZE)
-#define SRMMU_PMD_TABLE_SIZE (PAGE_SIZE)
+#define SRMMU_PTE_TABLE_SIZE (PTRS_PER_PTE*4)
+#define SRMMU_PMD_TABLE_SIZE (PTRS_PER_PMD*4)
#define SRMMU_PGD_TABLE_SIZE (PTRS_PER_PGD*4)

/* Definition of the values in the ET field of PTD's and PTE's */

2020-05-17 00:06:00

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables

On Mon, May 11, 2020 at 09:41:36PM +0100, Will Deacon wrote:
> Now that the page table allocator can free page table allocations
> smaller than PAGE_SIZE, reduce the size of the PMD and PTE allocations
> to avoid needlessly wasting memory.
>
> Cc: "David S. Miller" <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Signed-off-by: Will Deacon <[email protected]>

Something in the sparc32 patches in linux-next causes all my sparc32 emulations
to crash. bisect points to this patch, but reverting it doesn't help, and neither
does reverting the rest of the series.

Guenter

---
Bisect log:

# bad: [bdecf38f228bcca73b31ada98b5b7ba1215eb9c9] Add linux-next specific files for 20200515
# good: [2ef96a5bb12be62ef75b5828c0aab838ebb29cb8] Linux 5.7-rc5
git bisect start 'HEAD' 'v5.7-rc5'
# bad: [3674d7aa7a8e61d993886c2fb7c896c5ef85e988] Merge remote-tracking branch 'crypto/master'
git bisect bad 3674d7aa7a8e61d993886c2fb7c896c5ef85e988
# bad: [1ab4d6ff0a3ee4b29441d8b0076bc8d4734bd16e] Merge remote-tracking branch 'hwmon-staging/hwmon-next'
git bisect bad 1ab4d6ff0a3ee4b29441d8b0076bc8d4734bd16e
# good: [dccfae3ab84387c94f2efc574d41efae005eeee5] Merge remote-tracking branch 'tegra/for-next'
git bisect good dccfae3ab84387c94f2efc574d41efae005eeee5
# bad: [20f9d1287c9f0047b81497197c9f4893485bbe15] Merge remote-tracking branch 'djw-vfs/vfs-for-next'
git bisect bad 20f9d1287c9f0047b81497197c9f4893485bbe15
# bad: [6537897637b5b91f921cb0ac6c465a593f4a665e] Merge remote-tracking branch 'sparc-next/master'
git bisect bad 6537897637b5b91f921cb0ac6c465a593f4a665e
# good: [bca1583e0693e0ba76450b684c5910f7083eeef4] Merge remote-tracking branch 'mips/mips-next'
git bisect good bca1583e0693e0ba76450b684c5910f7083eeef4
# good: [1f12096aca212af8fad3ef58d5673cde691a1452] Merge the lockless page table walk rework into next
git bisect good 1f12096aca212af8fad3ef58d5673cde691a1452
# good: [23a457b8d57dc8d0cc1dbd1882993dd2fcc4b0c0] s390: nvme reipl
git bisect good 23a457b8d57dc8d0cc1dbd1882993dd2fcc4b0c0
# good: [f57f5010c0c3fe2d924a957ddf1d17fbebb54d47] Merge remote-tracking branch 'risc-v/for-next'
git bisect good f57f5010c0c3fe2d924a957ddf1d17fbebb54d47
# good: [1d5fd6c33b04e5d5b665446c3b56f2148f0f1272] sh: add missing DECLARE_EXPORT() for __ashiftrt_r4_xx
git bisect good 1d5fd6c33b04e5d5b665446c3b56f2148f0f1272
# bad: [8c8f3156dd40f8bdc58f2ac461374bc804c28e3b] sparc32: mm: Reduce allocation size for PMD and PTE tables
git bisect bad 8c8f3156dd40f8bdc58f2ac461374bc804c28e3b
# good: [8e958839e4b9fb6ea4385ff2c52d1333a3a618de] sparc32: mm: Restructure sparc32 MMU page-table layout
git bisect good 8e958839e4b9fb6ea4385ff2c52d1333a3a618de
# good: [3f407976ac2953116cb8880a7a18b63bcc81829d] sparc32: mm: Change pgtable_t type to pte_t * instead of struct page *
git bisect good 3f407976ac2953116cb8880a7a18b63bcc81829d
# first bad commit: [8c8f3156dd40f8bdc58f2ac461374bc804c28e3b] sparc32: mm: Reduce allocation size for PMD and PTE tables

---
Log messages:

Lots of:

BUG: scheduling while atomic: kthreadd/2/0xffffffff
Modules linked in:
CPU: 0 PID: 2 Comm: kthreadd Tainted: G W 5.7.0-rc5-next-20200515 #1
[f04f2c94 :
here+0x16c/0x250 ]
[f04f2df0 :
schedule+0x78/0x11c ]
[f003f100 :
kthreadd+0x188/0x1a4 ]
[f0008448 :
ret_from_kernel_thread+0xc/0x38 ]
[00000000 :
0x0 ]

followed by:

Kernel panic - not syncing: Aiee, killing interrupt handler!
CPU: 0 PID: 19 Comm: cryptomgr_test Tainted: G W 5.7.0-rc5-next-20200515 #1
[f0024400 :
do_exit+0x7c8/0xa88 ]
[f0075540 :
__module_put_and_exit+0xc/0x18 ]
[f0221428 :
cryptomgr_test+0x28/0x48 ]
[f003edc0 :
kthread+0xf4/0x12c ]
[f0008448 :
ret_from_kernel_thread+0xc/0x38 ]
[00000000 :
0x0 ]

2020-05-17 00:11:53

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables

On Sat, May 16, 2020 at 05:00:50PM -0700, Guenter Roeck wrote:
> On Mon, May 11, 2020 at 09:41:36PM +0100, Will Deacon wrote:
> > Now that the page table allocator can free page table allocations
> > smaller than PAGE_SIZE, reduce the size of the PMD and PTE allocations
> > to avoid needlessly wasting memory.
> >
> > Cc: "David S. Miller" <[email protected]>
> > Cc: Peter Zijlstra <[email protected]>
> > Signed-off-by: Will Deacon <[email protected]>
>
> Something in the sparc32 patches in linux-next causes all my sparc32 emulations
> to crash. bisect points to this patch, but reverting it doesn't help, and neither
> does reverting the rest of the series.
>
Actually, turns out I see the same pattern (lots of scheduling while atomic
followed by 'killing interrupt handler' in cryptomgr_test) with several
powerpc boot tests. I am currently bisecting those crashes. I'll report
the results here as well as soon as I have it.

Guenter

> Guenter
>
> ---
> Bisect log:
>
> # bad: [bdecf38f228bcca73b31ada98b5b7ba1215eb9c9] Add linux-next specific files for 20200515
> # good: [2ef96a5bb12be62ef75b5828c0aab838ebb29cb8] Linux 5.7-rc5
> git bisect start 'HEAD' 'v5.7-rc5'
> # bad: [3674d7aa7a8e61d993886c2fb7c896c5ef85e988] Merge remote-tracking branch 'crypto/master'
> git bisect bad 3674d7aa7a8e61d993886c2fb7c896c5ef85e988
> # bad: [1ab4d6ff0a3ee4b29441d8b0076bc8d4734bd16e] Merge remote-tracking branch 'hwmon-staging/hwmon-next'
> git bisect bad 1ab4d6ff0a3ee4b29441d8b0076bc8d4734bd16e
> # good: [dccfae3ab84387c94f2efc574d41efae005eeee5] Merge remote-tracking branch 'tegra/for-next'
> git bisect good dccfae3ab84387c94f2efc574d41efae005eeee5
> # bad: [20f9d1287c9f0047b81497197c9f4893485bbe15] Merge remote-tracking branch 'djw-vfs/vfs-for-next'
> git bisect bad 20f9d1287c9f0047b81497197c9f4893485bbe15
> # bad: [6537897637b5b91f921cb0ac6c465a593f4a665e] Merge remote-tracking branch 'sparc-next/master'
> git bisect bad 6537897637b5b91f921cb0ac6c465a593f4a665e
> # good: [bca1583e0693e0ba76450b684c5910f7083eeef4] Merge remote-tracking branch 'mips/mips-next'
> git bisect good bca1583e0693e0ba76450b684c5910f7083eeef4
> # good: [1f12096aca212af8fad3ef58d5673cde691a1452] Merge the lockless page table walk rework into next
> git bisect good 1f12096aca212af8fad3ef58d5673cde691a1452
> # good: [23a457b8d57dc8d0cc1dbd1882993dd2fcc4b0c0] s390: nvme reipl
> git bisect good 23a457b8d57dc8d0cc1dbd1882993dd2fcc4b0c0
> # good: [f57f5010c0c3fe2d924a957ddf1d17fbebb54d47] Merge remote-tracking branch 'risc-v/for-next'
> git bisect good f57f5010c0c3fe2d924a957ddf1d17fbebb54d47
> # good: [1d5fd6c33b04e5d5b665446c3b56f2148f0f1272] sh: add missing DECLARE_EXPORT() for __ashiftrt_r4_xx
> git bisect good 1d5fd6c33b04e5d5b665446c3b56f2148f0f1272
> # bad: [8c8f3156dd40f8bdc58f2ac461374bc804c28e3b] sparc32: mm: Reduce allocation size for PMD and PTE tables
> git bisect bad 8c8f3156dd40f8bdc58f2ac461374bc804c28e3b
> # good: [8e958839e4b9fb6ea4385ff2c52d1333a3a618de] sparc32: mm: Restructure sparc32 MMU page-table layout
> git bisect good 8e958839e4b9fb6ea4385ff2c52d1333a3a618de
> # good: [3f407976ac2953116cb8880a7a18b63bcc81829d] sparc32: mm: Change pgtable_t type to pte_t * instead of struct page *
> git bisect good 3f407976ac2953116cb8880a7a18b63bcc81829d
> # first bad commit: [8c8f3156dd40f8bdc58f2ac461374bc804c28e3b] sparc32: mm: Reduce allocation size for PMD and PTE tables
>
> ---
> Log messages:
>
> Lots of:
>
> BUG: scheduling while atomic: kthreadd/2/0xffffffff
> Modules linked in:
> CPU: 0 PID: 2 Comm: kthreadd Tainted: G W 5.7.0-rc5-next-20200515 #1
> [f04f2c94 :
> here+0x16c/0x250 ]
> [f04f2df0 :
> schedule+0x78/0x11c ]
> [f003f100 :
> kthreadd+0x188/0x1a4 ]
> [f0008448 :
> ret_from_kernel_thread+0xc/0x38 ]
> [00000000 :
> 0x0 ]
>
> followed by:
>
> Kernel panic - not syncing: Aiee, killing interrupt handler!
> CPU: 0 PID: 19 Comm: cryptomgr_test Tainted: G W 5.7.0-rc5-next-20200515 #1
> [f0024400 :
> do_exit+0x7c8/0xa88 ]
> [f0075540 :
> __module_put_and_exit+0xc/0x18 ]
> [f0221428 :
> cryptomgr_test+0x28/0x48 ]
> [f003edc0 :
> kthread+0xf4/0x12c ]
> [f0008448 :
> ret_from_kernel_thread+0xc/0x38 ]
> [00000000 :
> 0x0 ]

2020-05-18 08:39:34

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables

On Sat, May 16, 2020 at 05:07:50PM -0700, Guenter Roeck wrote:
> On Sat, May 16, 2020 at 05:00:50PM -0700, Guenter Roeck wrote:
> > On Mon, May 11, 2020 at 09:41:36PM +0100, Will Deacon wrote:
> > > Now that the page table allocator can free page table allocations
> > > smaller than PAGE_SIZE, reduce the size of the PMD and PTE allocations
> > > to avoid needlessly wasting memory.
> > >
> > > Cc: "David S. Miller" <[email protected]>
> > > Cc: Peter Zijlstra <[email protected]>
> > > Signed-off-by: Will Deacon <[email protected]>
> >
> > Something in the sparc32 patches in linux-next causes all my sparc32 emulations
> > to crash. bisect points to this patch, but reverting it doesn't help, and neither
> > does reverting the rest of the series.
> >
> Actually, turns out I see the same pattern (lots of scheduling while atomic
> followed by 'killing interrupt handler' in cryptomgr_test) with several
> powerpc boot tests. I am currently bisecting those crashes. I'll report
> the results here as well as soon as I have it.

FWIW, I retested my sparc32 patches with PREEMPT=y and I don't see any
issues. However, linux-next is a different story, where I don't get very far
at all:

BUG: Bad page state in process swapper pfn:005b4

If you're seeing this on powerpc too, I wonder if it's related to:

https://lore.kernel.org/r/[email protected]

since I think it just hit -next and the diffstat is all over the place. I've
added Mike to CC just in case.

Will

2020-05-18 09:20:12

by Mike Rapoport

[permalink] [raw]
Subject: Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables

On Mon, May 18, 2020 at 09:37:15AM +0100, Will Deacon wrote:
> On Sat, May 16, 2020 at 05:07:50PM -0700, Guenter Roeck wrote:
> > On Sat, May 16, 2020 at 05:00:50PM -0700, Guenter Roeck wrote:
> > > On Mon, May 11, 2020 at 09:41:36PM +0100, Will Deacon wrote:
> > > > Now that the page table allocator can free page table allocations
> > > > smaller than PAGE_SIZE, reduce the size of the PMD and PTE allocations
> > > > to avoid needlessly wasting memory.
> > > >
> > > > Cc: "David S. Miller" <[email protected]>
> > > > Cc: Peter Zijlstra <[email protected]>
> > > > Signed-off-by: Will Deacon <[email protected]>
> > >
> > > Something in the sparc32 patches in linux-next causes all my sparc32 emulations
> > > to crash. bisect points to this patch, but reverting it doesn't help, and neither
> > > does reverting the rest of the series.
> > >
> > Actually, turns out I see the same pattern (lots of scheduling while atomic
> > followed by 'killing interrupt handler' in cryptomgr_test) with several
> > powerpc boot tests. I am currently bisecting those crashes. I'll report
> > the results here as well as soon as I have it.
>
> FWIW, I retested my sparc32 patches with PREEMPT=y and I don't see any
> issues. However, linux-next is a different story, where I don't get very far
> at all:
>
> BUG: Bad page state in process swapper pfn:005b4
>
> If you're seeing this on powerpc too, I wonder if it's related to:
>
> https://lore.kernel.org/r/[email protected]
>
> since I think it just hit -next and the diffstat is all over the place. I've
> added Mike to CC just in case.

Thanks, Will, I'll take a look.

> Will

--
Sincerely yours,
Mike.

2020-05-18 09:52:33

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables

On 5/18/20 1:37 AM, Will Deacon wrote:
> On Sat, May 16, 2020 at 05:07:50PM -0700, Guenter Roeck wrote:
>> On Sat, May 16, 2020 at 05:00:50PM -0700, Guenter Roeck wrote:
>>> On Mon, May 11, 2020 at 09:41:36PM +0100, Will Deacon wrote:
>>>> Now that the page table allocator can free page table allocations
>>>> smaller than PAGE_SIZE, reduce the size of the PMD and PTE allocations
>>>> to avoid needlessly wasting memory.
>>>>
>>>> Cc: "David S. Miller" <[email protected]>
>>>> Cc: Peter Zijlstra <[email protected]>
>>>> Signed-off-by: Will Deacon <[email protected]>
>>>
>>> Something in the sparc32 patches in linux-next causes all my sparc32 emulations
>>> to crash. bisect points to this patch, but reverting it doesn't help, and neither
>>> does reverting the rest of the series.
>>>
>> Actually, turns out I see the same pattern (lots of scheduling while atomic
>> followed by 'killing interrupt handler' in cryptomgr_test) with several
>> powerpc boot tests. I am currently bisecting those crashes. I'll report
>> the results here as well as soon as I have it.
>
> FWIW, I retested my sparc32 patches with PREEMPT=y and I don't see any
> issues. However, linux-next is a different story, where I don't get very far
> at all:
>
> BUG: Bad page state in process swapper pfn:005b4
>
> If you're seeing this on powerpc too, I wonder if it's related to:
>
> https://lore.kernel.org/r/[email protected]
>
> since I think it just hit -next and the diffstat is all over the place. I've
> added Mike to CC just in case.
>

Here are the bisect results for ppc:

# bad: [bdecf38f228bcca73b31ada98b5b7ba1215eb9c9] Add linux-next specific files for 20200515
# good: [2ef96a5bb12be62ef75b5828c0aab838ebb29cb8] Linux 5.7-rc5
git bisect start 'HEAD' 'v5.7-rc5'
# good: [3674d7aa7a8e61d993886c2fb7c896c5ef85e988] Merge remote-tracking branch 'crypto/master'
git bisect good 3674d7aa7a8e61d993886c2fb7c896c5ef85e988
# good: [87f6f21783522e6d62127cf33ae5e95f50874beb] Merge remote-tracking branch 'spi/for-next'
git bisect good 87f6f21783522e6d62127cf33ae5e95f50874beb
# good: [5c428e8277d5d97c85126387d4e00aa5adde4400] Merge remote-tracking branch 'staging/staging-next'
git bisect good 5c428e8277d5d97c85126387d4e00aa5adde4400
# good: [f68de67ed934e7bdef4799fd7777c86f33f14982] Merge remote-tracking branch 'hyperv/hyperv-next'
git bisect good f68de67ed934e7bdef4799fd7777c86f33f14982
# bad: [54acd2dc52b069da59639eea0d0c92726f32fb01] mm/memblock: fix a typo in comment "implict"->"implicit"
git bisect bad 54acd2dc52b069da59639eea0d0c92726f32fb01
# good: [784a17aa58a529b84f7cc50f351ed4acf3bd11f3] mm: remove the pgprot argument to __vmalloc
git bisect good 784a17aa58a529b84f7cc50f351ed4acf3bd11f3
# good: [6cd8137ff37e9a37aee2d2a8889c8beb8eab192f] khugepaged: replace the usage of system(3) in the test
git bisect good 6cd8137ff37e9a37aee2d2a8889c8beb8eab192f
# bad: [6987da379826ed01b8a1cf046b67cc8cc10117cc] sparc: remove unnecessary includes
git bisect bad 6987da379826ed01b8a1cf046b67cc8cc10117cc
# good: [bc17b545388f64c09e83e367898e28f60277c584] mm/hugetlb: define a generic fallback for is_hugepage_only_range()
git bisect good bc17b545388f64c09e83e367898e28f60277c584
# good: [9b5aa5b43f957f03a1f4a9aff5f7924e2ebbc011] arch-kmap_atomic-consolidate-duplicate-code-checkpatch-fixes
git bisect good 9b5aa5b43f957f03a1f4a9aff5f7924e2ebbc011
# bad: [89194ba5ee31567eeee9c81101b334c8e3248198] arch/kmap: define kmap_atomic_prot() for all arch's
git bisect bad 89194ba5ee31567eeee9c81101b334c8e3248198
# good: [022785d2bea99f8bc2a37b7b6c525eea26f6ac59] arch-kunmap_atomic-consolidate-duplicate-code-checkpatch-fixes
git bisect good 022785d2bea99f8bc2a37b7b6c525eea26f6ac59
# good: [a13c2f39e3f0519ddee57d26cc66ec70e3546106] arch/kmap: don't hard code kmap_prot values
git bisect good a13c2f39e3f0519ddee57d26cc66ec70e3546106
# first bad commit: [89194ba5ee31567eeee9c81101b334c8e3248198] arch/kmap: define kmap_atomic_prot() for all arch's

I don't know if that is accurate either. Maybe things are so broken
that bisect gets confused, or the problem is due to interaction
between different patch series.

Guenter

2020-05-18 14:25:49

by Mike Rapoport

[permalink] [raw]
Subject: Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables

On Mon, May 18, 2020 at 02:48:18AM -0700, Guenter Roeck wrote:
> On 5/18/20 1:37 AM, Will Deacon wrote:
> > On Sat, May 16, 2020 at 05:07:50PM -0700, Guenter Roeck wrote:
> >> On Sat, May 16, 2020 at 05:00:50PM -0700, Guenter Roeck wrote:
> >>> On Mon, May 11, 2020 at 09:41:36PM +0100, Will Deacon wrote:
> >>>> Now that the page table allocator can free page table allocations
> >>>> smaller than PAGE_SIZE, reduce the size of the PMD and PTE allocations
> >>>> to avoid needlessly wasting memory.
> >>>>
> >>>> Cc: "David S. Miller" <[email protected]>
> >>>> Cc: Peter Zijlstra <[email protected]>
> >>>> Signed-off-by: Will Deacon <[email protected]>
> >>>
> >>> Something in the sparc32 patches in linux-next causes all my sparc32 emulations
> >>> to crash. bisect points to this patch, but reverting it doesn't help, and neither
> >>> does reverting the rest of the series.
> >>>
> >> Actually, turns out I see the same pattern (lots of scheduling while atomic
> >> followed by 'killing interrupt handler' in cryptomgr_test) with several
> >> powerpc boot tests. I am currently bisecting those crashes. I'll report
> >> the results here as well as soon as I have it.
> >
> > FWIW, I retested my sparc32 patches with PREEMPT=y and I don't see any
> > issues. However, linux-next is a different story, where I don't get very far
> > at all:
> >
> > BUG: Bad page state in process swapper pfn:005b4

This one seems to be due to commit 24aab577764f ("mm: memmap_init:
iterate over memblock regions rather that check each PFN") and reverting
it and partially reverting the next cleanup commits makes those
dissapear. sparc32 boot still fails on today's linux-next and mmotm for me with

Run /sbin/init as init process
with arguments:
/sbin/init
with environment:
HOME=/
TERM=linux
Starting init: /sbin/init exists but couldn't execute it (error -14)

I've tried to bisect mmotm and I've got the first bad commits in
different places in the middle of arch/kmap series [1] so I've added Ira
to CC as well :)

I'll continue to look into "bad page" on sparc32

[1] https://lore.kernel.org/dri-devel/[email protected]/

> Here are the bisect results for ppc:
>
> # bad: [bdecf38f228bcca73b31ada98b5b7ba1215eb9c9] Add linux-next specific files for 20200515
> # good: [2ef96a5bb12be62ef75b5828c0aab838ebb29cb8] Linux 5.7-rc5
> git bisect start 'HEAD' 'v5.7-rc5'

...

> # good: [9b5aa5b43f957f03a1f4a9aff5f7924e2ebbc011] arch-kmap_atomic-consolidate-duplicate-code-checkpatch-fixes
> git bisect good 9b5aa5b43f957f03a1f4a9aff5f7924e2ebbc011
> # bad: [89194ba5ee31567eeee9c81101b334c8e3248198] arch/kmap: define kmap_atomic_prot() for all arch's
> git bisect bad 89194ba5ee31567eeee9c81101b334c8e3248198
> # good: [022785d2bea99f8bc2a37b7b6c525eea26f6ac59] arch-kunmap_atomic-consolidate-duplicate-code-checkpatch-fixes
> git bisect good 022785d2bea99f8bc2a37b7b6c525eea26f6ac59
> # good: [a13c2f39e3f0519ddee57d26cc66ec70e3546106] arch/kmap: don't hard code kmap_prot values
> git bisect good a13c2f39e3f0519ddee57d26cc66ec70e3546106
> # first bad commit: [89194ba5ee31567eeee9c81101b334c8e3248198] arch/kmap: define kmap_atomic_prot() for all arch's
>
> I don't know if that is accurate either. Maybe things are so broken
> that bisect gets confused, or the problem is due to interaction
> between different patch series.

My results with the workaround for sparc32 boot look similar:

# bad: [2bbf0589bfeb27800c730b76eacf34528eee5418] pci: test for unexpectedly disabled bridges
git bisect bad 2bbf0589bfeb27800c730b76eacf34528eee5418
# good: [2ef96a5bb12be62ef75b5828c0aab838ebb29cb8] Linux 5.7-rc5
git bisect good 2ef96a5bb12be62ef75b5828c0aab838ebb29cb8
# bad: [e4592f53440c6fd2288e2dcb8c6f5b4d9d40fd35] mm-add-debug_wx-support-fix
git bisect bad e4592f53440c6fd2288e2dcb8c6f5b4d9d40fd35
# bad: [e4592f53440c6fd2288e2dcb8c6f5b4d9d40fd35] mm-add-debug_wx-support-fix
git bisect bad e4592f53440c6fd2288e2dcb8c6f5b4d9d40fd35
# good: [e27369856a2d42ae4d84bc2c4ddac1e696c40d7c] mm: remove the prot argument from vm_map_ram
git bisect good e27369856a2d42ae4d84bc2c4ddac1e696c40d7c
# good: [6911f2b29f6daae2c4b51e6a37f794056d8afabd] mm/page_alloc.c: clear out zone->lowmem_reserve[] if the zone is empty
git bisect good 6911f2b29f6daae2c4b51e6a37f794056d8afabd
# good: [8cef4726f20ae37c3cf3f7a449f5b8a088247a27] hugetlbfs: clean up command line processing
git bisect good 8cef4726f20ae37c3cf3f7a449f5b8a088247a27
# good: [94f38895e0a68ceac3ceece6528123ed3129cedd] arch/kmap: ensure kmap_prot visibility
git bisect good 94f38895e0a68ceac3ceece6528123ed3129cedd
# skip: [fcc77c28bf9155c681712b25c0f5e6125d10ba2e] kmap: consolidate kmap_prot definitions
git bisect skip fcc77c28bf9155c681712b25c0f5e6125d10ba2e
# bad: [175a67be7ee750b2aa2a4a2fedeff18fdce787ac] kmap-consolidate-kmap_prot-definitions-checkpatch-fixes
git bisect bad 175a67be7ee750b2aa2a4a2fedeff18fdce787ac
# bad: [54db8ed321d66a00b6c69bbd5bf7c59809b3fd42] drm: vmwgfx: include linux/highmem.h
git bisect bad 54db8ed321d66a00b6c69bbd5bf7c59809b3fd42
# bad: [6671299c829d19c6ceb0fd1a14b690f6115c6d3d] arch/kmap: define kmap_atomic_prot() for all arch's
git bisect bad 6671299c829d19c6ceb0fd1a14b690f6115c6d3d
# bad: [f800fb6e517710e04391821e4b1908606c8a6b24] arch/kmap: don't hard code kmap_prot values
git bisect bad f800fb6e517710e04391821e4b1908606c8a6b24
# first bad commit: [f800fb6e517710e04391821e4b1908606c8a6b24] arch/kmap: don't hard code kmap_prot values


> Guenter

--
Sincerely yours,
Mike.

2020-05-18 16:11:13

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables

On Mon, May 18, 2020 at 05:23:10PM +0300, Mike Rapoport wrote:
> On Mon, May 18, 2020 at 02:48:18AM -0700, Guenter Roeck wrote:
> > On 5/18/20 1:37 AM, Will Deacon wrote:
> > > On Sat, May 16, 2020 at 05:07:50PM -0700, Guenter Roeck wrote:
> > >> On Sat, May 16, 2020 at 05:00:50PM -0700, Guenter Roeck wrote:
> > >>> On Mon, May 11, 2020 at 09:41:36PM +0100, Will Deacon wrote:
> > >>>> Now that the page table allocator can free page table allocations
> > >>>> smaller than PAGE_SIZE, reduce the size of the PMD and PTE allocations
> > >>>> to avoid needlessly wasting memory.
> > >>>>
> > >>>> Cc: "David S. Miller" <[email protected]>
> > >>>> Cc: Peter Zijlstra <[email protected]>
> > >>>> Signed-off-by: Will Deacon <[email protected]>
> > >>>
> > >>> Something in the sparc32 patches in linux-next causes all my sparc32 emulations
> > >>> to crash. bisect points to this patch, but reverting it doesn't help, and neither
> > >>> does reverting the rest of the series.
> > >>>
> > >> Actually, turns out I see the same pattern (lots of scheduling while atomic
> > >> followed by 'killing interrupt handler' in cryptomgr_test) with several
> > >> powerpc boot tests. I am currently bisecting those crashes. I'll report
> > >> the results here as well as soon as I have it.
> > >
> > > FWIW, I retested my sparc32 patches with PREEMPT=y and I don't see any
> > > issues. However, linux-next is a different story, where I don't get very far
> > > at all:
> > >
> > > BUG: Bad page state in process swapper pfn:005b4
>
> This one seems to be due to commit 24aab577764f ("mm: memmap_init:
> iterate over memblock regions rather that check each PFN") and reverting
> it and partially reverting the next cleanup commits makes those
> dissapear. sparc32 boot still fails on today's linux-next and mmotm for me with
>
> Run /sbin/init as init process
> with arguments:
> /sbin/init
> with environment:
> HOME=/
> TERM=linux
> Starting init: /sbin/init exists but couldn't execute it (error -14)
>

Interesting; that is also seen on microblazeel:petalogix-ml605. Bisect there
suggests 'arch/kmap_atomic: consolidate duplicate code' as the culprit,
which is part of Ira's series.

Today's -next is even worse, unfortunately; now all microblaze boot tests
(both little and big endian) fail, plus everything that failed last
time, plus new compile failures. Another round of bisects ...

Guenter

> I've tried to bisect mmotm and I've got the first bad commits in
> different places in the middle of arch/kmap series [1] so I've added Ira
> to CC as well :)
>
> I'll continue to look into "bad page" on sparc32
>
> [1] https://lore.kernel.org/dri-devel/[email protected]/
>
> > Here are the bisect results for ppc:
> >
> > # bad: [bdecf38f228bcca73b31ada98b5b7ba1215eb9c9] Add linux-next specific files for 20200515
> > # good: [2ef96a5bb12be62ef75b5828c0aab838ebb29cb8] Linux 5.7-rc5
> > git bisect start 'HEAD' 'v5.7-rc5'
>
> ...
>
> > # good: [9b5aa5b43f957f03a1f4a9aff5f7924e2ebbc011] arch-kmap_atomic-consolidate-duplicate-code-checkpatch-fixes
> > git bisect good 9b5aa5b43f957f03a1f4a9aff5f7924e2ebbc011
> > # bad: [89194ba5ee31567eeee9c81101b334c8e3248198] arch/kmap: define kmap_atomic_prot() for all arch's
> > git bisect bad 89194ba5ee31567eeee9c81101b334c8e3248198
> > # good: [022785d2bea99f8bc2a37b7b6c525eea26f6ac59] arch-kunmap_atomic-consolidate-duplicate-code-checkpatch-fixes
> > git bisect good 022785d2bea99f8bc2a37b7b6c525eea26f6ac59
> > # good: [a13c2f39e3f0519ddee57d26cc66ec70e3546106] arch/kmap: don't hard code kmap_prot values
> > git bisect good a13c2f39e3f0519ddee57d26cc66ec70e3546106
> > # first bad commit: [89194ba5ee31567eeee9c81101b334c8e3248198] arch/kmap: define kmap_atomic_prot() for all arch's
> >
> > I don't know if that is accurate either. Maybe things are so broken
> > that bisect gets confused, or the problem is due to interaction
> > between different patch series.
>
> My results with the workaround for sparc32 boot look similar:
>
> # bad: [2bbf0589bfeb27800c730b76eacf34528eee5418] pci: test for unexpectedly disabled bridges
> git bisect bad 2bbf0589bfeb27800c730b76eacf34528eee5418
> # good: [2ef96a5bb12be62ef75b5828c0aab838ebb29cb8] Linux 5.7-rc5
> git bisect good 2ef96a5bb12be62ef75b5828c0aab838ebb29cb8
> # bad: [e4592f53440c6fd2288e2dcb8c6f5b4d9d40fd35] mm-add-debug_wx-support-fix
> git bisect bad e4592f53440c6fd2288e2dcb8c6f5b4d9d40fd35
> # bad: [e4592f53440c6fd2288e2dcb8c6f5b4d9d40fd35] mm-add-debug_wx-support-fix
> git bisect bad e4592f53440c6fd2288e2dcb8c6f5b4d9d40fd35
> # good: [e27369856a2d42ae4d84bc2c4ddac1e696c40d7c] mm: remove the prot argument from vm_map_ram
> git bisect good e27369856a2d42ae4d84bc2c4ddac1e696c40d7c
> # good: [6911f2b29f6daae2c4b51e6a37f794056d8afabd] mm/page_alloc.c: clear out zone->lowmem_reserve[] if the zone is empty
> git bisect good 6911f2b29f6daae2c4b51e6a37f794056d8afabd
> # good: [8cef4726f20ae37c3cf3f7a449f5b8a088247a27] hugetlbfs: clean up command line processing
> git bisect good 8cef4726f20ae37c3cf3f7a449f5b8a088247a27
> # good: [94f38895e0a68ceac3ceece6528123ed3129cedd] arch/kmap: ensure kmap_prot visibility
> git bisect good 94f38895e0a68ceac3ceece6528123ed3129cedd
> # skip: [fcc77c28bf9155c681712b25c0f5e6125d10ba2e] kmap: consolidate kmap_prot definitions
> git bisect skip fcc77c28bf9155c681712b25c0f5e6125d10ba2e
> # bad: [175a67be7ee750b2aa2a4a2fedeff18fdce787ac] kmap-consolidate-kmap_prot-definitions-checkpatch-fixes
> git bisect bad 175a67be7ee750b2aa2a4a2fedeff18fdce787ac
> # bad: [54db8ed321d66a00b6c69bbd5bf7c59809b3fd42] drm: vmwgfx: include linux/highmem.h
> git bisect bad 54db8ed321d66a00b6c69bbd5bf7c59809b3fd42
> # bad: [6671299c829d19c6ceb0fd1a14b690f6115c6d3d] arch/kmap: define kmap_atomic_prot() for all arch's
> git bisect bad 6671299c829d19c6ceb0fd1a14b690f6115c6d3d
> # bad: [f800fb6e517710e04391821e4b1908606c8a6b24] arch/kmap: don't hard code kmap_prot values
> git bisect bad f800fb6e517710e04391821e4b1908606c8a6b24
> # first bad commit: [f800fb6e517710e04391821e4b1908606c8a6b24] arch/kmap: don't hard code kmap_prot values
>
>
> > Guenter
>
> --
> Sincerely yours,
> Mike.

2020-05-18 18:15:16

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables

On 5/18/20 7:23 AM, Mike Rapoport wrote:
> On Mon, May 18, 2020 at 02:48:18AM -0700, Guenter Roeck wrote:
>> On 5/18/20 1:37 AM, Will Deacon wrote:
>>> On Sat, May 16, 2020 at 05:07:50PM -0700, Guenter Roeck wrote:
>>>> On Sat, May 16, 2020 at 05:00:50PM -0700, Guenter Roeck wrote:
>>>>> On Mon, May 11, 2020 at 09:41:36PM +0100, Will Deacon wrote:
>>>>>> Now that the page table allocator can free page table allocations
>>>>>> smaller than PAGE_SIZE, reduce the size of the PMD and PTE allocations
>>>>>> to avoid needlessly wasting memory.
>>>>>>
>>>>>> Cc: "David S. Miller" <[email protected]>
>>>>>> Cc: Peter Zijlstra <[email protected]>
>>>>>> Signed-off-by: Will Deacon <[email protected]>
>>>>>
>>>>> Something in the sparc32 patches in linux-next causes all my sparc32 emulations
>>>>> to crash. bisect points to this patch, but reverting it doesn't help, and neither
>>>>> does reverting the rest of the series.
>>>>>
>>>> Actually, turns out I see the same pattern (lots of scheduling while atomic
>>>> followed by 'killing interrupt handler' in cryptomgr_test) with several
>>>> powerpc boot tests. I am currently bisecting those crashes. I'll report
>>>> the results here as well as soon as I have it.
>>>
>>> FWIW, I retested my sparc32 patches with PREEMPT=y and I don't see any
>>> issues. However, linux-next is a different story, where I don't get very far
>>> at all:
>>>
>>> BUG: Bad page state in process swapper pfn:005b4
>
> This one seems to be due to commit 24aab577764f ("mm: memmap_init:
> iterate over memblock regions rather that check each PFN") and reverting
> it and partially reverting the next cleanup commits makes those
> dissapear. sparc32 boot still fails on today's linux-next and mmotm for me with
>
> Run /sbin/init as init process
> with arguments:
> /sbin/init
> with environment:
> HOME=/
> TERM=linux
> Starting init: /sbin/init exists but couldn't execute it (error -14)
>
> I've tried to bisect mmotm and I've got the first bad commits in
> different places in the middle of arch/kmap series [1] so I've added Ira
> to CC as well :)
>
> I'll continue to look into "bad page" on sparc32
>
> [1] https://lore.kernel.org/dri-devel/[email protected]/
>
>> Here are the bisect results for ppc:
>>
>> # bad: [bdecf38f228bcca73b31ada98b5b7ba1215eb9c9] Add linux-next specific files for 20200515
>> # good: [2ef96a5bb12be62ef75b5828c0aab838ebb29cb8] Linux 5.7-rc5
>> git bisect start 'HEAD' 'v5.7-rc5'
>
> ...
>
>> # good: [9b5aa5b43f957f03a1f4a9aff5f7924e2ebbc011] arch-kmap_atomic-consolidate-duplicate-code-checkpatch-fixes
>> git bisect good 9b5aa5b43f957f03a1f4a9aff5f7924e2ebbc011
>> # bad: [89194ba5ee31567eeee9c81101b334c8e3248198] arch/kmap: define kmap_atomic_prot() for all arch's
>> git bisect bad 89194ba5ee31567eeee9c81101b334c8e3248198
>> # good: [022785d2bea99f8bc2a37b7b6c525eea26f6ac59] arch-kunmap_atomic-consolidate-duplicate-code-checkpatch-fixes
>> git bisect good 022785d2bea99f8bc2a37b7b6c525eea26f6ac59
>> # good: [a13c2f39e3f0519ddee57d26cc66ec70e3546106] arch/kmap: don't hard code kmap_prot values
>> git bisect good a13c2f39e3f0519ddee57d26cc66ec70e3546106
>> # first bad commit: [89194ba5ee31567eeee9c81101b334c8e3248198] arch/kmap: define kmap_atomic_prot() for all arch's
>>
>> I don't know if that is accurate either. Maybe things are so broken
>> that bisect gets confused, or the problem is due to interaction
>> between different patch series.
>
> My results with the workaround for sparc32 boot look similar:
>
> # bad: [2bbf0589bfeb27800c730b76eacf34528eee5418] pci: test for unexpectedly disabled bridges
> git bisect bad 2bbf0589bfeb27800c730b76eacf34528eee5418
> # good: [2ef96a5bb12be62ef75b5828c0aab838ebb29cb8] Linux 5.7-rc5
> git bisect good 2ef96a5bb12be62ef75b5828c0aab838ebb29cb8
> # bad: [e4592f53440c6fd2288e2dcb8c6f5b4d9d40fd35] mm-add-debug_wx-support-fix
> git bisect bad e4592f53440c6fd2288e2dcb8c6f5b4d9d40fd35
> # bad: [e4592f53440c6fd2288e2dcb8c6f5b4d9d40fd35] mm-add-debug_wx-support-fix
> git bisect bad e4592f53440c6fd2288e2dcb8c6f5b4d9d40fd35
> # good: [e27369856a2d42ae4d84bc2c4ddac1e696c40d7c] mm: remove the prot argument from vm_map_ram
> git bisect good e27369856a2d42ae4d84bc2c4ddac1e696c40d7c
> # good: [6911f2b29f6daae2c4b51e6a37f794056d8afabd] mm/page_alloc.c: clear out zone->lowmem_reserve[] if the zone is empty
> git bisect good 6911f2b29f6daae2c4b51e6a37f794056d8afabd
> # good: [8cef4726f20ae37c3cf3f7a449f5b8a088247a27] hugetlbfs: clean up command line processing
> git bisect good 8cef4726f20ae37c3cf3f7a449f5b8a088247a27
> # good: [94f38895e0a68ceac3ceece6528123ed3129cedd] arch/kmap: ensure kmap_prot visibility
> git bisect good 94f38895e0a68ceac3ceece6528123ed3129cedd
> # skip: [fcc77c28bf9155c681712b25c0f5e6125d10ba2e] kmap: consolidate kmap_prot definitions
> git bisect skip fcc77c28bf9155c681712b25c0f5e6125d10ba2e
> # bad: [175a67be7ee750b2aa2a4a2fedeff18fdce787ac] kmap-consolidate-kmap_prot-definitions-checkpatch-fixes
> git bisect bad 175a67be7ee750b2aa2a4a2fedeff18fdce787ac
> # bad: [54db8ed321d66a00b6c69bbd5bf7c59809b3fd42] drm: vmwgfx: include linux/highmem.h
> git bisect bad 54db8ed321d66a00b6c69bbd5bf7c59809b3fd42
> # bad: [6671299c829d19c6ceb0fd1a14b690f6115c6d3d] arch/kmap: define kmap_atomic_prot() for all arch's
> git bisect bad 6671299c829d19c6ceb0fd1a14b690f6115c6d3d
> # bad: [f800fb6e517710e04391821e4b1908606c8a6b24] arch/kmap: don't hard code kmap_prot values
> git bisect bad f800fb6e517710e04391821e4b1908606c8a6b24
> # first bad commit: [f800fb6e517710e04391821e4b1908606c8a6b24] arch/kmap: don't hard code kmap_prot values
>
>

Below is another set of bisect results, from next-20200518. It points to one
of your commits. This is for microblaze (big endian) boot failures.

Guenter

---
# bad: [72bc15d0018ebfbc9c389539d636e2e9a9002b3b] Add linux-next specific files for 20200518
# good: [2ef96a5bb12be62ef75b5828c0aab838ebb29cb8] Linux 5.7-rc5
git bisect start 'HEAD' 'v5.7-rc5'
# good: [b5b9a1a40fcf10db8f140c987b715e6816e1292d] Merge remote-tracking branch 'crypto/master'
git bisect good b5b9a1a40fcf10db8f140c987b715e6816e1292d
# good: [6a349e7cf4cec11b63ca8e3095c990e146f48784] Merge remote-tracking branch 'tip/auto-latest'
git bisect good 6a349e7cf4cec11b63ca8e3095c990e146f48784
# good: [0c5e27cea5e173afc1971ce9a521e022c288548c] Merge remote-tracking branch 'staging/staging-next'
git bisect good 0c5e27cea5e173afc1971ce9a521e022c288548c
# good: [7e90955569a080b17030161db6152917f3b0e061] Merge remote-tracking branch 'hyperv/hyperv-next'
git bisect good 7e90955569a080b17030161db6152917f3b0e061
# good: [c0218a9a3a60cf081f5545302d0fc28a8d68059b] fs/buffer.c: add debug print for __getblk_gfp() stall problem
git bisect good c0218a9a3a60cf081f5545302d0fc28a8d68059b
# good: [bcda3c9d968d3a8b596904fb2ff8009717ffb6ef] Merge branch 'akpm-current/current'
git bisect good bcda3c9d968d3a8b596904fb2ff8009717ffb6ef
# good: [5b271f59a6aee147db3d7137f6132f74977131c1] kernel: use show_stack_loglvl()
git bisect good 5b271f59a6aee147db3d7137f6132f74977131c1
# good: [dec7b12bacc0859e689c4a42714c7bf4d0b98cfd] mm/mmap.c: add more sanity checks to get_unmapped_area()
git bisect good dec7b12bacc0859e689c4a42714c7bf4d0b98cfd
# bad: [feda7bcd5e1846039cc1a999bf4090b1fee890e8] mm: fix build error for mips of process_madvise
git bisect bad feda7bcd5e1846039cc1a999bf4090b1fee890e8
# good: [0533da2f2fa20c28ac5b4573bd6bb0d445638c6a] x86/mm: simplify init_trampoline() and surrounding logic
git bisect good 0533da2f2fa20c28ac5b4573bd6bb0d445638c6a
# bad: [2b166035a0202b90f5860178b8ae43d41a42117f] mm: consolidate pud_index() and pud_offset() definitions
git bisect bad 2b166035a0202b90f5860178b8ae43d41a42117f
# bad: [01f489acfb0783379cc764d503477c0f6df49a0b] mm: consolidate pte_index() and pte_offset_*() definitions
git bisect bad 01f489acfb0783379cc764d503477c0f6df49a0b
# bad: [c57a43e52bf5fdc4152bb17db6e9c5d35569dcfd] mm: pgtable: add shortcuts for accessing kernel PMD and PTE
git bisect bad c57a43e52bf5fdc4152bb17db6e9c5d35569dcfd
# first bad commit: [c57a43e52bf5fdc4152bb17db6e9c5d35569dcfd] mm: pgtable: add shortcuts for accessing kernel PMD and PTE


2020-05-18 18:17:00

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables

On Mon, May 18, 2020 at 09:08:11AM -0700, Guenter Roeck wrote:
> On Mon, May 18, 2020 at 05:23:10PM +0300, Mike Rapoport wrote:
> > On Mon, May 18, 2020 at 02:48:18AM -0700, Guenter Roeck wrote:
> > > On 5/18/20 1:37 AM, Will Deacon wrote:
> > > > On Sat, May 16, 2020 at 05:07:50PM -0700, Guenter Roeck wrote:
> > > >> On Sat, May 16, 2020 at 05:00:50PM -0700, Guenter Roeck wrote:
> > > >>> On Mon, May 11, 2020 at 09:41:36PM +0100, Will Deacon wrote:
> > > >>>> Now that the page table allocator can free page table allocations
> > > >>>> smaller than PAGE_SIZE, reduce the size of the PMD and PTE allocations
> > > >>>> to avoid needlessly wasting memory.
> > > >>>>
> > > >>>> Cc: "David S. Miller" <[email protected]>
> > > >>>> Cc: Peter Zijlstra <[email protected]>
> > > >>>> Signed-off-by: Will Deacon <[email protected]>
> > > >>>
> > > >>> Something in the sparc32 patches in linux-next causes all my sparc32 emulations
> > > >>> to crash. bisect points to this patch, but reverting it doesn't help, and neither
> > > >>> does reverting the rest of the series.
> > > >>>
> > > >> Actually, turns out I see the same pattern (lots of scheduling while atomic
> > > >> followed by 'killing interrupt handler' in cryptomgr_test) with several
> > > >> powerpc boot tests. I am currently bisecting those crashes. I'll report
> > > >> the results here as well as soon as I have it.
> > > >
> > > > FWIW, I retested my sparc32 patches with PREEMPT=y and I don't see any
> > > > issues. However, linux-next is a different story, where I don't get very far
> > > > at all:
> > > >
> > > > BUG: Bad page state in process swapper pfn:005b4
> >
> > This one seems to be due to commit 24aab577764f ("mm: memmap_init:
> > iterate over memblock regions rather that check each PFN") and reverting
> > it and partially reverting the next cleanup commits makes those
> > dissapear. sparc32 boot still fails on today's linux-next and mmotm for me with
> >
> > Run /sbin/init as init process
> > with arguments:
> > /sbin/init
> > with environment:
> > HOME=/
> > TERM=linux
> > Starting init: /sbin/init exists but couldn't execute it (error -14)
> >
>
> Interesting; that is also seen on microblazeel:petalogix-ml605. Bisect there
> suggests 'arch/kmap_atomic: consolidate duplicate code' as the culprit,
> which is part of Ira's series.
>
> Today's -next is even worse, unfortunately; now all microblaze boot tests
> (both little and big endian) fail, plus everything that failed last
> time, plus new compile failures. Another round of bisects ...

I've found this bug in microblaze for sure still looking through the other archs...

commit 82c284b2bb74ca195dfcd35b70a175f010b9fd46 (HEAD -> lm-kmap17)
Author: Ira Weiny <[email protected]>
Date: Mon May 18 11:01:10 2020 -0700

microblaze/kmap: Don't enable pagefault/preempt twice

The kunmap_atomic clean up failed to remove the pagefault/preempt
enables on this path.

Fixes: bee2128a09e6 ("arch/kunmap_atomic: consolidate duplicate code")
Signed-off-by: Ira Weiny <[email protected]>

diff --git a/arch/microblaze/mm/highmem.c b/arch/microblaze/mm/highmem.c
index ee8a422b2b76..92e0890416c9 100644
--- a/arch/microblaze/mm/highmem.c
+++ b/arch/microblaze/mm/highmem.c
@@ -57,11 +57,8 @@ void kunmap_atomic_high(void *kvaddr)
int type;
unsigned int idx;

- if (vaddr < __fix_to_virt(FIX_KMAP_END)) {
- pagefault_enable();
- preempt_enable();
+ if (vaddr < __fix_to_virt(FIX_KMAP_END))
return;
- }

type = kmap_atomic_idx();

2020-05-18 18:17:26

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables

On Mon, May 18, 2020 at 09:08:11AM -0700, Guenter Roeck wrote:
> On Mon, May 18, 2020 at 05:23:10PM +0300, Mike Rapoport wrote:
> > On Mon, May 18, 2020 at 02:48:18AM -0700, Guenter Roeck wrote:
> > > On 5/18/20 1:37 AM, Will Deacon wrote:
> > > > On Sat, May 16, 2020 at 05:07:50PM -0700, Guenter Roeck wrote:
> > > >> On Sat, May 16, 2020 at 05:00:50PM -0700, Guenter Roeck wrote:
> > > >>> On Mon, May 11, 2020 at 09:41:36PM +0100, Will Deacon wrote:
> > > >>>> Now that the page table allocator can free page table allocations
> > > >>>> smaller than PAGE_SIZE, reduce the size of the PMD and PTE allocations
> > > >>>> to avoid needlessly wasting memory.
> > > >>>>
> > > >>>> Cc: "David S. Miller" <[email protected]>
> > > >>>> Cc: Peter Zijlstra <[email protected]>
> > > >>>> Signed-off-by: Will Deacon <[email protected]>
> > > >>>
> > > >>> Something in the sparc32 patches in linux-next causes all my sparc32 emulations
> > > >>> to crash. bisect points to this patch, but reverting it doesn't help, and neither
> > > >>> does reverting the rest of the series.
> > > >>>
> > > >> Actually, turns out I see the same pattern (lots of scheduling while atomic
> > > >> followed by 'killing interrupt handler' in cryptomgr_test) with several
> > > >> powerpc boot tests. I am currently bisecting those crashes. I'll report
> > > >> the results here as well as soon as I have it.
> > > >
> > > > FWIW, I retested my sparc32 patches with PREEMPT=y and I don't see any
> > > > issues. However, linux-next is a different story, where I don't get very far
> > > > at all:
> > > >
> > > > BUG: Bad page state in process swapper pfn:005b4
> >
> > This one seems to be due to commit 24aab577764f ("mm: memmap_init:
> > iterate over memblock regions rather that check each PFN") and reverting
> > it and partially reverting the next cleanup commits makes those
> > dissapear. sparc32 boot still fails on today's linux-next and mmotm for me with
> >
> > Run /sbin/init as init process
> > with arguments:
> > /sbin/init
> > with environment:
> > HOME=/
> > TERM=linux
> > Starting init: /sbin/init exists but couldn't execute it (error -14)
> >
>
> Interesting; that is also seen on microblazeel:petalogix-ml605. Bisect there
> suggests 'arch/kmap_atomic: consolidate duplicate code' as the culprit,
> which is part of Ira's series.
>
> Today's -next is even worse, unfortunately; now all microblaze boot tests
> (both little and big endian) fail, plus everything that failed last
> time, plus new compile failures. Another round of bisects ...

Sparc had the same problem...


commit 6e5c523370c510f5fae3436b193ab5dabe0fef06 (HEAD -> lm-kmap17)
Author: Ira Weiny <[email protected]>
Date: Mon May 18 11:13:16 2020 -0700

arch/sparc: Don't enable pagefault/preempt twice

The kunmap_atomic clean up failed to remove the pagefault/preempt
enables on this path.

Fixes: bee2128a09e6 ("arch/kunmap_atomic: consolidate duplicate code")
Signed-off-by: Ira Weiny <[email protected]>

diff --git a/arch/sparc/mm/highmem.c b/arch/sparc/mm/highmem.c
index d237d902f9c3..13fb197bb26c 100644
--- a/arch/sparc/mm/highmem.c
+++ b/arch/sparc/mm/highmem.c
@@ -86,11 +86,8 @@ void kunmap_atomic_high(void *kvaddr)
unsigned long vaddr = (unsigned long) kvaddr & PAGE_MASK;
int type;

- if (vaddr < FIXADDR_START) { // FIXME
- pagefault_enable();
- preempt_enable();
+ if (vaddr < FIXADDR_START) // FIXME
return;
- }

type = kmap_atomic_idx();

2020-05-18 18:26:00

by Ira Weiny

[permalink] [raw]
Subject: Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables

On Mon, May 18, 2020 at 11:09:46AM -0700, Guenter Roeck wrote:
> On 5/18/20 7:23 AM, Mike Rapoport wrote:
> > On Mon, May 18, 2020 at 02:48:18AM -0700, Guenter Roeck wrote:
> >> On 5/18/20 1:37 AM, Will Deacon wrote:
> >>> On Sat, May 16, 2020 at 05:07:50PM -0700, Guenter Roeck wrote:
> >>>> On Sat, May 16, 2020 at 05:00:50PM -0700, Guenter Roeck wrote:
> >>>>> On Mon, May 11, 2020 at 09:41:36PM +0100, Will Deacon wrote:
> >>>>>> Now that the page table allocator can free page table allocations
> >>>>>> smaller than PAGE_SIZE, reduce the size of the PMD and PTE allocations
> >>>>>> to avoid needlessly wasting memory.
> >>>>>>
> >>>>>> Cc: "David S. Miller" <[email protected]>
> >>>>>> Cc: Peter Zijlstra <[email protected]>
> >>>>>> Signed-off-by: Will Deacon <[email protected]>
> >>>>>
> >>>>> Something in the sparc32 patches in linux-next causes all my sparc32 emulations
> >>>>> to crash. bisect points to this patch, but reverting it doesn't help, and neither
> >>>>> does reverting the rest of the series.
> >>>>>
> >>>> Actually, turns out I see the same pattern (lots of scheduling while atomic
> >>>> followed by 'killing interrupt handler' in cryptomgr_test) with several
> >>>> powerpc boot tests. I am currently bisecting those crashes. I'll report
> >>>> the results here as well as soon as I have it.
> >>>
> >>> FWIW, I retested my sparc32 patches with PREEMPT=y and I don't see any
> >>> issues. However, linux-next is a different story, where I don't get very far
> >>> at all:
> >>>
> >>> BUG: Bad page state in process swapper pfn:005b4
> >
> > This one seems to be due to commit 24aab577764f ("mm: memmap_init:
> > iterate over memblock regions rather that check each PFN") and reverting
> > it and partially reverting the next cleanup commits makes those
> > dissapear. sparc32 boot still fails on today's linux-next and mmotm for me with
> >
> > Run /sbin/init as init process
> > with arguments:
> > /sbin/init
> > with environment:
> > HOME=/
> > TERM=linux
> > Starting init: /sbin/init exists but couldn't execute it (error -14)
> >
> > I've tried to bisect mmotm and I've got the first bad commits in
> > different places in the middle of arch/kmap series [1] so I've added Ira
> > to CC as well :)
> >
> > I'll continue to look into "bad page" on sparc32

mips is broken too.

Does anyone know what this FIXME was for?

...
if (vaddr < FIXADDR_START) { // FIXME
...

I'm going to remove it...

Ira

2020-05-18 20:02:24

by Mike Rapoport

[permalink] [raw]
Subject: Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables

On Mon, May 18, 2020 at 11:09:46AM -0700, Guenter Roeck wrote:
> On 5/18/20 7:23 AM, Mike Rapoport wrote:
>
> Below is another set of bisect results, from next-20200518. It points to one
> of your commits. This is for microblaze (big endian) boot failures.

The microblaze one was easy, as for sparc32 I still have no clue for the
root cause :(

Andrew, can you please fold it into "mm: pgtable: add shortcuts for
accessing kernel PMD and PTE"?

From 167250de28aa526342641b2647294a755d234090 Mon Sep 17 00:00:00 2001
From: Mike Rapoport <[email protected]>
Date: Mon, 18 May 2020 22:08:10 +0300
Subject: [PATCH] microblaze: fix page table traversal in setup_rt_frame()

The replacement of long folded page table traversal with the direct access
to PMD entry wrongly used the kernel page table in setup_rt_frame()
function instead of the process (current->mm) page table.

Fix it.

Signed-off-by: Mike Rapoport <[email protected]>
---
arch/microblaze/kernel/signal.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/microblaze/kernel/signal.c b/arch/microblaze/kernel/signal.c
index 28b1ec4b4e79..bdd6d0c86e16 100644
--- a/arch/microblaze/kernel/signal.c
+++ b/arch/microblaze/kernel/signal.c
@@ -194,7 +194,7 @@ static int setup_rt_frame(struct ksignal *ksig, sigset_t *set,

address = ((unsigned long)frame->tramp);
#ifdef CONFIG_MMU
- pmdp = pmd_off_k(address);
+ pmdp = pmd_off(current->mm, address);

preempt_disable();
ptep = pte_offset_map(pmdp, address);
--
2.26.2


> Guenter
>
> ---

--
Sincerely yours,
Mike.

2020-05-19 16:43:02

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables

On Mon, May 18, 2020 at 10:15:11PM +0300, Mike Rapoport wrote:
> On Mon, May 18, 2020 at 11:09:46AM -0700, Guenter Roeck wrote:
> > On 5/18/20 7:23 AM, Mike Rapoport wrote:
> >
> > Below is another set of bisect results, from next-20200518. It points to one
> > of your commits. This is for microblaze (big endian) boot failures.
>
> The microblaze one was easy, as for sparc32 I still have no clue for the
> root cause :(
>
> Andrew, can you please fold it into "mm: pgtable: add shortcuts for
> accessing kernel PMD and PTE"?
>
> From 167250de28aa526342641b2647294a755d234090 Mon Sep 17 00:00:00 2001
> From: Mike Rapoport <[email protected]>
> Date: Mon, 18 May 2020 22:08:10 +0300
> Subject: [PATCH] microblaze: fix page table traversal in setup_rt_frame()
>
> The replacement of long folded page table traversal with the direct access
> to PMD entry wrongly used the kernel page table in setup_rt_frame()
> function instead of the process (current->mm) page table.
>
> Fix it.
>
> Signed-off-by: Mike Rapoport <[email protected]>

Tested-by: Guenter Roeck <[email protected]>

> ---
> arch/microblaze/kernel/signal.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/microblaze/kernel/signal.c b/arch/microblaze/kernel/signal.c
> index 28b1ec4b4e79..bdd6d0c86e16 100644
> --- a/arch/microblaze/kernel/signal.c
> +++ b/arch/microblaze/kernel/signal.c
> @@ -194,7 +194,7 @@ static int setup_rt_frame(struct ksignal *ksig, sigset_t *set,
>
> address = ((unsigned long)frame->tramp);
> #ifdef CONFIG_MMU
> - pmdp = pmd_off_k(address);
> + pmdp = pmd_off(current->mm, address);
>
> preempt_disable();
> ptep = pte_offset_map(pmdp, address);
> --
> 2.26.2
>
>
> > Guenter
> >
> > ---
>
> --
> Sincerely yours,
> Mike.

2020-05-20 17:05:25

by Mike Rapoport

[permalink] [raw]
Subject: Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables

On Mon, May 18, 2020 at 09:37:15AM +0100, Will Deacon wrote:
> On Sat, May 16, 2020 at 05:07:50PM -0700, Guenter Roeck wrote:
> > On Sat, May 16, 2020 at 05:00:50PM -0700, Guenter Roeck wrote:
> > > On Mon, May 11, 2020 at 09:41:36PM +0100, Will Deacon wrote:
> > > > Now that the page table allocator can free page table allocations
> > > > smaller than PAGE_SIZE, reduce the size of the PMD and PTE allocations
> > > > to avoid needlessly wasting memory.
> > > >
> > > > Cc: "David S. Miller" <[email protected]>
> > > > Cc: Peter Zijlstra <[email protected]>
> > > > Signed-off-by: Will Deacon <[email protected]>
> > >
> > > Something in the sparc32 patches in linux-next causes all my sparc32 emulations
> > > to crash. bisect points to this patch, but reverting it doesn't help, and neither
> > > does reverting the rest of the series.
> > >
> > Actually, turns out I see the same pattern (lots of scheduling while atomic
> > followed by 'killing interrupt handler' in cryptomgr_test) with several
> > powerpc boot tests. I am currently bisecting those crashes. I'll report
> > the results here as well as soon as I have it.
>
> FWIW, I retested my sparc32 patches with PREEMPT=y and I don't see any
> issues. However, linux-next is a different story, where I don't get very far
> at all:
>
> BUG: Bad page state in process swapper pfn:005b4

This is caused by c03584e30534 ("mm: memmap_init: iterate over memblock
regions rather that check each PFN"). The commit sha is valid for
v5.7-rc6-mmots-2020-05-19-21-52, so it will change in a day or so :)

As it seems, sparc32 never registered the memory occupied by the kernel
image with memblock_add() and it only reserves this memory with
meblock_reserve().

I don't know what would happen on real HW, but with

qemu-system-sparc -kernel /path/to/kernel

the memory occupied by the kernel is reserved in openbios and removed
from mem.available. The prom setup code in the kernel used mem.available
to set up the memory banks and essentially there is a hole for the
memory occupied by the kernel.

Later in bootmem_init() this memory is memblock_reserve()d.

Before the problematic commit, memmap initialization would call
__init_single_page() for the pages in that hole, the
free_low_memory_core_early() would mark them as resrved and everything
would be Ok.

After the change in memmap initialization, the hole is skipped and the
page structs for it are not inited. And when they are passed from
memblock to page allocator as reserved it gets confused.

Simply registering the memory occupied by the kernel with memblock_add()
resolves this issue, at least for qemu-system-arm and I cannot see how
it can harm any other setup.

If all that makes sense I'll send a proper patch :)

diff --git a/arch/sparc/mm/init_32.c b/arch/sparc/mm/init_32.c
index 906eda1158b4..3cb3dffcbcdc 100644
--- a/arch/sparc/mm/init_32.c
+++ b/arch/sparc/mm/init_32.c
@@ -193,6 +193,7 @@ unsigned long __init bootmem_init(unsigned long *pages_avail)
/* Reserve the kernel text/data/bss. */
size = (start_pfn << PAGE_SHIFT) - phys_base;
memblock_reserve(phys_base, size);
+ memblock_add(phys_base, size);

size = memblock_phys_mem_size() - memblock_reserved_size();
*pages_avail = (size >> PAGE_SHIFT) - high_pages;

> Will

--
Sincerely yours,
Mike.

2020-05-20 19:05:26

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables

On 5/20/20 10:03 AM, Mike Rapoport wrote:
> On Mon, May 18, 2020 at 09:37:15AM +0100, Will Deacon wrote:
>> On Sat, May 16, 2020 at 05:07:50PM -0700, Guenter Roeck wrote:
>>> On Sat, May 16, 2020 at 05:00:50PM -0700, Guenter Roeck wrote:
>>>> On Mon, May 11, 2020 at 09:41:36PM +0100, Will Deacon wrote:
>>>>> Now that the page table allocator can free page table allocations
>>>>> smaller than PAGE_SIZE, reduce the size of the PMD and PTE allocations
>>>>> to avoid needlessly wasting memory.
>>>>>
>>>>> Cc: "David S. Miller" <[email protected]>
>>>>> Cc: Peter Zijlstra <[email protected]>
>>>>> Signed-off-by: Will Deacon <[email protected]>
>>>>
>>>> Something in the sparc32 patches in linux-next causes all my sparc32 emulations
>>>> to crash. bisect points to this patch, but reverting it doesn't help, and neither
>>>> does reverting the rest of the series.
>>>>
>>> Actually, turns out I see the same pattern (lots of scheduling while atomic
>>> followed by 'killing interrupt handler' in cryptomgr_test) with several
>>> powerpc boot tests. I am currently bisecting those crashes. I'll report
>>> the results here as well as soon as I have it.
>>
>> FWIW, I retested my sparc32 patches with PREEMPT=y and I don't see any
>> issues. However, linux-next is a different story, where I don't get very far
>> at all:
>>
>> BUG: Bad page state in process swapper pfn:005b4
>
> This is caused by c03584e30534 ("mm: memmap_init: iterate over memblock
> regions rather that check each PFN"). The commit sha is valid for
> v5.7-rc6-mmots-2020-05-19-21-52, so it will change in a day or so :)
>
> As it seems, sparc32 never registered the memory occupied by the kernel
> image with memblock_add() and it only reserves this memory with
> meblock_reserve().
>
> I don't know what would happen on real HW, but with
>
> qemu-system-sparc -kernel /path/to/kernel
>
> the memory occupied by the kernel is reserved in openbios and removed
> from mem.available. The prom setup code in the kernel used mem.available
> to set up the memory banks and essentially there is a hole for the
> memory occupied by the kernel.
>
> Later in bootmem_init() this memory is memblock_reserve()d.
>
> Before the problematic commit, memmap initialization would call
> __init_single_page() for the pages in that hole, the
> free_low_memory_core_early() would mark them as resrved and everything
> would be Ok.
>
> After the change in memmap initialization, the hole is skipped and the
> page structs for it are not inited. And when they are passed from
> memblock to page allocator as reserved it gets confused.
>
> Simply registering the memory occupied by the kernel with memblock_add()
> resolves this issue, at least for qemu-system-arm and I cannot see how
> it can harm any other setup.
>
> If all that makes sense I'll send a proper patch :)
>
> diff --git a/arch/sparc/mm/init_32.c b/arch/sparc/mm/init_32.c
> index 906eda1158b4..3cb3dffcbcdc 100644
> --- a/arch/sparc/mm/init_32.c
> +++ b/arch/sparc/mm/init_32.c
> @@ -193,6 +193,7 @@ unsigned long __init bootmem_init(unsigned long *pages_avail)
> /* Reserve the kernel text/data/bss. */
> size = (start_pfn << PAGE_SHIFT) - phys_base;
> memblock_reserve(phys_base, size);
> + memblock_add(phys_base, size);
>
> size = memblock_phys_mem_size() - memblock_reserved_size();
> *pages_avail = (size >> PAGE_SHIFT) - high_pages;
>
>> Will
>

With above patch applied on top of Ira's patch, I get:

BUG: spinlock recursion on CPU#0, S01syslogd/139
lock: 0xf5448350, .magic: dead4ead, .owner: S01syslogd/139, .owner_cpu: 0
CPU: 0 PID: 139 Comm: S01syslogd Not tainted 5.7.0-rc6-next-20200518-00002-gb178d2d56f29-dirty #1
[f0067a64 :
do_raw_spin_lock+0xa8/0xd8 ]
[f00d5034 :
copy_page_range+0x328/0x804 ]
[f0025be4 :
dup_mm+0x334/0x434 ]
[f0027124 :
copy_process+0x1224/0x12b0 ]
[f0027344 :
_do_fork+0x54/0x30c ]
[f0027670 :
do_fork+0x5c/0x6c ]
[f000de44 :
sparc_do_fork+0x18/0x38 ]
[f000b7f4 :
do_syscall+0x34/0x40 ]
[5010cd4c :
0x5010cd4c ]

Looks like yet another problem.

I can not revert c03584e30534 because it results in a compile failure.

Guenter

2020-05-20 19:53:24

by Mike Rapoport

[permalink] [raw]
Subject: Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables

On Wed, May 20, 2020 at 12:03:31PM -0700, Guenter Roeck wrote:
> On 5/20/20 10:03 AM, Mike Rapoport wrote:
> > On Mon, May 18, 2020 at 09:37:15AM +0100, Will Deacon wrote:
> >> On Sat, May 16, 2020 at 05:07:50PM -0700, Guenter Roeck wrote:
> >>> On Sat, May 16, 2020 at 05:00:50PM -0700, Guenter Roeck wrote:
> >>>> On Mon, May 11, 2020 at 09:41:36PM +0100, Will Deacon wrote:
> >>>>> Now that the page table allocator can free page table allocations
> >>>>> smaller than PAGE_SIZE, reduce the size of the PMD and PTE allocations
> >>>>> to avoid needlessly wasting memory.
> >>>>>
> >>>>> Cc: "David S. Miller" <[email protected]>
> >>>>> Cc: Peter Zijlstra <[email protected]>
> >>>>> Signed-off-by: Will Deacon <[email protected]>
> >>>>
> >>>> Something in the sparc32 patches in linux-next causes all my sparc32 emulations
> >>>> to crash. bisect points to this patch, but reverting it doesn't help, and neither
> >>>> does reverting the rest of the series.
> >>>>
> >>> Actually, turns out I see the same pattern (lots of scheduling while atomic
> >>> followed by 'killing interrupt handler' in cryptomgr_test) with several
> >>> powerpc boot tests. I am currently bisecting those crashes. I'll report
> >>> the results here as well as soon as I have it.
> >>
> >> FWIW, I retested my sparc32 patches with PREEMPT=y and I don't see any
> >> issues. However, linux-next is a different story, where I don't get very far
> >> at all:
> >>
> >> BUG: Bad page state in process swapper pfn:005b4
> >
> > This is caused by c03584e30534 ("mm: memmap_init: iterate over memblock
> > regions rather that check each PFN"). The commit sha is valid for
> > v5.7-rc6-mmots-2020-05-19-21-52, so it will change in a day or so :)
> >
> > As it seems, sparc32 never registered the memory occupied by the kernel
> > image with memblock_add() and it only reserves this memory with
> > meblock_reserve().
> >
> > I don't know what would happen on real HW, but with
> >
> > qemu-system-sparc -kernel /path/to/kernel
> >
> > the memory occupied by the kernel is reserved in openbios and removed
> > from mem.available. The prom setup code in the kernel used mem.available
> > to set up the memory banks and essentially there is a hole for the
> > memory occupied by the kernel.
> >
> > Later in bootmem_init() this memory is memblock_reserve()d.
> >
> > Before the problematic commit, memmap initialization would call
> > __init_single_page() for the pages in that hole, the
> > free_low_memory_core_early() would mark them as resrved and everything
> > would be Ok.
> >
> > After the change in memmap initialization, the hole is skipped and the
> > page structs for it are not inited. And when they are passed from
> > memblock to page allocator as reserved it gets confused.
> >
> > Simply registering the memory occupied by the kernel with memblock_add()
> > resolves this issue, at least for qemu-system-arm and I cannot see how
> > it can harm any other setup.
> >
> > If all that makes sense I'll send a proper patch :)
> >
> > diff --git a/arch/sparc/mm/init_32.c b/arch/sparc/mm/init_32.c
> > index 906eda1158b4..3cb3dffcbcdc 100644
> > --- a/arch/sparc/mm/init_32.c
> > +++ b/arch/sparc/mm/init_32.c
> > @@ -193,6 +193,7 @@ unsigned long __init bootmem_init(unsigned long *pages_avail)
> > /* Reserve the kernel text/data/bss. */
> > size = (start_pfn << PAGE_SHIFT) - phys_base;
> > memblock_reserve(phys_base, size);
> > + memblock_add(phys_base, size);
> >
> > size = memblock_phys_mem_size() - memblock_reserved_size();
> > *pages_avail = (size >> PAGE_SHIFT) - high_pages;
> >
> >> Will
> >
>
> With above patch applied on top of Ira's patch, I get:
>
> BUG: spinlock recursion on CPU#0, S01syslogd/139
> lock: 0xf5448350, .magic: dead4ead, .owner: S01syslogd/139, .owner_cpu: 0
> CPU: 0 PID: 139 Comm: S01syslogd Not tainted 5.7.0-rc6-next-20200518-00002-gb178d2d56f29-dirty #1
> [f0067a64 :
> do_raw_spin_lock+0xa8/0xd8 ]
> [f00d5034 :
> copy_page_range+0x328/0x804 ]
> [f0025be4 :
> dup_mm+0x334/0x434 ]
> [f0027124 :
> copy_process+0x1224/0x12b0 ]
> [f0027344 :
> _do_fork+0x54/0x30c ]
> [f0027670 :
> do_fork+0x5c/0x6c ]
> [f000de44 :
> sparc_do_fork+0x18/0x38 ]
> [f000b7f4 :
> do_syscall+0x34/0x40 ]
> [5010cd4c :
> 0x5010cd4c ]
>
> Looks like yet another problem.

I've checked the patch above on top of the mmots which already has Ira's
patches and it booted fine. I've used sparc32_defconfig to build the
kernel and qemu-system-sparc with default machine and CPU.

> I can not revert c03584e30534 because it results in a compile failure.

Here's the "revert" of c03584e30534:

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index d001d61e64d5..c9d9d3f9ebf4 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5901,6 +5901,42 @@ overlap_memmap_init(unsigned long zone, unsigned long *pfn)
return false;
}

+#ifdef CONFIG_SPARSEMEM
+/* Skip PFNs that belong to non-present sections */
+static inline __meminit unsigned long next_pfn(unsigned long pfn)
+{
+ const unsigned long section_nr = pfn_to_section_nr(++pfn);
+
+ if (present_section_nr(section_nr))
+ return pfn;
+ return section_nr_to_pfn(next_present_section_nr(section_nr));
+}
+#else
+static inline __meminit unsigned long next_pfn(unsigned long pfn)
+{
+ return pfn++;
+}
+#endif
+
+#ifdef CONFIG_NODES_SPAN_OTHER_NODES
+/* Only safe to use early in boot when initialisation is single-threaded */
+static inline bool __meminit early_pfn_in_nid(unsigned long pfn, int node)
+{
+ int nid;
+
+ nid = __early_pfn_to_nid(pfn, &early_pfnnid_cache);
+ if (nid >= 0 && nid != node)
+ return false;
+ return true;
+}
+
+#else
+static inline bool __meminit early_pfn_in_nid(unsigned long pfn, int node)
+{
+ return true;
+}
+#endif
+
/*
* Initially all pages are reserved - free ones are freed
* up by memblock_free_all() once the early boot process is
@@ -5940,6 +5976,14 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
* function. They do not exist on hotplugged memory.
*/
if (context == MEMMAP_EARLY) {
+ if (!early_pfn_valid(pfn)) {
+ pfn = next_pfn(pfn);
+ continue;
+ }
+ if (!early_pfn_in_nid(pfn, nid)) {
+ pfn++;
+ continue;
+ }
if (overlap_memmap_init(zone, &pfn))
continue;
if (defer_init(nid, pfn, end_pfn))
@@ -6055,23 +6099,9 @@ static void __meminit zone_init_free_lists(struct zone *zone)
}

void __meminit __weak memmap_init(unsigned long size, int nid,
- unsigned long zone,
- unsigned long range_start_pfn)
+ unsigned long zone, unsigned long start_pfn)
{
- unsigned long start_pfn, end_pfn;
- unsigned long range_end_pfn = range_start_pfn + size;
- int i;
-
- for_each_mem_pfn_range(i, nid, &start_pfn, &end_pfn, NULL) {
- start_pfn = clamp(start_pfn, range_start_pfn, range_end_pfn);
- end_pfn = clamp(end_pfn, range_start_pfn, range_end_pfn);
-
- if (end_pfn > start_pfn) {
- size = end_pfn - start_pfn;
- memmap_init_zone(size, nid, zone, start_pfn,
- MEMMAP_EARLY, NULL);
- }
- }
+ memmap_init_zone(size, nid, zone, start_pfn, MEMMAP_EARLY, NULL);
}

static int zone_batchsize(struct zone *zone)

> Guenter

--
Sincerely yours,
Mike.

2020-05-21 23:04:36

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables

On 5/20/20 12:51 PM, Mike Rapoport wrote:
> On Wed, May 20, 2020 at 12:03:31PM -0700, Guenter Roeck wrote:
>> On 5/20/20 10:03 AM, Mike Rapoport wrote:
>>> On Mon, May 18, 2020 at 09:37:15AM +0100, Will Deacon wrote:
>>>> On Sat, May 16, 2020 at 05:07:50PM -0700, Guenter Roeck wrote:
>>>>> On Sat, May 16, 2020 at 05:00:50PM -0700, Guenter Roeck wrote:
>>>>>> On Mon, May 11, 2020 at 09:41:36PM +0100, Will Deacon wrote:
>>>>>>> Now that the page table allocator can free page table allocations
>>>>>>> smaller than PAGE_SIZE, reduce the size of the PMD and PTE allocations
>>>>>>> to avoid needlessly wasting memory.
>>>>>>>
>>>>>>> Cc: "David S. Miller" <[email protected]>
>>>>>>> Cc: Peter Zijlstra <[email protected]>
>>>>>>> Signed-off-by: Will Deacon <[email protected]>
>>>>>>
>>>>>> Something in the sparc32 patches in linux-next causes all my sparc32 emulations
>>>>>> to crash. bisect points to this patch, but reverting it doesn't help, and neither
>>>>>> does reverting the rest of the series.
>>>>>>
>>>>> Actually, turns out I see the same pattern (lots of scheduling while atomic
>>>>> followed by 'killing interrupt handler' in cryptomgr_test) with several
>>>>> powerpc boot tests. I am currently bisecting those crashes. I'll report
>>>>> the results here as well as soon as I have it.
>>>>
>>>> FWIW, I retested my sparc32 patches with PREEMPT=y and I don't see any
>>>> issues. However, linux-next is a different story, where I don't get very far
>>>> at all:
>>>>
>>>> BUG: Bad page state in process swapper pfn:005b4
>>>
>>> This is caused by c03584e30534 ("mm: memmap_init: iterate over memblock
>>> regions rather that check each PFN"). The commit sha is valid for
>>> v5.7-rc6-mmots-2020-05-19-21-52, so it will change in a day or so :)
>>>
>>> As it seems, sparc32 never registered the memory occupied by the kernel
>>> image with memblock_add() and it only reserves this memory with
>>> meblock_reserve().
>>>
>>> I don't know what would happen on real HW, but with
>>>
>>> qemu-system-sparc -kernel /path/to/kernel
>>>
>>> the memory occupied by the kernel is reserved in openbios and removed
>>> from mem.available. The prom setup code in the kernel used mem.available
>>> to set up the memory banks and essentially there is a hole for the
>>> memory occupied by the kernel.
>>>
>>> Later in bootmem_init() this memory is memblock_reserve()d.
>>>
>>> Before the problematic commit, memmap initialization would call
>>> __init_single_page() for the pages in that hole, the
>>> free_low_memory_core_early() would mark them as resrved and everything
>>> would be Ok.
>>>
>>> After the change in memmap initialization, the hole is skipped and the
>>> page structs for it are not inited. And when they are passed from
>>> memblock to page allocator as reserved it gets confused.
>>>
>>> Simply registering the memory occupied by the kernel with memblock_add()
>>> resolves this issue, at least for qemu-system-arm and I cannot see how
>>> it can harm any other setup.
>>>
>>> If all that makes sense I'll send a proper patch :)
>>>
>>> diff --git a/arch/sparc/mm/init_32.c b/arch/sparc/mm/init_32.c
>>> index 906eda1158b4..3cb3dffcbcdc 100644
>>> --- a/arch/sparc/mm/init_32.c
>>> +++ b/arch/sparc/mm/init_32.c
>>> @@ -193,6 +193,7 @@ unsigned long __init bootmem_init(unsigned long *pages_avail)
>>> /* Reserve the kernel text/data/bss. */
>>> size = (start_pfn << PAGE_SHIFT) - phys_base;
>>> memblock_reserve(phys_base, size);
>>> + memblock_add(phys_base, size);
>>>
>>> size = memblock_phys_mem_size() - memblock_reserved_size();
>>> *pages_avail = (size >> PAGE_SHIFT) - high_pages;
>>>
>>>> Will
>>>
>>
>> With above patch applied on top of Ira's patch, I get:
>>
>> BUG: spinlock recursion on CPU#0, S01syslogd/139
>> lock: 0xf5448350, .magic: dead4ead, .owner: S01syslogd/139, .owner_cpu: 0
>> CPU: 0 PID: 139 Comm: S01syslogd Not tainted 5.7.0-rc6-next-20200518-00002-gb178d2d56f29-dirty #1
>> [f0067a64 :
>> do_raw_spin_lock+0xa8/0xd8 ]
>> [f00d5034 :
>> copy_page_range+0x328/0x804 ]
>> [f0025be4 :
>> dup_mm+0x334/0x434 ]
>> [f0027124 :
>> copy_process+0x1224/0x12b0 ]
>> [f0027344 :
>> _do_fork+0x54/0x30c ]
>> [f0027670 :
>> do_fork+0x5c/0x6c ]
>> [f000de44 :
>> sparc_do_fork+0x18/0x38 ]
>> [f000b7f4 :
>> do_syscall+0x34/0x40 ]
>> [5010cd4c :
>> 0x5010cd4c ]
>>
>> Looks like yet another problem.
>
> I've checked the patch above on top of the mmots which already has Ira's
> patches and it booted fine. I've used sparc32_defconfig to build the
> kernel and qemu-system-sparc with default machine and CPU.
>

Try sparc32_defconfig+SMP.

>> I can not revert c03584e30534 because it results in a compile failure.
>
> Here's the "revert" of c03584e30534:
>

Same problem (spinlock recursion) after applying it.

Guenter

2020-05-24 12:38:13

by Mike Rapoport

[permalink] [raw]
Subject: Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables

On Thu, May 21, 2020 at 04:02:11PM -0700, Guenter Roeck wrote:
> On 5/20/20 12:51 PM, Mike Rapoport wrote:
> > On Wed, May 20, 2020 at 12:03:31PM -0700, Guenter Roeck wrote:
> >> On 5/20/20 10:03 AM, Mike Rapoport wrote:
> >>> On Mon, May 18, 2020 at 09:37:15AM +0100, Will Deacon wrote:
> >>>> On Sat, May 16, 2020 at 05:07:50PM -0700, Guenter Roeck wrote:
> >>>>> On Sat, May 16, 2020 at 05:00:50PM -0700, Guenter Roeck wrote:
> >>>>>> On Mon, May 11, 2020 at 09:41:36PM +0100, Will Deacon wrote:
> >>>>>>> Now that the page table allocator can free page table allocations
> >>>>>>> smaller than PAGE_SIZE, reduce the size of the PMD and PTE allocations
> >>>>>>> to avoid needlessly wasting memory.
> >>>>>>>
> >>>>>>> Cc: "David S. Miller" <[email protected]>
> >>>>>>> Cc: Peter Zijlstra <[email protected]>
> >>>>>>> Signed-off-by: Will Deacon <[email protected]>
> >>>>>>
> >>>>>> Something in the sparc32 patches in linux-next causes all my sparc32 emulations
> >>>>>> to crash. bisect points to this patch, but reverting it doesn't help, and neither
> >>>>>> does reverting the rest of the series.
> >>>>>>
> >>>>> Actually, turns out I see the same pattern (lots of scheduling while atomic
> >>>>> followed by 'killing interrupt handler' in cryptomgr_test) with several
> >>>>> powerpc boot tests. I am currently bisecting those crashes. I'll report
> >>>>> the results here as well as soon as I have it.
> >>>>
> >>>> FWIW, I retested my sparc32 patches with PREEMPT=y and I don't see any
> >>>> issues. However, linux-next is a different story, where I don't get very far
> >>>> at all:
> >>>>
> >>>> BUG: Bad page state in process swapper pfn:005b4
> >>
> >> With above patch applied on top of Ira's patch, I get:
> >>
> >> BUG: spinlock recursion on CPU#0, S01syslogd/139
> >> lock: 0xf5448350, .magic: dead4ead, .owner: S01syslogd/139, .owner_cpu: 0
> >> CPU: 0 PID: 139 Comm: S01syslogd Not tainted 5.7.0-rc6-next-20200518-00002-gb178d2d56f29-dirty #1
> >> [f0067a64 :
> >> do_raw_spin_lock+0xa8/0xd8 ]
> >> [f00d5034 :
> >> copy_page_range+0x328/0x804 ]
> >> [f0025be4 :
> >> dup_mm+0x334/0x434 ]
> >> [f0027124 :
> >> copy_process+0x1224/0x12b0 ]
> >> [f0027344 :
> >> _do_fork+0x54/0x30c ]
> >> [f0027670 :
> >> do_fork+0x5c/0x6c ]
> >> [f000de44 :
> >> sparc_do_fork+0x18/0x38 ]
> >> [f000b7f4 :
> >> do_syscall+0x34/0x40 ]
> >> [5010cd4c :
> >> 0x5010cd4c ]
> >>
> >> Looks like yet another problem.
> >
> > I've checked the patch above on top of the mmots which already has Ira's
> > patches and it booted fine. I've used sparc32_defconfig to build the
> > kernel and qemu-system-sparc with default machine and CPU.
> >
>
> Try sparc32_defconfig+SMP.

I see a differernt problem, but this could be related:

INIT: version 2.86 booting
rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
(detected by 0, t=5252 jiffies, g=-935, q=3)
rcu: All QSes seen, last rcu_sched kthread activity 5252 (-68674--73926), jiffies_till_next_fqs=1, root ->qsmask 0x0
rcu: rcu_sched kthread starved for 5252 jiffies! g-935 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
rcu: Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
rcu: RCU grace-period kthread stack dump:
rcu_sched R running task 0 10 2 0x00000000

I'm running a bit old debian [1] with qemu-img-sparc.

My bisect pointed at commit 8c8f3156dd40 ("sparc32: mm: Reduce
allocation size for PMD and PTE tables"). The commit ID is valid for
next-20200522.

If I revert this commit and fixup the page table initialization [2] I've
broken, the build with CONFIG_SMP=n works fine, but the build with
CONFIG_SMP=y does not work even if I add nosmp to the kernel command
line.

[1] https://people.debian.org/~aurel32/qemu/sparc/debian_etch_sparc_small.qcow2
[2] sparc32 meminit fixup:

diff --git a/arch/sparc/mm/init_32.c b/arch/sparc/mm/init_32.c
index e45160839f79..eb2946b1df8a 100644
--- a/arch/sparc/mm/init_32.c
+++ b/arch/sparc/mm/init_32.c
@@ -192,6 +192,7 @@ unsigned long __init bootmem_init(unsigned long *pages_avail)
/* Reserve the kernel text/data/bss. */
size = (start_pfn << PAGE_SHIFT) - phys_base;
memblock_reserve(phys_base, size);
+ memblock_add(phys_base, size);

size = memblock_phys_mem_size() - memblock_reserved_size();
*pages_avail = (size >> PAGE_SHIFT) - high_pages;
diff --git a/arch/sparc/mm/srmmu.c b/arch/sparc/mm/srmmu.c
index 75b56bdd38ef..6cb1ea2d2b5c 100644
--- a/arch/sparc/mm/srmmu.c
+++ b/arch/sparc/mm/srmmu.c
@@ -304,7 +304,7 @@ static void __init srmmu_nocache_init(void)
pgd = pgd_offset_k(vaddr);
p4d = p4d_offset(__nocache_fix(pgd), vaddr);
pud = pud_offset(__nocache_fix(p4d), vaddr);
- pmd = pmd_offset(__nocache_fix(pud), vaddr);
+ pmd = pmd_offset(__nocache_fix(pgd), vaddr);
pte = pte_offset_kernel(__nocache_fix(pmd), vaddr);

pteval = ((paddr >> 4) | SRMMU_ET_PTE | SRMMU_PRIV);

> Guenter

--
Sincerely yours,
Mike.

2020-05-24 14:05:56

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables

On 5/24/20 5:32 AM, Mike Rapoport wrote:
> On Thu, May 21, 2020 at 04:02:11PM -0700, Guenter Roeck wrote:
>> On 5/20/20 12:51 PM, Mike Rapoport wrote:
>>> On Wed, May 20, 2020 at 12:03:31PM -0700, Guenter Roeck wrote:
>>>> On 5/20/20 10:03 AM, Mike Rapoport wrote:
>>>>> On Mon, May 18, 2020 at 09:37:15AM +0100, Will Deacon wrote:
>>>>>> On Sat, May 16, 2020 at 05:07:50PM -0700, Guenter Roeck wrote:
>>>>>>> On Sat, May 16, 2020 at 05:00:50PM -0700, Guenter Roeck wrote:
>>>>>>>> On Mon, May 11, 2020 at 09:41:36PM +0100, Will Deacon wrote:
>>>>>>>>> Now that the page table allocator can free page table allocations
>>>>>>>>> smaller than PAGE_SIZE, reduce the size of the PMD and PTE allocations
>>>>>>>>> to avoid needlessly wasting memory.
>>>>>>>>>
>>>>>>>>> Cc: "David S. Miller" <[email protected]>
>>>>>>>>> Cc: Peter Zijlstra <[email protected]>
>>>>>>>>> Signed-off-by: Will Deacon <[email protected]>
>>>>>>>>
>>>>>>>> Something in the sparc32 patches in linux-next causes all my sparc32 emulations
>>>>>>>> to crash. bisect points to this patch, but reverting it doesn't help, and neither
>>>>>>>> does reverting the rest of the series.
>>>>>>>>
>>>>>>> Actually, turns out I see the same pattern (lots of scheduling while atomic
>>>>>>> followed by 'killing interrupt handler' in cryptomgr_test) with several
>>>>>>> powerpc boot tests. I am currently bisecting those crashes. I'll report
>>>>>>> the results here as well as soon as I have it.
>>>>>>
>>>>>> FWIW, I retested my sparc32 patches with PREEMPT=y and I don't see any
>>>>>> issues. However, linux-next is a different story, where I don't get very far
>>>>>> at all:
>>>>>>
>>>>>> BUG: Bad page state in process swapper pfn:005b4
>>>>
>>>> With above patch applied on top of Ira's patch, I get:
>>>>
>>>> BUG: spinlock recursion on CPU#0, S01syslogd/139
>>>> lock: 0xf5448350, .magic: dead4ead, .owner: S01syslogd/139, .owner_cpu: 0
>>>> CPU: 0 PID: 139 Comm: S01syslogd Not tainted 5.7.0-rc6-next-20200518-00002-gb178d2d56f29-dirty #1
>>>> [f0067a64 :
>>>> do_raw_spin_lock+0xa8/0xd8 ]
>>>> [f00d5034 :
>>>> copy_page_range+0x328/0x804 ]
>>>> [f0025be4 :
>>>> dup_mm+0x334/0x434 ]
>>>> [f0027124 :
>>>> copy_process+0x1224/0x12b0 ]
>>>> [f0027344 :
>>>> _do_fork+0x54/0x30c ]
>>>> [f0027670 :
>>>> do_fork+0x5c/0x6c ]
>>>> [f000de44 :
>>>> sparc_do_fork+0x18/0x38 ]
>>>> [f000b7f4 :
>>>> do_syscall+0x34/0x40 ]
>>>> [5010cd4c :
>>>> 0x5010cd4c ]
>>>>
>>>> Looks like yet another problem.
>>>
>>> I've checked the patch above on top of the mmots which already has Ira's
>>> patches and it booted fine. I've used sparc32_defconfig to build the
>>> kernel and qemu-system-sparc with default machine and CPU.
>>>
>>
>> Try sparc32_defconfig+SMP.
>
> I see a differernt problem, but this could be related:
>
> INIT: version 2.86 booting
> rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
> (detected by 0, t=5252 jiffies, g=-935, q=3)
> rcu: All QSes seen, last rcu_sched kthread activity 5252 (-68674--73926), jiffies_till_next_fqs=1, root ->qsmask 0x0
> rcu: rcu_sched kthread starved for 5252 jiffies! g-935 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
> rcu: Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
> rcu: RCU grace-period kthread stack dump:
> rcu_sched R running task 0 10 2 0x00000000
>
> I'm running a bit old debian [1] with qemu-img-sparc.
>
> My bisect pointed at commit 8c8f3156dd40 ("sparc32: mm: Reduce
> allocation size for PMD and PTE tables"). The commit ID is valid for
> next-20200522.
>
Here is what I currently get:

next-20200522:
All builds/tests crash
next-20200522 plus upstream commit 0cfc8a8d70dc ("sparc32: fix page table traversal in srmmu_nocache_init()"):
nosmp images (sparc32_defconfig) boot fine
smp images (sparc32_defconfig+SMP) crash with "BUG: Bad page state"
next-20200522 plus 0cfc8a8d70dc plus memblock_add() from below:
smp images crash with spinlock recursion as above
next-20200522 plus 0cfc8a8d70dc plus revert of 8c8f3156dd40:
smp images crash with "BUG: Bad page state"
next-20200522 plus 0cfc8a8d70dc plus revert of 8c8f3156dd40 plus memblock_add():
All builds/tests pass

This is with my root file system. I tried the debian image but I seem to be
missing some command line option needed to make it work.

Guenter

2020-05-26 13:29:28

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables

On Sun, May 24, 2020 at 03:32:56PM +0300, Mike Rapoport wrote:
> On Thu, May 21, 2020 at 04:02:11PM -0700, Guenter Roeck wrote:
> > On 5/20/20 12:51 PM, Mike Rapoport wrote:
> > > On Wed, May 20, 2020 at 12:03:31PM -0700, Guenter Roeck wrote:
> > >> On 5/20/20 10:03 AM, Mike Rapoport wrote:
> > >>> On Mon, May 18, 2020 at 09:37:15AM +0100, Will Deacon wrote:
> > >>>> On Sat, May 16, 2020 at 05:07:50PM -0700, Guenter Roeck wrote:
> > >>>>> On Sat, May 16, 2020 at 05:00:50PM -0700, Guenter Roeck wrote:
> > >>>>>> On Mon, May 11, 2020 at 09:41:36PM +0100, Will Deacon wrote:
> > >>>>>>> Now that the page table allocator can free page table allocations
> > >>>>>>> smaller than PAGE_SIZE, reduce the size of the PMD and PTE allocations
> > >>>>>>> to avoid needlessly wasting memory.
> > >>>>>>>
> > >>>>>>> Cc: "David S. Miller" <[email protected]>
> > >>>>>>> Cc: Peter Zijlstra <[email protected]>
> > >>>>>>> Signed-off-by: Will Deacon <[email protected]>
> > >>>>>>
> > >>>>>> Something in the sparc32 patches in linux-next causes all my sparc32 emulations
> > >>>>>> to crash. bisect points to this patch, but reverting it doesn't help, and neither
> > >>>>>> does reverting the rest of the series.
> > >>>>>>
> > >>>>> Actually, turns out I see the same pattern (lots of scheduling while atomic
> > >>>>> followed by 'killing interrupt handler' in cryptomgr_test) with several
> > >>>>> powerpc boot tests. I am currently bisecting those crashes. I'll report
> > >>>>> the results here as well as soon as I have it.
> > >>>>
> > >>>> FWIW, I retested my sparc32 patches with PREEMPT=y and I don't see any
> > >>>> issues. However, linux-next is a different story, where I don't get very far
> > >>>> at all:
> > >>>>
> > >>>> BUG: Bad page state in process swapper pfn:005b4
> > >>
> > >> With above patch applied on top of Ira's patch, I get:
> > >>
> > >> BUG: spinlock recursion on CPU#0, S01syslogd/139
> > >> lock: 0xf5448350, .magic: dead4ead, .owner: S01syslogd/139, .owner_cpu: 0
> > >> CPU: 0 PID: 139 Comm: S01syslogd Not tainted 5.7.0-rc6-next-20200518-00002-gb178d2d56f29-dirty #1
> > >> [f0067a64 :
> > >> do_raw_spin_lock+0xa8/0xd8 ]
> > >> [f00d5034 :
> > >> copy_page_range+0x328/0x804 ]
> > >> [f0025be4 :
> > >> dup_mm+0x334/0x434 ]
> > >> [f0027124 :
> > >> copy_process+0x1224/0x12b0 ]
> > >> [f0027344 :
> > >> _do_fork+0x54/0x30c ]
> > >> [f0027670 :
> > >> do_fork+0x5c/0x6c ]
> > >> [f000de44 :
> > >> sparc_do_fork+0x18/0x38 ]
> > >> [f000b7f4 :
> > >> do_syscall+0x34/0x40 ]
> > >> [5010cd4c :
> > >> 0x5010cd4c ]
> > >>
> > >> Looks like yet another problem.
> > >
> > > I've checked the patch above on top of the mmots which already has Ira's
> > > patches and it booted fine. I've used sparc32_defconfig to build the
> > > kernel and qemu-system-sparc with default machine and CPU.
> > >
> >
> > Try sparc32_defconfig+SMP.
>
> I see a differernt problem, but this could be related:
>
> INIT: version 2.86 booting
> rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
> (detected by 0, t=5252 jiffies, g=-935, q=3)
> rcu: All QSes seen, last rcu_sched kthread activity 5252 (-68674--73926), jiffies_till_next_fqs=1, root ->qsmask 0x0
> rcu: rcu_sched kthread starved for 5252 jiffies! g-935 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
> rcu: Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
> rcu: RCU grace-period kthread stack dump:
> rcu_sched R running task 0 10 2 0x00000000
>
> I'm running a bit old debian [1] with qemu-img-sparc.
>
> My bisect pointed at commit 8c8f3156dd40 ("sparc32: mm: Reduce
> allocation size for PMD and PTE tables"). The commit ID is valid for
> next-20200522.

Can you try the diff below please?

Will

--->8

diff --git a/arch/sparc/mm/srmmu.c b/arch/sparc/mm/srmmu.c
index c861c0f0df73..7c05c0dea511 100644
--- a/arch/sparc/mm/srmmu.c
+++ b/arch/sparc/mm/srmmu.c
@@ -363,20 +363,16 @@ pgtable_t pte_alloc_one(struct mm_struct *mm)

if ((ptep = pte_alloc_one_kernel(mm)) == 0)
return NULL;
+
page = pfn_to_page(__nocache_pa((unsigned long)ptep) >> PAGE_SHIFT);
- if (!pgtable_pte_page_ctor(page)) {
- __free_page(page);
+ if (!PageTable(page) && !pgtable_pte_page_ctor(page))
return NULL;
- }
+
return ptep;
}

void pte_free(struct mm_struct *mm, pgtable_t ptep)
{
- struct page *page;
-
- page = pfn_to_page(__nocache_pa((unsigned long)ptep) >> PAGE_SHIFT);
- pgtable_pte_page_dtor(page);
srmmu_free_nocache(ptep, SRMMU_PTE_TABLE_SIZE);
}

diff --git a/mm/Kconfig b/mm/Kconfig
index c1acc34c1c35..97458119cce8 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -192,6 +192,9 @@ config MEMORY_HOTREMOVE
# Default to 4 for wider testing, though 8 might be more appropriate.
# ARM's adjust_pte (unused if VIPT) depends on mm-wide page_table_lock.
# PA-RISC 7xxx's spinlock_t would enlarge struct page from 32 to 44 bytes.
+# SPARC32 allocates multiple pte tables within a single page, and therefore
+# a per-page lock leads to problems when multiple tables need to be locked
+# at the same time (e.g. copy_page_range()).
# DEBUG_SPINLOCK and DEBUG_LOCK_ALLOC spinlock_t also enlarge struct page.
#
config SPLIT_PTLOCK_CPUS
@@ -199,6 +202,7 @@ config SPLIT_PTLOCK_CPUS
default "999999" if !MMU
default "999999" if ARM && !CPU_CACHE_VIPT
default "999999" if PARISC && !PA20
+ default "999999" if SPARC32
default "4"

config ARCH_ENABLE_SPLIT_PMD_PTLOCK

2020-05-26 14:07:36

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables

On Tue, May 26, 2020 at 02:26:35PM +0100, Will Deacon wrote:
> On Sun, May 24, 2020 at 03:32:56PM +0300, Mike Rapoport wrote:
> > On Thu, May 21, 2020 at 04:02:11PM -0700, Guenter Roeck wrote:
> > > On 5/20/20 12:51 PM, Mike Rapoport wrote:
> > > > On Wed, May 20, 2020 at 12:03:31PM -0700, Guenter Roeck wrote:
> > > >> With above patch applied on top of Ira's patch, I get:
> > > >>
> > > >> BUG: spinlock recursion on CPU#0, S01syslogd/139
> > > >> lock: 0xf5448350, .magic: dead4ead, .owner: S01syslogd/139, .owner_cpu: 0
> > > >> CPU: 0 PID: 139 Comm: S01syslogd Not tainted 5.7.0-rc6-next-20200518-00002-gb178d2d56f29-dirty #1
> > > >> [f0067a64 :
> > > >> do_raw_spin_lock+0xa8/0xd8 ]
> > > >> [f00d5034 :
> > > >> copy_page_range+0x328/0x804 ]
> > > >> [f0025be4 :
> > > >> dup_mm+0x334/0x434 ]
> > > >> [f0027124 :
> > > >> copy_process+0x1224/0x12b0 ]
> > > >> [f0027344 :
> > > >> _do_fork+0x54/0x30c ]
> > > >> [f0027670 :
> > > >> do_fork+0x5c/0x6c ]
> > > >> [f000de44 :
> > > >> sparc_do_fork+0x18/0x38 ]
> > > >> [f000b7f4 :
> > > >> do_syscall+0x34/0x40 ]
> > > >> [5010cd4c :
> > > >> 0x5010cd4c ]
> > > >>
> > > >> Looks like yet another problem.
> > > >
> > > > I've checked the patch above on top of the mmots which already has Ira's
> > > > patches and it booted fine. I've used sparc32_defconfig to build the
> > > > kernel and qemu-system-sparc with default machine and CPU.
> > > >
> > >
> > > Try sparc32_defconfig+SMP.
> >
> > I see a differernt problem, but this could be related:
> >
> > INIT: version 2.86 booting
> > rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
> > (detected by 0, t=5252 jiffies, g=-935, q=3)
> > rcu: All QSes seen, last rcu_sched kthread activity 5252 (-68674--73926), jiffies_till_next_fqs=1, root ->qsmask 0x0
> > rcu: rcu_sched kthread starved for 5252 jiffies! g-935 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
> > rcu: Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
> > rcu: RCU grace-period kthread stack dump:
> > rcu_sched R running task 0 10 2 0x00000000
> >
> > I'm running a bit old debian [1] with qemu-img-sparc.
> >
> > My bisect pointed at commit 8c8f3156dd40 ("sparc32: mm: Reduce
> > allocation size for PMD and PTE tables"). The commit ID is valid for
> > next-20200522.
>
> Can you try the diff below please?

Actually, that's racy. New version below!

Will

--->8

diff --git a/arch/sparc/mm/srmmu.c b/arch/sparc/mm/srmmu.c
index c861c0f0df73..068029471aa4 100644
--- a/arch/sparc/mm/srmmu.c
+++ b/arch/sparc/mm/srmmu.c
@@ -363,11 +363,16 @@ pgtable_t pte_alloc_one(struct mm_struct *mm)

if ((ptep = pte_alloc_one_kernel(mm)) == 0)
return NULL;
+
page = pfn_to_page(__nocache_pa((unsigned long)ptep) >> PAGE_SHIFT);
- if (!pgtable_pte_page_ctor(page)) {
- __free_page(page);
- return NULL;
+
+ spin_lock(&mm->page_table_lock);
+ if (page_ref_inc_return(page) == 2 && !pgtable_pte_page_ctor(page)) {
+ page_ref_dec(page);
+ ptep = NULL;
}
+ spin_unlock(&mm->page_table_lock);
+
return ptep;
}

@@ -376,7 +381,12 @@ void pte_free(struct mm_struct *mm, pgtable_t ptep)
struct page *page;

page = pfn_to_page(__nocache_pa((unsigned long)ptep) >> PAGE_SHIFT);
- pgtable_pte_page_dtor(page);
+
+ spin_lock(&mm->page_table_lock);
+ if (page_ref_dec_return(page) == 1)
+ pgtable_pte_page_dtor(page);
+ spin_unlock(&mm->page_table_lock);
+
srmmu_free_nocache(ptep, SRMMU_PTE_TABLE_SIZE);
}

diff --git a/mm/Kconfig b/mm/Kconfig
index c1acc34c1c35..97458119cce8 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -192,6 +192,9 @@ config MEMORY_HOTREMOVE
# Default to 4 for wider testing, though 8 might be more appropriate.
# ARM's adjust_pte (unused if VIPT) depends on mm-wide page_table_lock.
# PA-RISC 7xxx's spinlock_t would enlarge struct page from 32 to 44 bytes.
+# SPARC32 allocates multiple pte tables within a single page, and therefore
+# a per-page lock leads to problems when multiple tables need to be locked
+# at the same time (e.g. copy_page_range()).
# DEBUG_SPINLOCK and DEBUG_LOCK_ALLOC spinlock_t also enlarge struct page.
#
config SPLIT_PTLOCK_CPUS
@@ -199,6 +202,7 @@ config SPLIT_PTLOCK_CPUS
default "999999" if !MMU
default "999999" if ARM && !CPU_CACHE_VIPT
default "999999" if PARISC && !PA20
+ default "999999" if SPARC32
default "4"

config ARCH_ENABLE_SPLIT_PMD_PTLOCK

2020-05-26 15:24:58

by Mike Rapoport

[permalink] [raw]
Subject: Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables

On Tue, May 26, 2020 at 03:01:27PM +0100, Will Deacon wrote:
> On Tue, May 26, 2020 at 02:26:35PM +0100, Will Deacon wrote:
> > On Sun, May 24, 2020 at 03:32:56PM +0300, Mike Rapoport wrote:
> > > On Thu, May 21, 2020 at 04:02:11PM -0700, Guenter Roeck wrote:
> > > > On 5/20/20 12:51 PM, Mike Rapoport wrote:
> > > > > On Wed, May 20, 2020 at 12:03:31PM -0700, Guenter Roeck wrote:
> > > > >> With above patch applied on top of Ira's patch, I get:
> > > > >>
> > > > >> BUG: spinlock recursion on CPU#0, S01syslogd/139
> > > > >> lock: 0xf5448350, .magic: dead4ead, .owner: S01syslogd/139, .owner_cpu: 0
> > > > >> CPU: 0 PID: 139 Comm: S01syslogd Not tainted 5.7.0-rc6-next-20200518-00002-gb178d2d56f29-dirty #1
> > > > >> [f0067a64 :
> > > > >> do_raw_spin_lock+0xa8/0xd8 ]
> > > > >> [f00d5034 :
> > > > >> copy_page_range+0x328/0x804 ]
> > > > >> [f0025be4 :
> > > > >> dup_mm+0x334/0x434 ]
> > > > >> [f0027124 :
> > > > >> copy_process+0x1224/0x12b0 ]
> > > > >> [f0027344 :
> > > > >> _do_fork+0x54/0x30c ]
> > > > >> [f0027670 :
> > > > >> do_fork+0x5c/0x6c ]
> > > > >> [f000de44 :
> > > > >> sparc_do_fork+0x18/0x38 ]
> > > > >> [f000b7f4 :
> > > > >> do_syscall+0x34/0x40 ]
> > > > >> [5010cd4c :
> > > > >> 0x5010cd4c ]
> > > > >>
> > > > >> Looks like yet another problem.
> > > > >
> > > > > I've checked the patch above on top of the mmots which already has Ira's
> > > > > patches and it booted fine. I've used sparc32_defconfig to build the
> > > > > kernel and qemu-system-sparc with default machine and CPU.
> > > > >
> > > >
> > > > Try sparc32_defconfig+SMP.
> > >
> > > I see a differernt problem, but this could be related:
> > >
> > > INIT: version 2.86 booting
> > > rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
> > > (detected by 0, t=5252 jiffies, g=-935, q=3)
> > > rcu: All QSes seen, last rcu_sched kthread activity 5252 (-68674--73926), jiffies_till_next_fqs=1, root ->qsmask 0x0
> > > rcu: rcu_sched kthread starved for 5252 jiffies! g-935 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
> > > rcu: Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
> > > rcu: RCU grace-period kthread stack dump:
> > > rcu_sched R running task 0 10 2 0x00000000
> > >
> > > I'm running a bit old debian [1] with qemu-img-sparc.
> > >
> > > My bisect pointed at commit 8c8f3156dd40 ("sparc32: mm: Reduce
> > > allocation size for PMD and PTE tables"). The commit ID is valid for
> > > next-20200522.
> >
> > Can you try the diff below please?
>
> Actually, that's racy. New version below!

Well, both versions worked for me with sparc32_defconfig+SMP build when
I ran qemu-system-sparc with default machine (SS-5) that does not allow
SMP.

I could not check with actial SMP because
qemu-system-sparc -M SS-10 -smp 2 and qemu-system-sparc -M SS-20 -smp 2
fail early with an exception even in v5.7-rc7...

> Will
>
> --->8
>
> diff --git a/arch/sparc/mm/srmmu.c b/arch/sparc/mm/srmmu.c
> index c861c0f0df73..068029471aa4 100644
> --- a/arch/sparc/mm/srmmu.c
> +++ b/arch/sparc/mm/srmmu.c
> @@ -363,11 +363,16 @@ pgtable_t pte_alloc_one(struct mm_struct *mm)
>
> if ((ptep = pte_alloc_one_kernel(mm)) == 0)
> return NULL;
> +
> page = pfn_to_page(__nocache_pa((unsigned long)ptep) >> PAGE_SHIFT);
> - if (!pgtable_pte_page_ctor(page)) {
> - __free_page(page);
> - return NULL;
> +
> + spin_lock(&mm->page_table_lock);
> + if (page_ref_inc_return(page) == 2 && !pgtable_pte_page_ctor(page)) {
> + page_ref_dec(page);
> + ptep = NULL;
> }
> + spin_unlock(&mm->page_table_lock);
> +
> return ptep;
> }
>
> @@ -376,7 +381,12 @@ void pte_free(struct mm_struct *mm, pgtable_t ptep)
> struct page *page;
>
> page = pfn_to_page(__nocache_pa((unsigned long)ptep) >> PAGE_SHIFT);
> - pgtable_pte_page_dtor(page);
> +
> + spin_lock(&mm->page_table_lock);
> + if (page_ref_dec_return(page) == 1)
> + pgtable_pte_page_dtor(page);
> + spin_unlock(&mm->page_table_lock);
> +
> srmmu_free_nocache(ptep, SRMMU_PTE_TABLE_SIZE);
> }
>
> diff --git a/mm/Kconfig b/mm/Kconfig
> index c1acc34c1c35..97458119cce8 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -192,6 +192,9 @@ config MEMORY_HOTREMOVE
> # Default to 4 for wider testing, though 8 might be more appropriate.
> # ARM's adjust_pte (unused if VIPT) depends on mm-wide page_table_lock.
> # PA-RISC 7xxx's spinlock_t would enlarge struct page from 32 to 44 bytes.
> +# SPARC32 allocates multiple pte tables within a single page, and therefore
> +# a per-page lock leads to problems when multiple tables need to be locked
> +# at the same time (e.g. copy_page_range()).
> # DEBUG_SPINLOCK and DEBUG_LOCK_ALLOC spinlock_t also enlarge struct page.
> #
> config SPLIT_PTLOCK_CPUS
> @@ -199,6 +202,7 @@ config SPLIT_PTLOCK_CPUS
> default "999999" if !MMU
> default "999999" if ARM && !CPU_CACHE_VIPT
> default "999999" if PARISC && !PA20
> + default "999999" if SPARC32
> default "4"
>
> config ARCH_ENABLE_SPLIT_PMD_PTLOCK

--
Sincerely yours,
Mike.

2020-05-26 16:21:20

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables

On 5/26/20 7:01 AM, Will Deacon wrote:
> On Tue, May 26, 2020 at 02:26:35PM +0100, Will Deacon wrote:
>> On Sun, May 24, 2020 at 03:32:56PM +0300, Mike Rapoport wrote:
>>> On Thu, May 21, 2020 at 04:02:11PM -0700, Guenter Roeck wrote:
>>>> On 5/20/20 12:51 PM, Mike Rapoport wrote:
>>>>> On Wed, May 20, 2020 at 12:03:31PM -0700, Guenter Roeck wrote:
>>>>>> With above patch applied on top of Ira's patch, I get:
>>>>>>
>>>>>> BUG: spinlock recursion on CPU#0, S01syslogd/139
>>>>>> lock: 0xf5448350, .magic: dead4ead, .owner: S01syslogd/139, .owner_cpu: 0
>>>>>> CPU: 0 PID: 139 Comm: S01syslogd Not tainted 5.7.0-rc6-next-20200518-00002-gb178d2d56f29-dirty #1
>>>>>> [f0067a64 :
>>>>>> do_raw_spin_lock+0xa8/0xd8 ]
>>>>>> [f00d5034 :
>>>>>> copy_page_range+0x328/0x804 ]
>>>>>> [f0025be4 :
>>>>>> dup_mm+0x334/0x434 ]
>>>>>> [f0027124 :
>>>>>> copy_process+0x1224/0x12b0 ]
>>>>>> [f0027344 :
>>>>>> _do_fork+0x54/0x30c ]
>>>>>> [f0027670 :
>>>>>> do_fork+0x5c/0x6c ]
>>>>>> [f000de44 :
>>>>>> sparc_do_fork+0x18/0x38 ]
>>>>>> [f000b7f4 :
>>>>>> do_syscall+0x34/0x40 ]
>>>>>> [5010cd4c :
>>>>>> 0x5010cd4c ]
>>>>>>
>>>>>> Looks like yet another problem.
>>>>>
>>>>> I've checked the patch above on top of the mmots which already has Ira's
>>>>> patches and it booted fine. I've used sparc32_defconfig to build the
>>>>> kernel and qemu-system-sparc with default machine and CPU.
>>>>>
>>>>
>>>> Try sparc32_defconfig+SMP.
>>>
>>> I see a differernt problem, but this could be related:
>>>
>>> INIT: version 2.86 booting
>>> rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
>>> (detected by 0, t=5252 jiffies, g=-935, q=3)
>>> rcu: All QSes seen, last rcu_sched kthread activity 5252 (-68674--73926), jiffies_till_next_fqs=1, root ->qsmask 0x0
>>> rcu: rcu_sched kthread starved for 5252 jiffies! g-935 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
>>> rcu: Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
>>> rcu: RCU grace-period kthread stack dump:
>>> rcu_sched R running task 0 10 2 0x00000000
>>>
>>> I'm running a bit old debian [1] with qemu-img-sparc.
>>>
>>> My bisect pointed at commit 8c8f3156dd40 ("sparc32: mm: Reduce
>>> allocation size for PMD and PTE tables"). The commit ID is valid for
>>> next-20200522.
>>
>> Can you try the diff below please?
>
> Actually, that's racy. New version below!
>

Applied on top of next-20200526, with defconfig+SMP, I still get:

BUG: Bad page state in process swapper/0 pfn:0069f

many times. Did I have to revert something else ? Sorry, I lost track.


Note that "-smp 2" on SS-10 works for me (with the same page state
messages).

Guenter


> Will
>
> --->8
>
> diff --git a/arch/sparc/mm/srmmu.c b/arch/sparc/mm/srmmu.c
> index c861c0f0df73..068029471aa4 100644
> --- a/arch/sparc/mm/srmmu.c
> +++ b/arch/sparc/mm/srmmu.c
> @@ -363,11 +363,16 @@ pgtable_t pte_alloc_one(struct mm_struct *mm)
>
> if ((ptep = pte_alloc_one_kernel(mm)) == 0)
> return NULL;
> +
> page = pfn_to_page(__nocache_pa((unsigned long)ptep) >> PAGE_SHIFT);
> - if (!pgtable_pte_page_ctor(page)) {
> - __free_page(page);
> - return NULL;
> +
> + spin_lock(&mm->page_table_lock);
> + if (page_ref_inc_return(page) == 2 && !pgtable_pte_page_ctor(page)) {
> + page_ref_dec(page);
> + ptep = NULL;
> }
> + spin_unlock(&mm->page_table_lock);
> +
> return ptep;
> }
>
> @@ -376,7 +381,12 @@ void pte_free(struct mm_struct *mm, pgtable_t ptep)
> struct page *page;
>
> page = pfn_to_page(__nocache_pa((unsigned long)ptep) >> PAGE_SHIFT);
> - pgtable_pte_page_dtor(page);
> +
> + spin_lock(&mm->page_table_lock);
> + if (page_ref_dec_return(page) == 1)
> + pgtable_pte_page_dtor(page);
> + spin_unlock(&mm->page_table_lock);
> +
> srmmu_free_nocache(ptep, SRMMU_PTE_TABLE_SIZE);
> }
>
> diff --git a/mm/Kconfig b/mm/Kconfig
> index c1acc34c1c35..97458119cce8 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -192,6 +192,9 @@ config MEMORY_HOTREMOVE
> # Default to 4 for wider testing, though 8 might be more appropriate.
> # ARM's adjust_pte (unused if VIPT) depends on mm-wide page_table_lock.
> # PA-RISC 7xxx's spinlock_t would enlarge struct page from 32 to 44 bytes.
> +# SPARC32 allocates multiple pte tables within a single page, and therefore
> +# a per-page lock leads to problems when multiple tables need to be locked
> +# at the same time (e.g. copy_page_range()).
> # DEBUG_SPINLOCK and DEBUG_LOCK_ALLOC spinlock_t also enlarge struct page.
> #
> config SPLIT_PTLOCK_CPUS
> @@ -199,6 +202,7 @@ config SPLIT_PTLOCK_CPUS
> default "999999" if !MMU
> default "999999" if ARM && !CPU_CACHE_VIPT
> default "999999" if PARISC && !PA20
> + default "999999" if SPARC32
> default "4"
>
> config ARCH_ENABLE_SPLIT_PMD_PTLOCK
>

2020-05-26 16:34:14

by Mike Rapoport

[permalink] [raw]
Subject: Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables

On Tue, May 26, 2020 at 09:18:54AM -0700, Guenter Roeck wrote:
> On 5/26/20 7:01 AM, Will Deacon wrote:
> > On Tue, May 26, 2020 at 02:26:35PM +0100, Will Deacon wrote:
> >> On Sun, May 24, 2020 at 03:32:56PM +0300, Mike Rapoport wrote:
> >>> On Thu, May 21, 2020 at 04:02:11PM -0700, Guenter Roeck wrote:
> >>>> On 5/20/20 12:51 PM, Mike Rapoport wrote:
> >>>>> On Wed, May 20, 2020 at 12:03:31PM -0700, Guenter Roeck wrote:
> >>>>>> With above patch applied on top of Ira's patch, I get:
> >>>>>>
> >>>>>> BUG: spinlock recursion on CPU#0, S01syslogd/139
> >>>>>> lock: 0xf5448350, .magic: dead4ead, .owner: S01syslogd/139, .owner_cpu: 0
> >>>>>> CPU: 0 PID: 139 Comm: S01syslogd Not tainted 5.7.0-rc6-next-20200518-00002-gb178d2d56f29-dirty #1
> >>>>>> [f0067a64 :
> >>>>>> do_raw_spin_lock+0xa8/0xd8 ]
> >>>>>> [f00d5034 :
> >>>>>> copy_page_range+0x328/0x804 ]
> >>>>>> [f0025be4 :
> >>>>>> dup_mm+0x334/0x434 ]
> >>>>>> [f0027124 :
> >>>>>> copy_process+0x1224/0x12b0 ]
> >>>>>> [f0027344 :
> >>>>>> _do_fork+0x54/0x30c ]
> >>>>>> [f0027670 :
> >>>>>> do_fork+0x5c/0x6c ]
> >>>>>> [f000de44 :
> >>>>>> sparc_do_fork+0x18/0x38 ]
> >>>>>> [f000b7f4 :
> >>>>>> do_syscall+0x34/0x40 ]
> >>>>>> [5010cd4c :
> >>>>>> 0x5010cd4c ]
> >>>>>>
> >>>>>> Looks like yet another problem.
> >>>>>
> >>>>> I've checked the patch above on top of the mmots which already has Ira's
> >>>>> patches and it booted fine. I've used sparc32_defconfig to build the
> >>>>> kernel and qemu-system-sparc with default machine and CPU.
> >>>>>
> >>>>
> >>>> Try sparc32_defconfig+SMP.
> >>>
> >>> I see a differernt problem, but this could be related:
> >>>
> >>> INIT: version 2.86 booting
> >>> rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
> >>> (detected by 0, t=5252 jiffies, g=-935, q=3)
> >>> rcu: All QSes seen, last rcu_sched kthread activity 5252 (-68674--73926), jiffies_till_next_fqs=1, root ->qsmask 0x0
> >>> rcu: rcu_sched kthread starved for 5252 jiffies! g-935 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
> >>> rcu: Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
> >>> rcu: RCU grace-period kthread stack dump:
> >>> rcu_sched R running task 0 10 2 0x00000000
> >>>
> >>> I'm running a bit old debian [1] with qemu-img-sparc.
> >>>
> >>> My bisect pointed at commit 8c8f3156dd40 ("sparc32: mm: Reduce
> >>> allocation size for PMD and PTE tables"). The commit ID is valid for
> >>> next-20200522.
> >>
> >> Can you try the diff below please?
> >
> > Actually, that's racy. New version below!
> >
>
> Applied on top of next-20200526, with defconfig+SMP, I still get:
>
> BUG: Bad page state in process swapper/0 pfn:0069f
>
> many times. Did I have to revert something else ? Sorry, I lost track.

The bad page messages are fixed by [1], but this is not in mmotm or
linux-next. This is not related to SMP hangs.

[1] https://lore.kernel.org/lkml/[email protected]/

> Note that "-smp 2" on SS-10 works for me (with the same page state
> messages).
>
> Guenter
>
>
> > Will
> >
> > --->8
> >
> > diff --git a/arch/sparc/mm/srmmu.c b/arch/sparc/mm/srmmu.c
> > index c861c0f0df73..068029471aa4 100644
> > --- a/arch/sparc/mm/srmmu.c
> > +++ b/arch/sparc/mm/srmmu.c
> > @@ -363,11 +363,16 @@ pgtable_t pte_alloc_one(struct mm_struct *mm)
> >
> > if ((ptep = pte_alloc_one_kernel(mm)) == 0)
> > return NULL;
> > +
> > page = pfn_to_page(__nocache_pa((unsigned long)ptep) >> PAGE_SHIFT);
> > - if (!pgtable_pte_page_ctor(page)) {
> > - __free_page(page);
> > - return NULL;
> > +
> > + spin_lock(&mm->page_table_lock);
> > + if (page_ref_inc_return(page) == 2 && !pgtable_pte_page_ctor(page)) {
> > + page_ref_dec(page);
> > + ptep = NULL;
> > }
> > + spin_unlock(&mm->page_table_lock);
> > +
> > return ptep;
> > }
> >
> > @@ -376,7 +381,12 @@ void pte_free(struct mm_struct *mm, pgtable_t ptep)
> > struct page *page;
> >
> > page = pfn_to_page(__nocache_pa((unsigned long)ptep) >> PAGE_SHIFT);
> > - pgtable_pte_page_dtor(page);
> > +
> > + spin_lock(&mm->page_table_lock);
> > + if (page_ref_dec_return(page) == 1)
> > + pgtable_pte_page_dtor(page);
> > + spin_unlock(&mm->page_table_lock);
> > +
> > srmmu_free_nocache(ptep, SRMMU_PTE_TABLE_SIZE);
> > }
> >
> > diff --git a/mm/Kconfig b/mm/Kconfig
> > index c1acc34c1c35..97458119cce8 100644
> > --- a/mm/Kconfig
> > +++ b/mm/Kconfig
> > @@ -192,6 +192,9 @@ config MEMORY_HOTREMOVE
> > # Default to 4 for wider testing, though 8 might be more appropriate.
> > # ARM's adjust_pte (unused if VIPT) depends on mm-wide page_table_lock.
> > # PA-RISC 7xxx's spinlock_t would enlarge struct page from 32 to 44 bytes.
> > +# SPARC32 allocates multiple pte tables within a single page, and therefore
> > +# a per-page lock leads to problems when multiple tables need to be locked
> > +# at the same time (e.g. copy_page_range()).
> > # DEBUG_SPINLOCK and DEBUG_LOCK_ALLOC spinlock_t also enlarge struct page.
> > #
> > config SPLIT_PTLOCK_CPUS
> > @@ -199,6 +202,7 @@ config SPLIT_PTLOCK_CPUS
> > default "999999" if !MMU
> > default "999999" if ARM && !CPU_CACHE_VIPT
> > default "999999" if PARISC && !PA20
> > + default "999999" if SPARC32
> > default "4"
> >
> > config ARCH_ENABLE_SPLIT_PMD_PTLOCK
> >
>

--
Sincerely yours,
Mike.

2020-05-26 17:18:26

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH v5 04/18] sparc32: mm: Reduce allocation size for PMD and PTE tables

On 5/26/20 9:29 AM, Mike Rapoport wrote:
> On Tue, May 26, 2020 at 09:18:54AM -0700, Guenter Roeck wrote:
>> On 5/26/20 7:01 AM, Will Deacon wrote:
>>> On Tue, May 26, 2020 at 02:26:35PM +0100, Will Deacon wrote:
>>>> On Sun, May 24, 2020 at 03:32:56PM +0300, Mike Rapoport wrote:
>>>>> On Thu, May 21, 2020 at 04:02:11PM -0700, Guenter Roeck wrote:
>>>>>> On 5/20/20 12:51 PM, Mike Rapoport wrote:
>>>>>>> On Wed, May 20, 2020 at 12:03:31PM -0700, Guenter Roeck wrote:
>>>>>>>> With above patch applied on top of Ira's patch, I get:
>>>>>>>>
>>>>>>>> BUG: spinlock recursion on CPU#0, S01syslogd/139
>>>>>>>> lock: 0xf5448350, .magic: dead4ead, .owner: S01syslogd/139, .owner_cpu: 0
>>>>>>>> CPU: 0 PID: 139 Comm: S01syslogd Not tainted 5.7.0-rc6-next-20200518-00002-gb178d2d56f29-dirty #1
>>>>>>>> [f0067a64 :
>>>>>>>> do_raw_spin_lock+0xa8/0xd8 ]
>>>>>>>> [f00d5034 :
>>>>>>>> copy_page_range+0x328/0x804 ]
>>>>>>>> [f0025be4 :
>>>>>>>> dup_mm+0x334/0x434 ]
>>>>>>>> [f0027124 :
>>>>>>>> copy_process+0x1224/0x12b0 ]
>>>>>>>> [f0027344 :
>>>>>>>> _do_fork+0x54/0x30c ]
>>>>>>>> [f0027670 :
>>>>>>>> do_fork+0x5c/0x6c ]
>>>>>>>> [f000de44 :
>>>>>>>> sparc_do_fork+0x18/0x38 ]
>>>>>>>> [f000b7f4 :
>>>>>>>> do_syscall+0x34/0x40 ]
>>>>>>>> [5010cd4c :
>>>>>>>> 0x5010cd4c ]
>>>>>>>>
>>>>>>>> Looks like yet another problem.
>>>>>>>
>>>>>>> I've checked the patch above on top of the mmots which already has Ira's
>>>>>>> patches and it booted fine. I've used sparc32_defconfig to build the
>>>>>>> kernel and qemu-system-sparc with default machine and CPU.
>>>>>>>
>>>>>>
>>>>>> Try sparc32_defconfig+SMP.
>>>>>
>>>>> I see a differernt problem, but this could be related:
>>>>>
>>>>> INIT: version 2.86 booting
>>>>> rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
>>>>> (detected by 0, t=5252 jiffies, g=-935, q=3)
>>>>> rcu: All QSes seen, last rcu_sched kthread activity 5252 (-68674--73926), jiffies_till_next_fqs=1, root ->qsmask 0x0
>>>>> rcu: rcu_sched kthread starved for 5252 jiffies! g-935 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
>>>>> rcu: Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
>>>>> rcu: RCU grace-period kthread stack dump:
>>>>> rcu_sched R running task 0 10 2 0x00000000
>>>>>
>>>>> I'm running a bit old debian [1] with qemu-img-sparc.
>>>>>
>>>>> My bisect pointed at commit 8c8f3156dd40 ("sparc32: mm: Reduce
>>>>> allocation size for PMD and PTE tables"). The commit ID is valid for
>>>>> next-20200522.
>>>>
>>>> Can you try the diff below please?
>>>
>>> Actually, that's racy. New version below!
>>>
>>
>> Applied on top of next-20200526, with defconfig+SMP, I still get:
>>
>> BUG: Bad page state in process swapper/0 pfn:0069f
>>
>> many times. Did I have to revert something else ? Sorry, I lost track.
>
> The bad page messages are fixed by [1], but this is not in mmotm or
> linux-next. This is not related to SMP hangs.
>
> [1] https://lore.kernel.org/lkml/[email protected]/
>

With that applied, all boot tests pass for me (including tests with
"-smp 2" on SS-10).

Guenter

>> Note that "-smp 2" on SS-10 works for me (with the same page state
>> messages).
>>
>> Guenter
>>
>>
>>> Will
>>>
>>> --->8
>>>
>>> diff --git a/arch/sparc/mm/srmmu.c b/arch/sparc/mm/srmmu.c
>>> index c861c0f0df73..068029471aa4 100644
>>> --- a/arch/sparc/mm/srmmu.c
>>> +++ b/arch/sparc/mm/srmmu.c
>>> @@ -363,11 +363,16 @@ pgtable_t pte_alloc_one(struct mm_struct *mm)
>>>
>>> if ((ptep = pte_alloc_one_kernel(mm)) == 0)
>>> return NULL;
>>> +
>>> page = pfn_to_page(__nocache_pa((unsigned long)ptep) >> PAGE_SHIFT);
>>> - if (!pgtable_pte_page_ctor(page)) {
>>> - __free_page(page);
>>> - return NULL;
>>> +
>>> + spin_lock(&mm->page_table_lock);
>>> + if (page_ref_inc_return(page) == 2 && !pgtable_pte_page_ctor(page)) {
>>> + page_ref_dec(page);
>>> + ptep = NULL;
>>> }
>>> + spin_unlock(&mm->page_table_lock);
>>> +
>>> return ptep;
>>> }
>>>
>>> @@ -376,7 +381,12 @@ void pte_free(struct mm_struct *mm, pgtable_t ptep)
>>> struct page *page;
>>>
>>> page = pfn_to_page(__nocache_pa((unsigned long)ptep) >> PAGE_SHIFT);
>>> - pgtable_pte_page_dtor(page);
>>> +
>>> + spin_lock(&mm->page_table_lock);
>>> + if (page_ref_dec_return(page) == 1)
>>> + pgtable_pte_page_dtor(page);
>>> + spin_unlock(&mm->page_table_lock);
>>> +
>>> srmmu_free_nocache(ptep, SRMMU_PTE_TABLE_SIZE);
>>> }
>>>
>>> diff --git a/mm/Kconfig b/mm/Kconfig
>>> index c1acc34c1c35..97458119cce8 100644
>>> --- a/mm/Kconfig
>>> +++ b/mm/Kconfig
>>> @@ -192,6 +192,9 @@ config MEMORY_HOTREMOVE
>>> # Default to 4 for wider testing, though 8 might be more appropriate.
>>> # ARM's adjust_pte (unused if VIPT) depends on mm-wide page_table_lock.
>>> # PA-RISC 7xxx's spinlock_t would enlarge struct page from 32 to 44 bytes.
>>> +# SPARC32 allocates multiple pte tables within a single page, and therefore
>>> +# a per-page lock leads to problems when multiple tables need to be locked
>>> +# at the same time (e.g. copy_page_range()).
>>> # DEBUG_SPINLOCK and DEBUG_LOCK_ALLOC spinlock_t also enlarge struct page.
>>> #
>>> config SPLIT_PTLOCK_CPUS
>>> @@ -199,6 +202,7 @@ config SPLIT_PTLOCK_CPUS
>>> default "999999" if !MMU
>>> default "999999" if ARM && !CPU_CACHE_VIPT
>>> default "999999" if PARISC && !PA20
>>> + default "999999" if SPARC32
>>> default "4"
>>>
>>> config ARCH_ENABLE_SPLIT_PMD_PTLOCK
>>>
>>
>