While booting recent linux-next on a IBM Power10 Server LPAR
following crash is observed:
[ 0.000000] numa: Partition configured for 32 NUMA nodes.
[ 0.000000] ------------[ cut here ]------------
[ 0.000000] kernel BUG at mm/memblock.c:519!
[ 0.000000] Oops: Exception in kernel mode, sig: 5 [#1]
[ 0.000000] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
[ 0.000000] Modules linked in:
[ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 6.1.0-rc3-next-20221104 #1
[ 0.000000] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1030.00 (NH1030_026) hv:phyp pSeries
[ 0.000000] NIP: c0000000004ba240 LR: c0000000004bb240 CTR: c0000000004ba210
[ 0.000000] REGS: c000000002a8b7b0 TRAP: 0700 Not tainted (6.1.0-rc3-next-20221104)
[ 0.000000] MSR: 8000000000021033 <SF,ME,IR,DR,RI,LE> CR: 24042424 XER: 00000001
[ 0.000000] CFAR: c0000000004ba290 IRQMASK: 1
[ 0.000000] GPR00: c0000000004bb240 c000000002a8ba50 c00000000136ee00 c0000010f3ac00a8
[ 0.000000] GPR04: 0000000000000000 c0000010f3ac0090 00000010f3ac0000 0000000000000d00
[ 0.000000] GPR08: 0000000000000001 0000000000000007 0000000000000001 0000000000000081
[ 0.000000] GPR12: c0000000004ba210 c000000002e10000 0000000000000000 000000000000000d
[ 0.000000] GPR16: 000000000f6be620 000000000f6be8e8 000000000f6be788 000000000f6bed58
[ 0.000000] GPR20: 000000000f6f6d58 c0000000029a8de8 00000010f3ad8800 0000000000000080
[ 0.000000] GPR24: 00000010f3ad7b00 0000000000000000 0000000000000100 0000000000000d00
[ 0.000000] GPR28: 00000010f3ad7b00 c0000000029a8de8 c0000000029a8e00 0000000000000006
[ 0.000000] NIP [c0000000004ba240] memblock_merge_regions.isra.12+0x40/0x130
[ 0.000000] LR [c0000000004bb240] memblock_add_range+0x190/0x300
[ 0.000000] Call Trace:
[ 0.000000] [c000000002a8ba50] [0000000000000100] 0x100 (unreliable)
[ 0.000000] [c000000002a8ba90] [c0000000004bb240] memblock_add_range+0x190/0x300
[ 0.000000] [c000000002a8bb10] [c0000000004bb5e0] memblock_reserve+0x70/0xd0
[ 0.000000] [c000000002a8bba0] [c000000002045234] memblock_alloc_range_nid+0x11c/0x1e8
[ 0.000000] [c000000002a8bc60] [c0000000020453a4] memblock_alloc_internal+0xa4/0x110
[ 0.000000] [c000000002a8bcb0] [c0000000020456cc] memblock_alloc_try_nid+0x94/0xcc
[ 0.000000] [c000000002a8bd40] [c00000000200b570] alloc_paca_data+0x7c/0xcc
[ 0.000000] [c000000002a8bdb0] [c00000000200b770] allocate_paca+0x8c/0x28c
[ 0.000000] [c000000002a8be50] [c00000000200a26c] setup_arch+0x1c4/0x4d8
[ 0.000000] [c000000002a8bed0] [c000000002004378] start_kernel+0xb4/0xa84
[ 0.000000] [c000000002a8bf90] [c00000000000da90] start_here_common+0x1c/0x20
[ 0.000000] Instruction dump:
[ 0.000000] 7c0802a6 fba1ffe8 fbc1fff0 fbe1fff8 7c7d1b78 7c9e2378 3be00000 f8010010
[ 0.000000] f821ffc1 e9230000 3969ffff 4800000c <0b0a0000> 7d3f4b78 393f0001 7fbf5840
[ 0.000000] ---[ end trace 0000000000000000 ]---
[ 0.000000]
[ 0.000000] Kernel panic - not syncing: Fatal exception
[ 0.000000] Rebooting in 180 seconds..
This problem was introduced with next-20221101. Git bisect points to
following patch
commit 3f82c9c4ac377082e1230f5299e0ccce07b15e12
Date: Tue Oct 25 15:09:43 2022 +0800
memblock: don't run loop in memblock_add_range() twice
Reverting this patch helps boot the kernel to login prompt.
Have attached .config
- Sachin
Hi Sachin,
I didn't have a powerpc architecture machine. I don't know why this happened.
Hi Mike,
Do you have any suggestions? I tested in tools/testing/memblock, and it was successful.
November 6, 2022 8:07 PM, "Sachin Sant" <[email protected]> wrote:
> While booting recent linux-next on a IBM Power10 Server LPAR
> following crash is observed:
>
> [ 0.000000] numa: Partition configured for 32 NUMA nodes.
> [ 0.000000] ------------[ cut here ]------------
> [ 0.000000] kernel BUG at mm/memblock.c:519!
> [ 0.000000] Oops: Exception in kernel mode, sig: 5 [#1]
> [ 0.000000] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
> [ 0.000000] Modules linked in:
> [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 6.1.0-rc3-next-20221104 #1
> [ 0.000000] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1030.00
> (NH1030_026) hv:phyp pSeries
> [ 0.000000] NIP: c0000000004ba240 LR: c0000000004bb240 CTR: c0000000004ba210
> [ 0.000000] REGS: c000000002a8b7b0 TRAP: 0700 Not tainted (6.1.0-rc3-next-20221104)
> [ 0.000000] MSR: 8000000000021033 <SF,ME,IR,DR,RI,LE> CR: 24042424 XER: 00000001
> [ 0.000000] CFAR: c0000000004ba290 IRQMASK: 1
> [ 0.000000] GPR00: c0000000004bb240 c000000002a8ba50 c00000000136ee00 c0000010f3ac00a8
> [ 0.000000] GPR04: 0000000000000000 c0000010f3ac0090 00000010f3ac0000 0000000000000d00
> [ 0.000000] GPR08: 0000000000000001 0000000000000007 0000000000000001 0000000000000081
> [ 0.000000] GPR12: c0000000004ba210 c000000002e10000 0000000000000000 000000000000000d
> [ 0.000000] GPR16: 000000000f6be620 000000000f6be8e8 000000000f6be788 000000000f6bed58
> [ 0.000000] GPR20: 000000000f6f6d58 c0000000029a8de8 00000010f3ad8800 0000000000000080
> [ 0.000000] GPR24: 00000010f3ad7b00 0000000000000000 0000000000000100 0000000000000d00
> [ 0.000000] GPR28: 00000010f3ad7b00 c0000000029a8de8 c0000000029a8e00 0000000000000006
> [ 0.000000] NIP [c0000000004ba240] memblock_merge_regions.isra.12+0x40/0x130
> [ 0.000000] LR [c0000000004bb240] memblock_add_range+0x190/0x300
> [ 0.000000] Call Trace:
> [ 0.000000] [c000000002a8ba50] [0000000000000100] 0x100 (unreliable)
> [ 0.000000] [c000000002a8ba90] [c0000000004bb240] memblock_add_range+0x190/0x300
> [ 0.000000] [c000000002a8bb10] [c0000000004bb5e0] memblock_reserve+0x70/0xd0
> [ 0.000000] [c000000002a8bba0] [c000000002045234] memblock_alloc_range_nid+0x11c/0x1e8
> [ 0.000000] [c000000002a8bc60] [c0000000020453a4] memblock_alloc_internal+0xa4/0x110
> [ 0.000000] [c000000002a8bcb0] [c0000000020456cc] memblock_alloc_try_nid+0x94/0xcc
> [ 0.000000] [c000000002a8bd40] [c00000000200b570] alloc_paca_data+0x7c/0xcc
> [ 0.000000] [c000000002a8bdb0] [c00000000200b770] allocate_paca+0x8c/0x28c
> [ 0.000000] [c000000002a8be50] [c00000000200a26c] setup_arch+0x1c4/0x4d8
> [ 0.000000] [c000000002a8bed0] [c000000002004378] start_kernel+0xb4/0xa84
> [ 0.000000] [c000000002a8bf90] [c00000000000da90] start_here_common+0x1c/0x20
> [ 0.000000] Instruction dump:
> [ 0.000000] 7c0802a6 fba1ffe8 fbc1fff0 fbe1fff8 7c7d1b78 7c9e2378 3be00000 f8010010
> [ 0.000000] f821ffc1 e9230000 3969ffff 4800000c <0b0a0000> 7d3f4b78 393f0001 7fbf5840
> [ 0.000000] ---[ end trace 0000000000000000 ]---
> [ 0.000000]
> [ 0.000000] Kernel panic - not syncing: Fatal exception
> [ 0.000000] Rebooting in 180 seconds..
>
> This problem was introduced with next-20221101. Git bisect points to
> following patch
>
> commit 3f82c9c4ac377082e1230f5299e0ccce07b15e12
> Date: Tue Oct 25 15:09:43 2022 +0800
> memblock: don't run loop in memblock_add_range() twice
>
> Reverting this patch helps boot the kernel to login prompt.
>
> Have attached .config
>
> - Sachin
Hi Yajun,
On Tue, Nov 08, 2022 at 02:27:53AM +0000, Yajun Deng wrote:
> Hi Sachin,
> I didn't have a powerpc architecture machine. I don't know why this happened.
>
> Hi Mike,
> Do you have any suggestions?
You can try reproducing the bug qemu or work with Sachin to debug the
issue.
> I tested in tools/testing/memblock, and it was successful.
Memblock tests provide limited coverage still and they don't deal with all
possible cases.
For now I'm dropping this patch from the memblock tree until the issue is
fixed.
> November 6, 2022 8:07 PM, "Sachin Sant" <[email protected]> wrote:
>
> > While booting recent linux-next on a IBM Power10 Server LPAR
> > following crash is observed:
> >
> > [ 0.000000] numa: Partition configured for 32 NUMA nodes.
> > [ 0.000000] ------------[ cut here ]------------
> > [ 0.000000] kernel BUG at mm/memblock.c:519!
> > [ 0.000000] Oops: Exception in kernel mode, sig: 5 [#1]
> > [ 0.000000] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
> > [ 0.000000] Modules linked in:
> > [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 6.1.0-rc3-next-20221104 #1
> > [ 0.000000] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1030.00
> > (NH1030_026) hv:phyp pSeries
> > [ 0.000000] NIP: c0000000004ba240 LR: c0000000004bb240 CTR: c0000000004ba210
> > [ 0.000000] REGS: c000000002a8b7b0 TRAP: 0700 Not tainted (6.1.0-rc3-next-20221104)
> > [ 0.000000] MSR: 8000000000021033 <SF,ME,IR,DR,RI,LE> CR: 24042424 XER: 00000001
> > [ 0.000000] CFAR: c0000000004ba290 IRQMASK: 1
> > [ 0.000000] GPR00: c0000000004bb240 c000000002a8ba50 c00000000136ee00 c0000010f3ac00a8
> > [ 0.000000] GPR04: 0000000000000000 c0000010f3ac0090 00000010f3ac0000 0000000000000d00
> > [ 0.000000] GPR08: 0000000000000001 0000000000000007 0000000000000001 0000000000000081
> > [ 0.000000] GPR12: c0000000004ba210 c000000002e10000 0000000000000000 000000000000000d
> > [ 0.000000] GPR16: 000000000f6be620 000000000f6be8e8 000000000f6be788 000000000f6bed58
> > [ 0.000000] GPR20: 000000000f6f6d58 c0000000029a8de8 00000010f3ad8800 0000000000000080
> > [ 0.000000] GPR24: 00000010f3ad7b00 0000000000000000 0000000000000100 0000000000000d00
> > [ 0.000000] GPR28: 00000010f3ad7b00 c0000000029a8de8 c0000000029a8e00 0000000000000006
> > [ 0.000000] NIP [c0000000004ba240] memblock_merge_regions.isra.12+0x40/0x130
> > [ 0.000000] LR [c0000000004bb240] memblock_add_range+0x190/0x300
> > [ 0.000000] Call Trace:
> > [ 0.000000] [c000000002a8ba50] [0000000000000100] 0x100 (unreliable)
> > [ 0.000000] [c000000002a8ba90] [c0000000004bb240] memblock_add_range+0x190/0x300
> > [ 0.000000] [c000000002a8bb10] [c0000000004bb5e0] memblock_reserve+0x70/0xd0
> > [ 0.000000] [c000000002a8bba0] [c000000002045234] memblock_alloc_range_nid+0x11c/0x1e8
> > [ 0.000000] [c000000002a8bc60] [c0000000020453a4] memblock_alloc_internal+0xa4/0x110
> > [ 0.000000] [c000000002a8bcb0] [c0000000020456cc] memblock_alloc_try_nid+0x94/0xcc
> > [ 0.000000] [c000000002a8bd40] [c00000000200b570] alloc_paca_data+0x7c/0xcc
> > [ 0.000000] [c000000002a8bdb0] [c00000000200b770] allocate_paca+0x8c/0x28c
> > [ 0.000000] [c000000002a8be50] [c00000000200a26c] setup_arch+0x1c4/0x4d8
> > [ 0.000000] [c000000002a8bed0] [c000000002004378] start_kernel+0xb4/0xa84
> > [ 0.000000] [c000000002a8bf90] [c00000000000da90] start_here_common+0x1c/0x20
> > [ 0.000000] Instruction dump:
> > [ 0.000000] 7c0802a6 fba1ffe8 fbc1fff0 fbe1fff8 7c7d1b78 7c9e2378 3be00000 f8010010
> > [ 0.000000] f821ffc1 e9230000 3969ffff 4800000c <0b0a0000> 7d3f4b78 393f0001 7fbf5840
> > [ 0.000000] ---[ end trace 0000000000000000 ]---
> > [ 0.000000]
> > [ 0.000000] Kernel panic - not syncing: Fatal exception
> > [ 0.000000] Rebooting in 180 seconds..
> >
> > This problem was introduced with next-20221101. Git bisect points to
> > following patch
> >
> > commit 3f82c9c4ac377082e1230f5299e0ccce07b15e12
> > Date: Tue Oct 25 15:09:43 2022 +0800
> > memblock: don't run loop in memblock_add_range() twice
> >
> > Reverting this patch helps boot the kernel to login prompt.
> >
> > Have attached .config
> >
> > - Sachin
--
Sincerely yours,
Mike.
November 8, 2022 3:55 PM, "Mike Rapoport" <[email protected]> wrote:
> Hi Yajun,
>
> On Tue, Nov 08, 2022 at 02:27:53AM +0000, Yajun Deng wrote:
>
>> Hi Sachin,
>> I didn't have a powerpc architecture machine. I don't know why this happened.
>>
>> Hi Mike,
>> Do you have any suggestions?
>
> You can try reproducing the bug qemu or work with Sachin to debug the
> issue.
>
Thanks, I'll try it.
>> I tested in tools/testing/memblock, and it was successful.
>
> Memblock tests provide limited coverage still and they don't deal with all
> possible cases.
>
> For now I'm dropping this patch from the memblock tree until the issue is
> fixed.
>
>> November 6, 2022 8:07 PM, "Sachin Sant" <[email protected]> wrote:
>>
>> While booting recent linux-next on a IBM Power10 Server LPAR
>> following crash is observed:
>>
>> [ 0.000000] numa: Partition configured for 32 NUMA nodes.
>> [ 0.000000] ------------[ cut here ]------------
>> [ 0.000000] kernel BUG at mm/memblock.c:519!
>> [ 0.000000] Oops: Exception in kernel mode, sig: 5 [#1]
>> [ 0.000000] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
>> [ 0.000000] Modules linked in:
>> [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 6.1.0-rc3-next-20221104 #1
>> [ 0.000000] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1030.00
>> (NH1030_026) hv:phyp pSeries
>> [ 0.000000] NIP: c0000000004ba240 LR: c0000000004bb240 CTR: c0000000004ba210
>> [ 0.000000] REGS: c000000002a8b7b0 TRAP: 0700 Not tainted (6.1.0-rc3-next-20221104)
>> [ 0.000000] MSR: 8000000000021033 <SF,ME,IR,DR,RI,LE> CR: 24042424 XER: 00000001
>> [ 0.000000] CFAR: c0000000004ba290 IRQMASK: 1
>> [ 0.000000] GPR00: c0000000004bb240 c000000002a8ba50 c00000000136ee00 c0000010f3ac00a8
>> [ 0.000000] GPR04: 0000000000000000 c0000010f3ac0090 00000010f3ac0000 0000000000000d00
>> [ 0.000000] GPR08: 0000000000000001 0000000000000007 0000000000000001 0000000000000081
>> [ 0.000000] GPR12: c0000000004ba210 c000000002e10000 0000000000000000 000000000000000d
>> [ 0.000000] GPR16: 000000000f6be620 000000000f6be8e8 000000000f6be788 000000000f6bed58
>> [ 0.000000] GPR20: 000000000f6f6d58 c0000000029a8de8 00000010f3ad8800 0000000000000080
>> [ 0.000000] GPR24: 00000010f3ad7b00 0000000000000000 0000000000000100 0000000000000d00
>> [ 0.000000] GPR28: 00000010f3ad7b00 c0000000029a8de8 c0000000029a8e00 0000000000000006
>> [ 0.000000] NIP [c0000000004ba240] memblock_merge_regions.isra.12+0x40/0x130
>> [ 0.000000] LR [c0000000004bb240] memblock_add_range+0x190/0x300
>> [ 0.000000] Call Trace:
>> [ 0.000000] [c000000002a8ba50] [0000000000000100] 0x100 (unreliable)
>> [ 0.000000] [c000000002a8ba90] [c0000000004bb240] memblock_add_range+0x190/0x300
>> [ 0.000000] [c000000002a8bb10] [c0000000004bb5e0] memblock_reserve+0x70/0xd0
>> [ 0.000000] [c000000002a8bba0] [c000000002045234] memblock_alloc_range_nid+0x11c/0x1e8
>> [ 0.000000] [c000000002a8bc60] [c0000000020453a4] memblock_alloc_internal+0xa4/0x110
>> [ 0.000000] [c000000002a8bcb0] [c0000000020456cc] memblock_alloc_try_nid+0x94/0xcc
>> [ 0.000000] [c000000002a8bd40] [c00000000200b570] alloc_paca_data+0x7c/0xcc
>> [ 0.000000] [c000000002a8bdb0] [c00000000200b770] allocate_paca+0x8c/0x28c
>> [ 0.000000] [c000000002a8be50] [c00000000200a26c] setup_arch+0x1c4/0x4d8
>> [ 0.000000] [c000000002a8bed0] [c000000002004378] start_kernel+0xb4/0xa84
>> [ 0.000000] [c000000002a8bf90] [c00000000000da90] start_here_common+0x1c/0x20
>> [ 0.000000] Instruction dump:
>> [ 0.000000] 7c0802a6 fba1ffe8 fbc1fff0 fbe1fff8 7c7d1b78 7c9e2378 3be00000 f8010010
>> [ 0.000000] f821ffc1 e9230000 3969ffff 4800000c <0b0a0000> 7d3f4b78 393f0001 7fbf5840
>> [ 0.000000] ---[ end trace 0000000000000000 ]---
>> [ 0.000000]
>> [ 0.000000] Kernel panic - not syncing: Fatal exception
>> [ 0.000000] Rebooting in 180 seconds..
>>
>> This problem was introduced with next-20221101. Git bisect points to
>> following patch
>>
>> commit 3f82c9c4ac377082e1230f5299e0ccce07b15e12
>> Date: Tue Oct 25 15:09:43 2022 +0800
>> memblock: don't run loop in memblock_add_range() twice
>>
>> Reverting this patch helps boot the kernel to login prompt.
>>
>> Have attached .config
>>
>> - Sachin
>
> --
> Sincerely yours,
> Mike.
November 9, 2022 6:03 PM, "Yajun Deng" <[email protected]> wrote:
> Hey Mike,
>
Sorry, this email should be sent to Sachin but not Mike.
Please forgive my confusion. So:
Hey Sachin,
Can you help me test the attached file?
Please use this new patch instead of the one in memblock tree.
> Can you help me test the attached file?
> Please use this new patch instead of the one in memblock tree.
>
> November 8, 2022 3:55 PM, "Mike Rapoport" <[email protected]> wrote:
>
>> Hi Yajun,
>>
>> On Tue, Nov 08, 2022 at 02:27:53AM +0000, Yajun Deng wrote:
>>
>>> Hi Sachin,
>>> I didn't have a powerpc architecture machine. I don't know why this happened.
>>>
>>> Hi Mike,
>>> Do you have any suggestions?
>>
>> You can try reproducing the bug qemu or work with Sachin to debug the
>> issue.
>>
>>> I tested in tools/testing/memblock, and it was successful.
>>
>> Memblock tests provide limited coverage still and they don't deal with all
>> possible cases.
>>
>> For now I'm dropping this patch from the memblock tree until the issue is
>> fixed.
>>
>>> November 6, 2022 8:07 PM, "Sachin Sant" <[email protected]> wrote:
>>>
>>> While booting recent linux-next on a IBM Power10 Server LPAR
>>> following crash is observed:
>>>
>>> [ 0.000000] numa: Partition configured for 32 NUMA nodes.
>>> [ 0.000000] ------------[ cut here ]------------
>>> [ 0.000000] kernel BUG at mm/memblock.c:519!
>>> [ 0.000000] Oops: Exception in kernel mode, sig: 5 [#1]
>>> [ 0.000000] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
>>> [ 0.000000] Modules linked in:
>>> [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 6.1.0-rc3-next-20221104 #1
>>> [ 0.000000] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1030.00
>>> (NH1030_026) hv:phyp pSeries
>>> [ 0.000000] NIP: c0000000004ba240 LR: c0000000004bb240 CTR: c0000000004ba210
>>> [ 0.000000] REGS: c000000002a8b7b0 TRAP: 0700 Not tainted (6.1.0-rc3-next-20221104)
>>> [ 0.000000] MSR: 8000000000021033 <SF,ME,IR,DR,RI,LE> CR: 24042424 XER: 00000001
>>> [ 0.000000] CFAR: c0000000004ba290 IRQMASK: 1
>>> [ 0.000000] GPR00: c0000000004bb240 c000000002a8ba50 c00000000136ee00 c0000010f3ac00a8
>>> [ 0.000000] GPR04: 0000000000000000 c0000010f3ac0090 00000010f3ac0000 0000000000000d00
>>> [ 0.000000] GPR08: 0000000000000001 0000000000000007 0000000000000001 0000000000000081
>>> [ 0.000000] GPR12: c0000000004ba210 c000000002e10000 0000000000000000 000000000000000d
>>> [ 0.000000] GPR16: 000000000f6be620 000000000f6be8e8 000000000f6be788 000000000f6bed58
>>> [ 0.000000] GPR20: 000000000f6f6d58 c0000000029a8de8 00000010f3ad8800 0000000000000080
>>> [ 0.000000] GPR24: 00000010f3ad7b00 0000000000000000 0000000000000100 0000000000000d00
>>> [ 0.000000] GPR28: 00000010f3ad7b00 c0000000029a8de8 c0000000029a8e00 0000000000000006
>>> [ 0.000000] NIP [c0000000004ba240] memblock_merge_regions.isra.12+0x40/0x130
>>> [ 0.000000] LR [c0000000004bb240] memblock_add_range+0x190/0x300
>>> [ 0.000000] Call Trace:
>>> [ 0.000000] [c000000002a8ba50] [0000000000000100] 0x100 (unreliable)
>>> [ 0.000000] [c000000002a8ba90] [c0000000004bb240] memblock_add_range+0x190/0x300
>>> [ 0.000000] [c000000002a8bb10] [c0000000004bb5e0] memblock_reserve+0x70/0xd0
>>> [ 0.000000] [c000000002a8bba0] [c000000002045234] memblock_alloc_range_nid+0x11c/0x1e8
>>> [ 0.000000] [c000000002a8bc60] [c0000000020453a4] memblock_alloc_internal+0xa4/0x110
>>> [ 0.000000] [c000000002a8bcb0] [c0000000020456cc] memblock_alloc_try_nid+0x94/0xcc
>>> [ 0.000000] [c000000002a8bd40] [c00000000200b570] alloc_paca_data+0x7c/0xcc
>>> [ 0.000000] [c000000002a8bdb0] [c00000000200b770] allocate_paca+0x8c/0x28c
>>> [ 0.000000] [c000000002a8be50] [c00000000200a26c] setup_arch+0x1c4/0x4d8
>>> [ 0.000000] [c000000002a8bed0] [c000000002004378] start_kernel+0xb4/0xa84
>>> [ 0.000000] [c000000002a8bf90] [c00000000000da90] start_here_common+0x1c/0x20
>>> [ 0.000000] Instruction dump:
>>> [ 0.000000] 7c0802a6 fba1ffe8 fbc1fff0 fbe1fff8 7c7d1b78 7c9e2378 3be00000 f8010010
>>> [ 0.000000] f821ffc1 e9230000 3969ffff 4800000c <0b0a0000> 7d3f4b78 393f0001 7fbf5840
>>> [ 0.000000] ---[ end trace 0000000000000000 ]---
>>> [ 0.000000]
>>> [ 0.000000] Kernel panic - not syncing: Fatal exception
>>> [ 0.000000] Rebooting in 180 seconds..
>>>
>>> This problem was introduced with next-20221101. Git bisect points to
>>> following patch
>>>
>>> commit 3f82c9c4ac377082e1230f5299e0ccce07b15e12
>>> Date: Tue Oct 25 15:09:43 2022 +0800
>>> memblock: don't run loop in memblock_add_range() twice
>>>
>>> Reverting this patch helps boot the kernel to login prompt.
>>>
>>> Have attached .config
>>>
>>> - Sachin
>>
>> --
>> Sincerely yours,
>> Mike.
> On 09-Nov-2022, at 3:55 PM, Yajun Deng <[email protected]> wrote:
>
> November 9, 2022 6:03 PM, "Yajun Deng" <[email protected]> wrote:
>
>> Hey Mike,
>>
> Sorry, this email should be sent to Sachin but not Mike.
> Please forgive my confusion. So:
>
> Hey Sachin,
> Can you help me test the attached file?
> Please use this new patch instead of the one in memblock tree.
Thanks for the fix. With the updated patch kernel boots correctly.
Tested-by: Sachin Sant <[email protected] <mailto:[email protected]>>
- Sachin
Hey Mike,
Can you help me test the attached file?
Please use this new patch instead of the one in memblock tree.
November 8, 2022 3:55 PM, "Mike Rapoport" <[email protected]> wrote:
> Hi Yajun,
>
> On Tue, Nov 08, 2022 at 02:27:53AM +0000, Yajun Deng wrote:
>
>> Hi Sachin,
>> I didn't have a powerpc architecture machine. I don't know why this happened.
>>
>> Hi Mike,
>> Do you have any suggestions?
>
> You can try reproducing the bug qemu or work with Sachin to debug the
> issue.
>
>> I tested in tools/testing/memblock, and it was successful.
>
> Memblock tests provide limited coverage still and they don't deal with all
> possible cases.
>
> For now I'm dropping this patch from the memblock tree until the issue is
> fixed.
>
>> November 6, 2022 8:07 PM, "Sachin Sant" <[email protected]> wrote:
>>
>> While booting recent linux-next on a IBM Power10 Server LPAR
>> following crash is observed:
>>
>> [ 0.000000] numa: Partition configured for 32 NUMA nodes.
>> [ 0.000000] ------------[ cut here ]------------
>> [ 0.000000] kernel BUG at mm/memblock.c:519!
>> [ 0.000000] Oops: Exception in kernel mode, sig: 5 [#1]
>> [ 0.000000] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
>> [ 0.000000] Modules linked in:
>> [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 6.1.0-rc3-next-20221104 #1
>> [ 0.000000] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1030.00
>> (NH1030_026) hv:phyp pSeries
>> [ 0.000000] NIP: c0000000004ba240 LR: c0000000004bb240 CTR: c0000000004ba210
>> [ 0.000000] REGS: c000000002a8b7b0 TRAP: 0700 Not tainted (6.1.0-rc3-next-20221104)
>> [ 0.000000] MSR: 8000000000021033 <SF,ME,IR,DR,RI,LE> CR: 24042424 XER: 00000001
>> [ 0.000000] CFAR: c0000000004ba290 IRQMASK: 1
>> [ 0.000000] GPR00: c0000000004bb240 c000000002a8ba50 c00000000136ee00 c0000010f3ac00a8
>> [ 0.000000] GPR04: 0000000000000000 c0000010f3ac0090 00000010f3ac0000 0000000000000d00
>> [ 0.000000] GPR08: 0000000000000001 0000000000000007 0000000000000001 0000000000000081
>> [ 0.000000] GPR12: c0000000004ba210 c000000002e10000 0000000000000000 000000000000000d
>> [ 0.000000] GPR16: 000000000f6be620 000000000f6be8e8 000000000f6be788 000000000f6bed58
>> [ 0.000000] GPR20: 000000000f6f6d58 c0000000029a8de8 00000010f3ad8800 0000000000000080
>> [ 0.000000] GPR24: 00000010f3ad7b00 0000000000000000 0000000000000100 0000000000000d00
>> [ 0.000000] GPR28: 00000010f3ad7b00 c0000000029a8de8 c0000000029a8e00 0000000000000006
>> [ 0.000000] NIP [c0000000004ba240] memblock_merge_regions.isra.12+0x40/0x130
>> [ 0.000000] LR [c0000000004bb240] memblock_add_range+0x190/0x300
>> [ 0.000000] Call Trace:
>> [ 0.000000] [c000000002a8ba50] [0000000000000100] 0x100 (unreliable)
>> [ 0.000000] [c000000002a8ba90] [c0000000004bb240] memblock_add_range+0x190/0x300
>> [ 0.000000] [c000000002a8bb10] [c0000000004bb5e0] memblock_reserve+0x70/0xd0
>> [ 0.000000] [c000000002a8bba0] [c000000002045234] memblock_alloc_range_nid+0x11c/0x1e8
>> [ 0.000000] [c000000002a8bc60] [c0000000020453a4] memblock_alloc_internal+0xa4/0x110
>> [ 0.000000] [c000000002a8bcb0] [c0000000020456cc] memblock_alloc_try_nid+0x94/0xcc
>> [ 0.000000] [c000000002a8bd40] [c00000000200b570] alloc_paca_data+0x7c/0xcc
>> [ 0.000000] [c000000002a8bdb0] [c00000000200b770] allocate_paca+0x8c/0x28c
>> [ 0.000000] [c000000002a8be50] [c00000000200a26c] setup_arch+0x1c4/0x4d8
>> [ 0.000000] [c000000002a8bed0] [c000000002004378] start_kernel+0xb4/0xa84
>> [ 0.000000] [c000000002a8bf90] [c00000000000da90] start_here_common+0x1c/0x20
>> [ 0.000000] Instruction dump:
>> [ 0.000000] 7c0802a6 fba1ffe8 fbc1fff0 fbe1fff8 7c7d1b78 7c9e2378 3be00000 f8010010
>> [ 0.000000] f821ffc1 e9230000 3969ffff 4800000c <0b0a0000> 7d3f4b78 393f0001 7fbf5840
>> [ 0.000000] ---[ end trace 0000000000000000 ]---
>> [ 0.000000]
>> [ 0.000000] Kernel panic - not syncing: Fatal exception
>> [ 0.000000] Rebooting in 180 seconds..
>>
>> This problem was introduced with next-20221101. Git bisect points to
>> following patch
>>
>> commit 3f82c9c4ac377082e1230f5299e0ccce07b15e12
>> Date: Tue Oct 25 15:09:43 2022 +0800
>> memblock: don't run loop in memblock_add_range() twice
>>
>> Reverting this patch helps boot the kernel to login prompt.
>>
>> Have attached .config
>>
>> - Sachin
>
> --
> Sincerely yours,
> Mike.
November 9, 2022 6:55 PM, "Sachin Sant" <[email protected]> wrote:
>> On 09-Nov-2022, at 3:55 PM, Yajun Deng <[email protected]> wrote:
>>
>> November 9, 2022 6:03 PM, "Yajun Deng" <[email protected]> wrote:
>>
>>> Hey Mike,
>>
>> Sorry, this email should be sent to Sachin but not Mike.
>> Please forgive my confusion. So:
>>
>> Hey Sachin,
>> Can you help me test the attached file?
>> Please use this new patch instead of the one in memblock tree.
>
> Thanks for the fix. With the updated patch kernel boots correctly.
>
Thanks for your test results.
Hi Mike,
Do you have any other suggestions for this patch? If not, I'll send a v3 patch.
> Tested-by: Sachin Sant <[email protected] <[email protected]>>
>
> - Sachin
Hi Yajun,
On Wed, Nov 09, 2022 at 11:32:27AM +0000, Yajun Deng wrote:
> November 9, 2022 6:55 PM, "Sachin Sant" <[email protected]> wrote:
>
> >> On 09-Nov-2022, at 3:55 PM, Yajun Deng <[email protected]> wrote:
> >>
> >> November 9, 2022 6:03 PM, "Yajun Deng" <[email protected]> wrote:
> >>
> >>> Hey Mike,
> >>
> >> Sorry, this email should be sent to Sachin but not Mike.
> >> Please forgive my confusion. So:
> >>
> >> Hey Sachin,
> >> Can you help me test the attached file?
> >> Please use this new patch instead of the one in memblock tree.
> >
> > Thanks for the fix. With the updated patch kernel boots correctly.
> >
>
> Thanks for your test results.
>
> Hi Mike,
> Do you have any other suggestions for this patch? If not, I'll send a v3 patch.
Unfortunately I don't think the new version has much value as it does not
really eliminate the second loop in case memory allocation is required.
I'd say the improvement is not worth the churn.
> > Tested-by: Sachin Sant <[email protected] <[email protected]>>
> >
> > - Sachin
--
Sincerely yours,
Mike.
November 9, 2022 7:42 PM, "Mike Rapoport" <[email protected]> wrote:
> Hi Yajun,
>
> On Wed, Nov 09, 2022 at 11:32:27AM +0000, Yajun Deng wrote:
>
>> November 9, 2022 6:55 PM, "Sachin Sant" <[email protected]> wrote:
>>
>> On 09-Nov-2022, at 3:55 PM, Yajun Deng <[email protected]> wrote:
>>
>> November 9, 2022 6:03 PM, "Yajun Deng" <[email protected]> wrote:
>>
>> Hey Mike,
>>
>> Sorry, this email should be sent to Sachin but not Mike.
>> Please forgive my confusion. So:
>>
>> Hey Sachin,
>> Can you help me test the attached file?
>> Please use this new patch instead of the one in memblock tree.
>>
>> Thanks for the fix. With the updated patch kernel boots correctly.
>>
>> Thanks for your test results.
>>
>> Hi Mike,
>> Do you have any other suggestions for this patch? If not, I'll send a v3 patch.
>
> Unfortunately I don't think the new version has much value as it does not
> really eliminate the second loop in case memory allocation is required.
> I'd say the improvement is not worth the churn.
>
OK, I got it.
>> Tested-by: Sachin Sant <[email protected] <[email protected]>>
>>
>> - Sachin
>
> --
> Sincerely yours,
> Mike.