2016-11-15 15:36:01

by Guenter Roeck

[permalink] [raw]
Subject: next: s390 crash due to 's390: move sys_call_table and last_break from thread_info to thread_struct'

Hi Martin,

my s390 qemu boot test crashes in -next as follows.

Kernel stack overflow.
CPU: 0 PID: 923 Comm: modprobe Not tainted 4.9.0-rc5-next-20161115 #1
Hardware name: QEMU QEMU QEMU (KVM)
task: 000000001d805100 task.stack: 000000001d898000
Krnl PSW : 0404e00180000000 0000000000ac2b42 (pgm_check_handler+0xd6/0x1b4)
R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
Krnl GPRS: 0000000000000020 0000000000000000 0000000000000004 0000000000000000
0000000000400034 0000000000000000 000000007ff3b350 0000000000000001
0404e00180000000 0000000000ac2b42 0000000000ac2ad0 000000007ff38178
000000001d805100 0000000000ac335e 0000000000000200 000000007ff380d8
Krnl Code: 0000000000ac2b34: a7840005 brc 8,ac2b3e
0000000000ac2b38: d2ffe0e8d000 mvc 232(256,%r14),0(%r13)
#0000000000ac2b3e: 41b0f0a0 la %r11,160(%r15)
>0000000000ac2b42: eb07b0180024 stmg %r0,%r7,24(%r11)
0000000000ac2b48: d23fb0580200 mvc 88(64,%r11),512
0000000000ac2b4e: eb89b0080024 stmg %r8,%r9,8(%r11)
0000000000ac2b54: d203b0a0008c mvc 160(4,%r11),140
0000000000ac2b5a: d207b0a800a8 mvc 168(8,%r11),168
Call Trace:
no locks held by modprobe/923.
Last Breaking-Event-Address:
[<0000000000000000>] (null)
Kernel panic - not syncing: Corrupt kernel stack, can't continue.

Bisect points to commit 1914608db9 ("s390: move sys_call_table and last_break
from thread_info to thread_struct"). Reverting that patch fixes the problem.

Configuration is s390:defconfig with CONFIG_MARCH_Z900=y.

Bisect log is attached.
A complete log is at:
http://kerneltests.org/builders/qemu-s390-next/builds/252/steps/qemubuildcommand/logs/stdio

Guenter

---
# bad: [88a2ced28ffe354132353af73f9429f299b12e4c] Add linux-next specific files for 20161115
# good: [a25f0944ba9b1d8a6813fd6f1a86f1bd59ac25a6] Linux 4.9-rc5
git bisect start 'HEAD' 'v4.9-rc5'
# bad: [4fa7a32011ff952305f571c60384e907915e551c] Merge remote-tracking branch 'drm/drm-next'
git bisect bad 4fa7a32011ff952305f571c60384e907915e551c
# bad: [e3c8127151053b1561287d9f70ad07e45321d5a9] Merge remote-tracking branch 'dlm/next'
git bisect bad e3c8127151053b1561287d9f70ad07e45321d5a9
# good: [0956c4cfc46e3c572990366ad99592a93d0ae450] Merge remote-tracking branch 'renesas/next'
git bisect good 0956c4cfc46e3c572990366ad99592a93d0ae450
# bad: [833cac18bcdd53af7578cfdded58638ffef11be5] Merge remote-tracking branch 'ext4/dev'
git bisect bad 833cac18bcdd53af7578cfdded58638ffef11be5
# good: [f06b259941a664c5d3f388c42d8aea555fa65e9f] Merge remote-tracking branch 'arm64/for-next/core'
git bisect good f06b259941a664c5d3f388c42d8aea555fa65e9f
# bad: [b8e4c75a03709e8509640625efc506c69432a8b2] Merge remote-tracking branch 'tile/master'
git bisect bad b8e4c75a03709e8509640625efc506c69432a8b2
# good: [e56732ed80f07b8bfa7e9e95cb46e9faee3420bc] Merge remote-tracking branch 'powerpc/next'
git bisect good e56732ed80f07b8bfa7e9e95cb46e9faee3420bc
# good: [0729dcf248325db600f232d7b96e76441ea450dd] s390: hotplug: make pci_hpc explicitly non-modular
git bisect good 0729dcf248325db600f232d7b96e76441ea450dd
# good: [f8fc82b47149e3449d23e94d6ecf30af2ffcebff] s390: move system_call field from thread_info to thread_struct
git bisect good f8fc82b47149e3449d23e94d6ecf30af2ffcebff
# good: [ecc8bebe29f5c36e3b7b37f52946f318654a29cb] tile: remove #pragma unroll from finv_buffer_remote()
git bisect good ecc8bebe29f5c36e3b7b37f52946f318654a29cb
# bad: [1914608db9e8974ac9f53efdcf0f00f331f4c0e8] s390: move sys_call_table and last_break from thread_info to thread_struct
git bisect bad 1914608db9e8974ac9f53efdcf0f00f331f4c0e8
# good: [90c53e65806323382e8bff212cc993700a4a62d9] s390: move cputime accounting fields from thread_info to thread_struct
git bisect good 90c53e65806323382e8bff212cc993700a4a62d9
# first bad commit: [1914608db9e8974ac9f53efdcf0f00f331f4c0e8] s390: move sys_call_table and last_break from thread_info to thread_struct


2016-11-15 15:54:29

by Martin Schwidefsky

[permalink] [raw]
Subject: Re: next: s390 crash due to 's390: move sys_call_table and last_break from thread_info to thread_struct'

On Tue, 15 Nov 2016 07:35:54 -0800
Guenter Roeck <[email protected]> wrote:

> Hi Martin,
>
> my s390 qemu boot test crashes in -next as follows.
>
> Kernel stack overflow.
> CPU: 0 PID: 923 Comm: modprobe Not tainted 4.9.0-rc5-next-20161115 #1
> Hardware name: QEMU QEMU QEMU (KVM)
> task: 000000001d805100 task.stack: 000000001d898000
> Krnl PSW : 0404e00180000000 0000000000ac2b42 (pgm_check_handler+0xd6/0x1b4)
> R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
> Krnl GPRS: 0000000000000020 0000000000000000 0000000000000004 0000000000000000
> 0000000000400034 0000000000000000 000000007ff3b350 0000000000000001
> 0404e00180000000 0000000000ac2b42 0000000000ac2ad0 000000007ff38178
> 000000001d805100 0000000000ac335e 0000000000000200 000000007ff380d8
> Krnl Code: 0000000000ac2b34: a7840005 brc 8,ac2b3e
> 0000000000ac2b38: d2ffe0e8d000 mvc 232(256,%r14),0(%r13)
> #0000000000ac2b3e: 41b0f0a0 la %r11,160(%r15)
> >0000000000ac2b42: eb07b0180024 stmg %r0,%r7,24(%r11)
> 0000000000ac2b48: d23fb0580200 mvc 88(64,%r11),512
> 0000000000ac2b4e: eb89b0080024 stmg %r8,%r9,8(%r11)
> 0000000000ac2b54: d203b0a0008c mvc 160(4,%r11),140
> 0000000000ac2b5a: d207b0a800a8 mvc 168(8,%r11),168
> Call Trace:
> no locks held by modprobe/923.
> Last Breaking-Event-Address:
> [<0000000000000000>] (null)
> Kernel panic - not syncing: Corrupt kernel stack, can't continue.
>
> Bisect points to commit 1914608db9 ("s390: move sys_call_table and last_break
> from thread_info to thread_struct"). Reverting that patch fixes the problem.
>
> Configuration is s390:defconfig with CONFIG_MARCH_Z900=y.
>
> Bisect log is attached.
> A complete log is at:
> http://kerneltests.org/builders/qemu-s390-next/builds/252/steps/qemubuildcommand/logs/stdio

Thanks for the report. Builds for Z900 and Z990 are borked. This hunk

@@ -287,7 +292,13 @@ ENTRY(system_call)
mvc __PT_INT_CODE(4,%r11),__LC_SVC_ILC
stg %r14,__PT_FLAGS(%r11)
.Lsysc_do_svc:
- lg %r10,__TI_sysc_table(%r12) # address of system call table
+ # load address of system call table
+#ifdef CONFIG_HAVE_MARCH_Z990_FEATURES
+ lg %r10,__TASK_thread+__THREAD_sysc_table(%r12)
+#else
+ lghi %r10,__TASK_thread
+ lg %r10,__THREAD_sysc_table(%r10,%r12)
+#endif
llgh %r8,__PT_INT_CODE+2(%r11)
slag %r8,%r8,2 # shift and test for svc 0
jnz .Lsysc_nr_ok

makes ill use of %r10 in the #else part. Should be fixed now and tomorrows -next
tree will have the fix. Thanks again.

--
blue skies,
Martin.

"Reality continues to ruin my life." - Calvin.

2016-11-24 20:54:42

by Guenter Roeck

[permalink] [raw]
Subject: Re: next: s390 crash due to 's390: move sys_call_table and last_break from thread_info to thread_struct'

Martin,

On 11/15/2016 07:54 AM, Martin Schwidefsky wrote:
> On Tue, 15 Nov 2016 07:35:54 -0800
> Guenter Roeck <[email protected]> wrote:
>
>> Hi Martin,
>>
>> my s390 qemu boot test crashes in -next as follows.
>>
>> Kernel stack overflow.
>> CPU: 0 PID: 923 Comm: modprobe Not tainted 4.9.0-rc5-next-20161115 #1
>> Hardware name: QEMU QEMU QEMU (KVM)
>> task: 000000001d805100 task.stack: 000000001d898000
>> Krnl PSW : 0404e00180000000 0000000000ac2b42 (pgm_check_handler+0xd6/0x1b4)
>> R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
>> Krnl GPRS: 0000000000000020 0000000000000000 0000000000000004 0000000000000000
>> 0000000000400034 0000000000000000 000000007ff3b350 0000000000000001
>> 0404e00180000000 0000000000ac2b42 0000000000ac2ad0 000000007ff38178
>> 000000001d805100 0000000000ac335e 0000000000000200 000000007ff380d8
>> Krnl Code: 0000000000ac2b34: a7840005 brc 8,ac2b3e
>> 0000000000ac2b38: d2ffe0e8d000 mvc 232(256,%r14),0(%r13)
>> #0000000000ac2b3e: 41b0f0a0 la %r11,160(%r15)
>> >0000000000ac2b42: eb07b0180024 stmg %r0,%r7,24(%r11)
>> 0000000000ac2b48: d23fb0580200 mvc 88(64,%r11),512
>> 0000000000ac2b4e: eb89b0080024 stmg %r8,%r9,8(%r11)
>> 0000000000ac2b54: d203b0a0008c mvc 160(4,%r11),140
>> 0000000000ac2b5a: d207b0a800a8 mvc 168(8,%r11),168
>> Call Trace:
>> no locks held by modprobe/923.
>> Last Breaking-Event-Address:
>> [<0000000000000000>] (null)
>> Kernel panic - not syncing: Corrupt kernel stack, can't continue.
>>
>> Bisect points to commit 1914608db9 ("s390: move sys_call_table and last_break
>> from thread_info to thread_struct"). Reverting that patch fixes the problem.
>>
>> Configuration is s390:defconfig with CONFIG_MARCH_Z900=y.
>>
>> Bisect log is attached.
>> A complete log is at:
>> http://kerneltests.org/builders/qemu-s390-next/builds/252/steps/qemubuildcommand/logs/stdio
>
> Thanks for the report. Builds for Z900 and Z990 are borked. This hunk
>
> @@ -287,7 +292,13 @@ ENTRY(system_call)
> mvc __PT_INT_CODE(4,%r11),__LC_SVC_ILC
> stg %r14,__PT_FLAGS(%r11)
> .Lsysc_do_svc:
> - lg %r10,__TI_sysc_table(%r12) # address of system call table
> + # load address of system call table
> +#ifdef CONFIG_HAVE_MARCH_Z990_FEATURES
> + lg %r10,__TASK_thread+__THREAD_sysc_table(%r12)
> +#else
> + lghi %r10,__TASK_thread
> + lg %r10,__THREAD_sysc_table(%r10,%r12)
> +#endif
> llgh %r8,__PT_INT_CODE+2(%r11)
> slag %r8,%r8,2 # shift and test for svc 0
> jnz .Lsysc_nr_ok
>
> makes ill use of %r10 in the #else part. Should be fixed now and tomorrows -next
> tree will have the fix. Thanks again.
>

This is still crashing in -next with exactly the same message.

Guenter

2016-11-25 09:06:31

by Martin Schwidefsky

[permalink] [raw]
Subject: Re: next: s390 crash due to 's390: move sys_call_table and last_break from thread_info to thread_struct'

Hi Guenter,

On Thu, 24 Nov 2016 12:53:52 -0800
Guenter Roeck <[email protected]> wrote:

> > Thanks for the report. Builds for Z900 and Z990 are borked. This hunk
> >
> > @@ -287,7 +292,13 @@ ENTRY(system_call)
> > mvc __PT_INT_CODE(4,%r11),__LC_SVC_ILC
> > stg %r14,__PT_FLAGS(%r11)
> > .Lsysc_do_svc:
> > - lg %r10,__TI_sysc_table(%r12) # address of system call table
> > + # load address of system call table
> > +#ifdef CONFIG_HAVE_MARCH_Z990_FEATURES
> > + lg %r10,__TASK_thread+__THREAD_sysc_table(%r12)
> > +#else
> > + lghi %r10,__TASK_thread
> > + lg %r10,__THREAD_sysc_table(%r10,%r12)
> > +#endif
> > llgh %r8,__PT_INT_CODE+2(%r11)
> > slag %r8,%r8,2 # shift and test for svc 0
> > jnz .Lsysc_nr_ok
> >
> > makes ill use of %r10 in the #else part. Should be fixed now and tomorrows -next
> > tree will have the fix. Thanks again.
> >
>
> This is still crashing in -next with exactly the same message.

Yes, it is (note to myself: don't do things in a hurry). The patch below
gets linux-next booting again for CONFIG_MARCH_Z900=y on my test system.
Sorry about the trouble.

--
>From 2ec05f7c28963c12e9618e9f7f3b29edcec40482 Mon Sep 17 00:00:00 2001
From: Martin Schwidefsky <[email protected]>
Date: Fri, 25 Nov 2016 09:53:42 +0100
Subject: [PATCH] s390: fix kernel oops for CONFIG_MARCH_Z900=y builds

The LAST_BREAK macro in entry.S uses a different instruction sequence
for CONFIG_MARCH_Z900 builds. The branch target offset to skip the
store of the last breaking event address needs to take the different
length of the code block into account.

Fixes: f8fc82b47149e344 ("s390: move sys_call_table and last_break from thread_info to thread_struct")
Reported-by: Guenter Roeck <[email protected]>
Signed-off-by: Martin Schwidefsky <[email protected]>
---
arch/s390/kernel/entry.S | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/s390/kernel/entry.S b/arch/s390/kernel/entry.S
index 1cc4578..e2e47f7 100644
--- a/arch/s390/kernel/entry.S
+++ b/arch/s390/kernel/entry.S
@@ -123,10 +123,11 @@ _PIF_WORK = (_PIF_PER_TRAP)

.macro LAST_BREAK scratch
srag \scratch,%r10,23
- jz .+10
#ifdef CONFIG_HAVE_MARCH_Z990_FEATURES
+ jz .+10
stg %r10,__TASK_thread+__THREAD_last_break(%r12)
#else
+ jz .+14
lghi \scratch,__TASK_thread
stg %r10,__THREAD_last_break(\scratch,%r12)
#endif
--
2.8.4
--
blue skies,
Martin.

"Reality continues to ruin my life." - Calvin.

2016-11-25 16:20:38

by Guenter Roeck

[permalink] [raw]
Subject: Re: next: s390 crash due to 's390: move sys_call_table and last_break from thread_info to thread_struct'

On 11/25/2016 01:05 AM, Martin Schwidefsky wrote:
> Hi Guenter,
>
> On Thu, 24 Nov 2016 12:53:52 -0800
> Guenter Roeck <[email protected]> wrote:
>
>>> Thanks for the report. Builds for Z900 and Z990 are borked. This hunk
>>>
>>> @@ -287,7 +292,13 @@ ENTRY(system_call)
>>> mvc __PT_INT_CODE(4,%r11),__LC_SVC_ILC
>>> stg %r14,__PT_FLAGS(%r11)
>>> .Lsysc_do_svc:
>>> - lg %r10,__TI_sysc_table(%r12) # address of system call table
>>> + # load address of system call table
>>> +#ifdef CONFIG_HAVE_MARCH_Z990_FEATURES
>>> + lg %r10,__TASK_thread+__THREAD_sysc_table(%r12)
>>> +#else
>>> + lghi %r10,__TASK_thread
>>> + lg %r10,__THREAD_sysc_table(%r10,%r12)
>>> +#endif
>>> llgh %r8,__PT_INT_CODE+2(%r11)
>>> slag %r8,%r8,2 # shift and test for svc 0
>>> jnz .Lsysc_nr_ok
>>>
>>> makes ill use of %r10 in the #else part. Should be fixed now and tomorrows -next
>>> tree will have the fix. Thanks again.
>>>
>>
>> This is still crashing in -next with exactly the same message.
>
> Yes, it is (note to myself: don't do things in a hurry). The patch below
> gets linux-next booting again for CONFIG_MARCH_Z900=y on my test system.
> Sorry about the trouble.
>
No problem. Just trying to make sure it doesn't find its way into mainline.

> --
> From 2ec05f7c28963c12e9618e9f7f3b29edcec40482 Mon Sep 17 00:00:00 2001
> From: Martin Schwidefsky <[email protected]>
> Date: Fri, 25 Nov 2016 09:53:42 +0100
> Subject: [PATCH] s390: fix kernel oops for CONFIG_MARCH_Z900=y builds
>
> The LAST_BREAK macro in entry.S uses a different instruction sequence
> for CONFIG_MARCH_Z900 builds. The branch target offset to skip the
> store of the last breaking event address needs to take the different
> length of the code block into account.
>
> Fixes: f8fc82b47149e344 ("s390: move sys_call_table and last_break from thread_info to thread_struct")
> Reported-by: Guenter Roeck <[email protected]>
> Signed-off-by: Martin Schwidefsky <[email protected]>

Tested-by: Guenter Roeck <[email protected]>

Guenter

> ---
> arch/s390/kernel/entry.S | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/arch/s390/kernel/entry.S b/arch/s390/kernel/entry.S
> index 1cc4578..e2e47f7 100644
> --- a/arch/s390/kernel/entry.S
> +++ b/arch/s390/kernel/entry.S
> @@ -123,10 +123,11 @@ _PIF_WORK = (_PIF_PER_TRAP)
>
> .macro LAST_BREAK scratch
> srag \scratch,%r10,23
> - jz .+10
> #ifdef CONFIG_HAVE_MARCH_Z990_FEATURES
> + jz .+10
> stg %r10,__TASK_thread+__THREAD_last_break(%r12)
> #else
> + jz .+14
> lghi \scratch,__TASK_thread
> stg %r10,__THREAD_last_break(\scratch,%r12)
> #endif
>