2017-11-29 12:41:16

by Jarkko Nikula

[permalink] [raw]
Subject: Re: [PATCH] x86/entry/64: Fix native_load_gs_index() SWAPGS handling with IRQ state tracing enabled

On 11/29/2017 09:09 AM, Ingo Molnar wrote:
>
> * Jarkko Nikula <[email protected]> wrote:
>
>> Hi
>>
>> Suspend-to-ram and resume stopped working on v4.15-rc1 and I
>> bisected it to commit ca37e57bbe0c ("x86/entry/64: Add missing
>> irqflags tracing to native_load_gs_index()").
>>
>> I noticed it on Intel Kabylake (core) and Apollolake (atom) based
>> prototype machines. Symptoms are that machine appears to enter
>> into suspend but resumes instantly and hangs. Unfortunately no
>> logs.
>>
>> If I revert ca37e57bbe0c on v4.15-rc1 it works as expected.
>
> Hm, that commit looks broken with irq-tracing enabled. Does the
> patch below fix it?
>
No, it makes the machine not to boot at all :-(

Log below when I used my config (now attached). With x86_64_defconfig it
booted twice but didn't survive suspend/resume. However several other
boot attempts with x86_64_defconfig failed somewhat similarly. Not in
the same place but hanging anyway. With my own config it seems to always
end up failing in trace_hardirqs_off_caller.

Then I noticed suspend/resume is not working on v4.14 either when I use
the x86_64_defconfig. Maybe unrelated issue.

...
[ 1.917851] clocksource: tsc: mask: 0xffffffffffffffff max_cycles:
0x19f2297dd97, max_idle_ns: 440795236593 ns
[ 1.929795] workingset: timestamp_bits=62 max_order=21 bucket_order=0
[ 1.939699] BUG: stack guard page was hit at ffffc90000233ff8 (stack
is ffffc90000234000..ffffc90000237fff)
** 606 printk messages dropped **
[ 1.940339] ? native_iret+0x7/0x7
[ 1.940340] ? error_entry+0x6f/0xc0
[ 1.940341] error_entry+0x6f/0xc0
[ 1.940342] RIP: 0010:trace_hardirqs_off_caller+0x8/0xc0
[ 1.940342] RSP: 0000:ffffc90000257048 EFLAGS: 00010093 ORIG_RAX:
0000000000000000
[ 1.940343] RAX: 000000008168ad47 RBX: 0000000000000001 RCX:
ffffffff8168ad47
[ 1.940343] RDX: ffff8802b505bb28 RSI: 0000000000000000 RDI:
ffffffff8168b5ef
[ 1.940344] RBP: ffffc900002570a0 R08: 00000000ebddc52a R09:
0000000000000001
[ 1.940344] R10: 0000000000000000 R11: 0000000000000000 R12:
ffff8802b505a840
[ 1.940345] R13: ffff8802b505b048 R14: ffff8802b39dc800 R15:
ffff8802b505a840
[ 1.940346] ? page_fault+0xc/0x30
[ 1.940347] ? native_iret+0x7/0x7
[ 1.940348] ? error_entry+0x6f/0xc0
[ 1.940349] ? trace_hardirqs_off_thunk+0x1a/0x1c
[ 1.940350] ? native_iret+0x7/0x7
[ 1.940351] ? error_entry+0x6f/0xc0
[ 1.940352] error_entry+0x6f/0xc0
[ 1.940353] RIP: 0010:trace_hardirqs_off_caller+0x8/0xc0
[ 1.940353] RSP: 0000:ffffc90000257168 EFLAGS: 00010093 ORIG_RAX:
0000000000000000
[ 1.940354] RAX: 000000008168ad47 RBX: 0000000000000001 RCX:
ffffffff8168ad47
[ 1.940354] RDX: ffff8802b505bb28 RSI: 0000000000000000 RDI:
ffffffff8168b5ef
[ 1.940355] RBP: ffffc900002571c0 R08: 00000000ebddc52a R09:
0000000000000001
[ 1.940355] R10: 0000000000000000 R11: 0000000000000000 R12:
ffff8802b505a840
[ 1.940356] R13: ffff8802b505b048 R14: ffff8802b39dc800 R15:
ffff8802b505a840
[ 1.940357] ? page_fault+0xc/0x30
[ 1.940358] ? native_iret+0x7/0x7
[ 1.940359] ? error_entry+0x6f/0xc0
[ 1.940360] ? trace_hardirqs_off_thunk+0x1a/0x1c
[ 1.940361] ? native_iret+0x7/0x7
[ 1.940362] ? error_entry+0x6f/0xc0
[ 1.940363] error_entry+0x6f/0xc0

and this continues.

--
Jarkko


Attachments:
x86_64-4.15.config.gz (25.68 kB)

2017-11-29 09:30:51

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] x86/entry/64: Fix native_load_gs_index() SWAPGS handling with IRQ state tracing enabled

On Wed, 29 Nov 2017, Jarkko Nikula wrote:

> On 11/29/2017 09:09 AM, Ingo Molnar wrote:
> >
> > * Jarkko Nikula <[email protected]> wrote:
> >
> > > Hi
> > >
> > > Suspend-to-ram and resume stopped working on v4.15-rc1 and I bisected it
> > > to commit ca37e57bbe0c ("x86/entry/64: Add missing irqflags tracing to
> > > native_load_gs_index()").
> > >
> > > I noticed it on Intel Kabylake (core) and Apollolake (atom) based
> > > prototype machines. Symptoms are that machine appears to enter
> > > into suspend but resumes instantly and hangs. Unfortunately no
> > > logs.
> > >
> > > If I revert ca37e57bbe0c on v4.15-rc1 it works as expected.
> >
> > Hm, that commit looks broken with irq-tracing enabled. Does the
> > patch below fix it?
> >
> No, it makes the machine not to boot at all :-(
>
> Log below when I used my config (now attached). With x86_64_defconfig it
> booted twice but didn't survive suspend/resume. However several other boot
> attempts with x86_64_defconfig failed somewhat similarly. Not in the same
> place but hanging anyway. With my own config it seems to always end up failing
> in trace_hardirqs_off_caller.

Does it work when you disable all the tracing muck?

Thanks,

tglx

From 1585383480543829622@xxx Wed Nov 29 07:11:47 +0000 2017
X-GM-THRID: 1585383480543829622
X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread