2019-02-09 08:25:00

by Sander Eikelenboom

[permalink] [raw]
Subject: Linux 5.0 regression: BUG: unable to handle kernel paging request at ffff888023e26778

L.S.,


While testing a Linux 5.0-rc5-ish kernel (pull of yesterday) with some additional patches for
already reported other issues i came across the issue below which i haven't seen with 4.20.x

I haven't got a reproducer so i might be hard to hit it again,
system is AMD and this is from the host kernel running under
the Xen hypervisor might it matter.

--

Sander


[17035.016433] BUG: unable to handle kernel paging request at ffff888023e26778
[17035.025887] #PF error: [PROT] [WRITE]
[17035.035146] PGD 2a2a067 P4D 2a2a067 PUD 2a2b067 PMD 7fe01067 PTE 8010000023e26065
[17035.044371] Oops: 0003 [#1] SMP NOPTI
[17035.053720] CPU: 3 PID: 28310 Comm: apt-get Not tainted 5.0.0-rc5-20190208-thp-net-florian-rtl8169-eric-doflr+ #1
[17035.063440] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , BIOS V1.8B1 09/13/2010
[17035.072635] RIP: e030:move_page_tables+0x7c1/0xae0
[17035.081585] Code: ce 00 48 8b 03 31 ff 48 89 44 24 20 e8 9e 72 e4 ff 66 90 48 89 c6 48 89 df e8 8b 89 e4 ff 66 90 48 8b 44 24 20 b9 0c 00 00 00 <48> 89 45 00 41 f6 46 52 40 0f 85 3f 02 00 00 49 8b 7e 40 45 31 c0
[17035.100225] RSP: e02b:ffffc90000f2bd40 EFLAGS: 00010282
[17035.109208] RAX: 0000000475e42067 RBX: ffff888023e267e0 RCX: 000000000000000c
[17035.118332] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000201
[17035.127378] RBP: ffff888023e26778 R08: aaaaaaaaaaaaaaaa R09: 000000051c1d9000
[17035.136310] R10: deadbeefdeadf00d R11: ffff88807fc17000 R12: 00007fc59fa00000
[17035.145433] R13: ffffea00008f89a8 R14: ffff88801c2286c0 R15: 00007fc59f800000
[17035.154171] FS: 00007fc5a5591100(0000) GS:ffff88807d4c0000(0000) knlGS:0000000000000000
[17035.162730] CS: e030 DS: 0000 ES: 0000 CR0: 0000000080050033
[17035.171180] CR2: ffff888023e26778 CR3: 000000001c3f6000 CR4: 0000000000000660
[17035.179545] Call Trace:
[17035.187736] move_vma.isra.3+0xd1/0x2d0
[17035.195837] __se_sys_mremap+0x3c6/0x5b0
[17035.203986] do_syscall_64+0x49/0x100
[17035.212109] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[17035.219971] RIP: 0033:0x7fc5a453527a
[17035.227558] Code: 73 01 c3 48 8b 0d 1e fc 2a 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 49 89 ca b8 19 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ee fb 2a 00 f7 d8 64 89 01 48
[17035.243255] RSP: 002b:00007ffda22d96f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000019
[17035.251121] RAX: ffffffffffffffda RBX: 0000557d40923a30 RCX: 00007fc5a453527a
[17035.258986] RDX: 0000000001a00000 RSI: 0000000001900000 RDI: 00007fc59f7ff000
[17035.267127] RBP: 0000000001a00000 R08: 0000000000000020 R09: 0000000000000040
[17035.275259] R10: 0000000000000001 R11: 0000000000000246 R12: 00007fc59f7ff060
[17035.282681] R13: 00007fc59f7ff000 R14: 0000557d40923a30 R15: 0000557d40829aa0
[17035.290322] Modules linked in:
[17035.297875] CR2: ffff888023e26778
[17035.305405] ---[ end trace 6ff49f09286816b6 ]---
[17035.313131] RIP: e030:move_page_tables+0x7c1/0xae0
[17035.320326] Code: ce 00 48 8b 03 31 ff 48 89 44 24 20 e8 9e 72 e4 ff 66 90 48 89 c6 48 89 df e8 8b 89 e4 ff 66 90 48 8b 44 24 20 b9 0c 00 00 00 <48> 89 45 00 41 f6 46 52 40 0f 85 3f 02 00 00 49 8b 7e 40 45 31 c0
[17035.334851] RSP: e02b:ffffc90000f2bd40 EFLAGS: 00010282
[17035.341727] RAX: 0000000475e42067 RBX: ffff888023e267e0 RCX: 000000000000000c
[17035.348838] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000201
[17035.356000] RBP: ffff888023e26778 R08: aaaaaaaaaaaaaaaa R09: 000000051c1d9000
[17035.363623] R10: deadbeefdeadf00d R11: ffff88807fc17000 R12: 00007fc59fa00000
[17035.371454] R13: ffffea00008f89a8 R14: ffff88801c2286c0 R15: 00007fc59f800000
[17035.378958] FS: 00007fc5a5591100(0000) GS:ffff88807d4c0000(0000) knlGS:0000000000000000
[17035.386585] CS: e030 DS: 0000 ES: 0000 CR0: 0000000080050033
[17035.393797] CR2: ffff888023e26778 CR3: 000000001c3f6000 CR4: 0000000000000660





2019-02-09 18:44:48

by Sander Eikelenboom

[permalink] [raw]
Subject: Re: Linux 5.0 regression: BUG: unable to handle kernel paging request at ffff888023e26778 RIP: e030:move_page_tables+0x7c1/0xae0

On 09/02/2019 09:26, Sander Eikelenboom wrote:
> L.S.,
>
>
> While testing a Linux 5.0-rc5-ish kernel (pull of yesterday) with some additional patches for
> already reported other issues i came across the issue below which i haven't seen with 4.20.x
>
> I haven't got a reproducer so i might be hard to hit it again,
> system is AMD and this is from the host kernel running under
> the Xen hypervisor might it matter.

> --
>
> Sander

Hi Boris / Juergen,

The commit causing this is:
2c91bd4a4e2e530582d6fd643ea7b86b27907151 mm: speed up mremap by 20x on large regions

Since it seems there haven't been any other reports about this ..
could it be this doesn't specifically work well with a Xen PVH dom0 ?

I could do a straight revert and i'm testing it now for a while, so far no issues.

I also attached my .config, probably most important:
CONFIG_HAVE_MOVE_PMD=y
CONFIG_ARCH_ENABLE_THP_MIGRATION=y
CONFIG_ARCH_WANTS_THP_SWAP=y
CONFIG_THP_SWAP=y
CONFIG_CGROUP_HUGETLB=y
CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y
CONFIG_ARCH_WANT_GENERAL_HUGETLB=y
CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION=y
CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y
CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD=y
CONFIG_HAVE_ARCH_HUGE_VMAP=y
CONFIG_TRANSPARENT_HUGEPAGE=y
CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y
# CONFIG_TRANSPARENT_HUGEPAGE_MADVISE is not set
CONFIG_TRANSPARENT_HUGE_PAGECACHE=y
CONFIG_HUGETLBFS=y
CONFIG_HUGETLB_PAGE=y

Thanks in advance.

--
Sander


> [17035.016433] BUG: unable to handle kernel paging request at ffff888023e26778
> [17035.025887] #PF error: [PROT] [WRITE]
> [17035.035146] PGD 2a2a067 P4D 2a2a067 PUD 2a2b067 PMD 7fe01067 PTE 8010000023e26065
> [17035.044371] Oops: 0003 [#1] SMP NOPTI
> [17035.053720] CPU: 3 PID: 28310 Comm: apt-get Not tainted 5.0.0-rc5-20190208-thp-net-florian-rtl8169-eric-doflr+ #1
> [17035.063440] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , BIOS V1.8B1 09/13/2010
> [17035.072635] RIP: e030:move_page_tables+0x7c1/0xae0
> [17035.081585] Code: ce 00 48 8b 03 31 ff 48 89 44 24 20 e8 9e 72 e4 ff 66 90 48 89 c6 48 89 df e8 8b 89 e4 ff 66 90 48 8b 44 24 20 b9 0c 00 00 00 <48> 89 45 00 41 f6 46 52 40 0f 85 3f 02 00 00 49 8b 7e 40 45 31 c0
> [17035.100225] RSP: e02b:ffffc90000f2bd40 EFLAGS: 00010282
> [17035.109208] RAX: 0000000475e42067 RBX: ffff888023e267e0 RCX: 000000000000000c
> [17035.118332] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000201
> [17035.127378] RBP: ffff888023e26778 R08: aaaaaaaaaaaaaaaa R09: 000000051c1d9000
> [17035.136310] R10: deadbeefdeadf00d R11: ffff88807fc17000 R12: 00007fc59fa00000
> [17035.145433] R13: ffffea00008f89a8 R14: ffff88801c2286c0 R15: 00007fc59f800000
> [17035.154171] FS: 00007fc5a5591100(0000) GS:ffff88807d4c0000(0000) knlGS:0000000000000000
> [17035.162730] CS: e030 DS: 0000 ES: 0000 CR0: 0000000080050033
> [17035.171180] CR2: ffff888023e26778 CR3: 000000001c3f6000 CR4: 0000000000000660
> [17035.179545] Call Trace:
> [17035.187736] move_vma.isra.3+0xd1/0x2d0
> [17035.195837] __se_sys_mremap+0x3c6/0x5b0
> [17035.203986] do_syscall_64+0x49/0x100
> [17035.212109] entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [17035.219971] RIP: 0033:0x7fc5a453527a
> [17035.227558] Code: 73 01 c3 48 8b 0d 1e fc 2a 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 49 89 ca b8 19 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ee fb 2a 00 f7 d8 64 89 01 48
> [17035.243255] RSP: 002b:00007ffda22d96f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000019
> [17035.251121] RAX: ffffffffffffffda RBX: 0000557d40923a30 RCX: 00007fc5a453527a
> [17035.258986] RDX: 0000000001a00000 RSI: 0000000001900000 RDI: 00007fc59f7ff000
> [17035.267127] RBP: 0000000001a00000 R08: 0000000000000020 R09: 0000000000000040
> [17035.275259] R10: 0000000000000001 R11: 0000000000000246 R12: 00007fc59f7ff060
> [17035.282681] R13: 00007fc59f7ff000 R14: 0000557d40923a30 R15: 0000557d40829aa0
> [17035.290322] Modules linked in:
> [17035.297875] CR2: ffff888023e26778
> [17035.305405] ---[ end trace 6ff49f09286816b6 ]---
> [17035.313131] RIP: e030:move_page_tables+0x7c1/0xae0
> [17035.320326] Code: ce 00 48 8b 03 31 ff 48 89 44 24 20 e8 9e 72 e4 ff 66 90 48 89 c6 48 89 df e8 8b 89 e4 ff 66 90 48 8b 44 24 20 b9 0c 00 00 00 <48> 89 45 00 41 f6 46 52 40 0f 85 3f 02 00 00 49 8b 7e 40 45 31 c0
> [17035.334851] RSP: e02b:ffffc90000f2bd40 EFLAGS: 00010282
> [17035.341727] RAX: 0000000475e42067 RBX: ffff888023e267e0 RCX: 000000000000000c
> [17035.348838] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000201
> [17035.356000] RBP: ffff888023e26778 R08: aaaaaaaaaaaaaaaa R09: 000000051c1d9000
> [17035.363623] R10: deadbeefdeadf00d R11: ffff88807fc17000 R12: 00007fc59fa00000
> [17035.371454] R13: ffffea00008f89a8 R14: ffff88801c2286c0 R15: 00007fc59f800000
> [17035.378958] FS: 00007fc5a5591100(0000) GS:ffff88807d4c0000(0000) knlGS:0000000000000000
> [17035.386585] CS: e030 DS: 0000 ES: 0000 CR0: 0000000080050033
> [17035.393797] CR2: ffff888023e26778 CR3: 000000001c3f6000 CR4: 0000000000000660
>
>
>


Attachments:
dotconfig (120.62 kB)

2019-02-09 18:49:18

by Juergen Gross

[permalink] [raw]
Subject: Re: Linux 5.0 regression: BUG: unable to handle kernel paging request at ffff888023e26778 RIP: e030:move_page_tables+0x7c1/0xae0

On 09/02/2019 19:45, Sander Eikelenboom wrote:
> On 09/02/2019 09:26, Sander Eikelenboom wrote:
>> L.S.,
>>
>>
>> While testing a Linux 5.0-rc5-ish kernel (pull of yesterday) with some additional patches for
>> already reported other issues i came across the issue below which i haven't seen with 4.20.x
>>
>> I haven't got a reproducer so i might be hard to hit it again,
>> system is AMD and this is from the host kernel running under
>> the Xen hypervisor might it matter.
>
>> --
>>
>> Sander
>
> Hi Boris / Juergen,
>
> The commit causing this is:
> 2c91bd4a4e2e530582d6fd643ea7b86b27907151 mm: speed up mremap by 20x on large regions
>
> Since it seems there haven't been any other reports about this ..
> could it be this doesn't specifically work well with a Xen PVH dom0 ?

PVH? Not PV?


Juergen

2019-02-09 18:54:11

by Sander Eikelenboom

[permalink] [raw]
Subject: Re: Linux 5.0 regression: BUG: unable to handle kernel paging request at ffff888023e26778 RIP: e030:move_page_tables+0x7c1/0xae0

On 09/02/2019 19:48, Juergen Gross wrote:
> On 09/02/2019 19:45, Sander Eikelenboom wrote:
>> On 09/02/2019 09:26, Sander Eikelenboom wrote:
>>> L.S.,
>>>
>>>
>>> While testing a Linux 5.0-rc5-ish kernel (pull of yesterday) with some additional patches for
>>> already reported other issues i came across the issue below which i haven't seen with 4.20.x
>>>
>>> I haven't got a reproducer so i might be hard to hit it again,
>>> system is AMD and this is from the host kernel running under
>>> the Xen hypervisor might it matter.
>>
>>> --
>>>
>>> Sander
>>
>> Hi Boris / Juergen,
>>
>> The commit causing this is:
>> 2c91bd4a4e2e530582d6fd643ea7b86b27907151 mm: speed up mremap by 20x on large regions
>>
>> Since it seems there haven't been any other reports about this ..
>> could it be this doesn't specifically work well with a Xen PVH dom0 ?
>
> PVH? Not PV?

Ah sorry, indeed PV !

>
> Juergen
>


2019-02-09 18:55:58

by Linus Torvalds

[permalink] [raw]
Subject: Re: Linux 5.0 regression: BUG: unable to handle kernel paging request at ffff888023e26778

On Sat, Feb 9, 2019 at 12:24 AM Sander Eikelenboom <[email protected]> wrote:
>
> I haven't got a reproducer so i might be hard to hit it again,
> system is AMD and this is from the host kernel running under
> the Xen hypervisor might it matter.

I think this is a Xen bug.

In particular, there's a few poison values in there that look like
zen. Like this:

R10: deadbeefdeadf00d

looks like a special poison value that is from Xen itself.

It looks like the oops is around the TLB flushing code, looking at the
code it's the

arch_leave_lazy_mmu_mode();
if (force_flush)
flush_tlb_range(vma, old_end - len, old_end);
if (new_ptl != old_ptl)
spin_unlock(new_ptl);

sequence in move_page_tables. The oopsing code sequence is


28:* 48 89 45 00 mov %rax,0x0(%rbp) <-- trapping instruction
2c: 41 f6 46 52 40 testb $0x40,0x52(%r14)

and that "testb $0x40" instruction that comes after the trapping
instruction is the

((vma)->vm_flags & VM_HUGETLB) \

from the flush_tlb_range() macro:

#define flush_tlb_range(vma, start, end) \
flush_tlb_mm_range((vma)->vm_mm, start, end, \
((vma)->vm_flags & VM_HUGETLB) \
? huge_page_shift(hstate_vma(vma)) \
: PAGE_SHIFT, false)

if I read that oops correctly.

I have no idea what that store to 0(%rbp) is for, though - I can't
line that up with anything I see with my own kernel config.

We *do* have changes to 5.0 in the move_page_tables() code (mremap on
a pmd level), so I'm cc'ing some of the people involved there, but
that odd poison value does make me wonder abut Xen issues. When I
google for that value, all I see is Xen reports (and your report for
this).

Linus

2019-02-09 19:01:34

by Juergen Gross

[permalink] [raw]
Subject: Re: Linux 5.0 regression: BUG: unable to handle kernel paging request at ffff888023e26778 RIP: e030:move_page_tables+0x7c1/0xae0

On 09/02/2019 19:51, Sander Eikelenboom wrote:
> On 09/02/2019 19:48, Juergen Gross wrote:
>> On 09/02/2019 19:45, Sander Eikelenboom wrote:
>>> On 09/02/2019 09:26, Sander Eikelenboom wrote:
>>>> L.S.,
>>>>
>>>>
>>>> While testing a Linux 5.0-rc5-ish kernel (pull of yesterday) with some additional patches for
>>>> already reported other issues i came across the issue below which i haven't seen with 4.20.x
>>>>
>>>> I haven't got a reproducer so i might be hard to hit it again,
>>>> system is AMD and this is from the host kernel running under
>>>> the Xen hypervisor might it matter.
>>>
>>>> --
>>>>
>>>> Sander
>>>
>>> Hi Boris / Juergen,
>>>
>>> The commit causing this is:
>>> 2c91bd4a4e2e530582d6fd643ea7b86b27907151 mm: speed up mremap by 20x on large regions
>>>
>>> Since it seems there haven't been any other reports about this ..
>>> could it be this doesn't specifically work well with a Xen PVH dom0 ?
>>
>> PVH? Not PV?
>
> Ah sorry, indeed PV !

Okay, found the problem. set_pmd_at() is used in above commit for
writing a PMD entry with a page table address. In all other places
it is used only for huge pages, which are not possible in PV guests.

I'll send a patch on Monday.


Juergen


2019-02-09 19:46:11

by Andrew Cooper

[permalink] [raw]
Subject: Re: [Xen-devel] Linux 5.0 regression: BUG: unable to handle kernel paging request at ffff888023e26778

On 09/02/2019 18:54, Linus Torvalds wrote:
> On Sat, Feb 9, 2019 at 12:24 AM Sander Eikelenboom <[email protected]> wrote:
>> I haven't got a reproducer so i might be hard to hit it again,
>> system is AMD and this is from the host kernel running under
>> the Xen hypervisor might it matter.
> I think this is a Xen bug.
>
> In particular, there's a few poison values in there that look like
> zen. Like this:
>
> R10: deadbeefdeadf00d
>
> looks like a special poison value that is from Xen itself.

Xen's hypercall ABI states that parameters in registers may be changed
as part of the hypercall.  This is used restart hypercalls midway
through their processing if we had to deliver an interrupt to the vcpu.

As a result, debug builds of Xen deliberately poison all hypercall
parameters, to help catch guest code which doesn't follow the rules.

~Andrew