2020-09-15 18:28:55

by Linus Torvalds

[permalink] [raw]
Subject: Re: Kernel Benchmarking

On Tue, Sep 15, 2020 at 8:34 AM Matthieu Baerts
<[email protected]> wrote:
>
> > But it sounds like it's 100% repeatable with the fair page lock, which
> > is actually a good thing. It means that if you do a "sysrq-w" while
> > it's blocking, you should see exactly what is waiting for what.
> >
> > (Except since it times out nicely eventually, probably at least part
> > of the waiting is interruptible, and then you need to do "sysrq-t"
> > instead and it's going to be _very_ verbose and much harder to
> > pinpoint things, and you'll probably need to have a very big printk
> > buffer).
>
> Thank you for this idea! I was focused on using lockdep and I forgot
> about this simple method. It is not (yet) a reflex for me to use it!
>
> I think I got an interesting trace I took 20 seconds after having
> started packetdrill:

Ok, so everybody there is basically in the same identical situation,
they all seem to be doing mlockall(), which does __mm_populate() ->
populate_vma_page_range() -> __get_user_pages() -> handle_mm_fault()
and then actually tries to fault in the missing pages.

And that does do a lot of "lock_page()" (and, of course, as a result,
a lot of "unlock_page()" too).

Every one of them is in the "io_schedule()" in the filemap_fault()
path, although two of them seem to be in file_fdatawait_range() rather
than in the lock_page() code itself (so they are also waiting on a
page bit, but they are waiting for the writeback bit to clear).

And all of them do it under the mmap_read_lock().

I'm not seeing what else they'd be blocking on, though.

As mentioned, the thing they are blocking on might be something
interruptible that holds the lock, and might not be in 'D' state. Then
it wouldn't show up in sysrq-W, you'd have to do 'sysrq-T' to see
those..

From past experience, that tends to be a _lot_ of data, though, and it
easily overflows the printk buffers etc.

lockdep has made these kinds of sysrq hacks mostly a thing of the
past, and the few non-lockdep locks (and the page lock is definitely
the biggest of them) are an annoying blast to the past..

Linus


2020-09-15 18:49:09

by Linus Torvalds

[permalink] [raw]
Subject: Re: Kernel Benchmarking

On Tue, Sep 15, 2020 at 11:27 AM Linus Torvalds
<[email protected]> wrote:
>
> Every one of them is in the "io_schedule()" in the filemap_fault()
> path, although two of them seem to be in file_fdatawait_range() rather
> than in the lock_page() code itself (so they are also waiting on a
> page bit, but they are waiting for the writeback bit to clear).

No, that seems to be just stale entries on the stack from a previous
system call, rather than a real trace. There's no way to reach
file_fdatawait_range() from mlockall() that I can see.

So I'm not entirely sure why the stack trace for two of the processes
looks a bit different, but they all look like they should be in
__lock_page_killable().

It's possible those two were woken up (by another CPU) and their stack
is in flux. They also have "wake_up_page_bit()" as a stale entry on
their stack, so that's not entirely unlikely.

So that sysrq-W state shows that yes, people are stuck waiting for a
page, but that wasn't exactly unexpected.

Linus

2020-09-15 19:42:58

by Matthieu Baerts

[permalink] [raw]
Subject: Re: Kernel Benchmarking

On 15/09/2020 20:47, Linus Torvalds wrote:
> On Tue, Sep 15, 2020 at 11:27 AM Linus Torvalds
> <[email protected]> wrote:
>>
>> Every one of them is in the "io_schedule()" in the filemap_fault()
>> path, although two of them seem to be in file_fdatawait_range() rather
>> than in the lock_page() code itself (so they are also waiting on a
>> page bit, but they are waiting for the writeback bit to clear).
>
> No, that seems to be just stale entries on the stack from a previous
> system call, rather than a real trace. There's no way to reach
> file_fdatawait_range() from mlockall() that I can see.
>
> So I'm not entirely sure why the stack trace for two of the processes
> looks a bit different, but they all look like they should be in
> __lock_page_killable().
>
> It's possible those two were woken up (by another CPU) and their stack
> is in flux. They also have "wake_up_page_bit()" as a stale entry on
> their stack, so that's not entirely unlikely.

I don't know if this info is useful but I just checked and I can
reproduce the issue with a single CPU. And the trace is very similar to
the previous one:

------------------- 8< -------------------
[ 23.884953] sysrq: Show Blocked State
[ 23.888310] task:packetdrill state:D stack:13952 pid: 177 ppid:
148 flags:0x00004000
[ 23.895619] Call Trace:
[ 23.897885] __schedule+0x3eb/0x680
[ 23.901033] schedule+0x45/0xb0
[ 23.903882] io_schedule+0xd/0x30
[ 23.906868] __lock_page_killable+0x13e/0x280
[ 23.910729] ? file_fdatawait_range+0x20/0x20
[ 23.914648] filemap_fault+0x6b4/0x970
[ 23.918061] ? filemap_map_pages+0x195/0x330
[ 23.921833] __do_fault+0x32/0x90
[ 23.924754] handle_mm_fault+0x8c1/0xe50
[ 23.928011] __get_user_pages+0x25c/0x750
[ 23.931594] populate_vma_page_range+0x57/0x60
[ 23.935518] __mm_populate+0xa9/0x150
[ 23.938467] __x64_sys_mlockall+0x151/0x180
[ 23.942228] do_syscall_64+0x33/0x40
[ 23.945408] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 23.949736] RIP: 0033:0x7f28b847bb3b
[ 23.960960] Code: Bad RIP value.
[ 23.963856] RSP: 002b:00007ffe48d833c8 EFLAGS: 00000246 ORIG_RAX:
0000000000000097
[ 23.970157] RAX: ffffffffffffffda RBX: 000055a16594a450 RCX:
00007f28b847bb3b
[ 23.976474] RDX: 00007ffe48d812a2 RSI: 00007f28b84ff3c0 RDI:
0000000000000003
[ 23.982773] RBP: 00007ffe48d833d0 R08: 0000000000000000 R09:
0000000000000000
[ 23.988998] R10: 00007f28b84feac0 R11: 0000000000000246 R12:
000055a16590c0a0
[ 23.995079] R13: 00007ffe48d83810 R14: 0000000000000000 R15:
0000000000000000
[ 24.001143] task:packetdrill state:D stack:13952 pid: 179 ppid:
146 flags:0x00004000
[ 24.008425] Call Trace:
[ 24.010495] __schedule+0x3eb/0x680
[ 24.013378] schedule+0x45/0xb0
[ 24.016053] io_schedule+0xd/0x30
[ 24.018763] __lock_page_killable+0x13e/0x280
[ 24.022663] ? file_fdatawait_range+0x20/0x20
[ 24.026564] filemap_fault+0x6b4/0x970
[ 24.029954] ? filemap_map_pages+0x195/0x330
[ 24.033865] __do_fault+0x32/0x90
[ 24.036773] handle_mm_fault+0x8c1/0xe50
[ 24.040072] __get_user_pages+0x25c/0x750
[ 24.043667] populate_vma_page_range+0x57/0x60
[ 24.047200] __mm_populate+0xa9/0x150
[ 24.050554] __x64_sys_mlockall+0x151/0x180
[ 24.054273] do_syscall_64+0x33/0x40
[ 24.057492] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 24.061795] RIP: 0033:0x7f3f900f2b3b
[ 24.065048] Code: Bad RIP value.
[ 24.067971] RSP: 002b:00007ffd682b6338 EFLAGS: 00000246 ORIG_RAX:
0000000000000097
[ 24.074233] RAX: ffffffffffffffda RBX: 0000563239032450 RCX:
00007f3f900f2b3b
[ 24.080303] RDX: 00007ffd682b4212 RSI: 00007f3f901763c0 RDI:
0000000000000003
[ 24.086263] RBP: 00007ffd682b6340 R08: 0000000000000000 R09:
0000000000000000
[ 24.092364] R10: 00007f3f90175ac0 R11: 0000000000000246 R12:
0000563238ff40a0
[ 24.098345] R13: 00007ffd682b6780 R14: 0000000000000000 R15:
0000000000000000
[ 24.104588] task:packetdrill state:D stack:13512 pid: 185 ppid:
153 flags:0x00004000
[ 24.111856] Call Trace:
[ 24.114132] __schedule+0x3eb/0x680
[ 24.117323] schedule+0x45/0xb0
[ 24.120052] io_schedule+0xd/0x30
[ 24.123036] __lock_page_killable+0x13e/0x280
[ 24.126600] ? file_fdatawait_range+0x20/0x20
[ 24.130146] filemap_fault+0x6b4/0x970
[ 24.133264] ? filemap_map_pages+0x195/0x330
[ 24.136846] __do_fault+0x32/0x90
[ 24.139653] handle_mm_fault+0x8c1/0xe50
[ 24.143165] __get_user_pages+0x25c/0x750
[ 24.146439] populate_vma_page_range+0x57/0x60
[ 24.150050] __mm_populate+0xa9/0x150
[ 24.153325] __x64_sys_mlockall+0x151/0x180
[ 24.157089] do_syscall_64+0x33/0x40
[ 24.160181] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 24.164690] RIP: 0033:0x7f18e0da3b3b
[ 24.167851] Code: Bad RIP value.
[ 24.170516] RSP: 002b:00007ffc3a0d67f8 EFLAGS: 00000246 ORIG_RAX:
0000000000000097
[ 24.177116] RAX: ffffffffffffffda RBX: 0000562d7b1b1450 RCX:
00007f18e0da3b3b
[ 24.183423] RDX: 00007ffc3a0d46d2 RSI: 00007f18e0e273c0 RDI:
0000000000000003
[ 24.189707] RBP: 00007ffc3a0d6800 R08: 0000000000000000 R09:
0000000000000000
[ 24.195977] R10: 00007f18e0e26ac0 R11: 0000000000000246 R12:
0000562d7b1730a0
[ 24.202018] R13: 00007ffc3a0d6c40 R14: 0000000000000000 R15:
0000000000000000
[ 24.208311] task:packetdrill state:D stack:13952 pid: 188 ppid:
151 flags:0x00004000
[ 24.215398] Call Trace:
[ 24.217471] __schedule+0x3eb/0x680
[ 24.220446] schedule+0x45/0xb0
[ 24.223044] io_schedule+0xd/0x30
[ 24.225774] __lock_page_killable+0x13e/0x280
[ 24.229621] ? file_fdatawait_range+0x20/0x20
[ 24.233542] filemap_fault+0x6b4/0x970
[ 24.236868] ? xas_start+0x69/0x90
[ 24.239766] ? filemap_map_pages+0x195/0x330
[ 24.243194] __do_fault+0x32/0x90
[ 24.246252] handle_mm_fault+0x8c1/0xe50
[ 24.249759] ? asm_sysvec_apic_timer_interrupt+0x12/0x20
[ 24.254378] __get_user_pages+0x25c/0x750
[ 24.257960] populate_vma_page_range+0x57/0x60
[ 24.261868] __mm_populate+0xa9/0x150
[ 24.265114] __x64_sys_mlockall+0x151/0x180
[ 24.268683] do_syscall_64+0x33/0x40
[ 24.271658] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 24.275772] RIP: 0033:0x7f2d0b01eb3b
[ 24.278691] Code: Bad RIP value.
[ 24.281400] RSP: 002b:00007ffd7b2d9fa8 EFLAGS: 00000246 ORIG_RAX:
0000000000000097
[ 24.287430] RAX: ffffffffffffffda RBX: 000055785f8a4450 RCX:
00007f2d0b01eb3b
[ 24.293273] RDX: 00007ffd7b2d7e82 RSI: 00007f2d0b0a23c0 RDI:
0000000000000003
[ 24.302752] RBP: 00007ffd7b2d9fb0 R08: 0000000000000000 R09:
0000000000000000
[ 24.309115] R10: 00007f2d0b0a1ac0 R11: 0000000000000246 R12:
000055785f8660a0
[ 24.315529] R13: 00007ffd7b2da3f0 R14: 0000000000000000 R15:
0000000000000000
[ 24.321891] task:packetdrill state:D stack:13952 pid: 190 ppid:
157 flags:0x00004000
[ 24.329197] Call Trace:
[ 24.331531] __schedule+0x3eb/0x680
[ 24.334736] schedule+0x45/0xb0
[ 24.337548] io_schedule+0xd/0x30
[ 24.340362] __lock_page_killable+0x13e/0x280
[ 24.344098] ? file_fdatawait_range+0x20/0x20
[ 24.348001] filemap_fault+0x6b4/0x970
[ 24.351427] ? filemap_map_pages+0x195/0x330
[ 24.355292] __do_fault+0x32/0x90
[ 24.358282] handle_mm_fault+0x8c1/0xe50
[ 24.361788] __get_user_pages+0x25c/0x750
[ 24.365427] populate_vma_page_range+0x57/0x60
[ 24.369408] __mm_populate+0xa9/0x150
[ 24.372726] __x64_sys_mlockall+0x151/0x180
[ 24.376496] do_syscall_64+0x33/0x40
[ 24.379522] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 24.383623] RIP: 0033:0x7fe749a73b3b
[ 24.386534] Code: Bad RIP value.
[ 24.389240] RSP: 002b:00007fff6d4ad268 EFLAGS: 00000246 ORIG_RAX:
0000000000000097
[ 24.395433] RAX: ffffffffffffffda RBX: 0000556473049450 RCX:
00007fe749a73b3b
[ 24.401307] RDX: 00007fff6d4ab142 RSI: 00007fe749af73c0 RDI:
0000000000000003
[ 24.407426] RBP: 00007fff6d4ad270 R08: 0000000000000000 R09:
0000000000000000
[ 24.413696] R10: 00007fe749af6ac0 R11: 0000000000000246 R12:
000055647300b0a0
[ 24.419919] R13: 00007fff6d4ad6b0 R14: 0000000000000000 R15:
0000000000000000
[ 24.425940] task:packetdrill state:D stack:13952 pid: 193 ppid:
160 flags:0x00004000
[ 24.433269] Call Trace:
[ 24.435568] __schedule+0x3eb/0x680
[ 24.438397] schedule+0x45/0xb0
[ 24.441030] io_schedule+0xd/0x30
[ 24.443760] __lock_page_killable+0x13e/0x280
[ 24.447275] ? file_fdatawait_range+0x20/0x20
[ 24.450919] filemap_fault+0x6b4/0x970
[ 24.453976] ? filemap_map_pages+0x195/0x330
[ 24.457501] __do_fault+0x32/0x90
[ 24.460273] handle_mm_fault+0x8c1/0xe50
[ 24.463397] __get_user_pages+0x25c/0x750
[ 24.466743] populate_vma_page_range+0x57/0x60
[ 24.470728] __mm_populate+0xa9/0x150
[ 24.474084] __x64_sys_mlockall+0x151/0x180
[ 24.477780] do_syscall_64+0x33/0x40
[ 24.480924] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 24.485302] RIP: 0033:0x7faf08ad9b3b
[ 24.488404] Code: Bad RIP value.
[ 24.491348] RSP: 002b:00007ffec68c61d8 EFLAGS: 00000246 ORIG_RAX:
0000000000000097
[ 24.497549] RAX: ffffffffffffffda RBX: 00005556df2df450 RCX:
00007faf08ad9b3b
[ 24.503686] RDX: 00007ffec68c40b2 RSI: 00007faf08b5d3c0 RDI:
0000000000000003
[ 24.509693] RBP: 00007ffec68c61e0 R08: 0000000000000000 R09:
0000000000000000
[ 24.515919] R10: 00007faf08b5cac0 R11: 0000000000000246 R12:
00005556df2a10a0
[ 24.521919] R13: 00007ffec68c6620 R14: 0000000000000000 R15:
0000000000000000
[ 24.528173] task:packetdrill state:D stack:13952 pid: 199 ppid:
163 flags:0x00004000
[ 24.535290] Call Trace:
[ 24.537348] __schedule+0x3eb/0x680
[ 24.540324] schedule+0x45/0xb0
[ 24.542834] io_schedule+0xd/0x30
[ 24.545594] __lock_page_killable+0x13e/0x280
[ 24.549485] ? file_fdatawait_range+0x20/0x20
[ 24.553266] filemap_fault+0x6b4/0x970
[ 24.556501] ? filemap_map_pages+0x195/0x330
[ 24.560073] __do_fault+0x32/0x90
[ 24.562786] handle_mm_fault+0x8c1/0xe50
[ 24.566304] __get_user_pages+0x25c/0x750
[ 24.569874] populate_vma_page_range+0x57/0x60
[ 24.573844] __mm_populate+0xa9/0x150
[ 24.577054] __x64_sys_mlockall+0x151/0x180
[ 24.580624] do_syscall_64+0x33/0x40
[ 24.583629] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 24.587866] RIP: 0033:0x7efdca54bb3b
[ 24.590766] Code: Bad RIP value.
[ 24.593713] RSP: 002b:00007ffe160c8be8 EFLAGS: 00000246 ORIG_RAX:
0000000000000097
[ 24.600288] RAX: ffffffffffffffda RBX: 000055668dc19450 RCX:
00007efdca54bb3b
[ 24.606330] RDX: 00007ffe160c6ac2 RSI: 00007efdca5cf3c0 RDI:
0000000000000003
[ 24.612650] RBP: 00007ffe160c8bf0 R08: 0000000000000000 R09:
0000000000000000
[ 24.618715] R10: 00007efdca5ceac0 R11: 0000000000000246 R12:
000055668dbdb0a0
[ 24.625055] R13: 00007ffe160c9030 R14: 0000000000000000 R15:
0000000000000000
[ 24.631336] task:packetdrill state:D stack:13952 pid: 200 ppid:
167 flags:0x00004000
[ 24.638212] Call Trace:
[ 24.640504] __schedule+0x3eb/0x680
[ 24.643730] schedule+0x45/0xb0
[ 24.646616] io_schedule+0xd/0x30
[ 24.649646] __lock_page_killable+0x13e/0x280
[ 24.653420] ? file_fdatawait_range+0x20/0x20
[ 24.657351] filemap_fault+0x6b4/0x970
[ 24.660596] ? filemap_map_pages+0x195/0x330
[ 24.664134] __do_fault+0x32/0x90
[ 24.666936] handle_mm_fault+0x8c1/0xe50
[ 24.670502] __get_user_pages+0x25c/0x750
[ 24.674111] populate_vma_page_range+0x57/0x60
[ 24.678106] __mm_populate+0xa9/0x150
[ 24.681434] __x64_sys_mlockall+0x151/0x180
[ 24.685191] do_syscall_64+0x33/0x40
[ 24.688289] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 24.692754] RIP: 0033:0x7f4ac3f45b3b
[ 24.696017] Code: Bad RIP value.
[ 24.698963] RSP: 002b:00007ffd159771e8 EFLAGS: 00000246 ORIG_RAX:
0000000000000097
[ 24.705620] RAX: ffffffffffffffda RBX: 00005620eb450450 RCX:
00007f4ac3f45b3b
[ 24.711901] RDX: 00007ffd159750c2 RSI: 00007f4ac3fc93c0 RDI:
0000000000000003
[ 24.718195] RBP: 00007ffd159771f0 R08: 0000000000000000 R09:
0000000000000000
[ 24.724468] R10: 00007f4ac3fc8ac0 R11: 0000000000000246 R12:
00005620eb4120a0
[ 24.730689] R13: 00007ffd15977630 R14: 0000000000000000 R15:
0000000000000000
[ 24.736965] task:packetdrill state:D stack:13952 pid: 202 ppid:
174 flags:0x00004000
[ 24.744128] Call Trace:
[ 24.746188] __schedule+0x3eb/0x680
[ 24.749129] schedule+0x45/0xb0
[ 24.751715] io_schedule+0xd/0x30
[ 24.754430] __lock_page_killable+0x13e/0x280
[ 24.758317] ? file_fdatawait_range+0x20/0x20
[ 24.762160] filemap_fault+0x6b4/0x970
[ 24.765582] ? filemap_map_pages+0x195/0x330
[ 24.769418] __do_fault+0x32/0x90
[ 24.772432] handle_mm_fault+0x8c1/0xe50
[ 24.775700] __get_user_pages+0x25c/0x750
[ 24.779051] populate_vma_page_range+0x57/0x60
[ 24.783009] __mm_populate+0xa9/0x150
[ 24.786309] __x64_sys_mlockall+0x151/0x180
[ 24.790052] do_syscall_64+0x33/0x40
[ 24.793285] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 24.797821] RIP: 0033:0x7f177627db3b
[ 24.801013] Code: Bad RIP value.
[ 24.803752] RSP: 002b:00007ffcd9c784d8 EFLAGS: 00000246 ORIG_RAX:
0000000000000097
[ 24.810149] RAX: ffffffffffffffda RBX: 0000556bbe193450 RCX:
00007f177627db3b
[ 24.816393] RDX: 00007ffcd9c763b2 RSI: 00007f17763013c0 RDI:
0000000000000003
[ 24.822407] RBP: 00007ffcd9c784e0 R08: 0000000000000000 R09:
0000000000000000
[ 24.828686] R10: 00007f1776300ac0 R11: 0000000000000246 R12:
0000556bbe1550a0
[ 24.834924] R13: 00007ffcd9c78920 R14: 0000000000000000 R15:
0000000000000000
[ 24.840714] task:packetdrill state:D stack:13952 pid: 204 ppid:
182 flags:0x00004000
[ 24.847842] Call Trace:
[ 24.850112] __schedule+0x3eb/0x680
[ 24.853316] schedule+0x45/0xb0
[ 24.856185] io_schedule+0xd/0x30
[ 24.859181] __lock_page_killable+0x13e/0x280
[ 24.862675] ? file_fdatawait_range+0x20/0x20
[ 24.866212] filemap_fault+0x6b4/0x970
[ 24.869525] ? filemap_map_pages+0x195/0x330
[ 24.873240] __do_fault+0x32/0x90
[ 24.876250] handle_mm_fault+0x8c1/0xe50
[ 24.879470] __get_user_pages+0x25c/0x750
[ 24.882680] populate_vma_page_range+0x57/0x60
[ 24.894498] __mm_populate+0xa9/0x150
[ 24.897763] __x64_sys_mlockall+0x151/0x180
[ 24.901363] do_syscall_64+0x33/0x40
[ 24.904437] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 24.908958] RIP: 0033:0x7ff1fe4e2b3b
[ 24.911951] Code: Bad RIP value.
[ 24.914865] RSP: 002b:00007ffc28177598 EFLAGS: 00000246 ORIG_RAX:
0000000000000097
[ 24.921310] RAX: ffffffffffffffda RBX: 0000558d9bbe3450 RCX:
00007ff1fe4e2b3b
[ 24.927427] RDX: 00007ffc28175472 RSI: 00007ff1fe5663c0 RDI:
0000000000000003
[ 24.933468] RBP: 00007ffc281775a0 R08: 0000000000000000 R09:
0000000000000000
[ 24.939722] R10: 00007ff1fe565ac0 R11: 0000000000000246 R12:
0000558d9bba50a0
[ 24.945719] R13: 00007ffc281779e0 R14: 0000000000000000 R15:
0000000000000000
[ 24.951908] task:packetdrill state:D stack:13952 pid: 205 ppid:
187 flags:0x00004000
[ 24.958948] Call Trace:
[ 24.961229] __schedule+0x3eb/0x680
[ 24.964212] schedule+0x45/0xb0
[ 24.967086] io_schedule+0xd/0x30
[ 24.970104] __lock_page_killable+0x13e/0x280
[ 24.974001] ? file_fdatawait_range+0x20/0x20
[ 24.977912] filemap_fault+0x6b4/0x970
[ 24.981228] ? filemap_map_pages+0x195/0x330
[ 24.984924] __do_fault+0x32/0x90
[ 24.987934] handle_mm_fault+0x8c1/0xe50
[ 24.991452] __get_user_pages+0x25c/0x750
[ 24.995022] populate_vma_page_range+0x57/0x60
[ 24.998990] __mm_populate+0xa9/0x150
[ 25.002109] __x64_sys_mlockall+0x151/0x180
[ 25.005873] do_syscall_64+0x33/0x40
[ 25.009084] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 25.013550] RIP: 0033:0x7fc6a6880b3b
[ 25.016548] Code: Bad RIP value.
[ 25.019318] RSP: 002b:00007ffc69437db8 EFLAGS: 00000246 ORIG_RAX:
0000000000000097
[ 25.025516] RAX: ffffffffffffffda RBX: 00005572a33dd450 RCX:
00007fc6a6880b3b
[ 25.031678] RDX: 00007ffc69435c92 RSI: 00007fc6a69043c0 RDI:
0000000000000003
[ 25.037940] RBP: 00007ffc69437dc0 R08: 0000000000000000 R09:
0000000000000000
[ 25.044100] R10: 00007fc6a6903ac0 R11: 0000000000000246 R12:
00005572a339f0a0
[ 25.049950] R13: 00007ffc69438200 R14: 0000000000000000 R15:
0000000000000000

> So that sysrq-W state shows that yes, people are stuck waiting for a
> page, but that wasn't exactly unexpected.

Is there anything else I can do to get more info? I guess a core dump
would start to be really interesting here.

Cheers,
Matt
--
Tessares | Belgium | Hybrid Access Solutions
http://www.tessares.net


Attachments:
sysrq_w_1_core_analysed.txt (18.51 kB)