2021-01-28 15:10:02

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH V5] x86/mm: Tracking linear mapping split events

On Thu, Jan 28, 2021 at 02:49:34AM -0800, Saravanan D wrote:
> One of the many lasting (as we don't coalesce back) sources for huge page
> splits is tracing as the granular page attribute/permission changes would
> force the kernel to split code segments mapped to huge pages to smaller
> ones thereby increasing the probability of TLB miss/reload even after
> tracing has been stopped.

You didn't answer my question.

Is this tracing of userspace programs causing splits, or is it kernel
tracing? Also, we have lots of kinds of tracing these days; are you
referring to kprobes? tracepoints? ftrace? Something else?


2021-01-28 21:04:19

by Saravanan D

[permalink] [raw]
Subject: Re: [PATCH V5] x86/mm: Tracking linear mapping split events

Hi Mathew,

> Is this tracing of userspace programs causing splits, or is it kernel
> tracing? Also, we have lots of kinds of tracing these days; are you
> referring to kprobes? tracepoints? ftrace? Something else?

It has to be kernel tracing (kprobes, tracepoints) as we are dealing with
direct mapping splits.

Kernel's direct mapping
`` ffff888000000000 | -119.5 TB | ffffc87fffffffff | 64 TB | direct
mapping of all physical memory (page_offset_base)``

The kernel text range
``ffffffff80000000 | -2 GB | ffffffff9fffffff | 512 MB | kernel
text mapping, mapped to physical address 0``

Source : Documentation/x86/x86_64/mm.rst

Kernel code segment points to the same physical addresses already mapped
in the direct mapping range (0x20000000 = 512 MB)

When we enable kernel tracing, we would have to modify attributes/permissions
of the text segment pages that are direct mapped causing them to split.

When we track the direct_pages_count[] in arch/x86/mm/pat/set_memory.c
There are only splits from higher levels. They never coalesce back.

Splits when we turn on dynamic tracing
....
cat /proc/vmstat | grep -i direct_map_level
direct_map_level2_splits 784
direct_map_level3_splits 12
bpftrace -e 'tracepoint:raw_syscalls:sys_enter { @ [pid, comm] = count(); }'
cat /proc/vmstat | grep -i
direct_map_level
direct_map_level2_splits 789
direct_map_level3_splits 12
....

Thanks,
Saravanan D