LinuxLists.cc - Re: [PATCH 1/2] [RFC] proc: pagemap: Expose whether a PTE is writable

2024-03-07 11:59:47

Subject: Re: [PATCH 1/2] [RFC] proc: pagemap: Expose whether a PTE is writable

On 07.03.24 12:51, Richard Weinberger wrote:
> ----- Ursprüngliche Mail -----
>> Von: "David Hildenbrand" <[email protected]>
>>> I'm currently investigating why a real-time application faces unexpected
>>> page faults. Page faults are usually fatal for real-time work loads because
>>> the latency constraints are no longer met.
>>
>> Are you concerned about any type of page fault, or are things like a
>> simple remapping of the same page from "read-only to writable"
>> acceptable? ("very minor fault")
>
> Any page fault has to be avoided.
> To give you more background, the real time application runs on Xenomai,
> a real time extension for Linux.
> Xenomai applies already many tweaks to the kernel to trigger pre-faulting of
> memory areas. But sometimes the application does not use the Xenomai API
> correctly or there is an bug in Xenomai it self.
> Currently I'm suspecting the latter.
>

Thanks for the details!

>>>
>>> So, I wrote a small tool to inspect the memory mappings of a process to find
>>> areas which are not correctly pre-faulted. While doing so I noticed that
>>> there is currently no way to detect CoW mappings.
>>> Exposing the writable property of a PTE seemed like a good start to me.
>>
>> Is it just about "detection" for debugging purposes or about "fixup" in
>> running applications?
>
> It's only about debugging. If an application fails a test I want to have
> a tool which tells me what memory mappings are wonky or could cause a fault
> at runtime.

One destructive way to find out in a writable mapping if the page would
actually get remapped:

a) Read the PFN of a virtual address using pagemap
b) Write to the virtual address using /proc/pid/mem
c) Read the PFN of a virtual address using pagemap to see if it changed

If the application can be paused, you could read+write a single byte,
turning it non-destructive.

But that would still "hide" the remap-writable-type faults.

>
> I fully understand that my use case is a corner case and anything but mainline.
> While developing my debug tool I thought that improving the pagemap interface
> might help others too.

I'm fine with this (can be a helpful debugging tool for some other cases
as well, and IIRC we don't have another interface to introspect this),
as long as we properly document the corner case that there could still
be writefaults on some architectures when the page would not be
accessed/dirty yet.

--
Cheers,

David / dhildenb

2024-03-07 12:30:09

by David Hildenbrand

[permalink] [raw]

Subject: Re: [PATCH 1/2] [RFC] proc: pagemap: Expose whether a PTE is writable

On 07.03.24 12:59, David Hildenbrand wrote:
> On 07.03.24 12:51, Richard Weinberger wrote:
>> ----- Ursprüngliche Mail -----
>>> Von: "David Hildenbrand" <[email protected]>
>>>> I'm currently investigating why a real-time application faces unexpected
>>>> page faults. Page faults are usually fatal for real-time work loads because
>>>> the latency constraints are no longer met.
>>>
>>> Are you concerned about any type of page fault, or are things like a
>>> simple remapping of the same page from "read-only to writable"
>>> acceptable? ("very minor fault")
>>
>> Any page fault has to be avoided.
>> To give you more background, the real time application runs on Xenomai,
>> a real time extension for Linux.
>> Xenomai applies already many tweaks to the kernel to trigger pre-faulting of
>> memory areas. But sometimes the application does not use the Xenomai API
>> correctly or there is an bug in Xenomai it self.
>> Currently I'm suspecting the latter.
>>
>
> Thanks for the details!
>
>>>>
>>>> So, I wrote a small tool to inspect the memory mappings of a process to find
>>>> areas which are not correctly pre-faulted. While doing so I noticed that
>>>> there is currently no way to detect CoW mappings.
>>>> Exposing the writable property of a PTE seemed like a good start to me.
>>>
>>> Is it just about "detection" for debugging purposes or about "fixup" in
>>> running applications?
>>
>> It's only about debugging. If an application fails a test I want to have
>> a tool which tells me what memory mappings are wonky or could cause a fault
>> at runtime.
>
> One destructive way to find out in a writable mapping if the page would
> actually get remapped:
>
> a) Read the PFN of a virtual address using pagemap
> b) Write to the virtual address using /proc/pid/mem
> c) Read the PFN of a virtual address using pagemap to see if it changed
>
> If the application can be paused, you could read+write a single byte,
> turning it non-destructive.
>
> But that would still "hide" the remap-writable-type faults.
>
>>
>> I fully understand that my use case is a corner case and anything but mainline.
>> While developing my debug tool I thought that improving the pagemap interface
>> might help others too.
>
> I'm fine with this (can be a helpful debugging tool for some other cases
> as well, and IIRC we don't have another interface to introspect this),
> as long as we properly document the corner case that there could still
> be writefaults on some architectures when the page would not be
> accessed/dirty yet.
>

[and I just recall, there are some other corner cases. For example,
pages in a shadow stack can be pte_write(), but they can only be written
by HW indirectly when modifying the stack, and ordinary write access
would still fault]

--
Cheers,

David / dhildenb

2024-03-07 14:43:11

by Richard Weinberger

[permalink] [raw]

Subject: Re: [PATCH 1/2] [RFC] proc: pagemap: Expose whether a PTE is writable

----- Ursprüngliche Mail -----
> Von: "David Hildenbrand" <[email protected]>
>> One destructive way to find out in a writable mapping if the page would
>> actually get remapped:
>>
>> a) Read the PFN of a virtual address using pagemap
>> b) Write to the virtual address using /proc/pid/mem
>> c) Read the PFN of a virtual address using pagemap to see if it changed
>>
>> If the application can be paused, you could read+write a single byte,
>> turning it non-destructive.

I'm not so sure whether this works well if a mapping is device memory or such.

>> But that would still "hide" the remap-writable-type faults.

Xenomai will tell me anyway when there was a page fault while a real time thread
had the CPU.
My idea was having a tool to check before the applications enters the critical phase.

>>> I fully understand that my use case is a corner case and anything but mainline.
>>> While developing my debug tool I thought that improving the pagemap interface
>>> might help others too.
>>
>> I'm fine with this (can be a helpful debugging tool for some other cases
>> as well, and IIRC we don't have another interface to introspect this),
>> as long as we properly document the corner case that there could still
>> be writefaults on some architectures when the page would not be
>> accessed/dirty yet.

Cool. :)

>
> [and I just recall, there are some other corner cases. For example,
> pages in a shadow stack can be pte_write(), but they can only be written
> by HW indirectly when modifying the stack, and ordinary write access
> would still fault]

Yeah, I noticed this while browsing through various pte_write() implementations.
That's a tradeoff I can live with.

Thanks,
//richard