On 2/23/24 12:39, John Groves wrote:
>> We had similar unit test regression concerns with fsdax where some
>> upstream change silently broke PMD faults. The solution there was trace
>> points in the fault handlers and a basic test that knows apriori that it
>> *should* be triggering a certain number of huge faults:
>>
>> https://github.com/pmem/ndctl/blob/main/test/dax.sh#L31
> Good approach, thanks Dan! My working assumption is that we'll be able to make
> that approach work in the famfs tests. So the fault counters should go away
> in the next version.
I do really suspect there's something more generic that should be done
here. Maybe we need a generic 'huge_faults' perf event to pair up with
the good ol' faults that we already have:
# perf stat -e faults /bin/ls
Performance counter stats for '/bin/ls':
104 faults
0.001499862 seconds time elapsed
0.001490000 seconds user
0.000000000 seconds sys
Dave Hansen wrote:
> On 2/23/24 12:39, John Groves wrote:
> >> We had similar unit test regression concerns with fsdax where some
> >> upstream change silently broke PMD faults. The solution there was trace
> >> points in the fault handlers and a basic test that knows apriori that it
> >> *should* be triggering a certain number of huge faults:
> >>
> >> https://github.com/pmem/ndctl/blob/main/test/dax.sh#L31
> > Good approach, thanks Dan! My working assumption is that we'll be able to make
> > that approach work in the famfs tests. So the fault counters should go away
> > in the next version.
>
> I do really suspect there's something more generic that should be done
> here. Maybe we need a generic 'huge_faults' perf event to pair up with
> the good ol' faults that we already have:
>
> # perf stat -e faults /bin/ls
>
> Performance counter stats for '/bin/ls':
>
> 104 faults
>
>
> 0.001499862 seconds time elapsed
>
> 0.001490000 seconds user
> 0.000000000 seconds sys
Certainly something like that would have satisified this sanity test use
case. I will note that mm_account_fault() would need some help to figure
out the size of the page table entry that got installed. Maybe
extensions to vm_fault_reason to add VM_FAULT_P*D? That compliments
VM_FAULT_FALLBACK to indicate whether, for example, the fallback went
from PUD to PMD, or all the way back to PTE.
Then use cases like this could just add a dynamic probe in
mm_account_fault(). No real need for a new tracepoint unless there was a
use case for this outside of regression testing fault handlers, right?
On Fri, Feb 23, 2024 at 03:50:33PM -0800, Dan Williams wrote:
> Certainly something like that would have satisified this sanity test use
> case. I will note that mm_account_fault() would need some help to figure
> out the size of the page table entry that got installed. Maybe
> extensions to vm_fault_reason to add VM_FAULT_P*D? That compliments
> VM_FAULT_FALLBACK to indicate whether, for example, the fallback went
> from PUD to PMD, or all the way back to PTE.
ugh, no, it's more complicated than that. look at the recent changes to
set_ptes(). we can now install PTEs of many different sizes, depending
on the architecture. someday i look forward to supporting all the page
sizes on parisc (4k, 16k, 64k, 256k, ... 4G)