Subject: [tip:x86/urgent] x86/mm/kmmio: Fix mmiotrace for page unaligned addresses

Commit-ID: 6d60ce384d1d5ca32b595244db4077a419acc687
Gitweb: https://git.kernel.org/tip/6d60ce384d1d5ca32b595244db4077a419acc687
Author: Karol Herbst <[email protected]>
AuthorDate: Mon, 27 Nov 2017 08:51:39 +0100
Committer: Ingo Molnar <[email protected]>
CommitDate: Mon, 11 Dec 2017 15:35:18 +0100

x86/mm/kmmio: Fix mmiotrace for page unaligned addresses

If something calls ioremap() with an address not aligned to PAGE_SIZE, the
returned address might be not aligned as well. This led to a probe
registered on exactly the returned address, but the entire page was armed
for mmiotracing.

On calling iounmap() the address passed to unregister_kmmio_probe() was
PAGE_SIZE aligned by the caller leading to a complete freeze of the
machine.

We should always page align addresses while (un)registerung mappings,
because the mmiotracer works on top of pages, not mappings. We still keep
track of the probes based on their real addresses and lengths though,
because the mmiotrace still needs to know what are mapped memory regions.

Also move the call to mmiotrace_iounmap() prior page aligning the address,
so that all probes are unregistered properly, otherwise the kernel ends up
failing memory allocations randomly after disabling the mmiotracer.

Tested-by: Lyude <[email protected]>
Signed-off-by: Karol Herbst <[email protected]>
Acked-by: Pekka Paalanen <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/mm/ioremap.c | 4 ++--
arch/x86/mm/kmmio.c | 12 +++++++-----
2 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 6e4573b..c45b6ec 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -404,11 +404,11 @@ void iounmap(volatile void __iomem *addr)
return;
}

+ mmiotrace_iounmap(addr);
+
addr = (volatile void __iomem *)
(PAGE_MASK & (unsigned long __force)addr);

- mmiotrace_iounmap(addr);
-
/* Use the vm area unlocked, assuming the caller
ensures there isn't another iounmap for the same address
in parallel. Reuse of the virtual address is prevented by
diff --git a/arch/x86/mm/kmmio.c b/arch/x86/mm/kmmio.c
index c21c2ed..58477ec 100644
--- a/arch/x86/mm/kmmio.c
+++ b/arch/x86/mm/kmmio.c
@@ -435,17 +435,18 @@ int register_kmmio_probe(struct kmmio_probe *p)
unsigned long flags;
int ret = 0;
unsigned long size = 0;
+ unsigned long addr = p->addr & PAGE_MASK;
const unsigned long size_lim = p->len + (p->addr & ~PAGE_MASK);
unsigned int l;
pte_t *pte;

spin_lock_irqsave(&kmmio_lock, flags);
- if (get_kmmio_probe(p->addr)) {
+ if (get_kmmio_probe(addr)) {
ret = -EEXIST;
goto out;
}

- pte = lookup_address(p->addr, &l);
+ pte = lookup_address(addr, &l);
if (!pte) {
ret = -EINVAL;
goto out;
@@ -454,7 +455,7 @@ int register_kmmio_probe(struct kmmio_probe *p)
kmmio_count++;
list_add_rcu(&p->list, &kmmio_probes);
while (size < size_lim) {
- if (add_kmmio_fault_page(p->addr + size))
+ if (add_kmmio_fault_page(addr + size))
pr_err("Unable to set page fault.\n");
size += page_level_size(l);
}
@@ -528,19 +529,20 @@ void unregister_kmmio_probe(struct kmmio_probe *p)
{
unsigned long flags;
unsigned long size = 0;
+ unsigned long addr = p->addr & PAGE_MASK;
const unsigned long size_lim = p->len + (p->addr & ~PAGE_MASK);
struct kmmio_fault_page *release_list = NULL;
struct kmmio_delayed_release *drelease;
unsigned int l;
pte_t *pte;

- pte = lookup_address(p->addr, &l);
+ pte = lookup_address(addr, &l);
if (!pte)
return;

spin_lock_irqsave(&kmmio_lock, flags);
while (size < size_lim) {
- release_kmmio_fault_page(p->addr + size, &release_list);
+ release_kmmio_fault_page(addr + size, &release_list);
size += page_level_size(l);
}
list_del_rcu(&p->list);


2017-12-12 13:50:25

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [tip:x86/urgent] x86/mm/kmmio: Fix mmiotrace for page unaligned addresses

On Tue, Dec 12, 2017 at 02:55:30AM -0800, tip-bot for Karol Herbst wrote:
> Commit-ID: 6d60ce384d1d5ca32b595244db4077a419acc687
> Gitweb: https://git.kernel.org/tip/6d60ce384d1d5ca32b595244db4077a419acc687
> Author: Karol Herbst <[email protected]>
> AuthorDate: Mon, 27 Nov 2017 08:51:39 +0100
> Committer: Ingo Molnar <[email protected]>
> CommitDate: Mon, 11 Dec 2017 15:35:18 +0100
>
> x86/mm/kmmio: Fix mmiotrace for page unaligned addresses

OK, let me hijack this thread since apparently people use and care about
mmiotrace.

I was recently auditing the x86 tlb flushing and ran across this
'thing'. Can someone please explain to me how this is supposed to work
and how its not completely broken?

2017-12-12 14:04:30

by Ingo Molnar

[permalink] [raw]
Subject: Re: [tip:x86/urgent] x86/mm/kmmio: Fix mmiotrace for page unaligned addresses


* Peter Zijlstra <[email protected]> wrote:

> On Tue, Dec 12, 2017 at 02:55:30AM -0800, tip-bot for Karol Herbst wrote:
> > Commit-ID: 6d60ce384d1d5ca32b595244db4077a419acc687
> > Gitweb: https://git.kernel.org/tip/6d60ce384d1d5ca32b595244db4077a419acc687
> > Author: Karol Herbst <[email protected]>
> > AuthorDate: Mon, 27 Nov 2017 08:51:39 +0100
> > Committer: Ingo Molnar <[email protected]>
> > CommitDate: Mon, 11 Dec 2017 15:35:18 +0100
> >
> > x86/mm/kmmio: Fix mmiotrace for page unaligned addresses
>
> OK, let me hijack this thread since apparently people use and care about
> mmiotrace.
>
> I was recently auditing the x86 tlb flushing and ran across this
> 'thing'. Can someone please explain to me how this is supposed to work
> and how its not completely broken?

(I have Cc:-ed other gents as well.)

Thanks,

Ingo

2017-12-12 14:21:15

by Ilia Mirkin

[permalink] [raw]
Subject: Re: [tip:x86/urgent] x86/mm/kmmio: Fix mmiotrace for page unaligned addresses

On Tue, Dec 12, 2017 at 9:04 AM, Ingo Molnar <[email protected]> wrote:
>
> * Peter Zijlstra <[email protected]> wrote:
>
>> On Tue, Dec 12, 2017 at 02:55:30AM -0800, tip-bot for Karol Herbst wrote:
>> > Commit-ID: 6d60ce384d1d5ca32b595244db4077a419acc687
>> > Gitweb: https://git.kernel.org/tip/6d60ce384d1d5ca32b595244db4077a419acc687
>> > Author: Karol Herbst <[email protected]>
>> > AuthorDate: Mon, 27 Nov 2017 08:51:39 +0100
>> > Committer: Ingo Molnar <[email protected]>
>> > CommitDate: Mon, 11 Dec 2017 15:35:18 +0100
>> >
>> > x86/mm/kmmio: Fix mmiotrace for page unaligned addresses
>>
>> OK, let me hijack this thread since apparently people use and care about
>> mmiotrace.
>>
>> I was recently auditing the x86 tlb flushing and ran across this
>> 'thing'. Can someone please explain to me how this is supposed to work
>> and how its not completely broken?

The "thing" being mmiotrace, or the "thing" being page-unaligned addresses?

If the former, its primary use-case is for snooping on the NVIDIA
proprietary GPU driver in order to figure out how to drive the
underlying hardware. The driver does ioremap's to get at PCI space,
which mmiotrace "steals" and provides pages without a present bit set,
along with a fault handler. When the fault handler is hit, it
reinstates the faulting page, and single-steps the faulting
instruction reading the before/after regs to determine what happened
(doesn't work universally, but enough for instructions used for PCI
MMIO accesses). See mmio-mod.c::pre and post (the latter is called
from the debug handler).

You may be interested in reading
Documentation/trace/mmiotrace.txt::How Mmiotrace Works

Cheers,

-ilia

2017-12-12 14:32:46

by Karol Herbst

[permalink] [raw]
Subject: Re: [tip:x86/urgent] x86/mm/kmmio: Fix mmiotrace for page unaligned addresses

Hi Peter,

the basic idea is to detect if a driver accesses a memory region
mapped through ioremap. This is super usefull for reverse engineering
closed source drivers like the Nvidia GPU driver.

So here is what it does:
1. on ioremap the entire memory region mapped is registered in the
mmiotracer and marked as not presen, which basically leads to page
faults on acces
2. mmiotrace is the registered page fault handler for those pages and
while handling the page (which basically means marking them as presen,
because they were never missing in the first place) it parses the
current instruction to detect if it was a read or write and writes
relevant information into a file. This includes address accessed,
value read/written, type of instruction
3. after single stepping, the page is marked as not present again
4. on unmap time, mmiotrace unregisteres those regions and marks them as present

this is more or less the basic idea.

And to answer your question how it is not completely broken: I don't
know. It works for us (more or less, we can't parse repeat
instructions as one example what does not work) and if we come across
issues we try to fix them on the way.

Anyway, this is a super useful tool to record and debug what a driver
is doing with hardware and helps tracking down a lot of this,
especially for Nouveau.

I hope that helps.

On Tue, Dec 12, 2017 at 3:04 PM, Ingo Molnar <[email protected]> wrote:
>
> * Peter Zijlstra <[email protected]> wrote:
>
>> On Tue, Dec 12, 2017 at 02:55:30AM -0800, tip-bot for Karol Herbst wrote:
>> > Commit-ID: 6d60ce384d1d5ca32b595244db4077a419acc687
>> > Gitweb: https://git.kernel.org/tip/6d60ce384d1d5ca32b595244db4077a419acc687
>> > Author: Karol Herbst <[email protected]>
>> > AuthorDate: Mon, 27 Nov 2017 08:51:39 +0100
>> > Committer: Ingo Molnar <[email protected]>
>> > CommitDate: Mon, 11 Dec 2017 15:35:18 +0100
>> >
>> > x86/mm/kmmio: Fix mmiotrace for page unaligned addresses
>>
>> OK, let me hijack this thread since apparently people use and care about
>> mmiotrace.
>>
>> I was recently auditing the x86 tlb flushing and ran across this
>> 'thing'. Can someone please explain to me how this is supposed to work
>> and how its not completely broken?
>
> (I have Cc:-ed other gents as well.)
>
> Thanks,
>
> Ingo

2017-12-12 14:43:46

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [tip:x86/urgent] x86/mm/kmmio: Fix mmiotrace for page unaligned addresses

On Tue, Dec 12, 2017 at 09:21:10AM -0500, Ilia Mirkin wrote:
> The "thing" being mmiotrace, or the "thing" being page-unaligned addresses?

mmiotrace

> If the former, its primary use-case is for snooping on the NVIDIA
> proprietary GPU driver in order to figure out how to drive the
> underlying hardware. The driver does ioremap's to get at PCI space,
> which mmiotrace "steals" and provides pages without a present bit set,
> along with a fault handler. When the fault handler is hit, it
> reinstates the faulting page, and single-steps the faulting
> instruction

At which point you have valid page-tables and another CPU can access
that page too.

> reading the before/after regs to determine what happened
> (doesn't work universally, but enough for instructions used for PCI
> MMIO accesses). See mmio-mod.c::pre and post (the latter is called
> from the debug handler).

And after that you only invalidate the TLBs for the CPU that took the
initial fault, leaving possibly stale TLBs on other CPUs.


So this 'thing' has huge gaping SMP holes in.

2017-12-12 14:47:10

by Ilia Mirkin

[permalink] [raw]
Subject: Re: [tip:x86/urgent] x86/mm/kmmio: Fix mmiotrace for page unaligned addresses

On Tue, Dec 12, 2017 at 9:43 AM, Peter Zijlstra <[email protected]> wrote:
> On Tue, Dec 12, 2017 at 09:21:10AM -0500, Ilia Mirkin wrote:
>> The "thing" being mmiotrace, or the "thing" being page-unaligned addresses?
>
> mmiotrace
>
>> If the former, its primary use-case is for snooping on the NVIDIA
>> proprietary GPU driver in order to figure out how to drive the
>> underlying hardware. The driver does ioremap's to get at PCI space,
>> which mmiotrace "steals" and provides pages without a present bit set,
>> along with a fault handler. When the fault handler is hit, it
>> reinstates the faulting page, and single-steps the faulting
>> instruction
>
> At which point you have valid page-tables and another CPU can access
> that page too.
>
>> reading the before/after regs to determine what happened
>> (doesn't work universally, but enough for instructions used for PCI
>> MMIO accesses). See mmio-mod.c::pre and post (the latter is called
>> from the debug handler).
>
> And after that you only invalidate the TLBs for the CPU that took the
> initial fault, leaving possibly stale TLBs on other CPUs.
>
>
> So this 'thing' has huge gaping SMP holes in.

Sure does! Probably why the following happens when mmiotrace is enabled:

void enable_mmiotrace(void)
{
mutex_lock(&mmiotrace_mutex);
if (is_enabled())
goto out;

if (nommiotrace)
pr_info("MMIO tracing disabled.\n");
kmmio_init();
enter_uniprocessor();
spin_lock_irq(&trace_lock);
atomic_inc(&mmiotrace_enabled);
spin_unlock_irq(&trace_lock);
pr_info("enabled.\n");
out:
mutex_unlock(&mmiotrace_mutex);
}

2017-12-12 14:51:21

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [tip:x86/urgent] x86/mm/kmmio: Fix mmiotrace for page unaligned addresses

On Tue, Dec 12, 2017 at 09:47:05AM -0500, Ilia Mirkin wrote:
> > So this 'thing' has huge gaping SMP holes in.
>
> Sure does! Probably why the following happens when mmiotrace is enabled:
>
> void enable_mmiotrace(void)
> {
> mutex_lock(&mmiotrace_mutex);
> if (is_enabled())
> goto out;
>
> if (nommiotrace)
> pr_info("MMIO tracing disabled.\n");
> kmmio_init();
> enter_uniprocessor();

^^^^^

Ah! I completely missed that. OK, that makes it much less broken :-)

If I don't forget, I'll add some comments to this file to clarify that.

Thanks!

> spin_lock_irq(&trace_lock);
> atomic_inc(&mmiotrace_enabled);
> spin_unlock_irq(&trace_lock);
> pr_info("enabled.\n");
> out:
> mutex_unlock(&mmiotrace_mutex);
> }

2017-12-13 16:31:42

by Steven Rostedt

[permalink] [raw]
Subject: Re: [tip:x86/urgent] x86/mm/kmmio: Fix mmiotrace for page unaligned addresses

On Tue, 12 Dec 2017 15:51:05 +0100
Peter Zijlstra <[email protected]> wrote:

> On Tue, Dec 12, 2017 at 09:47:05AM -0500, Ilia Mirkin wrote:
> > > So this 'thing' has huge gaping SMP holes in.
> >
> > Sure does! Probably why the following happens when mmiotrace is enabled:
> >
> > void enable_mmiotrace(void)
> > {
> > mutex_lock(&mmiotrace_mutex);
> > if (is_enabled())
> > goto out;
> >
> > if (nommiotrace)
> > pr_info("MMIO tracing disabled.\n");
> > kmmio_init();
> > enter_uniprocessor();
>
> ^^^^^
>
> Ah! I completely missed that. OK, that makes it much less broken :-)

/me just saw this thread.

Once I saw your initial complaint, my first thought was "I think he
doesn't know that it causes the system to turn into a uniprocessor
first".

My ftrace tests have a test that switches to each tracer, and this
mmiotrace catches the most bugs after an rc1 release. The bugs are
triggered by switching to and from uniprocessor mode. Hopefully, with
the new hotplug code, that will be a thing of the past.

>
> If I don't forget, I'll add some comments to this file to clarify that.

Great, thanks!

-- Steve

>