MIME-Version: 1.0
In-Reply-To: <20150724081326.GO25159@twins.programming.kicks-ass.net>
References: <CALCETrUf9s-o-ETMiSxxjMGxVeH7di4O9vTi0Oe7wS-RCiVXLA@mail.gmail.com>
 <CA+55aFwR8mHw=wm+Uecy0ERgrD7WbijBn9kj_ZAd47L4GyG5Xw@mail.gmail.com>
 <CALCETrVAzhE7w3BDjqRack54BLncZALbnAOZyeXHx1cSTryy4g@mail.gmail.com>
 <CA+55aFyxs8Q5WrjN9o4Zmfd_4+muLkcoO8cXyv5Nt+Pf8c0TBQ@mail.gmail.com>
 <20150723173105.6795c0dc@gandalf.local.home> <CA+55aFy0-rj7hp3zOUAZD5y5Zp=v6Cu3TG0SHB-buj3oYTJcZg@mail.gmail.com>
 <CALCETrWMgsrgEYWpzPFapOj+-SvZfadDAZ7SH7O8bFsR2b6F1Q@mail.gmail.com>
 <CA+55aFzma9NgODkzz08zpEKSWVnwxuCvwPt_JnO8HaHwRnBPdQ@mail.gmail.com> <20150724081326.GO25159@twins.programming.kicks-ass.net>
From: Andy Lutomirski <luto@amacapital.net>
Date: Fri, 24 Jul 2015 08:48:57 -0700
Message-ID: <CALCETrWjzU79ASDK+0RJQyCy6qTdM3FPTa4ZM0d5sVW66yhcug@mail.gmail.com>
Subject: Re: Dealing with the NMI mess
To: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
        Steven Rostedt <rostedt@goodmis.org>, X86 ML <x86@kernel.org>,
        "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
        Willy Tarreau <w@1wt.eu>, Borislav Petkov <bp@alien8.de>,
        Thomas Gleixner <tglx@linutronix.de>, Brian Gerst <brgerst@gmail.com>
Content-Type: text/plain; charset=UTF-8
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 2527
Lines: 63

On Fri, Jul 24, 2015 at 1:13 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Thu, Jul 23, 2015 at 02:59:56PM -0700, Linus Torvalds wrote:
>> Hmmm. I thought watchpoints were "before the instruction" too, but
>> that's just because I haven't used them in ages, and I didn't remember
>> the details. I just looked it up.
>>
>> You're right - the memory watchpoints trigger after the instruction
>> has executed, so RF isn't an issue. So yes, the only issue is
>> instruction breakpoints, and those are the only ones we need to clear.
>>
>> And that makes it really easy.
>>
>> So yes, I agree. We only need to clear all kernel breakpoints.
>
> But but but, we can access userspace with !IF, imagine someone doing:
>
>   local_irq_disable();
>   copy_from_user_inatomic();
>
> and as luck would have it, there's a breakpoint on the user memory we
> just touched. And we go and disable a user breakpoint.
>

The Intel SDM says:

17.3.1.2 Data Memory and I/O Breakpoint Exception Conditions

Data memory and I/O breakpoints are reported when the processor
attempts to access a memory or I/O address
specified in a breakpoint-address register (DR0 through DR3) that has
been set up to detect data or I/O accesses
(R/W flag is set to 1, 2, or 3). The processor generates the exception
after it executes the instruction that made the
access, so these breakpoint condition causes a trap-class exception to
be generated.

So by the time we detect that we've hit a watchpoint, the instruction
that tripped it is done and we don't need RF.  Furthermore, after
reading 17.3.1.1: I *think* that regs->flags withh have RF *clear* if
we hit a watchpoint.  So this might be as simple as:

if ((dr6 && (0xf * DR_TRAP0) && (regs->flags & (X86_EFLAGS_RF |
X86_EFLAGS_IF)) == X86_EFLAGS_RF && !user_mode(regs))
  for (i = 0; i < 4; i++)
    if (dr6 & (DR_TRAP0<<i)) {
      /* hit a kernel breakpoint with IF clear */
      dr7 &= ~(DR_GLOBAL_ENABLE << (i * DR_ENABLE_SHIFT));
    }

I'm not saying that your code is wrong, but I think this is simpler
and avoids poking at yet more per-cpu state from NMI context, which is
kind of nice.

If you don't like the RF games above, it would also be straightforward
to parse dr0..dr3 for each DR_TRAP bit that's set and see if it's a
breakpoint.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/