MIME-Version: 1.0
In-Reply-To: <20150724171018.GH3612@1wt.eu>
References: <CALCETrUf9s-o-ETMiSxxjMGxVeH7di4O9vTi0Oe7wS-RCiVXLA@mail.gmail.com>
 <CA+55aFwR8mHw=wm+Uecy0ERgrD7WbijBn9kj_ZAd47L4GyG5Xw@mail.gmail.com>
 <CALCETrVAzhE7w3BDjqRack54BLncZALbnAOZyeXHx1cSTryy4g@mail.gmail.com>
 <CA+55aFyxs8Q5WrjN9o4Zmfd_4+muLkcoO8cXyv5Nt+Pf8c0TBQ@mail.gmail.com>
 <20150723173105.6795c0dc@gandalf.local.home> <CA+55aFy0-rj7hp3zOUAZD5y5Zp=v6Cu3TG0SHB-buj3oYTJcZg@mail.gmail.com>
 <CALCETrWMgsrgEYWpzPFapOj+-SvZfadDAZ7SH7O8bFsR2b6F1Q@mail.gmail.com>
 <CA+55aFzma9NgODkzz08zpEKSWVnwxuCvwPt_JnO8HaHwRnBPdQ@mail.gmail.com>
 <20150724081326.GO25159@twins.programming.kicks-ass.net> <CALCETrWjzU79ASDK+0RJQyCy6qTdM3FPTa4ZM0d5sVW66yhcug@mail.gmail.com>
 <20150724171018.GH3612@1wt.eu>
From: Andy Lutomirski <luto@amacapital.net>
Date: Fri, 24 Jul 2015 10:20:03 -0700
Message-ID: <CALCETrWq0KoBerS5OjoYZvfGNfwHYCtzzNUDuCH=84XQvEoRug@mail.gmail.com>
Subject: Re: Dealing with the NMI mess
To: Willy Tarreau <w@1wt.eu>
Cc: Peter Zijlstra <peterz@infradead.org>,
        Linus Torvalds <torvalds@linux-foundation.org>,
        Steven Rostedt <rostedt@goodmis.org>, X86 ML <x86@kernel.org>,
        "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
        Borislav Petkov <bp@alien8.de>, Thomas Gleixner <tglx@linutronix.de>,
        Brian Gerst <brgerst@gmail.com>
Content-Type: text/plain; charset=UTF-8
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 2040
Lines: 44

On Fri, Jul 24, 2015 at 10:10 AM, Willy Tarreau <w@1wt.eu> wrote:
> On Fri, Jul 24, 2015 at 08:48:57AM -0700, Andy Lutomirski wrote:
>> So by the time we detect that we've hit a watchpoint, the instruction
>> that tripped it is done and we don't need RF.  Furthermore, after
>> reading 17.3.1.1: I *think* that regs->flags withh have RF *clear* if
>> we hit a watchpoint.  So this might be as simple as:
>>
>> if ((dr6 && (0xf * DR_TRAP0) && (regs->flags & (X86_EFLAGS_RF |
>> X86_EFLAGS_IF)) == X86_EFLAGS_RF && !user_mode(regs))
>>   for (i = 0; i < 4; i++)
>>     if (dr6 & (DR_TRAP0<<i)) {
>>       /* hit a kernel breakpoint with IF clear */
>>       dr7 &= ~(DR_GLOBAL_ENABLE << (i * DR_ENABLE_SHIFT));
>>     }
>>
>> I'm not saying that your code is wrong, but I think this is simpler
>> and avoids poking at yet more per-cpu state from NMI context, which is
>> kind of nice.
>>
>> If you don't like the RF games above, it would also be straightforward
>> to parse dr0..dr3 for each DR_TRAP bit that's set and see if it's a
>> breakpoint.
>
> Andy, section 5.8 of the SDM makes me think we could possibly abuse SYSRET
> to emulate IRET, and then possibly simplify the flags processing. It says
> that it takes the CPL3 code segment but nowhere it says that the target is
> validated for effectively being userland, and further it suggests that it
> doesn't validate anything :
>
>   "It is the responsibility of the OS to ensure the descriptors in
>    the GDT/LDT correspond to the selectors loaded by SYSCALL/SYSRET
>    (consistent with the base, limit, and attribute values forced by
>    the instructions)."

You are an evil bastard.  I seriously doubt that this will work.
SYSRET goes to CPL3 no matter what.  Also, I don't think you want to
start poking at MSRs to return.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/