Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753476AbcCGSlk (ORCPT ); Mon, 7 Mar 2016 13:41:40 -0500 Received: from mail-oi0-f68.google.com ([209.85.218.68]:35237 "EHLO mail-oi0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752427AbcCGSld (ORCPT ); Mon, 7 Mar 2016 13:41:33 -0500 MIME-Version: 1.0 In-Reply-To: References: Date: Mon, 7 Mar 2016 13:41:32 -0500 Message-ID: Subject: Re: [PATCH v2 07/10] x86/entry: Vastly simplify SYSENTER TF handling From: Brian Gerst To: Andy Lutomirski Cc: Andy Lutomirski , "the arch/x86 maintainers" , Linux Kernel Mailing List , Borislav Petkov , Oleg Nesterov , Andrew Cooper Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3244 Lines: 69 On Mon, Mar 7, 2016 at 1:03 PM, Andy Lutomirski wrote: > On Mon, Mar 7, 2016 at 9:17 AM, Brian Gerst wrote: >> On Sun, Mar 6, 2016 at 12:52 AM, Andy Lutomirski wrote: >>> Due to a blatant design error, SYSENTER doesn't clear TF. As a result, >>> if a user does SYSENTER with TF set, we will single-step through the >>> kernel until something clears TF. There is absolutely nothing we can >>> do to prevent this short of turning off SYSENTER [1]. >>> >>> Simplify the handling considerably with two changes: >>> >>> 1. We already sanitize EFLAGS in SYSENTER to clear NT and AC. We can >>> add TF to that list of flags to sanitize with no overhead whatsoever. >>> >>> 2. Teach do_debug to ignore single-step traps in the SYSENTER prologue. >> >> What is wrong with the current method of clearing TF and setting >> TIF_SINGLESTEP on the first debug trap? This patch actually increases >> complexity because it has to check for a range of addresses rather >> than just the first instruction, plus it has to singlestep all the way >> through the SYSENTER prologue. >> >> Unless there is an actual issue with TIF_SINGLESTEP, I don't think >> this patch is an improvement. > > TIF_SINGLESTEP has bizarrely overloaded semantics. > > There's this: > > /* > * If we stepped into a sysenter/syscall insn, it trapped in > * kernel mode; do_debug() cleared TF and set TIF_SINGLESTEP. > * If user-mode had set TF itself, then it's still clear from > * do_debug() and we need to set it again to restore the user > * state so we don't wrongly set TIF_FORCED_TF below. > * If enable_single_step() was used last and that is what > * set TIF_SINGLESTEP, then both TF and TIF_FORCED_TF are > * already set and our bookkeeping is fine. > */ > if (unlikely(test_tsk_thread_flag(child, TIF_SINGLESTEP))) > regs->flags |= X86_EFLAGS_TF; > > but TIF_SINGLESTEP is also used for other things. (And I need to > follow up with a patch to remove that comment.) This results in > incomprehensible behavior: if a user program sets TF and does > SYSENTER, then TIF_SINGLESTEP gets set (and stays set!). This does > not happen if a user sets TF and does INT80 or SYSCALL. There was at > least one bug in here that Oleg fixed a while back, and I wouldn't be > at all surprised if there were others. > > I don't know what would happen if TF were set and SYSENTER were used > to do a syscall where the __get_user to load the syscall nr failed. > That happens before the TIF_SINGLESTEP fixup. > > Basically, the overloaded use of TIF_SINGLESTEP was complicated and > hard to understand, and the new behavior is straightforward and > consistent with other entries, even if it's a bit slower. > > We could introduce a new TIF_SYSENTER_TF and use it directly, or we > could accelerate the TF fixup in regs->flags by switching to a > different entry path (I had a draft patch to do that), but I tend to > favor simplicity for things that aren't performance-critical. The alternate entry path wouldn't be very big, just 5 instructions with the OR being the only difference. Another option is to use a per-cpu var instead of a TIF flag. -- Brian Gerst