Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753691AbbDBMcp (ORCPT ); Thu, 2 Apr 2015 08:32:45 -0400 Received: from terminus.zytor.com ([198.137.202.10]:60729 "EHLO terminus.zytor.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753356AbbDBMci (ORCPT ); Thu, 2 Apr 2015 08:32:38 -0400 Date: Thu, 2 Apr 2015 05:32:10 -0700 From: tip-bot for Andy Lutomirski Message-ID: Cc: luto@kernel.org, mingo@kernel.org, bp@suse.de, bp@alien8.de, linux-kernel@vger.kernel.org, tglx@linutronix.de, brgerst@gmail.com, hpa@zytor.com, torvalds@linux-foundation.org, dvlasenk@redhat.com Reply-To: bp@suse.de, mingo@kernel.org, luto@kernel.org, tglx@linutronix.de, linux-kernel@vger.kernel.org, bp@alien8.de, brgerst@gmail.com, hpa@zytor.com, dvlasenk@redhat.com, torvalds@linux-foundation.org In-Reply-To: <9472f1ca4c19a38ecda45bba9c91b7168135fcfa.1427923514.git.luto@kernel.org> References: <9472f1ca4c19a38ecda45bba9c91b7168135fcfa.1427923514.git.luto@kernel.org> To: linux-tip-commits@vger.kernel.org Subject: [tip:x86/urgent] x86/asm/entry/64: Disable opportunistic SYSRET if regs->flags has TF set Git-Commit-ID: 7ea24169097d3d3a3eab2dcc5773bc43fd5593e7 X-Mailer: tip-git-log-daemon Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=UTF-8 Content-Disposition: inline Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3254 Lines: 84 Commit-ID: 7ea24169097d3d3a3eab2dcc5773bc43fd5593e7 Gitweb: http://git.kernel.org/tip/7ea24169097d3d3a3eab2dcc5773bc43fd5593e7 Author: Andy Lutomirski AuthorDate: Wed, 1 Apr 2015 14:26:34 -0700 Committer: Ingo Molnar CommitDate: Thu, 2 Apr 2015 11:09:54 +0200 x86/asm/entry/64: Disable opportunistic SYSRET if regs->flags has TF set When I wrote the opportunistic SYSRET code, I missed an important difference between SYSRET and IRET. Both instructions are capable of setting EFLAGS.TF, but they behave differently when doing so: - IRET will not issue a #DB trap after execution when it sets TF. This is critical -- otherwise you'd never be able to make forward progress when returning to userspace. - SYSRET, on the other hand, will trap with #DB immediately after returning to CPL3, and the next instruction will never execute. This breaks anything that opportunistically SYSRETs to a user context with TF set. For example, running this code with TF set and a SIGTRAP handler loaded never gets past 'post_nop': extern unsigned char post_nop[]; asm volatile ("pushfq\n\t" "popq %%r11\n\t" "nop\n\t" "post_nop:" : : "c" (post_nop) : "r11"); In my defense, I can't find this documented in the AMD or Intel manual. Fix it by using IRET to restore TF. Signed-off-by: Andy Lutomirski Cc: Borislav Petkov Cc: Borislav Petkov Cc: Brian Gerst Cc: Denys Vlasenko Cc: H. Peter Anvin Cc: Linus Torvalds Cc: Thomas Gleixner Fixes: 2a23c6b8a9c4 ("x86_64, entry: Use sysret to return to userspace when possible") Link: http://lkml.kernel.org/r/9472f1ca4c19a38ecda45bba9c91b7168135fcfa.1427923514.git.luto@kernel.org Signed-off-by: Ingo Molnar --- arch/x86/kernel/entry_64.S | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S index 2babb39..f0095a7 100644 --- a/arch/x86/kernel/entry_64.S +++ b/arch/x86/kernel/entry_64.S @@ -799,7 +799,21 @@ retint_swapgs: /* return to user-space */ cmpq %r11,(EFLAGS-ARGOFFSET)(%rsp) /* R11 == RFLAGS */ jne opportunistic_sysret_failed - testq $X86_EFLAGS_RF,%r11 /* sysret can't restore RF */ + /* + * SYSRET can't restore RF. SYSRET can restore TF, but unlike IRET, + * restoring TF results in a trap from userspace immediately after + * SYSRET. This would cause an infinite loop whenever #DB happens + * with register state that satisfies the opportunistic SYSRET + * conditions. For example, single-stepping this user code: + * + * movq $stuck_here,%rcx + * pushfq + * popq %r11 + * stuck_here: + * + * would never get past 'stuck_here'. + */ + testq $(X86_EFLAGS_RF|X86_EFLAGS_TF), %r11 jnz opportunistic_sysret_failed /* nothing to check for RSP */ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/