Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753389AbbDBM7r (ORCPT ); Thu, 2 Apr 2015 08:59:47 -0400 Received: from mx1.redhat.com ([209.132.183.28]:41416 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752881AbbDBM7q (ORCPT ); Thu, 2 Apr 2015 08:59:46 -0400 Message-ID: <551D3D2A.2040802@redhat.com> Date: Thu, 02 Apr 2015 14:59:22 +0200 From: Denys Vlasenko User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: Ingo Molnar CC: Brian Gerst , Andy Lutomirski , the arch/x86 maintainers , Linux Kernel Mailing List , Borislav Petkov , Linus Torvalds , Borislav Petkov Subject: Re: [PATCH urgent v2] x86, asm: Disable opportunistic SYSRET if regs->flags has TF set References: <9472f1ca4c19a38ecda45bba9c91b7168135fcfa.1427923514.git.luto@kernel.org> <20150402090744.GA26846@gmail.com> <551D14D3.1070907@redhat.com> <20150402103735.GA21105@gmail.com> <551D3503.6000508@redhat.com> <20150402123159.GA25151@gmail.com> In-Reply-To: <20150402123159.GA25151@gmail.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3753 Lines: 100 On 04/02/2015 02:31 PM, Ingo Molnar wrote: > > * Denys Vlasenko wrote: > >> On 04/02/2015 01:14 PM, Brian Gerst wrote: >>>>>> So I merged this as it's an obvious bugfix, but in hindsight I'm >>>>>> really uneasy about the whole opportunistic SYSRET concept: it appears >>>>>> that the chance that %rcx matches return-%rip is astronomical - this >>>>>> is why this bug wasn't noticed live so far. >>>>>> >>>>>> So should we really be doing this? >>>>> >>>>> Andy does this not for the off-chance that userspace's RCX is equal >>>>> to return address and R11 == RFLAGS. The chances of that are >>>>> astronomically small. >>>>> >>>>> This code path triggers when ptrace/audit/seccomp is active. Instead >>>>> of torturing ourselves trying to not divert into IRET return, now >>>>> code is steered that way. But then immediately before actual IRET, >>>>> we check again: "do we really need IRET?" IOW "did ptrace really >>>>> touch pt_regs->ss? ->flags? ->rip? ->rcx?" which in vast majority of >>>>> cases will not be true. >>>> >>>> I keep forgetting about that, my test systems have the audit muck >>>> turned off ;-) >>>> >>>> Fair enough - and it's sensible to share the IRET path between >>>> interrupts and complex-return system calls, even though the check >>>> is unnecessary overhead for the pure interrupt return path... >>> >>> >>> Maybe we could reintroduce TIF_IRET for this purpose instead of >>> (ab)using TIF_NOTIFY_RESUME. Then we would only do the opportunistic >>> check for those cases (ptrace, audit, exec, sigreturn, etc.), and skip >>> it for interrupts. >> >> The very first check in the existing code, pt_regs->cx == >> pt_regs->ip, will fail for interrupt returns. >> >> You hardly can save anything by placing a (ti->flags & >> TIF_TRY_SYSRET) check in front of it, it's almost as expensive. > > Well, what I was thinking of was to have a pure irq (well, async > context) return path, not shared with the weird-syscall-IRET return > path at all ... > > It would be open coded, not obfuscated via macros. > > That way AFAICS the upsides are: > > - it's easier to read (and maintain) what goes on in which case. > '*intr*' labels would truly identify interrupt return related > processing, for a change! Re labels: I fully agree they need cleanup (mass rename). Something along the lines of int_ret_from_sys_call -> return_from_syscall int_with_check -> sysret_check_workmask_in_edi int_careful -> sysret_check_NEED_RESCHED int_very_careful -> sysret_check_SYSCALL_EXIT int_signal -> sysret_check_DO_NOTIFY_MASK int_restore_rest -> sysret_next_check ret_from_intr -> return_from_intr retint_with_reschedule -> intr_check_WORK_MASK retint_check -> intr_check_workmask_in_edi retint_careful -> intr_check_NEED_RESCHED retint_signal -> intr_check_DO_NOTIFY_MASK retint_swapgs -> return_from_syscall_or_intr irq_return_via_sysret -> return_via_sysret retint_kernel -> intr_check_preempt restore_args -> restore_c_regs irq_return -> return_via_iret and then your proposal can be rephrased as "let's stop merging sysret and intr code paths at retint_swapgs". Makes sense. It would entail some code duplication, but the code will be easier to maintain. > - we can optimize in a more directed fashion - like here > > ... while the downsides are: > > - more code > - a (small) chance of a fix going to one path while not the other. > > How much extra code would it be? A screenful or two. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/