Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752561AbbLIVQQ (ORCPT ); Wed, 9 Dec 2015 16:16:16 -0500 Received: from mail-ob0-f181.google.com ([209.85.214.181]:35750 "EHLO mail-ob0-f181.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751321AbbLIVQO convert rfc822-to-8bit (ORCPT ); Wed, 9 Dec 2015 16:16:14 -0500 MIME-Version: 1.0 In-Reply-To: References: <1449666173-15366-1-git-send-email-brgerst@gmail.com> From: Andy Lutomirski Date: Wed, 9 Dec 2015 13:15:54 -0800 Message-ID: Subject: Re: [PATCH] x86/entry/64: Remove duplicate syscall table for fast path To: Brian Gerst Cc: Andy Lutomirski , "the arch/x86 maintainers" , Linux Kernel Mailing List , Borislav Petkov , =?UTF-8?B?RnLDqWTDqXJpYyBXZWlzYmVja2Vy?= , Denys Vlasenko , Linus Torvalds Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3860 Lines: 109 On Wed, Dec 9, 2015 at 1:08 PM, Brian Gerst wrote: > On Wed, Dec 9, 2015 at 1:53 PM, Andy Lutomirski wrote: >> On Wed, Dec 9, 2015 at 5:02 AM, Brian Gerst wrote: >>> Instead of using a duplicate syscall table for the fast path, create stubs for >>> the syscalls that need pt_regs that save the extra registers if a flag for the >>> slow path is not set. >>> >>> Signed-off-by: Brian Gerst >>> To: Andy Lutomirski >>> Cc: Andy Lutomirski >>> Cc: the arch/x86 maintainers >>> Cc: Linux Kernel Mailing List >>> Cc: Borislav Petkov >>> Cc: Frédéric Weisbecker >>> Cc: Denys Vlasenko >>> Cc: Linus Torvalds >>> --- >>> >>> Applies on top of Andy's syscall cleanup series. >> >> A couple questions: >> >>> @@ -306,15 +306,37 @@ END(entry_SYSCALL_64) >>> >>> ENTRY(stub_ptregs_64) >>> /* >>> - * Syscalls marked as needing ptregs that go through the fast path >>> - * land here. We transfer to the slow path. >>> + * Syscalls marked as needing ptregs land here. >>> + * If we are on the fast path, we need to save the extra regs. >>> + * If we are on the slow path, the extra regs are already saved. >>> */ >>> - DISABLE_INTERRUPTS(CLBR_NONE) >>> - TRACE_IRQS_OFF >>> - addq $8, %rsp >>> - jmp entry_SYSCALL64_slow_path >>> + movq PER_CPU_VAR(cpu_current_top_of_stack), %r10 >>> + testl $TS_SLOWPATH, ASM_THREAD_INFO(TI_status, %r10, 0) >>> + jnz 1f >> >> OK (but see below), but why not do: >> >> addq $8, %rsp >> jmp entry_SYSCALL64_slow_path > > I've always been adverse to doing things like that because it breaks > call/return branch prediction. I'd agree with you there except that the syscalls in question really don't matter for performance enough that we should worry about a handful of cycles from a return misprediction. We're still avoiding IRET regardless (to the extent possible), and that was always the major factor. > Also, are there any side effects to calling enter_from_user_mode() > more than once? A warning that invariants are broken if you have an appropriately configured kernel. > >> here instead of the stack munging below? >> >>> + subq $SIZEOF_PTREGS, %r10 >>> + SAVE_EXTRA_REGS base=r10 >>> + movq %r10, %rbx >>> + call *%rax >>> + movq %rbx, %r10 >>> + RESTORE_EXTRA_REGS base=r10 >>> + ret >>> +1: >>> + jmp *%rax >>> END(stub_ptregs_64) > > After some thought, that can be simplified. It's only executed on the > fast path, so pt_regs is at 8(%rsp). > >> Also, can we not get away with keying off rip or rsp instead of >> ti->status? That should be faster and less magical IMO. > > Checking if the return address is the instruction after the fast path > dispatch would work. > > Simplified version: > ENTRY(stub_ptregs_64) > cmpl $fast_path_return, (%rsp) Does that instruction actually work the way you want it to? (Does it link?) I think you might need to use leaq the way I did in my patch. > jne 1f > SAVE_EXTRA_REGS offset=8 > call *%rax > RESTORE_EXTRA_REGS offset=8 > ret > 1: > jmp *%rax > END(stub_ptregs_64) This'll work, I think, but I still think I prefer keeping as much complexity as possible in the slow path. I could be convinced otherwise, though -- this variant is reasonably clean. --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/