Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752660AbbKPX51 (ORCPT ); Mon, 16 Nov 2015 18:57:27 -0500 Received: from mail-oi0-f48.google.com ([209.85.218.48]:36161 "EHLO mail-oi0-f48.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751404AbbKPX5Z (ORCPT ); Mon, 16 Nov 2015 18:57:25 -0500 MIME-Version: 1.0 In-Reply-To: <20151116225048.GA5212@lerouge> References: <73ee804fff48cd8c66b65b724f9f728a11a8c686.1447361906.git.luto@kernel.org> <20151113152612.GA14397@lerouge> <20151116225048.GA5212@lerouge> From: Andy Lutomirski Date: Mon, 16 Nov 2015 15:57:05 -0800 Message-ID: Subject: Re: [PATCH v3 5/5] x86/entry/64: Bypass enter_from_user_mode on non-context-tracking boots To: Frederic Weisbecker Cc: Borislav Petkov , Brian Gerst , "linux-kernel@vger.kernel.org" , X86 ML , Peter Zijlstra , Linus Torvalds Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3570 Lines: 94 On Mon, Nov 16, 2015 at 2:50 PM, Frederic Weisbecker wrote: > On Mon, Nov 16, 2015 at 11:10:55AM -0800, Andy Lutomirski wrote: >> On Nov 13, 2015 7:26 AM, "Frederic Weisbecker" wrote: >> > >> > On Thu, Nov 12, 2015 at 12:59:04PM -0800, Andy Lutomirski wrote: >> > > On CONFIG_CONTEXT_TRACKING kernels that have context tracking >> > > disabled at runtime (which includes most distro kernels), we still >> > > have the overhead of a call to enter_from_user_mode in interrupt and >> > > exception entries. >> > > >> > > If jump labels are available, this uses the jump label >> > > infrastructure to skip the call. >> > >> > Looks good. But why are we still calling context tracking code on IRQs at all? >> >> Same reasons as before: >> >> 1. This way the IRQ exit path is almost completely shared with all the >> other exit paths. > > I'm all for consolidation in general. Unless it brings bad middle states. The middle state works fine, though. With these patches, the middle state should have essentially no performance hit compared to the previous state in default configurations. > > If I knew before that I would have to argue endlessly in order to protest against > these context tracking changes, I would have NACK'ed the x86 consolidation rework in > the state it was while it got merged. > >> >> 2. It combines the checks for which context we were in with what CPL >> we entered from. >> >> Part 2 should be complete across the whole x86 kernel soon once the >> 64-bit syscall code gets fixed up. >> >> We should get rid of the duplication in the irq entry hooks. Want to >> help with that? > > Which one? The duplication against irq_enter() and irq_exit()? Yes. > > I think that irq_exit() should be moved to the IRQ very end and perform the > final signal/schedule/preempt_schedule_irq() loop. But it requires a bit of > rework on all archs in order to do that. This could be done iteratively though. > >> Presumably we should do the massive remote polling speedup to the nohz code, > > Hmm, I don't get what you mean here. > Currently (4.4-rc1), when an IRQ hits user mode, here's roughly what we do: - Tell context tracking that we're in the kernel - Switch ct state - Wake up RCU - Adjust vtime - irq_enter - Adjust preempt count - Wake up RCU - Tell vtime accounting that we're in an IRQ All of the initial stuff should be, in the long term, just a write to some variable and a possible barrier. Whatever CPU is doing housekeeping can poll to keep track of user vs system time. The irq_enter stuff, in turn, could either set some variable telling the housekeeper that we're in an IRQ or it could continue to directly adjust time accounting. In any event, all of this should be extremely fast, which it currently isn't. >> and we should also teach enter_from_user_mode to transition directly to IRQ state as >> appropriate. Then irq_enter can be much faster. > > I don't get what you mean here either. You mean calling irq_enter() from enter_from_user_mode()? > No, I mean teaching irq_enter that, on x86 at least, we promise that irq_enter is only ever called from CONTEXT_KERNEL so it can do less redundant work. Or, even better, we could fold the irq_enter and user->kernel hooks into a single context tracking call. --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/