Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751651AbbHRXCl (ORCPT ); Tue, 18 Aug 2015 19:02:41 -0400 Received: from mail-wi0-f169.google.com ([209.85.212.169]:32998 "EHLO mail-wi0-f169.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750710AbbHRXCk (ORCPT ); Tue, 18 Aug 2015 19:02:40 -0400 Date: Wed, 19 Aug 2015 01:02:37 +0200 From: Frederic Weisbecker To: Andy Lutomirski Cc: Andy Lutomirski , X86 ML , Sasha Levin , Brian Gerst , Denys Vlasenko , "linux-kernel@vger.kernel.org" , Oleg Nesterov , Borislav Petkov , Rik van Riel Subject: Re: [PATCH] x86/entry/64: Context-track syscalls before enabling interrupts Message-ID: <20150818230235.GA13685@lerouge> References: <20150818221623.GA12858@lerouge> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2277 Lines: 57 On Tue, Aug 18, 2015 at 03:35:30PM -0700, Andy Lutomirski wrote: > On Tue, Aug 18, 2015 at 3:16 PM, Frederic Weisbecker wrote: > > On Tue, Aug 18, 2015 at 12:11:59PM -0700, Andy Lutomirski wrote: > >> This fixes a couple minor holes if we took an IRQ very early in syscall > >> processing: > >> > >> - We could enter the IRQ with CONTEXT_USER. Everything worked (RCU > >> was fine), but we could warn if all the debugging options were > >> set. > > > > So this is fixing issues after your changes that call user_exit() from > > IRQs, right? > > Yes. Here's an example splat, courtesy of Sasha: > > https://gist.github.com/sashalevin/a006a44989312f6835e7 > > > > > But the IRQs aren't supposed to call user_exit(), they have their own hooks. > > That's where the real issue is. > > In -tip, the assumption is that we *always* switch to CONTEXT_KERNEL > when entering the kernel for a non-NMI reason. Why? IRQs don't need that! We already have irq_enter()/irq_exit(). And we don't want to call rcu_user_*() pairs on IRQs, you're introducing a serious performance regression here! And I'm talking about the code that's currently in -tip. > That means that we can > avoid all of the (expensive!) checks for what context we're in. If you're referring to context tracking, the context check is a per-cpu read. Not something that's usually considered expensive. > It also means that (other than IRQs, which need further cleanup), we only > switch once per user/kernel switch. ??? > > The cost for doing should be essentially zero, modulo artifacts from > poor inlining. And modulo rcu_user_*() that do multiple costly atomic_add_return() operations implying full memory barriers. Plus the unnecessary vtime accounting that doubles the existing one in irq_enter/exit() (those even imply a lock currently, which will probably be turned to seqcount, but still, full memory barriers...). I'm sorry but I'm going to NACK any code that does that in IRQs (and again that concerns current tip:x86/asm). -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/