Date: Wed, 19 Aug 2015 01:02:37 +0200
From: Frederic Weisbecker <fweisbec@gmail.com>
To: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>, X86 ML <x86@kernel.org>,
        Sasha Levin <sasha.levin@oracle.com>, Brian Gerst <brgerst@gmail.com>,
        Denys Vlasenko <dvlasenk@redhat.com>,
        "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
        Oleg Nesterov <oleg@redhat.com>, Borislav Petkov <bp@alien8.de>,
        Rik van Riel <riel@redhat.com>
Subject: Re: [PATCH] x86/entry/64: Context-track syscalls before enabling
 interrupts
Message-ID: <20150818230235.GA13685@lerouge>
References: <ad9154dd60f669e94e60d36d23c3267b2ac4c94d.1439924771.git.luto@kernel.org>
 <20150818221623.GA12858@lerouge>
 <CALCETrVQCi_RZqRSTy9bs0V+RB6cLHVfYq4Ouq_JLMoJePg1zA@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <CALCETrVQCi_RZqRSTy9bs0V+RB6cLHVfYq4Ouq_JLMoJePg1zA@mail.gmail.com>
User-Agent: Mutt/1.5.23 (2014-03-12)
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 2277
Lines: 57

On Tue, Aug 18, 2015 at 03:35:30PM -0700, Andy Lutomirski wrote:
> On Tue, Aug 18, 2015 at 3:16 PM, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > On Tue, Aug 18, 2015 at 12:11:59PM -0700, Andy Lutomirski wrote:
> >> This fixes a couple minor holes if we took an IRQ very early in syscall
> >> processing:
> >>
> >>  - We could enter the IRQ with CONTEXT_USER.  Everything worked (RCU
> >>    was fine), but we could warn if all the debugging options were
> >>    set.
> >
> > So this is fixing issues after your changes that call user_exit() from
> > IRQs, right?
> 
> Yes.  Here's an example splat, courtesy of Sasha:
> 
> https://gist.github.com/sashalevin/a006a44989312f6835e7
> 
> >
> > But the IRQs aren't supposed to call user_exit(), they have their own hooks.
> > That's where the real issue is.
> 
> In -tip, the assumption is that we *always* switch to CONTEXT_KERNEL
> when entering the kernel for a non-NMI reason.

Why? IRQs don't need that! We already have irq_enter()/irq_exit().

And we don't want to call rcu_user_*() pairs on IRQs, you're
introducing a serious performance regression here! And I'm talking about
the code that's currently in -tip.

> That means that we can
> avoid all of the (expensive!) checks for what context we're in.

If you're referring to context tracking, the context check is a per-cpu
read. Not something that's usually considered expensive.

> It also means that (other than IRQs, which need further cleanup), we only
> switch once per user/kernel switch.

???

> 
> The cost for doing should be essentially zero, modulo artifacts from
> poor inlining.

And modulo rcu_user_*() that do multiple costly atomic_add_return() operations
implying full memory barriers. Plus the unnecessary vtime accounting that doubles
the existing one in irq_enter/exit() (those even imply a lock currently, which will
probably be turned to seqcount, but still, full memory barriers...).

I'm sorry but I'm going to NACK any code that does that in IRQs (and again that
concerns current tip:x86/asm).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/