Date: Tue, 1 Aug 2017 19:57:17 +1000
From: Nicholas Piggin <npiggin@gmail.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
        Michael Ellerman <mpe@ellerman.id.au>,
        "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
        linux-kernel <linux-kernel@vger.kernel.org>,
        Boqun Feng <boqun.feng@gmail.com>, Andrew Hunter <ahh@google.com>,
        maged michael <maged.michael@gmail.com>, gromer <gromer@google.com>,
        Avi Kivity <avi@scylladb.com>,
        Benjamin Herrenschmidt <benh@kernel.crashing.org>,
        Palmer Dabbelt <palmer@dabbelt.com>, Dave Watson <davejwatson@fb.com>
Subject: Re: [RFC PATCH v2] membarrier: expedited private command
Message-ID: <20170801195717.7a675cc2@roar.ozlabs.ibm.com>
In-Reply-To: <20170801081230.GF6524@worktop.programming.kicks-ass.net>
References: <20170727211314.32666-1-mathieu.desnoyers@efficios.com>
        <20170728085532.ylhuz2irwmgpmejv@hirez.programming.kicks-ass.net>
        <20170728115702.5vgnvwhmbbmyrxbf@hirez.programming.kicks-ass.net>
        <87tw1s4u9w.fsf@concordia.ellerman.id.au>
        <20170731233731.32e68f6d@roar.ozlabs.ibm.com>
        <973223324.694.1501551189603.JavaMail.zimbra@efficios.com>
        <20170801120047.61c59064@roar.ozlabs.ibm.com>
        <20170801081230.GF6524@worktop.programming.kicks-ass.net>
Organization: IBM
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 1988
Lines: 46

On Tue, 1 Aug 2017 10:12:30 +0200
Peter Zijlstra <peterz@infradead.org> wrote:

> On Tue, Aug 01, 2017 at 12:00:47PM +1000, Nicholas Piggin wrote:
> > Thanks for this, I'll take a look. This should be a good start as a stress
> > test, but I'd also be interested in some application. The reason being that
> > for example using runqueue locks may give reasonable maximum throughput
> > numbers, but could cause some latency or slowdown when it's used in more
> > realistic scenario.  
> 
> Given this is an unprivileged interface we have to consider DoS and
> other such lovely things.  And since we cannot use mm_cpumask() we're
> stuck with for_each_online_cpu().

I think we *can* make that part of it per-arch, as well as whether
or not to use runqueue locks. It's kind of crazy not to use it when
it's available. Avoiding CPUs you aren't allowed to run on is also
nice for compartmentalization.

> Combined that means that using rq->lock is completely out of the
> question, some numbnut doing 'for (;;) sys_membarrier();' can
> completely wreck the system.

In what way would it wreck the system? It's not holding the lock over
the IPI, only to inspect the rq->curr->mm briefly. 

> Yes, it might work for 'normal' workloads, but the interference
> potential is just too big.

Well it's good to be concerned about it. I do see your point. Although
I don't know if it's all that complicated to use unprivileged ops to
badly hurt QoS on most systems already :)

If mm cpumask is used, I think it's okay. You can cause quite similar
kind of iteration over CPUs and lots of IPIs, tlb flushes, etc using
munmap/mprotect/etc, or context switch IPIs, etc. Are we reaching the
stage where we're controlling those kinds of ops in terms of impact
to the rest of the system? 

> I have the same problem with Paul's synchronize_rcu_expedited() patch,
> that is a machine wide IPI spray and will interfere with unrelated work.

Possibly global IPI would be a more serious concern.

Thanks,
Nick