Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751185AbeACVUE (ORCPT + 1 other); Wed, 3 Jan 2018 16:20:04 -0500 Received: from out2-smtp.messagingengine.com ([66.111.4.26]:58489 "EHLO out2-smtp.messagingengine.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750913AbeACVUD (ORCPT ); Wed, 3 Jan 2018 16:20:03 -0500 X-ME-Sender: Date: Wed, 3 Jan 2018 13:20:00 -0800 From: Andres Freund To: Willy Tarreau Cc: Linus Torvalds , Linux Kernel Mailing List Subject: Re: Linux 4.15-rc6 Message-ID: <20180103212000.zvll6xvgj3idysgd@alap3.anarazel.de> References: <20180102202859.4fvvrtngnitwzfym@alap3.anarazel.de> <20180103125724.GA2189@1wt.eu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180103125724.GA2189@1wt.eu> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Return-Path: On 2018-01-03 13:57:25 +0100, Willy Tarreau wrote: > On Tue, Jan 02, 2018 at 01:09:13PM -0800, Linus Torvalds wrote: > > On Tue, Jan 2, 2018 at 12:28 PM, Andres Freund wrote: > > > > > > I thought it'd be interesting to run a short benchmark to be able to > > > estimate the impact of the PTI work on postgres workloads (which I work > > > on). On my skylake laptop, a memory resident, OLTP workload with 16 > > > connections results in: > > > > Yeah, that's actually pretty much in line with expectations. > > > > Something around 5% performance impact of the isolation is what people > > are looking at. > > > > Obviously it depends on just exactly what you do. Some loads will > > hardly be affected at all, if they just spend all their time in user > > space. And if you do a lot of small system calls, you might see > > double-digit slowdowns. > > I can confirm, I've just run some tests on haproxy on a core i7-4790K > and I'm observing a performance loss of ~17%, making the connection > rate go down from 245k/s to 204k/s. It's indeed quite significant for > such use cases, eventhough I think it might reasonably be absorbed by > usual noise in most use cases. Yea, I've expanded the postgres benchmarks a bit, and it's not hard to construct cases with significantly increased slowdowns: https://www.postgresql.org/message-id/20180102222354.qikjmf7dvnjgbkxe@alap3.anarazel.de and that's on a laptop, not a large system. I'd assume at least the nopcid cases gets considerably worse on larger sysstems. > With that said, I think we should start to think about an option to > disable this per process. We could imagine for example a prctl() > requiring CAP_SYS_ADMIN to disable it. This would at least allow > processes started as root to disable it when they consider themselves > irrelevant to this kind of protection (mostly I/O intensive or network > intensive applications). That might not be a bad idea. If so, it'd be a good idea to keep it separate from CAP_SYS_ADMIN. E.g. postgres refuses to run as root, but setcap'ing to allow CAP_SYS_LIVE_AND_LET_LIVE_SYSCALL or such would work. But I suspect this isn't something easily done on a capability/prctl level? Seems not uncomplicated to change this after a process has already been created - so maybe it'd be easier to force this via personality()? > > > This isn't a complaint, I just thought it might be useful > > > information. If it helps for anything/anybody, I'm happy to run > > > additional benchmarks / provide additional information. > > > > Note that it will depend heavily on the hardware too. Older CPU's > > without PCID will be impacted more by the isolation. > > Interesting. This CPU has PCID, so it's possible that older hardware > may indeed be hit a bit more. The post linked above has numbers with nopcid disabling pcid use, and indeed, the difference is quite measurable. Greetings, Andres Freund