Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753231AbdDJKn4 (ORCPT ); Mon, 10 Apr 2017 06:43:56 -0400 Received: from r00tworld.com ([212.85.137.150]:40947 "EHLO r00tworld.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753152AbdDJKnz (ORCPT ); Mon, 10 Apr 2017 06:43:55 -0400 From: "PaX Team" To: Andy Lutomirski Date: Mon, 10 Apr 2017 12:42:35 +0200 MIME-Version: 1.0 Subject: Re: [kernel-hardening] Re: [RFC v2][PATCH 04/11] x86: Implement __arch_rare_write_begin/unmap() Reply-to: pageexec@freemail.hu CC: Andy Lutomirski , Mathias Krause , Thomas Gleixner , Kees Cook , "kernel-hardening@lists.openwall.com" , Mark Rutland , Hoeun Ryu , Emese Revfy , Russell King , X86 ML , "linux-kernel@vger.kernel.org" , "linux-arm-kernel@lists.infradead.org" , Peter Zijlstra Message-ID: <58EB619B.8144.6F924846@pageexec.freemail.hu> In-reply-to: References: <1490811363-93944-1-git-send-email-keescook@chromium.org>, <58EA2D58.17782.6ADE22BD@pageexec.freemail.hu>, X-mailer: Pegasus Mail for Windows (4.72.572) Content-type: text/plain; charset=US-ASCII Content-transfer-encoding: 7BIT Content-description: Mail message body X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-2.1.12 (r00tworld.com [212.85.137.150]); Mon, 10 Apr 2017 12:42:35 +0200 (CEST) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1938 Lines: 38 On 9 Apr 2017 at 17:10, Andy Lutomirski wrote: > On Sun, Apr 9, 2017 at 5:47 AM, PaX Team wrote: > > on x86 the cost of the pax_open/close_kernel primitives comes from the cr0 > > writes and nothing else, use_mm suffers not only from the cr3 writes but > > also locking/atomic ops and cr4 writes on its path and the inevitable TLB > > entry costs. and if cpu vendors cared enough, they could make toggling cr0.wp > > a fast path in the microcode and reduce its overhead by an order of magnitude. > > > > If the CR4 writes happen in for this use case, that's a bug. that depends on how you plan to handle perf/rdpmc users and how many alternative mm structs you plan to manage (one global, one per cpu, one per mm struct, etc). > > you'll be duplicating TLB entries in the alternative PCID for both code > > and data, where they will accumulate (=take room away from the normal PCID > > and expose unwanted memory for access) unless you also flush them when > > switching back (which then will cost even more cycles). also i'm not sure > > that processors implement all the 12 PCID bits so depending on how many PCIDs > > you plan to use, you could be causing even more unnecessary TLB replacements. > > > > Unless the CPU is rather dumber than I expect, the only duplicated > entries should be for the writable aliases of pages that are written. > The rest of the pages are global and should be shared for all PCIDs. well, 4.10.2.4 has language like this (4.10.3.2 implies similar): A logical processor may use a global TLB entry to translate a linear address, even if the TLB entry is associated with a PCID different from the current PCID. that to me says that global page entries are associated with a PCID and may (not) be used while in another PCID. in Intel-speak that's not 'dumb' but "tricks up our sleeve that we don't really want to tell you about in detail, except perhaps under a NDA".