Received: by 2002:a25:8b12:0:0:0:0:0 with SMTP id i18csp87262ybl; Tue, 27 Aug 2019 16:20:16 -0700 (PDT) X-Google-Smtp-Source: APXvYqzJuq4T2TWdcf3p+wejD4W9CssS/HAiy+PxjiWZ1D4RT8BnYnrLid/r5pYJ/lFCHGOawhBw X-Received: by 2002:a17:902:ab8f:: with SMTP id f15mr1380804plr.301.1566948015967; Tue, 27 Aug 2019 16:20:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1566948015; cv=none; d=google.com; s=arc-20160816; b=P4ymsUrRX8fSmdUH1BWBSM+UMCYWNxNwqM8bkAEW5R0nmcUiSG3G9EnajqwYPK3eqM +Mo2Dqqig+V8jLp8FaW8Y8mKB0VrALOyzuASt908sO1+tZnMzHP7kGDjmox0BR1d0esx Nh4dq4eCwHkXgdUxARI/PP8/TB4f/D/4Pwc/AaJzmTUxtrgtvEKaBCtmQ0WJoQDA8M7D B0KfwgE3Ai2LgjW8IOXBXhxNDiI17VRjiOHLZOOVLyqBEceHEFyF8I/4NK8jA8LjG5H/ aWFuWWi3KZcnGzl+M55iakqlVo/j3oQ0ebdW+U9iQWtWefG3Ulg45iYM7EFeG334M0dM vScA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=krvyLSQzu28Up6VZeUNAFv1ocp/owcAVeujSF0GGQkA=; b=xHWRgWhYq6o71GpLbYuT/7VW+tnejwD/Pd2QUfOwWYWTtwywWALkTljvPk+CS6l/Vo AtDP0cyR2ZASt+SHQ+Mwu+n92FWLpIYo7Ho+4slom98vLjo21ElUP5zlqio7H6HIffiy 06jX2L6kZ64yb6+UP6QMvNsXxQlGm5gBVYgzBTD/RsSJf7FsDTnr9Z5NZLmsCuQEm7B7 Z7ohvuaU9f+jRp9ABSBDa1SPM3M5oElDuUlFf8Hc53N1eh7BoWEDS8qxinz+A9Sw2Q2i osk4lAcO8i2R21MfzTsCX+c11bnibBZBkGx4uupKRIr4JqyN1JTnRM77QAnoJ0kVX225 k4og== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=YEJFwa6D; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id v123si680682pfb.241.2019.08.27.16.20.00; Tue, 27 Aug 2019 16:20:15 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=YEJFwa6D; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726247AbfH0XSt (ORCPT + 99 others); Tue, 27 Aug 2019 19:18:49 -0400 Received: from mail.kernel.org ([198.145.29.99]:43922 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726044AbfH0XSt (ORCPT ); Tue, 27 Aug 2019 19:18:49 -0400 Received: from mail-wr1-f41.google.com (mail-wr1-f41.google.com [209.85.221.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 5B37522CED for ; Tue, 27 Aug 2019 23:18:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1566947928; bh=wezWlEj+emEAu/Y4km3hAy77vGxp7GmMxShBOXaNH/c=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=YEJFwa6DH4Bi9vKBliGQJdbs7QdCdl5CbUFaGLRDKMA7LgI6h2h2NcafiO3+IsK9o tLJ8mT037k7E0TtSc5Ddk6vtN5x0csky5M7Y2YexteucMMuvtSg/dxNxE77sjWcLrC 6lqi1qXbkBZZ0/8rTnVXVd/qMgGk3WsfLZpdBFIA= Received: by mail-wr1-f41.google.com with SMTP id y19so498405wrd.3 for ; Tue, 27 Aug 2019 16:18:48 -0700 (PDT) X-Gm-Message-State: APjAAAVLsUI4friSv2M7nGYkx7Irz+kOuUYY+C3mwkuuMqymSpDeOpEE csemNfMkL8NFvfW0OvBQhFuSoZ1oK71C1ddwB7gRxw== X-Received: by 2002:a05:6000:4f:: with SMTP id k15mr491850wrx.221.1566947926809; Tue, 27 Aug 2019 16:18:46 -0700 (PDT) MIME-Version: 1.0 References: <20190823224635.15387-1-namit@vmware.com> In-Reply-To: <20190823224635.15387-1-namit@vmware.com> From: Andy Lutomirski Date: Tue, 27 Aug 2019 16:18:35 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [RFC PATCH 0/3] x86/mm/tlb: Defer TLB flushes with PTI To: Nadav Amit Cc: Andy Lutomirski , Dave Hansen , X86 ML , LKML , Peter Zijlstra , Thomas Gleixner , Ingo Molnar Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Aug 23, 2019 at 11:07 PM Nadav Amit wrote: > > INVPCID is considerably slower than INVLPG of a single PTE, but it is > currently used to flush PTEs in the user page-table when PTI is used. > > Instead, it is possible to defer TLB flushes until after the user > page-tables are loaded. Preventing speculation over the TLB flushes > should keep the whole thing safe. In some cases, deferring TLB flushes > in such a way can result in more full TLB flushes, but arguably this > behavior is oftentimes beneficial. I have a somewhat horrible suggestion. Would it make sense to refactor this so that it works for user *and* kernel tables? In particular, if we flush a *kernel* mapping (vfree, vunmap, set_memory_ro, etc), we shouldn't need to send an IPI to a task that is running user code to flush most kernel mappings or even to free kernel pagetables. The same trick could be done if we treat idle like user mode for this purpose. In code, this could mostly consist of changing all the "user" data structures involved to something like struct deferred_flush_info and having one for user and one for kernel. I think this is horrible because it will enable certain workloads to work considerably faster with PTI on than with PTI off, and that would be a barely excusable moral failing. :-p For what it's worth, other than register clobber issues, the whole "switch CR3 for PTI" logic ought to be doable in C. I don't know a priori whether that would end up being an improvement.