Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp8602126imu; Tue, 4 Dec 2018 10:57:22 -0800 (PST) X-Google-Smtp-Source: AFSGD/VU/jd3W+bTupMK9+nNQiqyVWbGfyVi99YjoADJ6/TpO3YsGVi8zS64+GxHNFGjVrIirxWF X-Received: by 2002:a17:902:6b87:: with SMTP id p7-v6mr21505474plk.282.1543949842804; Tue, 04 Dec 2018 10:57:22 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1543949842; cv=none; d=google.com; s=arc-20160816; b=SI6kz/IlxAlD6qtwi9X1l5YZ5EfKkIAyMIbjqRfIWMWXkN5fidDkZo9hx5ZxFlMId9 vxK43cbZTQ3cGcEIQbF8DTY99xJmyPSim8JUe+9ttiB7roHSep04ARIF+3HrIg8FOUN8 0lcnY2TKVDICwql4Ik/72eyRFYPCevo+iXSm4f5fl1JsP0frnbuaqsfsu/rJdcVlTKm5 xL1UnPXj7Hok+hrBTIAFonPYzIZ8pZ4XZrg9QvnsvCyRsf3dVwhKNNaUbCpxmPYIInAN 2jrgMhMIaLJi15bHOUmfmFVjZNp+RS1FPF7VK1wpE6HQJyI6lxBWNgjxOCPjbwkeMGgM XgWw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=U7iy4RSo+qZqAMjzbA3tADY+0EQsiqg2Pvc/IvBDyOA=; b=kWHIX+gvTV+efXZbbLW8drFFU+R7de11kE4hNgWsiKx4ieSxAPbaTR1VybGOaTvrf7 kldyY+QW7r84mPwg+0xPxU+ricCGf6EEX0ouAZmohil67s6R9ORBzgAG059kyCLMfnH3 JiS61QNrrbEe/862nIc6RKeSfXXPE8T6yvZIsXsAqlvHGL5v2gTCELnN4BL5wXbWML2H 9TMt+4/6QK8Dbq17kHllr/CrNF4epWW9Mk8UNdmjXMZIFxhxSzFdf36xnDOL1Xk+Uorx iwD4htoMEI/1v9CsYB1xlVy7dL8xqjxFF6jOizCODDMsH6FqPr6r522LiVEGy9Z0vvDx t6MA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=wlFopwh3; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b3si16425679pgh.496.2018.12.04.10.57.06; Tue, 04 Dec 2018 10:57:22 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=wlFopwh3; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726124AbeLDS4U (ORCPT + 99 others); Tue, 4 Dec 2018 13:56:20 -0500 Received: from mail.kernel.org ([198.145.29.99]:52720 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725864AbeLDS4T (ORCPT ); Tue, 4 Dec 2018 13:56:19 -0500 Received: from mail-wm1-f41.google.com (mail-wm1-f41.google.com [209.85.128.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 0A13B20851 for ; Tue, 4 Dec 2018 18:56:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1543949778; bh=iFtxg6LhjYpm8Evo/ahI8Rptn8CtMFv3JwG6zzvPXHk=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=wlFopwh3m9VtTIjsbQ850FdexYr8m8osKTWxvMxNBVgBVEdtQZBa83brClWrPWftH njiXTqsfNBt8iK2kXRbze8fhaHbFLT/NddiA8lWArPEefXNrBJ84o7BG3yIa6oqzFa PJnVS/WVMCtwgBQ3ckR3ZHC3xFpHf8MTFfADQVC0= Received: by mail-wm1-f41.google.com with SMTP id y139so10695217wmc.5 for ; Tue, 04 Dec 2018 10:56:17 -0800 (PST) X-Gm-Message-State: AA+aEWZLkZaS05hGHy98rLu95pSeeLEMIs08lKIhvlUHUOOfBTWxP7ON Dq03zH/lKjKKkmCEYaedtBy+lfFvpBiIXXbXnj5iWA== X-Received: by 2002:a1c:aa0f:: with SMTP id t15mr2740811wme.108.1543949776480; Tue, 04 Dec 2018 10:56:16 -0800 (PST) MIME-Version: 1.0 References: <20181128000754.18056-1-rick.p.edgecombe@intel.com> <20181128000754.18056-2-rick.p.edgecombe@intel.com> <4883FED1-D0EC-41B0-A90F-1A697756D41D@gmail.com> In-Reply-To: <4883FED1-D0EC-41B0-A90F-1A697756D41D@gmail.com> From: Andy Lutomirski Date: Tue, 4 Dec 2018 10:56:03 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH 1/2] vmalloc: New flag for flush before releasing pages To: Nadav Amit Cc: Rick Edgecombe , Andrew Morton , Andrew Lutomirski , Will Deacon , Linux-MM , LKML , Kernel Hardening , "Naveen N . Rao" , Anil S Keshavamurthy , "David S. Miller" , Masami Hiramatsu , Steven Rostedt , Ingo Molnar , Alexei Starovoitov , Daniel Borkmann , jeyu@kernel.org, Network Development , Ard Biesheuvel , Jann Horn , Kristen Carlson Accardi , Dave Hansen , "Dock, Deneen T" , Peter Zijlstra Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Dec 3, 2018 at 5:43 PM Nadav Amit wrote: > > > On Nov 27, 2018, at 4:07 PM, Rick Edgecombe wrote: > > > > Since vfree will lazily flush the TLB, but not lazily free the underlyi= ng pages, > > it often leaves stale TLB entries to freed pages that could get re-used= . This is > > undesirable for cases where the memory being freed has special permissi= ons such > > as executable. > > So I am trying to finish my patch-set for preventing transient W+X mappin= gs > from taking space, by handling kprobes & ftrace that I missed (thanks aga= in for > pointing it out). > > But all of the sudden, I don=E2=80=99t understand why we have the problem= that this > (your) patch-set deals with at all. We already change the mappings to mak= e > the memory writable before freeing the memory, so why can=E2=80=99t we ma= ke it > non-executable at the same time? Actually, why do we make the module memo= ry, > including its data executable before freeing it??? > All the code you're looking at is IMO a very awkward and possibly incorrect of doing what's actually necessary: putting the direct map the way it wants to be. Can't we shove this entirely mess into vunmap? Have a flag (as part of vmalloc like in Rick's patch or as a flag passed to a vfree variant directly) that makes the vunmap code that frees the underlying pages also reset their permissions? Right now, we muck with set_memory_rw() and set_memory_nx(), which both have very awkward (and inconsistent with each other!) semantics when called on vmalloc memory. And they have their own flushes, which is inefficient. Maybe the right solution is for vunmap to remove the vmap area PTEs, call into a function like set_memory_rw() that resets the direct maps to their default permissions *without* flushing, and then to do a single flush for everything. Or, even better, to cause the change_page_attr code to do the flush and also to flush the vmap area all at once so that very small free operations can flush single pages instead of flushing globally.