Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp8647806imu; Tue, 4 Dec 2018 11:45:56 -0800 (PST) X-Google-Smtp-Source: AFSGD/Xd35/cxNOD07MX28cPHxJ+ej2YDeGndTFe2sn7sB+OnJ6Cn0XPpHlwZKCggi/uCOLXQEo6 X-Received: by 2002:a63:4745:: with SMTP id w5mr18297155pgk.377.1543952756078; Tue, 04 Dec 2018 11:45:56 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1543952756; cv=none; d=google.com; s=arc-20160816; b=0jQufi+Z10jT+QWua2Sg8R5qCxjJOjo53dv3KxAVlgpjxN+vXtl4HTz8kAJTpwv//R lhyevdHxcumrVi+kVpagDXv0H52h2MJeXidUjmHpEz2nhhLCq9AG15UGXN+VP4HXbJJY BuO2n70DpowXGk1OrJEpY2LvRwbEkLy9TsUD9k7U0A5SXoOYnMGzR6PvEIQBSWOewaDy x3d/O2BEs8+2Fwmegj9nnZ2Oqq9gLi/SaxNXKGvn8ZujaYVboWDFjy3MXIObVHR06XsY wrrqIYacXmEKn6+t4PFJXyQjTlMQqBFGy5Mgk8HtPmh4SWRl72vZ1a6EqL+w8vL2rCZ7 2QMA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:to:references:message-id :content-transfer-encoding:cc:date:in-reply-to:from:subject :mime-version:dkim-signature; bh=h/WGcDBlkFhA27cXAvj5y+7rgOt6LBLLcXWwGSkEtVY=; b=ILMjC4IcwM1BkvYJownNxhu+t2f/k+EOf7NIsHD/bBCBn1X9H8B9UqDo4X2x+LMtWl IvG2tF6751koNOUeE3kAMP/h3XVu6HUCv5hWRva0BpbNRuWd/em4Si4FVhLRztkqKVMG Zz8LDlHBQdKVvB8/XYaepmIFODWSPDcxzrQJrfOq0hH6+/oWZegmcUijyN5LD4ZdQ8H3 4SQAOS7/TpSDqkC+4fSAtKn2iaoB4ZBhSKqM2ZPkYjByDD4yN2BH5D+TfmWbc8Ys9XCz P2fdP6NIEu7cDzpc4a6vUxG6hO0aTn3lrH73PaBowIaHolh2Q5Mg0tzii0BPftVSmGHq MiJg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=TlC9Lq44; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p12si16047125pgj.56.2018.12.04.11.45.40; Tue, 04 Dec 2018 11:45:56 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=TlC9Lq44; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726062AbeLDTpC (ORCPT + 99 others); Tue, 4 Dec 2018 14:45:02 -0500 Received: from mail-pg1-f195.google.com ([209.85.215.195]:34534 "EHLO mail-pg1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725859AbeLDTpC (ORCPT ); Tue, 4 Dec 2018 14:45:02 -0500 Received: by mail-pg1-f195.google.com with SMTP id 17so7854469pgg.1; Tue, 04 Dec 2018 11:45:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=h/WGcDBlkFhA27cXAvj5y+7rgOt6LBLLcXWwGSkEtVY=; b=TlC9Lq44k0PQKrLUh6lAjsQ2dm3ZowPCTMMH9A3Jcj05BTn9VWpfAPi9Odlg5bJ8cE fUAgyE/PvW+B3seCnDe9bmWxZzsWfteDJLqYwEvOdUyrOz/FdhI/o6zKcOpJ7RNEPIQw cJw91L3leJ2IIWRAmTvDI1wovcDKy4S6pO5xWlb5Dta0/nEVqVkr7Wwh9iMZmpJAMF1W CAhGXgQhIoiXn48++Wn6oVAkbTiy4Ew0/mHxYKCmFqF/7lPY25tQjdU/e33Bicu6gFj+ c8JJ201TVUGquI4iLZl+LCh/er/5yxR3cQsm/yHM/Fuyqb0XTfABvk9kKEbdj5UkDi9V gzeg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=h/WGcDBlkFhA27cXAvj5y+7rgOt6LBLLcXWwGSkEtVY=; b=Y4j/Eyv40RIw0tfC9vNCfEGNLweh/qN/BUkphaAqNRvnfI6CJ9PAuM9j9sHUjVt1Vm /jZLAOU4gcqR+fZ+nwMg7m8keBwJdh/Vn+nyxIPVqyS8pzZV5th787bnJT5Tcyx8N3lS tn0u+CbQyHWN0U6piTci8FtzzDrK5n+FWCUPzH1nt4kbcg0gvaZ9YkdGduY8yAO6bIHr nJUgqOhFMJxOYRRZqVqUMBqLaAcdDUrnBQ0b336chXzqY1Q8UMXizlOIu5Vr1UpjW66u U8vGxsW9PRF7RbeFx03rKl59tht18WU6g5kQtJpF//R4vHFMze/I+ruyQ3TiHpGYnoiu Fvsg== X-Gm-Message-State: AA+aEWbIocrCeLZv9JVS5hdVj9Ew+jB3pyN4by4KEzsvbUP+T8AFjr9A PdUertuY9BpgJiED/cDJuyo= X-Received: by 2002:a63:2643:: with SMTP id m64mr17555130pgm.35.1543952700849; Tue, 04 Dec 2018 11:45:00 -0800 (PST) Received: from ?IPv6:2601:647:4580:b719:9448:6a51:fd0f:790f? ([2601:647:4580:b719:9448:6a51:fd0f:790f]) by smtp.gmail.com with ESMTPSA id g11sm22596415pfo.139.2018.12.04.11.44.58 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 04 Dec 2018 11:45:00 -0800 (PST) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 12.1 \(3445.101.1\)) Subject: Re: [PATCH 1/2] vmalloc: New flag for flush before releasing pages From: Nadav Amit In-Reply-To: Date: Tue, 4 Dec 2018 11:44:58 -0800 Cc: Rick Edgecombe , Andrew Morton , Will Deacon , Linux-MM , LKML , Kernel Hardening , "Naveen N . Rao" , Anil S Keshavamurthy , "David S. Miller" , Masami Hiramatsu , Steven Rostedt , Ingo Molnar , Alexei Starovoitov , Daniel Borkmann , jeyu@kernel.org, Network Development , Ard Biesheuvel , Jann Horn , Kristen Carlson Accardi , Dave Hansen , "Dock, Deneen T" , Peter Zijlstra Content-Transfer-Encoding: quoted-printable Message-Id: <08141F66-F3E6-4CC5-AF91-1ED5F101A54C@gmail.com> References: <20181128000754.18056-1-rick.p.edgecombe@intel.com> <20181128000754.18056-2-rick.p.edgecombe@intel.com> <4883FED1-D0EC-41B0-A90F-1A697756D41D@gmail.com> To: Andy Lutomirski X-Mailer: Apple Mail (2.3445.101.1) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > On Dec 4, 2018, at 10:56 AM, Andy Lutomirski wrote: >=20 > On Mon, Dec 3, 2018 at 5:43 PM Nadav Amit = wrote: >>> On Nov 27, 2018, at 4:07 PM, Rick Edgecombe = wrote: >>>=20 >>> Since vfree will lazily flush the TLB, but not lazily free the = underlying pages, >>> it often leaves stale TLB entries to freed pages that could get = re-used. This is >>> undesirable for cases where the memory being freed has special = permissions such >>> as executable. >>=20 >> So I am trying to finish my patch-set for preventing transient W+X = mappings >> from taking space, by handling kprobes & ftrace that I missed (thanks = again for >> pointing it out). >>=20 >> But all of the sudden, I don=E2=80=99t understand why we have the = problem that this >> (your) patch-set deals with at all. We already change the mappings to = make >> the memory writable before freeing the memory, so why can=E2=80=99t = we make it >> non-executable at the same time? Actually, why do we make the module = memory, >> including its data executable before freeing it??? >=20 > All the code you're looking at is IMO a very awkward and possibly > incorrect of doing what's actually necessary: putting the direct map > the way it wants to be. >=20 > Can't we shove this entirely mess into vunmap? Have a flag (as part > of vmalloc like in Rick's patch or as a flag passed to a vfree variant > directly) that makes the vunmap code that frees the underlying pages > also reset their permissions? >=20 > Right now, we muck with set_memory_rw() and set_memory_nx(), which > both have very awkward (and inconsistent with each other!) semantics > when called on vmalloc memory. And they have their own flushes, which > is inefficient. Maybe the right solution is for vunmap to remove the > vmap area PTEs, call into a function like set_memory_rw() that resets > the direct maps to their default permissions *without* flushing, and > then to do a single flush for everything. Or, even better, to cause > the change_page_attr code to do the flush and also to flush the vmap > area all at once so that very small free operations can flush single > pages instead of flushing globally. Thanks for the explanation. I read it just after I realized that indeed = the whole purpose of this code is to get cpa_process_alias()=20 update the corresponding direct mapping. This thing (pageattr.c) indeed seems over-engineered and very = unintuitive. Right now I have a list of patch-sets that I owe, so I don=E2=80=99t = have the time to deal with it. But, I still think that disable_ro_nx() should not call set_memory_x(). IIUC, this breaks W+X of the direct-mapping which correspond with the = module memory. Does it ever stop being W+X?? I=E2=80=99ll have another look.