Received: by 2002:ac0:a581:0:0:0:0:0 with SMTP id m1-v6csp2058118imm; Thu, 21 Jun 2018 06:39:14 -0700 (PDT) X-Google-Smtp-Source: ADUXVKLGgWdLt03XD4P3fhhiSjY0UffT7plch5RrlUJG1A5OUpDeDgVS9Ny/Wd4ta+R+BQyvHh9K X-Received: by 2002:a17:902:6b09:: with SMTP id o9-v6mr28201661plk.256.1529588354899; Thu, 21 Jun 2018 06:39:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529588354; cv=none; d=google.com; s=arc-20160816; b=KMRXJskMHX+I6W+OzDtzMR0kd4f4+YU0QaBdqVEwSTgCitYVe8D9UlaNnHYSNfc0hv ZAuaDuFewdw0ymNKr0HqASLMKBhggh2CDVdeSZAjrVg+IouY9eUTybrctQ1f0KkJv6/n 5rKAT+0XAN1oXKsYZ096CSA0y/zIR433EZs4t/sAvLkzrqzwjQt0YdfFxzT5ec9bhtcm sDI97fV7PzD17q0D3LnZJPhCiGR68qi6NFigND0SKkOz4Sm/0iYpNGxO0XMjLx+G55IP YFWzXkTVH3eELxzjZSl3xm/F4tW7uh6qS8MyuUHS3TQpq2u/qNFjO6kL0Uvg68qMBLPx SYgA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature :arc-authentication-results; bh=AxitzApvSq5vn20mLC6n7HOMaZrgMnbAyDI+AqLxW08=; b=blSgdAS5rfs/SAWJWRI99pVjVGwCx/YKQ85GpG0MwZM8G62JB1KhzMUNoGi9HvC+2q 1sWEKjuj0w1gTl6r8XhIGNYRdvDHUfOuRQhGb2TGXaE4kSo7qud3m5Sljn4mOE06xaUg 0wipxFyizTy6iGowWPZAvn39jQ8fcgB2sibWwHKZIj+ULBFqIGvoAQ6AE5WEGZ5fyDtk BjXk++49DrvDwbatYWbzM7wu3jW7iYBOXFqU7I0yzC8Hnu6jt73CGi7BWxWadxpc/N7D 3tf/YIWjfTGtJMkxN0CxtlYNWDaueXGwuEZ62LIC7cCg9yt6HH2yjAXbtpbNYqBF7cad XbYA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=W0MzFNsS; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f184-v6si5000019pfb.314.2018.06.21.06.39.00; Thu, 21 Jun 2018 06:39:14 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=W0MzFNsS; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933323AbeFUNiL (ORCPT + 99 others); Thu, 21 Jun 2018 09:38:11 -0400 Received: from mail-oi0-f68.google.com ([209.85.218.68]:36592 "EHLO mail-oi0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932845AbeFUNhp (ORCPT ); Thu, 21 Jun 2018 09:37:45 -0400 Received: by mail-oi0-f68.google.com with SMTP id 14-v6so2947316oie.3 for ; Thu, 21 Jun 2018 06:37:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=AxitzApvSq5vn20mLC6n7HOMaZrgMnbAyDI+AqLxW08=; b=W0MzFNsSYSCb3lOKNVYjF/HxexQWE0EDwqyB9ozFQjC6J6oH50e1A8aV5djXjDJUYC iGR4512Esz7uEf5R6kNyZtvVX+ABxwX1zSs99mlnXcS3QW4Ymiv1idIzB0XgGuZQwW0V COlFXIFsivvajELIFv2fmgZULbxhBIkW6Iz8fBgs6PtHyrnS4pYvXdVfUJku1U1pbbFM DNCMLPYAI2U8PpFHDoDo/NgnHQiwHvlOxWBb+g6f0mmXOkHdm8Mj1S5aOW1vGm/nra4F /A7++k3J84PtPqJKpIWdBxeusmpsHxV5jbjnFDGIaGXOLjRdPlKMr1jD7eoFKt/PwnH8 LmkQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=AxitzApvSq5vn20mLC6n7HOMaZrgMnbAyDI+AqLxW08=; b=ok7ZX+D5/N48Pqht5gwysvB3XbeelH1aqhs9G3QO/MjS6lTdiBrs1BlO/VmJMF9kp6 lIBA69Md/l9wMe0v4e2ycHR34OCUAWLDJezKdkhP/+iL6lVa8UKNlkoB94fAeOBl/92Z QQ9xRTnZ9jnsA1jqcYfLHeAteKeINuADdXJdUsw32+LmxW3z9cOA3hdBXoBjTgZrHxgy tG3fAOv4eMbjsnd7pQYrVfT6t4RbhgRT284qwYStBvqJ/AIiXiZz9f3FKZO1BkWssQr8 AUaRS+t3Y8tjZG8pML3qm3oHPB3TaNrlwp7n1hPFXgy06a1tpzxCGZhe0MpsyaQfSHhi tAlA== X-Gm-Message-State: APt69E18qKy4FY9NtcB/Lo+xqewp4c3GdRBbA0Cy0JeX8mzxxpo6o3FT BF8iBwSxsUWHKrzRa+qE9MVVumAU41pNJfh40FlDbQ== X-Received: by 2002:aca:ab15:: with SMTP id u21-v6mr14832613oie.272.1529588264561; Thu, 21 Jun 2018 06:37:44 -0700 (PDT) MIME-Version: 1.0 References: <1529532570-21765-1-git-send-email-rick.p.edgecombe@intel.com> In-Reply-To: From: Jann Horn Date: Thu, 21 Jun 2018 15:37:33 +0200 Message-ID: Subject: Re: [PATCH 0/3] KASLR feature to randomize each loadable module To: Kees Cook , rick.p.edgecombe@intel.com Cc: Thomas Gleixner , Ingo Molnar , "H . Peter Anvin" , "the arch/x86 maintainers" , kernel list , Linux-MM , Kernel Hardening , kristen.c.accardi@intel.com, Dave Hansen , arjan.van.de.ven@intel.com Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jun 21, 2018 at 12:34 AM Kees Cook wrote: > > On Wed, Jun 20, 2018 at 3:09 PM, Rick Edgecombe > wrote: > > This patch changes the module loading KASLR algorithm to randomize the position > > of each module text section allocation with at least 18 bits of entropy in the > > typical case. It used on x86_64 only for now. > > Very cool! Thanks for sending the series. :) > > > Today the RANDOMIZE_BASE feature randomizes the base address where the module > > allocations begin with 10 bits of entropy. From here, a highly deterministic > > algorithm allocates space for the modules as they are loaded and un-loaded. If > > an attacker can predict the order and identities for modules that will be > > loaded, then a single text address leak can give the attacker access to the > > nit: "text address" -> "module text address" > > > So the defensive strength of this algorithm in typical usage (<800 modules) for > > x86_64 should be at least 18 bits, even if an address from the random area > > leaks. > > And most systems have <200 modules, really. I have 113 on a desktop > right now, 63 on a server. So this looks like a trivial win. But note that the eBPF JIT also uses module_alloc(). Every time a BPF program (this includes seccomp filters!) is JIT-compiled by the kernel, another module_alloc() allocation is made. For example, on my desktop machine, I have a bunch of seccomp-sandboxed processes thanks to Chrome. If I enable the net.core.bpf_jit_enable sysctl and open a few Chrome tabs, BPF JIT allocations start showing up between modules: # grep -C1 bpf_jit_binary_alloc /proc/vmallocinfo | cut -d' ' -f 2- 20480 load_module+0x1326/0x2ab0 pages=4 vmalloc N0=4 12288 bpf_jit_binary_alloc+0x32/0x90 pages=2 vmalloc N0=2 20480 load_module+0x1326/0x2ab0 pages=4 vmalloc N0=4 -- 20480 load_module+0x1326/0x2ab0 pages=4 vmalloc N0=4 12288 bpf_jit_binary_alloc+0x32/0x90 pages=2 vmalloc N0=2 36864 load_module+0x1326/0x2ab0 pages=8 vmalloc N0=8 -- 20480 load_module+0x1326/0x2ab0 pages=4 vmalloc N0=4 12288 bpf_jit_binary_alloc+0x32/0x90 pages=2 vmalloc N0=2 40960 load_module+0x1326/0x2ab0 pages=9 vmalloc N0=9 -- 20480 load_module+0x1326/0x2ab0 pages=4 vmalloc N0=4 12288 bpf_jit_binary_alloc+0x32/0x90 pages=2 vmalloc N0=2 253952 load_module+0x1326/0x2ab0 pages=61 vmalloc N0=61 If you use Chrome with Site Isolation, you have a few dozen open tabs, and the BPF JIT is enabled, reaching a few hundred allocations might not be that hard. Also: What's the impact on memory usage? Is this going to increase the number of pagetables that need to be allocated by the kernel per module_alloc() by 4K or 8K or so? > > As for fragmentation, this algorithm reduces the average number of modules that > > can be loaded without an allocation failure by about 6% (~17000 to ~16000) > > (p<0.05). It can also reduce the largest module executable section that can be > > loaded by half to ~500MB in the worst case. > > Given that we only have 8312 tristate Kconfig items, I think 16000 > will remain just fine. And even large modules (i915) are under 2MB... > > > The new __vmalloc_node_try_addr function uses the existing function > > __vmalloc_node_range, in order to introduce this algorithm with the least > > invasive change. The side effect is that each time there is a collision when > > trying to allocate in the random area a TLB flush will be triggered. There is > > a more complex, more efficient implementation that can be used instead if > > there is interest in improving performance. > > The only time when module loading speed is noticeable, I would think, > would be boot time. Have you done any boot time delta analysis? I > wouldn't expect it to change hardly at all, but it's probably a good > idea to actually test it. :) If you have a forking server that applies seccomp filters on each fork, or something like that, you might care about those TLB flushes. > Also: can this be generalized for use on other KASLRed architectures? > For example, I know the arm64 module randomization is pretty similar > to x86. > > -Kees > > -- > Kees Cook > Pixel Security