Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp565066ybl; Fri, 31 Jan 2020 04:06:54 -0800 (PST) X-Google-Smtp-Source: APXvYqzhoWHN8vkYtfF3IzDxI0cz88wjEFGUzzyFFSPYTELqxkJVAzv5pnW8OCFp2B7XuWJffKTc X-Received: by 2002:a05:6830:110a:: with SMTP id w10mr7494827otq.300.1580472413913; Fri, 31 Jan 2020 04:06:53 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1580472413; cv=none; d=google.com; s=arc-20160816; b=SAPDXqlSIz6emmUheTh6xgo/hQ8cX2dYzcYVno4cQOZGrOFRBBG06iojSnJMfP+DFz BRvgGbRwV5ct7CxCUfIlKfvK7B9txXTdU2sOtLB9e+zehQnONNV2Wu1SlaCWgLVUSORo Dy9bpGVEcXRUIqpv8+txhREcscb7siFDxU8149PPQOW1gTyDkzlR3VIEKeYAADOH6ZVd rT71CTdPBa2eBy4qLTuYlSeg1Rmmm8aA86E88Z9CgG1sAK+EA6/OfdyyBx9tXD7q+nGX pisfxw/H8zKUJEB9NRwhXnuMKeKSQdof9qS7nihWZ5sS3D/HaME1PXwtRUNQZcAf+CT3 eVaw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=bPZYbEHgT8QqWy65RuP2LwO/ekDNyecnlbDtriR+rAY=; b=KsvvBWOYoSZ8DJ1s4FMW+QwEmBT/4je5nMDRF7wUMQRBTwknowZLRechsYo+dPnfwg 5h3BHnD2zRtxp5kHplkMfHqljhP6fepMDnZ8NMKY184CLLYrP7cDMZGBeNRqU4FbfDeQ hEBKj5XEfFHZormdvnzK/xScj2XLqccivYO69vXQLtAtQdXYkuzx9gCbxVh+hbkc3eqz oR5wcPBeba4KseikwKk1zTH7oGCTcPV3SSmlQbsVnh1zLgXZJJA3CndIJfvlpRjjtAui ewWGdwDlEBcuAK0oVQoUQ2noryPgtFg3LAMboCc6FhVMXQSWsvab4cHfkdCbLHwcYhWg sV4Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=InWWajXp; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id w24si4019501oic.260.2020.01.31.04.06.41; Fri, 31 Jan 2020 04:06:53 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=InWWajXp; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728588AbgAaMEI (ORCPT + 99 others); Fri, 31 Jan 2020 07:04:08 -0500 Received: from mail-ot1-f67.google.com ([209.85.210.67]:45654 "EHLO mail-ot1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728554AbgAaMEI (ORCPT ); Fri, 31 Jan 2020 07:04:08 -0500 Received: by mail-ot1-f67.google.com with SMTP id 59so6278853otp.12 for ; Fri, 31 Jan 2020 04:04:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=bPZYbEHgT8QqWy65RuP2LwO/ekDNyecnlbDtriR+rAY=; b=InWWajXpKjo8nIUsFLbd3+gXkRiyQwKrsN3SuyxJtIAV+5mvx5jWLI+32kU4GxZqH0 RL3AB4Duh8Nu54vbn6Muxq8c7ddYI2vv1j7nsgV4hvKBOc2ykYv8yJY6xRCulQ68HJ9n iMi+kRfnObu5vSYzqNBr6p9OyWwUMYtfZw8chuNpE5PrP9ihITkeTb6GRpUv11oX15Gx hKLXcS74O2uqJiYEjhy6o05QCFBK6KK6y1VA4haWcWqrRWsb+K6OeBV3e3AiqLuW6rBp N4hqHrI7+oEEVCExwmrVqpD/5Cb0z1bSEqvHGmen9xNTQ9Sn5kSIhznFRsrTIHBgnN/y BP4Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=bPZYbEHgT8QqWy65RuP2LwO/ekDNyecnlbDtriR+rAY=; b=D1l+0EHC2YPRr8WCwRHHKJVyQbCNOQ2xyDyXR4Co1X2QVB9MfU6zpmREepjz8UWylP 6DEyYxv3kVty6XdJzVANMLwpM7yApT/iuIrNQZ+yHq4LTV3jJSIM/VnoEXn6d2z3SGr+ YYildMajliZi6NcDPRKMHAZ7CtlwmKJ9SaF2+UO1ht2h+AcNOJZI9mbXt6Tyx3z20+NK TCxo23IMC7G35XgQ09hs+QozK/N5v3/KfkaPJjUXRP5HT4ffhoro1XKtCaQO4PBDiMb5 i2vX1tUJlY74SX5BRRGoSsUTQ4kjt6+VtOZy3UJ6gXp0EnkaGe9v8HZp6e1zC2BH4cIP hMCg== X-Gm-Message-State: APjAAAV81mWvjBvUgHbzu+14cKuiBodNFJLj1o4lWqC8K61wGF8SqF68 I1S4rban5aO6PTZBerKcEIueRTl4On+4b5tb8CdZ0A== X-Received: by 2002:a05:6830:22cc:: with SMTP id q12mr7537510otc.110.1580472246617; Fri, 31 Jan 2020 04:04:06 -0800 (PST) MIME-Version: 1.0 References: <201911121313.1097D6EE@keescook> <201911141327.4DE6510@keescook> <202001271519.AA6ADEACF0@keescook> <5861936c-1fe1-4c44-d012-26efa0c8b6e7@de.ibm.com> <202001281457.FA11CC313A@keescook> <6844ea47-8e0e-4fb7-d86f-68046995a749@de.ibm.com> <20200129170939.GA4277@infradead.org> <771c5511-c5ab-3dd1-d938-5dbc40396daa@de.ibm.com> <202001300945.7D465B5F5@keescook> In-Reply-To: <202001300945.7D465B5F5@keescook> From: Jann Horn Date: Fri, 31 Jan 2020 13:03:40 +0100 Message-ID: Subject: Re: [kernel-hardening] [PATCH 09/38] usercopy: Mark kmalloc caches as usercopy caches To: Kees Cook Cc: Christian Borntraeger , Christoph Hellwig , Christopher Lameter , Jiri Slaby , Julian Wiedmann , Ursula Braun , Alexander Viro , kernel list , David Windsor , Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Linux-MM , linux-xfs@vger.kernel.org, Linus Torvalds , Andy Lutomirski , "David S. Miller" , Laura Abbott , Mark Rutland , "Martin K. Petersen" , Paolo Bonzini , Christoffer Dall , Dave Kleikamp , Jan Kara , Luis de Bethencourt , Marc Zyngier , Rik van Riel , Matthew Garrett , linux-fsdevel , linux-arch , Network Development , Kernel Hardening , Vlastimil Babka , Michal Kubecek Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jan 30, 2020 at 8:23 PM Kees Cook wrote: > On Wed, Jan 29, 2020 at 06:19:56PM +0100, Christian Borntraeger wrote: > > On 29.01.20 18:09, Christoph Hellwig wrote: > > > On Wed, Jan 29, 2020 at 06:07:14PM +0100, Christian Borntraeger wrote: > > >>> DMA can be done to NORMAL memory as well. > > >> > > >> Exactly. > > >> I think iucv uses GFP_DMA because z/VM needs those buffers to reside below 2GB (which is ZONA_DMA for s390). > > > > > > The normal way to allocate memory with addressing limits would be to > > > use dma_alloc_coherent and friends. Any chance to switch iucv over to > > > that? Or is there no device associated with it? > > > > There is not necessarily a device for that. It is a hypervisor interface (an > > instruction that is interpreted by z/VM). We do have the netiucv driver that > > creates a virtual nic, but there is also AF_IUCV which works without a device. > > > > But back to the original question: If we mark kmalloc caches as usercopy caches, > > we should do the same for DMA kmalloc caches. As outlined by Christoph, this has > > nothing to do with device DMA. > > Hm, looks like it's allocated from the low 16MB. Seems like poor naming! The physical address limit of the DMA zone depends on the architecture (and the kernel version); e.g. on Linux 4.4 on arm64 (which is used on the Pixel 2), the DMA zone goes up to 4GiB. Later, arm64 started using the DMA32 zone for that instead (as was already the case on e.g. X86-64); but recently (commit 1a8e1cef7603), arm64 started using the DMA zone again, but now for up to 1GiB. (AFAICS the DMA32 zone can't be used with kmalloc at all, that only works with the DMA zone.) > :) There seems to be a LOT of stuff using GFP_DMA, and it seems unlikely > those are all expecting low addresses? I think there's a bunch of (especially really old) hardware where the hardware can only talk to low physical addresses, e.g. stuff that uses the ISA bus. However, there aren't *that* many users of GFP_DMA that actually cause kmalloc allocations with GFP_DMA - many of the users of GFP_DMA actually just pass that flag to dma_alloc_coherent()/dma_pool_alloc(), where it is filtered away and the allocation ultimately doesn't go through the slab allocator AFAICS, or they pass it directly to the page allocator, where it has no effect on the usercopy stuff. Looking on my workstation, there are zero objects allocated in dma-kmalloc-* slabs: /sys/kernel/slab# for name in dma-kmalloc-*; do echo "$name: $(cat $name/objects)"; done dma-kmalloc-128: 0 dma-kmalloc-16: 0 dma-kmalloc-192: 0 dma-kmalloc-1k: 0 dma-kmalloc-256: 0 dma-kmalloc-2k: 0 dma-kmalloc-32: 0 dma-kmalloc-4k: 0 dma-kmalloc-512: 0 dma-kmalloc-64: 0 dma-kmalloc-8: 0 dma-kmalloc-8k: 0 dma-kmalloc-96: 0 On a Pixel 2, there are a whole five objects allocated across the DMA zone kmalloc caches: walleye:/sys/kernel/slab # for name in dma-kmalloc-*; do echo "$name: $(cat $name/objects)"; done dma-kmalloc-1024: 0 dma-kmalloc-128: 0 dma-kmalloc-2048: 2 dma-kmalloc-256: 0 dma-kmalloc-4096: 3 dma-kmalloc-512: 0 dma-kmalloc-8192: 0 > Since this has only been a problem on s390, should just s390 gain the > weakening of the usercopy restriction? Something like: > > > diff --git a/mm/slab_common.c b/mm/slab_common.c > index 1907cb2903c7..c5bbc141f20b 100644 > --- a/mm/slab_common.c > +++ b/mm/slab_common.c > @@ -1303,7 +1303,9 @@ void __init create_kmalloc_caches(slab_flags_t flags) > kmalloc_caches[KMALLOC_DMA][i] = create_kmalloc_cache( > kmalloc_info[i].name[KMALLOC_DMA], > kmalloc_info[i].size, > - SLAB_CACHE_DMA | flags, 0, 0); > + SLAB_CACHE_DMA | flags, 0, > + IS_ENABLED(CONFIG_S390) ? > + kmalloc_info[i].size : 0); > } > } > #endif I think dma-kmalloc slabs should be handled the same way as normal kmalloc slabs. When a dma-kmalloc allocation is freshly created, it is just normal kernel memory - even if it might later be used for DMA -, and it should be perfectly fine to copy_from_user() into such allocations at that point, and to copy_to_user() out of them at the end. If you look at the places where such allocations are created, you can see things like kmemdup(), memcpy() and so on - all normal operations that shouldn't conceptually be different from usercopy in any relevant way.