Received: by 2002:a25:683:0:0:0:0:0 with SMTP id 125csp2359099ybg; Fri, 5 Jun 2020 11:49:50 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxvkK/z9VMt38Xdv7pUxylSw0aTz6V51rQIGP8DS4td6HIyMdhPwQZ/TUXynuxsuAG53a9U X-Received: by 2002:a50:ec0f:: with SMTP id g15mr10874976edr.359.1591382989857; Fri, 05 Jun 2020 11:49:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1591382989; cv=none; d=google.com; s=arc-20160816; b=HbT70O39O6RTwefPV2z3vJko1wDcLi+ay66shtbdBZ6G20BOkX0kme6WjlwUrUNS6s XKikYEnZaGjruO5yRaTyCKypcZ4UuBgoaJGjfPM5xkp3fUKq1DzbB4C9fsPA6o/TdP48 UWc++Va351Om2mg4/MRQEm17AQqlUgWKzdly2+v74k+VCM84SzNKLhzqI+FaZv4Nl08G Zv1QDlghkS2Kp0bWXcAkRwMx20/0t/XDVp021KALOamNFc5qm+2n0VxKkN1c3DrKX7it S7rTwHwQEDf2D8tpB+kMe4MpPq0t+Lz5GrGjx004elAYp0i/4Ns3rv76fifq6ZwAroSr jVgw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=9Da0IbFTrNfj9bJaxBOpwux7VHsiRSXba/ar+wuI/Gg=; b=D025qkx4W5+Di7f7o/QVffvBYgRT1fmL2D3Jb8U5rxSfJe3xu6r4PRiNN+ehQTcxd6 zD4F2uLKxglmjNokXpktuHJvpp34+AdcEafWRN916NF1DGtDWIISDBs8Qzhn8gLpCZq9 ysQProq/qMs2rLW+R0wGs3cdUQk/pzP3OH15Oip81TBNLTFgQl2gDRIA/i0yibITnop1 8bF/MymhxSIB9BZ6huH8x7AB02KwG+SwO/srz7H90hF/nKxP8E0vNnxCWo/EHu/paHXB BPPoHKWHgGSX6qTdNZOC2ChanQlx3/ma1k43/zkrZHiDrFD3BXbsgEWgzy9WqtDzQ8hg cs4Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=jVF836jp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id g14si4361892edl.432.2020.06.05.11.49.27; Fri, 05 Jun 2020 11:49:49 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=jVF836jp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728206AbgFESqn (ORCPT + 99 others); Fri, 5 Jun 2020 14:46:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34712 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728198AbgFESqk (ORCPT ); Fri, 5 Jun 2020 14:46:40 -0400 Received: from mail-pj1-x1041.google.com (mail-pj1-x1041.google.com [IPv6:2607:f8b0:4864:20::1041]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5A3E8C08C5C3 for ; Fri, 5 Jun 2020 11:46:39 -0700 (PDT) Received: by mail-pj1-x1041.google.com with SMTP id q24so3215331pjd.1 for ; Fri, 05 Jun 2020 11:46:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=9Da0IbFTrNfj9bJaxBOpwux7VHsiRSXba/ar+wuI/Gg=; b=jVF836jpngEgOt8ljxat1BJiqjwfDFPTZi7flYc19znEdTtBiJY2eDmgaMXonEp3C+ iWjBg2OuU07b6zg0WeVRxc5RnyYkAhaGEWnN8MtUXutRFmKURQdumN+fTjo3nl7eUcx/ GoTrOUw5R2c69SKrEU01/XQVZdHNeb6jdGtQ4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=9Da0IbFTrNfj9bJaxBOpwux7VHsiRSXba/ar+wuI/Gg=; b=bMV1nr5VogB5VmskA2YWhHqqkDEh9nWZVM8cp2eX9tV7o+BzBKh4q3hvvH9eNeUYOl v7U/c3TBuU0th6hn5zgdmkNeTYEHAhXcYCHz6CbPBg9Yyuzacox9ysH7/4K6rlcWl6RV 4jSg0EyUp83Z2q3NAjADX4dOjN3TZ6WrohisYRciX5eja3sST2lh/SmhHv36mH2u58UU k6cp7SWJfxNWB1vYIlK5Q4oBaaUO4wE+twN2oeDG1rGIsS5mgo5sWAFo/czvm6162DEq l3uijNb+J7cIGYyldB5Ya1nQUkbikBWIk1fSoEVntJ2lvpAkp+5G55a6I/bHPmgLMH4Y iZ+w== X-Gm-Message-State: AOAM530qKblVimCVgpFEQ4chmeOfNh0l7y94u09o4FHTGyWD4rjJBEqB 5asrynhkVYbxqQCzN4TaBC1ORg== X-Received: by 2002:a17:902:b582:: with SMTP id a2mr7488097pls.224.1591382798721; Fri, 05 Jun 2020 11:46:38 -0700 (PDT) Received: from www.outflux.net (smtp.outflux.net. [198.145.64.163]) by smtp.gmail.com with ESMTPSA id m12sm8493256pjs.41.2020.06.05.11.46.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 05 Jun 2020 11:46:37 -0700 (PDT) Date: Fri, 5 Jun 2020 11:46:36 -0700 From: Kees Cook To: Vlastimil Babka Cc: Vegard Nossum , "Rafael J. Wysocki" , Robert Moore , Erik Kaneda , "Rafael J. Wysocki" , Christoph Lameter , Andrew Morton , Marco Elver , Waiman Long , LKML , Linux MM , ACPI Devel Maling List , Len Brown , Steven Rostedt , Roman Gushchin Subject: Re: slub freelist issue / BUG: unable to handle page fault for address: 000000003ffe0018 Message-ID: <202006051053.A61A42374C@keescook> References: <202006041054.874AA564@keescook> <34455dce-6675-1fc2-8d61-45bf56f3f554@suse.cz> <6b2b149e-c2bc-f87a-ea2c-3046c5e39bf9@oracle.com> <894e8cee-33df-1f63-fb12-72dceb024ea7@oracle.com> <202006050828.F85A75D13@keescook> <92d994be-e4f5-b186-4ad7-21828de44967@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <92d994be-e4f5-b186-4ad7-21828de44967@suse.cz> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jun 05, 2020 at 06:55:27PM +0200, Vlastimil Babka wrote: > > On 6/5/20 5:44 PM, Kees Cook wrote: > > On Fri, Jun 05, 2020 at 04:44:51PM +0200, Vegard Nossum wrote: > >> On 2020-06-05 16:08, Vlastimil Babka wrote: > >> > On 6/5/20 3:12 PM, Rafael J. Wysocki wrote: > >> > > On Fri, Jun 5, 2020 at 2:48 PM Vegard Nossum wrote: > >> > > > > >> > > > On 2020-06-05 11:36, Vegard Nossum wrote: > >> > > > > > >> > > > > On 2020-06-05 11:11, Vlastimil Babka wrote: > >> > > > > > So, with Kees' patch reverted, booting with slub_debug=F (or even more > >> > > > > > specific slub_debug=F,ftrace_event_field) also hits this bug below. I > >> > > > > > wanted to bisect it, but v5.7 was also bad, and also v5.6. Didn't try > >> > > > > > further in history. So it's not new at all, and likely very specific to > >> > > > > > your config+QEMU? (and related to the ACPI error messages that precede > >> > > > > > it?). > >> > [...] > >> > [ 0.140408] ------------[ cut here ]------------ > >> > [ 0.140837] cache_from_obj: Wrong slab cache. Acpi-Namespace but object is from kmalloc-64 > >> > [ 0.141406] WARNING: CPU: 0 PID: 1 at mm/slab.h:524 kmem_cache_free+0x1d3/0x250 > > > > Ah yes! Good. I had improved this check recently too, and I was worried > > the freelist pointer patch was somehow blocking it, but I see now that > > the failing config didn't have CONFIG_SLAB_FREELIST_HARDENED=y. Once > > SLAB_CONSISTENCY_CHECKS was enabled ("slub_debug=F"), it started > > tripping. Whew. > > > > I wonder if that entire test block should just be removed from > > cache_from_obj(): > > > > if (!memcg_kmem_enabled() && > > !IS_ENABLED(CONFIG_SLAB_FREELIST_HARDENED) && > > !unlikely(s->flags & SLAB_CONSISTENCY_CHECKS)) > > return s; > > > > and make this test unconditional? It's mostly only called during free(), > > and shouldn't be too expensive to be made unconditional. Hmm. > > Hmm I have a different idea. The whole cache_from_obj() was added because of > kmemcg (commit b9ce5ef49f00d) where per-memcg cache can be different from the > root one. And I just realized this usecase can go away with Roman's series [1]. > But cache_from_obj() also kept the original SLUB consistency check case, and you > added the freelist hardening case. If kmemcg use case went away it would be nice > to avoid the virt_to_cache() and check completely again, unless in debugging or > hardened kernel. Is it that expensive? (I'm fine with it staying behind debug/hardening, but if we can make it on by default, that'd be safer.) > Furthermore, the original SLUB debugging case was an unconditional pr_err() plus > WARN_ON_ONCE(1), which was kept by commit b9ce5ef49f00d. With freelist > hardening this all changed to WARN_ONCE. So the second and later cases are not > reported at all for hardening and also not for explicitly enabled debugging like > in this case, which is IMHO not ideal. Oh, I have no problem with WARN vs WARN_ONCE -- there's no reason to split this. And I'd love the hardening side to gain the tracking call too, if it's available. I had just used WARN_ONCE() since sometimes it can be very noisy to keep warning for some condition that might not be correctable. > So I propose the following - the freelist hardening case keeps the WARN_ONCE, > but also a one-line pr_err() for each case so they are not silent. The SLUB > debugging case is always a full warning, and printing the tracking info if > enabled and available. Pure kmemcg case does virt_to_cache() for now (until > hopefully removed by Roman's series) but no checking at all. Would that work for > everyone? > [...] > @@ -520,9 +528,18 @@ static inline struct kmem_cache *cache_from_obj(struct kmem_cache *s, void *x) > return s; > > cachep = virt_to_cache(x); > - WARN_ONCE(cachep && !slab_equal_or_root(cachep, s), > - "%s: Wrong slab cache. %s but object is from %s\n", > - __func__, s->name, cachep->name); > + if (unlikely(s->flags & SLAB_CONSISTENCY_CHECKS)) { > + if (WARN(cachep && !slab_equal_or_root(cachep, s), > + "%s: Wrong slab cache. %s but object is from %s\n", > + __func__, s->name, cachep->name)) > + slab_print_tracking(cachep, x); > + } else if (IS_ENABLED(CONFIG_SLAB_FREELIST_HARDENED)) { > + if (unlikely(cachep && !slab_equal_or_root(cachep, s))) { > + pr_err("%s: Wrong slab cache. %s but object is from %s\n", > + __func__, s->name, cachep->name); > + WARN_ON_ONCE(1); > + } > + } How about just this (in addition to your slab_print_tracking() refactor): diff --git a/mm/slab.h b/mm/slab.h index 207c83ef6e06..107b7f6db3c3 100644 --- a/mm/slab.h +++ b/mm/slab.h @@ -520,9 +520,10 @@ static inline struct kmem_cache *cache_from_obj(struct kmem_cache *s, void *x) return s; cachep = virt_to_cache(x); - WARN_ONCE(cachep && !slab_equal_or_root(cachep, s), + if (WARN(cachep && !slab_equal_or_root(cachep, s), "%s: Wrong slab cache. %s but object is from %s\n", - __func__, s->name, cachep->name); + __func__, s->name, cachep->name)) + slab_print_tracking(cachep, x); return cachep; } -- Kees Cook