Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp10094319ybi; Wed, 10 Jul 2019 23:24:36 -0700 (PDT) X-Google-Smtp-Source: APXvYqyCqMGSJhCftTp4zzDdlqL63upYxD9z0+30SaXaPKhrucLd/PAKpU0WxsqwqD7Mta92tKpO X-Received: by 2002:a17:902:8f81:: with SMTP id z1mr2681265plo.290.1562826276000; Wed, 10 Jul 2019 23:24:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1562826275; cv=none; d=google.com; s=arc-20160816; b=GDfTCvrgZMV6MvF8Gm61AY5+JqefjnhCEzc4RcOAt6tGqXtoB/bWbd496S0QQMC8JT UY5aTwvZinPBtD2MvsTZv5W+JMqqryHPcGCODZawkVxeThak2UGZZfwsNgpRup+Lqsbg XnA+U9DiPpEAoMo9/SZyQ3Q/klsqTvs/on58uvvWV6EJq//Fs1etP/OB4/h4e0S5cEig d9QmiyBmLhOUwnsTiLeMVG0Jv2HuummiQZJKLOjGDsr2gnQYN9RXJQNDoOHjmxEzT4zN dh1gtui9bx3hYVf26GhtvxQwmzBGRL44NKDuH/NdzEkW2d/l67VeaxQ2BKrxKNTReqCu phrw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:date:subject:user-agent:message-id :references:cc:in-reply-to:from:to:content-transfer-encoding :mime-version; bh=DrVcdla93H72oIEgs5iw0VCOqAbIhy+SkVXPV0WQpxk=; b=g8BHZcoLq4U3t93wKGBAu+yj8KCb47EkxPUx3sLTjNJwev1HmoCZjcmlb88Pu6cXeq cDQJdeQKyAqzPEQxTTqdfiXJTA2df1RPOrWh/FmtuQ35M1YEoVU18oWdI/ghwsyUyG4/ nnMoxmuBztD/pYI2SQI2auF4JB8l9oihzYeAihN/O7QuHQqzCX/lvV4MPuVTHMe+/F16 GPu6w03+IcuCYe7jyi2N24MRJ4pPI1uvR+egAOOPz1qDR1wAcIwd8wb6eLPitD860p9n ebwQIMkqGde/89ziY06PptpaNYCUf8zWFfW8Ig5Ay/a4wWc2fviFPvq1w6GFFMTJ2gGm rEQw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d5si3876422pla.17.2019.07.10.23.24.18; Wed, 10 Jul 2019 23:24:35 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727831AbfGKGSA convert rfc822-to-8bit (ORCPT + 99 others); Thu, 11 Jul 2019 02:18:00 -0400 Received: from mail.fireflyinternet.com ([109.228.58.192]:53143 "EHLO fireflyinternet.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725963AbfGKGSA (ORCPT ); Thu, 11 Jul 2019 02:18:00 -0400 X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from localhost (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP (TLS) id 17214061-1500050 for multiple; Thu, 11 Jul 2019 07:17:55 +0100 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8BIT To: Steven Rostedt , Tejun Heo From: Chris Wilson In-Reply-To: <20190710225720.58246f8e@oasis.local.home> Cc: Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <20190614093914.58f41d8f@gandalf.local.home> <156052491337.7796.17642747687124632554@skylake-alporthouse-com> <20190614153837.GE538958@devbig004.ftw2.facebook.com> <20190710225720.58246f8e@oasis.local.home> Message-ID: <156282587317.12280.11217721447100506162@skylake-alporthouse-com> User-Agent: alot/0.6 Subject: Re: [BUG] lockdep splat with kernfs lockdep annotations and slab mutex from drm patch?? Date: Thu, 11 Jul 2019 07:17:53 +0100 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Quoting Steven Rostedt (2019-07-11 03:57:20) > On Fri, 14 Jun 2019 08:38:37 -0700 > Tejun Heo wrote: > > > Hello, > > > > On Fri, Jun 14, 2019 at 04:08:33PM +0100, Chris Wilson wrote: > > > #ifdef CONFIG_MEMCG > > > if (slab_state >= FULL && err >= 0 && is_root_cache(s)) { > > > struct kmem_cache *c; > > > > > > mutex_lock(&slab_mutex); > > > > > > so it happens to hit the error + FULL case with the additional slabcaches? > > > > > > Anyway, according to lockdep, it is dangerous to use the slab_mutex inside > > > slab_attr_store(). > > > > Didn't really look into the code but it looks like slab_mutex is held > > while trying to remove sysfs files. sysfs file removal flushes > > on-going accesses, so if a file operation then tries to grab a mutex > > which is held during removal, it leads to a deadlock. > > > > Looks like this never got fixed and now this bug is in 5.2. git blame gives commit 107dab5c92d5f9c3afe962036e47c207363255c7 Author: Glauber Costa Date: Tue Dec 18 14:23:05 2012 -0800 slub: slub-specific propagation changes for adding the mutex underneath sysfs read, and I think commit d50d82faa0c964e31f7a946ba8aba7c715ca7ab0 Author: Mikulas Patocka Date: Wed Jun 27 23:26:09 2018 -0700 slub: fix failure when we delete and create a slab cache added the sysfs removal underneath the slab_mutex. > Just got this: > > ====================================================== > WARNING: possible circular locking dependency detected > 5.2.0-test #15 Not tainted > ------------------------------------------------------ > slub_cpu_partia/899 is trying to acquire lock: > 000000000f6f2dd7 (slab_mutex){+.+.}, at: slab_attr_store+0x6d/0xe0 > > but task is already holding lock: > 00000000b23ffe3d (kn->count#160){++++}, at: kernfs_fop_write+0x125/0x230 > > which lock already depends on the new lock. > > > the existing dependency chain (in reverse order) is: > > -> #1 (kn->count#160){++++}: > __kernfs_remove+0x413/0x4a0 > kernfs_remove_by_name_ns+0x40/0x80 > sysfs_slab_add+0x1b5/0x2f0 > __kmem_cache_create+0x511/0x560 > create_cache+0xcd/0x1f0 > kmem_cache_create_usercopy+0x18a/0x240 > kmem_cache_create+0x12/0x20 > is_active_nid+0xdb/0x230 [snd_hda_codec_generic] > snd_hda_get_path_idx+0x55/0x80 [snd_hda_codec_generic] > get_nid_path+0xc/0x170 [snd_hda_codec_generic] > do_one_initcall+0xa2/0x394 > do_init_module+0xfd/0x370 > load_module+0x38c6/0x3bd0 > __do_sys_finit_module+0x11a/0x1b0 > do_syscall_64+0x68/0x250 > entry_SYSCALL_64_after_hwframe+0x49/0xbe > > -> #0 (slab_mutex){+.+.}: > lock_acquire+0xbd/0x1d0 > __mutex_lock+0xfc/0xb70 > slab_attr_store+0x6d/0xe0 > kernfs_fop_write+0x170/0x230 > vfs_write+0xe1/0x240 > ksys_write+0xba/0x150 > do_syscall_64+0x68/0x250 > entry_SYSCALL_64_after_hwframe+0x49/0xbe > > other info that might help us debug this: > > Possible unsafe locking scenario: > > CPU0 CPU1 > ---- ---- > lock(kn->count#160); > lock(slab_mutex); > lock(kn->count#160); > lock(slab_mutex); > > *** DEADLOCK *** > > > > Attached is a config and the full dmesg. > > -- Steve >