Received: by 2002:a05:6358:7058:b0:131:369:b2a3 with SMTP id 24csp8580967rwp; Wed, 19 Jul 2023 11:59:47 -0700 (PDT) X-Google-Smtp-Source: APBJJlE3ln1llu/L9k9liYSk5jcSzT8wHXJux+H7AmYyLLOCuTMzBmZbcsSeC2e+7RHwZqX/CK2a X-Received: by 2002:a05:6358:33a1:b0:134:f03f:1bc5 with SMTP id i33-20020a05635833a100b00134f03f1bc5mr7342776rwd.13.1689793187414; Wed, 19 Jul 2023 11:59:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689793187; cv=none; d=google.com; s=arc-20160816; b=RXRMlsdChey+uvrdifkg7QPjdRVavlrtwQRQ7tMTA94l7OX2LAXk7PLUaNCk1/3Ss5 PynUDIT0gE2OzIWwXL2wufUJ1vtPK3Rj6y0p4nBl+eRtN3eCZlr9TtNgovvN5wt9r+CM pztGdD7h7r3g8/n/Cm92TfC4c5+0youmUnexbZFHAGZ/uGHu+3uh2Jrw5OYxdoIKXpRs Yr+2lJB2TSlhWXTO/7XyHP57Ag9AW0RmwVjF/jE/XBSUZHXtorvwZVCbWsdKQ1wra4OC Yoe79wiQ6SUrKr7CTsmWSogpsvFuLkFG8rb68FQnRBW5yKdAEt786GCVqkEPppG8XGmc ujKw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date; bh=QucI29PVdEKlkPhESweXgXSO2Giux5NEbFszcnkfd2I=; fh=HbytflD5lzIzdZ8w3Bima0WkLhXQpUB9KvswBgzePpQ=; b=XkWKwwnB0NLtGPbbWi0KN0C9bqmk86yy9K8stLONP3ntknAbDrE0aHyH/ZGQLjPUPv 6L6sGftuU9sdL8onbJqc4gkaj2g6BA7DIALHoAt2W9sUzKPUFZElXACPPKDWbCH+991r ui1XvfSWorjFqQ8SumXF2Bk+hqq4VhoIQadgGD7eyXKbfDkuhePknjmZW228je4vtvMu esAtU436sSYmAC+LNf/irtN3HOBvhjiMATUZsgt9fwnV8Y6JbxA8ti1/lvYfoLyLEFrE WdpYb2JQ1AxeWeObALEAfbhhM2YNN8snYKFVxP/a0DH0zeic2nPJ2addiwtIGzjI18Nh TFHQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id u195-20020a6379cc000000b005577ad28a97si3717260pgc.633.2023.07.19.11.59.34; Wed, 19 Jul 2023 11:59:47 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229902AbjGSSkx convert rfc822-to-8bit (ORCPT + 99 others); Wed, 19 Jul 2023 14:40:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59972 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229492AbjGSSkv (ORCPT ); Wed, 19 Jul 2023 14:40:51 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 935C0B6; Wed, 19 Jul 2023 11:40:50 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 27D8D617E0; Wed, 19 Jul 2023 18:40:50 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id EA141C433C7; Wed, 19 Jul 2023 18:40:47 +0000 (UTC) Date: Wed, 19 Jul 2023 14:40:46 -0400 From: Steven Rostedt To: Ajay Kaher Cc: "shuah@kernel.org" , "mhiramat@kernel.org" , Ching-lin Yu , "linux-kernel@vger.kernel.org" , "linux-kselftest@vger.kernel.org" , "linux-trace-kernel@vger.kernel.org" , "lkp@intel.com" , Nadav Amit , "oe-lkp@lists.linux.dev" , Alexey Makhalov , "er.ajay.kaher@gmail.com" , "srivatsa@csail.mit.edu" , Tapas Kundu , Vasavi Sirnapalli Subject: Re: [PATCH v4 00/10] tracing: introducing eventfs Message-ID: <20230719144046.746af82e@gandalf.local.home> In-Reply-To: <899D0823-A1B2-4A6F-A5BA-0D707F41C3D4@vmware.com> References: <1689248004-8158-1-git-send-email-akaher@vmware.com> <20230714185824.62556254@gandalf.local.home> <883F9774-3E76-4346-9988-2788FAF0D55E@vmware.com> <20230718094005.32516161@gandalf.local.home> <2CD72098-08E2-4CAA-B74D-D8C44D318117@vmware.com> <20230719102310.552d3356@gandalf.local.home> <899D0823-A1B2-4A6F-A5BA-0D707F41C3D4@vmware.com> X-Mailer: Claws Mail 3.19.1 (GTK+ 2.24.33; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT X-Spam-Status: No, score=-1.7 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 19 Jul 2023 18:37:12 +0000 Ajay Kaher wrote: > > Here's the reproducer (of both v3 splat and the bug I'm hitting now). > > > > ~# echo 'p:sock_getattr 0xffffffff9b55cef0 sk=%di' > /sys/kernel/tracing/kprobe_events > > ~# ls /sys/kernel/debug/tracing/events/kprobes/sock_getattr/ > > ~# echo '-:sock_getattr 0xffffffff9b55cef0 sk=%di' > /sys/kernel/tracing/kprobe_events > > I tried above steps on v4 but couldn’t reproduce: > > root@photon-6 [ ~/sdb/linux ]# echo 'p:sock_getattr 0xffffffff9b55cef0 sk=%di' > /sys/kernel/tracing/kprobe_events > root@photon-6 [ ~/sdb/linux ]# ls /sys/kernel/debug/tracing/events/kprobes/sock_getattr/ > enable filter format id trigger > root@photon-6 [ ~/sdb/linux ]# echo '-:sock_getattr 0xffffffff9b55cef0 sk=%di' > /sys/kernel/tracing/kprobe_events > -bash: echo: write error: No such file or directory > > I have doubt on call_srcu(), it may first end the grace period for parent then for child. If this is true then free_list > will have unordered list and could cause problem. I modified the srcu portion a bit. Will post soon. I think I got something working. I'm having doubt that the dput()s were needed in the eventfs_remove_rec(), as the d_invalidate() appears to be enough. I'm still testing. > > > > > > v3 gives me (and my updates too) > > > > ====================================================== > > WARNING: possible circular locking dependency detected > > 6.5.0-rc1-test+ #576 Not tainted > > ------------------------------------------------------ > > trace-cmd/840 is trying to acquire lock: > > ffff8881007e5de0 (&sb->s_type->i_mutex_key#5){++++}-{3:3}, at: dcache_dir_open_wrapper+0xc1/0x1b0 > > > > but task is already holding lock: > > ffff888103ad7e70 (eventfs_rwsem/1){.+.+}-{3:3}, at: dcache_dir_open_wrapper+0x6f/0x1b0 > > > > which lock already depends on the new lock. > > > > > > the existing dependency chain (in reverse order) is: > > > > -> #1 (eventfs_rwsem/1){.+.+}-{3:3}: > > down_read_nested+0x41/0x180 > > eventfs_root_lookup+0x42/0x120 > > __lookup_slow+0xff/0x1b0 > > walk_component+0xdb/0x150 > > path_lookupat+0x67/0x1a0 > > filename_lookup+0xe4/0x1f0 > > vfs_statx+0x9e/0x180 > > vfs_fstatat+0x51/0x70 > > __do_sys_newfstatat+0x3f/0x80 > > do_syscall_64+0x3a/0xc0 > > entry_SYSCALL_64_after_hwframe+0x6e/0xd8 > > > > -> #0 (&sb->s_type->i_mutex_key#5){++++}-{3:3}: > > __lock_acquire+0x165d/0x2390 > > lock_acquire+0xd4/0x2d0 > > down_write+0x3b/0xd0 > > dcache_dir_open_wrapper+0xc1/0x1b0 > > do_dentry_open+0x20c/0x510 > > path_openat+0x7ad/0xc60 > > do_filp_open+0xaf/0x160 > > do_sys_openat2+0xab/0xe0 > > __x64_sys_openat+0x6a/0xa0 > > do_syscall_64+0x3a/0xc0 > > entry_SYSCALL_64_after_hwframe+0x6e/0xd8 > > > > other info that might help us debug this: > > > > Possible unsafe locking scenario: > > > > CPU0 CPU1 > > ---- ---- > > rlock(eventfs_rwsem/1); > > lock(&sb->s_type->i_mutex_key#5); > > lock(eventfs_rwsem/1); > > lock(&sb->s_type->i_mutex_key#5); > > > > *** DEADLOCK *** > > > > 1 lock held by trace-cmd/840: > > #0: ffff888103ad7e70 (eventfs_rwsem/1){.+.+}-{3:3}, at: dcache_dir_open_wrapper+0x6f/0x1b0 > > > > stack backtrace: > > CPU: 7 PID: 840 Comm: trace-cmd Not tainted 6.5.0-rc1-test+ #576 > > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 > > Call Trace: > > > > dump_stack_lvl+0x57/0x90 > > check_noncircular+0x14b/0x160 > > __lock_acquire+0x165d/0x2390 > > lock_acquire+0xd4/0x2d0 > > ? dcache_dir_open_wrapper+0xc1/0x1b0 > > down_write+0x3b/0xd0 > > ? dcache_dir_open_wrapper+0xc1/0x1b0 > > dcache_dir_open_wrapper+0xc1/0x1b0 > > ? __pfx_dcache_dir_open_wrapper+0x10/0x10 > > do_dentry_open+0x20c/0x510 > > path_openat+0x7ad/0xc60 > > do_filp_open+0xaf/0x160 > > do_sys_openat2+0xab/0xe0 > > __x64_sys_openat+0x6a/0xa0 > > do_syscall_64+0x3a/0xc0 > > entry_SYSCALL_64_after_hwframe+0x6e/0xd8 > > RIP: 0033:0x7f1743267e41 > > Code: 44 24 18 31 c0 41 83 e2 40 75 3e 89 f0 25 00 00 41 00 3d 00 00 41 00 74 30 89 f2 b8 01 01 00 00 48 89 fe bf 9c ff ff ff 0f 05 <48> 3d 00 f0 ff ff 77 3f 48 8b 54 24 18 64 48 2b 14 25 28 00 00 00 > > RSP: 002b:00007ffec10ff5d0 EFLAGS: 00000287 ORIG_RAX: 0000000000000101 > > RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f1743267e41 > > RDX: 0000000000090800 RSI: 00007ffec10ffdb0 RDI: 00000000ffffff9c > > RBP: 00007ffec10ffda0 R08: 00007ffec11003e0 R09: 0000000000000040 > > R10: 0000000000000000 R11: 0000000000000287 R12: 00007ffec11003e0 > > R13: 0000000000000040 R14: 0000000000000000 R15: 00007ffec110034b > > > > > > This is expected from v3 (just ignore as of now), if eventfs_set_ef_status_free crash not > reproduced on v3 then it’s v4 issue. The issue comes from fixing the above ;-) -- Steve