Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp1706874rwb; Mon, 7 Nov 2022 05:01:29 -0800 (PST) X-Google-Smtp-Source: AMsMyM65nLpvZC3w53+bu2XHjb3nm/ff7G3s8/Oz4B58FYgShSLyrJpnXFN3wSRi6pMWWWxxokra X-Received: by 2002:a17:906:d550:b0:7ad:d367:fabb with SMTP id cr16-20020a170906d55000b007add367fabbmr40085412ejc.285.1667826089309; Mon, 07 Nov 2022 05:01:29 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1667826089; cv=none; d=google.com; s=arc-20160816; b=u9ONc16JKZ6e5n198tQLj+aGadh/Dy69x5uJ+14RCQ0CJNcL0q94TNvutSW6IoPTZp vk5Uo1JUKSNfVbTaQGBYI+wH+gtHGxaSpM9qbHdK5RcJjAE8Osbqc42qN8yDs+G8+8Vw YxKeiRJA6CXfvNO4ZBaOTL38SRXOXm/l+YTddH97jTf0EIKw+98zJmiOp2EZISnqRvY7 ZTYp9Ipf0pzA8qxMIwKt+YCpc8UIozyIyMqpjWjJnAOUNPlZYKlT6FhLmhDRN1lB2VAQ sydUisZd/PKDpenaa3ZR18gZJezR+VzKbhorGrfJJF8gP/BR6heJMZT6k+Vc1pdbMMS+ dG0Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature :dkim-signature; bh=oAFxQVQuSdUEmDiDci+wNMiu2DcH6WEZgRjsPrlLJSw=; b=tzHLDHHUmO8kOyWWyGRbKSSTSrIml5ns9F6fdGVGgScU/8qFsbiOqrp7XoWSAxpxMg e5Ife18cMfwwOpmXANfL+swKASJS4V4fGOm6FNdfWhVM3yqZHLorKdb9XIwrmazQdAyy bDjLw2xFKiZG2lmM++xD7dzKeNDO1rTt2mcT/w2h49wihcsasNflqrz9VMxGPZdr5Jom pivcTEMZQ0vo/4Vr0i+5/AJZga4TsVieXAfe4CJfQ0Pi3r9vNUi+ZvvOhM+IBWMcS4tv HTtF6tod8Aa/SMIh7wWz1/LKBBidq9tHjDZhZoOxNSrzSEoVqPw/JWLseruoSKtDGtM+ oYVQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.cz header.s=susede2_rsa header.b=pr0F6EeP; dkim=neutral (no key) header.i=@suse.cz; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id bh9-20020a170906a0c900b0073dc32ccd9fsi7407958ejb.104.2022.11.07.05.01.05; Mon, 07 Nov 2022 05:01:29 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.cz header.s=susede2_rsa header.b=pr0F6EeP; dkim=neutral (no key) header.i=@suse.cz; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231967AbiKGL4a (ORCPT + 93 others); Mon, 7 Nov 2022 06:56:30 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59432 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230434AbiKGL42 (ORCPT ); Mon, 7 Nov 2022 06:56:28 -0500 Received: from smtp-out2.suse.de (smtp-out2.suse.de [IPv6:2001:67c:2178:6::1d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AD5E919C35; Mon, 7 Nov 2022 03:56:26 -0800 (PST) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 678461F88D; Mon, 7 Nov 2022 11:56:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1667822185; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=oAFxQVQuSdUEmDiDci+wNMiu2DcH6WEZgRjsPrlLJSw=; b=pr0F6EePjJI+cy5LxOWv2WRjIzFjJzyNNwZNeHvB9FBiSaQz+GBPbLAQsbEPBiXUJbObOt XkRaswjUiHOhyegqDVfJ4YUiyuHUI4SO/gKc5e5ZqKDP5RxflUE5rlOMZ7nPPhou03RVw/ jhClf63TLwj2pZoNWT7atRAu7iXW6ek= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1667822185; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=oAFxQVQuSdUEmDiDci+wNMiu2DcH6WEZgRjsPrlLJSw=; b=Xc2MiYNjI9Jnrd0K7ucS+ggZqTgBDBWfXnUZ61adq5qkdQT7XjOKuiNWogpvUz4P7CtK/z 2kRkW8rYTtaPd0Bg== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 51B2213AC7; Mon, 7 Nov 2022 11:56:25 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id VHI3E2nyaGM+JAAAMHmgww (envelope-from ); Mon, 07 Nov 2022 11:56:25 +0000 Received: by quack3.suse.cz (Postfix, from userid 1000) id 7BE60A0704; Mon, 7 Nov 2022 12:56:24 +0100 (CET) Date: Mon, 7 Nov 2022 12:56:24 +0100 From: Jan Kara To: Stephen Brennan Cc: Jan Kara , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Amir Goldstein , Al Viro Subject: Re: [PATCH v3 0/3] fsnotify: fix softlockups iterating over d_subdirs Message-ID: <20221107115624.j2ilaotaigrqiheu@quack3> References: <20221021010310.29521-1-stephen.s.brennan@oracle.com> <20221028001016.332663-1-stephen.s.brennan@oracle.com> <20221101175144.yu3l5qo5gfwfpatt@quack3> <877d0eh03t.fsf@oracle.com> <20221102175224.iacden3lq4oksmof@quack3> <87y1sqfg75.fsf@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87y1sqfg75.fsf@oracle.com> X-Spam-Status: No, score=-3.7 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_SOFTFAIL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri 04-11-22 16:33:18, Stephen Brennan wrote: > Jan Kara writes: > > On Tue 01-11-22 13:48:54, Stephen Brennan wrote: > >> Jan Kara writes: > >> > Hi Stephen! > >> > > >> > On Thu 27-10-22 17:10:13, Stephen Brennan wrote: > >> >> Here is v3 of the patch series. I've taken all of the feedback, > >> >> thanks Amir, Christian, Hilf, et al. Differences are noted in each > >> >> patch. > >> >> > >> >> I caught an obvious and silly dentry reference leak: d_find_any_alias() > >> >> returns a reference, which I never called dput() on. With that change, I > >> >> no longer see the rpc_pipefs issue, but I do think I need more testing > >> >> and thinking through the third patch. Al, I'd love your feedback on that > >> >> one especially. > >> >> > >> >> Thanks, > >> >> Stephen > >> >> > >> >> Stephen Brennan (3): > >> >> fsnotify: Use d_find_any_alias to get dentry associated with inode > >> >> fsnotify: Protect i_fsnotify_mask and child flags with inode rwsem > >> >> fsnotify: allow sleepable child flag update > >> > > >> > Thanks for the patches Stephen and I'm sorry for replying somewhat late. > >> > >> Absolutely no worries, these things take time. Thanks for taking a look! > >> > >> > The first patch is a nobrainer. The other two patches ... complicate things > >> > somewhat more complicated than I'd like. I guess I can live with them if we > >> > don't find a better solution but I'd like to discuss a bit more about > >> > alternatives. > >> > >> Understood! > >> > >> > So what would happen if we just clear DCACHE_FSNOTIFY_PARENT_WATCHED in > >> > __fsnotify_parent() for the dentry which triggered the event and does not > >> > have watched parent anymore and never bother with full children walk? I > >> > suppose your contention problems will be gone, we'll just pay the price of > >> > dget_parent() + fsnotify_inode_watches_children() for each child that > >> > falsely triggers instead of for only one. Maybe that's not too bad? After > >> > all any event upto this moment triggered this overhead as well... > >> > >> This is an interesting idea. It came across my mind but I don't think I > >> considered it seriously because I assumed that it was too big a change. > >> But I suppose in the process I created an even bigger change :P > >> > >> The false positive dget_parent() + fsnotify_inode_watches_children() > >> shouldn't be too bad. I could see a situation where there's a lot of > >> random accesses within a directory, where the dget_parent() could cause > >> some contention over the parent dentry. But to be fair, the performance > >> would have been the same or worse while fsnotify was active in that > >> case, and the contention would go away as most of the dentries get their > >> flags cleared. So I don't think this is a problem. > >> > >> > Am I missing something? > >> > >> I think there's one thing missed here. I understand you'd like to get > >> rid of the extra flag in the connector. But the advantage of the flag is > >> avoiding duplicate work by saving a bit of state. Suppose that a mark is > >> added to a connector, which causes fsnotify_inode_watches_children() to > >> become true. Then, any subsequent call to fsnotify_recalc_mask() must > >> call __fsnotify_update_child_dentry_flags(), even though the child > >> dentry flags don't need to be updated: they're already set. For (very) > >> large directories, this can take a few seconds, which means that we're > >> doing a few extra seconds of work each time a new mark is added to or > >> removed from a connector in that case. I can't imagine that's a super > >> common workload though, and I don't know if my customers do that (my > >> guess would be no). > > > > I understand. This basically matters for fsnotify_recalc_mask(). As a side > > note I've realized that your changes to fsnotify_recalc_mask() acquiring > > inode->i_rwsem for updating dentry flags in patch 2/3 are problematic for > > dnotify because that calls fsnotify_recalc_mask() under a spinlock. > > Furthermore it is somewhat worrying also for inotify & fanotify because it > > nests inode->i_rwsem inside fsnotify_group->lock however I'm not 100% sure > > something doesn't force the ordering the other way around (e.g. the removal > > of oneshot mark during modify event generation). Did you run tests with > > lockdep enabled? > > No I didn't. I'll be sure to get the LTP tests running with lockdep > early next week and try this series out, we'll probably get an error > like you say. > > I'll also take a look at the dnotify use case and see if there's > anything to do there. Hopefully there's something we can do to salvage > it :D Yeah, dnotify should be solvable. I'm even willing to accept somewhat unusual methods for dnotify ;). It is ancient and rarely used API these days so I don't want it to block fixes for newer APIs. From a quick look it should be enough to move fsnotify_recalc_mask() call in dnotify_recalc_inode_mask() from under the mark->lock out into callers and perhaps rename dnotify_recalc_inode_mask() to dnotify_recalc_mark_mask() to make the name somewhat less confusing. Honza -- Jan Kara SUSE Labs, CR