Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1759645AbcLBKs5 (ORCPT ); Fri, 2 Dec 2016 05:48:57 -0500 Received: from mx2.suse.de ([195.135.220.15]:46901 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758130AbcLBKs4 (ORCPT ); Fri, 2 Dec 2016 05:48:56 -0500 Date: Fri, 2 Dec 2016 11:48:04 +0100 From: Jan Kara To: Miklos Szeredi Cc: Jan Kara , Amir Goldstein , Eric Paris , linux-fsdevel , linux-kernel Subject: Re: fsnotify_mark_srcu wtf? Message-ID: <20161202104804.GC26086@quack2.suse.cz> References: <20161102220851.GA1839@veci.piliscsaba.szeredi.hu> <20161105213411.GA32353@quack2.suse.cz> <20161109111005.GA32353@quack2.suse.cz> <20161110194625.GG31098@quack2.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1712 Lines: 37 On Fri 02-12-16 09:26:51, Miklos Szeredi wrote: > On Thu, Nov 10, 2016 at 8:46 PM, Jan Kara wrote: > > On Wed 09-11-16 20:26:16, Amir Goldstein wrote: > >> On Wed, Nov 9, 2016 at 1:10 PM, Jan Kara wrote: > > >> > And this does not work as well... Fanotify must notify groups by their > >> > priority so you cannot arbitrarily reorder ordering in which groups get > >> > notified. I'm currently pondering on using mark refcount to pin it when > >> > processing permission event but there are still some details to check. > >> > > >> > >> All right, mark refcount sound like the proper solution. > > > > Except it doesn't quite work. We can pin the current marks by a refcount > > but they can still be removed from the list so after we regain srcu lock, > > we are not sure their ->next pointers still point to still allocated marks > > :-| Sadly I realized this only after implementing all this. > > Hmm, how about this: when removing mark from inode, drop refcount. If > refcount is zero can remove from list. Otherwise mark the mark "dead" > and leave it on the list. > > And fsnotify can just skip dead marks. I had this idea as well and when trying to implement this, I've stumbled over some problems. I think the biggest problem was that destruction of a notification mark is relatively complex operation (doing iput() for example) and quite a few places dropping mark references are in a context where this can cause problems. Also I don't want to defer iput() to a workqueue as that will have unexpected consequences such as unlinked watched inode lingering in the system (possibly colliding with umount etc.). Honza -- Jan Kara SUSE Labs, CR