Received: by 2002:a25:c205:0:0:0:0:0 with SMTP id s5csp3247587ybf; Tue, 3 Mar 2020 02:23:58 -0800 (PST) X-Google-Smtp-Source: ADFU+vuc42lC/xqDpdjh5k6hHVgCyQBD8ksYAi14nKL9f1z/hU1+y8vgO5l5CjxbaQQo7rFylun8 X-Received: by 2002:a9d:53c2:: with SMTP id i2mr2655699oth.43.1583231038720; Tue, 03 Mar 2020 02:23:58 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1583231038; cv=none; d=google.com; s=arc-20160816; b=O9/d/RGNb6mJeW4g1kPMF1AN4Aj2OzP1jF1SxZeloDyJD4ly7d0Zqs8GIhoussyH1W CjwmGMuHDbBwcJQmZaGRPjtrQVy0CBQUQYSvpk1QKoUyy4A91yXT3bx2M9Qw1zCtJ058 Ag8ldxxsARVyHu+qT//NTpKK9hutosL+I6wodYqTcBg6N4giDjeEHV6Q6xjveFH6Fv+l qO9F2VAG++lGZpi1TLZDV+0dcd1SsV6tPMs3+y8XHCwgDi7ka/3AA+irPd//dYKq9+90 pUzZwky3y1Q8FY+IegUkolVX/3MDCk7Wav/Mwqe3AetwjA5INYejdBcZTxhjtV567xpJ 37zg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature; bh=Uy9QGkzP3M7ePq29WQyT+XeoSg6l7Sp/Eya8JeglC7c=; b=sVFQJ1FMBkZYwlB0TmXEUru02Z26fJ92K+rAKXHSkMOia8ue9t9EW/9VQnlHTrIP24 PM0O8WwjEFza5DP8ZLDHVCvhV1cEua9gH9IiCeQC1DSEYHZSTNPbSpOgaClA5XnK8ZMs zNWO/7KPojkehjimyzbc1n1y6+I03Gr2V9cS7MPPNKEuT+upWTvmND/eqD/FahIn4GUZ RKESYfQDQyHoypX3HIgCc/wiG/dWoDPzfKj4GVLTP15CdLrXc6iKvNRklKgBE42K/QJq fK9El3UqJxUqGgCBRqlMMd+JVFWRqXFcBfw0r+0yMk9UFAkg6EHdtSNhZM+fUiRgl5eu 3zxA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=TnVxUcVX; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y14si3839516oia.61.2020.03.03.02.23.46; Tue, 03 Mar 2020 02:23:58 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=TnVxUcVX; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728530AbgCCKWU (ORCPT + 99 others); Tue, 3 Mar 2020 05:22:20 -0500 Received: from us-smtp-1.mimecast.com ([205.139.110.61]:24923 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728102AbgCCKWT (ORCPT ); Tue, 3 Mar 2020 05:22:19 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1583230938; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Uy9QGkzP3M7ePq29WQyT+XeoSg6l7Sp/Eya8JeglC7c=; b=TnVxUcVXcEaElbx6iQaAu3jslR7uQeEdNTt1s7YqRWYYqWW5mlgAu3htU00rvWvDguwpdg u8l4YwRSM1R49Np4+aIkzzhc8ces3UqULU84Xc4ZXyofn+XmS+vldQlJvEabRD1VjXrjNB 8xS/p/iA7kS81t2BYioPmJbnbXoh65s= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-142-M1Qs26YlPxSbtksrXLRmBQ-1; Tue, 03 Mar 2020 05:22:14 -0500 X-MC-Unique: M1Qs26YlPxSbtksrXLRmBQ-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 9BFFB800D5F; Tue, 3 Mar 2020 10:22:12 +0000 (UTC) Received: from fogou.chygwyn.com (unknown [10.33.36.14]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 583C760BF3; Tue, 3 Mar 2020 10:22:00 +0000 (UTC) Subject: Re: [PATCH 00/17] VFS: Filesystem information and notifications [ver #17] To: Miklos Szeredi , David Howells Cc: Ian Kent , Christian Brauner , James Bottomley , Miklos Szeredi , viro , Christian Brauner , Jann Horn , "Darrick J. Wong" , Linux API , linux-fsdevel , lkml , Greg Kroah-Hartman References: <158230810644.2185128.16726948836367716086.stgit@warthog.procyon.org.uk> <1582316494.3376.45.camel@HansenPartnership.com> <1582556135.3384.4.camel@HansenPartnership.com> <1582644535.3361.8.camel@HansenPartnership.com> <20200228155244.k4h4hz3dqhl7q7ks@wittgenstein> <107666.1582907766@warthog.procyon.org.uk> <0403cda7345e34c800eec8e2870a1917a8c07e5c.camel@themaw.net> <1509948.1583226773@warthog.procyon.org.uk> From: Steven Whitehouse Message-ID: <06d2dbf0-4580-3812-bb14-34c6aa615747@redhat.com> Date: Tue, 3 Mar 2020 10:21:58 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.1.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, On 03/03/2020 09:48, Miklos Szeredi wrote: > On Tue, Mar 3, 2020 at 10:26 AM Miklos Szeredi wrote: >> On Tue, Mar 3, 2020 at 10:13 AM David Howells wrote: >>> Miklos Szeredi wrote: >>> >>>> I'm doing a patch. Let's see how it fares in the face of all these >>>> preconceptions. >>> Don't forget the efficiency criterion. One reason for going with fsinfo(2) is >>> that scanning /proc/mounts when there are a lot of mounts in the system is >>> slow (not to mention the global lock that is held during the read). > BTW, I do feel that there's room for improvement in userspace code as > well. Even quite big mount table could be scanned for *changes* very > efficiently. l.e. cache previous contents of /proc/self/mountinfo and > compare with new contents, line-by-line. Only need to parse the > changed/added/removed lines. > > Also it would be pretty easy to throttle the number of updates so > systemd et al. wouldn't hog the system with unnecessary processing. > > Thanks, > Miklos > At least having patches to compare would allow us to look at the performance here and gain some numbers, which would be helpful to frame the discussions. However I'm not seeing how it would be easy to throttle updates... they occur at whatever rate they are generated and this can be fairly high. Also I'm not sure that I follow how the notifications and the dumping of the whole table are synchronized in this case, either. Al has pointed out before that a single mount operation on a subtree can generate a large number of changes on that subtree. That kind of scenario will need to be dealt with efficiently so that we don't miss things, and we also minimize the possibility of overruns, and additional overhead on the mount changes themselves, by keeping the notification messages small. We should also look at what the likely worst case might be. I seem to remember from what Ian has said in the past that there can be tens of thousands of autofs mounts on some large systems. I assume that worst case might be something like that, but multiplied by however many containers might be on a system. Can anybody think of a situation which might require even more mounts? The network subsystem had a similar problem... they use rtnetlink for the routing information, and just like the proposal here it contains a dump mechanism, and a way to listen to events (add/remove routes) which is synchronized with that dump. Ian did start looking at netlink some time ago, but it also has some issues (it is in the network namespace not the fs namespace, it also has various things accumulated over the years that we don't need for filesystems) but that was part of the original inspiration for the fs notifications. There is also, of course, /proc/net/route which can be useful in many circumstances, but for efficiency and synchronization reasons if is not the interface of choice for routing protocols. David's proposal has a number of the important attributes of an rtnetlink-like (in a conceptual sense) solution, and I remain skeptical that a /sysfs or similar interface would be an efficient solution to the original problem, even if it might perhaps make a useful addition. There is also the chicken-and-egg issue, in the sense that if the interface is via a filesystem (sysfs, proc or whatever), how does one receive a notification for that filesystem itself being mounted until after it has been mounted? Maybe that is not a particular problem, but I think a cleaner solution would not require a mount in order to watch for other mounts, Steve.