Received: by 2002:a25:c205:0:0:0:0:0 with SMTP id s5csp1235564ybf; Thu, 27 Feb 2020 07:15:24 -0800 (PST) X-Google-Smtp-Source: APXvYqwlND9e7kCgBmO78Gtmos0F0aNuwDA+1Q2In84JnU8afZ5nqlB+51OqTWZAJPkqHHlhXSQy X-Received: by 2002:a05:6808:1d3:: with SMTP id x19mr3844417oic.77.1582816524592; Thu, 27 Feb 2020 07:15:24 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1582816524; cv=none; d=google.com; s=arc-20160816; b=FHx9NZG/hU9Tp9tTm0Uj6nSWLiRS0PBBeGLiSmR9xmMFZ0LhLPitK2VmXFZ1Iib7IZ X5zz+421i/NoXXKz8JSUaztHKgnqHCQ3zCIoTlBsbEWVONCrlpe1Wo7IiZ1hNwr8xbgT CdNq66YzIeSDVP1VJ5MtaQBYHiKn1oNDUzDflflp98oV+GdZ2+vwUIydaLBbiD2f97SN Hrbm/4YqZ7c2UUVnVPiBwShb6+YRIlQT2006VDhhc8i6dQ6VCBrLvoXxkoPhOUZn6JfT yIpZP92FjlKhOeEyG3X9AOIo3LSxy2PBhAru+QPQUoeS16Ww8aCJjfK8kiAYK9SSub3H 1l1g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=KsXD7OC/j5gaz+S/O9WvlSstDKGTM7c7shTcNdfT6SA=; b=gOAEAF94G9YlzXLZh8htb+VrzjGQmBesG0Uir8DwR0BPlZ2y3RyuAUMfsR+FrMolFi 7akCvdsv5j3RVP0NsnVlPJtvjI3eLAXiExSVfVlqjNT5Z5HARdwpPLTmremqAs6gcQ5i bmiJ1mNRsG0qXFRu43zwVl+MhqKYKq+i5sl1QB0qzBRic422m54LsJxll5IBwGcIcAzZ ONRQFqoKfM1t3V+p8gDpvYlYMza69boIQ/YhB1vG0Sbli459NBsP058h5xZPE93jpuoo CQt4y6zf09IZCymc6Pq3GTvqAknLfWLdtmqbbnmZAINVlLeRpr3OUPCP6BNNaGHnj3im vLkw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=dRoiOdVV; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id n7si1588834otk.277.2020.02.27.07.15.09; Thu, 27 Feb 2020 07:15:24 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=dRoiOdVV; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730548AbgB0POi (ORCPT + 99 others); Thu, 27 Feb 2020 10:14:38 -0500 Received: from us-smtp-delivery-1.mimecast.com ([205.139.110.120]:29089 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1730383AbgB0POi (ORCPT ); Thu, 27 Feb 2020 10:14:38 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1582816477; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=KsXD7OC/j5gaz+S/O9WvlSstDKGTM7c7shTcNdfT6SA=; b=dRoiOdVVzPUlgTOstGMQllOMu/E1P9qioktA0cvF4owXeTcFrkEv7vHe/noK8UZiNcjCD+ 8ofHZhNxlwRcqK42YdNZLJwkeeNwbaZJmR13z3QQxuvySEOjUp193EAow80Hj5W5hOT6NA TuDaBIov2EO0MgJ12QUShlSCh+GEBEw= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-354-WDNOwscoO72aO4dxQsI4uw-1; Thu, 27 Feb 2020 10:14:30 -0500 X-MC-Unique: WDNOwscoO72aO4dxQsI4uw-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 693DD1005514; Thu, 27 Feb 2020 15:14:28 +0000 (UTC) Received: from ws.net.home (ovpn-204-202.brq.redhat.com [10.40.204.202]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 807061036B25; Thu, 27 Feb 2020 15:14:23 +0000 (UTC) Date: Thu, 27 Feb 2020 16:14:21 +0100 From: Karel Zak To: Miklos Szeredi Cc: Ian Kent , Miklos Szeredi , James Bottomley , Steven Whitehouse , David Howells , viro , Christian Brauner , Jann Horn , "Darrick J. Wong" , Linux API , linux-fsdevel , lkml , Lennart Poettering , Zbigniew =?utf-8?Q?J=C4=99drzejewski-Szmek?= , util-linux@vger.kernel.org Subject: Re: [PATCH 00/17] VFS: Filesystem information and notifications [ver #17] Message-ID: <20200227151421.3u74ijhqt6ekbiss@ws.net.home> References: <1582556135.3384.4.camel@HansenPartnership.com> <1582644535.3361.8.camel@HansenPartnership.com> <1c8db4e2b707f958316941d8edd2073ee7e7b22c.camel@themaw.net> <3e656465c427487e4ea14151b77d391d52cd6bad.camel@themaw.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Feb 27, 2020 at 02:45:27PM +0100, Miklos Szeredi wrote: > > So the problem I want to see fixed is the effect of very large > > mount tables on other user space applications, particularly the > > effect when a large number of mounts or umounts are performed. Yes, now you have to generate (in kernel) and parse (in userspace) all mount table to get information about just one mount table entry. This is typical for umount or systemd. > > > - add a notification mechanism - lookup a mount based on path > > > - and a way to selectively query mount/superblock information > > based on path ... For umount-like use-cases we need mountpoint/ to mount entry conversion; I guess something like open(mountpoint/) + fsinfo() should be good enough. For systemd we need the same, but triggered by notification. The ideal solution is to get mount entry ID or FD from notification and later use this ID or FD to ask for details about the mount entry (probably again fsinfo()). The notification has to be usable with in epoll() set. This solves 99% of our performance issues I guess. > > So that means mount table info. needs to be maintained, whether that > > can be achieved using sysfs I don't know. Creating and maintaining > > the sysfs tree would be a big challenge I think. It will be still necessary to get complete mount table sometimes, but not in performance sensitive scenarios. I'm not sure about sysfs/, you need somehow resolve namespaces, order of the mount entries (which one is the last one), etc. IMHO translate mountpoint path to sysfs/ path will be complicated. > > But before trying to work out how to use a notification mechanism > > just having a way to get the info provided by the proc tables using > > a path alone should give initial immediate improvement in libmount. > > Adding Karel, Lennart, Zbigniew and util-linux@vger... > > At a quick glance at libmount and systemd code, it appears that just > switching out the implementation in libmount will not be enough: > systemd is calling functions like mnt_table_parse_*() when it receives > a notification that the mount table changed. We're ready to change this stuff in systemd if there will be something better (something per-mount-entry). My plan is add new API to libmount to query information about one mount entry (but I had no time to play with fsinfo yet). > What is the end purpose of parsing the mount tables? Can systemd guys > comment on that? If mount/umount is triggered by systemd than it need verification about success and final version of the mount options. It also reads information from libmount to get userspace mount options (.e.g. _netdev -- libmount uses mount source, target and fsroot to join kernel and userpace stuff). And don't forget that mount units are part of systemd dependencies, so umount/mount is important event for systemd and it need details about the changes (what, where, ... etc.) Karel -- Karel Zak http://karelzak.blogspot.com