2020-04-01 08:28:24

by David Howells

[permalink] [raw]
Subject: Re: [PATCH 00/13] VFS: Filesystem information [ver #19]

Miklos Szeredi <[email protected]> wrote:

> According to dhowell's measurements processing 100k mounts would take
> about a few seconds of system time (that's the time spent by the
> kernel to retrieve the data,

But the inefficiency of mountfs - at least as currently implemented - scales
up with the number of individual values you want to retrieve, both in terms of
memory usage and time taken.

With fsinfo(), I've tried to batch values together where it makes sense - and
there's no lingering memory overhead - no extra inodes, dentries and files
required.

David


2020-04-01 08:37:56

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [PATCH 00/13] VFS: Filesystem information [ver #19]

On Wed, Apr 1, 2020 at 10:27 AM David Howells <[email protected]> wrote:
>
> Miklos Szeredi <[email protected]> wrote:
>
> > According to dhowell's measurements processing 100k mounts would take
> > about a few seconds of system time (that's the time spent by the
> > kernel to retrieve the data,
>
> But the inefficiency of mountfs - at least as currently implemented - scales
> up with the number of individual values you want to retrieve, both in terms of
> memory usage and time taken.

I've taken that into account when guesstimating a "few seconds per
100k entries". My guess is that there's probably an order of
magnitude difference between the performance of a fs based interface
and a binary syscall based interface. That could be reduced somewhat
with a readfile(2) type API.

But the point is: this does not matter. Whether it's .5s or 5s is
completely irrelevant, as neither is going to take down the system,
and userspace processing is probably going to take as much, if not
more time. And remember, we are talking about stopping and starting
the automount daemon, which is something that happens, but it should
not happen often by any measure.

> With fsinfo(), I've tried to batch values together where it makes sense - and
> there's no lingering memory overhead - no extra inodes, dentries and files
> required.

The dentries, inodes and files in your test are single use (except the
root dentry) and can be made ephemeral if that turns out to be better.
My guess is that dentries belonging to individual attributes should be
deleted on final put, while the dentries belonging to the mount
directory can be reclaimed normally.

Thanks,
Miklos

2020-04-01 12:36:55

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [PATCH 00/13] VFS: Filesystem information [ver #19]

On Wed, Apr 1, 2020 at 10:37 AM Miklos Szeredi <[email protected]> wrote:
>
> On Wed, Apr 1, 2020 at 10:27 AM David Howells <[email protected]> wrote:
> >
> > Miklos Szeredi <[email protected]> wrote:
> >
> > > According to dhowell's measurements processing 100k mounts would take
> > > about a few seconds of system time (that's the time spent by the
> > > kernel to retrieve the data,
> >
> > But the inefficiency of mountfs - at least as currently implemented - scales
> > up with the number of individual values you want to retrieve, both in terms of
> > memory usage and time taken.
>
> I've taken that into account when guesstimating a "few seconds per
> 100k entries". My guess is that there's probably an order of
> magnitude difference between the performance of a fs based interface
> and a binary syscall based interface. That could be reduced somewhat
> with a readfile(2) type API.

And to show that I'm not completely off base, attached a patch that
adds a limited readfile(2) syscall and uses it in the p2 method.

Results are promising:

./test-fsinfo-perf /tmp/a 30000
--- make mounts ---
--- test fsinfo by path ---
sum(mnt_id) = 930000
--- test fsinfo by mnt_id ---
sum(mnt_id) = 930000
--- test /proc/fdinfo ---
sum(mnt_id) = 930000
--- test mountfs ---
sum(mnt_id) = 930000
For 30000 mounts, f= 146400us f2= 136766us p= 1406569us p2=
221669us; p=9.6*f p=10.3*f2 p=6.3*p2
--- umount ---

This is about a 2 fold increase in speed compared to open + read + close.

Is someone still worried about performance, or can we move on to more
interesting parts of the design?

Thanks,
Miklos


Attachments:
fsmount-readfile.patch (6.18 kB)

2020-04-02 01:39:04

by Ian Kent

[permalink] [raw]
Subject: Re: [PATCH 00/13] VFS: Filesystem information [ver #19]

On Wed, 2020-04-01 at 10:37 +0200, Miklos Szeredi wrote:
> On Wed, Apr 1, 2020 at 10:27 AM David Howells <[email protected]>
> wrote:
> > Miklos Szeredi <[email protected]> wrote:
> >
> > > According to dhowell's measurements processing 100k mounts would
> > > take
> > > about a few seconds of system time (that's the time spent by the
> > > kernel to retrieve the data,
> >
> > But the inefficiency of mountfs - at least as currently implemented
> > - scales
> > up with the number of individual values you want to retrieve, both
> > in terms of
> > memory usage and time taken.
>
> I've taken that into account when guesstimating a "few seconds per
> 100k entries". My guess is that there's probably an order of
> magnitude difference between the performance of a fs based interface
> and a binary syscall based interface. That could be reduced somewhat
> with a readfile(2) type API.
>
> But the point is: this does not matter. Whether it's .5s or 5s is
> completely irrelevant, as neither is going to take down the system,
> and userspace processing is probably going to take as much, if not
> more time. And remember, we are talking about stopping and starting
> the automount daemon, which is something that happens, but it should
> not happen often by any measure.

Yes, but don't forget, I'm reporting what I saw when testing during
development.

From previous discussion we know systemd (and probably the other apps
like udisks2, et. al.) gets notified on mount and umount activity so
its not going to be just starting and stopping autofs that's a problem
with very large mount tables.

To get a feel for the real difference we'd need to make the libmount
changes for both and then check between the two and check behaviour.
The mount and umount lookup case that Karel (and I) talked about
should be sufficient.

The biggest problem I had with fsinfo() when I was working with
earlier series was getting fs specific options, in particular the
need to use sb op ->fsinfo(). With this latest series David has made
that part of the generic code and your patch also cover it.

So the thing that was holding me up is done so we should be getting
on with libmount improvements, we need to settle this.

I prefer the system call interface and I'm not offering justification
for that other than a general dislike (and on occasion outright
frustration) of pretty much every proc implementation I have had to
look at.

>
> > With fsinfo(), I've tried to batch values together where it makes
> > sense - and
> > there's no lingering memory overhead - no extra inodes, dentries
> > and files
> > required.
>
> The dentries, inodes and files in your test are single use (except
> the
> root dentry) and can be made ephemeral if that turns out to be
> better.
> My guess is that dentries belonging to individual attributes should
> be
> deleted on final put, while the dentries belonging to the mount
> directory can be reclaimed normally.
>
> Thanks,
> Miklos

2020-04-02 14:17:01

by Karel Zak

[permalink] [raw]
Subject: Re: [PATCH 00/13] VFS: Filesystem information [ver #19]

On Thu, Apr 02, 2020 at 09:38:20AM +0800, Ian Kent wrote:
> I prefer the system call interface and I'm not offering justification
> for that other than a general dislike (and on occasion outright
> frustration) of pretty much every proc implementation I have had to
> look at.

Frankly, I'm modest, what about to have both interfaces in kernel --
fsinfo() as well mountfs? It's nothing unusual for example for block
devices to have attribute accessible by /sys as well as by ioctl().

I can imagine that for complex task or performance sensitive tasks
it's better to use fsinfo(), but in another simple use-cases (for
example to convert mountpoint to device name in shell) is better to
read /proc/.../<atrtr>.

Karel

--
Karel Zak <[email protected]>
http://karelzak.blogspot.com