2017-10-06 21:09:57

by Josef Bacik

[permalink] [raw]
Subject: [ANNOUNCE] fsperf: a simple fs/block performance testing framework

Hello,

One thing that comes up a lot every LSF is the fact that we have no general way
that we do performance testing. Every fs developer has a set of scripts or
things that they run with varying degrees of consistency, but nothing central
that we all use. I for one am getting tired of finding regressions when we are
deploying new kernels internally, so I wired this thing up to try and address
this need.

We all hate convoluted setups, the more brain power we have to put in to setting
something up the less likely we are to use it, so I took the xfstests approach
of making it relatively simple to get running and relatively easy to add new
tests. For right now the only thing this framework does is run fio scripts. I
chose fio because it already gathers loads of performance data about it's runs.
We have everything we need there, latency, bandwidth, cpu time, and all broken
down by reads, writes, and trims. I figure most of us are familiar enough with
fio and how it works to make it relatively easy to add new tests to the
framework.

I've posted my code up on github, you can get it here

https://github.com/josefbacik/fsperf

All (well most) of the results from fio are stored in a local sqlite database.
Right now the comparison stuff is very crude, it simply checks against the
previous run and it only checks a few of the keys by default. You can check
latency if you want, but while writing this stuff up it seemed that latency was
too variable from run to run to be useful in a "did my thing regress or improve"
sort of way.

The configuration is brain dead simple, the README has examples. All you need
to do is make your local.cfg, run ./setup and then run ./fsperf and you are good
to go.

The plan is to add lots of workloads as we discover regressions and such. We
don't want anything that takes too long to run otherwise people won't run this,
so the existing tests don't take much longer than a few minutes each. I will be
adding some more comparison options so you can compare against averages of all
previous runs and such.

Another future goal is to parse the sqlite database and generate graphs of all
runs for the tests so we can visualize changes over time. This is where the
latency measurements will be more useful so we can spot patterns rather than
worrying about test to test variances.

Please let me know if you have any feedback. I'll take github pull requests for
people who like that workflow, but email'ed patches work as well. Thanks,

Josef


2017-10-09 00:51:37

by Dave Chinner

[permalink] [raw]
Subject: Re: [ANNOUNCE] fsperf: a simple fs/block performance testing framework

On Fri, Oct 06, 2017 at 05:09:57PM -0400, Josef Bacik wrote:
> Hello,
>
> One thing that comes up a lot every LSF is the fact that we have no general way
> that we do performance testing. Every fs developer has a set of scripts or
> things that they run with varying degrees of consistency, but nothing central
> that we all use. I for one am getting tired of finding regressions when we are
> deploying new kernels internally, so I wired this thing up to try and address
> this need.
>
> We all hate convoluted setups, the more brain power we have to put in to setting
> something up the less likely we are to use it, so I took the xfstests approach
> of making it relatively simple to get running and relatively easy to add new
> tests. For right now the only thing this framework does is run fio scripts. I
> chose fio because it already gathers loads of performance data about it's runs.
> We have everything we need there, latency, bandwidth, cpu time, and all broken
> down by reads, writes, and trims. I figure most of us are familiar enough with
> fio and how it works to make it relatively easy to add new tests to the
> framework.
>
> I've posted my code up on github, you can get it here
>
> https://github.com/josefbacik/fsperf
>
> All (well most) of the results from fio are stored in a local sqlite database.
> Right now the comparison stuff is very crude, it simply checks against the
> previous run and it only checks a few of the keys by default. You can check
> latency if you want, but while writing this stuff up it seemed that latency was
> too variable from run to run to be useful in a "did my thing regress or improve"
> sort of way.
>
> The configuration is brain dead simple, the README has examples. All you need
> to do is make your local.cfg, run ./setup and then run ./fsperf and you are good
> to go.

Why re-invent the test infrastructure? Why not just make it a
tests/perf subdir in fstests?

> The plan is to add lots of workloads as we discover regressions and such. We
> don't want anything that takes too long to run otherwise people won't run this,
> so the existing tests don't take much longer than a few minutes each. I will be
> adding some more comparison options so you can compare against averages of all
> previous runs and such.

Yup, that fits exactly into what fstests is for... :P

Integrating into fstests means it will be immediately available to
all fs developers, it'll run on everything that everyone already has
setup for filesystem testing, and it will have familiar mkfs/mount
option setup behaviour so there's no new hoops for everyone to jump
through to run it...

Cheers,

Dave.
--
Dave Chinner
[email protected]

2017-10-09 02:25:10

by Josef Bacik

[permalink] [raw]
Subject: Re: [ANNOUNCE] fsperf: a simple fs/block performance testing framework

On Mon, Oct 09, 2017 at 11:51:37AM +1100, Dave Chinner wrote:
> On Fri, Oct 06, 2017 at 05:09:57PM -0400, Josef Bacik wrote:
> > Hello,
> >
> > One thing that comes up a lot every LSF is the fact that we have no general way
> > that we do performance testing. Every fs developer has a set of scripts or
> > things that they run with varying degrees of consistency, but nothing central
> > that we all use. I for one am getting tired of finding regressions when we are
> > deploying new kernels internally, so I wired this thing up to try and address
> > this need.
> >
> > We all hate convoluted setups, the more brain power we have to put in to setting
> > something up the less likely we are to use it, so I took the xfstests approach
> > of making it relatively simple to get running and relatively easy to add new
> > tests. For right now the only thing this framework does is run fio scripts. I
> > chose fio because it already gathers loads of performance data about it's runs.
> > We have everything we need there, latency, bandwidth, cpu time, and all broken
> > down by reads, writes, and trims. I figure most of us are familiar enough with
> > fio and how it works to make it relatively easy to add new tests to the
> > framework.
> >
> > I've posted my code up on github, you can get it here
> >
> > https://github.com/josefbacik/fsperf
> >
> > All (well most) of the results from fio are stored in a local sqlite database.
> > Right now the comparison stuff is very crude, it simply checks against the
> > previous run and it only checks a few of the keys by default. You can check
> > latency if you want, but while writing this stuff up it seemed that latency was
> > too variable from run to run to be useful in a "did my thing regress or improve"
> > sort of way.
> >
> > The configuration is brain dead simple, the README has examples. All you need
> > to do is make your local.cfg, run ./setup and then run ./fsperf and you are good
> > to go.
>
> Why re-invent the test infrastructure? Why not just make it a
> tests/perf subdir in fstests?
>

Probably should have led with that shouldn't I have? There's nothing keeping me
from doing it, but I didn't want to try and shoehorn in a python thing into
fstests. I need python to do the sqlite and the json parsing to dump into the
sqlite database.

Now if you (and others) are not opposed to this being dropped into tests/perf
then I'll work that up. But it's definitely going to need to be done in python.
I know you yourself have said you aren't opposed to using python in the past, so
if that's still the case then I can definitely wire it all up.

> > The plan is to add lots of workloads as we discover regressions and such. We
> > don't want anything that takes too long to run otherwise people won't run this,
> > so the existing tests don't take much longer than a few minutes each. I will be
> > adding some more comparison options so you can compare against averages of all
> > previous runs and such.
>
> Yup, that fits exactly into what fstests is for... :P
>
> Integrating into fstests means it will be immediately available to
> all fs developers, it'll run on everything that everyone already has
> setup for filesystem testing, and it will have familiar mkfs/mount
> option setup behaviour so there's no new hoops for everyone to jump
> through to run it...
>

TBF I specifically made it as easy as possible because I know we all hate trying
to learn new shit. I figured this was different enough to warrant a separate
project, especially since I'm going to add block device jobs so Jens can test
block layer things. If we all agree we'd rather see this in fstests then I'm
happy to do that too. Thanks,

Josef

2017-10-09 03:43:35

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [ANNOUNCE] fsperf: a simple fs/block performance testing framework

On Sun, Oct 08, 2017 at 10:25:10PM -0400, Josef Bacik wrote:
>
> Probably should have led with that shouldn't I have? There's nothing keeping me
> from doing it, but I didn't want to try and shoehorn in a python thing into
> fstests. I need python to do the sqlite and the json parsing to dump into the
> sqlite database.

What version of python are you using? From inspection it looks like
some variant of python 3.x (you're using print as a function w/o using
"from __future import print_function") but it's not immediately
obvious from the top-level fsperf shell script what version of python
your scripts are dependant upon.

This could potentially be a bit of a portability issue across various
distributions --- RHEL doesn't ship with Python 3.x at all, and on
Debian you need to use python3 to get python 3.x, since
/usr/bin/python still points at Python 2.7 by default. So I could see
this as a potential issue for xfstests.

I'm currently using Python 2.7 in my wrapper scripts for, among other
things, to parse xUnit XML format and create nice summaries like this:

ext4/4k: 337 tests, 6 failures, 21 skipped, 3814 seconds
Failures: generic/232 generic/233 generic/361 generic/388
generic/451 generic/459

So I'm not opposed to python, but I will note that if you are using
modules from the Python Package Index, and they are modules which are
not packaged by your distribution (so you're using pip or easy_install
to pull them off the network), it does make doing hermetic builds from
trusted sources to be a bit trickier.

If you have a secops team who wants to know the provenance of software
which get thrown in production data centers (and automated downloading
from random external sources w/o any code review makes them break out
in hives), use of PyPI adds a new wrinkle. It's not impossible to
solve, by any means, but it's something to consider.

- Ted

2017-10-09 05:17:31

by Dave Chinner

[permalink] [raw]
Subject: Re: [ANNOUNCE] fsperf: a simple fs/block performance testing framework

On Sun, Oct 08, 2017 at 10:25:10PM -0400, Josef Bacik wrote:
> On Mon, Oct 09, 2017 at 11:51:37AM +1100, Dave Chinner wrote:
> > On Fri, Oct 06, 2017 at 05:09:57PM -0400, Josef Bacik wrote:
> > > Hello,
> > >
> > > One thing that comes up a lot every LSF is the fact that we have no general way
> > > that we do performance testing. Every fs developer has a set of scripts or
> > > things that they run with varying degrees of consistency, but nothing central
> > > that we all use. I for one am getting tired of finding regressions when we are
> > > deploying new kernels internally, so I wired this thing up to try and address
> > > this need.
> > >
> > > We all hate convoluted setups, the more brain power we have to put in to setting
> > > something up the less likely we are to use it, so I took the xfstests approach
> > > of making it relatively simple to get running and relatively easy to add new
> > > tests. For right now the only thing this framework does is run fio scripts. I
> > > chose fio because it already gathers loads of performance data about it's runs.
> > > We have everything we need there, latency, bandwidth, cpu time, and all broken
> > > down by reads, writes, and trims. I figure most of us are familiar enough with
> > > fio and how it works to make it relatively easy to add new tests to the
> > > framework.
> > >
> > > I've posted my code up on github, you can get it here
> > >
> > > https://github.com/josefbacik/fsperf
> > >
> > > All (well most) of the results from fio are stored in a local sqlite database.
> > > Right now the comparison stuff is very crude, it simply checks against the
> > > previous run and it only checks a few of the keys by default. You can check
> > > latency if you want, but while writing this stuff up it seemed that latency was
> > > too variable from run to run to be useful in a "did my thing regress or improve"
> > > sort of way.
> > >
> > > The configuration is brain dead simple, the README has examples. All you need
> > > to do is make your local.cfg, run ./setup and then run ./fsperf and you are good
> > > to go.
> >
> > Why re-invent the test infrastructure? Why not just make it a
> > tests/perf subdir in fstests?
> >
>
> Probably should have led with that shouldn't I have? There's nothing keeping me
> from doing it, but I didn't want to try and shoehorn in a python thing into
> fstests. I need python to do the sqlite and the json parsing to dump into the
> sqlite database.
>
> Now if you (and others) are not opposed to this being dropped into tests/perf
> then I'll work that up. But it's definitely going to need to be done in python.
> I know you yourself have said you aren't opposed to using python in the past, so
> if that's still the case then I can definitely wire it all up.

I have no problems with people using python for stuff like this but,
OTOH, I'm not the fstests maintainer anymore :P

> > > The plan is to add lots of workloads as we discover regressions and such. We
> > > don't want anything that takes too long to run otherwise people won't run this,
> > > so the existing tests don't take much longer than a few minutes each. I will be
> > > adding some more comparison options so you can compare against averages of all
> > > previous runs and such.
> >
> > Yup, that fits exactly into what fstests is for... :P
> >
> > Integrating into fstests means it will be immediately available to
> > all fs developers, it'll run on everything that everyone already has
> > setup for filesystem testing, and it will have familiar mkfs/mount
> > option setup behaviour so there's no new hoops for everyone to jump
> > through to run it...
> >
>
> TBF I specifically made it as easy as possible because I know we all hate trying
> to learn new shit.

Yeah, it's also hard to get people to change their workflows to add
a whole new test harness into them. It's easy if it's just a new
command to an existing workflow :P

> I figured this was different enough to warrant a separate
> project, especially since I'm going to add block device jobs so Jens can test
> block layer things. If we all agree we'd rather see this in fstests then I'm
> happy to do that too. Thanks,

I'm not fussed either way - it's a good discussion to have, though.

If I want to add tests (e.g. my time-honoured fsmark tests), where
should I send patches?

Cheers,

Dave.
--
Dave Chinner
[email protected]

2017-10-09 06:54:34

by Eryu Guan

[permalink] [raw]
Subject: Re: [ANNOUNCE] fsperf: a simple fs/block performance testing framework

On Mon, Oct 09, 2017 at 04:17:31PM +1100, Dave Chinner wrote:
> On Sun, Oct 08, 2017 at 10:25:10PM -0400, Josef Bacik wrote:
> > On Mon, Oct 09, 2017 at 11:51:37AM +1100, Dave Chinner wrote:
> > > On Fri, Oct 06, 2017 at 05:09:57PM -0400, Josef Bacik wrote:
> > > > Hello,
> > > >
> > > > One thing that comes up a lot every LSF is the fact that we have no general way
> > > > that we do performance testing. Every fs developer has a set of scripts or
> > > > things that they run with varying degrees of consistency, but nothing central
> > > > that we all use. I for one am getting tired of finding regressions when we are
> > > > deploying new kernels internally, so I wired this thing up to try and address
> > > > this need.
> > > >
> > > > We all hate convoluted setups, the more brain power we have to put in to setting
> > > > something up the less likely we are to use it, so I took the xfstests approach
> > > > of making it relatively simple to get running and relatively easy to add new
> > > > tests. For right now the only thing this framework does is run fio scripts. I
> > > > chose fio because it already gathers loads of performance data about it's runs.
> > > > We have everything we need there, latency, bandwidth, cpu time, and all broken
> > > > down by reads, writes, and trims. I figure most of us are familiar enough with
> > > > fio and how it works to make it relatively easy to add new tests to the
> > > > framework.
> > > >
> > > > I've posted my code up on github, you can get it here
> > > >
> > > > https://github.com/josefbacik/fsperf
> > > >
> > > > All (well most) of the results from fio are stored in a local sqlite database.
> > > > Right now the comparison stuff is very crude, it simply checks against the
> > > > previous run and it only checks a few of the keys by default. You can check
> > > > latency if you want, but while writing this stuff up it seemed that latency was
> > > > too variable from run to run to be useful in a "did my thing regress or improve"
> > > > sort of way.
> > > >
> > > > The configuration is brain dead simple, the README has examples. All you need
> > > > to do is make your local.cfg, run ./setup and then run ./fsperf and you are good
> > > > to go.
> > >
> > > Why re-invent the test infrastructure? Why not just make it a
> > > tests/perf subdir in fstests?
> > >
> >
> > Probably should have led with that shouldn't I have? There's nothing keeping me
> > from doing it, but I didn't want to try and shoehorn in a python thing into
> > fstests. I need python to do the sqlite and the json parsing to dump into the
> > sqlite database.
> >
> > Now if you (and others) are not opposed to this being dropped into tests/perf
> > then I'll work that up. But it's definitely going to need to be done in python.
> > I know you yourself have said you aren't opposed to using python in the past, so
> > if that's still the case then I can definitely wire it all up.
>
> I have no problems with people using python for stuff like this but,
> OTOH, I'm not the fstests maintainer anymore :P

I have no problem either if python is really needed, after all this is a
very useful infrastructure improvement. But the python version problem
brought up by Ted made me a bit nervous, we need to work that round
carefully.

OTOH, I'm just curious, what is the specific reason that python is a
hard requirement? If we can use perl, that'll be much easier for
fstests.

BTW, opinions from key fs developers/fstests users, like you, are also
very important and welcomed :)

Thanks,
Eryu

>
> > > > The plan is to add lots of workloads as we discover regressions and such. We
> > > > don't want anything that takes too long to run otherwise people won't run this,
> > > > so the existing tests don't take much longer than a few minutes each. I will be
> > > > adding some more comparison options so you can compare against averages of all
> > > > previous runs and such.
> > >
> > > Yup, that fits exactly into what fstests is for... :P
> > >
> > > Integrating into fstests means it will be immediately available to
> > > all fs developers, it'll run on everything that everyone already has
> > > setup for filesystem testing, and it will have familiar mkfs/mount
> > > option setup behaviour so there's no new hoops for everyone to jump
> > > through to run it...
> > >
> >
> > TBF I specifically made it as easy as possible because I know we all hate trying
> > to learn new shit.
>
> Yeah, it's also hard to get people to change their workflows to add
> a whole new test harness into them. It's easy if it's just a new
> command to an existing workflow :P
>
> > I figured this was different enough to warrant a separate
> > project, especially since I'm going to add block device jobs so Jens can test
> > block layer things. If we all agree we'd rather see this in fstests then I'm
> > happy to do that too. Thanks,
>
> I'm not fussed either way - it's a good discussion to have, though.
>
> If I want to add tests (e.g. my time-honoured fsmark tests), where
> should I send patches?
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> [email protected]
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2017-10-09 12:52:59

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [ANNOUNCE] fsperf: a simple fs/block performance testing framework

On Mon, Oct 09, 2017 at 02:54:34PM +0800, Eryu Guan wrote:
> I have no problem either if python is really needed, after all this is a
> very useful infrastructure improvement. But the python version problem
> brought up by Ted made me a bit nervous, we need to work that round
> carefully.
>
> OTOH, I'm just curious, what is the specific reason that python is a
> hard requirement? If we can use perl, that'll be much easier for
> fstests.

Note that perl has its own versioning issues (but it's been easier
given that Perl has been frozen so long waiting for Godot^H^H^H^H^H
Perl6). It's for similar reasons that Python 2.7 is nice and stable.
Python 2 is frozen while Python 3 caught up. The only downside is
that Python 2.7 is now deprecated, and will stop getting security
updates from Python Core in 2020.

I'll note that Python 3.6 has some nice features that aren't in Python
3.5 --- but Debian Stable only has Python 3.5, and Debian Unstable has
Python 3.6. Which is why I said, "it looks like you're using some
variant of Python 3.x", but it wasn't obvious what version exactly was
required by fsperf. This version instability in Python and Perl is
way Larry McVoy ended up using Tcl for Bitkeeper, by the way. That
was the only thing that was guaranteed to work everywhere, exactly the
same, without random changes added by Perl/Python innovations...

It's a little easier for me with gce/kvm-xfstests, since I'm using a
Debian Stable images/chroots, and I don't even going to _pretend_ that
I care about cross-distro portability for the Test Appliance VM's that
xfstests-bld creates. But I suspect this will be more of an issue
with xfstests.

Also, the same issues around versioning / "DLL hell" and provenance of
various high-level package/modules exists with Perl just as much with
Python. Just substitute CPAN with PyPI. And again, some of the
popular Perl packages have been packaged by the distro's to solve the
versioning / provenance problem, but exactly _which_ packages /
modules are packaged varies from distro to distro. (Hopefully the
most popular ones will be packaged by both Red Hat, SuSE and Debian
derivitives, but you'll have to check for each package / module you
want to use.)

One way of solving the problem is just including those Perl / Python
package modules in the sources of xfstests itself; that way you're not
depending on which version of a particular module / package is
available on a distro, and you're also not randomly downloading
software over the network and hoping it works / hasn't been taken over
by some hostile state power. (I'd be much happier if PyPI or CPAN
used SHA checksums of what you expect to be downloaded; even if you
use Easy_Install's requirements.txt, you're still trusting PyPI to
give you what it thinks is version of Junitparser 1.0.0.)

This is why I've dropped my own copy of junitparser into the git repo
of xfstests-bld. It's the "ship your own version of the DLL" solution
to the "DLL hell" problem, but it was the best I could come up with,
especially since Debian hadn't packaged the Junitparser python module.
I also figured it was much better at lowering the blood pressure of
the friendly local secops types. :-)

- Ted

2017-10-09 12:54:16

by Josef Bacik

[permalink] [raw]
Subject: Re: [ANNOUNCE] fsperf: a simple fs/block performance testing framework

On Sun, Oct 08, 2017 at 11:43:35PM -0400, Theodore Ts'o wrote:
> On Sun, Oct 08, 2017 at 10:25:10PM -0400, Josef Bacik wrote:
> >
> > Probably should have led with that shouldn't I have? There's nothing keeping me
> > from doing it, but I didn't want to try and shoehorn in a python thing into
> > fstests. I need python to do the sqlite and the json parsing to dump into the
> > sqlite database.
>
> What version of python are you using? From inspection it looks like
> some variant of python 3.x (you're using print as a function w/o using
> "from __future import print_function") but it's not immediately
> obvious from the top-level fsperf shell script what version of python
> your scripts are dependant upon.
>
> This could potentially be a bit of a portability issue across various
> distributions --- RHEL doesn't ship with Python 3.x at all, and on
> Debian you need to use python3 to get python 3.x, since
> /usr/bin/python still points at Python 2.7 by default. So I could see
> this as a potential issue for xfstests.
>
> I'm currently using Python 2.7 in my wrapper scripts for, among other
> things, to parse xUnit XML format and create nice summaries like this:
>
> ext4/4k: 337 tests, 6 failures, 21 skipped, 3814 seconds
> Failures: generic/232 generic/233 generic/361 generic/388
> generic/451 generic/459
>
> So I'm not opposed to python, but I will note that if you are using
> modules from the Python Package Index, and they are modules which are
> not packaged by your distribution (so you're using pip or easy_install
> to pull them off the network), it does make doing hermetic builds from
> trusted sources to be a bit trickier.
>
> If you have a secops team who wants to know the provenance of software
> which get thrown in production data centers (and automated downloading
> from random external sources w/o any code review makes them break out
> in hives), use of PyPI adds a new wrinkle. It's not impossible to
> solve, by any means, but it's something to consider.
>

I purposefully used as little as possible, just json and sqlite, and I tried to
use as little python3 isms as possible. Any rpm based systems should have these
libraries already installed, I agree that using any of the PyPI stuff is a pain
and is a non-starter for me as I want to make it as easy as possible to get
running.

Where do you fall on the including it in xfstests question? I expect that the
perf stuff would be run more as maintainers put their pull requests together to
make sure things haven't regressed. To that end I was going to wire up
xfstests-bld to run this as well. Whatever you and Dave prefer is what I'll do,
I'll use it wherever it ends up so you two are the ones that get to decide.
Thanks,

Josef

2017-10-09 13:00:53

by Josef Bacik

[permalink] [raw]
Subject: Re: [ANNOUNCE] fsperf: a simple fs/block performance testing framework

On Mon, Oct 09, 2017 at 04:17:31PM +1100, Dave Chinner wrote:
> On Sun, Oct 08, 2017 at 10:25:10PM -0400, Josef Bacik wrote:
> > On Mon, Oct 09, 2017 at 11:51:37AM +1100, Dave Chinner wrote:
> > > On Fri, Oct 06, 2017 at 05:09:57PM -0400, Josef Bacik wrote:
> > > > Hello,
> > > >
> > > > One thing that comes up a lot every LSF is the fact that we have no general way
> > > > that we do performance testing. Every fs developer has a set of scripts or
> > > > things that they run with varying degrees of consistency, but nothing central
> > > > that we all use. I for one am getting tired of finding regressions when we are
> > > > deploying new kernels internally, so I wired this thing up to try and address
> > > > this need.
> > > >
> > > > We all hate convoluted setups, the more brain power we have to put in to setting
> > > > something up the less likely we are to use it, so I took the xfstests approach
> > > > of making it relatively simple to get running and relatively easy to add new
> > > > tests. For right now the only thing this framework does is run fio scripts. I
> > > > chose fio because it already gathers loads of performance data about it's runs.
> > > > We have everything we need there, latency, bandwidth, cpu time, and all broken
> > > > down by reads, writes, and trims. I figure most of us are familiar enough with
> > > > fio and how it works to make it relatively easy to add new tests to the
> > > > framework.
> > > >
> > > > I've posted my code up on github, you can get it here
> > > >
> > > > https://github.com/josefbacik/fsperf
> > > >
> > > > All (well most) of the results from fio are stored in a local sqlite database.
> > > > Right now the comparison stuff is very crude, it simply checks against the
> > > > previous run and it only checks a few of the keys by default. You can check
> > > > latency if you want, but while writing this stuff up it seemed that latency was
> > > > too variable from run to run to be useful in a "did my thing regress or improve"
> > > > sort of way.
> > > >
> > > > The configuration is brain dead simple, the README has examples. All you need
> > > > to do is make your local.cfg, run ./setup and then run ./fsperf and you are good
> > > > to go.
> > >
> > > Why re-invent the test infrastructure? Why not just make it a
> > > tests/perf subdir in fstests?
> > >
> >
> > Probably should have led with that shouldn't I have? There's nothing keeping me
> > from doing it, but I didn't want to try and shoehorn in a python thing into
> > fstests. I need python to do the sqlite and the json parsing to dump into the
> > sqlite database.
> >
> > Now if you (and others) are not opposed to this being dropped into tests/perf
> > then I'll work that up. But it's definitely going to need to be done in python.
> > I know you yourself have said you aren't opposed to using python in the past, so
> > if that's still the case then I can definitely wire it all up.
>
> I have no problems with people using python for stuff like this but,
> OTOH, I'm not the fstests maintainer anymore :P
>
> > > > The plan is to add lots of workloads as we discover regressions and such. We
> > > > don't want anything that takes too long to run otherwise people won't run this,
> > > > so the existing tests don't take much longer than a few minutes each. I will be
> > > > adding some more comparison options so you can compare against averages of all
> > > > previous runs and such.
> > >
> > > Yup, that fits exactly into what fstests is for... :P
> > >
> > > Integrating into fstests means it will be immediately available to
> > > all fs developers, it'll run on everything that everyone already has
> > > setup for filesystem testing, and it will have familiar mkfs/mount
> > > option setup behaviour so there's no new hoops for everyone to jump
> > > through to run it...
> > >
> >
> > TBF I specifically made it as easy as possible because I know we all hate trying
> > to learn new shit.
>
> Yeah, it's also hard to get people to change their workflows to add
> a whole new test harness into them. It's easy if it's just a new
> command to an existing workflow :P
>

Agreed, so if you probably won't run this outside of fstests then I'll add it to
xfstests. I envision this tool as being run by maintainers to verify their pull
requests haven't regressed since the last set of patches, as well as by anybody
trying to fix performance problems. So it's way more important to me that you,
Ted, and all the various btrfs maintainers will run it than anybody else.

> > I figured this was different enough to warrant a separate
> > project, especially since I'm going to add block device jobs so Jens can test
> > block layer things. If we all agree we'd rather see this in fstests then I'm
> > happy to do that too. Thanks,
>
> I'm not fussed either way - it's a good discussion to have, though.
>
> If I want to add tests (e.g. my time-honoured fsmark tests), where
> should I send patches?
>

I beat you to that! I wanted to avoid adding fs_mark to the suite because it
means parsing another different set of outputs, so I added a new ioengine to fio
for this

http://www.spinics.net/lists/fio/msg06367.html

and added a fio job to do 500k files

https://github.com/josefbacik/fsperf/blob/master/tests/500kemptyfiles.fio

The test is disabled by default for now because obviously the fio support hasn't
landed yet.

I'd _like_ to expand fio for cases we come up with that aren't possible, as
there's already a ton of measurements that are taken, especially around
latencies. That said I'm not opposed to throwing new stuff in there, it just
means we have to add stuff to parse the output and store it in the database in a
consistent way, which seems like more of a pain than just making fio do what we
need it to. Thanks,

Josef

2017-10-09 14:40:15

by David Sterba

[permalink] [raw]
Subject: Re: [ANNOUNCE] fsperf: a simple fs/block performance testing framework

On Fri, Oct 06, 2017 at 05:09:57PM -0400, Josef Bacik wrote:
> One thing that comes up a lot every LSF is the fact that we have no general way
> that we do performance testing. Every fs developer has a set of scripts or
> things that they run with varying degrees of consistency, but nothing central
> that we all use. I for one am getting tired of finding regressions when we are
> deploying new kernels internally, so I wired this thing up to try and address
> this need.
>
> We all hate convoluted setups, the more brain power we have to put in to setting
> something up the less likely we are to use it, so I took the xfstests approach
> of making it relatively simple to get running and relatively easy to add new
> tests. For right now the only thing this framework does is run fio scripts. I
> chose fio because it already gathers loads of performance data about it's runs.
> We have everything we need there, latency, bandwidth, cpu time, and all broken
> down by reads, writes, and trims. I figure most of us are familiar enough with
> fio and how it works to make it relatively easy to add new tests to the
> framework.
>
> I've posted my code up on github, you can get it here
>
> https://github.com/josefbacik/fsperf

Let me propose an existing framework that is capable of what is in
fsperf, and much more. I'ts Mel Gorman's mmtests
http://github.com/gormanm/mmtests .

I've been using it for a year or so, built a few scripts on top of that to
help me set up configs for specific machines or run tests in sequences.

What are the capabilities regarding filesystem tests:

* create and mount filesystems (based on configs)
* start various workloads, that are possibly adapted to the machine
(cpu, memory), there are many types, we'd be interested in those
touching filesystems
* gather system statistics - cpu, memory, IO, latency there are scripts
that understand the output of various benchmarking tools (fio, dbench,
ffsb, tiobench, bonnie, fs_mark, iozone, blogbench, ...)
* export the results into plain text or html, with tables and graphs
* it is already able to compare results of several runs, with
statistical indicators

The testsuite is actively used and maintained, which means that the
efforts are mosly on the configuration side. From users' perspective
this means to spend time with the setup and the rest will work as
expected. Ie. you don't have to start debugging the suite because there
are some version mismatches.

> All (well most) of the results from fio are stored in a local sqlite database.
> Right now the comparison stuff is very crude, it simply checks against the
> previous run and it only checks a few of the keys by default. You can check
> latency if you want, but while writing this stuff up it seemed that latency was
> too variable from run to run to be useful in a "did my thing regress or improve"
> sort of way.
>
> The configuration is brain dead simple, the README has examples. All you need
> to do is make your local.cfg, run ./setup and then run ./fsperf and you are good
> to go.
>
> The plan is to add lots of workloads as we discover regressions and such. We
> don't want anything that takes too long to run otherwise people won't run this,
> so the existing tests don't take much longer than a few minutes each.

Sorry, this is IMO the wrong approach to benchmarking. Can you exercise
the filesystem enough in a few minutes? Can you write at least 2 times
memory size of data to the filesystem? Everything works fine when it's
from caches and the filesystem is fresh. With that you can simply start
using phoronix-test-suite and be done, with the same quality of results
we all roll eyes about.

Targeted tests using fio are fine and I understand the need to keep it
minimal. mmtests have support for fio and any jobfile can be used,
internally implemented with the 'fio --cmdline' option that will
transform it to a shell variable that's passed to fio in the end.

As proposed in the thread, why not use xfstests? It could be suitable
for the configs, mkfs/mount and running but I think it would need a lot
of work to enhance the result gathering and presentation. Essentially
duplicating mmtests from that side.

I was positively surprised by various performance monitors that I was
not primarily interested in, like memory allocations or context
switches. This gives deeper insights into the system and may help
analyzing the benchmark results.

Side note: you can run xfstests from mmtests, ie. the machine/options
confugration is shared.

I'm willing to write more about the actual usage of mmtests, but at this
point I'm proposing the whole framework for consideration.

2017-10-09 15:14:49

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [ANNOUNCE] fsperf: a simple fs/block performance testing framework

On Mon, Oct 09, 2017 at 08:54:16AM -0400, Josef Bacik wrote:
> I purposefully used as little as possible, just json and sqlite, and I tried to
> use as little python3 isms as possible. Any rpm based systems should have these
> libraries already installed, I agree that using any of the PyPI stuff is a pain
> and is a non-starter for me as I want to make it as easy as possible to get
> running.
>
> Where do you fall on the including it in xfstests question? I expect that the
> perf stuff would be run more as maintainers put their pull requests together to
> make sure things haven't regressed. To that end I was going to wire up
> xfstests-bld to run this as well. Whatever you and Dave prefer is what I'll do,
> I'll use it wherever it ends up so you two are the ones that get to decide.

I'm currently using Python 2.7 mainly because the LTM subsystem in
gce-xfstests was implemented using that version of Python, and my
initial efforts to port it to Python 3 were... not smooth. (Because
it was doing authentication, I got bit by the Python 2 vs Python 3
"bytes vs. strings vs. unicode" transition especially hard.)

So I'm going to be annoyed by needing to package Python 2.7 and Python
3.5 in my test VM's, but I can deal, and this will probably be the
forcing function for me to figure out how make that jump. To be
honest, the bigger issue I'm going to have to figure out is how to
manage the state in the sqlite database across disposable VM's running
in parallel. And the assumption being made with having a static
sqllite database on a test machine is that the hardware capabilities
of the are static, and that's not true with a VM, whether it's running
via Qemu or in some cloud environment.

I'm not going to care that much about Python 3 not being available on
enterprise distro's, but I could see that being annoying for some
folks. I'll let them worry about that.

The main thing I think we'll need to worry about once we let Python
into xfstests is to be pretty careful about specifying what version of
Python the scripts need to be portable against (Python 3.3? 3.4?
3.5?) and what versions of python packages get imported.

The bottom line is that I'm personally supportive of adding Python and
fsperf to xfstests. We just need to be careful about the portability
concerns, not just now, but any time we check in new Python code. And
having some documented Python style guidelines, and adding unit tests
so we can notice potential portability breakages would be a good idea.

- Ted

2017-10-09 21:09:20

by Dave Chinner

[permalink] [raw]
Subject: Re: [ANNOUNCE] fsperf: a simple fs/block performance testing framework

On Mon, Oct 09, 2017 at 09:00:51AM -0400, Josef Bacik wrote:
> On Mon, Oct 09, 2017 at 04:17:31PM +1100, Dave Chinner wrote:
> > On Sun, Oct 08, 2017 at 10:25:10PM -0400, Josef Bacik wrote:
> > > > Integrating into fstests means it will be immediately available to
> > > > all fs developers, it'll run on everything that everyone already has
> > > > setup for filesystem testing, and it will have familiar mkfs/mount
> > > > option setup behaviour so there's no new hoops for everyone to jump
> > > > through to run it...
> > > >
> > >
> > > TBF I specifically made it as easy as possible because I know we all hate trying
> > > to learn new shit.
> >
> > Yeah, it's also hard to get people to change their workflows to add
> > a whole new test harness into them. It's easy if it's just a new
> > command to an existing workflow :P
> >
>
> Agreed, so if you probably won't run this outside of fstests then I'll add it to
> xfstests. I envision this tool as being run by maintainers to verify their pull
> requests haven't regressed since the last set of patches, as well as by anybody
> trying to fix performance problems. So it's way more important to me that you,
> Ted, and all the various btrfs maintainers will run it than anybody else.
>
> > > I figured this was different enough to warrant a separate
> > > project, especially since I'm going to add block device jobs so Jens can test
> > > block layer things. If we all agree we'd rather see this in fstests then I'm
> > > happy to do that too. Thanks,
> >
> > I'm not fussed either way - it's a good discussion to have, though.
> >
> > If I want to add tests (e.g. my time-honoured fsmark tests), where
> > should I send patches?
> >
>
> I beat you to that! I wanted to avoid adding fs_mark to the suite because it
> means parsing another different set of outputs, so I added a new ioengine to fio
> for this
>
> http://www.spinics.net/lists/fio/msg06367.html
>
> and added a fio job to do 500k files
>
> https://github.com/josefbacik/fsperf/blob/master/tests/500kemptyfiles.fio
>
> The test is disabled by default for now because obviously the fio support hasn't
> landed yet.

That seems .... misguided. fio is good, but it's not a universal
solution.

> I'd _like_ to expand fio for cases we come up with that aren't possible, as
> there's already a ton of measurements that are taken, especially around
> latencies.

To be properly useful it needs to support more than just fio to run
tests. Indeed, it's largely useless to me if that's all it can do or
it's a major pain to add support for different tools like fsmark.

e.g. my typical perf regression test that you see the concurrnet
fsmark create workload is actually a lot more. It does:

fsmark to create 50m zero length files
umount,
run parallel xfs_repair (excellent mmap_sem/page fault punisher)
mount
run parallel find -ctime (readdir + lookup traversal)
unmount, mount
run parallel ls -R (readdir + dtype traversal)
unmount, mount
parallel rm -rf of 50m files

I have variants that use small 4k files or large files rather than
empty files, taht use different fsync patterns to stress the
log, use grep -R to traverse the data as well as
the directory/inode structure instead of find, etc.

> That said I'm not opposed to throwing new stuff in there, it just
> means we have to add stuff to parse the output and store it in the database in a
> consistent way, which seems like more of a pain than just making fio do what we
> need it to. Thanks,

fio is not going to be able to replace the sort of perf tests I run
from week to week. If that's all it's going to do then it's not
directly useful to me...

Cheers,

Dave.
--
Dave Chinner
[email protected]

2017-10-09 21:15:02

by Josef Bacik

[permalink] [raw]
Subject: Re: [ANNOUNCE] fsperf: a simple fs/block performance testing framework

On Tue, Oct 10, 2017 at 08:09:20AM +1100, Dave Chinner wrote:
> On Mon, Oct 09, 2017 at 09:00:51AM -0400, Josef Bacik wrote:
> > On Mon, Oct 09, 2017 at 04:17:31PM +1100, Dave Chinner wrote:
> > > On Sun, Oct 08, 2017 at 10:25:10PM -0400, Josef Bacik wrote:
> > > > > Integrating into fstests means it will be immediately available to
> > > > > all fs developers, it'll run on everything that everyone already has
> > > > > setup for filesystem testing, and it will have familiar mkfs/mount
> > > > > option setup behaviour so there's no new hoops for everyone to jump
> > > > > through to run it...
> > > > >
> > > >
> > > > TBF I specifically made it as easy as possible because I know we all hate trying
> > > > to learn new shit.
> > >
> > > Yeah, it's also hard to get people to change their workflows to add
> > > a whole new test harness into them. It's easy if it's just a new
> > > command to an existing workflow :P
> > >
> >
> > Agreed, so if you probably won't run this outside of fstests then I'll add it to
> > xfstests. I envision this tool as being run by maintainers to verify their pull
> > requests haven't regressed since the last set of patches, as well as by anybody
> > trying to fix performance problems. So it's way more important to me that you,
> > Ted, and all the various btrfs maintainers will run it than anybody else.
> >
> > > > I figured this was different enough to warrant a separate
> > > > project, especially since I'm going to add block device jobs so Jens can test
> > > > block layer things. If we all agree we'd rather see this in fstests then I'm
> > > > happy to do that too. Thanks,
> > >
> > > I'm not fussed either way - it's a good discussion to have, though.
> > >
> > > If I want to add tests (e.g. my time-honoured fsmark tests), where
> > > should I send patches?
> > >
> >
> > I beat you to that! I wanted to avoid adding fs_mark to the suite because it
> > means parsing another different set of outputs, so I added a new ioengine to fio
> > for this
> >
> > http://www.spinics.net/lists/fio/msg06367.html
> >
> > and added a fio job to do 500k files
> >
> > https://github.com/josefbacik/fsperf/blob/master/tests/500kemptyfiles.fio
> >
> > The test is disabled by default for now because obviously the fio support hasn't
> > landed yet.
>
> That seems .... misguided. fio is good, but it's not a universal
> solution.
>
> > I'd _like_ to expand fio for cases we come up with that aren't possible, as
> > there's already a ton of measurements that are taken, especially around
> > latencies.
>
> To be properly useful it needs to support more than just fio to run
> tests. Indeed, it's largely useless to me if that's all it can do or
> it's a major pain to add support for different tools like fsmark.
>
> e.g. my typical perf regression test that you see the concurrnet
> fsmark create workload is actually a lot more. It does:
>
> fsmark to create 50m zero length files
> umount,
> run parallel xfs_repair (excellent mmap_sem/page fault punisher)
> mount
> run parallel find -ctime (readdir + lookup traversal)
> unmount, mount
> run parallel ls -R (readdir + dtype traversal)
> unmount, mount
> parallel rm -rf of 50m files
>
> I have variants that use small 4k files or large files rather than
> empty files, taht use different fsync patterns to stress the
> log, use grep -R to traverse the data as well as
> the directory/inode structure instead of find, etc.
>
> > That said I'm not opposed to throwing new stuff in there, it just
> > means we have to add stuff to parse the output and store it in the database in a
> > consistent way, which seems like more of a pain than just making fio do what we
> > need it to. Thanks,
>
> fio is not going to be able to replace the sort of perf tests I run
> from week to week. If that's all it's going to do then it's not
> directly useful to me...
>

Agreed, I'm just going to add this stuff to fstests since I'd like to be able to
use the _require stuff to make sure we only run stuff we have support for. I'll
wire that up this week and send patches along. Thanks,

Josef

2017-10-10 09:00:53

by Mel Gorman

[permalink] [raw]
Subject: Re: [ANNOUNCE] fsperf: a simple fs/block performance testing framework

On Tue, Oct 10, 2017 at 08:09:20AM +1100, Dave Chinner wrote:
> > I'd _like_ to expand fio for cases we come up with that aren't possible, as
> > there's already a ton of measurements that are taken, especially around
> > latencies.
>
> To be properly useful it needs to support more than just fio to run
> tests. Indeed, it's largely useless to me if that's all it can do or
> it's a major pain to add support for different tools like fsmark.
>
> e.g. my typical perf regression test that you see the concurrnet
> fsmark create workload is actually a lot more. It does:
>
> fsmark to create 50m zero length files
> umount,
> run parallel xfs_repair (excellent mmap_sem/page fault punisher)
> mount
> run parallel find -ctime (readdir + lookup traversal)
> unmount, mount
> run parallel ls -R (readdir + dtype traversal)
> unmount, mount
> parallel rm -rf of 50m files
>
> I have variants that use small 4k files or large files rather than
> empty files, taht use different fsync patterns to stress the
> log, use grep -R to traverse the data as well as
> the directory/inode structure instead of find, etc.
>

FWIW, this is partially implemented in mmtests as
configs/config-global-dhp__io-xfsrepair. It covers the fsmark and
xfs_repair part and an example report is

http://beta.suse.com/private/mgorman/results/home/marvin/openSUSE-LEAP-42.2/global-dhp__io-xfsrepair-xfs/delboy/#xfsrepair

(ignore 4.12.603, it's 4.12.3-stable with some additional patches that were
pending for -stable at the time the test was executed). That config was
added after a discussion with you a few years ago and I've kept it since as
it has been useful in a number of contexts. Adding additional tests to cover
parallel find, parallel ls and parallel rm would be relatively trivial but
it's not there. This is a test that doesn't have proper graphing support
but it could be added in 10-15 minutes as xfsrepair is the primary metric
and it's simply reported as elapsed time.

fsmark is also covered albeit not necessarily with parameters everyone wants
as configs/config-global-dhp__io-metadata in mmtests. Example report is

http://beta.suse.com/private/mgorman/results/home/marvin/openSUSE-LEAP-42.2/global-dhp__io-metadata-xfs/delboy/#fsmark-threaded

mmtests has been modified multiple times as according as methodologies
were improved and it's far from perfect but it seems to me that fsperf
is going to end up reimplementing a lot of it.

It's not perfect as there are multiple quality-of-implementation issues
as it often takes the shortest path to being able to collect data but
it improves over time. When a test is found to be flawed, it's fixed and
historical data is discarded. It doesn't store data in sqlite or anything
fancy, just the raw logs are preserved and reports generated as required. In
terms of tools required, the core is just bash scripts. Some of the tests
require a number of packages to be installed but not all of them. It uses a
tool to install packages if they are missing but the naming is all based on
opensuse. It *can* map opensuse package names to fedora and debian but the
mappings are not up-to-date as I do not personally run those distributions.

Even with the quality-of-implementation issues, it seems to me that it
covers a lot of the requirements that fsperf aims for.

--
Mel Gorman
SUSE Labs