LinuxLists.cc - Re: [PATCH RFC - TAKE TWO - 00/12] New version of the BFQ I/O Scheduler

2014-06-02 11:14:37

Subject: Re: [PATCH RFC - TAKE TWO - 00/12] New version of the BFQ I/O Scheduler

On Fri 2014-05-30 19:28:04, Tejun Heo wrote:
> Hello,
>
> On Sat, May 31, 2014 at 12:23:01AM +0200, Paolo Valente wrote:
> > I do agree that bfq has essentially the same purpose as cfq. I am
> > not sure that it is what you are proposing, but, in my opinion,
> > since both the engine and all the new heuristics of bfq differ from
> > those of cfq, a replacement would be most certainly a much easier
> > solution than any other transformation of cfq into bfq (needless to
> > say, leaving the same name for the scheduler would not be a problem
> > for me). Of course, before that we are willing to improve what has
> > to be improved in bfq.
>
> Well, it's all about how to actually route the changes and in general
> whenever avoidable we try to avoid whole-sale code replacement
> especially when most of the structural code is similar like in this
> case. Gradually evolving cfq to bfq is likely to take more work but
> I'm very positive that it'd definitely be a lot easier to merge the
> changes that way and people involved, including the developers and
> reviewers, would acquire a lot clearer picture of what's going on in
> the process. For example, AFAICS, most of the heuristics added by

Would it make sense to merge bfq first, _then_ turn cfq into bfq, then
remove bfq?

That way

1. Users like me would see improvements soon

2. BFQ would get more testing early.

3. If there are some problems in some workload, switching between bfq
and cfq will be easier than playing with git/patches.

Now.. I see it is more work for storage maintainers, because there'll
be more code to maintain in the interim. But perhaps user advantages
are worth it?

Thanks,

Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2014-06-02 13:02:31

by Pavel Machek

[permalink] [raw]

Subject: Re: [PATCH RFC - TAKE TWO - 00/12] New version of the BFQ I/O Scheduler

Hi!

> > Well, it's all about how to actually route the changes and in general
> > whenever avoidable we try to avoid whole-sale code replacement
> > especially when most of the structural code is similar like in this
> > case. Gradually evolving cfq to bfq is likely to take more work but
> > I'm very positive that it'd definitely be a lot easier to merge the
> > changes that way and people involved, including the developers and
> > reviewers, would acquire a lot clearer picture of what's going on in
> > the process. For example, AFAICS, most of the heuristics added by
>
> Would it make sense to merge bfq first, _then_ turn cfq into bfq, then
> remove bfq?
>
> That way
>
> 1. Users like me would see improvements soon
>
> 2. BFQ would get more testing early.

Like this: I applied patch over today's git...

I only see last bits of panic...

Call trace:
__bfq_bfqq_expire
bfq_bfqq_expire
bfq_dispatch_requests
sci_request_fn
...
EIP: T.1839+0x26
Kernel panic - not syncing: Fatal exception in interrupt
Shutting down cpus with NMI

...

Will retry.

Any ideas?
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2014-06-02 17:33:37

by Tejun Heo

[permalink] [raw]

Subject: Re: [PATCH RFC - TAKE TWO - 00/12] New version of the BFQ I/O Scheduler

Hello, Pavel.

On Mon, Jun 02, 2014 at 01:14:33PM +0200, Pavel Machek wrote:
> Now.. I see it is more work for storage maintainers, because there'll
> be more code to maintain in the interim. But perhaps user advantages
> are worth it?

I'm quite skeptical about going that route. Not necessarily because
of the extra amount of work but more the higher probability of getting
into situation where we can neither push forward or back out. It's
difficult to define clear deadline and there will likely be unforeseen
challenges in the planned convergence of the two schedulers,
eventually, it isn't too unlikely to be in a situation where we have
to admit defeat and just keep both schedulers. Note that developer
overhead isn't the only factor here. Providing two slightly different
alternatives inevitably makes userland grow dependencies on subtleties
of both and there's a lot less pressure to make judgement calls and
take appropriate trade-offs, which have fairly high chance of
deadlocking progress towards any direction.

Thanks.

--
tejun

2014-06-03 04:12:50

by Mike Galbraith

[permalink] [raw]

Subject: Re: [PATCH RFC - TAKE TWO - 00/12] New version of the BFQ I/O Scheduler

On Mon, 2014-06-02 at 13:33 -0400, Tejun Heo wrote:
> Hello, Pavel.
>
> On Mon, Jun 02, 2014 at 01:14:33PM +0200, Pavel Machek wrote:
> > Now.. I see it is more work for storage maintainers, because there'll
> > be more code to maintain in the interim. But perhaps user advantages
> > are worth it?
>
> I'm quite skeptical about going that route. Not necessarily because
> of the extra amount of work but more the higher probability of getting
> into situation where we can neither push forward or back out. It's
> difficult to define clear deadline and there will likely be unforeseen
> challenges in the planned convergence of the two schedulers,
> eventually, it isn't too unlikely to be in a situation where we have
> to admit defeat and just keep both schedulers. Note that developer
> overhead isn't the only factor here. Providing two slightly different
> alternatives inevitably makes userland grow dependencies on subtleties
> of both and there's a lot less pressure to make judgement calls and
> take appropriate trade-offs, which have fairly high chance of
> deadlocking progress towards any direction.

But OTOH..

This thing (allegedly) fixes issues that have existed for ages, issues
which have (also allegedly) not been fixed in all that time despite a
number of people having done a lot of this and that over the years. If
the claims are true, seems to me that would make BFQ a bit special, and
perhaps worth some extra leeway and effort to ensure that what we are
being offered on a silver plate doesn't molder away out of tree forever.

If it were say put in staging, and it were stated right up front that it
isn't ever going to go further (Jens already said that more or less),
and _will_ drop dead if it stagnates, that would surely increase the
test base to shake out problem spots (surely it has some), and allow
users who meet an issue in either IO scheduler to verify it with the
flick of a switch every step of the way to whichever ending, and maybe
even motivate other IO people to help with the merge and/or to compare
their changes at the flick of that same switch.

-Mike

2014-06-03 16:55:04

by Paolo Valente

[permalink] [raw]

Subject: Re: [PATCH RFC - TAKE TWO - 00/12] New version of the BFQ I/O Scheduler

Il giorno 02/giu/2014, alle ore 15:02, Pavel Machek <[email protected]> ha scritto:

> Hi!
>
>>> Well, it's all about how to actually route the changes and in general
>>> whenever avoidable we try to avoid whole-sale code replacement
>>> especially when most of the structural code is similar like in this
>>> case. Gradually evolving cfq to bfq is likely to take more work but
>>> I'm very positive that it'd definitely be a lot easier to merge the
>>> changes that way and people involved, including the developers and
>>> reviewers, would acquire a lot clearer picture of what's going on in
>>> the process. For example, AFAICS, most of the heuristics added by
>>
>> Would it make sense to merge bfq first, _then_ turn cfq into bfq, then
>> remove bfq?
>>
>> That way
>>
>> 1. Users like me would see improvements soon
>>
>> 2. BFQ would get more testing early.
>
> Like this: I applied patch over today's git...
>
> I only see last bits of panic...
>
> Call trace:
> __bfq_bfqq_expire
> bfq_bfqq_expire
> bfq_dispatch_requests
> sci_request_fn
> ...
> EIP: T.1839+0x26
> Kernel panic - not syncing: Fatal exception in interrupt
> Shutting down cpus with NMI
>
> ...
>
> Will retry.
>
> Any ideas?
>

We have tried to think about ways to trigger this failure, but in vain. Unfortunately, so far no user has reported any failure with this last version of bfq either. Finally, we have gone through a new static analysis, but also in this case uselessly.

So, if you are willing to retry, we have put online a version of the code filled with many BUG_ONs. I hope they can make it easier to track down the bug. The archive is here:
http://algogroup.unimore.it/people/paolo/disk_sched/debugging-patches/3.15.0-rc8-v7rc5.tgz

Should this attempt be useless as well, I will, if you do not mind, try by asking you more details about your system and reproducing your configuration as much as I can.

Thanks,
Paolo

> Pavel
> --
> (english) http://www.livejournal.com/~pavelmachek
> (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2014-06-03 20:41:07

by Pavel Machek

[permalink] [raw]

Subject: Re: [PATCH RFC - TAKE TWO - 00/12] New version of the BFQ I/O Scheduler

Hi!

> >>> Well, it's all about how to actually route the changes and in general
> >>> whenever avoidable we try to avoid whole-sale code replacement
> >>> especially when most of the structural code is similar like in this
> >>> case. Gradually evolving cfq to bfq is likely to take more work but
> >>> I'm very positive that it'd definitely be a lot easier to merge the
> >>> changes that way and people involved, including the developers and
> >>> reviewers, would acquire a lot clearer picture of what's going on in
> >>> the process. For example, AFAICS, most of the heuristics added by
> >>
> >> Would it make sense to merge bfq first, _then_ turn cfq into bfq, then
> >> remove bfq?
> >>
> >> That way
> >>
> >> 1. Users like me would see improvements soon
> >>
> >> 2. BFQ would get more testing early.
> >
> > Like this: I applied patch over today's git...
> >
> > I only see last bits of panic...
> >
> > Call trace:
> > __bfq_bfqq_expire
> > bfq_bfqq_expire
> > bfq_dispatch_requests
> > sci_request_fn
> > ...
> > EIP: T.1839+0x26
> > Kernel panic - not syncing: Fatal exception in interrupt
> > Shutting down cpus with NMI
> >
> > ...
> >
> > Will retry.
> >
> > Any ideas?
> >

> We have tried to think about ways to trigger this failure, but in
> vain. Unfortunately, so far no user has reported any failure with
> this last version of bfq either. Finally, we have gone through a new
> static analysis, but also in this case uselessly.

Ok, it is pretty much reproducible here: system just will not finish
booting.

> So, if you are willing to retry, we have put online a version of the code filled with many BUG_ONs. I hope they can make it easier to track down the bug. The archive is here:
> http://algogroup.unimore.it/people/paolo/disk_sched/debugging-patches/3.15.0-rc8-v7rc5.tgz
>

Ok, let me try.

> Should this attempt be useless as well, I will, if you do not mind, try by asking you more details about your system and reproducing your configuration as much as I can.
>

It is thinkpad x60 notebook, x86-32 machine with 2GB ram.

But I think it died on my x86-32 core duo desktop, too.

Best regards,
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2014-06-04 08:40:00

by Pavel Machek

[permalink] [raw]

Subject: Re: [PATCH RFC - TAKE TWO - 00/12] New version of the BFQ I/O Scheduler

Hi!

> > Like this: I applied patch over today's git...
> >
> > I only see last bits of panic...
> >
> > Call trace:
> > __bfq_bfqq_expire
> > bfq_bfqq_expire
> > bfq_dispatch_requests
> > sci_request_fn
> > ...
> > EIP: T.1839+0x26

> > Any ideas?
> >
>
> We have tried to think about ways to trigger this failure, but in vain. Unfortunately, so far no user has reported any failure with this last version of bfq either. Finally, we have gone through a new static analysis, but also in this case uselessly.
>
> So, if you are willing to retry, we have put online a version of the code filled with many BUG_ONs. I hope they can make it easier to track down the bug. The archive is here:
> http://algogroup.unimore.it/people/paolo/disk_sched/debugging-patches/3.15.0-rc8-v7rc5.tgz
>

BUG: Unable to handle kernel paging request ad dee22fa0
IP: bfq_del_bfqq_busy+0x4d
...
Tainted: GW 3.15.0-rc8+
...
Call trace:
__bfq_bfqq_expire
bfq_bfqq_expire
? bfq_bfqq_expire
? bfq_bfqq_expire
bfq_idle_slice_timer
call_timer_fn
...

> Should this attempt be useless as well, I will, if you do not mind, try by asking you more details about your system and reproducing your configuration as much as I can.
>

See the preivous email...

Best regards,
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2014-06-04 09:09:01

by Pavel Machek

[permalink] [raw]

Subject: Re: [PATCH RFC - TAKE TWO - 00/12] New version of the BFQ I/O Scheduler

Hi!

> Should this attempt be useless as well, I will, if you do not mind,
>try by asking you more details about your system and reproducing your
>configuration as much as I can.

It fails during boot or shortly after that when clicking in gnome2
desktop. I had BFQ as a default scheduler.

Now I set CFQ as a default and it boots (as expected).

root@duo:~# cat /sys/block/sda/queue/scheduler
noop deadline [cfq] bfq
root@duo:~# echo bfq > /sys/block/sda/queue/scheduler
root@duo:~# dmesg | grep WARN
WARNING: CPU: 1 PID: 1 at net/wireless/reg.c:479
regulatory_init+0x88/0xf5()
root@duo:~#

Hmm, and I seem to have pretty much functional system.

I'll try to do some benchmarks now.
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2014-06-04 10:04:03

by Pavel Machek

[permalink] [raw]

Subject: BFQ speed tests [was Re: [PATCH RFC - TAKE TWO - 00/12] New version of the BFQ I/O Scheduler]

Hi!

> Should this attempt be useless as well, I will, if you do not mind, try by asking you more details about your system and reproducing your configuration as much as I can.
>

Try making BFQ the default scheduler. That seems to break it for me,
when selected at runtime, it looks stable.

Anyway, here are some speed tests. Background load:

root@duo:/data/tmp# echo cfq > /sys/block/sda/queue/scheduler
root@duo:/data/tmp# echo 3 > /proc/sys/vm/drop_caches
root@duo:/data/tmp# cat /dev/zero > delme; cat /dev/zero > delme;cat
/dev/zero > delme;cat /dev/zero > delme;cat /dev/zero > delme;cat
/dev/zero > delme

(Machine was running out of disk space.)

(I alternate between cfq and bfq).

Benchmark. I chose git describe because it is part of kernel build
sometimes .. and I actually wait for that.

pavel@duo:/data/l/linux-good$ time git describe
warning: refname 'HEAD' is ambiguous.
v3.15-rc8-144-g405dedd

Unfortunately, results are not too good for BFQ. (Can you replicate
the results?)

# BFQ
10.24user 1.62system 467.02 (7m47.028s) elapsed 2.54%CPU
# CFQ
8.55user 1.26system 69.57 (1m9.577s) elapsed 14.11%CPU
# BFQ
11.70user 3.18system 1491.59 (24m51.599s) elapsed 0.99%CPU
# CFQ, no background load
8.51user 0.75system 30.99 (0m30.994s) elapsed 29.91%CPU
# CFQ
8.70user 1.36system 74.72 (1m14.720s) elapsed 13.48%CPU

Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2014-06-04 10:24:52

by Paolo Valente

[permalink] [raw]

Subject: Re: BFQ speed tests [was Re: [PATCH RFC - TAKE TWO - 00/12] New version of the BFQ I/O Scheduler]

2014-06-04 11:59:40

by Takashi Iwai

[permalink] [raw]

Subject: Re: BFQ speed tests [was Re: [PATCH RFC - TAKE TWO - 00/12] New version of the BFQ I/O Scheduler]

At Wed, 4 Jun 2014 12:24:30 +0200,
Paolo Valente wrote:
>
>
> Il giorno 04/giu/2014, alle ore 12:03, Pavel Machek <[email protected]> ha scritto:
>
> > Hi!
> >
> >> Should this attempt be useless as well, I will, if you do not mind, try by asking you more details about your system and reproducing your configuration as much as I can.
> >>
> >
> > Try making BFQ the default scheduler. That seems to break it for me,
> > when selected at runtime, it looks stable.
> >
> > Anyway, here are some speed tests. Background load:
> >
> > root@duo:/data/tmp# echo cfq > /sys/block/sda/queue/scheduler
> > root@duo:/data/tmp# echo 3 > /proc/sys/vm/drop_caches
> > root@duo:/data/tmp# cat /dev/zero > delme; cat /dev/zero > delme;cat
> > /dev/zero > delme;cat /dev/zero > delme;cat /dev/zero > delme;cat
> > /dev/zero > delme
> >
> > (Machine was running out of disk space.)
> >
> > (I alternate between cfq and bfq).
> >
> > Benchmark. I chose git describe because it is part of kernel build
> > sometimes .. and I actually wait for that.
> >
> > pavel@duo:/data/l/linux-good$ time git describe
> > warning: refname 'HEAD' is ambiguous.
> > v3.15-rc8-144-g405dedd
> >
> > Unfortunately, results are not too good for BFQ. (Can you replicate
> > the results?)
> >
> > # BFQ
> > 10.24user 1.62system 467.02 (7m47.028s) elapsed 2.54%CPU
> > # CFQ
> > 8.55user 1.26system 69.57 (1m9.577s) elapsed 14.11%CPU
> > # BFQ
> > 11.70user 3.18system 1491.59 (24m51.599s) elapsed 0.99%CPU
> > # CFQ, no background load
> > 8.51user 0.75system 30.99 (0m30.994s) elapsed 29.91%CPU
> > # CFQ
> > 8.70user 1.36system 74.72 (1m14.720s) elapsed 13.48%CPU
> >
>
> Definitely bad, we are about to repeat the test …

I've been using BFQ for a while and noticed also some obvious
regression in some operations, notably git, too.
For example, git grep regresses badly.

I ran "test git grep foo > /dev/null" on linux kernel repos on both
rotational disk and SSD.

Rotational disk:
CFQ:
2.32user 3.48system 1:46.97elapsed 5%CPU
2.33user 3.41system 1:48.30elapsed 5%CPU
2.30user 3.54system 1:48.01elapsed 5%CPU

BFQ:
2.41user 3.22system 2:51.96elapsed 3%CPU
2.40user 3.19system 2:50.35elapsed 3%CPU
2.43user 3.11system 2:46.49elapsed 3%CPU

SSD:
CFQ:
2.37user 3.18system 0:04.70elapsed 118%CPU
2.28user 3.26system 0:04.69elapsed 118%CPU
2.21user 3.33system 0:04.69elapsed 118%CPU

BFQ:
2.35user 2.82system 1:07.85elapsed 7%CPU
2.32user 2.90system 0:57.57elapsed 9%CPU
2.39user 2.90system 0:55.03elapsed 9%CPU

It's without background task.

BFQ seems behaving bad when reading many small files.
When I ran "git grep foo HEAD", i.e. performing to the packaged
repository, the results of both BFQ and CFQ become almost same, as
expected:

SSD:
CFQ:
7.25user 0.47system 0:09.79elapsed 78%CPU
7.26user 0.43system 0:09.75elapsed 78%CPU
7.26user 0.43system 0:09.76elapsed 78%CPU

BFQ:
7.24user 0.45system 0:09.93elapsed 77%CPU
7.31user 0.42system 0:09.90elapsed 78%CPU
7.28user 0.42system 0:09.86elapsed 78%CPU

thanks,

Takashi

2014-06-04 12:13:06

by Paolo Valente

[permalink] [raw]

Subject: Re: BFQ speed tests [was Re: [PATCH RFC - TAKE TWO - 00/12] New version of the BFQ I/O Scheduler]

Il giorno 04/giu/2014, alle ore 13:59, Takashi Iwai <[email protected]> ha scritto:

> At Wed, 4 Jun 2014 12:24:30 +0200,
> Paolo Valente wrote:
>>
>>
>> Il giorno 04/giu/2014, alle ore 12:03, Pavel Machek <[email protected]> ha scritto:
>>
>>> Hi!
>>>
>>>> Should this attempt be useless as well, I will, if you do not mind, try by asking you more details about your system and reproducing your configuration as much as I can.
>>>>
>>>
>>> Try making BFQ the default scheduler. That seems to break it for me,
>>> when selected at runtime, it looks stable.
>>>
>>> Anyway, here are some speed tests. Background load:
>>>
>>> root@duo:/data/tmp# echo cfq > /sys/block/sda/queue/scheduler
>>> root@duo:/data/tmp# echo 3 > /proc/sys/vm/drop_caches
>>> root@duo:/data/tmp# cat /dev/zero > delme; cat /dev/zero > delme;cat
>>> /dev/zero > delme;cat /dev/zero > delme;cat /dev/zero > delme;cat
>>> /dev/zero > delme
>>>
>>> (Machine was running out of disk space.)
>>>
>>> (I alternate between cfq and bfq).
>>>
>>> Benchmark. I chose git describe because it is part of kernel build
>>> sometimes .. and I actually wait for that.
>>>
>>> pavel@duo:/data/l/linux-good$ time git describe
>>> warning: refname 'HEAD' is ambiguous.
>>> v3.15-rc8-144-g405dedd
>>>
>>> Unfortunately, results are not too good for BFQ. (Can you replicate
>>> the results?)
>>>
>>> # BFQ
>>> 10.24user 1.62system 467.02 (7m47.028s) elapsed 2.54%CPU
>>> # CFQ
>>> 8.55user 1.26system 69.57 (1m9.577s) elapsed 14.11%CPU
>>> # BFQ
>>> 11.70user 3.18system 1491.59 (24m51.599s) elapsed 0.99%CPU
>>> # CFQ, no background load
>>> 8.51user 0.75system 30.99 (0m30.994s) elapsed 29.91%CPU
>>> # CFQ
>>> 8.70user 1.36system 74.72 (1m14.720s) elapsed 13.48%CPU
>>>
>>
>> Definitely bad, we are about to repeat the test ?
>
> I've been using BFQ for a while and noticed also some obvious
> regression in some operations, notably git, too.
> For example, git grep regresses badly.
>
> I ran "test git grep foo > /dev/null" on linux kernel repos on both
> rotational disk and SSD.
>
> Rotational disk:
> CFQ:
> 2.32user 3.48system 1:46.97elapsed 5%CPU
> 2.33user 3.41system 1:48.30elapsed 5%CPU
> 2.30user 3.54system 1:48.01elapsed 5%CPU
>
> BFQ:
> 2.41user 3.22system 2:51.96elapsed 3%CPU
> 2.40user 3.19system 2:50.35elapsed 3%CPU
> 2.43user 3.11system 2:46.49elapsed 3%CPU
>
> SSD:
> CFQ:
> 2.37user 3.18system 0:04.70elapsed 118%CPU
> 2.28user 3.26system 0:04.69elapsed 118%CPU
> 2.21user 3.33system 0:04.69elapsed 118%CPU
>
> BFQ:
> 2.35user 2.82system 1:07.85elapsed 7%CPU
> 2.32user 2.90system 0:57.57elapsed 9%CPU
> 2.39user 2.90system 0:55.03elapsed 9%CPU
>
> It's without background task.
>
> BFQ seems behaving bad when reading many small files.

We ran this type of tests (plus checkout, merge and compilation) a long ago, and the performance was about the same as or better than with CFQ. Unfortunately, we have not repeated also these tests anymore since then.

We are already trying to understand what is going wrong.

Thanks,
Paolo

> When I ran "git grep foo HEAD", i.e. performing to the packaged
> repository, the results of both BFQ and CFQ become almost same, as
> expected:
>
> SSD:
> CFQ:
> 7.25user 0.47system 0:09.79elapsed 78%CPU
> 7.26user 0.43system 0:09.75elapsed 78%CPU
> 7.26user 0.43system 0:09.76elapsed 78%CPU
>
> BFQ:
> 7.24user 0.45system 0:09.93elapsed 77%CPU
> 7.31user 0.42system 0:09.90elapsed 78%CPU
> 7.28user 0.42system 0:09.86elapsed 78%CPU
>
>
> thanks,
>
> Takashi

--
Paolo Valente
Algogroup
Dipartimento di Fisica, Informatica e Matematica
Via Campi, 213/B
41125 Modena - Italy
homepage: http://algogroup.unimore.it/people/paolo/

2014-06-04 22:31:56

by Pavel Machek

[permalink] [raw]

Subject: Re: [PATCH RFC - TAKE TWO - 00/12] New version of the BFQ I/O Scheduler

Hi!

On Mon 2014-06-02 13:33:32, Tejun Heo wrote:
> On Mon, Jun 02, 2014 at 01:14:33PM +0200, Pavel Machek wrote:
> > Now.. I see it is more work for storage maintainers, because there'll
> > be more code to maintain in the interim. But perhaps user advantages
> > are worth it?
>
> I'm quite skeptical about going that route. Not necessarily because
> of the extra amount of work but more the higher probability of getting
> into situation where we can neither push forward or back out. It's
> difficult to define clear deadline and there will likely be unforeseen
> challenges in the planned convergence of the two schedulers,
> eventually, it isn't too unlikely to be in a situation where we have
> to admit defeat and just keep both schedulers. Note that developer

Yes, that might happen. But it appears that conditions that would
make us stuck with CFQ&BFQ are the same conditions that would make us
stuck with CFQ alone.

And if BFQ is really better for interactivity under load, I'd really
really like option to use it, even if it leads to regression under
batch loads (or something else)...

> overhead isn't the only factor here. Providing two slightly different
> alternatives inevitably makes userland grow dependencies on subtleties
> of both and there's a lot less pressure to make judgement calls and

Dunno. It is just the scheduler. It makes stuff slower or faster, but
should not affect userland too badly.

Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2014-06-05 02:14:10

by Jens Axboe

[permalink] [raw]

Subject: Re: [PATCH RFC - TAKE TWO - 00/12] New version of the BFQ I/O Scheduler

On 2014-06-04 16:31, Pavel Machek wrote:
> Hi!
>
> On Mon 2014-06-02 13:33:32, Tejun Heo wrote:
>> On Mon, Jun 02, 2014 at 01:14:33PM +0200, Pavel Machek wrote:
>>> Now.. I see it is more work for storage maintainers, because there'll
>>> be more code to maintain in the interim. But perhaps user advantages
>>> are worth it?
>>
>> I'm quite skeptical about going that route. Not necessarily because
>> of the extra amount of work but more the higher probability of getting
>> into situation where we can neither push forward or back out. It's
>> difficult to define clear deadline and there will likely be unforeseen
>> challenges in the planned convergence of the two schedulers,
>> eventually, it isn't too unlikely to be in a situation where we have
>> to admit defeat and just keep both schedulers. Note that developer
>
> Yes, that might happen. But it appears that conditions that would
> make us stuck with CFQ&BFQ are the same conditions that would make us
> stuck with CFQ alone.

We're not merging BFQ as is. The plan has to be to merge the changes
into CFQ, leaving us with both a single scheduler, and with a clear path
both backwards and forwards. This was all mentioned earlier in this
thread as well. The latter part of the patch series is already nicely
geared towards this, it's just the first part that has to be done as
well. THAT is the way forward for BFQ.

> And if BFQ is really better for interactivity under load, I'd really
> really like option to use it, even if it leads to regression under
> batch loads (or something else)...

The benefit is that BFQ has (most) everything nicely characterized, not
that it is necessarily a lot better for any possible workload out there.
As you saw yourself, there can be (and are) bugs lurking that can cause
crashes. Another instance has been reported where there's a huge
performance regression. Especially the latter would be a lot easier to
debug, if it could be pin-pointed down to a specific single change. And
I'm sure there are other issues as well, similarly to where there's
undoubtedly cases where BFQ works better.

>> overhead isn't the only factor here. Providing two slightly different
>> alternatives inevitably makes userland grow dependencies on subtleties
>> of both and there's a lot less pressure to make judgement calls and
>
> Dunno. It is just the scheduler. It makes stuff slower or faster, but
> should not affect userland too badly.

Until userland starts depending on various sysfs exports to tweak behavior.

--
Jens Axboe

2014-06-11 20:41:35

by Paolo Valente

[permalink] [raw]

Subject: Re: BFQ speed tests [was Re: [PATCH RFC - TAKE TWO - 00/12] New version of the BFQ I/O Scheduler]

Il giorno 04/giu/2014, alle ore 12:03, Pavel Machek <[email protected]> ha scritto:

> Hi!
>
>> Should this attempt be useless as well, I will, if you do not mind, try by asking you more details about your system and reproducing your configuration as much as I can.
>>
>
> Try making BFQ the default scheduler. That seems to break it for me,
> when selected at runtime, it looks stable.

As I have already written to you privately, we have fixed the bug. It was a
clerical error, made while turning the original patchset into the series of
patches we have then submitted.

The new patchset is available here:
http://algogroup.unimore.it/people/paolo/disk_sched/debugging-patches/3.16.0-rc0-v7rc5.tgz

I?m not submitting this new, fixed patchset by email, because, before doing that,
we want to apply all the changes recommended by Tejun, and try to turn the
new patchset into a 'transformer' of cfq into bfq (of course, should it be better
to proceed in a different way also for this intermediate new version of bfq,
we are willing to).

>
> Anyway, here are some speed tests. Background load:
> [?]
> root@duo:/data/tmp# echo cfq > /sys/block/sda/queue/scheduler
> root@duo:/data/tmp# echo 3 > /proc/sys/vm/drop_caches
> root@duo:/data/tmp# cat /dev/zero > delme; cat /dev/zero > delme;cat
> /dev/zero > delme;cat /dev/zero > delme;cat /dev/zero > delme;cat
> /dev/zero > delme
>
> (Machine was running out of disk space.)
>
> (I alternate between cfq and bfq).
>
> Benchmark. I chose git describe because it is part of kernel build
> sometimes .. and I actually wait for that.
> [?]

We have solved also this regression, related to both the queue-merge
mechanism and the heuristic for providing a low-latency to soft real-time
applications. The new patchset contains also this fix. We have repeated
your tests (and other similar tests) with this fixed version of bfq.
These are now our results with your tests.

# Test with background writes

[root@bfq-testbed data]# echo cfq > /sys/block/sda/queue/scheduler
[root@bfq-testbed data]# echo 3 > /proc/sys/vm/drop_caches
[root@bfq-testbed data]# cat /dev/zero > delme; cat /dev/zero > delme;cat
/dev/zero > delme;cat /dev/zero > delme;cat /dev/zero > delme;cat
/dev/zero > delme

[root@bfq-testbed linux-lkml]# time git describe
v3.15-rc8-78-gd531c25

# BFQ
0.24user 0.14system 0:07.42elapsed 5%CPU
# CFQ
0.24user 0.16system 0:08.39elapsed 4%CPU
# BFQ
0.25user 0.15system 0:08.45elapsed 4%CPU
# CFQ
0.26user 0.15system 0:09.11elapsed 4%CPU

# Results without background workload

# BFQ
0.23user 0.12system 0:07.23elapsed 4%CPU
# CFQ
0.25user 0.13system 0:07.36elapsed 5%CPU
# BFQ
0.23user 0.14system 0:07.24elapsed 5%CPU
# CFQ
0.22user 0.14system 0:07.36elapsed 5%CPU

Any feedback on these and other tests is more than welcome.

Thanks,
Paolo

2014-06-11 20:45:23

by Paolo Valente

[permalink] [raw]

Subject: Re: BFQ speed tests [was Re: [PATCH RFC - TAKE TWO - 00/12] New version of the BFQ I/O Scheduler]

Il giorno 04/giu/2014, alle ore 13:59, Takashi Iwai <[email protected]> ha scritto:

> [?]
> I've been using BFQ for a while and noticed also some obvious
> regression in some operations, notably git, too.
> For example, git grep regresses badly.
>
> I ran "test git grep foo > /dev/null" on linux kernel repos on both
> rotational disk and SSD.
> [?]
>
> BFQ seems behaving bad when reading many small files.
>

The fix I described in my last reply to Pavel's speed tests
(https://lkml.org/lkml/2014/6/4/94) apparently solves also this problem.
As I wrote in that reply, the new fixed version of bfq is here:
http://algogroup.unimore.it/people/paolo/disk_sched/debugging-patches/3.16.0-rc0-v7rc5.tgz

These are our results, for your test, with this fixed version of bfq.

time git grep foo > /dev/null

Rotational disk:
CFQ:
2.86user 4.87system 0:29.51elapsed 26%CPU
2.87user 4.87system 0:30.30elapsed 25%CPU
2.82user 4.90system 0:29.13elapsed 26%CPU

BFQ:
2.81user 4.97system 0:25.96elapsed 29%CPU
2.83user 5.02system 0:24.79elapsed 31%CPU
2.85user 4.95system 0:24.73elapsed 31%CPU

SSD:
CFQ:
2.04user 3.93system 0:03.88elapsed 153%CPU
2.12user 3.85system 0:03.89elapsed 153%CPU
2.05user 3.92system 0:03.89elapsed 153%CPU

BFQ:
2.10user 3.86system 0:03.89elapsed 153%CPU
2.05user 3.90system 0:03.88elapsed 153%CPU
2.01user 3.95system 0:03.89elapsed 153%CPU

time git grep foo HEAD > /dev/null

SSD:
CFQ:
5.11user 0.38system 0:06.71elapsed 81%CPU
5.21user 0.36system 0:06.78elapsed 82%CPU
5.05user 0.41system 0:06.69elapsed 81%CPU

BFQ:
5.17user 0.39system 0:06.77elapsed 82%CPU
5.13user 0.37system 0:06.73elapsed 81%CPU
5.17user 0.37system 0:06.78elapsed 81%CPU

Should you be willing to provide further feedback on this and other tests,
we would of course really appreciate it.

Thanks again for your report,
Paolo-

2014-06-13 16:21:36

by Takashi Iwai

[permalink] [raw]

Subject: Re: BFQ speed tests [was Re: [PATCH RFC - TAKE TWO - 00/12] New version of the BFQ I/O Scheduler]

At Wed, 11 Jun 2014 22:45:06 +0200,
Paolo Valente wrote:
>
>
> Il giorno 04/giu/2014, alle ore 13:59, Takashi Iwai <[email protected]> ha scritto:
>
> > […]
> > I've been using BFQ for a while and noticed also some obvious
> > regression in some operations, notably git, too.
> > For example, git grep regresses badly.
> >
> > I ran "test git grep foo > /dev/null" on linux kernel repos on both
> > rotational disk and SSD.
> > […]
> >
> > BFQ seems behaving bad when reading many small files.
> >
>
> The fix I described in my last reply to Pavel's speed tests
> (https://lkml.org/lkml/2014/6/4/94) apparently solves also this problem.
> As I wrote in that reply, the new fixed version of bfq is here:
> http://algogroup.unimore.it/people/paolo/disk_sched/debugging-patches/3.16.0-rc0-v7rc5.tgz
>
> These are our results, for your test, with this fixed version of bfq.
>
> time git grep foo > /dev/null
>
> Rotational disk:
> CFQ:
> 2.86user 4.87system 0:29.51elapsed 26%CPU
> 2.87user 4.87system 0:30.30elapsed 25%CPU
> 2.82user 4.90system 0:29.13elapsed 26%CPU
>
> BFQ:
> 2.81user 4.97system 0:25.96elapsed 29%CPU
> 2.83user 5.02system 0:24.79elapsed 31%CPU
> 2.85user 4.95system 0:24.73elapsed 31%CPU
>
> SSD:
> CFQ:
> 2.04user 3.93system 0:03.88elapsed 153%CPU
> 2.12user 3.85system 0:03.89elapsed 153%CPU
> 2.05user 3.92system 0:03.89elapsed 153%CPU
>
> BFQ:
> 2.10user 3.86system 0:03.89elapsed 153%CPU
> 2.05user 3.90system 0:03.88elapsed 153%CPU
> 2.01user 3.95system 0:03.89elapsed 153%CPU
>
> time git grep foo HEAD > /dev/null
>
> SSD:
> CFQ:
> 5.11user 0.38system 0:06.71elapsed 81%CPU
> 5.21user 0.36system 0:06.78elapsed 82%CPU
> 5.05user 0.41system 0:06.69elapsed 81%CPU
>
> BFQ:
> 5.17user 0.39system 0:06.77elapsed 82%CPU
> 5.13user 0.37system 0:06.73elapsed 81%CPU
> 5.17user 0.37system 0:06.78elapsed 81%CPU
>
> Should you be willing to provide further feedback on this and other tests,
> we would of course really appreciate it.

Thanks. The new patchset works well now. The results with the new
patchset + latest Linus git tree are below.

The only significant difference is the case with "git grep foo" on
SSD. But I'm not sure whether it's a casual error. I'll need to get
more samples to flatten the errors.

Takashi

===

* time git grep foo > /dev/null

rotational disk:
CFQ:
2.34user 4.04system 2:00.12elapsed 5%CPU
2.49user 3.80system 1:56.20elapsed 5%CPU
2.42user 3.68system 1:46.81elapsed 5%CPU

BFQ:
2.44user 3.57system 1:49.65elapsed 5%CPU
2.47user 3.67system 1:55.92elapsed 5%CPU
2.47user 3.63system 1:50.06elapsed 5%CPU

SSD:
CFQ:
1.25user 1.54system 0:04.62elapsed 60%CPU
1.23user 1.67system 0:04.65elapsed 62%CPU
1.22user 1.60system 0:04.61elapsed 61%CPU

BFQ:
1.29user 1.64system 0:06.91elapsed 42%CPU
1.30user 1.66system 0:06.66elapsed 44%CPU
1.27user 1.59system 0:04.73elapsed 60%CPU

* time git grep foo HEAD > /dev/null

rotational disk:
CFQ:
5.12user 0.43system 0:19.86elapsed 28%CPU
5.06user 0.45system 0:19.88elapsed 27%CPU
5.00user 0.41system 0:20.05elapsed 27%CPU

BFQ:
4.82user 0.37system 0:19.56elapsed 26%CPU
5.00user 0.43system 0:19.53elapsed 27%CPU
4.92user 0.45system 0:19.69elapsed 27%CPU

SSD:
CFQ:
4.49user 0.32system 0:07.26elapsed 66%CPU
4.50user 0.31system 0:07.25elapsed 66%CPU
4.40user 0.32system 0:07.16elapsed 65%CPU

BFQ:
4.09user 0.26system 0:06.93elapsed 62%CPU
3.76user 0.23system 0:06.54elapsed 61%CPU
3.65user 0.22system 0:06.40elapsed 60%CPU