by Jens Axboe

[permalink] [raw]

Subject: Re: [PATCH 8/8] writeback: throttle buffered writeback

On 04/25/2016 05:41 AM, xiakaixu wrote:
> 于 2016/4/24 5:37, Jens Axboe 写道:
>> On 04/23/2016 02:21 AM, xiakaixu wrote:
>>>> diff --git a/block/blk-core.c b/block/blk-core.c
>>>> index 40b57bf4852c..d941f69dfb4b 100644
>>>> --- a/block/blk-core.c
>>>> +++ b/block/blk-core.c
>>>> @@ -39,6 +39,7 @@
>>>>
>>>> #include "blk.h"
>>>> #include "blk-mq.h"
>>>> +#include "blk-wb.h"
>>>>
>>>> EXPORT_TRACEPOINT_SYMBOL_GPL(block_bio_remap);
>>>> EXPORT_TRACEPOINT_SYMBOL_GPL(block_rq_remap);
>>>> @@ -880,6 +881,7 @@ blk_init_allocated_queue(struct request_queue *q, request_fn_proc *rfn,
>>>>
>>>> fail:
>>>> blk_free_flush_queue(q->fq);
>>>> + blk_wb_exit(q);
>>>> return NULL;
>>>> }
>>>> EXPORT_SYMBOL(blk_init_allocated_queue);
>>>> @@ -1395,6 +1397,7 @@ void blk_requeue_request(struct request_queue *q, struct request *rq)
>>>> blk_delete_timer(rq);
>>>> blk_clear_rq_complete(rq);
>>>> trace_block_rq_requeue(q, rq);
>>>> + blk_wb_requeue(q->rq_wb, rq);
>>>>
>>>> if (rq->cmd_flags & REQ_QUEUED)
>>>> blk_queue_end_tag(q, rq);
>>>> @@ -1485,6 +1488,8 @@ void __blk_put_request(struct request_queue *q, struct request *req)
>>>> /* this is a bio leak */
>>>> WARN_ON(req->bio != NULL);
>>>>
>>>> + blk_wb_done(q->rq_wb, req);
>>>> +
>>>> /*
>>>> * Request may not have originated from ll_rw_blk. if not,
>>>> * it didn't come out of our reserved rq pools
>>>> @@ -1714,6 +1719,7 @@ static blk_qc_t blk_queue_bio(struct request_queue *q, struct bio *bio)
>>>> int el_ret, rw_flags, where = ELEVATOR_INSERT_SORT;
>>>> struct request *req;
>>>> unsigned int request_count = 0;
>>>> + bool wb_acct;
>>>>
>>>> /*
>>>> * low level driver can indicate that it wants pages above a
>>>> @@ -1766,6 +1772,8 @@ static blk_qc_t blk_queue_bio(struct request_queue *q, struct bio *bio)
>>>> }
>>>>
>>>> get_rq:
>>>> + wb_acct = blk_wb_wait(q->rq_wb, bio, q->queue_lock);
>>>> +
>>>> /*
>>>> * This sync check and mask will be re-done in init_request_from_bio(),
>>>> * but we need to set it earlier to expose the sync flag to the
>>>> @@ -1781,11 +1789,16 @@ get_rq:
>>>> */
>>>> req = get_request(q, rw_flags, bio, GFP_NOIO);
>>>> if (IS_ERR(req)) {
>>>> + if (wb_acct)
>>>> + __blk_wb_done(q->rq_wb);
>>>> bio->bi_error = PTR_ERR(req);
>>>> bio_endio(bio);
>>>> goto out_unlock;
>>>> }
>>>>
>>>> + if (wb_acct)
>>>> + req->cmd_flags |= REQ_BUF_INFLIGHT;
>>>> +
>>>> /*
>>>> * After dropping the lock and possibly sleeping here, our request
>>>> * may now be mergeable after it had proven unmergeable (above).
>>>> @@ -2515,6 +2528,7 @@ void blk_start_request(struct request *req)
>>>> blk_dequeue_request(req);
>>>>
>>>> req->issue_time = ktime_to_ns(ktime_get());
>>>> + blk_wb_issue(req->q->rq_wb, req);
>>>>
>>>> /*
>>>> * We are now handing the request to the hardware, initialize
>>>> @@ -2751,6 +2765,7 @@ void blk_finish_request(struct request *req, int error)
>>>> blk_unprep_request(req);
>>>>
>>>> blk_account_io_done(req);
>>>> + blk_wb_done(req->q->rq_wb, req);
>>>
>>> Hi Jens,
>>>
>>> Seems the function blk_wb_done() will be executed twice even if the end_io
>>> callback is set.
>>> Maybe the same thing would happen in blk-mq.c.
>>
>> Yeah, that was a mistake, the current version has it fixed. It was inadvertently added when I discovered that the flush request didn't work properly. Now it just duplicates the call inside the check for if it has an ->end_io() defined, since we don't use the normal path for that.
>>
> Hi Jens,
>
> I have checked the wb-buf-throttle branch in your block git repo. I am not sure it is the completed version.
> Seems only the problem is fixed in blk-mq.c. The function blk_wb_done() still would be executed twice in blk-core.c.
> (the functions blk_finish_request() and __blk_put_request())
> Maybe we can add a flag to mark whether blk_wb_done() has been done or not.

Good catch, looks like I did only patch up the mq bits. It's still not
perfect, since we could potentially double account a request that has a
private end_io(), if it was allocated through the normal block rq
allocator. It'll skew the unrelated-io-timestamp a bit, but it's not a
big deal. The count for inflight will be consistent, which is the
important part.

We currently have just 1 bit to tell if the request is tracked or not,
so we don't know if it was tracked but already seen.

I'll fix up the blk-core part to be identical to the blk-mq fix.

--
Jens Axboe

2016-04-26 07:05:00

by Sedat Dilek

[permalink] [raw]

Subject: Re: [PATCHSET v4 0/8] Make background writeback not suck

Hi Jens,

I am testing current linux-block.git#wb-buf-throttle on top of Linux
v4.6-rc5 here on my Ubuntu/precise AMD64.
( Was installed as a WUBI "test" system - "testing" since April 2012 :-) .)

Here are some numbers:

# df -T | egrep 'sda|loop'
/dev/sda2 fuseblk 465546236 210981868 254564368 46% /host
/dev/loop0 ext4 17753424 15586612 1241936 93% /

# egrep 'sda|loop|ext4' /etc/fstab
/host/ubuntu/disks/root.disk / ext4
loop,errors=remount-ro 0 1
/host/ubuntu/disks/swap.disk none swap loop,sw
0 0

( Not sure why I cannot do a find on /sys/block/ .)

# find /sys/devices/ -name '*wb_stat*' | egrep 'loop0|sda'
/sys/devices/pci0000:00/0000:00:1f.2/ata1/host0/target0:0:0/0:0:0:0/block/sda/queue/wb_stats
/sys/devices/virtual/block/loop0/queue/wb_stats

# find /sys/devices/ -name '*wb_lat*' | egrep 'loop0|sda'
/sys/devices/pci0000:00/0000:00:1f.2/ata1/host0/target0:0:0/0:0:0:0/block/sda/queue/wb_lat_usec
/sys/devices/virtual/block/loop0/queue/wb_lat_usec

# cat /sys/devices/pci0000:00/0000:00:1f.2/ata1/host0/target0:0:0/0:0:0:0/block/sda/queue/wb_stats
/sys/devices/pci0000:00/0000:00:1f.2/ata1/host0/target0:0:0/0:0:0:0/block/sda/queue/wb_lat_usec
background=8, normal=16, max=31, inflight=0, wait=0, bdp_wait=0
75000

# cat /sys/devices/virtual/block/loop0/queue/wb_stats
/sys/devices/virtual/block/loop0/queue/wb_lat_usec
background=16, normal=32, max=64, inflight=0, wait=0, bdp_wait=0
75000

Questions...

Planning a v5?
Will this go to Linux v4.7 or later?
How should someone test?

Documentation...

Can you add some more docs about getting infos (see above cat#s etc.)
below Documentation/ directory?
( Talking about the stuff you have embedded in the commit-messages. )

Unfortunately, I have no more data-volume (using Mobile Broadband
Network @ 56kBps), so I cannot send you my dmesg, linux-config and
patchset.

Regards,
- Sedat -

Attachments:

dmesg_4.6.0-rc5-9-iniza-small.txt (60.67 kB)
config-4.6.0-rc5-9-iniza-small (131.97 kB)
Download all attachments

2016-04-26 15:07:48

by Jens Axboe

[permalink] [raw]

Subject: Re: [PATCHSET v4 0/8] Make background writeback not suck

On 04/26/2016 01:04 AM, Sedat Dilek wrote:
> Hi Jens,
>
> I am testing current linux-block.git#wb-buf-throttle on top of Linux
> v4.6-rc5 here on my Ubuntu/precise AMD64.
> ( Was installed as a WUBI "test" system - "testing" since April 2012 :-) .)

Great! Thanks for testing.

> Here are some numbers:
>
> # df -T | egrep 'sda|loop'
> /dev/sda2 fuseblk 465546236 210981868 254564368 46% /host
> /dev/loop0 ext4 17753424 15586612 1241936 93% /
>
> # egrep 'sda|loop|ext4' /etc/fstab
> /host/ubuntu/disks/root.disk / ext4
> loop,errors=remount-ro 0 1
> /host/ubuntu/disks/swap.disk none swap loop,sw
> 0 0

What kind of device is sda?

> ( Not sure why I cannot do a find on /sys/block/ .)

Probably the symlinks that confuse it.

> # find /sys/devices/ -name '*wb_stat*' | egrep 'loop0|sda'
> /sys/devices/pci0000:00/0000:00:1f.2/ata1/host0/target0:0:0/0:0:0:0/block/sda/queue/wb_stats
> /sys/devices/virtual/block/loop0/queue/wb_stats
>
> # find /sys/devices/ -name '*wb_lat*' | egrep 'loop0|sda'
> /sys/devices/pci0000:00/0000:00:1f.2/ata1/host0/target0:0:0/0:0:0:0/block/sda/queue/wb_lat_usec
> /sys/devices/virtual/block/loop0/queue/wb_lat_usec
>
> # cat /sys/devices/pci0000:00/0000:00:1f.2/ata1/host0/target0:0:0/0:0:0:0/block/sda/queue/wb_stats
> /sys/devices/pci0000:00/0000:00:1f.2/ata1/host0/target0:0:0/0:0:0:0/block/sda/queue/wb_lat_usec
> background=8, normal=16, max=31, inflight=0, wait=0, bdp_wait=0
> 75000
>
> # cat /sys/devices/virtual/block/loop0/queue/wb_stats
> /sys/devices/virtual/block/loop0/queue/wb_lat_usec
> background=16, normal=32, max=64, inflight=0, wait=0, bdp_wait=0
> 75000
>
> Questions...
>
> Planning a v5?

Yes, I'll post a v5 today or something like that. Functionally not a
huge amount of changes, but it does have a few important bug fixes that
make it perform better. The biggest part is making it generic, so it can
be plugged into NFS as well, for instance.

> Will this go to Linux v4.7 or later?
> How should someone test?

Probably a bit too tight for 4.7, but one can always hope. 4.8 is
probably a more realistic target.

Testing is really having some readers while you have writes going on.
One example is doing something that reads while you have a dd writing to
your device. Or interactive feel while installing a lot of packages,
which tends to generate a ton of writes as well.

If you are so inclined, I'd encourage you to test the v5 I'll post later
today. I've run that through the paces on various devices.

> Documentation...
>
> Can you add some more docs about getting infos (see above cat#s etc.)
> below Documentation/ directory?
> ( Talking about the stuff you have embedded in the commit-messages. )

Yeah, I'll do that, it is a bit light right now. It'll be in the next
release.

--
Jens Axboe