2022-07-22 10:00:02

by Wang You

[permalink] [raw]
Subject: [PATCH v2 2/2] block/mq-deadline: Prioritize first request

The function deadline_head_request can select the request located at
the head from the sector red-black tree of the mq-deadline scheduler,
dispatch such a request may cause the disk access address to return
to the head, so as to prevent it from swinging back and forth.

- The presence of the scheduler batching requests may reduce or
even eliminate its ability to fuse and sort, so I sometimes set
it to 1.

- This pathc may exacerbate the risk of expire, I don't know if
a more absolute expire detection is necessary.

- Tested some disks (mainly rotational disks and some SSDs) with
the fio tool (using sync, direct, etc. parameters), the results
show that they increase the disk's small sector sequential read
and write performance, does this imply that changing
nr_sched_batch is reasonable?

- Later testing on different hardware showed that the raid controller
probably played an important role, but the performance of a single
disk did not improve as expected. so I'm not sure if this patch really
has the desired effect.

Thanks,

Wang.

The following are all test data:

The test hardware is:
Kunpeng-920, HW-SAS3508+(MG04ACA400N * 2), RAID0.

The test command is:
fio -ioengine=psync -lockmem=1G -buffered=0 -time_based=1 -direct=1
-iodepth=1 -thread -bs=512B -size=110g -numjobs=16 -runtime=300
-group_reporting -name=read -filename=/dev/sdb14
-ioscheduler=mq-deadline -rw=read[,write,rw]

- The following is the test data:
origin/master:
read iops: 152421 write iops: 136959 rw iops: 54593,54581

nr_sched_batch = 1:
read iops: 166449 write iops: 139477 rw iops: 55363,55355

nr_sched_batch = 1, use deadline_head_request:
read iops: 171177 write iops: 184431 rw iops: 56178,56169


- The test hardware is:
Hygon C86, MG04ACA400N

The test command is:
fio -ioengine=psync -lockmem=1G -buffered=0 -time_based=1 -direct=1 -iodepth=1
-thread -bs=512B -size=110g -numjobs=32 -runtime=300 -group_reporting
-name=read -filename=/dev/sdc -ioscheduler=mq-deadline -rw=read[,write,rw]

The following is the test data:
origin/master:
read iops: 15463 write iops: 5949 rw iops: 574,576

nr_sched_batch = 1:
read iops: 15082 write iops: 6283 rw iops: 783,786

nr_sched_batch = 1, use deadline_head_request:
read iops: 15368 write iops: 6575 rw iops: 907,906


- The test hardware is:
Kunpeng-920, HW-SAS3508 + Samsung SSD 780, RAID0.

The test command is:
fio -ioengine=psync -lockmem=1G -buffered=0 -time_based=1 -direct=1 -iodepth=1
-thread -bs=512B -size=110g -numjobs=16 -runtime=300 -group_reporting
-name=read -filename=/dev/sda -ioscheduler=mq-deadline -rw=read[,write,rw]

The following is the test data:
origin/master:
read iops: 115399 write iops: 136801 rw iops: 58082,58084

nr_sched_batch = 1, use deadline_head_request:
read iops: 136473 write iops: 184646 rw iops: 56460,56454

Signed-off-by: Wang You <[email protected]>
---
block/mq-deadline.c | 43 ++++++++++++++++++++++++++++++++++++++++---
1 file changed, 40 insertions(+), 3 deletions(-)

diff --git a/block/mq-deadline.c b/block/mq-deadline.c
index 1a9e835e816c..4660dd4a16f6 100644
--- a/block/mq-deadline.c
+++ b/block/mq-deadline.c
@@ -344,6 +344,36 @@ deadline_next_request(struct deadline_data *dd, struct dd_per_prio *per_prio,
return rq;
}

+static inline struct request *
+deadline_head_request(struct deadline_data *dd, struct dd_per_prio *per_prio,
+ int data_dir)
+{
+ struct rb_node *node = rb_first(&per_prio->sort_list[data_dir]);
+ struct request *rq;
+ unsigned long flags;
+
+ if (!node)
+ return NULL;
+
+ rq = rb_entry_rq(node);
+ if (data_dir == DD_READ || !blk_queue_is_zoned(rq->q))
+ return rq;
+
+ /*
+ * Look for a write request that can be dispatched, that is one with
+ * an unlocked target zone.
+ */
+ spin_lock_irqsave(&dd->zone_lock, flags);
+ while (rq) {
+ if (blk_req_can_dispatch_to_zone(rq))
+ break;
+ rq = deadline_latter_request(rq);
+ }
+ spin_unlock_irqrestore(&dd->zone_lock, flags);
+
+ return rq;
+}
+
/*
* Returns true if and only if @rq started after @latest_start where
* @latest_start is in jiffies.
@@ -429,13 +459,20 @@ static struct request *__dd_dispatch_request(struct deadline_data *dd,
* we are not running a batch, find best request for selected data_dir
*/
next_rq = deadline_next_request(dd, per_prio, data_dir);
- if (deadline_check_fifo(per_prio, data_dir) || !next_rq) {
+ if (deadline_check_fifo(per_prio, data_dir)) {
/*
* A deadline has expired, the last request was in the other
- * direction, or we have run out of higher-sectored requests.
- * Start again from the request with the earliest expiry time.
+ * direction. Start again from the request with the earliest
+ * expiry time.
*/
rq = deadline_fifo_request(dd, per_prio, data_dir);
+ } else if (!next_rq) {
+ /*
+ * There is no operation expired, and we have run out of
+ * higher-sectored requests. Look for the sector at the head
+ * which may reduce disk seek consumption.
+ */
+ rq = deadline_head_request(dd, per_prio, data_dir);
} else {
/*
* The last req was the same dir and we have a next request in
--
2.27.0




2022-07-22 17:50:49

by Bart Van Assche

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] block/mq-deadline: Prioritize first request

On 7/22/22 02:51, Wang You wrote:
> The test hardware is:
> Kunpeng-920, HW-SAS3508+(MG04ACA400N * 2), RAID0.

What is MG04ACA400N? The test results suggest that it is an SSD but this
is something that should be mentioned explicitly.

> - The test hardware is:
> Hygon C86, MG04ACA400N

What is MG04ACA400N?

> The test command is:
> fio -ioengine=psync -lockmem=1G -buffered=0 -time_based=1 -direct=1 -iodepth=1
> -thread -bs=512B -size=110g -numjobs=32 -runtime=300 -group_reporting
> -name=read -filename=/dev/sdc -ioscheduler=mq-deadline -rw=read[,write,rw]
>
> The following is the test data:
> origin/master:
> read iops: 15463 write iops: 5949 rw iops: 574,576
>
> nr_sched_batch = 1:
> read iops: 15082 write iops: 6283 rw iops: 783,786
>
> nr_sched_batch = 1, use deadline_head_request:
> read iops: 15368 write iops: 6575 rw iops: 907,906

The above results are low enough such that these could come from a hard
disk. However, the test results are hard to interpret since the I/O
pattern is neither perfectly sequential nor perfectly random (32
sequential jobs). Please provide separate measurements for sequential
and random I/O.

The above results show that this patch makes reading from a hard disk
slower. Isn't the primary use case of mq-deadline to make reading from
hard disks faster? So why should these two patches be applied if these
slow down reading from a hard disk?

Thanks,

Bart.

2022-07-23 11:05:58

by Wang You

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] block/mq-deadline: Prioritize first request

> What is MG04ACA400N?

It is a Toshiba 7200 RPM hard drive.

> The above results are low enough such that these could come from a hard
> disk. However, the test results are hard to interpret since the I/O
> pattern is neither perfectly sequential nor perfectly random (32
> sequential jobs). Please provide separate measurements for sequential
> and random I/O.

> The above results show that this patch makes reading from a hard disk
> slower. Isn't the primary use case of mq-deadline to make reading from
> hard disks faster? So why should these two patches be applied if these
> slow down reading from a hard disk?

The data of MG04ACA400N on the raid controller is obviously different from
the single disk, especially the reading data, I did not expect this situation,
the data on the raid controller made me mistakenly think that the same applies
to HDD.

I will re-analyze the impact of this patch on the HDD later, please ignore it
for now.

Also, can I ask? If using fio or other tools, how should testing be done to get
more accurate and convincing data? Such as the perfectly sequential and random I/O
performance you mentioned above (fio's multi-threaded test does result in neither
perfectly sequential nor perfectly random, but single thread dispatch is too slow,
and cannot play the merge and sorting ability of elv).

Thanks,

Wang.


2022-07-25 02:02:08

by Bart Van Assche

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] block/mq-deadline: Prioritize first request

On 7/23/22 03:59, Wang You wrote:
> Also, can I ask? If using fio or other tools, how should testing be done to get
> more accurate and convincing data? Such as the perfectly sequential and random I/O
> performance you mentioned above (fio's multi-threaded test does result in neither
> perfectly sequential nor perfectly random, but single thread dispatch is too slow,
> and cannot play the merge and sorting ability of elv).

I'm not sure that there is agreement about for which data patterns to
measure performance to conclude that certain code changes improve
performance of an I/O scheduler.

Thanks,

Bart.