From: Chengming Zhou <[email protected]>
Now we unconditionally blk_rq_init_flush() to replace rq->end_io to
make rq return twice back to the flush state machine for post-flush.
Obviously, non post-flush requests don't need it, they don't need to
end request twice, so they don't need to replace rq->end_io callback.
And the same for requests with the FUA bit on hardware with FUA support.
So we move blk_rq_init_flush() to REQ_FSEQ_DATA stage and only replace
rq->end_io if it needs post-flush. Otherwise, it can end like normal
request and doesn't need to return back to the flush state machine.
There are also some other good points:
1. all requests on hardware with FUA support won't have post-flush, so
all of them don't need to end twice.
2. non post-flush requests won't have RQF_FLUSH_SEQ rq_flags set, so
they can merge like normal requests.
3. we don't account non post-flush requests in flush_data_in_flight,
since there is no point to defer pending flush for these requests.
Signed-off-by: Chengming Zhou <[email protected]>
---
block/blk-flush.c | 22 ++++++++++++++++++----
1 file changed, 18 insertions(+), 4 deletions(-)
diff --git a/block/blk-flush.c b/block/blk-flush.c
index 094a6adb2718..1b92654e8757 100644
--- a/block/blk-flush.c
+++ b/block/blk-flush.c
@@ -93,6 +93,7 @@ enum {
static void blk_kick_flush(struct request_queue *q,
struct blk_flush_queue *fq, blk_opf_t flags);
+static void blk_rq_init_flush(struct request *rq);
static inline struct blk_flush_queue *
blk_get_flush_queue(struct request_queue *q, struct blk_mq_ctx *ctx)
@@ -187,7 +188,15 @@ static void blk_flush_complete_seq(struct request *rq,
break;
case REQ_FSEQ_DATA:
- fq->flush_data_in_flight++;
+ /*
+ * Only for requests that need post-flush,
+ * we need to do rq->end_io replacement trick
+ * to return back to the flush state machine.
+ */
+ if (!(rq->flush.seq & REQ_FSEQ_POSTFLUSH)) {
+ blk_rq_init_flush(rq);
+ fq->flush_data_in_flight++;
+ }
spin_lock(&q->requeue_lock);
list_move_tail(&rq->queuelist, &q->flush_list);
spin_unlock(&q->requeue_lock);
@@ -202,7 +211,13 @@ static void blk_flush_complete_seq(struct request *rq,
* normal completion and end it.
*/
list_del_init(&rq->queuelist);
- blk_flush_restore_request(rq);
+ /*
+ * Only for requests that had rq->end_io replaced,
+ * we need to restore rq->end_io and make it a normal
+ * request before the second end.
+ */
+ if (rq->rq_flags & RQF_FLUSH_SEQ)
+ blk_flush_restore_request(rq);
blk_mq_end_request(rq, error);
break;
@@ -389,7 +404,6 @@ static enum rq_end_io_ret mq_flush_data_end_io(struct request *rq,
static void blk_rq_init_flush(struct request *rq)
{
- rq->flush.seq = 0;
rq->rq_flags |= RQF_FLUSH_SEQ;
rq->flush.saved_end_io = rq->end_io; /* Usually NULL */
rq->end_io = mq_flush_data_end_io;
@@ -424,6 +438,7 @@ bool blk_insert_flush(struct request *rq)
* the request accounting.
*/
rq->cmd_flags |= REQ_SYNC;
+ rq->flush.seq = 0;
switch (policy) {
case 0:
@@ -458,7 +473,6 @@ bool blk_insert_flush(struct request *rq)
* Mark the request as part of a flush sequence and submit it
* for further processing to the flush state machine.
*/
- blk_rq_init_flush(rq);
spin_lock_irq(&fq->mq_flush_lock);
blk_flush_complete_seq(rq, fq, REQ_FSEQ_ACTIONS & ~policy, 0);
spin_unlock_irq(&fq->mq_flush_lock);
--
2.41.0
On Mon, Jul 10, 2023 at 02:47:05PM +0800, [email protected] wrote:
> From: Chengming Zhou <[email protected]>
>
> Now we unconditionally blk_rq_init_flush() to replace rq->end_io to
> make rq return twice back to the flush state machine for post-flush.
>
> Obviously, non post-flush requests don't need it, they don't need to
> end request twice, so they don't need to replace rq->end_io callback.
> And the same for requests with the FUA bit on hardware with FUA support.
>
> So we move blk_rq_init_flush() to REQ_FSEQ_DATA stage and only replace
> rq->end_io if it needs post-flush. Otherwise, it can end like normal
> request and doesn't need to return back to the flush state machine.
I really like the idea behind this optimization, but I kinda hate
adding more magic to the already way too magic flush sequence.
I wonder if a better idea would be to kill the flush sequence entirely,
and just split the flush_queue into a preflush and a postflush queue.
This would remove a field from struct request and lead to more readable
code as well.
On 2023/7/10 21:33, Christoph Hellwig wrote:
> On Mon, Jul 10, 2023 at 02:47:05PM +0800, [email protected] wrote:
>> From: Chengming Zhou <[email protected]>
>>
>> Now we unconditionally blk_rq_init_flush() to replace rq->end_io to
>> make rq return twice back to the flush state machine for post-flush.
>>
>> Obviously, non post-flush requests don't need it, they don't need to
>> end request twice, so they don't need to replace rq->end_io callback.
>> And the same for requests with the FUA bit on hardware with FUA support.
>>
>> So we move blk_rq_init_flush() to REQ_FSEQ_DATA stage and only replace
>> rq->end_io if it needs post-flush. Otherwise, it can end like normal
>> request and doesn't need to return back to the flush state machine.
>
> I really like the idea behind this optimization, but I kinda hate
> adding more magic to the already way too magic flush sequence.
Yes, agree.
>
> I wonder if a better idea would be to kill the flush sequence entirely,
> and just split the flush_queue into a preflush and a postflush queue.
> This would remove a field from struct request and lead to more readable
> code as well.
I have thought about this for some time, it seems feasible. So I implement it
today and test it using blktests, it works.
I will send the patchset soon.
Thanks.