2023-07-25 13:45:53

by Chengming Zhou

[permalink] [raw]
Subject: [PATCH v2 0/4] blk-flush: optimize non-postflush requests

From: Chengming Zhou <[email protected]>

Hello,

This series optimize flush handling for non-postflush requests. Now we
unconditionally replace rq->end_io to make rq return twice back to the
flush state machine for post-flush.

Obviously, non-postflush requests don't need it, they don't need to
end request twice, so they don't need to replace rq->end_io callback.
And the same for requests with the FUA bit on hardware with FUA support.

The previous approach [1] we take is to move blk_rq_init_flush() to
REQ_FSEQ_DATA stage and only replace rq->end_io if it needs post-flush.

But this way add more magic to the already way too magic flush sequence.
Christoph suggested that we can kill the flush sequence entirely, and
just split the flush_queue into a preflush and a postflush queue.

So this series implement the suggested approach that use two queues:
preflush and postflush requests have separate pending list and running
list, so we know what to do for each request in flush_end_io(), and
we don't need the flush sequence entirely.

Thanks for comments!

[1] https://lore.kernel.org/lkml/[email protected]/

Chengming Zhou (4):
blk-flush: flush_rq should inherit first_rq's cmd_flags
blk-flush: split queues for preflush and postflush requests
blk-flush: kill the flush state machine
blk-flush: don't need to end rq twice for non postflush

block/blk-flush.c | 181 +++++++++++++++++++++--------------------
block/blk.h | 3 +-
include/linux/blk-mq.h | 1 -
3 files changed, 96 insertions(+), 89 deletions(-)

--
2.41.0



2023-07-25 13:50:49

by Chengming Zhou

[permalink] [raw]
Subject: [PATCH v2 1/4] blk-flush: flush_rq should inherit first_rq's cmd_flags

From: Chengming Zhou <[email protected]>

The cmd_flags in blk_kick_flush() should inherit the original request's
cmd_flags, but the current code looks buggy to me:

flush_end_io()
blk_flush_complete_seq() // requests on flush running list
blk_kick_flush()

So the request passed to blk_flush_complete_seq() may will be ended
before blk_kick_flush().
On the other hand, flush_rq will inherit first_rq's tag, it should
use first_rq's cmd_flags too.

This patch is just preparation for the following patches, no bugfix
intended.

Signed-off-by: Chengming Zhou <[email protected]>
---
block/blk-flush.c | 13 +++++--------
1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/block/blk-flush.c b/block/blk-flush.c
index e73dc22d05c1..fc25228f7bb1 100644
--- a/block/blk-flush.c
+++ b/block/blk-flush.c
@@ -92,7 +92,7 @@ enum {
};

static void blk_kick_flush(struct request_queue *q,
- struct blk_flush_queue *fq, blk_opf_t flags);
+ struct blk_flush_queue *fq);

static inline struct blk_flush_queue *
blk_get_flush_queue(struct request_queue *q, struct blk_mq_ctx *ctx)
@@ -166,11 +166,9 @@ static void blk_flush_complete_seq(struct request *rq,
{
struct request_queue *q = rq->q;
struct list_head *pending = &fq->flush_queue[fq->flush_pending_idx];
- blk_opf_t cmd_flags;

BUG_ON(rq->flush.seq & seq);
rq->flush.seq |= seq;
- cmd_flags = rq->cmd_flags;

if (likely(!error))
seq = blk_flush_cur_seq(rq);
@@ -210,7 +208,7 @@ static void blk_flush_complete_seq(struct request *rq,
BUG();
}

- blk_kick_flush(q, fq, cmd_flags);
+ blk_kick_flush(q, fq);
}

static enum rq_end_io_ret flush_end_io(struct request *flush_rq,
@@ -277,7 +275,6 @@ bool is_flush_rq(struct request *rq)
* blk_kick_flush - consider issuing flush request
* @q: request_queue being kicked
* @fq: flush queue
- * @flags: cmd_flags of the original request
*
* Flush related states of @q have changed, consider issuing flush request.
* Please read the comment at the top of this file for more info.
@@ -286,8 +283,7 @@ bool is_flush_rq(struct request *rq)
* spin_lock_irq(fq->mq_flush_lock)
*
*/
-static void blk_kick_flush(struct request_queue *q, struct blk_flush_queue *fq,
- blk_opf_t flags)
+static void blk_kick_flush(struct request_queue *q, struct blk_flush_queue *fq)
{
struct list_head *pending = &fq->flush_queue[fq->flush_pending_idx];
struct request *first_rq =
@@ -336,7 +332,8 @@ static void blk_kick_flush(struct request_queue *q, struct blk_flush_queue *fq,
flush_rq->internal_tag = first_rq->internal_tag;

flush_rq->cmd_flags = REQ_OP_FLUSH | REQ_PREFLUSH;
- flush_rq->cmd_flags |= (flags & REQ_DRV) | (flags & REQ_FAILFAST_MASK);
+ flush_rq->cmd_flags |= (first_rq->cmd_flags & REQ_DRV) |
+ (first_rq->cmd_flags & REQ_FAILFAST_MASK);
flush_rq->rq_flags |= RQF_FLUSH_SEQ;
flush_rq->end_io = flush_end_io;
/*
--
2.41.0


2023-07-25 14:27:27

by Chengming Zhou

[permalink] [raw]
Subject: [PATCH v2 2/4] blk-flush: split queues for preflush and postflush requests

From: Chengming Zhou <[email protected]>

We don't need to replace rq->end_io to make it return to the flush state
machine if it doesn't need post-flush.

The previous approach [1] we take is to move blk_rq_init_flush() to
REQ_FSEQ_DATA stage and only replace rq->end_io if it needs post-flush.
Otherwise, it can end like normal request and doesn't need to return
back to the flush state machine.

But this way add more magic to the already way too magic flush sequence.
Christoph suggested that we can kill the flush sequence entirely, and
just split the flush_queue into a preflush and a postflush queue.

The reason we need separate queues for preflush and postflush requests
is that in flush_end_io(), we need to handle differently: end request
for postflush requests, but requeue dispatch for preflush requests.

This patch is just in preparation for the following patches, no
functional changes intended.

[1] https://lore.kernel.org/lkml/[email protected]/

Signed-off-by: Chengming Zhou <[email protected]>
---
block/blk-flush.c | 50 +++++++++++++++++++++++++++++++++++------------
block/blk.h | 3 ++-
2 files changed, 40 insertions(+), 13 deletions(-)

diff --git a/block/blk-flush.c b/block/blk-flush.c
index fc25228f7bb1..4993c3c3b502 100644
--- a/block/blk-flush.c
+++ b/block/blk-flush.c
@@ -165,7 +165,7 @@ static void blk_flush_complete_seq(struct request *rq,
unsigned int seq, blk_status_t error)
{
struct request_queue *q = rq->q;
- struct list_head *pending = &fq->flush_queue[fq->flush_pending_idx];
+ struct list_head *pending;

BUG_ON(rq->flush.seq & seq);
rq->flush.seq |= seq;
@@ -177,9 +177,9 @@ static void blk_flush_complete_seq(struct request *rq,

switch (seq) {
case REQ_FSEQ_PREFLUSH:
- case REQ_FSEQ_POSTFLUSH:
+ pending = &fq->preflush_queue[fq->flush_pending_idx];
/* queue for flush */
- if (list_empty(pending))
+ if (!fq->flush_pending_since)
fq->flush_pending_since = jiffies;
list_move_tail(&rq->queuelist, pending);
break;
@@ -192,6 +192,14 @@ static void blk_flush_complete_seq(struct request *rq,
blk_mq_kick_requeue_list(q);
break;

+ case REQ_FSEQ_POSTFLUSH:
+ pending = &fq->postflush_queue[fq->flush_pending_idx];
+ /* queue for flush */
+ if (!fq->flush_pending_since)
+ fq->flush_pending_since = jiffies;
+ list_move_tail(&rq->queuelist, pending);
+ break;
+
case REQ_FSEQ_DONE:
/*
* @rq was previously adjusted by blk_insert_flush() for
@@ -215,7 +223,7 @@ static enum rq_end_io_ret flush_end_io(struct request *flush_rq,
blk_status_t error)
{
struct request_queue *q = flush_rq->q;
- struct list_head *running;
+ struct list_head *preflush_running, *postflush_running;
struct request *rq, *n;
unsigned long flags = 0;
struct blk_flush_queue *fq = blk_get_flush_queue(q, flush_rq->mq_ctx);
@@ -248,14 +256,22 @@ static enum rq_end_io_ret flush_end_io(struct request *flush_rq,
flush_rq->internal_tag = BLK_MQ_NO_TAG;
}

- running = &fq->flush_queue[fq->flush_running_idx];
+ preflush_running = &fq->preflush_queue[fq->flush_running_idx];
+ postflush_running = &fq->postflush_queue[fq->flush_running_idx];
BUG_ON(fq->flush_pending_idx == fq->flush_running_idx);

/* account completion of the flush request */
fq->flush_running_idx ^= 1;

/* and push the waiting requests to the next stage */
- list_for_each_entry_safe(rq, n, running, queuelist) {
+ list_for_each_entry_safe(rq, n, preflush_running, queuelist) {
+ unsigned int seq = blk_flush_cur_seq(rq);
+
+ BUG_ON(seq != REQ_FSEQ_PREFLUSH && seq != REQ_FSEQ_POSTFLUSH);
+ blk_flush_complete_seq(rq, fq, seq, error);
+ }
+
+ list_for_each_entry_safe(rq, n, postflush_running, queuelist) {
unsigned int seq = blk_flush_cur_seq(rq);

BUG_ON(seq != REQ_FSEQ_PREFLUSH && seq != REQ_FSEQ_POSTFLUSH);
@@ -285,13 +301,20 @@ bool is_flush_rq(struct request *rq)
*/
static void blk_kick_flush(struct request_queue *q, struct blk_flush_queue *fq)
{
- struct list_head *pending = &fq->flush_queue[fq->flush_pending_idx];
- struct request *first_rq =
- list_first_entry(pending, struct request, queuelist);
+ struct list_head *preflush_pending = &fq->preflush_queue[fq->flush_pending_idx];
+ struct list_head *postflush_pending = &fq->postflush_queue[fq->flush_pending_idx];
+ struct request *first_rq = NULL;
struct request *flush_rq = fq->flush_rq;

/* C1 described at the top of this file */
- if (fq->flush_pending_idx != fq->flush_running_idx || list_empty(pending))
+ if (fq->flush_pending_idx != fq->flush_running_idx)
+ return;
+
+ if (!list_empty(preflush_pending))
+ first_rq = list_first_entry(preflush_pending, struct request, queuelist);
+ else if (!list_empty(postflush_pending))
+ first_rq = list_first_entry(postflush_pending, struct request, queuelist);
+ else
return;

/* C2 and C3 */
@@ -305,6 +328,7 @@ static void blk_kick_flush(struct request_queue *q, struct blk_flush_queue *fq)
* different from running_idx, which means flush is in flight.
*/
fq->flush_pending_idx ^= 1;
+ fq->flush_pending_since = 0;

blk_rq_init(q, flush_rq);

@@ -496,8 +520,10 @@ struct blk_flush_queue *blk_alloc_flush_queue(int node, int cmd_size,
if (!fq->flush_rq)
goto fail_rq;

- INIT_LIST_HEAD(&fq->flush_queue[0]);
- INIT_LIST_HEAD(&fq->flush_queue[1]);
+ INIT_LIST_HEAD(&fq->preflush_queue[0]);
+ INIT_LIST_HEAD(&fq->preflush_queue[1]);
+ INIT_LIST_HEAD(&fq->postflush_queue[0]);
+ INIT_LIST_HEAD(&fq->postflush_queue[1]);

return fq;

diff --git a/block/blk.h b/block/blk.h
index 686712e13835..1a11675152ac 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -20,7 +20,8 @@ struct blk_flush_queue {
unsigned int flush_running_idx:1;
blk_status_t rq_status;
unsigned long flush_pending_since;
- struct list_head flush_queue[2];
+ struct list_head preflush_queue[2];
+ struct list_head postflush_queue[2];
unsigned long flush_data_in_flight;
struct request *flush_rq;
};
--
2.41.0


2023-07-25 14:28:57

by Chengming Zhou

[permalink] [raw]
Subject: [PATCH v2 4/4] blk-flush: don't need to end rq twice for non postflush

From: Chengming Zhou <[email protected]>

Now we unconditionally blk_rq_init_flush() to replace rq->end_io to
make rq return twice back to the flush state machine for post-flush.

Obviously, non post-flush requests don't need it, they don't need to
end request twice, so they don't need to replace rq->end_io callback.
And the same for requests with the FUA bit on hardware with FUA support.

There are also some other good points:
1. all requests on hardware with FUA support won't have post-flush, so
all of them don't need to end twice.

2. non post-flush requests won't have RQF_FLUSH_SEQ rq_flags set, so
they can merge like normal requests.

3. we don't account non post-flush requests in flush_data_in_flight,
since there is no point to defer pending flush for these requests.

Signed-off-by: Chengming Zhou <[email protected]>
---
block/blk-flush.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/block/blk-flush.c b/block/blk-flush.c
index ed195c760617..a299dae65350 100644
--- a/block/blk-flush.c
+++ b/block/blk-flush.c
@@ -178,7 +178,8 @@ static void blk_end_flush(struct request *rq, struct blk_flush_queue *fq,
* normal completion and end it.
*/
list_del_init(&rq->queuelist);
- blk_flush_restore_request(rq);
+ if (rq->rq_flags & RQF_FLUSH_SEQ)
+ blk_flush_restore_request(rq);
blk_mq_end_request(rq, error);

blk_kick_flush(q, fq);
@@ -461,7 +462,8 @@ bool blk_insert_flush(struct request *rq)
* Mark the request as part of a flush sequence and submit it
* for further processing to the flush state machine.
*/
- blk_rq_init_flush(rq);
+ if (policy & REQ_FSEQ_POSTFLUSH)
+ blk_rq_init_flush(rq);
spin_lock_irq(&fq->mq_flush_lock);
blk_enqueue_preflush(rq, fq);
spin_unlock_irq(&fq->mq_flush_lock);
--
2.41.0


2023-07-25 15:01:35

by Chengming Zhou

[permalink] [raw]
Subject: [PATCH v2 3/4] blk-flush: kill the flush state machine

From: Chengming Zhou <[email protected]>

Since now we put preflush and postflush requests in separate queues,
we don't need the flush sequence to record anymore.

REQ_FSEQ_PREFLUSH: blk_enqueue_preflush()
REQ_FSEQ_POSTFLUSH: blk_enqueue_postflush()
REQ_FSEQ_DONE: blk_end_flush()

In blk_flush_complete(), we have two list to handle: preflush_running
and postflush_running. We just blk_end_flush() directly for postflush
requests, but need to move preflush requests to requeue_list to
dispatch.

This patch just kill the flush state machine and directly call these
functions, in preparation for the next patch.

Signed-off-by: Chengming Zhou <[email protected]>
---
block/blk-flush.c | 158 ++++++++++++++++++-----------------------
include/linux/blk-mq.h | 1 -
2 files changed, 70 insertions(+), 89 deletions(-)

diff --git a/block/blk-flush.c b/block/blk-flush.c
index 4993c3c3b502..ed195c760617 100644
--- a/block/blk-flush.c
+++ b/block/blk-flush.c
@@ -117,11 +117,6 @@ static unsigned int blk_flush_policy(unsigned long fflags, struct request *rq)
return policy;
}

-static unsigned int blk_flush_cur_seq(struct request *rq)
-{
- return 1 << ffz(rq->flush.seq);
-}
-
static void blk_flush_restore_request(struct request *rq)
{
/*
@@ -147,75 +142,81 @@ static void blk_account_io_flush(struct request *rq)
part_stat_unlock();
}

-/**
- * blk_flush_complete_seq - complete flush sequence
- * @rq: PREFLUSH/FUA request being sequenced
- * @fq: flush queue
- * @seq: sequences to complete (mask of %REQ_FSEQ_*, can be zero)
- * @error: whether an error occurred
- *
- * @rq just completed @seq part of its flush sequence, record the
- * completion and trigger the next step.
- *
- * CONTEXT:
- * spin_lock_irq(fq->mq_flush_lock)
- */
-static void blk_flush_complete_seq(struct request *rq,
- struct blk_flush_queue *fq,
- unsigned int seq, blk_status_t error)
+static void blk_enqueue_preflush(struct request *rq, struct blk_flush_queue *fq)
{
struct request_queue *q = rq->q;
- struct list_head *pending;
+ struct list_head *pending = &fq->preflush_queue[fq->flush_pending_idx];

- BUG_ON(rq->flush.seq & seq);
- rq->flush.seq |= seq;
+ if (!fq->flush_pending_since)
+ fq->flush_pending_since = jiffies;
+ list_move_tail(&rq->queuelist, pending);

- if (likely(!error))
- seq = blk_flush_cur_seq(rq);
- else
- seq = REQ_FSEQ_DONE;
+ blk_kick_flush(q, fq);
+}

- switch (seq) {
- case REQ_FSEQ_PREFLUSH:
- pending = &fq->preflush_queue[fq->flush_pending_idx];
- /* queue for flush */
- if (!fq->flush_pending_since)
- fq->flush_pending_since = jiffies;
- list_move_tail(&rq->queuelist, pending);
- break;
+static void blk_enqueue_postflush(struct request *rq, struct blk_flush_queue *fq)
+{
+ struct request_queue *q = rq->q;
+ struct list_head *pending = &fq->postflush_queue[fq->flush_pending_idx];

- case REQ_FSEQ_DATA:
- fq->flush_data_in_flight++;
- spin_lock(&q->requeue_lock);
- list_move(&rq->queuelist, &q->requeue_list);
- spin_unlock(&q->requeue_lock);
- blk_mq_kick_requeue_list(q);
- break;
+ if (!fq->flush_pending_since)
+ fq->flush_pending_since = jiffies;
+ list_move_tail(&rq->queuelist, pending);

- case REQ_FSEQ_POSTFLUSH:
- pending = &fq->postflush_queue[fq->flush_pending_idx];
- /* queue for flush */
- if (!fq->flush_pending_since)
- fq->flush_pending_since = jiffies;
- list_move_tail(&rq->queuelist, pending);
- break;
+ blk_kick_flush(q, fq);
+}

- case REQ_FSEQ_DONE:
- /*
- * @rq was previously adjusted by blk_insert_flush() for
- * flush sequencing and may already have gone through the
- * flush data request completion path. Restore @rq for
- * normal completion and end it.
- */
- list_del_init(&rq->queuelist);
- blk_flush_restore_request(rq);
- blk_mq_end_request(rq, error);
- break;
+static void blk_end_flush(struct request *rq, struct blk_flush_queue *fq,
+ blk_status_t error)
+{
+ struct request_queue *q = rq->q;

- default:
- BUG();
+ /*
+ * @rq was previously adjusted by blk_insert_flush() for
+ * flush sequencing and may already have gone through the
+ * flush data request completion path. Restore @rq for
+ * normal completion and end it.
+ */
+ list_del_init(&rq->queuelist);
+ blk_flush_restore_request(rq);
+ blk_mq_end_request(rq, error);
+
+ blk_kick_flush(q, fq);
+}
+
+static void blk_flush_complete(struct request_queue *q,
+ struct blk_flush_queue *fq,
+ blk_status_t error)
+{
+ unsigned int nr_requeue = 0;
+ struct list_head *preflush_running;
+ struct list_head *postflush_running;
+ struct request *rq, *n;
+
+ preflush_running = &fq->preflush_queue[fq->flush_running_idx];
+ postflush_running = &fq->postflush_queue[fq->flush_running_idx];
+
+ list_for_each_entry_safe(rq, n, postflush_running, queuelist) {
+ blk_end_flush(rq, fq, error);
}

+ list_for_each_entry_safe(rq, n, preflush_running, queuelist) {
+ if (unlikely(error || !blk_rq_sectors(rq)))
+ blk_end_flush(rq, fq, error);
+ else
+ nr_requeue++;
+ }
+
+ if (nr_requeue) {
+ fq->flush_data_in_flight += nr_requeue;
+ spin_lock(&q->requeue_lock);
+ list_splice_init(preflush_running, &q->requeue_list);
+ spin_unlock(&q->requeue_lock);
+ blk_mq_kick_requeue_list(q);
+ }
+
+ /* account completion of the flush request */
+ fq->flush_running_idx ^= 1;
blk_kick_flush(q, fq);
}

@@ -223,8 +224,6 @@ static enum rq_end_io_ret flush_end_io(struct request *flush_rq,
blk_status_t error)
{
struct request_queue *q = flush_rq->q;
- struct list_head *preflush_running, *postflush_running;
- struct request *rq, *n;
unsigned long flags = 0;
struct blk_flush_queue *fq = blk_get_flush_queue(q, flush_rq->mq_ctx);

@@ -256,27 +255,9 @@ static enum rq_end_io_ret flush_end_io(struct request *flush_rq,
flush_rq->internal_tag = BLK_MQ_NO_TAG;
}

- preflush_running = &fq->preflush_queue[fq->flush_running_idx];
- postflush_running = &fq->postflush_queue[fq->flush_running_idx];
BUG_ON(fq->flush_pending_idx == fq->flush_running_idx);

- /* account completion of the flush request */
- fq->flush_running_idx ^= 1;
-
- /* and push the waiting requests to the next stage */
- list_for_each_entry_safe(rq, n, preflush_running, queuelist) {
- unsigned int seq = blk_flush_cur_seq(rq);
-
- BUG_ON(seq != REQ_FSEQ_PREFLUSH && seq != REQ_FSEQ_POSTFLUSH);
- blk_flush_complete_seq(rq, fq, seq, error);
- }
-
- list_for_each_entry_safe(rq, n, postflush_running, queuelist) {
- unsigned int seq = blk_flush_cur_seq(rq);
-
- BUG_ON(seq != REQ_FSEQ_PREFLUSH && seq != REQ_FSEQ_POSTFLUSH);
- blk_flush_complete_seq(rq, fq, seq, error);
- }
+ blk_flush_complete(q, fq, error);

spin_unlock_irqrestore(&fq->mq_flush_lock, flags);
return RQ_END_IO_NONE;
@@ -401,7 +382,10 @@ static enum rq_end_io_ret mq_flush_data_end_io(struct request *rq,
* re-initialize rq->queuelist before reusing it here.
*/
INIT_LIST_HEAD(&rq->queuelist);
- blk_flush_complete_seq(rq, fq, REQ_FSEQ_DATA, error);
+ if (likely(!error))
+ blk_enqueue_postflush(rq, fq);
+ else
+ blk_end_flush(rq, fq, error);
spin_unlock_irqrestore(&fq->mq_flush_lock, flags);

blk_mq_sched_restart(hctx);
@@ -410,7 +394,6 @@ static enum rq_end_io_ret mq_flush_data_end_io(struct request *rq,

static void blk_rq_init_flush(struct request *rq)
{
- rq->flush.seq = 0;
rq->rq_flags |= RQF_FLUSH_SEQ;
rq->flush.saved_end_io = rq->end_io; /* Usually NULL */
rq->end_io = mq_flush_data_end_io;
@@ -469,7 +452,6 @@ bool blk_insert_flush(struct request *rq)
* the post flush, and then just pass the command on.
*/
blk_rq_init_flush(rq);
- rq->flush.seq |= REQ_FSEQ_PREFLUSH;
spin_lock_irq(&fq->mq_flush_lock);
fq->flush_data_in_flight++;
spin_unlock_irq(&fq->mq_flush_lock);
@@ -481,7 +463,7 @@ bool blk_insert_flush(struct request *rq)
*/
blk_rq_init_flush(rq);
spin_lock_irq(&fq->mq_flush_lock);
- blk_flush_complete_seq(rq, fq, REQ_FSEQ_ACTIONS & ~policy, 0);
+ blk_enqueue_preflush(rq, fq);
spin_unlock_irq(&fq->mq_flush_lock);
return true;
}
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index 01e8c31db665..d46fefdacea8 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -177,7 +177,6 @@ struct request {
} elv;

struct {
- unsigned int seq;
rq_end_io_fn *saved_end_io;
} flush;

--
2.41.0


2023-07-31 06:31:43

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] blk-flush: kill the flush state machine

On Tue, Jul 25, 2023 at 09:01:01PM +0800, [email protected] wrote:
> From: Chengming Zhou <[email protected]>
>
> Since now we put preflush and postflush requests in separate queues,
> we don't need the flush sequence to record anymore.
>
> REQ_FSEQ_PREFLUSH: blk_enqueue_preflush()
> REQ_FSEQ_POSTFLUSH: blk_enqueue_postflush()
> REQ_FSEQ_DONE: blk_end_flush()
>
> In blk_flush_complete(), we have two list to handle: preflush_running
> and postflush_running. We just blk_end_flush() directly for postflush
> requests, but need to move preflush requests to requeue_list to
> dispatch.
>
> This patch just kill the flush state machine and directly call these
> functions, in preparation for the next patch.

> +static void blk_enqueue_postflush(struct request *rq, struct blk_flush_queue *fq)

Please avoid the overly long here. Maybe just rename enqueue to queue
here and for the preflush version as we don't really use enqueue in
the flush code anyway.

> +{
> + unsigned int nr_requeue = 0;
> + struct list_head *preflush_running;
> + struct list_head *postflush_running;
> + struct request *rq, *n;
> +
> + preflush_running = &fq->preflush_queue[fq->flush_running_idx];
> + postflush_running = &fq->postflush_queue[fq->flush_running_idx];

I'd initialize these ad declaration time:

struct list_head *preflush_running =
&fq->preflush_queue[fq->flush_running_idx];
struct list_head *postflush_running =
&fq->postflush_queue[fq->flush_running_idx];
unsigned int nr_requeue = 0;
struct request *rq, *n;

> +
> + list_for_each_entry_safe(rq, n, postflush_running, queuelist) {
> + blk_end_flush(rq, fq, error);
> }

No need for the braces.


2023-07-31 06:43:38

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] blk-flush: flush_rq should inherit first_rq's cmd_flags

On Tue, Jul 25, 2023 at 09:00:59PM +0800, [email protected] wrote:
> From: Chengming Zhou <[email protected]>
>
> The cmd_flags in blk_kick_flush() should inherit the original request's
> cmd_flags, but the current code looks buggy to me:

Should it? I know the code is kinda trying to do it, but does it really
make sense? Adding Hannes who originally added this inheritance and
discussing the details below:

> flush_rq->cmd_flags = REQ_OP_FLUSH | REQ_PREFLUSH;
> - flush_rq->cmd_flags |= (flags & REQ_DRV) | (flags & REQ_FAILFAST_MASK);
> + flush_rq->cmd_flags |= (first_rq->cmd_flags & REQ_DRV) |
> + (first_rq->cmd_flags & REQ_FAILFAST_MASK);

Two cases here:

1) REQ_FAILFAST_MASK: I don't think this is actually set on flush request
currently, and even if it was applying it to the flush that serves more
than a single originating command seems wrong to me.
2) REQ_DRV is only set by drivers that have seen a bio. For dm this
is used as REQ_DM_POLL_LIST which should never be set for a flush/fua
request. For nvme-mpath it is REQ_NVME_MPATH, which is set in the
bio based driver and used for decision making in the I/O completion
handler. So I guess this one actually does need to get passed
through.


2023-07-31 06:49:25

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v2 2/4] blk-flush: split queues for preflush and postflush requests

> - list_for_each_entry_safe(rq, n, running, queuelist) {
> + list_for_each_entry_safe(rq, n, preflush_running, queuelist) {
> + unsigned int seq = blk_flush_cur_seq(rq);
> +
> + BUG_ON(seq != REQ_FSEQ_PREFLUSH && seq != REQ_FSEQ_POSTFLUSH);
> + blk_flush_complete_seq(rq, fq, seq, error);
> + }
> +
> + list_for_each_entry_safe(rq, n, postflush_running, queuelist) {
> unsigned int seq = blk_flush_cur_seq(rq);
>
> BUG_ON(seq != REQ_FSEQ_PREFLUSH && seq != REQ_FSEQ_POSTFLUSH);

Shouldn't the BUG_ON be split into one that only checks for PREFLUSH and
one only for POSTFLUSH?

> + if (fq->flush_pending_idx != fq->flush_running_idx)
> + return;
> +
> + if (!list_empty(preflush_pending))
> + first_rq = list_first_entry(preflush_pending, struct request, queuelist);
> + else if (!list_empty(postflush_pending))
> + first_rq = list_first_entry(postflush_pending, struct request, queuelist);
> + else
> return;

Hmm, I don't think both lists can be empty here?

I'd simplify this and avoid the overly long lines as:

first_rq = list_first_entry_or_null(preflush_pending, struct request,
queuelist);
if (!first_rq)
first_rq = list_first_entry_or_null(postflush_pending,
struct request, queuelist);


2023-07-31 15:35:37

by Chengming Zhou

[permalink] [raw]
Subject: Re: [PATCH v2 2/4] blk-flush: split queues for preflush and postflush requests

On 2023/7/31 14:15, Christoph Hellwig wrote:
>> - list_for_each_entry_safe(rq, n, running, queuelist) {
>> + list_for_each_entry_safe(rq, n, preflush_running, queuelist) {
>> + unsigned int seq = blk_flush_cur_seq(rq);
>> +
>> + BUG_ON(seq != REQ_FSEQ_PREFLUSH && seq != REQ_FSEQ_POSTFLUSH);
>> + blk_flush_complete_seq(rq, fq, seq, error);
>> + }
>> +
>> + list_for_each_entry_safe(rq, n, postflush_running, queuelist) {
>> unsigned int seq = blk_flush_cur_seq(rq);
>>
>> BUG_ON(seq != REQ_FSEQ_PREFLUSH && seq != REQ_FSEQ_POSTFLUSH);
>
> Shouldn't the BUG_ON be split into one that only checks for PREFLUSH and
> one only for POSTFLUSH?

Ah yes, will fix it.

>
>> + if (fq->flush_pending_idx != fq->flush_running_idx)
>> + return;
>> +
>> + if (!list_empty(preflush_pending))
>> + first_rq = list_first_entry(preflush_pending, struct request, queuelist);
>> + else if (!list_empty(postflush_pending))
>> + first_rq = list_first_entry(postflush_pending, struct request, queuelist);
>> + else
>> return;
>
> Hmm, I don't think both lists can be empty here?

Yes if check fq->flush_pending_since != 0 before.

>
> I'd simplify this and avoid the overly long lines as:
>
> first_rq = list_first_entry_or_null(preflush_pending, struct request,
> queuelist);
> if (!first_rq)
> first_rq = list_first_entry_or_null(postflush_pending,
> struct request, queuelist);
>

This is better, will change it.

Thanks.


2023-07-31 15:38:53

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] blk-flush: flush_rq should inherit first_rq's cmd_flags

On Mon, Jul 31, 2023 at 10:02:39PM +0800, Chengming Zhou wrote:
> The commit 84fca1b0c461 ("block: pass failfast and driver-specific flags to
> flush requests") says:
> If flush requests are being sent to the device we need to inherit the
> failfast and driver-specific flags, too, otherwise I/O will fail.
>
> 1) REQ_FAILFAST_MASK: agree, shouldn't set to the flush_rq I think?
> 2) REQ_DRV: I don't get why this flag not set would cause I/O fail?

I don't think it would fail I/O by it's own, but it will cause the
nvme driver to not do the correct handling of an error when a flush
is set to multipath setups.


2023-07-31 15:39:36

by Chengming Zhou

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] blk-flush: flush_rq should inherit first_rq's cmd_flags

On 2023/7/31 14:09, Christoph Hellwig wrote:
> On Tue, Jul 25, 2023 at 09:00:59PM +0800, [email protected] wrote:
>> From: Chengming Zhou <[email protected]>
>>
>> The cmd_flags in blk_kick_flush() should inherit the original request's
>> cmd_flags, but the current code looks buggy to me:
>
> Should it? I know the code is kinda trying to do it, but does it really
> make sense? Adding Hannes who originally added this inheritance and
> discussing the details below:

I'm not sure, actually I don't get what the current code is doing...
Hope Hannes could provide some details.

blk_flush_complete_seq(rq) -> blk_kick_flush(rq->cmd_flags)

flush_rq will use the cmd_flags of request which just complete a sequence,
there are three cases:

1. blk_insert_flush(rq): rq is pending, wait for flush
2. flush_end_io(flush_rq): rq flush seq done
3. mq_flush_data_end_io(rq): rq data seq done

Only in the 1st case, the rq is the pending request that wait for flush_rq.
In the 2nd and 3rd cases, the rq has nothing to do with the next flush_rq?

So it's more reasonable for flush_rq to use its pending first_rq's cmd_flags?

>
>> flush_rq->cmd_flags = REQ_OP_FLUSH | REQ_PREFLUSH;
>> - flush_rq->cmd_flags |= (flags & REQ_DRV) | (flags & REQ_FAILFAST_MASK);
>> + flush_rq->cmd_flags |= (first_rq->cmd_flags & REQ_DRV) |
>> + (first_rq->cmd_flags & REQ_FAILFAST_MASK);
>
> Two cases here:
>
> 1) REQ_FAILFAST_MASK: I don't think this is actually set on flush request
> currently, and even if it was applying it to the flush that serves more
> than a single originating command seems wrong to me.
> 2) REQ_DRV is only set by drivers that have seen a bio. For dm this
> is used as REQ_DM_POLL_LIST which should never be set for a flush/fua
> request. For nvme-mpath it is REQ_NVME_MPATH, which is set in the
> bio based driver and used for decision making in the I/O completion
> handler. So I guess this one actually does need to get passed
> through.
>

The commit 84fca1b0c461 ("block: pass failfast and driver-specific flags to
flush requests") says:
If flush requests are being sent to the device we need to inherit the
failfast and driver-specific flags, too, otherwise I/O will fail.

1) REQ_FAILFAST_MASK: agree, shouldn't set to the flush_rq I think?
2) REQ_DRV: I don't get why this flag not set would cause I/O fail?

Thanks!

2023-07-31 17:06:56

by Chengming Zhou

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] blk-flush: kill the flush state machine

On 2023/7/31 14:19, Christoph Hellwig wrote:
> On Tue, Jul 25, 2023 at 09:01:01PM +0800, [email protected] wrote:
>> From: Chengming Zhou <[email protected]>
>>
>> Since now we put preflush and postflush requests in separate queues,
>> we don't need the flush sequence to record anymore.
>>
>> REQ_FSEQ_PREFLUSH: blk_enqueue_preflush()
>> REQ_FSEQ_POSTFLUSH: blk_enqueue_postflush()
>> REQ_FSEQ_DONE: blk_end_flush()
>>
>> In blk_flush_complete(), we have two list to handle: preflush_running
>> and postflush_running. We just blk_end_flush() directly for postflush
>> requests, but need to move preflush requests to requeue_list to
>> dispatch.
>>
>> This patch just kill the flush state machine and directly call these
>> functions, in preparation for the next patch.
>
>> +static void blk_enqueue_postflush(struct request *rq, struct blk_flush_queue *fq)
>
> Please avoid the overly long here. Maybe just rename enqueue to queue
> here and for the preflush version as we don't really use enqueue in
> the flush code anyway.

Ok, will rename to queue.

>
>> +{
>> + unsigned int nr_requeue = 0;
>> + struct list_head *preflush_running;
>> + struct list_head *postflush_running;
>> + struct request *rq, *n;
>> +
>> + preflush_running = &fq->preflush_queue[fq->flush_running_idx];
>> + postflush_running = &fq->postflush_queue[fq->flush_running_idx];
>
> I'd initialize these ad declaration time:
>
> struct list_head *preflush_running =
> &fq->preflush_queue[fq->flush_running_idx];
> struct list_head *postflush_running =
> &fq->postflush_queue[fq->flush_running_idx];
> unsigned int nr_requeue = 0;
> struct request *rq, *n;
>

LGTM, will change these.

Thanks for your review!

>> +
>> + list_for_each_entry_safe(rq, n, postflush_running, queuelist) {
>> + blk_end_flush(rq, fq, error);
>> }
>
> No need for the braces.
>

2023-07-31 17:11:05

by Hannes Reinecke

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] blk-flush: flush_rq should inherit first_rq's cmd_flags

On 7/31/23 08:09, Christoph Hellwig wrote:
> On Tue, Jul 25, 2023 at 09:00:59PM +0800, [email protected] wrote:
>> From: Chengming Zhou <[email protected]>
>>
>> The cmd_flags in blk_kick_flush() should inherit the original request's
>> cmd_flags, but the current code looks buggy to me:
>
> Should it? I know the code is kinda trying to do it, but does it really
> make sense? Adding Hannes who originally added this inheritance and
> discussing the details below:
>
Yeah, it does.
The flush machinery is sending flushes before and/or after the original
request (preflush/postflush). For blocked transports (ie during FC RSCN
handling) the transport will error out commands depending on the
FAILFAST setting. If FAILFAST is set the SCSI layer gets an
STS_TRANSPORT error (causing the I/O to be retried), but STS_ERROR if
not set (causing I/O to failed).

So if the FAILFAST setting is _not_ aligned between flush_rq and the
original we'll get an error on the flush rq and a retry on the original
rq, causing the entire command to fail.

I guess we need to align them.

Cheers,

Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
[email protected] +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Ivo Totev, Andrew
Myers, Andrew McDonald, Martje Boudien Moerman


2023-08-01 12:00:15

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] blk-flush: flush_rq should inherit first_rq's cmd_flags

On Tue, Aug 01, 2023 at 01:04:32PM +0200, Christoph Hellwig wrote:
> On Mon, Jul 31, 2023 at 06:28:01PM +0200, Hannes Reinecke wrote:
> > The flush machinery is sending flushes before and/or after the original
> > request (preflush/postflush). For blocked transports (ie during FC RSCN
> > handling) the transport will error out commands depending on the FAILFAST
> > setting. If FAILFAST is set the SCSI layer gets an STS_TRANSPORT error
> > (causing the I/O to be retried), but STS_ERROR if not set (causing I/O to
> > failed).
> >
> > So if the FAILFAST setting is _not_ aligned between flush_rq and the
> > original we'll get an error on the flush rq and a retry on the original rq,
> > causing the entire command to fail.
> >
> > I guess we need to align them.
>
> But you can't, because multiple pre/postflushes are coalesced into a
> single outstanding flush request. They can and will not match quite
> commonly.

And if you mean the REQ_FAILFAST_TRANSPORT added by dm - this will
never even see the flush state machine, as that is run in dm-mpath
which then inserts the fully built flush request into the lower request
queue. At least for request based multipath, bio could hit it.

2023-08-01 12:12:08

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] blk-flush: flush_rq should inherit first_rq's cmd_flags

On Mon, Jul 31, 2023 at 06:28:01PM +0200, Hannes Reinecke wrote:
> The flush machinery is sending flushes before and/or after the original
> request (preflush/postflush). For blocked transports (ie during FC RSCN
> handling) the transport will error out commands depending on the FAILFAST
> setting. If FAILFAST is set the SCSI layer gets an STS_TRANSPORT error
> (causing the I/O to be retried), but STS_ERROR if not set (causing I/O to
> failed).
>
> So if the FAILFAST setting is _not_ aligned between flush_rq and the
> original we'll get an error on the flush rq and a retry on the original rq,
> causing the entire command to fail.
>
> I guess we need to align them.

But you can't, because multiple pre/postflushes are coalesced into a
single outstanding flush request. They can and will not match quite
commonly.

2023-08-03 16:13:08

by Chengming Zhou

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] blk-flush: flush_rq should inherit first_rq's cmd_flags

On 2023/8/1 19:06, Christoph Hellwig wrote:
> On Tue, Aug 01, 2023 at 01:04:32PM +0200, Christoph Hellwig wrote:
>> On Mon, Jul 31, 2023 at 06:28:01PM +0200, Hannes Reinecke wrote:
>>> The flush machinery is sending flushes before and/or after the original
>>> request (preflush/postflush). For blocked transports (ie during FC RSCN
>>> handling) the transport will error out commands depending on the FAILFAST
>>> setting. If FAILFAST is set the SCSI layer gets an STS_TRANSPORT error
>>> (causing the I/O to be retried), but STS_ERROR if not set (causing I/O to
>>> failed).
>>>
>>> So if the FAILFAST setting is _not_ aligned between flush_rq and the
>>> original we'll get an error on the flush rq and a retry on the original rq,
>>> causing the entire command to fail.
>>>
>>> I guess we need to align them.
>>
>> But you can't, because multiple pre/postflushes are coalesced into a
>> single outstanding flush request. They can and will not match quite
>> commonly.
>
> And if you mean the REQ_FAILFAST_TRANSPORT added by dm - this will
> never even see the flush state machine, as that is run in dm-mpath
> which then inserts the fully built flush request into the lower request
> queue. At least for request based multipath, bio could hit it.

Yes, multiple pre/postflushes are coalesced into a single flush request.
So we can't figure out which request to use.

From the above explanation, can we just drop this inherit logic? It seems
strange or wrong here.

Thanks.

2023-08-10 15:07:38

by Oliver Sang

[permalink] [raw]
Subject: Re: [PATCH v2 4/4] blk-flush: don't need to end rq twice for non postflush



Hello,

kernel test robot noticed a -54.7% regression of stress-ng.symlink.ops_per_sec on:


commit: b3afbe4f56ec07dd9cbfd59734fe5bc084f4d307 ("[PATCH v2 4/4] blk-flush: don't need to end rq twice for non postflush")
url: https://github.com/intel-lab-lkp/linux/commits/chengming-zhou-linux-dev/blk-flush-flush_rq-should-inherit-first_rq-s-cmd_flags/20230725-212146
base: https://git.kernel.org/cgit/linux/kernel/git/axboe/linux-block.git for-next
patch link: https://lore.kernel.org/all/[email protected]/
patch subject: [PATCH v2 4/4] blk-flush: don't need to end rq twice for non postflush

testcase: stress-ng
test machine: 36 threads 1 sockets Intel(R) Core(TM) i9-9980XE CPU @ 3.00GHz (Skylake) with 32G memory
parameters:

nr_threads: 10%
disk: 1SSD
testtime: 60s
fs: ext4
class: filesystem
test: symlink
cpufreq_governor: performance




If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <[email protected]>
| Closes: https://lore.kernel.org/oe-lkp/[email protected]


Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20230810/[email protected]

=========================================================================================
class/compiler/cpufreq_governor/disk/fs/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
filesystem/gcc-12/performance/1SSD/ext4/x86_64-rhel-8.3/10%/debian-11.1-x86_64-20220510.cgz/lkp-skl-d08/symlink/stress-ng/60s

commit:
9e046e4b9a ("blk-flush: kill the flush state machine")
b3afbe4f56 ("blk-flush: don't need to end rq twice for non postflush")

9e046e4b9a326538 b3afbe4f56ec07dd9cbfd59734f
---------------- ---------------------------
%stddev %change %stddev
\ | \
81.33 ? 23% +701.4% 651.83 ? 8% perf-c2c.DRAM.local
385.00 ? 24% -60.2% 153.33 ? 18% perf-c2c.HITM.local
12838 ? 9% -83.6% 2111 ? 19% vmstat.io.bo
4894104 ? 2% -25.0% 3672720 ? 4% vmstat.memory.buff
0.03 ? 9% +2.1 2.12 ? 5% mpstat.cpu.all.iowait%
0.05 ? 6% -0.0 0.03 ? 7% mpstat.cpu.all.soft%
1.01 ? 2% -0.4 0.65 ? 5% mpstat.cpu.all.usr%
91.34 -2.4% 89.12 iostat.cpu.idle
0.04 ? 9% +5309.5% 2.10 ? 5% iostat.cpu.iowait
7.60 +6.7% 8.11 iostat.cpu.system
1.02 ? 2% -34.2% 0.67 ? 5% iostat.cpu.user
2.70 ? 7% +2.0 4.75 ? 4% turbostat.C1E%
0.22 ? 3% -43.2% 0.12 ? 6% turbostat.IPC
1.24 ? 63% -100.0% 0.00 turbostat.Pkg%pc2
82.91 -4.1% 79.50 turbostat.PkgWatt
395.50 -54.7% 179.17 ? 6% stress-ng.symlink.ops
6.58 -54.7% 2.98 ? 6% stress-ng.symlink.ops_per_sec
19194 ? 2% -57.0% 8247 ? 8% stress-ng.time.involuntary_context_switches
280.33 +1.7% 285.00 stress-ng.time.percent_of_cpu_this_job_got
157.58 +7.8% 169.80 stress-ng.time.system_time
16.88 -54.6% 7.67 ? 6% stress-ng.time.user_time
4864550 ? 2% -24.0% 3694800 ? 4% meminfo.Active
4805969 ? 2% -24.3% 3637287 ? 4% meminfo.Active(file)
4807385 ? 2% -24.3% 3638747 ? 4% meminfo.Buffers
137311 ? 8% -93.5% 8969 ? 61% meminfo.Dirty
253893 -8.9% 231389 meminfo.KReclaimable
8558145 -13.9% 7368683 ? 2% meminfo.Memused
253893 -8.9% 231389 meminfo.SReclaimable
8560008 -13.9% 7372705 ? 2% meminfo.max_used_kB
1201519 ? 2% -24.3% 909340 ? 4% proc-vmstat.nr_active_file
146816 ? 4% -90.7% 13672 ? 38% proc-vmstat.nr_dirtied
34328 ? 8% -93.5% 2241 ? 61% proc-vmstat.nr_dirty
1918325 -15.2% 1626003 ? 2% proc-vmstat.nr_file_pages
6000925 +5.0% 6298304 proc-vmstat.nr_free_pages
10094 +1.8% 10278 proc-vmstat.nr_mapped
63476 -8.9% 57854 proc-vmstat.nr_slab_reclaimable
25073 -3.4% 24225 proc-vmstat.nr_slab_unreclaimable
101605 ? 6% -86.6% 13660 ? 38% proc-vmstat.nr_written
1201519 ? 2% -24.3% 909340 ? 4% proc-vmstat.nr_zone_active_file
34329 ? 8% -93.5% 2242 ? 61% proc-vmstat.nr_zone_write_pending
2922100 -39.4% 1771460 ? 4% proc-vmstat.numa_hit
2925326 -39.4% 1771382 ? 4% proc-vmstat.numa_local
182296 -37.3% 114290 ? 4% proc-vmstat.pgactivate
3716779 -41.6% 2171212 ? 5% proc-vmstat.pgalloc_normal
1205341 ? 4% -41.5% 705041 ? 5% proc-vmstat.pgfree
852188 ? 10% -83.7% 138853 ? 20% proc-vmstat.pgpgout
0.01 ? 26% -100.0% 0.00 perf-sched.sch_delay.avg.ms.__cond_resched.jbd2_journal_commit_transaction.kjournald2.kthread.ret_from_fork
0.01 ? 62% -100.0% 0.00 perf-sched.sch_delay.avg.ms.kjournald2.kthread.ret_from_fork.ret_from_fork_asm
0.01 ? 26% -100.0% 0.00 perf-sched.sch_delay.max.ms.__cond_resched.jbd2_journal_commit_transaction.kjournald2.kthread.ret_from_fork
0.01 ? 29% -69.0% 0.00 ?101% perf-sched.sch_delay.max.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt
0.01 ? 21% -40.0% 0.01 ? 23% perf-sched.sch_delay.max.ms.io_schedule.bit_wait_io.__wait_on_bit.out_of_line_wait_on_bit
0.01 ? 62% -100.0% 0.00 perf-sched.sch_delay.max.ms.kjournald2.kthread.ret_from_fork.ret_from_fork_asm
0.03 ? 14% +166.0% 0.09 ? 58% perf-sched.sch_delay.max.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
101.49 ? 7% +59.7% 162.10 ? 17% perf-sched.wait_and_delay.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
2048 ? 3% -33.5% 1363 ? 19% perf-sched.wait_and_delay.count.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
0.03 ? 6% +40.8% 0.05 ? 13% perf-sched.wait_time.avg.ms.__cond_resched.__ext4_handle_dirty_metadata.ext4_getblk.ext4_bread.ext4_init_symlink_block
0.05 ? 11% +24.3% 0.06 ? 14% perf-sched.wait_time.avg.ms.__cond_resched.__getblk_gfp.ext4_getblk.ext4_bread.__ext4_read_dirblock
0.03 ? 51% -90.6% 0.00 ?150% perf-sched.wait_time.avg.ms.__cond_resched.down_read.__ext4_new_inode.ext4_symlink.vfs_symlink
5.83 ? 73% -100.0% 0.00 perf-sched.wait_time.avg.ms.__cond_resched.jbd2_journal_commit_transaction.kjournald2.kthread.ret_from_fork
0.03 ? 61% -84.7% 0.01 ?183% perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc.security_file_alloc.init_file.alloc_empty_file
5.73 ?140% -99.2% 0.05 ?223% perf-sched.wait_time.avg.ms.io_schedule.bit_wait_io.__wait_on_bit.out_of_line_wait_on_bit
0.70 ? 18% -100.0% 0.00 perf-sched.wait_time.avg.ms.kjournald2.kthread.ret_from_fork.ret_from_fork_asm
101.46 ? 7% +59.7% 162.08 ? 17% perf-sched.wait_time.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
0.04 ? 41% -93.0% 0.00 ?150% perf-sched.wait_time.max.ms.__cond_resched.down_read.__ext4_new_inode.ext4_symlink.vfs_symlink
16.16 ?215% -97.7% 0.38 ? 22% perf-sched.wait_time.max.ms.__cond_resched.dput.path_put.user_statfs.__do_sys_statfs
5.83 ? 73% -100.0% 0.00 perf-sched.wait_time.max.ms.__cond_resched.jbd2_journal_commit_transaction.kjournald2.kthread.ret_from_fork
0.07 ? 82% -90.7% 0.01 ?171% perf-sched.wait_time.max.ms.__cond_resched.kmem_cache_alloc.security_file_alloc.init_file.alloc_empty_file
25.16 ?122% -99.3% 0.16 ?223% perf-sched.wait_time.max.ms.io_schedule.bit_wait_io.__wait_on_bit.out_of_line_wait_on_bit
0.70 ? 18% -100.0% 0.00 perf-sched.wait_time.max.ms.kjournald2.kthread.ret_from_fork.ret_from_fork_asm
9.77 ? 18% +57.5% 15.38 perf-stat.i.MPKI
1.759e+09 -44.0% 9.852e+08 ? 4% perf-stat.i.branch-instructions
20930583 -32.7% 14087563 ? 4% perf-stat.i.branch-misses
10.78 +4.2 14.98 ? 4% perf-stat.i.cache-miss-rate%
8025531 -4.2% 7684648 perf-stat.i.cache-misses
77754557 ? 2% -25.5% 57916998 ? 3% perf-stat.i.cache-references
1.45 ? 8% +173.5% 3.96 ? 5% perf-stat.i.cpi
1.221e+10 +1.7% 1.241e+10 perf-stat.i.cpu-cycles
1592 +8.1% 1721 perf-stat.i.cycles-between-cache-misses
661039 -47.4% 347558 ? 5% perf-stat.i.dTLB-load-misses
2.38e+09 -45.8% 1.289e+09 ? 4% perf-stat.i.dTLB-loads
10275 ? 9% -13.3% 8913 ? 2% perf-stat.i.dTLB-store-misses
1.15e+09 -48.8% 5.892e+08 ? 5% perf-stat.i.dTLB-stores
45.27 ? 3% -14.9 30.42 ? 5% perf-stat.i.iTLB-load-miss-rate%
796188 ? 8% -47.8% 415706 ? 8% perf-stat.i.iTLB-load-misses
929612 ? 3% -10.6% 830862 perf-stat.i.iTLB-loads
9.024e+09 -45.7% 4.902e+09 ? 5% perf-stat.i.instructions
0.75 -45.0% 0.41 ? 4% perf-stat.i.ipc
0.34 +1.7% 0.34 perf-stat.i.metric.GHz
191.07 ? 8% +60.2% 306.12 ? 11% perf-stat.i.metric.K/sec
149.18 -45.7% 81.00 ? 5% perf-stat.i.metric.M/sec
1273025 ? 2% +178.5% 3545558 ? 5% perf-stat.i.node-loads
3427034 ? 2% -43.4% 1939169 ? 5% perf-stat.i.node-stores
8.60 +37.5% 11.82 perf-stat.overall.MPKI
1.19 +0.2 1.43 ? 4% perf-stat.overall.branch-miss-rate%
10.33 +3.0 13.29 ? 4% perf-stat.overall.cache-miss-rate%
1.35 +87.9% 2.54 ? 4% perf-stat.overall.cpi
1520 +6.2% 1614 perf-stat.overall.cycles-between-cache-misses
0.03 -0.0 0.03 perf-stat.overall.dTLB-load-miss-rate%
0.00 ? 11% +0.0 0.00 ? 5% perf-stat.overall.dTLB-store-miss-rate%
46.13 ? 3% -12.8 33.30 ? 5% perf-stat.overall.iTLB-load-miss-rate%
0.74 -46.7% 0.40 ? 5% perf-stat.overall.ipc
1.734e+09 -44.1% 9.699e+08 ? 4% perf-stat.ps.branch-instructions
20610819 -32.7% 13862952 ? 4% perf-stat.ps.branch-misses
7900440 -4.3% 7563578 perf-stat.ps.cache-misses
76460521 ? 2% -25.4% 57007500 ? 3% perf-stat.ps.cache-references
1.201e+10 +1.7% 1.221e+10 perf-stat.ps.cpu-cycles
652055 -47.5% 342193 ? 5% perf-stat.ps.dTLB-load-misses
2.347e+09 -45.9% 1.269e+09 ? 4% perf-stat.ps.dTLB-loads
10120 ? 9% -13.3% 8770 ? 2% perf-stat.ps.dTLB-store-misses
1.134e+09 -48.8% 5.801e+08 ? 5% perf-stat.ps.dTLB-stores
785838 ? 8% -47.9% 409304 ? 8% perf-stat.ps.iTLB-load-misses
915395 ? 3% -10.7% 817774 perf-stat.ps.iTLB-loads
8.896e+09 -45.8% 4.826e+09 ? 5% perf-stat.ps.instructions
1253851 ? 2% +178.3% 3488888 ? 5% perf-stat.ps.node-loads
3362358 ? 2% -43.2% 1909414 ? 5% perf-stat.ps.node-stores
5.702e+11 -46.4% 3.056e+11 ? 4% perf-stat.total.instructions
18.41 ? 6% -11.3 7.13 ? 37% perf-profile.calltrace.cycles-pp.unlink
18.05 ? 6% -11.0 7.05 ? 38% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.unlink
17.92 ? 6% -10.9 7.02 ? 38% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.unlink
17.61 ? 6% -10.7 6.94 ? 38% perf-profile.calltrace.cycles-pp.__x64_sys_unlink.do_syscall_64.entry_SYSCALL_64_after_hwframe.unlink
17.45 ? 6% -10.5 6.90 ? 39% perf-profile.calltrace.cycles-pp.do_unlinkat.__x64_sys_unlink.do_syscall_64.entry_SYSCALL_64_after_hwframe.unlink
12.32 ? 8% -6.7 5.61 ? 48% perf-profile.calltrace.cycles-pp.evict.do_unlinkat.__x64_sys_unlink.do_syscall_64.entry_SYSCALL_64_after_hwframe
8.70 ? 5% -6.6 2.05 ? 14% perf-profile.calltrace.cycles-pp.__statfs
12.06 ? 8% -6.5 5.56 ? 49% perf-profile.calltrace.cycles-pp.ext4_evict_inode.evict.do_unlinkat.__x64_sys_unlink.do_syscall_64
7.28 ? 5% -5.5 1.74 ? 14% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__statfs
6.78 ? 5% -5.2 1.62 ? 13% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__statfs
6.90 ? 5% -5.1 1.79 ? 14% perf-profile.calltrace.cycles-pp.ext4_map_blocks.ext4_getblk.ext4_bread.ext4_init_symlink_block.ext4_symlink
6.55 ? 5% -4.8 1.72 ? 14% perf-profile.calltrace.cycles-pp.ext4_ext_map_blocks.ext4_map_blocks.ext4_getblk.ext4_bread.ext4_init_symlink_block
5.91 ? 5% -4.5 1.40 ? 14% perf-profile.calltrace.cycles-pp.__do_sys_statfs.do_syscall_64.entry_SYSCALL_64_after_hwframe.__statfs
5.69 ? 5% -4.3 1.35 ? 14% perf-profile.calltrace.cycles-pp.user_statfs.__do_sys_statfs.do_syscall_64.entry_SYSCALL_64_after_hwframe.__statfs
5.46 ? 5% -4.0 1.44 ? 14% perf-profile.calltrace.cycles-pp.ext4_mb_new_blocks.ext4_ext_map_blocks.ext4_map_blocks.ext4_getblk.ext4_bread
4.23 ? 5% -3.2 1.06 ? 10% perf-profile.calltrace.cycles-pp.vfs_unlink.do_unlinkat.__x64_sys_unlink.do_syscall_64.entry_SYSCALL_64_after_hwframe
3.96 ? 6% -3.0 0.94 ? 12% perf-profile.calltrace.cycles-pp.statfs_by_dentry.user_statfs.__do_sys_statfs.do_syscall_64.entry_SYSCALL_64_after_hwframe
4.16 ? 5% -3.0 1.14 ? 5% perf-profile.calltrace.cycles-pp.ext4_add_nondir.ext4_symlink.vfs_symlink.do_symlinkat.__x64_sys_symlink
3.98 ? 5% -3.0 1.00 ? 11% perf-profile.calltrace.cycles-pp.ext4_unlink.vfs_unlink.do_unlinkat.__x64_sys_unlink.do_syscall_64
3.95 ? 5% -3.0 0.99 ? 11% perf-profile.calltrace.cycles-pp.__ext4_unlink.ext4_unlink.vfs_unlink.do_unlinkat.__x64_sys_unlink
3.87 ? 6% -3.0 0.92 ? 13% perf-profile.calltrace.cycles-pp.ext4_statfs.statfs_by_dentry.user_statfs.__do_sys_statfs.do_syscall_64
3.50 ? 6% -2.7 0.83 ? 12% perf-profile.calltrace.cycles-pp.__percpu_counter_sum.ext4_statfs.statfs_by_dentry.user_statfs.__do_sys_statfs
3.57 ? 5% -2.6 0.99 ? 5% perf-profile.calltrace.cycles-pp.ext4_add_entry.ext4_add_nondir.ext4_symlink.vfs_symlink.do_symlinkat
3.45 ? 5% -2.5 0.96 ? 5% perf-profile.calltrace.cycles-pp.ext4_dx_add_entry.ext4_add_entry.ext4_add_nondir.ext4_symlink.vfs_symlink
3.36 ? 5% -2.4 0.97 ? 14% perf-profile.calltrace.cycles-pp.ext4_mb_regular_allocator.ext4_mb_new_blocks.ext4_ext_map_blocks.ext4_map_blocks.ext4_getblk
3.13 ? 5% -2.3 0.78 ? 11% perf-profile.calltrace.cycles-pp.__ext4_new_inode.ext4_symlink.vfs_symlink.do_symlinkat.__x64_sys_symlink
2.53 ? 6% -1.9 0.65 ? 10% perf-profile.calltrace.cycles-pp.ext4_mb_clear_bb.ext4_remove_blocks.ext4_ext_rm_leaf.ext4_ext_remove_space.ext4_ext_truncate
2.48 ? 4% -1.8 0.65 ? 5% perf-profile.calltrace.cycles-pp.add_dirent_to_buf.ext4_dx_add_entry.ext4_add_entry.ext4_add_nondir.ext4_symlink
1.92 ? 5% -1.6 0.35 ? 70% perf-profile.calltrace.cycles-pp.ext4_mb_complex_scan_group.ext4_mb_regular_allocator.ext4_mb_new_blocks.ext4_ext_map_blocks.ext4_map_blocks
36.63 ? 5% +23.9 60.51 ? 5% perf-profile.calltrace.cycles-pp.symlink
36.25 ? 5% +24.2 60.42 ? 5% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.symlink
36.12 ? 5% +24.3 60.40 ? 5% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.symlink
35.81 ? 5% +24.5 60.32 ? 5% perf-profile.calltrace.cycles-pp.__x64_sys_symlink.do_syscall_64.entry_SYSCALL_64_after_hwframe.symlink
35.53 ? 5% +24.7 60.24 ? 5% perf-profile.calltrace.cycles-pp.do_symlinkat.__x64_sys_symlink.do_syscall_64.entry_SYSCALL_64_after_hwframe.symlink
34.80 ? 5% +25.3 60.07 ? 5% perf-profile.calltrace.cycles-pp.vfs_symlink.do_symlinkat.__x64_sys_symlink.do_syscall_64.entry_SYSCALL_64_after_hwframe
34.73 ? 5% +25.3 60.06 ? 5% perf-profile.calltrace.cycles-pp.ext4_symlink.vfs_symlink.do_symlinkat.__x64_sys_symlink.do_syscall_64
4.32 ? 22% +28.7 32.98 ? 8% perf-profile.calltrace.cycles-pp._raw_spin_lock.find_revoke_record.jbd2_journal_cancel_revoke.jbd2_journal_get_create_access.__ext4_journal_get_create_access
4.29 ? 22% +28.7 32.97 ? 8% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.find_revoke_record.jbd2_journal_cancel_revoke.jbd2_journal_get_create_access
27.14 ? 6% +30.9 58.06 ? 5% perf-profile.calltrace.cycles-pp.ext4_init_symlink_block.ext4_symlink.vfs_symlink.do_symlinkat.__x64_sys_symlink
27.08 ? 6% +31.0 58.05 ? 5% perf-profile.calltrace.cycles-pp.ext4_bread.ext4_init_symlink_block.ext4_symlink.vfs_symlink.do_symlinkat
27.08 ? 6% +31.0 58.04 ? 5% perf-profile.calltrace.cycles-pp.ext4_getblk.ext4_bread.ext4_init_symlink_block.ext4_symlink.vfs_symlink
18.24 ? 8% +37.4 55.63 ? 5% perf-profile.calltrace.cycles-pp.__ext4_journal_get_create_access.ext4_getblk.ext4_bread.ext4_init_symlink_block.ext4_symlink
18.24 ? 8% +37.4 55.63 ? 5% perf-profile.calltrace.cycles-pp.jbd2_journal_get_create_access.__ext4_journal_get_create_access.ext4_getblk.ext4_bread.ext4_init_symlink_block
17.96 ? 8% +37.6 55.56 ? 5% perf-profile.calltrace.cycles-pp.jbd2_journal_cancel_revoke.jbd2_journal_get_create_access.__ext4_journal_get_create_access.ext4_getblk.ext4_bread
17.84 ? 8% +37.7 55.53 ? 5% perf-profile.calltrace.cycles-pp.find_revoke_record.jbd2_journal_cancel_revoke.jbd2_journal_get_create_access.__ext4_journal_get_create_access.ext4_getblk
18.44 ? 6% -11.3 7.14 ? 37% perf-profile.children.cycles-pp.unlink
17.61 ? 6% -10.7 6.94 ? 38% perf-profile.children.cycles-pp.__x64_sys_unlink
17.46 ? 6% -10.6 6.90 ? 39% perf-profile.children.cycles-pp.do_unlinkat
12.32 ? 8% -6.7 5.61 ? 48% perf-profile.children.cycles-pp.evict
8.76 ? 5% -6.7 2.06 ? 14% perf-profile.children.cycles-pp.__statfs
12.07 ? 8% -6.5 5.56 ? 49% perf-profile.children.cycles-pp.ext4_evict_inode
7.58 ? 5% -5.6 1.97 ? 13% perf-profile.children.cycles-pp.ext4_map_blocks
7.08 ? 4% -5.3 1.77 ? 9% perf-profile.children.cycles-pp.__ext4_mark_inode_dirty
6.56 ? 5% -4.8 1.72 ? 14% perf-profile.children.cycles-pp.ext4_ext_map_blocks
5.91 ? 5% -4.5 1.40 ? 14% perf-profile.children.cycles-pp.__do_sys_statfs
5.92 ? 4% -4.4 1.49 ? 9% perf-profile.children.cycles-pp.ext4_mark_iloc_dirty
5.69 ? 5% -4.3 1.35 ? 14% perf-profile.children.cycles-pp.user_statfs
5.43 ? 3% -4.1 1.35 ? 10% perf-profile.children.cycles-pp.ext4_do_update_inode
5.46 ? 5% -4.0 1.45 ? 14% perf-profile.children.cycles-pp.ext4_mb_new_blocks
4.23 ? 5% -3.2 1.06 ? 10% perf-profile.children.cycles-pp.vfs_unlink
4.10 ? 5% -3.0 1.06 ? 12% perf-profile.children.cycles-pp.crc32c_pcl_intel_update
3.96 ? 6% -3.0 0.94 ? 12% perf-profile.children.cycles-pp.statfs_by_dentry
4.16 ? 5% -3.0 1.14 ? 5% perf-profile.children.cycles-pp.ext4_add_nondir
3.98 ? 5% -3.0 1.00 ? 11% perf-profile.children.cycles-pp.ext4_unlink
3.96 ? 5% -3.0 0.99 ? 11% perf-profile.children.cycles-pp.__ext4_unlink
3.88 ? 6% -3.0 0.92 ? 13% perf-profile.children.cycles-pp.ext4_statfs
3.96 ? 3% -3.0 1.00 ? 11% perf-profile.children.cycles-pp.ext4_fill_raw_inode
3.64 ? 6% -2.8 0.86 ? 13% perf-profile.children.cycles-pp.__percpu_counter_sum
3.57 ? 5% -2.6 0.99 ? 5% perf-profile.children.cycles-pp.ext4_add_entry
3.45 ? 5% -2.5 0.96 ? 5% perf-profile.children.cycles-pp.ext4_dx_add_entry
3.36 ? 5% -2.4 0.97 ? 14% perf-profile.children.cycles-pp.ext4_mb_regular_allocator
3.14 ? 5% -2.4 0.78 ? 11% perf-profile.children.cycles-pp.__ext4_new_inode
2.85 ? 3% -2.1 0.70 ? 13% perf-profile.children.cycles-pp.ext4_inode_csum_set
2.75 ? 4% -2.1 0.64 ? 8% perf-profile.children.cycles-pp.__find_get_block
2.68 ? 3% -2.0 0.66 ? 13% perf-profile.children.cycles-pp.ext4_inode_csum
2.64 ? 7% -1.9 0.72 ? 7% perf-profile.children.cycles-pp.syscall_exit_to_user_mode
2.54 ? 7% -1.9 0.65 ? 10% perf-profile.children.cycles-pp.ext4_mb_clear_bb
2.54 ? 5% -1.9 0.66 ? 12% perf-profile.children.cycles-pp.syscall_return_via_sysret
2.48 ? 4% -1.8 0.65 ? 5% perf-profile.children.cycles-pp.add_dirent_to_buf
2.42 ? 5% -1.8 0.61 ? 14% perf-profile.children.cycles-pp.ext4_reserve_inode_write
2.29 ? 6% -1.7 0.56 ? 16% perf-profile.children.cycles-pp.user_path_at_empty
2.32 ? 4% -1.7 0.63 ? 7% perf-profile.children.cycles-pp.__filemap_get_folio
1.92 ? 5% -1.4 0.49 ? 12% perf-profile.children.cycles-pp.ext4_mb_complex_scan_group
1.82 ? 4% -1.3 0.47 ? 10% perf-profile.children.cycles-pp.open64
1.71 ? 6% -1.3 0.44 ? 10% perf-profile.children.cycles-pp.do_readlinkat
1.63 ? 8% -1.2 0.40 ? 15% perf-profile.children.cycles-pp.filename_lookup
1.66 ? 7% -1.2 0.44 ? 11% perf-profile.children.cycles-pp.readlinkat
1.66 ? 7% -1.2 0.44 ? 9% perf-profile.children.cycles-pp.crc_pcl
1.60 ? 3% -1.2 0.38 ? 3% perf-profile.children.cycles-pp.filemap_get_entry
1.57 ? 6% -1.2 0.36 ? 11% perf-profile.children.cycles-pp.syscall
1.58 ? 3% -1.2 0.40 ? 7% perf-profile.children.cycles-pp.ext4_find_dest_de
1.63 ? 6% -1.2 0.46 ? 9% perf-profile.children.cycles-pp.__getblk_gfp
1.29 ? 15% -1.1 0.14 ? 22% perf-profile.children.cycles-pp.ret_from_fork_asm
1.29 ? 15% -1.1 0.14 ? 22% perf-profile.children.cycles-pp.ret_from_fork
1.29 ? 15% -1.1 0.14 ? 22% perf-profile.children.cycles-pp.kthread
1.48 ? 8% -1.1 0.36 ? 16% perf-profile.children.cycles-pp.path_lookupat
1.49 ? 5% -1.1 0.38 ? 14% perf-profile.children.cycles-pp.getname_flags
1.41 ? 5% -1.0 0.36 ? 13% perf-profile.children.cycles-pp.__lxstat64
1.42 ? 4% -1.0 0.37 ? 15% perf-profile.children.cycles-pp.ext4_get_inode_loc
1.37 ? 4% -1.0 0.36 ? 15% perf-profile.children.cycles-pp.__ext4_get_inode_loc
1.27 ? 8% -1.0 0.30 ? 20% perf-profile.children.cycles-pp.__ext4_journal_get_write_access
1.29 ? 7% -1.0 0.32 ? 13% perf-profile.children.cycles-pp.__mark_inode_dirty
1.26 ? 5% -0.9 0.31 ? 15% perf-profile.children.cycles-pp.__entry_text_start
1.28 ? 7% -0.9 0.38 ? 12% perf-profile.children.cycles-pp.__ext4_read_dirblock
1.24 ? 4% -0.9 0.35 ? 10% perf-profile.children.cycles-pp._find_next_bit
1.16 ? 9% -0.9 0.30 ? 10% perf-profile.children.cycles-pp.ext4_orphan_del
1.12 ? 6% -0.9 0.27 ? 10% perf-profile.children.cycles-pp.__close
1.09 ? 2% -0.8 0.30 ? 9% perf-profile.children.cycles-pp.__x64_sys_openat
1.05 ? 9% -0.8 0.26 ? 14% perf-profile.children.cycles-pp.ext4_orphan_add
1.02 ? 7% -0.8 0.24 ? 13% perf-profile.children.cycles-pp.ext4_mb_mark_diskspace_used
1.07 -0.8 0.29 ? 8% perf-profile.children.cycles-pp.do_sys_openat2
1.03 ? 8% -0.8 0.26 ? 13% perf-profile.children.cycles-pp.ext4_dirty_inode
0.99 ? 5% -0.7 0.26 ? 10% perf-profile.children.cycles-pp.kmem_cache_alloc
0.96 ? 6% -0.7 0.24 ? 16% perf-profile.children.cycles-pp.strncpy_from_user
1.09 ? 6% -0.7 0.37 ? 11% perf-profile.children.cycles-pp.__getblk_slow
0.92 ? 9% -0.7 0.22 ? 17% perf-profile.children.cycles-pp.__ext4_find_entry
0.88 ? 7% -0.7 0.21 ? 16% perf-profile.children.cycles-pp.ext4_dx_find_entry
0.89 ? 10% -0.7 0.22 ? 15% perf-profile.children.cycles-pp._find_next_or_bit
0.91 ? 5% -0.6 0.26 ? 9% perf-profile.children.cycles-pp.__x64_sys_readlinkat
0.84 ? 7% -0.6 0.19 ? 12% perf-profile.children.cycles-pp.__x64_sys_readlink
0.86 ? 7% -0.6 0.22 ? 9% perf-profile.children.cycles-pp.__ext4_handle_dirty_metadata
0.89 ? 8% -0.6 0.26 ? 18% perf-profile.children.cycles-pp.ext4_block_bitmap_csum_set
0.84 ? 6% -0.6 0.22 ? 15% perf-profile.children.cycles-pp.link_path_walk
0.80 ? 4% -0.6 0.20 ? 6% perf-profile.children.cycles-pp.xas_load
0.81 ? 2% -0.6 0.21 ? 12% perf-profile.children.cycles-pp.do_filp_open
0.77 ? 3% -0.6 0.20 ? 12% perf-profile.children.cycles-pp.path_openat
0.74 ? 7% -0.6 0.18 ? 11% perf-profile.children.cycles-pp.ext4_free_inode
0.70 ? 9% -0.6 0.14 ? 21% perf-profile.children.cycles-pp.jbd2__journal_start
0.77 ? 7% -0.6 0.21 ? 8% perf-profile.children.cycles-pp.dx_probe
0.69 ? 7% -0.5 0.15 ? 18% perf-profile.children.cycles-pp.complete_walk
0.71 ? 4% -0.5 0.18 ? 13% perf-profile.children.cycles-pp.__check_object_size
0.69 ? 8% -0.5 0.18 ? 15% perf-profile.children.cycles-pp.__do_sys_newlstat
0.64 ? 8% -0.5 0.14 ? 18% perf-profile.children.cycles-pp.try_to_unlazy
0.64 ? 5% -0.5 0.15 ? 10% perf-profile.children.cycles-pp.vfs_readlink
0.77 ? 12% -0.5 0.31 ? 13% perf-profile.children.cycles-pp.__do_softirq
0.66 ? 7% -0.5 0.19 ? 10% perf-profile.children.cycles-pp._raw_spin_lock_irqsave
0.64 ? 3% -0.5 0.18 ? 19% perf-profile.children.cycles-pp.ext4_handle_dirty_dirblock
0.59 ? 8% -0.5 0.13 ? 13% perf-profile.children.cycles-pp.filename_create
0.63 ? 8% -0.5 0.18 ? 12% perf-profile.children.cycles-pp.new_inode
0.57 ? 9% -0.4 0.13 ? 18% perf-profile.children.cycles-pp.__legitimize_path
0.60 ? 9% -0.4 0.16 ? 12% perf-profile.children.cycles-pp.vfs_fstatat
0.55 ? 7% -0.4 0.13 ? 15% perf-profile.children.cycles-pp.__cond_resched
0.58 ? 14% -0.4 0.16 ? 18% perf-profile.children.cycles-pp.rcu_core
0.54 ? 10% -0.4 0.13 ? 12% perf-profile.children.cycles-pp.ext4_es_remove_extent
0.51 ? 8% -0.4 0.10 ? 21% perf-profile.children.cycles-pp.start_this_handle
0.55 ? 8% -0.4 0.14 ? 12% perf-profile.children.cycles-pp.ext4_mb_load_buddy_gfp
0.52 ? 6% -0.4 0.13 ? 23% perf-profile.children.cycles-pp.lookup_one_qstr_excl
0.48 ? 3% -0.4 0.09 ? 22% perf-profile.children.cycles-pp.percpu_counter_add_batch
0.53 ? 9% -0.4 0.14 ? 14% perf-profile.children.cycles-pp.__ext4_journal_stop
0.55 ? 8% -0.4 0.16 ? 13% perf-profile.children.cycles-pp.alloc_inode
0.52 ? 7% -0.4 0.13 ? 23% perf-profile.children.cycles-pp.lookup_dcache
0.51 ? 9% -0.4 0.12 ? 10% perf-profile.children.cycles-pp.__es_remove_extent
0.51 ? 8% -0.4 0.12 ? 14% perf-profile.children.cycles-pp.__filename_parentat
0.54 ? 15% -0.4 0.15 ? 17% perf-profile.children.cycles-pp.rcu_do_batch
0.51 ? 6% -0.4 0.13 ? 23% perf-profile.children.cycles-pp.d_lookup
0.51 ? 6% -0.4 0.13 ? 23% perf-profile.children.cycles-pp.__d_lookup
0.50 ? 10% -0.4 0.12 ? 17% perf-profile.children.cycles-pp.ext4_ext_insert_extent
0.46 ? 3% -0.4 0.10 ? 11% perf-profile.children.cycles-pp.__ext4_check_dir_entry
0.47 ? 8% -0.4 0.11 ? 16% perf-profile.children.cycles-pp.exit_to_user_mode_prepare
0.49 ? 10% -0.4 0.14 ? 17% perf-profile.children.cycles-pp.ext4_alloc_inode
0.48 ? 9% -0.4 0.13 ? 14% perf-profile.children.cycles-pp.jbd2_journal_stop
0.48 ? 10% -0.3 0.13 ? 8% perf-profile.children.cycles-pp.ext4_es_lookup_extent
0.44 ? 9% -0.3 0.10 ? 17% perf-profile.children.cycles-pp.path_parentat
0.45 ? 6% -0.3 0.11 ? 18% perf-profile.children.cycles-pp.kmem_cache_free
0.42 ? 8% -0.3 0.08 ? 22% perf-profile.children.cycles-pp.jbd2_journal_revoke
0.47 ? 4% -0.3 0.14 ? 25% perf-profile.children.cycles-pp.jbd2_journal_get_write_access
0.46 ? 9% -0.3 0.13 ? 12% perf-profile.children.cycles-pp.pagecache_get_page
0.44 ? 6% -0.3 0.12 ? 14% perf-profile.children.cycles-pp.ext4_delete_entry
0.42 ? 13% -0.3 0.09 ? 11% perf-profile.children.cycles-pp._raw_read_lock
0.44 ? 10% -0.3 0.12 ? 14% perf-profile.children.cycles-pp.vfs_statx
0.42 ? 8% -0.3 0.11 ? 16% perf-profile.children.cycles-pp.jbd2_journal_dirty_metadata
0.43 ? 11% -0.3 0.12 ? 11% perf-profile.children.cycles-pp.kmem_cache_alloc_lru
0.41 ? 4% -0.3 0.10 ? 18% perf-profile.children.cycles-pp.check_heap_object
0.41 ? 10% -0.3 0.10 ? 13% perf-profile.children.cycles-pp.xas_descend
0.39 ? 10% -0.3 0.09 ? 12% perf-profile.children.cycles-pp.__brelse
0.37 ? 8% -0.3 0.08 ? 19% perf-profile.children.cycles-pp.ext4_es_insert_extent
0.40 ? 11% -0.3 0.12 ? 23% perf-profile.children.cycles-pp.smpboot_thread_fn
0.39 ? 8% -0.3 0.10 ? 16% perf-profile.children.cycles-pp.fscrypt_match_name
0.36 ? 7% -0.3 0.08 ? 22% perf-profile.children.cycles-pp.dput
0.38 ? 12% -0.3 0.11 ? 22% perf-profile.children.cycles-pp.run_ksoftirqd
0.36 ? 7% -0.3 0.09 ? 23% perf-profile.children.cycles-pp.ext4_group_desc_csum_set
0.36 ? 8% -0.3 0.09 ? 21% perf-profile.children.cycles-pp.mb_find_extent
0.36 ? 16% -0.3 0.09 ? 24% perf-profile.children.cycles-pp.lockref_get_not_dead
0.35 ? 9% -0.3 0.08 ? 16% perf-profile.children.cycles-pp.ext4_get_link
0.34 ? 7% -0.3 0.08 ? 23% perf-profile.children.cycles-pp.ext4_group_desc_csum
0.30 ? 8% -0.3 0.05 ? 48% perf-profile.children.cycles-pp.ext4_get_group_desc
0.34 ? 12% -0.2 0.10 ? 21% perf-profile.children.cycles-pp.inode_permission
0.36 ? 6% -0.2 0.11 ? 26% perf-profile.children.cycles-pp.jbd2_write_access_granted
0.30 ? 6% -0.2 0.06 ? 51% perf-profile.children.cycles-pp.jbd2_journal_forget
0.32 ? 7% -0.2 0.08 ? 20% perf-profile.children.cycles-pp.__slab_free
0.31 ? 6% -0.2 0.08 ? 20% perf-profile.children.cycles-pp._IO_default_xsputn
0.31 ? 10% -0.2 0.08 ? 19% perf-profile.children.cycles-pp.alloc_empty_file
0.29 ? 11% -0.2 0.06 ? 15% perf-profile.children.cycles-pp.__es_insert_extent
0.31 ? 8% -0.2 0.08 ? 14% perf-profile.children.cycles-pp.mb_find_order_for_block
0.32 ? 2% -0.2 0.09 ? 10% perf-profile.children.cycles-pp.memset_orig
0.31 ? 13% -0.2 0.09 ? 32% perf-profile.children.cycles-pp._raw_spin_trylock
0.34 ? 11% -0.2 0.13 ? 11% perf-profile.children.cycles-pp.filemap_add_folio
0.27 ? 5% -0.2 0.06 ? 11% perf-profile.children.cycles-pp.crypto_shash_update
0.25 ? 13% -0.2 0.04 ? 71% perf-profile.children.cycles-pp.path_init
0.29 ? 9% -0.2 0.08 ? 11% perf-profile.children.cycles-pp.___slab_alloc
0.24 ? 8% -0.2 0.03 ?101% perf-profile.children.cycles-pp.path_put
0.26 ? 11% -0.2 0.06 ? 17% perf-profile.children.cycles-pp.exit_to_user_mode_loop
0.44 ? 13% -0.2 0.25 ? 12% perf-profile.children.cycles-pp.__irq_exit_rcu
0.25 ? 10% -0.2 0.06 ? 47% perf-profile.children.cycles-pp.ext4_read_block_bitmap
0.28 ? 11% -0.2 0.08 ? 22% perf-profile.children.cycles-pp.ext4_mb_free_metadata
0.23 ? 18% -0.2 0.04 ? 72% perf-profile.children.cycles-pp.ext4_inode_bitmap_csum_set
0.29 ? 13% -0.2 0.10 ? 40% perf-profile.children.cycles-pp.ext4_mb_prefetch
0.22 ? 6% -0.2 0.03 ?105% perf-profile.children.cycles-pp.lockref_put_return
0.22 ? 12% -0.2 0.04 ? 72% perf-profile.children.cycles-pp.task_work_run
0.24 ? 10% -0.2 0.06 ? 47% perf-profile.children.cycles-pp.ext4_read_block_bitmap_nowait
0.21 ? 9% -0.2 0.03 ?100% perf-profile.children.cycles-pp.ext4_es_free_extent
0.22 ? 10% -0.2 0.04 ? 72% perf-profile.children.cycles-pp.__check_block_validity
0.22 ? 13% -0.2 0.04 ? 77% perf-profile.children.cycles-pp.generic_permission
0.24 ? 8% -0.2 0.07 ? 20% perf-profile.children.cycles-pp.ext4_sb_block_valid
0.24 ? 4% -0.2 0.07 ? 17% perf-profile.children.cycles-pp.__virt_addr_valid
0.24 ? 14% -0.2 0.07 ? 14% perf-profile.children.cycles-pp.allocate_slab
0.20 ? 9% -0.2 0.02 ? 99% perf-profile.children.cycles-pp.__check_heap_object
0.20 ? 14% -0.2 0.03 ?102% perf-profile.children.cycles-pp.walk_component
0.25 ? 10% -0.2 0.08 ? 22% perf-profile.children.cycles-pp.ext4_get_group_info
0.23 ? 11% -0.2 0.06 ? 14% perf-profile.children.cycles-pp.do_open
0.22 ? 9% -0.2 0.06 ? 13% perf-profile.children.cycles-pp.ext4_mb_use_best_found
0.22 ? 9% -0.2 0.06 ? 46% perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
0.22 ? 11% -0.2 0.06 ? 15% perf-profile.children.cycles-pp.readlink_copy
0.21 ? 10% -0.2 0.04 ? 45% perf-profile.children.cycles-pp.__ext4fs_dirhash
0.25 ? 11% -0.2 0.10 ? 11% perf-profile.children.cycles-pp.__filemap_add_folio
0.20 ? 9% -0.2 0.04 ? 45% perf-profile.children.cycles-pp.ext4_mb_prefetch_fini
0.20 ? 18% -0.2 0.05 ? 46% perf-profile.children.cycles-pp.shuffle_freelist
0.22 ? 11% -0.2 0.07 ? 21% perf-profile.children.cycles-pp.folio_mark_accessed
0.19 ? 16% -0.1 0.04 ? 45% perf-profile.children.cycles-pp.__list_del_entry_valid
0.20 ? 12% -0.1 0.06 ? 19% perf-profile.children.cycles-pp._copy_to_user
0.19 ? 3% -0.1 0.05 ? 46% perf-profile.children.cycles-pp.__jbd2_journal_file_buffer
0.18 ? 18% -0.1 0.04 ? 45% perf-profile.children.cycles-pp.setup_object
0.17 ? 15% -0.1 0.05 ? 46% perf-profile.children.cycles-pp.ext4_fc_init_inode
0.16 ? 10% -0.1 0.04 ? 45% perf-profile.children.cycles-pp.memcg_slab_post_alloc_hook
0.16 ? 13% -0.1 0.05 ? 75% perf-profile.children.cycles-pp.folio_alloc
0.17 ? 13% -0.1 0.06 ? 53% perf-profile.children.cycles-pp.__alloc_pages
0.13 ? 19% -0.1 0.03 ?103% perf-profile.children.cycles-pp.get_page_from_freelist
0.27 ? 12% -0.1 0.18 ? 17% perf-profile.children.cycles-pp.ext4_mb_find_good_group_avg_frag_lists
0.12 ? 8% -0.1 0.04 ? 73% perf-profile.children.cycles-pp.touch_atime
0.11 ? 14% -0.1 0.04 ? 72% perf-profile.children.cycles-pp._raw_spin_lock_irq
36.65 ? 5% +23.9 60.52 ? 5% perf-profile.children.cycles-pp.symlink
35.81 ? 5% +24.5 60.32 ? 5% perf-profile.children.cycles-pp.__x64_sys_symlink
35.54 ? 5% +24.7 60.24 ? 5% perf-profile.children.cycles-pp.do_symlinkat
34.80 ? 5% +25.3 60.07 ? 5% perf-profile.children.cycles-pp.vfs_symlink
34.73 ? 5% +25.3 60.06 ? 5% perf-profile.children.cycles-pp.ext4_symlink
8.16 ? 6% +28.3 36.49 ? 4% perf-profile.children.cycles-pp._raw_spin_lock
6.76 ? 6% +29.4 36.14 ? 4% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
28.56 ? 6% +29.9 58.46 ? 5% perf-profile.children.cycles-pp.ext4_bread
28.54 ? 6% +29.9 58.46 ? 5% perf-profile.children.cycles-pp.ext4_getblk
27.14 ? 6% +30.9 58.06 ? 5% perf-profile.children.cycles-pp.ext4_init_symlink_block
18.24 ? 8% +37.4 55.63 ? 5% perf-profile.children.cycles-pp.__ext4_journal_get_create_access
18.24 ? 8% +37.4 55.63 ? 5% perf-profile.children.cycles-pp.jbd2_journal_get_create_access
17.96 ? 8% +37.6 55.56 ? 5% perf-profile.children.cycles-pp.jbd2_journal_cancel_revoke
17.85 ? 8% +37.7 55.53 ? 5% perf-profile.children.cycles-pp.find_revoke_record
2.53 ? 5% -1.9 0.66 ? 12% perf-profile.self.cycles-pp.syscall_return_via_sysret
2.23 ? 5% -1.7 0.54 ? 15% perf-profile.self.cycles-pp.__percpu_counter_sum
2.19 ? 4% -1.6 0.56 ? 13% perf-profile.self.cycles-pp.crc32c_pcl_intel_update
2.15 ? 7% -1.5 0.60 ? 8% perf-profile.self.cycles-pp.syscall_exit_to_user_mode
1.64 ? 7% -1.2 0.44 ? 9% perf-profile.self.cycles-pp.crc_pcl
1.47 ? 4% -1.1 0.40 ? 15% perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
1.41 ? 8% -1.0 0.36 ? 8% perf-profile.self.cycles-pp._raw_spin_lock
1.28 ? 6% -1.0 0.30 ? 15% perf-profile.self.cycles-pp.__find_get_block
1.24 ? 4% -0.9 0.35 ? 11% perf-profile.self.cycles-pp._find_next_bit
1.10 ? 5% -0.8 0.27 ? 16% perf-profile.self.cycles-pp.__entry_text_start
0.96 ? 5% -0.7 0.26 ? 13% perf-profile.self.cycles-pp.ext4_fill_raw_inode
0.82 ? 6% -0.6 0.19 ? 11% perf-profile.self.cycles-pp.filemap_get_entry
0.76 ? 9% -0.6 0.18 ? 17% perf-profile.self.cycles-pp._find_next_or_bit
0.76 ? 4% -0.6 0.20 ? 11% perf-profile.self.cycles-pp.ext4_find_dest_de
0.66 ? 6% -0.5 0.17 ? 16% perf-profile.self.cycles-pp.__ext4_get_inode_loc
0.60 ? 5% -0.5 0.14 ? 3% perf-profile.self.cycles-pp.kmem_cache_alloc
0.62 ? 8% -0.4 0.18 ? 10% perf-profile.self.cycles-pp._raw_spin_lock_irqsave
0.53 ? 8% -0.4 0.12 ? 17% perf-profile.self.cycles-pp.ext4_inode_csum
0.46 ? 3% -0.4 0.08 ? 21% perf-profile.self.cycles-pp.percpu_counter_add_batch
0.46 ? 3% -0.4 0.10 ? 11% perf-profile.self.cycles-pp.__ext4_check_dir_entry
0.47 ? 13% -0.4 0.10 ? 18% perf-profile.self.cycles-pp.__ext4_journal_get_write_access
0.47 ? 8% -0.4 0.12 ? 22% perf-profile.self.cycles-pp.ext4_do_update_inode
0.41 ? 12% -0.3 0.09 ? 10% perf-profile.self.cycles-pp._raw_read_lock
0.38 ? 9% -0.3 0.08 ? 14% perf-profile.self.cycles-pp.__brelse
0.39 ? 9% -0.3 0.10 ? 27% perf-profile.self.cycles-pp.__d_lookup
0.38 ? 11% -0.3 0.10 ? 12% perf-profile.self.cycles-pp.xas_descend
0.37 ? 10% -0.3 0.09 ? 14% perf-profile.self.cycles-pp.link_path_walk
0.37 ? 4% -0.3 0.10 ? 18% perf-profile.self.cycles-pp.kmem_cache_free
0.36 ? 11% -0.3 0.08 ? 18% perf-profile.self.cycles-pp.__cond_resched
0.37 ? 9% -0.3 0.11 ? 9% perf-profile.self.cycles-pp.ext4_mark_iloc_dirty
0.35 ? 15% -0.3 0.09 ? 24% perf-profile.self.cycles-pp.lockref_get_not_dead
0.34 ? 9% -0.3 0.09 ? 22% perf-profile.self.cycles-pp.strncpy_from_user
0.30 ? 8% -0.2 0.05 ? 48% perf-profile.self.cycles-pp.ext4_get_group_desc
0.34 ? 11% -0.2 0.10 ? 21% perf-profile.self.cycles-pp.__ext4_handle_dirty_metadata
0.34 ? 6% -0.2 0.10 ? 28% perf-profile.self.cycles-pp.jbd2_write_access_granted
0.32 ? 7% -0.2 0.08 ? 20% perf-profile.self.cycles-pp.__slab_free
0.31 ? 4% -0.2 0.08 ? 15% perf-profile.self.cycles-pp.xas_load
0.31 ? 6% -0.2 0.08 ? 20% perf-profile.self.cycles-pp._IO_default_xsputn
0.32 ? 2% -0.2 0.09 ? 10% perf-profile.self.cycles-pp.memset_orig
0.31 ? 7% -0.2 0.08 ? 14% perf-profile.self.cycles-pp.mb_find_order_for_block
0.31 ? 14% -0.2 0.09 ? 32% perf-profile.self.cycles-pp._raw_spin_trylock
0.25 ? 11% -0.2 0.04 ? 73% perf-profile.self.cycles-pp.ext4_statfs
0.26 ? 12% -0.2 0.05 ? 49% perf-profile.self.cycles-pp.__ext4_new_inode
0.26 ? 4% -0.2 0.06 ? 13% perf-profile.self.cycles-pp.crypto_shash_update
0.23 ? 13% -0.2 0.04 ? 71% perf-profile.self.cycles-pp.path_init
0.23 ? 14% -0.2 0.03 ?100% perf-profile.self.cycles-pp.jbd2_journal_dirty_metadata
0.25 ? 8% -0.2 0.06 ? 46% perf-profile.self.cycles-pp.fscrypt_match_name
0.22 ? 6% -0.2 0.03 ?105% perf-profile.self.cycles-pp.lockref_put_return
0.21 ? 6% -0.2 0.03 ?100% perf-profile.self.cycles-pp.__ext4_mark_inode_dirty
0.25 ? 14% -0.2 0.08 ? 18% perf-profile.self.cycles-pp.ext4_es_lookup_extent
0.24 ? 7% -0.2 0.06 ? 21% perf-profile.self.cycles-pp.ext4_sb_block_valid
0.20 ? 10% -0.2 0.02 ? 99% perf-profile.self.cycles-pp.__check_heap_object
0.23 ? 10% -0.2 0.06 ? 52% perf-profile.self.cycles-pp.ext4_get_group_info
0.20 ? 6% -0.2 0.04 ? 71% perf-profile.self.cycles-pp.ext4_reserve_inode_write
0.23 ? 4% -0.2 0.06 ? 23% perf-profile.self.cycles-pp.__virt_addr_valid
0.19 ? 15% -0.2 0.03 ?102% perf-profile.self.cycles-pp.ext4_getblk
0.19 ? 16% -0.1 0.04 ? 45% perf-profile.self.cycles-pp.__list_del_entry_valid
0.18 ? 5% -0.1 0.04 ? 45% perf-profile.self.cycles-pp.dx_probe
0.20 ? 12% -0.1 0.08 ? 72% perf-profile.self.cycles-pp.ext4_mb_prefetch
0.17 ? 16% -0.1 0.05 ? 46% perf-profile.self.cycles-pp.ext4_fc_init_inode
0.12 ? 8% -0.1 0.03 ?100% perf-profile.self.cycles-pp.do_syscall_64
0.10 ? 18% -0.1 0.03 ?102% perf-profile.self.cycles-pp._raw_spin_lock_irq
0.10 ? 23% -0.0 0.07 ? 20% perf-profile.self.cycles-pp.ext4_mb_find_good_group_avg_frag_lists
13.44 ? 5% +9.0 22.43 ? 4% perf-profile.self.cycles-pp.find_revoke_record
6.72 ? 6% +29.2 35.97 ? 4% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki