2020-05-04 13:31:38

by Konstantin Khlebnikov

[permalink] [raw]
Subject: [PATCH 1/4] block/part_stat: remove rcu_read_lock() from part_stat_lock()

RCU lock is required only in blk_account_io_start() to lookup partition.
After that request holds reference to related hd_struct.

Replace get_cpu() with preempt_disable() - returned cpu index is unused.

Non-SMP case also needs preempt_disable, otherwise statistics update could
be non-atomic. Previously that was provided by rcu_read_lock().

Signed-off-by: Konstantin Khlebnikov <[email protected]>
---
block/blk-core.c | 3 +++
include/linux/part_stat.h | 7 +++----
2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 7f11560bfddb..45ddf7238c06 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1362,6 +1362,7 @@ void blk_account_io_start(struct request *rq, bool new_io)
part = rq->part;
part_stat_inc(part, merges[rw]);
} else {
+ rcu_read_lock();
part = disk_map_sector_rcu(rq->rq_disk, blk_rq_pos(rq));
if (!hd_struct_try_get(part)) {
/*
@@ -1375,6 +1376,8 @@ void blk_account_io_start(struct request *rq, bool new_io)
part = &rq->rq_disk->part0;
hd_struct_get(part);
}
+ rcu_read_unlock();
+
part_inc_in_flight(rq->q, part, rw);
rq->part = part;
}
diff --git a/include/linux/part_stat.h b/include/linux/part_stat.h
index ece607607a86..755a01f0fd61 100644
--- a/include/linux/part_stat.h
+++ b/include/linux/part_stat.h
@@ -16,9 +16,10 @@
* part_stat_{add|set_all}() and {init|free}_part_stats are for
* internal use only.
*/
+#define part_stat_lock() preempt_disable()
+#define part_stat_unlock() preempt_enable()
+
#ifdef CONFIG_SMP
-#define part_stat_lock() ({ rcu_read_lock(); get_cpu(); })
-#define part_stat_unlock() do { put_cpu(); rcu_read_unlock(); } while (0)

#define part_stat_get_cpu(part, field, cpu) \
(per_cpu_ptr((part)->dkstats, (cpu))->field)
@@ -58,8 +59,6 @@ static inline void free_part_stats(struct hd_struct *part)
}

#else /* !CONFIG_SMP */
-#define part_stat_lock() ({ rcu_read_lock(); 0; })
-#define part_stat_unlock() rcu_read_unlock()

#define part_stat_get(part, field) ((part)->dkstats.field)
#define part_stat_get_cpu(part, field, cpu) part_stat_get(part, field)


2020-05-04 13:33:15

by Konstantin Khlebnikov

[permalink] [raw]
Subject: [PATCH 3/4] block/part_stat: account merge of two requests

Also rename blk_account_io_merge() into blk_account_io_merge_request() to
distinguish it from merging request and bio.

Signed-off-by: Konstantin Khlebnikov <[email protected]>
---
block/blk-merge.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index a04e991b5ded..37bced39bae8 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -662,20 +662,23 @@ void blk_rq_set_mixed_merge(struct request *rq)
rq->rq_flags |= RQF_MIXED_MERGE;
}

-static void blk_account_io_merge(struct request *req)
+static void blk_account_io_merge_request(struct request *req)
{
if (blk_do_io_stat(req)) {
+ const int sgrp = op_stat_group(req_op(req));
struct hd_struct *part;

part_stat_lock();
part = req->part;

+ part_stat_inc(part, merges[sgrp]);
part_dec_in_flight(req->q, part, rq_data_dir(req));

hd_struct_put(part);
part_stat_unlock();
}
}
+
/*
* Two cases of handling DISCARD merge:
* If max_discard_segments > 1, the driver takes every bio
@@ -787,7 +790,7 @@ static struct request *attempt_merge(struct request_queue *q,
/*
* 'next' is going away, so update stats accordingly
*/
- blk_account_io_merge(next);
+ blk_account_io_merge_request(next);

/*
* ownership of bio passed from next to req, return 'next' for

2020-05-04 13:33:38

by Konstantin Khlebnikov

[permalink] [raw]
Subject: [PATCH 4/4] block/part_stat: add helper blk_account_io_merge_bio()

Move non-"new_io" branch of blk_account_io_start() into separate function.
Fix merge accounting for discards (they were counted as write merges).

Also blk_account_io_merge_bio() doesn't call update_io_ticks() unlike to
blk_account_io_start(), there is no reason for that.

Signed-off-by: Konstantin Khlebnikov <[email protected]>
---
block/blk-core.c | 39 ++++++++++++++++++++++-----------------
block/blk-exec.c | 2 +-
block/blk-mq.c | 2 +-
block/blk.h | 2 +-
4 files changed, 25 insertions(+), 20 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 45ddf7238c06..18fb42eb2f18 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -622,6 +622,17 @@ void blk_put_request(struct request *req)
}
EXPORT_SYMBOL(blk_put_request);

+static void blk_account_io_merge_bio(struct request *req)
+{
+ if (blk_do_io_stat(req)) {
+ const int sgrp = op_stat_group(req_op(req));
+
+ part_stat_lock();
+ part_stat_inc(req->part, merges[sgrp]);
+ part_stat_unlock();
+ }
+}
+
bool bio_attempt_back_merge(struct request *req, struct bio *bio,
unsigned int nr_segs)
{
@@ -640,7 +651,7 @@ bool bio_attempt_back_merge(struct request *req, struct bio *bio,
req->biotail = bio;
req->__data_len += bio->bi_iter.bi_size;

- blk_account_io_start(req, false);
+ blk_account_io_merge_bio(req);
return true;
}

@@ -664,7 +675,7 @@ bool bio_attempt_front_merge(struct request *req, struct bio *bio,
req->__sector = bio->bi_iter.bi_sector;
req->__data_len += bio->bi_iter.bi_size;

- blk_account_io_start(req, false);
+ blk_account_io_merge_bio(req);
return true;
}

@@ -686,7 +697,7 @@ bool bio_attempt_discard_merge(struct request_queue *q, struct request *req,
req->__data_len += bio->bi_iter.bi_size;
req->nr_phys_segments = segments + 1;

- blk_account_io_start(req, false);
+ blk_account_io_merge_bio(req);
return true;
no_merge:
req_set_nomerge(q, req);
@@ -1258,7 +1269,7 @@ blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request *
return BLK_STS_IOERR;

if (blk_queue_io_stat(q))
- blk_account_io_start(rq, true);
+ blk_account_io_start(rq);

/*
* Since we have a scheduler attached on the top device,
@@ -1348,20 +1359,14 @@ void blk_account_io_done(struct request *req, u64 now)
}
}

-void blk_account_io_start(struct request *rq, bool new_io)
+void blk_account_io_start(struct request *rq)
{
struct hd_struct *part;
int rw = rq_data_dir(rq);

- if (!blk_do_io_stat(rq))
- return;
-
- part_stat_lock();
+ if (blk_do_io_stat(rq)) {
+ part_stat_lock();

- if (!new_io) {
- part = rq->part;
- part_stat_inc(part, merges[rw]);
- } else {
rcu_read_lock();
part = disk_map_sector_rcu(rq->rq_disk, blk_rq_pos(rq));
if (!hd_struct_try_get(part)) {
@@ -1378,13 +1383,13 @@ void blk_account_io_start(struct request *rq, bool new_io)
}
rcu_read_unlock();

- part_inc_in_flight(rq->q, part, rw);
rq->part = part;
- }

- update_io_ticks(part, jiffies, false);
+ part_inc_in_flight(rq->q, part, rw);
+ update_io_ticks(part, jiffies, false);

- part_stat_unlock();
+ part_stat_unlock();
+ }
}

/*
diff --git a/block/blk-exec.c b/block/blk-exec.c
index e20a852ae432..85324d53d072 100644
--- a/block/blk-exec.c
+++ b/block/blk-exec.c
@@ -55,7 +55,7 @@ void blk_execute_rq_nowait(struct request_queue *q, struct gendisk *bd_disk,
rq->rq_disk = bd_disk;
rq->end_io = done;

- blk_account_io_start(rq, true);
+ blk_account_io_start(rq);

/*
* don't check dying flag for MQ because the request won't
diff --git a/block/blk-mq.c b/block/blk-mq.c
index bcc3a2397d4a..049c4f9417c3 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1794,7 +1794,7 @@ static void blk_mq_bio_to_request(struct request *rq, struct bio *bio,
rq->write_hint = bio->bi_write_hint;
blk_rq_bio_prep(rq, bio, nr_segs);

- blk_account_io_start(rq, true);
+ blk_account_io_start(rq);
}

static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx,
diff --git a/block/blk.h b/block/blk.h
index 73bd3b1c6938..06cd57cc10fb 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -195,7 +195,7 @@ bool bio_attempt_discard_merge(struct request_queue *q, struct request *req,
bool blk_attempt_plug_merge(struct request_queue *q, struct bio *bio,
unsigned int nr_segs, struct request **same_queue_rq);

-void blk_account_io_start(struct request *req, bool new_io);
+void blk_account_io_start(struct request *req);
void blk_account_io_completion(struct request *req, unsigned int bytes);
void blk_account_io_done(struct request *req, u64 now);


2020-05-04 14:05:26

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 1/4] block/part_stat: remove rcu_read_lock() from part_stat_lock()

On Mon, May 04, 2020 at 04:29:29PM +0300, Konstantin Khlebnikov wrote:
> RCU lock is required only in blk_account_io_start() to lookup partition.
> After that request holds reference to related hd_struct.
>
> Replace get_cpu() with preempt_disable() - returned cpu index is unused.
>
> Non-SMP case also needs preempt_disable, otherwise statistics update could
> be non-atomic. Previously that was provided by rcu_read_lock().
>
> Signed-off-by: Konstantin Khlebnikov <[email protected]>

Looks good,

Signed-off-by: Christoph Hellwig <[email protected]>

although I wonder if we should just kill off part_stat_lock and
part_stat_unlock.

2020-05-04 14:07:52

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 3/4] block/part_stat: account merge of two requests

On Mon, May 04, 2020 at 04:30:52PM +0300, Konstantin Khlebnikov wrote:
> Also rename blk_account_io_merge() into blk_account_io_merge_request() to
> distinguish it from merging request and bio.
>
> Signed-off-by: Konstantin Khlebnikov <[email protected]>

Looks good,

Reviewed-by: Christoph Hellwig <[email protected]>

2020-05-04 14:12:50

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 4/4] block/part_stat: add helper blk_account_io_merge_bio()

On Mon, May 04, 2020 at 04:31:04PM +0300, Konstantin Khlebnikov wrote:
> Move non-"new_io" branch of blk_account_io_start() into separate function.
> Fix merge accounting for discards (they were counted as write merges).
>
> Also blk_account_io_merge_bio() doesn't call update_io_ticks() unlike to
> blk_account_io_start(), there is no reason for that.
>
> Signed-off-by: Konstantin Khlebnikov <[email protected]>

Looks good,

Reviewed-by: Christoph Hellwig <[email protected]>

Nitpick below:

> +void blk_account_io_start(struct request *rq)
> {
> struct hd_struct *part;
> int rw = rq_data_dir(rq);
>
> + if (blk_do_io_stat(rq)) {

part and rw probably should move inside this branch.