2020-06-27 07:33:25

by Christoph Hellwig

[permalink] [raw]
Subject: drive-by blk-cgroup cleanups

Hi all,

while looking into another "project" I ended up wading through the
blkcq code for research and found a bunch of lose ends. So here is
a bunch of drive-by cleanups for the code.

Diffstat:
block/bio.c | 143 +----------------------------------
block/blk-cgroup.c | 182 +++++++++++++++++++++++++++++++++++++--------
block/blk-core.c | 7 +
block/blk-throttle.c | 10 +-
block/blk.h | 2
drivers/md/dm.c | 5 -
include/linux/bio.h | 9 --
include/linux/blk-cgroup.h | 101 ------------------------
kernel/cgroup/rstat.c | 1
mm/page_io.c | 17 ++++
10 files changed, 192 insertions(+), 285 deletions(-)


2020-06-27 07:33:29

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 05/14] block: move bio_associate_blkg_from_page to mm/page_io.c

bio_associate_blkg_from_page is a special purpose helper for swap bios
that doesn't need access to bio internals. Move it to the swap code
instead of having it in bio.c.

Signed-off-by: Christoph Hellwig <[email protected]>
---
block/bio.c | 26 --------------------------
include/linux/bio.h | 7 -------
mm/page_io.c | 17 +++++++++++++++++
3 files changed, 17 insertions(+), 33 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index bc8de2432e3645..901d22715dd4f3 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1659,32 +1659,6 @@ void bio_associate_blkg_from_css(struct bio *bio,
}
EXPORT_SYMBOL_GPL(bio_associate_blkg_from_css);

-#ifdef CONFIG_MEMCG
-/**
- * bio_associate_blkg_from_page - associate a bio with the page's blkg
- * @bio: target bio
- * @page: the page to lookup the blkcg from
- *
- * Associate @bio with the blkg from @page's owning memcg and the respective
- * request_queue. If cgroup_e_css returns %NULL, fall back to the queue's
- * root_blkg.
- */
-void bio_associate_blkg_from_page(struct bio *bio, struct page *page)
-{
- struct cgroup_subsys_state *css;
-
- if (!page->mem_cgroup)
- return;
-
- rcu_read_lock();
-
- css = cgroup_e_css(page->mem_cgroup->css.cgroup, &io_cgrp_subsys);
- bio_associate_blkg_from_css(bio, css);
-
- rcu_read_unlock();
-}
-#endif /* CONFIG_MEMCG */
-
/**
* bio_associate_blkg - associate a bio with a blkg
* @bio: target bio
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 4cd229e175c058..c6d76538292644 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -489,13 +489,6 @@ do { \
#define bio_dev(bio) \
disk_devt((bio)->bi_disk)

-#if defined(CONFIG_MEMCG) && defined(CONFIG_BLK_CGROUP)
-void bio_associate_blkg_from_page(struct bio *bio, struct page *page);
-#else
-static inline void bio_associate_blkg_from_page(struct bio *bio,
- struct page *page) { }
-#endif
-
#ifdef CONFIG_BLK_CGROUP
void bio_associate_blkg(struct bio *bio);
void bio_associate_blkg_from_css(struct bio *bio,
diff --git a/mm/page_io.c b/mm/page_io.c
index e8726f3e3820bf..ccda7679008851 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -277,6 +277,23 @@ static inline void count_swpout_vm_event(struct page *page)
count_vm_events(PSWPOUT, hpage_nr_pages(page));
}

+#if defined(CONFIG_MEMCG) && defined(CONFIG_BLK_CGROUP)
+static void bio_associate_blkg_from_page(struct bio *bio, struct page *page)
+{
+ struct cgroup_subsys_state *css;
+
+ if (!page->mem_cgroup)
+ return;
+
+ rcu_read_lock();
+ css = cgroup_e_css(page->mem_cgroup->css.cgroup, &io_cgrp_subsys);
+ bio_associate_blkg_from_css(bio, css);
+ rcu_read_unlock();
+}
+#else
+#define bio_associate_blkg_from_page(bio, page) do { } while (0)
+#endif /* CONFIG_MEMCG && CONFIG_BLK_CGROUP */
+
int __swap_writepage(struct page *page, struct writeback_control *wbc,
bio_end_io_t end_write_func)
{
--
2.26.2

2020-06-27 07:33:34

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 04/14] block: merge __bio_associate_blkg into bio_associate_blkg_from_css

Merge __bio_associate_blkg into the only caller, which allows to slightly
reduce the RCU crticial section and better explain the code flow.

Signed-off-by: Christoph Hellwig <[email protected]>
---
block/bio.c | 45 +++++++++++++--------------------------------
1 file changed, 13 insertions(+), 32 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index e1d01acce8070c..bc8de2432e3645 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1628,52 +1628,33 @@ int bioset_init_from_src(struct bio_set *bs, struct bio_set *src)
EXPORT_SYMBOL(bioset_init_from_src);

#ifdef CONFIG_BLK_CGROUP
-
-/**
- * __bio_associate_blkg - associate a bio with the a blkg
- * @bio: target bio
- * @blkg: the blkg to associate
- *
- * This tries to associate @bio with the specified @blkg. Association failure
- * is handled by walking up the blkg tree. Therefore, the blkg associated can
- * be anything between @blkg and the root_blkg. This situation only happens
- * when a cgroup is dying and then the remaining bios will spill to the closest
- * alive blkg.
- *
- * A reference will be taken on the @blkg and will be released when @bio is
- * freed.
- */
-static void __bio_associate_blkg(struct bio *bio, struct blkcg_gq *blkg)
-{
- if (bio->bi_blkg)
- blkg_put(bio->bi_blkg);
- bio->bi_blkg = blkg_tryget_closest(blkg);
-}
-
/**
* bio_associate_blkg_from_css - associate a bio with a specified css
* @bio: target bio
* @css: target css
*
* Associate @bio with the blkg found by combining the css's blkg and the
- * request_queue of the @bio. This falls back to the queue's root_blkg if
- * the association fails with the css.
+ * request_queue of the @bio. An association failure is handled by walking up
+ * the blkg tree. Therefore, the blkg associated can be anything between @blkg
+ * and q->root_blkg. This situation only happens when a cgroup is dying and
+ * then the remaining bios will spill to the closest alive blkg.
+ *
+ * A reference will be taken on the blkg and will be released when @bio is
+ * freed.
*/
void bio_associate_blkg_from_css(struct bio *bio,
struct cgroup_subsys_state *css)
{
struct request_queue *q = bio->bi_disk->queue;
- struct blkcg_gq *blkg;
+ struct blkcg_gq *blkg = q->root_blkg;

- rcu_read_lock();
+ if (bio->bi_blkg)
+ blkg_put(bio->bi_blkg);

- if (!css || !css->parent)
- blkg = q->root_blkg;
- else
+ rcu_read_lock();
+ if (css && css->parent)
blkg = blkg_lookup_create(css_to_blkcg(css), q);
-
- __bio_associate_blkg(bio, blkg);
-
+ bio->bi_blkg = blkg_tryget_closest(blkg);
rcu_read_unlock();
}
EXPORT_SYMBOL_GPL(bio_associate_blkg_from_css);
--
2.26.2

2020-06-27 07:33:40

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 07/14] block: merge blkg_lookup_create and __blkg_lookup_create

No good reason to keep these two functions split.

Signed-off-by: Christoph Hellwig <[email protected]>
---
block/blk-cgroup.c | 49 +++++++++++++++++-----------------------------
1 file changed, 18 insertions(+), 31 deletions(-)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index bb0607bfd771cd..6aedb73ffbf473 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -316,30 +316,35 @@ static struct blkcg_gq *blkg_create(struct blkcg *blkcg,
}

/**
- * __blkg_lookup_create - lookup blkg, try to create one if not there
+ * blkg_lookup_create - lookup blkg, try to create one if not there
* @blkcg: blkcg of interest
* @q: request_queue of interest
*
* Lookup blkg for the @blkcg - @q pair. If it doesn't exist, try to
* create one. blkg creation is performed recursively from blkcg_root such
* that all non-root blkg's have access to the parent blkg. This function
- * should be called under RCU read lock and @q->queue_lock.
+ * should be called under RCU read lock and takes @q->queue_lock.
*
* Returns the blkg or the closest blkg if blkg_create() fails as it walks
* down from root.
*/
-static struct blkcg_gq *__blkg_lookup_create(struct blkcg *blkcg,
- struct request_queue *q)
+static struct blkcg_gq *blkg_lookup_create(struct blkcg *blkcg,
+ struct request_queue *q)
{
struct blkcg_gq *blkg;
+ unsigned long flags;

WARN_ON_ONCE(!rcu_read_lock_held());
- lockdep_assert_held(&q->queue_lock);

- blkg = __blkg_lookup(blkcg, q, true);
+ blkg = blkg_lookup(blkcg, q);
if (blkg)
return blkg;

+ spin_lock_irqsave(&q->queue_lock, flags);
+ blkg = __blkg_lookup(blkcg, q, true);
+ if (blkg)
+ goto found;
+
/*
* Create blkgs walking down from blkcg_root to @blkcg, so that all
* non-root blkgs have access to their parents. Returns the closest
@@ -362,34 +367,16 @@ static struct blkcg_gq *__blkg_lookup_create(struct blkcg *blkcg,
}

blkg = blkg_create(pos, q, NULL);
- if (IS_ERR(blkg))
- return ret_blkg;
+ if (IS_ERR(blkg)) {
+ blkg = ret_blkg;
+ break;
+ }
if (pos == blkcg)
- return blkg;
- }
-}
-
-/**
- * blkg_lookup_create - find or create a blkg
- * @blkcg: target block cgroup
- * @q: target request_queue
- *
- * This looks up or creates the blkg representing the unique pair
- * of the blkcg and the request_queue.
- */
-static struct blkcg_gq *blkg_lookup_create(struct blkcg *blkcg,
- struct request_queue *q)
-{
- struct blkcg_gq *blkg = blkg_lookup(blkcg, q);
-
- if (unlikely(!blkg)) {
- unsigned long flags;
-
- spin_lock_irqsave(&q->queue_lock, flags);
- blkg = __blkg_lookup_create(blkcg, q);
- spin_unlock_irqrestore(&q->queue_lock, flags);
+ break;
}

+found:
+ spin_unlock_irqrestore(&q->queue_lock, flags);
return blkg;
}

--
2.26.2

2020-06-27 07:33:51

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 08/14] block: bypass blkg_tryget_closest for the root_blkg

The root_blkg is only torn down at the very end of removing a queue.
So in the I/O submission path is always has a life reference and we
can just grab another one using blkg_get instead of doing a tryget
and parent walk that won't lead anywhere.

Signed-off-by: Christoph Hellwig <[email protected]>
---
block/blk-cgroup.c | 15 ++++++++++-----
1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 6aedb73ffbf473..0912820d4f8194 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -1758,16 +1758,21 @@ void bio_associate_blkg_from_css(struct bio *bio,
struct cgroup_subsys_state *css)
{
struct request_queue *q = bio->bi_disk->queue;
- struct blkcg_gq *blkg = q->root_blkg;

if (bio->bi_blkg)
blkg_put(bio->bi_blkg);

- rcu_read_lock();
- if (css && css->parent)
+ if (css && css->parent) {
+ struct blkcg_gq *blkg;
+
+ rcu_read_lock();
blkg = blkg_lookup_create(css_to_blkcg(css), q);
- bio->bi_blkg = blkg_tryget_closest(blkg);
- rcu_read_unlock();
+ bio->bi_blkg = blkg_tryget_closest(blkg);
+ rcu_read_unlock();
+ } else {
+ blkg_get(q->root_blkg);
+ bio->bi_blkg = q->root_blkg;
+ }
}
EXPORT_SYMBOL_GPL(bio_associate_blkg_from_css);

--
2.26.2

2020-06-27 07:33:56

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 09/14] block: move the initial blkg lookup into blkg_tryget_closest

By moving the initial blkg lookup into blkg_tryget_closest we get
a nicely self contained routines that does all the RCU locking.

Signed-off-by: Christoph Hellwig <[email protected]>
---
block/blk-cgroup.c | 33 ++++++++++++++-------------------
1 file changed, 14 insertions(+), 19 deletions(-)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 0912820d4f8194..d21ec2acd716e7 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -1716,19 +1716,20 @@ void blkcg_add_delay(struct blkcg_gq *blkg, u64 now, u64 delta)

/**
* blkg_tryget_closest - try and get a blkg ref on the closet blkg
- * @blkg: blkg to get
+ * @bio: target bio
+ * @css: target css
*
- * This needs to be called rcu protected. As the failure mode here is to walk
- * up the blkg tree, this ensure that the blkg->parent pointers are always
- * valid. This returns the blkg that it ended up taking a reference on or %NULL
- * if no reference was taken.
+ * As the failure mode here is to walk up the blkg tree, this ensure that the
+ * blkg->parent pointers are always valid. This returns the blkg that it ended
+ * up taking a reference on or %NULL if no reference was taken.
*/
-static inline struct blkcg_gq *blkg_tryget_closest(struct blkcg_gq *blkg)
+static inline struct blkcg_gq *blkg_tryget_closest(struct bio *bio,
+ struct cgroup_subsys_state *css)
{
- struct blkcg_gq *ret_blkg = NULL;
-
- WARN_ON_ONCE(!rcu_read_lock_held());
+ struct blkcg_gq *blkg, *ret_blkg = NULL;

+ rcu_read_lock();
+ blkg = blkg_lookup_create(css_to_blkcg(css), bio->bi_disk->queue);
while (blkg) {
if (blkg_tryget(blkg)) {
ret_blkg = blkg;
@@ -1736,6 +1737,7 @@ static inline struct blkcg_gq *blkg_tryget_closest(struct blkcg_gq *blkg)
}
blkg = blkg->parent;
}
+ rcu_read_unlock();

return ret_blkg;
}
@@ -1757,21 +1759,14 @@ static inline struct blkcg_gq *blkg_tryget_closest(struct blkcg_gq *blkg)
void bio_associate_blkg_from_css(struct bio *bio,
struct cgroup_subsys_state *css)
{
- struct request_queue *q = bio->bi_disk->queue;
-
if (bio->bi_blkg)
blkg_put(bio->bi_blkg);

if (css && css->parent) {
- struct blkcg_gq *blkg;
-
- rcu_read_lock();
- blkg = blkg_lookup_create(css_to_blkcg(css), q);
- bio->bi_blkg = blkg_tryget_closest(blkg);
- rcu_read_unlock();
+ bio->bi_blkg = blkg_tryget_closest(bio, css);
} else {
- blkg_get(q->root_blkg);
- bio->bi_blkg = q->root_blkg;
+ blkg_get(bio->bi_disk->queue->root_blkg);
+ bio->bi_blkg = bio->bi_disk->queue->root_blkg;
}
}
EXPORT_SYMBOL_GPL(bio_associate_blkg_from_css);
--
2.26.2

2020-06-27 07:33:59

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 11/14] cgroup: unexport cgroup_rstat_updated

cgroup_rstat_updated is only used by core block code, no need to
export it.

Signed-off-by: Christoph Hellwig <[email protected]>
---
kernel/cgroup/rstat.c | 1 -
1 file changed, 1 deletion(-)

diff --git a/kernel/cgroup/rstat.c b/kernel/cgroup/rstat.c
index b6397a186ce9c8..d51175cedfca4f 100644
--- a/kernel/cgroup/rstat.c
+++ b/kernel/cgroup/rstat.c
@@ -64,7 +64,6 @@ void cgroup_rstat_updated(struct cgroup *cgrp, int cpu)

raw_spin_unlock_irqrestore(cpu_lock, flags);
}
-EXPORT_SYMBOL_GPL(cgroup_rstat_updated);

/**
* cgroup_rstat_cpu_pop_updated - iterate and dismantle rstat_cpu updated tree
--
2.26.2

2020-06-27 07:34:09

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 10/14] blk-cgroup: remove the !bio->bi_blkg check in blkcg_bio_issue_check

This is purely a sanity check for grave programming errors. Remove it
to simplify further work in this area.

Signed-off-by: Christoph Hellwig <[email protected]>
---
include/linux/blk-cgroup.h | 15 +--------------
1 file changed, 1 insertion(+), 14 deletions(-)

diff --git a/include/linux/blk-cgroup.h b/include/linux/blk-cgroup.h
index 60df97202314c7..8e86b598316c10 100644
--- a/include/linux/blk-cgroup.h
+++ b/include/linux/blk-cgroup.h
@@ -543,24 +543,11 @@ static inline void blkcg_bio_issue_init(struct bio *bio)
static inline bool blkcg_bio_issue_check(struct request_queue *q,
struct bio *bio)
{
- struct blkcg_gq *blkg;
+ struct blkcg_gq *blkg = bio->bi_blkg;
bool throtl = false;

rcu_read_lock();
-
- if (!bio->bi_blkg) {
- char b[BDEVNAME_SIZE];
-
- WARN_ONCE(1,
- "no blkg associated for bio on block-device: %s\n",
- bio_devname(bio, b));
- bio_associate_blkg(bio);
- }
-
- blkg = bio->bi_blkg;
-
throtl = blk_throtl_bio(q, blkg, bio);
-
if (!throtl) {
struct blkg_iostat_set *bis;
int rwd, cpu;
--
2.26.2

2020-06-27 07:34:39

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 14/14] blk-cgroup: remove a dead check in blk_throtl_bio

bios must have a valid block group by the time they are submitted.

Signed-off-by: Christoph Hellwig <[email protected]>
---
block/blk-throttle.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 9d00f62c05ecdf..ad37043297ed58 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -2163,7 +2163,7 @@ bool blk_throtl_bio(struct bio *bio)
struct request_queue *q = bio->bi_disk->queue;
struct blkcg_gq *blkg = bio->bi_blkg;
struct throtl_qnode *qn = NULL;
- struct throtl_grp *tg = blkg_to_tg(blkg ?: q->root_blkg);
+ struct throtl_grp *tg = blkg_to_tg(blkg);
struct throtl_service_queue *sq;
bool rw = bio_data_dir(bio);
bool throttled = false;
--
2.26.2

2020-06-27 07:35:04

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 06/14] block: move the bio cgroup associatation helpers to blk-cgroup.c

Keep the cgroup code together.

Signed-off-by: Christoph Hellwig <[email protected]>
---
block/bio.c | 75 ---------------------------
block/blk-cgroup.c | 103 ++++++++++++++++++++++++++++++++++++-
include/linux/blk-cgroup.h | 30 -----------
3 files changed, 101 insertions(+), 107 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index 901d22715dd4f3..fc1299f9d86a24 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1627,81 +1627,6 @@ int bioset_init_from_src(struct bio_set *bs, struct bio_set *src)
}
EXPORT_SYMBOL(bioset_init_from_src);

-#ifdef CONFIG_BLK_CGROUP
-/**
- * bio_associate_blkg_from_css - associate a bio with a specified css
- * @bio: target bio
- * @css: target css
- *
- * Associate @bio with the blkg found by combining the css's blkg and the
- * request_queue of the @bio. An association failure is handled by walking up
- * the blkg tree. Therefore, the blkg associated can be anything between @blkg
- * and q->root_blkg. This situation only happens when a cgroup is dying and
- * then the remaining bios will spill to the closest alive blkg.
- *
- * A reference will be taken on the blkg and will be released when @bio is
- * freed.
- */
-void bio_associate_blkg_from_css(struct bio *bio,
- struct cgroup_subsys_state *css)
-{
- struct request_queue *q = bio->bi_disk->queue;
- struct blkcg_gq *blkg = q->root_blkg;
-
- if (bio->bi_blkg)
- blkg_put(bio->bi_blkg);
-
- rcu_read_lock();
- if (css && css->parent)
- blkg = blkg_lookup_create(css_to_blkcg(css), q);
- bio->bi_blkg = blkg_tryget_closest(blkg);
- rcu_read_unlock();
-}
-EXPORT_SYMBOL_GPL(bio_associate_blkg_from_css);
-
-/**
- * bio_associate_blkg - associate a bio with a blkg
- * @bio: target bio
- *
- * Associate @bio with the blkg found from the bio's css and request_queue.
- * If one is not found, bio_lookup_blkg() creates the blkg. If a blkg is
- * already associated, the css is reused and association redone as the
- * request_queue may have changed.
- */
-void bio_associate_blkg(struct bio *bio)
-{
- struct cgroup_subsys_state *css;
-
- rcu_read_lock();
-
- if (bio->bi_blkg)
- css = &bio_blkcg(bio)->css;
- else
- css = blkcg_css();
-
- bio_associate_blkg_from_css(bio, css);
-
- rcu_read_unlock();
-}
-EXPORT_SYMBOL_GPL(bio_associate_blkg);
-
-/**
- * bio_clone_blkg_association - clone blkg association from src to dst bio
- * @dst: destination bio
- * @src: source bio
- */
-void bio_clone_blkg_association(struct bio *dst, struct bio *src)
-{
- if (src->bi_blkg) {
- if (dst->bi_blkg)
- blkg_put(dst->bi_blkg);
- blkg_get(src->bi_blkg);
- dst->bi_blkg = src->bi_blkg;
- }
-}
-EXPORT_SYMBOL_GPL(bio_clone_blkg_association);
-#endif /* CONFIG_BLK_CGROUP */
-
static void __init biovec_init_slabs(void)
{
int i;
diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index 0ecc897b225c96..bb0607bfd771cd 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -328,7 +328,7 @@ static struct blkcg_gq *blkg_create(struct blkcg *blkcg,
* Returns the blkg or the closest blkg if blkg_create() fails as it walks
* down from root.
*/
-struct blkcg_gq *__blkg_lookup_create(struct blkcg *blkcg,
+static struct blkcg_gq *__blkg_lookup_create(struct blkcg *blkcg,
struct request_queue *q)
{
struct blkcg_gq *blkg;
@@ -377,7 +377,7 @@ struct blkcg_gq *__blkg_lookup_create(struct blkcg *blkcg,
* This looks up or creates the blkg representing the unique pair
* of the blkcg and the request_queue.
*/
-struct blkcg_gq *blkg_lookup_create(struct blkcg *blkcg,
+static struct blkcg_gq *blkg_lookup_create(struct blkcg *blkcg,
struct request_queue *q)
{
struct blkcg_gq *blkg = blkg_lookup(blkcg, q);
@@ -1727,6 +1727,105 @@ void blkcg_add_delay(struct blkcg_gq *blkg, u64 now, u64 delta)
atomic64_add(delta, &blkg->delay_nsec);
}

+/**
+ * blkg_tryget_closest - try and get a blkg ref on the closet blkg
+ * @blkg: blkg to get
+ *
+ * This needs to be called rcu protected. As the failure mode here is to walk
+ * up the blkg tree, this ensure that the blkg->parent pointers are always
+ * valid. This returns the blkg that it ended up taking a reference on or %NULL
+ * if no reference was taken.
+ */
+static inline struct blkcg_gq *blkg_tryget_closest(struct blkcg_gq *blkg)
+{
+ struct blkcg_gq *ret_blkg = NULL;
+
+ WARN_ON_ONCE(!rcu_read_lock_held());
+
+ while (blkg) {
+ if (blkg_tryget(blkg)) {
+ ret_blkg = blkg;
+ break;
+ }
+ blkg = blkg->parent;
+ }
+
+ return ret_blkg;
+}
+
+/**
+ * bio_associate_blkg_from_css - associate a bio with a specified css
+ * @bio: target bio
+ * @css: target css
+ *
+ * Associate @bio with the blkg found by combining the css's blkg and the
+ * request_queue of the @bio. An association failure is handled by walking up
+ * the blkg tree. Therefore, the blkg associated can be anything between @blkg
+ * and q->root_blkg. This situation only happens when a cgroup is dying and
+ * then the remaining bios will spill to the closest alive blkg.
+ *
+ * A reference will be taken on the blkg and will be released when @bio is
+ * freed.
+ */
+void bio_associate_blkg_from_css(struct bio *bio,
+ struct cgroup_subsys_state *css)
+{
+ struct request_queue *q = bio->bi_disk->queue;
+ struct blkcg_gq *blkg = q->root_blkg;
+
+ if (bio->bi_blkg)
+ blkg_put(bio->bi_blkg);
+
+ rcu_read_lock();
+ if (css && css->parent)
+ blkg = blkg_lookup_create(css_to_blkcg(css), q);
+ bio->bi_blkg = blkg_tryget_closest(blkg);
+ rcu_read_unlock();
+}
+EXPORT_SYMBOL_GPL(bio_associate_blkg_from_css);
+
+/**
+ * bio_associate_blkg - associate a bio with a blkg
+ * @bio: target bio
+ *
+ * Associate @bio with the blkg found from the bio's css and request_queue.
+ * If one is not found, bio_lookup_blkg() creates the blkg. If a blkg is
+ * already associated, the css is reused and association redone as the
+ * request_queue may have changed.
+ */
+void bio_associate_blkg(struct bio *bio)
+{
+ struct cgroup_subsys_state *css;
+
+ rcu_read_lock();
+
+ if (bio->bi_blkg)
+ css = &bio_blkcg(bio)->css;
+ else
+ css = blkcg_css();
+
+ bio_associate_blkg_from_css(bio, css);
+
+ rcu_read_unlock();
+}
+EXPORT_SYMBOL_GPL(bio_associate_blkg);
+
+/**
+ * bio_clone_blkg_association - clone blkg association from src to dst bio
+ * @dst: destination bio
+ * @src: source bio
+ */
+void bio_clone_blkg_association(struct bio *dst, struct bio *src)
+{
+ if (src->bi_blkg) {
+ if (dst->bi_blkg)
+ blkg_put(dst->bi_blkg);
+ blkg_get(src->bi_blkg);
+ dst->bi_blkg = src->bi_blkg;
+ }
+}
+EXPORT_SYMBOL_GPL(bio_clone_blkg_association);
+
static int __init blkcg_init(void)
{
blkcg_punt_bio_wq = alloc_workqueue("blkcg_punt_bio",
diff --git a/include/linux/blk-cgroup.h b/include/linux/blk-cgroup.h
index a57ebe2f00abd8..60df97202314c7 100644
--- a/include/linux/blk-cgroup.h
+++ b/include/linux/blk-cgroup.h
@@ -183,10 +183,6 @@ extern bool blkcg_debug_stats;

struct blkcg_gq *blkg_lookup_slowpath(struct blkcg *blkcg,
struct request_queue *q, bool update_hint);
-struct blkcg_gq *__blkg_lookup_create(struct blkcg *blkcg,
- struct request_queue *q);
-struct blkcg_gq *blkg_lookup_create(struct blkcg *blkcg,
- struct request_queue *q);
int blkcg_init_queue(struct request_queue *q);
void blkcg_exit_queue(struct request_queue *q);

@@ -480,32 +476,6 @@ static inline bool blkg_tryget(struct blkcg_gq *blkg)
return blkg && percpu_ref_tryget(&blkg->refcnt);
}

-/**
- * blkg_tryget_closest - try and get a blkg ref on the closet blkg
- * @blkg: blkg to get
- *
- * This needs to be called rcu protected. As the failure mode here is to walk
- * up the blkg tree, this ensure that the blkg->parent pointers are always
- * valid. This returns the blkg that it ended up taking a reference on or %NULL
- * if no reference was taken.
- */
-static inline struct blkcg_gq *blkg_tryget_closest(struct blkcg_gq *blkg)
-{
- struct blkcg_gq *ret_blkg = NULL;
-
- WARN_ON_ONCE(!rcu_read_lock_held());
-
- while (blkg) {
- if (blkg_tryget(blkg)) {
- ret_blkg = blkg;
- break;
- }
- blkg = blkg->parent;
- }
-
- return ret_blkg;
-}
-
/**
* blkg_put - put a blkg reference
* @blkg: blkg to put
--
2.26.2

2020-06-27 07:36:17

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 02/14] block: remove bio_disassociate_blkg

bio_disassociate_blkg has two callers, of which one immediately assigns
a new value to >bi_blkg. Just open code the function in the two callers.

Signed-off-by: Christoph Hellwig <[email protected]>
---
block/bio.c | 27 ++++++++-------------------
include/linux/bio.h | 2 --
2 files changed, 8 insertions(+), 21 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index fb5533416fa670..8aef4460b32e0e 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -234,8 +234,12 @@ struct bio_vec *bvec_alloc(gfp_t gfp_mask, int nr, unsigned long *idx,

void bio_uninit(struct bio *bio)
{
- bio_disassociate_blkg(bio);
-
+#ifdef CONFIG_BLK_CGROUP
+ if (bio->bi_blkg) {
+ blkg_put(bio->bi_blkg);
+ bio->bi_blkg = NULL;
+ }
+#endif
if (bio_integrity(bio))
bio_integrity_free(bio);

@@ -1625,21 +1629,6 @@ EXPORT_SYMBOL(bioset_init_from_src);

#ifdef CONFIG_BLK_CGROUP

-/**
- * bio_disassociate_blkg - puts back the blkg reference if associated
- * @bio: target bio
- *
- * Helper to disassociate the blkg from @bio if a blkg is associated.
- */
-void bio_disassociate_blkg(struct bio *bio)
-{
- if (bio->bi_blkg) {
- blkg_put(bio->bi_blkg);
- bio->bi_blkg = NULL;
- }
-}
-EXPORT_SYMBOL_GPL(bio_disassociate_blkg);
-
/**
* __bio_associate_blkg - associate a bio with the a blkg
* @bio: target bio
@@ -1656,8 +1645,8 @@ EXPORT_SYMBOL_GPL(bio_disassociate_blkg);
*/
static void __bio_associate_blkg(struct bio *bio, struct blkcg_gq *blkg)
{
- bio_disassociate_blkg(bio);
-
+ if (bio->bi_blkg)
+ blkg_put(bio->bi_blkg);
bio->bi_blkg = blkg_tryget_closest(blkg);
}

diff --git a/include/linux/bio.h b/include/linux/bio.h
index 0282f8aa85935c..4cd229e175c058 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -497,13 +497,11 @@ static inline void bio_associate_blkg_from_page(struct bio *bio,
#endif

#ifdef CONFIG_BLK_CGROUP
-void bio_disassociate_blkg(struct bio *bio);
void bio_associate_blkg(struct bio *bio);
void bio_associate_blkg_from_css(struct bio *bio,
struct cgroup_subsys_state *css);
void bio_clone_blkg_association(struct bio *dst, struct bio *src);
#else /* CONFIG_BLK_CGROUP */
-static inline void bio_disassociate_blkg(struct bio *bio) { }
static inline void bio_associate_blkg(struct bio *bio) { }
static inline void bio_associate_blkg_from_css(struct bio *bio,
struct cgroup_subsys_state *css)
--
2.26.2

2020-06-27 07:36:28

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 12/14] blk-cgroup: move rcu locking from blkcg_bio_issue_check to blk_throtl_bio

The only thing in blkcg_bio_issue_check that needs to be under
rcu_read_lock is blk_throtl_bio, so move the locking there.

Signed-off-by: Christoph Hellwig <[email protected]>
---
block/blk-throttle.c | 3 ++-
include/linux/blk-cgroup.h | 2 --
2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 209fdd8939fba6..ac008345050010 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -2168,7 +2168,7 @@ bool blk_throtl_bio(struct request_queue *q, struct blkcg_gq *blkg,
bool throttled = false;
struct throtl_data *td = tg->td;

- WARN_ON_ONCE(!rcu_read_lock_held());
+ rcu_read_lock();

/* see throtl_charge_bio() */
if (bio_flagged(bio, BIO_THROTTLED))
@@ -2273,6 +2273,7 @@ bool blk_throtl_bio(struct request_queue *q, struct blkcg_gq *blkg,
if (throttled || !td->track_bio_latency)
bio->bi_issue.value |= BIO_ISSUE_THROTL_SKIP_LATENCY;
#endif
+ rcu_read_unlock();
return throttled;
}

diff --git a/include/linux/blk-cgroup.h b/include/linux/blk-cgroup.h
index 8e86b598316c10..8ab043c911f233 100644
--- a/include/linux/blk-cgroup.h
+++ b/include/linux/blk-cgroup.h
@@ -546,7 +546,6 @@ static inline bool blkcg_bio_issue_check(struct request_queue *q,
struct blkcg_gq *blkg = bio->bi_blkg;
bool throtl = false;

- rcu_read_lock();
throtl = blk_throtl_bio(q, blkg, bio);
if (!throtl) {
struct blkg_iostat_set *bis;
@@ -582,7 +581,6 @@ static inline bool blkcg_bio_issue_check(struct request_queue *q,

blkcg_bio_issue_init(bio);

- rcu_read_unlock();
return !throtl;
}

--
2.26.2

2020-06-27 07:36:34

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH 13/14] blk-cgroup: remove blkcg_bio_issue_check

blkcg_bio_issue_check is a giant inline function that does three entirely
different things. Factor out the blk-cgroup related bio initalization
into a new helper, and the open code the sequence in the only caller,
relying on the fact that all the actual functionality is stubbed out for
non-cgroup builds.

Signed-off-by: Christoph Hellwig <[email protected]>
---
block/blk-cgroup.c | 34 +++++++++++++++++++++++
block/blk-core.c | 7 ++++-
block/blk-throttle.c | 5 ++--
block/blk.h | 2 ++
include/linux/blk-cgroup.h | 56 ++------------------------------------
5 files changed, 47 insertions(+), 57 deletions(-)

diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index d21ec2acd716e7..1ce94afc03bcfd 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -1813,6 +1813,40 @@ void bio_clone_blkg_association(struct bio *dst, struct bio *src)
}
EXPORT_SYMBOL_GPL(bio_clone_blkg_association);

+static int blk_cgroup_io_type(struct bio *bio)
+{
+ if (op_is_discard(bio->bi_opf))
+ return BLKG_IOSTAT_DISCARD;
+ if (op_is_write(bio->bi_opf))
+ return BLKG_IOSTAT_WRITE;
+ return BLKG_IOSTAT_READ;
+}
+
+void blk_cgroup_bio_start(struct bio *bio)
+{
+ int rwd = blk_cgroup_io_type(bio), cpu;
+ struct blkg_iostat_set *bis;
+
+ cpu = get_cpu();
+ bis = per_cpu_ptr(bio->bi_blkg->iostat_cpu, cpu);
+ u64_stats_update_begin(&bis->sync);
+
+ /*
+ * If the bio is flagged with BIO_CGROUP_ACCT it means this is a split
+ * bio and we would have already accounted for the size of the bio.
+ */
+ if (!bio_flagged(bio, BIO_CGROUP_ACCT)) {
+ bio_set_flag(bio, BIO_CGROUP_ACCT);
+ bis->cur.bytes[rwd] += bio->bi_iter.bi_size;
+ }
+ bis->cur.ios[rwd]++;
+
+ u64_stats_update_end(&bis->sync);
+ if (cgroup_subsys_on_dfl(io_cgrp_subsys))
+ cgroup_rstat_updated(bio->bi_blkg->blkcg->css.cgroup, cpu);
+ put_cpu();
+}
+
static int __init blkcg_init(void)
{
blkcg_punt_bio_wq = alloc_workqueue("blkcg_punt_bio",
diff --git a/block/blk-core.c b/block/blk-core.c
index a9769c1a287546..76cfd5709f66cd 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1073,8 +1073,13 @@ generic_make_request_checks(struct bio *bio)
if (unlikely(!current->io_context))
create_task_io_context(current, GFP_ATOMIC, q->node);

- if (!blkcg_bio_issue_check(q, bio))
+ if (blk_throtl_bio(bio)) {
+ blkcg_bio_issue_init(bio);
return false;
+ }
+
+ blk_cgroup_bio_start(bio);
+ blkcg_bio_issue_init(bio);

if (!bio_flagged(bio, BIO_TRACE_COMPLETION)) {
trace_block_bio_queue(q, bio);
diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index ac008345050010..9d00f62c05ecdf 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -2158,9 +2158,10 @@ static inline void throtl_update_latency_buckets(struct throtl_data *td)
}
#endif

-bool blk_throtl_bio(struct request_queue *q, struct blkcg_gq *blkg,
- struct bio *bio)
+bool blk_throtl_bio(struct bio *bio)
{
+ struct request_queue *q = bio->bi_disk->queue;
+ struct blkcg_gq *blkg = bio->bi_blkg;
struct throtl_qnode *qn = NULL;
struct throtl_grp *tg = blkg_to_tg(blkg ?: q->root_blkg);
struct throtl_service_queue *sq;
diff --git a/block/blk.h b/block/blk.h
index 3a120a070dac82..41a50880c94e98 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -288,10 +288,12 @@ int create_task_io_context(struct task_struct *task, gfp_t gfp_mask, int node);
extern int blk_throtl_init(struct request_queue *q);
extern void blk_throtl_exit(struct request_queue *q);
extern void blk_throtl_register_queue(struct request_queue *q);
+bool blk_throtl_bio(struct bio *bio);
#else /* CONFIG_BLK_DEV_THROTTLING */
static inline int blk_throtl_init(struct request_queue *q) { return 0; }
static inline void blk_throtl_exit(struct request_queue *q) { }
static inline void blk_throtl_register_queue(struct request_queue *q) { }
+static inline bool blk_throtl_bio(struct bio *bio) { return false; }
#endif /* CONFIG_BLK_DEV_THROTTLING */
#ifdef CONFIG_BLK_DEV_THROTTLING_LOW
extern ssize_t blk_throtl_sample_time_show(struct request_queue *q, char *page);
diff --git a/include/linux/blk-cgroup.h b/include/linux/blk-cgroup.h
index 8ab043c911f233..431b2d18bf4074 100644
--- a/include/linux/blk-cgroup.h
+++ b/include/linux/blk-cgroup.h
@@ -517,14 +517,6 @@ static inline void blkg_put(struct blkcg_gq *blkg)
if (((d_blkg) = __blkg_lookup(css_to_blkcg(pos_css), \
(p_blkg)->q, false)))

-#ifdef CONFIG_BLK_DEV_THROTTLING
-extern bool blk_throtl_bio(struct request_queue *q, struct blkcg_gq *blkg,
- struct bio *bio);
-#else
-static inline bool blk_throtl_bio(struct request_queue *q, struct blkcg_gq *blkg,
- struct bio *bio) { return false; }
-#endif
-
bool __blkcg_punt_bio_submit(struct bio *bio);

static inline bool blkcg_punt_bio_submit(struct bio *bio)
@@ -540,50 +532,6 @@ static inline void blkcg_bio_issue_init(struct bio *bio)
bio_issue_init(&bio->bi_issue, bio_sectors(bio));
}

-static inline bool blkcg_bio_issue_check(struct request_queue *q,
- struct bio *bio)
-{
- struct blkcg_gq *blkg = bio->bi_blkg;
- bool throtl = false;
-
- throtl = blk_throtl_bio(q, blkg, bio);
- if (!throtl) {
- struct blkg_iostat_set *bis;
- int rwd, cpu;
-
- if (op_is_discard(bio->bi_opf))
- rwd = BLKG_IOSTAT_DISCARD;
- else if (op_is_write(bio->bi_opf))
- rwd = BLKG_IOSTAT_WRITE;
- else
- rwd = BLKG_IOSTAT_READ;
-
- cpu = get_cpu();
- bis = per_cpu_ptr(blkg->iostat_cpu, cpu);
- u64_stats_update_begin(&bis->sync);
-
- /*
- * If the bio is flagged with BIO_CGROUP_ACCT it means this is a
- * split bio and we would have already accounted for the size of
- * the bio.
- */
- if (!bio_flagged(bio, BIO_CGROUP_ACCT)) {
- bio_set_flag(bio, BIO_CGROUP_ACCT);
- bis->cur.bytes[rwd] += bio->bi_iter.bi_size;
- }
- bis->cur.ios[rwd]++;
-
- u64_stats_update_end(&bis->sync);
- if (cgroup_subsys_on_dfl(io_cgrp_subsys))
- cgroup_rstat_updated(blkg->blkcg->css.cgroup, cpu);
- put_cpu();
- }
-
- blkcg_bio_issue_init(bio);
-
- return !throtl;
-}
-
static inline void blkcg_use_delay(struct blkcg_gq *blkg)
{
if (WARN_ON_ONCE(atomic_read(&blkg->use_delay) < 0))
@@ -657,6 +605,7 @@ static inline void blkcg_clear_delay(struct blkcg_gq *blkg)
atomic_dec(&blkg->blkcg->css.cgroup->congestion_count);
}

+void blk_cgroup_bio_start(struct bio *bio);
void blkcg_add_delay(struct blkcg_gq *blkg, u64 now, u64 delta);
void blkcg_schedule_throttle(struct request_queue *q, bool use_memdelay);
void blkcg_maybe_throttle_current(void);
@@ -710,8 +659,7 @@ static inline void blkg_put(struct blkcg_gq *blkg) { }

static inline bool blkcg_punt_bio_submit(struct bio *bio) { return false; }
static inline void blkcg_bio_issue_init(struct bio *bio) { }
-static inline bool blkcg_bio_issue_check(struct request_queue *q,
- struct bio *bio) { return true; }
+static inline void blk_cgroup_bio_start(struct bio *bio) { }

#define blk_queue_for_each_rl(rl, q) \
for ((rl) = &(q)->root_rl; (rl); (rl) = NULL)
--
2.26.2

2020-06-29 18:41:03

by Johannes Thumshirn

[permalink] [raw]
Subject: Re: drive-by blk-cgroup cleanups

On 29/06/2020 10:08, Christoph Hellwig wrote:
> You'll have to ask Jens :) Note that your patch 2 overlaps with this
> series.

Should have removed you from the To: field, sorry.

> I thik my version is a little nicer, given that
> blkcg_bio_issue_check is a very strange function doing multiple things
> as-is.
>
>

Agreed, I'll go through your series later today. Patches 1 and 3 are still
applicable though I think.

2020-06-29 18:41:53

by Johannes Thumshirn

[permalink] [raw]
Subject: Re: drive-by blk-cgroup cleanups

On 27/06/2020 09:33, Christoph Hellwig wrote:
> Hi all,
>
> while looking into another "project" I ended up wading through the
> blkcq code for research and found a bunch of lose ends. So here is
> a bunch of drive-by cleanups for the code.
>
> Diffstat:
> block/bio.c | 143 +----------------------------------
> block/blk-cgroup.c | 182 +++++++++++++++++++++++++++++++++++++--------
> block/blk-core.c | 7 +
> block/blk-throttle.c | 10 +-
> block/blk.h | 2
> drivers/md/dm.c | 5 -
> include/linux/bio.h | 9 --
> include/linux/blk-cgroup.h | 101 ------------------------
> kernel/cgroup/rstat.c | 1
> mm/page_io.c | 17 ++++
> 10 files changed, 192 insertions(+), 285 deletions(-)
>

Btw what ever happened to https://lore.kernel.org/r/[email protected]?

2020-06-29 18:43:51

by Tejun Heo

[permalink] [raw]
Subject: Re: drive-by blk-cgroup cleanups

On Sat, Jun 27, 2020 at 09:31:45AM +0200, Christoph Hellwig wrote:
> Hi all,
>
> while looking into another "project" I ended up wading through the
> blkcq code for research and found a bunch of lose ends. So here is
> a bunch of drive-by cleanups for the code.

The whole series looks great to me.

Acked-by: Tejun Heo <[email protected]>

Thanks.

--
tejun

2020-06-29 18:49:06

by Christoph Hellwig

[permalink] [raw]
Subject: Re: drive-by blk-cgroup cleanups

On Mon, Jun 29, 2020 at 08:05:07AM +0000, Johannes Thumshirn wrote:
> Btw what ever happened to https://lore.kernel.org/r/[email protected]?

You'll have to ask Jens :) Note that your patch 2 overlaps with this
series. I thik my version is a little nicer, given that
blkcg_bio_issue_check is a very strange function doing multiple things
as-is.

2020-06-29 18:49:37

by Jens Axboe

[permalink] [raw]
Subject: Re: drive-by blk-cgroup cleanups

On 6/27/20 1:31 AM, Christoph Hellwig wrote:
> Hi all,
>
> while looking into another "project" I ended up wading through the
> blkcq code for research and found a bunch of lose ends. So here is
> a bunch of drive-by cleanups for the code.

Applied, thanks.

--
Jens Axboe

2020-06-29 18:55:46

by Johannes Thumshirn

[permalink] [raw]
Subject: Re: [PATCH 02/14] block: remove bio_disassociate_blkg

On 27/06/2020 09:33, Christoph Hellwig wrote:
> a new value to >bi_blkg. Just open code the function in the two callers.
Nit: ->bi_blkg

Otherwise,
Reviewed-by: Johannes Thumshirn <[email protected]>

2020-06-29 19:18:13

by Johannes Thumshirn

[permalink] [raw]
Subject: Re: [PATCH 05/14] block: move bio_associate_blkg_from_page to mm/page_io.c

On 27/06/2020 09:33, Christoph Hellwig wrote:
> bio_associate_blkg_from_page is a special purpose helper for swap bios

'for swapping' or 'to swap' I think.

Otherwise,
Reviewed-by: Johannes Thumshirn <[email protected]>