Hi Jens,
These are few fixes for block throtl code and for blk_cleanup_queue(). These
should be a good candidate for 2.6.39.
Please let me know if you have any concerns.
Thanks
Vivek
Vivek Goyal (3):
block: Initialize ->queue_lock to internal lock at queue allocation
time
loop: No need to initialize ->queue_lock explicitly before calling
blk_cleanup_queue()
block: Move blk_throtl_exit() call to blk_cleanup_queue()
block/blk-core.c | 23 +++++++++++++++++++++--
block/blk-settings.c | 7 -------
block/blk-sysfs.c | 2 --
block/blk-throttle.c | 6 +++---
drivers/block/loop.c | 3 ---
include/linux/blkdev.h | 2 --
6 files changed, 24 insertions(+), 19 deletions(-)
--
1.7.2.3
o Move blk_throtl_exit() in blk_cleanup_queue() as blk_throtl_exit() is
written in such a way that it needs queue lock. In blk_release_queue()
there is no gurantee that ->queue_lock is still around.
o Initially blk_throtl_exit() was in blk_cleanup_queue() but Ingo reported
one problem.
https://lkml.org/lkml/2010/10/23/86
And a quick fix moved blk_throtl_exit() to blk_release_queue().
commit 7ad58c028652753814054f4e3ac58f925e7343f4
Author: Jens Axboe <[email protected]>
Date: Sat Oct 23 20:40:26 2010 +0200
block: fix use-after-free bug in blk throttle code
o This patch reverts above change and does not try to shutdown the
throtl work in blk_sync_queue(). By avoiding call to
throtl_shutdown_timer_wq() from blk_sync_queue(), we should also avoid
the problem reported by Ingo.
o blk_sync_queue() seems to be used only by md driver and it seems to be
using it to make sure q->unplug_fn is not called as md registers its
own unplug functions and it is about to free up the data structures
used by unplug_fn(). Block throttle does not call back into unplug_fn()
or into md. So there is no need to cancel blk throttle work.
In fact I think cancelling block throttle work is bad because it might
happen that some bios are throttled and scheduled to be dispatched later
with the help of pending work and if work is cancelled, these bios might
never be dispatched.
Block layer also uses blk_sync_queue() during blk_cleanup_queue() and
blk_release_queue() time. That should be safe as we are also calling
blk_throtl_exit() which should make sure all the throttling related
data structures are cleaned up.
Signed-off-by: Vivek Goyal <[email protected]>
---
block/blk-core.c | 7 ++++++-
block/blk-sysfs.c | 2 --
block/blk-throttle.c | 6 +++---
include/linux/blkdev.h | 2 --
4 files changed, 9 insertions(+), 8 deletions(-)
diff --git a/block/blk-core.c b/block/blk-core.c
index bc2b7c5..accff29 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -380,13 +380,16 @@ EXPORT_SYMBOL(blk_stop_queue);
* that its ->make_request_fn will not re-add plugging prior to calling
* this function.
*
+ * This function does not cancel any asynchronous activity arising
+ * out of elevator or throttling code. That would require elevaotor_exit()
+ * and blk_throtl_exit() to be called with queue lock initialized.
+ *
*/
void blk_sync_queue(struct request_queue *q)
{
del_timer_sync(&q->unplug_timer);
del_timer_sync(&q->timeout);
cancel_work_sync(&q->unplug_work);
- throtl_shutdown_timer_wq(q);
}
EXPORT_SYMBOL(blk_sync_queue);
@@ -469,6 +472,8 @@ void blk_cleanup_queue(struct request_queue *q)
if (q->elevator)
elevator_exit(q->elevator);
+ blk_throtl_exit(q);
+
blk_put_queue(q);
}
EXPORT_SYMBOL(blk_cleanup_queue);
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index 41fb691..261c75c 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -471,8 +471,6 @@ static void blk_release_queue(struct kobject *kobj)
blk_sync_queue(q);
- blk_throtl_exit(q);
-
if (rl->rq_pool)
mempool_destroy(rl->rq_pool);
diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index a89043a..c0f6237 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -965,7 +965,7 @@ static void throtl_update_blkio_group_write_iops(void *key,
throtl_schedule_delayed_work(td->queue, 0);
}
-void throtl_shutdown_timer_wq(struct request_queue *q)
+static void throtl_shutdown_wq(struct request_queue *q)
{
struct throtl_data *td = q->td;
@@ -1099,7 +1099,7 @@ void blk_throtl_exit(struct request_queue *q)
BUG_ON(!td);
- throtl_shutdown_timer_wq(q);
+ throtl_shutdown_wq(q);
spin_lock_irq(q->queue_lock);
throtl_release_tgs(td);
@@ -1129,7 +1129,7 @@ void blk_throtl_exit(struct request_queue *q)
* update limits through cgroup and another work got queued, cancel
* it.
*/
- throtl_shutdown_timer_wq(q);
+ throtl_shutdown_wq(q);
throtl_td_free(td);
}
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index e3ee74f..23fb925 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1144,7 +1144,6 @@ extern int blk_throtl_init(struct request_queue *q);
extern void blk_throtl_exit(struct request_queue *q);
extern int blk_throtl_bio(struct request_queue *q, struct bio **bio);
extern void throtl_schedule_delayed_work(struct request_queue *q, unsigned long delay);
-extern void throtl_shutdown_timer_wq(struct request_queue *q);
#else /* CONFIG_BLK_DEV_THROTTLING */
static inline int blk_throtl_bio(struct request_queue *q, struct bio **bio)
{
@@ -1154,7 +1153,6 @@ static inline int blk_throtl_bio(struct request_queue *q, struct bio **bio)
static inline int blk_throtl_init(struct request_queue *q) { return 0; }
static inline int blk_throtl_exit(struct request_queue *q) { return 0; }
static inline void throtl_schedule_delayed_work(struct request_queue *q, unsigned long delay) {}
-static inline void throtl_shutdown_timer_wq(struct request_queue *q) {}
#endif /* CONFIG_BLK_DEV_THROTTLING */
#define MODULE_ALIAS_BLOCKDEV(major,minor) \
--
1.7.2.3
o Now we initialize ->queue_lock at queue allocation time so driver does
not have to worry about initializing it before calling blk_cleanup_queue().
---
drivers/block/loop.c | 3 ---
1 files changed, 0 insertions(+), 3 deletions(-)
diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 49e6a54..44e18c0 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -1641,9 +1641,6 @@ out:
static void loop_free(struct loop_device *lo)
{
- if (!lo->lo_queue->queue_lock)
- lo->lo_queue->queue_lock = &lo->lo_queue->__queue_lock;
-
blk_cleanup_queue(lo->lo_queue);
put_disk(lo->lo_disk);
list_del(&lo->lo_list);
--
1.7.2.3
o There does not seem to be a clear convention whether q->queue_lock is
initialized or not when blk_cleanup_queue() is called. In the past
it was not necessary but now blk_throtl_exit() takes up queue lock
by default and needs queue lock to be available.
In fact elevator_exit() code also has similar requirement just that
it is less stringent in the sense that elevator_exit() is called only
if elevator is initialized.
o Two problems have been noticed because of ambiguity about spin lock
status.
- If a driver calls blk_alloc_queue() and then soon calls
blk_cleanup_queue() almost immediately, (because some other driver
structure allocation failed or some other error happened) then
blk_throtl_exit() will run into issues as queue lock is not
initialized. Loop driver ran into this issue recently and I noticed
error paths in md driver too. Similar error paths should exist in
other drivers too.
- If some driver provided external spin lock and zapped the lock
before blk_cleanup_queue(), then it can lead to issues.
o So this patch initializes the default queue lock at queue allocation time.
block throttling code is one of the users of queue lock and it is
initialized at the queue allocation time, so it makes sense to
initialize ->queue_lock also to internal lock. A driver can overide that
lock later. This will take care of the issue where a driver does not have
to worry about initializing the queue lock to default before calling
blk_cleanup_queue()
Signed-off-by: Vivek Goyal <[email protected]>
---
block/blk-core.c | 16 +++++++++++++++-
block/blk-settings.c | 7 -------
2 files changed, 15 insertions(+), 8 deletions(-)
diff --git a/block/blk-core.c b/block/blk-core.c
index 3cc17e6..bc2b7c5 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -446,6 +446,11 @@ void blk_put_queue(struct request_queue *q)
kobject_put(&q->kobj);
}
+/*
+ * Note: If a driver supplied the queue lock, it should not zap that lock
+ * unexpectedly as some queue cleanup components like elevator_exit() and
+ * blk_throtl_exit() need queue lock.
+ */
void blk_cleanup_queue(struct request_queue *q)
{
/*
@@ -540,6 +545,12 @@ struct request_queue *blk_alloc_queue_node(gfp_t gfp_mask, int node_id)
mutex_init(&q->sysfs_lock);
spin_lock_init(&q->__queue_lock);
+ /*
+ * By default initialize queue_lock to internal lock and driver can
+ * override it later if need be.
+ */
+ q->queue_lock = &q->__queue_lock;
+
return q;
}
EXPORT_SYMBOL(blk_alloc_queue_node);
@@ -624,7 +635,10 @@ blk_init_allocated_queue_node(struct request_queue *q, request_fn_proc *rfn,
q->unprep_rq_fn = NULL;
q->unplug_fn = generic_unplug_device;
q->queue_flags = QUEUE_FLAG_DEFAULT;
- q->queue_lock = lock;
+
+ /* Override internal queue lock with supplied lock pointer */
+ if (lock)
+ q->queue_lock = lock;
/*
* This also sets hw/phys segments, boundary and size
diff --git a/block/blk-settings.c b/block/blk-settings.c
index 36c8c1f..df649fa 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -176,13 +176,6 @@ void blk_queue_make_request(struct request_queue *q, make_request_fn *mfn)
blk_queue_max_hw_sectors(q, BLK_SAFE_MAX_SECTORS);
/*
- * If the caller didn't supply a lock, fall back to our embedded
- * per-queue locks
- */
- if (!q->queue_lock)
- q->queue_lock = &q->__queue_lock;
-
- /*
* by default assume old behaviour and bounce for any highmem page
*/
blk_queue_bounce_limit(q, BLK_BOUNCE_HIGH);
--
1.7.2.3
On 2011-02-28 14:25, Vivek Goyal wrote:
> Hi Jens,
>
> These are few fixes for block throtl code and for blk_cleanup_queue(). These
> should be a good candidate for 2.6.39.
They look good, thanks.
A small please to revisit your changelog style. Lets drop the bullet
points, and it's best to keep at eg 72 chars so that the output is
readable in git log easily. I sometimes change these for you, would be
nice if they followed normal style though.
--
Jens Axboe
On Wed, Mar 02, 2011 at 06:57:37PM -0500, Jens Axboe wrote:
> On 2011-02-28 14:25, Vivek Goyal wrote:
> > Hi Jens,
> >
> > These are few fixes for block throtl code and for blk_cleanup_queue(). These
> > should be a good candidate for 2.6.39.
>
> They look good, thanks.
>
> A small please to revisit your changelog style. Lets drop the bullet
> points, and it's best to keep at eg 72 chars so that the output is
> readable in git log easily. I sometimes change these for you, would be
> nice if they followed normal style though.
Sure. I will drop bullet style and also limit line width to 72 characters
in future postings.
Do you want me to repost this patch series with changelogs corrected?
Thanks
Vivek
On 2011-03-02 19:09, Vivek Goyal wrote:
> On Wed, Mar 02, 2011 at 06:57:37PM -0500, Jens Axboe wrote:
>> On 2011-02-28 14:25, Vivek Goyal wrote:
>>> Hi Jens,
>>>
>>> These are few fixes for block throtl code and for blk_cleanup_queue(). These
>>> should be a good candidate for 2.6.39.
>>
>> They look good, thanks.
>>
>> A small please to revisit your changelog style. Lets drop the bullet
>> points, and it's best to keep at eg 72 chars so that the output is
>> readable in git log easily. I sometimes change these for you, would be
>> nice if they followed normal style though.
>
> Sure. I will drop bullet style and also limit line width to 72 characters
> in future postings.
Thanks!
> Do you want me to repost this patch series with changelogs corrected?
No, I just fixed these up. It's in for-2.6.39 now.
--
Jens Axboe