2012-08-10 03:22:45

by Chanho Min

[permalink] [raw]
Subject: [PATCH][SCSI] remove the queue unlock in scsi_requset_fn

We don't need to unlock the queue before put_device in scsi_request_fn()
If we trigger the ->remove() function, It occur a oops from the caller.
So sdev reference count should not be dropped to zero here.
Also It was added before scsi_device_dev_release() was moved
to user context, so it is outdated.

Signed-off-by: Chanho Min <[email protected]>
Reviewed-by: Bart Van Assche <[email protected]>
---
drivers/scsi/scsi_lib.c | 4 ----
1 files changed, 0 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index ffd7773..cb2185a 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1626,11 +1626,7 @@ out_delay:
if (sdev->device_busy == 0)
blk_delay_queue(q, SCSI_QUEUE_DELAY);
out:
- /* must be careful here...if we trigger the ->remove() function
- * we cannot be holding the q lock */
- spin_unlock_irq(q->queue_lock);
put_device(&sdev->sdev_gendev);
- spin_lock_irq(q->queue_lock);
}

u64 scsi_calculate_bounce_limit(struct Scsi_Host *shost)
--
1.7.0.4


2012-08-13 17:47:09

by Bart Van Assche

[permalink] [raw]
Subject: Re: [PATCH][SCSI] remove the queue unlock in scsi_requset_fn

On 08/10/12 03:22, Chanho Min wrote:
> We don't need to unlock the queue before put_device in scsi_request_fn()

It looks like there is a typo in the patch subject ? Also, you can omit
"[SCSI]" from the patch subject - AFAIK James has a script that inserts
that text automatically.

Bart.

2012-08-14 09:48:06

by Chanho Min

[permalink] [raw]
Subject: [PATCH RESEND] remove the queue unlock in scsi_requset_fn

We don't need to unlock the queue before put_device in scsi_request_fn()
If we trigger the ->remove() function, It occur a oops from the caller.
So sdev reference count should not be dropped to zero here.
Also It was added before scsi_device_dev_release() was moved
to user context, so it is outdated.

Signed-off-by: Chanho Min <[email protected]>
Reviewed-by: Bart Van Assche <[email protected]>
---
drivers/scsi/scsi_lib.c | 4 ----
1 files changed, 0 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index ffd7773..cb2185a 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1626,11 +1626,7 @@ out_delay:
if (sdev->device_busy == 0)
blk_delay_queue(q, SCSI_QUEUE_DELAY);
out:
- /* must be careful here...if we trigger the ->remove() function
- * we cannot be holding the q lock */
- spin_unlock_irq(q->queue_lock);
put_device(&sdev->sdev_gendev);
- spin_lock_irq(q->queue_lock);
}

u64 scsi_calculate_bounce_limit(struct Scsi_Host *shost)
--
1.7.0.4

2012-08-14 12:07:58

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH RESEND] remove the queue unlock in scsi_requset_fn

On Tue, 2012-08-14 at 18:48 +0900, Chanho Min wrote:
> We don't need to unlock the queue before put_device in scsi_request_fn()
> If we trigger the ->remove() function, It occur a oops from the caller.
> So sdev reference count should not be dropped to zero here.
> Also It was added before scsi_device_dev_release() was moved
> to user context, so it is outdated.

None of this sounds to be correct. The user context comment is
irrelevant because if we happen to be in user context, all the release
functions will occur in line. I also don't see why the sdev reference
couldn't drop to zero here.

The reason I think we could remove the lock drop is because the queue
reference cannot drop to zero here because the block layer is holding a
reference to run the queue. It's only the queue ->release function that
would take the queue lock and therefore we're safe to hold it.

James


> Signed-off-by: Chanho Min <[email protected]>
> Reviewed-by: Bart Van Assche <[email protected]>
> ---
> drivers/scsi/scsi_lib.c | 4 ----
> 1 files changed, 0 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index ffd7773..cb2185a 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -1626,11 +1626,7 @@ out_delay:
> if (sdev->device_busy == 0)
> blk_delay_queue(q, SCSI_QUEUE_DELAY);
> out:
> - /* must be careful here...if we trigger the ->remove() function
> - * we cannot be holding the q lock */
> - spin_unlock_irq(q->queue_lock);
> put_device(&sdev->sdev_gendev);
> - spin_lock_irq(q->queue_lock);
> }
>
> u64 scsi_calculate_bounce_limit(struct Scsi_Host *shost)

2012-08-16 01:36:07

by Chanho Min

[permalink] [raw]
Subject: Re: [PATCH RESEND] remove the queue unlock in scsi_requset_fn

> functions will occur in line. I also don't see why the sdev reference
> couldn't drop to zero here.
scsi_request_fn is called under the lock of request_queue->queue_lock.
If we drop the sdev reference to zero here,
scsi_device_dev_release_usercontext is
invoked and make request_queue to NULL. When caller of scsi_request_fn try to
unlock request_queue->queue_lock, the oops is occurred.

Thanks, Chanho

2012-08-16 07:52:44

by Bart Van Assche

[permalink] [raw]
Subject: Re: [PATCH RESEND] remove the queue unlock in scsi_requset_fn

On 08/16/12 01:35, Chanho Min wrote:
>> functions will occur in line. I also don't see why the sdev reference
>> couldn't drop to zero here.
> scsi_request_fn is called under the lock of request_queue->queue_lock.
> If we drop the sdev reference to zero here,
> scsi_device_dev_release_usercontext is
> invoked and make request_queue to NULL. When caller of scsi_request_fn try to
> unlock request_queue->queue_lock, the oops is occurred.

Whether or not your patch is applied, if the put_device() call in
scsi_request_fn() decreases the sdev reference count to zero, the
scsi_request_fn() caller will still try to unlock the queue lock after
scsi_request_fn() finished and hence will trigger a use-after-free. I'm
afraid the only real solution is to modify the SCSI and/or block layers
such that scsi_remove_device() can't finish while scsi_request_fn() is
in progress. And once that is guaranteed the get_device() / put_device()
pair can be left out from scsi_request_fn().

Bart.

2012-08-16 07:56:25

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH RESEND] remove the queue unlock in scsi_requset_fn

On Thu, 2012-08-16 at 10:35 +0900, Chanho Min wrote:
> > functions will occur in line. I also don't see why the sdev reference
> > couldn't drop to zero here.
> scsi_request_fn is called under the lock of request_queue->queue_lock.
> If we drop the sdev reference to zero here,
> scsi_device_dev_release_usercontext is
> invoked and make request_queue to NULL. When caller of scsi_request_fn try to
> unlock request_queue->queue_lock, the oops is occurred.

I don't understand this explanation.

sdev->request_queue goes to NULL if the sdev refcount goes to zero (and
blk. We have a copy though in the q variable, which is what we unlock.
That q variable only goes invalid if the queue ref count goes to zero.
If that happens, the queue release function will try to take the lock to
free the elevator and your patch will cause a deadlock.

There are only two possibilities here:

1. The queue refcount can never reach zero within a request
function because block ensures that it can unlock the queue lock
on exit. We could then remove this lock drop and acquire on the
grounds that it is superfluous.
2. The queue refcount does indeed go to zero and the queue gets
released. This would mean that all our lock; request_fn; unlock
patterns do a use after free (in the block layer). Your
proposed patch doesn't fix this (and indeed would cause a
deadlock on the release path).

I've cc'd Jens, because I don't entirely see why our

lock; request_fn; unlock

is safe against a racing blk_cleanup_queue().

James

2012-08-16 08:10:23

by James Bottomley

[permalink] [raw]
Subject: Re: [PATCH RESEND] remove the queue unlock in scsi_requset_fn

On Thu, 2012-08-16 at 07:52 +0000, Bart Van Assche wrote:
> On 08/16/12 01:35, Chanho Min wrote:
> >> functions will occur in line. I also don't see why the sdev reference
> >> couldn't drop to zero here.
> > scsi_request_fn is called under the lock of request_queue->queue_lock.
> > If we drop the sdev reference to zero here,
> > scsi_device_dev_release_usercontext is
> > invoked and make request_queue to NULL. When caller of scsi_request_fn try to
> > unlock request_queue->queue_lock, the oops is occurred.
>
> Whether or not your patch is applied, if the put_device() call in
> scsi_request_fn() decreases the sdev reference count to zero, the
> scsi_request_fn() caller will still try to unlock the queue lock after
> scsi_request_fn() finished and hence will trigger a use-after-free. I'm
> afraid the only real solution is to modify the SCSI and/or block layers
> such that scsi_remove_device() can't finish while scsi_request_fn() is
> in progress. And once that is guaranteed the get_device() / put_device()
> pair can be left out from scsi_request_fn().

Well, no. The only way to destroy a queue is with blk_cleanup_queue()
which does the final put. blk_cleanup_queue has a rather clunky drain
check that looks at both queued and in-flight requests. Even if we have
a scsi_remove_device() racing with the scsi_request_fn() and it gets to
blk_cleanup_queue(), that call gets held at the drain wait until the
scsi_request_fn() has exited. The same is true for all other request
functions, so we're safe.

James

2012-08-18 11:56:58

by Bart Van Assche

[permalink] [raw]
Subject: Re: [PATCH RESEND] remove the queue unlock in scsi_requset_fn

On 08/16/12 07:52, Bart Van Assche wrote:
> On 08/16/12 01:35, Chanho Min wrote:
>>> functions will occur in line. I also don't see why the sdev reference
>>> couldn't drop to zero here.
>> scsi_request_fn is called under the lock of request_queue->queue_lock.
>> If we drop the sdev reference to zero here,
>> scsi_device_dev_release_usercontext is
>> invoked and make request_queue to NULL. When caller of scsi_request_fn try to
>> unlock request_queue->queue_lock, the oops is occurred.
>
> Whether or not your patch is applied, if the put_device() call in
> scsi_request_fn() decreases the sdev reference count to zero, the
> scsi_request_fn() caller will still try to unlock the queue lock after
> scsi_request_fn() finished and hence will trigger a use-after-free. I'm
> afraid the only real solution is to modify the SCSI and/or block layers
> such that scsi_remove_device() can't finish while scsi_request_fn() is
> in progress. And once that is guaranteed the get_device() / put_device()
> pair can be left out from scsi_request_fn().

(replying to my own e-mail)

How about the patch below ?

[PATCH] Fix device removal race

If the put_device() call in scsi_request_fn() drops the sdev refcount
to zero then the spin_lock_irq() call after the put_device() call
triggers a use-after-free. Avoid that by making sure that blk_cleanup_queue()
can only finish after all active scsi_request_fn() calls have returned.
---
block/blk-core.c | 1 +
drivers/scsi/scsi_lib.c | 10 ++--------
include/linux/blkdev.h | 5 +++++
3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 4b4dbdf..0e4da3b 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -388,6 +388,7 @@ void blk_drain_queue(struct request_queue *q, bool drain_all)
__blk_run_queue(q);

drain |= q->nr_rqs_elvpriv;
+ drain |= q->request_fn_active;

/*
* Unfortunately, requests are queued at and tracked from
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index ffd7773..10bb348 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1514,9 +1514,7 @@ static void scsi_request_fn(struct request_queue *q)
struct scsi_cmnd *cmd;
struct request *req;

- if(!get_device(&sdev->sdev_gendev))
- /* We must be tearing the block queue down already */
- return;
+ q->request_fn_active++;

/*
* To start with, we keep looping until the queue is empty, or until
@@ -1626,11 +1624,7 @@ out_delay:
if (sdev->device_busy == 0)
blk_delay_queue(q, SCSI_QUEUE_DELAY);
out:
- /* must be careful here...if we trigger the ->remove() function
- * we cannot be holding the q lock */
- spin_unlock_irq(q->queue_lock);
- put_device(&sdev->sdev_gendev);
- spin_lock_irq(q->queue_lock);
+ q->request_fn_active--;
}

u64 scsi_calculate_bounce_limit(struct Scsi_Host *shost)
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 4e72a9d..11c1987 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -377,6 +377,11 @@ struct request_queue {

unsigned int nr_sorted;
unsigned int in_flight[2];
+ /*
+ * Number of active request_fn() calls for those request_fn()
+ * implementations that unlock the queue_lock, e.g. scsi_request_fn().
+ */
+ unsigned int request_fn_active;

unsigned int rq_timeout;
struct timer_list timeout;
--
1.7.7