On Sat, Nov 19, 2005 at 11:02:13AM -0500, Paul Clements wrote:
>
> Thanks for this. It looks correct, for the most part, and seems to deal
> with the various race conditions well (probably more cleanly than we
> were doing before). There is one problem with this patch, though. You
> have removed the acquiring of queue_lock from nbd_clear_que. I don't
> know if that was intentional or not, but the queue_lock must be taken
> when the queue is being modified. For instance, if nbd-client -d
> (disconnect) gets called on an active nbd device, then you'll have a
> race between requests being completed and pulled off the queue
> (nbd_find_request) and nbd-client -d (which calls nbd_clear_que).
This is intentional actually. nbd_clear_queue never races against
nbd_find_request because the ioctl is protected by the BKL. If it
weren't, then we have much worse problems to worry about (e.g.,
while you're clearing the queue someone else could have set the
socket again and started queueing requests).
> >+ /*
> >+ * lo->sock == NULL guarantees that the queue will not grow beyond
> >the
> >+ * current active request.
> >+ */
This comment is why we can do without the locking. The only other
function that modifies the list (apart from nbd_find_request and
nbd_clear_queueis) is nbd_do_request.
When we enter nbd_clear_queue, lo->sock must have been set to NULL.
In fact, we always take the tx_lock while setting lo->sock. So we
have two possibilities:
CPU0 CPU1
nbd_do_request
down(tx_lock)
lo->sock = NULL
up(tx_lock)
down(tx_lock)
bail as lo->sock is
NULL
up(tx_lock)
nbd_clear_queue
If any request occurs after the setting, then clearly the list will not
be modified.
CPU0 CPU1
nbd_do_request
down(tx_lock)
add req to list
up(tx_lock)
down(tx_lock)
lo->sock = NULL
up(tx_lock)
nbd_clear_queue
If any request occurs before the setting, the list has been modified
already. The up/down also gives us the required memory barriers so
that we can operate on the list safely.
In fact, I just realised that for the same reason active_req must be
NULL as well. So here is an updated patch which removes the active_req
wait in nbd_clear_queue and the associated memory barrier.
I've also clarified this in the comment.
[NBD] Fix TX/RX race condition
Janos Haar of First NetCenter Bt. reported numerous crashes involving the
NBD driver. With his help, this was tracked down to bogus bio vectors
which in turn was the result of a race condition between the
receive/transmit routines in the NBD driver.
The bug manifests itself like this:
CPU0 CPU1
do_nbd_request
add req to queuelist
nbd_send_request
send req head
for each bio
kmap
send
nbd_read_stat
nbd_find_request
nbd_end_request
kunmap
When CPU1 finishes nbd_end_request, the request and all its associated
bio's are freed. So when CPU0 calls kunmap whose argument is derived from
the last bio, it may crash.
Under normal circumstances, the race occurs only on the last bio. However,
if an error is encountered on the remote NBD server (such as an incorrect
magic number in the request), or if there were a bug in the server, it is
possible for the nbd_end_request to occur any time after the request's
addition to the queuelist.
The following patch fixes this problem by making sure that requests are not
added to the queuelist until after they have been completed transmission.
In order for the receiving side to be ready for responses involving
requests still being transmitted, the patch introduces the concept of the
active request.
When a response matches the current active request, its processing is
delayed until after the tranmission has come to a stop.
This has been tested by Janos and it has been successful in curing this
race condition.
Signed-off-by: Herbert Xu <[email protected]>
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
On Sun, Nov 20, 2005 at 09:34:19AM +1100, herbert wrote:
>
> This is intentional actually. nbd_clear_queue never races against
> nbd_find_request because the ioctl is protected by the BKL. If it
> weren't, then we have much worse problems to worry about (e.g.,
> while you're clearing the queue someone else could have set the
> socket again and started queueing requests).
Actually, we do have bigger problems :) The BKL is dropped every
time you sleep, and nbd_do_it is definitely a frequent sleeper :)
This isn't really an issue in practice though because the NBD
client program is single-threaded and doesn't share its file
descriptors with anyone else.
However, we shouldn't make it too easy for the user to shoot himself
in the foot. If he's going to do that, let him at least pay for the
bullet :)
So here is a patch to use a per-device semaphore instead of the
BKL to protect the ioctl's against each other.
Signed-off-by: Herbert Xu <[email protected]>
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Herbert Xu wrote:
> On Sun, Nov 20, 2005 at 09:34:19AM +1100, herbert wrote:
>
>>This is intentional actually. nbd_clear_queue never races against
>>nbd_find_request because the ioctl is protected by the BKL. If it
>>weren't, then we have much worse problems to worry about (e.g.,
>>while you're clearing the queue someone else could have set the
>>socket again and started queueing requests).
>
>
> Actually, we do have bigger problems :) The BKL is dropped every
> time you sleep, and nbd_do_it is definitely a frequent sleeper :)
The dropping of the lock in nbd_do_it is actually critical to the way
nbd functions. nbd_do_it runs for the lifetime of the nbd device, so if
nbd_do_it were holding some lock (BKL or otherwise), we'd have big problems.
> This isn't really an issue in practice though because the NBD
> client program is single-threaded and doesn't share its file
> descriptors with anyone else.
Right, there's no problem in practice.
> However, we shouldn't make it too easy for the user to shoot himself
> in the foot. If he's going to do that, let him at least pay for the
> bullet :)
>
> So here is a patch to use a per-device semaphore instead of the
> BKL to protect the ioctl's against each other.
The problem with this patch is that no ioctls can come in once nbd_do_it
starts because nbd_do_it runs for the lifetime of the device.
I think we really just need to add the acquiring of queue_lock in
nbd_clear_que to your previous patch and leave it at that. I'll code
that up and test it.
Thanks,
Paul
On Sun, Nov 20, 2005 at 12:19:17PM -0500, Paul Clements wrote:
>
> The dropping of the lock in nbd_do_it is actually critical to the way
> nbd functions. nbd_do_it runs for the lifetime of the nbd device, so if
> nbd_do_it were holding some lock (BKL or otherwise), we'd have big problems.
Why would you want to issue an ioctl from a different process while
nbd-client is still running?
Allow ioctl's while nbd_do_it is in progress is a *serious* bug. For a
start, if someone else clears the socket then nbd_read_stat will crash.
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Herbert Xu wrote:
> On Sun, Nov 20, 2005 at 12:19:17PM -0500, Paul Clements wrote:
>
>>The dropping of the lock in nbd_do_it is actually critical to the way
>>nbd functions. nbd_do_it runs for the lifetime of the nbd device, so if
>>nbd_do_it were holding some lock (BKL or otherwise), we'd have big problems.
>
>
> Why would you want to issue an ioctl from a different process while
> nbd-client is still running?
nbd-client -d (disconnect) is the main reason -- this functionality
would be broken if we didn't allow ioctls anymore
> Allow ioctl's while nbd_do_it is in progress is a *serious* bug. For a
Certain ioctls, yes. There's no harm in NBD_DISCONNECT, though.
> start, if someone else clears the socket then nbd_read_stat will crash.
I agree, NBD_CLEAR_SOCK from another process while nbd_do_it is running
would cause problems. But, if the user-level tools are coded properly,
this is not an issue.
Perhaps your ioctl lock scheme, with an exemption for NBD_DISCONNECT
would work?
--
Paul
On Sun, Nov 20, 2005 at 04:42:23PM -0500, Paul Clements wrote:
>
> nbd-client -d (disconnect) is the main reason -- this functionality
> would be broken if we didn't allow ioctls anymore
Good point. In fact my initial patch missed a pair of locks around
this case as well. Here is a patch to move NBD_DISCONNECT outside
the ioctl_lock and add a tx_lock around it.
[NBD] Use per-device semaphore instead of BKL
Actually, we do have bigger problems :) The BKL is dropped every
time you sleep, and nbd_do_it is definitely a frequent sleeper :)
This isn't really an issue in practice though because the NBD
client program is single-threaded and doesn't share its file
descriptors with anyone else.
However, we shouldn't make it too easy for the user to shoot himself
in the foot. If he's going to do that, let him at least pay for the
bullet :)
So here is a patch to use a per-device semaphore instead of the
BKL to protect the ioctl's against each other.
In order to preserve compatibility the ioctls NBD_DISCONNECT,
NBD_CLEAR_QUE and NBD_PRINT_DEBUG may be invoked without taking
the semaphore.
This patch also adds back the tx_lock for NBD_DISCONNECT that
went AWOL after the previous patch moved tx_lock outside of
nbd_send_req.
Signed-off-by: Herbert Xu <[email protected]>
Thanks,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Herbert Xu wrote:
> On Sun, Nov 20, 2005 at 04:42:23PM -0500, Paul Clements wrote:
>
>>nbd-client -d (disconnect) is the main reason -- this functionality
>>would be broken if we didn't allow ioctls anymore
>
>
> Good point. In fact my initial patch missed a pair of locks around
> this case as well. Here is a patch to move NBD_DISCONNECT outside
> the ioctl_lock and add a tx_lock around it.
Yep, I think that will do it. I'll test these two patches some and let
you know if I see any problems.
Thanks again.
--
Paul