Hi Christoph,
bl_resolve_deviceid() has:
add_wait_queue(&nn->bl_wq, &wq);
rc = rpc_queue_upcall(nn->bl_device_pipe, msg);
if (rc < 0) {
remove_wait_queue(&nn->bl_wq, &wq);
goto out_free_data;
}
set_current_state(TASK_UNINTERRUPTIBLE);
schedule();
remove_wait_queue(&nn->bl_wq, &wq);
Doesn't that call to 'set_current_state()' need to come before the
rpc_queue_upcall() if you want the wait for the downcall to be
race-free? It looks to me as if the right thing to do here is to
replace the above with a prepare_to_wait()/finish_wait() pair...
--
Trond Myklebust
Linux NFS client maintainer, PrimaryData
[email protected]
On Wed, Jan 14, 2015 at 11:51:37AM -0500, Trond Myklebust wrote:
> Hi Christoph,
>
> bl_resolve_deviceid() has:
>
> add_wait_queue(&nn->bl_wq, &wq);
> rc = rpc_queue_upcall(nn->bl_device_pipe, msg);
> if (rc < 0) {
> remove_wait_queue(&nn->bl_wq, &wq);
> goto out_free_data;
> }
>
> set_current_state(TASK_UNINTERRUPTIBLE);
> schedule();
> remove_wait_queue(&nn->bl_wq, &wq);
>
>
> Doesn't that call to 'set_current_state()' need to come before the
> rpc_queue_upcall() if you want the wait for the downcall to be
> race-free? It looks to me as if the right thing to do here is to
> replace the above with a prepare_to_wait()/finish_wait() pair...
That code predates my involvement with the block layout driver,
but from a quick inspecion I'd say you're right. Let me cook up
a patch and run it through testing.
On Wed, Jan 14, 2015 at 11:51:37AM -0500, Trond Myklebust wrote:
> Doesn't that call to 'set_current_state()' need to come before the
> rpc_queue_upcall() if you want the wait for the downcall to be
> race-free? It looks to me as if the right thing to do here is to
> replace the above with a prepare_to_wait()/finish_wait() pair...
After the trivial switch to prepare_to_wait()/finish_wait() the
thread asking for the deviceid never gets woken. I'll need more
time to understand all the code around the rpc_pipefs upcalls.