On Fri, Jul 12, 2019 at 12:10:12PM +0200, Steffen Klassert wrote:
> On Fri, Jul 12, 2019 at 06:06:36PM +0800, Herbert Xu wrote:
> > Daniel Jordan <[email protected]> wrote:
> > >
> > > CPU0 CPU1
> > >
> > > padata_reorder padata_do_serial
> > > LOAD reorder_objects // 0
> > > INC reorder_objects // 1
> > > padata_reorder
> > > TRYLOCK pd->lock // failed
> > > UNLOCK pd->lock
> >
> > I think this can't happen because CPU1 won't call padata_reorder
> > at all as it's the wrong CPU so it gets pushed back onto a work
> > queue which will go back to CPU0.
Thanks for looking at this.
When you say the wrong CPU, I think you're referring to the reorder_via_wq
logic in padata_do_serial. If so, I think my explanation was unclear, because
there were two padata jobs in flight and my tracepoints showed neither of them
punted padata_reorder to a workqueue. Let me expand on this, hopefully it
helps.
pcrypt used CPU 5 for its callback CPU. The first job in question, with
sequence number 16581, hashed to CPU 21 on my system. This is a more complete
version of what led to the hanging modprobe command.
modprobe (CPU2) kworker/21:1-293 (CPU21) kworker/5:2-276 (CPU5)
-------------------------- ------------------------ ----------------------
<submit job, seq_nr=16581>
...
padata_do_parallel
queue_work_on(21, ...)
<sleeps>
padata_parallel_worker
pcrypt_aead_dec
padata_do_serial
padata_reorder
| padata_get_next // returns job, seq_nr=16581
| // serialize seq_nr=16581
| queue_work_on(5, ...)
| padata_get_next // returns -EINPROGRESS
// padata_reorder doesn't return yet!
| | padata_serial_worker
| | pcrypt_aead_serial
| | <wake up modprobe>
| | <worker finishes>
<submit job, seq_nr=16582> | |
... | |
padata_do_parallel | |
queue_work_on(22, ...) | | (LOAD reorder_objects as 0 somewhere
<sleeps> | | in here, before the INC)
| | kworker/22:1-291 (CPU22)
| | ------------------------
| | padata_parallel_worker
| | pcrypt_aead_dec
| | padata_do_serial
| | // INC reorder_objects to 1
| | padata_reorder
| | // trylock fail, CPU21 _should_
| | // serialize 16582 but doesn't
| | <worker finishes>
| // deletes pd->timer
// padata_reorder returns
<worker finishes>
<keeps on sleeping lazily>
My tracepoints showed CPU22 increased reorder_objects to 1 but CPU21 read it as
0.
I think adding the full barrier guarantees the following ordering, and the
memory model people can correct me if I'm wrong:
CPU21 CPU22
------------------------ --------------------------
UNLOCK pd->lock
smp_mb()
LOAD reorder_objects
INC reorder_objects
spin_unlock(&pqueue->reorder.lock) // release barrier
TRYLOCK pd->lock
So if CPU22 has incremented reorder_objects but CPU21 reads 0 for it, CPU21
should also have unlocked pd->lock so CPU22 can get it and serialize any
remaining jobs.
On Fri, Jul 12, 2019 at 12:07:37PM -0400, Daniel Jordan wrote:
>
> modprobe (CPU2) kworker/21:1-293 (CPU21) kworker/5:2-276 (CPU5)
> -------------------------- ------------------------ ----------------------
> <submit job, seq_nr=16581>
> ...
> padata_do_parallel
> queue_work_on(21, ...)
> <sleeps>
> padata_parallel_worker
> pcrypt_aead_dec
> padata_do_serial
> padata_reorder
This can't happen because if the job started on CPU2 then it must
go back to CPU2 for completion. IOW padata_do_serial should be
punting this to a work queue for CPU2 rather than calling
padata_reorder on CPU21.
Cheers,
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
On Sat, Jul 13, 2019 at 01:03:21PM +0800, Herbert Xu wrote:
> On Fri, Jul 12, 2019 at 12:07:37PM -0400, Daniel Jordan wrote:
> >
> > modprobe (CPU2) kworker/21:1-293 (CPU21) kworker/5:2-276 (CPU5)
> > -------------------------- ------------------------ ----------------------
> > <submit job, seq_nr=16581>
> > ...
> > padata_do_parallel
> > queue_work_on(21, ...)
> > <sleeps>
> > padata_parallel_worker
> > pcrypt_aead_dec
> > padata_do_serial
> > padata_reorder
>
> This can't happen because if the job started on CPU2 then it must
> go back to CPU2 for completion. IOW padata_do_serial should be
> punting this to a work queue for CPU2 rather than calling
> padata_reorder on CPU21.
I've been wrong before plenty of times, and there's nothing preventing this
from being one of those times :) , but in this case I believe what I'm showing
is correct.
The padata_do_serial call for a given job ensures padata_reorder runs on the
CPU that the job hashed to in padata_do_parallel, which is not necessarily the
same CPU as the one that padata_do_parallel itself ran on.
In this case, the padata job in question started via padata_do_parallel, where
it hashed to CPU 21:
padata_do_parallel // ran on CPU 2
...
target_cpu = padata_cpu_hash(pd); // target_cpu == 21
padata->cpu = target_cpu;
...
queue_work_on(21, ...)
The corresponding kworker then started:
padata_parallel_worker // bound to CPU 21
pcrypt_aead_dec
padata_do_serial
padata_reorder
Daniel