2020-05-19 13:54:58

by Ben Hutchings

[permalink] [raw]
Subject: Backporting "padata: Remove broken queue flushing"

I noticed that commit 07928d9bfc81 "padata: Remove broken queue
flushing" has been backported to most stable branches, but commit
6fc4dbcf0276 "padata: Replace delayed timer with immediate workqueue in
padata_reorder" has not.

Is this correct? What prevents the parallel_data ref-count from
dropping to 0 while the timer is scheduled?

Ben.

--
Ben Hutchings
Larkinson's Law: All laws are basically false.


Attachments:
signature.asc (849.00 B)
This is a digitally signed message part

2020-05-19 20:02:14

by Daniel Jordan

[permalink] [raw]
Subject: Re: Backporting "padata: Remove broken queue flushing"

Hello Ben,

On Tue, May 19, 2020 at 02:53:05PM +0100, Ben Hutchings wrote:
> I noticed that commit 07928d9bfc81 "padata: Remove broken queue
> flushing" has been backported to most stable branches, but commit
> 6fc4dbcf0276 "padata: Replace delayed timer with immediate workqueue in
> padata_reorder" has not.
>
> Is this correct? What prevents the parallel_data ref-count from
> dropping to 0 while the timer is scheduled?

Doesn't seem like anything does, looking at 4.19.

I can see a race where the timer function uses a parallel_data after free
whether or not the refcount goes to 0. Don't think it's likely to happen in
practice because of how small the window is between the serial callback
finishing and the timer being deactivated.


task1:
padata_reorder
task2:
padata_do_serial
// object arrives in reorder queue
// sees reorder_objects > 0,
// set timer for 1 second
mod_timer
return
padata_reorder
// queue serial work, which finishes
// (now possibly no more objects
// left)
|
task1: |
// pd is freed one of two ways: |
// 1) pcrypt is unloaded |
// 2) padata_replace triggered |
// from userspace | (small window)
|
task3: |
padata_reorder_timer |
// uses pd after free |
|
del_timer // too late


If I got this right we might want to backport the commit you mentioned to be on
the safe side.

2020-05-20 14:34:52

by Ben Hutchings

[permalink] [raw]
Subject: Re: Backporting "padata: Remove broken queue flushing"

On Tue, 2020-05-19 at 16:00 -0400, Daniel Jordan wrote:
> Hello Ben,
>
> On Tue, May 19, 2020 at 02:53:05PM +0100, Ben Hutchings wrote:
> > I noticed that commit 07928d9bfc81 "padata: Remove broken queue
> > flushing" has been backported to most stable branches, but commit
> > 6fc4dbcf0276 "padata: Replace delayed timer with immediate workqueue in
> > padata_reorder" has not.
> >
> > Is this correct? What prevents the parallel_data ref-count from
> > dropping to 0 while the timer is scheduled?
>
> Doesn't seem like anything does, looking at 4.19.

OK, so it looks like the following commits should be backported:

[3.16-4.9] 119a0798dc42 padata: Remove unused but set variables
[3.16] de5540d088fe padata: avoid race in reordering
[3.16-4.9] 69b348449bda padata: get_next is never NULL
[3.16-4.14] cf5868c8a22d padata: ensure the reorder timer callback runs on the correct CPU
[3.16-4.14] 350ef88e7e92 padata: ensure padata_do_serial() runs on the correct CPU
[3.16-4.19] 6fc4dbcf0276 padata: Replace delayed timer with immediate workqueue in padata_reorder
[3.16-4.19] ec9c7d19336e padata: initialize pd->cpu with effective cpumask
[3.16-4.19] 065cf577135a padata: purge get_cpu and reorder_via_wq from padata_do_serial

Ben.

> I can see a race where the timer function uses a parallel_data after free
> whether or not the refcount goes to 0. Don't think it's likely to happen in
> practice because of how small the window is between the serial callback
> finishing and the timer being deactivated.
>
>
> task1:
> padata_reorder
> task2:
> padata_do_serial
> // object arrives in reorder queue
> // sees reorder_objects > 0,
> // set timer for 1 second
> mod_timer
> return
> padata_reorder
> // queue serial work, which finishes
> // (now possibly no more objects
> // left)
> |
> task1: |
> // pd is freed one of two ways: |
> // 1) pcrypt is unloaded |
> // 2) padata_replace triggered |
> // from userspace | (small window)
> |
> task3: |
> padata_reorder_timer |
> // uses pd after free |
> |
> del_timer // too late
>
>
> If I got this right we might want to backport the commit you mentioned to be on
> the safe side.
--
Ben Hutchings
All the simple programs have been written, and all the good names taken


Attachments:
signature.asc (849.00 B)
This is a digitally signed message part

2020-05-21 08:02:17

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: Backporting "padata: Remove broken queue flushing"

On Wed, May 20, 2020 at 03:33:44PM +0100, Ben Hutchings wrote:
> On Tue, 2020-05-19 at 16:00 -0400, Daniel Jordan wrote:
> > Hello Ben,
> >
> > On Tue, May 19, 2020 at 02:53:05PM +0100, Ben Hutchings wrote:
> > > I noticed that commit 07928d9bfc81 "padata: Remove broken queue
> > > flushing" has been backported to most stable branches, but commit
> > > 6fc4dbcf0276 "padata: Replace delayed timer with immediate workqueue in
> > > padata_reorder" has not.
> > >
> > > Is this correct? What prevents the parallel_data ref-count from
> > > dropping to 0 while the timer is scheduled?
> >
> > Doesn't seem like anything does, looking at 4.19.
>
> OK, so it looks like the following commits should be backported:
>
> [3.16-4.9] 119a0798dc42 padata: Remove unused but set variables
> [3.16] de5540d088fe padata: avoid race in reordering
> [3.16-4.9] 69b348449bda padata: get_next is never NULL
> [3.16-4.14] cf5868c8a22d padata: ensure the reorder timer callback runs on the correct CPU
> [3.16-4.14] 350ef88e7e92 padata: ensure padata_do_serial() runs on the correct CPU

These all applied cleanly to the needed trees, but these:

> [3.16-4.19] 6fc4dbcf0276 padata: Replace delayed timer with immediate workqueue in padata_reorder
> [3.16-4.19] ec9c7d19336e padata: initialize pd->cpu with effective cpumask
> [3.16-4.19] 065cf577135a padata: purge get_cpu and reorder_via_wq from padata_do_serial

Need some non-trivial backporting. Can you, or someone else do it so I
can queue them up? I don't have the free time at the moment, sorry.

thanks,

greg k-h

2020-05-21 13:32:48

by Daniel Jordan

[permalink] [raw]
Subject: Re: Backporting "padata: Remove broken queue flushing"

On Thu, May 21, 2020 at 10:00:46AM +0200, Greg Kroah-Hartman wrote:
> but these:
>
> > [3.16-4.19] 6fc4dbcf0276 padata: Replace delayed timer with immediate workqueue in padata_reorder
> > [3.16-4.19] ec9c7d19336e padata: initialize pd->cpu with effective cpumask
> > [3.16-4.19] 065cf577135a padata: purge get_cpu and reorder_via_wq from padata_do_serial
>
> Need some non-trivial backporting. Can you, or someone else do it so I
> can queue them up? I don't have the free time at the moment, sorry.

Sure, I'll do these three.

Daniel