David, you have to fix the locking scheme used in kernel/workqueue.c,
you absolutely cannot assume that cmpxchg() is available on all
platforms. This breaks the build on the platforms that don't
have such an instruction, and no it cannot emulated.
Also, because Alan Cox's machine (zeniv) went down, a few folks such
as Al Viro (CC:'d) had no opportunity to comment on your changes
before they went in. This mess would have been avoided if Al had a
chance to read over this, in particular since he does cross sparc32
builds he knows that cmpxchg is not available there.
On Thu, Dec 07, 2006 at 12:09:50AM -0800, David Miller wrote:
>
> David, you have to fix the locking scheme used in kernel/workqueue.c,
> you absolutely cannot assume that cmpxchg() is available on all
> platforms. This breaks the build on the platforms that don't
> have such an instruction, and no it cannot emulated.
>
> Also, because Alan Cox's machine (zeniv) went down, a few folks such
> as Al Viro (CC:'d) had no opportunity to comment on your changes
> before they went in. This mess would have been avoided if Al had a
> chance to read over this, in particular since he does cross sparc32
> builds he knows that cmpxchg is not available there.
FWIW, the *real* problem with that (and several other recent breakage
incidents) would be avoided if massive cross-arch patchsets would be
posted to linux-arch first.
It wouldn't catch all crap, but at least it would get folks to check
if the damn thing builds.
David Miller <[email protected]> wrote:
> David, you have to fix the locking scheme used in kernel/workqueue.c,
> you absolutely cannot assume that cmpxchg() is available on all
> platforms. This breaks the build on the platforms that don't
> have such an instruction, and no it cannot emulated.
Yeah, I've figured that one out. Also, having considered things last night, I
think the use of cmpxchg() is unnecessary.
I was trying to handle against two possibilities:
(1) The pending flag may have been unset or may be cleared. However, given
where it's called, the pending flag is _always_ set. I don't think it
can be unset whilst we're in set_wq_data().
Once the work is enqueued to be actually run, the only way off the queue
is for it to be actually run.
If it's a delayed work item, then the bit can't be cleared by the timer
because we haven't started the timer yet. Also, the pending bit can't be
cleared by cancelling the delayed work _until_ the work item has had its
timer started.
(2) The workqueue pointer might change. This can only happen in two cases:
(a) The work item has just been queued to actually run, and so we're
protected by the appropriate workqueue spinlock.
(b) A delayed work item is being queued, and so the timer hasn't been
started yet, and so no one else knows about the work item or can
access it (the pending bit protects us).
Besides, set_wq_data() _sets_ the workqueue pointer unconditionally, so
it can be assigned instead.
So, I think replacing the set_wq_data() with a straight assignment would be
okay in most cases. The problem is where we end up tangling with
test_and_set_bit() emulated using spinlocks, and even then it's not a problem
_provided_ test_and_set_bit() doesn't attempt to modify the word if the bit
was set.
David
On Thu, 07 Dec 2006 11:03:49 +0000
David Howells <[email protected]> wrote:
> David Miller <[email protected]> wrote:
>
> > David, you have to fix the locking scheme used in kernel/workqueue.c,
> > you absolutely cannot assume that cmpxchg() is available on all
> > platforms. This breaks the build on the platforms that don't
> > have such an instruction, and no it cannot emulated.
>
> Yeah, I've figured that one out. Also, having considered things last night, I
> think the use of cmpxchg() is unnecessary.
>
> I was trying to handle against two possibilities:
>
> (1) The pending flag may have been unset or may be cleared. However, given
> where it's called, the pending flag is _always_ set. I don't think it
> can be unset whilst we're in set_wq_data().
>
> Once the work is enqueued to be actually run, the only way off the queue
> is for it to be actually run.
>
> If it's a delayed work item, then the bit can't be cleared by the timer
> because we haven't started the timer yet. Also, the pending bit can't be
> cleared by cancelling the delayed work _until_ the work item has had its
> timer started.
>
> (2) The workqueue pointer might change. This can only happen in two cases:
>
> (a) The work item has just been queued to actually run, and so we're
> protected by the appropriate workqueue spinlock.
>
> (b) A delayed work item is being queued, and so the timer hasn't been
> started yet, and so no one else knows about the work item or can
> access it (the pending bit protects us).
>
> Besides, set_wq_data() _sets_ the workqueue pointer unconditionally, so
> it can be assigned instead.
>
> So, I think replacing the set_wq_data() with a straight assignment would be
> okay in most cases. The problem is where we end up tangling with
> test_and_set_bit() emulated using spinlocks, and even then it's not a problem
> _provided_ test_and_set_bit() doesn't attempt to modify the word if the bit
> was set.
>
I don't see why the 2.6.19 logic needed changing.
a) Nobody should be freeing the work_struct itself without running
flush_scheduled_work() and
b) even if the work_struct _did_ get freed, the callback function won't
care, because there's nothing in that work_struct which it's interested
in.
Andrew Morton <[email protected]> wrote:
> I don't see why the 2.6.19 logic needed changing.
>
> a) Nobody should be freeing the work_struct itself without running
> flush_scheduled_work() and
>
> b) even if the work_struct _did_ get freed, the callback function won't
> care, because there's nothing in that work_struct which it's interested
> in.
Erm... Did you mean that in reply to my suggestion that we don't need to use
cmpxchg()?
We might want to avoid cmpxchg() because it isn't available on all platforms
under all circumstances, and besides I'm not sure it's actually necessary.
David
[adding linux-arch to the CC list]
David Miller <[email protected]> wrote:
> David, you have to fix the locking scheme used in kernel/workqueue.c,
> you absolutely cannot assume that cmpxchg() is available on all
> platforms. This breaks the build on the platforms that don't
> have such an instruction, and no it cannot emulated.
Yeah, I've figured that one out. Also, having considered things last night, I
think the use of cmpxchg() is unnecessary.
I was trying to handle against two possibilities:
(1) The pending flag may have been unset or may be cleared. However, given
where it's called, the pending flag is _always_ set. I don't think it
can be unset whilst we're in set_wq_data().
Once the work is enqueued to be actually run, the only way off the queue
is for it to be actually run.
If it's a delayed work item, then the bit can't be cleared by the timer
because we haven't started the timer yet. Also, the pending bit can't be
cleared by cancelling the delayed work _until_ the work item has had its
timer started.
(2) The workqueue pointer might change. This can only happen in two cases:
(a) The work item has just been queued to actually run, and so we're
protected by the appropriate workqueue spinlock.
(b) A delayed work item is being queued, and so the timer hasn't been
started yet, and so no one else knows about the work item or can
access it (the pending bit protects us).
Besides, set_wq_data() _sets_ the workqueue pointer unconditionally, so
it can be assigned instead.
So, I think replacing the set_wq_data() with a straight assignment would be
okay in most cases. The problem is where we end up tangling with
test_and_set_bit() emulated using spinlocks, and even then it's not a problem
_provided_ test_and_set_bit() doesn't attempt to modify the word if the bit
was set.
David
On Thu, Dec 07, 2006 at 12:09:50AM -0800, David Miller wrote:
> Also, because Alan Cox's machine (zeniv) went down, a few folks such
> as Al Viro (CC:'d) had no opportunity to comment on your changes
> before they went in.
Special thanks should be given to Vince Sanders and Leslie Mitchell for
sourcing the replacement PSU and going to the data centre at rather short
notice to resolve the problem, without whom ZenIV would still probably
be down.
I should also correct you - it was Bryce (Philip Copeland) put ZenIV
together.
--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of:
On Thu, 07 Dec 2006 11:46:24 +0000
David Howells <[email protected]> wrote:
> Andrew Morton <[email protected]> wrote:
>
> > I don't see why the 2.6.19 logic needed changing.
> >
> > a) Nobody should be freeing the work_struct itself without running
> > flush_scheduled_work() and
> >
> > b) even if the work_struct _did_ get freed, the callback function won't
> > care, because there's nothing in that work_struct which it's interested
> > in.
>
> Erm... Did you mean that in reply to my suggestion that we don't need to use
> cmpxchg()?
I was referring to the core logic change in run-workqueue():
if (!test_bit(WORK_STRUCT_NOAUTOREL, &work->management))
work_release(work);
f(work);