This is a corrected patch for the autofs4 daedlock problem I posted about
yesterday.
As before comments please.
diff -Nur autofs4-2.6.orig/fs/autofs4/autofs_i.h autofs4-2.6.deadlock/fs/autofs4/autofs_i.h
--- autofs4-2.6.orig/fs/autofs4/autofs_i.h 2003-09-09 03:50:14.000000000 +0800
+++ autofs4-2.6.deadlock/fs/autofs4/autofs_i.h 2003-09-22 22:48:11.000000000 +0800
@@ -82,6 +82,7 @@
char *name;
/* This is for status reporting upon return */
int status;
+ struct task_struct *owner;
int wait_ctr;
};
diff -Nur autofs4-2.6.orig/fs/autofs4/waitq.c autofs4-2.6.deadlock/fs/autofs4/waitq.c
--- autofs4-2.6.orig/fs/autofs4/waitq.c 2003-09-09 03:50:31.000000000 +0800
+++ autofs4-2.6.deadlock/fs/autofs4/waitq.c 2003-09-24 00:10:38.000000000 +0800
@@ -165,6 +165,7 @@
wq->status = -EINTR; /* Status return if interrupted */
memcpy(wq->name, name->name, name->len);
wq->next = sbi->queues;
+ wq->owner = current;
sbi->queues = wq;
DPRINTK(("autofs_wait: new wait id = 0x%08lx, name = %.*s, nfy=%d\n",
@@ -206,6 +207,11 @@
interruptible_sleep_on(&wq->queue);
+ if (waitqueue_active(&wq->queue) && current != wq->owner) {
+ set_current_state(TASK_INTERRUPTIBLE);
+ schedule_timeout(wq->wait_ctr * (HZ/10));
+ }
+
spin_lock_irqsave(¤t->sighand->siglock, irqflags);
current->blocked = oldset;
recalc_sigpending();
--
,-._|\ Ian Kent
/ \ Perth, Western Australia
*_.--._/ E-mail: [email protected]
v Web: http://themaw.net/
On Wed, 2003-09-24 at 15:01, Ian Kent wrote:
> This is a corrected patch for the autofs4 daedlock problem I posted about
> @@ -206,6 +207,11 @@
>
> interruptible_sleep_on(&wq->queue);
>
> + if (waitqueue_active(&wq->queue) && current != wq->owner) {
> + set_current_state(TASK_INTERRUPTIBLE);
> + schedule_timeout(wq->wait_ctr * (HZ/10));
> + }
> +
this really really looks like you're trying to pamper over a bug by
changing the timing somewhere instead of fixing it...
also are you sure the deadlock isn't because of the racey use of
interruptible_sleep_on ?
On Wed, 24 Sep 2003, Arjan van de Ven wrote:
> On Wed, 2003-09-24 at 15:01, Ian Kent wrote:
> > This is a corrected patch for the autofs4 daedlock problem I posted about
> > @@ -206,6 +207,11 @@
> >
> > interruptible_sleep_on(&wq->queue);
> >
> > + if (waitqueue_active(&wq->queue) && current != wq->owner) {
> > + set_current_state(TASK_INTERRUPTIBLE);
> > + schedule_timeout(wq->wait_ctr * (HZ/10));
> > + }
> > +
>
> this really really looks like you're trying to pamper over a bug by
> changing the timing somewhere instead of fixing it...
Agreed.
>
> also are you sure the deadlock isn't because of the racey use of
> interruptible_sleep_on ?
>
OK so maybe I should have suggestions instead of comments.
Please elaborate.
--
,-._|\ Ian Kent
/ \ Perth, Western Australia
*_.--._/ E-mail: [email protected]
v Web: http://themaw.net/
On Wed, Sep 24, 2003 at 09:38:16PM +0800, Ian Kent wrote:
> > interruptible_sleep_on ?
> >
>
> OK so maybe I should have suggestions instead of comments.
instead of interruptible_sleep_on(), it looks like you really want
to use completions for this code..
see kernel/workqueue.c for how those are used
On Wed, 24 Sep 2003, Arjan van de Ven wrote:
> On Wed, Sep 24, 2003 at 09:38:16PM +0800, Ian Kent wrote:
> > > interruptible_sleep_on ?
> > >
> >
> > OK so maybe I should have suggestions instead of comments.
>
> instead of interruptible_sleep_on(), it looks like you really want
> to use completions for this code..
> see kernel/workqueue.c for how those are used
>
I did try that but thinking again ...
The actual expire really needs to be interrutible.
I didn't like the idea that the additional waiters would be
uninterruptible but perhaps I have no choice.
Given that do you think that using an interruptible_sleep_on for the
expire and a completion for the additional waiters could give an
acceptable solution?
--
,-._|\ Ian Kent
/ \ Perth, Western Australia
*_.--._/ E-mail: [email protected]
v Web: http://themaw.net/
===== waitq.c 1.6 vs edited =====
--- 1.6/fs/autofs4/waitq.c Fri Feb 7 12:25:20 2003
+++ edited/waitq.c Wed Sep 24 15:48:30 2003
@@ -204,7 +204,7 @@
recalc_sigpending();
spin_unlock_irqrestore(¤t->sighand->siglock, irqflags);
- interruptible_sleep_on(&wq->queue);
+ wait_event_interruptible(wq->queue, wq->name == NULL);
spin_lock_irqsave(¤t->sighand->siglock, irqflags);
current->blocked = oldset;
On Wed, Sep 24, 2003 at 10:46:43PM +0800, Ian Kent wrote:
> The actual expire really needs to be interrutible.
> I didn't like the idea that the additional waiters would be
> uninterruptible but perhaps I have no choice.
>
> Given that do you think that using an interruptible_sleep_on for the
> expire and a completion for the additional waiters could give an
> acceptable solution?
No...
what happens if the person waking you does so inbetween the
if ( wq->name ) {
test and the
interruptible_sleep_on(&wq->queue);
.. ?
On Wed, 24 Sep 2003, Mike Waychison wrote:
>
> I think the deadlock itself needs to be properly identified.
>
> Could you explain where the deadlock is actually occuring? I briefed
> over the automount 4 code as well as autofs4 and I don't see the
> deadlock. The 'owner' in the case of an expiry will be a child process
> of the daemon, within a call to ioctl(EXPIRE_MULTI), correct? Having it
> be released from the waitqueue first should not affect flow of execution
> and released from deadlock.
Yes. I am having trouble defining the actual problem also.
I believe that the deadlock occures because of the sequence of calls
between the expire and Naultilus, each execution path taking and releasing
the BKL.
I will try and get more evidence as I work on it.
>
> I don't see how having it wake up before before any other racing
> processes solves anything.
I thought this was a side effect of the O(1) scheduler and that the
design of the wait handling left it open to a sequence of calls problem.
I first got the impression that it was related to the scheduler (and
felt that it was a deadlock) when I bumped the priority of the expire to
see what would happen and it work fine every time I tried it.
>
> I think Arjan is right in that the race is do to the nautilus process
> entering the sleep_on after the a call to wake_up(&wq->queue). I don't
That needs fixing for sure, apart from anything else.
> know if a change to using a workqueue is best.. how about refactoring
> that chunk of code to use wait_event_interruptible on the queue, which
> should be clear of any waitqueue/sleep_on races.
Exaclty what I thought after pondering Arjans' mail last night.
The expire must be interruptible.
This will show if there actually is a timing problem (I hope).
>
> >
> >
> > OK so maybe I should have suggestions instead of comments.
> >
> > Please elaborate.
> >
>
> How about you try out this quick patch I threw together.
Oops. Replied without looking at the patch.
I'll check it out after work tonight.
Thanks to all for the help.
--
,-._|\ Ian Kent
/ \ Perth, Western Australia
*_.--._/ E-mail: [email protected]
v Web: http://themaw.net/
On Thu, 25 Sep 2003, Ian Kent wrote:
Seem to have lost the original message.
Please forgive the double indent.
I have tried the patch you recommended and have exactly the same symptoms.
> On Wed, 24 Sep 2003, Mike Waychison wrote:
>
> >
> > I think the deadlock itself needs to be properly identified.
> >
> > Could you explain where the deadlock is actually occuring? I briefed
> > over the automount 4 code as well as autofs4 and I don't see the
> > deadlock. The 'owner' in the case of an expiry will be a child process
> > of the daemon, within a call to ioctl(EXPIRE_MULTI), correct? Having it
> > be released from the waitqueue first should not affect flow of execution
> > and released from deadlock.
I have had some time to clear my thoughts on this now.
First, the expire is done using a while in the daemon. It continues until
the ioctl returns non zero.
I believe the sequence of events is:
Expire locates an expireable dentry and sends a message to the daemon to
do the umount and returns 0. The ioctl runs with the BKL.
The Naultilus process triggers a revalidate during the expire and waits
till that iteration of the expire finishes before continuing. The
revalidate leads to a lookup which takes the BKL.
Meanwhile the expire enters the ioctl again but is blocked on the BKL.
The lookup sends a mount request to the daemon which it cannot hear
because it is still waiting for the expire to finish.
While this is a plausible description with the code as it is, I
have found that even if I bracket the revalidate with a BKL the problem
still occurs.
So there is something else I am missing. I don't know what it is.
> > I don't see how having it wake up before before any other racing
> > processes solves anything.
If my description above where correct then forcing the expire to finish
would lead to the correct behaviour.
>
> I thought this was a side effect of the O(1) scheduler and that the
> design of the wait handling left it open to a sequence of calls problem.
> I first got the impression that it was related to the scheduler (and
> felt that it was a deadlock) when I bumped the priority of the expire to
> see what would happen and it work fine every time I tried it.
>
I still think that it is a O(1) effect. When the sleeping processes awake
it is possible that the scheduler does not pick the expire process as it
was the most recent of the processes to run.
> >
> > I think Arjan is right in that the race is do to the nautilus process
> > entering the sleep_on after the a call to wake_up(&wq->queue). I don't
>
> That needs fixing for sure, apart from anything else.
>
> > know if a change to using a workqueue is best.. how about refactoring
> > that chunk of code to use wait_event_interruptible on the queue, which
> > should be clear of any waitqueue/sleep_on races.
Again if my explanation is close to correct then the question is what is
the proper way to force the execution order? Using a completion as well as
the wait perhaps?
--
,-._|\ Ian Kent
/ \ Perth, Western Australia
*_.--._/ E-mail: [email protected]
v Web: http://themaw.net/