2004-09-19 16:03:47

by Vladimir B. Savkin

[permalink] [raw]
Subject: 2.6.9-rc2 hangs in posix_locks_deadlock

I was experiencing kernel hangs with versions 2.6.9-rc2 and
2.6.9-rc2-mm1 on two different boxes.

Today I managed to see the output of Alt+SysRq+P on the
hanged box and write down call trace (from screen, so it is incomplete).

EIP (c015da89) was in function posix_locks_deadlock,
and the call trace was:
__posix_lock_file
fcntl_setlk


Offending process was saslauthd (version 2.1.15)

~
:wq
With best regards,
Vladimir Savkin.


2004-09-19 20:05:33

by Vladimir B. Savkin

[permalink] [raw]
Subject: Re: 2.6.9-rc2 hangs in posix_locks_deadlock

On Sun, Sep 19, 2004 at 08:03:42PM +0400, Vladimir B. Savkin wrote:
> I was experiencing kernel hangs with versions 2.6.9-rc2 and
> 2.6.9-rc2-mm1 on two different boxes.

FYI: I have reverted posix-locking-* patches (as found in 2.6.9-rc2-mm1
patch set), no hangs since that.

>
> Today I managed to see the output of Alt+SysRq+P on the
> hanged box and write down call trace (from screen, so it is incomplete).
>
> EIP (c015da89) was in function posix_locks_deadlock,
> and the call trace was:
> __posix_lock_file
> fcntl_setlk
~
:wq
With best regards,
Vladimir Savkin.

2004-09-19 20:32:35

by Trond Myklebust

[permalink] [raw]
Subject: Re: 2.6.9-rc2 hangs in posix_locks_deadlock

P? su , 19/09/2004 klokka 13:05, skreiv Vladimir B. Savkin:
> >
> > Today I managed to see the output of Alt+SysRq+P on the
> > hanged box and write down call trace (from screen, so it is incomplete).
> >
> > EIP (c015da89) was in function posix_locks_deadlock,
> > and the call trace was:
> > __posix_lock_file
> > fcntl_setlk

What filesystems are you using on that box?

Cheers,
Trond

2004-09-19 20:36:34

by Vladimir B. Savkin

[permalink] [raw]
Subject: Re: 2.6.9-rc2 hangs in posix_locks_deadlock

On Sun, Sep 19, 2004 at 01:32:08PM -0700, Trond Myklebust wrote:
> P? su , 19/09/2004 klokka 13:05, skreiv Vladimir B. Savkin:
> > >
> > > Today I managed to see the output of Alt+SysRq+P on the
> > > hanged box and write down call trace (from screen, so it is incomplete).
> > >
> > > EIP (c015da89) was in function posix_locks_deadlock,
> > > and the call trace was:
> > > __posix_lock_file
> > > fcntl_setlk
>
> What filesystems are you using on that box?

reiserfs on all but / and /boot partitions, which are ext2.

~
:wq
With best regards,
Vladimir Savkin.

2004-09-19 22:51:58

by Trond Myklebust

[permalink] [raw]
Subject: Re: 2.6.9-rc2 hangs in posix_locks_deadlock

[PATCH] fix posix_locks_deadlock().

"blocked_list" may contain both leases and flock locks. Since the latter in
particular do not initialize the fl_owner field, we have to beware not to
call posix_same_owner() on them.

Signed-off-by: Trond Myklebust <[email protected]>
---
locks.c | 7 +++----
1 files changed, 3 insertions(+), 4 deletions(-)

Index: linux-2.6.9-rc2-up/fs/locks.c
===================================================================
--- linux-2.6.9-rc2-up.orig/fs/locks.c 2004-09-19 13:55:33.680258334 -0700
+++ linux-2.6.9-rc2-up/fs/locks.c 2004-09-19 15:37:32.595634679 -0700
@@ -634,14 +634,13 @@
int posix_locks_deadlock(struct file_lock *caller_fl,
struct file_lock *block_fl)
{
- struct list_head *tmp;
+ struct file_lock *fl;

next_task:
if (posix_same_owner(caller_fl, block_fl))
return 1;
- list_for_each(tmp, &blocked_list) {
- struct file_lock *fl = list_entry(tmp, struct file_lock, fl_link);
- if (posix_same_owner(fl, block_fl)) {
+ list_for_each_entry(fl, &blocked_list, fl_link) {
+ if (IS_POSIX(fl) && posix_same_owner(fl, block_fl)) {
fl = fl->fl_next;
block_fl = fl;
goto next_task;


Attachments:
fix_posix_locks_deadlock.dif (1.14 kB)

2004-09-20 11:47:49

by Vladimir B. Savkin

[permalink] [raw]
Subject: Re: 2.6.9-rc2 hangs in posix_locks_deadlock

On Sun, Sep 19, 2004 at 03:51:43PM -0700, Trond Myklebust wrote:
> Hmm... It appears that it is indeed possible for both leases and flocks
> to be on the global "blocked_list", so the appended check is *not*
> redundant.
> Since flocks in particular do not initialize fl_owner, I suspect that
> you might be seeing wierd loops that were previously being avoided due
> to the ->fl_pid checks...
>
> Cheers,
> Trond
>

> [PATCH] fix posix_locks_deadlock().

2.6.9-rc2-mm1 with this patch seems to be doing fine, thanks

>
~
:wq
With best regards,
Vladimir Savkin.

2004-10-30 10:39:40

by Vladimir B. Savkin

[permalink] [raw]
Subject: Re: 2.6.9-rc2 hangs in posix_locks_deadlock

On Sun, Sep 19, 2004 at 03:51:43PM -0700, Trond Myklebust wrote:
> Hmm... It appears that it is indeed possible for both leases and flocks
> to be on the global "blocked_list", so the appended check is *not*
> redundant.
> Since flocks in particular do not initialize fl_owner, I suspect that
> you might be seeing wierd loops that were previously being avoided due
> to the ->fl_pid checks...

I just noticed that this fix didn't make it into 2.6.9.

> [PATCH] fix posix_locks_deadlock().
>
> "blocked_list" may contain both leases and flock locks. Since the latter in
> particular do not initialize the fl_owner field, we have to beware not to
> call posix_same_owner() on them.
>
> Signed-off-by: Trond Myklebust <[email protected]>
> ---
> locks.c | 7 +++----
> 1 files changed, 3 insertions(+), 4 deletions(-)
>
> Index: linux-2.6.9-rc2-up/fs/locks.c
> ===================================================================
> --- linux-2.6.9-rc2-up.orig/fs/locks.c 2004-09-19 13:55:33.680258334 -0700
> +++ linux-2.6.9-rc2-up/fs/locks.c 2004-09-19 15:37:32.595634679 -0700
> @@ -634,14 +634,13 @@
> int posix_locks_deadlock(struct file_lock *caller_fl,
> struct file_lock *block_fl)
> {
> - struct list_head *tmp;
> + struct file_lock *fl;
>
> next_task:
> if (posix_same_owner(caller_fl, block_fl))
> return 1;
> - list_for_each(tmp, &blocked_list) {
> - struct file_lock *fl = list_entry(tmp, struct file_lock, fl_link);
> - if (posix_same_owner(fl, block_fl)) {
> + list_for_each_entry(fl, &blocked_list, fl_link) {
> + if (IS_POSIX(fl) && posix_same_owner(fl, block_fl)) {
> fl = fl->fl_next;
> block_fl = fl;
> goto next_task;

~
:wq
With best regards,
Vladimir Savkin.