2007-08-01 17:46:16

by Chris Rankin

[permalink] [raw]
Subject: [BUG] Linux 2.6.22 - Atomic counter underflow in NFS

Hi,

I am running a 2.6.22 kernel on a dual P4 Xeon (HT enabled) with 2 GB RAM, and I have just found
this BUG in my dmesg log:

nfsd: last server has exited
nfsd: unexporting all filesystems
BUG: atomic counter underflow at:
[<c01a382a>] kref_put+0x66/0x84
[<c0188bc2>] sysfs_hash_and_remove+0xb8/0x12d
[<c0188bca>] sysfs_hash_and_remove+0xc0/0x12d
[<c01522e2>] sysfs_slab_alias+0x19/0x5c
[<c0152438>] sysfs_slab_add+0x113/0x124
[<c01527de>] kmem_cache_create+0x12d/0x1d3
[<f8d9df1b>] nfs4_state_start+0x3b/0x1a7 [nfsd]
[<f8d8a485>] nfsd_svc+0x51/0x10d [nfsd]
[<f8d8ae76>] write_threads+0x65/0x96 [nfsd]
[<c016c279>] simple_transaction_get+0x70/0x82
[<f8d8ae11>] write_threads+0x0/0x96 [nfsd]
[<f8d8ab9f>] nfsctl_transaction_write+0x36/0x5c [nfsd]
[<f8d8ab69>] nfsctl_transaction_write+0x0/0x5c [nfsd]
[<c0155392>] vfs_write+0x8a/0x10c
[<c0155901>] sys_write+0x41/0x67
[<c010266e>] sysenter_past_esp+0x5f/0x85
=======================
NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
NFSD: starting 90-second grace period

Cheers,
Chris



___________________________________________________________
Yahoo! Mail is the world's favourite email. Don't settle for less, sign up for
your free account today http://uk.rd.yahoo.com/evt=44106/*http://uk.docs.yahoo.com/mail/winter07.html


2007-08-01 20:59:16

by Satyam Sharma

[permalink] [raw]
Subject: Re: [BUG] Linux 2.6.22 - Atomic counter underflow in NFS

Hi Chris,


On Wed, 1 Aug 2007, Chris Rankin wrote:

> I am running a 2.6.22 kernel on a dual P4 Xeon (HT enabled) with 2 GB RAM, and I have just found
> this BUG in my dmesg log:
>
> nfsd: last server has exited
> nfsd: unexporting all filesystems
> BUG: atomic counter underflow at:
> [<c01a382a>] kref_put+0x66/0x84
> [<c0188bc2>] sysfs_hash_and_remove+0xb8/0x12d
> [<c0188bca>] sysfs_hash_and_remove+0xc0/0x12d
> [<c01522e2>] sysfs_slab_alias+0x19/0x5c
> [<c0152438>] sysfs_slab_add+0x113/0x124
> [<c01527de>] kmem_cache_create+0x12d/0x1d3
> [<f8d9df1b>] nfs4_state_start+0x3b/0x1a7 [nfsd]
> [<f8d8a485>] nfsd_svc+0x51/0x10d [nfsd]
> [<f8d8ae76>] write_threads+0x65/0x96 [nfsd]
> [<c016c279>] simple_transaction_get+0x70/0x82
> [<f8d8ae11>] write_threads+0x0/0x96 [nfsd]
> [<f8d8ab9f>] nfsctl_transaction_write+0x36/0x5c [nfsd]
> [<f8d8ab69>] nfsctl_transaction_write+0x0/0x5c [nfsd]
> [<c0155392>] vfs_write+0x8a/0x10c
> [<c0155901>] sys_write+0x41/0x67
> [<c010266e>] sysenter_past_esp+0x5f/0x85
> =======================
> NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
> NFSD: starting 90-second grace period

I expect this is easy to reproduce at will (when shutting down nfs
services, probably), right?

Please try the latest mainline -git kernel (say 2.6.23-rc1-git10 from
kernel.org) and let us know if this still occurs? There were some fixes
regarding sysfs symlink refcounting that went in recently, and this looks
like one of those cases to me.


Thanks,
Satyam

2007-08-01 21:11:22

by Chris Rankin

[permalink] [raw]
Subject: Re: [BUG] Linux 2.6.22 - Atomic counter underflow in NFS

--- Satyam Sharma <[email protected]> wrote:
> I expect this is easy to reproduce at will (when shutting down nfs
> services, probably), right?

I'm not sure about the "at will" part because this is the first time I've seen it since 2.6.22 was
released. However, I was upgrading my Fedora 7 nfs-utils package at the time so it probably did
happen when the NFS service was being shut down.

> Please try the latest mainline -git kernel (say 2.6.23-rc1-git10 from
> kernel.org) and let us know if this still occurs? There were some fixes
> regarding sysfs symlink refcounting that went in recently, and this looks
> like one of those cases to me.

Do you have the actual patches instead?

Cheers,
Chris



___________________________________________________________
Yahoo! Answers - Got a question? Someone out there knows the answer. Try it
now.
http://uk.answers.yahoo.com/

2007-08-01 21:25:41

by Satyam Sharma

[permalink] [raw]
Subject: Re: [BUG] Linux 2.6.22 - Atomic counter underflow in NFS



On Wed, 1 Aug 2007, Chris Rankin wrote:

> --- Satyam Sharma <[email protected]> wrote:
> > I expect this is easy to reproduce at will (when shutting down nfs
> > services, probably), right?
>
> I'm not sure about the "at will" part because this is the first time I've seen it since 2.6.22 was
> released. However, I was upgrading my Fedora 7 nfs-utils package at the time so it probably did
> happen when the NFS service was being shut down.

Hmm, the backtrace suggests it should be reproducible fairly frequently.

> > Please try the latest mainline -git kernel (say 2.6.23-rc1-git10 from
> > kernel.org) and let us know if this still occurs? There were some fixes
> > regarding sysfs symlink refcounting that went in recently, and this looks
> > like one of those cases to me.
>
> Do you have the actual patches instead?

I was thinking maybe this one: http://lkml.org/lkml/diff/2007/7/18/48/1

The kref_put() there happens on the sysfs_remove_link() path, and the
patch above fixed an extra put() during create_link(), which lead to
the final put() during remove_link() being what causes this underflow.
However, a lot changed in sysfs in the past few months, and I'm not
really sure that patch would apply cleanly. Hence, if you can try with
the latest -git ...


Thanks,
Satyam