2010-12-08 19:03:40

by Trond Myklebust

[permalink] [raw]
Subject: Re: [Bug 24302] Kernel crashes when repeatedly trying to mount nfs share that is failing

On Wed, 2010-12-08 at 18:30 +0000, [email protected]
wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=24302
>
>
> Andrew Morton <[email protected]> changed:
>
> What |Removed |Added
> ----------------------------------------------------------------------------
> Component|Other |NFS
> AssignedTo|[email protected] |[email protected]
> |rg |
>
>
>
>
> --- Comment #8 from Andrew Morton <[email protected]> 2010-12-08 18:30:32 ---
> OK, I reassigned it to NFS. If that was wrong then at least the NFS guys
> should be able to help point things in the right direction.

<Switching to email interface. Please do not edit the bugzilla entry
directly, since that will lose the above Cc information>


Chuck,

Stefan appears to be hitting a panic in the nfs_umount() call from
nfs_walk_authlist(). Can you take a look, please?

Cheers
Trond



2010-12-08 20:37:02

by Chuck Lever III

[permalink] [raw]
Subject: Re: [Bug 24302] Kernel crashes when repeatedly trying to mount nfs share that is failing


On Dec 8, 2010, at 2:03 PM, Trond Myklebust wrote:

> On Wed, 2010-12-08 at 18:30 +0000, [email protected]
> wrote:
>> https://bugzilla.kernel.org/show_bug.cgi?id=24302
>>
>>
>> Andrew Morton <[email protected]> changed:
>>
>> What |Removed |Added
>> ----------------------------------------------------------------------------
>> Component|Other |NFS
>> AssignedTo|[email protected] |[email protected]
>> |rg |
>>
>>
>>
>>
>> --- Comment #8 from Andrew Morton <[email protected]> 2010-12-08 18:30:32 ---
>> OK, I reassigned it to NFS. If that was wrong then at least the NFS guys
>> should be able to help point things in the right direction.
>
> <Switching to email interface. Please do not edit the bugzilla entry
> directly, since that will lose the above Cc information>
>
>
> Chuck,
>
> Stefan appears to be hitting a panic in the nfs_umount() call from
> nfs_walk_authlist(). Can you take a look, please?

Recv'd. I'll have a look.

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com





2010-12-09 17:28:22

by Chuck Lever III

[permalink] [raw]
Subject: Re: [Bug 24302] Kernel crashes when repeatedly trying to mount nfs share that is failing


On Dec 9, 2010, at 12:25 PM, Stefan Bader wrote:

> On 12/09/2010 04:23 PM, Chuck Lever wrote:
>>
>> On Dec 9, 2010, at 5:12 AM, Stefan Bader wrote:
>>
>>> On 12/09/2010 01:19 AM, Chuck Lever wrote:
>>>> Hi Stefan-
>>>>
>>>> On Dec 8, 2010, at 3:35 PM, Chuck Lever wrote:
>>>>
>>>>>
>>>>> On Dec 8, 2010, at 2:03 PM, Trond Myklebust wrote:
>>>>>
>>>>>> On Wed, 2010-12-08 at 18:30 +0000, [email protected]
>>>>>> wrote:
>>>>>>> https://bugzilla.kernel.org/show_bug.cgi?id=24302
>>>>>>>
>>>>>>>
>>>>>>> Andrew Morton <[email protected]> changed:
>>>>>>>
>>>>>>> What |Removed |Added
>>>>>>> ----------------------------------------------------------------------------
>>>>>>>
>>>>>>>
>>> Component|Other |NFS
>>>>>>> AssignedTo|[email protected] |[email protected] |rg
>>>>>>> |
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --- Comment #8 from Andrew Morton <[email protected]>
>>>>>>> 2010-12-08 18:30:32 --- OK, I reassigned it to NFS. If that was wrong
>>>>>>> then at least the NFS guys should be able to help point things in the
>>>>>>> right direction.
>>>>>>
>>>>>> <Switching to email interface. Please do not edit the bugzilla entry
>>>>>> directly, since that will lose the above Cc information>
>>>
>>> Missed to read the above, so I have to duplicate things a bit, sorry. Removed
>>> the bugzilla cc, so the report does not get spammed.
>>>
>>>>>>
>>>>>>
>>>>>> Chuck,
>>>>>>
>>>>>> Stefan appears to be hitting a panic in the nfs_umount() call from
>>>>>> nfs_walk_authlist(). Can you take a look, please?
>>>>>
>>>>> Recv'd. I'll have a look.
>>>>
>>>> Apologies in advance for the attachment. There are a few other clean ups
>>>> that can be done, but this seems to be the minimal fix. Please try this and
>>>> let us know if it addresses your panic.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> The faulty commit went in a while back, so this patch is probably appropriate
>>>> for stable kernels (back to 2.6.31, where this was likely introduced).
>>>>
>>>
>>> I can confirm that this was the root cause of the crash. The sha referenced as
>>> the causing commit, got in with a 2.6.32-rc so I added the stable info
>>> accordingly. I also added my tested-by and slightly modified the bug reference
>>> format (Chuck, I hope this is ok).
>>
>> Looks good to me. I assume, since Trond still owns the kernel.org bugzilla, that he will close this loop? Let me know if I need to do anything else.
>>
> Actually someone spotted that the second array_size should probably use
> mnt_procedures3 instead of mnt_procedures. They happen to be the same size,
> so testing is ok.

Oops. Copy-n-paste error, I thought I had fixed that. Good catch.

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com





2010-12-09 15:25:14

by Chuck Lever III

[permalink] [raw]
Subject: Re: [Bug 24302] Kernel crashes when repeatedly trying to mount nfs share that is failing


On Dec 9, 2010, at 5:12 AM, Stefan Bader wrote:

> On 12/09/2010 01:19 AM, Chuck Lever wrote:
>> Hi Stefan-
>>
>> On Dec 8, 2010, at 3:35 PM, Chuck Lever wrote:
>>
>>>
>>> On Dec 8, 2010, at 2:03 PM, Trond Myklebust wrote:
>>>
>>>> On Wed, 2010-12-08 at 18:30 +0000, [email protected]
>>>> wrote:
>>>>> https://bugzilla.kernel.org/show_bug.cgi?id=24302
>>>>>
>>>>>
>>>>> Andrew Morton <[email protected]> changed:
>>>>>
>>>>> What |Removed |Added
>>>>> ----------------------------------------------------------------------------
>>>>>
>>>>>
> Component|Other |NFS
>>>>> AssignedTo|[email protected] |[email protected] |rg
>>>>> |
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --- Comment #8 from Andrew Morton <[email protected]>
>>>>> 2010-12-08 18:30:32 --- OK, I reassigned it to NFS. If that was wrong
>>>>> then at least the NFS guys should be able to help point things in the
>>>>> right direction.
>>>>
>>>> <Switching to email interface. Please do not edit the bugzilla entry
>>>> directly, since that will lose the above Cc information>
>
> Missed to read the above, so I have to duplicate things a bit, sorry. Removed
> the bugzilla cc, so the report does not get spammed.
>
>>>>
>>>>
>>>> Chuck,
>>>>
>>>> Stefan appears to be hitting a panic in the nfs_umount() call from
>>>> nfs_walk_authlist(). Can you take a look, please?
>>>
>>> Recv'd. I'll have a look.
>>
>> Apologies in advance for the attachment. There are a few other clean ups
>> that can be done, but this seems to be the minimal fix. Please try this and
>> let us know if it addresses your panic.
>>
>>
>>
>>
>>
>>
>> The faulty commit went in a while back, so this patch is probably appropriate
>> for stable kernels (back to 2.6.31, where this was likely introduced).
>>
>
> I can confirm that this was the root cause of the crash. The sha referenced as
> the causing commit, got in with a 2.6.32-rc so I added the stable info
> accordingly. I also added my tested-by and slightly modified the bug reference
> format (Chuck, I hope this is ok).

Looks good to me. I assume, since Trond still owns the kernel.org bugzilla, that he will close this loop? Let me know if I need to do anything else.

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com





2010-12-09 17:25:36

by Stefan Bader

[permalink] [raw]
Subject: Re: [Bug 24302] Kernel crashes when repeatedly trying to mount nfs share that is failing

On 12/09/2010 04:23 PM, Chuck Lever wrote:
>
> On Dec 9, 2010, at 5:12 AM, Stefan Bader wrote:
>
>> On 12/09/2010 01:19 AM, Chuck Lever wrote:
>>> Hi Stefan-
>>>
>>> On Dec 8, 2010, at 3:35 PM, Chuck Lever wrote:
>>>
>>>>
>>>> On Dec 8, 2010, at 2:03 PM, Trond Myklebust wrote:
>>>>
>>>>> On Wed, 2010-12-08 at 18:30 +0000, [email protected]
>>>>> wrote:
>>>>>> https://bugzilla.kernel.org/show_bug.cgi?id=24302
>>>>>>
>>>>>>
>>>>>> Andrew Morton <[email protected]> changed:
>>>>>>
>>>>>> What |Removed |Added
>>>>>> ----------------------------------------------------------------------------
>>>>>>
>>>>>>
>> Component|Other |NFS
>>>>>> AssignedTo|[email protected] |[email protected] |rg
>>>>>> |
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --- Comment #8 from Andrew Morton <[email protected]>
>>>>>> 2010-12-08 18:30:32 --- OK, I reassigned it to NFS. If that was wrong
>>>>>> then at least the NFS guys should be able to help point things in the
>>>>>> right direction.
>>>>>
>>>>> <Switching to email interface. Please do not edit the bugzilla entry
>>>>> directly, since that will lose the above Cc information>
>>
>> Missed to read the above, so I have to duplicate things a bit, sorry. Removed
>> the bugzilla cc, so the report does not get spammed.
>>
>>>>>
>>>>>
>>>>> Chuck,
>>>>>
>>>>> Stefan appears to be hitting a panic in the nfs_umount() call from
>>>>> nfs_walk_authlist(). Can you take a look, please?
>>>>
>>>> Recv'd. I'll have a look.
>>>
>>> Apologies in advance for the attachment. There are a few other clean ups
>>> that can be done, but this seems to be the minimal fix. Please try this and
>>> let us know if it addresses your panic.
>>>
>>>
>>>
>>>
>>>
>>>
>>> The faulty commit went in a while back, so this patch is probably appropriate
>>> for stable kernels (back to 2.6.31, where this was likely introduced).
>>>
>>
>> I can confirm that this was the root cause of the crash. The sha referenced as
>> the causing commit, got in with a 2.6.32-rc so I added the stable info
>> accordingly. I also added my tested-by and slightly modified the bug reference
>> format (Chuck, I hope this is ok).
>
> Looks good to me. I assume, since Trond still owns the kernel.org bugzilla, that he will close this loop? Let me know if I need to do anything else.
>
Actually someone spotted that the second array_size should probably use
mnt_procedures3 instead of mnt_procedures. They happen to be the same size,
so testing is ok.

-Stefan

2010-12-09 10:12:53

by Stefan Bader

[permalink] [raw]
Subject: Re: [Bug 24302] Kernel crashes when repeatedly trying to mount nfs share that is failing

On 12/09/2010 01:19 AM, Chuck Lever wrote:
> Hi Stefan-
>
> On Dec 8, 2010, at 3:35 PM, Chuck Lever wrote:
>
>>
>> On Dec 8, 2010, at 2:03 PM, Trond Myklebust wrote:
>>
>>> On Wed, 2010-12-08 at 18:30 +0000, [email protected]
>>> wrote:
>>>> https://bugzilla.kernel.org/show_bug.cgi?id=24302
>>>>
>>>>
>>>> Andrew Morton <[email protected]> changed:
>>>>
>>>> What |Removed |Added
>>>> ----------------------------------------------------------------------------
>>>>
>>>>
Component|Other |NFS
>>>> AssignedTo|[email protected] |[email protected] |rg
>>>> |
>>>>
>>>>
>>>>
>>>>
>>>> --- Comment #8 from Andrew Morton <[email protected]>
>>>> 2010-12-08 18:30:32 --- OK, I reassigned it to NFS. If that was wrong
>>>> then at least the NFS guys should be able to help point things in the
>>>> right direction.
>>>
>>> <Switching to email interface. Please do not edit the bugzilla entry
>>> directly, since that will lose the above Cc information>

Missed to read the above, so I have to duplicate things a bit, sorry. Removed
the bugzilla cc, so the report does not get spammed.

>>>
>>>
>>> Chuck,
>>>
>>> Stefan appears to be hitting a panic in the nfs_umount() call from
>>> nfs_walk_authlist(). Can you take a look, please?
>>
>> Recv'd. I'll have a look.
>
> Apologies in advance for the attachment. There are a few other clean ups
> that can be done, but this seems to be the minimal fix. Please try this and
> let us know if it addresses your panic.
>
>
>
>
>
>
> The faulty commit went in a while back, so this patch is probably appropriate
> for stable kernels (back to 2.6.31, where this was likely introduced).
>

I can confirm that this was the root cause of the crash. The sha referenced as
the causing commit, got in with a 2.6.32-rc so I added the stable info
accordingly. I also added my tested-by and slightly modified the bug reference
format (Chuck, I hope this is ok).

-Stefan


Attachments:
0001-NFS-Fix-panic-after-nfs_umount.patch (1.78 kB)