2022-09-22 21:50:48

by Kurt Garloff

[permalink] [raw]
Subject: linux-5.15.69 breaks nfs client

Hi,

a freshly compiled 5.15.69 kernel showed hangs with NFS.
Typically mkdir would end up in a 'D' process state, but I
have seen ls -l hanging as well.
Server is kernel NFS 5.15.69.

After reverting the last three NFS related commits,
a68a734b19af NFS: Fix WARN_ON due to unionization of nfs_inode.nrequests
3b97deb4abf5 NFS: Fix another fsync() issue after a server reboot
31b992b3c39b NFS: Save some space in the inode

things work normally again.

As you can see, I suspected 31b992b3c39b ...

I know this report is light on details; if nothing like this has been
reported yet, let me know and I'll try to find some time to investigate
further.

PS: Please keep me on Cc, I'm not subscribed to linux-nfs.

Best,

--
Kurt Garloff <[email protected]>
Cologne, Germany


Subject: Re: linux-5.15.69 breaks nfs client

Hi, this is your Linux kernel regression tracker. CCing the regression
mailing list, as it should be in the loop for all regressions, as
explained here:
https://www.kernel.org/doc/html/latest/admin-guide/reporting-issues.html
Also CCing the stable ml, the NFS maintainers, and the authors of
31b992b3c39b, too.

On 22.09.22 23:46, Kurt Garloff wrote:
>
> a freshly compiled 5.15.69 kernel showed hangs with NFS.
> Typically mkdir would end up in a 'D' process state, but I
> have seen ls -l hanging as well.
> Server is kernel NFS 5.15.69.
>
> After reverting the last three NFS related commits,
> a68a734b19af NFS: Fix WARN_ON due to unionization of nfs_inode.nrequests
> 3b97deb4abf5 NFS: Fix another fsync() issue after a server reboot
> 31b992b3c39b NFS: Save some space in the inode
>
> things work normally again.
>
> As you can see, I suspected 31b992b3c39b ...

FWIW, that's e591b298d7ec in mainline.

> I know this report is light on details; if nothing like this has been
> reported yet, let me know and I'll try to find some time to investigate
> further.
>
> PS: Please keep me on Cc, I'm not subscribed to linux-nfs.

I wonder if this is this is a dup of this report:

https://lore.kernel.org/all/[email protected]/

In that thread Trond mentioned
```
I believe this is a dependency that was introduced by the back port of
commit e591b298d7ec ("NFS: Save some space in the inode") into 5.15.68.
So the reason it wasn't seen is because the change is very recent.

FYI Greg and Sasha: please also consider pulling 6e176d47160c ("NFSv4:
Fixes for nfs4_inode_return_delegation()") into that stable series.
```

Anyway, for the rest of this mail:
[TLDR: I'm adding this regression report to the list of tracked
regressions; all text from me you find below is based on a few templates
paragraphs you might have encountered already already in similar form.]

Thanks for the report. To be sure below issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, my Linux kernel regression
tracking bot:

#regzbot ^introduced 31b992b3c39b
#regzbot ignore-activity

This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply -- ideally with also
telling regzbot about it, as explained here:
https://linux-regtracking.leemhuis.info/tracked-regression/

Reminder for developers: When fixing the issue, add 'Link:' tags
pointing to the report (the mail this one replies to), as explained for
in the Linux kernel's documentation; above webpage explains why this is
important for tracked regressions.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I deal with a lot of
reports and sometimes miss something important when writing mails like
this. If that's the case here, don't hesitate to tell me in a public
reply, it's in everyone's interest to set the public record straight.

Subject: Re: linux-5.15.69 breaks nfs client #forregzbot

TWIMC: this mail is primarily send for documentation purposes and for
regzbot, my Linux kernel regression tracking bot. These mails usually
contain '#forregzbot' in the subject, to make them easy to spot and filter.

On 23.09.22 09:46, Thorsten Leemhuis wrote:
> Hi, this is your Linux kernel regression tracker. CCing the regression
> mailing list, as it should be in the loop for all regressions, as
> explained here:
> https://www.kernel.org/doc/html/latest/admin-guide/reporting-issues.html
> Also CCing the stable ml, the NFS maintainers, and the authors of
> 31b992b3c39b, too.
>
> On 22.09.22 23:46, Kurt Garloff wrote:
>>
>> a freshly compiled 5.15.69 kernel showed hangs with NFS.
>> Typically mkdir would end up in a 'D' process state, but I
>> have seen ls -l hanging as well.
>> Server is kernel NFS 5.15.69.
>>
>> After reverting the last three NFS related commits,
>> a68a734b19af NFS: Fix WARN_ON due to unionization of nfs_inode.nrequests
>> 3b97deb4abf5 NFS: Fix another fsync() issue after a server reboot
>> 31b992b3c39b NFS: Save some space in the inode
>>
>> things work normally again.
>>
>> As you can see, I suspected 31b992b3c39b ...
>
> FWIW, that's e591b298d7ec in mainline.
>
>> I know this report is light on details; if nothing like this has been
>> reported yet, let me know and I'll try to find some time to investigate
>> further.
>>
>> PS: Please keep me on Cc, I'm not subscribed to linux-nfs.
>
> [...]
>
> #regzbot ^introduced 31b992b3c39b
> #regzbot ignore-activity

#regzbot fixed-by: 27bf7a5d11987