Hi,
Bisect identified commit 7fd461c47c6c ("NFSv4.2: Change the default
KConfig value for READ_PLUS") as one leading to NULL pointer exception
when mounting NFS root on NFSv4 client:
[ 25.739003] systemd[1]: Set hostname to <odroidhc1>.
[ 25.771714] systemd[1]: Failed to bump fs.file-max, ignoring: Invalid
argument
[ 26.199478] 8<--- cut here ---
[ 26.201366] Unable to handle kernel NULL pointer dereference at
virtual address 00000004
...
[ 26.555522] mmiocpy from xdr_inline_decode+0xec/0x16c
[ 26.560628] xdr_inline_decode from nfs4_xdr_dec_read_plus+0x178/0x358
[ 26.567130] nfs4_xdr_dec_read_plus from call_decode+0x204/0x304
Full OOPS attached. Full log available here:
https://krzk.eu/#/builders/21/builds/3901/steps/15/logs/serial0
Disabling NFS_V4_2_READ_PLUS fixes the issue, so obviously the commit is
not the cause, but rather making it default caused the regression.
I did not make the bisect yet which commit introduced it, if every
config includes NFS_V4_2_READ_PLUS.
Some details about platform:
1. Arch ARM Linux
2. exynos_defconfig
3. Odroid HC1 board with ARMv7, octa-core (Cortex-A7+A15), Exynos5422 SoC
4. systemd, boot up with static IP set in kernel command line
5. No swap
6. Kernel, DTB and initramfs are downloaded with TFTP
7. NFS root (NFS client) mounted from a NFSv4 server
Best regards,
Krzysztof
[TLDR: I'm adding this report to the list of tracked Linux kernel
regressions; the text you find below is based on a few templates
paragraphs you might have encountered already in similar form.
See link in footer if these mails annoy you.]
On 07.01.23 16:44, Krzysztof Kozlowski wrote:
> Hi,
>
> Bisect identified commit 7fd461c47c6c ("NFSv4.2: Change the default
> KConfig value for READ_PLUS") as one leading to NULL pointer exception
> when mounting NFS root on NFSv4 client:
>
> [ 25.739003] systemd[1]: Set hostname to <odroidhc1>.
> [ 25.771714] systemd[1]: Failed to bump fs.file-max, ignoring: Invalid
> argument
> [ 26.199478] 8<--- cut here ---
> [ 26.201366] Unable to handle kernel NULL pointer dereference at
> virtual address 00000004
> ...
> [ 26.555522] mmiocpy from xdr_inline_decode+0xec/0x16c
> [ 26.560628] xdr_inline_decode from nfs4_xdr_dec_read_plus+0x178/0x358
> [ 26.567130] nfs4_xdr_dec_read_plus from call_decode+0x204/0x304
>
> Full OOPS attached. Full log available here:
> https://krzk.eu/#/builders/21/builds/3901/steps/15/logs/serial0
>
> Disabling NFS_V4_2_READ_PLUS fixes the issue, so obviously the commit is
> not the cause, but rather making it default caused the regression.
>
> I did not make the bisect yet which commit introduced it, if every
> config includes NFS_V4_2_READ_PLUS.
>
>
> Some details about platform:
>
> 1. Arch ARM Linux
> 2. exynos_defconfig
> 3. Odroid HC1 board with ARMv7, octa-core (Cortex-A7+A15), Exynos5422 SoC
> 4. systemd, boot up with static IP set in kernel command line
> 5. No swap
> 6. Kernel, DTB and initramfs are downloaded with TFTP
> 7. NFS root (NFS client) mounted from a NFSv4 server
Thanks for the report. To be sure the issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
tracking bot:
#regzbot ^introduced 7fd461c47
#regzbot title nfs: NULL pointer dereference since NFS_V4_2_READ_PLUS is
enabled by default
#regzbot ignore-activity
This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply and tell me -- ideally
while also telling regzbot about it, as explained by the page listed in
the footer of this mail.
Developers: When fixing the issue, remember to add 'Link:' tags pointing
to the report (the parent of this mail). See page linked in footer for
details.
Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.
Hi Krzysztof,
> On Jan 8, 2023, at 08:25, Krzysztof Kozlowski <[email protected]> wrote:
>
> [You don't often get email from [email protected]. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification]
>
> On 07/01/2023 16:44, Krzysztof Kozlowski wrote:
>> Hi,
>>
>> Bisect identified commit 7fd461c47c6c ("NFSv4.2: Change the default
>> KConfig value for READ_PLUS") as one leading to NULL pointer exception
>> when mounting NFS root on NFSv4 client:
>>
>> [ 25.739003] systemd[1]: Set hostname to <odroidhc1>.
>> [ 25.771714] systemd[1]: Failed to bump fs.file-max, ignoring: Invalid
>> argument
>> [ 26.199478] 8<--- cut here ---
>> [ 26.201366] Unable to handle kernel NULL pointer dereference at
>> virtual address 00000004
>> ...
>> [ 26.555522] mmiocpy from xdr_inline_decode+0xec/0x16c
>> [ 26.560628] xdr_inline_decode from nfs4_xdr_dec_read_plus+0x178/0x358
>> [ 26.567130] nfs4_xdr_dec_read_plus from call_decode+0x204/0x304
>>
>> Full OOPS attached. Full log available here:
>> https://krzk.eu/#/builders/21/builds/3901/steps/15/logs/serial0
>>
>> Disabling NFS_V4_2_READ_PLUS fixes the issue, so obviously the commit is
>> not the cause, but rather making it default caused the regression.
>>
>> I did not make the bisect yet which commit introduced it, if every
>> config includes NFS_V4_2_READ_PLUS.
>
> When every kernel is built with NFS_V4_2_READ_PLUS, bisect pointed to:
> d3b00a802c84 ("NFS: Replace the READ_PLUS decoding code")
>
> commit d3b00a802c845a6021148ce2e669b5a0b5729959
> Author: Anna Schumaker <[email protected]>
> Date: Thu Jul 21 14:21:34 2022 -0400
>
> NFS: Replace the READ_PLUS decoding code
>
> We now take a 2-step process that allows us to place data and hole
> segments directly at their final position in the xdr_stream without
> needing to do a bunch of redundant copies to expand holes. Due to the
> variable lengths of each segment, the xdr metadata might cross page
> boundaries which I account for by setting a small scratch buffer so
> xdr_inline_decode() won't fail.
>
> Signed-off-by: Anna Schumaker <[email protected]>
> Signed-off-by: Trond Myklebust <[email protected]>
>
> With a trace:
> [ 25.898462] systemd[1]: Set hostname to <odroidhc1>.
> [ 25.933746] systemd[1]: Failed to bump fs.file-max, ignoring: Invalid
> argument
> [ 25.986237] random: crng init done
> [ 26.264564] 8<--- cut here ---
> [ 26.266823] Unable to handle kernel NULL pointer dereference at
> virtual address 00000fe8
> ...
> [ 26.597263] nfs4_xdr_dec_read_plus from call_decode+0x204/0x304
> [ 26.603222] call_decode from __rpc_execute+0xd0/0x890
> [ 26.608328] __rpc_execute from rpc_async_schedule+0x1c/0x34
> [ 26.613960] rpc_async_schedule from process_one_work+0x294/0x790
> [ 26.620030] process_one_work from worker_thread+0x54/0x518
> [ 26.625570] worker_thread from kthread+0xf4/0x128
> [ 26.630336] kthread from ret_from_fork+0x14/0x2c
>
Is this test being run against a 6.2-rc2 server, or is it an older server platform? We know there were bugs in older server implementations, so the question is whether this might be a problem with handling a bad/corrupt RPC reply from the server, or whether it is happening against code that is supposed to have been fixed?
Thanks
Trond
_________________________________
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]
On 08/01/2023 18:09, Trond Myklebust wrote:
> Hi Krzysztof,
>
>> On Jan 8, 2023, at 08:25, Krzysztof Kozlowski <[email protected]> wrote:
>>
>> [You don't often get email from [email protected]. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification]
>>
>> On 07/01/2023 16:44, Krzysztof Kozlowski wrote:
>>> Hi,
>>>
>>> Bisect identified commit 7fd461c47c6c ("NFSv4.2: Change the default
>>> KConfig value for READ_PLUS") as one leading to NULL pointer exception
>>> when mounting NFS root on NFSv4 client:
>>>
>>> [ 25.739003] systemd[1]: Set hostname to <odroidhc1>.
>>> [ 25.771714] systemd[1]: Failed to bump fs.file-max, ignoring: Invalid
>>> argument
>>> [ 26.199478] 8<--- cut here ---
>>> [ 26.201366] Unable to handle kernel NULL pointer dereference at
>>> virtual address 00000004
>>> ...
>>> [ 26.555522] mmiocpy from xdr_inline_decode+0xec/0x16c
>>> [ 26.560628] xdr_inline_decode from nfs4_xdr_dec_read_plus+0x178/0x358
>>> [ 26.567130] nfs4_xdr_dec_read_plus from call_decode+0x204/0x304
>>>
>>> Full OOPS attached. Full log available here:
>>> https://krzk.eu/#/builders/21/builds/3901/steps/15/logs/serial0
>>>
>>> Disabling NFS_V4_2_READ_PLUS fixes the issue, so obviously the commit is
>>> not the cause, but rather making it default caused the regression.
>>>
>>> I did not make the bisect yet which commit introduced it, if every
>>> config includes NFS_V4_2_READ_PLUS.
>>
>> When every kernel is built with NFS_V4_2_READ_PLUS, bisect pointed to:
>> d3b00a802c84 ("NFS: Replace the READ_PLUS decoding code")
>>
>> commit d3b00a802c845a6021148ce2e669b5a0b5729959
>> Author: Anna Schumaker <[email protected]>
>> Date: Thu Jul 21 14:21:34 2022 -0400
>>
>> NFS: Replace the READ_PLUS decoding code
>>
>> We now take a 2-step process that allows us to place data and hole
>> segments directly at their final position in the xdr_stream without
>> needing to do a bunch of redundant copies to expand holes. Due to the
>> variable lengths of each segment, the xdr metadata might cross page
>> boundaries which I account for by setting a small scratch buffer so
>> xdr_inline_decode() won't fail.
>>
>> Signed-off-by: Anna Schumaker <[email protected]>
>> Signed-off-by: Trond Myklebust <[email protected]>
>>
>> With a trace:
>> [ 25.898462] systemd[1]: Set hostname to <odroidhc1>.
>> [ 25.933746] systemd[1]: Failed to bump fs.file-max, ignoring: Invalid
>> argument
>> [ 25.986237] random: crng init done
>> [ 26.264564] 8<--- cut here ---
>> [ 26.266823] Unable to handle kernel NULL pointer dereference at
>> virtual address 00000fe8
>> ...
>> [ 26.597263] nfs4_xdr_dec_read_plus from call_decode+0x204/0x304
>> [ 26.603222] call_decode from __rpc_execute+0xd0/0x890
>> [ 26.608328] __rpc_execute from rpc_async_schedule+0x1c/0x34
>> [ 26.613960] rpc_async_schedule from process_one_work+0x294/0x790
>> [ 26.620030] process_one_work from worker_thread+0x54/0x518
>> [ 26.625570] worker_thread from kthread+0xf4/0x128
>> [ 26.630336] kthread from ret_from_fork+0x14/0x2c
>>
>
> Is this test being run against a 6.2-rc2 server, or is it an older server platform? We know there were bugs in older server implementations, so the question is whether this might be a problem with handling a bad/corrupt RPC reply from the server, or whether it is happening against code that is supposed to have been fixed?
I would say that buggy server should not cause NULL pointer dereferences
on the client. Otherwise this is a perfect recipe for a rogue server in
the network to start crashing clients and running exploits... Imagine a
compromised machine (through some other means) in a local company
network running now a server with NFS share "HR salary data" or "HR
planned layoffs", where unsuspected people in that network access it
leading to exploit of NFS code on their side...
Server is Raspberry Pi 3 kernel: 5.10.92-2-rpi-legacy-ARCH
Which points that it is not latest stable, so anyway I need to update.
Best regards,
Krzysztof
On 09/01/2023 09:14, Krzysztof Kozlowski wrote:
> On 08/01/2023 18:09, Trond Myklebust wrote:
>> Hi Krzysztof,
>>
>>> On Jan 8, 2023, at 08:25, Krzysztof Kozlowski <[email protected]> wrote:
>>>
>>> [You don't often get email from [email protected]. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification]
>>>
>>> On 07/01/2023 16:44, Krzysztof Kozlowski wrote:
>>>> Hi,
>>>>
>>>> Bisect identified commit 7fd461c47c6c ("NFSv4.2: Change the default
>>>> KConfig value for READ_PLUS") as one leading to NULL pointer exception
>>>> when mounting NFS root on NFSv4 client:
>>>>
>>>> [ 25.739003] systemd[1]: Set hostname to <odroidhc1>.
>>>> [ 25.771714] systemd[1]: Failed to bump fs.file-max, ignoring: Invalid
>>>> argument
>>>> [ 26.199478] 8<--- cut here ---
>>>> [ 26.201366] Unable to handle kernel NULL pointer dereference at
>>>> virtual address 00000004
>>>> ...
>>>> [ 26.555522] mmiocpy from xdr_inline_decode+0xec/0x16c
>>>> [ 26.560628] xdr_inline_decode from nfs4_xdr_dec_read_plus+0x178/0x358
>>>> [ 26.567130] nfs4_xdr_dec_read_plus from call_decode+0x204/0x304
>>>>
>>>> Full OOPS attached. Full log available here:
>>>> https://krzk.eu/#/builders/21/builds/3901/steps/15/logs/serial0
>>>>
>>>> Disabling NFS_V4_2_READ_PLUS fixes the issue, so obviously the commit is
>>>> not the cause, but rather making it default caused the regression.
>>>>
>>>> I did not make the bisect yet which commit introduced it, if every
>>>> config includes NFS_V4_2_READ_PLUS.
>>>
>>> When every kernel is built with NFS_V4_2_READ_PLUS, bisect pointed to:
>>> d3b00a802c84 ("NFS: Replace the READ_PLUS decoding code")
>>>
>>> commit d3b00a802c845a6021148ce2e669b5a0b5729959
>>> Author: Anna Schumaker <[email protected]>
>>> Date: Thu Jul 21 14:21:34 2022 -0400
>>>
>>> NFS: Replace the READ_PLUS decoding code
>>>
>>> We now take a 2-step process that allows us to place data and hole
>>> segments directly at their final position in the xdr_stream without
>>> needing to do a bunch of redundant copies to expand holes. Due to the
>>> variable lengths of each segment, the xdr metadata might cross page
>>> boundaries which I account for by setting a small scratch buffer so
>>> xdr_inline_decode() won't fail.
>>>
>>> Signed-off-by: Anna Schumaker <[email protected]>
>>> Signed-off-by: Trond Myklebust <[email protected]>
>>>
>>> With a trace:
>>> [ 25.898462] systemd[1]: Set hostname to <odroidhc1>.
>>> [ 25.933746] systemd[1]: Failed to bump fs.file-max, ignoring: Invalid
>>> argument
>>> [ 25.986237] random: crng init done
>>> [ 26.264564] 8<--- cut here ---
>>> [ 26.266823] Unable to handle kernel NULL pointer dereference at
>>> virtual address 00000fe8
>>> ...
>>> [ 26.597263] nfs4_xdr_dec_read_plus from call_decode+0x204/0x304
>>> [ 26.603222] call_decode from __rpc_execute+0xd0/0x890
>>> [ 26.608328] __rpc_execute from rpc_async_schedule+0x1c/0x34
>>> [ 26.613960] rpc_async_schedule from process_one_work+0x294/0x790
>>> [ 26.620030] process_one_work from worker_thread+0x54/0x518
>>> [ 26.625570] worker_thread from kthread+0xf4/0x128
>>> [ 26.630336] kthread from ret_from_fork+0x14/0x2c
>>>
>>
>> Is this test being run against a 6.2-rc2 server, or is it an older server platform? We know there were bugs in older server implementations, so the question is whether this might be a problem with handling a bad/corrupt RPC reply from the server, or whether it is happening against code that is supposed to have been fixed?
>
> I would say that buggy server should not cause NULL pointer dereferences
> on the client. Otherwise this is a perfect recipe for a rogue server in
> the network to start crashing clients and running exploits... Imagine a
> compromised machine (through some other means) in a local company
> network running now a server with NFS share "HR salary data" or "HR
> planned layoffs", where unsuspected people in that network access it
> leading to exploit of NFS code on their side...
>
> Server is Raspberry Pi 3 kernel: 5.10.92-2-rpi-legacy-ARCH
>
> Which points that it is not latest stable, so anyway I need to update.
I updated the kernel to 5.15.84-3-rpi-ARCH which is pretty close to
latest stable and I can reproduce the issue. Therefore:
1. It is visible on two stable (one new, one old) kernels on the server,
2. Buggy or rogue server should not cause NULL pointer on remote devices...
Best regards,
Krzysztof
> On Jan 9, 2023, at 03:42, Krzysztof Kozlowski <[email protected]> wrote:
>
> On 09/01/2023 09:14, Krzysztof Kozlowski wrote:
>> On 08/01/2023 18:09, Trond Myklebust wrote:
>>> Hi Krzysztof,
>>>
>>>> On Jan 8, 2023, at 08:25, Krzysztof Kozlowski <[email protected]> wrote:
>>>>
>>>> [You don't often get email from [email protected]. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification]
>>>>
>>>> On 07/01/2023 16:44, Krzysztof Kozlowski wrote:
>>>>> Hi,
>>>>>
>>>>> Bisect identified commit 7fd461c47c6c ("NFSv4.2: Change the default
>>>>> KConfig value for READ_PLUS") as one leading to NULL pointer exception
>>>>> when mounting NFS root on NFSv4 client:
>>>>>
>>>>> [ 25.739003] systemd[1]: Set hostname to <odroidhc1>.
>>>>> [ 25.771714] systemd[1]: Failed to bump fs.file-max, ignoring: Invalid
>>>>> argument
>>>>> [ 26.199478] 8<--- cut here ---
>>>>> [ 26.201366] Unable to handle kernel NULL pointer dereference at
>>>>> virtual address 00000004
>>>>> ...
>>>>> [ 26.555522] mmiocpy from xdr_inline_decode+0xec/0x16c
>>>>> [ 26.560628] xdr_inline_decode from nfs4_xdr_dec_read_plus+0x178/0x358
>>>>> [ 26.567130] nfs4_xdr_dec_read_plus from call_decode+0x204/0x304
>>>>>
>>>>> Full OOPS attached. Full log available here:
>>>>> https://krzk.eu/#/builders/21/builds/3901/steps/15/logs/serial0
>>>>>
>>>>> Disabling NFS_V4_2_READ_PLUS fixes the issue, so obviously the commit is
>>>>> not the cause, but rather making it default caused the regression.
>>>>>
>>>>> I did not make the bisect yet which commit introduced it, if every
>>>>> config includes NFS_V4_2_READ_PLUS.
>>>>
>>>> When every kernel is built with NFS_V4_2_READ_PLUS, bisect pointed to:
>>>> d3b00a802c84 ("NFS: Replace the READ_PLUS decoding code")
>>>>
>>>> commit d3b00a802c845a6021148ce2e669b5a0b5729959
>>>> Author: Anna Schumaker <[email protected]>
>>>> Date: Thu Jul 21 14:21:34 2022 -0400
>>>>
>>>> NFS: Replace the READ_PLUS decoding code
>>>>
>>>> We now take a 2-step process that allows us to place data and hole
>>>> segments directly at their final position in the xdr_stream without
>>>> needing to do a bunch of redundant copies to expand holes. Due to the
>>>> variable lengths of each segment, the xdr metadata might cross page
>>>> boundaries which I account for by setting a small scratch buffer so
>>>> xdr_inline_decode() won't fail.
>>>>
>>>> Signed-off-by: Anna Schumaker <[email protected]>
>>>> Signed-off-by: Trond Myklebust <[email protected]>
>>>>
>>>> With a trace:
>>>> [ 25.898462] systemd[1]: Set hostname to <odroidhc1>.
>>>> [ 25.933746] systemd[1]: Failed to bump fs.file-max, ignoring: Invalid
>>>> argument
>>>> [ 25.986237] random: crng init done
>>>> [ 26.264564] 8<--- cut here ---
>>>> [ 26.266823] Unable to handle kernel NULL pointer dereference at
>>>> virtual address 00000fe8
>>>> ...
>>>> [ 26.597263] nfs4_xdr_dec_read_plus from call_decode+0x204/0x304
>>>> [ 26.603222] call_decode from __rpc_execute+0xd0/0x890
>>>> [ 26.608328] __rpc_execute from rpc_async_schedule+0x1c/0x34
>>>> [ 26.613960] rpc_async_schedule from process_one_work+0x294/0x790
>>>> [ 26.620030] process_one_work from worker_thread+0x54/0x518
>>>> [ 26.625570] worker_thread from kthread+0xf4/0x128
>>>> [ 26.630336] kthread from ret_from_fork+0x14/0x2c
>>>>
>>>
>>> Is this test being run against a 6.2-rc2 server, or is it an older server platform? We know there were bugs in older server implementations, so the question is whether this might be a problem with handling a bad/corrupt RPC reply from the server, or whether it is happening against code that is supposed to have been fixed?
>>
>> I would say that buggy server should not cause NULL pointer dereferences
>> on the client. Otherwise this is a perfect recipe for a rogue server in
>> the network to start crashing clients and running exploits... Imagine a
>> compromised machine (through some other means) in a local company
>> network running now a server with NFS share "HR salary data" or "HR
>> planned layoffs", where unsuspected people in that network access it
>> leading to exploit of NFS code on their side...
>>
>> Server is Raspberry Pi 3 kernel: 5.10.92-2-rpi-legacy-ARCH
>>
>> Which points that it is not latest stable, so anyway I need to update.
>
> I updated the kernel to 5.15.84-3-rpi-ARCH which is pretty close to
> latest stable and I can reproduce the issue. Therefore:
> 1. It is visible on two stable (one new, one old) kernels on the server,
> 2. Buggy or rogue server should not cause NULL pointer on remote devices...
>
The nfsd READ_PLUS code is borked up and until 6.2-rc1. I thought we had a server option to disable that code, but it seems as if that is not the case.
Chuck + Anna, can we please send a stable patch to rip out that code altogether from all the older versions? If we want to enable READ_PLUS by default on the client, then we cannot do so if the majority of NFSv4.2 servers out there are running a borked implementation.
I do agree that we cannot allow buggy servers to cause memory corruption in the client code, so I propose that we revert the Kconfig default setting change again until both the client code and the legacy servers have been fixed. Anna?
Thanks
Trond
> On Jan 9, 2023, at 9:44 AM, Trond Myklebust <[email protected]> wrote:
>
>
>
>> On Jan 9, 2023, at 03:42, Krzysztof Kozlowski <[email protected]> wrote:
>>
>> On 09/01/2023 09:14, Krzysztof Kozlowski wrote:
>>> On 08/01/2023 18:09, Trond Myklebust wrote:
>>>> Hi Krzysztof,
>>>>
>>>>> On Jan 8, 2023, at 08:25, Krzysztof Kozlowski <[email protected]> wrote:
>>>>>
>>>>> [You don't often get email from [email protected]. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification]
>>>>>
>>>>> On 07/01/2023 16:44, Krzysztof Kozlowski wrote:
>>>>>> Hi,
>>>>>>
>>>>>> Bisect identified commit 7fd461c47c6c ("NFSv4.2: Change the default
>>>>>> KConfig value for READ_PLUS") as one leading to NULL pointer exception
>>>>>> when mounting NFS root on NFSv4 client:
>>>>>>
>>>>>> [ 25.739003] systemd[1]: Set hostname to <odroidhc1>.
>>>>>> [ 25.771714] systemd[1]: Failed to bump fs.file-max, ignoring: Invalid
>>>>>> argument
>>>>>> [ 26.199478] 8<--- cut here ---
>>>>>> [ 26.201366] Unable to handle kernel NULL pointer dereference at
>>>>>> virtual address 00000004
>>>>>> ...
>>>>>> [ 26.555522] mmiocpy from xdr_inline_decode+0xec/0x16c
>>>>>> [ 26.560628] xdr_inline_decode from nfs4_xdr_dec_read_plus+0x178/0x358
>>>>>> [ 26.567130] nfs4_xdr_dec_read_plus from call_decode+0x204/0x304
>>>>>>
>>>>>> Full OOPS attached. Full log available here:
>>>>>> https://krzk.eu/#/builders/21/builds/3901/steps/15/logs/serial0
>>>>>>
>>>>>> Disabling NFS_V4_2_READ_PLUS fixes the issue, so obviously the commit is
>>>>>> not the cause, but rather making it default caused the regression.
>>>>>>
>>>>>> I did not make the bisect yet which commit introduced it, if every
>>>>>> config includes NFS_V4_2_READ_PLUS.
>>>>>
>>>>> When every kernel is built with NFS_V4_2_READ_PLUS, bisect pointed to:
>>>>> d3b00a802c84 ("NFS: Replace the READ_PLUS decoding code")
>>>>>
>>>>> commit d3b00a802c845a6021148ce2e669b5a0b5729959
>>>>> Author: Anna Schumaker <[email protected]>
>>>>> Date: Thu Jul 21 14:21:34 2022 -0400
>>>>>
>>>>> NFS: Replace the READ_PLUS decoding code
>>>>>
>>>>> We now take a 2-step process that allows us to place data and hole
>>>>> segments directly at their final position in the xdr_stream without
>>>>> needing to do a bunch of redundant copies to expand holes. Due to the
>>>>> variable lengths of each segment, the xdr metadata might cross page
>>>>> boundaries which I account for by setting a small scratch buffer so
>>>>> xdr_inline_decode() won't fail.
>>>>>
>>>>> Signed-off-by: Anna Schumaker <[email protected]>
>>>>> Signed-off-by: Trond Myklebust <[email protected]>
>>>>>
>>>>> With a trace:
>>>>> [ 25.898462] systemd[1]: Set hostname to <odroidhc1>.
>>>>> [ 25.933746] systemd[1]: Failed to bump fs.file-max, ignoring: Invalid
>>>>> argument
>>>>> [ 25.986237] random: crng init done
>>>>> [ 26.264564] 8<--- cut here ---
>>>>> [ 26.266823] Unable to handle kernel NULL pointer dereference at
>>>>> virtual address 00000fe8
>>>>> ...
>>>>> [ 26.597263] nfs4_xdr_dec_read_plus from call_decode+0x204/0x304
>>>>> [ 26.603222] call_decode from __rpc_execute+0xd0/0x890
>>>>> [ 26.608328] __rpc_execute from rpc_async_schedule+0x1c/0x34
>>>>> [ 26.613960] rpc_async_schedule from process_one_work+0x294/0x790
>>>>> [ 26.620030] process_one_work from worker_thread+0x54/0x518
>>>>> [ 26.625570] worker_thread from kthread+0xf4/0x128
>>>>> [ 26.630336] kthread from ret_from_fork+0x14/0x2c
>>>>>
>>>>
>>>> Is this test being run against a 6.2-rc2 server, or is it an older server platform? We know there were bugs in older server implementations, so the question is whether this might be a problem with handling a bad/corrupt RPC reply from the server, or whether it is happening against code that is supposed to have been fixed?
>>>
>>> I would say that buggy server should not cause NULL pointer dereferences
>>> on the client. Otherwise this is a perfect recipe for a rogue server in
>>> the network to start crashing clients and running exploits... Imagine a
>>> compromised machine (through some other means) in a local company
>>> network running now a server with NFS share "HR salary data" or "HR
>>> planned layoffs", where unsuspected people in that network access it
>>> leading to exploit of NFS code on their side...
>>>
>>> Server is Raspberry Pi 3 kernel: 5.10.92-2-rpi-legacy-ARCH
>>>
>>> Which points that it is not latest stable, so anyway I need to update.
>>
>> I updated the kernel to 5.15.84-3-rpi-ARCH which is pretty close to
>> latest stable and I can reproduce the issue. Therefore:
>> 1. It is visible on two stable (one new, one old) kernels on the server,
>> 2. Buggy or rogue server should not cause NULL pointer on remote devices...
>>
>
> The nfsd READ_PLUS code is borked up and until 6.2-rc1. I thought we had a server option to disable that code, but it seems as if that is not the case.
> Chuck + Anna, can we please send a stable patch to rip out that code altogether from all the older versions? If we want to enable READ_PLUS by default on the client, then we cannot do so if the majority of NFSv4.2 servers out there are running a borked implementation.
>
> I do agree that we cannot allow buggy
or malicious, or non-Linux,
> servers to cause memory corruption in the client code, so I propose that we revert the Kconfig default setting change again until both the client code and the legacy servers have been fixed.
I stand ready to receive and apply server-side fixes, as you suggested.
However IMO it would be most responsible to go straight for a client code fix. The Kconfig setting is a red herring (as delicious as that might sound). Any thoughts about how difficult that fix might be?
--
Chuck Lever
On Mon, Jan 9, 2023 at 10:12 AM Chuck Lever III <[email protected]> wrote:
>
>
>
> > On Jan 9, 2023, at 9:44 AM, Trond Myklebust <[email protected]> wrote:
> >
> >
> >
> >> On Jan 9, 2023, at 03:42, Krzysztof Kozlowski <[email protected]> wrote:
> >>
> >> On 09/01/2023 09:14, Krzysztof Kozlowski wrote:
> >>> On 08/01/2023 18:09, Trond Myklebust wrote:
> >>>> Hi Krzysztof,
> >>>>
> >>>>> On Jan 8, 2023, at 08:25, Krzysztof Kozlowski <[email protected]> wrote:
> >>>>>
> >>>>> [You don't often get email from [email protected]. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification]
> >>>>>
> >>>>> On 07/01/2023 16:44, Krzysztof Kozlowski wrote:
> >>>>>> Hi,
> >>>>>>
> >>>>>> Bisect identified commit 7fd461c47c6c ("NFSv4.2: Change the default
> >>>>>> KConfig value for READ_PLUS") as one leading to NULL pointer exception
> >>>>>> when mounting NFS root on NFSv4 client:
> >>>>>>
> >>>>>> [ 25.739003] systemd[1]: Set hostname to <odroidhc1>.
> >>>>>> [ 25.771714] systemd[1]: Failed to bump fs.file-max, ignoring: Invalid
> >>>>>> argument
> >>>>>> [ 26.199478] 8<--- cut here ---
> >>>>>> [ 26.201366] Unable to handle kernel NULL pointer dereference at
> >>>>>> virtual address 00000004
> >>>>>> ...
> >>>>>> [ 26.555522] mmiocpy from xdr_inline_decode+0xec/0x16c
> >>>>>> [ 26.560628] xdr_inline_decode from nfs4_xdr_dec_read_plus+0x178/0x358
> >>>>>> [ 26.567130] nfs4_xdr_dec_read_plus from call_decode+0x204/0x304
> >>>>>>
> >>>>>> Full OOPS attached. Full log available here:
> >>>>>> https://krzk.eu/#/builders/21/builds/3901/steps/15/logs/serial0
> >>>>>>
> >>>>>> Disabling NFS_V4_2_READ_PLUS fixes the issue, so obviously the commit is
> >>>>>> not the cause, but rather making it default caused the regression.
> >>>>>>
> >>>>>> I did not make the bisect yet which commit introduced it, if every
> >>>>>> config includes NFS_V4_2_READ_PLUS.
> >>>>>
> >>>>> When every kernel is built with NFS_V4_2_READ_PLUS, bisect pointed to:
> >>>>> d3b00a802c84 ("NFS: Replace the READ_PLUS decoding code")
> >>>>>
> >>>>> commit d3b00a802c845a6021148ce2e669b5a0b5729959
> >>>>> Author: Anna Schumaker <[email protected]>
> >>>>> Date: Thu Jul 21 14:21:34 2022 -0400
> >>>>>
> >>>>> NFS: Replace the READ_PLUS decoding code
> >>>>>
> >>>>> We now take a 2-step process that allows us to place data and hole
> >>>>> segments directly at their final position in the xdr_stream without
> >>>>> needing to do a bunch of redundant copies to expand holes. Due to the
> >>>>> variable lengths of each segment, the xdr metadata might cross page
> >>>>> boundaries which I account for by setting a small scratch buffer so
> >>>>> xdr_inline_decode() won't fail.
> >>>>>
> >>>>> Signed-off-by: Anna Schumaker <[email protected]>
> >>>>> Signed-off-by: Trond Myklebust <[email protected]>
> >>>>>
> >>>>> With a trace:
> >>>>> [ 25.898462] systemd[1]: Set hostname to <odroidhc1>.
> >>>>> [ 25.933746] systemd[1]: Failed to bump fs.file-max, ignoring: Invalid
> >>>>> argument
> >>>>> [ 25.986237] random: crng init done
> >>>>> [ 26.264564] 8<--- cut here ---
> >>>>> [ 26.266823] Unable to handle kernel NULL pointer dereference at
> >>>>> virtual address 00000fe8
> >>>>> ...
> >>>>> [ 26.597263] nfs4_xdr_dec_read_plus from call_decode+0x204/0x304
> >>>>> [ 26.603222] call_decode from __rpc_execute+0xd0/0x890
> >>>>> [ 26.608328] __rpc_execute from rpc_async_schedule+0x1c/0x34
> >>>>> [ 26.613960] rpc_async_schedule from process_one_work+0x294/0x790
> >>>>> [ 26.620030] process_one_work from worker_thread+0x54/0x518
> >>>>> [ 26.625570] worker_thread from kthread+0xf4/0x128
> >>>>> [ 26.630336] kthread from ret_from_fork+0x14/0x2c
> >>>>>
> >>>>
> >>>> Is this test being run against a 6.2-rc2 server, or is it an older server platform? We know there were bugs in older server implementations, so the question is whether this might be a problem with handling a bad/corrupt RPC reply from the server, or whether it is happening against code that is supposed to have been fixed?
> >>>
> >>> I would say that buggy server should not cause NULL pointer dereferences
> >>> on the client. Otherwise this is a perfect recipe for a rogue server in
> >>> the network to start crashing clients and running exploits... Imagine a
> >>> compromised machine (through some other means) in a local company
> >>> network running now a server with NFS share "HR salary data" or "HR
> >>> planned layoffs", where unsuspected people in that network access it
> >>> leading to exploit of NFS code on their side...
> >>>
> >>> Server is Raspberry Pi 3 kernel: 5.10.92-2-rpi-legacy-ARCH
> >>>
> >>> Which points that it is not latest stable, so anyway I need to update.
> >>
> >> I updated the kernel to 5.15.84-3-rpi-ARCH which is pretty close to
> >> latest stable and I can reproduce the issue. Therefore:
> >> 1. It is visible on two stable (one new, one old) kernels on the server,
> >> 2. Buggy or rogue server should not cause NULL pointer on remote devices...
> >>
> >
> > The nfsd READ_PLUS code is borked up and until 6.2-rc1. I thought we had a server option to disable that code, but it seems as if that is not the case.
> > Chuck + Anna, can we please send a stable patch to rip out that code altogether from all the older versions? If we want to enable READ_PLUS by default on the client, then we cannot do so if the majority of NFSv4.2 servers out there are running a borked implementation.
> >
> > I do agree that we cannot allow buggy
>
> or malicious, or non-Linux,
>
>
> > servers to cause memory corruption in the client code, so I propose that we revert the Kconfig default setting change again until both the client code and the legacy servers have been fixed.
>
> I stand ready to receive and apply server-side fixes, as you suggested.
>
> However IMO it would be most responsible to go straight for a client code fix. The Kconfig setting is a red herring (as delicious as that might sound). Any thoughts about how difficult that fix might be?
I'm wondering about how hard the fix might be as well, since it could
be a legitimate bug or some error checking that I've overlooked. I've
gotten as far as installing a 5.15 server in my testing setup, so
we'll see if I'm able to reproduce the crash this morning.
Anna
>
>
> --
> Chuck Lever
>
>
>
> On Jan 9, 2023, at 10:38 AM, Trond Myklebust <[email protected]> wrote:
>
>
>
>> On Jan 9, 2023, at 10:07, Chuck Lever III <[email protected]> wrote:
>>
>>
>>
>>> On Jan 9, 2023, at 9:44 AM, Trond Myklebust <[email protected]> wrote:
>>>
>>>
>>>
>>>> On Jan 9, 2023, at 03:42, Krzysztof Kozlowski <[email protected]> wrote:
>>>>
>>>> On 09/01/2023 09:14, Krzysztof Kozlowski wrote:
>>>>> On 08/01/2023 18:09, Trond Myklebust wrote:
>>>>>> Hi Krzysztof,
>>>>>>
>>>>>>> On Jan 8, 2023, at 08:25, Krzysztof Kozlowski <[email protected]> wrote:
>>>>>>>
>>>>>>> [You don't often get email from [email protected]. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification]
>>>>>>>
>>>>>>> On 07/01/2023 16:44, Krzysztof Kozlowski wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> Bisect identified commit 7fd461c47c6c ("NFSv4.2: Change the default
>>>>>>>> KConfig value for READ_PLUS") as one leading to NULL pointer exception
>>>>>>>> when mounting NFS root on NFSv4 client:
>>>>>>>>
>>>>>>>> [ 25.739003] systemd[1]: Set hostname to <odroidhc1>.
>>>>>>>> [ 25.771714] systemd[1]: Failed to bump fs.file-max, ignoring: Invalid
>>>>>>>> argument
>>>>>>>> [ 26.199478] 8<--- cut here ---
>>>>>>>> [ 26.201366] Unable to handle kernel NULL pointer dereference at
>>>>>>>> virtual address 00000004
>>>>>>>> ...
>>>>>>>> [ 26.555522] mmiocpy from xdr_inline_decode+0xec/0x16c
>>>>>>>> [ 26.560628] xdr_inline_decode from nfs4_xdr_dec_read_plus+0x178/0x358
>>>>>>>> [ 26.567130] nfs4_xdr_dec_read_plus from call_decode+0x204/0x304
>>>>>>>>
>>>>>>>> Full OOPS attached. Full log available here:
>>>>>>>> https://krzk.eu/#/builders/21/builds/3901/steps/15/logs/serial0
>>>>>>>>
>>>>>>>> Disabling NFS_V4_2_READ_PLUS fixes the issue, so obviously the commit is
>>>>>>>> not the cause, but rather making it default caused the regression.
>>>>>>>>
>>>>>>>> I did not make the bisect yet which commit introduced it, if every
>>>>>>>> config includes NFS_V4_2_READ_PLUS.
>>>>>>>
>>>>>>> When every kernel is built with NFS_V4_2_READ_PLUS, bisect pointed to:
>>>>>>> d3b00a802c84 ("NFS: Replace the READ_PLUS decoding code")
>>>>>>>
>>>>>>> commit d3b00a802c845a6021148ce2e669b5a0b5729959
>>>>>>> Author: Anna Schumaker <[email protected]>
>>>>>>> Date: Thu Jul 21 14:21:34 2022 -0400
>>>>>>>
>>>>>>> NFS: Replace the READ_PLUS decoding code
>>>>>>>
>>>>>>> We now take a 2-step process that allows us to place data and hole
>>>>>>> segments directly at their final position in the xdr_stream without
>>>>>>> needing to do a bunch of redundant copies to expand holes. Due to the
>>>>>>> variable lengths of each segment, the xdr metadata might cross page
>>>>>>> boundaries which I account for by setting a small scratch buffer so
>>>>>>> xdr_inline_decode() won't fail.
>>>>>>>
>>>>>>> Signed-off-by: Anna Schumaker <[email protected]>
>>>>>>> Signed-off-by: Trond Myklebust <[email protected]>
>>>>>>>
>>>>>>> With a trace:
>>>>>>> [ 25.898462] systemd[1]: Set hostname to <odroidhc1>.
>>>>>>> [ 25.933746] systemd[1]: Failed to bump fs.file-max, ignoring: Invalid
>>>>>>> argument
>>>>>>> [ 25.986237] random: crng init done
>>>>>>> [ 26.264564] 8<--- cut here ---
>>>>>>> [ 26.266823] Unable to handle kernel NULL pointer dereference at
>>>>>>> virtual address 00000fe8
>>>>>>> ...
>>>>>>> [ 26.597263] nfs4_xdr_dec_read_plus from call_decode+0x204/0x304
>>>>>>> [ 26.603222] call_decode from __rpc_execute+0xd0/0x890
>>>>>>> [ 26.608328] __rpc_execute from rpc_async_schedule+0x1c/0x34
>>>>>>> [ 26.613960] rpc_async_schedule from process_one_work+0x294/0x790
>>>>>>> [ 26.620030] process_one_work from worker_thread+0x54/0x518
>>>>>>> [ 26.625570] worker_thread from kthread+0xf4/0x128
>>>>>>> [ 26.630336] kthread from ret_from_fork+0x14/0x2c
>>>>>>>
>>>>>>
>>>>>> Is this test being run against a 6.2-rc2 server, or is it an older server platform? We know there were bugs in older server implementations, so the question is whether this might be a problem with handling a bad/corrupt RPC reply from the server, or whether it is happening against code that is supposed to have been fixed?
>>>>>
>>>>> I would say that buggy server should not cause NULL pointer dereferences
>>>>> on the client. Otherwise this is a perfect recipe for a rogue server in
>>>>> the network to start crashing clients and running exploits... Imagine a
>>>>> compromised machine (through some other means) in a local company
>>>>> network running now a server with NFS share "HR salary data" or "HR
>>>>> planned layoffs", where unsuspected people in that network access it
>>>>> leading to exploit of NFS code on their side...
>>>>>
>>>>> Server is Raspberry Pi 3 kernel: 5.10.92-2-rpi-legacy-ARCH
>>>>>
>>>>> Which points that it is not latest stable, so anyway I need to update.
>>>>
>>>> I updated the kernel to 5.15.84-3-rpi-ARCH which is pretty close to
>>>> latest stable and I can reproduce the issue. Therefore:
>>>> 1. It is visible on two stable (one new, one old) kernels on the server,
>>>> 2. Buggy or rogue server should not cause NULL pointer on remote devices...
>>>>
>>>
>>> The nfsd READ_PLUS code is borked up and until 6.2-rc1. I thought we had a server option to disable that code, but it seems as if that is not the case.
>>> Chuck + Anna, can we please send a stable patch to rip out that code altogether from all the older versions? If we want to enable READ_PLUS by default on the client, then we cannot do so if the majority of NFSv4.2 servers out there are running a borked implementation.
>>>
>>> I do agree that we cannot allow buggy
>>
>> or malicious, or non-Linux,
>>
>>
>>> servers to cause memory corruption in the client code, so I propose that we revert the Kconfig default setting change again until both the client code and the legacy servers have been fixed.
>>
>> I stand ready to receive and apply server-side fixes, as you suggested.
>>
>> However IMO it would be most responsible to go straight for a client code fix. The Kconfig setting is a red herring (as delicious as that might sound). Any thoughts about how difficult that fix might be?
>>
>
> A client fix is necessary, but not sufficient.
No argument from me. There is a problem on both sides.
> The older server READ_PLUS code is slower than ordinary read for several filesystems, since it relies upon the (often poor) performance of lseek(SEEK_HOLE). Leaving legacy servers as is, would therefore still cause regressions for clients that default to trying READ_PLUS, even if the code is fixed to not Oops.
Backporting eeadcb757945 ("NFSD: Simplify READ_PLUS") to stable kernels would address both concerns.
--
Chuck Lever
> On Jan 9, 2023, at 12:11 PM, Chuck Lever III <[email protected]> wrote:
>
>
>
>> On Jan 9, 2023, at 10:38 AM, Trond Myklebust <[email protected]> wrote:
>>
>>
>>
>>> On Jan 9, 2023, at 10:07, Chuck Lever III <[email protected]> wrote:
>>>
>>>
>>>
>>>> On Jan 9, 2023, at 9:44 AM, Trond Myklebust <[email protected]> wrote:
>>>>
>>>>
>>>>
>>>>> On Jan 9, 2023, at 03:42, Krzysztof Kozlowski <[email protected]> wrote:
>>>>>
>>>>> On 09/01/2023 09:14, Krzysztof Kozlowski wrote:
>>>>>> On 08/01/2023 18:09, Trond Myklebust wrote:
>>>>>>> Hi Krzysztof,
>>>>>>>
>>>>>>>> On Jan 8, 2023, at 08:25, Krzysztof Kozlowski <[email protected]> wrote:
>>>>>>>>
>>>>>>>> [You don't often get email from [email protected]. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification]
>>>>>>>>
>>>>>>>> On 07/01/2023 16:44, Krzysztof Kozlowski wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> Bisect identified commit 7fd461c47c6c ("NFSv4.2: Change the default
>>>>>>>>> KConfig value for READ_PLUS") as one leading to NULL pointer exception
>>>>>>>>> when mounting NFS root on NFSv4 client:
>>>>>>>>>
>>>>>>>>> [ 25.739003] systemd[1]: Set hostname to <odroidhc1>.
>>>>>>>>> [ 25.771714] systemd[1]: Failed to bump fs.file-max, ignoring: Invalid
>>>>>>>>> argument
>>>>>>>>> [ 26.199478] 8<--- cut here ---
>>>>>>>>> [ 26.201366] Unable to handle kernel NULL pointer dereference at
>>>>>>>>> virtual address 00000004
>>>>>>>>> ...
>>>>>>>>> [ 26.555522] mmiocpy from xdr_inline_decode+0xec/0x16c
>>>>>>>>> [ 26.560628] xdr_inline_decode from nfs4_xdr_dec_read_plus+0x178/0x358
>>>>>>>>> [ 26.567130] nfs4_xdr_dec_read_plus from call_decode+0x204/0x304
>>>>>>>>>
>>>>>>>>> Full OOPS attached. Full log available here:
>>>>>>>>> https://krzk.eu/#/builders/21/builds/3901/steps/15/logs/serial0
>>>>>>>>>
>>>>>>>>> Disabling NFS_V4_2_READ_PLUS fixes the issue, so obviously the commit is
>>>>>>>>> not the cause, but rather making it default caused the regression.
>>>>>>>>>
>>>>>>>>> I did not make the bisect yet which commit introduced it, if every
>>>>>>>>> config includes NFS_V4_2_READ_PLUS.
>>>>>>>>
>>>>>>>> When every kernel is built with NFS_V4_2_READ_PLUS, bisect pointed to:
>>>>>>>> d3b00a802c84 ("NFS: Replace the READ_PLUS decoding code")
>>>>>>>>
>>>>>>>> commit d3b00a802c845a6021148ce2e669b5a0b5729959
>>>>>>>> Author: Anna Schumaker <[email protected]>
>>>>>>>> Date: Thu Jul 21 14:21:34 2022 -0400
>>>>>>>>
>>>>>>>> NFS: Replace the READ_PLUS decoding code
>>>>>>>>
>>>>>>>> We now take a 2-step process that allows us to place data and hole
>>>>>>>> segments directly at their final position in the xdr_stream without
>>>>>>>> needing to do a bunch of redundant copies to expand holes. Due to the
>>>>>>>> variable lengths of each segment, the xdr metadata might cross page
>>>>>>>> boundaries which I account for by setting a small scratch buffer so
>>>>>>>> xdr_inline_decode() won't fail.
>>>>>>>>
>>>>>>>> Signed-off-by: Anna Schumaker <[email protected]>
>>>>>>>> Signed-off-by: Trond Myklebust <[email protected]>
>>>>>>>>
>>>>>>>> With a trace:
>>>>>>>> [ 25.898462] systemd[1]: Set hostname to <odroidhc1>.
>>>>>>>> [ 25.933746] systemd[1]: Failed to bump fs.file-max, ignoring: Invalid
>>>>>>>> argument
>>>>>>>> [ 25.986237] random: crng init done
>>>>>>>> [ 26.264564] 8<--- cut here ---
>>>>>>>> [ 26.266823] Unable to handle kernel NULL pointer dereference at
>>>>>>>> virtual address 00000fe8
>>>>>>>> ...
>>>>>>>> [ 26.597263] nfs4_xdr_dec_read_plus from call_decode+0x204/0x304
>>>>>>>> [ 26.603222] call_decode from __rpc_execute+0xd0/0x890
>>>>>>>> [ 26.608328] __rpc_execute from rpc_async_schedule+0x1c/0x34
>>>>>>>> [ 26.613960] rpc_async_schedule from process_one_work+0x294/0x790
>>>>>>>> [ 26.620030] process_one_work from worker_thread+0x54/0x518
>>>>>>>> [ 26.625570] worker_thread from kthread+0xf4/0x128
>>>>>>>> [ 26.630336] kthread from ret_from_fork+0x14/0x2c
>>>>>>>>
>>>>>>>
>>>>>>> Is this test being run against a 6.2-rc2 server, or is it an older server platform? We know there were bugs in older server implementations, so the question is whether this might be a problem with handling a bad/corrupt RPC reply from the server, or whether it is happening against code that is supposed to have been fixed?
>>>>>>
>>>>>> I would say that buggy server should not cause NULL pointer dereferences
>>>>>> on the client. Otherwise this is a perfect recipe for a rogue server in
>>>>>> the network to start crashing clients and running exploits... Imagine a
>>>>>> compromised machine (through some other means) in a local company
>>>>>> network running now a server with NFS share "HR salary data" or "HR
>>>>>> planned layoffs", where unsuspected people in that network access it
>>>>>> leading to exploit of NFS code on their side...
>>>>>>
>>>>>> Server is Raspberry Pi 3 kernel: 5.10.92-2-rpi-legacy-ARCH
>>>>>>
>>>>>> Which points that it is not latest stable, so anyway I need to update.
>>>>>
>>>>> I updated the kernel to 5.15.84-3-rpi-ARCH which is pretty close to
>>>>> latest stable and I can reproduce the issue. Therefore:
>>>>> 1. It is visible on two stable (one new, one old) kernels on the server,
>>>>> 2. Buggy or rogue server should not cause NULL pointer on remote devices...
>>>>>
>>>>
>>>> The nfsd READ_PLUS code is borked up and until 6.2-rc1. I thought we had a server option to disable that code, but it seems as if that is not the case.
>>>> Chuck + Anna, can we please send a stable patch to rip out that code altogether from all the older versions? If we want to enable READ_PLUS by default on the client, then we cannot do so if the majority of NFSv4.2 servers out there are running a borked implementation.
>>>>
>>>> I do agree that we cannot allow buggy
>>>
>>> or malicious, or non-Linux,
>>>
>>>
>>>> servers to cause memory corruption in the client code, so I propose that we revert the Kconfig default setting change again until both the client code and the legacy servers have been fixed.
>>>
>>> I stand ready to receive and apply server-side fixes, as you suggested.
>>>
>>> However IMO it would be most responsible to go straight for a client code fix. The Kconfig setting is a red herring (as delicious as that might sound). Any thoughts about how difficult that fix might be?
>>>
>>
>> A client fix is necessary, but not sufficient.
>
> No argument from me. There is a problem on both sides.
>
>
>> The older server READ_PLUS code is slower than ordinary read for several filesystems, since it relies upon the (often poor) performance of lseek(SEEK_HOLE). Leaving legacy servers as is, would therefore still cause regressions for clients that default to trying READ_PLUS, even if the code is fixed to not Oops.
>
> Backporting eeadcb757945 ("NFSD: Simplify READ_PLUS") to stable kernels would address both concerns.
And by "both concerns" I mean it addresses the possible performance
regression and the server's malformed RPC reply.
The client code still needs to be fixed.
--
Chuck Lever
> On Jan 9, 2023, at 10:26 AM, Anna Schumaker <[email protected]> wrote:
>
> On Mon, Jan 9, 2023 at 10:12 AM Chuck Lever III <[email protected]> wrote:
>>
>>
>>
>>> On Jan 9, 2023, at 9:44 AM, Trond Myklebust <[email protected]> wrote:
>>>
>>>
>>>
>>>> On Jan 9, 2023, at 03:42, Krzysztof Kozlowski <[email protected]> wrote:
>>>>
>>>> On 09/01/2023 09:14, Krzysztof Kozlowski wrote:
>>>>> On 08/01/2023 18:09, Trond Myklebust wrote:
>>>>>> Hi Krzysztof,
>>>>>>
>>>>>>> On Jan 8, 2023, at 08:25, Krzysztof Kozlowski <[email protected]> wrote:
>>>>>>>
>>>>>>> [You don't often get email from [email protected]. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification]
>>>>>>>
>>>>>>> On 07/01/2023 16:44, Krzysztof Kozlowski wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> Bisect identified commit 7fd461c47c6c ("NFSv4.2: Change the default
>>>>>>>> KConfig value for READ_PLUS") as one leading to NULL pointer exception
>>>>>>>> when mounting NFS root on NFSv4 client:
>>>>>>>>
>>>>>>>> [ 25.739003] systemd[1]: Set hostname to <odroidhc1>.
>>>>>>>> [ 25.771714] systemd[1]: Failed to bump fs.file-max, ignoring: Invalid
>>>>>>>> argument
>>>>>>>> [ 26.199478] 8<--- cut here ---
>>>>>>>> [ 26.201366] Unable to handle kernel NULL pointer dereference at
>>>>>>>> virtual address 00000004
>>>>>>>> ...
>>>>>>>> [ 26.555522] mmiocpy from xdr_inline_decode+0xec/0x16c
>>>>>>>> [ 26.560628] xdr_inline_decode from nfs4_xdr_dec_read_plus+0x178/0x358
>>>>>>>> [ 26.567130] nfs4_xdr_dec_read_plus from call_decode+0x204/0x304
>>>>>>>>
>>>>>>>> Full OOPS attached. Full log available here:
>>>>>>>> https://krzk.eu/#/builders/21/builds/3901/steps/15/logs/serial0
>>>>>>>>
>>>>>>>> Disabling NFS_V4_2_READ_PLUS fixes the issue, so obviously the commit is
>>>>>>>> not the cause, but rather making it default caused the regression.
>>>>>>>>
>>>>>>>> I did not make the bisect yet which commit introduced it, if every
>>>>>>>> config includes NFS_V4_2_READ_PLUS.
>>>>>>>
>>>>>>> When every kernel is built with NFS_V4_2_READ_PLUS, bisect pointed to:
>>>>>>> d3b00a802c84 ("NFS: Replace the READ_PLUS decoding code")
>>>>>>>
>>>>>>> commit d3b00a802c845a6021148ce2e669b5a0b5729959
>>>>>>> Author: Anna Schumaker <[email protected]>
>>>>>>> Date: Thu Jul 21 14:21:34 2022 -0400
>>>>>>>
>>>>>>> NFS: Replace the READ_PLUS decoding code
>>>>>>>
>>>>>>> We now take a 2-step process that allows us to place data and hole
>>>>>>> segments directly at their final position in the xdr_stream without
>>>>>>> needing to do a bunch of redundant copies to expand holes. Due to the
>>>>>>> variable lengths of each segment, the xdr metadata might cross page
>>>>>>> boundaries which I account for by setting a small scratch buffer so
>>>>>>> xdr_inline_decode() won't fail.
>>>>>>>
>>>>>>> Signed-off-by: Anna Schumaker <[email protected]>
>>>>>>> Signed-off-by: Trond Myklebust <[email protected]>
>>>>>>>
>>>>>>> With a trace:
>>>>>>> [ 25.898462] systemd[1]: Set hostname to <odroidhc1>.
>>>>>>> [ 25.933746] systemd[1]: Failed to bump fs.file-max, ignoring: Invalid
>>>>>>> argument
>>>>>>> [ 25.986237] random: crng init done
>>>>>>> [ 26.264564] 8<--- cut here ---
>>>>>>> [ 26.266823] Unable to handle kernel NULL pointer dereference at
>>>>>>> virtual address 00000fe8
>>>>>>> ...
>>>>>>> [ 26.597263] nfs4_xdr_dec_read_plus from call_decode+0x204/0x304
>>>>>>> [ 26.603222] call_decode from __rpc_execute+0xd0/0x890
>>>>>>> [ 26.608328] __rpc_execute from rpc_async_schedule+0x1c/0x34
>>>>>>> [ 26.613960] rpc_async_schedule from process_one_work+0x294/0x790
>>>>>>> [ 26.620030] process_one_work from worker_thread+0x54/0x518
>>>>>>> [ 26.625570] worker_thread from kthread+0xf4/0x128
>>>>>>> [ 26.630336] kthread from ret_from_fork+0x14/0x2c
>>>>>>>
>>>>>>
>>>>>> Is this test being run against a 6.2-rc2 server, or is it an older server platform? We know there were bugs in older server implementations, so the question is whether this might be a problem with handling a bad/corrupt RPC reply from the server, or whether it is happening against code that is supposed to have been fixed?
>>>>>
>>>>> I would say that buggy server should not cause NULL pointer dereferences
>>>>> on the client. Otherwise this is a perfect recipe for a rogue server in
>>>>> the network to start crashing clients and running exploits... Imagine a
>>>>> compromised machine (through some other means) in a local company
>>>>> network running now a server with NFS share "HR salary data" or "HR
>>>>> planned layoffs", where unsuspected people in that network access it
>>>>> leading to exploit of NFS code on their side...
>>>>>
>>>>> Server is Raspberry Pi 3 kernel: 5.10.92-2-rpi-legacy-ARCH
>>>>>
>>>>> Which points that it is not latest stable, so anyway I need to update.
>>>>
>>>> I updated the kernel to 5.15.84-3-rpi-ARCH which is pretty close to
>>>> latest stable and I can reproduce the issue. Therefore:
>>>> 1. It is visible on two stable (one new, one old) kernels on the server,
>>>> 2. Buggy or rogue server should not cause NULL pointer on remote devices...
>>>>
>>>
>>> The nfsd READ_PLUS code is borked up and until 6.2-rc1. I thought we had a server option to disable that code, but it seems as if that is not the case.
>>> Chuck + Anna, can we please send a stable patch to rip out that code altogether from all the older versions? If we want to enable READ_PLUS by default on the client, then we cannot do so if the majority of NFSv4.2 servers out there are running a borked implementation.
>>>
>>> I do agree that we cannot allow buggy
>>
>> or malicious, or non-Linux,
>>
>>
>>> servers to cause memory corruption in the client code, so I propose that we revert the Kconfig default setting change again until both the client code and the legacy servers have been fixed.
>>
>> I stand ready to receive and apply server-side fixes, as you suggested.
>>
>> However IMO it would be most responsible to go straight for a client code fix. The Kconfig setting is a red herring (as delicious as that might sound). Any thoughts about how difficult that fix might be?
>
> I'm wondering about how hard the fix might be as well, since it could
> be a legitimate bug or some error checking that I've overlooked. I've
> gotten as far as installing a 5.15 server in my testing setup, so
> we'll see if I'm able to reproduce the crash this morning.
All current long-term stable kernels have 72d78717c6d0 ("nfsd: Fixes for nfsd4_encode_read_plus_data()"), and so does v5.10.92. I'm not aware of
any other problem with READ_PLUS except for performance. Let me know (or
file a bug) if I'm missing something!
For the possible performance regression, backporting eeadcb757945
("NFSD: Simplify READ_PLUS") is the proposed stable fix.
* v6.1.3 - eeadcb757945 applies cleanly
* v6.0.18 - eeadcb757945 applies with offsets
* v5.15.86 - eeadcb757945 needs 1e37d0e5bda4, then applies with offsets
* v5.10.162 - would need a hand-written version of eeadcb757945
* v5.4.228 - does not implement READ_PLUS
* v4.19.269 - does not implement READ_PLUS
* v4.14.302 - does not implement READ_PLUS
* v4.9.337 - EOL
If someone volunteers to build and test, I would not object to
a request to stable@ to apply eeadcb757945 to 5.15, 6.0, and 6.1.
--
Chuck Lever
Hi Krzysztof,
On Mon, Jan 9, 2023 at 3:46 AM Krzysztof Kozlowski
<[email protected]> wrote:
>
> On 09/01/2023 09:14, Krzysztof Kozlowski wrote:
> > On 08/01/2023 18:09, Trond Myklebust wrote:
> >> Hi Krzysztof,
> >>
> >>> On Jan 8, 2023, at 08:25, Krzysztof Kozlowski <[email protected]> wrote:
> >>>
> >>> [You don't often get email from [email protected]. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification]
> >>>
> >>> On 07/01/2023 16:44, Krzysztof Kozlowski wrote:
> >>>> Hi,
> >>>>
> >>>> Bisect identified commit 7fd461c47c6c ("NFSv4.2: Change the default
> >>>> KConfig value for READ_PLUS") as one leading to NULL pointer exception
> >>>> when mounting NFS root on NFSv4 client:
> >>>>
> >>>> [ 25.739003] systemd[1]: Set hostname to <odroidhc1>.
> >>>> [ 25.771714] systemd[1]: Failed to bump fs.file-max, ignoring: Invalid
> >>>> argument
> >>>> [ 26.199478] 8<--- cut here ---
> >>>> [ 26.201366] Unable to handle kernel NULL pointer dereference at
> >>>> virtual address 00000004
> >>>> ...
> >>>> [ 26.555522] mmiocpy from xdr_inline_decode+0xec/0x16c
> >>>> [ 26.560628] xdr_inline_decode from nfs4_xdr_dec_read_plus+0x178/0x358
> >>>> [ 26.567130] nfs4_xdr_dec_read_plus from call_decode+0x204/0x304
> >>>>
> >>>> Full OOPS attached. Full log available here:
> >>>> https://krzk.eu/#/builders/21/builds/3901/steps/15/logs/serial0
> >>>>
> >>>> Disabling NFS_V4_2_READ_PLUS fixes the issue, so obviously the commit is
> >>>> not the cause, but rather making it default caused the regression.
> >>>>
> >>>> I did not make the bisect yet which commit introduced it, if every
> >>>> config includes NFS_V4_2_READ_PLUS.
> >>>
> >>> When every kernel is built with NFS_V4_2_READ_PLUS, bisect pointed to:
> >>> d3b00a802c84 ("NFS: Replace the READ_PLUS decoding code")
> >>>
> >>> commit d3b00a802c845a6021148ce2e669b5a0b5729959
> >>> Author: Anna Schumaker <[email protected]>
> >>> Date: Thu Jul 21 14:21:34 2022 -0400
> >>>
> >>> NFS: Replace the READ_PLUS decoding code
> >>>
> >>> We now take a 2-step process that allows us to place data and hole
> >>> segments directly at their final position in the xdr_stream without
> >>> needing to do a bunch of redundant copies to expand holes. Due to the
> >>> variable lengths of each segment, the xdr metadata might cross page
> >>> boundaries which I account for by setting a small scratch buffer so
> >>> xdr_inline_decode() won't fail.
> >>>
> >>> Signed-off-by: Anna Schumaker <[email protected]>
> >>> Signed-off-by: Trond Myklebust <[email protected]>
> >>>
> >>> With a trace:
> >>> [ 25.898462] systemd[1]: Set hostname to <odroidhc1>.
> >>> [ 25.933746] systemd[1]: Failed to bump fs.file-max, ignoring: Invalid
> >>> argument
> >>> [ 25.986237] random: crng init done
> >>> [ 26.264564] 8<--- cut here ---
> >>> [ 26.266823] Unable to handle kernel NULL pointer dereference at
> >>> virtual address 00000fe8
> >>> ...
> >>> [ 26.597263] nfs4_xdr_dec_read_plus from call_decode+0x204/0x304
> >>> [ 26.603222] call_decode from __rpc_execute+0xd0/0x890
> >>> [ 26.608328] __rpc_execute from rpc_async_schedule+0x1c/0x34
> >>> [ 26.613960] rpc_async_schedule from process_one_work+0x294/0x790
> >>> [ 26.620030] process_one_work from worker_thread+0x54/0x518
> >>> [ 26.625570] worker_thread from kthread+0xf4/0x128
> >>> [ 26.630336] kthread from ret_from_fork+0x14/0x2c
> >>>
> >>
> >> Is this test being run against a 6.2-rc2 server, or is it an older server platform? We know there were bugs in older server implementations, so the question is whether this might be a problem with handling a bad/corrupt RPC reply from the server, or whether it is happening against code that is supposed to have been fixed?
> >
> > I would say that buggy server should not cause NULL pointer dereferences
> > on the client. Otherwise this is a perfect recipe for a rogue server in
> > the network to start crashing clients and running exploits... Imagine a
> > compromised machine (through some other means) in a local company
> > network running now a server with NFS share "HR salary data" or "HR
> > planned layoffs", where unsuspected people in that network access it
> > leading to exploit of NFS code on their side...
> >
> > Server is Raspberry Pi 3 kernel: 5.10.92-2-rpi-legacy-ARCH
> >
> > Which points that it is not latest stable, so anyway I need to update.
>
> I updated the kernel to 5.15.84-3-rpi-ARCH which is pretty close to
> latest stable and I can reproduce the issue. Therefore:
> 1. It is visible on two stable (one new, one old) kernels on the server,
> 2. Buggy or rogue server should not cause NULL pointer on remote devices...
I'm still working on this issue, but I haven't been able to reproduce
it with my setup at all yet. I was hoping I could ask you a couple of
questions?
- Are both client and server run on a Raspberry Pi 3?
- Is there a specific workload that triggers the bug, or is it just
during boot when using nfsroot?
- Would it be possible to get a Wireshark network trace of the crash
(you'll have to run this on the server due to nfsroot).
- Can you share your export options from /etc/exports and the mount
options that the client uses?
Thanks,
Anna
>
> Best regards,
> Krzysztof
>
On 22/01/2023 23:25, Anna Schumaker wrote:
> Hi Krzysztof,
>
> On Mon, Jan 9, 2023 at 3:46 AM Krzysztof Kozlowski
> <[email protected]> wrote:
>>
>> On 09/01/2023 09:14, Krzysztof Kozlowski wrote:
>>> On 08/01/2023 18:09, Trond Myklebust wrote:
>>>> Hi Krzysztof,
>>>>
>>>>> On Jan 8, 2023, at 08:25, Krzysztof Kozlowski <[email protected]> wrote:
>>>>>
>>>>> [You don't often get email from [email protected]. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification]
>>>>>
>>>>> On 07/01/2023 16:44, Krzysztof Kozlowski wrote:
>>>>>> Hi,
>>>>>>
>>>>>> Bisect identified commit 7fd461c47c6c ("NFSv4.2: Change the default
>>>>>> KConfig value for READ_PLUS") as one leading to NULL pointer exception
>>>>>> when mounting NFS root on NFSv4 client:
>>>>>>
>>>>>> [ 25.739003] systemd[1]: Set hostname to <odroidhc1>.
>>>>>> [ 25.771714] systemd[1]: Failed to bump fs.file-max, ignoring: Invalid
>>>>>> argument
>>>>>> [ 26.199478] 8<--- cut here ---
>>>>>> [ 26.201366] Unable to handle kernel NULL pointer dereference at
>>>>>> virtual address 00000004
>>>>>> ...
>>>>>> [ 26.555522] mmiocpy from xdr_inline_decode+0xec/0x16c
>>>>>> [ 26.560628] xdr_inline_decode from nfs4_xdr_dec_read_plus+0x178/0x358
>>>>>> [ 26.567130] nfs4_xdr_dec_read_plus from call_decode+0x204/0x304
>>>>>>
>>>>>> Full OOPS attached. Full log available here:
>>>>>> https://krzk.eu/#/builders/21/builds/3901/steps/15/logs/serial0
>>>>>>
>>>>>> Disabling NFS_V4_2_READ_PLUS fixes the issue, so obviously the commit is
>>>>>> not the cause, but rather making it default caused the regression.
>>>>>>
>>>>>> I did not make the bisect yet which commit introduced it, if every
>>>>>> config includes NFS_V4_2_READ_PLUS.
>>>>>
>>>>> When every kernel is built with NFS_V4_2_READ_PLUS, bisect pointed to:
>>>>> d3b00a802c84 ("NFS: Replace the READ_PLUS decoding code")
>>>>>
>>>>> commit d3b00a802c845a6021148ce2e669b5a0b5729959
>>>>> Author: Anna Schumaker <[email protected]>
>>>>> Date: Thu Jul 21 14:21:34 2022 -0400
>>>>>
>>>>> NFS: Replace the READ_PLUS decoding code
>>>>>
>>>>> We now take a 2-step process that allows us to place data and hole
>>>>> segments directly at their final position in the xdr_stream without
>>>>> needing to do a bunch of redundant copies to expand holes. Due to the
>>>>> variable lengths of each segment, the xdr metadata might cross page
>>>>> boundaries which I account for by setting a small scratch buffer so
>>>>> xdr_inline_decode() won't fail.
>>>>>
>>>>> Signed-off-by: Anna Schumaker <[email protected]>
>>>>> Signed-off-by: Trond Myklebust <[email protected]>
>>>>>
>>>>> With a trace:
>>>>> [ 25.898462] systemd[1]: Set hostname to <odroidhc1>.
>>>>> [ 25.933746] systemd[1]: Failed to bump fs.file-max, ignoring: Invalid
>>>>> argument
>>>>> [ 25.986237] random: crng init done
>>>>> [ 26.264564] 8<--- cut here ---
>>>>> [ 26.266823] Unable to handle kernel NULL pointer dereference at
>>>>> virtual address 00000fe8
>>>>> ...
>>>>> [ 26.597263] nfs4_xdr_dec_read_plus from call_decode+0x204/0x304
>>>>> [ 26.603222] call_decode from __rpc_execute+0xd0/0x890
>>>>> [ 26.608328] __rpc_execute from rpc_async_schedule+0x1c/0x34
>>>>> [ 26.613960] rpc_async_schedule from process_one_work+0x294/0x790
>>>>> [ 26.620030] process_one_work from worker_thread+0x54/0x518
>>>>> [ 26.625570] worker_thread from kthread+0xf4/0x128
>>>>> [ 26.630336] kthread from ret_from_fork+0x14/0x2c
>>>>>
>>>>
>>>> Is this test being run against a 6.2-rc2 server, or is it an older server platform? We know there were bugs in older server implementations, so the question is whether this might be a problem with handling a bad/corrupt RPC reply from the server, or whether it is happening against code that is supposed to have been fixed?
>>>
>>> I would say that buggy server should not cause NULL pointer dereferences
>>> on the client. Otherwise this is a perfect recipe for a rogue server in
>>> the network to start crashing clients and running exploits... Imagine a
>>> compromised machine (through some other means) in a local company
>>> network running now a server with NFS share "HR salary data" or "HR
>>> planned layoffs", where unsuspected people in that network access it
>>> leading to exploit of NFS code on their side...
>>>
>>> Server is Raspberry Pi 3 kernel: 5.10.92-2-rpi-legacy-ARCH
>>>
>>> Which points that it is not latest stable, so anyway I need to update.
>>
>> I updated the kernel to 5.15.84-3-rpi-ARCH which is pretty close to
>> latest stable and I can reproduce the issue. Therefore:
>> 1. It is visible on two stable (one new, one old) kernels on the server,
>> 2. Buggy or rogue server should not cause NULL pointer on remote devices...
>
> I'm still working on this issue, but I haven't been able to reproduce
> it with my setup at all yet. I was hoping I could ask you a couple of
> questions?
>
> - Are both client and server run on a Raspberry Pi 3?
No, server is Rpi3, client is Odroid HC1.
> - Is there a specific workload that triggers the bug, or is it just
> during boot when using nfsroot?
No specific workload - mounting system with nfsroot. You have there full
logs as it is reproducible every time.
> - Would it be possible to get a Wireshark network trace of the crash
> (you'll have to run this on the server due to nfsroot).
I'll check.
> - Can you share your export options from /etc/exports and the mount
> options that the client uses?
exports are the same since like 5 years or more:
/srv/nfs/odroidhc1 192.168.1.0/24(rw,async,no_root_squash,no_subtree_check)
/srv/nfs/odroidhc1-home
192.168.1.0/24(rw,async,no_root_squash,no_subtree_check)
>
> Thanks,
> Anna
>
>>
>> Best regards,
>> Krzysztof
>>
Best regards,
Krzysztof
Hi Krzysztof,
On Mon, Jan 23, 2023 at 2:58 AM Krzysztof Kozlowski
<[email protected]> wrote:
>
> On 22/01/2023 23:25, Anna Schumaker wrote:
> > Hi Krzysztof,
> >
> > On Mon, Jan 9, 2023 at 3:46 AM Krzysztof Kozlowski
> > <[email protected]> wrote:
> >>
> >> On 09/01/2023 09:14, Krzysztof Kozlowski wrote:
> >>> On 08/01/2023 18:09, Trond Myklebust wrote:
> >>>> Hi Krzysztof,
> >>>>
> >>>>> On Jan 8, 2023, at 08:25, Krzysztof Kozlowski <[email protected]> wrote:
> >>>>>
> >>>>> [You don't often get email from [email protected]. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification]
> >>>>>
> >>>>> On 07/01/2023 16:44, Krzysztof Kozlowski wrote:
> >>>>>> Hi,
> >>>>>>
> >>>>>> Bisect identified commit 7fd461c47c6c ("NFSv4.2: Change the default
> >>>>>> KConfig value for READ_PLUS") as one leading to NULL pointer exception
> >>>>>> when mounting NFS root on NFSv4 client:
> >>>>>>
> >>>>>> [ 25.739003] systemd[1]: Set hostname to <odroidhc1>.
> >>>>>> [ 25.771714] systemd[1]: Failed to bump fs.file-max, ignoring: Invalid
> >>>>>> argument
> >>>>>> [ 26.199478] 8<--- cut here ---
> >>>>>> [ 26.201366] Unable to handle kernel NULL pointer dereference at
> >>>>>> virtual address 00000004
> >>>>>> ...
> >>>>>> [ 26.555522] mmiocpy from xdr_inline_decode+0xec/0x16c
> >>>>>> [ 26.560628] xdr_inline_decode from nfs4_xdr_dec_read_plus+0x178/0x358
> >>>>>> [ 26.567130] nfs4_xdr_dec_read_plus from call_decode+0x204/0x304
> >>>>>>
> >>>>>> Full OOPS attached. Full log available here:
> >>>>>> https://krzk.eu/#/builders/21/builds/3901/steps/15/logs/serial0
> >>>>>>
> >>>>>> Disabling NFS_V4_2_READ_PLUS fixes the issue, so obviously the commit is
> >>>>>> not the cause, but rather making it default caused the regression.
> >>>>>>
> >>>>>> I did not make the bisect yet which commit introduced it, if every
> >>>>>> config includes NFS_V4_2_READ_PLUS.
> >>>>>
> >>>>> When every kernel is built with NFS_V4_2_READ_PLUS, bisect pointed to:
> >>>>> d3b00a802c84 ("NFS: Replace the READ_PLUS decoding code")
> >>>>>
> >>>>> commit d3b00a802c845a6021148ce2e669b5a0b5729959
> >>>>> Author: Anna Schumaker <[email protected]>
> >>>>> Date: Thu Jul 21 14:21:34 2022 -0400
> >>>>>
> >>>>> NFS: Replace the READ_PLUS decoding code
> >>>>>
> >>>>> We now take a 2-step process that allows us to place data and hole
> >>>>> segments directly at their final position in the xdr_stream without
> >>>>> needing to do a bunch of redundant copies to expand holes. Due to the
> >>>>> variable lengths of each segment, the xdr metadata might cross page
> >>>>> boundaries which I account for by setting a small scratch buffer so
> >>>>> xdr_inline_decode() won't fail.
> >>>>>
> >>>>> Signed-off-by: Anna Schumaker <[email protected]>
> >>>>> Signed-off-by: Trond Myklebust <[email protected]>
> >>>>>
> >>>>> With a trace:
> >>>>> [ 25.898462] systemd[1]: Set hostname to <odroidhc1>.
> >>>>> [ 25.933746] systemd[1]: Failed to bump fs.file-max, ignoring: Invalid
> >>>>> argument
> >>>>> [ 25.986237] random: crng init done
> >>>>> [ 26.264564] 8<--- cut here ---
> >>>>> [ 26.266823] Unable to handle kernel NULL pointer dereference at
> >>>>> virtual address 00000fe8
> >>>>> ...
> >>>>> [ 26.597263] nfs4_xdr_dec_read_plus from call_decode+0x204/0x304
> >>>>> [ 26.603222] call_decode from __rpc_execute+0xd0/0x890
> >>>>> [ 26.608328] __rpc_execute from rpc_async_schedule+0x1c/0x34
> >>>>> [ 26.613960] rpc_async_schedule from process_one_work+0x294/0x790
> >>>>> [ 26.620030] process_one_work from worker_thread+0x54/0x518
> >>>>> [ 26.625570] worker_thread from kthread+0xf4/0x128
> >>>>> [ 26.630336] kthread from ret_from_fork+0x14/0x2c
> >>>>>
> >>>>
> >>>> Is this test being run against a 6.2-rc2 server, or is it an older server platform? We know there were bugs in older server implementations, so the question is whether this might be a problem with handling a bad/corrupt RPC reply from the server, or whether it is happening against code that is supposed to have been fixed?
> >>>
> >>> I would say that buggy server should not cause NULL pointer dereferences
> >>> on the client. Otherwise this is a perfect recipe for a rogue server in
> >>> the network to start crashing clients and running exploits... Imagine a
> >>> compromised machine (through some other means) in a local company
> >>> network running now a server with NFS share "HR salary data" or "HR
> >>> planned layoffs", where unsuspected people in that network access it
> >>> leading to exploit of NFS code on their side...
> >>>
> >>> Server is Raspberry Pi 3 kernel: 5.10.92-2-rpi-legacy-ARCH
> >>>
> >>> Which points that it is not latest stable, so anyway I need to update.
> >>
> >> I updated the kernel to 5.15.84-3-rpi-ARCH which is pretty close to
> >> latest stable and I can reproduce the issue. Therefore:
> >> 1. It is visible on two stable (one new, one old) kernels on the server,
> >> 2. Buggy or rogue server should not cause NULL pointer on remote devices...
> >
> > I'm still working on this issue, but I haven't been able to reproduce
> > it with my setup at all yet. I was hoping I could ask you a couple of
> > questions?
> >
> > - Are both client and server run on a Raspberry Pi 3?
>
> No, server is Rpi3, client is Odroid HC1.
>
> > - Is there a specific workload that triggers the bug, or is it just
> > during boot when using nfsroot?
>
> No specific workload - mounting system with nfsroot. You have there full
> logs as it is reproducible every time.
>
> > - Would it be possible to get a Wireshark network trace of the crash
> > (you'll have to run this on the server due to nfsroot).
>
> I'll check.
Any luck getting the wireshark trace? I don't have access to the
Odroid HC1 board, so all my attempts at reproducing the problem have
been with qemu & libvirt, which doesn't seem to be hitting this issue.
I was also wondering if it would be possible to turn on KASAN in your
kernel, which should give us a little more info?
Thanks,
Anna
>
> > - Can you share your export options from /etc/exports and the mount
> > options that the client uses?
>
> exports are the same since like 5 years or more:
>
> /srv/nfs/odroidhc1 192.168.1.0/24(rw,async,no_root_squash,no_subtree_check)
> /srv/nfs/odroidhc1-home
> 192.168.1.0/24(rw,async,no_root_squash,no_subtree_check)
>
>
> >
> > Thanks,
> > Anna
> >
> >>
> >> Best regards,
> >> Krzysztof
> >>
>
> Best regards,
> Krzysztof
>
On 09/02/2023 19:22, Anna Schumaker wrote:
> Hi Krzysztof,
>
> On Mon, Jan 23, 2023 at 2:58 AM Krzysztof Kozlowski
> <[email protected]> wrote:
>>
>> On 22/01/2023 23:25, Anna Schumaker wrote:
>>> Hi Krzysztof,
>>>
>>> On Mon, Jan 9, 2023 at 3:46 AM Krzysztof Kozlowski
>>> <[email protected]> wrote:
>>>>
>>>> On 09/01/2023 09:14, Krzysztof Kozlowski wrote:
>>>>> On 08/01/2023 18:09, Trond Myklebust wrote:
>>>>>> Hi Krzysztof,
>>>>>>
>>>>>>> On Jan 8, 2023, at 08:25, Krzysztof Kozlowski <[email protected]> wrote:
>>>>>>>
>>>>>>> [You don't often get email from [email protected]. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification]
>>>>>>>
>>>>>>> On 07/01/2023 16:44, Krzysztof Kozlowski wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> Bisect identified commit 7fd461c47c6c ("NFSv4.2: Change the default
>>>>>>>> KConfig value for READ_PLUS") as one leading to NULL pointer exception
>>>>>>>> when mounting NFS root on NFSv4 client:
>>>>>>>>
>>>>>>>> [ 25.739003] systemd[1]: Set hostname to <odroidhc1>.
>>>>>>>> [ 25.771714] systemd[1]: Failed to bump fs.file-max, ignoring: Invalid
>>>>>>>> argument
>>>>>>>> [ 26.199478] 8<--- cut here ---
>>>>>>>> [ 26.201366] Unable to handle kernel NULL pointer dereference at
>>>>>>>> virtual address 00000004
>>>>>>>> ...
>>>>>>>> [ 26.555522] mmiocpy from xdr_inline_decode+0xec/0x16c
>>>>>>>> [ 26.560628] xdr_inline_decode from nfs4_xdr_dec_read_plus+0x178/0x358
>>>>>>>> [ 26.567130] nfs4_xdr_dec_read_plus from call_decode+0x204/0x304
>>>>>>>>
>>>>>>>> Full OOPS attached. Full log available here:
>>>>>>>> https://krzk.eu/#/builders/21/builds/3901/steps/15/logs/serial0
>>>>>>>>
>>>>>>>> Disabling NFS_V4_2_READ_PLUS fixes the issue, so obviously the commit is
>>>>>>>> not the cause, but rather making it default caused the regression.
>>>>>>>>
>>>>>>>> I did not make the bisect yet which commit introduced it, if every
>>>>>>>> config includes NFS_V4_2_READ_PLUS.
>>>>>>>
>>>>>>> When every kernel is built with NFS_V4_2_READ_PLUS, bisect pointed to:
>>>>>>> d3b00a802c84 ("NFS: Replace the READ_PLUS decoding code")
>>>>>>>
>>>>>>> commit d3b00a802c845a6021148ce2e669b5a0b5729959
>>>>>>> Author: Anna Schumaker <[email protected]>
>>>>>>> Date: Thu Jul 21 14:21:34 2022 -0400
>>>>>>>
>>>>>>> NFS: Replace the READ_PLUS decoding code
>>>>>>>
>>>>>>> We now take a 2-step process that allows us to place data and hole
>>>>>>> segments directly at their final position in the xdr_stream without
>>>>>>> needing to do a bunch of redundant copies to expand holes. Due to the
>>>>>>> variable lengths of each segment, the xdr metadata might cross page
>>>>>>> boundaries which I account for by setting a small scratch buffer so
>>>>>>> xdr_inline_decode() won't fail.
>>>>>>>
>>>>>>> Signed-off-by: Anna Schumaker <[email protected]>
>>>>>>> Signed-off-by: Trond Myklebust <[email protected]>
>>>>>>>
>>>>>>> With a trace:
>>>>>>> [ 25.898462] systemd[1]: Set hostname to <odroidhc1>.
>>>>>>> [ 25.933746] systemd[1]: Failed to bump fs.file-max, ignoring: Invalid
>>>>>>> argument
>>>>>>> [ 25.986237] random: crng init done
>>>>>>> [ 26.264564] 8<--- cut here ---
>>>>>>> [ 26.266823] Unable to handle kernel NULL pointer dereference at
>>>>>>> virtual address 00000fe8
>>>>>>> ...
>>>>>>> [ 26.597263] nfs4_xdr_dec_read_plus from call_decode+0x204/0x304
>>>>>>> [ 26.603222] call_decode from __rpc_execute+0xd0/0x890
>>>>>>> [ 26.608328] __rpc_execute from rpc_async_schedule+0x1c/0x34
>>>>>>> [ 26.613960] rpc_async_schedule from process_one_work+0x294/0x790
>>>>>>> [ 26.620030] process_one_work from worker_thread+0x54/0x518
>>>>>>> [ 26.625570] worker_thread from kthread+0xf4/0x128
>>>>>>> [ 26.630336] kthread from ret_from_fork+0x14/0x2c
>>>>>>>
>>>>>>
>>>>>> Is this test being run against a 6.2-rc2 server, or is it an older server platform? We know there were bugs in older server implementations, so the question is whether this might be a problem with handling a bad/corrupt RPC reply from the server, or whether it is happening against code that is supposed to have been fixed?
>>>>>
>>>>> I would say that buggy server should not cause NULL pointer dereferences
>>>>> on the client. Otherwise this is a perfect recipe for a rogue server in
>>>>> the network to start crashing clients and running exploits... Imagine a
>>>>> compromised machine (through some other means) in a local company
>>>>> network running now a server with NFS share "HR salary data" or "HR
>>>>> planned layoffs", where unsuspected people in that network access it
>>>>> leading to exploit of NFS code on their side...
>>>>>
>>>>> Server is Raspberry Pi 3 kernel: 5.10.92-2-rpi-legacy-ARCH
>>>>>
>>>>> Which points that it is not latest stable, so anyway I need to update.
>>>>
>>>> I updated the kernel to 5.15.84-3-rpi-ARCH which is pretty close to
>>>> latest stable and I can reproduce the issue. Therefore:
>>>> 1. It is visible on two stable (one new, one old) kernels on the server,
>>>> 2. Buggy or rogue server should not cause NULL pointer on remote devices...
>>>
>>> I'm still working on this issue, but I haven't been able to reproduce
>>> it with my setup at all yet. I was hoping I could ask you a couple of
>>> questions?
>>>
>>> - Are both client and server run on a Raspberry Pi 3?
>>
>> No, server is Rpi3, client is Odroid HC1.
>>
>>> - Is there a specific workload that triggers the bug, or is it just
>>> during boot when using nfsroot?
>>
>> No specific workload - mounting system with nfsroot. You have there full
>> logs as it is reproducible every time.
>>
>>> - Would it be possible to get a Wireshark network trace of the crash
>>> (you'll have to run this on the server due to nfsroot).
>>
>> I'll check.
>
> Any luck getting the wireshark trace? I don't have access to the
> Odroid HC1 board, so all my attempts at reproducing the problem have
> been with qemu & libvirt, which doesn't seem to be hitting this issue.
I'll send pcap dump off list. Failure is in similar place:
[ 23.876714] systemd[1]: Hostname set to <odroidhc1>.
[ 24.061568] systemd[1]: bpf-lsm: BPF LSM hook not enabled in the
kernel, BPF LSM not supported
[ 24.078774] memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL,
pid=1 'systemd'
[ 24.220531] 8<--- cut here ---
[ 24.222333] Unable to handle kernel NULL pointer dereference at
virtual address 00000004 when read
>
> I was also wondering if it would be possible to turn on KASAN in your
> kernel, which should give us a little more info?
I'll try with KASAN.
Best regards,
Krzysztof
On 10/02/2023 09:41, Krzysztof Kozlowski wrote:
>
>
>>
>> I was also wondering if it would be possible to turn on KASAN in your
>> kernel, which should give us a little more info?
>
> I'll try with KASAN.
Not much from the KASAN, except that kernel
continues to boot and runs:
[ 44.722846] vdd_vmem: disabling
[ 44.793465] systemd[1]: Hostname set to <odroidhc1>.
[ 45.357929] systemd[1]: bpf-lsm: BPF LSM hook not enabled in the kernel, BPF LSM not supported
[ 45.980476] ==================================================================
[ 45.986372] BUG: KASAN: null-ptr-deref in xdr_inline_decode+0x140/0x200
[ 45.992929] Read of size 4092 at addr 00000004 by task kworker/u16:3/71
[ 45.999513]
[ 46.000940] CPU: 6 PID: 71 Comm: kworker/u16:3 Not tainted 6.2.0-rc7-00018-g0983f6bf2bfc #222
[ 46.009504] Hardware name: Samsung Exynos (Flattened Device Tree)
[ 46.015542] Workqueue: rpciod rpc_async_schedule
[ 46.020123] unwind_backtrace from show_stack+0x10/0x14
[ 46.025323] show_stack from dump_stack_lvl+0x58/0x70
[ 46.030301] dump_stack_lvl from kasan_report+0xa8/0xe0
[ 46.035501] kasan_report from kasan_check_range+0x94/0x1a0
[ 46.041048] kasan_check_range from memcpy+0x28/0x68
[ 46.045985] memcpy from xdr_inline_decode+0x140/0x200
[ 46.051098] xdr_inline_decode from nfs4_xdr_dec_read_plus+0x298/0x5b8
[ 46.057602] nfs4_xdr_dec_read_plus from call_decode+0x39c/0x530
[ 46.063581] call_decode from __rpc_execute+0x1f4/0xc5c
[ 46.068776] __rpc_execute from rpc_async_schedule+0x2c/0x4c
[ 46.074411] rpc_async_schedule from process_one_work+0x51c/0xc44
[ 46.080478] process_one_work from worker_thread+0x9c/0x7c0
[ 46.086022] worker_thread from kthread+0x16c/0x1b8
[ 46.090872] kthread from ret_from_fork+0x14/0x2c
[ 46.095550] Exception stack(0xf0ba3fb0 to 0xf0ba3ff8)
[ 46.100580] 3fa0: 00000000 00000000 00000000 00000000
[ 46.108740] 3fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 46.116885] 3fe0: 00000000 00000000 00000000 00000000 00000013 00000000
[ 46.123461] ==================================================================
[ 46.130774] Disabling lock debugging due to kernel taint
[ 50.848579] systemd[1]: Queued start job for default target Graphical Interface.
[ 50.877068] systemd[1]: Created slice Slice /system/getty.
Decoded stacktrace is (this is master branch):
[ 46.020123] unwind_backtrace from show_stack (arch/arm/kernel/traps.c:258)
[ 46.025323] show_stack from dump_stack_lvl (lib/dump_stack.c:107 (discriminator 4))
[ 46.030301] dump_stack_lvl from kasan_report (mm/kasan/report.c:184 mm/kasan/report.c:519)
[ 46.035501] kasan_report from kasan_check_range (mm/kasan/generic.c:173 mm/kasan/generic.c:189)
[ 46.041048] kasan_check_range from memcpy (mm/kasan/shadow.c:65)
[ 46.045985] memcpy from xdr_inline_decode (net/sunrpc/xdr.c:1419 net/sunrpc/xdr.c:1454)
[ 46.051098] xdr_inline_decode from nfs4_xdr_dec_read_plus (fs/nfs/nfs42xdr.c:1063 fs/nfs/nfs42xdr.c:1149 fs/nfs/nfs42xdr.c:1360 fs/nfs/nfs42xdr.c:1343)
[ 46.057602] nfs4_xdr_dec_read_plus from call_decode (net/sunrpc/clnt.c:2595)
[ 46.063581] call_decode from __rpc_execute (include/asm-generic/bitops/generic-non-atomic.h:128 net/sunrpc/sched.c:954)
[ 46.068776] __rpc_execute from rpc_async_schedule (include/linux/sched/mm.h:336 net/sunrpc/sched.c:1035)
[ 46.074411] rpc_async_schedule from process_one_work (kernel/workqueue.c:2294)
[ 46.080478] process_one_work from worker_thread (include/linux/list.h:292 kernel/workqueue.c:2437)
[ 46.086022] worker_thread from kthread (kernel/kthread.c:376)
[ 46.090872] kthread from ret_from_fork (arch/arm/kernel/entry-common.S:149)
Best regards,
Krzysztof
Hi Krzysztof,
On Fri, Feb 10, 2023 at 3:53 AM Krzysztof Kozlowski
<[email protected]> wrote:
>
> On 10/02/2023 09:41, Krzysztof Kozlowski wrote:
> >
> >
> >>
> >> I was also wondering if it would be possible to turn on KASAN in your
> >> kernel, which should give us a little more info?
> >
> > I'll try with KASAN.
>
> Not much from the KASAN, except that kernel
> continues to boot and runs:
>
> [ 44.722846] vdd_vmem: disabling
> [ 44.793465] systemd[1]: Hostname set to <odroidhc1>.
> [ 45.357929] systemd[1]: bpf-lsm: BPF LSM hook not enabled in the kernel, BPF LSM not supported
> [ 45.980476] ==================================================================
> [ 45.986372] BUG: KASAN: null-ptr-deref in xdr_inline_decode+0x140/0x200
> [ 45.992929] Read of size 4092 at addr 00000004 by task kworker/u16:3/71
> [ 45.999513]
> [ 46.000940] CPU: 6 PID: 71 Comm: kworker/u16:3 Not tainted 6.2.0-rc7-00018-g0983f6bf2bfc #222
> [ 46.009504] Hardware name: Samsung Exynos (Flattened Device Tree)
> [ 46.015542] Workqueue: rpciod rpc_async_schedule
> [ 46.020123] unwind_backtrace from show_stack+0x10/0x14
> [ 46.025323] show_stack from dump_stack_lvl+0x58/0x70
> [ 46.030301] dump_stack_lvl from kasan_report+0xa8/0xe0
> [ 46.035501] kasan_report from kasan_check_range+0x94/0x1a0
> [ 46.041048] kasan_check_range from memcpy+0x28/0x68
> [ 46.045985] memcpy from xdr_inline_decode+0x140/0x200
> [ 46.051098] xdr_inline_decode from nfs4_xdr_dec_read_plus+0x298/0x5b8
> [ 46.057602] nfs4_xdr_dec_read_plus from call_decode+0x39c/0x530
> [ 46.063581] call_decode from __rpc_execute+0x1f4/0xc5c
> [ 46.068776] __rpc_execute from rpc_async_schedule+0x2c/0x4c
> [ 46.074411] rpc_async_schedule from process_one_work+0x51c/0xc44
> [ 46.080478] process_one_work from worker_thread+0x9c/0x7c0
> [ 46.086022] worker_thread from kthread+0x16c/0x1b8
> [ 46.090872] kthread from ret_from_fork+0x14/0x2c
> [ 46.095550] Exception stack(0xf0ba3fb0 to 0xf0ba3ff8)
> [ 46.100580] 3fa0: 00000000 00000000 00000000 00000000
> [ 46.108740] 3fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> [ 46.116885] 3fe0: 00000000 00000000 00000000 00000000 00000013 00000000
> [ 46.123461] ==================================================================
> [ 46.130774] Disabling lock debugging due to kernel taint
> [ 50.848579] systemd[1]: Queued start job for default target Graphical Interface.
> [ 50.877068] systemd[1]: Created slice Slice /system/getty.
>
>
> Decoded stacktrace is (this is master branch):
>
> [ 46.020123] unwind_backtrace from show_stack (arch/arm/kernel/traps.c:258)
> [ 46.025323] show_stack from dump_stack_lvl (lib/dump_stack.c:107 (discriminator 4))
> [ 46.030301] dump_stack_lvl from kasan_report (mm/kasan/report.c:184 mm/kasan/report.c:519)
> [ 46.035501] kasan_report from kasan_check_range (mm/kasan/generic.c:173 mm/kasan/generic.c:189)
> [ 46.041048] kasan_check_range from memcpy (mm/kasan/shadow.c:65)
> [ 46.045985] memcpy from xdr_inline_decode (net/sunrpc/xdr.c:1419 net/sunrpc/xdr.c:1454)
Actually, this part is really useful. net/sunrpc/xdr.c:1419 points to
the memcpy in xdr_copy_to_scratch(), which has me wondering if I'm
incorrectly setting up the xdr scratch buffer that READ_PLUS uses for
decoding. Can you try this patch and let me know if it helps?
From ac2d6c501dbcdb306480edaee625b5496f1fb4f5 Mon Sep 17 00:00:00 2001
From: Anna Schumaker <[email protected]>
Date: Fri, 10 Feb 2023 15:50:22 -0500
Subject: [PATCH] NFSv4.2: Rework scratch handling for READ_PLUS
Instead of using a tiny scratch buffer, we should use a full scratch
page to match how other NFSv4 operations handle scratch data.
Signed-off-by: Anna Schumaker <[email protected]>
---
fs/nfs/nfs42xdr.c | 4 ++--
fs/nfs/nfs4proc.c | 14 ++++++++++----
include/linux/nfs_xdr.h | 1 +
3 files changed, 13 insertions(+), 6 deletions(-)
diff --git a/fs/nfs/nfs42xdr.c b/fs/nfs/nfs42xdr.c
index d80ee88ca996..702567d5b1db 100644
--- a/fs/nfs/nfs42xdr.c
+++ b/fs/nfs/nfs42xdr.c
@@ -1122,7 +1122,6 @@ static int decode_read_plus(struct xdr_stream
*xdr, struct nfs_pgio_res *res)
uint32_t segments;
struct read_plus_segment *segs;
int status, i;
- char scratch_buf[16];
__be32 *p;
status = decode_op_hdr(xdr, OP_READ_PLUS);
@@ -1143,7 +1142,6 @@ static int decode_read_plus(struct xdr_stream
*xdr, struct nfs_pgio_res *res)
if (!segs)
return -ENOMEM;
- xdr_set_scratch_buffer(xdr, &scratch_buf, sizeof(scratch_buf));
status = -EIO;
for (i = 0; i < segments; i++) {
status = decode_read_plus_segment(xdr, &segs[i]);
@@ -1348,6 +1346,8 @@ static int nfs4_xdr_dec_read_plus(struct rpc_rqst *rqstp,
struct compound_hdr hdr;
int status;
+ xdr_set_scratch_page(xdr, res->scratch);
+
status = decode_compound_hdr(xdr, &hdr);
if (status)
goto out;
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 40d749f29ed3..5c589d0bd9e9 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -5440,6 +5440,8 @@ static bool nfs4_read_plus_not_supported(struct
rpc_task *task,
static int nfs4_read_done(struct rpc_task *task, struct nfs_pgio_header *hdr)
{
+ if (hdr->res.scratch)
+ __free_page(hdr->res.scratch);
if (!nfs4_sequence_done(task, &hdr->res.seq_res))
return -EAGAIN;
if (nfs4_read_stateid_changed(task, &hdr->args))
@@ -5453,12 +5455,16 @@ static int nfs4_read_done(struct rpc_task
*task, struct nfs_pgio_header *hdr)
}
#if defined CONFIG_NFS_V4_2 && defined CONFIG_NFS_V4_2_READ_PLUS
-static void nfs42_read_plus_support(struct nfs_pgio_header *hdr,
+static bool nfs42_read_plus_support(struct nfs_pgio_header *hdr,
struct rpc_message *msg)
{
/* Note: We don't use READ_PLUS with pNFS yet */
- if (nfs_server_capable(hdr->inode, NFS_CAP_READ_PLUS) && !hdr->ds_clp)
+ if (nfs_server_capable(hdr->inode, NFS_CAP_READ_PLUS) && !hdr->ds_clp) {
msg->rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_READ_PLUS];
+ hdr->res.scratch = alloc_page(GFP_KERNEL);
+ return hdr->res.scratch != NULL;
+ }
+ return false;
}
#else
static void nfs42_read_plus_support(struct nfs_pgio_header *hdr,
@@ -5473,8 +5479,8 @@ static void nfs4_proc_read_setup(struct
nfs_pgio_header *hdr,
hdr->timestamp = jiffies;
if (!hdr->pgio_done_cb)
hdr->pgio_done_cb = nfs4_read_done_cb;
- msg->rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_READ];
- nfs42_read_plus_support(hdr, msg);
+ if (!nfs42_read_plus_support(hdr, msg))
+ msg->rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_READ];
nfs4_init_sequence(&hdr->args.seq_args, &hdr->res.seq_res, 0, 0);
}
diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
index e86cf6642d21..6d821aaf0b1a 100644
--- a/include/linux/nfs_xdr.h
+++ b/include/linux/nfs_xdr.h
@@ -670,6 +670,7 @@ struct nfs_pgio_res {
struct {
unsigned int replen; /* used by read */
int eof; /* used by read */
+ struct page *scratch; /* used by read */
};
struct {
struct nfs_writeverf * verf; /* used by write */
--
2.39.1
Thanks,
Anna
> [ 46.051098] xdr_inline_decode from nfs4_xdr_dec_read_plus (fs/nfs/nfs42xdr.c:1063 fs/nfs/nfs42xdr.c:1149 fs/nfs/nfs42xdr.c:1360 fs/nfs/nfs42xdr.c:1343)
> [ 46.057602] nfs4_xdr_dec_read_plus from call_decode (net/sunrpc/clnt.c:2595)
> [ 46.063581] call_decode from __rpc_execute (include/asm-generic/bitops/generic-non-atomic.h:128 net/sunrpc/sched.c:954)
> [ 46.068776] __rpc_execute from rpc_async_schedule (include/linux/sched/mm.h:336 net/sunrpc/sched.c:1035)
> [ 46.074411] rpc_async_schedule from process_one_work (kernel/workqueue.c:2294)
> [ 46.080478] process_one_work from worker_thread (include/linux/list.h:292 kernel/workqueue.c:2437)
> [ 46.086022] worker_thread from kthread (kernel/kthread.c:376)
> [ 46.090872] kthread from ret_from_fork (arch/arm/kernel/entry-common.S:149)
>
>
> Best regards,
> Krzysztof
>
On 10/02/2023 21:55, Anna Schumaker wrote:
> Hi Krzysztof,
>
> On Fri, Feb 10, 2023 at 3:53 AM Krzysztof Kozlowski
> <[email protected]> wrote:
>>
>> On 10/02/2023 09:41, Krzysztof Kozlowski wrote:
>>>
>>>
>>>>
>>>> I was also wondering if it would be possible to turn on KASAN in your
>>>> kernel, which should give us a little more info?
>>>
>>> I'll try with KASAN.
>>
>> Not much from the KASAN, except that kernel
>> continues to boot and runs:
>>
>> [ 44.722846] vdd_vmem: disabling
>> [ 44.793465] systemd[1]: Hostname set to <odroidhc1>.
>> [ 45.357929] systemd[1]: bpf-lsm: BPF LSM hook not enabled in the kernel, BPF LSM not supported
>> [ 45.980476] ==================================================================
>> [ 45.986372] BUG: KASAN: null-ptr-deref in xdr_inline_decode+0x140/0x200
>> [ 45.992929] Read of size 4092 at addr 00000004 by task kworker/u16:3/71
>> [ 45.999513]
>> [ 46.000940] CPU: 6 PID: 71 Comm: kworker/u16:3 Not tainted 6.2.0-rc7-00018-g0983f6bf2bfc #222
>> [ 46.009504] Hardware name: Samsung Exynos (Flattened Device Tree)
>> [ 46.015542] Workqueue: rpciod rpc_async_schedule
>> [ 46.020123] unwind_backtrace from show_stack+0x10/0x14
>> [ 46.025323] show_stack from dump_stack_lvl+0x58/0x70
>> [ 46.030301] dump_stack_lvl from kasan_report+0xa8/0xe0
>> [ 46.035501] kasan_report from kasan_check_range+0x94/0x1a0
>> [ 46.041048] kasan_check_range from memcpy+0x28/0x68
>> [ 46.045985] memcpy from xdr_inline_decode+0x140/0x200
>> [ 46.051098] xdr_inline_decode from nfs4_xdr_dec_read_plus+0x298/0x5b8
>> [ 46.057602] nfs4_xdr_dec_read_plus from call_decode+0x39c/0x530
>> [ 46.063581] call_decode from __rpc_execute+0x1f4/0xc5c
>> [ 46.068776] __rpc_execute from rpc_async_schedule+0x2c/0x4c
>> [ 46.074411] rpc_async_schedule from process_one_work+0x51c/0xc44
>> [ 46.080478] process_one_work from worker_thread+0x9c/0x7c0
>> [ 46.086022] worker_thread from kthread+0x16c/0x1b8
>> [ 46.090872] kthread from ret_from_fork+0x14/0x2c
>> [ 46.095550] Exception stack(0xf0ba3fb0 to 0xf0ba3ff8)
>> [ 46.100580] 3fa0: 00000000 00000000 00000000 00000000
>> [ 46.108740] 3fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
>> [ 46.116885] 3fe0: 00000000 00000000 00000000 00000000 00000013 00000000
>> [ 46.123461] ==================================================================
>> [ 46.130774] Disabling lock debugging due to kernel taint
>> [ 50.848579] systemd[1]: Queued start job for default target Graphical Interface.
>> [ 50.877068] systemd[1]: Created slice Slice /system/getty.
>>
>>
>> Decoded stacktrace is (this is master branch):
>>
>> [ 46.020123] unwind_backtrace from show_stack (arch/arm/kernel/traps.c:258)
>> [ 46.025323] show_stack from dump_stack_lvl (lib/dump_stack.c:107 (discriminator 4))
>> [ 46.030301] dump_stack_lvl from kasan_report (mm/kasan/report.c:184 mm/kasan/report.c:519)
>> [ 46.035501] kasan_report from kasan_check_range (mm/kasan/generic.c:173 mm/kasan/generic.c:189)
>> [ 46.041048] kasan_check_range from memcpy (mm/kasan/shadow.c:65)
>> [ 46.045985] memcpy from xdr_inline_decode (net/sunrpc/xdr.c:1419 net/sunrpc/xdr.c:1454)
>
> Actually, this part is really useful. net/sunrpc/xdr.c:1419 points to
> the memcpy in xdr_copy_to_scratch(), which has me wondering if I'm
> incorrectly setting up the xdr scratch buffer that READ_PLUS uses for
> decoding. Can you try this patch and let me know if it helps?
>
> From ac2d6c501dbcdb306480edaee625b5496f1fb4f5 Mon Sep 17 00:00:00 2001
> From: Anna Schumaker <[email protected]>
> Date: Fri, 10 Feb 2023 15:50:22 -0500
> Subject: [PATCH] NFSv4.2: Rework scratch handling for READ_PLUS
>
Patch is corrupted - maybe mail program reformatted it when sending:
Applying: NFSv4.2: Rework scratch handling for READ_PLUS
error: corrupt patch at line 12
Patch failed at 0001 NFSv4.2: Rework scratch handling for READ_PLUS
Best regards,
Krzysztof
On Sat, Feb 11, 2023 at 6:23 AM Krzysztof Kozlowski
<[email protected]> wrote:
>
> On 10/02/2023 21:55, Anna Schumaker wrote:
> > Hi Krzysztof,
> >
> > On Fri, Feb 10, 2023 at 3:53 AM Krzysztof Kozlowski
> > <[email protected]> wrote:
> >>
> >> On 10/02/2023 09:41, Krzysztof Kozlowski wrote:
> >>>
> >>>
> >>>>
> >>>> I was also wondering if it would be possible to turn on KASAN in your
> >>>> kernel, which should give us a little more info?
> >>>
> >>> I'll try with KASAN.
> >>
> >> Not much from the KASAN, except that kernel
> >> continues to boot and runs:
> >>
> >> [ 44.722846] vdd_vmem: disabling
> >> [ 44.793465] systemd[1]: Hostname set to <odroidhc1>.
> >> [ 45.357929] systemd[1]: bpf-lsm: BPF LSM hook not enabled in the kernel, BPF LSM not supported
> >> [ 45.980476] ==================================================================
> >> [ 45.986372] BUG: KASAN: null-ptr-deref in xdr_inline_decode+0x140/0x200
> >> [ 45.992929] Read of size 4092 at addr 00000004 by task kworker/u16:3/71
> >> [ 45.999513]
> >> [ 46.000940] CPU: 6 PID: 71 Comm: kworker/u16:3 Not tainted 6.2.0-rc7-00018-g0983f6bf2bfc #222
> >> [ 46.009504] Hardware name: Samsung Exynos (Flattened Device Tree)
> >> [ 46.015542] Workqueue: rpciod rpc_async_schedule
> >> [ 46.020123] unwind_backtrace from show_stack+0x10/0x14
> >> [ 46.025323] show_stack from dump_stack_lvl+0x58/0x70
> >> [ 46.030301] dump_stack_lvl from kasan_report+0xa8/0xe0
> >> [ 46.035501] kasan_report from kasan_check_range+0x94/0x1a0
> >> [ 46.041048] kasan_check_range from memcpy+0x28/0x68
> >> [ 46.045985] memcpy from xdr_inline_decode+0x140/0x200
> >> [ 46.051098] xdr_inline_decode from nfs4_xdr_dec_read_plus+0x298/0x5b8
> >> [ 46.057602] nfs4_xdr_dec_read_plus from call_decode+0x39c/0x530
> >> [ 46.063581] call_decode from __rpc_execute+0x1f4/0xc5c
> >> [ 46.068776] __rpc_execute from rpc_async_schedule+0x2c/0x4c
> >> [ 46.074411] rpc_async_schedule from process_one_work+0x51c/0xc44
> >> [ 46.080478] process_one_work from worker_thread+0x9c/0x7c0
> >> [ 46.086022] worker_thread from kthread+0x16c/0x1b8
> >> [ 46.090872] kthread from ret_from_fork+0x14/0x2c
> >> [ 46.095550] Exception stack(0xf0ba3fb0 to 0xf0ba3ff8)
> >> [ 46.100580] 3fa0: 00000000 00000000 00000000 00000000
> >> [ 46.108740] 3fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> >> [ 46.116885] 3fe0: 00000000 00000000 00000000 00000000 00000013 00000000
> >> [ 46.123461] ==================================================================
> >> [ 46.130774] Disabling lock debugging due to kernel taint
> >> [ 50.848579] systemd[1]: Queued start job for default target Graphical Interface.
> >> [ 50.877068] systemd[1]: Created slice Slice /system/getty.
> >>
> >>
> >> Decoded stacktrace is (this is master branch):
> >>
> >> [ 46.020123] unwind_backtrace from show_stack (arch/arm/kernel/traps.c:258)
> >> [ 46.025323] show_stack from dump_stack_lvl (lib/dump_stack.c:107 (discriminator 4))
> >> [ 46.030301] dump_stack_lvl from kasan_report (mm/kasan/report.c:184 mm/kasan/report.c:519)
> >> [ 46.035501] kasan_report from kasan_check_range (mm/kasan/generic.c:173 mm/kasan/generic.c:189)
> >> [ 46.041048] kasan_check_range from memcpy (mm/kasan/shadow.c:65)
> >> [ 46.045985] memcpy from xdr_inline_decode (net/sunrpc/xdr.c:1419 net/sunrpc/xdr.c:1454)
> >
> > Actually, this part is really useful. net/sunrpc/xdr.c:1419 points to
> > the memcpy in xdr_copy_to_scratch(), which has me wondering if I'm
> > incorrectly setting up the xdr scratch buffer that READ_PLUS uses for
> > decoding. Can you try this patch and let me know if it helps?
> >
> > From ac2d6c501dbcdb306480edaee625b5496f1fb4f5 Mon Sep 17 00:00:00 2001
> > From: Anna Schumaker <[email protected]>
> > Date: Fri, 10 Feb 2023 15:50:22 -0500
> > Subject: [PATCH] NFSv4.2: Rework scratch handling for READ_PLUS
> >
>
> Patch is corrupted - maybe mail program reformatted it when sending:
>
> Applying: NFSv4.2: Rework scratch handling for READ_PLUS
> error: corrupt patch at line 12
> Patch failed at 0001 NFSv4.2: Rework scratch handling for READ_PLUS
That's weird. I wasn't expecting gmail to reformat the patch but I
guess it did. I've added it as an attachment so that shouldn't happen
again.
Sorry about that!
Anna
>
>
> Best regards,
> Krzysztof
>
On 12/02/2023 15:05, Anna Schumaker wrote:
>>> From ac2d6c501dbcdb306480edaee625b5496f1fb4f5 Mon Sep 17 00:00:00 2001
>>> From: Anna Schumaker <[email protected]>
>>> Date: Fri, 10 Feb 2023 15:50:22 -0500
>>> Subject: [PATCH] NFSv4.2: Rework scratch handling for READ_PLUS
>>>
>>
>> Patch is corrupted - maybe mail program reformatted it when sending:
>>
>> Applying: NFSv4.2: Rework scratch handling for READ_PLUS
>> error: corrupt patch at line 12
>> Patch failed at 0001 NFSv4.2: Rework scratch handling for READ_PLUS
>
> That's weird. I wasn't expecting gmail to reformat the patch but I
> guess it did. I've added it as an attachment so that shouldn't happen
> again.
Still null ptr (built on 420b2d4 with your patch):
[ 144.690844] mmiocpy from xdr_inline_decode (net/sunrpc/xdr.c:1419 net/sunrpc/xdr.c:1454)
[ 144.695950] xdr_inline_decode from nfs4_xdr_dec_read_plus (fs/nfs/nfs42xdr.c:1063 fs/nfs/nfs42xdr.c:1147 fs/nfs/nfs42xdr.c:1360 fs/nfs/nfs42xdr.c:1341)
[ 144.702452] nfs4_xdr_dec_read_plus from call_decode (net/sunrpc/clnt.c:2595)
[ 144.708429] call_decode from __rpc_execute (include/asm-generic/bitops/generic-non-atomic.h:128 net/sunrpc/sched.c:954)
[ 144.713538] __rpc_execute from rpc_async_schedule (include/linux/sched/mm.h:336 net/sunrpc/sched.c:1035)
[ 144.719170] rpc_async_schedule from process_one_work (include/linux/jump_label.h:260 include/linux/jump_label.h:270 include/trace/events/workqueue.h:108 kernel/workqueue.c:2294)
[ 144.725238] process_one_work from worker_thread (include/linux/list.h:292 kernel/workqueue.c:2437)
[ 144.730782] worker_thread from kthread (kernel/kthread.c:378)
[ 144.735547] kthread from ret_from_fork (arch/arm/kernel/entry-common.S:149)
Best regards,
Krzysztof
On Tue, Feb 14, 2023 at 6:08 AM Krzysztof Kozlowski
<[email protected]> wrote:
>
> On 12/02/2023 15:05, Anna Schumaker wrote:
> >>> From ac2d6c501dbcdb306480edaee625b5496f1fb4f5 Mon Sep 17 00:00:00 2001
> >>> From: Anna Schumaker <[email protected]>
> >>> Date: Fri, 10 Feb 2023 15:50:22 -0500
> >>> Subject: [PATCH] NFSv4.2: Rework scratch handling for READ_PLUS
> >>>
> >>
> >> Patch is corrupted - maybe mail program reformatted it when sending:
> >>
> >> Applying: NFSv4.2: Rework scratch handling for READ_PLUS
> >> error: corrupt patch at line 12
> >> Patch failed at 0001 NFSv4.2: Rework scratch handling for READ_PLUS
> >
> > That's weird. I wasn't expecting gmail to reformat the patch but I
> > guess it did. I've added it as an attachment so that shouldn't happen
> > again.
>
> Still null ptr (built on 420b2d4 with your patch):
>
> [ 144.690844] mmiocpy from xdr_inline_decode (net/sunrpc/xdr.c:1419 net/sunrpc/xdr.c:1454)
> [ 144.695950] xdr_inline_decode from nfs4_xdr_dec_read_plus (fs/nfs/nfs42xdr.c:1063 fs/nfs/nfs42xdr.c:1147 fs/nfs/nfs42xdr.c:1360 fs/nfs/nfs42xdr.c:1341)
> [ 144.702452] nfs4_xdr_dec_read_plus from call_decode (net/sunrpc/clnt.c:2595)
> [ 144.708429] call_decode from __rpc_execute (include/asm-generic/bitops/generic-non-atomic.h:128 net/sunrpc/sched.c:954)
> [ 144.713538] __rpc_execute from rpc_async_schedule (include/linux/sched/mm.h:336 net/sunrpc/sched.c:1035)
> [ 144.719170] rpc_async_schedule from process_one_work (include/linux/jump_label.h:260 include/linux/jump_label.h:270 include/trace/events/workqueue.h:108 kernel/workqueue.c:2294)
> [ 144.725238] process_one_work from worker_thread (include/linux/list.h:292 kernel/workqueue.c:2437)
> [ 144.730782] worker_thread from kthread (kernel/kthread.c:378)
> [ 144.735547] kthread from ret_from_fork (arch/arm/kernel/entry-common.S:149)
My 2cents...
From what I can tell read_plus only calls xdr_inline_decode() for
"numbers" (eof, #segs, type, offset, length) and we always expect that
__xdr_inline_decode() would return a a non-null "p". But if
__xdr_inline_decode() returned null, the code would call
xdr_copy_to_scratch() which would ultimately call the memcpy().
xdr_copy_to_scrach() expects the scratch buffer to be setup. However,
as I said, for the decode of numbers we don't set up the scratch
space. Which then leads to this oops. How, the reason the
__xdr_inline_decode() would return a null pointer if it ran out it's
provided xdr space which was provided #decode_read_plus_maxsz.
#define NFS42_READ_PLUS_DATA_SEGMENT_SIZE \
(1 /* data_content4 */ + \
2 /* data_info4.di_offset */ + \
1 /* data_info4.di_length */)
#define decode_read_plus_maxsz (op_decode_hdr_maxsz + \
1 /* rpr_eof */ + \
1 /* rpr_contents count */ + \
NFS42_READ_PLUS_DATA_SEGMENT_SIZE)
while a data segment needs (2) + (1), a hole segment needs to be (2) +
(2) (as both offset and lengths are longs.
while a "correct" maxsz is important for page alignment for reads, it
might means we are not providing enough space for when there are hole
segments? It seems weird that for the spec we have hole length and
data length of different types (long and int).
>
>
>
> Best regards,
> Krzysztof
>
[TLDR: This mail in primarily relevant for Linux regression tracking. A
change or fix related to the regression discussed in this thread was
posted or applied, but it did not use a Link: tag to point to the
report, as Linus and the documentation call for. Things happen, no
worries -- but now the regression tracking bot needs to be told manually
about the fix. See link in footer if these mails annoy you.]
On 08.01.23 09:52, Linux kernel regression tracking (#adding) wrote:
> On 07.01.23 16:44, Krzysztof Kozlowski wrote:
>>
>> Bisect identified commit 7fd461c47c6c ("NFSv4.2: Change the default
>> KConfig value for READ_PLUS") as one leading to NULL pointer exception
>> when mounting NFS root on NFSv4 client:
> [...]
> Thanks for the report. To be sure the issue doesn't fall through the
> cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
> tracking bot:
>
> #regzbot ^introduced 7fd461c47
> #regzbot title nfs: NULL pointer dereference since NFS_V4_2_READ_PLUS is
> enabled by default
> #regzbot ignore-activity
#regzbot fix: 896e090eefedeb8a715ea19938a2791c32679
Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.
#regzbot ignore-activity
On 18/02/2023 05:42, Linux regression tracking #update (Thorsten
Leemhuis) wrote:
> [TLDR: This mail in primarily relevant for Linux regression tracking. A
> change or fix related to the regression discussed in this thread was
> posted or applied, but it did not use a Link: tag to point to the
> report, as Linus and the documentation call for. Things happen, no
> worries -- but now the regression tracking bot needs to be told manually
> about the fix. See link in footer if these mails annoy you.]
>
> On 08.01.23 09:52, Linux kernel regression tracking (#adding) wrote:
>> On 07.01.23 16:44, Krzysztof Kozlowski wrote:
>>>
>>> Bisect identified commit 7fd461c47c6c ("NFSv4.2: Change the default
>>> KConfig value for READ_PLUS") as one leading to NULL pointer exception
>>> when mounting NFS root on NFSv4 client:
>> [...]
>> Thanks for the report. To be sure the issue doesn't fall through the
>> cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
>> tracking bot:
>>
>> #regzbot ^introduced 7fd461c47
>> #regzbot title nfs: NULL pointer dereference since NFS_V4_2_READ_PLUS is
>> enabled by default
>> #regzbot ignore-activity
>
> #regzbot fix: 896e090eefedeb8a715ea19938a2791c32679
I see it was posted and merged as "Revert "NFSv4.2: Change the default
KConfig value for READ_PLUS"". It's nice to give credits to people who
report bugs with "Reported-by" tag.
Best regards,
Krzysztof
On 18.02.23 16:09, Krzysztof Kozlowski wrote:
> On 18/02/2023 05:42, Linux regression tracking #update (Thorsten
> Leemhuis) wrote:
>> [TLDR: This mail in primarily relevant for Linux regression tracking. A
>> change or fix related to the regression discussed in this thread was
>> posted or applied, but it did not use a Link: tag to point to the
>> report, as Linus and the documentation call for. Things happen, no
>> worries -- but now the regression tracking bot needs to be told manually
>> about the fix. See link in footer if these mails annoy you.]
>>
>> On 08.01.23 09:52, Linux kernel regression tracking (#adding) wrote:
>>> On 07.01.23 16:44, Krzysztof Kozlowski wrote:
>>>>
>>>> Bisect identified commit 7fd461c47c6c ("NFSv4.2: Change the default
>>>> KConfig value for READ_PLUS") as one leading to NULL pointer exception
>>>> when mounting NFS root on NFSv4 client:
>>> [...]
>>> Thanks for the report. To be sure the issue doesn't fall through the
>>> cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
>>> tracking bot:
>>>
>>> #regzbot ^introduced 7fd461c47
>>> #regzbot title nfs: NULL pointer dereference since NFS_V4_2_READ_PLUS is
>>> enabled by default
>>> #regzbot ignore-activity
>>
>> #regzbot fix: 896e090eefedeb8a715ea19938a2791c32679
>
> I see it was posted and merged as "Revert "NFSv4.2: Change the default
> KConfig value for READ_PLUS"". It's nice to give credits to people who
> report bugs with "Reported-by" tag.
Yup. And a "Link:" with the url to the report is missing as well (Linus
wants those, automatic regression tracking needs those [IOW: if they are
missing it causes me trouble, that's why I care], and the docs explain
this as well).
That's why I asked for those two tags, but I didn't even get a reply:
https://lore.kernel.org/all/[email protected]/
Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.
Hi Krzysztof,
On Tue, Feb 14, 2023 at 6:02 AM Krzysztof Kozlowski
<[email protected]> wrote:
>
> On 12/02/2023 15:05, Anna Schumaker wrote:
> >>> From ac2d6c501dbcdb306480edaee625b5496f1fb4f5 Mon Sep 17 00:00:00 2001
> >>> From: Anna Schumaker <[email protected]>
> >>> Date: Fri, 10 Feb 2023 15:50:22 -0500
> >>> Subject: [PATCH] NFSv4.2: Rework scratch handling for READ_PLUS
> >>>
> >>
> >> Patch is corrupted - maybe mail program reformatted it when sending:
> >>
> >> Applying: NFSv4.2: Rework scratch handling for READ_PLUS
> >> error: corrupt patch at line 12
> >> Patch failed at 0001 NFSv4.2: Rework scratch handling for READ_PLUS
> >
> > That's weird. I wasn't expecting gmail to reformat the patch but I
> > guess it did. I've added it as an attachment so that shouldn't happen
> > again.
>
> Still null ptr (built on 420b2d4 with your patch):
We're through the merge window and at rc1 now, so I can spend more
time scratching my head over your bug again. We've come up with a
patch (attached) that adds a bunch of printks to show us what the
kernel thinks is going on. Do you mind trying it out and letting us
know what gets printed out? You'll need to make sure
CONFIG_NFS_V4_2_READ_PLUS is enabled when compiling the kernel.
Thanks,
Anna
>
> [ 144.690844] mmiocpy from xdr_inline_decode (net/sunrpc/xdr.c:1419 net/sunrpc/xdr.c:1454)
> [ 144.695950] xdr_inline_decode from nfs4_xdr_dec_read_plus (fs/nfs/nfs42xdr.c:1063 fs/nfs/nfs42xdr.c:1147 fs/nfs/nfs42xdr.c:1360 fs/nfs/nfs42xdr.c:1341)
> [ 144.702452] nfs4_xdr_dec_read_plus from call_decode (net/sunrpc/clnt.c:2595)
> [ 144.708429] call_decode from __rpc_execute (include/asm-generic/bitops/generic-non-atomic.h:128 net/sunrpc/sched.c:954)
> [ 144.713538] __rpc_execute from rpc_async_schedule (include/linux/sched/mm.h:336 net/sunrpc/sched.c:1035)
> [ 144.719170] rpc_async_schedule from process_one_work (include/linux/jump_label.h:260 include/linux/jump_label.h:270 include/trace/events/workqueue.h:108 kernel/workqueue.c:2294)
> [ 144.725238] process_one_work from worker_thread (include/linux/list.h:292 kernel/workqueue.c:2437)
> [ 144.730782] worker_thread from kthread (kernel/kthread.c:378)
> [ 144.735547] kthread from ret_from_fork (arch/arm/kernel/entry-common.S:149)
>
>
>
> Best regards,
> Krzysztof
>
On Mon, Mar 6, 2023 at 12:12 PM Anna Schumaker <[email protected]> wrote:
>
> Hi Krzysztof,
>
> On Tue, Feb 14, 2023 at 6:02 AM Krzysztof Kozlowski
> <[email protected]> wrote:
> >
> > On 12/02/2023 15:05, Anna Schumaker wrote:
> > >>> From ac2d6c501dbcdb306480edaee625b5496f1fb4f5 Mon Sep 17 00:00:00 2001
> > >>> From: Anna Schumaker <[email protected]>
> > >>> Date: Fri, 10 Feb 2023 15:50:22 -0500
> > >>> Subject: [PATCH] NFSv4.2: Rework scratch handling for READ_PLUS
> > >>>
> > >>
> > >> Patch is corrupted - maybe mail program reformatted it when sending:
> > >>
> > >> Applying: NFSv4.2: Rework scratch handling for READ_PLUS
> > >> error: corrupt patch at line 12
> > >> Patch failed at 0001 NFSv4.2: Rework scratch handling for READ_PLUS
> > >
> > > That's weird. I wasn't expecting gmail to reformat the patch but I
> > > guess it did. I've added it as an attachment so that shouldn't happen
> > > again.
> >
> > Still null ptr (built on 420b2d4 with your patch):
>
> We're through the merge window and at rc1 now, so I can spend more
> time scratching my head over your bug again. We've come up with a
> patch (attached) that adds a bunch of printks to show us what the
> kernel thinks is going on. Do you mind trying it out and letting us
> know what gets printed out? You'll need to make sure
> CONFIG_NFS_V4_2_READ_PLUS is enabled when compiling the kernel.
Hi Krzystof,
Since you are the only one hitting the problem, could you be so kind
as to help with getting this resolved.
Thank you.
>
> Thanks,
> Anna
>
> >
> > [ 144.690844] mmiocpy from xdr_inline_decode (net/sunrpc/xdr.c:1419 net/sunrpc/xdr.c:1454)
> > [ 144.695950] xdr_inline_decode from nfs4_xdr_dec_read_plus (fs/nfs/nfs42xdr.c:1063 fs/nfs/nfs42xdr.c:1147 fs/nfs/nfs42xdr.c:1360 fs/nfs/nfs42xdr.c:1341)
> > [ 144.702452] nfs4_xdr_dec_read_plus from call_decode (net/sunrpc/clnt.c:2595)
> > [ 144.708429] call_decode from __rpc_execute (include/asm-generic/bitops/generic-non-atomic.h:128 net/sunrpc/sched.c:954)
> > [ 144.713538] __rpc_execute from rpc_async_schedule (include/linux/sched/mm.h:336 net/sunrpc/sched.c:1035)
> > [ 144.719170] rpc_async_schedule from process_one_work (include/linux/jump_label.h:260 include/linux/jump_label.h:270 include/trace/events/workqueue.h:108 kernel/workqueue.c:2294)
> > [ 144.725238] process_one_work from worker_thread (include/linux/list.h:292 kernel/workqueue.c:2437)
> > [ 144.730782] worker_thread from kthread (kernel/kthread.c:378)
> > [ 144.735547] kthread from ret_from_fork (arch/arm/kernel/entry-common.S:149)
> >
> >
> >
> > Best regards,
> > Krzysztof
> >
On 06/03/2023 18:09, Anna Schumaker wrote:
> Hi Krzysztof,
>
> On Tue, Feb 14, 2023 at 6:02 AM Krzysztof Kozlowski
> <[email protected]> wrote:
>>
>> On 12/02/2023 15:05, Anna Schumaker wrote:
>>>>> From ac2d6c501dbcdb306480edaee625b5496f1fb4f5 Mon Sep 17 00:00:00 2001
>>>>> From: Anna Schumaker <[email protected]>
>>>>> Date: Fri, 10 Feb 2023 15:50:22 -0500
>>>>> Subject: [PATCH] NFSv4.2: Rework scratch handling for READ_PLUS
>>>>>
>>>>
>>>> Patch is corrupted - maybe mail program reformatted it when sending:
>>>>
>>>> Applying: NFSv4.2: Rework scratch handling for READ_PLUS
>>>> error: corrupt patch at line 12
>>>> Patch failed at 0001 NFSv4.2: Rework scratch handling for READ_PLUS
>>>
>>> That's weird. I wasn't expecting gmail to reformat the patch but I
>>> guess it did. I've added it as an attachment so that shouldn't happen
>>> again.
>>
>> Still null ptr (built on 420b2d4 with your patch):
>
> We're through the merge window and at rc1 now, so I can spend more
> time scratching my head over your bug again. We've come up with a
> patch (attached) that adds a bunch of printks to show us what the
> kernel thinks is going on. Do you mind trying it out and letting us
> know what gets printed out? You'll need to make sure
> CONFIG_NFS_V4_2_READ_PLUS is enabled when compiling the kernel.
The patch does not apply. I tried: v6.4-rc1, v6.4-rc5, next-20230609.
Best regards,
Krzysztof
Hi Krzysztof,
On Sat, Jun 10, 2023 at 6:15 AM Krzysztof Kozlowski
<[email protected]> wrote:
>
> On 06/03/2023 18:09, Anna Schumaker wrote:
> > Hi Krzysztof,
> >
> > On Tue, Feb 14, 2023 at 6:02 AM Krzysztof Kozlowski
> > <[email protected]> wrote:
> >>
> >> On 12/02/2023 15:05, Anna Schumaker wrote:
> >>>>> From ac2d6c501dbcdb306480edaee625b5496f1fb4f5 Mon Sep 17 00:00:00 2001
> >>>>> From: Anna Schumaker <[email protected]>
> >>>>> Date: Fri, 10 Feb 2023 15:50:22 -0500
> >>>>> Subject: [PATCH] NFSv4.2: Rework scratch handling for READ_PLUS
> >>>>>
> >>>>
> >>>> Patch is corrupted - maybe mail program reformatted it when sending:
> >>>>
> >>>> Applying: NFSv4.2: Rework scratch handling for READ_PLUS
> >>>> error: corrupt patch at line 12
> >>>> Patch failed at 0001 NFSv4.2: Rework scratch handling for READ_PLUS
> >>>
> >>> That's weird. I wasn't expecting gmail to reformat the patch but I
> >>> guess it did. I've added it as an attachment so that shouldn't happen
> >>> again.
> >>
> >> Still null ptr (built on 420b2d4 with your patch):
> >
> > We're through the merge window and at rc1 now, so I can spend more
> > time scratching my head over your bug again. We've come up with a
> > patch (attached) that adds a bunch of printks to show us what the
> > kernel thinks is going on. Do you mind trying it out and letting us
> > know what gets printed out? You'll need to make sure
> > CONFIG_NFS_V4_2_READ_PLUS is enabled when compiling the kernel.
>
> The patch does not apply. I tried: v6.4-rc1, v6.4-rc5, next-20230609.
Can you try the attached patch on top of my 3-patch series from the
other day, and let me know what gets printed out? It adds a bunch of
printk()s at strategic points to print out what is going on with the
xdr scratch buffer since it's suddenly a bad memory address after
working for a bit on your machine.
Thanks,
Anna
>
> Best regards,
> Krzysztof
>
On 15/06/2023 10:52, Krzysztof Kozlowski wrote:
> On 14/06/2023 22:55, Anna Schumaker wrote:
>>>>> Still null ptr (built on 420b2d4 with your patch):
>>>>
>>>> We're through the merge window and at rc1 now, so I can spend more
>>>> time scratching my head over your bug again. We've come up with a
>>>> patch (attached) that adds a bunch of printks to show us what the
>>>> kernel thinks is going on. Do you mind trying it out and letting us
>>>> know what gets printed out? You'll need to make sure
>>>> CONFIG_NFS_V4_2_READ_PLUS is enabled when compiling the kernel.
>>>
>>> The patch does not apply. I tried: v6.4-rc1, v6.4-rc5, next-20230609.
>>
>> Can you try the attached patch on top of my 3-patch series from the
>> other day, and let me know what gets printed out? It adds a bunch of
>> printk()s at strategic points to print out what is going on with the
>> xdr scratch buffer since it's suddenly a bad memory address after
>> working for a bit on your machine.
>>
>
> Here you have entire log - attached (113 kB, I hope goes past mailing
> lists/spam filters).
As expected this bounced from the mailing lists, but I hope you got it.
If not, let me know.
Best regards,
Krzysztof
On Thu, Jun 15, 2023 at 4:55 AM Krzysztof Kozlowski
<[email protected]> wrote:
>
> On 15/06/2023 10:52, Krzysztof Kozlowski wrote:
> > On 14/06/2023 22:55, Anna Schumaker wrote:
> >>>>> Still null ptr (built on 420b2d4 with your patch):
> >>>>
> >>>> We're through the merge window and at rc1 now, so I can spend more
> >>>> time scratching my head over your bug again. We've come up with a
> >>>> patch (attached) that adds a bunch of printks to show us what the
> >>>> kernel thinks is going on. Do you mind trying it out and letting us
> >>>> know what gets printed out? You'll need to make sure
> >>>> CONFIG_NFS_V4_2_READ_PLUS is enabled when compiling the kernel.
> >>>
> >>> The patch does not apply. I tried: v6.4-rc1, v6.4-rc5, next-20230609.
> >>
> >> Can you try the attached patch on top of my 3-patch series from the
> >> other day, and let me know what gets printed out? It adds a bunch of
> >> printk()s at strategic points to print out what is going on with the
> >> xdr scratch buffer since it's suddenly a bad memory address after
> >> working for a bit on your machine.
> >>
> >
> > Here you have entire log - attached (113 kB, I hope goes past mailing
> > lists/spam filters).
>
> As expected this bounced from the mailing lists, but I hope you got it.
> If not, let me know.
I did still receive it. Thanks!
Anna
>
> Best regards,
> Krzysztof
>
On Thu, Jun 15, 2023 at 9:01 AM Anna Schumaker <[email protected]> wrote:
>
> On Thu, Jun 15, 2023 at 4:55 AM Krzysztof Kozlowski
> <[email protected]> wrote:
> >
> > On 15/06/2023 10:52, Krzysztof Kozlowski wrote:
> > > On 14/06/2023 22:55, Anna Schumaker wrote:
> > >>>>> Still null ptr (built on 420b2d4 with your patch):
> > >>>>
> > >>>> We're through the merge window and at rc1 now, so I can spend more
> > >>>> time scratching my head over your bug again. We've come up with a
> > >>>> patch (attached) that adds a bunch of printks to show us what the
> > >>>> kernel thinks is going on. Do you mind trying it out and letting us
> > >>>> know what gets printed out? You'll need to make sure
> > >>>> CONFIG_NFS_V4_2_READ_PLUS is enabled when compiling the kernel.
> > >>>
> > >>> The patch does not apply. I tried: v6.4-rc1, v6.4-rc5, next-20230609.
> > >>
> > >> Can you try the attached patch on top of my 3-patch series from the
> > >> other day, and let me know what gets printed out? It adds a bunch of
> > >> printk()s at strategic points to print out what is going on with the
> > >> xdr scratch buffer since it's suddenly a bad memory address after
> > >> working for a bit on your machine.
> > >>
> > >
> > > Here you have entire log - attached (113 kB, I hope goes past mailing
> > > lists/spam filters).
> >
> > As expected this bounced from the mailing lists, but I hope you got it.
> > If not, let me know.
>
> I did still receive it. Thanks!
Can you swap out yesterday's patch with this patch? I've adjusted what
gets printed out, and added printk()s to xdr_copy_to_scratch(). I'm
starting to think that the xdr scratch buffer is fine, and that it's
the other pointer passed to memcpy() in that function that's the
problem, and the output from this patch will confirm for me.
Thanks,
Anna
>
> Anna
> >
> > Best regards,
> > Krzysztof
> >
On Thu, Jun 15, 2023 at 1:04 PM Anna Schumaker <[email protected]> wrote:
>
> On Thu, Jun 15, 2023 at 9:01 AM Anna Schumaker <[email protected]> wrote:
> >
> > On Thu, Jun 15, 2023 at 4:55 AM Krzysztof Kozlowski
> > <[email protected]> wrote:
> > >
> > > On 15/06/2023 10:52, Krzysztof Kozlowski wrote:
> > > > On 14/06/2023 22:55, Anna Schumaker wrote:
> > > >>>>> Still null ptr (built on 420b2d4 with your patch):
> > > >>>>
> > > >>>> We're through the merge window and at rc1 now, so I can spend more
> > > >>>> time scratching my head over your bug again. We've come up with a
> > > >>>> patch (attached) that adds a bunch of printks to show us what the
> > > >>>> kernel thinks is going on. Do you mind trying it out and letting us
> > > >>>> know what gets printed out? You'll need to make sure
> > > >>>> CONFIG_NFS_V4_2_READ_PLUS is enabled when compiling the kernel.
> > > >>>
> > > >>> The patch does not apply. I tried: v6.4-rc1, v6.4-rc5, next-20230609.
> > > >>
> > > >> Can you try the attached patch on top of my 3-patch series from the
> > > >> other day, and let me know what gets printed out? It adds a bunch of
> > > >> printk()s at strategic points to print out what is going on with the
> > > >> xdr scratch buffer since it's suddenly a bad memory address after
> > > >> working for a bit on your machine.
> > > >>
> > > >
> > > > Here you have entire log - attached (113 kB, I hope goes past mailing
> > > > lists/spam filters).
> > >
> > > As expected this bounced from the mailing lists, but I hope you got it.
> > > If not, let me know.
> >
> > I did still receive it. Thanks!
>
> Can you swap out yesterday's patch with this patch? I've adjusted what
> gets printed out, and added printk()s to xdr_copy_to_scratch(). I'm
> starting to think that the xdr scratch buffer is fine, and that it's
> the other pointer passed to memcpy() in that function that's the
> problem, and the output from this patch will confirm for me.
Oh, and can you add this one on top of the v2 patch as well?
Thanks,
Anna
>
> Thanks,
> Anna
>
> >
> > Anna
> > >
> > > Best regards,
> > > Krzysztof
> > >
On Sat, Jun 17, 2023 at 6:09 AM Krzysztof Kozlowski
<[email protected]> wrote:
>
> On 15/06/2023 21:38, Anna Schumaker wrote:
> > On Thu, Jun 15, 2023 at 1:16 PM Anna Schumaker <[email protected]> wrote:
> >>
> >> On Thu, Jun 15, 2023 at 1:04 PM Anna Schumaker <[email protected]> wrote:
> >>>
> >>> On Thu, Jun 15, 2023 at 9:01 AM Anna Schumaker <[email protected]> wrote:
> >>>>
> >>>> On Thu, Jun 15, 2023 at 4:55 AM Krzysztof Kozlowski
> >>>> <[email protected]> wrote:
> >>>>>
> >>>>> On 15/06/2023 10:52, Krzysztof Kozlowski wrote:
> >>>>>> On 14/06/2023 22:55, Anna Schumaker wrote:
> >>>>>>>>>> Still null ptr (built on 420b2d4 with your patch):
> >>>>>>>>>
> >>>>>>>>> We're through the merge window and at rc1 now, so I can spend more
> >>>>>>>>> time scratching my head over your bug again. We've come up with a
> >>>>>>>>> patch (attached) that adds a bunch of printks to show us what the
> >>>>>>>>> kernel thinks is going on. Do you mind trying it out and letting us
> >>>>>>>>> know what gets printed out? You'll need to make sure
> >>>>>>>>> CONFIG_NFS_V4_2_READ_PLUS is enabled when compiling the kernel.
> >>>>>>>>
> >>>>>>>> The patch does not apply. I tried: v6.4-rc1, v6.4-rc5, next-20230609.
> >>>>>>>
> >>>>>>> Can you try the attached patch on top of my 3-patch series from the
> >>>>>>> other day, and let me know what gets printed out? It adds a bunch of
> >>>>>>> printk()s at strategic points to print out what is going on with the
> >>>>>>> xdr scratch buffer since it's suddenly a bad memory address after
> >>>>>>> working for a bit on your machine.
> >>>>>>>
> >>>>>>
> >>>>>> Here you have entire log - attached (113 kB, I hope goes past mailing
> >>>>>> lists/spam filters).
> >>>>>
> >>>>> As expected this bounced from the mailing lists, but I hope you got it.
> >>>>> If not, let me know.
> >>>>
> >>>> I did still receive it. Thanks!
> >>>
> >>> Can you swap out yesterday's patch with this patch? I've adjusted what
> >>> gets printed out, and added printk()s to xdr_copy_to_scratch(). I'm
> >>> starting to think that the xdr scratch buffer is fine, and that it's
> >>> the other pointer passed to memcpy() in that function that's the
> >>> problem, and the output from this patch will confirm for me.
> >>
> >> Oh, and can you add this one on top of the v2 patch as well?
> >
> > Sorry about the noise today. Can you use this patch instead of the two
> > I attached earlier? I cleaned up the output and cut down on extra
> > output..
> >
>
> Here you have - attached.
This is good, thanks! I was finally able to figure out how to hit the
bug using a 32bit x86 VM, so hopefully the next thing you hear from me
is a patch fixing the bug!
Anna
>
>
> Best regards,
> Krzysztof
On 21/06/2023 14:49, Anna Schumaker wrote:
> On Sat, Jun 17, 2023 at 6:09 AM Krzysztof Kozlowski
> <[email protected]> wrote:
>>
>> On 15/06/2023 21:38, Anna Schumaker wrote:
>>> On Thu, Jun 15, 2023 at 1:16 PM Anna Schumaker <[email protected]> wrote:
>>>>
>>>> On Thu, Jun 15, 2023 at 1:04 PM Anna Schumaker <[email protected]> wrote:
>>>>>
>>>>> On Thu, Jun 15, 2023 at 9:01 AM Anna Schumaker <[email protected]> wrote:
>>>>>>
>>>>>> On Thu, Jun 15, 2023 at 4:55 AM Krzysztof Kozlowski
>>>>>> <[email protected]> wrote:
>>>>>>>
>>>>>>> On 15/06/2023 10:52, Krzysztof Kozlowski wrote:
>>>>>>>> On 14/06/2023 22:55, Anna Schumaker wrote:
>>>>>>>>>>>> Still null ptr (built on 420b2d4 with your patch):
>>>>>>>>>>>
>>>>>>>>>>> We're through the merge window and at rc1 now, so I can spend more
>>>>>>>>>>> time scratching my head over your bug again. We've come up with a
>>>>>>>>>>> patch (attached) that adds a bunch of printks to show us what the
>>>>>>>>>>> kernel thinks is going on. Do you mind trying it out and letting us
>>>>>>>>>>> know what gets printed out? You'll need to make sure
>>>>>>>>>>> CONFIG_NFS_V4_2_READ_PLUS is enabled when compiling the kernel.
>>>>>>>>>>
>>>>>>>>>> The patch does not apply. I tried: v6.4-rc1, v6.4-rc5, next-20230609.
>>>>>>>>>
>>>>>>>>> Can you try the attached patch on top of my 3-patch series from the
>>>>>>>>> other day, and let me know what gets printed out? It adds a bunch of
>>>>>>>>> printk()s at strategic points to print out what is going on with the
>>>>>>>>> xdr scratch buffer since it's suddenly a bad memory address after
>>>>>>>>> working for a bit on your machine.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Here you have entire log - attached (113 kB, I hope goes past mailing
>>>>>>>> lists/spam filters).
>>>>>>>
>>>>>>> As expected this bounced from the mailing lists, but I hope you got it.
>>>>>>> If not, let me know.
>>>>>>
>>>>>> I did still receive it. Thanks!
>>>>>
>>>>> Can you swap out yesterday's patch with this patch? I've adjusted what
>>>>> gets printed out, and added printk()s to xdr_copy_to_scratch(). I'm
>>>>> starting to think that the xdr scratch buffer is fine, and that it's
>>>>> the other pointer passed to memcpy() in that function that's the
>>>>> problem, and the output from this patch will confirm for me.
>>>>
>>>> Oh, and can you add this one on top of the v2 patch as well?
>>>
>>> Sorry about the noise today. Can you use this patch instead of the two
>>> I attached earlier? I cleaned up the output and cut down on extra
>>> output..
>>>
>>
>> Here you have - attached.
>
> This is good, thanks! I was finally able to figure out how to hit the
> bug using a 32bit x86 VM, so hopefully the next thing you hear from me
> is a patch fixing the bug!
QEMU also has 32-bit ARM and x86...
Best regards,
Krzysztof
Hi Krzysztof,
On Wed, Jun 21, 2023 at 9:27 AM Krzysztof Kozlowski
<[email protected]> wrote:
>
> On 21/06/2023 14:49, Anna Schumaker wrote:
> > On Sat, Jun 17, 2023 at 6:09 AM Krzysztof Kozlowski
> > <[email protected]> wrote:
> >>
> >> On 15/06/2023 21:38, Anna Schumaker wrote:
> >>> On Thu, Jun 15, 2023 at 1:16 PM Anna Schumaker <[email protected]> wrote:
> >>>>
> >>>> On Thu, Jun 15, 2023 at 1:04 PM Anna Schumaker <[email protected]> wrote:
> >>>>>
> >>>>> On Thu, Jun 15, 2023 at 9:01 AM Anna Schumaker <[email protected]> wrote:
> >>>>>>
> >>>>>> On Thu, Jun 15, 2023 at 4:55 AM Krzysztof Kozlowski
> >>>>>> <[email protected]> wrote:
> >>>>>>>
> >>>>>>> On 15/06/2023 10:52, Krzysztof Kozlowski wrote:
> >>>>>>>> On 14/06/2023 22:55, Anna Schumaker wrote:
> >>>>>>>>>>>> Still null ptr (built on 420b2d4 with your patch):
> >>>>>>>>>>>
> >>>>>>>>>>> We're through the merge window and at rc1 now, so I can spend more
> >>>>>>>>>>> time scratching my head over your bug again. We've come up with a
> >>>>>>>>>>> patch (attached) that adds a bunch of printks to show us what the
> >>>>>>>>>>> kernel thinks is going on. Do you mind trying it out and letting us
> >>>>>>>>>>> know what gets printed out? You'll need to make sure
> >>>>>>>>>>> CONFIG_NFS_V4_2_READ_PLUS is enabled when compiling the kernel.
> >>>>>>>>>>
> >>>>>>>>>> The patch does not apply. I tried: v6.4-rc1, v6.4-rc5, next-20230609.
> >>>>>>>>>
> >>>>>>>>> Can you try the attached patch on top of my 3-patch series from the
> >>>>>>>>> other day, and let me know what gets printed out? It adds a bunch of
> >>>>>>>>> printk()s at strategic points to print out what is going on with the
> >>>>>>>>> xdr scratch buffer since it's suddenly a bad memory address after
> >>>>>>>>> working for a bit on your machine.
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> Here you have entire log - attached (113 kB, I hope goes past mailing
> >>>>>>>> lists/spam filters).
> >>>>>>>
> >>>>>>> As expected this bounced from the mailing lists, but I hope you got it.
> >>>>>>> If not, let me know.
> >>>>>>
> >>>>>> I did still receive it. Thanks!
> >>>>>
> >>>>> Can you swap out yesterday's patch with this patch? I've adjusted what
> >>>>> gets printed out, and added printk()s to xdr_copy_to_scratch(). I'm
> >>>>> starting to think that the xdr scratch buffer is fine, and that it's
> >>>>> the other pointer passed to memcpy() in that function that's the
> >>>>> problem, and the output from this patch will confirm for me.
> >>>>
> >>>> Oh, and can you add this one on top of the v2 patch as well?
> >>>
> >>> Sorry about the noise today. Can you use this patch instead of the two
> >>> I attached earlier? I cleaned up the output and cut down on extra
> >>> output..
> >>>
> >>
> >> Here you have - attached.
> >
> > This is good, thanks! I was finally able to figure out how to hit the
> > bug using a 32bit x86 VM, so hopefully the next thing you hear from me
> > is a patch fixing the bug!
I'm really hopeful that the attached patch finally fixes the issue.
Can you try it and let me know?
Thanks,
Anna
>
> QEMU also has 32-bit ARM and x86...
>
> Best regards,
> Krzysztof
>
On 23/06/2023 19:59, Anna Schumaker wrote:
>>>>>>>
>>>>>>> Can you swap out yesterday's patch with this patch? I've adjusted what
>>>>>>> gets printed out, and added printk()s to xdr_copy_to_scratch(). I'm
>>>>>>> starting to think that the xdr scratch buffer is fine, and that it's
>>>>>>> the other pointer passed to memcpy() in that function that's the
>>>>>>> problem, and the output from this patch will confirm for me.
>>>>>>
>>>>>> Oh, and can you add this one on top of the v2 patch as well?
>>>>>
>>>>> Sorry about the noise today. Can you use this patch instead of the two
>>>>> I attached earlier? I cleaned up the output and cut down on extra
>>>>> output..
>>>>>
>>>>
>>>> Here you have - attached.
>>>
>>> This is good, thanks! I was finally able to figure out how to hit the
>>> bug using a 32bit x86 VM, so hopefully the next thing you hear from me
>>> is a patch fixing the bug!
>
> I'm really hopeful that the attached patch finally fixes the issue.
> Can you try it and let me know?
Just test it yourself on 32-bit system... There is absolutely nothing
special in the system I reproduced it on. Nothing.
IP-Config: eth0 hardware address 00:1e:06:30:bf:ac mtu 1500
IP-Config: eth0 guessed broadcast address 192.168.1.255
IP-Config: eth0 complete (from 192.168.1.10):
address: 192.168.1.12 broadcast: 192.168.1.255 netmask:
255.255.255.0
gateway: 192.168.1.1 dns0 : 0.0.0.0 dns1 : 0.0.0.0
rootserver: 192.168.1.10 rootpath:
filename :
NFS-Mount: 192.168.1.10:/srv/nfs/odroidhc1
Waiting 10 seconds for device /dev/nfs ...
ERROR: device '/dev/nfs' not found. Skipping fsck.
Mount cmd:
mount.nfs4 -o vers=4,nolock 192.168.1.10:/srv/nfs/odroidhc1 /new_root
[ 21.800626] ------------[ cut here ]------------
[ 21.803891] WARNING: CPU: 7 PID: 154 at mm/highmem.c:603
xdr_stream_unmap_current_page+0x18/0x24
[ 21.812729] Modules linked in:
[ 21.815642] CPU: 7 PID: 154 Comm: mount.nfs4 Not tainted
6.4.0-00001-gfbb103bb8df0 #8
[ 21.823444] Hardware name: Samsung Exynos (Flattened Device Tree)
[ 21.829525] unwind_backtrace from show_stack+0x10/0x14
[ 21.834698] show_stack from dump_stack_lvl+0x58/0x70
[ 21.839730] dump_stack_lvl from __warn+0x7c/0x1bc
[ 21.844491] __warn from warn_slowpath_fmt+0xbc/0x1b8
[ 21.849518] warn_slowpath_fmt from
xdr_stream_unmap_current_page+0x18/0x24
[ 21.856437] xdr_stream_unmap_current_page from call_decode+0x210/0x2c8
[ 21.863020] call_decode from __rpc_execute+0xf8/0x764
[ 21.868134] __rpc_execute from rpc_execute+0xc0/0x1d0
[ 21.873243] rpc_execute from rpc_run_task+0x148/0x190
[ 21.878348] rpc_run_task from rpc_create_xprt+0x1a4/0x284
[ 21.883805] rpc_create_xprt from rpc_create+0xf8/0x254
[ 21.889004] rpc_create from nfs_create_rpc_client+0x150/0x17c
[ 21.894812] nfs_create_rpc_client from nfs4_alloc_client+0x360/0x374
[ 21.901226] nfs4_alloc_client from nfs_get_client+0x16c/0x3e8
[ 21.907030] nfs_get_client from nfs4_set_client+0xfc/0x1a4
[ 21.912574] nfs4_set_client from nfs4_create_server+0x11c/0x2fc
[ 21.918554] nfs4_create_server from nfs4_try_get_tree+0x10/0x50
[ 21.924534] nfs4_try_get_tree from vfs_get_tree+0x24/0xe4
[ 21.929993] vfs_get_tree from path_mount+0x3e8/0xb04
[ 21.935019] path_mount from sys_mount+0x20c/0x254
[ 21.939784] sys_mount from ret_fast_syscall+0x0/0x1c
[ 21.944809] Exception stack(0xf0cf9fa8 to 0xf0cf9ff0)
[ 21.949837] 9fa0: 0047ebe0 00479c64 0047e960
0047e9b8 0047e9c8 00000000
[ 21.957986] 9fc0: 0047ebe0 00479c64 b6f058c8 00000015 00466c08
00000010 00479c64 00466bfc
[ 21.966139] 9fe0: 00479e70 befb69b0 0045a708 b6dca610
[ 21.971245] irq event stamp: 0
[ 21.974188] hardirqs last enabled at (0): [<00000000>] 0x0
[ 21.979736] hardirqs last disabled at (0): [<c012357c>]
copy_process+0x810/0x1ffc
[ 21.987227] softirqs last enabled at (0): [<c012357c>]
copy_process+0x810/0x1ffc
[ 21.994679] softirqs last disabled at (0): [<00000000>] 0x0
[ 22.000264] ---[ end trace 0000000000000000 ]---
[ 22.004781] BUG: sleeping function called from invalid context at
net/sunrpc/sched.c:953
[ 22.012876] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid:
154, name: mount.nfs4
[ 22.020936] preempt_count: 1, expected: 0
[ 22.024881] RCU nest depth: 0, expected: 0
[ 22.028955] INFO: lockdep is turned off.
[ 22.032889] CPU: 7 PID: 154 Comm: mount.nfs4 Tainted: G W
6.4.0-00001-gfbb103bb8df0 #8
[ 22.042131] Hardware name: Samsung Exynos (Flattened Device Tree)
[ 22.048196] unwind_backtrace from show_stack+0x10/0x14
[ 22.053393] show_stack from dump_stack_lvl+0x58/0x70
[ 22.058417] dump_stack_lvl from __might_resched+0x194/0x260
[ 22.064054] __might_resched from __rpc_execute+0x118/0x764
[ 22.069596] __rpc_execute from rpc_execute+0xc0/0x1d0
[ 22.074708] rpc_execute from rpc_run_task+0x148/0x190
[ 22.079821] rpc_run_task from rpc_create_xprt+0x1a4/0x284
[ 22.085281] rpc_create_xprt from rpc_create+0xf8/0x254
[ 22.090483] rpc_create from nfs_create_rpc_client+0x150/0x17c
[ 22.096286] nfs_create_rpc_client from nfs4_alloc_client+0x360/0x374
[ 22.102700] nfs4_alloc_client from nfs_get_client+0x16c/0x3e8
[ 22.108504] nfs_get_client from nfs4_set_client+0xfc/0x1a4
[ 22.114050] nfs4_set_client from nfs4_create_server+0x11c/0x2fc
[ 22.120029] nfs4_create_server from nfs4_try_get_tree+0x10/0x50
[ 22.126009] nfs4_try_get_tree from vfs_get_tree+0x24/0xe4
[ 22.131467] vfs_get_tree from path_mount+0x3e8/0xb04
[ 22.136493] path_mount from sys_mount+0x20c/0x254
[ 22.141258] sys_mount from ret_fast_syscall+0x0/0x1c
[ 22.146284] Exception stack(0xf0cf9fa8 to 0xf0cf9ff0)
[ 22.151322] 9fa0: 0047ebe0 00479c64 0047e960
0047e9b8 0047e9c8 00000000
[ 22.159461] 9fc0: 0047ebe0 00479c64 b6f058c8 00000015 00466c08
00000010 00479c64 00466bfc
[ 22.167606] 9fe0: 00479e70 befb69b0 0045a708 b6dca610
[ 22.172820] BUG: scheduling while atomic: mount.nfs4/154/0x00000002
[ 22.178871] INFO: lockdep is turned off.
[ 22.182803] Modules linked in:
[ 22.185798] CPU: 7 PID: 154 Comm: mount.nfs4 Tainted: G W
6.4.0-00001-gfbb103bb8df0 #8
[ 22.195076] Hardware name: Samsung Exynos (Flattened Device Tree)
[ 22.201139] unwind_backtrace from show_stack+0x10/0x14
[ 22.206337] show_stack from dump_stack_lvl+0x58/0x70
[ 22.211365] dump_stack_lvl from __schedule_bug+0x70/0x84
[ 22.216736] __schedule_bug from __schedule+0x9c0/0xc80
[ 22.221936] __schedule from schedule+0x58/0xf8
[ 22.226439] schedule from schedule_timeout+0x134/0x200
[ 22.231641] schedule_timeout from __wait_for_common+0xac/0x1f8
[ 22.237533] __wait_for_common from
wait_for_completion_killable+0x18/0x24
[ 22.244379] wait_for_completion_killable from
__kthread_create_on_node+0xe0/0x168
[ 22.251923] __kthread_create_on_node from
kthread_create_on_node+0x30/0x60
[ 22.258851] kthread_create_on_node from svc_set_num_threads+0x1c8/0x420
[ 22.265525] svc_set_num_threads from nfs_callback_up+0x150/0x3c0
[ 22.271597] nfs_callback_up from nfs4_init_client+0x98/0x144
[ 22.277306] nfs4_init_client from nfs4_set_client+0xfc/0x1a4
[ 22.283026] nfs4_set_client from nfs4_create_server+0x11c/0x2fc
[ 22.289005] nfs4_create_server from nfs4_try_get_tree+0x10/0x50
[ 22.294985] nfs4_try_get_tree from vfs_get_tree+0x24/0xe4
[ 22.300444] vfs_get_tree from path_mount+0x3e8/0xb04
[ 22.305468] path_mount from sys_mount+0x20c/0x254
[ 22.310249] sys_mount from ret_fast_syscall+0x0/0x1c
[ 22.315261] Exception stack(0xf0cf9fa8 to 0xf0cf9ff0)
[ 22.320300] 9fa0: 0047ebe0 00479c64 0047e960
0047e9b8 0047e9c8 00000000
[ 22.328438] 9fc0: 0047ebe0 00479c64 b6f058c8 00000015 00466c08
00000010 00479c64 00466bfc
[ 22.336582] 9fe0: 00479e70 befb69b0 0045a708 b6dca610
:: running cleanup hook [udev]
[ 26.235349] systemd[1]: System time before build time, advancing clock.
[ 26.435536] systemd[1]: systemd 253.4-1-arch running in system mode
(+PAM +AUDIT -SELINUX -APPARMOR -IMA +SMACK +SECCOMP +GCRYPT +GNUTLS
+OPENSSL +ACL +BLKID +CURL +ELFUTILS +FIDO2 +IDN2 -IDN +IPTC +KMOD
+LIBCRYPTSETUP +LIBFDISK +PCRE2 -PWQUALITY +P11KIT -QRENCODE +TPM2
+BZIP2 +LZ4 +XZ +ZLIB +ZSTD +BPF_FRAMEWORK +XKBCOMMON +UTMP -SYSVINIT
default-hierarchy=unified)
[ 26.466749] systemd[1]: Detected architecture arm.
Best regards,
Krzysztof