It was reported that the riscv kernel hangs while executing the test
in [1].
Indeed, the test hangs when trying to write a buffer to a file. The
problem is that the riscv implementation of raw_copy_from_user() does not
return the number of bytes written when an exception happens and is fixed
up.
generic_perform_write() pre-faults the user pages and bails out if nothing
can be written, otherwise it will access the userspace buffer: here the
riscv implementation keeps returning it was not able to copy any byte
though the pre-faulting indicates otherwise. So generic_perform_write()
keeps retrying to access the user memory and ends up in an infinite
loop.
Note that before the commit mentioned in [1] that introduced this
regression, it worked because generic_perform_write() would bail out if
only one byte could not be written.
So fix this by returning the number of bytes effectively written in
__asm_copy_[to|from]_user() and __clear_user(), as it is expected.
[1] https://lore.kernel.org/linux-riscv/20230309151841.bomov6hq3ybyp42a@debian/
Fixes: ebcbd75e3962 ("riscv: Fix the bug in memory access fixup code")
Reported-by: Bo YU <[email protected]>
Closes: https://lore.kernel.org/linux-riscv/20230309151841.bomov6hq3ybyp42a@debian/#t
Reported-by: Aurelien Jarno <[email protected]>
Closes: https://lore.kernel.org/linux-riscv/[email protected]/
Signed-off-by: Alexandre Ghiti <[email protected]>
---
arch/riscv/lib/uaccess.S | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/arch/riscv/lib/uaccess.S b/arch/riscv/lib/uaccess.S
index ec486e5369d9..09b47ebacf2e 100644
--- a/arch/riscv/lib/uaccess.S
+++ b/arch/riscv/lib/uaccess.S
@@ -17,8 +17,11 @@ ENTRY(__asm_copy_from_user)
li t6, SR_SUM
csrs CSR_STATUS, t6
- /* Save for return value */
- mv t5, a2
+ /*
+ * Save the terminal address which will be used to compute the number
+ * of bytes copied in case of a fixup exception.
+ */
+ add t5, a0, a2
/*
* Register allocation for code below:
@@ -176,7 +179,7 @@ ENTRY(__asm_copy_from_user)
10:
/* Disable access to user memory */
csrc CSR_STATUS, t6
- mv a0, t5
+ sub a0, t5, a0
ret
ENDPROC(__asm_copy_to_user)
ENDPROC(__asm_copy_from_user)
@@ -228,7 +231,7 @@ ENTRY(__clear_user)
11:
/* Disable access to user memory */
csrc CSR_STATUS, t6
- mv a0, a1
+ sub a0, a3, a0
ret
ENDPROC(__clear_user)
EXPORT_SYMBOL(__clear_user)
--
2.39.2
On 11/08/2023 13:03, Alexandre Ghiti wrote:
> It was reported that the riscv kernel hangs while executing the test
> in [1].
>
> Indeed, the test hangs when trying to write a buffer to a file. The
> problem is that the riscv implementation of raw_copy_from_user() does not
> return the number of bytes written when an exception happens and is fixed
> up.
I'll respin another version as the changelog and the title are
incorrect: the uaccess routines should not return the number of bytes
copied but actually the number of bytes not copied (this is what this
patch implements).
I'll wait for feedbacks before doing so!
Sorry about that!
Alex
>
> generic_perform_write() pre-faults the user pages and bails out if nothing
> can be written, otherwise it will access the userspace buffer: here the
> riscv implementation keeps returning it was not able to copy any byte
> though the pre-faulting indicates otherwise. So generic_perform_write()
> keeps retrying to access the user memory and ends up in an infinite
> loop.
>
> Note that before the commit mentioned in [1] that introduced this
> regression, it worked because generic_perform_write() would bail out if
> only one byte could not be written.
>
> So fix this by returning the number of bytes effectively written in
> __asm_copy_[to|from]_user() and __clear_user(), as it is expected.
>
> [1] https://lore.kernel.org/linux-riscv/20230309151841.bomov6hq3ybyp42a@debian/
>
> Fixes: ebcbd75e3962 ("riscv: Fix the bug in memory access fixup code")
> Reported-by: Bo YU <[email protected]>
> Closes: https://lore.kernel.org/linux-riscv/20230309151841.bomov6hq3ybyp42a@debian/#t
> Reported-by: Aurelien Jarno <[email protected]>
> Closes: https://lore.kernel.org/linux-riscv/[email protected]/
> Signed-off-by: Alexandre Ghiti <[email protected]>
> ---
> arch/riscv/lib/uaccess.S | 11 +++++++----
> 1 file changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/arch/riscv/lib/uaccess.S b/arch/riscv/lib/uaccess.S
> index ec486e5369d9..09b47ebacf2e 100644
> --- a/arch/riscv/lib/uaccess.S
> +++ b/arch/riscv/lib/uaccess.S
> @@ -17,8 +17,11 @@ ENTRY(__asm_copy_from_user)
> li t6, SR_SUM
> csrs CSR_STATUS, t6
>
> - /* Save for return value */
> - mv t5, a2
> + /*
> + * Save the terminal address which will be used to compute the number
> + * of bytes copied in case of a fixup exception.
> + */
> + add t5, a0, a2
>
> /*
> * Register allocation for code below:
> @@ -176,7 +179,7 @@ ENTRY(__asm_copy_from_user)
> 10:
> /* Disable access to user memory */
> csrc CSR_STATUS, t6
> - mv a0, t5
> + sub a0, t5, a0
> ret
> ENDPROC(__asm_copy_to_user)
> ENDPROC(__asm_copy_from_user)
> @@ -228,7 +231,7 @@ ENTRY(__clear_user)
> 11:
> /* Disable access to user memory */
> csrc CSR_STATUS, t6
> - mv a0, a1
> + sub a0, a3, a0
> ret
> ENDPROC(__clear_user)
> EXPORT_SYMBOL(__clear_user)
On Fri, Aug 11, 2023 at 01:39:16PM +0200, Alexandre Ghiti wrote:
> On 11/08/2023 13:03, Alexandre Ghiti wrote:
> > It was reported that the riscv kernel hangs while executing the test
> > in [1].
> >
> > Indeed, the test hangs when trying to write a buffer to a file. The
> > problem is that the riscv implementation of raw_copy_from_user() does not
> > return the number of bytes written when an exception happens and is fixed
> > up.
>
>
> I'll respin another version as the changelog and the title are incorrect:
> the uaccess routines should not return the number of bytes copied but
> actually the number of bytes not copied (this is what this patch
> implements).
>
> I'll wait for feedbacks before doing so!
While you're editing te changelog, can you make the link a regular Link:
tag? You can do `Link: <foo> [1]`.
Alexandre Ghiti <[email protected]> writes:
> On 11/08/2023 13:03, Alexandre Ghiti wrote:
>> It was reported that the riscv kernel hangs while executing the test
>> in [1].
>>
>> Indeed, the test hangs when trying to write a buffer to a file. The
>> problem is that the riscv implementation of raw_copy_from_user() does not
>> return the number of bytes written when an exception happens and is fixed
>> up.
>
>
> I'll respin another version as the changelog and the title are
> incorrect: the uaccess routines should not return the number of bytes
> copied but actually the number of bytes not copied (this is what this
> patch implements).
>
> I'll wait for feedbacks before doing so!
Yikes! Nice catch.
Functions like fault_in_readable() will fail horribly w/o this. I wonder
why we haven't seen this problem more?
Feel free to add my RB to your next spin!
Reviewed-by: Björn Töpel <[email protected]>
Hi Alexandre,
On 2023-08-11 13:03, Alexandre Ghiti wrote:
> It was reported that the riscv kernel hangs while executing the test
> in [1].
>
> Indeed, the test hangs when trying to write a buffer to a file. The
> problem is that the riscv implementation of raw_copy_from_user() does not
> return the number of bytes written when an exception happens and is fixed
> up.
>
> generic_perform_write() pre-faults the user pages and bails out if nothing
> can be written, otherwise it will access the userspace buffer: here the
> riscv implementation keeps returning it was not able to copy any byte
> though the pre-faulting indicates otherwise. So generic_perform_write()
> keeps retrying to access the user memory and ends up in an infinite
> loop.
>
> Note that before the commit mentioned in [1] that introduced this
> regression, it worked because generic_perform_write() would bail out if
> only one byte could not be written.
>
> So fix this by returning the number of bytes effectively written in
> __asm_copy_[to|from]_user() and __clear_user(), as it is expected.
>
> [1] https://lore.kernel.org/linux-riscv/20230309151841.bomov6hq3ybyp42a@debian/
>
> Fixes: ebcbd75e3962 ("riscv: Fix the bug in memory access fixup code")
> Reported-by: Bo YU <[email protected]>
> Closes: https://lore.kernel.org/linux-riscv/20230309151841.bomov6hq3ybyp42a@debian/#t
> Reported-by: Aurelien Jarno <[email protected]>
> Closes: https://lore.kernel.org/linux-riscv/[email protected]/
> Signed-off-by: Alexandre Ghiti <[email protected]>
> ---
> arch/riscv/lib/uaccess.S | 11 +++++++----
> 1 file changed, 7 insertions(+), 4 deletions(-)
Thanks for providing a patch so fast. I confirm that the issue I
reported is fixed, but also that the full strace testsuite now passes
successfully.
The code looks all good to me, so:
Tested-by: Aurelien Jarno <[email protected]>
Reviewed-by: Aurelien Jarno <[email protected]>
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
[email protected] http://aurel32.net