2016-03-10 19:27:18

by Mika Penttilä

[permalink] [raw]
Subject: Re: [PATCH v14] x86, mce: Add memcpy_mcsafe()



On 18.02.2016 21:47, Tony Luck wrote:
> Make use of the EXTABLE_FAULT exception table entries to write
> a kernel copy routine that doesn't crash the system if it
> encounters a machine check. Prime use case for this is to copy
> from large arrays of non-volatile memory used as storage.
>
> We have to use an unrolled copy loop for now because current
> hardware implementations treat a machine check in "rep mov"
> as fatal. When that is fixed we can simplify.
>
> Signed-off-by: Tony Luck <[email protected]>
> ---
>
> Is this what we want now? Return type is a "bool". True means
> that we copied OK, false means that it didn't (this is all that
> Dan says that he needs). Dropped all the complex code to figure
> out how many bytes we didn't copy as Linus says this isn't the
> right place to do this (and besides we should just make "rep mov"
>
> +
> + /* Copy successful. Return true */
> +.L_done_memcpy_trap:
> + xorq %rax, %rax
> + ret
> +ENDPROC(memcpy_mcsafe)
> +
> + .section .fixup, "ax"
> + /* Return false for any failure */
> +.L_memcpy_mcsafe_fail:
> + mov $1, %rax
> + ret
> +
>
But you return 0 == false for success and 1 == true for failure.

--Mika


2016-03-10 19:37:57

by Tony Luck

[permalink] [raw]
Subject: RE: [PATCH v14] x86, mce: Add memcpy_mcsafe()

> But you return 0 == false for success and 1 == true for failure.

Aaargh! -ETOOMUCHSHELLSCRIPTPROGRAMMING

-Tony

2016-03-11 22:10:23

by Tony Luck

[permalink] [raw]
Subject: Re: [PATCH v14] x86, mce: Add memcpy_mcsafe()

On Thu, Mar 10, 2016 at 11:37 AM, Luck, Tony <[email protected]> wrote:
>> But you return 0 == false for success and 1 == true for failure.
>
> Aaargh! -ETOOMUCHSHELLSCRIPTPROGRAMMING
>
> -Tony

Options to fix this:
1) Just change the comments in the code.
This seems like it would confuse people as I thing most people
would expect the "true" return to mean the copy succeeded.
2) Reverse the return values.
Better that option 1 - but doesn't leave scope to return a count
if some future user does want to know where the copy failed.
3) Change the return type back from "bool" to "int"
0 == success, non-zero == fail (with option to put the non-copied
byte count in later).
4) Something else

-Tony

2016-03-11 22:14:48

by Dan Williams

[permalink] [raw]
Subject: Re: [PATCH v14] x86, mce: Add memcpy_mcsafe()

On Fri, Mar 11, 2016 at 2:10 PM, Tony Luck <[email protected]> wrote:
> On Thu, Mar 10, 2016 at 11:37 AM, Luck, Tony <[email protected]> wrote:
>>> But you return 0 == false for success and 1 == true for failure.
>>
>> Aaargh! -ETOOMUCHSHELLSCRIPTPROGRAMMING
>>
>> -Tony
>
> Options to fix this:
> 1) Just change the comments in the code.
> This seems like it would confuse people as I thing most people
> would expect the "true" return to mean the copy succeeded.
> 2) Reverse the return values.
> Better that option 1 - but doesn't leave scope to return a count
> if some future user does want to know where the copy failed.
> 3) Change the return type back from "bool" to "int"
> 0 == success, non-zero == fail (with option to put the non-copied
> byte count in later).
> 4) Something else
>

You could return 0 for success and -EIO for failure (positive value
for 'location' if that support ever resurfaces).

That would at least let me drop the ternary conversion I have in
memcpy_from_pmem() [1] to convert bool to an error code.

[1]: https://patchwork.kernel.org/patch/8559521/

2016-03-12 17:17:01

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH v14] x86, mce: Add memcpy_mcsafe()


* Tony Luck <[email protected]> wrote:

> On Thu, Mar 10, 2016 at 11:37 AM, Luck, Tony <[email protected]> wrote:
> >> But you return 0 == false for success and 1 == true for failure.
> >
> > Aaargh! -ETOOMUCHSHELLSCRIPTPROGRAMMING
> >
> > -Tony
>
> Options to fix this:
> 1) Just change the comments in the code.
> This seems like it would confuse people as I thing most people
> would expect the "true" return to mean the copy succeeded.
> 2) Reverse the return values.
> Better that option 1 - but doesn't leave scope to return a count
> if some future user does want to know where the copy failed.
> 3) Change the return type back from "bool" to "int"
> 0 == success, non-zero == fail (with option to put the non-copied
> byte count in later).
> 4) Something else

Please use the copy_*_user() memory copying API semantics, which are: return
negative code (-EFAULT) on error, 0 on success.

Don't return 1 and please don't use bools for any memory copy library
functionality.

Thanks,

Ingo

2016-03-13 01:12:47

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH v14] x86, mce: Add memcpy_mcsafe()

On Sat, Mar 12, 2016 at 9:16 AM, Ingo Molnar <[email protected]> wrote:
>
> Please use the copy_*_user() memory copying API semantics, which are: return
> negative code (-EFAULT) on error, 0 on success.

Those are the get_user/put_user() semantics.

copy_*_user() has those annoying "bytes left uncopied" return values
that I really wouldn't encourage anybody else use unless they really
have to. There are (good) reasons why it's done that way, but it's
still not something that should be aped without the same kind of major
reasons.

Linus

2016-03-13 09:25:17

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH v14] x86, mce: Add memcpy_mcsafe()


* Linus Torvalds <[email protected]> wrote:

> On Sat, Mar 12, 2016 at 9:16 AM, Ingo Molnar <[email protected]> wrote:
> >
> > Please use the copy_*_user() memory copying API semantics, which are: return
> > negative code (-EFAULT) on error, 0 on success.
>
> Those are the get_user/put_user() semantics.
>
> copy_*_user() has those annoying "bytes left uncopied" return values
> that I really wouldn't encourage anybody else use unless they really
> have to. [...]

Indeed, and I even grepped for it because I tend to get it wrong ... I suck!

> [...] There are (good) reasons why it's done that way, but it's still not
> something that should be aped without the same kind of major reasons.

Yeah, so IIRC the main reason being security and the ability to know which bits of
a possibly uninitialized area are not yet copied on.

But with primary memcpy_mcsafe() usage being in-kernel (MM)IO space management I
don't think we have the 'uninitialized' problem in nearly as marked a fashion, and
we very much have the (largely uncompensated) cost of more complex assembly.

Nevertheless, wrt. error handling, the bit I feel very strongly about is the
following:

We have the existing 'mempcy API check for failure' pattern of:

if (put_user()) {
... it didn't succeed ...
}

if (copy_from_user()) {
... it didn't succeed ...
}

... which should be kept IMHO:

if (memcpy_mcsafe()) {
... it didn't succeed ...
}

... and what I most disagree with most is one of the suggestions in the original
mail I replied to:

>> 2) Reverse the return values.

(== return 1 on success, 0 on failure)

... that's crazy IMHO and reverses the pattern of almost every memcpy-with-check
(or syscall-success) API we have!

So I still think we should have the regular '0 on success, negative error value on
failure' semantics - but I'd not be hugely against a '0 on success, bytes left
uncopied on failure' variant either in principle, except that I think it unduly
complicates the assembly side to recover the bytes.

The '0/1' current implementation is really neither of these but still fits the
common error checking pattern, but it's still much better than the '1/0'
suggestion which is crazy!

So I think we should move from 0/1 to 0/-EFAULT for now, and maybe (maybe) move to
a 0/bytes_left return code in the future, if there comes a strong usecase. Most
error checks will have to be converted in that case anyway to make use of the
return code, so it's not like we introduce an extra cost here by going to
0/-EFAULT.

Also, by going to 0/-EFAULT we have the option to also allow other negative error
codes, which would allow the distinction of #MC traps from regular page fault
traps for example. (a driver might want to log them differently.)

Thanks,

Ingo

2016-03-14 22:33:54

by Tony Luck

[permalink] [raw]
Subject: [PATCH] x86/mm, x86/mce: Fix return type/value for memcpy_mcsafe()

Returning a 'bool' was very unpopular. Doubly so because the
code was just wrong (returning zero for true, one for false;
great for shell programming, not so good for C).

Change return type to "int". Keep zero as the success indicator
because it matches other similar code and people may be more
comfortable writing:

if (memcpy_mcsafe(to, from, count)) {
printk("Sad panda, copy failed\n");
...
}

Make the failure return value -EFAULT for now.

Reported by: Mika Penttilä <[email protected]>
Fixes: 92b0729c34ca ("x86/mm, x86/mce: Add memcpy_mcsafe()")
Signed-off-by: Tony Luck <[email protected]>
---
arch/x86/include/asm/string_64.h | 4 ++--
arch/x86/lib/memcpy_64.S | 7 ++++---
2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/string_64.h b/arch/x86/include/asm/string_64.h
index ca6ba3607705..90dbbd9666d4 100644
--- a/arch/x86/include/asm/string_64.h
+++ b/arch/x86/include/asm/string_64.h
@@ -87,9 +87,9 @@ int strcmp(const char *cs, const char *ct);
*
* Low level memory copy function that catches machine checks
*
- * Return true for success, false for fail
+ * Return 0 for success, -EFAULT for fail
*/
-bool memcpy_mcsafe(void *dst, const void *src, size_t cnt);
+int memcpy_mcsafe(void *dst, const void *src, size_t cnt);

#endif /* __KERNEL__ */

diff --git a/arch/x86/lib/memcpy_64.S b/arch/x86/lib/memcpy_64.S
index cbb8ee5830ff..2ec0b0abbfaa 100644
--- a/arch/x86/lib/memcpy_64.S
+++ b/arch/x86/lib/memcpy_64.S
@@ -1,6 +1,7 @@
/* Copyright 2002 Andi Kleen */

#include <linux/linkage.h>
+#include <asm/errno.h>
#include <asm/cpufeatures.h>
#include <asm/alternative-asm.h>

@@ -268,16 +269,16 @@ ENTRY(memcpy_mcsafe)
decl %ecx
jnz .L_copy_trailing_bytes

- /* Copy successful. Return true */
+ /* Copy successful. Return zero */
.L_done_memcpy_trap:
xorq %rax, %rax
ret
ENDPROC(memcpy_mcsafe)

.section .fixup, "ax"
- /* Return false for any failure */
+ /* Return -EFAULT for any failure */
.L_memcpy_mcsafe_fail:
- mov $1, %rax
+ mov $-EFAULT, %rax
ret

.previous
--
2.5.0

Subject: [tip:x86/urgent] x86/mm, x86/mce: Fix return type/value for memcpy_mcsafe()

Commit-ID: cbf8b5a2b649a501758291cb4d4ba1e5711771ba
Gitweb: http://git.kernel.org/tip/cbf8b5a2b649a501758291cb4d4ba1e5711771ba
Author: Tony Luck <[email protected]>
AuthorDate: Mon, 14 Mar 2016 15:33:39 -0700
Committer: Ingo Molnar <[email protected]>
CommitDate: Wed, 16 Mar 2016 09:02:18 +0100

x86/mm, x86/mce: Fix return type/value for memcpy_mcsafe()

Returning a 'bool' was very unpopular. Doubly so because the
code was just wrong (returning zero for true, one for false;
great for shell programming, not so good for C).

Change return type to "int". Keep zero as the success indicator
because it matches other similar code and people may be more
comfortable writing:

if (memcpy_mcsafe(to, from, count)) {
printk("Sad panda, copy failed\n");
...
}

Make the failure return value -EFAULT for now.

Reported by: Mika Penttilä <[email protected]>
Signed-off-by: Tony Luck <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Fixes: 92b0729c34ca ("x86/mm, x86/mce: Add memcpy_mcsafe()")
Link: http://lkml.kernel.org/r/695f14233fa7a54fcac4406c706d7fec228e3f4c.1457993040.git.tony.luck@intel.com
Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/include/asm/string_64.h | 4 ++--
arch/x86/lib/memcpy_64.S | 7 ++++---
2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/string_64.h b/arch/x86/include/asm/string_64.h
index ca6ba36..90dbbd9 100644
--- a/arch/x86/include/asm/string_64.h
+++ b/arch/x86/include/asm/string_64.h
@@ -87,9 +87,9 @@ int strcmp(const char *cs, const char *ct);
*
* Low level memory copy function that catches machine checks
*
- * Return true for success, false for fail
+ * Return 0 for success, -EFAULT for fail
*/
-bool memcpy_mcsafe(void *dst, const void *src, size_t cnt);
+int memcpy_mcsafe(void *dst, const void *src, size_t cnt);

#endif /* __KERNEL__ */

diff --git a/arch/x86/lib/memcpy_64.S b/arch/x86/lib/memcpy_64.S
index cbb8ee5..2ec0b0abb 100644
--- a/arch/x86/lib/memcpy_64.S
+++ b/arch/x86/lib/memcpy_64.S
@@ -1,6 +1,7 @@
/* Copyright 2002 Andi Kleen */

#include <linux/linkage.h>
+#include <asm/errno.h>
#include <asm/cpufeatures.h>
#include <asm/alternative-asm.h>

@@ -268,16 +269,16 @@ ENTRY(memcpy_mcsafe)
decl %ecx
jnz .L_copy_trailing_bytes

- /* Copy successful. Return true */
+ /* Copy successful. Return zero */
.L_done_memcpy_trap:
xorq %rax, %rax
ret
ENDPROC(memcpy_mcsafe)

.section .fixup, "ax"
- /* Return false for any failure */
+ /* Return -EFAULT for any failure */
.L_memcpy_mcsafe_fail:
- mov $1, %rax
+ mov $-EFAULT, %rax
ret

.previous