LinuxLists.cc - [PATCH] hex2bin: make the function hex_to

2022-04-24 21:50:26

Subject: [PATCH] hex2bin: make the function hex_to_bin constant-time

The function hex2bin is used to load cryptographic keys into device mapper
targets dm-crypt and dm-integrity. It should take constant time
independent on the processed data, so that concurrently running
unprivileged code can't infer any information about the keys via
microarchitectural convert channels.

This patch changes the function hex_to_bin so that it contains no branches
and no memory accesses.

Note that this shouldn't cause performance degradation because the size of
the new function is the same as the size of the old function (on x86-64) -
and the new function causes no branch misprediction penalties.

I compile-tested this function with gcc on aarch64 alpha arm hppa hppa64
i386 ia64 m68k mips32 mips64 powerpc powerpc64 riscv sh4 s390x sparc32
sparc64 x86_64 and with clang on aarch64 arm hexagon i386 mips32 mips64
powerpc powerpc64 s390x sparc32 sparc64 x86_64 to verify that there are no
branches in the generated code.

Signed-off-by: Mikulas Patocka <[email protected]>
Cc: [email protected]

---
lib/hexdump.c | 27 +++++++++++++++++++++------
1 file changed, 21 insertions(+), 6 deletions(-)

Index: linux-2.6/lib/hexdump.c
===================================================================
--- linux-2.6.orig/lib/hexdump.c 2022-04-24 18:51:20.000000000 +0200
+++ linux-2.6/lib/hexdump.c 2022-04-24 18:51:20.000000000 +0200
@@ -22,15 +22,30 @@ EXPORT_SYMBOL(hex_asc_upper);
*
* hex_to_bin() converts one hex digit to its actual value or -1 in case of bad
* input.
+ *
+ * This function is used to load cryptographic keys, so it is coded in such a
+ * way that there are no conditions or memory accesses that depend on data.
+ *
+ * Explanation of the logic:
+ * (ch - '9' - 1) is negative if ch <= '9'
+ * ('0' - 1 - ch) is negative if ch >= '0'
+ * we "and" these two values, so the result is negative if ch is in the range
+ * '0' ... '9'
+ * we are only interested in the sign, so we do a shift ">> 8" --- we have -1 if
+ * ch is in the range '0' ... '9', 0 otherwise
+ * we "and" this value with (ch - '0' + 1) --- we have a value 1 ... 10 if ch is
+ * in the range '0' ... '9', 0 otherwise
+ * we add this value to -1 --- we have a value 0 ... 9 if ch is in the range '0'
+ * ... '9', -1 otherwise
+ * the next line is similar to the previous one, but we need to decode both
+ * uppercase and lowercase letters, so we use (ch & 0xdf), which converts
+ * lowercase to uppercase
*/
int hex_to_bin(char ch)
{
- if ((ch >= '0') && (ch <= '9'))
- return ch - '0';
- ch = tolower(ch);
- if ((ch >= 'a') && (ch <= 'f'))
- return ch - 'a' + 10;
- return -1;
+ return -1 +
+ ((ch - '0' + 1) & (((ch - '9' - 1) & ('0' - 1 - ch)) >> 8)) +
+ (((ch & 0xdf) - 'A' + 11) & ((((ch & 0xdf) - 'F' - 1) & ('A' - 1 - (ch & 0xdf))) >> 8));
}
EXPORT_SYMBOL(hex_to_bin);

2022-04-25 17:51:20

by David Laight

[permalink] [raw]

Subject: RE: [PATCH] hex2bin: make the function hex_to_bin constant-time

From: Mikulas Patocka
> Sent: 25 April 2022 12:04
>
> On Mon, 25 Apr 2022, David Laight wrote:
>
> > From: Linus Torvalds
> > > Sent: 24 April 2022 22:42
> > >
> > > On Sun, Apr 24, 2022 at 2:37 PM Linus Torvalds
> > > <[email protected]> wrote:
> > > >
> > > > Finally, for the same reason - please don't use ">> 8". Because I do
> > > > not believe that bit 8 is well-defined in your arithmetic. The *sign*
> > > > bit will be, but I'm not convinced bit 8 is.
> > >
> > > Hmm.. I think it's ok. It can indeed overflow in 'char' and change the
> > > sign in bit #7, but I suspect bit #8 is always fine.
> > >
> > > Still, If you want to just extend the sign bit, ">> 31" _is_ the
> > > obvious thing to use (yeah, yeah, properly "sizeof(int)*8-1" or
> > > whatever, you get my drift).
> >
> > Except that right shifts of signed values are UB.
> > In particular it has always been valid to do an unsigned
> > shift right on a 2's compliment negative number.
> >
> > David
>
> Yes. All the standard versions (C89, C99, C11, C2X) say that right shift
> of a negative value is implementation-defined.
>
> So, we should cast it to "unsigned" before shifting it.

Except that the intent appears to be to replicate the sign bit.

If it is 'implementation defined' (rather than suddenly being UB)
it might be that the linux kernel requires sign propagating
right shifts of negative values.
This is typically what happens on 2's compliment systems.
But not all small cpu have the required shift instruction.
OTOH all the ones bit enough to run Linux probably do.
(And gcc doesn't support '1's compliment' or 'sign overpunch' cpus.)

The problem is that the compiler writers seem to be entering
a mindset where they are optimising code based on UB behaviour.
So given:
void foo(int x)
{
if (x >> 1 < 0)
return;
do_something();
}
they decide the test is UB, so can always be assumed to be true
and thus do_something() is compiled away.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)