This is a resend of v2 with the crypto people properly CC'd.
The original v1 can be found here:
http://lkml.kernel.org/g/[email protected]
While v1 (hopefully) fixed some issues in mpi_write_sgl() and
mpi_read_buffer() introduced by
commit 2d4d1eea540b ("lib/mpi: Add mpi sgl helpers") and by
commit 9cbe21d8f89d ("lib/mpi: only require buffers as big as needed for
the integer"),
I missed that there are some, including out-of-bounds buffer accesses,
in mpi_read_raw_from_sgl() as well.
Hence v2, which includes the original stuff from v1 plus my new fixes to
mpi_read_raw_from_sgl().
Applicable to linux-next-20160318.
Changes to v1:
- [1-8/14]
former [1-8/8], unchanged.
- [9-14/14]
Added in v2. Fixes to mpi_read_raw_from_sgl().
Nicolai Stange (14):
lib/mpi: mpi_write_sgl(): fix skipping of leading zero limbs
lib/mpi: mpi_write_sgl(): fix style issue with lzero decrement
lib/mpi: mpi_write_sgl(): purge redundant pointer arithmetic
lib/mpi: mpi_write_sgl(): fix out-of-bounds stack access
lib/mpi: mpi_write_sgl(): replace open coded endian conversion
lib/mpi: mpi_read_buffer(): optimize skipping of leading zero limbs
lib/mpi: mpi_read_buffer(): replace open coded endian conversion
lib/mpi: mpi_read_buffer(): fix buffer overflow
lib/mpi: mpi_read_raw_from_sgl(): replace len argument by nbytes
lib/mpi: mpi_read_raw_from_sgl(): don't include leading zero SGEs in
nbytes
lib/mpi: mpi_read_raw_from_sgl(): purge redundant clearing of nbits
lib/mpi: mpi_read_raw_from_sgl(): fix nbits calculation
lib/mpi: mpi_read_raw_from_sgl(): sanitize meaning of indices
lib/mpi: mpi_read_raw_from_sgl(): fix out-of-bounds buffer access
lib/mpi/mpicoder.c | 122 +++++++++++++++++++----------------------------------
1 file changed, 43 insertions(+), 79 deletions(-)
--
2.7.3
Currently, the endian conversion from CPU order to BE is open coded in
mpi_read_buffer().
Replace this by the centrally provided cpu_to_be*() macros.
Copy from the temporary storage on stack to the destination buffer
by means of memcpy().
Signed-off-by: Nicolai Stange <[email protected]>
---
lib/mpi/mpicoder.c | 27 ++++++++++++---------------
1 file changed, 12 insertions(+), 15 deletions(-)
diff --git a/lib/mpi/mpicoder.c b/lib/mpi/mpicoder.c
index 2fd8d41..a999ee1 100644
--- a/lib/mpi/mpicoder.c
+++ b/lib/mpi/mpicoder.c
@@ -21,6 +21,7 @@
#include <linux/bitops.h>
#include <linux/count_zeros.h>
#include <linux/byteorder/generic.h>
+#include <linux/string.h>
#include "mpi-internal.h"
#define MAX_EXTERN_MPI_BITS 16384
@@ -164,7 +165,13 @@ int mpi_read_buffer(MPI a, uint8_t *buf, unsigned buf_len, unsigned *nbytes,
int *sign)
{
uint8_t *p;
- mpi_limb_t alimb;
+#if BYTES_PER_MPI_LIMB == 4
+ __be32 alimb;
+#elif BYTES_PER_MPI_LIMB == 8
+ __be64 alimb;
+#else
+#error please implement for this limb size.
+#endif
unsigned int n = mpi_get_size(a);
int i, lzeros;
@@ -187,25 +194,15 @@ int mpi_read_buffer(MPI a, uint8_t *buf, unsigned buf_len, unsigned *nbytes,
for (i = a->nlimbs - 1 - lzeros / BYTES_PER_MPI_LIMB,
lzeros %= BYTES_PER_MPI_LIMB;
i >= 0; i--) {
- alimb = a->d[i];
#if BYTES_PER_MPI_LIMB == 4
- *p++ = alimb >> 24;
- *p++ = alimb >> 16;
- *p++ = alimb >> 8;
- *p++ = alimb;
+ alimb = cpu_to_be32(a->d[i]);
#elif BYTES_PER_MPI_LIMB == 8
- *p++ = alimb >> 56;
- *p++ = alimb >> 48;
- *p++ = alimb >> 40;
- *p++ = alimb >> 32;
- *p++ = alimb >> 24;
- *p++ = alimb >> 16;
- *p++ = alimb >> 8;
- *p++ = alimb;
+ alimb = cpu_to_be64(a->d[i]);
#else
#error please implement for this limb size.
#endif
-
+ memcpy(p, &alimb, BYTES_PER_MPI_LIMB);
+ p += BYTES_PER_MPI_LIMB;
if (lzeros > 0) {
mpi_limb_t *limb1 = (void *)p - sizeof(alimb);
mpi_limb_t *limb2 = (void *)p - sizeof(alimb)
--
2.7.3
Currently, if the number of leading zeros is greater than fits into a
complete limb, mpi_read_buffer() skips them by iterating over them
limb-wise.
Instead of skipping the high order zero limbs within the loop as shown
above, adjust the copying loop's bounds.
Signed-off-by: Nicolai Stange <[email protected]>
---
lib/mpi/mpicoder.c | 18 ++++++++----------
1 file changed, 8 insertions(+), 10 deletions(-)
diff --git a/lib/mpi/mpicoder.c b/lib/mpi/mpicoder.c
index 623439e..2fd8d41 100644
--- a/lib/mpi/mpicoder.c
+++ b/lib/mpi/mpicoder.c
@@ -184,7 +184,9 @@ int mpi_read_buffer(MPI a, uint8_t *buf, unsigned buf_len, unsigned *nbytes,
p = buf;
*nbytes = n - lzeros;
- for (i = a->nlimbs - 1; i >= 0; i--) {
+ for (i = a->nlimbs - 1 - lzeros / BYTES_PER_MPI_LIMB,
+ lzeros %= BYTES_PER_MPI_LIMB;
+ i >= 0; i--) {
alimb = a->d[i];
#if BYTES_PER_MPI_LIMB == 4
*p++ = alimb >> 24;
@@ -205,15 +207,11 @@ int mpi_read_buffer(MPI a, uint8_t *buf, unsigned buf_len, unsigned *nbytes,
#endif
if (lzeros > 0) {
- if (lzeros >= sizeof(alimb)) {
- p -= sizeof(alimb);
- } else {
- mpi_limb_t *limb1 = (void *)p - sizeof(alimb);
- mpi_limb_t *limb2 = (void *)p - sizeof(alimb)
- + lzeros;
- *limb1 = *limb2;
- p -= lzeros;
- }
+ mpi_limb_t *limb1 = (void *)p - sizeof(alimb);
+ mpi_limb_t *limb2 = (void *)p - sizeof(alimb)
+ + lzeros;
+ *limb1 = *limb2;
+ p -= lzeros;
lzeros -= sizeof(alimb);
}
}
--
2.7.3
Currently, the nbytes local variable is calculated from the len argument
as follows:
... mpi_read_raw_from_sgl(..., unsigned int len)
{
unsigned nbytes;
...
if (!ents)
nbytes = 0;
else
nbytes = len - lzeros;
...
}
Given that nbytes is derived from len in a trivial way and that the len
argument is shadowed by a local len variable in several loops, this is just
confusing.
Rename the len argument to nbytes and get rid of the nbytes local variable.
Do the nbytes calculation in place.
Signed-off-by: Nicolai Stange <[email protected]>
---
lib/mpi/mpicoder.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/lib/mpi/mpicoder.c b/lib/mpi/mpicoder.c
index 27582e2..2de2c7d 100644
--- a/lib/mpi/mpicoder.c
+++ b/lib/mpi/mpicoder.c
@@ -418,15 +418,15 @@ EXPORT_SYMBOL_GPL(mpi_write_to_sgl);
* a new MPI and reads the content of the sgl to the MPI.
*
* @sgl: scatterlist to read from
- * @len: number of bytes to read
+ * @nbytes: number of bytes to read
*
* Return: Pointer to a new MPI or NULL on error
*/
-MPI mpi_read_raw_from_sgl(struct scatterlist *sgl, unsigned int len)
+MPI mpi_read_raw_from_sgl(struct scatterlist *sgl, unsigned int nbytes)
{
struct scatterlist *sg;
int x, i, j, z, lzeros, ents;
- unsigned int nbits, nlimbs, nbytes;
+ unsigned int nbits, nlimbs;
mpi_limb_t a;
MPI val = NULL;
@@ -455,7 +455,7 @@ MPI mpi_read_raw_from_sgl(struct scatterlist *sgl, unsigned int len)
if (!ents)
nbytes = 0;
else
- nbytes = len - lzeros;
+ nbytes -= lzeros;
nbits = nbytes * 8;
if (nbits > MAX_EXTERN_MPI_BITS) {
--
2.7.3
In mpi_read_raw_from_sgl(), unsigned nbits is calculated as follows:
nbits = nbytes * 8;
and redundantly cleared later on if nbytes == 0:
if (nbytes > 0)
...
else
nbits = 0;
Purge this redundant clearing for the sake of clarity.
Signed-off-by: Nicolai Stange <[email protected]>
---
lib/mpi/mpicoder.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/lib/mpi/mpicoder.c b/lib/mpi/mpicoder.c
index add9e81..4cf8516 100644
--- a/lib/mpi/mpicoder.c
+++ b/lib/mpi/mpicoder.c
@@ -461,8 +461,6 @@ MPI mpi_read_raw_from_sgl(struct scatterlist *sgl, unsigned int nbytes)
if (nbytes > 0)
nbits -= count_leading_zeros(*(u8 *)(sg_virt(sgl) + lzeros));
- else
- nbits = 0;
nlimbs = DIV_ROUND_UP(nbytes, BYTES_PER_MPI_LIMB);
val = mpi_alloc(nlimbs);
--
2.7.3
The number of bits, nbits, is calculated in mpi_read_raw_from_sgl() as
follows:
nbits = nbytes * 8;
Afterwards, the number of leading zero bits of the first byte get
subtracted:
nbits -= count_leading_zeros(*(u8 *)(sg_virt(sgl) + lzeros));
However, count_leading_zeros() takes an unsigned long and thus,
the u8 gets promoted to an unsigned long.
Thus, the above doesn't subtract the number of leading zeros in the most
significant nonzero input byte from nbits, but the number of leading
zeros of the most significant nonzero input byte promoted to unsigned long,
i.e. BITS_PER_LONG - 8 too many.
Fix this by subtracting
count_leading_zeros(...) - (BITS_PER_LONG - 8)
from nbits only.
Fixes: 2d4d1eea540b ("lib/mpi: Add mpi sgl helpers")
Signed-off-by: Nicolai Stange <[email protected]>
---
lib/mpi/mpicoder.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/lib/mpi/mpicoder.c b/lib/mpi/mpicoder.c
index 4cf8516..5d02efe 100644
--- a/lib/mpi/mpicoder.c
+++ b/lib/mpi/mpicoder.c
@@ -460,7 +460,8 @@ MPI mpi_read_raw_from_sgl(struct scatterlist *sgl, unsigned int nbytes)
}
if (nbytes > 0)
- nbits -= count_leading_zeros(*(u8 *)(sg_virt(sgl) + lzeros));
+ nbits -= count_leading_zeros(*(u8 *)(sg_virt(sgl) + lzeros)) -
+ (BITS_PER_LONG - 8);
nlimbs = DIV_ROUND_UP(nbytes, BYTES_PER_MPI_LIMB);
val = mpi_alloc(nlimbs);
--
2.7.3
At the very beginning of mpi_read_raw_from_sgl(), the leading zeros of
the input scatterlist are counted:
lzeros = 0;
for_each_sg(sgl, sg, ents, i) {
...
if (/* sg contains nonzero bytes */)
break;
/* sg contains nothing but zeros here */
ents--;
lzeros = 0;
}
Later on, the total number of trailing nonzero bytes is calculated by
subtracting the number of leading zero bytes from the total number of input
bytes:
nbytes -= lzeros;
However, since lzeros gets reset to zero for each completely zero leading
sg in the loop above, it doesn't include those.
Besides wasting resources by allocating a too large output buffer,
this mistake propagates into the calculation of x, the number of
leading zeros within the most significant output limb:
x = BYTES_PER_MPI_LIMB - nbytes % BYTES_PER_MPI_LIMB;
What's more, the low order bytes of the output, equal in number to the
extra bytes in nbytes, are left uninitialized.
Fix this by adjusting nbytes for each completely zero leading scatterlist
entry.
Fixes: 2d4d1eea540b ("lib/mpi: Add mpi sgl helpers")
Signed-off-by: Nicolai Stange <[email protected]>
---
lib/mpi/mpicoder.c | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/lib/mpi/mpicoder.c b/lib/mpi/mpicoder.c
index 2de2c7d..add9e81 100644
--- a/lib/mpi/mpicoder.c
+++ b/lib/mpi/mpicoder.c
@@ -447,16 +447,12 @@ MPI mpi_read_raw_from_sgl(struct scatterlist *sgl, unsigned int nbytes)
break;
ents--;
+ nbytes -= lzeros;
lzeros = 0;
}
sgl = sg;
-
- if (!ents)
- nbytes = 0;
- else
- nbytes -= lzeros;
-
+ nbytes -= lzeros;
nbits = nbytes * 8;
if (nbits > MAX_EXTERN_MPI_BITS) {
pr_info("MPI: mpi too large (%u bits)\n", nbits);
--
2.7.3
Currently, mpi_read_buffer() writes full limbs to the output buffer
and moves memory around to purge leading zero limbs afterwards.
However, with
commit 9cbe21d8f89d ("lib/mpi: only require buffers as big as needed for
the integer")
the caller is only required to provide a buffer large enough to hold the
result without the leading zeros.
This might result in a buffer overflow for small MP numbers with leading
zeros.
Fix this by coping the result to its final destination within the output
buffer and not copying the leading zeros at all.
Fixes: 9cbe21d8f89d ("lib/mpi: only require buffers as big as needed for
the integer")
Signed-off-by: Nicolai Stange <[email protected]>
---
lib/mpi/mpicoder.c | 13 +++----------
1 file changed, 3 insertions(+), 10 deletions(-)
diff --git a/lib/mpi/mpicoder.c b/lib/mpi/mpicoder.c
index a999ee1..27582e2 100644
--- a/lib/mpi/mpicoder.c
+++ b/lib/mpi/mpicoder.c
@@ -201,16 +201,9 @@ int mpi_read_buffer(MPI a, uint8_t *buf, unsigned buf_len, unsigned *nbytes,
#else
#error please implement for this limb size.
#endif
- memcpy(p, &alimb, BYTES_PER_MPI_LIMB);
- p += BYTES_PER_MPI_LIMB;
- if (lzeros > 0) {
- mpi_limb_t *limb1 = (void *)p - sizeof(alimb);
- mpi_limb_t *limb2 = (void *)p - sizeof(alimb)
- + lzeros;
- *limb1 = *limb2;
- p -= lzeros;
- lzeros -= sizeof(alimb);
- }
+ memcpy(p, &alimb + lzeros, BYTES_PER_MPI_LIMB - lzeros);
+ p += BYTES_PER_MPI_LIMB - lzeros;
+ lzeros = 0;
}
return 0;
}
--
2.7.3
Currently, the endian conversion from CPU order to BE is open coded in
mpi_write_sgl().
Replace this by the centrally provided cpu_to_be*() macros.
Signed-off-by: Nicolai Stange <[email protected]>
---
lib/mpi/mpicoder.c | 27 +++++++++++----------------
1 file changed, 11 insertions(+), 16 deletions(-)
diff --git a/lib/mpi/mpicoder.c b/lib/mpi/mpicoder.c
index b05d390..623439e 100644
--- a/lib/mpi/mpicoder.c
+++ b/lib/mpi/mpicoder.c
@@ -20,6 +20,7 @@
#include <linux/bitops.h>
#include <linux/count_zeros.h>
+#include <linux/byteorder/generic.h>
#include "mpi-internal.h"
#define MAX_EXTERN_MPI_BITS 16384
@@ -359,7 +360,13 @@ int mpi_write_to_sgl(MPI a, struct scatterlist *sgl, unsigned *nbytes,
int *sign)
{
u8 *p, *p2;
- mpi_limb_t alimb, alimb2;
+#if BYTES_PER_MPI_LIMB == 4
+ __be32 alimb;
+#elif BYTES_PER_MPI_LIMB == 8
+ __be64 alimb;
+#else
+#error please implement for this limb size.
+#endif
unsigned int n = mpi_get_size(a);
int i, x, y = 0, lzeros, buf_len;
@@ -383,22 +390,10 @@ int mpi_write_to_sgl(MPI a, struct scatterlist *sgl, unsigned *nbytes,
for (i = a->nlimbs - 1 - lzeros / BYTES_PER_MPI_LIMB,
lzeros %= BYTES_PER_MPI_LIMB;
i >= 0; i--) {
- alimb = a->d[i];
- p = (u8 *)&alimb2;
#if BYTES_PER_MPI_LIMB == 4
- *p++ = alimb >> 24;
- *p++ = alimb >> 16;
- *p++ = alimb >> 8;
- *p++ = alimb;
+ alimb = cpu_to_be32(a->d[i]);
#elif BYTES_PER_MPI_LIMB == 8
- *p++ = alimb >> 56;
- *p++ = alimb >> 48;
- *p++ = alimb >> 40;
- *p++ = alimb >> 32;
- *p++ = alimb >> 24;
- *p++ = alimb >> 16;
- *p++ = alimb >> 8;
- *p++ = alimb;
+ alimb = cpu_to_be64(a->d[i]);
#else
#error please implement for this limb size.
#endif
@@ -407,7 +402,7 @@ int mpi_write_to_sgl(MPI a, struct scatterlist *sgl, unsigned *nbytes,
lzeros = 0;
}
- p = p - sizeof(alimb) + y;
+ p = (u8 *)&alimb + y;
for (x = 0; x < sizeof(alimb) - y; x++) {
if (!buf_len) {
--
2.7.3
Within the copying loop in mpi_read_raw_from_sgl(), the last input SGE's
byte count gets artificially extended as follows:
if (sg_is_last(sg) && (len % BYTES_PER_MPI_LIMB))
len += BYTES_PER_MPI_LIMB - (len % BYTES_PER_MPI_LIMB);
Within the following byte copying loop, this causes reads beyond that
SGE's allocated buffer:
BUG: KASAN: slab-out-of-bounds in mpi_read_raw_from_sgl+0x331/0x650
at addr ffff8801e168d4d8
Read of size 1 by task systemd-udevd/721
[...]
Call Trace:
[<ffffffff818c4d35>] dump_stack+0xbc/0x117
[<ffffffff818c4c79>] ? _atomic_dec_and_lock+0x169/0x169
[<ffffffff814af5d1>] ? print_section+0x61/0xb0
[<ffffffff814b1109>] print_trailer+0x179/0x2c0
[<ffffffff814bc524>] object_err+0x34/0x40
[<ffffffff814bfdc7>] kasan_report_error+0x307/0x8c0
[<ffffffff814bf315>] ? kasan_unpoison_shadow+0x35/0x50
[<ffffffff814bf38e>] ? kasan_kmalloc+0x5e/0x70
[<ffffffff814c0ad1>] kasan_report+0x71/0xa0
[<ffffffff81938171>] ? mpi_read_raw_from_sgl+0x331/0x650
[<ffffffff814bf1a6>] __asan_load1+0x46/0x50
[<ffffffff81938171>] mpi_read_raw_from_sgl+0x331/0x650
[<ffffffff817f41b6>] rsa_verify+0x106/0x260
[<ffffffff817f40b0>] ? rsa_set_pub_key+0xf0/0xf0
[<ffffffff818edc79>] ? sg_init_table+0x29/0x50
[<ffffffff817f4d22>] ? pkcs1pad_sg_set_buf+0xb2/0x2e0
[<ffffffff817f5b74>] pkcs1pad_verify+0x1f4/0x2b0
[<ffffffff81831057>] public_key_verify_signature+0x3a7/0x5e0
[<ffffffff81830cb0>] ? public_key_describe+0x80/0x80
[<ffffffff817830f0>] ? keyring_search_aux+0x150/0x150
[<ffffffff818334a4>] ? x509_request_asymmetric_key+0x114/0x370
[<ffffffff814b83f0>] ? kfree+0x220/0x370
[<ffffffff818312c2>] public_key_verify_signature_2+0x32/0x50
[<ffffffff81830b5c>] verify_signature+0x7c/0xb0
[<ffffffff81835d0c>] pkcs7_validate_trust+0x42c/0x5f0
[<ffffffff813c391a>] system_verify_data+0xca/0x170
[<ffffffff813c3850>] ? top_trace_array+0x9b/0x9b
[<ffffffff81510b29>] ? __vfs_read+0x279/0x3d0
[<ffffffff8129372f>] mod_verify_sig+0x1ff/0x290
[...]
The exact purpose of the len extension isn't clear to me, but due to
its form, I suspect that it's a leftover somehow accounting for leading
zero bytes within the most significant output limb.
Note however that without that len adjustement, the total number of bytes
ever processed by the inner loop equals nbytes and thus, the last output
limb gets written at this point. Thus the net effect of the len adjustement
cited above is just to keep the inner loop running for some more
iterations, namely < BYTES_PER_MPI_LIMB ones, reading some extra bytes from
beyond the last SGE's buffer and discarding them afterwards.
Fix this issue by purging the extension of len beyond the last input SGE's
buffer length.
Fixes: 2d4d1eea540b ("lib/mpi: Add mpi sgl helpers")
Signed-off-by: Nicolai Stange <[email protected]>
---
lib/mpi/mpicoder.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/lib/mpi/mpicoder.c b/lib/mpi/mpicoder.c
index 3f114d2..0c534ac 100644
--- a/lib/mpi/mpicoder.c
+++ b/lib/mpi/mpicoder.c
@@ -484,9 +484,6 @@ MPI mpi_read_raw_from_sgl(struct scatterlist *sgl, unsigned int nbytes)
const u8 *buffer = sg_virt(sg) + lzeros;
int len = sg->length - lzeros;
- if (sg_is_last(sg) && (len % BYTES_PER_MPI_LIMB))
- len += BYTES_PER_MPI_LIMB - (len % BYTES_PER_MPI_LIMB);
-
for (x = 0; x < len; x++) {
a <<= 8;
a |= *buffer++;
--
2.7.3
Within the byte reading loop in mpi_read_raw_sgl(), there are two
housekeeping indices used, z and x.
At all times, the index z represents the number of output bytes covered
by the input SGEs for which processing has completed so far. This includes
any leading zero bytes within the most significant limb.
The index x changes its meaning after the first outer loop's iteration
though: while processing the first input SGE, it represents
"number of leading zero bytes in most significant output limb" +
"current position within current SGE"
For the remaining SGEs OTOH, x corresponds just to
"current position within current SGE"
After all, it is only the sum of z and x that has any meaning for the
output buffer and thus, the
"number of leading zero bytes in most significant output limb"
part can be moved away from x into z from the beginning, opening up the
opportunity for cleaner code.
Before the outer loop iterating over the SGEs, don't initialize z with
zero, but with the number of leading zero bytes in the most significant
output limb. For the inner loop iterating over a single SGE's bytes,
get rid of the buf_shift offset to x' bounds and let x run from zero to
sg->length - 1.
Signed-off-by: Nicolai Stange <[email protected]>
---
lib/mpi/mpicoder.c | 9 +++------
1 file changed, 3 insertions(+), 6 deletions(-)
diff --git a/lib/mpi/mpicoder.c b/lib/mpi/mpicoder.c
index 5d02efe..3f114d2 100644
--- a/lib/mpi/mpicoder.c
+++ b/lib/mpi/mpicoder.c
@@ -477,19 +477,17 @@ MPI mpi_read_raw_from_sgl(struct scatterlist *sgl, unsigned int nbytes)
j = nlimbs - 1;
a = 0;
- z = 0;
- x = BYTES_PER_MPI_LIMB - nbytes % BYTES_PER_MPI_LIMB;
- x %= BYTES_PER_MPI_LIMB;
+ z = BYTES_PER_MPI_LIMB - nbytes % BYTES_PER_MPI_LIMB;
+ z %= BYTES_PER_MPI_LIMB;
for_each_sg(sgl, sg, ents, i) {
const u8 *buffer = sg_virt(sg) + lzeros;
int len = sg->length - lzeros;
- int buf_shift = x;
if (sg_is_last(sg) && (len % BYTES_PER_MPI_LIMB))
len += BYTES_PER_MPI_LIMB - (len % BYTES_PER_MPI_LIMB);
- for (; x < len + buf_shift; x++) {
+ for (x = 0; x < len; x++) {
a <<= 8;
a |= *buffer++;
if (((z + x + 1) % BYTES_PER_MPI_LIMB) == 0) {
@@ -498,7 +496,6 @@ MPI mpi_read_raw_from_sgl(struct scatterlist *sgl, unsigned int nbytes)
}
}
z += x;
- x = 0;
lzeros = 0;
}
return val;
--
2.7.3
Within the copying loop in mpi_write_sgl(), we have
if (lzeros) {
...
p -= lzeros;
y = lzeros;
}
p = p - (sizeof(alimb) - y);
If lzeros == 0, then y == 0, too. Thus, lzeros gets subtracted and added
back again to p.
Purge this redundancy.
Signed-off-by: Nicolai Stange <[email protected]>
---
lib/mpi/mpicoder.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/lib/mpi/mpicoder.c b/lib/mpi/mpicoder.c
index d8b372b..78ec4e1 100644
--- a/lib/mpi/mpicoder.c
+++ b/lib/mpi/mpicoder.c
@@ -407,12 +407,11 @@ int mpi_write_to_sgl(MPI a, struct scatterlist *sgl, unsigned *nbytes,
mpi_limb_t *limb2 = (void *)p - sizeof(alimb)
+ lzeros;
*limb1 = *limb2;
- p -= lzeros;
y = lzeros;
lzeros = 0;
}
- p = p - (sizeof(alimb) - y);
+ p = p - sizeof(alimb);
for (x = 0; x < sizeof(alimb) - y; x++) {
if (!buf_len) {
--
2.7.3
Within the copying loop in mpi_write_sgl(), we have
if (lzeros) {
mpi_limb_t *limb1 = (void *)p - sizeof(alimb);
mpi_limb_t *limb2 = (void *)p - sizeof(alimb)
+ lzeros;
*limb1 = *limb2;
...
}
where p points past the end of alimb2 which lives on the stack and contains
the current limb in BE order.
The purpose of the above is to shift the non-zero bytes of alimb2 to its
beginning in memory, i.e. to skip its leading zero bytes.
However, limb2 points somewhere into the middle of alimb2 and thus, reading
*limb2 pulls in lzero bytes from somewhere.
Indeed, KASAN splats:
BUG: KASAN: stack-out-of-bounds in mpi_write_to_sgl+0x4e3/0x6f0
at addr ffff8800cb04f601
Read of size 8 by task systemd-udevd/391
page:ffffea00032c13c0 count:0 mapcount:0 mapping: (null) index:0x0
flags: 0x3fff8000000000()
page dumped because: kasan: bad access detected
CPU: 3 PID: 391 Comm: systemd-udevd Tainted: G B L
4.5.0-next-20160316+ #12
[...]
Call Trace:
[<ffffffff8194889e>] dump_stack+0xdc/0x15e
[<ffffffff819487c2>] ? _atomic_dec_and_lock+0xa2/0xa2
[<ffffffff814892b5>] ? __dump_page+0x185/0x330
[<ffffffff8150ffd6>] kasan_report_error+0x5e6/0x8b0
[<ffffffff814724cd>] ? kzfree+0x2d/0x40
[<ffffffff819c5bce>] ? mpi_free_limb_space+0xe/0x20
[<ffffffff819c469e>] ? mpi_powm+0x37e/0x16f0
[<ffffffff815109f1>] kasan_report+0x71/0xa0
[<ffffffff819c0353>] ? mpi_write_to_sgl+0x4e3/0x6f0
[<ffffffff8150ed34>] __asan_load8+0x64/0x70
[<ffffffff819c0353>] mpi_write_to_sgl+0x4e3/0x6f0
[<ffffffff819bfe70>] ? mpi_set_buffer+0x620/0x620
[<ffffffff819c0e6f>] ? mpi_cmp+0xbf/0x180
[<ffffffff8186e282>] rsa_verify+0x202/0x260
What's more, since lzeros can be anything from 1 to sizeof(mpi_limb_t)-1,
the above will cause unaligned accesses which is bad on non-x86 archs.
Fix the issue, by preparing the starting point p for the upcoming copy
operation instead of shifting the source memory, i.e. alimb2.
Fixes: 2d4d1eea540b ("lib/mpi: Add mpi sgl helpers")
Signed-off-by: Nicolai Stange <[email protected]>
---
lib/mpi/mpicoder.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/lib/mpi/mpicoder.c b/lib/mpi/mpicoder.c
index 78ec4e1..b05d390 100644
--- a/lib/mpi/mpicoder.c
+++ b/lib/mpi/mpicoder.c
@@ -403,15 +403,11 @@ int mpi_write_to_sgl(MPI a, struct scatterlist *sgl, unsigned *nbytes,
#error please implement for this limb size.
#endif
if (lzeros) {
- mpi_limb_t *limb1 = (void *)p - sizeof(alimb);
- mpi_limb_t *limb2 = (void *)p - sizeof(alimb)
- + lzeros;
- *limb1 = *limb2;
y = lzeros;
lzeros = 0;
}
- p = p - sizeof(alimb);
+ p = p - sizeof(alimb) + y;
for (x = 0; x < sizeof(alimb) - y; x++) {
if (!buf_len) {
--
2.7.3
Within the copying loop in mpi_write_sgl(), we have
if (lzeros > 0) {
...
lzeros -= sizeof(alimb);
}
However, at this point, lzeros < sizeof(alimb) holds. Make this fact
explicit by rewriting the above to
if (lzeros) {
...
lzeros = 0;
}
Signed-off-by: Nicolai Stange <[email protected]>
---
lib/mpi/mpicoder.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/lib/mpi/mpicoder.c b/lib/mpi/mpicoder.c
index 6bb52be..d8b372b 100644
--- a/lib/mpi/mpicoder.c
+++ b/lib/mpi/mpicoder.c
@@ -402,14 +402,14 @@ int mpi_write_to_sgl(MPI a, struct scatterlist *sgl, unsigned *nbytes,
#else
#error please implement for this limb size.
#endif
- if (lzeros > 0) {
+ if (lzeros) {
mpi_limb_t *limb1 = (void *)p - sizeof(alimb);
mpi_limb_t *limb2 = (void *)p - sizeof(alimb)
+ lzeros;
*limb1 = *limb2;
p -= lzeros;
y = lzeros;
- lzeros -= sizeof(alimb);
+ lzeros = 0;
}
p = p - (sizeof(alimb) - y);
--
2.7.3
Currently, if the number of leading zeros is greater than fits into a
complete limb, mpi_write_sgl() skips them by iterating over them limb-wise.
However, it fails to adjust its internal leading zeros tracking variable,
lzeros, accordingly: it does a
p -= sizeof(alimb);
continue;
which should really have been a
lzeros -= sizeof(alimb);
continue;
Since lzeros never decreases if its initial value >= sizeof(alimb), nothing
gets copied by mpi_write_sgl() in that case.
Instead of skipping the high order zero limbs within the loop as shown
above, fix the issue by adjusting the copying loop's bounds.
Fixes: 2d4d1eea540b ("lib/mpi: Add mpi sgl helpers")
Signed-off-by: Nicolai Stange <[email protected]>
---
lib/mpi/mpicoder.c | 21 +++++++++------------
1 file changed, 9 insertions(+), 12 deletions(-)
diff --git a/lib/mpi/mpicoder.c b/lib/mpi/mpicoder.c
index eb15e7d..6bb52be 100644
--- a/lib/mpi/mpicoder.c
+++ b/lib/mpi/mpicoder.c
@@ -380,7 +380,9 @@ int mpi_write_to_sgl(MPI a, struct scatterlist *sgl, unsigned *nbytes,
buf_len = sgl->length;
p2 = sg_virt(sgl);
- for (i = a->nlimbs - 1; i >= 0; i--) {
+ for (i = a->nlimbs - 1 - lzeros / BYTES_PER_MPI_LIMB,
+ lzeros %= BYTES_PER_MPI_LIMB;
+ i >= 0; i--) {
alimb = a->d[i];
p = (u8 *)&alimb2;
#if BYTES_PER_MPI_LIMB == 4
@@ -401,17 +403,12 @@ int mpi_write_to_sgl(MPI a, struct scatterlist *sgl, unsigned *nbytes,
#error please implement for this limb size.
#endif
if (lzeros > 0) {
- if (lzeros >= sizeof(alimb)) {
- p -= sizeof(alimb);
- continue;
- } else {
- mpi_limb_t *limb1 = (void *)p - sizeof(alimb);
- mpi_limb_t *limb2 = (void *)p - sizeof(alimb)
- + lzeros;
- *limb1 = *limb2;
- p -= lzeros;
- y = lzeros;
- }
+ mpi_limb_t *limb1 = (void *)p - sizeof(alimb);
+ mpi_limb_t *limb2 = (void *)p - sizeof(alimb)
+ + lzeros;
+ *limb1 = *limb2;
+ p -= lzeros;
+ y = lzeros;
lzeros -= sizeof(alimb);
}
--
2.7.3
Hi Nicolai,
On 03/21/2016 06:26 AM, Nicolai Stange wrote:
> This is a resend of v2 with the crypto people properly CC'd.
>
> The original v1 can be found here:
>
> http://lkml.kernel.org/g/[email protected]
>
>
> While v1 (hopefully) fixed some issues in mpi_write_sgl() and
> mpi_read_buffer() introduced by
> commit 2d4d1eea540b ("lib/mpi: Add mpi sgl helpers") and by
> commit 9cbe21d8f89d ("lib/mpi: only require buffers as big as needed for
> the integer"),
> I missed that there are some, including out-of-bounds buffer accesses,
> in mpi_read_raw_from_sgl() as well.
>
> Hence v2, which includes the original stuff from v1 plus my new fixes to
> mpi_read_raw_from_sgl().
>
>
> Applicable to linux-next-20160318.
>
>
> Changes to v1:
> - [1-8/14]
> former [1-8/8], unchanged.
>
> - [9-14/14]
> Added in v2. Fixes to mpi_read_raw_from_sgl().
>
> Nicolai Stange (14):
> lib/mpi: mpi_write_sgl(): fix skipping of leading zero limbs
> lib/mpi: mpi_write_sgl(): fix style issue with lzero decrement
> lib/mpi: mpi_write_sgl(): purge redundant pointer arithmetic
> lib/mpi: mpi_write_sgl(): fix out-of-bounds stack access
> lib/mpi: mpi_write_sgl(): replace open coded endian conversion
> lib/mpi: mpi_read_buffer(): optimize skipping of leading zero limbs
> lib/mpi: mpi_read_buffer(): replace open coded endian conversion
> lib/mpi: mpi_read_buffer(): fix buffer overflow
> lib/mpi: mpi_read_raw_from_sgl(): replace len argument by nbytes
> lib/mpi: mpi_read_raw_from_sgl(): don't include leading zero SGEs in
> nbytes
> lib/mpi: mpi_read_raw_from_sgl(): purge redundant clearing of nbits
> lib/mpi: mpi_read_raw_from_sgl(): fix nbits calculation
> lib/mpi: mpi_read_raw_from_sgl(): sanitize meaning of indices
> lib/mpi: mpi_read_raw_from_sgl(): fix out-of-bounds buffer access
>
> lib/mpi/mpicoder.c | 122 +++++++++++++++++++----------------------------------
> 1 file changed, 43 insertions(+), 79 deletions(-)
Thanks for sending this. Nice work. In fact the mpi_write_sgl() function
worked only because the mpi_normalize() stripped all MSB zero limbs.
Given that this will hold for all cases we can simplify this even more.
Unfortunately I don't know if this will be true for mpi_sub() or
mpi_set_ui() because they are declared in include/linux/mpi.h,
but never defined anywhere.
I've found one problem in 08/14 in mpi_read_buffer()
It's a pointer arithmetic issue, which causes the first limb to be
written at an invalid address if lzeros > 0. This incremental patch
fixes it.
---8<---
Subject: [PATCH] lib/mpi: fix pointer arithmetic issue in mpi_read_buffer
Fix pointer arithmetic issue, which causes the first limb to be
written at invalid address if lzeros > 0.
Signed-off-by: Tadeusz Struk <[email protected]>
---
lib/mpi/mpicoder.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/lib/mpi/mpicoder.c b/lib/mpi/mpicoder.c
index 0c534ac..747606f 100644
--- a/lib/mpi/mpicoder.c
+++ b/lib/mpi/mpicoder.c
@@ -201,7 +201,7 @@ int mpi_read_buffer(MPI a, uint8_t *buf, unsigned buf_len, unsigned *nbytes,
#else
#error please implement for this limb size.
#endif
- memcpy(p, &alimb + lzeros, BYTES_PER_MPI_LIMB - lzeros);
+ memcpy(p, (u8 *)&alimb + lzeros, BYTES_PER_MPI_LIMB - lzeros);
p += BYTES_PER_MPI_LIMB - lzeros;
lzeros = 0;
}
---
Other than that please include my
Tested-by: Tadeusz Struk <[email protected]>
for the whole series.
Thanks,
--
TS
Hi Tadeusz,
thank you very much for your quick reply!
Tadeusz Struk <[email protected]> writes:
> On 03/21/2016 06:26 AM, Nicolai Stange wrote:
>> This is a resend of v2 with the crypto people properly CC'd.
>>
>> The original v1 can be found here:
>>
>> http://lkml.kernel.org/g/[email protected]
>>
>>
>> While v1 (hopefully) fixed some issues in mpi_write_sgl() and
>> mpi_read_buffer() introduced by
>> commit 2d4d1eea540b ("lib/mpi: Add mpi sgl helpers") and by
>> commit 9cbe21d8f89d ("lib/mpi: only require buffers as big as needed for
>> the integer"),
>> I missed that there are some, including out-of-bounds buffer accesses,
>> in mpi_read_raw_from_sgl() as well.
>>
>> Hence v2, which includes the original stuff from v1 plus my new fixes to
>> mpi_read_raw_from_sgl().
>>
>>
>> Applicable to linux-next-20160318.
>>
>>
>> Changes to v1:
>> - [1-8/14]
>> former [1-8/8], unchanged.
>>
>> - [9-14/14]
>> Added in v2. Fixes to mpi_read_raw_from_sgl().
>>
>> Nicolai Stange (14):
>> lib/mpi: mpi_write_sgl(): fix skipping of leading zero limbs
>> lib/mpi: mpi_write_sgl(): fix style issue with lzero decrement
>> lib/mpi: mpi_write_sgl(): purge redundant pointer arithmetic
>> lib/mpi: mpi_write_sgl(): fix out-of-bounds stack access
>> lib/mpi: mpi_write_sgl(): replace open coded endian conversion
>> lib/mpi: mpi_read_buffer(): optimize skipping of leading zero limbs
>> lib/mpi: mpi_read_buffer(): replace open coded endian conversion
>> lib/mpi: mpi_read_buffer(): fix buffer overflow
>> lib/mpi: mpi_read_raw_from_sgl(): replace len argument by nbytes
>> lib/mpi: mpi_read_raw_from_sgl(): don't include leading zero SGEs in
>> nbytes
>> lib/mpi: mpi_read_raw_from_sgl(): purge redundant clearing of nbits
>> lib/mpi: mpi_read_raw_from_sgl(): fix nbits calculation
>> lib/mpi: mpi_read_raw_from_sgl(): sanitize meaning of indices
>> lib/mpi: mpi_read_raw_from_sgl(): fix out-of-bounds buffer access
>>
>> lib/mpi/mpicoder.c | 122 +++++++++++++++++++----------------------------------
>> 1 file changed, 43 insertions(+), 79 deletions(-)
>
> Thanks for sending this. Nice work. In fact the mpi_write_sgl() function
> worked only because the mpi_normalize() stripped all MSB zero limbs.
> Given that this will hold for all cases we can simplify this even more.
> Unfortunately I don't know if this will be true for mpi_sub() or
> mpi_set_ui() because they are declared in include/linux/mpi.h,
> but never defined anywhere.
>
> I've found one problem in 08/14 in mpi_read_buffer()
> It's a pointer arithmetic issue, which causes the first limb to be
> written at an invalid address if lzeros > 0. This incremental patch
> fixes it.
Ugh. I'll send a v3 fixing this up during the course of the day. Or do
you prefer to apply your incremental patch below to this v2 as it
stands?
Anyway, thank you for spotting this!
>
> ---8<---
> Subject: [PATCH] lib/mpi: fix pointer arithmetic issue in mpi_read_buffer
>
> Fix pointer arithmetic issue, which causes the first limb to be
> written at invalid address if lzeros > 0.
>
> Signed-off-by: Tadeusz Struk <[email protected]>
> ---
> lib/mpi/mpicoder.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/lib/mpi/mpicoder.c b/lib/mpi/mpicoder.c
> index 0c534ac..747606f 100644
> --- a/lib/mpi/mpicoder.c
> +++ b/lib/mpi/mpicoder.c
> @@ -201,7 +201,7 @@ int mpi_read_buffer(MPI a, uint8_t *buf, unsigned buf_len, unsigned *nbytes,
> #else
> #error please implement for this limb size.
> #endif
> - memcpy(p, &alimb + lzeros, BYTES_PER_MPI_LIMB - lzeros);
> + memcpy(p, (u8 *)&alimb + lzeros, BYTES_PER_MPI_LIMB - lzeros);
> p += BYTES_PER_MPI_LIMB - lzeros;
> lzeros = 0;
> }
>
> ---
> Other than that please include my
> Tested-by: Tadeusz Struk <[email protected]>
> for the whole series.
Glad to get that one :)
Nicolai
On 03/22/2016 12:06 AM, Nicolai Stange wrote:
> Ugh. I'll send a v3 fixing this up during the course of the day. Or do
> you prefer to apply your incremental patch below to this v2 as it
> stands?
Either way is fine with me.
Thanks,
--
TS