2001-10-27 04:23:17

by Ren

[permalink] [raw]
Subject: [PATCH] random.c bugfix

Hi,

there's a bug in random.c, I think. The third argument of
extract_entropy() is supposed to be the number of _bytes_ to extract,
while nwords contains the number of _bytes_ we want. This seems to lead
us to transfer n bytes of entropy and credit for n*4 bytes.

Ren?



--- linux-2.4.14-pre2/drivers/char/random.c Fri Oct 26 23:07:16 2001
+++ linux-2.4.14-pre2-rs/drivers/char/random.c Sat Oct 27 05:36:23 2001
@@ -1253,7 +1253,7 @@
r == sec_random_state ? "secondary" : "unknown",
r->entropy_count, nbytes * 8);

- extract_entropy(random_state, tmp, nwords, 0);
+ extract_entropy(random_state, tmp, nwords * 4, 0);
add_entropy_words(r, tmp, nwords);
credit_entropy_store(r, nwords * 32);
}


2001-10-27 06:22:06

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH] random.c bugfix

On Oct 27, 2001 06:21 +0200, Ren? Scharfe wrote:
> there's a bug in random.c, I think. The third argument of
> extract_entropy() is supposed to be the number of _bytes_ to extract,
> while nwords contains the number of _bytes_ we want. This seems to lead
^^^^^ words, I think you mean ;-)
> us to transfer n bytes of entropy and credit for n*4 bytes.

OK, my bad. At least the random variable-name cleanups let you SEE where
we are supposed to be using word sizes and byte sizes. Even you were
confused about it ;-)

> --- linux-2.4.14-pre2/drivers/char/random.c Fri Oct 26 23:07:16 2001
> +++ linux-2.4.14-pre2-rs/drivers/char/random.c Sat Oct 27 05:36:23 2001
> @@ -1253,7 +1253,7 @@
> r == sec_random_state ? "secondary" : "unknown",
> r->entropy_count, nbytes * 8);
>
> - extract_entropy(random_state, tmp, nwords, 0);
> + extract_entropy(random_state, tmp, nwords * 4, 0);
> add_entropy_words(r, tmp, nwords);
> credit_entropy_store(r, nwords * 32);
> }

The patch looks correct, though.

> The rest of the patch is just there for consistency and because it just
> looks better to me. Those sizeof()s were introduced in kernel 2.4.13, I
> just can't imagine why. Care to explain anyone?

> @@ -1260,9 +1260,9 @@
> if (r->extract_count > 1024) {
> DEBUG_ENT("reseeding %s with %d from primary\n",
> r == sec_random_state ? "secondary" : "unknown",
> - sizeof(tmp) * 8);
> - extract_entropy(random_state, tmp, sizeof(tmp), 0);
> - add_entropy_words(r, tmp, sizeof(tmp) / 4);
> + TMP_BUF_SIZE * 32);
> + extract_entropy(random_state, tmp, TMP_BUF_SIZE * 4, 0);
> + add_entropy_words(r, tmp, TMP_BUF_SIZE);
> r->extract_count = 0;
> }
> }

Well, this is a matter of taste. With my code, it is correct regardless
of how tmp is declared, while with your code you assume tmp is TMP_BUF_SIZE
words, and that it is declared with a 4-byte type. Both ways are resolved
at compile time, so using "sizeof(tmp)/4" or "sizeof(tmp)*8" doesn't add
any run-time overhead.

I don't have a strong opinion either way, if Linus and/or Alan have a
preference to do it one way or the other.

Cheers, Andreas
--
Andreas Dilger \ "If a man ate a pound of pasta and a pound of antipasto,
\ would they cancel out, leaving him still hungry?"
http://www-mddsp.enel.ucalgary.ca/People/adilger/ -- Dogbert

2001-10-27 06:35:49

by Robert Love

[permalink] [raw]
Subject: Re: [PATCH] random.c bugfix

On Sat, 2001-10-27 at 02:21, Andreas Dilger wrote:
> OK, my bad. At least the random variable-name cleanups let you SEE where
> we are supposed to be using word sizes and byte sizes. Even you were
> confused about it ;-)

I went over your original patch good; I am surprised I missed this. :/
Nonetheless, only with the new cleanups could anyone spot this.

> Well, this is a matter of taste. With my code, it is correct regardless
> of how tmp is declared, while with your code you assume tmp is TMP_BUF_SIZE
> words, and that it is declared with a 4-byte type. Both ways are resolved
> at compile time, so using "sizeof(tmp)/4" or "sizeof(tmp)*8" doesn't add
> any run-time overhead.

I think I prefer your sizeof() method, if for nothing else but that we
can keep it consistent -- we can always take the sizeof a variable and
not everything has its size in a define.

Furthermore, sizeof(tmp) certainly means "size of the variable temp"
while TMP_BUF_SIZE could be the size of anything related to tmp -- the
buffer it points to (if it were a pointer), a buffer in it (if it were a
struct), etc. Since it all compiles to the same, it is not a huge
issue. Just my two bits...

> I don't have a strong opinion either way, if Linus and/or Alan have a
> preference to do it one way or the other.

...but I'm not Alan or Linus ;)

Robert Love

2001-10-29 00:04:37

by Horst von Brand

[permalink] [raw]
Subject: Re: [PATCH] random.c bugfix

Andreas Dilger <[email protected]> said:

[...]

> OK, my bad. At least the random variable-name cleanups let you SEE where
> we are supposed to be using word sizes and byte sizes. Even you were
> confused about it ;-)

I have now seen various bits and pieces about this flying around. To get it
right will be hard, as over/under estimates will show up only under unusual
circumstances; and as you _can't_ really know how much "entropy" there
should be, testing this is very hard. So the only way to get it right is
make it "obviously" right.

How hard would it be to change the API to talk _all_ in the same units, be
it bits, bytes, words, or whatever? I believe bits or bytes would be best,
as word sizes differ. Bytes have the advantage that a simple sizeof() will
give size in bytes for any random variable.
--
Horst von Brand [email protected]
Casilla 9G, Vin~a del Mar, Chile +56 32 672616

2001-10-29 05:41:07

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH] random.c bugfix

On Oct 28, 2001 20:57 -0300, Horst von Brand wrote:
> Andreas Dilger <[email protected]> said:
> > OK, my bad. At least the random variable-name cleanups let you SEE where
> > we are supposed to be using word sizes and byte sizes. Even you were
> > confused about it ;-)
>
> I have now seen various bits and pieces about this flying around. To get it
> right will be hard, as over/under estimates will show up only under unusual
> circumstances; and as you _can't_ really know how much "entropy" there
> should be, testing this is very hard. So the only way to get it right is
> make it "obviously" right.

Yes, I have suspected the same thing. When testing these changes, I almost
always got 9-12 bits of entropy per keypress, and hundreds of bits for a
mouse movement. Granted, it may be reasonable given that I have a TSC,
which implies nearly random low order bits, but you would think that mouse
interrupts are at a fairly standard rate, so should have low entropy?

I left discussion about the actual entropy DATA out of my patch, and
focussed only on getting the accounting right. We still "discard" some
entropy in places, but it is orders of magnitude smaller than before,
and discarding entropy is not a danger, unless you are forced to use
less-secure data as a result of not having enough entropy.

> How hard would it be to change the API to talk _all_ in the same units, be
> it bits, bytes, words, or whatever? I believe bits or bytes would be best,
> as word sizes differ. Bytes have the advantage that a simple sizeof() will
> give size in bytes for any random variable.

Not too hard (basic patch below). This has the drawback of passing a
byte count to add_entropy(), but still using a word array for values.
It wouldn't be too hard to change add_entropy() to accept a byte array
(which we cast internally to a word array, don't really care about endian
issues here), and then do something* with any extra bytes. As it is,
add_entropy() "uses" the unset bytes at the end of the last word if we
are called from random_write(). We also can't make credit_entropy_store()
use bytes as a parameter, so we haven't totally fixed the problem.

(*) I don't know enough about the hash functions to know how to add a
few odd bytes into the store in a useful and safe way. We don't
really want to discard them either - think if a user-space random
daemon on an otherwise entropy-free system only writes one byte at
a time...

The second patch (incremental to the first) changes poolwords to poolbytes
in all places, but still doesn't fix add_entropy() to handle a byte array as
input.

Cheers, Andreas
==========================================================================
--- linux.orig/drivers/char/random.c Thu Oct 25 03:04:34 2001
+++ linux/drivers/char/random.c Sun Oct 28 21:28:23 2001
@@ -557,13 +553,13 @@
* it's cheap to do so and helps slightly in the expected case where
* the entropy is concentrated in the low-order bits.
*/
-static void add_entropy_words(struct entropy_store *r, const __u32 *in,
- int nwords)
+static void add_entropy(struct entropy_store *r, const __u32 *in, int nbytes)
{
static __u32 const twist_table[8] = {
0, 0x3b6e20c8, 0x76dc4190, 0x4db26158,
0xedb88320, 0xd6d6a3e8, 0x9b64c2b0, 0xa00ae278 };
unsigned i;
+ int nwords = nbytes / 4;
int new_rotate;
int wordmask = r->poolinfo.poolwords - 1;
__u32 w;
@@ -693,7 +689,7 @@
sec_random_state;
max_entropy = r->poolinfo.POOLBITS;
}
- add_entropy_words(r, batch_entropy_pool + 2*batch_tail, 2);
+ add_entropy(r, batch_entropy_pool + 2*batch_tail, 8);
credit_entropy_store(r, batch_entropy_credit[batch_tail]);
batch_tail = (batch_tail+1) & (batch_max-1);
}
@@ -1245,24 +1241,24 @@

if (r->entropy_count < nbytes * 8 &&
r->entropy_count < r->poolinfo.POOLBITS) {
- int nwords = min(r->poolinfo.poolwords - r->entropy_count/32,
- sizeof(tmp) / 4);
+ int cbytes = min(r->poolinfo.POOLBYTES - r->entropy_count/8,
+ sizeof(tmp));

DEBUG_ENT("xfer %d from primary to %s (have %d, need %d)\n",
- nwords * 32,
+ cbytes * 8,
r == sec_random_state ? "secondary" : "unknown",
r->entropy_count, nbytes * 8);

- extract_entropy(random_state, tmp, nwords, 0);
- add_entropy_words(r, tmp, nwords);
- credit_entropy_store(r, nwords * 32);
+ extract_entropy(random_state, tmp, cbytes, 0);
+ add_entropy(r, tmp, cbytes);
+ credit_entropy_store(r, cbytes * 8);
}
if (r->extract_count > 1024) {
DEBUG_ENT("reseeding %s with %d from primary\n",
r == sec_random_state ? "secondary" : "unknown",
sizeof(tmp) * 8);
extract_entropy(random_state, tmp, sizeof(tmp), 0);
- add_entropy_words(r, tmp, sizeof(tmp) / 4);
+ add_entropy(r, tmp, sizeof(tmp));
r->extract_count = 0;
}
}
@@ -1343,7 +1339,7 @@
*/
for (i = 0, x = 0; i < r->poolinfo.poolwords; i += 16, x+=2) {
HASH_TRANSFORM(tmp, r->pool+i);
- add_entropy_words(r, &tmp[x%HASH_BUFFER_SIZE], 1);
+ add_entropy(r, &tmp[x%HASH_BUFFER_SIZE], 4);
}

/*
@@ -1418,7 +1414,7 @@
do_gettimeofday(&tv);
words[0] = tv.tv_sec;
words[1] = tv.tv_usec;
- add_entropy_words(r, words, 2);
+ add_entropy(r, words, sizeof(words));

/*
* This doesn't lock system.utsname. However, we are generating
@@ -1427,7 +1423,7 @@
p = (char *) &system_utsname;
for (i = sizeof(system_utsname) / sizeof(words); i; i--) {
memcpy(words, p, sizeof(words));
- add_entropy_words(r, words, sizeof(words)/4);
+ add_entropy(r, words, sizeof(words));
p += sizeof(words);
}
}
@@ -1596,7 +1591,7 @@
c -= bytes;
p += bytes;

- add_entropy_words(random_state, buf, (bytes + 3) / 4);
+ add_entropy(random_state, buf, bytes);
}
if (p == buffer) {
return (ssize_t)ret;
@@ -1747,8 +1742,8 @@
if ((ret = create_entropy_store(poolsize, &new_store)))
return ret;

- add_entropy_words(new_store, random_state->pool,
- random_state->poolinfo.poolwords);
+ add_entropy(new_store, random_state->pool,
+ random_state->poolinfo.POOLBYTES);
credit_entropy_store(new_store, random_state->entropy_count);

sysctl_init_random(new_store);
============================================================================
--- linux/drivers/char/random.c~ Sun Oct 28 21:28:23 2001
+++ linux/drivers/char/random.c Sun Oct 28 21:53:46 2001
@@ -279,57 +279,56 @@
static int random_write_wakeup_thresh = 128;

/*
- * A pool of size .poolwords is stirred with a primitive polynomial
- * of degree .poolwords over GF(2). The taps for various sizes are
+ * A pool of size .poolbytes is stirred with a primitive polynomial
+ * of degree .poolbytes over GF(2). The taps for various sizes are
* defined below. They are chosen to be evenly spaced (minimum RMS
* distance from evenly spaced; the numbers in the comments are a
* scaled squared error sum) except for the last tap, which is 1 to
* get the twisting happening as fast as possible.
*/
static struct poolinfo {
- int poolwords;
+ int poolbytes;
int tap1, tap2, tap3, tap4, tap5;
} poolinfo_table[] = {
/* x^2048 + x^1638 + x^1231 + x^819 + x^411 + x + 1 -- 115 */
- { 2048, 1638, 1231, 819, 411, 1 },
+ { 16384, 1638, 1231, 819, 411, 1 },

/* x^1024 + x^817 + x^615 + x^412 + x^204 + x + 1 -- 290 */
- { 1024, 817, 615, 412, 204, 1 },
+ { 8192, 817, 615, 412, 204, 1 },
#if 0 /* Alternate polynomial */
/* x^1024 + x^819 + x^616 + x^410 + x^207 + x^2 + 1 -- 115 */
- { 1024, 819, 616, 410, 207, 2 },
+ { 8192, 819, 616, 410, 207, 2 },
#endif

/* x^512 + x^411 + x^308 + x^208 + x^104 + x + 1 -- 225 */
- { 512, 411, 308, 208, 104, 1 },
+ { 4096, 411, 308, 208, 104, 1 },
#if 0 /* Alternates */
/* x^512 + x^409 + x^307 + x^206 + x^102 + x^2 + 1 -- 95 */
- { 512, 409, 307, 206, 102, 2 },
+ { 4096, 409, 307, 206, 102, 2 },
/* x^512 + x^409 + x^309 + x^205 + x^103 + x^2 + 1 -- 95 */
- { 512, 409, 309, 205, 103, 2 },
+ { 4096, 409, 309, 205, 103, 2 },
#endif

/* x^256 + x^205 + x^155 + x^101 + x^52 + x + 1 -- 125 */
- { 256, 205, 155, 101, 52, 1 },
+ { 2048, 205, 155, 101, 52, 1 },

/* x^128 + x^103 + x^76 + x^51 +x^25 + x + 1 -- 105 */
- { 128, 103, 76, 51, 25, 1 },
+ { 1024, 103, 76, 51, 25, 1 },
#if 0 /* Alternate polynomial */
/* x^128 + x^103 + x^78 + x^51 + x^27 + x^2 + 1 -- 70 */
- { 128, 103, 78, 51, 27, 2 },
+ { 1024, 103, 78, 51, 27, 2 },
#endif

/* x^64 + x^52 + x^39 + x^26 + x^14 + x + 1 -- 15 */
- { 64, 52, 39, 26, 14, 1 },
+ { 512, 52, 39, 26, 14, 1 },

/* x^32 + x^26 + x^20 + x^14 + x^7 + x + 1 -- 15 */
- { 32, 26, 20, 14, 7, 1 },
+ { 256, 26, 20, 14, 7, 1 },

{ 0, 0, 0, 0, 0, 0 },
};

-#define POOLBITS poolwords*32
-#define POOLBYTES poolwords*4
+#define POOLBITS poolbytes*8

/*
* For the purposes of better mixing, we use the CRC-32 polynomial as
@@ -496,17 +495,16 @@
{
struct entropy_store *r;
struct poolinfo *p;
- int poolwords;
+ int poolbytes;

- poolwords = (size + 3) / 4; /* Convert bytes->words */
- /* The pool size must be a multiple of 16 32-bit words */
- poolwords = ((poolwords + 15) / 16) * 16;
+ /* The pool size must be a multiple of 16 32-bit words (64 bytes) */
+ poolbytes = ((poolbytes + 63) / 64) * 64;

- for (p = poolinfo_table; p->poolwords; p++) {
- if (poolwords == p->poolwords)
+ for (p = poolinfo_table; p->poolbytes; p++) {
+ if (poolbytes == p->poolbytes)
break;
}
- if (p->poolwords == 0)
+ if (p->poolbytes == 0)
return -EINVAL;

r = kmalloc(sizeof(struct entropy_store), GFP_KERNEL);
@@ -516,12 +514,12 @@
memset (r, 0, sizeof(struct entropy_store));
r->poolinfo = *p;

- r->pool = kmalloc(POOLBYTES, GFP_KERNEL);
+ r->pool = kmalloc(poolbytes, GFP_KERNEL);
if (!r->pool) {
kfree(r);
return -ENOMEM;
}
- memset(r->pool, 0, POOLBYTES);
+ memset(r->pool, 0, poolbytes);
*ret_bucket = r;
return 0;
}
@@ -533,7 +531,7 @@
r->entropy_count = 0;
r->input_rotate = 0;
r->extract_count = 0;
- memset(r->pool, 0, r->poolinfo.POOLBYTES);
+ memset(r->pool, 0, r->poolinfo.poolbytes);
}

static void free_entropy_store(struct entropy_store *r)
@@ -561,7 +559,7 @@
unsigned i;
int nwords = nbytes / 4;
int new_rotate;
- int wordmask = r->poolinfo.poolwords - 1;
+ int wordmask = r->poolinfo.poolbytes * 4 - 1;
__u32 w;

while (nwords--) {
@@ -1241,7 +1239,7 @@

if (r->entropy_count < nbytes * 8 &&
r->entropy_count < r->poolinfo.POOLBITS) {
- int cbytes = min(r->poolinfo.POOLBYTES - r->entropy_count/8,
+ int cbytes = min(r->poolinfo.poolbytes - r->entropy_count/8,
sizeof(tmp));

DEBUG_ENT("xfer %d from primary to %s (have %d, need %d)\n",
@@ -1274,7 +1272,8 @@
* extracting entropy from the secondary pool, and can refill from the
* primary pool if needed.
*
- * Note: extract_entropy() assumes that .poolwords is a multiple of 16 words.
+ * Note: extract_entropy() assumes that .poolbytes is a multiple of
+ * 16 32-bit words (64 bytes).
*/
static ssize_t extract_entropy(struct entropy_store *r, void * buf,
size_t nbytes, int flags)
@@ -1337,7 +1336,7 @@
* attempts to find previous ouputs), unless the hash
* function can be inverted.
*/
- for (i = 0, x = 0; i < r->poolinfo.poolwords; i += 16, x+=2) {
+ for (i = 0, x = 0; i < r->poolinfo.poolbytes; i += 64, x+=2) {
HASH_TRANSFORM(tmp, r->pool+i);
add_entropy(r, &tmp[x%HASH_BUFFER_SIZE], 4);
}
@@ -1635,13 +1634,14 @@
ent_count = random_state->entropy_count;
if (put_user(ent_count, p++) ||
get_user(size, p) ||
- put_user(random_state->poolinfo.poolwords, p++))
+ put_user(random_state->poolinfo.poolbytes / 4, p++))
return -EFAULT;
+ size *= 4;
if (size < 0)
return -EINVAL;
- if (size > random_state->poolinfo.poolwords)
- size = random_state->poolinfo.poolwords;
- if (copy_to_user(p, random_state->pool, size * 4))
+ if (size > random_state->poolinfo.poolbytes)
+ size = random_state->poolinfo.poolbytes;
+ if (copy_to_user(p, random_state->pool, size))
return -EFAULT;
return 0;
case RNDADDENTROPY:
@@ -1743,7 +1743,7 @@
return ret;

add_entropy(new_store, random_state->pool,
- random_state->poolinfo.POOLBYTES);
+ random_state->poolinfo.poolbytes);
credit_entropy_store(new_store, random_state->entropy_count);

sysctl_init_random(new_store);
@@ -1758,11 +1758,11 @@
{
int ret;

- sysctl_poolsize = random_state->poolinfo.POOLBYTES;
+ sysctl_poolsize = random_state->poolinfo.poolbytes;

ret = proc_dointvec(table, write, filp, buffer, lenp);
if (ret || !write ||
- (sysctl_poolsize == random_state->poolinfo.POOLBYTES))
+ (sysctl_poolsize == random_state->poolinfo.poolbytes))
return ret;

return change_poolsize(sysctl_poolsize);
@@ -1774,7 +1774,7 @@
{
int len;

- sysctl_poolsize = random_state->poolinfo.POOLBYTES;
+ sysctl_poolsize = random_state->poolinfo.poolbytes;

/*
* We only handle the write case, since the read case gets
@@ -1789,7 +1789,7 @@
return -EFAULT;
}

- if (sysctl_poolsize != random_state->poolinfo.POOLBYTES)
+ if (sysctl_poolsize != random_state->poolinfo.poolbytes)
return change_poolsize(sysctl_poolsize);

return 0;
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/

2001-10-29 16:16:22

by Horst H. von Brand

[permalink] [raw]
Subject: Re: [PATCH] random.c bugfix

Andreas Dilger <[email protected]> said:

[...]

> (*) I don't know enough about the hash functions to know how to add a
> few odd bytes into the store in a useful and safe way. We don't
> really want to discard them either - think if a user-space random
> daemon on an otherwise entropy-free system only writes one byte at
> a time...

I'm no expert either, but padding with anything (zeroes?) to get the right
length should be safe, no?
--
Dr. Horst H. von Brand Usuario #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513

2001-10-29 16:54:52

by Oliver Xymoron

[permalink] [raw]
Subject: Re: [PATCH] random.c bugfix

On Mon, 29 Oct 2001, Horst von Brand wrote:

> Andreas Dilger <[email protected]> said:
>
> [...]
>
> > (*) I don't know enough about the hash functions to know how to add a
> > few odd bytes into the store in a useful and safe way. We don't
> > really want to discard them either - think if a user-space random
> > daemon on an otherwise entropy-free system only writes one byte at
> > a time...
>
> I'm no expert either, but padding with anything (zeroes?) to get the right
> length should be safe, no?

No. A 4-byte accumulator is the right answer. We have to be careful here
though - the actual entropy might be in the partial words, we have to
account for it conservatively.

--
"Love the dolphins," she advised him. "Write by W.A.S.T.E.."

2001-10-29 23:39:30

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH] random.c bugfix

On Oct 29, 2001 10:58 -0600, Oliver Xymoron wrote:
> > > (*) I don't know enough about the hash functions to know how to add a
> > > few odd bytes into the store in a useful and safe way. We don't
> > > really want to discard them either - think if a user-space random
> > > daemon on an otherwise entropy-free system only writes one byte at
> > > a time...
> >
> > I'm no expert either, but padding with anything (zeroes?) to get the right
> > length should be safe, no?
>
> No. A 4-byte accumulator is the right answer. We have to be careful here
> though - the actual entropy might be in the partial words, we have to
> account for it conservatively.

In a large majority of the cases, there are only full-word entropy additions.
The only time we need to deal with sub-word additions is from random_write()
and from the equivalent ioctl. It also appears that we do this when filling
the secondary pool, but that is OK because we periodically dump far more
entropy into the secondary pool than we could possibly lose through rounding
errors.

Having an accumulator would only handle a rarely-used corner case. We
could just as easily fix any user-space entropy daemon to write at least
4 bytes at a time. Alternately, we could "pad" with enough bytes from
the random pool, and not accumulate at all.

In any case, this is in the noise compared to not using the input data
at all (which I fixed in the other patch).

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/

2001-10-30 00:19:31

by Oliver Xymoron

[permalink] [raw]
Subject: Re: [PATCH] random.c bugfix

On Mon, 29 Oct 2001, Andreas Dilger wrote:

> On Oct 29, 2001 10:58 -0600, Oliver Xymoron wrote:
> > > > (*) I don't know enough about the hash functions to know how to add a
> > > > few odd bytes into the store in a useful and safe way. We don't
> > > > really want to discard them either - think if a user-space random
> > > > daemon on an otherwise entropy-free system only writes one byte at
> > > > a time...
> > >
> > > I'm no expert either, but padding with anything (zeroes?) to get the right
> > > length should be safe, no?
> >
> > No. A 4-byte accumulator is the right answer. We have to be careful here
> > though - the actual entropy might be in the partial words, we have to
> > account for it conservatively.
>
> In a large majority of the cases, there are only full-word entropy additions.
> The only time we need to deal with sub-word additions is from random_write()
> and from the equivalent ioctl. It also appears that we do this when filling
> the secondary pool, but that is OK because we periodically dump far more
> entropy into the secondary pool than we could possibly lose through rounding
> errors.

They're not rounding errors per se though. This is what I mean by
conservative accounting. If you have three extra bytes, you must assume
that they're worth a full 24 bits of entropy. Throwing them away is fairly
expensive.

> Having an accumulator would only handle a rarely-used corner case. We
> could just as easily fix any user-space entropy daemon to write at least
> 4 bytes at a time. Alternately, we could "pad" with enough bytes from
> the random pool, and not accumulate at all.

Padding -from the pool- is acceptable (and simple enough a slow path to
add to the low-level function). Padding with constants is bad and padding
with zeros tends to be really bad.

> In any case, this is in the noise compared to not using the input data
> at all (which I fixed in the other patch).

Any of this made it into recent kernels yet? Backport to 2.2 might be a
good idea too..

--
"Love the dolphins," she advised him. "Write by W.A.S.T.E.."

2001-10-30 03:50:11

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH] random.c bugfix

On Oct 29, 2001 18:23 -0600, Oliver Xymoron wrote:
> On Mon, 29 Oct 2001, Andreas Dilger wrote:
> > Having an accumulator would only handle a rarely-used corner case. We
> > could just as easily fix any user-space entropy daemon to write at least
> > 4 bytes at a time. Alternately, we could "pad" with enough bytes from
> > the random pool, and not accumulate at all.
>
> Padding -from the pool- is acceptable (and simple enough a slow path to
> add to the low-level function). Padding with constants is bad and padding
> with zeros tends to be really bad.

I would rather fix this at the caller and not in the low-level code, because
a large majority of the callers (esp. on the fast path) only use full word
inputs. Even for the random_write() case, we would only need this at most
a single time per write, instead of once per "hunk" (=64 bytes). What I
have done is to accumulate bytes in random_write() until we get a full word.

> > In any case, this is in the noise compared to not using the input data
> > at all (which I fixed in the other patch).
>
> Any of this made it into recent kernels yet? Backport to 2.2 might be a
> good idea too..

Well, I just saw that the "in++" and "nwords * 4" patches went into -pre4.
These are the only real non-cosmetic parts of what has been sent. The
other patches were not officially submitted to Linus yet (using bytes
as parameters, and removing poolwords from the struct). I have reverted
those patches in my tree, and gone back to using words as units for
add_entropy(), since it doesn't make sense to take bytes as a parameter
and then require a multiple of 4 bytes for input sizes.

The following patch changes random_write() to accumulate partial words.
It should apply against any recent kernel, and assumes the two "byte"
patches I previously sent out are not applied. Haven't had a chance
to test it yet, but it compiles.

Cheers, Andreas
===========================================================================
--- linux.orig/drivers/char/random.c Thu Oct 25 03:04:34 2001
+++ linux/drivers/char/random.c Mon Oct 29 20:31:42 2001
@@ -1576,34 +1576,56 @@
}

static ssize_t
-random_write(struct file * file, const char * buffer,
- size_t count, loff_t *ppos)
+random_write(struct file *file, const char *buffer, size_t count, loff_t *ppos)
{
+ static char save[4];
+ static int nsave = 0;
int ret = 0;
- size_t bytes;
- __u32 buf[16];
- const char *p = buffer;
- size_t c = count;
+ const __u32 *p = (__u32 *)buffer;
+ size_t w = count / 4;
+ int tail = count & 3;
+ int extra = 0;
+
+ while (w > 0) {
+ size_t words;
+ __u32 buf[16];

- while (c > 0) {
- bytes = MIN(c, sizeof(buf));
+ words = min(w, sizeof(buf) / 4);

- bytes -= copy_from_user(&buf, p, bytes);
- if (!bytes) {
+ if (copy_from_user(buf, p, words * 4)) {
ret = -EFAULT;
- break;
+ goto out;
}
- c -= bytes;
- p += bytes;
+ w -= words;
+ p += words;

- add_entropy_words(random_state, buf, (bytes + 3) / 4);
+ add_entropy_words(random_state, p, words);
+ }
+
+ /* Accumulate partial words until we get a full word for addition */
+ while (tail) {
+ int bytes = min(tail, 4 - nsave);
+
+ if (copy_from_user(save + nsave, (char *)p + extra, bytes)) {
+ ret = -EFAULT;
+ goto out;
+ }
+ nsave += bytes;
+ extra += bytes;
+ tail -= bytes;
+
+ if (nsave >= 4) {
+ add_entropy_words(random_state, (__u32 *)save, 1);
+ nsave = 0;
+ }
}
- if (p == buffer) {
+out:
+ if (p == (__u32 *)buffer && extra == 0) {
return (ssize_t)ret;
} else {
file->f_dentry->d_inode->i_mtime = CURRENT_TIME;
mark_inode_dirty(file->f_dentry->d_inode);
- return (ssize_t)(p - buffer);
+ return (ssize_t)((p - (__u32 *)buffer) * 4 + extra);
}
}

--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/

2001-10-30 04:49:30

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH] random.c bugfix

On Oct 29, 2001 18:23 -0600, Oliver Xymoron wrote:
> Backport to 2.2 might be a good idea too..

Note that the 2.2 kernel code is (AFAIK) quite different from the 2.4
kernel code for drivers/char/random.c. The serious bug - not incrementing
"in" was introduced in 2.3.16. The other real bug - trunctating the pool
entropy to the number of words in the pool, does not exist in 2.2. It
correctly compares it agains POOLBITS.

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/

2001-10-31 05:38:01

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH] random.c bugfix

On Mon, Oct 29, 2001 at 08:50:05PM -0700, Andreas Dilger wrote:
>
> Well, I just saw that the "in++" and "nwords * 4" patches went into -pre4.
> These are the only real non-cosmetic parts of what has been sent. The
> other patches were not officially submitted to Linus yet (using bytes
> as parameters, and removing poolwords from the struct). I have reverted
> those patches in my tree, and gone back to using words as units for
> add_entropy(), since it doesn't make sense to take bytes as a parameter
> and then require a multiple of 4 bytes for input sizes.

Oops, ouch. Thanks for catching the in++ bug; I can't believe that
remained unnoticed for so long.

Could you send me a pointer to the proposed change to remove poolwords
from the struct? I'm not sure why that wwould be a good thing at all.

Also, the reason why add_entropy_words did stuff in multiple of 4
bytes was simply because it made the code much more efficient.
Zero-padding isn't a problem, since it's perfectly safe to mix in zero
bytes into the pool. It is an issue for the entropy credit
calculation, but that's completely separate from add_entropy_words()....

- Ted

2001-10-31 06:20:20

by Andreas Dilger

[permalink] [raw]
Subject: Re: [PATCH] random.c bugfix

On Oct 30, 2001 11:07 -0500, Theodore Tso wrote:
> On Mon, Oct 29, 2001 at 08:50:05PM -0700, Andreas Dilger wrote:
> > Well, I just saw that the "in++" and "nwords * 4" patches went into -pre4.
> > These are the only real non-cosmetic parts of what has been sent. The
> > other patches were not officially submitted to Linus yet (using bytes
> > as parameters, and removing poolwords from the struct). I have reverted
> > those patches in my tree, and gone back to using words as units for
> > add_entropy(), since it doesn't make sense to take bytes as a parameter
> > and then require a multiple of 4 bytes for input sizes.
>
> Oops, ouch. Thanks for catching the in++ bug; I can't believe that
> remained unnoticed for so long.
>
> Could you send me a pointer to the proposed change to remove poolwords
> from the struct? I'm not sure why that wwould be a good thing at all.

No point - I reverted them and will not use them. The goal was to
simplify the "unit" issue in random.c becuase the previous bug about
truncating "entropy_count" to "poolwords" instead of "poolbits" was
caused specifically by a misunderstanding about the unit sizes of
function parameters.

add_entropy_words -> words == 4 bytes
credit_entropy_store -> bits
extract_entropy -> bytes

What I have done instead of changing "poolwords" to "poolbytes"
everywhere, is rename "credit_entropy_store" to "credit_entropy_bits"
and rename the struct field "entropy_count" to "entropy_bits".

> Also, the reason why add_entropy_words did stuff in multiple of 4
> bytes was simply because it made the code much more efficient.

I didn't disagree with that at all. All that I tried (and ended up not
using) was to change all the word units to be bytes. The algorithm was
not changed. In the end, one of the reasons to not use it was that it
introduced a few cases where we scaled a variable value instead of a
constant value, and introduced a small run-time overhead. It also didn't
make sense to pass a byte value to add_entropy_words() and then restrict
it to being a multiple of 4. Ugh.

> Zero-padding isn't a problem, since it's perfectly safe to mix in zero
> bytes into the pool.

Well, Oliver tends to disagree. I don't know enough either way. It _does_
seem bad that if you wrote continually wrote 1-byte values into /dev/random
and padded out the end of the word that it would be bad. However, in the
end this is no worse than cat /dev/zero > /dev/random, which is also allowed.

The only place were we submit non-word-sized values to add_entropy_words()
is in random_write(), and I fixed that to accumulate bytes until we
have a full word to mix in. One possible "optimization" that could
be done in that function is instead of copy_from_user(), we could use
verify_area() instead, directly add any fully-aligned words into the pool,
and then "accumulate" any misaligned or partial words. This saves us a
memcpy() of all the data being written into /dev/random, but whether the
performance improvement is noticable on this non-fast-path is important
is questionable. Another question is whether "rounding" a char pointer
to a multiple of 4 is legal/portable. Maybe I'll put in a comment and
wait until someone else wants to do it...

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/

2001-10-31 14:39:30

by Oliver Xymoron

[permalink] [raw]
Subject: Re: [PATCH] random.c bugfix

On Tue, 30 Oct 2001, Andreas Dilger wrote:

> On Oct 30, 2001 11:07 -0500, Theodore Tso wrote:
> > Zero-padding isn't a problem, since it's perfectly safe to mix in zero
> > bytes into the pool.
>
> Well, Oliver tends to disagree. I don't know enough either way. It _does_
> seem bad that if you wrote continually wrote 1-byte values into /dev/random
> and padded out the end of the word that it would be bad. However, in the
> end this is no worse than cat /dev/zero > /dev/random, which is also allowed.

That was just conservatism on my part. There are a large number of hashes
and ciphers for which zero inputs are suboptimal so my gut feel was that
it was a bad idea. That was silly of me, given the way the mixing works.

--
"Love the dolphins," she advised him. "Write by W.A.S.T.E.."