I have a PRNG that I want to use within the Linux random(4) driver. It
looks remarkably strong to me, but analysis from others is needed.
(The spinlocks here are a kluge for out-of-kernel testing.)
/*************************************************************************
* use xtea to create a pseudorandom 64-bit output
*
* xtea is fast & uses little storage
* See https://en.wikipedia.org/wiki/XTEA and papers it links
*
* tea is the original block cipher, Tiny Encryption Algorithm
* xtea is an improved version preventing some published attacks
* both are in linux/crypto/tea.c
*************************************************************************/
static spinlock_t xtea_lock;
/*
* The initialisations here may not be needed, but they are
* more-or-less free and can do no harm.
* constants taken from SHA-512
*/
static u64 tea_mask = 0x7137449123ef65cd ;
static u64 tea_counter = 0xb5c0fbcfec4d3b2f ;
/*
* 128-bit key
* cipher itself uses 32-bit operations
* but rekeying uses 64-bit
*/
static u64 tea_key64[2] = {0x0fc19dc68b8cd5b5, 0xe9b5dba58189dbbc} ;
static u32 *tea_key = (u32 *) &tea_key64[0] ;
/*
* simplified version of code fron crypto/tea.c
* xtea_encrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
*
* does not use struct
* does no endianess conversions
* no *src or *dst, encrypt a 64-bit block in place
*/
#define XTEA_ROUNDS 32
#define XTEA_DELTA 0x9e3779b9
static void xtea_block(u64 *x)
{
u32 i, y, z, sum, *p ;
p = (u32 *) x ;
y = p[0] ;
z = p[1] ;
for( i = 0, sum = 0 ; i < XTEA_ROUNDS ; i++ ) {
y += ((z << 4 ^ z >> 5) + z) ^ (sum + tea_key[sum&3]);
sum += XTEA_DELTA;
z += ((y << 4 ^ y >> 5) + y) ^ (sum + tea_key[sum>>11 &3]);
}
p[0] = y ;
p[1] = z ;
}
/*
* For counter mode see RFC 4086 section 6.2.1
* Add a constant instead of just incrementing
* to change more bits
*
* Even and Mansour proved proved a lower bound
* for an XOR-permutation-XOR sequence.
* S. Even, Y. Mansour, Asiacrypt 1991
* A Construction of a Cipher From a Single Pseudorandom Permutation
*
* For an n-bit cipher and D known or chosen plaintexts,
* time T to break it is bounded by DT >= 2^n.
*
* This applies even if the enemy knows the permutation,
* for a block cipher even if he or she knows the key.
* All the proof requires is that the permutation be
* nonlinear.
*
* Here the main calling function changes the key a bit
* on every iteration and updates key, mask and counter
* after TEA_REKEY iterations, so any possible D for
* the same key, mask and counter sequence is small.
*
* With TEA_REKEY = 127 ~= 2^7, for each sequence of
* 127 outputs between rekeyings the enemy needs
* time T > 2^57
*
* Assuming proper keying and that the enemy cannot
* peek into the running kernel, this can be
* considered effectively unbreakable, even if
* xtea itself were found to be weak.
*/
#define COUNTER_DELTA 0x240ca1cc77ac9c65
static u64 xtea_counter()
{
u64 x ;
x = tea_counter ^ tea_mask ;
xtea_block(&x) ;
x ^= tea_mask ;
tea_counter += COUNTER_DELTA ;
return x ;
}
/*
* Interval for full rekey can be moderately long
* because there is incremental key change as well
*/
#define TEA_REKEY 127
static int xtea_iterations = 0 ;
static void get_xtea_long(u64 *out)
{
int a ;
int flag = 0 ;
spin_lock(&xtea_lock) ;
if (xtea_iterations >= TEA_REKEY)
xtea_iterations = 0 ;
if (xtea_iterations == 0)
flag = 1 ;
spin_unlock(&xtea_lock) ;
if (flag)
xtea_rekey() ;
spin_lock(&xtea_lock) ;
a = xtea_iterations & 3 ;
tea_key[a] += rol32(tea_key[(a+1)&3], 5) ;
*out = xtea_counter() ;
xtea_iterations++ ;
spin_unlock(&xtea_lock) ;
}
> On Sep 16, 2021, at 20:18, Sandy Harris <[email protected]> wrote:
>
> I have a PRNG that I want to use within the Linux random(4) driver. It
> looks remarkably strong to me, but analysis from others is needed.
A good block cipher in counter mode makes a pretty-okay PRNG. I say pretty-okay only because I would like my PRNG not to be invertible. Iterated hash functions are better. However, they are slower, and a property you want in a PRNG is that it's fast. I did a system PRNG that was intentionally faster than arc4random() and close to linear-congruential because then there's no excuse for not using it. A mildly evil person would replace both of those with a fast real PRNG. (Mildly evil because if some user knew the internals and was counting on it acting the way the internals specified, they might be disappointed.)
XTEA is an okay block cipher. Not great, okay. Probably good enough for a PRNG. But -- why wouldn't you use AES? An obvious answer is that you don't have it in hardware and see previous 'graph commentary about speed. Note that XTEA is a 64-bit block, so if you compare to AES, take that in mind.
The NIST AES_CTR_DRBG is mostly-okay. It's got a few issues (see <https://eprint.iacr.org/2020/619.pdf>), but you can get around them. I don't like that it takes a simple concept (a good block cipher is a good PRP/PRF) and throws enough scaffolding around it to make it hard to understand. I understand the reasons (they are essentially the same as your rekey protection and some more), it just bugs me.
So why not just use your CPU's AES (assuming you have one), or further wrap XTEA in the DRBG stuff for the added misuse protection? Or even iterate MD5, or, ummm, you knew this was coming, didn't you, use the PRNG mode of Skein-512 (which is about twice as fast as software AES on a 64-bit CPU)? Yeah, I know, driver. So we're back to why not AES? What's the design consideration?
Jon
On Oct 2, 2021, at 5:08 PM, Jon Callas <[email protected]> wrote:
>
>
>> On Sep 16, 2021, at 20:18, Sandy Harris <[email protected]> wrote:
>>
>> I have a PRNG that I want to use within the Linux random(4) driver. It
>> looks remarkably strong to me, but analysis from others is needed.
>
> A good block cipher in counter mode makes a pretty-okay PRNG. I say pretty-okay only because I would like my PRNG not to be invertible. Iterated hash functions are better.
Whatever you use you want to truncate the output, otherwise you won?t get repetitions, which you actually want from a good PRNG.
rg
Jon Callas <[email protected]> wrote:
> > On Sep 16, 2021, at 20:18, Sandy Harris <[email protected]> wrote:
> >
> > I have a PRNG that I want to use within the Linux random(4) driver. It
> > looks remarkably strong to me, but analysis from others is needed.
>
> A good block cipher in counter mode makes a pretty-okay PRNG. I say pretty-okay only because I would like my PRNG not to be invertible.
I'm using counter mode inside an Even-Mansour XOR-permutation-XOR
structure which, among other things, makes it non-invertible.
> Iterated hash functions are better. However, they are slower, and a property you want in a PRNG is that it's fast. I did a system PRNG that was intentionally faster than arc4random() and close to linear-congruential because then there's no excuse for not using it. A mildly evil person would replace both of those with a fast real PRNG. (Mildly evil because if some user knew the internals and was counting on it acting the way the internals specified, they might be disappointed.)
I'm doing this within the Linux random(4) driver which iterates chacha
to generate output. This prng will only generate values for internal
use, like rekeying chacha or dumping data into the input pool. In
fact, if an instruction like Intel RDRAND or a hardware rng exist the
code mostly uses those, only injecting xtea output once in a while to
avoid tructing the other source completely or falling back to xtea if
the other fails.
> XTEA is an okay block cipher. Not great, okay. Probably good enough for a PRNG.
With the Even-Mansour construction it seems good enough to me. Code
includes this comment:
* Even and Mansour proved proved a lower bound
* for an XOR-permutation-XOR sequence.
* S. Even, Y. Mansour, Asiacrypt 1991
* A Construction of a Cipher From a Single Pseudorandom Permutation
*
* For an n-bit cipher and D known or chosen plaintexts,
* time T to break it is bounded by DT >= 2^n.
*
* This applies even if the enemy knows the permutation,
* for a block cipher even if he or she knows the key.
* All the proof requires is that the permutation be
* nonlinear.
*
* The main calling function does a full rekey (update
* key, mask and counter) when xtea_iterations >= 127
* We therefore have D ~= 2^7 and T >= 2^57 to break
* each sequence of 127 outputs between rekeyings.
*
* Assuming proper keying and that the enemy cannot
* peek into the running kernel, this can be
* considered effectively unbreakable, even if
* xtea itself were found to be flawed.
> But -- why wouldn't you use AES? An obvious answer is that you don't have it in hardware ...
I wanted something that would be reasonably fast on anything Linux
runs on & wanted to avoid using kernel memory for the S-box & round
keys.
> On Oct 3, 2021, at 00:04, Sandy Harris <[email protected]> wrote:
>
> I'm using counter mode inside an Even-Mansour XOR-permutation-XOR
> structure which, among other things, makes it non-invertible.
>
And also takes care of other quibbles that were rattling around in my head.
>
> I'm doing this within the Linux random(4) driver which iterates chacha
> to generate output. This prng will only generate values for internal
> use, like rekeying chacha or dumping data into the input pool. In
> fact, if an instruction like Intel RDRAND or a hardware rng exist the
> code mostly uses those, only injecting xtea output once in a while to
> avoid tructing the other source completely or falling back to xtea if
> the other fails.
Then the constraints are even better. If that output isn't directly exposed, it can be a bit sloppier.
>
>> XTEA is an okay block cipher. Not great, okay. Probably good enough for a PRNG.
>
> With the Even-Mansour construction it seems good enough to me. [...]
Yeah, I was thinking it would be direct output. If it's direct output, then a ciphertext-only attack on the cipher is an attack on the PRNG. The way you're using it, it's a cheap way to provide another safety net. Remember, though, that E-M assumes a PRP, and many breaks on ciphers imply that it's not a PRP, and yet you don't care about any of that. Sounds fine to me.
>> But -- why wouldn't you use AES? An obvious answer is that you don't have it in hardware ...
>
> I wanted something that would be reasonably fast on anything Linux
> runs on & wanted to avoid using kernel memory for the S-box & round
> keys.
The speed aspect on low-end CPUs is what I was thinking of. Most importantly, you are doing something inside the whole box, so it doesn't even really need full pseudorandomness.
Overall, sounds fine to me.
Jon
On Fri, Sep 17, 2021 at 8:31 AM Sandy Harris <[email protected]> wrote:
>
> I have a PRNG that I want to use within the Linux random(4) driver. It
> looks remarkably strong to me, but analysis from others is needed.
How do you plan to get this into the Linux community for testing?
In a cluster of machines a PRNG kernel update and verification process
seems tedious and difficult to tidy up after a problem is found or
suspected.
First benchmark it.
If it slows the system down without advantage it will not be welcome
except as an optional service and device.
Spinlocks ... hmmm... now I need to look at how the kernel can ignore
or not need them?
--
T o m M i t c h e l l (on NiftyEgg[.]com )
On Mon, Oct 4, 2021 Tom Mitchell <[email protected]> wrote:
> How do you plan to get this into the Linux community for testing?
> In a cluster of machines a PRNG kernel update and verification process
> seems tedious and difficult to tidy up after a problem is found or
> suspected.
I'll submit it as a kernel patch; there's a procedure in place to test
those before acceptance. I'm doing this RFC version first to try
and be certain I have it right before submitting.
> Spinlocks ... hmmm... now I need to look at how the kernel can ignore
> or not need them?
The spinlocks in my code are rather dubious emulations used
for out-of-kernel testing. My comments include:
* The locking here is completely untested and quite likely
* to be partly wrong. I just define emulations for spin_lock()
* and spin_unlock() and stick them in as reminders of where
* locks are needed.
....
* this will catch some basic blunders
* like taking a lock that the calling code holds
*
* it is definitely not enough for serious lock tests
The current driver uses spinlocks fairly extensively.
There was a thread on the linux crypto list about
getting rid of those locks:
https://www.spinics.net/lists/linux-crypto/msg55255.html
but as far as I know the prposal went nowhere.
Another comment in my code is:
* I add two new locks, one for xtea and and an output lock
* which all output routines take. Contexts are updated
* within that lock, so duplicate outputs are astronomically
* unlikely.
*
* In the rest of the code, I only lock data structures which
* are being written to, never ones that are only read. If
* someone writes while we are reading then read output becomes
* indeterminate, and in most applications that would be a Bad
* Thing. In an rng, though, it is at worst harmless, and
* arguably even a Good Thing.