From: Jean-Philippe Aumasson Subject: Re: [PATCH v5 1/4] siphash: add cryptographically secure PRF Date: Mon, 19 Dec 2016 20:18:07 +0000 Message-ID: References: <063D6719AE5E284EB5DD2968C1650D6DB0242669@AcuExch.aculab.com> <20161219181040.25441.qmail@ns.sciencehorizons.net> Reply-To: kernel-hardening@lists.openwall.com Mime-Version: 1.0 Content-Type: multipart/alternative; boundary=f403045ec68ed4e1590544089e52 Cc: ak@linux.intel.com, davem@davemloft.net, djb@cr.yp.to, ebiggers3@gmail.com, hannes@stressinduktion.org, Jason@zx2c4.com, kernel-hardening@lists.openwall.com, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org, luto@amacapital.net, netdev@vger.kernel.org, torvalds@linux-foundation.org, tytso@mit.edu, vegard.nossum@gmail.com To: George Spelvin , David.Laight@aculab.com, tom@herbertland.com Return-path: List-Post: List-Help: List-Unsubscribe: List-Subscribe: In-Reply-To: <20161219181040.25441.qmail@ns.sciencehorizons.net> List-Id: linux-crypto.vger.kernel.org --f403045ec68ed4e1590544089e52 Content-Type: text/plain; charset=UTF-8 FTR, I agree that SipHash may benefit from security/performance optimizations (there's always optimizations), but I'm note sure that it's the right time/place to discuss it. Cryptanalysis is hard, and any major change should be backed by a rigorous security analysis and/or security proof. For example, the redundancy in the initial state prevents trivial weaknesses that would occur if four key words were used (something I've seen proposed in this thread). In the paper at http://aumasson.jp/siphash/siphash.pdf we explain rationale behind most of our design choices. The cryptanalysis paper at https://eprint.iacr.org/2014/722.pdf analyzes differences propagations and presents attacks on reduced versions of SipHash. On Mon, Dec 19, 2016 at 7:10 PM George Spelvin wrote: > David Laight wrote: > > From: George Spelvin > ... > >> uint32_t > >> hsiphash24(char const *in, size_t len, uint32_t const key[2]) > >> { > >> uint32_t c = key[0]; > >> uint32_t d = key[1]; > >> uint32_t a = 0x6c796765 ^ 0x736f6d65; > >> uint32_t b = d ^ 0x74656462 ^ 0x646f7261; > > > I've not looked closely, but is that (in some sense) duplicating > > the key length? > > So you could set a = key[2] and b = key[3] and still have an > > working hash - albeit not exactly the one specified. > > That's tempting, but not necessarily effective. (A similar unsuccesful > idea can be found in discussions of "DES with independent round keys". > Or see the design discussion of Salsa20 and the constants in its input.) > > You can increase the key size, but that might not increase the *security* > any. > > The big issue is that there are a *lot* of square root attack in > cryptanalysis. Because SipHash's state is twice the size of the key, > such an attack will have the same complexity as key exhaustion and need > not be considered. To make a stronger security claim, you need to start > working through them all and show that they don't apply. > > For SipHash in particular, an important property is asymmetry of the > internal state. That's what duplicating the key with XORs guarantees. > If the two halves of the state end up identical, the mixing is much > weaker. > > Now the probability of ending up in a "mirror state" is the square > root of the state size (1 in 2^64 for HalfSipHash's 128-bit state), > which is the same probability as guessing a key, so it's not a > problem that has to be considered when making a 64-bit security claim. > > But if you want a higher security level, you have to think about > what can happen. > > That said, I have been thinking very hard about > > a = c ^ 0x48536970; /* 'HSip' */ > d = key[2]; > > By guaranteeing that a and c are different, we get the desired > asymmetry, and the XOR of b and d is determined by the first word of > the message anyway, so this isn't weakening anything. > > 96 bits is far beyond the reach of any brute-force attack, and if a > more sophisticated 64-bit attack exists, it's at least out of the reach > of the script kiddies, and will almost certainly have a non-negligible > constant factor and more limits in when it can be applied. > > > Is it worth using the 32bit hash for IP addresses on 64bit systems that > > can't do misaligned accessed? > > Not a good idea. To hash 64 bits of input: > > * Full SipHash has to do two loads, a shift, an or, and two rounds of > mixing. > * HalfSipHash has to do a load, two rounds, another load, and two more > rounds. > > In other words, in addition to being less secure, it's half the speed. > > Also, what systems are you thinking about? x86, ARMv8, PowerPC, and > S390 (and ia64, if anyone cares) all handle unaligned loads. MIPS has > efficient support. Alpha and HPPA are for retrocomputing fans, not > people who care about performance. > > So you're down to SPARC. Which conveniently has the same maintainer as > the networking code, so I figure DaveM can take care of that himself. :-) > --f403045ec68ed4e1590544089e52 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
FTR, I agree that SipHash may benefit from security/perfor= mance optimizations (there's always optimizations), but I'm note su= re that it's the right time/place to discuss it. Cryptanalysis is hard,= and any major change should be backed by a rigorous security analysis and/= or security proof. For example, the redundancy in the initial state prevent= s trivial weaknesses that would occur if four key words were used (somethin= g I've seen proposed in this thread).

In the pa= per at http://aumasson.j= p/siphash/siphash.pdf=C2=A0we explain rationale behind most of our desi= gn choices. The cryptanalysis paper at https://eprint.iacr.org/2014/722.pdf=C2=A0analyzes differe= nces propagations and presents attacks on reduced versions of SipHash.


On Mon, Dec 19, 2016 at 7:10 PM George Spelvin <linux@sciencehorizons.net> wrote:
David Laight wrote:
> From: George Spelvin
...
>> uint32_t
>> hsiphash24(char const *in, size_t len, uint32_t const key[2])
>> {
>>=C2=A0 =C2=A0 =C2=A0 uint32_t c =3D key[0];
>>=C2=A0 =C2=A0 =C2=A0 uint32_t d =3D key[1];
>>=C2=A0 =C2=A0 =C2=A0 uint32_t a =3D=C2=A0 =C2=A0 =C2=A00x6c796765 ^= 0x736f6d65;
>>=C2=A0 =C2=A0 =C2=A0 uint32_t b =3D d ^ 0x74656462 ^ 0x646f7261;
> I've not looked closely, but is that (in some sense) duplicating > the key length?
> So you could set a =3D key[2] and b =3D key[3] and still have an
> working hash - albeit not exactly the one specified.

That's tempting, but not necessarily effective.=C2=A0 (A similar unsucc= esful
idea can be found in discussions of "DES with independent round keys&q= uot;.
Or see the design discussion of Salsa20 and the constants in its input.)
You can increase the key size, but that might not increase the *security* any.

The big issue is that there are a *lot* of square root attack in
cryptanalysis.=C2=A0 Because SipHash's state is twice the size of the k= ey,
such an attack will have the same complexity as key exhaustion and need
not be considered.=C2=A0 To make a stronger security claim, you need to sta= rt
working through them all and show that they don't apply.

For SipHash in particular, an important property is asymmetry of the
internal state.=C2=A0 That's what duplicating the key with XORs guarant= ees.
If the two halves of the state end up identical, the mixing is much
weaker.

Now the probability of ending up in a "mirror state" is the squar= e
root of the state size (1 in 2^64 for HalfSipHash's 128-bit state),
which is the same probability as guessing a key, so it's not a
problem that has to be considered when making a 64-bit security claim.

But if you want a higher security level, you have to think about
what can happen.

That said, I have been thinking very hard about

=C2=A0 =C2=A0 =C2=A0 =C2=A0 a =3D c ^ 0x48536970;=C2=A0 =C2=A0 =C2=A0/* = 9;HSip' */
=C2=A0 =C2=A0 =C2=A0 =C2=A0 d =3D key[2];

By guaranteeing that a and c are different, we get the desired
asymmetry, and the XOR of b and d is determined by the first word of
the message anyway, so this isn't weakening anything.

96 bits is far beyond the reach of any brute-force attack, and if a
more sophisticated 64-bit attack exists, it's at least out of the reach=
of the script kiddies, and will almost certainly have a non-negligible
constant factor and more limits in when it can be applied.

> Is it worth using the 32bit hash for IP addresses on 64bit systems tha= t
> can't do misaligned accessed?

Not a good idea.=C2=A0 To hash 64 bits of input:

* Full SipHash has to do two loads, a shift, an or, and two rounds of mixin= g.
* HalfSipHash has to do a load, two rounds, another load, and two more roun= ds.

In other words, in addition to being less secure, it's half the speed.<= br class=3D"gmail_msg">
Also, what systems are you thinking about?=C2=A0 x86, ARMv8, PowerPC, and S390 (and ia64, if anyone cares) all handle unaligned loads.=C2=A0 MIPS has=
efficient support.=C2=A0 Alpha and HPPA are for retrocomputing fans, not people who care about performance.

So you're down to SPARC.=C2=A0 Which conveniently has the same maintain= er as
the networking code, so I figure DaveM can take care of that himself. :-)
--f403045ec68ed4e1590544089e52--