2022-04-29 06:38:14

by Willy Tarreau

[permalink] [raw]
Subject: [PATCH v2 net 3/7] tcp: resalt the secret every 10 seconds

From: Eric Dumazet <[email protected]>

In order to limit the ability for an observer to recognize the source
ports sequence used to contact a set of destinations, we should
periodically shuffle the secret. 10 seconds looks effective enough
without causing particular issues.

Cc: Moshe Kol <[email protected]>
Cc: Yossi Gilad <[email protected]>
Cc: Amit Klein <[email protected]>
Tested-by: Willy Tarreau <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
---
net/core/secure_seq.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/net/core/secure_seq.c b/net/core/secure_seq.c
index 55aa5cc258e3..2c76aab20403 100644
--- a/net/core/secure_seq.c
+++ b/net/core/secure_seq.c
@@ -22,6 +22,8 @@
static siphash_aligned_key_t net_secret;
static siphash_aligned_key_t ts_secret;

+#define EPHEMERAL_PORT_SHUFFLE_PERIOD (10 * HZ)
+
static __always_inline void net_secret_init(void)
{
net_get_random_once(&net_secret, sizeof(net_secret));
@@ -101,10 +103,12 @@ u64 secure_ipv6_port_ephemeral(const __be32 *saddr, const __be32 *daddr,
struct in6_addr saddr;
struct in6_addr daddr;
__be16 dport;
+ unsigned int timeseed;
} __aligned(SIPHASH_ALIGNMENT) combined = {
.saddr = *(struct in6_addr *)saddr,
.daddr = *(struct in6_addr *)daddr,
- .dport = dport
+ .dport = dport,
+ .timeseed = jiffies / EPHEMERAL_PORT_SHUFFLE_PERIOD,
};
net_secret_init();
return siphash(&combined, offsetofend(typeof(combined), dport),
@@ -145,8 +149,10 @@ EXPORT_SYMBOL_GPL(secure_tcp_seq);
u64 secure_ipv4_port_ephemeral(__be32 saddr, __be32 daddr, __be16 dport)
{
net_secret_init();
- return siphash_3u32((__force u32)saddr, (__force u32)daddr,
- (__force u16)dport, &net_secret);
+ return siphash_4u32((__force u32)saddr, (__force u32)daddr,
+ (__force u16)dport,
+ jiffies / EPHEMERAL_PORT_SHUFFLE_PERIOD,
+ &net_secret);
}
EXPORT_SYMBOL_GPL(secure_ipv4_port_ephemeral);
#endif
--
2.17.5


2022-04-29 15:48:43

by Jason A. Donenfeld

[permalink] [raw]
Subject: Re: [PATCH v2 net 3/7] tcp: resalt the secret every 10 seconds

Hi Eric,

On Thu, Apr 28, 2022 at 02:39:57PM +0200, Willy Tarreau wrote:
> From: Eric Dumazet <[email protected]>
>
> In order to limit the ability for an observer to recognize the source
> ports sequence used to contact a set of destinations, we should
> periodically shuffle the secret. 10 seconds looks effective enough
Nit: "periodically re-salt the input".
> without causing particular issues.

Just FYI, moving from siphash_3u32 to siphash_4u32 is not free, as it
bumps us up from siphash_3u32 to siphash_2u64, which does two more
siphash rounds. Maybe this doesn't matter much, but just FYI.

I wonder, though, about your "10 seconds looks effective enough without
causing particular issues." I surmise from that sentence that a lower
value might cause particular issues, but that you found 10 seconds to be
okay in practice. Fine. But what happens if one caller hits this at
second 9 and the next caller hits it at second 0? In that case, the
interval might have been 1 second, not 10. In other words, if you need
a certain minimum quantization for this to not cause "particular
issues", it might not work the way you wanted it to.

Additionally, that problem aside, if you round EPHEMERAL_PORT_SHUFFLE_PERIOD
to the nearest power of two, you can turn the expensive division into a
bit shift right.

Regards,
Jason

2022-04-29 21:01:55

by Willy Tarreau

[permalink] [raw]
Subject: Re: [PATCH v2 net 3/7] tcp: resalt the secret every 10 seconds

Hi Jason,

On Fri, Apr 29, 2022 at 04:37:12PM +0200, Jason A. Donenfeld wrote:
> Hi Eric,
>
> On Thu, Apr 28, 2022 at 02:39:57PM +0200, Willy Tarreau wrote:
> > From: Eric Dumazet <[email protected]>
> >
> > In order to limit the ability for an observer to recognize the source
> > ports sequence used to contact a set of destinations, we should
> > periodically shuffle the secret. 10 seconds looks effective enough
> Nit: "periodically re-salt the input".
> > without causing particular issues.
>
> Just FYI, moving from siphash_3u32 to siphash_4u32 is not free, as it
> bumps us up from siphash_3u32 to siphash_2u64, which does two more
> siphash rounds. Maybe this doesn't matter much, but just FYI.
>
> I wonder, though, about your "10 seconds looks effective enough without
> causing particular issues." I surmise from that sentence that a lower
> value might cause particular issues, but that you found 10 seconds to be
> okay in practice. Fine. But what happens if one caller hits this at
> second 9 and the next caller hits it at second 0? In that case, the
> interval might have been 1 second, not 10. In other words, if you need
> a certain minimum quantization for this to not cause "particular
> issues", it might not work the way you wanted it to.

I can partially respond on this one because as you've probably seen at
first I had some reserves on this one, which vanished.

The problem caused by reusing a source port too early almost essentially
concerns clients closing first, either with a regular shutdown() where
a FIN is sent first, or by forcing an RST by disabling lingering / breaking
the association with connect(AF_UNSPEC). In both cases, the risk is that
the packet supposed to finish to close the connection (e.g. last ACK that
turns the socket to TIME_WAIT or the RST) is lost, releasing the local
entry too early and making the client send a SYN from a port that's still
seen as in use on the other side.

For local TIME_WAIT, there's no issue in practice because 1) the port is
not yet usable, and 2) clients closing first cannot sustain high connection
rates anyway, they're limited to #ports/60s.

For those dealing with high connection rates (e.g. proxies) which seldom
have to abort connections, they have no other choice but closing using an
RST to avoid blocking the port in TIME_WAIT. These ones already face the
risk that the RST is lost on the path and that a subsequent SYN from the
same port is delivered to a connection that's in ESTABLISHED or FIN_WAIT
state anyway. This *does* happen due to losses, albeit at quite a low
rate in general. In such cases, the server returns an out-of-window ACK,
the client sends an RST and a SYN again (which roughly looks like a
retransmit as the RST). Such cases may occasionally fail through firewalls
if the RST is lost between the firewall and the server. The firewall will
then block the response ACK from the server and the client will have to
time out on the SYN and try again. This is the current situation even
without the patch (hence why closing with RST must be avoided as much as
possible).

The change of the salting here can slightly increase the risk that this
happens during the transition from the 9th to the 10th second as you said,
and only for the case above. However this will only be a problem if the
SYN arrives before the RST or if the RST is lost. The second case already
exists and the difference is that before the patch, we could have hoped
that the other side would finally have closed on timeout before sending a
new SYN (which is already not granted, e.g. ESTABLISHED) or unlikely to
happen above #ports/60s connections per second (e.g. FIN_WAIT with a
longer timeout). The first case (SYN arrives before RST) may happen on
randomly load-balanced links (where TCP already doesn't work well), and
while this does increase the risk, the risk only concerns just a few SYN
packets every 10s which each have a chance over #port of reusing the same
port as the last one, or ~#port/N of reusing the same port as one of the
last N ones for small values of N. This remains extremely low and is not
higher than the risk of losing the RST before the patch anyway.

As such all these risks are very low but they increase as the timer
decreases. This would work well at 60s and could probably still work well
enough at 1s. But 10s is enough to make Amit's attack impractical, so
better not risk to degrade anything. And the tests I ran showed only 2
failed connections after 13 billion closed by the client, which is
perfectly within what I'm used to observe in such cases.

> Additionally, that problem aside, if you round EPHEMERAL_PORT_SHUFFLE_PERIOD
> to the nearest power of two, you can turn the expensive division into a
> bit shift right.

I've been wondering if it would help to use 8 or 16 here, but that doesn't
change anything, it's a constant so it's converted to a reciprocal divide
by the compiler, i.e. just a 32-bit multiply and a shift.

Thanks!
Willy

2022-05-03 00:34:58

by Jason A. Donenfeld

[permalink] [raw]
Subject: Re: [PATCH v2 net 3/7] tcp: resalt the secret every 10 seconds

On Thu, Apr 28, 2022 at 2:40 PM Willy Tarreau <[email protected]> wrote:
> @@ -101,10 +103,12 @@ u64 secure_ipv6_port_ephemeral(const __be32 *saddr, const __be32 *daddr,
> struct in6_addr saddr;
> struct in6_addr daddr;
> __be16 dport;
> + unsigned int timeseed;

Also, does the struct packing (or lack thereof) lead to problems here?
Uninitialized bytes might not make a stable hash.

2022-05-03 00:58:00

by Willy Tarreau

[permalink] [raw]
Subject: Re: [PATCH v2 net 3/7] tcp: resalt the secret every 10 seconds

On Fri, Apr 29, 2022 at 04:48:52PM +0200, Jason A. Donenfeld wrote:
> On Thu, Apr 28, 2022 at 2:40 PM Willy Tarreau <[email protected]> wrote:
> > @@ -101,10 +103,12 @@ u64 secure_ipv6_port_ephemeral(const __be32 *saddr, const __be32 *daddr,
> > struct in6_addr saddr;
> > struct in6_addr daddr;
> > __be16 dport;
> > + unsigned int timeseed;
>
> Also, does the struct packing (or lack thereof) lead to problems here?
> Uninitialized bytes might not make a stable hash.

Hmmm, I didn't notice, and I think you're right indeed. I did test in IPv6
without noticing any problem but it doesn't mean that the hash is perfectly
stable.

I'll send an update for this one, thank you!
Willy