2021-11-05 01:54:26

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH 0/5] tcp/md5: Generic tcp_sig_pool

3 small fixes for unlikely issues.

The last patch adds tcp_sig_pool, which can be used to manage ahashes
besides md5 algorythm. It reuses tcp_md5sig_pool code and in my view is
a better alternative to [1] proposal, which uses shash for hasing
segments and frags. This also doesn't need introduction of an enum with
supported algorythms, which makes it possible to supply any
crypto-supported hashing algorythm from socket option syscall (like
struct xfrm_algo does in ipsec), reducing needless kernel code.

[1]: https://lore.kernel.org/all/5245f35901015acc6a41d1da92deb96f3e593b7c.1635784253.git.cdleonard@gmail.com/T/#u

Cc: Andy Lutomirski <[email protected]>
Cc: David Ahern <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Francesco Ruggeri <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Herbert Xu <[email protected]>
Cc: Hideaki YOSHIFUJI <[email protected]>
Cc: Leonard Crestez <[email protected]>
Cc: [email protected]
Cc: [email protected]

Dmitry Safonov (5):
tcp/md5: Don't BUG_ON() failed kmemdup()
tcp/md5: Don't leak ahash in OOM
tcp/md5: Alloc tcp_md5sig_pool only in setsockopt()
tcp/md5: Use tcp_md5sig_pool_* naming scheme
tcp/md5: Make more generic tcp_sig_pool

include/net/tcp.h | 23 +++--
net/ipv4/tcp.c | 193 ++++++++++++++++++++++++++++-----------
net/ipv4/tcp_ipv4.c | 45 ++++-----
net/ipv4/tcp_minisocks.c | 5 +-
net/ipv6/tcp_ipv6.c | 43 +++++----
5 files changed, 207 insertions(+), 102 deletions(-)


base-commit: 8a796a1dfca2780321755033a74bca2bbe651680
--
2.33.1


2021-11-05 01:55:13

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH 2/5] tcp/md5: Don't leak ahash in OOM

In quite unlikely scenario when __tcp_alloc_md5sig_pool() succeeded in
crypto_alloc_ahash(), but later failed to allocate per-cpu request or
scratch area ahash will be leaked.
In theory it can happen multiple times in OOM condition for every
setsockopt(TCP_MD5SIG{,_EXT}).

Add a clean-up path to free ahash.

Signed-off-by: Dmitry Safonov <[email protected]>
---
net/ipv4/tcp.c | 16 ++++++++++++----
1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index c0856a6af9f5..eb478028b1ea 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -4276,15 +4276,13 @@ static void __tcp_alloc_md5sig_pool(void)
GFP_KERNEL,
cpu_to_node(cpu));
if (!scratch)
- return;
+ goto out_free;
per_cpu(tcp_md5sig_pool, cpu).scratch = scratch;
}
- if (per_cpu(tcp_md5sig_pool, cpu).md5_req)
- continue;

req = ahash_request_alloc(hash, GFP_KERNEL);
if (!req)
- return;
+ goto out_free;

ahash_request_set_callback(req, 0, NULL, NULL);

@@ -4295,6 +4293,16 @@ static void __tcp_alloc_md5sig_pool(void)
*/
smp_wmb();
tcp_md5sig_pool_populated = true;
+ return;
+
+out_free:
+ for_each_possible_cpu(cpu) {
+ if (per_cpu(tcp_md5sig_pool, cpu).md5_req == NULL)
+ break;
+ ahash_request_free(per_cpu(tcp_md5sig_pool, cpu).md5_req);
+ per_cpu(tcp_md5sig_pool, cpu).md5_req = NULL;
+ }
+ crypto_free_ahash(hash);
}

bool tcp_alloc_md5sig_pool(void)
--
2.33.1

2021-11-05 01:55:17

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH 3/5] tcp/md5: Alloc tcp_md5sig_pool only in setsockopt()

Besides setsockopt() tcp_md5_do_add() can be called from
tcp_v4_syn_recv_sock()/tcp_v6_syn_recv_sock().
If it is called from there, tcp_md5_do_lookup() has succeeded, which
means that md5 static key is enabled. Which makes this
tcp_alloc_md5sig_pool() call to be nop for any caller, but
tcp_v4_parse_md5_keys()/tcp_v6_parse_md5_keys().

tcp_alloc_md5sig_pool() can sleep if tcp_md5sig_pool hasn't been
populated, so if anything changes tcp_md5_do_add() may start sleeping in
atomic context.

Let's leave the check for tcp_md5sig_pool in tcp_md5_do_add(), but
intentionally call tcp_alloc_md5sig_pool() only from sleepable
setsockopt() syscall context.

Signed-off-by: Dmitry Safonov <[email protected]>
---
net/ipv4/tcp_ipv4.c | 5 ++++-
net/ipv6/tcp_ipv6.c | 3 +++
2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 13d868c43284..6a8ff9ab1cbc 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1190,7 +1190,7 @@ int tcp_md5_do_add(struct sock *sk, const union tcp_md5_addr *addr,
key = sock_kmalloc(sk, sizeof(*key), gfp | __GFP_ZERO);
if (!key)
return -ENOMEM;
- if (!tcp_alloc_md5sig_pool()) {
+ if (WARN_ON_ONCE(!tcp_md5sig_pool_ready())) {
sock_kfree_s(sk, key, sizeof(*key));
return -ENOMEM;
}
@@ -1294,6 +1294,9 @@ static int tcp_v4_parse_md5_keys(struct sock *sk, int optname,
if (cmd.tcpm_keylen > TCP_MD5SIG_MAXKEYLEN)
return -EINVAL;

+ if (!tcp_alloc_md5sig_pool())
+ return -ENOMEM;
+
return tcp_md5_do_add(sk, addr, AF_INET, prefixlen, l3index, flags,
cmd.tcpm_key, cmd.tcpm_keylen, GFP_KERNEL);
}
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 2cc9b0e53ad1..3af13bd6fed0 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -654,6 +654,9 @@ static int tcp_v6_parse_md5_keys(struct sock *sk, int optname,
if (cmd.tcpm_keylen > TCP_MD5SIG_MAXKEYLEN)
return -EINVAL;

+ if (!tcp_alloc_md5sig_pool())
+ return -ENOMEM;
+
if (ipv6_addr_v4mapped(&sin6->sin6_addr))
return tcp_md5_do_add(sk, (union tcp_md5_addr *)&sin6->sin6_addr.s6_addr32[3],
AF_INET, prefixlen, l3index, flags,
--
2.33.1

2021-11-05 01:55:19

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH 4/5] tcp/md5: Use tcp_md5sig_pool_* naming scheme

Use common prefixes for operations with (struct tcp_md5sig_pool).

Signed-off-by: Dmitry Safonov <[email protected]>
---
include/net/tcp.h | 6 +++---
net/ipv4/tcp.c | 18 +++++++++---------
net/ipv4/tcp_ipv4.c | 14 +++++++-------
net/ipv6/tcp_ipv6.c | 14 +++++++-------
4 files changed, 26 insertions(+), 26 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 3e5423a10a74..27eb71dd7ff8 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1671,11 +1671,11 @@ tcp_md5_do_lookup(const struct sock *sk, int l3index,
#define tcp_twsk_md5_key(twsk) NULL
#endif

-bool tcp_alloc_md5sig_pool(void);
+bool tcp_md5sig_pool_alloc(void);
bool tcp_md5sig_pool_ready(void);

-struct tcp_md5sig_pool *tcp_get_md5sig_pool(void);
-static inline void tcp_put_md5sig_pool(void)
+struct tcp_md5sig_pool *tcp_md5sig_pool_get(void);
+static inline void tcp_md5sig_pool_put(void)
{
local_bh_enable();
}
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index eb478028b1ea..8d8692fc9cd5 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -4257,7 +4257,7 @@ static DEFINE_PER_CPU(struct tcp_md5sig_pool, tcp_md5sig_pool);
static DEFINE_MUTEX(tcp_md5sig_mutex);
static bool tcp_md5sig_pool_populated = false;

-static void __tcp_alloc_md5sig_pool(void)
+static void __tcp_md5sig_pool_alloc(void)
{
struct crypto_ahash *hash;
int cpu;
@@ -4289,7 +4289,7 @@ static void __tcp_alloc_md5sig_pool(void)
per_cpu(tcp_md5sig_pool, cpu).md5_req = req;
}
/* before setting tcp_md5sig_pool_populated, we must commit all writes
- * to memory. See smp_rmb() in tcp_get_md5sig_pool()
+ * to memory. See smp_rmb() in tcp_md5sig_pool_get()
*/
smp_wmb();
tcp_md5sig_pool_populated = true;
@@ -4305,13 +4305,13 @@ static void __tcp_alloc_md5sig_pool(void)
crypto_free_ahash(hash);
}

-bool tcp_alloc_md5sig_pool(void)
+bool tcp_md5sig_pool_alloc(void)
{
if (unlikely(!tcp_md5sig_pool_populated)) {
mutex_lock(&tcp_md5sig_mutex);

if (!tcp_md5sig_pool_populated) {
- __tcp_alloc_md5sig_pool();
+ __tcp_md5sig_pool_alloc();
if (tcp_md5sig_pool_populated)
static_branch_inc(&tcp_md5_needed);
}
@@ -4320,7 +4320,7 @@ bool tcp_alloc_md5sig_pool(void)
}
return tcp_md5sig_pool_populated;
}
-EXPORT_SYMBOL(tcp_alloc_md5sig_pool);
+EXPORT_SYMBOL(tcp_md5sig_pool_alloc);

bool tcp_md5sig_pool_ready(void)
{
@@ -4329,25 +4329,25 @@ bool tcp_md5sig_pool_ready(void)
EXPORT_SYMBOL(tcp_md5sig_pool_ready);

/**
- * tcp_get_md5sig_pool - get md5sig_pool for this user
+ * tcp_md5sig_pool_get - get md5sig_pool for this user
*
* We use percpu structure, so if we succeed, we exit with preemption
* and BH disabled, to make sure another thread or softirq handling
* wont try to get same context.
*/
-struct tcp_md5sig_pool *tcp_get_md5sig_pool(void)
+struct tcp_md5sig_pool *tcp_md5sig_pool_get(void)
{
local_bh_disable();

if (tcp_md5sig_pool_populated) {
- /* coupled with smp_wmb() in __tcp_alloc_md5sig_pool() */
+ /* coupled with smp_wmb() in __tcp_md5sig_pool_alloc() */
smp_rmb();
return this_cpu_ptr(&tcp_md5sig_pool);
}
local_bh_enable();
return NULL;
}
-EXPORT_SYMBOL(tcp_get_md5sig_pool);
+EXPORT_SYMBOL(tcp_md5sig_pool_get);

int tcp_md5_hash_skb_data(struct tcp_md5sig_pool *hp,
const struct sk_buff *skb, unsigned int header_len)
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 6a8ff9ab1cbc..44db9afa17fc 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1294,7 +1294,7 @@ static int tcp_v4_parse_md5_keys(struct sock *sk, int optname,
if (cmd.tcpm_keylen > TCP_MD5SIG_MAXKEYLEN)
return -EINVAL;

- if (!tcp_alloc_md5sig_pool())
+ if (!tcp_md5sig_pool_alloc())
return -ENOMEM;

return tcp_md5_do_add(sk, addr, AF_INET, prefixlen, l3index, flags,
@@ -1332,7 +1332,7 @@ static int tcp_v4_md5_hash_hdr(char *md5_hash, const struct tcp_md5sig_key *key,
struct tcp_md5sig_pool *hp;
struct ahash_request *req;

- hp = tcp_get_md5sig_pool();
+ hp = tcp_md5sig_pool_get();
if (!hp)
goto clear_hash_noput;
req = hp->md5_req;
@@ -1347,11 +1347,11 @@ static int tcp_v4_md5_hash_hdr(char *md5_hash, const struct tcp_md5sig_key *key,
if (crypto_ahash_final(req))
goto clear_hash;

- tcp_put_md5sig_pool();
+ tcp_md5sig_pool_put();
return 0;

clear_hash:
- tcp_put_md5sig_pool();
+ tcp_md5sig_pool_put();
clear_hash_noput:
memset(md5_hash, 0, 16);
return 1;
@@ -1375,7 +1375,7 @@ int tcp_v4_md5_hash_skb(char *md5_hash, const struct tcp_md5sig_key *key,
daddr = iph->daddr;
}

- hp = tcp_get_md5sig_pool();
+ hp = tcp_md5sig_pool_get();
if (!hp)
goto clear_hash_noput;
req = hp->md5_req;
@@ -1393,11 +1393,11 @@ int tcp_v4_md5_hash_skb(char *md5_hash, const struct tcp_md5sig_key *key,
if (crypto_ahash_final(req))
goto clear_hash;

- tcp_put_md5sig_pool();
+ tcp_md5sig_pool_put();
return 0;

clear_hash:
- tcp_put_md5sig_pool();
+ tcp_md5sig_pool_put();
clear_hash_noput:
memset(md5_hash, 0, 16);
return 1;
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 3af13bd6fed0..9147f9f69196 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -654,7 +654,7 @@ static int tcp_v6_parse_md5_keys(struct sock *sk, int optname,
if (cmd.tcpm_keylen > TCP_MD5SIG_MAXKEYLEN)
return -EINVAL;

- if (!tcp_alloc_md5sig_pool())
+ if (!tcp_md5sig_pool_alloc())
return -ENOMEM;

if (ipv6_addr_v4mapped(&sin6->sin6_addr))
@@ -701,7 +701,7 @@ static int tcp_v6_md5_hash_hdr(char *md5_hash, const struct tcp_md5sig_key *key,
struct tcp_md5sig_pool *hp;
struct ahash_request *req;

- hp = tcp_get_md5sig_pool();
+ hp = tcp_md5sig_pool_get();
if (!hp)
goto clear_hash_noput;
req = hp->md5_req;
@@ -716,11 +716,11 @@ static int tcp_v6_md5_hash_hdr(char *md5_hash, const struct tcp_md5sig_key *key,
if (crypto_ahash_final(req))
goto clear_hash;

- tcp_put_md5sig_pool();
+ tcp_md5sig_pool_put();
return 0;

clear_hash:
- tcp_put_md5sig_pool();
+ tcp_md5sig_pool_put();
clear_hash_noput:
memset(md5_hash, 0, 16);
return 1;
@@ -745,7 +745,7 @@ static int tcp_v6_md5_hash_skb(char *md5_hash,
daddr = &ip6h->daddr;
}

- hp = tcp_get_md5sig_pool();
+ hp = tcp_md5sig_pool_get();
if (!hp)
goto clear_hash_noput;
req = hp->md5_req;
@@ -763,11 +763,11 @@ static int tcp_v6_md5_hash_skb(char *md5_hash,
if (crypto_ahash_final(req))
goto clear_hash;

- tcp_put_md5sig_pool();
+ tcp_md5sig_pool_put();
return 0;

clear_hash:
- tcp_put_md5sig_pool();
+ tcp_md5sig_pool_put();
clear_hash_noput:
memset(md5_hash, 0, 16);
return 1;
--
2.33.1

2021-11-05 01:55:33

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH 1/5] tcp/md5: Don't BUG_ON() failed kmemdup()

static_branch_unlikely(&tcp_md5_needed) is enabled by
tcp_alloc_md5sig_pool(), so as long as the code doesn't change
tcp_md5sig_pool has been already populated if this code is being
executed.

In case tcptw->tw_md5_key allocaion failed - no reason to crash kernel:
tcp_{v4,v6}_send_ack() will send unsigned segment, the connection won't be
established, which is bad enough, but in OOM situation totally
acceptable and better than kernel crash.

Introduce tcp_md5sig_pool_ready() helper.
tcp_alloc_md5sig_pool() usage is intentionally avoided here as it's
fast-path here and it's check for sanity rather than point of actual
pool allocation. That will allow to have generic slow-path allocator
for tcp crypto pool.

Signed-off-by: Dmitry Safonov <[email protected]>
---
include/net/tcp.h | 1 +
net/ipv4/tcp.c | 5 +++++
net/ipv4/tcp_minisocks.c | 5 +++--
3 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 4da22b41bde6..3e5423a10a74 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1672,6 +1672,7 @@ tcp_md5_do_lookup(const struct sock *sk, int l3index,
#endif

bool tcp_alloc_md5sig_pool(void);
+bool tcp_md5sig_pool_ready(void);

struct tcp_md5sig_pool *tcp_get_md5sig_pool(void);
static inline void tcp_put_md5sig_pool(void)
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index b7796b4cf0a0..c0856a6af9f5 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -4314,6 +4314,11 @@ bool tcp_alloc_md5sig_pool(void)
}
EXPORT_SYMBOL(tcp_alloc_md5sig_pool);

+bool tcp_md5sig_pool_ready(void)
+{
+ return tcp_md5sig_pool_populated;
+}
+EXPORT_SYMBOL(tcp_md5sig_pool_ready);

/**
* tcp_get_md5sig_pool - get md5sig_pool for this user
diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index cf913a66df17..c99cdb529902 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -293,11 +293,12 @@ void tcp_time_wait(struct sock *sk, int state, int timeo)
tcptw->tw_md5_key = NULL;
if (static_branch_unlikely(&tcp_md5_needed)) {
struct tcp_md5sig_key *key;
+ bool err = WARN_ON(!tcp_md5sig_pool_ready());

key = tp->af_specific->md5_lookup(sk, sk);
- if (key) {
+ if (key && !err) {
tcptw->tw_md5_key = kmemdup(key, sizeof(*key), GFP_ATOMIC);
- BUG_ON(tcptw->tw_md5_key && !tcp_alloc_md5sig_pool());
+ WARN_ON_ONCE(tcptw->tw_md5_key == NULL);
}
}
} while (0);
--
2.33.1

2021-11-05 01:56:07

by Dmitry Safonov

[permalink] [raw]
Subject: [PATCH 5/5] tcp/md5: Make more generic tcp_sig_pool

Convert tcp_md5sig_pool to more generic tcp_sig_pool.
Now tcp_sig_pool_alloc(const char *alg) can be used to allocate per-cpu
ahash request for different hashing algorithms besides md5.
tcp_sig_pool_get() and tcp_sig_pool_put() should be used to get
ahash_request and scratch area.

Make tcp_sig_pool reusable for TCP Authentication Option support
(TCP-AO, RFC5925), where RFC5926[1] requires HMAC-SHA1 and AES-128_CMAC
hashing at least.

Signed-off-by: Dmitry Safonov <[email protected]>
---
include/net/tcp.h | 24 +++--
net/ipv4/tcp.c | 192 +++++++++++++++++++++++++++------------
net/ipv4/tcp_ipv4.c | 46 +++++-----
net/ipv4/tcp_minisocks.c | 4 +-
net/ipv6/tcp_ipv6.c | 44 ++++-----
5 files changed, 197 insertions(+), 113 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 27eb71dd7ff8..2d868ce8736d 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1628,10 +1628,10 @@ union tcp_md5sum_block {
#endif
};

-/* - pool: digest algorithm, hash description and scratch buffer */
-struct tcp_md5sig_pool {
- struct ahash_request *md5_req;
+struct tcp_sig_pool {
+ struct ahash_request *req;
void *scratch;
+#define SCRATCH_SIZE (sizeof(union tcp_md5sum_block) + sizeof(struct tcphdr))
};

/* - functions */
@@ -1671,18 +1671,24 @@ tcp_md5_do_lookup(const struct sock *sk, int l3index,
#define tcp_twsk_md5_key(twsk) NULL
#endif

-bool tcp_md5sig_pool_alloc(void);
-bool tcp_md5sig_pool_ready(void);

-struct tcp_md5sig_pool *tcp_md5sig_pool_get(void);
-static inline void tcp_md5sig_pool_put(void)
+/* TCP MD5 supports only one hash function, set MD5 id in stone
+ * to avoid needless storing MD5 id in (struct tcp_md5sig_info).
+ */
+#define TCP_MD5_SIG_ID 0
+
+int tcp_sig_pool_alloc(const char *alg);
+bool tcp_sig_pool_ready(unsigned int id);
+
+int tcp_sig_pool_get(unsigned int id, struct tcp_sig_pool *tsp);
+static inline void tcp_sig_pool_put(void)
{
local_bh_enable();
}

-int tcp_md5_hash_skb_data(struct tcp_md5sig_pool *, const struct sk_buff *,
+int tcp_md5_hash_skb_data(struct tcp_sig_pool *, const struct sk_buff *,
unsigned int header_len);
-int tcp_md5_hash_key(struct tcp_md5sig_pool *hp,
+int tcp_md5_hash_key(struct tcp_sig_pool *hp,
const struct tcp_md5sig_key *key);

/* From tcp_fastopen.c */
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 8d8692fc9cd5..8307c7b91d09 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -4253,32 +4253,79 @@ int tcp_getsockopt(struct sock *sk, int level, int optname, char __user *optval,
EXPORT_SYMBOL(tcp_getsockopt);

#ifdef CONFIG_TCP_MD5SIG
-static DEFINE_PER_CPU(struct tcp_md5sig_pool, tcp_md5sig_pool);
-static DEFINE_MUTEX(tcp_md5sig_mutex);
-static bool tcp_md5sig_pool_populated = false;
+struct tcp_sig_crypto {
+ struct ahash_request * __percpu *req;
+ const char *alg;
+ bool ready;
+};
+
+#define TCP_SIG_POOL_MAX 8
+static struct tcp_sig_pool_priv_t {
+ struct tcp_sig_crypto cryptos[TCP_SIG_POOL_MAX];
+ unsigned int cryptos_nr;
+} tcp_sig_pool_priv = {
+ .cryptos_nr = 1,
+ .cryptos[TCP_MD5_SIG_ID].alg = "md5",
+};
+
+static DEFINE_PER_CPU(void *, tcp_sig_pool_scratch);
+static DEFINE_MUTEX(tcp_sig_pool_mutex);

-static void __tcp_md5sig_pool_alloc(void)
+static int tcp_sig_pool_alloc_scrath(void)
{
- struct crypto_ahash *hash;
int cpu;

- hash = crypto_alloc_ahash("md5", 0, CRYPTO_ALG_ASYNC);
- if (IS_ERR(hash))
- return;
+ for_each_possible_cpu(cpu) {
+ void *scratch = per_cpu(tcp_sig_pool_scratch, cpu);
+
+ if (scratch)
+ continue;
+
+ scratch = kmalloc_node(SCRATCH_SIZE, GFP_KERNEL,
+ cpu_to_node(cpu));
+ if (!scratch)
+ return -ENOMEM;
+ per_cpu(tcp_sig_pool_scratch, cpu) = scratch;
+ }
+ return 0;
+}
+
+bool tcp_sig_pool_ready(unsigned int id)
+{
+ return tcp_sig_pool_priv.cryptos[id].ready;
+}
+EXPORT_SYMBOL(tcp_sig_pool_ready);
+
+static int __tcp_sig_pool_alloc(struct tcp_sig_crypto *crypto, const char *alg)
+{
+ struct crypto_ahash *hash;
+ bool needs_key;
+ int cpu, ret = -ENOMEM;
+
+ crypto->alg = kstrdup(alg, GFP_KERNEL);
+ if (!crypto->alg)
+ return -ENOMEM;
+
+ crypto->req = alloc_percpu(struct ahash_request *);
+ if (!crypto->req)
+ goto out_free_alg;
+
+ hash = crypto_alloc_ahash(alg, 0, CRYPTO_ALG_ASYNC);
+ if (IS_ERR(hash)) {
+ ret = PTR_ERR(hash);
+ goto out_free_req;
+ }
+
+ /* If hash has .setkey(), allocate ahash per-cpu, not only request */
+ needs_key = crypto_ahash_get_flags(hash) & CRYPTO_TFM_NEED_KEY;

for_each_possible_cpu(cpu) {
- void *scratch = per_cpu(tcp_md5sig_pool, cpu).scratch;
struct ahash_request *req;

- if (!scratch) {
- scratch = kmalloc_node(sizeof(union tcp_md5sum_block) +
- sizeof(struct tcphdr),
- GFP_KERNEL,
- cpu_to_node(cpu));
- if (!scratch)
- goto out_free;
- per_cpu(tcp_md5sig_pool, cpu).scratch = scratch;
- }
+ if (hash == NULL)
+ hash = crypto_alloc_ahash(alg, 0, CRYPTO_ALG_ASYNC);
+ if (IS_ERR(hash))
+ goto out_free;

req = ahash_request_alloc(hash, GFP_KERNEL);
if (!req)
@@ -4286,75 +4333,106 @@ static void __tcp_md5sig_pool_alloc(void)

ahash_request_set_callback(req, 0, NULL, NULL);

- per_cpu(tcp_md5sig_pool, cpu).md5_req = req;
+ *per_cpu_ptr(crypto->req, cpu) = req;
+
+ if (needs_key)
+ hash = NULL;
}
- /* before setting tcp_md5sig_pool_populated, we must commit all writes
- * to memory. See smp_rmb() in tcp_md5sig_pool_get()
+ /* before setting crypto->ready, we must commit all writes
+ * to memory. See smp_rmb() in tcp_sig_pool_get()
*/
smp_wmb();
- tcp_md5sig_pool_populated = true;
- return;
+ crypto->ready = true;
+ return 0;

out_free:
+ if (!IS_ERR_OR_NULL(hash) && needs_key)
+ crypto_free_ahash(hash);
+
for_each_possible_cpu(cpu) {
- if (per_cpu(tcp_md5sig_pool, cpu).md5_req == NULL)
+ if (*per_cpu_ptr(crypto->req, cpu) == NULL)
break;
- ahash_request_free(per_cpu(tcp_md5sig_pool, cpu).md5_req);
- per_cpu(tcp_md5sig_pool, cpu).md5_req = NULL;
+ hash = crypto_ahash_reqtfm(*per_cpu_ptr(crypto->req, cpu));
+ ahash_request_free(*per_cpu_ptr(crypto->req, cpu));
+ if (needs_key) {
+ crypto_free_ahash(hash);
+ hash = NULL;
+ }
}
- crypto_free_ahash(hash);
+
+ if (hash)
+ crypto_free_ahash(hash);
+out_free_req:
+ free_percpu(crypto->req);
+out_free_alg:
+ kfree(crypto->alg);
+ return ret;
}

-bool tcp_md5sig_pool_alloc(void)
+int tcp_sig_pool_alloc(const char *alg)
{
- if (unlikely(!tcp_md5sig_pool_populated)) {
- mutex_lock(&tcp_md5sig_mutex);
+ unsigned int i, err;

- if (!tcp_md5sig_pool_populated) {
- __tcp_md5sig_pool_alloc();
- if (tcp_md5sig_pool_populated)
- static_branch_inc(&tcp_md5_needed);
- }
+ /* slow-path: once per setsockopt() */
+ mutex_lock(&tcp_sig_pool_mutex);

- mutex_unlock(&tcp_md5sig_mutex);
+ err = tcp_sig_pool_alloc_scrath();
+ if (err)
+ goto out;
+
+ for (i = 0; i < tcp_sig_pool_priv.cryptos_nr; i++) {
+ if (!strcmp(tcp_sig_pool_priv.cryptos[i].alg, alg))
+ break;
}
- return tcp_md5sig_pool_populated;
-}
-EXPORT_SYMBOL(tcp_md5sig_pool_alloc);

-bool tcp_md5sig_pool_ready(void)
-{
- return tcp_md5sig_pool_populated;
+ if (i >= TCP_SIG_POOL_MAX) {
+ i = -ENOSPC;
+ goto out;
+ }
+
+ if (tcp_sig_pool_priv.cryptos[i].ready)
+ goto out;
+
+ err = __tcp_sig_pool_alloc(&tcp_sig_pool_priv.cryptos[i], alg);
+ if (!err && i == TCP_MD5_SIG_ID)
+ static_branch_inc(&tcp_md5_needed);
+
+out:
+ mutex_unlock(&tcp_sig_pool_mutex);
+ return err ?: i;
}
-EXPORT_SYMBOL(tcp_md5sig_pool_ready);
+EXPORT_SYMBOL(tcp_sig_pool_alloc);

/**
- * tcp_md5sig_pool_get - get md5sig_pool for this user
+ * tcp_sig_pool_get - get tcp_sig_pool for this user
*
* We use percpu structure, so if we succeed, we exit with preemption
* and BH disabled, to make sure another thread or softirq handling
* wont try to get same context.
*/
-struct tcp_md5sig_pool *tcp_md5sig_pool_get(void)
+int tcp_sig_pool_get(unsigned int id, struct tcp_sig_pool *tsp)
{
local_bh_disable();

- if (tcp_md5sig_pool_populated) {
- /* coupled with smp_wmb() in __tcp_md5sig_pool_alloc() */
- smp_rmb();
- return this_cpu_ptr(&tcp_md5sig_pool);
+ if (id > tcp_sig_pool_priv.cryptos_nr || !tcp_sig_pool_ready(id)) {
+ local_bh_enable();
+ return -EINVAL;
}
- local_bh_enable();
- return NULL;
+
+ /* coupled with smp_wmb() in __tcp_sig_pool_alloc() */
+ smp_rmb();
+ tsp->req = *this_cpu_ptr(tcp_sig_pool_priv.cryptos[id].req);
+ tsp->scratch = this_cpu_ptr(&tcp_sig_pool_scratch);
+ return 0;
}
-EXPORT_SYMBOL(tcp_md5sig_pool_get);
+EXPORT_SYMBOL(tcp_sig_pool_get);

-int tcp_md5_hash_skb_data(struct tcp_md5sig_pool *hp,
+int tcp_md5_hash_skb_data(struct tcp_sig_pool *hp,
const struct sk_buff *skb, unsigned int header_len)
{
struct scatterlist sg;
const struct tcphdr *tp = tcp_hdr(skb);
- struct ahash_request *req = hp->md5_req;
+ struct ahash_request *req = hp->req;
unsigned int i;
const unsigned int head_data_len = skb_headlen(skb) > header_len ?
skb_headlen(skb) - header_len : 0;
@@ -4388,16 +4466,16 @@ int tcp_md5_hash_skb_data(struct tcp_md5sig_pool *hp,
}
EXPORT_SYMBOL(tcp_md5_hash_skb_data);

-int tcp_md5_hash_key(struct tcp_md5sig_pool *hp, const struct tcp_md5sig_key *key)
+int tcp_md5_hash_key(struct tcp_sig_pool *hp, const struct tcp_md5sig_key *key)
{
u8 keylen = READ_ONCE(key->keylen); /* paired with WRITE_ONCE() in tcp_md5_do_add */
struct scatterlist sg;

sg_init_one(&sg, key->key, keylen);
- ahash_request_set_crypt(hp->md5_req, &sg, NULL, keylen);
+ ahash_request_set_crypt(hp->req, &sg, NULL, keylen);

/* We use data_race() because tcp_md5_do_add() might change key->key under us */
- return data_race(crypto_ahash_update(hp->md5_req));
+ return data_race(crypto_ahash_update(hp->req));
}
EXPORT_SYMBOL(tcp_md5_hash_key);

diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 44db9afa17fc..7e18806c6d82 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1190,7 +1190,7 @@ int tcp_md5_do_add(struct sock *sk, const union tcp_md5_addr *addr,
key = sock_kmalloc(sk, sizeof(*key), gfp | __GFP_ZERO);
if (!key)
return -ENOMEM;
- if (WARN_ON_ONCE(!tcp_md5sig_pool_ready())) {
+ if (WARN_ON_ONCE(!tcp_sig_pool_ready(TCP_MD5_SIG_ID))) {
sock_kfree_s(sk, key, sizeof(*key));
return -ENOMEM;
}
@@ -1249,6 +1249,7 @@ static int tcp_v4_parse_md5_keys(struct sock *sk, int optname,
u8 prefixlen = 32;
int l3index = 0;
u8 flags;
+ int err;

if (optlen < sizeof(cmd))
return -EINVAL;
@@ -1294,14 +1295,15 @@ static int tcp_v4_parse_md5_keys(struct sock *sk, int optname,
if (cmd.tcpm_keylen > TCP_MD5SIG_MAXKEYLEN)
return -EINVAL;

- if (!tcp_md5sig_pool_alloc())
- return -ENOMEM;
+ err = tcp_sig_pool_alloc("md5");
+ if (err < 0 || WARN_ON_ONCE(err != TCP_MD5_SIG_ID))
+ return err;

return tcp_md5_do_add(sk, addr, AF_INET, prefixlen, l3index, flags,
cmd.tcpm_key, cmd.tcpm_keylen, GFP_KERNEL);
}

-static int tcp_v4_md5_hash_headers(struct tcp_md5sig_pool *hp,
+static int tcp_v4_md5_hash_headers(struct tcp_sig_pool *hp,
__be32 daddr, __be32 saddr,
const struct tcphdr *th, int nbytes)
{
@@ -1321,37 +1323,36 @@ static int tcp_v4_md5_hash_headers(struct tcp_md5sig_pool *hp,
_th->check = 0;

sg_init_one(&sg, bp, sizeof(*bp) + sizeof(*th));
- ahash_request_set_crypt(hp->md5_req, &sg, NULL,
+ ahash_request_set_crypt(hp->req, &sg, NULL,
sizeof(*bp) + sizeof(*th));
- return crypto_ahash_update(hp->md5_req);
+ return crypto_ahash_update(hp->req);
}

static int tcp_v4_md5_hash_hdr(char *md5_hash, const struct tcp_md5sig_key *key,
__be32 daddr, __be32 saddr, const struct tcphdr *th)
{
- struct tcp_md5sig_pool *hp;
+ struct tcp_sig_pool hp;
struct ahash_request *req;

- hp = tcp_md5sig_pool_get();
- if (!hp)
+ if (tcp_sig_pool_get(TCP_MD5_SIG_ID, &hp))
goto clear_hash_noput;
- req = hp->md5_req;
+ req = hp.req;

if (crypto_ahash_init(req))
goto clear_hash;
- if (tcp_v4_md5_hash_headers(hp, daddr, saddr, th, th->doff << 2))
+ if (tcp_v4_md5_hash_headers(&hp, daddr, saddr, th, th->doff << 2))
goto clear_hash;
- if (tcp_md5_hash_key(hp, key))
+ if (tcp_md5_hash_key(&hp, key))
goto clear_hash;
ahash_request_set_crypt(req, NULL, md5_hash, 0);
if (crypto_ahash_final(req))
goto clear_hash;

- tcp_md5sig_pool_put();
+ tcp_sig_pool_put();
return 0;

clear_hash:
- tcp_md5sig_pool_put();
+ tcp_sig_pool_put();
clear_hash_noput:
memset(md5_hash, 0, 16);
return 1;
@@ -1361,7 +1362,7 @@ int tcp_v4_md5_hash_skb(char *md5_hash, const struct tcp_md5sig_key *key,
const struct sock *sk,
const struct sk_buff *skb)
{
- struct tcp_md5sig_pool *hp;
+ struct tcp_sig_pool hp;
struct ahash_request *req;
const struct tcphdr *th = tcp_hdr(skb);
__be32 saddr, daddr;
@@ -1375,29 +1376,28 @@ int tcp_v4_md5_hash_skb(char *md5_hash, const struct tcp_md5sig_key *key,
daddr = iph->daddr;
}

- hp = tcp_md5sig_pool_get();
- if (!hp)
+ if (tcp_sig_pool_get(TCP_MD5_SIG_ID, &hp))
goto clear_hash_noput;
- req = hp->md5_req;
+ req = hp.req;

if (crypto_ahash_init(req))
goto clear_hash;

- if (tcp_v4_md5_hash_headers(hp, daddr, saddr, th, skb->len))
+ if (tcp_v4_md5_hash_headers(&hp, daddr, saddr, th, skb->len))
goto clear_hash;
- if (tcp_md5_hash_skb_data(hp, skb, th->doff << 2))
+ if (tcp_md5_hash_skb_data(&hp, skb, th->doff << 2))
goto clear_hash;
- if (tcp_md5_hash_key(hp, key))
+ if (tcp_md5_hash_key(&hp, key))
goto clear_hash;
ahash_request_set_crypt(req, NULL, md5_hash, 0);
if (crypto_ahash_final(req))
goto clear_hash;

- tcp_md5sig_pool_put();
+ tcp_sig_pool_put();
return 0;

clear_hash:
- tcp_md5sig_pool_put();
+ tcp_sig_pool_put();
clear_hash_noput:
memset(md5_hash, 0, 16);
return 1;
diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index c99cdb529902..7bfc5a42ce98 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -293,10 +293,10 @@ void tcp_time_wait(struct sock *sk, int state, int timeo)
tcptw->tw_md5_key = NULL;
if (static_branch_unlikely(&tcp_md5_needed)) {
struct tcp_md5sig_key *key;
- bool err = WARN_ON(!tcp_md5sig_pool_ready());
+ bool err = !tcp_sig_pool_ready(TCP_MD5_SIG_ID);

key = tp->af_specific->md5_lookup(sk, sk);
- if (key && !err) {
+ if (key && !WARN_ON(err)) {
tcptw->tw_md5_key = kmemdup(key, sizeof(*key), GFP_ATOMIC);
WARN_ON_ONCE(tcptw->tw_md5_key == NULL);
}
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 9147f9f69196..0abc28937058 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -603,6 +603,7 @@ static int tcp_v6_parse_md5_keys(struct sock *sk, int optname,
int l3index = 0;
u8 prefixlen;
u8 flags;
+ int err;

if (optlen < sizeof(cmd))
return -EINVAL;
@@ -654,8 +655,9 @@ static int tcp_v6_parse_md5_keys(struct sock *sk, int optname,
if (cmd.tcpm_keylen > TCP_MD5SIG_MAXKEYLEN)
return -EINVAL;

- if (!tcp_md5sig_pool_alloc())
- return -ENOMEM;
+ err = tcp_sig_pool_alloc("md5");
+ if (err < 0 || WARN_ON_ONCE(err != TCP_MD5_SIG_ID))
+ return err;

if (ipv6_addr_v4mapped(&sin6->sin6_addr))
return tcp_md5_do_add(sk, (union tcp_md5_addr *)&sin6->sin6_addr.s6_addr32[3],
@@ -668,7 +670,7 @@ static int tcp_v6_parse_md5_keys(struct sock *sk, int optname,
cmd.tcpm_key, cmd.tcpm_keylen, GFP_KERNEL);
}

-static int tcp_v6_md5_hash_headers(struct tcp_md5sig_pool *hp,
+static int tcp_v6_md5_hash_headers(struct tcp_sig_pool *hp,
const struct in6_addr *daddr,
const struct in6_addr *saddr,
const struct tcphdr *th, int nbytes)
@@ -689,38 +691,37 @@ static int tcp_v6_md5_hash_headers(struct tcp_md5sig_pool *hp,
_th->check = 0;

sg_init_one(&sg, bp, sizeof(*bp) + sizeof(*th));
- ahash_request_set_crypt(hp->md5_req, &sg, NULL,
+ ahash_request_set_crypt(hp->req, &sg, NULL,
sizeof(*bp) + sizeof(*th));
- return crypto_ahash_update(hp->md5_req);
+ return crypto_ahash_update(hp->req);
}

static int tcp_v6_md5_hash_hdr(char *md5_hash, const struct tcp_md5sig_key *key,
const struct in6_addr *daddr, struct in6_addr *saddr,
const struct tcphdr *th)
{
- struct tcp_md5sig_pool *hp;
+ struct tcp_sig_pool hp;
struct ahash_request *req;

- hp = tcp_md5sig_pool_get();
- if (!hp)
+ if (tcp_sig_pool_get(TCP_MD5_SIG_ID, &hp))
goto clear_hash_noput;
- req = hp->md5_req;
+ req = hp.req;

if (crypto_ahash_init(req))
goto clear_hash;
- if (tcp_v6_md5_hash_headers(hp, daddr, saddr, th, th->doff << 2))
+ if (tcp_v6_md5_hash_headers(&hp, daddr, saddr, th, th->doff << 2))
goto clear_hash;
- if (tcp_md5_hash_key(hp, key))
+ if (tcp_md5_hash_key(&hp, key))
goto clear_hash;
ahash_request_set_crypt(req, NULL, md5_hash, 0);
if (crypto_ahash_final(req))
goto clear_hash;

- tcp_md5sig_pool_put();
+ tcp_sig_pool_put();
return 0;

clear_hash:
- tcp_md5sig_pool_put();
+ tcp_sig_pool_put();
clear_hash_noput:
memset(md5_hash, 0, 16);
return 1;
@@ -732,7 +733,7 @@ static int tcp_v6_md5_hash_skb(char *md5_hash,
const struct sk_buff *skb)
{
const struct in6_addr *saddr, *daddr;
- struct tcp_md5sig_pool *hp;
+ struct tcp_sig_pool hp;
struct ahash_request *req;
const struct tcphdr *th = tcp_hdr(skb);

@@ -745,29 +746,28 @@ static int tcp_v6_md5_hash_skb(char *md5_hash,
daddr = &ip6h->daddr;
}

- hp = tcp_md5sig_pool_get();
- if (!hp)
+ if (tcp_sig_pool_get(TCP_MD5_SIG_ID, &hp))
goto clear_hash_noput;
- req = hp->md5_req;
+ req = hp.req;

if (crypto_ahash_init(req))
goto clear_hash;

- if (tcp_v6_md5_hash_headers(hp, daddr, saddr, th, skb->len))
+ if (tcp_v6_md5_hash_headers(&hp, daddr, saddr, th, skb->len))
goto clear_hash;
- if (tcp_md5_hash_skb_data(hp, skb, th->doff << 2))
+ if (tcp_md5_hash_skb_data(&hp, skb, th->doff << 2))
goto clear_hash;
- if (tcp_md5_hash_key(hp, key))
+ if (tcp_md5_hash_key(&hp, key))
goto clear_hash;
ahash_request_set_crypt(req, NULL, md5_hash, 0);
if (crypto_ahash_final(req))
goto clear_hash;

- tcp_md5sig_pool_put();
+ tcp_sig_pool_put();
return 0;

clear_hash:
- tcp_md5sig_pool_put();
+ tcp_sig_pool_put();
clear_hash_noput:
memset(md5_hash, 0, 16);
return 1;
--
2.33.1

2021-11-05 03:25:13

by Eric Dumazet

[permalink] [raw]
Subject: Re: [PATCH 2/5] tcp/md5: Don't leak ahash in OOM

On Thu, Nov 4, 2021 at 6:50 PM Dmitry Safonov <[email protected]> wrote:
>
> In quite unlikely scenario when __tcp_alloc_md5sig_pool() succeeded in
> crypto_alloc_ahash(), but later failed to allocate per-cpu request or
> scratch area ahash will be leaked.
> In theory it can happen multiple times in OOM condition for every
> setsockopt(TCP_MD5SIG{,_EXT}).

Then store it in a global, like the other parts ?

This makes the patch smaller, and hopefully the allocations will
eventually succeed,
one at a time.

Bug fixes should target net tree, with a Fixes: tag, not buried in
apatch series targeting net-next (which is closed btw)

Thanks.

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index b461ae573afc82a2c37321b13c2d76f61cd13b53..e2353e35693935fb5abd7da4531c98b86fd35e1c
100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -4260,13 +4260,14 @@ static bool tcp_md5sig_pool_populated = false;

static void __tcp_alloc_md5sig_pool(void)
{
- struct crypto_ahash *hash;
+ static struct crypto_ahash *hash;
int cpu;

- hash = crypto_alloc_ahash("md5", 0, CRYPTO_ALG_ASYNC);
- if (IS_ERR(hash))
- return;
-
+ if (IS_ERR_OR_NULL(hash)) {
+ hash = crypto_alloc_ahash("md5", 0, CRYPTO_ALG_ASYNC);
+ if (IS_ERR(hash))
+ return;
+ }
for_each_possible_cpu(cpu) {
void *scratch = per_cpu(tcp_md5sig_pool, cpu).scratch;
struct ahash_request *req;

2021-11-05 03:43:55

by Eric Dumazet

[permalink] [raw]
Subject: Re: [PATCH 1/5] tcp/md5: Don't BUG_ON() failed kmemdup()



On 11/4/21 6:49 PM, Dmitry Safonov wrote:
> static_branch_unlikely(&tcp_md5_needed) is enabled by
> tcp_alloc_md5sig_pool(), so as long as the code doesn't change
> tcp_md5sig_pool has been already populated if this code is being
> executed.
>
> In case tcptw->tw_md5_key allocaion failed - no reason to crash kernel:
> tcp_{v4,v6}_send_ack() will send unsigned segment, the connection won't be
> established, which is bad enough, but in OOM situation totally
> acceptable and better than kernel crash.
>
> Introduce tcp_md5sig_pool_ready() helper.
> tcp_alloc_md5sig_pool() usage is intentionally avoided here as it's
> fast-path here and it's check for sanity rather than point of actual
> pool allocation. That will allow to have generic slow-path allocator
> for tcp crypto pool.
>
> Signed-off-by: Dmitry Safonov <[email protected]>
> ---
> include/net/tcp.h | 1 +
> net/ipv4/tcp.c | 5 +++++
> net/ipv4/tcp_minisocks.c | 5 +++--
> 3 files changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/include/net/tcp.h b/include/net/tcp.h
> index 4da22b41bde6..3e5423a10a74 100644
> --- a/include/net/tcp.h
> +++ b/include/net/tcp.h
> @@ -1672,6 +1672,7 @@ tcp_md5_do_lookup(const struct sock *sk, int l3index,
> #endif
>
> bool tcp_alloc_md5sig_pool(void);
> +bool tcp_md5sig_pool_ready(void);
>
> struct tcp_md5sig_pool *tcp_get_md5sig_pool(void);
> static inline void tcp_put_md5sig_pool(void)
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index b7796b4cf0a0..c0856a6af9f5 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -4314,6 +4314,11 @@ bool tcp_alloc_md5sig_pool(void)
> }
> EXPORT_SYMBOL(tcp_alloc_md5sig_pool);
>
> +bool tcp_md5sig_pool_ready(void)
> +{
> + return tcp_md5sig_pool_populated;
> +}
> +EXPORT_SYMBOL(tcp_md5sig_pool_ready);
>
> /**
> * tcp_get_md5sig_pool - get md5sig_pool for this user
> diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
> index cf913a66df17..c99cdb529902 100644
> --- a/net/ipv4/tcp_minisocks.c
> +++ b/net/ipv4/tcp_minisocks.c
> @@ -293,11 +293,12 @@ void tcp_time_wait(struct sock *sk, int state, int timeo)
> tcptw->tw_md5_key = NULL;
> if (static_branch_unlikely(&tcp_md5_needed)) {
> struct tcp_md5sig_key *key;
> + bool err = WARN_ON(!tcp_md5sig_pool_ready());
>
> key = tp->af_specific->md5_lookup(sk, sk);
> - if (key) {
> + if (key && !err) {
> tcptw->tw_md5_key = kmemdup(key, sizeof(*key), GFP_ATOMIC);
> - BUG_ON(tcptw->tw_md5_key && !tcp_alloc_md5sig_pool());
> + WARN_ON_ONCE(tcptw->tw_md5_key == NULL);
> }
> }
> } while (0);
>

Hmmm.... how this BUG_ON() could trigger exactly ?

tcp_md5_needed can only be enabled after __tcp_alloc_md5sig_pool has succeeded.

This patch, sent during merge-window, is a distraction, sorry.

About renaming : It looks nice, but is a disaster for backports
done for stable releases. Please refrain from doing this.

2021-11-05 09:16:53

by Leonard Crestez

[permalink] [raw]
Subject: Re: [PATCH 1/5] tcp/md5: Don't BUG_ON() failed kmemdup()

On 11/5/21 4:55 AM, Eric Dumazet wrote:
>
>
> On 11/4/21 6:49 PM, Dmitry Safonov wrote:
>> static_branch_unlikely(&tcp_md5_needed) is enabled by
>> tcp_alloc_md5sig_pool(), so as long as the code doesn't change
>> tcp_md5sig_pool has been already populated if this code is being
>> executed.
>>
>> In case tcptw->tw_md5_key allocaion failed - no reason to crash kernel:
>> tcp_{v4,v6}_send_ack() will send unsigned segment, the connection won't be
>> established, which is bad enough, but in OOM situation totally
>> acceptable and better than kernel crash.
>>
>> Introduce tcp_md5sig_pool_ready() helper.
>> tcp_alloc_md5sig_pool() usage is intentionally avoided here as it's
>> fast-path here and it's check for sanity rather than point of actual
>> pool allocation. That will allow to have generic slow-path allocator
>> for tcp crypto pool.
>>
>> Signed-off-by: Dmitry Safonov <[email protected]>
>> ---
>> include/net/tcp.h | 1 +
>> net/ipv4/tcp.c | 5 +++++
>> net/ipv4/tcp_minisocks.c | 5 +++--
>> 3 files changed, 9 insertions(+), 2 deletions(-)
>>
>> diff --git a/include/net/tcp.h b/include/net/tcp.h
>> index 4da22b41bde6..3e5423a10a74 100644
>> --- a/include/net/tcp.h
>> +++ b/include/net/tcp.h
>> @@ -1672,6 +1672,7 @@ tcp_md5_do_lookup(const struct sock *sk, int l3index,
>> #endif
>>
>> bool tcp_alloc_md5sig_pool(void);
>> +bool tcp_md5sig_pool_ready(void);
>>
>> struct tcp_md5sig_pool *tcp_get_md5sig_pool(void);
>> static inline void tcp_put_md5sig_pool(void)
>> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
>> index b7796b4cf0a0..c0856a6af9f5 100644
>> --- a/net/ipv4/tcp.c
>> +++ b/net/ipv4/tcp.c
>> @@ -4314,6 +4314,11 @@ bool tcp_alloc_md5sig_pool(void)
>> }
>> EXPORT_SYMBOL(tcp_alloc_md5sig_pool);
>>
>> +bool tcp_md5sig_pool_ready(void)
>> +{
>> + return tcp_md5sig_pool_populated;
>> +}
>> +EXPORT_SYMBOL(tcp_md5sig_pool_ready);
>>
>> /**
>> * tcp_get_md5sig_pool - get md5sig_pool for this user
>> diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
>> index cf913a66df17..c99cdb529902 100644
>> --- a/net/ipv4/tcp_minisocks.c
>> +++ b/net/ipv4/tcp_minisocks.c
>> @@ -293,11 +293,12 @@ void tcp_time_wait(struct sock *sk, int state, int timeo)
>> tcptw->tw_md5_key = NULL;
>> if (static_branch_unlikely(&tcp_md5_needed)) {
>> struct tcp_md5sig_key *key;
>> + bool err = WARN_ON(!tcp_md5sig_pool_ready());
>>
>> key = tp->af_specific->md5_lookup(sk, sk);
>> - if (key) {
>> + if (key && !err) {
>> tcptw->tw_md5_key = kmemdup(key, sizeof(*key), GFP_ATOMIC);
>> - BUG_ON(tcptw->tw_md5_key && !tcp_alloc_md5sig_pool());
>> + WARN_ON_ONCE(tcptw->tw_md5_key == NULL);
>> }
>> }
>> } while (0);
>>
>
> Hmmm.... how this BUG_ON() could trigger exactly ?
>
> tcp_md5_needed can only be enabled after __tcp_alloc_md5sig_pool has succeeded.

The tcp_alloc_md5_pool here should just be removed, it no longer does
anything since md5 pool reference counting and freeing were removed back
in 2013 by commit 71cea17ed39f ("tcp: md5: remove spinlock usage in fast
path").

2021-11-05 09:55:48

by Leonard Crestez

[permalink] [raw]
Subject: Re: [PATCH 5/5] tcp/md5: Make more generic tcp_sig_pool

On 11/5/21 3:49 AM, Dmitry Safonov wrote:
> Convert tcp_md5sig_pool to more generic tcp_sig_pool.
> Now tcp_sig_pool_alloc(const char *alg) can be used to allocate per-cpu
> ahash request for different hashing algorithms besides md5.
> tcp_sig_pool_get() and tcp_sig_pool_put() should be used to get
> ahash_request and scratch area.

This pool pattern is a workaround for crypto-api only being able to
allocate transforms from user context.

It would be useful for this "one-transform-per-cpu" object to be part of
crypto api itself, there is nothing TCP-specific here other than the
size of scratch buffer.

> Make tcp_sig_pool reusable for TCP Authentication Option support
> (TCP-AO, RFC5925), where RFC5926[1] requires HMAC-SHA1 and AES-128_CMAC
> hashing at least.
Additional work would be required to support options of arbitrary size
and I don't think anyone would use non-standard crypto algorithms.

Is RFC5926 conformance really insufficient?

My knowledge of cryptography doesn't go much beyond "data goes in
signature goes out" but there are many recent arguments from that cipher
agility is outright harmful and recent protocols like WireGuard don't
support any algorithm choices.

> +#define TCP_SIG_POOL_MAX 8
> +static struct tcp_sig_pool_priv_t {
> + struct tcp_sig_crypto cryptos[TCP_SIG_POOL_MAX];
> + unsigned int cryptos_nr;
> +} tcp_sig_pool_priv = {
> + .cryptos_nr = 1,
> + .cryptos[TCP_MD5_SIG_ID].alg = "md5",
> +};

Why an array of 8? Better to use an arbitrary list.

--
Regards,
Leonard

2021-11-05 13:31:44

by Dmitry Safonov

[permalink] [raw]
Subject: Re: [PATCH 1/5] tcp/md5: Don't BUG_ON() failed kmemdup()

On 11/5/21 02:55, Eric Dumazet wrote:
>
>
> On 11/4/21 6:49 PM, Dmitry Safonov wrote:
>> static_branch_unlikely(&tcp_md5_needed) is enabled by
>> tcp_alloc_md5sig_pool(), so as long as the code doesn't change
>> tcp_md5sig_pool has been already populated if this code is being
>> executed.
>>
>> In case tcptw->tw_md5_key allocaion failed - no reason to crash kernel:
>> tcp_{v4,v6}_send_ack() will send unsigned segment, the connection won't be
>> established, which is bad enough, but in OOM situation totally
>> acceptable and better than kernel crash.
>>
>> Introduce tcp_md5sig_pool_ready() helper.
>> tcp_alloc_md5sig_pool() usage is intentionally avoided here as it's
>> fast-path here and it's check for sanity rather than point of actual
>> pool allocation. That will allow to have generic slow-path allocator
>> for tcp crypto pool.
>>
>> Signed-off-by: Dmitry Safonov <[email protected]>
>> ---
>> include/net/tcp.h | 1 +
>> net/ipv4/tcp.c | 5 +++++
>> net/ipv4/tcp_minisocks.c | 5 +++--
>> 3 files changed, 9 insertions(+), 2 deletions(-)
>>
>> diff --git a/include/net/tcp.h b/include/net/tcp.h
>> index 4da22b41bde6..3e5423a10a74 100644
>> --- a/include/net/tcp.h
>> +++ b/include/net/tcp.h
>> @@ -1672,6 +1672,7 @@ tcp_md5_do_lookup(const struct sock *sk, int l3index,
>> #endif
>>
>> bool tcp_alloc_md5sig_pool(void);
>> +bool tcp_md5sig_pool_ready(void);
>>
>> struct tcp_md5sig_pool *tcp_get_md5sig_pool(void);
>> static inline void tcp_put_md5sig_pool(void)
>> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
>> index b7796b4cf0a0..c0856a6af9f5 100644
>> --- a/net/ipv4/tcp.c
>> +++ b/net/ipv4/tcp.c
>> @@ -4314,6 +4314,11 @@ bool tcp_alloc_md5sig_pool(void)
>> }
>> EXPORT_SYMBOL(tcp_alloc_md5sig_pool);
>>
>> +bool tcp_md5sig_pool_ready(void)
>> +{
>> + return tcp_md5sig_pool_populated;
>> +}
>> +EXPORT_SYMBOL(tcp_md5sig_pool_ready);
>>
>> /**
>> * tcp_get_md5sig_pool - get md5sig_pool for this user
>> diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
>> index cf913a66df17..c99cdb529902 100644
>> --- a/net/ipv4/tcp_minisocks.c
>> +++ b/net/ipv4/tcp_minisocks.c
>> @@ -293,11 +293,12 @@ void tcp_time_wait(struct sock *sk, int state, int timeo)
>> tcptw->tw_md5_key = NULL;
>> if (static_branch_unlikely(&tcp_md5_needed)) {
>> struct tcp_md5sig_key *key;
>> + bool err = WARN_ON(!tcp_md5sig_pool_ready());
>>
>> key = tp->af_specific->md5_lookup(sk, sk);
>> - if (key) {
>> + if (key && !err) {
>> tcptw->tw_md5_key = kmemdup(key, sizeof(*key), GFP_ATOMIC);
>> - BUG_ON(tcptw->tw_md5_key && !tcp_alloc_md5sig_pool());
>> + WARN_ON_ONCE(tcptw->tw_md5_key == NULL);
>> }
>> }
>> } while (0);
>>
>
> Hmmm.... how this BUG_ON() could trigger exactly ?
>
> tcp_md5_needed can only be enabled after __tcp_alloc_md5sig_pool has succeeded.

Yeah, I've misread this part as
: BUG_ON(!tcptw->tw_md5_key || !tcp_alloc_md5sig_pool());

Still, there is an issue with checking tcp_alloc_md5sig_pool():
currently the condition is never true, but if it ever becomes true, the
tcp_alloc_md5sig_pool() call may cause tcp_time_wait() to sleep with bh
disabled (i.e. __tcp_close()). So, if this condition ever becomes true,
it will cause an issue checking it here.

I'll squash this with patch 3 and send when the merge window closes.

Thanks,
Dmitry

2021-11-05 14:01:06

by Dmitry Safonov

[permalink] [raw]
Subject: Re: [PATCH 5/5] tcp/md5: Make more generic tcp_sig_pool

On 11/5/21 09:54, Leonard Crestez wrote:
> On 11/5/21 3:49 AM, Dmitry Safonov wrote:
>> Convert tcp_md5sig_pool to more generic tcp_sig_pool.
>> Now tcp_sig_pool_alloc(const char *alg) can be used to allocate per-cpu
>> ahash request for different hashing algorithms besides md5.
>> tcp_sig_pool_get() and tcp_sig_pool_put() should be used to get
>> ahash_request and scratch area.
>
> This pool pattern is a workaround for crypto-api only being able to
> allocate transforms from user context.
>> It would be useful for this "one-transform-per-cpu" object to be part of
> crypto api itself, there is nothing TCP-specific here other than the
> size of scratch buffer.

Agree, it would be nice to have something like this as a part of crypto.
The intention here is to reuse md5 sig pool, rather than introduce
another similar one.

>> Make tcp_sig_pool reusable for TCP Authentication Option support
>> (TCP-AO, RFC5925), where RFC5926[1] requires HMAC-SHA1 and AES-128_CMAC
>> hashing at least.
> Additional work would be required to support options of arbitrary size
> and I don't think anyone would use non-standard crypto algorithms.
>
> Is RFC5926 conformance really insufficient?

For the resulting hash, the scratch buffer can be used.

Honestly, I just don't see much benefit in introducing more code and
structures in order to limit hash algorithms. If anything,

:if (strcmp("hmac(sha1)", opts.algo) && strcmp("cmac(aes)", opts.algo))
: return -EPROTONOSUPPORT;

and passing the string straight to crypto seems to be better than adding
new structures.

On the other side, those two hashes MUST be supported to comply with
RFC, other may. As user can already configure conflicting receive/send
ids for MKTs, I don't see a point not allowing any hash algorithm
supported by crypto.

> My knowledge of cryptography doesn't go much beyond "data goes in
> signature goes out" but there are many recent arguments from that cipher
> agility is outright harmful and recent protocols like WireGuard don't
> support any algorithm choices.

You already limit usage when root-enabled sysctl is triggered, I don't
see big concerns here.

>
>> +#define TCP_SIG_POOL_MAX        8
>> +static struct tcp_sig_pool_priv_t {
>> +    struct tcp_sig_crypto        cryptos[TCP_SIG_POOL_MAX];
>> +    unsigned int            cryptos_nr;
>> +} tcp_sig_pool_priv = {
>> +    .cryptos_nr = 1,
>> +    .cryptos[TCP_MD5_SIG_ID].alg = "md5",
>> +};
>
> Why an array of 8? Better to use an arbitrary list.

Some reasonable limit, may be 16 or whatever in order to avoid
dynamically (re-)allocating the array and keeping O(1) lookups.

Thanks,
Dmitry

2021-11-05 17:21:50

by Leonard Crestez

[permalink] [raw]
Subject: Re: [PATCH 5/5] tcp/md5: Make more generic tcp_sig_pool

On 11/5/21 3:59 PM, Dmitry Safonov wrote:
> On 11/5/21 09:54, Leonard Crestez wrote:
>> On 11/5/21 3:49 AM, Dmitry Safonov wrote:
>>> Convert tcp_md5sig_pool to more generic tcp_sig_pool.
>>> Now tcp_sig_pool_alloc(const char *alg) can be used to allocate per-cpu
>>> ahash request for different hashing algorithms besides md5.
>>> tcp_sig_pool_get() and tcp_sig_pool_put() should be used to get
>>> ahash_request and scratch area.
>>
>> This pool pattern is a workaround for crypto-api only being able to
>> allocate transforms from user context.
>>> It would be useful for this "one-transform-per-cpu" object to be part of
>> crypto api itself, there is nothing TCP-specific here other than the
>> size of scratch buffer.
>
> Agree, it would be nice to have something like this as a part of crypto.
> The intention here is to reuse md5 sig pool, rather than introduce
> another similar one.
>
>>> Make tcp_sig_pool reusable for TCP Authentication Option support
>>> (TCP-AO, RFC5925), where RFC5926[1] requires HMAC-SHA1 and AES-128_CMAC
>>> hashing at least.
>> Additional work would be required to support options of arbitrary size
>> and I don't think anyone would use non-standard crypto algorithms.
>>
>> Is RFC5926 conformance really insufficient?
>
> For the resulting hash, the scratch buffer can be used.
>
> Honestly, I just don't see much benefit in introducing more code and
> structures in order to limit hash algorithms. If anything,
>
> :if (strcmp("hmac(sha1)", opts.algo) && strcmp("cmac(aes)", opts.algo))
> : return -EPROTONOSUPPORT;
>
> and passing the string straight to crypto seems to be better than adding
> new structures.
>
> On the other side, those two hashes MUST be supported to comply with
> RFC, other may. As user can already configure conflicting receive/send
> ids for MKTs, I don't see a point not allowing any hash algorithm
> supported by crypto.

The algorithm enum controls not just the algorithm but the length in
bytes of the traffic key and signatures. Supporting arbitrary algorithms
requires supporting arbitrary byte lengths everywhere and increasing a
couple of stack allocations (including some in the TCP core).

An earlier version actually had u8 fields for the length but those were
later removed.

Adding this is not difficult but requires careful testing of the new
corner cases. All this will be for functionality is never going to be used.

>> My knowledge of cryptography doesn't go much beyond "data goes in
>> signature goes out" but there are many recent arguments from that cipher
>> agility is outright harmful and recent protocols like WireGuard don't
>> support any algorithm choices.
>
> You already limit usage when root-enabled sysctl is triggered, I don't
> see big concerns here.
>
>>
>>> +#define TCP_SIG_POOL_MAX        8
>>> +static struct tcp_sig_pool_priv_t {
>>> +    struct tcp_sig_crypto        cryptos[TCP_SIG_POOL_MAX];
>>> +    unsigned int            cryptos_nr;
>>> +} tcp_sig_pool_priv = {
>>> +    .cryptos_nr = 1,
>>> +    .cryptos[TCP_MD5_SIG_ID].alg = "md5",
>>> +};
>>
>> Why an array of 8? Better to use an arbitrary list.
>
> Some reasonable limit, may be 16 or whatever in order to avoid
> dynamically (re-)allocating the array and keeping O(1) lookups.

Defining an arbitrary array length limit is an underhanded way of making
lookup O(1).

2021-11-06 10:43:18

by Herbert Xu

[permalink] [raw]
Subject: Re: [PATCH 5/5] tcp/md5: Make more generic tcp_sig_pool

On Fri, Nov 05, 2021 at 01:59:35PM +0000, Dmitry Safonov wrote:
> On 11/5/21 09:54, Leonard Crestez wrote:
>
> > This pool pattern is a workaround for crypto-api only being able to
> > allocate transforms from user context.
> >> It would be useful for this "one-transform-per-cpu" object to be part of
> > crypto api itself, there is nothing TCP-specific here other than the
> > size of scratch buffer.
>
> Agree, it would be nice to have something like this as a part of crypto.
> The intention here is to reuse md5 sig pool, rather than introduce
> another similar one.

As I said before, I'm happy to see the ahash/shash interface modified
so that we allow the key to be in the request object in addition to the
tfm. However, I don't really have the time to work on that and
nobody else from the crypto side seems interested in this.

So if you guys have the time and are willing to work on it then I'm
more than happy to help you.

Cheers,
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt