With net.ipv4.tcp_l3mdev_accept=1 it is possible for a listen socket to
accept connection from the same client address in different VRFs. It is
also possible to set different MD5 keys for these clients which
different only in the tcpm_l3index field.
This appears to work when distinguishing between different VRFs but not
between non-VRF and VRF connections. In particular:
* tcp_md5_do_lookup_exact will match a non-vrf key against a vrf key.
This means that adding a key with l3index != 0 after a key with l3index
== 0 will cause the earlier key to be deleted. Both keys can be present
if the non-vrf key is added later.
* _tcp_md5_do_lookup can match a non-vrf key before a vrf key. This
casues failures if the passwords differ.
This was found while working on TCP-AO tests, the exact failing tests
are among test_vrf_overlap*_md5 from this file:
https://github.com/cdleonard/tcp-authopt-test/blob/main/tcp_authopt_test/test_vrf_bind.py
There is also a test for overlapping between vrfs
(test_vrf_overlap12_md5) and that works correctly without this change.
Reproducing this with nettest and fcnal-test.sh would require support
for multiple keys inside the nettest server.
Signed-off-by: Leonard Crestez <[email protected]>
---
net/ipv4/tcp_ipv4.c | 19 ++++++++++++++++---
1 file changed, 16 insertions(+), 3 deletions(-)
The fact that keys with l3index==0 affect VRF connections might not be
desirable at all. It might make more sense to have an option to completely
ignore keys outside the vrf for connections inside the VRF.
At least this patch makes it behave in a well defined manner that doesn't
depend on the order of key addition.
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 29a57bd159f0..a9a6a6d598c6 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1035,10 +1035,24 @@ static void tcp_v4_reqsk_destructor(struct request_sock *req)
*/
DEFINE_STATIC_KEY_FALSE(tcp_md5_needed);
EXPORT_SYMBOL(tcp_md5_needed);
+static bool better_md5_match(struct tcp_md5sig_key *old, struct tcp_md5sig_key *new)
+{
+ if (!old)
+ return true;
+
+ /* l3index always overrides non-l3index */
+ if (old->l3index && new->l3index == 0)
+ return false;
+ if (old->l3index == 0 && new->l3index)
+ return true;
+
+ return old->prefixlen < new->prefixlen;
+}
+
/* Find the Key structure for an address. */
struct tcp_md5sig_key *__tcp_md5_do_lookup(const struct sock *sk, int l3index,
const union tcp_md5_addr *addr,
int family)
{
@@ -1072,12 +1086,11 @@ struct tcp_md5sig_key *__tcp_md5_do_lookup(const struct sock *sk, int l3index,
#endif
} else {
match = false;
}
- if (match && (!best_match ||
- key->prefixlen > best_match->prefixlen))
+ if (match && better_md5_match(best_match, key))
best_match = key;
}
return best_match;
}
EXPORT_SYMBOL(__tcp_md5_do_lookup);
@@ -1103,11 +1116,11 @@ static struct tcp_md5sig_key *tcp_md5_do_lookup_exact(const struct sock *sk,
#endif
hlist_for_each_entry_rcu(key, &md5sig->head, node,
lockdep_sock_is_held(sk)) {
if (key->family != family)
continue;
- if (key->l3index && key->l3index != l3index)
+ if (key->l3index != l3index)
continue;
if (!memcmp(&key->addr, addr, size) &&
key->prefixlen == prefixlen)
return key;
}
base-commit: 9cbfc51af026f5b721a1b36cf622ada591b3c5de
--
2.25.1
On 10/6/21 11:48 AM, Leonard Crestez wrote:
> @@ -1103,11 +1116,11 @@ static struct tcp_md5sig_key *tcp_md5_do_lookup_exact(const struct sock *sk,
> #endif
> hlist_for_each_entry_rcu(key, &md5sig->head, node,
> lockdep_sock_is_held(sk)) {
> if (key->family != family)
> continue;
> - if (key->l3index && key->l3index != l3index)
> + if (key->l3index != l3index)
That seems like the bug fix there. The L3 reference needs to match for
new key and existing key. I think the same change is needed in
__tcp_md5_do_lookup.
> continue;
> if (!memcmp(&key->addr, addr, size) &&
> key->prefixlen == prefixlen)
> return key;
> }
>
> base-commit: 9cbfc51af026f5b721a1b36cf622ada591b3c5de
>
On 07.10.2021 04:14, David Ahern wrote:
> On 10/6/21 11:48 AM, Leonard Crestez wrote:
>> @@ -1103,11 +1116,11 @@ static struct tcp_md5sig_key *tcp_md5_do_lookup_exact(const struct sock *sk,
>> #endif
>> hlist_for_each_entry_rcu(key, &md5sig->head, node,
>> lockdep_sock_is_held(sk)) {
>> if (key->family != family)
>> continue;
>> - if (key->l3index && key->l3index != l3index)
>> + if (key->l3index != l3index)
>
> That seems like the bug fix there. The L3 reference needs to match for
> new key and existing key. I think the same change is needed in
> __tcp_md5_do_lookup.
Current behavior is that keys added without tcpm_ifindex will match
connections both inside and outside VRFs. Changing this might break real
applications, is it really OK to claim that this behavior was a bug all
along?
The approach with most backward compatibility would be to add a new flag
for keys that only match non-vrf connections.
Alternatively (TCP_MD5SIG_FLAG_IFINDEX && tcpm_ifindex == 0) could be
defined as "only non-vrf connections" while TCP_MD5SIG_FLAG_IFINDEX
missing could be "either".
--
Regards,
Leonard
On 10/7/21 12:41 AM, Leonard Crestez wrote:
>
>
> On 07.10.2021 04:14, David Ahern wrote:
>> On 10/6/21 11:48 AM, Leonard Crestez wrote:
>>> @@ -1103,11 +1116,11 @@ static struct tcp_md5sig_key
>>> *tcp_md5_do_lookup_exact(const struct sock *sk,
>>> #endif
>>> hlist_for_each_entry_rcu(key, &md5sig->head, node,
>>> lockdep_sock_is_held(sk)) {
>>> if (key->family != family)
>>> continue;
>>> - if (key->l3index && key->l3index != l3index)
>>> + if (key->l3index != l3index)
>>
>> That seems like the bug fix there. The L3 reference needs to match for
>> new key and existing key. I think the same change is needed in
>> __tcp_md5_do_lookup.
>
> Current behavior is that keys added without tcpm_ifindex will match
> connections both inside and outside VRFs. Changing this might break real
> applications, is it really OK to claim that this behavior was a bug all
> along?
no.
It's been a few years. I need to refresh on the logic and that is not
going to happen before this weekend.
On 07.10.2021 21:27, David Ahern wrote:
> On 10/7/21 12:41 AM, Leonard Crestez wrote:
>> On 07.10.2021 04:14, David Ahern wrote:
>>> On 10/6/21 11:48 AM, Leonard Crestez wrote:
>>>> @@ -1103,11 +1116,11 @@ static struct tcp_md5sig_key
>>>> *tcp_md5_do_lookup_exact(const struct sock *sk,
>>>> #endif
>>>> hlist_for_each_entry_rcu(key, &md5sig->head, node,
>>>> lockdep_sock_is_held(sk)) {
>>>> if (key->family != family)
>>>> continue;
>>>> - if (key->l3index && key->l3index != l3index)
>>>> + if (key->l3index != l3index)
>>>
>>> That seems like the bug fix there. The L3 reference needs to match for
>>> new key and existing key. I think the same change is needed in
>>> __tcp_md5_do_lookup.
>>
>> Current behavior is that keys added without tcpm_ifindex will match
>> connections both inside and outside VRFs. Changing this might break real
>> applications, is it really OK to claim that this behavior was a bug all
>> along?
>
> no.
>
> It's been a few years. I need to refresh on the logic and that is not
> going to happen before this weekend.
It seems that always doing a strict key->l3index != l3index condition
inside of __tcp_md5_do_lookup breaks the usecase of binding one listener
to each VRF and not specifying the ifindex for each key.
This is a very valid usecase, maybe the most common way to use md5 with vrf.
Ways to fix this:
* Make this comparison only take effect if TCP_MD5SIG_FLAG_IFINDEX is set.
* Make this comparison only take effect if tcp_l3mdev_accept=1
* Add a new flag?
Right now passing TCP_MD5SIG_FLAG_IFINDEX and ifindex == 0 results in an
error but maybe it should be accepted to mean "key applies only for
default VRF".
--
Regards,
Leonard
On 10/8/21 9:51 AM, Leonard Crestez wrote:
> On 07.10.2021 21:27, David Ahern wrote:
>> On 10/7/21 12:41 AM, Leonard Crestez wrote:
>>> On 07.10.2021 04:14, David Ahern wrote:
>>>> On 10/6/21 11:48 AM, Leonard Crestez wrote:
>>>>> @@ -1103,11 +1116,11 @@ static struct tcp_md5sig_key
>>>>> *tcp_md5_do_lookup_exact(const struct sock *sk,
>>>>> #endif
>>>>> hlist_for_each_entry_rcu(key, &md5sig->head, node,
>>>>> lockdep_sock_is_held(sk)) {
>>>>> if (key->family != family)
>>>>> continue;
>>>>> - if (key->l3index && key->l3index != l3index)
>>>>> + if (key->l3index != l3index)
>>>>
>>>> That seems like the bug fix there. The L3 reference needs to match for
>>>> new key and existing key. I think the same change is needed in
>>>> __tcp_md5_do_lookup.
>>>
>>> Current behavior is that keys added without tcpm_ifindex will match
>>> connections both inside and outside VRFs. Changing this might break real
>>> applications, is it really OK to claim that this behavior was a bug all
>>> along?
>>
>> no.
>>
>> It's been a few years. I need to refresh on the logic and that is not
>> going to happen before this weekend.
>
> It seems that always doing a strict key->l3index != l3index condition
> inside of __tcp_md5_do_lookup breaks the usecase of binding one listener
> to each VRF and not specifying the ifindex for each key.
>
> This is a very valid usecase, maybe the most common way to use md5 with
> vrf.
>
> Ways to fix this:
> * Make this comparison only take effect if TCP_MD5SIG_FLAG_IFINDEX is set.
> * Make this comparison only take effect if tcp_l3mdev_accept=1
> * Add a new flag?
>
> Right now passing TCP_MD5SIG_FLAG_IFINDEX and ifindex == 0 results in an
> error but maybe it should be accepted to mean "key applies only for
> default VRF".
>
I think I remember the history now: prior to the set
98c8147648fa..5cad8bce26e0 MD5 lookups for VRF and default VRF both
succeed on an address or prefix match because the L3 domain was not
checked. That set did not want to break the legacy behavior which is why
the change is based on db key having l3index set and matching the
ingress domain.
That means the limitation (hole depending on perspective) is a default
VRF and a VRF having overlap addresses with a key installed with the L3
index set (can't since default VRF does not have one) - which I believe
is your point.
So, yes, one option is to have a flag that indicates strict checking to
close the legacy path.