2015-05-05 10:44:00

by Markus Stenberg

[permalink] [raw]
Subject: [PATCH] ipv6: Fixed source specific default route handling.

If there are only IPv6 source specific default routes present, the
host gets -ENETUNREACH on e.g. connect() because ip6_dst_lookup_tail
calls ip6_route_output first, and given source address any, it fails,
and ip6_route_get_saddr is never called.

The change is to use the ip6_route_get_saddr, even if the initial
ip6_route_output fails, and then doing ip6_route_output _again_ after
we have appropriate source address available.

Note that this is '99% fix' to the problem; a correct fix would be to
do route lookups only within addrconf.c when picking a source address,
and never call ip6_route_output before source address has been
populated.

Signed-off-by: Markus Stenberg <[email protected]>
---
net/ipv6/ip6_output.c | 39 +++++++++++++++++++++++++++++++--------
net/ipv6/route.c | 5 +++--
2 files changed, 34 insertions(+), 10 deletions(-)

diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 7fde1f2..c217775 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -886,22 +886,45 @@ static int ip6_dst_lookup_tail(struct sock *sk,
#endif
int err;

- if (!*dst)
- *dst = ip6_route_output(net, sk, fl6);
-
- err = (*dst)->error;
- if (err)
- goto out_err_release;
+ /* The correct way to handle this would be to do
+ * ip6_route_get_saddr, and then ip6_route_output; however,
+ * the route-specific preferred source forces the
+ * ip6_route_output call _before_ ip6_route_get_saddr.
+ *
+ * In source specific routing (no src=any default route),
+ * ip6_route_output will fail given src=any saddr, though, so
+ * that's why we try it again later.
+ */
+ if (ipv6_addr_any(&fl6->saddr) && (!*dst || !(*dst)->error)) {
+ struct rt6_info *rt;
+ bool had_dst = *dst != NULL;

- if (ipv6_addr_any(&fl6->saddr)) {
- struct rt6_info *rt = (struct rt6_info *) *dst;
+ if (!had_dst)
+ *dst = ip6_route_output(net, sk, fl6);
+ rt = (*dst)->error ? NULL : (struct rt6_info *)*dst;
err = ip6_route_get_saddr(net, rt, &fl6->daddr,
sk ? inet6_sk(sk)->srcprefs : 0,
&fl6->saddr);
if (err)
goto out_err_release;
+
+ /* If we had an erroneous initial result, pretend it
+ * never existed and let the SA-enabled version take
+ * over.
+ */
+ if (!had_dst && (*dst)->error) {
+ dst_release(*dst);
+ *dst = NULL;
+ }
}

+ if (!*dst)
+ *dst = ip6_route_output(net, sk, fl6);
+
+ err = (*dst)->error;
+ if (err)
+ goto out_err_release;
+
#ifdef CONFIG_IPV6_OPTIMISTIC_DAD
/*
* Here if the dst entry we've looked up
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 5c48293..d358888 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -2245,9 +2245,10 @@ int ip6_route_get_saddr(struct net *net,
unsigned int prefs,
struct in6_addr *saddr)
{
- struct inet6_dev *idev = ip6_dst_idev((struct dst_entry *)rt);
+ struct inet6_dev *idev =
+ rt ? ip6_dst_idev((struct dst_entry *)rt) : NULL;
int err = 0;
- if (rt->rt6i_prefsrc.plen)
+ if (rt && rt->rt6i_prefsrc.plen)
*saddr = rt->rt6i_prefsrc.addr;
else
err = ipv6_dev_get_saddr(net, idev ? idev->dev : NULL,
--
2.3.2 (Apple Git-55)


2015-05-09 19:59:38

by David Miller

[permalink] [raw]
Subject: Re: [PATCH] ipv6: Fixed source specific default route handling.

From: Markus Stenberg <[email protected]>
Date: Tue, 5 May 2015 13:36:59 +0300

> If there are only IPv6 source specific default routes present, the
> host gets -ENETUNREACH on e.g. connect() because ip6_dst_lookup_tail
> calls ip6_route_output first, and given source address any, it fails,
> and ip6_route_get_saddr is never called.
>
> The change is to use the ip6_route_get_saddr, even if the initial
> ip6_route_output fails, and then doing ip6_route_output _again_ after
> we have appropriate source address available.
>
> Note that this is '99% fix' to the problem; a correct fix would be to
> do route lookups only within addrconf.c when picking a source address,
> and never call ip6_route_output before source address has been
> populated.
>
> Signed-off-by: Markus Stenberg <[email protected]>

Applied, but would like to see the more complete fix at some
point.

2015-06-21 17:18:53

by Matthias Schiffer

[permalink] [raw]
Subject: Re: [PATCH] ipv6: Fixed source specific default route handling.

On 05/05/2015 12:36 PM, Markus Stenberg wrote:
> If there are only IPv6 source specific default routes present, the
> host gets -ENETUNREACH on e.g. connect() because ip6_dst_lookup_tail
> calls ip6_route_output first, and given source address any, it fails,
> and ip6_route_get_saddr is never called.
>
> The change is to use the ip6_route_get_saddr, even if the initial
> ip6_route_output fails, and then doing ip6_route_output _again_ after
> we have appropriate source address available.
>
> Note that this is '99% fix' to the problem; a correct fix would be to
> do route lookups only within addrconf.c when picking a source address,
> and never call ip6_route_output before source address has been
> populated.
>
> Signed-off-by: Markus Stenberg <[email protected]>
> ---
> net/ipv6/ip6_output.c | 39 +++++++++++++++++++++++++++++++--------
> net/ipv6/route.c | 5 +++--
> 2 files changed, 34 insertions(+), 10 deletions(-)
>
...

So... how does ip6_route_get_saddr() select the source address when no
route is given? OpenWrt has recently started relying on this patch and
I'm seeing quite weird source address selection behaviour (@Steven:
especially since OpenWrt commit r45941).

Steps to reproduce:

ip l add test link eth0 type macvlan
ip l set test up
ip a add fd00::20/64 dev eth0
ip a add fd00::1/128 dev test

Upto here everything is okay,

ping6 fd00::10

will use the correct source address fd00::20.

Now I add an additional source-specific route:

ip r add fd00::/64 from fd00::/64 dev eth0

(this certainly looks like a contrived example, but the configuration
OpenWrt's netifd/odhcp6c creates is similar)

Without this patch, the ping will fail with 'network unreachable' (which
is weird by itself - why does the source-specific route shadow the
generic route even though no source address has been chosen?). But after
applying this patch, the kernel will now choose the address with the
longest common prefix with the destination, which is fd00::1 - even
though this address is assigned as /128.

Adding a prefsrc attribute to the source-specific route doesn't have an
effect as ip6_route_get_saddr() doesn't even get the route when the
non-source-specific ip6_route_output() has failed. Thus, there's
currently no way (to my knowledge) to specify the source address to use
when source-specific routes are involved...

Matthias


Attachments:
signature.asc (819.00 B)
OpenPGP digital signature

2015-06-21 22:12:24

by Markus Stenberg

[permalink] [raw]
Subject: Re: [PATCH] ipv6: Fixed source specific default route handling.

Prefsrc is essentially historic non IPv6 construct. IPv6 SAS is based on dst, src, metric ordered lookup just like the routing is too ( lookup rfc, some src specific routing drafts for details ).

Therefore I do not see a problem. If you want specific SA, add same route with higher metric and/or (more) specific src match.

There might be bugs there tho, but that is how it should work. As SAS is supposed to happen before routing ( see rfc ) the prefsrc is .. Cough.

-Markus

2015-06-21 22:36:03

by Matthias Schiffer

[permalink] [raw]
Subject: Re: [PATCH] ipv6: Fixed source specific default route handling.

On 06/22/2015 12:05 AM, Markus Stenberg wrote:
> Prefsrc is essentially historic non IPv6 construct. IPv6 SAS is based on dst, src, metric ordered lookup just like the routing is too ( lookup rfc, some src specific routing drafts for details ).
>
> Therefore I do not see a problem. If you want specific SA, add same route with higher metric and/or (more) specific src match.
>
> There might be bugs there tho, but that is how it should work. As SAS is supposed to happen before routing ( see rfc ) the prefsrc is .. Cough.
>
> -Markus
>

Could you explain in detail what you mean with "If you want specific SA,
add same route with higher metric and/or (more) specific src match."?
Routes aren't bound to specific addresses except via the "src" attribute
(which is called prefsrc in the kernel), which is exactly what it not
working. I can't control the chosen source address at all when
source-specific routes are involved.

Also, metric-based route selection is broken when source-specific routes
are involved. The commands mentioned in my first mail will create the
following configuration:

# ip a
3: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state
UNKNOWN group default qlen 500
link/ether 22:46:f4:9c:9e:3a brd ff:ff:ff:ff:ff:ff
inet6 fd00::20/64 scope global
valid_lft forever preferred_lft forever
inet6 fe80::2046:f4ff:fe9c:9e3a/64 scope link
valid_lft forever preferred_lft forever
4: test@eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue
state UNKNOWN group default
link/ether ae:2b:02:16:23:0f brd ff:ff:ff:ff:ff:ff
inet6 fd00::1/128 scope global
valid_lft forever preferred_lft forever
inet6 fe80::ac2b:2ff:fe16:230f/64 scope link
valid_lft forever preferred_lft forever

# ip -6 r
fd00::/64 from fd00::/64 dev eth0 metric 1024
fd00::1 dev test proto kernel metric 256
fd00::/64 dev eth0 proto kernel metric 256

The only route I have added manually is the source-specific one, the
other two have been created by address assignment. Adding a "src"
address to the source-specific route has no effect.

Even though the source-specific route has a higher metric than the
generic one, the source-specific one shadows the generic route.

Thanks for your reply,
Matthias


Attachments:
signature.asc (819.00 B)
OpenPGP digital signature

2015-06-21 23:13:15

by Markus Stenberg

[permalink] [raw]
Subject: Re: [PATCH] ipv6: Fixed source specific default route handling.

You have /128 dst. To override it you need dst/128 AND src/>0 route. ( or just /128 dst and higher metric).

Sorry if I am bit unclear - can explain better given real keyboard but that is avail only week from now.

-Markus

(on the road, via iPhone)

21.6.2015 23.35$B!"(BMatthias Schiffer <[email protected]> $B$N%a%C%;!<%8(B:

>> On 06/22/2015 12:05 AM, Markus Stenberg wrote:
>> Prefsrc is essentially historic non IPv6 construct. IPv6 SAS is based on dst, src, metric ordered lookup just like the routing is too ( lookup rfc, some src specific routing drafts for details ).
>>
>> Therefore I do not see a problem. If you want specific SA, add same route with higher metric and/or (more) specific src match.
>>
>> There might be bugs there tho, but that is how it should work. As SAS is supposed to happen before routing ( see rfc ) the prefsrc is .. Cough.
>>
>> -Markus
>
> Could you explain in detail what you mean with "If you want specific SA,
> add same route with higher metric and/or (more) specific src match."?
> Routes aren't bound to specific addresses except via the "src" attribute
> (which is called prefsrc in the kernel), which is exactly what it not
> working. I can't control the chosen source address at all when
> source-specific routes are involved.
>
> Also, metric-based route selection is broken when source-specific routes
> are involved. The commands mentioned in my first mail will create the
> following configuration:
>
> # ip a
> 3: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state
> UNKNOWN group default qlen 500
> link/ether 22:46:f4:9c:9e:3a brd ff:ff:ff:ff:ff:ff
> inet6 fd00::20/64 scope global
> valid_lft forever preferred_lft forever
> inet6 fe80::2046:f4ff:fe9c:9e3a/64 scope link
> valid_lft forever preferred_lft forever
> 4: test@eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue
> state UNKNOWN group default
> link/ether ae:2b:02:16:23:0f brd ff:ff:ff:ff:ff:ff
> inet6 fd00::1/128 scope global
> valid_lft forever preferred_lft forever
> inet6 fe80::ac2b:2ff:fe16:230f/64 scope link
> valid_lft forever preferred_lft forever
>
> # ip -6 r
> fd00::/64 from fd00::/64 dev eth0 metric 1024
> fd00::1 dev test proto kernel metric 256
> fd00::/64 dev eth0 proto kernel metric 256
>
> The only route I have added manually is the source-specific one, the
> other two have been created by address assignment. Adding a "src"
> address to the source-specific route has no effect.
>
> Even though the source-specific route has a higher metric than the
> generic one, the source-specific one shadows the generic route.
>
> Thanks for your reply,
> Matthias
>

2015-06-22 06:29:36

by Steven Barth

[permalink] [raw]
Subject: Re: [PATCH] ipv6: Fixed source specific default route handling.

On 22.06.2015 00:35, Matthias Schiffer wrote:
> Could you explain in detail what you mean with "If you want specific SA,
> add same route with higher metric and/or (more) specific src match."?
> Routes aren't bound to specific addresses except via the "src" attribute
> (which is called prefsrc in the kernel), which is exactly what it not
> working. I can't control the chosen source address at all when
> source-specific routes are involved.
Except that prefsrc and src are two different beasts and usually ip route from transates to
RTA_SRC instead of RTA_PREFSOURCE when used with a prefix length.

Try adding two routes to the same destination with the same metric but different source values with PREFSRC (e.g. IPv4) and then
try doing the same with SRC (e.g. IPv6). The former will fail but the latter will succeed.


https://tools.ietf.org/html/draft-troan-homenet-sadr-01
was the original draft for source-address dependent routing IIRC so might be a good read.


>
> Even though the source-specific route has a higher metric than the
> generic one, the source-specific one shadows the generic route.

(was a bit ago since I read into this so please correct me if I am wrong)
IIRC this is intentional since longest-prefix-match beats metric here
and the source-address match counts to being more-specific here. See also above difference between PREFSRC and SRC.



Cheers,

Steven


2015-06-22 17:51:25

by Matthias Schiffer

[permalink] [raw]
Subject: Re: [PATCH] ipv6: Fixed source specific default route handling.

On 06/22/2015 07:58 AM, Steven Barth wrote:
> On 22.06.2015 00:35, Matthias Schiffer wrote:
>> Could you explain in detail what you mean with "If you want specific SA,
>> add same route with higher metric and/or (more) specific src match."?
>> Routes aren't bound to specific addresses except via the "src" attribute
>> (which is called prefsrc in the kernel), which is exactly what it not
>> working. I can't control the chosen source address at all when
>> source-specific routes are involved.
> Except that prefsrc and src are two different beasts and usually ip route from transates to
> RTA_SRC instead of RTA_PREFSOURCE when used with a prefix length.
>
> Try adding two routes to the same destination with the same metric but different source values with PREFSRC (e.g. IPv4) and then
> try doing the same with SRC (e.g. IPv6). The former will fail but the latter will succeed.

Ah sorry, I didn't know that "src" and "prefsrc" were distinct concepts.
I meant to refer to "src" whenever I wrote "prefsrc". What are the
precise semantics of the "src" attribute? Any RFC I can read, or is this
a Linux-specific concept?

>
>
> https://tools.ietf.org/html/draft-troan-homenet-sadr-01
> was the original draft for source-address dependent routing IIRC so might be a good read.

Thanks for the link, that helps a bit.

>
>
>>
>> Even though the source-specific route has a higher metric than the
>> generic one, the source-specific one shadows the generic route.
>
> (was a bit ago since I read into this so please correct me if I am wrong)
> IIRC this is intentional since longest-prefix-match beats metric here
> and the source-address match counts to being more-specific here. See also above difference between PREFSRC and SRC.

Ah, that would explain the metric issue. I looks like the source of my
confusion is that for source-specific routes *all* addresses are in the
candidate set, not only the addresses of the outgoing interface (which
makes sense as ip6_route_get_saddr() is called with a NULL rt6_info in
the source-specific case).

I'm not sure if this can be fixed in a sane way (as there seems to be a
dependency cycle: source address should depend on outgoing interface,
which depends on the chosen route, which depends on the source address),
but it leads to highly unintuitive source address selection :(

Markus suggested in the commit message not to call ip6_route_output at
all before the source address has been selected. Wouldn't this make it
impossible to choose the source address depending on the outgoing
interface in the non-source-specific case as well?

> Cheers,
>
> Steven

Thanks for the explanation,
Matthias


Attachments:
signature.asc (819.00 B)
OpenPGP digital signature

2015-06-23 01:54:17

by Hideaki Yoshifuji

[permalink] [raw]
Subject: Re: [PATCH] ipv6: Fixed source specific default route handling.

Matthias Schiffer wrote:
> On 06/22/2015 07:58 AM, Steven Barth wrote:
>> On 22.06.2015 00:35, Matthias Schiffer wrote:
>>> Could you explain in detail what you mean with "If you want specific SA,
>>> add same route with higher metric and/or (more) specific src match."?
>>> Routes aren't bound to specific addresses except via the "src" attribute
>>> (which is called prefsrc in the kernel), which is exactly what it not
>>> working. I can't control the chosen source address at all when
>>> source-specific routes are involved.
>> Except that prefsrc and src are two different beasts and usually ip route from transates to
>> RTA_SRC instead of RTA_PREFSOURCE when used with a prefix length.
>>
>> Try adding two routes to the same destination with the same metric but different source values with PREFSRC (e.g. IPv4) and then
>> try doing the same with SRC (e.g. IPv6). The former will fail but the latter will succeed.
>
> Ah sorry, I didn't know that "src" and "prefsrc" were distinct concepts.
> I meant to refer to "src" whenever I wrote "prefsrc". What are the
> precise semantics of the "src" attribute? Any RFC I can read, or is this
> a Linux-specific concept?
>

"src" is long-lived feature which is usually used with mutiple routing
tables by "ip rule".

--yoshfuji

>>
>>
>> https://tools.ietf.org/html/draft-troan-homenet-sadr-01
>> was the original draft for source-address dependent routing IIRC so might be a good read.
>
> Thanks for the link, that helps a bit.
>
>>
>>
>>>
>>> Even though the source-specific route has a higher metric than the
>>> generic one, the source-specific one shadows the generic route.
>>
>> (was a bit ago since I read into this so please correct me if I am wrong)
>> IIRC this is intentional since longest-prefix-match beats metric here
>> and the source-address match counts to being more-specific here. See also above difference between PREFSRC and SRC.
>
> Ah, that would explain the metric issue. I looks like the source of my
> confusion is that for source-specific routes *all* addresses are in the
> candidate set, not only the addresses of the outgoing interface (which
> makes sense as ip6_route_get_saddr() is called with a NULL rt6_info in
> the source-specific case).
>
> I'm not sure if this can be fixed in a sane way (as there seems to be a
> dependency cycle: source address should depend on outgoing interface,
> which depends on the chosen route, which depends on the source address),
> but it leads to highly unintuitive source address selection :(
>
> Markus suggested in the commit message not to call ip6_route_output at
> all before the source address has been selected. Wouldn't this make it
> impossible to choose the source address depending on the outgoing
> interface in the non-source-specific case as well?
>
>> Cheers,
>>
>> Steven
>
> Thanks for the explanation,
> Matthias
>

--
吉藤英明 <[email protected]>
ミラクル・リナックス株式会社 技術本部 サポート部