2020-11-25 11:47:31

by Phil Sutter

[permalink] [raw]
Subject: XFRM interface and NF_INET_LOCAL_OUT hook

Hi Steffen,

I am working on a ticket complaining about netfilter policy match
missing packets in OUTPUT chain if XFRM interface is being used.

I don't fully overlook the relevant code path, but it seems like
skb_dest(skb)->xfrm is not yet assigned when the skb is routed towards
XFRM interface and already cleared again (by xfrm_output_one?) before it
makes its way towards the real output interface. NF_INET_POST_ROUTING
hook works though.

Is this a bug or an expected quirk when using XFRM interface?

Cheers, Phil


2020-11-26 11:35:27

by Steffen Klassert

[permalink] [raw]
Subject: Re: XFRM interface and NF_INET_LOCAL_OUT hook

Hi Phil,

On Wed, Nov 25, 2020 at 12:23:42PM +0100, Phil Sutter wrote:
> Hi Steffen,
>
> I am working on a ticket complaining about netfilter policy match
> missing packets in OUTPUT chain if XFRM interface is being used.
>
> I don't fully overlook the relevant code path, but it seems like
> skb_dest(skb)->xfrm is not yet assigned when the skb is routed towards
> XFRM interface and already cleared again (by xfrm_output_one?) before it
> makes its way towards the real output interface. NF_INET_POST_ROUTING
> hook works though.
>
> Is this a bug or an expected quirk when using XFRM interface?

This is expected behaviour. The xfrm interfaces are plaintext devices,
the plaintext packets are routed to the xfrm interface which guarantees
transformation. So the lookup that assigns skb_dst(skb)->xfrm
happens 'behind' the interface. After transformation,
skb_dst(skb)->xfrm will be cleared. So this assignment exists just
inside xfrm in that case.

Does netfilter match against skb_dst(skb)->xfrm? What is the exact case
that does not work?

2020-11-26 13:12:44

by Phil Sutter

[permalink] [raw]
Subject: Re: XFRM interface and NF_INET_LOCAL_OUT hook

Hi Steffen,

On Thu, Nov 26, 2020 at 10:40:21AM +0100, Steffen Klassert wrote:
> On Wed, Nov 25, 2020 at 12:23:42PM +0100, Phil Sutter wrote:
> > I am working on a ticket complaining about netfilter policy match
> > missing packets in OUTPUT chain if XFRM interface is being used.
> >
> > I don't fully overlook the relevant code path, but it seems like
> > skb_dest(skb)->xfrm is not yet assigned when the skb is routed towards
> > XFRM interface and already cleared again (by xfrm_output_one?) before it
> > makes its way towards the real output interface. NF_INET_POST_ROUTING
> > hook works though.
> >
> > Is this a bug or an expected quirk when using XFRM interface?
>
> This is expected behaviour. The xfrm interfaces are plaintext devices,
> the plaintext packets are routed to the xfrm interface which guarantees
> transformation. So the lookup that assigns skb_dst(skb)->xfrm
> happens 'behind' the interface. After transformation,
> skb_dst(skb)->xfrm will be cleared. So this assignment exists just
> inside xfrm in that case.

OK, thanks for the clarification.

> Does netfilter match against skb_dst(skb)->xfrm? What is the exact case
> that does not work?

The reported use-case is a match against tunnel data in output hook:

| table t {
| chain c {
| type filter hook output priority filter
| oifname eth0 ipsec out ip daddr 192.168.1.2
| }
| }

The ipsec expression tries to extract that data from skb_dst(skb)->xfrm
if present. In xt_policy (for iptables), code is equivalent. The above
works when not using xfrm_interface. Initially I assumed one just needs
to adjust the oifname match, but even dropping it doesn't help.

Cheers, Phil

2020-11-27 09:58:17

by Steffen Klassert

[permalink] [raw]
Subject: Re: XFRM interface and NF_INET_LOCAL_OUT hook

On Thu, Nov 26, 2020 at 02:12:00PM +0100, Phil Sutter wrote:
> > >
> > > Is this a bug or an expected quirk when using XFRM interface?
> >
> > This is expected behaviour. The xfrm interfaces are plaintext devices,
> > the plaintext packets are routed to the xfrm interface which guarantees
> > transformation. So the lookup that assigns skb_dst(skb)->xfrm
> > happens 'behind' the interface. After transformation,
> > skb_dst(skb)->xfrm will be cleared. So this assignment exists just
> > inside xfrm in that case.
>
> OK, thanks for the clarification.
>
> > Does netfilter match against skb_dst(skb)->xfrm? What is the exact case
> > that does not work?
>
> The reported use-case is a match against tunnel data in output hook:
>
> | table t {
> | chain c {
> | type filter hook output priority filter
> | oifname eth0 ipsec out ip daddr 192.168.1.2
> | }
> | }
>
> The ipsec expression tries to extract that data from skb_dst(skb)->xfrm
> if present. In xt_policy (for iptables), code is equivalent. The above
> works when not using xfrm_interface. Initially I assumed one just needs
> to adjust the oifname match, but even dropping it doesn't help.

Yes, this does not work with xfrm interfaces. As said, they are plaintext
devices that guarantee transformation.

Maybe you can try to match after transformation by using the secpath,
but not sure if that is what you need.

2020-11-27 14:11:41

by Phil Sutter

[permalink] [raw]
Subject: Re: XFRM interface and NF_INET_LOCAL_OUT hook

On Fri, Nov 27, 2020 at 10:55:11AM +0100, Steffen Klassert wrote:
> On Thu, Nov 26, 2020 at 02:12:00PM +0100, Phil Sutter wrote:
> > > >
> > > > Is this a bug or an expected quirk when using XFRM interface?
> > >
> > > This is expected behaviour. The xfrm interfaces are plaintext devices,
> > > the plaintext packets are routed to the xfrm interface which guarantees
> > > transformation. So the lookup that assigns skb_dst(skb)->xfrm
> > > happens 'behind' the interface. After transformation,
> > > skb_dst(skb)->xfrm will be cleared. So this assignment exists just
> > > inside xfrm in that case.
> >
> > OK, thanks for the clarification.
> >
> > > Does netfilter match against skb_dst(skb)->xfrm? What is the exact case
> > > that does not work?
> >
> > The reported use-case is a match against tunnel data in output hook:
> >
> > | table t {
> > | chain c {
> > | type filter hook output priority filter
> > | oifname eth0 ipsec out ip daddr 192.168.1.2
> > | }
> > | }
> >
> > The ipsec expression tries to extract that data from skb_dst(skb)->xfrm
> > if present. In xt_policy (for iptables), code is equivalent. The above
> > works when not using xfrm_interface. Initially I assumed one just needs
> > to adjust the oifname match, but even dropping it doesn't help.
>
> Yes, this does not work with xfrm interfaces. As said, they are plaintext
> devices that guarantee transformation.
>
> Maybe you can try to match after transformation by using the secpath,
> but not sure if that is what you need.

Secpath is used for input only, no?

I played a bit more with xfrm_interface and noticed that when used,
NF_INET_LOCAL_OUT hook sees the packet (an ICMP reply) only once instead
of twice as without xfrm_interface. I don't think using it should change
behaviour that much apart from packets without matching policy being
dropped. What do you think about the following fix? I checked forwarding
packets as well and it looks like behaviour is identical to plain
policy:

diff --git a/net/xfrm/xfrm_interface.c b/net/xfrm/xfrm_interface.c
index aa4cdcf69d471..24af61c95b4d4 100644
--- a/net/xfrm/xfrm_interface.c
+++ b/net/xfrm/xfrm_interface.c
@@ -317,7 +317,8 @@ xfrmi_xmit2(struct sk_buff *skb, struct net_device *dev, struct flowi *fl)
skb_dst_set(skb, dst);
skb->dev = tdev;

- err = dst_output(xi->net, skb->sk, skb);
+ err = NF_HOOK(skb_dst(skb)->ops->family, NF_INET_LOCAL_OUT, xi->net,
+ skb->sk, skb, NULL, skb_dst(skb)->dev, dst_output);
if (net_xmit_eval(err) == 0) {
struct pcpu_sw_netstats *tstats = this_cpu_ptr(dev->tstats);

Thanks, Phil

2020-12-07 10:07:00

by Steffen Klassert

[permalink] [raw]
Subject: Re: XFRM interface and NF_INET_LOCAL_OUT hook

On Fri, Nov 27, 2020 at 03:10:48PM +0100, Phil Sutter wrote:
> On Fri, Nov 27, 2020 at 10:55:11AM +0100, Steffen Klassert wrote:
> > On Thu, Nov 26, 2020 at 02:12:00PM +0100, Phil Sutter wrote:
> > > > >
> > > > > Is this a bug or an expected quirk when using XFRM interface?
> > > >
> > > > This is expected behaviour. The xfrm interfaces are plaintext devices,
> > > > the plaintext packets are routed to the xfrm interface which guarantees
> > > > transformation. So the lookup that assigns skb_dst(skb)->xfrm
> > > > happens 'behind' the interface. After transformation,
> > > > skb_dst(skb)->xfrm will be cleared. So this assignment exists just
> > > > inside xfrm in that case.
> > >
> > > OK, thanks for the clarification.
> > >
> > > > Does netfilter match against skb_dst(skb)->xfrm? What is the exact case
> > > > that does not work?
> > >
> > > The reported use-case is a match against tunnel data in output hook:
> > >
> > > | table t {
> > > | chain c {
> > > | type filter hook output priority filter
> > > | oifname eth0 ipsec out ip daddr 192.168.1.2
> > > | }
> > > | }
> > >
> > > The ipsec expression tries to extract that data from skb_dst(skb)->xfrm
> > > if present. In xt_policy (for iptables), code is equivalent. The above
> > > works when not using xfrm_interface. Initially I assumed one just needs
> > > to adjust the oifname match, but even dropping it doesn't help.
> >
> > Yes, this does not work with xfrm interfaces. As said, they are plaintext
> > devices that guarantee transformation.
> >
> > Maybe you can try to match after transformation by using the secpath,
> > but not sure if that is what you need.
>
> Secpath is used for input only, no?

Yes, apparently :-/

There are cases where we have a secpath for output, but you can't rely
on it.

> I played a bit more with xfrm_interface and noticed that when used,
> NF_INET_LOCAL_OUT hook sees the packet (an ICMP reply) only once instead
> of twice as without xfrm_interface. I don't think using it should change
> behaviour that much apart from packets without matching policy being
> dropped. What do you think about the following fix? I checked forwarding
> packets as well and it looks like behaviour is identical to plain
> policy:
>
> diff --git a/net/xfrm/xfrm_interface.c b/net/xfrm/xfrm_interface.c
> index aa4cdcf69d471..24af61c95b4d4 100644
> --- a/net/xfrm/xfrm_interface.c
> +++ b/net/xfrm/xfrm_interface.c
> @@ -317,7 +317,8 @@ xfrmi_xmit2(struct sk_buff *skb, struct net_device *dev, struct flowi *fl)
> skb_dst_set(skb, dst);
> skb->dev = tdev;
>
> - err = dst_output(xi->net, skb->sk, skb);
> + err = NF_HOOK(skb_dst(skb)->ops->family, NF_INET_LOCAL_OUT, xi->net,
> + skb->sk, skb, NULL, skb_dst(skb)->dev, dst_output);
> if (net_xmit_eval(err) == 0) {
> struct pcpu_sw_netstats *tstats = this_cpu_ptr(dev->tstats);

I don't mind that change, but we have to be carefull on namespace transition.
xi->net is the namespace 'behind' the xfrm interface. I guess this is the
namespace where you want to do the match because that is the namespace
that has the policies and states for the xfrm interface. So I think that
change is correct, I just wanted to point that out explicitely.

2020-12-07 12:37:59

by Phil Sutter

[permalink] [raw]
Subject: Re: XFRM interface and NF_INET_LOCAL_OUT hook

Hi Steffen,

On Wed, Dec 02, 2020 at 02:18:47PM +0100, Steffen Klassert wrote:
> On Fri, Nov 27, 2020 at 03:10:48PM +0100, Phil Sutter wrote:
[...]
> > diff --git a/net/xfrm/xfrm_interface.c b/net/xfrm/xfrm_interface.c
> > index aa4cdcf69d471..24af61c95b4d4 100644
> > --- a/net/xfrm/xfrm_interface.c
> > +++ b/net/xfrm/xfrm_interface.c
> > @@ -317,7 +317,8 @@ xfrmi_xmit2(struct sk_buff *skb, struct net_device *dev, struct flowi *fl)
> > skb_dst_set(skb, dst);
> > skb->dev = tdev;
> >
> > - err = dst_output(xi->net, skb->sk, skb);
> > + err = NF_HOOK(skb_dst(skb)->ops->family, NF_INET_LOCAL_OUT, xi->net,
> > + skb->sk, skb, NULL, skb_dst(skb)->dev, dst_output);
> > if (net_xmit_eval(err) == 0) {
> > struct pcpu_sw_netstats *tstats = this_cpu_ptr(dev->tstats);
>
> I don't mind that change, but we have to be carefull on namespace transition.
> xi->net is the namespace 'behind' the xfrm interface. I guess this is the
> namespace where you want to do the match because that is the namespace
> that has the policies and states for the xfrm interface. So I think that
> change is correct, I just wanted to point that out explicitely.

Thanks for the heads-up, I didn't consider this at all! But indeed I
think it makes sense. I can move the xfrm interface into a netns after
setting things up, then inside that netns netfilter only sees the plain
"inner" packets and no associated ipsec context. This is correct as the
netns doesn't have any knowledge of the policies pesent in initial
netns.

I'll submit the patch formally.

Thanks, Phil

2020-12-07 12:40:44

by Nicolas Dichtel

[permalink] [raw]
Subject: Re: XFRM interface and NF_INET_LOCAL_OUT hook

Le 02/12/2020 à 14:18, Steffen Klassert a écrit :
> On Fri, Nov 27, 2020 at 03:10:48PM +0100, Phil Sutter wrote:
[snip]
>> diff --git a/net/xfrm/xfrm_interface.c b/net/xfrm/xfrm_interface.c
>> index aa4cdcf69d471..24af61c95b4d4 100644
>> --- a/net/xfrm/xfrm_interface.c
>> +++ b/net/xfrm/xfrm_interface.c
>> @@ -317,7 +317,8 @@ xfrmi_xmit2(struct sk_buff *skb, struct net_device *dev, struct flowi *fl)
>> skb_dst_set(skb, dst);
>> skb->dev = tdev;
>>
>> - err = dst_output(xi->net, skb->sk, skb);
>> + err = NF_HOOK(skb_dst(skb)->ops->family, NF_INET_LOCAL_OUT, xi->net,
>> + skb->sk, skb, NULL, skb_dst(skb)->dev, dst_output);
>> if (net_xmit_eval(err) == 0) {
>> struct pcpu_sw_netstats *tstats = this_cpu_ptr(dev->tstats);
>
> I don't mind that change, but we have to be carefull on namespace transition.
> xi->net is the namespace 'behind' the xfrm interface. I guess this is the
> namespace where you want to do the match because that is the namespace
> that has the policies and states for the xfrm interface. So I think that
> change is correct, I just wanted to point that out explicitely.
>
I also agree with the change and the x-netns case.