Netlink currently has partial support for acting on interfaces outside
the current namespace. This patch extends RTM_SETLINK with this
functionality.
The current implementation has an unfortunate semantic ambiguity in the
IFLA_TARGET_NETNSID attribute. For setting the interface namespace, one
may pass the IFLA_TARGET_NETNSID attribute with the namespace to move the
interface to. This conflicts with the meaning of this attribute for all
other methods where IFLA_TARGET_NETNSID identifies the namespace in
which to search for the interface to act upon: the pair (namespace,
ifindex) is generally given by (IFLA_TARGET_NETNSID, ifi->ifi_index).
In order to change the namespace of an interface outside the current
namespace, we would need to specify both an IFLA_TARGET_NETNSID
attribute and a namespace to move to using IFLA_NET_NS_[PID|FD]. This is
currently now allowed as only one of these three flags may be specified.
This patch loosens the restrictions a bit but tries to maintain
compatibility with the previous behaviour:
i) IFLA_TARGET_NETNSID may be passed together with one of
IFLA_NET_NS_[PID|FD]
ii) IFLA_TARGET_NETNSID is primarily defined to be the namespace in
which to find the interface to act upon
iii) In order to maintain backwards compatibility, if the device is not
found in the specified namespace, we also look for it in the current
namespace
iv) If only IFLA_TARGET_NETNSID is given, the device is still moved to
that namespace, as before; and, as before, IFLA_NET_NS_[PID|FD] take
precedence as namespace selectors
Ideally, IFLA_TARGET_NETNSID would only ever have been used to select the
namespace of the device to act upon. A separate flag, IFLA_NET_NS_ID
would have been made available for changing namespaces
Signed-off-by: Jonas Bonn <[email protected]>
Acked-by: Nicolas Dichtel <[email protected]>
---
net/core/rtnetlink.c | 37 ++++++++++++++++++++++++++++++-------
1 file changed, 30 insertions(+), 7 deletions(-)
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index c81cd80114d9..aa3924c9813c 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -2109,13 +2109,7 @@ static int rtnl_ensure_unique_netns(struct nlattr *tb[],
return -EOPNOTSUPP;
}
- if (tb[IFLA_TARGET_NETNSID] && (tb[IFLA_NET_NS_PID] || tb[IFLA_NET_NS_FD]))
- goto invalid_attr;
-
- if (tb[IFLA_NET_NS_PID] && (tb[IFLA_TARGET_NETNSID] || tb[IFLA_NET_NS_FD]))
- goto invalid_attr;
-
- if (tb[IFLA_NET_NS_FD] && (tb[IFLA_TARGET_NETNSID] || tb[IFLA_NET_NS_PID]))
+ if (tb[IFLA_NET_NS_PID] && tb[IFLA_NET_NS_FD])
goto invalid_attr;
return 0;
@@ -2727,6 +2721,7 @@ static int rtnl_setlink(struct sk_buff *skb, struct nlmsghdr *nlh,
struct netlink_ext_ack *extack)
{
struct net *net = sock_net(skb->sk);
+ struct net *tgt_net = NULL;
struct ifinfomsg *ifm;
struct net_device *dev;
int err;
@@ -2742,6 +2737,15 @@ static int rtnl_setlink(struct sk_buff *skb, struct nlmsghdr *nlh,
if (err < 0)
goto errout;
+ if (tb[IFLA_TARGET_NETNSID]) {
+ s32 netnsid = nla_get_s32(tb[IFLA_TARGET_NETNSID]);
+
+ tgt_net = rtnl_get_net_ns_capable(NETLINK_CB(skb).sk, netnsid);
+ if (IS_ERR(net))
+ return PTR_ERR(net);
+ net = tgt_net;
+ }
+
if (tb[IFLA_IFNAME])
nla_strlcpy(ifname, tb[IFLA_IFNAME], IFNAMSIZ);
else
@@ -2756,6 +2760,23 @@ static int rtnl_setlink(struct sk_buff *skb, struct nlmsghdr *nlh,
else
goto errout;
+ /* A hack to preserve kernel<->userspace interface.
+ * It was previously allowed to pass the IFLA_TARGET_NETNSID
+ * attribute as a way to _set_ the network namespace. In this
+ * case, the device interface was assumed to be in the _current_
+ * namespace.
+ * If the device cannot be found in the target namespace then we
+ * assume that the request is to set the device in the current
+ * namespace and thus we attempt to find the device there.
+ */
+ if (!dev && tgt_net) {
+ net = sock_net(skb->sk);
+ if (ifm->ifi_index > 0)
+ dev = __dev_get_by_index(net, ifm->ifi_index);
+ else if (tb[IFLA_IFNAME])
+ dev = __dev_get_by_name(net, ifname);
+ }
+
if (dev == NULL) {
err = -ENODEV;
goto errout;
@@ -2763,6 +2784,8 @@ static int rtnl_setlink(struct sk_buff *skb, struct nlmsghdr *nlh,
err = do_setlink(skb, dev, ifm, extack, tb, ifname, 0);
errout:
+ if (tgt_net)
+ put_net(tgt_net);
return err;
}
--
2.20.1
On Thu, Nov 7, 2019 at 5:30 AM Jonas Bonn <[email protected]> wrote:
>
> Netlink currently has partial support for acting on interfaces outside
> the current namespace. This patch extends RTM_SETLINK with this
> functionality.
>
> The current implementation has an unfortunate semantic ambiguity in the
> IFLA_TARGET_NETNSID attribute. For setting the interface namespace, one
> may pass the IFLA_TARGET_NETNSID attribute with the namespace to move the
> interface to. This conflicts with the meaning of this attribute for all
> other methods where IFLA_TARGET_NETNSID identifies the namespace in
> which to search for the interface to act upon: the pair (namespace,
> ifindex) is generally given by (IFLA_TARGET_NETNSID, ifi->ifi_index).
>
> In order to change the namespace of an interface outside the current
> namespace, we would need to specify both an IFLA_TARGET_NETNSID
> attribute and a namespace to move to using IFLA_NET_NS_[PID|FD]. This is
> currently now allowed as only one of these three flags may be specified.
>
> This patch loosens the restrictions a bit but tries to maintain
> compatibility with the previous behaviour:
> i) IFLA_TARGET_NETNSID may be passed together with one of
> IFLA_NET_NS_[PID|FD]
> ii) IFLA_TARGET_NETNSID is primarily defined to be the namespace in
> which to find the interface to act upon
> iii) In order to maintain backwards compatibility, if the device is not
> found in the specified namespace, we also look for it in the current
> namespace
> iv) If only IFLA_TARGET_NETNSID is given, the device is still moved to
> that namespace, as before; and, as before, IFLA_NET_NS_[PID|FD] take
> precedence as namespace selectors
>
> Ideally, IFLA_TARGET_NETNSID would only ever have been used to select the
> namespace of the device to act upon. A separate flag, IFLA_NET_NS_ID
> would have been made available for changing namespaces
>
> Signed-off-by: Jonas Bonn <[email protected]>
> Acked-by: Nicolas Dichtel <[email protected]>
> ---
> net/core/rtnetlink.c | 37 ++++++++++++++++++++++++++++++-------
> 1 file changed, 30 insertions(+), 7 deletions(-)
>
> diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
> index c81cd80114d9..aa3924c9813c 100644
> --- a/net/core/rtnetlink.c
> +++ b/net/core/rtnetlink.c
> @@ -2109,13 +2109,7 @@ static int rtnl_ensure_unique_netns(struct nlattr *tb[],
> return -EOPNOTSUPP;
> }
>
> - if (tb[IFLA_TARGET_NETNSID] && (tb[IFLA_NET_NS_PID] || tb[IFLA_NET_NS_FD]))
> - goto invalid_attr;
> -
> - if (tb[IFLA_NET_NS_PID] && (tb[IFLA_TARGET_NETNSID] || tb[IFLA_NET_NS_FD]))
> - goto invalid_attr;
> -
> - if (tb[IFLA_NET_NS_FD] && (tb[IFLA_TARGET_NETNSID] || tb[IFLA_NET_NS_PID]))
> + if (tb[IFLA_NET_NS_PID] && tb[IFLA_NET_NS_FD])
> goto invalid_attr;
>
> return 0;
> @@ -2727,6 +2721,7 @@ static int rtnl_setlink(struct sk_buff *skb, struct nlmsghdr *nlh,
> struct netlink_ext_ack *extack)
> {
> struct net *net = sock_net(skb->sk);
> + struct net *tgt_net = NULL;
> struct ifinfomsg *ifm;
> struct net_device *dev;
> int err;
> @@ -2742,6 +2737,15 @@ static int rtnl_setlink(struct sk_buff *skb, struct nlmsghdr *nlh,
> if (err < 0)
> goto errout;
>
> + if (tb[IFLA_TARGET_NETNSID]) {
> + s32 netnsid = nla_get_s32(tb[IFLA_TARGET_NETNSID]);
> +
> + tgt_net = rtnl_get_net_ns_capable(NETLINK_CB(skb).sk, netnsid);
> + if (IS_ERR(net))
> + return PTR_ERR(net);
> + net = tgt_net;
> + }
> +
> if (tb[IFLA_IFNAME])
> nla_strlcpy(ifname, tb[IFLA_IFNAME], IFNAMSIZ);
> else
> @@ -2756,6 +2760,23 @@ static int rtnl_setlink(struct sk_buff *skb, struct nlmsghdr *nlh,
> else
> goto errout;
>
> + /* A hack to preserve kernel<->userspace interface.
> + * It was previously allowed to pass the IFLA_TARGET_NETNSID
> + * attribute as a way to _set_ the network namespace. In this
> + * case, the device interface was assumed to be in the _current_
> + * namespace.
> + * If the device cannot be found in the target namespace then we
> + * assume that the request is to set the device in the current
> + * namespace and thus we attempt to find the device there.
> + */
Could this bypasses the ns_capable() check? i.e. if the target is
"foo" but your current ns is bar. The process may be "capable" is foo
but the interface is not found in foo but present in bar and ends up
modifying it (especially when you are not capable in bar)?
> + if (!dev && tgt_net) {
> + net = sock_net(skb->sk);
> + if (ifm->ifi_index > 0)
> + dev = __dev_get_by_index(net, ifm->ifi_index);
> + else if (tb[IFLA_IFNAME])
> + dev = __dev_get_by_name(net, ifname);
> + }
> +
> if (dev == NULL) {
> err = -ENODEV;
> goto errout;
> @@ -2763,6 +2784,8 @@ static int rtnl_setlink(struct sk_buff *skb, struct nlmsghdr *nlh,
>
> err = do_setlink(skb, dev, ifm, extack, tb, ifname, 0);
> errout:
> + if (tgt_net)
> + put_net(tgt_net);
> return err;
> }
>
> --
> 2.20.1
>
Hi Mahesh,
On 07/11/2019 21:36, Mahesh Bandewar (महेश बंडेवार) wrote:
> On Thu, Nov 7, 2019 at 5:30 AM Jonas Bonn <[email protected]> wrote:
>>
>>
>> + /* A hack to preserve kernel<->userspace interface.
>> + * It was previously allowed to pass the IFLA_TARGET_NETNSID
>> + * attribute as a way to _set_ the network namespace. In this
>> + * case, the device interface was assumed to be in the _current_
>> + * namespace.
>> + * If the device cannot be found in the target namespace then we
>> + * assume that the request is to set the device in the current
>> + * namespace and thus we attempt to find the device there.
>> + */
> Could this bypasses the ns_capable() check? i.e. if the target is
> "foo" but your current ns is bar. The process may be "capable" is foo
> but the interface is not found in foo but present in bar and ends up
> modifying it (especially when you are not capable in bar)?
I don't think so. There was never any capable-check for the "current"
namespace so there's no change in that regard.
I do think there is an issue with this hack that I can't see any
workaround for. If the user specifies an interface (by name or index)
for another namespace that doesn't exist, there's a potential problem if
that name/index happens to exist in the "current" namespace. In that
case, one many end up inadvertently modifying the interface in the
current namespace. I don't see how to avoid that while maintaining the
backwards compatibility.
My absolute preference would be to drop this compat-hack altogether.
iproute2 doesn't use a bare TARGET_NETNSID in this manner (for changing
namespaces) and I didn't find any other users by a quick search of other
prominent Netlink users: systemd, network-manager, connman. This
compat-hack is there for the _potential ab-user_ of the interface, not
for any known such.
>
>> + if (!dev && tgt_net) {
>> + net = sock_net(skb->sk);
>> + if (ifm->ifi_index > 0)
>> + dev = __dev_get_by_index(net, ifm->ifi_index);
>> + else if (tb[IFLA_IFNAME])
>> + dev = __dev_get_by_name(net, ifname);
>> + }
/Jonas
Hi Jonas, thanks for the response.
On Fri, Nov 8, 2019 at 12:20 AM Jonas Bonn <[email protected]> wrote:
>
> Hi Mahesh,
>
> On 07/11/2019 21:36, Mahesh Bandewar (महेश बंडेवार) wrote:
> > On Thu, Nov 7, 2019 at 5:30 AM Jonas Bonn <[email protected]> wrote:
> >>
> >>
> >> + /* A hack to preserve kernel<->userspace interface.
> >> + * It was previously allowed to pass the IFLA_TARGET_NETNSID
> >> + * attribute as a way to _set_ the network namespace. In this
> >> + * case, the device interface was assumed to be in the _current_
> >> + * namespace.
> >> + * If the device cannot be found in the target namespace then we
> >> + * assume that the request is to set the device in the current
> >> + * namespace and thus we attempt to find the device there.
> >> + */
> > Could this bypasses the ns_capable() check? i.e. if the target is
> > "foo" but your current ns is bar. The process may be "capable" is foo
> > but the interface is not found in foo but present in bar and ends up
> > modifying it (especially when you are not capable in bar)?
>
> I don't think so. There was never any capable-check for the "current"
> namespace so there's no change in that regard.
>
not having capable-check seems wrong as we don't want random
not-capable processes to alter settings. However, it may be at the API
entry level, which will provide necessary protection (haven't
checked!). Having said that, this could be bad for the stuff that you
are implementing since I could be in "foo" and attempting to change
"bar". For this I must be capable in "bar" but the top-level capable
check will by default check me in "foo" as well which is not required
and could potentially block me from performing legal operation in
"bar".
Not saying this is a problem, but without having an implementation to
use this would be hard to try. You would most likely have a way to
verify this, so please check it.
> I do think there is an issue with this hack that I can't see any
> workaround for. If the user specifies an interface (by name or index)
> for another namespace that doesn't exist, there's a potential problem if
> that name/index happens to exist in the "current" namespace. In that
> case, one many end up inadvertently modifying the interface in the
> current namespace. I don't see how to avoid that while maintaining the
> backwards compatibility.
>
This could very well be the case always for single digit ifindex
values. (We recently suffered a local scare because of something very
similar).
> My absolute preference would be to drop this compat-hack altogether.
> iproute2 doesn't use a bare TARGET_NETNSID in this manner (for changing
> namespaces) and I didn't find any other users by a quick search of other
> prominent Netlink users: systemd, network-manager, connman. This
> compat-hack is there for the _potential ab-user_ of the interface, not
> for any known such.
>
what is forcing you keeping you keeping / implementing this hack? I
would also prefer simple solution without creating a potential problem
/ vulnerability (problem: potentially modifying unintended interface,
vulnerability: potentially allow changing without proper credentials;
both not proven but are possibilities) down the line. One possibility
is to drop the compatibility hack and keep it as a backup if something
breaks / someone complains.
thanks,
--mahesh..
> >
> >> + if (!dev && tgt_net) {
> >> + net = sock_net(skb->sk);
> >> + if (ifm->ifi_index > 0)
> >> + dev = __dev_get_by_index(net, ifm->ifi_index);
> >> + else if (tb[IFLA_IFNAME])
> >> + dev = __dev_get_by_name(net, ifname);
> >> + }
>
>
> /Jonas
Hi Mahesh,
Thanks for the detailed response. It provided valuable insight.
On 08/11/2019 19:55, Mahesh Bandewar (महेश बंडेवार) wrote:
> Hi Jonas, thanks for the response.
>
> On Fri, Nov 8, 2019 at 12:20 AM Jonas Bonn <[email protected]> wrote:
>>
>> Hi Mahesh,
>>
>> On 07/11/2019 21:36, Mahesh Bandewar (महेश बंडेवार) wrote:
>>> On Thu, Nov 7, 2019 at 5:30 AM Jonas Bonn <[email protected]> wrote:
>>>>
>>>>
>>>> + /* A hack to preserve kernel<->userspace interface.
>>>> + * It was previously allowed to pass the IFLA_TARGET_NETNSID
>>>> + * attribute as a way to _set_ the network namespace. In this
>>>> + * case, the device interface was assumed to be in the _current_
>>>> + * namespace.
>>>> + * If the device cannot be found in the target namespace then we
>>>> + * assume that the request is to set the device in the current
>>>> + * namespace and thus we attempt to find the device there.
>>>> + */
>>> Could this bypasses the ns_capable() check? i.e. if the target is
>>> "foo" but your current ns is bar. The process may be "capable" is foo
>>> but the interface is not found in foo but present in bar and ends up
>>> modifying it (especially when you are not capable in bar)?
>>
>> I don't think so. There was never any capable-check for the "current"
>> namespace so there's no change in that regard.
I was wrong on this point. There IS a capable-check for the "current"
net. The code to create interfaces in 'other' namespaces was already in
place before my patch and that code does the right thing with respect to
checking NS capabilities on the "destination" and "link" nets.
My patch is mostly just accounting for the "setlink" aspect of NEWLINK
where the device already exists in a foreign namespace and needs to be
searched for there. Even in that code path, all the ns-capable checks
are in place and the behaviour is the same as before.
>>
> not having capable-check seems wrong as we don't want random
> not-capable processes to alter settings. However, it may be at the API
> entry level, which will provide necessary protection (haven't
> checked!). Having said that, this could be bad for the stuff that you
> are implementing since I could be in "foo" and attempting to change
> "bar". For this I must be capable in "bar" but the top-level capable
> check will by default check me in "foo" as well which is not required
> and could potentially block me from performing legal operation in
> "bar".
>
> Not saying this is a problem, but without having an implementation to
> use this would be hard to try. You would most likely have a way to
> verify this, so please check it.
The above shouldn't be an issue with the current implementation.
>
>> I do think there is an issue with this hack that I can't see any
>> workaround for. If the user specifies an interface (by name or index)
>> for another namespace that doesn't exist, there's a potential problem if
>> that name/index happens to exist in the "current" namespace. In that
>> case, one many end up inadvertently modifying the interface in the
>> current namespace. I don't see how to avoid that while maintaining the
>> backwards compatibility.
>>
> This could very well be the case always for single digit ifindex
> values. (We recently suffered a local scare because of something very
> similar).
>
>> My absolute preference would be to drop this compat-hack altogether.
>> iproute2 doesn't use a bare TARGET_NETNSID in this manner (for changing
>> namespaces) and I didn't find any other users by a quick search of other
>> prominent Netlink users: systemd, network-manager, connman. This
>> compat-hack is there for the _potential ab-user_ of the interface, not
>> for any known such.
>>
> what is forcing you keeping you keeping / implementing this hack? I
> would also prefer simple solution without creating a potential problem
> / vulnerability (problem: potentially modifying unintended interface,
> vulnerability: potentially allow changing without proper credentials;
> both not proven but are possibilities) down the line. One possibility
> is to drop the compatibility hack and keep it as a backup if something
> breaks / someone complains.
OK, this would be my preference, too. If we can work on the assumption
that this isn't actually providing compatibility for anybody in
practice, then we can drop it. With that, the potential problem of
inadvertently modifying the wrong device disappears. There's no problem
of being able to access a namespace that one isn't capable in, but
leaving a hole through which the user may end up doing something
unexpected is pretty ugly.
I'll remove this and repost the series.
Thanks for your insight into this issue. It was helpful.
/Jonas
On Sat, Nov 9, 2019 at 6:17 AM Jonas Bonn <[email protected]> wrote:
>
> Hi Mahesh,
>
> Thanks for the detailed response. It provided valuable insight.
>
> On 08/11/2019 19:55, Mahesh Bandewar (महेश बंडेवार) wrote:
> > Hi Jonas, thanks for the response.
> >
> > On Fri, Nov 8, 2019 at 12:20 AM Jonas Bonn <[email protected]> wrote:
> >>
> >> Hi Mahesh,
> >>
> >> On 07/11/2019 21:36, Mahesh Bandewar (महेश बंडेवार) wrote:
> >>> On Thu, Nov 7, 2019 at 5:30 AM Jonas Bonn <[email protected]> wrote:
> >>>>
> >>>>
> >>>> + /* A hack to preserve kernel<->userspace interface.
> >>>> + * It was previously allowed to pass the IFLA_TARGET_NETNSID
> >>>> + * attribute as a way to _set_ the network namespace. In this
> >>>> + * case, the device interface was assumed to be in the _current_
> >>>> + * namespace.
> >>>> + * If the device cannot be found in the target namespace then we
> >>>> + * assume that the request is to set the device in the current
> >>>> + * namespace and thus we attempt to find the device there.
> >>>> + */
> >>> Could this bypasses the ns_capable() check? i.e. if the target is
> >>> "foo" but your current ns is bar. The process may be "capable" is foo
> >>> but the interface is not found in foo but present in bar and ends up
> >>> modifying it (especially when you are not capable in bar)?
> >>
> >> I don't think so. There was never any capable-check for the "current"
> >> namespace so there's no change in that regard.
>
> I was wrong on this point. There IS a capable-check for the "current"
> net. The code to create interfaces in 'other' namespaces was already in
> place before my patch and that code does the right thing with respect to
> checking NS capabilities on the "destination" and "link" nets.
>
> My patch is mostly just accounting for the "setlink" aspect of NEWLINK
> where the device already exists in a foreign namespace and needs to be
> searched for there. Even in that code path, all the ns-capable checks
> are in place and the behaviour is the same as before.
>
> >>
> > not having capable-check seems wrong as we don't want random
> > not-capable processes to alter settings. However, it may be at the API
> > entry level, which will provide necessary protection (haven't
> > checked!). Having said that, this could be bad for the stuff that you
> > are implementing since I could be in "foo" and attempting to change
> > "bar". For this I must be capable in "bar" but the top-level capable
> > check will by default check me in "foo" as well which is not required
> > and could potentially block me from performing legal operation in
> > "bar".
> >
> > Not saying this is a problem, but without having an implementation to
> > use this would be hard to try. You would most likely have a way to
> > verify this, so please check it.
>
> The above shouldn't be an issue with the current implementation.
>
> >
> >> I do think there is an issue with this hack that I can't see any
> >> workaround for. If the user specifies an interface (by name or index)
> >> for another namespace that doesn't exist, there's a potential problem if
> >> that name/index happens to exist in the "current" namespace. In that
> >> case, one many end up inadvertently modifying the interface in the
> >> current namespace. I don't see how to avoid that while maintaining the
> >> backwards compatibility.
> >>
> > This could very well be the case always for single digit ifindex
> > values. (We recently suffered a local scare because of something very
> > similar).
> >
> >> My absolute preference would be to drop this compat-hack altogether.
> >> iproute2 doesn't use a bare TARGET_NETNSID in this manner (for changing
> >> namespaces) and I didn't find any other users by a quick search of other
> >> prominent Netlink users: systemd, network-manager, connman. This
> >> compat-hack is there for the _potential ab-user_ of the interface, not
> >> for any known such.
> >>
> > what is forcing you keeping you keeping / implementing this hack? I
> > would also prefer simple solution without creating a potential problem
> > / vulnerability (problem: potentially modifying unintended interface,
> > vulnerability: potentially allow changing without proper credentials;
> > both not proven but are possibilities) down the line. One possibility
> > is to drop the compatibility hack and keep it as a backup if something
> > breaks / someone complains.
>
> OK, this would be my preference, too. If we can work on the assumption
> that this isn't actually providing compatibility for anybody in
> practice, then we can drop it. With that, the potential problem of
> inadvertently modifying the wrong device disappears. There's no problem
> of being able to access a namespace that one isn't capable in, but
> leaving a hole through which the user may end up doing something
> unexpected is pretty ugly.
>
> I'll remove this and repost the series.
>
sgtm
thanks,
--mahesh..
> Thanks for your insight into this issue. It was helpful.
>
> /Jonas