2011-04-27 23:20:57

by John Gardiner Myers

[permalink] [raw]
Subject: [PATCH] ipv6: fix incorrect unregistration of sysctl when last ip deleted

When the last ip address is deleted, the kernel disables IPv6 on the
interface. (Not sure why, but that's beside the point.) The call that
does this is over-aggressive--it indicates the interface is about to
be removed even though that isn't necessarily so.

This causes IPv6 to, among other things, unregister its sysctl
parameters for the interface. Thus, the "accept_ra" and "addrconf"
settings can't be set on the interface until after the interface has
been brought back up, which is too late.

Signed-off-by: John Gardiner Myers <[email protected]>
Cc: [email protected]

---

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 1493534..042d0aa 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -2215,7 +2215,7 @@ static int inet6_addr_del(struct net *net, int ifindex, struct in6_addr *pfx,
disable IPv6 on this interface.
*/
if (list_empty(&idev->addr_list))
- addrconf_ifdown(idev->dev, 1);
+ addrconf_ifdown(idev->dev, 0);
return 0;
}
}


2011-04-29 20:45:58

by David Miller

[permalink] [raw]
Subject: Re: [PATCH] ipv6: fix incorrect unregistration of sysctl when last ip deleted

From: John Myers <[email protected]>
Date: Wed, 27 Apr 2011 16:12:38 -0700

> When the last ip address is deleted, the kernel disables IPv6 on the
> interface. (Not sure why, but that's beside the point.) The call that
> does this is over-aggressive--it indicates the interface is about to
> be removed even though that isn't necessarily so.
>
> This causes IPv6 to, among other things, unregister its sysctl
> parameters for the interface. Thus, the "accept_ra" and "addrconf"
> settings can't be set on the interface until after the interface has
> been brought back up, which is too late.
>
> Signed-off-by: John Gardiner Myers <[email protected]>
> Cc: [email protected]

I'm not applying this, at least without some more discussion.

I can't see what you gain from this change.

First of all, when the machine boots up, you already have the problem
that you cannot set the accept_ra and addrconf sysctl settings before
the first ipv6 address is added to the interface.

So by definition you already cannot make the settings before it is
"too late" and the device is already engaging in ipv6 activity.

Giving you the capability to handle this across full ipv6 address
deletions on the device later on doesn't add anything, and at best it
gives people a false sense of security about being able to preserve
these settings across an ipv6 disable on the device.

If people are going to use this new behavior to do some trick like:

1) Let device come up and assign ipv6 addresses so that sysctls appear
2) Set ipv6 sysctls how actually desired
3) Delete all ipv6 addresses
4) Add them all back

Then I doubly do not want to set a precedent for this kind of usage
by applying this patch. Fix the real problem.

This behavior has been here, and intentionally so, since Alexey added
the "how" parameter to addrconf_ifdown() back in 1997.

Furthermore, there are other side effects to changing the 'how' parameter
to zero in this case and I haven't seen any analysis that those won't
cause any other undesirable side effects either.

So again, I'm not applying this patch, sorry.

2011-04-29 21:59:43

by John Gardiner Myers

[permalink] [raw]
Subject: Re: [PATCH] ipv6: fix incorrect unregistration of sysctl when last ip deleted

On 4/29/2011 1:45 PM, David Miller wrote:
> First of all, when the machine boots up, you already have the problem
> that you cannot set the accept_ra and addrconf sysctl settings before
> the first ipv6 address is added to the interface.
What do you mean? I see no problem.

One does not need to set the accept_ra and addrconf sysctl settings
before the first ipv6 address is added to the interface, one needs to
set them before the interface is brought up. DAD (and thus router
solicitation) does not happen on down interfaces.

When the machine boots up and the interface is discovered, it starts in
the down state but with the sysctls registered. There is no problem
adjusting the settings before bringing the interface up.

> So by definition you already cannot make the settings before it is
> "too late" and the device is already engaging in ipv6 activity.
>
> Giving you the capability to handle this across full ipv6 address
> deletions on the device later on doesn't add anything, and at best it
> gives people a false sense of security about being able to preserve
> these settings across an ipv6 disable on the device.
>
> If people are going to use this new behavior to do some trick like:
>
> 1) Let device come up and assign ipv6 addresses so that sysctls appear
> 2) Set ipv6 sysctls how actually desired
> 3) Delete all ipv6 addresses
> 4) Add them all back
>
> Then I doubly do not want to set a precedent for this kind of usage
> by applying this patch. Fix the real problem.
This is all nonsense.
> This behavior has been here, and intentionally so, since Alexey added
> the "how" parameter to addrconf_ifdown() back in 1997.
The "how" parameter indicates the device is being deleted. In this case,
the device is not being deleted.

This does bring up the issue that the call to addrconf_ifdown() when the
MTU goes below IPV6_MIN_MTU probably also needs to be fixed.
Furthermore, there are other side effects to changing the 'how' parameter
> to zero in this case and I haven't seen any analysis that those won't
> cause any other undesirable side effects either.
If the device isn't going away, then the ip6_ptr shouldn't be zeroed,
the /proc/net/dev_snmp6 entry shouldn't be deregistered, the stateless
addrconf flags should be cleared, the regen timer shouldn't be deleted,
ipv6 multicast shouldn't mark the device down instead of being
destroyed, and the nd_parms shouldn't be freed.

2011-04-30 00:14:03

by Alexey Kuznetsov

[permalink] [raw]
Subject: Re: [PATCH] ipv6: fix incorrect unregistration of sysctl when last ip deleted

On Fri, Apr 29, 2011 at 02:58:24PM -0700, John Gardiner Myers wrote:
> If the device isn't going away, then the ip6_ptr shouldn't be zeroed,
> the /proc/net/dev_snmp6 entry shouldn't be deregistered,

Actually, you are right. Tuned interface parameters and disabling/enabling IPv6
(or IP, or whatever) should be different things. We just did not have an interface
to disable protocol, but leave in*_dev, so that they were merged.

When doing this just keep in mind that addrconf_ifdown(how = 0) did _not_ mean
disabling IPv6. (Probably, it does now in fact, I do not know. But it definitely
did not mean this in the past).

Look, addrconf_ifdown(how = 0) was executed only when the physical device is down, so that we could
neither receive nor send over this interface. If the device is UP, addrconf_ifdown(how = 0)
did not prohibit sending/receiving IPv6. Actually, logically, addrconf_ifdown(how = 0)
on UP interface must be followed by immediate restart of autoconfiguration,
because interface is still actually UP. See?

So, to implement this you should verify that IPv6 packets are not sent/received over
disabled interface (at least over interface with illegal mtu :-)). And add some flag in in6_dev
meaning that IPv6 is actually disabled. So that f.e. after occasional
ifconfig eth0 down; ifconfig eth0 up autoconfiguration would not resume IPv6
(the thing which we could not even implement with destroying in6_dev, but definitely wanted).

Alexey

2011-04-30 03:47:49

by David Miller

[permalink] [raw]
Subject: Re: [PATCH] ipv6: fix incorrect unregistration of sysctl when last ip deleted

From: John Gardiner Myers <[email protected]>
Date: Fri, 29 Apr 2011 14:58:24 -0700

> The "how" parameter indicates the device is being deleted. In this
> case, the device is not being deleted.

It means that ipv6 is being disabled on the device.

Which is what happens when one of the following happens:

1) the MTU is too small to support ipv6 properly

2) no configured ipv6 addresses remain on the device

3) the device is being unregistered

2011-05-01 09:59:22

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH] ipv6: fix incorrect unregistration of sysctl when last ip deleted

Alexey Kuznetsov <[email protected]> writes:

> On Fri, Apr 29, 2011 at 02:58:24PM -0700, John Gardiner Myers wrote:
>> If the device isn't going away, then the ip6_ptr shouldn't be zeroed,
>> the /proc/net/dev_snmp6 entry shouldn't be deregistered,
>
> Actually, you are right. Tuned interface parameters and disabling/enabling IPv6
> (or IP, or whatever) should be different things. We just did not have an interface
> to disable protocol, but leave in*_dev, so that they were merged.
>
> When doing this just keep in mind that addrconf_ifdown(how = 0) did _not_ mean
> disabling IPv6. (Probably, it does now in fact, I do not know. But it definitely
> did not mean this in the past).
>
> Look, addrconf_ifdown(how = 0) was executed only when the physical device is down, so that we could
> neither receive nor send over this interface. If the device is UP, addrconf_ifdown(how = 0)
> did not prohibit sending/receiving IPv6. Actually, logically, addrconf_ifdown(how = 0)
> on UP interface must be followed by immediate restart of autoconfiguration,
> because interface is still actually UP. See?
>
> So, to implement this you should verify that IPv6 packets are not sent/received over
> disabled interface (at least over interface with illegal mtu :-)). And add some flag in in6_dev
> meaning that IPv6 is actually disabled. So that f.e. after occasional
> ifconfig eth0 down; ifconfig eth0 up autoconfiguration would not resume IPv6
> (the thing which we could not even implement with destroying in6_dev,
> but definitely wanted).

I played with this a while ago with on 2.6.37 and I cooked up the patch
below. I don't know if it still applies, but I think something like
this is what we want, as it makes the no ipv6 address case and the
ipv6 mtu too small case match the disable_ipv6 case.

Eric


>From e97b017fe210db4e0a1d5553449da140a0794bb0 Mon Sep 17 00:00:00 2001
From: Eric W. Biederman <[email protected]>
Date: Wed, 8 Dec 2010 11:59:36 -0800
Subject: [PATCH] addrconf: Force logical down for bad MTU or no ipv6 addresses as well.

Don't loose our ipv6 state (like disable_ipv6) when we delete the last
address from an ipv6 interface or when we set an mtu on the interface
that is smaller than we support.

As well as making things more comprehensible to userspace this
makes the code simpler.

Signed-off-by: Eric W. Biederman <[email protected]>
---
net/ipv6/addrconf.c | 79 ++++++++++++++++++++------------------------------
1 files changed, 32 insertions(+), 47 deletions(-)

Index: linux-2.6.37-rc5.x86_64/net/ipv6/addrconf.c
===================================================================
--- linux-2.6.37-rc5.x86_64.orig/net/ipv6/addrconf.c
+++ linux-2.6.37-rc5.x86_64/net/ipv6/addrconf.c
@@ -147,6 +147,7 @@ static void addrconf_leave_anycast(struc
static void addrconf_type_change(struct net_device *dev,
unsigned long event);
static int addrconf_ifdown(struct net_device *dev, int how);
+static void dev_disable_change(struct inet6_dev *idev);

static void addrconf_dad_start(struct inet6_ifaddr *ifp, u32 flags);
static void addrconf_dad_timer(unsigned long data);
@@ -2221,7 +2222,7 @@ static int inet6_addr_del(struct net *ne
disable IPv6 on this interface.
*/
if (list_empty(&idev->addr_list))
- addrconf_ifdown(idev->dev, 1);
+ dev_disable_change(idev);
return 0;
}
}
@@ -2499,7 +2500,7 @@ static int addrconf_notify(struct notifi

switch (event) {
case NETDEV_REGISTER:
- if (!idev && dev->mtu >= IPV6_MIN_MTU) {
+ if (!idev) {
idev = ipv6_add_dev(dev);
if (!idev)
return notifier_from_errno(-ENOMEM);
@@ -2520,26 +2521,18 @@ static int addrconf_notify(struct notifi
dev->name);
break;
}
-
- if (!idev && dev->mtu >= IPV6_MIN_MTU)
- idev = ipv6_add_dev(dev);
-
- if (idev) {
- idev->if_flags |= IF_READY;
- run_pending = 1;
- }
+ idev->if_flags |= IF_READY;
+ run_pending = 1;
} else {
if (!addrconf_qdisc_ok(dev)) {
/* device is still not ready. */
break;
}

- if (idev) {
- if (idev->if_flags & IF_READY)
- /* device is already configured. */
- break;
- idev->if_flags |= IF_READY;
- }
+ if (idev->if_flags & IF_READY)
+ /* device is already configured. */
+ break;
+ idev->if_flags |= IF_READY;

printk(KERN_INFO
"ADDRCONF(NETDEV_CHANGE): %s: "
@@ -2567,45 +2560,37 @@ static int addrconf_notify(struct notifi
break;
}

- if (idev) {
- if (run_pending)
- addrconf_dad_run(idev);
-
- /*
- * If the MTU changed during the interface down,
- * when the interface up, the changed MTU must be
- * reflected in the idev as well as routers.
- */
- if (idev->cnf.mtu6 != dev->mtu &&
- dev->mtu >= IPV6_MIN_MTU) {
- rt6_mtu_change(dev, dev->mtu);
- idev->cnf.mtu6 = dev->mtu;
- }
- idev->tstamp = jiffies;
- inet6_ifinfo_notify(RTM_NEWLINK, idev);
+ if (run_pending)
+ addrconf_dad_run(idev);

- /*
- * If the changed mtu during down is lower than
- * IPV6_MIN_MTU stop IPv6 on this interface.
- */
- if (dev->mtu < IPV6_MIN_MTU)
- addrconf_ifdown(dev, 1);
+ /*
+ * If the MTU changed during the interface down,
+ * when the interface up, the changed MTU must be
+ * reflected in the idev as well as routers.
+ */
+ if (idev->cnf.mtu6 != dev->mtu &&
+ dev->mtu >= IPV6_MIN_MTU) {
+ rt6_mtu_change(dev, dev->mtu);
+ idev->cnf.mtu6 = dev->mtu;
}
+ idev->tstamp = jiffies;
+ inet6_ifinfo_notify(RTM_NEWLINK, idev);
+
+ /*
+ * If the changed mtu during down is lower than
+ * IPV6_MIN_MTU stop IPv6 on this interface.
+ */
+ if (dev->mtu < IPV6_MIN_MTU)
+ dev_disable_change(idev);
break;

case NETDEV_CHANGEMTU:
- if (idev && dev->mtu >= IPV6_MIN_MTU) {
+ if (dev->mtu >= IPV6_MIN_MTU) {
rt6_mtu_change(dev, dev->mtu);
idev->cnf.mtu6 = dev->mtu;
break;
}

- if (!idev && dev->mtu >= IPV6_MIN_MTU) {
- idev = ipv6_add_dev(dev);
- if (idev)
- break;
- }
-
/*
* MTU falled under IPV6_MIN_MTU.
* Stop IPv6 on this interface.
@@ -2616,7 +2601,7 @@ static int addrconf_notify(struct notifi
/*
* Remove all addresses from this interface.
*/
- addrconf_ifdown(dev, event != NETDEV_DOWN);
+ addrconf_ifdown(dev, event == NETDEV_UNREGISTER);
break;

case NETDEV_CHANGENAME:
@@ -4137,6 +4122,18 @@ static void ipv6_ifa_notify(int event, s
rcu_read_unlock_bh();
}

+static void dev_disable_change(struct inet6_dev *idev)
+{
+ if (!idev || !idev->dev)
+ return;
+
+ if (idev->cnf.disable_ipv6 || (list_empty(&idev->addr_list)) ||
+ (idev->dev->mtu < IPV6_MIN_MTU))
+ addrconf_notify(NULL, NETDEV_DOWN, idev->dev);
+ else
+ addrconf_notify(NULL, NETDEV_UP, idev->dev);
+}
+
#ifdef CONFIG_SYSCTL

static
@@ -4157,17 +4154,6 @@ int addrconf_sysctl_forward(ctl_table *c
return ret;
}

-static void dev_disable_change(struct inet6_dev *idev)
-{
- if (!idev || !idev->dev)
- return;
-
- if (idev->cnf.disable_ipv6)
- addrconf_notify(NULL, NETDEV_DOWN, idev->dev);
- else
- addrconf_notify(NULL, NETDEV_UP, idev->dev);
-}
-
static void addrconf_disable_change(struct net *net, __s32 newf)
{
struct net_device *dev;