I started running v6.7-rc5 on a desktop and began having problems
where Chromium would frequently fail to load pages and give an
"ERR_NETWORK_CHANGED" message instead. I also noticed instability in
avahi-daemon (it would stop resolving local names and/or consume 100%
CPU). Eventually I discovered that what is happening is that new
temporary IPv6 addresses for a ULA address are being generated once
every second, with very short preferred lifetimes (and I had an
interface with thousands of such temporary addresses). I also found
that it seems to be triggered when one of the devices on the network
sends a router advertisement with a prefix that has a preferred
lifetime of 0 (presumably it's sending that because it wants to
deprecate that prefix).
I bisected it to commit 629df6701c8a ("net: ipv6/addrconf: clamp
preferred_lft to the minimum required"). Upon reviewing that change, I
see that it has changed when generation of temporary addresses will be
allowed. I believe that change might have inadvertently caused the
kernel to violate RFC 4941 and might need to be reverted.
In particular RFC 4941 specifies that the preferred lifetime of a
temporary address must not be greater than the preferred lifetime of
the public address it is derived from. However, this change allows a
temporary address to be generated with a preferred lifetime greater
than the public address' preferred lifetime.
From RFC 4941:
4. When creating a temporary address, the lifetime values MUST be
derived from the corresponding prefix as follows:
* Its Valid Lifetime is the lower of the Valid Lifetime of the
public address or TEMP_VALID_LIFETIME.
* Its Preferred Lifetime is the lower of the Preferred Lifetime
of the public address or TEMP_PREFERRED_LIFETIME -
DESYNC_FACTOR.
Previously temporary addresses would not be generated for an interface
if the administratively configured preferred lifetime on that
interface was too short. This change tries to avoid that, and allow
generating temporary addresses even on interfaces with very short
configured lifetimes, by simply increasing the preferred lifetime of
the generated address. However, doing so runs afoul of the above
requirement. It allows the preferred lifetime of the temporary address
to be increased to a value that is larger than the public address'
preferred lifetime. For example, in my case where the router
advertisement causes the public address' preferred lifetime to be set
to 0, the current code allows a temporary address to be generated with
a preferred lifetime of (regen_advance + age + 1), which is obviously
greater than 0. It also, in my case, leads to new temporary addresses
with very short lifetimes being generated, about once every second,
leading to the application-level issues I described above.
Cheers,
-- Dan
On Thu, Dec 21, 2023 at 04:10:57PM -0700, Dan Moulding wrote:
> I started running v6.7-rc5 on a desktop and began having problems
> where Chromium would frequently fail to load pages and give an
> "ERR_NETWORK_CHANGED" message instead. I also noticed instability in
> avahi-daemon (it would stop resolving local names and/or consume 100%
> CPU). Eventually I discovered that what is happening is that new
> temporary IPv6 addresses for a ULA address are being generated once
> every second, with very short preferred lifetimes (and I had an
> interface with thousands of such temporary addresses). I also found
> that it seems to be triggered when one of the devices on the network
> sends a router advertisement with a prefix that has a preferred
> lifetime of 0 (presumably it's sending that because it wants to
> deprecate that prefix).
>
> I bisected it to commit 629df6701c8a ("net: ipv6/addrconf: clamp
> preferred_lft to the minimum required"). Upon reviewing that change, I
> see that it has changed when generation of temporary addresses will be
> allowed. I believe that change might have inadvertently caused the
> kernel to violate RFC 4941 and might need to be reverted.
Can you verify that by actually reverting 629df6701c8a91 on top of net
tree?
--
An old man doll... just what I always wanted! - Clara
> > I bisected it to commit 629df6701c8a ("net: ipv6/addrconf: clamp
> > preferred_lft to the minimum required"). Upon reviewing that change, I
> > see that it has changed when generation of temporary addresses will be
> > allowed. I believe that change might have inadvertently caused the
> > kernel to violate RFC 4941 and might need to be reverted.
>
> Can you verify that by actually reverting 629df6701c8a91 on top of net
> tree?
Yes, after bisecting it to that commit, I did revert it on top of
v6.7-rc6 and verified that the problem goes away. It doesn't start
accumulating addresses once every second. Instead the single temporary
address that was generated for the deprecated prefix is still there,
with a preferred lifetime of 0, like I'd expect, and no new addresses
get generated. The application-level problems are also gone (Chromium
loads pages without issue, and avahi-daemon hasn't jumped to 100%
CPU).
I'm now running it this way on several machines and everything is
looking good again.
-- Dan
On Thu, Dec 21, 2023 at 4:12 PM Dan Moulding <[email protected]> wrote:
>
> I started running v6.7-rc5 on a desktop and began having problems
> where Chromium would frequently fail to load pages and give an
> "ERR_NETWORK_CHANGED" message instead. I also noticed instability in
> avahi-daemon (it would stop resolving local names and/or consume 100%
> CPU). Eventually I discovered that what is happening is that new
> temporary IPv6 addresses for a ULA address are being generated once
> every second, with very short preferred lifetimes (and I had an
> interface with thousands of such temporary addresses). I also found
> that it seems to be triggered when one of the devices on the network
> sends a router advertisement with a prefix that has a preferred
> lifetime of 0 (presumably it's sending that because it wants to
> deprecate that prefix).
>
> I bisected it to commit 629df6701c8a ("net: ipv6/addrconf: clamp
> preferred_lft to the minimum required"). Upon reviewing that change, I
> see that it has changed when generation of temporary addresses will be
> allowed. I believe that change might have inadvertently caused the
> kernel to violate RFC 4941 and might need to be reverted.
>
> In particular RFC 4941 specifies that the preferred lifetime of a
> temporary address must not be greater than the preferred lifetime of
> the public address it is derived from. However, this change allows a
> temporary address to be generated with a preferred lifetime greater
> than the public address' preferred lifetime.
>
> From RFC 4941:
>
> 4. When creating a temporary address, the lifetime values MUST be
> derived from the corresponding prefix as follows:
>
> * Its Valid Lifetime is the lower of the Valid Lifetime of the
> public address or TEMP_VALID_LIFETIME.
>
> * Its Preferred Lifetime is the lower of the Preferred Lifetime
> of the public address or TEMP_PREFERRED_LIFETIME -
> DESYNC_FACTOR.
>
> Previously temporary addresses would not be generated for an interface
> if the administratively configured preferred lifetime on that
> interface was too short. This change tries to avoid that, and allow
> generating temporary addresses even on interfaces with very short
> configured lifetimes, by simply increasing the preferred lifetime of
> the generated address. However, doing so runs afoul of the above
> requirement. It allows the preferred lifetime of the temporary address
> to be increased to a value that is larger than the public address'
> preferred lifetime. For example, in my case where the router
> advertisement causes the public address' preferred lifetime to be set
> to 0, the current code allows a temporary address to be generated with
> a preferred lifetime of (regen_advance + age + 1), which is obviously
> greater than 0. It also, in my case, leads to new temporary addresses
> with very short lifetimes being generated, about once every second,
> leading to the application-level issues I described above.
Sorry for the unintended consequences, and thank you for the detailed
explanation. Does this patch fix the problem for you?
-Alex
Alex Henrie (1):
net: ipv6/addrconf: clamp prefered_lft to the public address preferred
lifetime
net/ipv6/addrconf.c | 19 ++++++++++++++++---
1 file changed, 16 insertions(+), 3 deletions(-)
--
2.43.0
Fixes: 629df6701c8a ("net: ipv6/addrconf: clamp preferred_lft to the minimum required")
Reported-by: Dan Moulding <[email protected]>
Closes: https://lore.kernel.org/netdev/[email protected]/
Signed-off-by: Alex Henrie <[email protected]>
---
net/ipv6/addrconf.c | 36 ++++++++++++++++++++++++++----------
1 file changed, 26 insertions(+), 10 deletions(-)
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 2692a7b24c40..37141d3417fe 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -1337,7 +1337,7 @@ static int ipv6_create_tempaddr(struct inet6_ifaddr *ifp, bool block)
unsigned long tmp_tstamp, age;
unsigned long regen_advance;
unsigned long now = jiffies;
- s32 cnf_temp_preferred_lft;
+ s32 cnf_temp_preferred_lft, if_public_preferred_lft;
struct inet6_ifaddr *ift;
struct ifa6_config cfg;
long max_desync_factor;
@@ -1394,11 +1394,13 @@ static int ipv6_create_tempaddr(struct inet6_ifaddr *ifp, bool block)
}
}
+ if_public_preferred_lft = ifp->prefered_lft;
+
memset(&cfg, 0, sizeof(cfg));
cfg.valid_lft = min_t(__u32, ifp->valid_lft,
idev->cnf.temp_valid_lft + age);
cfg.preferred_lft = cnf_temp_preferred_lft + age - idev->desync_factor;
- cfg.preferred_lft = min_t(__u32, ifp->prefered_lft, cfg.preferred_lft);
+ cfg.preferred_lft = min_t(__u32, if_public_preferred_lft, cfg.preferred_lft);
cfg.preferred_lft = min_t(__u32, cfg.valid_lft, cfg.preferred_lft);
cfg.plen = ifp->prefix_len;
@@ -1414,20 +1416,34 @@ static int ipv6_create_tempaddr(struct inet6_ifaddr *ifp, bool block)
* particular, an implementation must not create a temporary address
* with a zero Preferred Lifetime.
*
- * Clamp the preferred lifetime to a minimum of regen_advance, unless
- * that would exceed valid_lft.
+ * ...
+ *
+ * When creating a temporary address, the lifetime values MUST be
+ * derived from the corresponding prefix as follows:
+ *
+ * ...
+ *
+ * * Its Preferred Lifetime is the lower of the Preferred Lifetime
+ * of the public address or TEMP_PREFERRED_LIFETIME -
+ * DESYNC_FACTOR.
+ *
+ * To comply with the RFC's requirements, clamp the preferred lifetime
+ * to a minimum of regen_advance, unless that would exceed valid_lft or
+ * ifp->prefered_lft.
*
* Use age calculation as in addrconf_verify to avoid unnecessary
* temporary addresses being generated.
*/
age = (now - tmp_tstamp + ADDRCONF_TIMER_FUZZ_MINUS) / HZ;
- if (cfg.preferred_lft <= regen_advance + age)
+ if (cfg.preferred_lft <= regen_advance + age) {
cfg.preferred_lft = regen_advance + age + 1;
- if (cfg.preferred_lft > cfg.valid_lft) {
- in6_ifa_put(ifp);
- in6_dev_put(idev);
- ret = -1;
- goto out;
+ if (cfg.preferred_lft > cfg.valid_lft ||
+ cfg.preferred_lft > if_public_preferred_lft) {
+ in6_ifa_put(ifp);
+ in6_dev_put(idev);
+ ret = -1;
+ goto out;
+ }
}
cfg.ifa_flags = IFA_F_TEMPORARY;
--
2.43.0
On 22.12.23 00:10, Dan Moulding wrote:
> I started running v6.7-rc5 on a desktop and began having problems
> where Chromium would frequently fail to load pages and give an
> "ERR_NETWORK_CHANGED" message instead.
This is already handled, but to ensure that his does not fall through
the cracks due to the festive days let me quickly add it to the tracking:
#regzbot ^introduced 629df6701c8a
#regzbot title net/ipv6/addrconf: Temporary addresses with short
lifetimes generating when they shouldn't, causing applications to fail
#regzbot ignore-activity
Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.
> Sorry for the unintended consequences, and thank you for the detailed
> explanation. Does this patch fix the problem for you?
Thanks. I think this patch may resolve the application-level issues
I'm seeing.
However, it looks to me like this would still violate the RFC. The
temoporary address' preferred liftime must be lower than /both/ the
preferred lifetime of the public address and TEMP_PREFERRED_LIFETIME -
DESYNC_FACTOR.
These two existing lines ensure that it will meet the requirement:
cfg.preferred_lft = cnf_temp_preferred_lft + age - idev->desync_factor;
cfg.preferred_lft = min_t(__u32, ifp->prefered_lft, cfg.preferred_lft);
Once that has been computed, cfg.preferred_lft is already at its
maximum allowed value. There is no case where the RFC allows
increasing that value after doing that computation.
I think the safest thing to do is revert this change, and try to find
some other way to achieve the goal of preventing the user from
administratively setting a preferred lifetime that prevents temporary
addresses from being generated, when the user wants to use the privacy
extensions. For example, this could be done where administratively
configured values are accepted (wherever that is), and either generate
a warning or reject the change, if the value provided by the user is
lower than REGEN_ADVANCE.
-- Dan
On Sat, Dec 23, 2023 at 8:22 AM Dan Moulding <[email protected]> wrote:
>
> > Sorry for the unintended consequences, and thank you for the detailed
> > explanation. Does this patch fix the problem for you?
>
> Thanks. I think this patch may resolve the application-level issues
> I'm seeing.
>
> However, it looks to me like this would still violate the RFC. The
> temoporary address' preferred liftime must be lower than /both/ the
> preferred lifetime of the public address and TEMP_PREFERRED_LIFETIME -
> DESYNC_FACTOR.
>
> These two existing lines ensure that it will meet the requirement:
>
> cfg.preferred_lft = cnf_temp_preferred_lft + age - idev->desync_factor;
> cfg.preferred_lft = min_t(__u32, ifp->prefered_lft, cfg.preferred_lft);
>
> Once that has been computed, cfg.preferred_lft is already at its
> maximum allowed value. There is no case where the RFC allows
> increasing that value after doing that computation.
TEMP_PREFERRED_LIFETIME is an administratively set variable: The user
can change it to whatever they want whenever they want, and the
operating system can adjust it automatically too. It might be more
clear to increase cnf_temp_preferred_lft before those two lines, but
the effect is the same as increasing cfg.preferred_lft afterwards.
Unless there's something else I'm missing, I don't see how this
approach could violate the RFC.
> I think the safest thing to do is revert this change, and try to find
> some other way to achieve the goal of preventing the user from
> administratively setting a preferred lifetime that prevents temporary
> addresses from being generated, when the user wants to use the privacy
> extensions. For example, this could be done where administratively
> configured values are accepted (wherever that is), and either generate
> a warning or reject the change, if the value provided by the user is
> lower than REGEN_ADVANCE.
It's fine to revert the commit for version 6.7 (after all, I think
everyone wants a break for the holidays). Hopefully by version 6.8 we
can agree on a way to support users who want to randomize their IPv6
address as frequently as the network allows.
-Alex
I think a maintainer will probably need to make a call here and decide
how to proceed.
> TEMP_PREFERRED_LIFETIME is an administratively set variable: The user
> can change it to whatever they want whenever they want, and the
> operating system can adjust it automatically too.
Agreed. And the behavior it seems you really want is to prevent the
user from administratively setting it to a value that is lower than
REGEN_ADVANCE, so that it won't stop generating new temporary
addresses altogether.
But preventing the user from configuring it to a value that is too low
is different from generating new temporary addresses with preferred
lifetimes that are greater than the currently configured value of
TEMP_PREFERRED_LIFETIME. I still believe it would be better, and would
be in conformance with the RFC, to simply not allow the user to
configure a too-short TEMP_PREFERRED_LIFETIME instead of tinkering
with the lifetimes of generated temporary addresses.
> It's fine to revert the commit for version 6.7 (after all, I think
> everyone wants a break for the holidays). Hopefully by version 6.8 we
> can agree on a way to support users who want to randomize their IPv6
> address as frequently as the network allows.
FWIW, I think the desired effect you are seeking makes sense and is
the right thing to do. I'm just not convinced this is the correct way
to do it. But I'm not a maintainer and also not an expert in IPv6, so
I'm definitely not the right person to make that call.
-- Dan
On 12/29/23 11:33 AM, Dan Moulding wrote:
> I think a maintainer will probably need to make a call here and decide
> how to proceed.
>
>> TEMP_PREFERRED_LIFETIME is an administratively set variable: The user
>> can change it to whatever they want whenever they want, and the
>> operating system can adjust it automatically too.
>
> Agreed. And the behavior it seems you really want is to prevent the
> user from administratively setting it to a value that is lower than
> REGEN_ADVANCE, so that it won't stop generating new temporary
> addresses altogether.
>
> But preventing the user from configuring it to a value that is too low
> is different from generating new temporary addresses with preferred
> lifetimes that are greater than the currently configured value of
> TEMP_PREFERRED_LIFETIME. I still believe it would be better, and would
> be in conformance with the RFC, to simply not allow the user to
> configure a too-short TEMP_PREFERRED_LIFETIME instead of tinkering
> with the lifetimes of generated temporary addresses.
>
>> It's fine to revert the commit for version 6.7 (after all, I think
>> everyone wants a break for the holidays). Hopefully by version 6.8 we
>> can agree on a way to support users who want to randomize their IPv6
>> address as frequently as the network allows.
>
> FWIW, I think the desired effect you are seeking makes sense and is
> the right thing to do. I'm just not convinced this is the correct way
> to do it. But I'm not a maintainer and also not an expert in IPv6, so
> I'm definitely not the right person to make that call.
>
Send a revert before 6.7 is released which will most likely be this
weekend.
On Fri, Dec 29, 2023 at 9:33 AM Dan Moulding <[email protected]> wrote:
>
> I think a maintainer will probably need to make a call here and decide
> how to proceed.
>
> > TEMP_PREFERRED_LIFETIME is an administratively set variable: The user
> > can change it to whatever they want whenever they want, and the
> > operating system can adjust it automatically too.
>
> Agreed. And the behavior it seems you really want is to prevent the
> user from administratively setting it to a value that is lower than
> REGEN_ADVANCE, so that it won't stop generating new temporary
> addresses altogether.
>
> But preventing the user from configuring it to a value that is too low
> is different from generating new temporary addresses with preferred
> lifetimes that are greater than the currently configured value of
> TEMP_PREFERRED_LIFETIME. I still believe it would be better, and would
> be in conformance with the RFC, to simply not allow the user to
> configure a too-short TEMP_PREFERRED_LIFETIME instead of tinkering
> with the lifetimes of generated temporary addresses.
In RFC 4941, REGEN_ADVANCE is a constant value of 5 seconds.[1]
However, Linux uses a variable regen_advance that depends on the
Retrans Timer value in the router advertisement.[2][3][4] Let's
imagine that when the user tries to set
/proc/sys/net/ipv6/conf/*/temp_prefered_lft to 3 seconds, they get an
error message that says "Sorry, the network requires at least 4
seconds." After a few minutes, network conditions change, and now 5
seconds is the minimum. Should the kernel just give up on using
private addresses? Or, the minimum might drop to 3 seconds, which is
what the user really wanted. Should the operating system tell the user
to change the value?
What I think you're getting at is that Linux might be violating the
spec by allowing TEMP_PREFERRED_LIFETIME to be less than 5 seconds.
The RFC says: [5]
> When processing a Router Advertisement with a Prefix
> Information option carrying a global scope prefix for the purposes of
> address autoconfiguration (i.e., the A bit is set), the node MUST
> perform the following steps:
> 5. A temporary address is created only if this calculated Preferred
> Lifetime is greater than REGEN_ADVANCE time units.
The right solution might be to make Linux use a constant value for
REGEN_ADVANCE instead of a variable. I think that's how it used to
work before commit 76506a986dc3 (IPv6: fix DESYNC_FACTOR,
2016-10-13).[6] If regen_advance can't change depending on network
conditions then it can't cause private address generation to randomly
stop working. I don't understand why the protocol would require
REGEN_ADVANCE to be a constant, but interpreting the RFC literally, it
would seem that Linux is technically non-compliant.
-Alex
[1] https://datatracker.ietf.org/doc/html/rfc4941#section-5
[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/net/ipv6/addrconf.c?h=v6.7-rc7#n1377
[3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/net/ipv6/ndisc.c?h=v6.7-rc7#n1438
[4] https://datatracker.ietf.org/doc/html/rfc4861#section-4.2
[5] https://datatracker.ietf.org/doc/html/rfc4941#section-3.3
[6] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=76506a986dc31394fd1f2741db037d29c7e57843