Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751415AbVKEBHV (ORCPT ); Fri, 4 Nov 2005 20:07:21 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751425AbVKEBHV (ORCPT ); Fri, 4 Nov 2005 20:07:21 -0500 Received: from postel.suug.ch ([195.134.158.23]:23712 "EHLO postel.suug.ch") by vger.kernel.org with ESMTP id S1751415AbVKEBHU (ORCPT ); Fri, 4 Nov 2005 20:07:20 -0500 Date: Sat, 5 Nov 2005 02:07:40 +0100 From: Thomas Graf To: Patrick McHardy Cc: Brian Pomerantz , netdev@vger.kernel.org, davem@davemloft.net, kuznet@ms2.inr.ac.ru, pekkas@netcore.fi, jmorris@namei.org, yoshfuji@linux-ipv6.org, kaber@coreworks.de, linux-kernel@vger.kernel.org Subject: Re: [PATCH] [IPV4] Fix secondary IP addresses after promotion Message-ID: <20051105010740.GR23537@postel.suug.ch> References: <20051104184633.GA16256@skull.piratehaven.org> <436BFE08.6030906@trash.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <436BFE08.6030906@trash.net> Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6781 Lines: 184 * Patrick McHardy 2005-11-05 01:34 > Brian Pomerantz wrote: > >When 3 or more IP addresses in the same subnet exist on a device and the > >first one is removed, only the promoted IP address can be reached. Just > >after promotion of the next IP address, this fix spins through any more > >IP addresses on the interface and sends a NETDEV_UP notification for > >that address. This repopulates the FIB with the proper route > >information. > > > >@@ -294,7 +294,13 @@ static void inet_del_ifa(struct in_devic > > /* not sure if we should send a delete notify first? */ > > promote->ifa_flags &= ~IFA_F_SECONDARY; > > rtmsg_ifa(RTM_NEWADDR, promote); > >- notifier_call_chain(&inetaddr_chain, NETDEV_UP, promote); > >+ > >+ /* update fib in the rest of this address list */ > >+ ifa = promote; > >+ while (ifa != NULL) { > >+ notifier_call_chain(&inetaddr_chain, NETDEV_UP, ifa); > >+ ifa = ifa->ifa_next; > >+ } > > } > > } > > You assume all addresses following the primary addresses are secondary > addresses of the primary, which is not true with multiple primaries. > This patch (untested) makes sure only to send notification for real > secondaries of the deleted address. Even this corrected version is only a workaround, the real bug is that or whatever reason all local routes of seconaries get deleted upon an address promotion. I started debugging it a bit by looking at the requests generated by fib_magic() and the resulting notifications, the local routes just disappear when they shouldn't. Situation is: 10.0.0.[1-4]/24 on dev0, 10.0.0.1 is the primary address and gets deleted while address promotion is enabled. The following happens: [Format:] Request generated by fib_magic() Notification event received RTM_DELROUTE 10.0.0.0/24 dev eth0 scope link unicast table main protocol 2 preferred-src 10.0.0.1 RTM_DELROUTE 10.0.0.0/24 dev eth0 scope link unicast table main protocol 2 preferred-src 10.0.0.1 RTM_DELROUTE 10.0.0.255 dev eth0 scope link broadcast table local protocol 2 preferred-src 10.0.0.1 RTM_DELROUTE 10.0.0.255 dev eth0 scope link broadcast table local protocol 2 preferred-src 10.0.0.1 RTM_DELROUTE 10.0.0.0 dev eth0 scope link broadcast table local protocol 2 preferred-src 10.0.0.1 RTM_DELROUTE 10.0.0.0 dev eth0 scope link broadcast table local protocol 2 preferred-src 10.0.0.1 RTM_DELROUTE 10.0.0.1 dev eth0 scope host local table local protocol 2 preferred-src 10.0.0.1 RTM_DELROUTE 10.0.0.1 dev eth0 scope host local table local protocol 2 preferred-src 10.0.0.1 RTM_NEWROUTE 10.0.0.2 dev eth0 scope host local table local protocol 2 preferred-src 10.0.0.2 RTM_NEWROUTE 10.0.0.2 dev eth0 scope host local table local protocol 2 preferred-src 10.0.0.2 RTM_NEWROUTE 10.0.0.0/24 dev eth0 scope link unicast table main protocol 2 preferred-src 10.0.0.2 RTM_NEWROUTE 10.0.0.0/24 dev eth0 scope link unicast table main protocol 2 preferred-src 10.0.0.2 RTM_NEWROUTE 10.0.0.0 dev eth0 scope link broadcast table local protocol 2 preferred-src 10.0.0.2 RTM_NEWROUTE 10.0.0.0 dev eth0 scope link broadcast table local protocol 2 preferred-src 10.0.0.2 RTM_NEWROUTE 10.0.0.255 dev eth0 scope link broadcast table local protocol 2 preferred-src 10.0.0.2 RTM_NEWROUTE 10.0.0.255 dev eth0 scope link broadcast table local protocol 2 preferred-src 10.0.0.2 State afterwards: 4: eth0: mtu 1500 qdisc pfifo_fast qlen 1000 inet 10.0.0.2/24 scope global eth0 inet 10.0.0.3/24 scope global secondary eth0 inet 10.0.0.4/24 scope global secondary eth0 broadcast 10.0.0.0 proto kernel scope link src 10.0.0.2 local 10.0.0.2 proto kernel scope host src 10.0.0.2 broadcast 10.0.0.255 proto kernel scope link src 10.0.0.2 Local routes for 10.0.0.3 and 10.0.0.4 have disappeared _without_ any notification. I think the correct way to fix this is to prevent the deletion of the local routes, not just readding them. _If_ the deletion of them is intended, which I doubt, then at least notifications must be sent out. Code to get fib_magic() requests to userspace: Index: linux-2.6/include/linux/rtnetlink.h =================================================================== --- linux-2.6.orig/include/linux/rtnetlink.h +++ linux-2.6/include/linux/rtnetlink.h @@ -880,6 +880,8 @@ enum rtnetlink_groups { #define RTNLGRP_DECnet_ROUTE RTNLGRP_DECnet_ROUTE RTNLGRP_IPV6_PREFIX, #define RTNLGRP_IPV6_PREFIX RTNLGRP_IPV6_PREFIX + RTNLGRP_FIB_MAGIC, +#define RTNLGRP_FIB_MAGIC RTNLGRP_FIB_MAGIC __RTNLGRP_MAX }; #define RTNLGRP_MAX (__RTNLGRP_MAX - 1) Index: linux-2.6/net/ipv4/fib_frontend.c =================================================================== --- linux-2.6.orig/net/ipv4/fib_frontend.c +++ linux-2.6/net/ipv4/fib_frontend.c @@ -359,6 +359,48 @@ int inet_dump_fib(struct sk_buff *skb, s return skb->len; } +static int fib_magic_build(struct sk_buff *skb, int type, struct nlmsghdr *nlh, + struct rtmsg *rtm, struct kern_rta *rta) +{ + struct nlmsghdr *dst = NULL; + struct rtmsg *rtm_dst; + + dst = NLMSG_NEW(skb, current->pid, 0, type, sizeof(*rtm), 0); + memcpy(dst, nlh, sizeof(*nlh)); + + rtm_dst = NLMSG_DATA(dst); + memcpy(rtm_dst, rtm, sizeof(*rtm)); + rtm_dst->rtm_family = AF_INET; + + RTA_PUT(skb, RTA_DST, 4, rta->rta_dst); + RTA_PUT(skb, RTA_PREFSRC, 4, rta->rta_prefsrc); + RTA_PUT(skb, RTA_OIF, 4, rta->rta_oif); + + return NLMSG_END(skb, dst); +rtattr_failure: +nlmsg_failure: + return NLMSG_CANCEL(skb, dst); +} + +static void fib_magic_event(int type, struct nlmsghdr *nlh, struct rtmsg *rtm, + struct kern_rta *rta) +{ + struct sk_buff *skb; + + skb = alloc_skb(NLMSG_SPACE(sizeof(struct rtmsg) + 256), GFP_KERNEL); + if (!skb) + return; + + if (fib_magic_build(skb, type, nlh, rtm, rta) < 0) { + kfree_skb(skb); + return; + } + + NETLINK_CB(skb).dst_group = RTNLGRP_FIB_MAGIC; + netlink_broadcast(rtnl, skb, 0, RTNLGRP_FIB_MAGIC, GFP_KERNEL); +} + + /* Prepare and feed intra-kernel routing request. Really, it should be netlink message, but :-( netlink can be not configured, so that we feed it directly @@ -402,6 +444,8 @@ static void fib_magic(int cmd, int type, rta.rta_prefsrc = &ifa->ifa_local; rta.rta_oif = &ifa->ifa_dev->dev->ifindex; + fib_magic_event(cmd, &req.nlh, &req.rtm, &rta); + if (cmd == RTM_NEWROUTE) tb->tb_insert(tb, &req.rtm, &rta, &req.nlh, NULL); else - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/