2011-05-15 17:00:34

by Veaceslav Falico

[permalink] [raw]
Subject: [PATCH 1/1] igmp: fix ip_mc_clear_src to not reset ip_mc_list->sf{mode,count}

ip_mc_clear_src resets the imc->sfcount and imc->sfmode, without taking into
account the current number of sockets listening on that multicast struct, which
can lead to bogus routes for local listeners.

On NETDEV_DOWN/UP event, if there were 3 multicast listeners for that interface's
address, the imc->sfcount[MCAST_EXCLUDE] will be reset to 1. And after that a
listener socket destroys, multicast traffic will not be delivered to local
listeners because __mkroute_output drops the local flag for the route (by
checking ip_check_mc).

Signed-off-by: Veaceslav Falico <[email protected]>

diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
index 1fd3d9c..b14f371 100644
--- a/net/ipv4/igmp.c
+++ b/net/ipv4/igmp.c
@@ -1775,9 +1775,6 @@ static void ip_mc_clear_src(struct ip_mc_list *pmc)
kfree(psf);
}
pmc->sources = NULL;
- pmc->sfmode = MCAST_EXCLUDE;
- pmc->sfcount[MCAST_INCLUDE] = 0;
- pmc->sfcount[MCAST_EXCLUDE] = 1;
}


2011-05-16 18:08:03

by David Miller

[permalink] [raw]
Subject: Re: [PATCH 1/1] igmp: fix ip_mc_clear_src to not reset ip_mc_list->sf{mode,count}

From: Veaceslav Falico <[email protected]>
Date: Sun, 15 May 2011 18:59:45 +0200

> ip_mc_clear_src resets the imc->sfcount and imc->sfmode, without taking into
> account the current number of sockets listening on that multicast struct, which
> can lead to bogus routes for local listeners.
>
> On NETDEV_DOWN/UP event, if there were 3 multicast listeners for that interface's
> address, the imc->sfcount[MCAST_EXCLUDE] will be reset to 1. And after that a
> listener socket destroys, multicast traffic will not be delivered to local
> listeners because __mkroute_output drops the local flag for the route (by
> checking ip_check_mc).
>
> Signed-off-by: Veaceslav Falico <[email protected]>

David, please take a look at this. Thanks.

> diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
> index 1fd3d9c..b14f371 100644
> --- a/net/ipv4/igmp.c
> +++ b/net/ipv4/igmp.c
> @@ -1775,9 +1775,6 @@ static void ip_mc_clear_src(struct ip_mc_list *pmc)
> kfree(psf);
> }
> pmc->sources = NULL;
> - pmc->sfmode = MCAST_EXCLUDE;
> - pmc->sfcount[MCAST_INCLUDE] = 0;
> - pmc->sfcount[MCAST_EXCLUDE] = 1;
> }
>
>

2011-05-16 20:42:34

by David Stevens

[permalink] [raw]
Subject: Re: [PATCH 1/1] igmp: fix ip_mc_clear_src to not reset ip_mc_list->sf{mode,count}

> From: Veaceslav Falico <[email protected]>
> Date: Sun, 15 May 2011 18:59:45 +0200
>
> > ip_mc_clear_src resets the imc->sfcount and imc->sfmode, without
taking into
> > account the current number of sockets listening on that multicast
> struct, which
> > can lead to bogus routes for local listeners.
> >
> > On NETDEV_DOWN/UP event, if there were 3 multicast listeners for
> that interface's
> > address, the imc->sfcount[MCAST_EXCLUDE] will be reset to 1. And
> after that a
> > listener socket destroys, multicast traffic will not be delivered to
local
> > listeners because __mkroute_output drops the local flag for the route
(by
> > checking ip_check_mc).

On NETDEV_DOWN, all group memberships are dropped.
ip_mc_clear_src()
is simply freeing all the source filters and turning it into an "EXCLUDE
nobody"
membership (ie, the same as an ordinary join without source filtering).
This
ordinarily happens when you are deleting the group entirely (when the
reference
count goes to 0), but is also called on device down.
This patch is not appropriate; when the groups are deleted, the
source
filters are deleted, and the filter counts have to reflect the source
filters
in the list. If you had an "INCLUDE A" filter, for example, that would
become
an "INCLUDE nobody" filter and drop all traffic (from A or not). The
number
of source filters is not related to the number of listener sockets, and
the
function of ip_mc_clear_src() is to make it 0 (with the special case of 1
for
EXCLUDE), so setting the counts has to be done for proper functioning.
I don't quite understand the problem you're trying to solve here
--
when the device comes back up, the group should be re-added with
{EXCLUDE,nobody} and
ip_check_mc() should therefore return 1. Of course, while the interface is
down, the mc_list is empty and it'd return 0 in that case.
Do you have a small test program to demonstrate the problem?

For the patch, I have to say NACK.

+-DLS

2011-05-17 13:31:38

by Veaceslav Falico

[permalink] [raw]
Subject: Re: [PATCH 1/1] igmp: fix ip_mc_clear_src to not reset ip_mc_list->sf{mode,count}

On Mon, May 16, 2011 at 01:42:11PM -0700, David Stevens wrote:
>
> On NETDEV_DOWN, all group memberships are dropped.
> ip_mc_clear_src()
> is simply freeing all the source filters and turning it into an "EXCLUDE
> nobody"
> membership (ie, the same as an ordinary join without source filtering).
> This
> ordinarily happens when you are deleting the group entirely (when the
> reference
> count goes to 0), but is also called on device down.
> This patch is not appropriate; when the groups are deleted, the
> source
> filters are deleted, and the filter counts have to reflect the source
> filters
> in the list. If you had an "INCLUDE A" filter, for example, that would
> become
> an "INCLUDE nobody" filter and drop all traffic (from A or not). The
> number
> of source filters is not related to the number of listener sockets, and
> the
> function of ip_mc_clear_src() is to make it 0 (with the special case of 1
> for
> EXCLUDE), so setting the counts has to be done for proper functioning.
> I don't quite understand the problem you're trying to solve here
> --
> when the device comes back up, the group should be re-added with
> {EXCLUDE,nobody} and
> ip_check_mc() should therefore return 1. Of course, while the interface is
> down, the mc_list is empty and it'd return 0 in that case.
> Do you have a small test program to demonstrate the problem?

Yes, attached are two programs, one sender and one receiver, they bind both
to localhost and send each other traffic. To reproduce, start the sender
and two instances of receivers, then do an ifconfig lo up; ifconfig lo
down;, restart the sender program (both of the receivers should once again
receive the multicast traffic). Then kill one receiver (the MCAST_EXCLUDE
will become 0), and do an "ip route flush cache". The new route cache will
be without the local flag on, and the remaining receiver will stop
receiving traffic.

What happens:

1) When both receivers start, ip_mc_list->sfcount[MCAST_EXCLUDE] == 2
2) On NETDEV_DOWN event, groups are dropped and sfmode = MCAST_EXCLUDE,
sfcount[MCAST_EXCLUDE] = 1
3) On NETDEV_UP, the group is re-joined, but kernel thinks that there's
only one listener (sfcount[MCAST_EXCLUDE]).
4) On socket destroy (when one receiver is terminated), the count is 0.
5) On route cache flush, __mkroute_output() doesn't see the remaining
listener, and creates a route cache without RTCF_LOCAL flag, thus not
allowing any traffic on that group to local listeners.

The igmp_group_dropped() (the actual routine that drops a group) is called
when:

1) ip_mc_dec_group() is called and im->users == 0
2) ip_mc_unmap()
3) ip_mc_down()
4) ip_mc_destroy_dev()

The 1) we call either on socket destroy or when the socket actually asks to
leave a group. In this case, we need to "reset" the state on no listeners.

2),3),4) are called on various device modifications
(NETDEV_PRE_TYPE_CHANGE, NETDEV_DOWN and NETDEV_UNREGISTER) - but the group
can be rejoined on their next events - NETDEV_POST_TYPE_CHANGE, NETDEV_UP
and NETDEV_REGISTER, which will cause the ip_mc_list to loose track of
existing listeners.

So, I tend to think that we must clear the sources only on 1).

Will send the patch shortly.

Thank you!


Attachments:
(No filename) (3.17 kB)
mcsend.c (3.51 kB)
mcreceive.c (3.73 kB)
Download all attachments

2011-05-17 13:38:52

by Veaceslav Falico

[permalink] [raw]
Subject: [PATCH v2 1/1] igmp: call ip_mc_clear_src() only when we have no users of ip_mc_list

In igmp_group_dropped() we call ip_mc_clear_src(), which resets the number
of source filters per mulitcast. However, igmp_group_dropped() is also
called on NETDEV_DOWN, NETDEV_PRE_TYPE_CHANGE and NETDEV_UNREGISTER, which
means that the group might get added back on NETDEV_UP, NETDEV_REGISTER and
NETDEV_POST_TYPE_CHANGE respectively, leaving us with broken source
filters.

To fix that, we must clear the source filters only when there are no users
in the ip_mc_list, i.e. in ip_mc_dec_group().

Signed-off-by: Veaceslav Falico <[email protected]>

---
diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
index 1fd3d9c..732e30b 100644
--- a/net/ipv4/igmp.c
+++ b/net/ipv4/igmp.c
@@ -1182,7 +1182,6 @@ static void igmp_group_dropped(struct ip_mc_list *im)
}
done:
#endif
- ip_mc_clear_src(im);
}

static void igmp_group_added(struct ip_mc_list *im)
@@ -1319,6 +1318,7 @@ void ip_mc_dec_group(struct in_device *in_dev, __be32 addr)
*ip = i->next_rcu;
in_dev->mc_count--;
igmp_group_dropped(i);
+ ip_mc_clear_src(i);

if (!in_dev->dead)
ip_rt_multicast_event(in_dev);

2011-05-17 14:39:07

by Veaceslav Falico

[permalink] [raw]
Subject: [PATCH v3 1/1] igmp: call ip_mc_clear_src() only when we have no users of ip_mc_list

In igmp_group_dropped() we call ip_mc_clear_src(), which resets the number
of source filters per mulitcast. However, igmp_group_dropped() is also
called on NETDEV_DOWN, NETDEV_PRE_TYPE_CHANGE and NETDEV_UNREGISTER, which
means that the group might get added back on NETDEV_UP, NETDEV_REGISTER and
NETDEV_POST_TYPE_CHANGE respectively, leaving us with broken source
filters.

To fix that, we must clear the source filters only when there are no users
in the ip_mc_list, i.e. in ip_mc_dec_group().

Correct version of the patch.

Signed-off-by: Veaceslav Falico <[email protected]>
---
diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
index 1fd3d9c..142ca0d 100644
--- a/net/ipv4/igmp.c
+++ b/net/ipv4/igmp.c
@@ -1169,20 +1169,18 @@ static void igmp_group_dropped(struct ip_mc_list *im)

if (!in_dev->dead) {
if (IGMP_V1_SEEN(in_dev))
- goto done;
+ return;
if (IGMP_V2_SEEN(in_dev)) {
if (reporter)
igmp_send_report(in_dev, im, IGMP_HOST_LEAVE_MESSAGE);
- goto done;
+ return;
}
/* IGMPv3 */
igmpv3_add_delrec(in_dev, im);

igmp_ifc_event(in_dev);
}
-done:
#endif
- ip_mc_clear_src(im);
}

static void igmp_group_added(struct ip_mc_list *im)
@@ -1319,6 +1317,7 @@ void ip_mc_dec_group(struct in_device *in_dev, __be32 addr)
*ip = i->next_rcu;
in_dev->mc_count--;
igmp_group_dropped(i);
+ ip_mc_clear_src(i);

if (!in_dev->dead)
ip_rt_multicast_event(in_dev);

2011-05-17 17:43:38

by David Stevens

[permalink] [raw]
Subject: Re: [PATCH v3 1/1] igmp: call ip_mc_clear_src() only when we have no users of ip_mc_list

Veaceslav,
It looks to me like this will leak the source filters if we are
called from ip_mc_destroy_dev(),
Even with your previous patch, you're assuming that we don't free the
ip_mc_list and so we have the
same one when we up the device, but if there are no timers running, it
looks like refcnt canl go to 0 and free
it. If we can ever free the ip_mc_list when users != 0 (or going to 0
immediately after the drop), we
have to do the ip_mc_clear_src() or leak the list. I haven't looked at
this code in years, so I'll need
to refresh my memory.
So, I'll look at that a bit more; at a minimum, I think you need
to do the clear_src
also in the destroy case. We could lose the filters and set the exclude
count to users, instead
of 1; but I like the idea of keeping the source filters across a down/up,
if we can be sure there
are no cases where we free the ip_mc_list without first freeing all the
filters.

+-DLS

Veaceslav Falico <[email protected]> wrote on 05/17/2011 07:37:56 AM:

> From: Veaceslav Falico <[email protected]>
> To: David Stevens/Beaverton/IBM@IBMUS
> Cc: David Miller <[email protected]>, [email protected],
> [email protected], [email protected], [email protected],
> [email protected], [email protected],
> [email protected], [email protected], [email protected]
> Date: 05/17/2011 07:39 AM
> Subject: [PATCH v3 1/1] igmp: call ip_mc_clear_src() only when we
> have no users of ip_mc_list
>
> In igmp_group_dropped() we call ip_mc_clear_src(), which resets the
number
> of source filters per mulitcast. However, igmp_group_dropped() is also
> called on NETDEV_DOWN, NETDEV_PRE_TYPE_CHANGE and NETDEV_UNREGISTER,
which
> means that the group might get added back on NETDEV_UP, NETDEV_REGISTER
and
> NETDEV_POST_TYPE_CHANGE respectively, leaving us with broken source
> filters.
>
> To fix that, we must clear the source filters only when there are no
users
> in the ip_mc_list, i.e. in ip_mc_dec_group().
>
> Correct version of the patch.
>
> Signed-off-by: Veaceslav Falico <[email protected]>
> ---
> diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
> index 1fd3d9c..142ca0d 100644
> --- a/net/ipv4/igmp.c
> +++ b/net/ipv4/igmp.c
> @@ -1169,20 +1169,18 @@ static void igmp_group_dropped(struct ip_mc_list
*im)
>
> if (!in_dev->dead) {
> if (IGMP_V1_SEEN(in_dev))
> - goto done;
> + return;
> if (IGMP_V2_SEEN(in_dev)) {
> if (reporter)
> igmp_send_report(in_dev, im, IGMP_HOST_LEAVE_MESSAGE);
> - goto done;
> + return;
> }
> /* IGMPv3 */
> igmpv3_add_delrec(in_dev, im);
>
> igmp_ifc_event(in_dev);
> }
> -done:
> #endif
> - ip_mc_clear_src(im);
> }
>
> static void igmp_group_added(struct ip_mc_list *im)
> @@ -1319,6 +1317,7 @@ void ip_mc_dec_group(struct in_device *in_dev,
> __be32 addr)
> *ip = i->next_rcu;
> in_dev->mc_count--;
> igmp_group_dropped(i);
> + ip_mc_clear_src(i);
>
> if (!in_dev->dead)
> ip_rt_multicast_event(in_dev);

2011-05-20 16:28:26

by Veaceslav Falico

[permalink] [raw]
Subject: Re: [PATCH v3 1/1] igmp: call ip_mc_clear_src() only when we have no users of ip_mc_list

On Tue, May 17, 2011 at 10:42:59AM -0700, David Stevens wrote:
> Veaceslav,
> It looks to me like this will leak the source filters if we are
> called from ip_mc_destroy_dev(),
> Even with your previous patch, you're assuming that we don't free the
> ip_mc_list and so we have the
> same one when we up the device, but if there are no timers running, it
> looks like refcnt canl go to 0 and free
> it. If we can ever free the ip_mc_list when users != 0 (or going to 0
> immediately after the drop), we
> have to do the ip_mc_clear_src() or leak the list. I haven't looked at
> this code in years, so I'll need
> to refresh my memory.
> So, I'll look at that a bit more; at a minimum, I think you need
> to do the clear_src
> also in the destroy case. We could lose the filters and set the exclude
> count to users, instead
> of 1; but I like the idea of keeping the source filters across a down/up,
> if we can be sure there
> are no cases where we free the ip_mc_list without first freeing all the
> filters.
>
> +-DLS

Yes, you are completely right, we can leak the sources on
ip_mc_destroy_dev() when we've ip_ma_put() it inside all the timers. Also,
I've seen that we called igmp_group_dropped() for every mc in dev->mc_list,
however we've done it already in ip_mc_down() before, which wouldn't lead
to anything (cause the device is already ->dead, and all timers are
stopped), but just would be a waste of time.

So, does this patch seem ok? If yes, I'll send it with the changelog.

---
diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
index 1fd3d9c..57ca93a 100644
--- a/net/ipv4/igmp.c
+++ b/net/ipv4/igmp.c
@@ -1169,20 +1169,18 @@ static void igmp_group_dropped(struct ip_mc_list *im)

if (!in_dev->dead) {
if (IGMP_V1_SEEN(in_dev))
- goto done;
+ return;
if (IGMP_V2_SEEN(in_dev)) {
if (reporter)
igmp_send_report(in_dev, im, IGMP_HOST_LEAVE_MESSAGE);
- goto done;
+ return;
}
/* IGMPv3 */
igmpv3_add_delrec(in_dev, im);

igmp_ifc_event(in_dev);
}
-done:
#endif
- ip_mc_clear_src(im);
}

static void igmp_group_added(struct ip_mc_list *im)
@@ -1319,6 +1317,7 @@ void ip_mc_dec_group(struct in_device *in_dev, __be32 addr)
*ip = i->next_rcu;
in_dev->mc_count--;
igmp_group_dropped(i);
+ ip_mc_clear_src(i);

if (!in_dev->dead)
ip_rt_multicast_event(in_dev);
@@ -1428,7 +1427,8 @@ void ip_mc_destroy_dev(struct in_device *in_dev)
in_dev->mc_list = i->next_rcu;
in_dev->mc_count--;

- igmp_group_dropped(i);
+ /* We've dropped the groups in ip_mc_down already */
+ ip_mc_clear_src(i);
ip_ma_put(i);
}
}

2011-05-23 17:41:47

by David Stevens

[permalink] [raw]
Subject: Re: [PATCH v3 1/1] igmp: call ip_mc_clear_src() only when we have no users of ip_mc_list

[email protected] wrote on 05/20/2011 09:27:09 AM:

> From: Veaceslav Falico <[email protected]>

Looks ok to me:

Acked-by: David L Stevens <[email protected]>

>
> So, does this patch seem ok? If yes, I'll send it with the changelog.
>
> ---
> diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
> index 1fd3d9c..57ca93a 100644
> --- a/net/ipv4/igmp.c
> +++ b/net/ipv4/igmp.c
> @@ -1169,20 +1169,18 @@ static void igmp_group_dropped(struct ip_mc_list
*im)
>
> if (!in_dev->dead) {
> if (IGMP_V1_SEEN(in_dev))
> - goto done;
> + return;
> if (IGMP_V2_SEEN(in_dev)) {
> if (reporter)
> igmp_send_report(in_dev, im, IGMP_HOST_LEAVE_MESSAGE);
> - goto done;
> + return;
> }
> /* IGMPv3 */
> igmpv3_add_delrec(in_dev, im);
>
> igmp_ifc_event(in_dev);
> }
> -done:
> #endif
> - ip_mc_clear_src(im);
> }
>
> static void igmp_group_added(struct ip_mc_list *im)
> @@ -1319,6 +1317,7 @@ void ip_mc_dec_group(struct in_device *in_dev,
> __be32 addr)
> *ip = i->next_rcu;
> in_dev->mc_count--;
> igmp_group_dropped(i);
> + ip_mc_clear_src(i);
>
> if (!in_dev->dead)
> ip_rt_multicast_event(in_dev);
> @@ -1428,7 +1427,8 @@ void ip_mc_destroy_dev(struct in_device *in_dev)
> in_dev->mc_list = i->next_rcu;
> in_dev->mc_count--;
>
> - igmp_group_dropped(i);
> + /* We've dropped the groups in ip_mc_down already */
> + ip_mc_clear_src(i);
> ip_ma_put(i);
> }
> }
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2011-05-24 09:16:06

by Veaceslav Falico

[permalink] [raw]
Subject: [PATCH v4 1/1] igmp: call ip_mc_clear_src() only when we have no users of ip_mc_list

In igmp_group_dropped() we call ip_mc_clear_src(), which resets the number
of source filters per mulitcast. However, igmp_group_dropped() is also
called on NETDEV_DOWN, NETDEV_PRE_TYPE_CHANGE and NETDEV_UNREGISTER, which
means that the group might get added back on NETDEV_UP, NETDEV_REGISTER and
NETDEV_POST_TYPE_CHANGE respectively, leaving us with broken source
filters.

To fix that, we must clear the source filters only when there are no users
in the ip_mc_list, i.e. in ip_mc_dec_group() and on device destroy.

Acked-by: David L Stevens <[email protected]>
Signed-off-by: Veaceslav Falico <[email protected]>

---
diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
index 1fd3d9c..57ca93a 100644
--- a/net/ipv4/igmp.c
+++ b/net/ipv4/igmp.c
@@ -1169,20 +1169,18 @@ static void igmp_group_dropped(struct ip_mc_list *im)

if (!in_dev->dead) {
if (IGMP_V1_SEEN(in_dev))
- goto done;
+ return;
if (IGMP_V2_SEEN(in_dev)) {
if (reporter)
igmp_send_report(in_dev, im, IGMP_HOST_LEAVE_MESSAGE);
- goto done;
+ return;
}
/* IGMPv3 */
igmpv3_add_delrec(in_dev, im);

igmp_ifc_event(in_dev);
}
-done:
#endif
- ip_mc_clear_src(im);
}

static void igmp_group_added(struct ip_mc_list *im)
@@ -1319,6 +1317,7 @@ void ip_mc_dec_group(struct in_device *in_dev, __be32 addr)
*ip = i->next_rcu;
in_dev->mc_count--;
igmp_group_dropped(i);
+ ip_mc_clear_src(i);

if (!in_dev->dead)
ip_rt_multicast_event(in_dev);
@@ -1428,7 +1427,8 @@ void ip_mc_destroy_dev(struct in_device *in_dev)
in_dev->mc_list = i->next_rcu;
in_dev->mc_count--;

- igmp_group_dropped(i);
+ /* We've dropped the groups in ip_mc_down already */
+ ip_mc_clear_src(i);
ip_ma_put(i);
}
}

2011-05-24 17:32:43

by David Miller

[permalink] [raw]
Subject: Re: [PATCH v4 1/1] igmp: call ip_mc_clear_src() only when we have no users of ip_mc_list

From: Veaceslav Falico <[email protected]>
Date: Tue, 24 May 2011 11:15:05 +0200

> In igmp_group_dropped() we call ip_mc_clear_src(), which resets the number
> of source filters per mulitcast. However, igmp_group_dropped() is also
> called on NETDEV_DOWN, NETDEV_PRE_TYPE_CHANGE and NETDEV_UNREGISTER, which
> means that the group might get added back on NETDEV_UP, NETDEV_REGISTER and
> NETDEV_POST_TYPE_CHANGE respectively, leaving us with broken source
> filters.
>
> To fix that, we must clear the source filters only when there are no users
> in the ip_mc_list, i.e. in ip_mc_dec_group() and on device destroy.
>
> Acked-by: David L Stevens <[email protected]>
> Signed-off-by: Veaceslav Falico <[email protected]>

Applied.