2001-07-26 02:05:26

by Nat Ersoz

[permalink] [raw]
Subject: IGMP join/leave time variability

Greetings,

I'm encountering time variability with IGMP joins and leaves. I'm working
with the 2.2.19 kernel. I've placed gettimeofday() printf's within the user
space program and do_gettimeofday() printk's within the ethernet driver.

So far, what I've found is typical of this captured data:

--- user space timestamps
996133011.376224 +UserCloseSource
996133011.377821 -UserCloseSource
996133011.378296 +UserOpenSource: 224.0.17.103:2001
calls:
socket()
setsockopt(REUSEADDR)
bind()
setsockopt(IP_ADD_MEMBERSHIP)
996133011.379933 -UserOpenSource: 224.0.17.103:2001: result=0

---- tcpdump output:
00:36:43.335501 > stb_nat.et.myrio.com > 224.0.17.104: igmp nreport
224.0.17.104 [ttl 1]
00:36:45.245501 > stb_nat.et.myrio.com > 224.0.17.104: igmp nreport
224.0.17.104 [ttl 1]
00:36:51.376707 > stb_nat.et.myrio.com > all-routers.mcast.net: igmp leave
224.0.17.104 [ttl 1]
00:36:52.275523 > stb_nat.et.myrio.com > 224.0.17.103: igmp nreport
224.0.17.103 [ttl 1]
00:36:53.705502 > stb_nat.et.myrio.com > 224.0.17.103: igmp nreport
224.0.17.103 [ttl 1]
00:37:02.495500 > stb_nat.et.myrio.com > 224.0.17.103: igmp nreport
224.0.17.103 [ttl 1]

---- ethernet driver timestamps (natsemi.o, modified)
Jul 26 00:36:35 stb_nat kernel: eth0: Add Multicast 996132995.817524
Jul 26 00:36:35 stb_nat kernel: ^I1.0.94.0.0.1
Jul 26 00:36:35 stb_nat kernel: eth0: Add Multicast 996132995.819686
Jul 26 00:36:35 stb_nat kernel: 1.0.94.0.17.104
Jul 26 00:36:35 stb_nat kernel: 1.0.94.0.0.1

==== Some notes:
1. The user space socket() calls take less than 4mS to complete.
2. The ethernet multicast filter gets set very quickly: less than 2 mS.
3. Tcpdump reports that the time between this leave and join is 900 mS for
this particular transaction. We have correlated tcpdump's results with
actual traffic on the ethernet wire using a network analyzer and found
tcpdump to be accurate.

==== Linux 2.2.19 code:
I have dug into code and it seems that the function igmp_group_added(),
found in linux/net/ipv4/igmp.c, is where things really happen. The function
igmp_start_timer() gets called with a IGMP_Initial_Report_Delay value of
(1*HZ). From what I can tell, this amounts to up to 1 second of delay
depending on what net_random() returns in igmp_start_timer() - which agrees
with our measurements of IGMP joins varying from "very short" delays to
something a bit over a second.

==== Questions:
For our application, it would be desireable to have the leave/join occur
ASAP with respect to the user mode calls.
1. What would be the harm if I set IGMP_Initial_Report_Delay to something
very small like 5 to 10 (jiffies)? No need for net_random() I'de expect in
that case?
2. I'm guessing that modifying igmp_start_timer() to call
igmp_timer_expire() directly is not a good idea, since the timers provide
race condition safeness. (?)

Thanks for wading through this. I looked at the 2.4.3 igmp.c code and
noticed that its somewhat similar. Right now our app is at 2.2.19 however.

Thanks for any help and thoughts you may offer.

Nat

________________________________________
Nat Ersoz Myrio Corporation
[email protected]
Phone: 425.897.7278 Fax:425.897.5600
3500 Carillon Point Kirkland, WA 98033


2001-07-26 11:58:07

by Alan

[permalink] [raw]
Subject: Re: IGMP join/leave time variability

> ASAP with respect to the user mode calls.
> 1. What would be the harm if I set IGMP_Initial_Report_Delay to something
> very small like 5 to 10 (jiffies)? No need for net_random() I'de expect in
> that case?

Read the IGMP RFC documents they discuss in detail the cases where time
delays and randomness are needed and important.

2001-07-26 17:47:47

by Torrey Hoffman

[permalink] [raw]
Subject: RE: IGMP join/leave time variability


Alan Cox wrote:

> Read the IGMP RFC documents they discuss in detail the cases
> where time delays and randomness are needed and important.

I'm one of Nat's co-workers, also looking at this problem.

RFC 2236, the IGMPv2 spec, states:
"
When a host joins a multicast group, it should immediately transmit
an unsolicited Version 2 Membership Report for that group, in case it
is the first member of that group on the network. To cover the
possibility of the initial Membership Report being lost or damaged,
it is recommended that it be repeated once or twice after short
delays [Unsolicited Report Interval].
"

>From this, I infer that there should be _no_ initial delay on sending
the IGMP join. In fact, a quick peek at the source confirms this:
(net/ipv4/igmp.c):

#define IGMP_Initial_Report_Delay (1*HZ)

/* IGMP_Initial_Report_Delay is not from IGMP specs!
* IGMP specs require to report membership immediately after
* joining a group, but we delay the first report by a
* small interval. It seems more natural and still does not
* contradict to specs provided this delay is small enough.
*/

But this "small interval" is actually very noticeable in our application.

I think we'll take it out of our version, and I believe it should be
removed from the standard kernel.

Regards,

Torrey Hoffman

2001-07-26 17:50:57

by Alan

[permalink] [raw]
Subject: Re: IGMP join/leave time variability

> >From this, I infer that there should be _no_ initial delay on sending
> the IGMP join. In fact, a quick peek at the source confirms this:
> (net/ipv4/igmp.c):
>
> #define IGMP_Initial_Report_Delay (1*HZ)
>
> /* IGMP_Initial_Report_Delay is not from IGMP specs!
> * IGMP specs require to report membership immediately after
> * joining a group, but we delay the first report by a
> * small interval. It seems more natural and still does not
> * contradict to specs provided this delay is small enough.
> */
>
> But this "small interval" is actually very noticeable in our application.

I suspect the small interval for the first one should be 1 not 1*HZ. That
would keep a little bit of jitter which is good to avoid the multicast
receive/join group problem

[Lots of clients all running an app listening for multicast packets, one
packet says 'do xyz on this group' and they all then send joins at the same
instant]

Alan

2001-07-26 18:07:27

by Nat Ersoz

[permalink] [raw]
Subject: RE: IGMP join/leave time variability

This morning, after dealing with other bugs, I was able to verify (using
tcpdump) that

#define IGMP_Initial_Report_Delay 5

fixed the "long" (1 sec) variable delay associated with the first igmp join
message. This resolves a specific bug in our application (excessive IGMP
join delay as compared to an NT client).

Can we get something like this moved into future versions of ipv4/igmp.c?

Thanks,

Nat

-----Original Message-----
From: Torrey Hoffman [mailto:[email protected]]
Sent: Thursday, July 26, 2001 10:47 AM
To: 'Alan Cox'; Nat Ersoz
Cc: [email protected]
Subject: RE: IGMP join/leave time variability



Alan Cox wrote:

> Read the IGMP RFC documents they discuss in detail the cases
> where time delays and randomness are needed and important.

I'm one of Nat's co-workers, also looking at this problem.

RFC 2236, the IGMPv2 spec, states:
"
When a host joins a multicast group, it should immediately transmit
an unsolicited Version 2 Membership Report for that group, in case it
is the first member of that group on the network. To cover the
possibility of the initial Membership Report being lost or damaged,
it is recommended that it be repeated once or twice after short
delays [Unsolicited Report Interval].
"

>From this, I infer that there should be _no_ initial delay on sending
the IGMP join. In fact, a quick peek at the source confirms this:
(net/ipv4/igmp.c):

#define IGMP_Initial_Report_Delay (1*HZ)

/* IGMP_Initial_Report_Delay is not from IGMP specs!
* IGMP specs require to report membership immediately after
* joining a group, but we delay the first report by a
* small interval. It seems more natural and still does not
* contradict to specs provided this delay is small enough.
*/

But this "small interval" is actually very noticeable in our application.

I think we'll take it out of our version, and I believe it should be
removed from the standard kernel.

Regards,

Torrey Hoffman

2001-07-27 09:28:09

by David Miller

[permalink] [raw]
Subject: Re: IGMP join/leave time variability


Alan Cox writes:
> > But this "small interval" is actually very noticeable in our application.
>
> I suspect the small interval for the first one should be 1 not 1*HZ. That
> would keep a little bit of jitter which is good to avoid the multicast
> receive/join group problem

I've changed it to "1" in my sources. Thanks.

Later,
David S. Miller
[email protected]