2009-09-14 13:21:49

by Stephan von Krawczynski

[permalink] [raw]
Subject: ipv4 regression in 2.6.31 ?

Hello all,

today we experienced some sort of regression in 2.6.31 ipv4 implementation, or
at least some incompatibility with former 2.6.30.X kernels.

We have the following situation:

---------- vlan1@eth0 192.168.2.1/24
/
host A 192.168.1.1/24 eth0 -------<router> host B
\
---------- eth1 192.168.3.1/24


Now, if you route 192.168.1.0/24 via interface vlan1@eth0 on host B and let
host A ping 192.168.2.1 everything works. But if you route 192.168.1.0/24 via
interface eth1 on host B and let host A ping 192.168.2.1 you get no reply.
With tcpdump we see the icmp packets arrive at vlan1@eth0, but no icmp echo
reply being generated neither on vlan1 nor eth1.
Kernels 2.6.30.X and below do not show this behaviour.
Is this intended? Do we need to reconfigure something to restore the old
behaviour?

--
Regards,
Stephan


2009-09-14 13:57:08

by Eric Dumazet

[permalink] [raw]
Subject: Re: ipv4 regression in 2.6.31 ?

Stephan von Krawczynski a ?crit :
> Hello all,
>
> today we experienced some sort of regression in 2.6.31 ipv4 implementation, or
> at least some incompatibility with former 2.6.30.X kernels.
>
> We have the following situation:
>
> ---------- vlan1@eth0 192.168.2.1/24
> /
> host A 192.168.1.1/24 eth0 -------<router> host B
> \
> ---------- eth1 192.168.3.1/24
>
>
> Now, if you route 192.168.1.0/24 via interface vlan1@eth0 on host B and let
> host A ping 192.168.2.1 everything works. But if you route 192.168.1.0/24 via
> interface eth1 on host B and let host A ping 192.168.2.1 you get no reply.
> With tcpdump we see the icmp packets arrive at vlan1@eth0, but no icmp echo
> reply being generated neither on vlan1 nor eth1.
> Kernels 2.6.30.X and below do not show this behaviour.
> Is this intended? Do we need to reconfigure something to restore the old
> behaviour?
>

Asymetric routing ?

Check your rp_filter settings

grep . `find /proc/sys/net -name rp_filter`

rp_filter - INTEGER
0 - No source validation.
1 - Strict mode as defined in RFC3704 Strict Reverse Path
Each incoming packet is tested against the FIB and if the interface
is not the best reverse path the packet check will fail.
By default failed packets are discarded.
2 - Loose mode as defined in RFC3704 Loose Reverse Path
Each incoming packet's source address is also tested against the FIB
and if the source address is not reachable via any interface
the packet check will fail.

Current recommended practice in RFC3704 is to enable strict mode
to prevent IP spoofing from DDos attacks. If using asymmetric routing
or other complicated routing, then loose mode is recommended.

conf/all/rp_filter must also be set to non-zero to do source validation
on the interface

Default value is 0. Note that some distributions enable it
in startup scripts.

2009-09-14 15:14:29

by Stephan von Krawczynski

[permalink] [raw]
Subject: Re: ipv4 regression in 2.6.31 ?

On Mon, 14 Sep 2009 15:57:03 +0200
Eric Dumazet <[email protected]> wrote:

> Stephan von Krawczynski a ?crit :
> > Hello all,
> >
> > today we experienced some sort of regression in 2.6.31 ipv4 implementation, or
> > at least some incompatibility with former 2.6.30.X kernels.
> >
> > We have the following situation:
> >
> > ---------- vlan1@eth0 192.168.2.1/24
> > /
> > host A 192.168.1.1/24 eth0 -------<router> host B
> > \
> > ---------- eth1 192.168.3.1/24
> >
> >
> > Now, if you route 192.168.1.0/24 via interface vlan1@eth0 on host B and let
> > host A ping 192.168.2.1 everything works. But if you route 192.168.1.0/24 via
> > interface eth1 on host B and let host A ping 192.168.2.1 you get no reply.
> > With tcpdump we see the icmp packets arrive at vlan1@eth0, but no icmp echo
> > reply being generated neither on vlan1 nor eth1.
> > Kernels 2.6.30.X and below do not show this behaviour.
> > Is this intended? Do we need to reconfigure something to restore the old
> > behaviour?
> >
>
> Asymetric routing ?
>
> Check your rp_filter settings
>
> grep . `find /proc/sys/net -name rp_filter`
>
> rp_filter - INTEGER
> 0 - No source validation.
> 1 - Strict mode as defined in RFC3704 Strict Reverse Path
> Each incoming packet is tested against the FIB and if the interface
> is not the best reverse path the packet check will fail.
> By default failed packets are discarded.
> 2 - Loose mode as defined in RFC3704 Loose Reverse Path
> Each incoming packet's source address is also tested against the FIB
> and if the source address is not reachable via any interface
> the packet check will fail.
>
> Current recommended practice in RFC3704 is to enable strict mode
> to prevent IP spoofing from DDos attacks. If using asymmetric routing
> or other complicated routing, then loose mode is recommended.
>
> conf/all/rp_filter must also be set to non-zero to do source validation
> on the interface
>
> Default value is 0. Note that some distributions enable it
> in startup scripts.

Problem is this:
Kernel 2.6.30.X and below work flawlessly in this setup, only kernel 2.6.31
acts different. Is this an intended change in policy?

--
Regards,
Stephan

2009-09-14 15:21:22

by Eric Dumazet

[permalink] [raw]
Subject: Re: ipv4 regression in 2.6.31 ?

Stephan von Krawczynski a ?crit :
> On Mon, 14 Sep 2009 15:57:03 +0200
> Eric Dumazet <[email protected]> wrote:
>
>> Stephan von Krawczynski a ?crit :
>>> Hello all,
>>>
>>> today we experienced some sort of regression in 2.6.31 ipv4 implementation, or
>>> at least some incompatibility with former 2.6.30.X kernels.
>>>
>>> We have the following situation:
>>>
>>> ---------- vlan1@eth0 192.168.2.1/24
>>> /
>>> host A 192.168.1.1/24 eth0 -------<router> host B
>>> \
>>> ---------- eth1 192.168.3.1/24
>>>
>>>
>>> Now, if you route 192.168.1.0/24 via interface vlan1@eth0 on host B and let
>>> host A ping 192.168.2.1 everything works. But if you route 192.168.1.0/24 via
>>> interface eth1 on host B and let host A ping 192.168.2.1 you get no reply.
>>> With tcpdump we see the icmp packets arrive at vlan1@eth0, but no icmp echo
>>> reply being generated neither on vlan1 nor eth1.
>>> Kernels 2.6.30.X and below do not show this behaviour.
>>> Is this intended? Do we need to reconfigure something to restore the old
>>> behaviour?
>>>
>> Asymetric routing ?
>>
>> Check your rp_filter settings
>>
>> grep . `find /proc/sys/net -name rp_filter`
>>
>> rp_filter - INTEGER
>> 0 - No source validation.
>> 1 - Strict mode as defined in RFC3704 Strict Reverse Path
>> Each incoming packet is tested against the FIB and if the interface
>> is not the best reverse path the packet check will fail.
>> By default failed packets are discarded.
>> 2 - Loose mode as defined in RFC3704 Loose Reverse Path
>> Each incoming packet's source address is also tested against the FIB
>> and if the source address is not reachable via any interface
>> the packet check will fail.
>>
>> Current recommended practice in RFC3704 is to enable strict mode
>> to prevent IP spoofing from DDos attacks. If using asymmetric routing
>> or other complicated routing, then loose mode is recommended.
>>
>> conf/all/rp_filter must also be set to non-zero to do source validation
>> on the interface
>>
>> Default value is 0. Note that some distributions enable it
>> in startup scripts.
>
> Problem is this:
> Kernel 2.6.30.X and below work flawlessly in this setup, only kernel 2.6.31
> acts different. Is this an intended change in policy?
>

Here, it only depends on rp_filter settings, kernel 2.6.30 or 2.6.31

Please give your settings for further investigations, for all hosts involved.

2009-09-14 15:55:12

by Stephan von Krawczynski

[permalink] [raw]
Subject: Re: ipv4 regression in 2.6.31 ?

On Mon, 14 Sep 2009 15:57:03 +0200
Eric Dumazet <[email protected]> wrote:

> Stephan von Krawczynski a ?crit :
> > Hello all,
> >
> > today we experienced some sort of regression in 2.6.31 ipv4 implementation, or
> > at least some incompatibility with former 2.6.30.X kernels.
> >
> > We have the following situation:
> >
> > ---------- vlan1@eth0 192.168.2.1/24
> > /
> > host A 192.168.1.1/24 eth0 -------<router> host B
> > \
> > ---------- eth1 192.168.3.1/24
> >
> >
> > Now, if you route 192.168.1.0/24 via interface vlan1@eth0 on host B and let
> > host A ping 192.168.2.1 everything works. But if you route 192.168.1.0/24 via
> > interface eth1 on host B and let host A ping 192.168.2.1 you get no reply.
> > With tcpdump we see the icmp packets arrive at vlan1@eth0, but no icmp echo
> > reply being generated neither on vlan1 nor eth1.
> > Kernels 2.6.30.X and below do not show this behaviour.
> > Is this intended? Do we need to reconfigure something to restore the old
> > behaviour?
> >
>
> Asymetric routing ?
>
> Check your rp_filter settings
>
> grep . `find /proc/sys/net -name rp_filter`
>
> rp_filter - INTEGER
> 0 - No source validation.
> 1 - Strict mode as defined in RFC3704 Strict Reverse Path
> Each incoming packet is tested against the FIB and if the interface
> is not the best reverse path the packet check will fail.
> By default failed packets are discarded.
> 2 - Loose mode as defined in RFC3704 Loose Reverse Path
> Each incoming packet's source address is also tested against the FIB
> and if the source address is not reachable via any interface
> the packet check will fail.
>
> Current recommended practice in RFC3704 is to enable strict mode
> to prevent IP spoofing from DDos attacks. If using asymmetric routing
> or other complicated routing, then loose mode is recommended.
>
> conf/all/rp_filter must also be set to non-zero to do source validation
> on the interface
>
> Default value is 0. Note that some distributions enable it
> in startup scripts.

Ok, here you can see 2.6.31 values from the discussed box:
(remember, no ping reply in this setup)

/proc/sys/net/ipv4/conf/all/rp_filter:1
/proc/sys/net/ipv4/conf/default/rp_filter:0
/proc/sys/net/ipv4/conf/lo/rp_filter:0
/proc/sys/net/ipv4/conf/eth2/rp_filter:0
/proc/sys/net/ipv4/conf/eth0/rp_filter:0
/proc/sys/net/ipv4/conf/eth1/rp_filter:0
/proc/sys/net/ipv4/conf/vlan1/rp_filter:0


And these are from the same box with 2.6.30.5:
(ping reply works)

/proc/sys/net/ipv4/conf/all/rp_filter:1
/proc/sys/net/ipv4/conf/default/rp_filter:0
/proc/sys/net/ipv4/conf/lo/rp_filter:0
/proc/sys/net/ipv4/conf/eth2/rp_filter:0
/proc/sys/net/ipv4/conf/eth0/rp_filter:0
/proc/sys/net/ipv4/conf/eth1/rp_filter:0
/proc/sys/net/ipv4/conf/vlan1/rp_filter:0

As you can see they're all the same. Does this mean that rp_filter never
really worked as intended before 2.6.31 ? Or does it mean that rp_filter=0
(eth1 and vlan1) gets overriden by all/rp_filter=1 in 2.6.31 and not before?

--
Regards,
Stephan

2009-09-14 16:29:49

by Richard B. Johnson

[permalink] [raw]
Subject: Re: Paravirtualization

On Linux-2.6.30.5, if I turn on paravirtualization in ".config" and build
the kernel, the new kernel boots okay. However, many programs seg-fault. For
instance, it is impossible to rebuild the kernel without rebooting to
another Linux version. Is this a known problem?

I have attached some configuration information.

Cheers,
Richard B. Johnson
http://Route495Software.com/


Attachments:
config.gz (24.84 kB)
cpuinfo.txt (485.00 B)
gcc-vers.txt (245.00 B)
Download all attachments

2009-09-14 16:10:31

by Eric Dumazet

[permalink] [raw]
Subject: Re: ipv4 regression in 2.6.31 ?

Stephan von Krawczynski a ?crit :
> On Mon, 14 Sep 2009 15:57:03 +0200
> Eric Dumazet <[email protected]> wrote:
>
>> Stephan von Krawczynski a ?crit :
>>> Hello all,
>>>
>>> today we experienced some sort of regression in 2.6.31 ipv4 implementation, or
>>> at least some incompatibility with former 2.6.30.X kernels.
>>>
>>> We have the following situation:
>>>
>>> ---------- vlan1@eth0 192.168.2.1/24
>>> /
>>> host A 192.168.1.1/24 eth0 -------<router> host B
>>> \
>>> ---------- eth1 192.168.3.1/24
>>>
>>>
>>> Now, if you route 192.168.1.0/24 via interface vlan1@eth0 on host B and let
>>> host A ping 192.168.2.1 everything works. But if you route 192.168.1.0/24 via
>>> interface eth1 on host B and let host A ping 192.168.2.1 you get no reply.
>>> With tcpdump we see the icmp packets arrive at vlan1@eth0, but no icmp echo
>>> reply being generated neither on vlan1 nor eth1.
>>> Kernels 2.6.30.X and below do not show this behaviour.
>>> Is this intended? Do we need to reconfigure something to restore the old
>>> behaviour?
>>>
>> Asymetric routing ?
>>
>> Check your rp_filter settings
>>
>> grep . `find /proc/sys/net -name rp_filter`
>>
>> rp_filter - INTEGER
>> 0 - No source validation.
>> 1 - Strict mode as defined in RFC3704 Strict Reverse Path
>> Each incoming packet is tested against the FIB and if the interface
>> is not the best reverse path the packet check will fail.
>> By default failed packets are discarded.
>> 2 - Loose mode as defined in RFC3704 Loose Reverse Path
>> Each incoming packet's source address is also tested against the FIB
>> and if the source address is not reachable via any interface
>> the packet check will fail.
>>
>> Current recommended practice in RFC3704 is to enable strict mode
>> to prevent IP spoofing from DDos attacks. If using asymmetric routing
>> or other complicated routing, then loose mode is recommended.
>>
>> conf/all/rp_filter must also be set to non-zero to do source validation
>> on the interface
>>
>> Default value is 0. Note that some distributions enable it
>> in startup scripts.
>
> Ok, here you can see 2.6.31 values from the discussed box:
> (remember, no ping reply in this setup)
>
> /proc/sys/net/ipv4/conf/all/rp_filter:1
> /proc/sys/net/ipv4/conf/default/rp_filter:0
> /proc/sys/net/ipv4/conf/lo/rp_filter:0
> /proc/sys/net/ipv4/conf/eth2/rp_filter:0
> /proc/sys/net/ipv4/conf/eth0/rp_filter:0
> /proc/sys/net/ipv4/conf/eth1/rp_filter:0
> /proc/sys/net/ipv4/conf/vlan1/rp_filter:0
>
>
> And these are from the same box with 2.6.30.5:
> (ping reply works)
>
> /proc/sys/net/ipv4/conf/all/rp_filter:1
> /proc/sys/net/ipv4/conf/default/rp_filter:0
> /proc/sys/net/ipv4/conf/lo/rp_filter:0
> /proc/sys/net/ipv4/conf/eth2/rp_filter:0
> /proc/sys/net/ipv4/conf/eth0/rp_filter:0
> /proc/sys/net/ipv4/conf/eth1/rp_filter:0
> /proc/sys/net/ipv4/conf/vlan1/rp_filter:0
>
> As you can see they're all the same. Does this mean that rp_filter never
> really worked as intended before 2.6.31 ? Or does it mean that rp_filter=0
> (eth1 and vlan1) gets overriden by all/rp_filter=1 in 2.6.31 and not before?
>

Yes, previous kernels ignored /proc/sys/net/ipv4/conf/all/rp_filter value, it was a bug.

commit 27fed4175acf81ddd91d9a4ee2fd298981f60295
Author: Stephen Hemminger <[email protected]>
Date: Mon Jul 27 18:39:45 2009 -0700

ip: fix logic of reverse path filter sysctl

Even though reverse path filter was changed from simple boolean to
trinary control, the loose mode only works if both all and device are
configured because of this logic error.

Signed-off-by: Stephen Hemminger <[email protected]>
Signed-off-by: David S. Miller <[email protected]>


In your case, you *need*
echo 0 >/proc/sys/net/ipv4/conf/all/rp_filter
or
echo 2 >/proc/sys/net/ipv4/conf/all/rp_filter

2009-09-14 16:31:36

by Stephen Hemminger

[permalink] [raw]
Subject: Re: ipv4 regression in 2.6.31 ?

On Mon, 14 Sep 2009 17:55:05 +0200
Stephan von Krawczynski <[email protected]> wrote:

> On Mon, 14 Sep 2009 15:57:03 +0200
> Eric Dumazet <[email protected]> wrote:
>
> > Stephan von Krawczynski a écrit :
> > > Hello all,
> > >
> > > today we experienced some sort of regression in 2.6.31 ipv4 implementation, or
> > > at least some incompatibility with former 2.6.30.X kernels.
> > >
> > > We have the following situation:
> > >
> > > ---------- vlan1@eth0 192.168.2.1/24
> > > /
> > > host A 192.168.1.1/24 eth0 -------<router> host B
> > > \
> > > ---------- eth1 192.168.3.1/24
> > >
> > >
> > > Now, if you route 192.168.1.0/24 via interface vlan1@eth0 on host B and let
> > > host A ping 192.168.2.1 everything works. But if you route 192.168.1.0/24 via
> > > interface eth1 on host B and let host A ping 192.168.2.1 you get no reply.
> > > With tcpdump we see the icmp packets arrive at vlan1@eth0, but no icmp echo
> > > reply being generated neither on vlan1 nor eth1.
> > > Kernels 2.6.30.X and below do not show this behaviour.
> > > Is this intended? Do we need to reconfigure something to restore the old
> > > behaviour?
> > >
> >
> > Asymetric routing ?
> >
> > Check your rp_filter settings
> >
> > grep . `find /proc/sys/net -name rp_filter`
> >
> > rp_filter - INTEGER
> > 0 - No source validation.
> > 1 - Strict mode as defined in RFC3704 Strict Reverse Path
> > Each incoming packet is tested against the FIB and if the interface
> > is not the best reverse path the packet check will fail.
> > By default failed packets are discarded.
> > 2 - Loose mode as defined in RFC3704 Loose Reverse Path
> > Each incoming packet's source address is also tested against the FIB
> > and if the source address is not reachable via any interface
> > the packet check will fail.
> >
> > Current recommended practice in RFC3704 is to enable strict mode
> > to prevent IP spoofing from DDos attacks. If using asymmetric routing
> > or other complicated routing, then loose mode is recommended.
> >
> > conf/all/rp_filter must also be set to non-zero to do source validation
> > on the interface
> >
> > Default value is 0. Note that some distributions enable it
> > in startup scripts.
>
> Ok, here you can see 2.6.31 values from the discussed box:
> (remember, no ping reply in this setup)
>
> /proc/sys/net/ipv4/conf/all/rp_filter:1
> /proc/sys/net/ipv4/conf/default/rp_filter:0
> /proc/sys/net/ipv4/conf/lo/rp_filter:0
> /proc/sys/net/ipv4/conf/eth2/rp_filter:0
> /proc/sys/net/ipv4/conf/eth0/rp_filter:0
> /proc/sys/net/ipv4/conf/eth1/rp_filter:0
> /proc/sys/net/ipv4/conf/vlan1/rp_filter:0
>
>
> And these are from the same box with 2.6.30.5:
> (ping reply works)
>
> /proc/sys/net/ipv4/conf/all/rp_filter:1
> /proc/sys/net/ipv4/conf/default/rp_filter:0
> /proc/sys/net/ipv4/conf/lo/rp_filter:0
> /proc/sys/net/ipv4/conf/eth2/rp_filter:0
> /proc/sys/net/ipv4/conf/eth0/rp_filter:0
> /proc/sys/net/ipv4/conf/eth1/rp_filter:0
> /proc/sys/net/ipv4/conf/vlan1/rp_filter:0
>
> As you can see they're all the same. Does this mean that rp_filter never
> really worked as intended before 2.6.31 ? Or does it mean that rp_filter=0
> (eth1 and vlan1) gets overriden by all/rp_filter=1 in 2.6.31 and not before?

RP filter did not work correctly in 2.6.30. The code added to to the loose
mode caused a bug; the rp_filter value was being computed as:
rp_filter = interface_value & all_value;
So in order to get reverse path filter both would have to be set.

In 2.6.31 this was change to:
rp_filter = max(interface_value, all_value);

This was the intended behaviour, if user asks all interfaces to have rp
filtering turned on, then set /proc/sys/net/ipv4/conf/all/rp_filter = 1
or to turn on just one interface, set it for just that interface.

Sorry for any confusion this caused.



--

2009-09-15 00:01:39

by Julian Anastasov

[permalink] [raw]
Subject: Re: ipv4 regression in 2.6.31 ?


Hello,

On Mon, 14 Sep 2009, Stephen Hemminger wrote:

> RP filter did not work correctly in 2.6.30. The code added to to the loose
> mode caused a bug; the rp_filter value was being computed as:
> rp_filter = interface_value & all_value;
> So in order to get reverse path filter both would have to be set.

May be we can add IN_DEV_MASKCONF as a better
option (all & dev). All loose-mode fans just need to set
all/rp_filter to 3 to allow both strict and loose mode and then
DEV/rp_filter will be restricted to the allowed modes. By this way
compatibility is preserved (all/rp_filter will mean "allowed modes")
and you can add other loose-mode variants as explained in RFC 3704.
Then strict mode will have priority to all loose modes when checking
the sender address. Or if we really want to help asymmetric routing
we should not play with loose modes but with solutions like
rp_filter_mask:

http://www.ssi.bg/~ja/#rp_filter_mask

where we can use the DEV/medium_id knowledge for rp_filter, not
just for proxy_arp. The drawback is that currently it is
limited to 31 mediums. Still, it serves the main goal of
RFC 3704: 2.3. Feasible Path Reverse Path Forwarding.
Then users can use loose mode to fight against martians
or rp_filter_mask for setups with asymmetric routing.

Regards

--
Julian Anastasov <[email protected]>

2009-09-15 08:14:08

by Jarek Poplawski

[permalink] [raw]
Subject: Re: ipv4 regression in 2.6.31 ?

On 14-09-2009 18:31, Stephen Hemminger wrote:
> On Mon, 14 Sep 2009 17:55:05 +0200
> Stephan von Krawczynski <[email protected]> wrote:
>
>> On Mon, 14 Sep 2009 15:57:03 +0200
>> Eric Dumazet <[email protected]> wrote:
>>
>>> Stephan von Krawczynski a A~(c)crit :
>>>> Hello all,
...
>>> rp_filter - INTEGER
>>> 0 - No source validation.
>>> 1 - Strict mode as defined in RFC3704 Strict Reverse Path
>>> Each incoming packet is tested against the FIB and if the interface
>>> is not the best reverse path the packet check will fail.
>>> By default failed packets are discarded.
>>> 2 - Loose mode as defined in RFC3704 Loose Reverse Path
>>> Each incoming packet's source address is also tested against the FIB
>>> and if the source address is not reachable via any interface
>>> the packet check will fail.
...
> RP filter did not work correctly in 2.6.30. The code added to to the loose
> mode caused a bug; the rp_filter value was being computed as:
> rp_filter = interface_value & all_value;
> So in order to get reverse path filter both would have to be set.
>
> In 2.6.31 this was change to:
> rp_filter = max(interface_value, all_value);
>
> This was the intended behaviour, if user asks all interfaces to have rp
> filtering turned on, then set /proc/sys/net/ipv4/conf/all/rp_filter = 1
> or to turn on just one interface, set it for just that interface.

Alas this max() formula handles also cases where both values are set
and it doesn't look very natural/"user friendly" to me. Especially
with something like this: all_value = 2; interface_value = 1
Why would anybody care to bother with interface_value in such a case?

"All" suggests "default" in this context, so I'd rather expect
something like:
rp_filter = interface_value ? : all_value;
which gives "the inteded behaviour" too, plus more...

We'd only need to add e.g.:
0 - Default ("all") validation. (No source validation if "all" is 0).
3 - No source validation on this interface.

Jarek P.

2009-09-15 22:59:31

by Stephen Hemminger

[permalink] [raw]
Subject: Re: ipv4 regression in 2.6.31 ?

On Tue, 15 Sep 2009 08:13:55 +0000
Jarek Poplawski <[email protected]> wrote:

> On 14-09-2009 18:31, Stephen Hemminger wrote:
> > On Mon, 14 Sep 2009 17:55:05 +0200
> > Stephan von Krawczynski <[email protected]> wrote:
> >
> >> On Mon, 14 Sep 2009 15:57:03 +0200
> >> Eric Dumazet <[email protected]> wrote:
> >>
> >>> Stephan von Krawczynski a A~(c)crit :
> >>>> Hello all,
> ...
> >>> rp_filter - INTEGER
> >>> 0 - No source validation.
> >>> 1 - Strict mode as defined in RFC3704 Strict Reverse Path
> >>> Each incoming packet is tested against the FIB and if the interface
> >>> is not the best reverse path the packet check will fail.
> >>> By default failed packets are discarded.
> >>> 2 - Loose mode as defined in RFC3704 Loose Reverse Path
> >>> Each incoming packet's source address is also tested against the FIB
> >>> and if the source address is not reachable via any interface
> >>> the packet check will fail.
> ...
> > RP filter did not work correctly in 2.6.30. The code added to to the loose
> > mode caused a bug; the rp_filter value was being computed as:
> > rp_filter = interface_value & all_value;
> > So in order to get reverse path filter both would have to be set.
> >
> > In 2.6.31 this was change to:
> > rp_filter = max(interface_value, all_value);
> >
> > This was the intended behaviour, if user asks all interfaces to have rp
> > filtering turned on, then set /proc/sys/net/ipv4/conf/all/rp_filter = 1
> > or to turn on just one interface, set it for just that interface.
>
> Alas this max() formula handles also cases where both values are set
> and it doesn't look very natural/"user friendly" to me. Especially
> with something like this: all_value = 2; interface_value = 1
> Why would anybody care to bother with interface_value in such a case?
>
> "All" suggests "default" in this context, so I'd rather expect
> something like:
> rp_filter = interface_value ? : all_value;
> which gives "the inteded behaviour" too, plus more...
>
> We'd only need to add e.g.:
> 0 - Default ("all") validation. (No source validation if "all" is 0).
> 3 - No source validation on this interface.

More values == more confusion.
I chose the maxconf() method to make rp_filter consistent with other
multi valued variables (arp_announce and arp_ignore).

--------
Subject: [PATCH] Document rp_filter behaviour

Signed-off-by: Stephen Hemminger <[email protected]>


--- a/Documentation/networking/ip-sysctl.txt 2009-09-15 15:54:25.844934373 -0700
+++ b/Documentation/networking/ip-sysctl.txt 2009-09-15 15:55:40.709205883 -0700
@@ -744,6 +744,8 @@ rp_filter - INTEGER
Default value is 0. Note that some distributions enable it
in startup scripts.

+ The max value from conf/{all,interface}/rp_filter is used.
+
arp_filter - BOOLEAN
1 - Allows you to have multiple network interfaces on the same
subnet, and have the ARPs for each interface be answered







--

2009-09-16 05:23:23

by Jarek Poplawski

[permalink] [raw]
Subject: Re: ipv4 regression in 2.6.31 ?

On Tue, Sep 15, 2009 at 03:57:19PM -0700, Stephen Hemminger wrote:
> On Tue, 15 Sep 2009 08:13:55 +0000
> Jarek Poplawski <[email protected]> wrote:
>
> > On 14-09-2009 18:31, Stephen Hemminger wrote:
> > > On Mon, 14 Sep 2009 17:55:05 +0200
> > > Stephan von Krawczynski <[email protected]> wrote:
> > >
> > >> On Mon, 14 Sep 2009 15:57:03 +0200
> > >> Eric Dumazet <[email protected]> wrote:
> > >>
> > >>> Stephan von Krawczynski a A~(c)crit :
> > >>>> Hello all,
> > ...
> > >>> rp_filter - INTEGER
> > >>> 0 - No source validation.
> > >>> 1 - Strict mode as defined in RFC3704 Strict Reverse Path
> > >>> Each incoming packet is tested against the FIB and if the interface
> > >>> is not the best reverse path the packet check will fail.
> > >>> By default failed packets are discarded.
> > >>> 2 - Loose mode as defined in RFC3704 Loose Reverse Path
> > >>> Each incoming packet's source address is also tested against the FIB
> > >>> and if the source address is not reachable via any interface
> > >>> the packet check will fail.
> > ...
> > > RP filter did not work correctly in 2.6.30. The code added to to the loose
> > > mode caused a bug; the rp_filter value was being computed as:
> > > rp_filter = interface_value & all_value;
> > > So in order to get reverse path filter both would have to be set.
> > >
> > > In 2.6.31 this was change to:
> > > rp_filter = max(interface_value, all_value);
> > >
> > > This was the intended behaviour, if user asks all interfaces to have rp
> > > filtering turned on, then set /proc/sys/net/ipv4/conf/all/rp_filter = 1
> > > or to turn on just one interface, set it for just that interface.
> >
> > Alas this max() formula handles also cases where both values are set
> > and it doesn't look very natural/"user friendly" to me. Especially
> > with something like this: all_value = 2; interface_value = 1
> > Why would anybody care to bother with interface_value in such a case?
> >
> > "All" suggests "default" in this context, so I'd rather expect
> > something like:
> > rp_filter = interface_value ? : all_value;
> > which gives "the inteded behaviour" too, plus more...
> >
> > We'd only need to add e.g.:
> > 0 - Default ("all") validation. (No source validation if "all" is 0).
> > 3 - No source validation on this interface.
>
> More values == more confusion.
> I chose the maxconf() method to make rp_filter consistent with other
> multi valued variables (arp_announce and arp_ignore).

This additional value is not necessary (it'd give as superpowers).
Max seems logical to me only when values are sorted (especially if
max is the strictest).

Jarek P.

>
> --------
> Subject: [PATCH] Document rp_filter behaviour
>
> Signed-off-by: Stephen Hemminger <[email protected]>
>
>
> --- a/Documentation/networking/ip-sysctl.txt 2009-09-15 15:54:25.844934373 -0700
> +++ b/Documentation/networking/ip-sysctl.txt 2009-09-15 15:55:40.709205883 -0700
> @@ -744,6 +744,8 @@ rp_filter - INTEGER
> Default value is 0. Note that some distributions enable it
> in startup scripts.
>
> + The max value from conf/{all,interface}/rp_filter is used.
> +
> arp_filter - BOOLEAN
> 1 - Allows you to have multiple network interfaces on the same
> subnet, and have the ARPs for each interface be answered
>
>
>
>
>
>
>
> --

2009-09-16 17:00:36

by Stephen Hemminger

[permalink] [raw]
Subject: Re: ipv4 regression in 2.6.31 ?

On Wed, 16 Sep 2009 05:23:04 +0000
Jarek Poplawski <[email protected]> wrote:

> On Tue, Sep 15, 2009 at 03:57:19PM -0700, Stephen Hemminger wrote:
> > On Tue, 15 Sep 2009 08:13:55 +0000
> > Jarek Poplawski <[email protected]> wrote:
> >
> > > On 14-09-2009 18:31, Stephen Hemminger wrote:
> > > > On Mon, 14 Sep 2009 17:55:05 +0200
> > > > Stephan von Krawczynski <[email protected]> wrote:
> > > >
> > > >> On Mon, 14 Sep 2009 15:57:03 +0200
> > > >> Eric Dumazet <[email protected]> wrote:
> > > >>
> > > >>> Stephan von Krawczynski a A~(c)crit :
> > > >>>> Hello all,
> > > ...
> > > >>> rp_filter - INTEGER
> > > >>> 0 - No source validation.
> > > >>> 1 - Strict mode as defined in RFC3704 Strict Reverse Path
> > > >>> Each incoming packet is tested against the FIB and if the interface
> > > >>> is not the best reverse path the packet check will fail.
> > > >>> By default failed packets are discarded.
> > > >>> 2 - Loose mode as defined in RFC3704 Loose Reverse Path
> > > >>> Each incoming packet's source address is also tested against the FIB
> > > >>> and if the source address is not reachable via any interface
> > > >>> the packet check will fail.
> > > ...
> > > > RP filter did not work correctly in 2.6.30. The code added to to the loose
> > > > mode caused a bug; the rp_filter value was being computed as:
> > > > rp_filter = interface_value & all_value;
> > > > So in order to get reverse path filter both would have to be set.
> > > >
> > > > In 2.6.31 this was change to:
> > > > rp_filter = max(interface_value, all_value);
> > > >
> > > > This was the intended behaviour, if user asks all interfaces to have rp
> > > > filtering turned on, then set /proc/sys/net/ipv4/conf/all/rp_filter = 1
> > > > or to turn on just one interface, set it for just that interface.
> > >
> > > Alas this max() formula handles also cases where both values are set
> > > and it doesn't look very natural/"user friendly" to me. Especially
> > > with something like this: all_value = 2; interface_value = 1
> > > Why would anybody care to bother with interface_value in such a case?
> > >
> > > "All" suggests "default" in this context, so I'd rather expect
> > > something like:
> > > rp_filter = interface_value ? : all_value;
> > > which gives "the inteded behaviour" too, plus more...
> > >
> > > We'd only need to add e.g.:
> > > 0 - Default ("all") validation. (No source validation if "all" is 0).
> > > 3 - No source validation on this interface.
> >
> > More values == more confusion.
> > I chose the maxconf() method to make rp_filter consistent with other
> > multi valued variables (arp_announce and arp_ignore).
>
> This additional value is not necessary (it'd give as superpowers).
> Max seems logical to me only when values are sorted (especially if
> max is the strictest).

The values had to be unsorted because of the requirement to retain
interface compatibility with older releases.
--

2009-09-18 08:43:41

by Stephan von Krawczynski

[permalink] [raw]
Subject: Re: ipv4 regression in 2.6.31 ?

On Wed, 16 Sep 2009 10:00:28 -0700
Stephen Hemminger <[email protected]> wrote:

> On Wed, 16 Sep 2009 05:23:04 +0000
> Jarek Poplawski <[email protected]> wrote:
>
> > On Tue, Sep 15, 2009 at 03:57:19PM -0700, Stephen Hemminger wrote:
> > > On Tue, 15 Sep 2009 08:13:55 +0000
> > > Jarek Poplawski <[email protected]> wrote:
> > >
> > > > On 14-09-2009 18:31, Stephen Hemminger wrote:
> > > > > On Mon, 14 Sep 2009 17:55:05 +0200
> > > > > Stephan von Krawczynski <[email protected]> wrote:
> > > > >
> > > > >> On Mon, 14 Sep 2009 15:57:03 +0200
> > > > >> Eric Dumazet <[email protected]> wrote:
> > > > >>
> > > > >>> Stephan von Krawczynski a A~(c)crit :
> > > > >>>> Hello all,
> > > > ...
> > > > >>> rp_filter - INTEGER
> > > > >>> 0 - No source validation.
> > > > >>> 1 - Strict mode as defined in RFC3704 Strict Reverse Path
> > > > >>> Each incoming packet is tested against the FIB and if the interface
> > > > >>> is not the best reverse path the packet check will fail.
> > > > >>> By default failed packets are discarded.
> > > > >>> 2 - Loose mode as defined in RFC3704 Loose Reverse Path
> > > > >>> Each incoming packet's source address is also tested against the FIB
> > > > >>> and if the source address is not reachable via any interface
> > > > >>> the packet check will fail.
> > > > ...
> > > > > RP filter did not work correctly in 2.6.30. The code added to to the loose
> > > > > mode caused a bug; the rp_filter value was being computed as:
> > > > > rp_filter = interface_value & all_value;
> > > > > So in order to get reverse path filter both would have to be set.
> > > > >
> > > > > In 2.6.31 this was change to:
> > > > > rp_filter = max(interface_value, all_value);
> > > > >
> > > > > This was the intended behaviour, if user asks all interfaces to have rp
> > > > > filtering turned on, then set /proc/sys/net/ipv4/conf/all/rp_filter = 1
> > > > > or to turn on just one interface, set it for just that interface.
> > > >
> > > > Alas this max() formula handles also cases where both values are set
> > > > and it doesn't look very natural/"user friendly" to me. Especially
> > > > with something like this: all_value = 2; interface_value = 1
> > > > Why would anybody care to bother with interface_value in such a case?
> > > >
> > > > "All" suggests "default" in this context, so I'd rather expect
> > > > something like:
> > > > rp_filter = interface_value ? : all_value;
> > > > which gives "the inteded behaviour" too, plus more...
> > > >
> > > > We'd only need to add e.g.:
> > > > 0 - Default ("all") validation. (No source validation if "all" is 0).
> > > > 3 - No source validation on this interface.
> > >
> > > More values == more confusion.
> > > I chose the maxconf() method to make rp_filter consistent with other
> > > multi valued variables (arp_announce and arp_ignore).
> >
> > This additional value is not necessary (it'd give as superpowers).
> > Max seems logical to me only when values are sorted (especially if
> > max is the strictest).
>
> The values had to be unsorted because of the requirement to retain
> interface compatibility with older releases.

The parameters are the same (I guess this is what you call interface
compatibility), but the function came out different, meaning you broke
functional compatibility with 2.6.31 instead. Just to mention that - though
the argument is leight-weight for the compatibility broke because the whole
thing was broken somehow before the bugfix.

--
Regards,
Stephan