Hi,
I have 2 servers which are connected to a gateway machine. The gateway an=
d one server are running
Linux 2.6.16.2, while the third machine is running 2.6.16.5. The two ethe=
rnet ports on the gateway
which are connected to the servers are combined into a single ethernet br=
idge device.
Ever since 2.6.16, I have noticed that I can no longer cross-mount the tw=
o servers' /home
directories via UDP NFS. Which is to say that the mount command succeeds,=
but that trying to
access the filesystem makes the process hang and the "NFS server not resp=
onding" message to appear
in the console log. This is true regardless of which machine is the NFS s=
erver and which is the
NFS client.
It all works fine if I use TCP NFS instead.
Also, UDP NFS works OK between any server and the gateway itself, so it o=
nly goes wrong when UDP
NFS traffic is forwarded across the bridge. (I have not changed my firewa=
ll rules, which just tell
the gateway to forward all traffic coming in from the bridge device anywa=
y.)
Can anyone reproduce this, please? I obviously have a workaround (using T=
CP instead of UDP) but it
sounds like there's a bug somewhere.
Cheers,
Chris
=09
___________________________________________________________=20
Switch an email account to Yahoo! Mail, you could win FIFA World Cup tick=
ets. http://uk.mail.yahoo.com
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting langua=
ge
that extends applications into web and mobile media. Attend the live webc=
ast
and join the prime developer group breaking into this new coding territor=
y!
http://sel.as-us.falkag.net/sel?cmd=3Dlnk&kid=3D110944&bid=3D241720&dat=3D=
121642
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
On Fri, 14 Apr 2006 14:42:20 +0100 (BST)
Chris Rankin <[email protected]> wrote:
> Hi,
>
> I have 2 servers which are connected to a gateway machine. The gateway and one server are running
> Linux 2.6.16.2, while the third machine is running 2.6.16.5. The two ethernet ports on the gateway
> which are connected to the servers are combined into a single ethernet bridge device.
>
> Ever since 2.6.16, I have noticed that I can no longer cross-mount the two servers' /home
> directories via UDP NFS. Which is to say that the mount command succeeds, but that trying to
> access the filesystem makes the process hang and the "NFS server not responding" message to appear
> in the console log. This is true regardless of which machine is the NFS server and which is the
> NFS client.
>
> It all works fine if I use TCP NFS instead.
>
> Also, UDP NFS works OK between any server and the gateway itself, so it only goes wrong when UDP
> NFS traffic is forwarded across the bridge. (I have not changed my firewall rules, which just tell
> the gateway to forward all traffic coming in from the bridge device anyway.)
>
> Can anyone reproduce this, please? I obviously have a workaround (using TCP instead of UDP) but it
> sounds like there's a bug somewhere.
>
> Cheers,
> Chris
Most likely the problem is that the MTU on the two devices in the bridge is different.
The bridge will silently drop packets if they are too large for the destination port (it's in the 802.1d
standard). TCP has path mtu discovery and is smart enough to recover. UDP doesn't do
that.
Anyway don't run NFS over UDP unless you want data corruption. There are sequence number wraparound
issues that are unsolvable when running NFS over UDP/IP and faster links.
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
--- Stephen Hemminger <[email protected]> wrote:
> Most likely the problem is that the MTU on the two devices in the bridg=
e is different.
> The bridge will silently drop packets if they are too large for the des=
tination port (it's in
> the 802.1d standard). TCP has path mtu discovery and is smart enough to=
recover. UDP doesn't do
> that.
Hi,
Thanks for the top about the dangers of NFS and UDP. However, I don't thi=
nk that the MTU is the
problem. All the ethernet devices (including the bridge) have an MTU of 1=
500, and according to my
routing table, only the default route has a lower MTU. Both servers are c=
onfigured like this:
192.168.0.0/24 dev eth0 proto kernel scope link src 192.168.0.x
169.254.0.0/16 dev eth0 scope link
default via 192.168.0.1 dev eth0 src 192.168.0.x advmss 1452
eth0 Link encap:Ethernet HWaddr nn:nn:nn:nn:nn:nn
inet addr:192.168.0.x Bcast:192.168.0.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:6817 errors:0 dropped:0 overruns:0 frame:0
TX packets:4951 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:1840267 (1.7 MiB) TX bytes:678873 (662.9 KiB)
Base address:0xdcc0 Memory:ff6e0000-ff700000
So all traffic between 192.168.0.x machines should be routed with an MTU =
of 1500.
Cheers,
Chris
=09
___________________________________________________________=20
Switch an email account to Yahoo! Mail, you could win FIFA World Cup tick=
ets. http://uk.mail.yahoo.com
=09
___________________________________________________________=20
Switch an email account to Yahoo! Mail, you could win FIFA World Cup tick=
ets. http://uk.mail.yahoo.com
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting langua=
ge
that extends applications into web and mobile media. Attend the live webc=
ast
and join the prime developer group breaking into this new coding territor=
y!
http://sel.as-us.falkag.net/sel?cmd=3Dlnk&kid=3D110944&bid=3D241720&dat=3D=
121642
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
On Fri, 14 Apr 2006 20:26:56 +0100 (BST)
Chris Rankin <[email protected]> wrote:
> --- Stephen Hemminger <[email protected]> wrote:
> > Most likely the problem is that the MTU on the two devices in the bridge is different.
> > The bridge will silently drop packets if they are too large for the destination port (it's in
> > the 802.1d standard). TCP has path mtu discovery and is smart enough to recover. UDP doesn't do
> > that.
>
> Hi,
>
> Thanks for the top about the dangers of NFS and UDP. However, I don't think that the MTU is the
> problem. All the ethernet devices (including the bridge) have an MTU of 1500, and according to my
> routing table, only the default route has a lower MTU. Both servers are configured like this:
>
> 192.168.0.0/24 dev eth0 proto kernel scope link src 192.168.0.x
> 169.254.0.0/16 dev eth0 scope link
> default via 192.168.0.1 dev eth0 src 192.168.0.x advmss 1452
>
> eth0 Link encap:Ethernet HWaddr nn:nn:nn:nn:nn:nn
> inet addr:192.168.0.x Bcast:192.168.0.255 Mask:255.255.255.0
> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
> RX packets:6817 errors:0 dropped:0 overruns:0 frame:0
> TX packets:4951 errors:0 dropped:0 overruns:0 carrier:0
> collisions:0 txqueuelen:1000
> RX bytes:1840267 (1.7 MiB) TX bytes:678873 (662.9 KiB)
> Base address:0xdcc0 Memory:ff6e0000-ff700000
>
> So all traffic between 192.168.0.x machines should be routed with an MTU of 1500.
>
> Cheers,
> Chris
What is the mtu of eth0 and eth1 on the bridge?
>
>
>
>
> ___________________________________________________________
> Switch an email account to Yahoo! Mail, you could win FIFA World Cup tickets. http://uk.mail.yahoo.com
>
>
>
> ___________________________________________________________
> Switch an email account to Yahoo! Mail, you could win FIFA World Cup tickets. http://uk.mail.yahoo.com
--
Stephen Hemminger <[email protected]>
OSDL http://developer.osdl.org/~shemminger
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
--- Stephen Hemminger <[email protected]> wrote:
> What is the mtu of eth0 and eth1 on the bridge?
1500 on both eth0 and eth1, and on the actual bridge device too.
Cheers,
Chris
=09
___________________________________________________________=20
Switch an email account to Yahoo! Mail, you could win FIFA World Cup tick=
ets. http://uk.mail.yahoo.com
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting langua=
ge
that extends applications into web and mobile media. Attend the live webc=
ast
and join the prime developer group breaking into this new coding territor=
y!
http://sel.as-us.falkag.net/sel?cmd=3Dlnk&kid=3D110944&bid=3D241720&dat=3D=
121642
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
On Fri, 14 Apr 2006 21:58:15 +0100 (BST)
Chris Rankin <[email protected]> wrote:
> --- Stephen Hemminger <[email protected]> wrote:
> > What is the mtu of eth0 and eth1 on the bridge?
>
> 1500 on both eth0 and eth1, and on the actual bridge device too.
Are you doing brouting or filtering? or vlan's?
Are the ethernet devices on the bridge doing hardware checksumming?
What version of kernel and configuration are running on the bridge?
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
--- Stephen Hemminger <[email protected]> wrote:
> Are you doing brouting or filtering? or vlan's?
> Are the ethernet devices on the bridge doing hardware checksumming?
> What version of kernel and configuration are running on the bridge?
The gateway is running 2.6.16.2, and all its ethernet devices are e100s. =
They might well do
hardware checksumming, but the configuration used to work fine. There is =
no brouting / ebfilters,
and I don't think that there's any vlan either (on the basis that I can't=
have it if I don't know
what it is ;-).)
Cheers,
Chris
=09
=09
=09
___________________________________________________________=20
Yahoo! Messenger - NEW crystal clear PC to PC calling worldwide with voic=
email http://uk.messenger.yahoo.com
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting langua=
ge
that extends applications into web and mobile media. Attend the live webc=
ast
and join the prime developer group breaking into this new coding territor=
y!
http://sel.as-us.falkag.net/sel?cmd=3Dlnk&kid=3D110944&bid=3D241720&dat=3D=
121642
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
On Fri, 14 Apr 2006 23:13:12 +0100 (BST)
Chris Rankin <[email protected]> wrote:
> --- Stephen Hemminger <[email protected]> wrote:
> > Are you doing brouting or filtering? or vlan's?
> > Are the ethernet devices on the bridge doing hardware checksumming?
> > What version of kernel and configuration are running on the bridge?
>
> The gateway is running 2.6.16.2, and all its ethernet devices are e100s. They might well do
> hardware checksumming, but the configuration used to work fine. There is no brouting / ebfilters,
> and I don't think that there's any vlan either (on the basis that I can't have it if I don't know
> what it is ;-).)
>
If you have the patience, then "git bisect" will pin down the regression.
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
--- Stephen Hemminger <[email protected]> wrote:
> If you have the patience, then "git bisect" will pin down the regressio=
n.
And git, of course. Did I mention that the relevant machine is the gatewa=
y server?
=09
___________________________________________________________=20
Switch an email account to Yahoo! Mail, you could win FIFA World Cup tick=
ets. http://uk.mail.yahoo.com
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting langua=
ge
that extends applications into web and mobile media. Attend the live webc=
ast
and join the prime developer group breaking into this new coding territor=
y!
http://sel.as-us.falkag.net/sel?cmd=3Dlnk&kid=3D110944&bid=3D241720&dat=3D=
121642
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
Then get a packet trace of a failing session with tcpdump. You may need
to get two, one
one the client and one on the server to be able to see which packet
isn't getting past the
bridge.
There are tools to santize tcpdump files if you are paranoid about IP
adresses, etc.
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
Stephen Hemminger wrote:
> Then get a packet trace of a failing session with tcpdump. You may need
> to get two, one
> one the client and one on the server to be able to see which packet
> isn't getting past the
> bridge.
I only saw half of this thread (Chris' mails haven't made it to the list
yet), but in case you're using bridge-netfilter and conntrack, its most
likely because of conntrack fragmentation changes in 2.6.16. Conntrack
defragments packets, but relies on the IP layer to do the
refragmentation now. With purely bridged traffic, the packets don't go
through the IP layer, so they exceed the MTU of the outgoing bridge
port. 2.6.16.6 will include a fix for this problem:
[patch 06/22] NETFILTER: Fix fragmentation issues with bridge netfilter
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
--- Patrick McHardy <[email protected]> wrote:
> I only saw half of this thread (Chris' mails haven't made it to the lis=
t
> yet), but in case you're using bridge-netfilter and conntrack, its most
> likely because of conntrack fragmentation changes in 2.6.16. Conntrack
> defragments packets, but relies on the IP layer to do the
> refragmentation now. With purely bridged traffic, the packets don't go
> through the IP layer, so they exceed the MTU of the outgoing bridge
> port. 2.6.16.6 will include a fix for this problem:
>=20
> [patch 06/22] NETFILTER: Fix fragmentation issues with bridge netfilter
I emailed the packet dumps to Stephen privately, but what was happening w=
as that the server was
receiving the request and was fragmenting the reply. However, the client=
was never receiving the
reply packets for some reason.
Yes, I am using connection tracking and netfilter, and the br0 interface =
is referenced in my
iptables rules. I am not using / have not loaded the ebtables modules, al=
though I did compile
them.
Cheers,
Chris
=09
=09
=09
___________________________________________________________=20
Yahoo! Messenger - NEW crystal clear PC to PC calling worldwide with voic=
email http://uk.messenger.yahoo.com
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting langua=
ge
that extends applications into web and mobile media. Attend the live webc=
ast
and join the prime developer group breaking into this new coding territor=
y!
http://sel.as-us.falkag.net/sel?cmd=3Dlnk&kid=3D110944&bid=3D241720&dat=3D=
121642
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs
-stable review patch. If anyone has any objections, please let us know.
------------------
[NETFILTER]: Fix fragmentation issues with bridge netfilter
The conntrack code doesn't do re-fragmentation of defragmented packets
anymore but relies on fragmentation in the IP layer. Purely bridged
packets don't pass through the IP layer, so the bridge netfilter code
needs to take care of fragmentation itself.
Signed-off-by: Patrick McHardy <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
include/net/ip.h | 1 +
net/bridge/br_netfilter.c | 13 +++++++++++--
net/ipv4/ip_output.c | 6 +++---
3 files changed, 15 insertions(+), 5 deletions(-)
--- linux-2.6.16.5.orig/include/net/ip.h
+++ linux-2.6.16.5/include/net/ip.h
@@ -95,6 +95,7 @@ extern int ip_local_deliver(struct sk_b
extern int ip_mr_input(struct sk_buff *skb);
extern int ip_output(struct sk_buff *skb);
extern int ip_mc_output(struct sk_buff *skb);
+extern int ip_fragment(struct sk_buff *skb, int (*output)(struct sk_buff *));
extern int ip_do_nat(struct sk_buff *skb);
extern void ip_send_check(struct iphdr *ip);
extern int ip_queue_xmit(struct sk_buff *skb, int ipfragok);
--- linux-2.6.16.5.orig/net/bridge/br_netfilter.c
+++ linux-2.6.16.5/net/bridge/br_netfilter.c
@@ -739,6 +739,15 @@ out:
return NF_STOLEN;
}
+static int br_nf_dev_queue_xmit(struct sk_buff *skb)
+{
+ if (skb->protocol == htons(ETH_P_IP) &&
+ skb->len > skb->dev->mtu &&
+ !(skb_shinfo(skb)->ufo_size || skb_shinfo(skb)->tso_size))
+ return ip_fragment(skb, br_dev_queue_push_xmit);
+ else
+ return br_dev_queue_push_xmit(skb);
+}
/* PF_BRIDGE/POST_ROUTING ********************************************/
static unsigned int br_nf_post_routing(unsigned int hook, struct sk_buff **pskb,
@@ -798,7 +807,7 @@ static unsigned int br_nf_post_routing(u
realoutdev = nf_bridge->netoutdev;
#endif
NF_HOOK(pf, NF_IP_POST_ROUTING, skb, NULL, realoutdev,
- br_dev_queue_push_xmit);
+ br_nf_dev_queue_xmit);
return NF_STOLEN;
@@ -843,7 +852,7 @@ static unsigned int ip_sabotage_out(unsi
if ((out->hard_start_xmit == br_dev_xmit &&
okfn != br_nf_forward_finish &&
okfn != br_nf_local_out_finish &&
- okfn != br_dev_queue_push_xmit)
+ okfn != br_nf_dev_queue_xmit)
#if defined(CONFIG_VLAN_8021Q) || defined(CONFIG_VLAN_8021Q_MODULE)
|| ((out->priv_flags & IFF_802_1Q_VLAN) &&
VLAN_DEV_INFO(out)->real_dev->hard_start_xmit == br_dev_xmit)
--- linux-2.6.16.5.orig/net/ipv4/ip_output.c
+++ linux-2.6.16.5/net/ipv4/ip_output.c
@@ -86,8 +86,6 @@
int sysctl_ip_default_ttl = IPDEFTTL;
-static int ip_fragment(struct sk_buff *skb, int (*output)(struct sk_buff*));
-
/* Generate a checksum for an outgoing IP datagram. */
__inline__ void ip_send_check(struct iphdr *iph)
{
@@ -421,7 +419,7 @@ static void ip_copy_metadata(struct sk_b
* single device frame, and queue such a frame for sending.
*/
-static int ip_fragment(struct sk_buff *skb, int (*output)(struct sk_buff*))
+int ip_fragment(struct sk_buff *skb, int (*output)(struct sk_buff*))
{
struct iphdr *iph;
int raw = 0;
@@ -673,6 +671,8 @@ fail:
return err;
}
+EXPORT_SYMBOL(ip_fragment);
+
int
ip_generic_getfrag(void *from, char *to, int offset, int len, int odd, struct sk_buff *skb)
{
--
--- Patrick McHardy <[email protected]> wrote:
> >>[patch 06/22] NETFILTER: Fix fragmentation issues with bridge netfilt=
er
Thanks, this patch fixes the "NFS via UDP" problem.
Cheers,
Chris
=09
___________________________________________________________=20
Yahoo! For Good - Sponsor a London Marathon runner - http://uk.promotions=
.yahoo.com/charity/london-marathon
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting langua=
ge
that extends applications into web and mobile media. Attend the live webc=
ast
and join the prime developer group breaking into this new coding territor=
y!
http://sel.as-us.falkag.net/sel?cmd=3Dlnk&kid=3D110944&bid=3D241720&dat=3D=
121642
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs