2002-09-18 12:41:45

by Peter Niessen

[permalink] [raw]
Subject: NFS/NAT(MASQUERADING) trouble

Hi,

after days of searching the net, my frustration amounts to levels
which urge me to write to the experts. I have the following problem:

We've set up a PC farm of 9 client and a master machine, running
2.4.18 (debian) in the following way:


alpha server =3D NFS/NIS server
\
\
\
XXXXXXXX switch
/
/
/ eth0=3Dpublic ip,
Mastermachine with two interfaces eth0/eth1
\ eth1=3D192.168.0.199
\
\
\
\
XXXXXXXXX...XXXXXXXX switch
| | | ... | | |
192.168.0.1 .2 .3 ... .9

iptables is used to run the masquerading/nat on the master machine.
After rebooting, everything runs smooth and the client machines and the
master machine can see the NFS disks on the alpha server and the NIS
works o.k. After some while, i.e. several hours in which the client
machines access the NFS server (running jobs under the PBS batch
system), the master machine, and only this one, gets into trouble.
First, it will loose the disk which the clients write to (1000 files
of ~2MB size, roughly ten of them within a minute in 10 minute
intervals + a ~kB logfile and an empty status file), then other disks
on the server. The clients are still able to see the server through
the master, NFS as well as NIS.

I use the NAT-script at the end of the message, copied from c't 4/2002.

Now, I found that people report similar behaviour when using INTEL
boards with certain NICs (cf. eg.
http://sourceforge.net/mailarchive/message.php?msg_id=3D550848). But why
would the client machines still see the NFS?

If anyone has a similar set-up, I'd be glad to hear/read about it.

Cheers and thanks in advance, Peter.

#!/bin/sh

echo 1 > /proc/sys/net/ipv4/ip_forward

modprobe ip_tables iptable_filter ip_conntrack ip_conntrack_ftp
modprobe iptable_nat ip_nat_ftp ipt_LOG ipt_MASQUERADE
iptables -t nat -A POSTROUTING -o eth0 -s 192.168.0.0/16 -j MASQUERADE

and iptables -t nat -L gives

master:~# iptables -t nat -L
Chain PREROUTING (policy ACCEPT)
target prot opt source destination

Chain POSTROUTING (policy ACCEPT)
target prot opt source destination
MASQUERADE all -- 192.168.0.0/16 anywhere

Chain OUTPUT (policy ACCEPT)
target prot opt source destination

Peter Nie=DFen
VUB Brussels
Dienst ELEM
Pleinlaan 2
B-1050 Brussels
Belgium

Tel (+32)2/629-3554, -3651 (lab)
Fax (+32)2/629-3816
e-mail [email protected]
www http://www.ifh.de/~niessen

/"\ ASCII ribbon campaign
\ / ---------------------
X against HTML mail
/ \ and postings

pgp public key at

http://www.ifh.de/www_users/amanda/niessen/www/pgp_pubkey.asc



-------------------------------------------------------
This SF.NET email is sponsored by: AMD - Your access to the experts
on Hammer Technology! Open Source & Linux Developers, register now
for the AMD Developer Symposium. Code: EX8664
http://www.developwithamd.com/developerlab
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


2002-09-22 20:43:54

by Chip Salzenberg

[permalink] [raw]
Subject: Re: NFS/NAT(MASQUERADING) trouble

According to Peter Niessen:
> the master machine, and only this one, gets into trouble.
> First, it will loose the disk which the clients write to (1000 files
> of ~2MB size, roughly ten of them within a minute in 10 minute
> intervals + a ~kB logfile and an empty status file), then other disks
> on the server. The clients are still able to see the server through
> the master, NFS as well as NIS.

What do you mean by the 'master' machine "losing" the disks?
--
Chip Salzenberg - a.k.a. - <[email protected]>
"It furthers one to have somewhere to go."


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2002-09-23 07:35:02

by Peter Niessen

[permalink] [raw]
Subject: Re: NFS/NAT(MASQUERADING) trouble

On Sat, 21 Sep 2002, Chip Salzenberg wrote:

Hi Chip,

thanks for your mail.

> According to Peter Niessen:
> > the master machine, and only this one, gets into trouble.
....
>
> What do you mean by the 'master' machine "losing" the disks?

It means that the NFS link breaks and I get messages like

nfs server my_nfs_server not responing

and

nfs server my_nfs_server: task 12345 can't get a request slot.

Some more information:

If I run df -h every 120 s on the master, NFS stays stable.

Cheers, Peter.

> --
> Chip Salzenberg - a.k.a. - <[email protected]>
> "It furthers one to have somewhere to go."
>



-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs