2002-05-14 10:25:36

by Steffen Persvold

[permalink] [raw]
Subject: Input/output error on hard mounted NFS directory

Hi all,

I guess I've reported this before, but here we go again:

I'm compiling up a large project which resides on a NFS mounted
ext2 filesystem. In this case the NFS server is running RH 7.2
kernel-smp-2.4.9-21 (i686 version) nfs-utils-0.3.1-13.7.2.1, and the
client is running RH 6.2 kernel-enterprise-2.2.19-6.2.7 (i686 version).

The relevant /proc/mounts looks like this :

huey:/export/home/sp /home/sp nfs rw,v3,rsize=8192,wsize=8192,intr,addr=huey 0 0


What happens is that when all object files are compiled (this goes fine),
a static library is about to be created. This is when the client aborts :

# Creating static library Linux2.i86pc.gnu/libmpi.a ...
ar: Linux2.i86pc.gnu/libmpi.a: Input/output error
make[4]: *** [Linux2.i86pc.gnu/libmpi.a] Error 1

If I try once more (just creating the static library) everything goes
fine. Another thing is that this is unfortunately not always reproducible.

Is there a know bug in some of my componets which could result in such an
error ? I would very much appreciate some pointers to what could be wrong
(not just 'upgrade to latest version, maybe it is fixed').

Thanks,
--
Steffen Persvold | Scalable Linux Systems | Try out the world's best
mailto:[email protected] | http://www.scali.com | performing MPI implementation:
Tel: (+47) 2262 8950 | Olaf Helsets vei 6 | - ScaMPI 1.13.8 -
Fax: (+47) 2262 8951 | N0621 Oslo, NORWAY | >320MBytes/s and <4uS latency


_______________________________________________________________

Have big pipes? SourceForge.net is looking for download mirrors. We supply
the hardware. You get the recognition. Email Us: [email protected]
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


2002-05-24 11:49:36

by Steffen Persvold

[permalink] [raw]
Subject: Re: Input/output error on hard mounted NFS directory

On Tue, 14 May 2002, Trond Myklebust wrote:

> Tysdag 14. mai 2002 13:15 skreiv Steffen Persvold:
>
> > > If you want to run a 2.2. client, then grab 2.2.21-rc2, and apply the
> > > patch from
> > >
> > > http://www.fys.uio.no/~trondmy/src/2.2.21-rc2/linux-2.2.21-NFS.dif
> >
> > Do I have to ?
>
> Not if you can live with the bugs ;-)
>
> > Will it fix my problem ? Other known NFS bugs ?
>
> The 2.2.21-NFS patch fixes all the EIO bugs that were found in 2.4.x. You
> might of course have found totally different bugs, but I really can't tell
> until you eliminate the first possibility.
>


Hi again,

I have some updates on this problem. As you may remember I had a RH 7.2
NFS server (i686) running the 2.4.9-21smp kernel, exporting an ext2
directory (~70GB).

Sometimes I discovered problems when linking together a large static
library on two clients which had different architecture (ia32 ia64), both
running RedHat but with different kernels (ia32 was running 2.2.19-6.2.7
and ia64 was running a stock 2.4.18 with ia64 patchset). The error message
I got on the ia32 client was :

ar: Linux2.i86pc.gnu.dbg/libmpi.a: Input/output error
gmake[4]: *** [Linux2.i86pc.gnu.dbg/libmpi.a] Error 1

and on the ia64 client it was :

ar: Linux2.ia64.gnu.dbg/libmpi.a: Stale NFS file handle
gmake[4]: *** [Linux2.ia64.gnu.dbg/libmpi.a] Error 1

Now, since both these clients got almost the same problem (ok the 2.2.19
client reported -EIO, and the 2.4.18 client -ESTALE), I tried upgrading my
server to a stock 2.4.18 kernel and that actually helped ...

Can you give me any more details to why it helped upgrading my NFS server
to 2.4.18 ?

Regards,
--
Steffen Persvold | Scalable Linux Systems | Try out the world's best
mailto:[email protected] | http://www.scali.com | performing MPI implementation:
Tel: (+47) 2262 8950 | Olaf Helsets vei 6 | - ScaMPI 1.13.8 -
Fax: (+47) 2262 8951 | N0621 Oslo, NORWAY | >320MBytes/s and <4uS latency



_______________________________________________________________

Don't miss the 2002 Sprint PCS Application Developer's Conference
August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm

_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2002-05-25 22:13:46

by Trond Myklebust

[permalink] [raw]
Subject: Re: Input/output error on hard mounted NFS directory

>>>>> " " == Steffen Persvold <[email protected]> writes:

> Can you give me any more details to why it helped upgrading my
> NFS server to 2.4.18 ?

See the linux-kernel and NFS@sourceforge archives. Those races were
discovered by the 'fsx' filesystem exerciser.

Cheers,
Trond

_______________________________________________________________

Don't miss the 2002 Sprint PCS Application Developer's Conference
August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm

_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2002-05-14 10:42:20

by Steffen Persvold

[permalink] [raw]
Subject: Re: Input/output error on hard mounted NFS directory

On Tue, 14 May 2002, Steffen Persvold wrote:

> Hi all,
>
> I guess I've reported this before, but here we go again:
>
> I'm compiling up a large project which resides on a NFS mounted
> ext2 filesystem. In this case the NFS server is running RH 7.2
> kernel-smp-2.4.9-21 (i686 version) nfs-utils-0.3.1-13.7.2.1, and the
> client is running RH 6.2 kernel-enterprise-2.2.19-6.2.7 (i686 version).
>
> The relevant /proc/mounts looks like this :
>
> huey:/export/home/sp /home/sp nfs rw,v3,rsize=8192,wsize=8192,intr,addr=huey 0 0
>
>
> What happens is that when all object files are compiled (this goes fine),
> a static library is about to be created. This is when the client aborts :
>
> # Creating static library Linux2.i86pc.gnu/libmpi.a ...
> ar: Linux2.i86pc.gnu/libmpi.a: Input/output error
> make[4]: *** [Linux2.i86pc.gnu/libmpi.a] Error 1
>
> If I try once more (just creating the static library) everything goes
> fine. Another thing is that this is unfortunately not always reproducible.
>
> Is there a know bug in some of my componets which could result in such an
> error ? I would very much appreciate some pointers to what could be wrong
> (not just 'upgrade to latest version, maybe it is fixed').
>


I have also sort of the same problem on another architecture (IA64).
This machine runs a stock 2.4.18 kernel (with ia64 patchset). The filesystem is
also mounted differently (noac):

huey:/export/home/sp /home/sp nfs rw,sync,v3,rsize=8192,wsize=8192,acregmin=0,acregmax=0,acdirmin=0,acdirmax=0,hard,intr,udp,noac,lock,addr=huey 0 0

The error I get here is :

# Creating static library Linux2.ia64.gnu/libmpi.a ...
# Creating shared library Linux2.ia64.gnu/libmpi.so ...
/usr/bin/ld: final link failed: Stale NFS file handle
collect2: ld returned 1 exit status
make[4]: *** [Linux2.ia64.gnu/libmpi.a] Error 1


Do you think the two different errors are related ?

Regards,
--
Steffen Persvold | Scalable Linux Systems | Try out the world's best
mailto:[email protected] | http://www.scali.com | performing MPI implementation:
Tel: (+47) 2262 8950 | Olaf Helsets vei 6 | - ScaMPI 1.13.8 -
Fax: (+47) 2262 8951 | N0621 Oslo, NORWAY | >320MBytes/s and <4uS latency


_______________________________________________________________

Have big pipes? SourceForge.net is looking for download mirrors. We supply
the hardware. You get the recognition. Email Us: [email protected]
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2002-05-14 10:50:39

by Trond Myklebust

[permalink] [raw]
Subject: Re: Input/output error on hard mounted NFS directory

>>>>> " " == Steffen Persvold <[email protected]> writes:

> The error I get here is :

> # Creating static library Linux2.ia64.gnu/libmpi.a ...
> # Creating shared library Linux2.ia64.gnu/libmpi.so ...
> /usr/bin/ld: final link failed: Stale NFS file handle collect2:
> ld returned 1 exit status make[4]: ***
> [Linux2.ia64.gnu/libmpi.a] Error 1


Stale file handle has nothing to do with EIO. That looks more like a
server bug. Are you perchance using nfs-server?

Cheers,
Trond

_______________________________________________________________

Have big pipes? SourceForge.net is looking for download mirrors. We supply
the hardware. You get the recognition. Email Us: [email protected]
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2002-05-14 10:52:10

by Trond Myklebust

[permalink] [raw]
Subject: Re: Input/output error on hard mounted NFS directory

>>>>> " " == Steffen Persvold <[email protected]> writes:

> I'm compiling up a large project which resides on a NFS mounted
> ext2 filesystem. In this case the NFS server is running RH 7.2
> kernel-smp-2.4.9-21 (i686 version) nfs-utils-0.3.1-13.7.2.1,
> and the client is running RH 6.2 kernel-enterprise-2.2.19-6.2.7
> (i686 version).

If you want to run a 2.2. client, then grab 2.2.21-rc2, and apply the
patch from

http://www.fys.uio.no/~trondmy/src/2.2.21-rc2/linux-2.2.21-NFS.dif

Cheers,
Trond

_______________________________________________________________

Have big pipes? SourceForge.net is looking for download mirrors. We supply
the hardware. You get the recognition. Email Us: [email protected]
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2002-05-14 11:12:11

by Steffen Persvold

[permalink] [raw]
Subject: Re: Input/output error on hard mounted NFS directory

On 14 May 2002, Trond Myklebust wrote:

> >>>>> " " == Steffen Persvold <[email protected]> writes:
>
> > The error I get here is :
>
> > # Creating static library Linux2.ia64.gnu/libmpi.a ...
> > # Creating shared library Linux2.ia64.gnu/libmpi.so ...
> > /usr/bin/ld: final link failed: Stale NFS file handle collect2:
> > ld returned 1 exit status make[4]: ***
> > [Linux2.ia64.gnu/libmpi.a] Error 1
>
>
> Stale file handle has nothing to do with EIO. That looks more like a
> server bug. Are you perchance using nfs-server?

No, I'm using a standard RH 7.2 server with knfsd
(and nfs-utils-0.3.1-13.7.2.1). rpc.nfsd is started with 48 servers.

Regards,
--
Steffen Persvold | Scalable Linux Systems | Try out the world's best
mailto:[email protected] | http://www.scali.com | performing MPI implementation:
Tel: (+47) 2262 8950 | Olaf Helsets vei 6 | - ScaMPI 1.13.8 -
Fax: (+47) 2262 8951 | N0621 Oslo, NORWAY | >320MBytes/s and <4uS latency


_______________________________________________________________

Have big pipes? SourceForge.net is looking for download mirrors. We supply
the hardware. You get the recognition. Email Us: [email protected]
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2002-05-14 11:15:21

by Steffen Persvold

[permalink] [raw]
Subject: Re: Input/output error on hard mounted NFS directory

ioOn 14 May 2002, Trond Myklebust wrote:

> >>>>> " " == Steffen Persvold <[email protected]> writes:
>
> > I'm compiling up a large project which resides on a NFS mounted
> > ext2 filesystem. In this case the NFS server is running RH 7.2
> > kernel-smp-2.4.9-21 (i686 version) nfs-utils-0.3.1-13.7.2.1,
> > and the client is running RH 6.2 kernel-enterprise-2.2.19-6.2.7
> > (i686 version).
>
> If you want to run a 2.2. client, then grab 2.2.21-rc2, and apply the
> patch from
>
> http://www.fys.uio.no/~trondmy/src/2.2.21-rc2/linux-2.2.21-NFS.dif
>

Do I have to ? Will it fix my problem ? Other known NFS bugs ?

Regards,
--
Steffen Persvold | Scalable Linux Systems | Try out the world's best
mailto:[email protected] | http://www.scali.com | performing MPI implementation:
Tel: (+47) 2262 8950 | Olaf Helsets vei 6 | - ScaMPI 1.13.8 -
Fax: (+47) 2262 8951 | N0621 Oslo, NORWAY | >320MBytes/s and <4uS latency


_______________________________________________________________

Have big pipes? SourceForge.net is looking for download mirrors. We supply
the hardware. You get the recognition. Email Us: [email protected]
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2002-05-14 12:38:15

by Trond Myklebust

[permalink] [raw]
Subject: Re: Input/output error on hard mounted NFS directory

Tysdag 14. mai 2002 13:15 skreiv Steffen Persvold:

> > If you want to run a 2.2. client, then grab 2.2.21-rc2, and apply the
> > patch from
> >
> > http://www.fys.uio.no/~trondmy/src/2.2.21-rc2/linux-2.2.21-NFS.dif
>
> Do I have to ?

Not if you can live with the bugs ;-)

> Will it fix my problem ? Other known NFS bugs ?

The 2.2.21-NFS patch fixes all the EIO bugs that were found in 2.4.x. You=
=20
might of course have found totally different bugs, but I really can't tel=
l=20
until you eliminate the first possibility.

Cheers,
Trond

_______________________________________________________________

Have big pipes? SourceForge.net is looking for download mirrors. We supply
the hardware. You get the recognition. Email Us: [email protected]
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2002-05-14 12:52:43

by Trond Myklebust

[permalink] [raw]
Subject: Re: Input/output error on hard mounted NFS directory

>>>>> " " == Steffen Persvold <[email protected]> writes:

> On 14 May 2002, Trond Myklebust wrote:
>> >>>>> " " == Steffen Persvold <[email protected]> writes:
>>
>> > The error I get here is :
>>
>> > # Creating static library Linux2.ia64.gnu/libmpi.a ...
>> > # Creating shared library Linux2.ia64.gnu/libmpi.so ...
>> > /usr/bin/ld: final link failed: Stale NFS file handle
>> > collect2: ld returned 1 exit status make[4]: ***
>> > [Linux2.ia64.gnu/libmpi.a] Error 1
>>
>>
>> Stale file handle has nothing to do with EIO. That looks more
>> like a server bug. Are you perchance using nfs-server?

> No, I'm using a standard RH 7.2 server with knfsd (and
> nfs-utils-0.3.1-13.7.2.1). rpc.nfsd is started with 48 servers.

What is the filesystem you are using on the server? Would it be an old
ReiserFS 3.5 format disk? If so, you should consider upgrading it to a
3.6 format disk. See the manpage for 'mount -oconv' at http://www.namesys.com

Cheers,
Trond

_______________________________________________________________

Have big pipes? SourceForge.net is looking for download mirrors. We supply
the hardware. You get the recognition. Email Us: [email protected]
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2002-05-14 13:32:22

by Paul Heinlein

[permalink] [raw]
Subject: Re: Input/output error on hard mounted NFS directory

On Tue, 14 May 2002, Steffen Persvold wrote:

> The relevant /proc/mounts looks like this :
>
> huey:/export/home/sp /home/sp nfs rw,v3,rsize=8192,wsize=8192,intr,addr=huey 0 0

Pardon my asking a question that's likely irrelevant -- but why are
your {r,w}size settings stuck at 8k? I've never done any transfer-size
testing under a 2.2.x kernel, but won't it negotiate something closer
to 32k when running nfs v3?

+----------------------------------------------+----------------------+
| Paul Heinlein | [email protected] |
| Research Systems Engineer | +1 503 748-1472 |
| Department of Computer Science & Engineering | 20000 NW Walker Road |
| OGI School of Science & Engineering | Beaverton, OR 97006 |
| Oregon Health & Science University | USA |
+----------------------------------------------+----------------------+


_______________________________________________________________

Have big pipes? SourceForge.net is looking for download mirrors. We supply
the hardware. You get the recognition. Email Us: [email protected]
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2002-05-14 14:05:39

by Steffen Persvold

[permalink] [raw]
Subject: Re: Input/output error on hard mounted NFS directory

On 14 May 2002, Trond Myklebust wrote:

> >>>>> " " == Steffen Persvold <[email protected]> writes:
>
> > On 14 May 2002, Trond Myklebust wrote:
> >> >>>>> " " == Steffen Persvold <[email protected]> writes:
> >>
> >> > The error I get here is :
> >>
> >> > # Creating static library Linux2.ia64.gnu/libmpi.a ...
> >> > # Creating shared library Linux2.ia64.gnu/libmpi.so ...
> >> > /usr/bin/ld: final link failed: Stale NFS file handle
> >> > collect2: ld returned 1 exit status make[4]: ***
> >> > [Linux2.ia64.gnu/libmpi.a] Error 1
> >>
> >>
> >> Stale file handle has nothing to do with EIO. That looks more
> >> like a server bug. Are you perchance using nfs-server?
>
> > No, I'm using a standard RH 7.2 server with knfsd (and
> > nfs-utils-0.3.1-13.7.2.1). rpc.nfsd is started with 48 servers.
>
> What is the filesystem you are using on the server? Would it be an old
> ReiserFS 3.5 format disk? If so, you should consider upgrading it to a
> 3.6 format disk. See the manpage for 'mount -oconv' at http://www.namesys.com
>

It is ext2 (used to be ext3, but I changed back).

Regards,
--
Steffen Persvold | Scalable Linux Systems | Try out the world's best
mailto:[email protected] | http://www.scali.com | performing MPI implementation:
Tel: (+47) 2262 8950 | Olaf Helsets vei 6 | - ScaMPI 1.13.8 -
Fax: (+47) 2262 8951 | N0621 Oslo, NORWAY | >320MBytes/s and <4uS latency


_______________________________________________________________

Have big pipes? SourceForge.net is looking for download mirrors. We supply
the hardware. You get the recognition. Email Us: [email protected]
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2002-05-14 14:06:57

by Steffen Persvold

[permalink] [raw]
Subject: Re: Input/output error on hard mounted NFS directory

On Tue, 14 May 2002, Paul Heinlein wrote:

> On Tue, 14 May 2002, Steffen Persvold wrote:
>
> > The relevant /proc/mounts looks like this :
> >
> > huey:/export/home/sp /home/sp nfs rw,v3,rsize=8192,wsize=8192,intr,addr=huey 0 0
>
> Pardon my asking a question that's likely irrelevant -- but why are
> your {r,w}size settings stuck at 8k? I've never done any transfer-size
> testing under a 2.2.x kernel, but won't it negotiate something closer
> to 32k when running nfs v3?
>

I believe the maximum for NFS messages sizes is still 8k (unless you have
Trond's rpc-twaks.diff).

Regards,
--
Steffen Persvold | Scalable Linux Systems | Try out the world's best
mailto:[email protected] | http://www.scali.com | performing MPI implementation:
Tel: (+47) 2262 8950 | Olaf Helsets vei 6 | - ScaMPI 1.13.8 -
Fax: (+47) 2262 8951 | N0621 Oslo, NORWAY | >320MBytes/s and <4uS latency


_______________________________________________________________

Have big pipes? SourceForge.net is looking for download mirrors. We supply
the hardware. You get the recognition. Email Us: [email protected]
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs