2005-09-14 00:02:10

by Stefan Wuerthner

[permalink] [raw]
Subject: Sporadic timeout problems during streaming to nfs server

- I have the following setup:

DBox (DVB-C receiver, 10BaseT) connected to
100MBit-Switch connected over 10m TP to
100MBit-Switch connected over 5m TP to
Netwinder (Linux server, 100BaseT)

- Export options on the Netwinder (Kernel 2.4.19-rmk7-nw1):

dbox2(rw,insecure,all_squash)

- Mount options on the DBox (Kernel 2.4.31-dbox2) to stream the DVB-TS:

rw,soft,udp,nolock,rsize=8192,wsize=8192

- This results in:

192.168.24.50:/home/wuerthne on /mnt/filme type nfs (rw,v3,rsize=8192,
wsize=8192,soft,udp,nolock,addr=192.168.24.50)


Unfortunately, I sometimes get a timeout error in the serial logfile of my
DBox:

[Tue Sep 13 19:15:49 2005]nfs: server 192.168.24.50 not responding, timed out
[Tue Sep 13 19:15:49 2005][stream2file]: error in write: Input/output error
[Tue Sep 13 19:15:51 2005][stream2file] pthreads exit code: 4294967293
[Tue Sep 13 19:15:56 2005][neutrino.cpp] executing /var/tuxbox/config/recording.start.
[Tue Sep 13 19:15:57 2005]Record channel_id: 44d00016dcd epg: 44d00016dcda60c, apids mode 1
[Tue Sep 13 19:46:43 2005]nfs: server 192.168.24.50 not responding, timed out
[Tue Sep 13 19:46:43 2005][stream2file]: error in write: Input/output error
[Tue Sep 13 19:46:44 2005][stream2file] pthreads exit code: 4294967293
[Tue Sep 13 19:46:49 2005][neutrino.cpp] executing /var/tuxbox/config/recording.start.

E.g. the client recognizes a timeout and restarts the recording. This
results in two ore more fragments which is not desirable.

For some time now, I played with different NFS options, changed switches and
so on but to no avail. Sometimes I can stream 5GB without problems,
sometimes the timeout occurs after a short period of time.


Any ideas what I can change or how I can diagnose this problem further to
avoid the timeouts?


Stefan

--
-----------------------------------------------------------------------------
Stefan Wuerthner web http://wuerthner.dyndns.org
-----------------------------------------------------------------------------


-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server.
Download it for free - -and be entered to win a 42" plasma tv or your very
own Sony(tm)PSP. Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


2005-09-14 02:17:07

by Lever, Charles

[permalink] [raw]
Subject: RE: Sporadic timeout problems during streaming to nfs server

don't use the "soft" option -- use "hard" instead. "soft" with "udp" is
a sure recipe for data corruption and the very symptoms you describe
below.=20

> -----Original Message-----
> From: Stefan Wuerthner [mailto:[email protected]]=20
> Sent: Tuesday, September 13, 2005 8:02 PM
> To: [email protected]
> Subject: [NFS] Sporadic timeout problems during streaming to=20
> nfs server
>=20
> - I have the following setup:
>=20
> DBox (DVB-C receiver, 10BaseT) connected to
> 100MBit-Switch connected over 10m TP to
> 100MBit-Switch connected over 5m TP to
> Netwinder (Linux server, 100BaseT)
>=20
> - Export options on the Netwinder (Kernel 2.4.19-rmk7-nw1):
>=20
> dbox2(rw,insecure,all_squash)
>=20
> - Mount options on the DBox (Kernel 2.4.31-dbox2) to stream=20
> the DVB-TS:
>=20
> rw,soft,udp,nolock,rsize=3D8192,wsize=3D8192
>=20
> - This results in:
>=20
> 192.168.24.50:/home/wuerthne on /mnt/filme type nfs =
(rw,v3,rsize=3D8192,
> wsize=3D8192,soft,udp,nolock,addr=3D192.168.24.50)
>=20
>=20
> Unfortunately, I sometimes get a timeout error in the serial=20
> logfile of my
> DBox:
>=20
> [Tue Sep 13 19:15:49 2005]nfs: server 192.168.24.50 not=20
> responding, timed out
> [Tue Sep 13 19:15:49 2005][stream2file]: error in write:=20
> Input/output error
> [Tue Sep 13 19:15:51 2005][stream2file] pthreads exit code: 4294967293
> [Tue Sep 13 19:15:56 2005][neutrino.cpp] executing=20
> /var/tuxbox/config/recording.start.
> [Tue Sep 13 19:15:57 2005]Record channel_id: 44d00016dcd epg:=20
> 44d00016dcda60c, apids mode 1
> [Tue Sep 13 19:46:43 2005]nfs: server 192.168.24.50 not=20
> responding, timed out
> [Tue Sep 13 19:46:43 2005][stream2file]: error in write:=20
> Input/output error
> [Tue Sep 13 19:46:44 2005][stream2file] pthreads exit code: 4294967293
> [Tue Sep 13 19:46:49 2005][neutrino.cpp] executing=20
> /var/tuxbox/config/recording.start.
>=20
> E.g. the client recognizes a timeout and restarts the recording. This
> results in two ore more fragments which is not desirable.
>=20
> For some time now, I played with different NFS options,=20
> changed switches and
> so on but to no avail. Sometimes I can stream 5GB without problems,
> sometimes the timeout occurs after a short period of time.
>=20
>=20
> Any ideas what I can change or how I can diagnose this=20
> problem further to
> avoid the timeouts?
>=20
>=20
> Stefan
>=20
> --=20
> --------------------------------------------------------------
> ---------------
> Stefan Wuerthner web =20
> http://wuerthner.dyndns.org
> --------------------------------------------------------------
> ---------------
>=20
>=20
> -------------------------------------------------------
> SF.Net email is sponsored by:
> Tame your development challenges with Apache's Geronimo App Server.=20
> Download it for free - -and be entered to win a 42" plasma tv=20
> or your very
> own Sony(tm)PSP. Click here to play:=20
> http://sourceforge.net/geronimo.php
> _______________________________________________
> NFS maillist - [email protected]
> https://lists.sourceforge.net/lists/listinfo/nfs
>=20


-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server.
Download it for free - -and be entered to win a 42" plasma tv or your very
own Sony(tm)PSP. Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-09-14 09:59:12

by Stefan Wuerthner

[permalink] [raw]
Subject: RE: Sporadic timeout problems during streaming to nfs server

In message <[email protected]> you wrote:

> don't use the "soft" option -- use "hard" instead. "soft" with "udp" is
> a sure recipe for data corruption and the very symptoms you describe
> below.
>

I already tried "hard", but without success:

The serial log tells me:

[Fri Sep 9 11:50:20 2005]
nfs: server 192.168.24.50 not responding, still trying
[Fri Sep 9 12:58:53 2005]
PANIC: not enough space in ringbuffer, available 42887, needed 93225


E.g. the client cannot reach the server and therefore the client ringbuffer
overflows. This results in a total lockup of the client...


> > -----Original Message-----
> > From: Stefan Wuerthner [mailto:[email protected]]
> > Sent: Tuesday, September 13, 2005 8:02 PM
> > To: [email protected]
> > Subject: [NFS] Sporadic timeout problems during streaming to
> > nfs server
> >
> > - I have the following setup:
> >
> > DBox (DVB-C receiver, 10BaseT) connected to
> > 100MBit-Switch connected over 10m TP to
> > 100MBit-Switch connected over 5m TP to
> > Netwinder (Linux server, 100BaseT)
> >
> > - Export options on the Netwinder (Kernel 2.4.19-rmk7-nw1):
> >
> > dbox2(rw,insecure,all_squash)
> >
> > - Mount options on the DBox (Kernel 2.4.31-dbox2) to stream
> > the DVB-TS:
> >
> > rw,soft,udp,nolock,rsize=8192,wsize=8192
> >
> > - This results in:
> >
> > 192.168.24.50:/home/wuerthne on /mnt/filme type nfs (rw,v3,rsize=8192,
> > wsize=8192,soft,udp,nolock,addr=192.168.24.50)
> >
> >
> > Unfortunately, I sometimes get a timeout error in the serial
> > logfile of my
> > DBox:
> >
> > [Tue Sep 13 19:15:49 2005]nfs: server 192.168.24.50 not
> > responding, timed out
> > [Tue Sep 13 19:15:49 2005][stream2file]: error in write:
> > Input/output error
> > [Tue Sep 13 19:15:51 2005][stream2file] pthreads exit code: 4294967293
> > [Tue Sep 13 19:15:56 2005][neutrino.cpp] executing
> > /var/tuxbox/config/recording.start.
> > [Tue Sep 13 19:15:57 2005]Record channel_id: 44d00016dcd epg:
> > 44d00016dcda60c, apids mode 1
> > [Tue Sep 13 19:46:43 2005]nfs: server 192.168.24.50 not
> > responding, timed out
> > [Tue Sep 13 19:46:43 2005][stream2file]: error in write:
> > Input/output error
> > [Tue Sep 13 19:46:44 2005][stream2file] pthreads exit code: 4294967293
> > [Tue Sep 13 19:46:49 2005][neutrino.cpp] executing
> > /var/tuxbox/config/recording.start.
> >
> > E.g. the client recognizes a timeout and restarts the recording. This
> > results in two ore more fragments which is not desirable.
> >
> > For some time now, I played with different NFS options,
> > changed switches and
> > so on but to no avail. Sometimes I can stream 5GB without problems,
> > sometimes the timeout occurs after a short period of time.
> >
> >
> > Any ideas what I can change or how I can diagnose this
> > problem further to
> > avoid the timeouts?
> >
> >
> > Stefan
> >


--
-----------------------------------------------------------------------------
Stefan Wuerthner web http://wuerthner.dyndns.org
-----------------------------------------------------------------------------


-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server.
Download it for free - -and be entered to win a 42" plasma tv or your very
own Sony(tm)PSP. Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-09-14 14:00:55

by Lever, Charles

[permalink] [raw]
Subject: RE: Sporadic timeout problems during streaming to nfs server

> In message=20
> <[email protected]>
> you wrote:
>=20
> > don't use the "soft" option -- use "hard" instead. "soft"=20
> with "udp" is
> > a sure recipe for data corruption and the very symptoms you describe
> > below.=20
> >=20
>=20
> I already tried "hard", but without success:
>=20
> The serial log tells me:
>=20
> [Fri Sep 9 11:50:20 2005]
> nfs: server 192.168.24.50 not responding, still trying
> [Fri Sep 9 12:58:53 2005]
> PANIC: not enough space in ringbuffer, available 42887, needed 93225
>=20
>=20
> E.g. the client cannot reach the server and therefore the=20
> client ringbuffer
> overflows. This results in a total lockup of the client...

a ring buffer overflow is a NIC driver problem. the NFS timeout issues
suggest a network problem (either the link or the NICs or...). i would
start looking closely at your client-side network (driver version,
hardware, cabling, switch, etc).

in the long run you want to use NFS over TCP rather than UDP (and don't
use "soft").

> > > -----Original Message-----
> > > From: Stefan Wuerthner [mailto:[email protected]]=20
> > > Sent: Tuesday, September 13, 2005 8:02 PM
> > > To: [email protected]
> > > Subject: [NFS] Sporadic timeout problems during streaming to=20
> > > nfs server
> > >=20
> > > - I have the following setup:
> > >=20
> > > DBox (DVB-C receiver, 10BaseT) connected to
> > > 100MBit-Switch connected over 10m TP to
> > > 100MBit-Switch connected over 5m TP to
> > > Netwinder (Linux server, 100BaseT)
> > >=20
> > > - Export options on the Netwinder (Kernel 2.4.19-rmk7-nw1):
> > >=20
> > > dbox2(rw,insecure,all_squash)
> > >=20
> > > - Mount options on the DBox (Kernel 2.4.31-dbox2) to stream=20
> > > the DVB-TS:
> > >=20
> > > rw,soft,udp,nolock,rsize=3D8192,wsize=3D8192
> > >=20
> > > - This results in:
> > >=20
> > > 192.168.24.50:/home/wuerthne on /mnt/filme type nfs=20
> (rw,v3,rsize=3D8192,
> > > wsize=3D8192,soft,udp,nolock,addr=3D192.168.24.50)
> > >=20
> > >=20
> > > Unfortunately, I sometimes get a timeout error in the serial=20
> > > logfile of my
> > > DBox:
> > >=20
> > > [Tue Sep 13 19:15:49 2005]nfs: server 192.168.24.50 not=20
> > > responding, timed out
> > > [Tue Sep 13 19:15:49 2005][stream2file]: error in write:=20
> > > Input/output error
> > > [Tue Sep 13 19:15:51 2005][stream2file] pthreads exit=20
> code: 4294967293
> > > [Tue Sep 13 19:15:56 2005][neutrino.cpp] executing=20
> > > /var/tuxbox/config/recording.start.
> > > [Tue Sep 13 19:15:57 2005]Record channel_id: 44d00016dcd epg:=20
> > > 44d00016dcda60c, apids mode 1
> > > [Tue Sep 13 19:46:43 2005]nfs: server 192.168.24.50 not=20
> > > responding, timed out
> > > [Tue Sep 13 19:46:43 2005][stream2file]: error in write:=20
> > > Input/output error
> > > [Tue Sep 13 19:46:44 2005][stream2file] pthreads exit=20
> code: 4294967293
> > > [Tue Sep 13 19:46:49 2005][neutrino.cpp] executing=20
> > > /var/tuxbox/config/recording.start.
> > >=20
> > > E.g. the client recognizes a timeout and restarts the=20
> recording. This
> > > results in two ore more fragments which is not desirable.
> > >=20
> > > For some time now, I played with different NFS options,=20
> > > changed switches and
> > > so on but to no avail. Sometimes I can stream 5GB without=20
> problems,
> > > sometimes the timeout occurs after a short period of time.
> > >=20
> > >=20
> > > Any ideas what I can change or how I can diagnose this=20
> > > problem further to
> > > avoid the timeouts?
> > >=20
> > >=20
> > > Stefan
> > >=20
>=20
>=20
> --=20
> --------------------------------------------------------------
> ---------------
> Stefan Wuerthner web =20
http://wuerthner.dyndns.org
------------------------------------------------------------------------
-----


-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server.=20
Download it for free - -and be entered to win a 42" plasma tv or your
very
own Sony(tm)PSP. Click here to play:
http://sourceforge.net/geronimo.php
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server.
Download it for free - -and be entered to win a 42" plasma tv or your very
own Sony(tm)PSP. Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-09-14 21:11:03

by Stefan Wuerthner

[permalink] [raw]
Subject: RE: Sporadic timeout problems during streaming to nfs server

In message <[email protected]> you wrote:

> > In message
> > <[email protected]>
> > you wrote:
> >
> > > don't use the "soft" option -- use "hard" instead. "soft"
> > with "udp" is
> > > a sure recipe for data corruption and the very symptoms you describe
> > > below.
> > >
> >
> > I already tried "hard", but without success:
> >
> > The serial log tells me:
> >
> > [Fri Sep 9 11:50:20 2005]
> > nfs: server 192.168.24.50 not responding, still trying
> > [Fri Sep 9 12:58:53 2005]
> > PANIC: not enough space in ringbuffer, available 42887, needed 93225
> >
> >
> > E.g. the client cannot reach the server and therefore the
> > client ringbuffer
> > overflows. This results in a total lockup of the client...
>
> a ring buffer overflow is a NIC driver problem. the NFS timeout issues
> suggest a network problem (either the link or the NICs or...). i would
> start looking closely at your client-side network (driver version,
> hardware, cabling, switch, etc).
>
> in the long run you want to use NFS over TCP rather than UDP (and don't
> use "soft").
>

Client side is more or less fixed because it's not a computer but a
consumer device running busybox. Ethernet controller is partially
formed by the CPU (slow PPC)...

The ring buffer is afaik not part of the NIC, but has been added
some time ago to buffer peak bitrates in the TS stream. Hardware
should be o.k. (Cat5e+ cabling, Allied Telesyn switches).

The difficult points are:

1. Real-time application: video streaming (MPEG2 stream)
2. Receiver NIC is limited to 10MBit half-duplex
3. video streams can peak to 8-10MBit...

So:

1. NFS over TCP is too slow
2. - "hard" is no option, because it stops writing the stream finally
- "soft" allows the server to restart the recording after several seconds
but there is a small interruption in the final recording

The interesting questions are:

Why does streaming work for e.g. 30min or 65min and then fail with
'timeout'?

How can I get more information on the server side what happened exactly at
this point of time. I could not find any logs related to serverside NFS.



> > > > -----Original Message-----
> > > > From: Stefan Wuerthner [mailto:[email protected]]
> > > > Sent: Tuesday, September 13, 2005 8:02 PM
> > > > To: [email protected]
> > > > Subject: [NFS] Sporadic timeout problems during streaming to
> > > > nfs server
> > > >
> > > > - I have the following setup:
> > > >
> > > > DBox (DVB-C receiver, 10BaseT) connected to
> > > > 100MBit-Switch connected over 10m TP to
> > > > 100MBit-Switch connected over 5m TP to
> > > > Netwinder (Linux server, 100BaseT)
> > > >
> > > > - Export options on the Netwinder (Kernel 2.4.19-rmk7-nw1):
> > > >
> > > > dbox2(rw,insecure,all_squash)
> > > >
> > > > - Mount options on the DBox (Kernel 2.4.31-dbox2) to stream
> > > > the DVB-TS:
> > > >
> > > > rw,soft,udp,nolock,rsize=8192,wsize=8192
> > > >
> > > > - This results in:
> > > >
> > > > 192.168.24.50:/home/wuerthne on /mnt/filme type nfs
> > (rw,v3,rsize=8192,
> > > > wsize=8192,soft,udp,nolock,addr=192.168.24.50)
> > > >
> > > >
> > > > Unfortunately, I sometimes get a timeout error in the serial
> > > > logfile of my
> > > > DBox:
> > > >
> > > > [Tue Sep 13 19:15:49 2005]nfs: server 192.168.24.50 not
> > > > responding, timed out
> > > > [Tue Sep 13 19:15:49 2005][stream2file]: error in write:
> > > > Input/output error
> > > > [Tue Sep 13 19:15:51 2005][stream2file] pthreads exit
> > code: 4294967293
> > > > [Tue Sep 13 19:15:56 2005][neutrino.cpp] executing
> > > > /var/tuxbox/config/recording.start.
> > > > [Tue Sep 13 19:15:57 2005]Record channel_id: 44d00016dcd epg:
> > > > 44d00016dcda60c, apids mode 1
> > > > [Tue Sep 13 19:46:43 2005]nfs: server 192.168.24.50 not
> > > > responding, timed out
> > > > [Tue Sep 13 19:46:43 2005][stream2file]: error in write:
> > > > Input/output error
> > > > [Tue Sep 13 19:46:44 2005][stream2file] pthreads exit
> > code: 4294967293
> > > > [Tue Sep 13 19:46:49 2005][neutrino.cpp] executing
> > > > /var/tuxbox/config/recording.start.
> > > >
> > > > E.g. the client recognizes a timeout and restarts the
> > recording. This
> > > > results in two ore more fragments which is not desirable.
> > > >
> > > > For some time now, I played with different NFS options,
> > > > changed switches and
> > > > so on but to no avail. Sometimes I can stream 5GB without
> > problems,
> > > > sometimes the timeout occurs after a short period of time.
> > > >
> > > >
> > > > Any ideas what I can change or how I can diagnose this
> > > > problem further to
> > > > avoid the timeouts?
> > > >
> > > >
> > > > Stefan
> > > >
> >


--
-----------------------------------------------------------------------------
Stefan Wuerthner web http://wuerthner.dyndns.org
-----------------------------------------------------------------------------


-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server.
Download it for free - -and be entered to win a 42" plasma tv or your very
own Sony(tm)PSP. Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2005-09-14 21:23:37

by Peter Staubach

[permalink] [raw]
Subject: Re: Sporadic timeout problems during streaming to nfs server

Stefan Wuerthner wrote:

>In message <[email protected]> you wrote:
>
>
>
>>>In message
>>><[email protected]>
>>> you wrote:
>>>
>>>
>>>
>>>>don't use the "soft" option -- use "hard" instead. "soft"
>>>>
>>>>
>>>with "udp" is
>>>
>>>
>>>>a sure recipe for data corruption and the very symptoms you describe
>>>>below.
>>>>
>>>>
>>>>
>>>I already tried "hard", but without success:
>>>
>>>The serial log tells me:
>>>
>>>[Fri Sep 9 11:50:20 2005]
>>>nfs: server 192.168.24.50 not responding, still trying
>>>[Fri Sep 9 12:58:53 2005]
>>>PANIC: not enough space in ringbuffer, available 42887, needed 93225
>>>
>>>
>>>E.g. the client cannot reach the server and therefore the
>>>client ringbuffer
>>>overflows. This results in a total lockup of the client...
>>>
>>>
>>a ring buffer overflow is a NIC driver problem. the NFS timeout issues
>>suggest a network problem (either the link or the NICs or...). i would
>>start looking closely at your client-side network (driver version,
>>hardware, cabling, switch, etc).
>>
>>in the long run you want to use NFS over TCP rather than UDP (and don't
>>use "soft").
>>
>>
>>
>
>Client side is more or less fixed because it's not a computer but a
>consumer device running busybox. Ethernet controller is partially
>formed by the CPU (slow PPC)...
>
>The ring buffer is afaik not part of the NIC, but has been added
>some time ago to buffer peak bitrates in the TS stream. Hardware
>should be o.k. (Cat5e+ cabling, Allied Telesyn switches).
>
>The difficult points are:
>
>1. Real-time application: video streaming (MPEG2 stream)
>2. Receiver NIC is limited to 10MBit half-duplex
>3. video streams can peak to 8-10MBit...
>
>So:
>
>1. NFS over TCP is too slow
>2. - "hard" is no option, because it stops writing the stream finally
> - "soft" allows the server to restart the recording after several seconds
> but there is a small interruption in the final recording
>
>The interesting questions are:
>
>Why does streaming work for e.g. 30min or 65min and then fail with
>'timeout'?
>
>How can I get more information on the server side what happened exactly at
>this point of time. I could not find any logs related to serverside NFS.
>

10Mb/s is really slow, for today's standards. Almost anything ought to
be able to drive the network at full speed, even with TCP.

Can you try this without the ring buffer?

I would suggest using ethereal or tcpdump on the server to watch the
traffic and see if anything appears unusual when the situation occurs.

ps


-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server.
Download it for free - -and be entered to win a 42" plasma tv or your very
own Sony(tm)PSP. Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs